Skip to content

Eng 567 pr2 saas config version#7688

Open
Vagoasdf wants to merge 59 commits intomainfrom
ENG-567_pr2-saas-config-version
Open

Eng 567 pr2 saas config version#7688
Vagoasdf wants to merge 59 commits intomainfrom
ENG-567_pr2-saas-config-version

Conversation

@Vagoasdf
Copy link
Copy Markdown
Contributor

@Vagoasdf Vagoasdf commented Mar 18, 2026

Ticket []

Description Of Changes

Setting up the backend for Saas Config historic information

This PR sets up the backend for the historical record keeping of saas integrations, creating two new models, SaaSConfigVersion for storing version changes of out of the box integrations, and ConnectionConfigSaaSHistory, to store the history of a specific instance of a saas connection. It also creates the endpoints that would be used to request data from these history versions

Code Changes

  • Creates new Models for storing saas config version
  • Creates Migrations for said models
  • Set up Endpoints to access them

Steps to Confirm

  1. Start up a Server
  2. Check on the localhost API that the out of the box integrations have set up their version template trought /connector-templates/{connector_template_type}/versions
  3. Create a new Integration connection
  4. Call to the /connection/{connection_key}/saas-history and check that the saas history is recorded

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • All UX related changes have been reviewed by a designer
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Preview Mar 27, 2026 7:52pm
fides-privacy-center Ignored Ignored Mar 27, 2026 7:52pm

Request Review

@Vagoasdf Vagoasdf force-pushed the ENG-567_pr2-saas-config-version branch from bf2caf2 to b3ebb01 Compare March 18, 2026 15:56
Vagoasdf and others added 4 commits March 18, 2026 14:35
- Use conditional logger.info calls instead of mixing f-strings with loguru placeholders in log_start/log_retry
- Add comment explaining _saas_version is snapshotted at construction time
- Fix missing blank line between test functions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Vagoasdf and others added 2 commits March 25, 2026 15:18
- Fix unsorted imports (ruff I001) in base.py, seed.py, and connector_template_endpoints.py
- Fix mypy return type error in list_connector_template_versions — convert ORM rows to SaaSConfigVersionResponse explicitly
- Annotate saas_config_version and connection_config_saas_history tables in db_dataset.yml to fix fides_db_scan
- Add changelog entry for PR #7688

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Vagoasdf Vagoasdf marked this pull request as ready for review March 27, 2026 18:54
@Vagoasdf Vagoasdf requested a review from a team as a code owner March 27, 2026 18:54
@Vagoasdf Vagoasdf requested review from JadeCara and removed request for a team March 27, 2026 18:54
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Good feature overall — the two-table design (template-level SaaSConfigVersion vs. connection-scoped ConnectionConfigSaaSHistory) is well-motivated and the append-only history approach is sound. Test coverage for the connection-history endpoints is thorough. A few things need attention before merge:

Critical (Must Fix)

  • .mcp.json should not be committed — contains hardcoded local machine paths (/Users/bruno/Ethyca/...). Add to .gitignore instead.
  • Unused importsload_config_from_string and load_dataset_from_string are imported in connector_template_endpoints.py but never used.
  • Missing tests — the three new connector-template version endpoints (/versions, /versions/{version}/config, /versions/{version}/dataset) have no test coverage.

Suggestions

  • Model method doing cross-model workupdate_saas_config now imports DatasetConfig and ConnectionConfigSaaSHistory, queries related rows, and creates a snapshot. This is heavy for a model method and makes it hard to test in isolation. Consider moving the orchestration to a service layer.
  • Business logic in the route handlerSaaSConfigVersion.upsert is called directly inside patch_saas_config. This belongs in a service or co-located with update_saas_config.
  • Double ConnectorRegistry.get_connector_template call — in patch_saas_config the template can be fetched twice. Fetching once before the if not existing_saas_config block and reusing the result is cleaner.
  • Missing index on connection_key — the column is denormalized specifically so history survives connection deletion, but no index backs it up.
  • DB queries in the endpoint file_get_version_row and the inline queries in list_connector_template_versions should live in a service or repository, not directly in the endpoint module.
  • Startup queries one row per connector per bootsync_oob_saas_config_versions can be made more efficient by fetching all existing (connector_type, version) pairs in one query and only inserting the missing ones.

Nice to Have

  • Migration filename prefix — both new migration files use xx_ rather than the hash-based prefix convention used elsewhere in this repo.
  • Bare dict in SaaSConfigVersionDetailResponse — use Dict[str, Any] to match the sibling schema's style.
  • Double None-coalescingdatasets if datasets else None at the call site combined with datasets or None inside create_snapshot is redundant; handle it in one place.

.mcp.json Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: Personal developer config should not be committed.

This file contains hardcoded paths to the author's local machine (/Users/bruno/Ethyca/...). It's a developer-specific MCP server configuration and should be added to .gitignore rather than committed to the repository. Anyone cloning the repo will have a broken/irrelevant .mcp.json pointing to non-existent paths on their machine.

from fides.api.service.connectors.saas.connector_registry_service import (
ConnectorRegistry,
CustomConnectorTemplateLoader,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports.

load_config_from_string and load_dataset_from_string are imported here but never called in any of the new endpoint handlers in this file. They should be removed.

connection_key=self.key,
version=saas_config.version,
config=self.saas_config,
datasets=datasets if datasets else None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double redundancy: datasets if datasets else Nonedatasets or None → still None when list is empty.

The call site passes datasets if datasets else None and the create_snapshot body stores datasets or None. Both expressions reduce to the same thing — the double guard is confusing and implies the distinction matters somewhere in between. Pick one location for the None-coalescing and remove the other.

More importantly, an empty [] (no associated datasets found) and None (datasets intentionally not provided) have different semantics. If the intent is always to store None when there are no datasets, that's fine, but it should be explicit rather than implicit via two chained falsy checks.

self.secrets = updated_secrets
self.saas_config = saas_config.model_dump(mode="json")

datasets = [
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model method doing cross-model querying and side-effect writes — consider moving to a service.

update_saas_config now imports DatasetConfig and ConnectionConfigSaaSHistory, queries related rows, and creates a history snapshot, all inside a model method. This makes the model aware of other models' internals and adds a hidden DB query on every call.

Ideally this orchestration belongs in a service layer that coordinates the update and the snapshot creation explicitly. At minimum, accepting a pre-fetched datasets list as an optional parameter to update_saas_config would keep the model's surface area narrow and let callers decide whether to include dataset context.

version=saas_config.version,
config=saas_config.model_dump(mode="json"),
dataset=None, # PATCH only updates the config; dataset is managed separately
is_custom=template.custom if template else False,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Business logic directly in route handler.

Calling SaaSConfigVersion.upsert(...) here mixes persistence logic into the route layer. This belongs in a service (or at minimum, inside connection_config.update_saas_config()), so routes stay thin. The existing update_saas_config call just above already does the config update — the version upsert could be co-located there or delegated to a dedicated service method.

["connection_config_id"],
["connectionconfig.id"],
ondelete="SET NULL",
),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing index on connection_key.

connection_key is explicitly denormalized so history remains queryable after the parent ConnectionConfig is deleted (when connection_config_id becomes NULL). But without an index on connection_key, those orphaned rows can only be found via a full table scan.

If the intent is purely for audit/ad-hoc use this may be acceptable, but since the docstring calls out post-deletion queryability as a design goal, an index would make that goal actionable:

op.create_index(
    op.f("ix_connection_config_saas_history_connection_key"),
    "connection_config_saas_history",
    ["connection_key"],
    unique=False,
)

class SaaSConfigVersionDetailResponse(SaaSConfigVersionResponse):
"""Full detail for a single version, including config and dataset as raw dicts."""

config: dict
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Dict[str, Any] instead of bare dict for consistency.

ConnectionConfigSaaSHistoryDetailResponse (the sibling schema) uses Dict[str, Any] for its config field. Using the bare dict annotation here is less explicit and inconsistent. Prefer Dict[str, Any] (already imported in the project's typing conventions) for both fields.


@router.get(
CONNECTOR_TEMPLATES_VERSIONS,
dependencies=[Security(verify_oauth_client, scopes=[CONNECTOR_TEMPLATE_READ])],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No tests for the three new connector template version endpoints.

list_connector_template_versions, get_connector_template_version_config, and get_connector_template_version_dataset have no test coverage. The TestListSaaSConfigHistory / TestGetSaaSConfigHistoryByVersion classes cover the connection-scoped endpoints well — the same level of coverage (unauthenticated, wrong scope, not found, happy path) should exist for the template-version endpoints too.

) -> Optional[SaaSConfigVersion]:
"""
Returns the most recent stored row for (connector_type, version).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DB query helper living in the endpoint file.

_get_version_row performs a direct db.query(SaaSConfigVersion) inside connector_template_endpoints.py. Endpoint files in this codebase are generally thin — data access logic belongs in a service or repository layer. This helper (and the inline queries in list_connector_template_versions) should live in a service method.

SaaSConfig, # pylint: disable=import-outside-toplevel
)
from fides.api.service.connectors.saas.connector_registry_service import ( # pylint: disable=import-outside-toplevel
FileConnectorTemplateLoader,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Startup queries one row per OOB connector on every boot.

sync_oob_saas_config_versions iterates over every bundled connector template and issues a SELECT per connector to check whether the row already exists. For OOB connectors (immutable once written) this is a no-op after first deploy, but the queries still fire on every startup.

Consider a set-difference approach: fetch all (connector_type, version) pairs already in the table in a single query and only parse + insert the ones that are missing. This keeps startup overhead bounded regardless of how many OOB connectors exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant