Skip to content

Migration to peprs#155

Open
khoroshevskyi wants to merge 1 commit intomasterfrom
peprs
Open

Migration to peprs#155
khoroshevskyi wants to merge 1 commit intomasterfrom
peprs

Conversation

@khoroshevskyi
Copy link
Copy Markdown
Member

Changes:

  • Adding peprs to pepdbagent, to speed up validation and pep saving

TODO:

  • ❗ If this PR includes a new database schema migration, following steps are completed: Database Version Migration
  • Version of pepdbagent updated in __version__.py file
  • Changelog updated

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates pepdbagent from peppy to peprs for project/sample handling, updating the raw PEP dict shape and adjusting tests + fixture PEPs accordingly.

Changes:

  • Replace peppy imports/usages with peprs across pepdbagent modules and tests.
  • Update raw project/sample dict conventions from _config/_sample_dict to config/samples (and subsamples where applicable).
  • Adjust test fixture PEP YAMLs and expectations to accommodate peprs parsing/inference behavior.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pepdbagent/modules/project.py Switches project CRUD paths to peprs constants/dict shape; updates sample retrieval conversion.
pepdbagent/modules/sample.py Returns peprs.Sample when raw=False and updates sample-name lookup key handling.
pepdbagent/modules/view.py Updates view retrieval to return peprs projects / peprs-style raw dicts.
pepdbagent/models.py Updates pydantic raw-project model aliases to peprs constants.
pepdbagent/utils.py Switches SAMPLE_RAW_DICT_KEY import to peprs constant.
pepdbagent/const.py Removes outdated peppy-const comment.
requirements/requirements-all.txt Replaces peppy dependency with peprs.
tests/utils.py Uses peprs.Project(...) during DB seeding; adds skip-on-parse-failure behavior.
tests/test_project.py Updates project retrieval/serialization assertions to peprs dict shape and pandas conversion.
tests/test_updates.py Updates update flows to use samples list and peprs.Project.from_dict.
tests/test_samples.py Updates sample type assertions to peprs.Sample and adjusts expectations for inferred numeric types.
tests/test_project_history.py Updates history update flows to use peprs dict shape (samples, config).
tests/data/private_test/remove/project_config.yaml Removes a derive source entry to fit updated parsing/fixtures.
tests/data/private_test/derive/project_config.yaml Removes a derive source entry to fit updated parsing/fixtures.
tests/data/private_test/amendments1/project_config.yaml Removes a derive source entry to fit updated parsing/fixtures.
tests/data/namespace3/remove/project_config.yaml Removes a derive source entry to fit updated parsing/fixtures.
tests/data/namespace3/piface/project_config.yaml Adjusts pipeline_interfaces fixture value format.
tests/data/namespace2/derive/project_config.yaml Removes a derive source entry to fit updated parsing/fixtures.
tests/data/namespace1/amendments1/project_config.yaml Removes a derive source entry to fit updated parsing/fixtures.
Comments suppressed due to low confidence (1)

pepdbagent/modules/project.py:103

  • The get() docstring still describes the old peppy raw-project keys (_config, _sample_dict, _subsample_dict). After migrating to peprs, the raw dict returned here uses config/samples/subsamples (via peprs.const). Please update the docstring so it matches the actual return shape.
        :return: peprs.Project object with found project or dict with unprocessed
            PEP elements: {
                name: str
                description: str
                _config: dict

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +330 to 332
if isinstance(project, peprs.Project):
proj_dict = project.to_dict(raw=True, by_sample=True)
elif isinstance(project, dict):
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using peprs.Project.to_dict(raw=True, by_sample=True) will preserve peprs/polars-inferred column dtypes (e.g., numeric columns become ints). This appears to be a behavioral change vs the prior peppy-based behavior (tests now expect time == 0 instead of '0') and can be a breaking API change for clients expecting strings. Consider normalizing sample values to strings before persisting, or explicitly documenting this type change as part of the migration.

Copilot uses AI. Check for mistakes.
Comment on lines +76 to +80
sub_project_dict = {"config": config, "samples": samples}
if raw:
return sub_project_dict
else:
return peppy.Project.from_dict(sub_project_dict)
return peprs.Project.from_dict(sub_project_dict)
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The view get() raw dict omits subsamples, so the returned structure differs from project.get(..., raw=True) (which includes subsamples, even if empty). For API consistency, consider returning subsamples: [] as well (and ideally using peprs.const keys).

Copilot uses AI. Check for mistakes.
Comment on lines 385 to +389
if raw:
return {"_config": config, "_sample_dict": samples, "_subsample_dict": None}
return {"config": config, "samples": samples}
else:
return peppy.Project.from_dict(
{"_config": config, "_sample_dict": samples, "_subsample_dict": None}
return peprs.Project.from_dict(
{"config": config, "samples": samples}
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_snap_view() returns/constructs a project dict without subsamples, which makes its output shape differ from other raw project-returning APIs. Consider including an explicit empty subsamples key (and/or using peprs.const keys) in both branches for consistency.

Copilot uses AI. Check for mistakes.
Comment on lines 146 to 150
sample_mapping.sample_name = sample_mapping.sample[
project_mapping.config.get(SAMPLE_TABLE_INDEX_KEY, "sample_name")
project_mapping.config.get("sample_table_index", "sample_name")
]
except KeyError:
raise KeyError(
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample_table_index is now referenced via the hard-coded string key here. To reduce duplication/typos and keep it consistent with the project module, consider using a shared constant (e.g., defined in pepdbagent.const) for this config key and reusing it in both the lookup and the error message.

Copilot uses AI. Check for mistakes.
Comment thread tests/utils.py
Comment on lines +112 to +113
warnings.warn(f"Skipping {namespace}/{name}: {e}")
continue
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching a broad Exception and skipping the project during test DB seeding can silently reduce the dataset and make test results depend on which projects failed to parse. For tests, it’s usually better to fail fast (or catch only expected parse/validation exceptions and re-raise with context) so failures are actionable.

Suggested change
warnings.warn(f"Skipping {namespace}/{name}: {e}")
continue
raise RuntimeError(
f"Failed to load test project {namespace}/{name} from {path}"
) from e

Copilot uses AI. Check for mistakes.
Comment on lines 1 to +3
sqlalchemy>=2.0.0
logmuse>=0.2.7
peppy>=0.40.6
peprs>=0.1.2
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description TODO mentions bumping pepdbagent’s version and updating the changelog, but this change set updates a core dependency (peprs) without updating pepdbagent/_version.py or docs/changelog.md. Please include the version/changelog updates as part of this migration so downstream users can track the breaking change.

Copilot uses AI. Check for mistakes.
Comment on lines +14 to 16
SAMPLE_NAME_ATTR = "sample_name"
SAMPLE_TABLE_INDEX_KEY = "sample_table_index"
from sqlalchemy import Select, and_, delete, select
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SAMPLE_NAME_ATTR / SAMPLE_TABLE_INDEX_KEY are defined before the import block finishes (imports resume right after). This breaks standard import ordering. Move these constants below all imports (or into pepdbagent.const) so the file keeps a clean imports-then-code structure.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants