Conversation
There was a problem hiding this comment.
Pull request overview
Migrates pepdbagent from peppy to peprs for project/sample handling, updating the raw PEP dict shape and adjusting tests + fixture PEPs accordingly.
Changes:
- Replace peppy imports/usages with peprs across pepdbagent modules and tests.
- Update raw project/sample dict conventions from
_config/_sample_dicttoconfig/samples(andsubsampleswhere applicable). - Adjust test fixture PEP YAMLs and expectations to accommodate peprs parsing/inference behavior.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
pepdbagent/modules/project.py |
Switches project CRUD paths to peprs constants/dict shape; updates sample retrieval conversion. |
pepdbagent/modules/sample.py |
Returns peprs.Sample when raw=False and updates sample-name lookup key handling. |
pepdbagent/modules/view.py |
Updates view retrieval to return peprs projects / peprs-style raw dicts. |
pepdbagent/models.py |
Updates pydantic raw-project model aliases to peprs constants. |
pepdbagent/utils.py |
Switches SAMPLE_RAW_DICT_KEY import to peprs constant. |
pepdbagent/const.py |
Removes outdated peppy-const comment. |
requirements/requirements-all.txt |
Replaces peppy dependency with peprs. |
tests/utils.py |
Uses peprs.Project(...) during DB seeding; adds skip-on-parse-failure behavior. |
tests/test_project.py |
Updates project retrieval/serialization assertions to peprs dict shape and pandas conversion. |
tests/test_updates.py |
Updates update flows to use samples list and peprs.Project.from_dict. |
tests/test_samples.py |
Updates sample type assertions to peprs.Sample and adjusts expectations for inferred numeric types. |
tests/test_project_history.py |
Updates history update flows to use peprs dict shape (samples, config). |
tests/data/private_test/remove/project_config.yaml |
Removes a derive source entry to fit updated parsing/fixtures. |
tests/data/private_test/derive/project_config.yaml |
Removes a derive source entry to fit updated parsing/fixtures. |
tests/data/private_test/amendments1/project_config.yaml |
Removes a derive source entry to fit updated parsing/fixtures. |
tests/data/namespace3/remove/project_config.yaml |
Removes a derive source entry to fit updated parsing/fixtures. |
tests/data/namespace3/piface/project_config.yaml |
Adjusts pipeline_interfaces fixture value format. |
tests/data/namespace2/derive/project_config.yaml |
Removes a derive source entry to fit updated parsing/fixtures. |
tests/data/namespace1/amendments1/project_config.yaml |
Removes a derive source entry to fit updated parsing/fixtures. |
Comments suppressed due to low confidence (1)
pepdbagent/modules/project.py:103
- The
get()docstring still describes the old peppy raw-project keys (_config,_sample_dict,_subsample_dict). After migrating to peprs, the raw dict returned here usesconfig/samples/subsamples(viapeprs.const). Please update the docstring so it matches the actual return shape.
:return: peprs.Project object with found project or dict with unprocessed
PEP elements: {
name: str
description: str
_config: dict
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if isinstance(project, peprs.Project): | ||
| proj_dict = project.to_dict(raw=True, by_sample=True) | ||
| elif isinstance(project, dict): |
There was a problem hiding this comment.
Using peprs.Project.to_dict(raw=True, by_sample=True) will preserve peprs/polars-inferred column dtypes (e.g., numeric columns become ints). This appears to be a behavioral change vs the prior peppy-based behavior (tests now expect time == 0 instead of '0') and can be a breaking API change for clients expecting strings. Consider normalizing sample values to strings before persisting, or explicitly documenting this type change as part of the migration.
| sub_project_dict = {"config": config, "samples": samples} | ||
| if raw: | ||
| return sub_project_dict | ||
| else: | ||
| return peppy.Project.from_dict(sub_project_dict) | ||
| return peprs.Project.from_dict(sub_project_dict) |
There was a problem hiding this comment.
The view get() raw dict omits subsamples, so the returned structure differs from project.get(..., raw=True) (which includes subsamples, even if empty). For API consistency, consider returning subsamples: [] as well (and ideally using peprs.const keys).
| if raw: | ||
| return {"_config": config, "_sample_dict": samples, "_subsample_dict": None} | ||
| return {"config": config, "samples": samples} | ||
| else: | ||
| return peppy.Project.from_dict( | ||
| {"_config": config, "_sample_dict": samples, "_subsample_dict": None} | ||
| return peprs.Project.from_dict( | ||
| {"config": config, "samples": samples} |
There was a problem hiding this comment.
get_snap_view() returns/constructs a project dict without subsamples, which makes its output shape differ from other raw project-returning APIs. Consider including an explicit empty subsamples key (and/or using peprs.const keys) in both branches for consistency.
| sample_mapping.sample_name = sample_mapping.sample[ | ||
| project_mapping.config.get(SAMPLE_TABLE_INDEX_KEY, "sample_name") | ||
| project_mapping.config.get("sample_table_index", "sample_name") | ||
| ] | ||
| except KeyError: | ||
| raise KeyError( |
There was a problem hiding this comment.
sample_table_index is now referenced via the hard-coded string key here. To reduce duplication/typos and keep it consistent with the project module, consider using a shared constant (e.g., defined in pepdbagent.const) for this config key and reusing it in both the lookup and the error message.
| warnings.warn(f"Skipping {namespace}/{name}: {e}") | ||
| continue |
There was a problem hiding this comment.
Catching a broad Exception and skipping the project during test DB seeding can silently reduce the dataset and make test results depend on which projects failed to parse. For tests, it’s usually better to fail fast (or catch only expected parse/validation exceptions and re-raise with context) so failures are actionable.
| warnings.warn(f"Skipping {namespace}/{name}: {e}") | |
| continue | |
| raise RuntimeError( | |
| f"Failed to load test project {namespace}/{name} from {path}" | |
| ) from e |
| sqlalchemy>=2.0.0 | ||
| logmuse>=0.2.7 | ||
| peppy>=0.40.6 | ||
| peprs>=0.1.2 |
There was a problem hiding this comment.
The PR description TODO mentions bumping pepdbagent’s version and updating the changelog, but this change set updates a core dependency (peprs) without updating pepdbagent/_version.py or docs/changelog.md. Please include the version/changelog updates as part of this migration so downstream users can track the breaking change.
| SAMPLE_NAME_ATTR = "sample_name" | ||
| SAMPLE_TABLE_INDEX_KEY = "sample_table_index" | ||
| from sqlalchemy import Select, and_, delete, select |
There was a problem hiding this comment.
SAMPLE_NAME_ATTR / SAMPLE_TABLE_INDEX_KEY are defined before the import block finishes (imports resume right after). This breaks standard import ordering. Move these constants below all imports (or into pepdbagent.const) so the file keeps a clean imports-then-code structure.
Changes:
TODO:
__version__.pyfile