Skip to content

Migrate POST /setup/tag endpoint (#64)#271

Open
igennova wants to merge 6 commits intoopenml:mainfrom
igennova:issue/64
Open

Migrate POST /setup/tag endpoint (#64)#271
igennova wants to merge 6 commits intoopenml:mainfrom
igennova:issue/64

Conversation

@igennova
Copy link
Contributor

@igennova igennova commented Mar 13, 2026

Description

Fixes: #64

  • Implemented the POST /setup/tag FastAPI endpoint and underlying database logic.
  • Modified the response to return the full array of remaining tags to match the old PHP API behavior.
  • Added comprehensive unit tests and py_api vs php_api migration parity tests.
  • Updated docs/migration.md to document the new tag array response behavior.

Checklist

Always:

  • I have performed a self-review of my own pull request to ensure it contains all relevant information, and the proposed changes are minimal but sufficient to accomplish their task.

Required for code changes:

  • Tests pass locally
  • I have commented my code in hard-to-understand areas, and provided or updated docstrings as needed
  • Changes are already covered under existing tests
  • I have added tests that cover the changes (only required if not already under coverage)

If applicable:

  • I have made corresponding changes to the documentation pages (/docs)

Extra context:

  • This PR and the commits have been created autonomously by a bot/agent.

Screenshot

Screenshot 2026-03-13 at 5 52 41 AM

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4a7eec51-eb4d-43a4-8d40-17a2e3228d29

📥 Commits

Reviewing files that changed from the base of the PR and between 99342f3 and e53b3e6.

📒 Files selected for processing (1)
  • tests/conftest.py

Walkthrough

Adds a POST /setup/tag endpoint, a new async database function tag() to insert into setup_tag, and updates router logic to validate existence, enforce case-insensitive duplicate detection (raising TagAlreadyExistsError), and return tags as a list (casefolded). Adjusts untag comparisons to be case-insensitive. Adds documentation and tests plus test fixtures (temporary_records/temporary_tags) to cover tag/untag, auth, nonexistent setups, and duplicate-tag cases.

Possibly related PRs

  • openml/server-api PR 246: Adds complementary tagging operations (get_tags and untag) and related database helpers, modifying the same setups router and database module.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Migrate POST /setup/tag endpoint (#64)' clearly describes the main change - implementation of the POST /setup/tag endpoint migration, matching the primary objective of the pull request.
Description check ✅ Passed The description is directly related to the changeset, detailing implementation of the POST /setup/tag endpoint, database logic, tests, and documentation updates that match the code changes.
Linked Issues check ✅ Passed The PR implements the POST /setup/tag endpoint as required by linked issue #64, with proper endpoint implementation, database logic, error handling (TagAlreadyExistsError), and tag array response behavior.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the POST /setup/tag endpoint: database layer (setups.py), router endpoint (openml/setups.py), migration tests, unit tests, test fixtures, and documentation updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The tag_setup flow does a read (get_tags) followed by a write (tag) to detect duplicates, which can still allow a concurrent duplicate insert; if this matters, consider enforcing uniqueness at the DB level (e.g. unique (id, LOWER(tag)) index) and mapping constraint violations to TagAlreadyExistsError instead of relying purely on the pre-check.
  • In tag_setup, tags are case-folded before being returned (casefold()), which may change the casing of existing tags; please confirm this matches the legacy API behavior and, if not, consider preserving original stored casing while still doing case-insensitive comparison for duplicate detection.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `tag_setup` flow does a read (`get_tags`) followed by a write (`tag`) to detect duplicates, which can still allow a concurrent duplicate insert; if this matters, consider enforcing uniqueness at the DB level (e.g. unique `(id, LOWER(tag))` index) and mapping constraint violations to `TagAlreadyExistsError` instead of relying purely on the pre-check.
- In `tag_setup`, tags are case-folded before being returned (`casefold()`), which may change the casing of existing tags; please confirm this matches the legacy API behavior and, if not, consider preserving original stored casing while still doing case-insensitive comparison for duplicate detection.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 23.33333% with 23 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@f94808c). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/routers/openml/setups.py 20.00% 12 Missing ⚠️
tests/conftest.py 23.07% 10 Missing ⚠️
src/database/setups.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #271   +/-   ##
=======================================
  Coverage        ?   54.07%           
=======================================
  Files           ?       34           
  Lines           ?     1474           
  Branches        ?      122           
=======================================
  Hits            ?      797           
  Misses          ?      675           
  Partials        ?        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tests/routers/openml/setups_test.py (1)

105-117: Add a case-insensitive duplicate test variant.

The endpoint blocks duplicates via casefold(), but this test only validates exact-case duplicates. Please add a mixed-case scenario (e.g., stored Existing_Tag, request existing_tag) to lock in the intended behavior.

Coverage extension example
 `@pytest.mark.mut`
+@pytest.mark.parametrize(
+    ("stored_tag", "requested_tag"),
+    [("existing_tag_123", "existing_tag_123"), ("Existing_Tag_123", "existing_tag_123")],
+    ids=["exact-match", "case-insensitive-match"],
+)
 async def test_setup_tag_already_exists(
-    py_api: httpx.AsyncClient, expdb_test: AsyncConnection
+    py_api: httpx.AsyncClient, expdb_test: AsyncConnection, stored_tag: str, requested_tag: str
 ) -> None:
     await expdb_test.execute(
-        text("INSERT INTO setup_tag (id, tag, uploader) VALUES (1, 'existing_tag_123', 2);")
+        text("INSERT INTO setup_tag (id, tag, uploader) VALUES (1, :tag, 2);"),
+        parameters={"tag": stored_tag},
     )
     response = await py_api.post(
         f"/setup/tag?api_key={ApiKey.SOME_USER}",
-        json={"setup_id": 1, "tag": "existing_tag_123"},
+        json={"setup_id": 1, "tag": requested_tag},
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/setups_test.py` around lines 105 - 117, Add a new async
test alongside test_setup_tag_already_exists that verifies case-insensitive
duplicate detection: insert a setup_tag row with a mixed-case tag (e.g.,
'Existing_Tag') for setup id 1 using expdb_test.execute on the setup_tag table,
then POST to /setup/tag with json {"setup_id": 1, "tag": "existing_tag"} and
api_key=ApiKey.SOME_USER, and assert HTTPStatus.CONFLICT and that
response.json()["detail"] matches the existing-tag message for setup 1; mirror
the original test's assertions and use the same helpers (py_api, expdb_test) and
identifiers so the test ensures casefold-based duplicate blocking in
test_setup_tag_already_exists_variant (or similar name).
tests/routers/openml/migration/setups_migration_test.py (2)

224-228: Use multiset comparison instead of set(...) for tag parity.

set(tags) can hide duplicate-tag regressions. Prefer Counter(...) so parity checks keep multiplicity.

Small assertion hardening
+from collections import Counter
@@
-            assert set(tags) == set(new_tag["tag"])
+            assert Counter(tags) == Counter(new_tag["tag"])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/migration/setups_migration_test.py` around lines 224 -
228, Replace the unordered set comparison with a multiset comparison so
duplicate tags are checked: when comparing original_tag (variable original_tag)
to new_tag["tag"] in the test, keep the existing branch for string tags but for
the sequence branch replace assert set(tags) == set(new_tag["tag"]) with a
Counter-based equality (use collections.Counter) to compare multiplicities; add
the necessary import (from collections import Counter) at the top of the test
file.

172-194: Extract temporary_tags to a shared helper in this module.

This helper duplicates logic already present earlier in the file. Consolidating it once will reduce drift across migration tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/migration/setups_migration_test.py` around lines 172 -
194, The temporary_tags async context manager is duplicated; extract it once as
a shared helper function named temporary_tags in this module (keeping the
signature temporary_tags(tags: Iterable[str], setup_id: int, *, persist: bool =
False) -> AsyncGenerator[None]) and replace the other duplicate
definitions/copies with calls to that single helper; ensure the implementation
preserves its INSERT/DELETE SQL, parameters using OWNER_USER.user_id, commit
behavior when persist is True, and that all tests import/use the consolidated
temporary_tags symbol instead of defining their own local versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/setups.py`:
- Around line 34-42: The current check-then-insert for tags can race; wrap the
await database.setups.tag(setup_id, tag, user.user_id, expdb_db) call in a
try/except that catches IntegrityError (import from sqlalchemy.exc) and raises
TagAlreadyExistsError with the same message used for the pre-check (e.g.,
f"Setup {setup_id} already has tag {tag!r}."); keep the existing pre-check for
fast-fail but treat the DB uniqueness violation as the source of truth, and only
append tag.casefold() to all_tags after a successful insert (or re-fetch tags if
you prefer) to avoid stale state.

---

Nitpick comments:
In `@tests/routers/openml/migration/setups_migration_test.py`:
- Around line 224-228: Replace the unordered set comparison with a multiset
comparison so duplicate tags are checked: when comparing original_tag (variable
original_tag) to new_tag["tag"] in the test, keep the existing branch for string
tags but for the sequence branch replace assert set(tags) == set(new_tag["tag"])
with a Counter-based equality (use collections.Counter) to compare
multiplicities; add the necessary import (from collections import Counter) at
the top of the test file.
- Around line 172-194: The temporary_tags async context manager is duplicated;
extract it once as a shared helper function named temporary_tags in this module
(keeping the signature temporary_tags(tags: Iterable[str], setup_id: int, *,
persist: bool = False) -> AsyncGenerator[None]) and replace the other duplicate
definitions/copies with calls to that single helper; ensure the implementation
preserves its INSERT/DELETE SQL, parameters using OWNER_USER.user_id, commit
behavior when persist is True, and that all tests import/use the consolidated
temporary_tags symbol instead of defining their own local versions.

In `@tests/routers/openml/setups_test.py`:
- Around line 105-117: Add a new async test alongside
test_setup_tag_already_exists that verifies case-insensitive duplicate
detection: insert a setup_tag row with a mixed-case tag (e.g., 'Existing_Tag')
for setup id 1 using expdb_test.execute on the setup_tag table, then POST to
/setup/tag with json {"setup_id": 1, "tag": "existing_tag"} and
api_key=ApiKey.SOME_USER, and assert HTTPStatus.CONFLICT and that
response.json()["detail"] matches the existing-tag message for setup 1; mirror
the original test's assertions and use the same helpers (py_api, expdb_test) and
identifiers so the test ensures casefold-based duplicate blocking in
test_setup_tag_already_exists_variant (or similar name).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a2a06757-f7cc-482c-a567-80f29d6e3d3f

📥 Commits

Reviewing files that changed from the base of the PR and between f94808c and dbf4844.

📒 Files selected for processing (5)
  • docs/migration.md
  • src/database/setups.py
  • src/routers/openml/setups.py
  • tests/routers/openml/migration/setups_migration_test.py
  • tests/routers/openml/setups_test.py

@igennova
Copy link
Contributor Author

@PGijsbers The PR is ready for review. Thank you

Copy link
Contributor

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes requests with regards to code structure, but the functional bits look good to me! Also, this PR accidentally got to the top of my pile, but in principle I am trying to work through the PRs in chronological order, so it may take a little bit of time for me to catch back up.

Comment on lines +172 to +193
@contextlib.asynccontextmanager
async def temporary_tags(
tags: Iterable[str], setup_id: int, *, persist: bool = False
) -> AsyncGenerator[None]:
for tag in tags:
await expdb_test.execute(
text(
"INSERT INTO setup_tag(`id`,`tag`,`uploader`) "
"VALUES (:setup_id, :tag, :user_id);"
),
parameters={"setup_id": setup_id, "tag": tag, "user_id": OWNER_USER.user_id},
)
if persist:
await expdb_test.commit()
yield
for tag in tags:
await expdb_test.execute(
text("DELETE FROM setup_tag WHERE `id`=:setup_id AND `tag`=:tag"),
parameters={"setup_id": setup_id, "tag": tag},
)
if persist:
await expdb_test.commit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should extract this as a module level fixture to avoid repeating it in multiple tests. Then you can also use it for the other test where you currently have a try/finally construction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to have something even above module level that takes the queries as parameters, and then use that to define module level reusable context managers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, I've extracted it to a generic context manager in conftest.py and refactored the tests to use a single fixture here, including replacing that try/finally

@PGijsbers PGijsbers changed the title Migrate POST /setup/untag endpoint (#64) Migrate POST /setup/tag endpoint (#64) Mar 13, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/routers/openml/migration/setups_migration_test.py (1)

22-37: Materialize tags once before building both query lists.

tags is typed as Iterable[str]; a single-pass iterator would be exhausted by the first comprehension, leaving the second empty. All current call sites pass lists, but the type signature allows any iterable.

💡 Suggested fix
     async def _temporary_tags(
         tags: Iterable[str], setup_id: int, *, persist: bool = False
     ) -> AsyncGenerator[None]:
+        tags = tuple(tags)
         insert_queries = [
             (
                 "INSERT INTO setup_tag(`id`,`tag`,`uploader`) VALUES (:setup_id, :tag, :user_id);",
                 {"setup_id": setup_id, "tag": tag, "user_id": OWNER_USER.user_id},
             )
             for tag in tags
         ]
         delete_queries = [
             (
                 "DELETE FROM setup_tag WHERE `id`=:setup_id AND `tag`=:tag",
                 {"setup_id": setup_id, "tag": tag},
             )
             for tag in tags
         ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/migration/setups_migration_test.py` around lines 22 -
37, Materialize the iterable `tags` into a concrete sequence before building the
two comprehensions so a single-pass iterator isn't consumed by `insert_queries`
and left empty for `delete_queries`; e.g. assign `tags_list = list(tags)` and
use `tags_list` when constructing both `insert_queries` and `delete_queries`
(references: `tags`, `insert_queries`, `delete_queries`, `setup_id`,
`OWNER_USER`).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/conftest.py`:
- Around line 31-50: The async context manager temporary_records should protect
the cleanup steps with a try/finally: execute insert_queries and commit (if
persist) before yielding, then use try: yield finally: run the delete_queries
and commit (if persist) so cleanup always runs even if the body raises; update
the temporary_records function to wrap the yield in a try/finally and move the
delete_queries/commit into the finally block, referencing the existing
parameters connection, insert_queries, delete_queries, and persist.

---

Nitpick comments:
In `@tests/routers/openml/migration/setups_migration_test.py`:
- Around line 22-37: Materialize the iterable `tags` into a concrete sequence
before building the two comprehensions so a single-pass iterator isn't consumed
by `insert_queries` and left empty for `delete_queries`; e.g. assign `tags_list
= list(tags)` and use `tags_list` when constructing both `insert_queries` and
`delete_queries` (references: `tags`, `insert_queries`, `delete_queries`,
`setup_id`, `OWNER_USER`).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6e17f23a-3ba3-4802-ac15-6639ca3e9e2b

📥 Commits

Reviewing files that changed from the base of the PR and between dbf4844 and 99342f3.

📒 Files selected for processing (3)
  • src/routers/openml/setups.py
  • tests/conftest.py
  • tests/routers/openml/migration/setups_migration_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/routers/openml/setups.py

@igennova igennova requested a review from PGijsbers March 15, 2026 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

POST /setup/tag

2 participants