fix(application): sort query results by submitted_at before limit#495
fix(application): sort query results by submitted_at before limit#495olivermeyer merged 2 commits intomainfrom
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes run listing behavior in ApplicationService.application_runs() when using the query parameter by ensuring results are correctly limited per search and returned in newest-first order.
Changes:
- Fix tag-search limit budgeting so tag results aren’t starved when note results reach the overall
limit. - Sort the merged (note ∪ tag) result set by
submitted_atdescending before applying the final limit. - Expand unit test coverage to validate independent budgeting, deduplication, and ordering-by-recency.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/aignostics/application/_service.py |
Adjusts tag-loop break condition and sorts merged query results by submitted_at before slicing to limit. |
tests/aignostics/application/service_test.py |
Adds/updates unit tests to assert correct limiting semantics, dedup behavior, and newest-first ordering. |
Codecov Report✅ All modified and coverable lines are covered by tests.
|
|
|
Tests on |
| if run.run_id not in note_runs_dict: | ||
| tag_runs_dict[run.run_id] = run | ||
| if limit is not None and len(note_runs_dict) + len(tag_runs_dict) >= limit: | ||
| if limit is not None and len(tag_runs_dict) >= limit: |
There was a problem hiding this comment.
✅ Excellent fix! The independent limit for tag search is now correctly implemented.
What changed: Changed from len(note_runs_dict) + len(tag_runs_dict) >= limit to just len(tag_runs_dict) >= limit.
Why this matters: Previously, if the note search filled the entire limit quota, the tag search had zero budget. Now each search gets its own independent N-slot budget, ensuring both searches can contribute results up to the limit.
Example: With limit=3:
- Old behavior: note search finds 3 runs → tag search has 0 budget → only note results returned
- New behavior: note search finds 3 runs, tag search finds 3 unique runs → 6 total, then sorted and sliced to top 3 newest
The deduplication logic (line 680 check) ensures duplicates don't consume the tag search quota. 👍
|
|
||
| # Apply limit after union | ||
| if limit is not None and len(runs) > limit: | ||
| runs.sort(key=lambda r: r.submitted_at, reverse=True) |
There was a problem hiding this comment.
✅ Critical sorting fix! This addresses the "unsorted merge" bug described in the PR.
What changed: Added .sort(key=lambda r: r.submitted_at, reverse=True) before applying the limit.
Why this matters: Previously, the union concatenated runs in arbitrary order (note results first, then tag results), and [:limit] didn't guarantee the newest runs. Now the merged list is sorted by submission time descending before slicing.
Example:
- Old:
[note-day1, note-day2, tag-day10][:2]→[note-day1, note-day2](oldest wins! ❌) - New: Sort →
[tag-day10, note-day2, note-day1][:2]→[tag-day10, note-day2](newest wins! ✅)
The comprehensive test coverage in test_application_runs_query_sorts_results_by_submitted_at and test_application_runs_query_respects_limit validates this behavior. Excellent work!
| "list", | ||
| "--query", | ||
| "note_of_this_complex_test", | ||
| "--limit", |
There was a problem hiding this comment.
✅ Good performance optimization! Adding --limit to CLI test list operations prevents executing heavy queries repeatedly.
Why this matters: The test was listing runs without limits dozens of times per operation, which could cause:
- Slow test execution (especially in E2E against real API)
- Unnecessary load on the staging API
- Timeouts in CI if many runs exist
Using LIST_APPLICATION_RUNS_MAX_PAGE_SIZE as the limit is appropriate since:
- It's the max page size constant from the platform module
- Provides predictable test behavior
- Reduces API load without affecting test coverage
Applied consistently across all 10+ list operations in this test. 👍
| def test_application_runs_query_searches_note_and_tags(mock_get_client: MagicMock) -> None: | ||
| """Test that query parameter searches both note and tags with union semantics.""" | ||
| # Create mock runs | ||
| base_time = datetime(2024, 1, 1, tzinfo=UTC) |
There was a problem hiding this comment.
✅ Excellent test enhancement! All existing tests have been updated with submitted_at timestamps.
Why this matters: The fix adds sorting by submitted_at, so tests MUST have realistic timestamps to:
- Verify sort order is correct
- Prevent flaky tests from undefined ordering
- Document expected behavior clearly
Best practice observed:
base_time = datetime(2024, 1, 1, tzinfo=UTC)establishes a reference- Each run gets
base_time + timedelta(days=N)for deterministic ordering - Newer tests use explicit day offsets to test sort logic
Note: All 5 new/updated tests properly mock submitted_at. Great attention to detail! 🎯
| # After merge+sort+slice: tag-4(9), tag-3(8), tag-2(7) are the 3 newest. | ||
| assert len(results) == 3 | ||
| result_ids = {r.run_id for r in results} | ||
| assert result_ids == {"run-tag-2", "run-tag-3", "run-tag-4"} |
There was a problem hiding this comment.
✅ Comprehensive new test coverage! Four new unit tests added:
test_application_runs_query_respects_limit(lines 318-356): Verifies limit returns newest N runs across both searchestest_application_runs_query_tag_search_has_independent_limit(lines 359-402): Confirms tag search has its own budget (the key fix!)test_application_runs_query_sorts_results_by_submitted_at(lines 405-440): Validates newest-first ordering without limittest_application_runs_query_deduplicates_with_independent_budget(lines 443-485): Ensures duplicates don't consume tag quota
Test quality observations:
- ✅ All properly marked with
@pytest.mark.unit - ✅ Detailed docstrings explaining the scenario
- ✅ Realistic mock data with timestamps
- ✅ Uses
MagicMock(spec=RunData)for type safety - ✅
mock_runs.list_data.side_effectsimulates API pagination correctly - ✅ Assertions validate both result count and content
Coverage impact: These tests add 163 lines covering critical edge cases. Total test count increased from 20 → 24 tests (+20%). 📈
The test names and docstrings clearly communicate intent. Excellent TDD approach!
🎯 PR Review Summary: APPROVED ✅OverviewThis PR fixes two critical bugs in the ✅ CRITICAL CHECKS - ALL PASSED1. Test Markers ✅
2. Code Quality ✅
3. Test Coverage ✅
4. Conventional Commits ✅
🐛 Bugs FixedBug 1: Shared Budget Issue ✅Problem: Tag search shared limit budget with note search. If note search filled the limit, tag search had zero budget. Root Cause: Line 682 checked Fix: Changed to Impact: Both note and tag searches can contribute up to Bug 2: Unsorted Merge Issue ✅Problem: Union of note+tag results concatenated in arbitrary order before slicing, so Root Cause: No sorting before applying limit at line 689. Fix: Added Impact: Merged results now sorted newest-first before slicing, ensuring users get the most recent runs. 📊 Test Coverage SummaryNew Tests Added (4 unit tests, 163 lines):
Tests Updated (3 existing tests):
E2E Test Performance Optimization:
🏗️ Architecture Compliance ✅Modulith Design ✅
Service Pattern ✅
🔒 Medical Device & Security ✅
📈 Code Quality Metrics
💡 HighlightsWhat Makes This PR Excellent:
🚀 RecommendationAPPROVE AND MERGE ✅ This PR:
🙏 KudosExcellent work @olivermeyer! The attention to detail in:
This is a textbook example of a high-quality bug fix PR. The test coverage is outstanding and will prevent regressions. 🌟 📚 References
Review conducted by: Claude Code (Automated Review) |



Fixing two bugs when listing runs:
len(note_runs) + len(tag_runs) >= limit, so a full note search left zero budget for tags. Fixed by giving the tag search its own independent limit based onlen(tag_runs_dict)only. Duplicates (runs matching both searches) are still deduplicated and don't consume the tag quota.[:limit]didn't necessarily return the newest runs. Fixed by sorting bysubmitted_atdescending before applying the limit.Setting a limit on list operations in
test_cli_run_submit_and_describe_and_cancel_and_download_and_deleteto avoid executing heavy queries dozens of times per operation.