chore: update test suite name when reporting failures#494
Conversation
There was a problem hiding this comment.
Pull request overview
This PR clarifies failure reporting in the GitHub Actions test workflow by making E2E suite names explicit in the final “Assert no test failures” gate step, reducing ambiguity when multiple E2E variants exist.
Changes:
- Rename the reported failure labels for E2E suites from generic names (
e2e,long-running,very-long-running) to explicit E2E-scoped names (e2e-regular,e2e-long-running,e2e-very-long-running).
|
Claude finished @olivermeyer's task in 1m 18s —— View job Code Review Complete ✅
SummaryThis PR improves test failure reporting clarity by renaming E2E test step IDs from generic names ( 🔴 BLOCKING ISSUE: GitHub Actions Expression Syntax ErrorLocation: Problem: The Sentry bot correctly identified that underscores in step IDs work with dot notation, but the PR description's goal of using hyphens would break the workflow. Looking at the actual code changes, you're using underscores ( Current Code (After PR): id: e2e_regular # ✅ Underscores work with dot notation
id: e2e_long_running # ✅ Underscores work
id: e2e_very_long_running # ✅ Underscores work
# Referenced as:
${{ steps.e2e_regular.outcome }} # ✅ This works correctlySentry Bot Warning Was About: If you had used hyphens like the scheduled workflow does ( id: e2e-regular # ❌ Would require bracket notation: steps['e2e-regular']Verdict: Your implementation is CORRECT ✅You chose underscores which work perfectly with dot notation. The Sentry bot was warning about a potential issue if hyphens were used, but your actual code uses underscores throughout.
|
| Test Suite | _test.yml (This PR) |
_scheduled-test-daily.yml |
Consistent? |
|---|---|---|---|
| Regular E2E | e2e_regular |
e2e-regular |
❌ No |
| Long-running E2E | e2e_long_running |
e2e-long-running |
❌ No |
| Very long-running E2E | e2e_very_long_running |
e2e-very-long-running |
❌ No |
Impact: This inconsistency could confuse developers reading logs or comparing workflow outputs between PR CI and scheduled tests.
Recommendation: Choose one convention and apply consistently:
Option A (Recommended): Keep underscores everywhere (easiest, your current choice)
- Update
_scheduled-test-daily.ymlto use underscores:e2e_regular,e2e_long_running,e2e_very_long_running - Update bracket notation references in that file to dot notation
Option B: Use hyphens everywhere (matches existing scheduled workflow)
- Revert this PR to use hyphens:
e2e-regular,e2e-long-running,e2e-very-long-running - Update all references to use bracket notation:
steps['e2e-regular'].outcome
Personally, I recommend Option A (underscores) because:
- Simpler expression syntax (dot notation)
- Matches Python/programming conventions
- Less risk of typos with bracket notation
✅ What This PR Does Well
- Improves clarity: The new names (
e2e_regularvs generice2e) clearly indicate this is the "regular" E2E suite, not the long-running variant - Maintains alignment: Step IDs and the failure tracking logic stay synchronized
- Preserves functionality: The test failure detection logic works correctly
- Clean diff: Only touches what needs to change
📝 Minor Observations
- Whitespace alignment: Lines 282-286 now have better column alignment for the step ID references (good!)
- No functional changes: This is purely a naming/clarity improvement (appropriate for a
chore:commit) - Label addition: The PR correctly includes
skip:test_long_runninglabel to avoid unnecessarily running long tests
🔍 Testing Verification
The failing E2E test (test_cli_run_submit_and_describe_and_cancel_and_download_and_delete) is unrelated to this PR:
- It timed out after 1207s (20 minutes) waiting for an API response
- This is a flaky E2E test issue, not a workflow syntax issue
- The test has
@pytest.mark.flaky(retries=3, delay=5)indicating known flakiness - Root cause: SSL socket read timeout during
runs.list()API call
The workflow changes in this PR are syntactically correct and will not affect test execution.
📚 Documentation Check
No documentation updates needed because:
- This is an internal workflow implementation detail
.github/CLAUDE.mddescribes the test strategy at a conceptual level- Step IDs are not documented in user-facing docs
Final Recommendation
Must Do Before Merge:
Nothing! The code is correct as-is. ✅
Should Consider (Non-Blocking):
- Standardize naming across workflows: Update
_scheduled-test-daily.ymlto match the underscore convention in a follow-up PR - Document the convention: Add a comment in
.github/CLAUDE.mdabout the step ID naming convention (underscores preferred for dot notation compatibility)
Example Follow-up PR Changes:
In _scheduled-test-daily.yml:
- id: e2e-regular
+ id: e2e_regular
- id: e2e-long-running
+ id: e2e_long_running
- id: e2e-very-long-running
+ id: e2e_very_long_running
# And update all references:
- if [ "${{ steps.e2e-regular.outcome }}" == "failure" ]; then
+ if [ "${{ steps.e2e_regular.outcome }}" == "failure" ]; thenConclusion
Approve and merge ✅ - The changes are correct, improve clarity, and don't introduce any bugs. The Sentry bot's warning was a false alarm (it was worried about hyphens, but you correctly used underscores). The only suggestion is a non-blocking follow-up to standardize naming across all workflows.
Great attention to detail on making failure messages clearer! 🎉
d00095f to
15e5739
Compare
There was a problem hiding this comment.
Pull request overview
This PR clarifies how failed test suites are reported in the GitHub Actions “Assert no test failures” gate by renaming E2E step IDs and aligning the failure labels to those more specific suite names.
Changes:
- Renamed E2E-related workflow step IDs to be more explicit (
e2e_regular,e2e_long_running,e2e_very_long_running). - Updated the “Assert no test failures” gate to reference the new step IDs and report less ambiguous suite names.
❌ 2 Tests Failed:
View the top 2 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
15e5739 to
ed5e02d
Compare
ed5e02d to
4dec5b7
Compare
There was a problem hiding this comment.
Pull request overview
This PR clarifies how E2E test suites are identified in the GitHub Actions “Assert no test failures” gate by renaming step IDs to be more explicit, reducing ambiguity when failures are reported.
Changes:
- Renamed E2E step IDs to distinguish regular vs. long-running vs. very-long-running suites.
- Updated the failure-aggregation gate to reference the new step IDs and report the updated suite names.
|



Updating the test suite names in the
Assert no test failuresto make them less ambiguous.