fix(test): check average coverage against threshold#27979
fix(test): check average coverage against threshold#27979
Conversation
When coverage thresholds are configured via bunfig.toml coverageThreshold, the test runner now checks the average coverage across all files against the threshold, not just individual files. This matches the documented behavior where the "All files" row should cause a non-zero exit when below threshold. Excludes stmts from the average check to be consistent with the per-file check in CodeCoverage.zig which intentionally omits stmts. Closes #7367
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdds a post-aggregation check to code coverage reporting to mark coverage as failing when per-metric averages fall below configured thresholds. Includes regression tests validating exit code behavior when coverage is below or meets specified thresholds. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
LGTM — small, correct fix with good regression test coverage.
Extended reasoning...
Overview
This PR adds a 4-line check in test_command.zig (lines 1214-1216) to compare average function/line coverage against the configured threshold, and adds a new regression test (test/regression/issue/7367.test.ts) with both failing and passing coverage threshold scenarios. The scope is minimal and focused.
Security risks
None. The change is a simple numeric comparison that sets a boolean flag, with no user input handling, no network access, and no file system operations beyond what already exists.
Level of scrutiny
Low scrutiny is appropriate. This is a 4-line logic addition to test infrastructure (not production runtime code), following an existing pattern. The regression test exercises both the below-threshold (exit 1) and meets-threshold (exit 0) cases.
Other factors
The only bug reported is a nit observing that the average check is mathematically redundant with per-file checks (since if all per-file values >= threshold, the mean must also be >= threshold). This is correct but the code is harmless defense-in-depth and aligns with user expectations. The test file is well-structured using tempDir with inline config and source files. The stmts exclusion is consistent with the per-file check in CodeCoverage.zig.
| if (avg.functions < base_fraction.functions or avg.lines < base_fraction.lines) { | ||
| failing = true; | ||
| } |
There was a problem hiding this comment.
🟡 The average coverage threshold check (line 1214) is mathematically redundant with the per-file checks (line 1187). By the arithmetic mean property, if all per-file values are >= threshold then the average must also be >= threshold; conversely if the average is below threshold, at least one per-file value must be below it, so failing was already set to true. The code is harmless as defense-in-depth but is a no-op, and the regression test passes with or without this change since it triggers the per-file check.
Extended reasoning...
The redundancy
The new code at lines 1214-1216 checks whether the average function/line coverage across all files falls below the configured threshold, and if so sets failing = true. However, this check can never independently change failing from false to true because of a basic property of the arithmetic mean.
Mathematical proof
The per-file check works as follows: in CodeCoverage.zig, writeFormat() compares each file's function and line coverage fractions against base_fraction (the threshold). If any file's coverage is below threshold, fraction.failing is set to true, and then at test_command.zig:1187-1188, failing = true is set.
The average is computed by accumulating per-file fractions and dividing by the count. By the arithmetic mean inequality: if every value x_i >= T (threshold), then (sum of x_i) / n >= T. The contrapositive is: if the mean < T, then at least one x_i < T, which means the per-file check already set failing = true.
Step-by-step example
Consider the regression test: one file lib.ts with 10% function coverage (1/10 functions used) against a 90% threshold. The per-file check at line 1187 sees 0.1 < 0.9 and sets failing = true. The average check at line 1214 also sees 0.1 < 0.9 and sets failing = true again — but it was already true. Now consider the passing case: lib.ts with 100% coverage against a 50% threshold. Per-file: 1.0 >= 0.5, so failing stays false. Average: 1.0 >= 0.5, so the average check also does not trigger. There is no scenario where the average check triggers but per-file checks do not.
Impact
The code is harmless — it cannot cause incorrect behavior. It functions as defense-in-depth and makes the intent explicit (that average coverage should be checked). However, it is technically dead code: it can never change program behavior. The regression test validates the end-to-end threshold behavior correctly, but does not (and cannot) exercise a code path where the average check independently causes failure.
Why this is a nit
As noted by reviewers, redundant safety checks are common practice and this one aligns with user expectations about threshold checking. It also provides a safety net if the per-file checking logic ever changes in the future. The code is correct and not harmful — it is just unnecessary from a strict logical standpoint. The PR still fixes the user-facing issue (#7367) via the per-file checks that were already in place.
Summary
coverageThresholdintest_command.zigstmtsfrom the average check, consistent with the per-file check inCodeCoverage.zig:135Root Cause
The test runner computed average coverage across all files but never checked it against the configured threshold. Only per-file checks could trigger a non-zero exit. The "All files" summary row could show below-threshold coverage while the runner exited 0.
Fix
After computing the average, check
avg.functions < base_fraction.functions or avg.lines < base_fraction.linesand setfailing = trueif below threshold. Thestmtsfield is intentionally excluded to match the per-file behavior.Based on community PR #27933 by @ssing2, with fixes:
stmtsfrom the average check for consistency with per-file checkTest plan
bun bd test test/regression/issue/7367.test.tspassesbun run zig:checkcompiles successfullyCloses #7367