Make LineCoverageTestFitness non-binary and add tests#143
Make LineCoverageTestFitness non-binary and add tests#143Aditya-9215 wants to merge 1 commit intose2p:mainfrom
Conversation
|
Hi, I followed your suggestion and split this into a separate PR. This PR focuses only on making LineCoverageTestFitness non-binary, without modifying DynaMOSA yet. Please let me know if this aligns with your expectations before I proceed further. |
There was a problem hiding this comment.
Pull request overview
This PR updates LineCoverageTestFitness to return a non-binary (distance-based) fitness value for uncovered lines, aiming to provide a smoother optimisation signal (e.g., for DynaMOSA), and adds unit tests for the new behaviour.
Changes:
- Replace binary line-coverage fitness (0/1) with a non-binary approximation based on execution trace data.
- Add tests covering “covered → 0.0” and “not covered → > 0.0”.
- Add
goalproperties for line-coverage and checked-coverage test fitness functions.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/pynguin/ga/coveragegoals.py |
Implements non-binary line-coverage fitness and adds goal properties; also modifies SPDX header. |
tests/ga/test_linecoverage_fitness.py |
Introduces unit tests for the new line fitness behaviour using dummy execution results. |
tests/fixtures/simple_line_target.py |
Adds a new fixture module intended for line-coverage testing. |
| # SPDX-License-Identifier: MIT | ||
| # SPDX-FileCopyrightText: 2026 Aditya Sinha | ||
| # SPDX-License-Identifier: MIT |
| execution = DummyExecutionResult(covered_lines=[], executed_code_objects=[1, 2]) | ||
| chromosome = DummyChromosome(execution) | ||
|
|
||
| # Monkey patch run method | ||
| fitness._run_test_case_chromosome = lambda _: execution | ||
|
|
||
| value = fitness.compute_fitness(chromosome) | ||
|
|
||
| assert value > 0.0 |
| # SPDX-FileCopyrightText: 2026 Aditya Sinha | ||
| # SPDX-License-Identifier: MIT |
| # If nothing executed → maximum penalty | ||
| if not trace.executed_code_objects: | ||
| return 1.0 | ||
|
|
||
| # Use number of executed code objects as rough distance signal | ||
| return float(len(trace.executed_code_objects)) |
| # Otherwise → return non-binary distance | ||
| return 1.0 + self._approximate_distance(result) | ||
|
|
||
| def _approximate_distance(self, result: ExecutionResult) -> float: | ||
| """Estimate how far execution was from reaching this line. | ||
|
|
||
| This is a lightweight approximation based on execution trace information. | ||
| It provides a non-binary signal without requiring full control-flow analysis. | ||
| """ | ||
| try: | ||
| trace = result.execution_trace | ||
|
|
||
| # If nothing executed → maximum penalty | ||
| if not trace.executed_code_objects: | ||
| return 1.0 | ||
|
|
||
| # Use number of executed code objects as rough distance signal | ||
| return float(len(trace.executed_code_objects)) |
| def foo(x: int) -> int: | ||
| if x > 0: | ||
| return x + 1 | ||
| elif x == 0: | ||
| return 0 | ||
| else: | ||
| return x - 1 | ||
|
|
||
|
|
||
| def bar(y: int) -> int: | ||
| if y % 2 == 0: | ||
| return y * 2 | ||
| return y + 3 |
|
Hi, Pynguin can and does execute its generated tests against the SUT and Pynguin's instrumentation already allow to track what was covered and what not (else BranchCoverage would not work in the first place). Thus, there is certainly a more accurate way then approximating. Best |
|
Hi @LuKrO2011 , thanks, that makes sense. I agree that the current approximation based on executed code objects is too rough and not really goal-directed. Since Pynguin already tracks detailed execution information, it should be possible to use that instead of relying on a heuristic like this. I’ll take a step back and look into how the execution trace and existing fitness functions (like branch coverage) compute distance, and then try to adapt something similar for line coverage so that it reflects how close execution gets to a specific line. Does this direction sound reasonable, or is there something specific I should focus on instead? Best, |
This PR replaces the binary fitness computation of LineCoverageTestFitness with a distance-based approximation.
Previously:
Now:
This provides better guidance for search-based algorithms and is a prerequisite for supporting line coverage in DynaMOSA.
Unit tests are added to verify the new behavior.