Skip to content

Fix/redteam errored count tracking#46086

Draft
slister1001 wants to merge 1 commit intoAzure:mainfrom
slister1001:fix/redteam-errored-count-tracking
Draft

Fix/redteam errored count tracking#46086
slister1001 wants to merge 1 commit intoAzure:mainfrom
slister1001:fix/redteam-errored-count-tracking

Conversation

@slister1001
Copy link
Copy Markdown
Member

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Apr 2, 2026
@slister1001 slister1001 force-pushed the fix/redteam-errored-count-tracking branch from 16055c4 to bc36e7c Compare April 2, 2026 16:34
Previously, objectives that failed during attack execution or risk
categories with zero prepared objectives were silently dropped from the
pipeline. The result_counts.errored field always showed 0 because
_compute_result_count only counted existing output items.

Changes:
- _execution_manager.py: Record 0-objective categories as failed in
  red_team_info instead of silently skipping. Add expected_count to all
  red_team_info entries to track expected vs actual objectives.
- _result_processor.py: Add _extract_expected_total() to compute total
  expected objectives from red_team_info (de-duplicated by risk
  category). Pass expected_total to _compute_result_count() which now
  computes errored as the delta between expected and actual items. Add
  partial_failure to _determine_run_status failure detection.
- test_result_processor_errored.py: 31 new unit tests covering
  _compute_result_count with expected_total, _extract_expected_total
  de-duplication logic, and _determine_run_status failure detection.
- test_foundry.py: 3 new tests for 0-objective recording and
  expected_count propagation in FoundryExecutionManager.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@slister1001 slister1001 force-pushed the fix/redteam-errored-count-tracking branch from bc36e7c to 992ddd2 Compare April 2, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant