-
Notifications
You must be signed in to change notification settings - Fork 3.3k
HotFix 1.16.3 (released 4/1) --> main merge #46078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
w-javed
wants to merge
21
commits into
main
Choose a base branch
from
hotfix/azure-ai-evaluation/1.16.3
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
c12b2eb
Fix top sample data (#45214)
YoYoJa 73084d5
[Agentic Evaluators]: Accept input string as is (#45159)
m7md7sien c20b189
Fix XPIA binary_path incompatibility for model targets (#5058420) (#4…
slister1001 4d9781a
Fix content-filter responses showing raw JSON in results (#5058447) (…
slister1001 a0d4277
Extract token_usage from labels in Foundry path for row-level output …
slister1001 d71e327
Fix legacy endpoint backwards compatibility for _use_legacy_endpoint …
slister1001 7272bdb
chore: Update CHANGELOG for azure-ai-evaluation 1.16.1 hotfix release
slister1001 ef80815
docs: Backport CHANGELOG entries for azure-ai-evaluation 1.16.1 hotfi…
slister1001 49a05b2
Fix adversarial chat target for Tense, Crescendo, and MultiTurn attac…
slister1001 2470419
[Evaluation] Additional red team e2e tests (#45579)
slister1001 3d76f79
chore: Clean up CHANGELOG for 1.16.2 hotfix release
slister1001 2039e3e
Extract RAI scorer token metrics into Score metadata and save to memo…
slister1001 0694d4e
chore: Add PR #45865 to CHANGELOG for 1.16.2 hotfix
slister1001 6511e37
chore: Set release date for 1.16.2 hotfix (2026-03-24)
slister1001 d96f216
Increment package version after release of azure-ai-evaluation (#46001)
slister1001 fc6bf76
Fix ASR scoring: use score-based threshold instead of passed field (#…
slister1001 1d3c3c1
Fix/redteam partial results (#45996)
slister1001 490d3ed
Fix evaluator token metrics not persisted in red teaming results (#46…
slister1001 79769d4
Clean up CHANGELOG: remove empty sections, set release date 2026-04-01
slister1001 a1ce738
Fix CHANGELOG spacing for 1.16.3 section
slister1001 d8ccf28
Increment package version after release of azure-ai-evaluation (#46065)
azure-sdk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop calls
_process_criteria_metrics(...)and capturessample, butsampleis no longer used to populatetop_sample. This means rows withoutinputs.sample.generated_sample_datawill now always return an empty top-levelsample, even when criteria results include per-metric sample data. Consider restoring a fallback (e.g., settop_samplefrom the first non-emptysamplewhentop_sampleis still empty) to avoid regressions for callers that rely on the top-level sample payload.