Skip to content

Add properties in Evaluation Result - Custom Evaluator extra fields. #46077

Open
w-javed wants to merge 9 commits intomainfrom
waqasjaved02/aoai-properties-passthrough
Open

Add properties in Evaluation Result - Custom Evaluator extra fields. #46077
w-javed wants to merge 9 commits intomainfrom
waqasjaved02/aoai-properties-passthrough

Conversation

@w-javed
Copy link
Copy Markdown
Contributor

@w-javed w-javed commented Apr 2, 2026

No description provided.

@w-javed w-javed requested a review from a team as a code owner April 2, 2026 07:11
Copilot AI review requested due to automatic review settings April 2, 2026 07:11
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Apr 2, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for passing through custom evaluator “extra fields” via a properties bag into the AOAI-style evaluation result objects produced by the evaluation results converter.

Changes:

  • Update _extract_metric_values to detect an outputs.<criteria>.properties dict and propagate it onto per-metric extracted values.
  • Update _create_result_object to include properties in the final AOAI result payload when present.
  • Add a unit test asserting properties is preserved and not flattened into the top-level result object.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Propagates a per-criteria properties dict into per-metric result objects during AOAI conversion.
sdk/evaluation/azure-ai-evaluation/tests/unittests/test_evaluate.py Adds coverage validating properties passthrough behavior for a custom evaluator result row.

…esults

Pass through evaluator properties dict in AOAI evaluation results.
When an evaluator returns a properties dict, it is included alongside
score, label, reason, threshold, and passed in the result object.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@w-javed w-javed force-pushed the waqasjaved02/aoai-properties-passthrough branch from 2fd8ce2 to c8f5958 Compare April 2, 2026 07:19
Update _extract_metric_values and _create_result_object docstrings
to document the new properties field and its expected dict type.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
w-javed and others added 5 commits April 2, 2026 18:33
Address PR review: warn users when their custom evaluator returns
'properties' as a non-dict type so they can fix the output format.
Also add properties to _create_result_object example input.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@slister1001 slister1001 force-pushed the waqasjaved02/aoai-properties-passthrough branch from e9b6431 to 5ec9b95 Compare April 3, 2026 14:24
slister1001 and others added 2 commits April 3, 2026 10:25
Remove erroneous space in self._eval_metric. value (two occurrences) that
would cause an AttributeError at runtime when building result keys for
_details and _total_tokens fields.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@nagkumar91 nagkumar91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — Properties passthrough

  1. Shared dict reference (bug-risk) — In _extract_metric_values, the same properties object is assigned to every metric entry:
for metric_dict in result_per_metric.values():
    metric_dict["properties"] = properties  # same object reference

If anything downstream mutates one entry's properties, all entries are affected. Consider metric_dict["properties"] = properties.copy() (or copy.deepcopy if nested dicts matter).

  1. No test for the warning path — The isinstance(metric_value, dict) guard logs a warning when properties isn't a dict, but no test covers this branch. A quick test passing properties="not_a_dict" would confirm the warning fires and properties is omitted.

  2. Typo fixes in _base_rai_svc_eval.py — Good catch on self._eval_metric. valueself._eval_metric.value. Cosmetic (Python allows whitespace after dot) but worth cleaning up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants