Skip to content

feat: add code-based evaluator support#739

Open
jariy17 wants to merge 4 commits intomainfrom
code-based-evaluator
Open

feat: add code-based evaluator support#739
jariy17 wants to merge 4 commits intomainfrom
code-based-evaluator

Conversation

@jariy17
Copy link
Copy Markdown
Collaborator

@jariy17 jariy17 commented Mar 31, 2026

Summary

  • Add managed and external code-based evaluator support across schema, CLI flags, TUI wizard, and template scaffolding
  • EvaluatorConfigSchema becomes a mutual-exclusion union of llmAsAJudge / codeBased
  • New CLI flags: --type code-based|llm-as-a-judge, --lambda-arn, --timeout
  • TUI wizard with 3 branching flows: LLM, managed code-based, external code-based
  • Scaffold Python Lambda template with @custom_code_based_evaluator() decorator
  • Block code-based evaluators from online eval configs at schema, CLI, and TUI layers

Test plan

  • Unit tests for EvaluatorPrimitive (add, remove, previewRemove)
  • Online eval config blocking (remove blocked when referenced)
  • ESLint, Prettier, Secretlint pass
  • TUI manual testing (add managed, add external, remove)
  • Integration test with agentcore deploy

Note

Second commit vendors the SDK wheel temporarily until bedrock-agentcore is published to PyPI with code-based evaluator support.

jariy17 added 2 commits March 31, 2026 00:37
Add managed and external code-based evaluator support across schema,
CLI flags, TUI wizard, and template scaffolding. Block code-based
evaluators from online eval configs at schema, CLI, and TUI layers.
Vendor the SDK wheel and add binary-aware template rendering until the
SDK is published to PyPI. To be removed once the SDK is publicly available.
@jariy17 jariy17 requested a review from a team March 31, 2026 04:44
@github-actions github-actions bot added the size/l PR size: L label Mar 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.5.0.tgz

How to install

npm install https://github.com/aws/agentcore-cli/releases/download/pr-739-tarball/aws-agentcore-0.5.0.tgz

@github-actions github-actions bot added size/l PR size: L and removed size/l PR size: L labels Mar 31, 2026
@jariy17 jariy17 force-pushed the code-based-evaluator branch from 2637c56 to d182cc5 Compare March 31, 2026 14:48
@github-actions github-actions bot removed the size/l PR size: L label Mar 31, 2026
@github-actions github-actions bot added the size/l PR size: L label Mar 31, 2026
@jariy17 jariy17 force-pushed the code-based-evaluator branch from d182cc5 to 14b84b5 Compare March 31, 2026 15:05
@github-actions github-actions bot added size/l PR size: L and removed size/l PR size: L labels Mar 31, 2026
- Update asset file listing snapshot for new evaluator templates
- Regenerate package-lock.json to fix stale aws-cdk bundled dep
  (@aws-cdk/cloud-assembly-schema 52.2.0 -> 53.11.0)
@jariy17 jariy17 force-pushed the code-based-evaluator branch from 14b84b5 to 5e46523 Compare March 31, 2026 15:28
@github-actions github-actions bot removed the size/l PR size: L label Mar 31, 2026
@github-actions github-actions bot added the size/l PR size: L label Mar 31, 2026
Status command was hardcoding "LLM-as-a-Judge" for all evaluators.
Now derives the label from item.config.codeBased to distinguish
code-based evaluators.
@github-actions github-actions bot added size/l PR size: L and removed size/l PR size: L labels Mar 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 45.78% 6596 / 14407
🔵 Statements 45.34% 7007 / 15453
🔵 Functions 44.53% 1178 / 2645
🔵 Branches 45.97% 4369 / 9502
Generated in workflow #1554 for commit 0d7d519 by the Vitest Coverage Report Action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/l PR size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant