feat(studio): achieve full convex-evals feature parity#811
Merged
Conversation
Implements all 5 gaps from #810 plus low-priority items: - Gap 1: File tree in Output/Task tabs with split layout (FileTree + Monaco) - Gap 2: Category drill-down page at /runs/:runId/category/:category - Gap 3: Landing page tabs (Recent Runs, Experiments, Targets) - Gap 4: Experiment detail page at /experiments/:experimentName - Gap 5: Breadcrumb navigation derived from TanStack Router matches Low priority: - Step timing badges on assertions - Target/Experiment columns in run list - Run metadata enrichment in API New API endpoints: /api/experiments, /api/targets, /api/runs/:filename/evals/:evalId/files, /api/runs/:filename/evals/:evalId/files/* Closes #810 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use LightweightResultRecord's experiment field directly instead of unsafe Record<string, unknown> casts. Fix import ordering and formatting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TargetSummary type used `passed`/`total` fields but the API returns `passed_count`/`eval_count`. Aligned the type and component. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a626e62 to
7e627f9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements all 5 gaps from #810 plus low-priority items to achieve full convex-evals feature parity in AgentV Studio:
/runs/:runId/category/:categorywith scoped stats and eval list/experiments/:experimentNamewith aggregate statsNew files
FileTree.tsx,Breadcrumbs.tsx,ExperimentsTab.tsx,TargetsTab.tsx$runId_.category.$category.tsx,$experimentName.tsxModified files
serve.ts— 4 new API endpoints (file tree, file content, experiments, targets)EvalDetail.tsx— split layout for Output/Task tabs with file treeRunDetail.tsx— category cards now navigate to drill-down pagesSidebar.tsx— context-aware for category and experiment pagesLayout.tsx— breadcrumbs above content areaindex.tsx— tabbed landing pageTest plan
Closes #810