-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
epicTracking issue for a multi-issue initiativeTracking issue for a multi-issue initiativewuiRelates to the browser dashboard / web UI runtimeRelates to the browser dashboard / web UI runtime
Description
Summary
Upgrade the agentv dashboard from a read-only viewer into a full AgentV Studio — a management and analysis platform for quality gate enforcement, orchestration monitoring, regression detection, cost attribution, root cause diagnosis, and pattern synthesis.
Inspired by melagiri/code-insights (React + Vite + Hono dashboard with real-time session analysis, cost tracking, pattern detection).
Architecture Boundary Summary
| Layer | Scope | Issues |
|---|---|---|
| Platform foundation | React+Vite SPA, Hono server, history API, run management | #563 |
| Quality enforcement | Dashboard gate config, regression alerts, cost views | #334, #335, #635 |
| Orchestration control | Campaign monitoring, pause/resume/stop active loops | #785 |
| Deep analysis | Trace-driven root cause diagnosis, pattern synthesis | #786, #787 |
Current Implementation Status
| Component | Status |
|---|---|
| html-writer.ts (static HTML report) | Implemented |
| History repo architecture | Not started |
| React+Vite dashboard scaffold | Not started |
| Hono API server | Not started |
| SSE progressive visualization | Not started |
| Quality gate engine (severity, remediation) | Not started |
| Quality gate dashboard UI | Not started |
| Regression detection engine | Not started |
| Regression alert visualization | Not started |
| Cost computation engine | Not started |
| Cost attribution dashboard views | Not started |
| Orchestration monitor UI | Not started |
| Root cause explorer | Not started |
| Code-insights pattern synthesis | Not started |
Dependency Graph
Phase 1: Platform Foundation
#563 (AgentV Studio platform)
- React+Vite scaffold
- Hono API server
- History repo integration
- Run management views
Phase 2: Quality Enforcement (parallel tracks)
#334 (quality gates)
#335 (regression alerts) -- all depend on #563 platform
#635 (cost attribution)
Phase 3: Orchestration Control
#785 (orchestration monitor)
- depends on #563 platform
- depends on #748, #699, #746 engines
Phase 4: Deep Analysis (parallel tracks)
#786 (root cause explorer) -- depend on #563 platform + #335 regression data
#787 (code-insights)
Parallel Execution Waves
Wave 1 — Platform (sequential prerequisite)
- feat: AgentV Studio — eval management platform with historical trends, quality gates, and orchestration #563 — AgentV Studio platform (React+Vite scaffold, Hono server, history API, run management, SSE)
Wave 2 — Quality Enforcement (parallel after Wave 1)
- feat(eval): composable quality gates with auto-remediation triggers #334 — Quality gates: dashboard gate configuration UI, visual threshold editor, one-click remediation
- feat(eval): iteration tracking, termination taxonomy, and cross-run regression detection #335 — Regression detection: alert feed, regression timeline, auto-clustering, git correlation
- feat: compute costUsd from token usage via model pricing table #635 — Cost tracking: cost attribution views, budget tracking, optimization suggestions
Wave 3 — Orchestration (after Wave 1, can overlap Wave 2)
- feat(dashboard): orchestration monitor — campaign management UI #785 — Orchestration monitor: campaign list, score trajectory, controls (pause/resume/stop), cost burn rate
Wave 4 — Deep Analysis (after Waves 1-2)
- feat(dashboard): root cause explorer — trace-driven failure diagnosis #786 — Root cause explorer: failure clustering, trace drill-down, side-by-side comparison, git correlation
- feat(dashboard): code-insights integration — pattern synthesis from eval sessions #787 — Code-insights integration: insight extraction, pattern synthesis, friction heatmap, rule export
Merge Order (low-conflict default)
- feat: AgentV Studio — eval management platform with historical trends, quality gates, and orchestration #563 (platform) — foundational, merge first
- feat: compute costUsd from token usage via model pricing table #635 (cost) — smallest scope, least conflict
- feat(eval): composable quality gates with auto-remediation triggers #334 (quality gates) — independent of feat(eval): iteration tracking, termination taxonomy, and cross-run regression detection #335
- feat(eval): iteration tracking, termination taxonomy, and cross-run regression detection #335 (regression) — independent of feat(eval): composable quality gates with auto-remediation triggers #334
- feat(dashboard): orchestration monitor — campaign management UI #785 (orchestration) — depends on platform only
- feat(dashboard): root cause explorer — trace-driven failure diagnosis #786 (root cause) — depends on feat(eval): iteration tracking, termination taxonomy, and cross-run regression detection #335 regression data
- feat(dashboard): code-insights integration — pattern synthesis from eval sessions #787 (code-insights) — depends on platform + feat(eval): composable quality gates with auto-remediation triggers #334 for rule export
Subagent Operating Contract
- Each issue is independently implementable after its dependencies are merged
- Issues should not introduce cross-cutting schema changes without updating this tracking issue
- Dashboard components should follow the React+Vite architecture established in feat: AgentV Studio — eval management platform with historical trends, quality gates, and orchestration #563
- All dashboard views should support the SSE progressive update pattern from feat: AgentV Studio — eval management platform with historical trends, quality gates, and orchestration #563
- Backend API routes follow the Hono pattern established in feat: AgentV Studio — eval management platform with historical trends, quality gates, and orchestration #563
Completion Criteria
- All 7 sub-issues closed
- Dashboard serves from agentv serve with React SPA
- Quality gates configurable from dashboard UI
- Regression alerts visible in real-time
- Active campaigns monitorable and controllable
- Cost attribution visible per evaluator/category/target
- Root cause explorer links regressions to trace-level diagnosis
- Pattern synthesis extracts insights across runs
- Static --format html report continues to work independently
Research Source
- melagiri/code-insights — React+Vite+Hono dashboard, real-time session analysis, cost tracking, pattern detection, AI fluency scoring, rule generation
- DeepEval Confident AI — cloud eval dashboard with trends
- Convex Evals — React dashboard with category breakdown
- Anthropic skill-creator — eval-viewer with grading.json schema alignment
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
epicTracking issue for a multi-issue initiativeTracking issue for a multi-issue initiativewuiRelates to the browser dashboard / web UI runtimeRelates to the browser dashboard / web UI runtime
Type
Projects
Status
Ready