tracking: AgentV Studio — eval management platform with quality gates, orchestration, and analysis

## Summary

Upgrade the agentv dashboard from a read-only viewer into a full **AgentV Studio** — a management and analysis platform for quality gate enforcement, orchestration monitoring, regression detection, cost attribution, root cause diagnosis, and pattern synthesis.

Inspired by [melagiri/code-insights](https://github.com/melagiri/code-insights) (React + Vite + Hono dashboard with real-time session analysis, cost tracking, pattern detection).

## Architecture Boundary Summary

| Layer | Scope | Issues |
|-------|-------|--------|
| Platform foundation | React+Vite SPA, Hono server, history API, run management | #563 |
| Quality enforcement | Dashboard gate config, regression alerts, cost views | #334, #335, #635 |
| Orchestration control | Campaign monitoring, pause/resume/stop active loops | #785 |
| Deep analysis | Trace-driven root cause diagnosis, pattern synthesis | #786, #787 |

## Current Implementation Status

| Component | Status |
|-----------|--------|
| html-writer.ts (static HTML report) | Implemented |
| History repo architecture | Not started |
| React+Vite dashboard scaffold | Not started |
| Hono API server | Not started |
| SSE progressive visualization | Not started |
| Quality gate engine (severity, remediation) | Not started |
| Quality gate dashboard UI | Not started |
| Regression detection engine | Not started |
| Regression alert visualization | Not started |
| Cost computation engine | Not started |
| Cost attribution dashboard views | Not started |
| Orchestration monitor UI | Not started |
| Root cause explorer | Not started |
| Code-insights pattern synthesis | Not started |

## Dependency Graph

```
Phase 1: Platform Foundation
  #563 (AgentV Studio platform)
    - React+Vite scaffold
    - Hono API server
    - History repo integration
    - Run management views

Phase 2: Quality Enforcement (parallel tracks)
  #334 (quality gates)
  #335 (regression alerts)       -- all depend on #563 platform
  #635 (cost attribution)

Phase 3: Orchestration Control
  #785 (orchestration monitor)
    - depends on #563 platform
    - depends on #748, #699, #746 engines

Phase 4: Deep Analysis (parallel tracks)
  #786 (root cause explorer)     -- depend on #563 platform + #335 regression data
  #787 (code-insights)
```

## Parallel Execution Waves

### Wave 1 — Platform (sequential prerequisite)
- [ ] #563 — AgentV Studio platform (React+Vite scaffold, Hono server, history API, run management, SSE)

### Wave 2 — Quality Enforcement (parallel after Wave 1)
- [ ] #334 — Quality gates: dashboard gate configuration UI, visual threshold editor, one-click remediation
- [ ] #335 — Regression detection: alert feed, regression timeline, auto-clustering, git correlation
- [ ] #635 — Cost tracking: cost attribution views, budget tracking, optimization suggestions

### Wave 3 — Orchestration (after Wave 1, can overlap Wave 2)
- [ ] #785 — Orchestration monitor: campaign list, score trajectory, controls (pause/resume/stop), cost burn rate

### Wave 4 — Deep Analysis (after Waves 1-2)
- [ ] #786 — Root cause explorer: failure clustering, trace drill-down, side-by-side comparison, git correlation
- [ ] #787 — Code-insights integration: insight extraction, pattern synthesis, friction heatmap, rule export

## Merge Order (low-conflict default)

1. #563 (platform) — foundational, merge first
2. #635 (cost) — smallest scope, least conflict
3. #334 (quality gates) — independent of #335
4. #335 (regression) — independent of #334
5. #785 (orchestration) — depends on platform only
6. #786 (root cause) — depends on #335 regression data
7. #787 (code-insights) — depends on platform + #334 for rule export

## Subagent Operating Contract

- Each issue is independently implementable after its dependencies are merged
- Issues should not introduce cross-cutting schema changes without updating this tracking issue
- Dashboard components should follow the React+Vite architecture established in #563
- All dashboard views should support the SSE progressive update pattern from #563
- Backend API routes follow the Hono pattern established in #563

## Completion Criteria

- [ ] All 7 sub-issues closed
- [ ] Dashboard serves from agentv serve with React SPA
- [ ] Quality gates configurable from dashboard UI
- [ ] Regression alerts visible in real-time
- [ ] Active campaigns monitorable and controllable
- [ ] Cost attribution visible per evaluator/category/target
- [ ] Root cause explorer links regressions to trace-level diagnosis
- [ ] Pattern synthesis extracts insights across runs
- [ ] Static --format html report continues to work independently

## Research Source

- [melagiri/code-insights](https://github.com/melagiri/code-insights) — React+Vite+Hono dashboard, real-time session analysis, cost tracking, pattern detection, AI fluency scoring, rule generation
- DeepEval Confident AI — cloud eval dashboard with trends
- Convex Evals — React dashboard with category breakdown
- Anthropic skill-creator — eval-viewer with grading.json schema alignment


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tracking: AgentV Studio — eval management platform with quality gates, orchestration, and analysis #788

Summary

Architecture Boundary Summary

Current Implementation Status

Dependency Graph

Parallel Execution Waves

Wave 1 — Platform (sequential prerequisite)

Wave 2 — Quality Enforcement (parallel after Wave 1)

Wave 3 — Orchestration (after Wave 1, can overlap Wave 2)

Wave 4 — Deep Analysis (after Waves 1-2)

Merge Order (low-conflict default)

Subagent Operating Contract

Completion Criteria

Research Source

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Layer	Scope	Issues
Platform foundation	React+Vite SPA, Hono server, history API, run management	#563
Quality enforcement	Dashboard gate config, regression alerts, cost views	#334, #335, #635
Orchestration control	Campaign monitoring, pause/resume/stop active loops	#785
Deep analysis	Trace-driven root cause diagnosis, pattern synthesis	#786, #787

Component	Status
html-writer.ts (static HTML report)	Implemented
History repo architecture	Not started
React+Vite dashboard scaffold	Not started
Hono API server	Not started
SSE progressive visualization	Not started
Quality gate engine (severity, remediation)	Not started
Quality gate dashboard UI	Not started
Regression detection engine	Not started
Regression alert visualization	Not started
Cost computation engine	Not started
Cost attribution dashboard views	Not started
Orchestration monitor UI	Not started
Root cause explorer	Not started
Code-insights pattern synthesis	Not started

tracking: AgentV Studio — eval management platform with quality gates, orchestration, and analysis #788

Description

Summary

Architecture Boundary Summary

Current Implementation Status

Dependency Graph

Parallel Execution Waves

Wave 1 — Platform (sequential prerequisite)

Wave 2 — Quality Enforcement (parallel after Wave 1)

Wave 3 — Orchestration (after Wave 1, can overlap Wave 2)

Wave 4 — Deep Analysis (after Waves 1-2)

Merge Order (low-conflict default)

Subagent Operating Contract

Completion Criteria

Research Source

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions