Agentic review loop with tool use

## Problem

DiffScope uses a fixed pipeline: parse diff → gather context → build prompt → single LLM call → parse output. Greptile's v3/v4 architecture replaced this with an **agentic loop** where the reviewer can iteratively search the codebase, read files, check git history, and challenge its own hypotheses. This is why they catch cross-file bugs that pipeline approaches miss.

## How Greptile Does It

The v3/v4 agent operates in a **loop with a high iteration limit**, with these tools:

1. **Codebase search** — semantic search against pgvector embeddings + graph traversal
2. **File read** — read any file in the repo
3. **Git history** — inspect past commits to understand design decisions
4. **Learned rules** — query per-team rules from past PR feedback
5. **External context** — Jira tickets, Notion docs, project rule files

**The flow:**
1. Agent receives the PR diff
2. Performs initial analysis of changed code
3. Searches the codebase graph for related implementations, callers, and contracts
4. **Recursively follows function call chains** (multi-hop) — finds `calculateInvoiceTotal()` changed, discovers `generateMonthlyStatement()` calls it, checks downstream consumers
5. Retrieves git history for relevant files
6. **Challenges its own hypotheses** by searching for counter-evidence before raising a comment

**Results:** v4 achieved 74% increase in addressed comments per PR (0.92 → 1.60), comment acceptance rate from 30% → 43%.

**Infrastructure:** The agent worker runs in a rootless Podman sandbox with kernel-level isolation (mount namespaces + pivot_root), preventing prompt injection from accessing host resources.

## How CodeRabbit Does It

Five specialized agents running in parallel, each with tool access:
- Review Agent: reads files, queries AST patterns, runs linters
- Verification Agent: validates review output against actual code
- Chat Agent: handles conversations
- Pre-Merge Checks Agent: quality gates
- Finishing Touches Agent: runs with Claude Agent SDK (Read, Write, Edit, Glob, Grep, Bash tools)

## Proposed Solution

### Phase 1: Single Agent with Tools
Replace the single LLM call in the review pipeline with a tool-using agent loop:

```rust
struct ReviewAgent {
    model: ModelConfig,
    tools: Vec<Box<dyn AgentTool>>,
    max_iterations: usize,  // default: 10
}

#[async_trait]
trait AgentTool {
    fn name(&self) -> &str;
    fn description(&self) -> &str;
    fn parameters_schema(&self) -> serde_json::Value;
    async fn execute(&self, params: serde_json::Value) -> Result<String>;
}
```

**Tools to implement:**
- `search_code` — semantic search (ties into RAG pipeline from #22)
- `read_file` — read any file in the repo with line range
- `search_symbols` — query the symbol graph for callers/callees/implementors
- `git_log` — recent history for a file/function
- `git_blame` — who last changed a line and why
- `run_grep` — regex search across the codebase
- `check_tests` — find related test files for changed code

### Phase 2: Multi-Agent Orchestration
Split into specialized agents (ties into #21):
- Security Agent — dedicated security-focused review with security tools
- Logic Agent — correctness review with codebase exploration
- Style Agent — convention enforcement with learned rules
- Orchestrator — dispatches to agents, merges results, deduplicates

### Phase 3: Sandboxing
- Run the agent in a restricted environment (seccomp/landlock on Linux, sandbox-exec on macOS)
- Limit file access to the repository directory
- No network access beyond LLM API calls
- Timeout per agent iteration

### Configuration
```yaml
agent:
  enabled: true  # false = use legacy pipeline
  max_iterations: 10
  tools:
    - search_code
    - read_file
    - search_symbols
    - git_log
    - git_blame
    - run_grep
  sandbox: true
```

## Relationship to Other Issues

- Depends on: #22 (RAG pipeline) for the `search_code` tool
- Enhances: #10 (deep codebase graph context) — the agent naturally queries the symbol graph
- Foundation for: #21 (multi-agent architecture)

## Priority

**Critical — architectural shift.** This is the direction the entire market is moving. Pipeline → agent is the single biggest architectural decision.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agentic review loop with tool use #24

Problem

How Greptile Does It

How CodeRabbit Does It

Proposed Solution

Phase 1: Single Agent with Tools

Phase 2: Multi-Agent Orchestration

Phase 3: Sandboxing

Configuration

Relationship to Other Issues

Priority

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Agentic review loop with tool use #24

Description

Problem

How Greptile Does It

How CodeRabbit Does It

Proposed Solution

Phase 1: Single Agent with Tools

Phase 2: Multi-Agent Orchestration

Phase 3: Sandboxing

Configuration

Relationship to Other Issues

Priority

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions