-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem
DiffScope uses a fixed pipeline: parse diff → gather context → build prompt → single LLM call → parse output. Greptile's v3/v4 architecture replaced this with an agentic loop where the reviewer can iteratively search the codebase, read files, check git history, and challenge its own hypotheses. This is why they catch cross-file bugs that pipeline approaches miss.
How Greptile Does It
The v3/v4 agent operates in a loop with a high iteration limit, with these tools:
- Codebase search — semantic search against pgvector embeddings + graph traversal
- File read — read any file in the repo
- Git history — inspect past commits to understand design decisions
- Learned rules — query per-team rules from past PR feedback
- External context — Jira tickets, Notion docs, project rule files
The flow:
- Agent receives the PR diff
- Performs initial analysis of changed code
- Searches the codebase graph for related implementations, callers, and contracts
- Recursively follows function call chains (multi-hop) — finds
calculateInvoiceTotal()changed, discoversgenerateMonthlyStatement()calls it, checks downstream consumers - Retrieves git history for relevant files
- Challenges its own hypotheses by searching for counter-evidence before raising a comment
Results: v4 achieved 74% increase in addressed comments per PR (0.92 → 1.60), comment acceptance rate from 30% → 43%.
Infrastructure: The agent worker runs in a rootless Podman sandbox with kernel-level isolation (mount namespaces + pivot_root), preventing prompt injection from accessing host resources.
How CodeRabbit Does It
Five specialized agents running in parallel, each with tool access:
- Review Agent: reads files, queries AST patterns, runs linters
- Verification Agent: validates review output against actual code
- Chat Agent: handles conversations
- Pre-Merge Checks Agent: quality gates
- Finishing Touches Agent: runs with Claude Agent SDK (Read, Write, Edit, Glob, Grep, Bash tools)
Proposed Solution
Phase 1: Single Agent with Tools
Replace the single LLM call in the review pipeline with a tool-using agent loop:
struct ReviewAgent {
model: ModelConfig,
tools: Vec<Box<dyn AgentTool>>,
max_iterations: usize, // default: 10
}
#[async_trait]
trait AgentTool {
fn name(&self) -> &str;
fn description(&self) -> &str;
fn parameters_schema(&self) -> serde_json::Value;
async fn execute(&self, params: serde_json::Value) -> Result<String>;
}Tools to implement:
search_code— semantic search (ties into RAG pipeline from Embedding-based RAG pipeline with function-level chunking #22)read_file— read any file in the repo with line rangesearch_symbols— query the symbol graph for callers/callees/implementorsgit_log— recent history for a file/functiongit_blame— who last changed a line and whyrun_grep— regex search across the codebasecheck_tests— find related test files for changed code
Phase 2: Multi-Agent Orchestration
Split into specialized agents (ties into #21):
- Security Agent — dedicated security-focused review with security tools
- Logic Agent — correctness review with codebase exploration
- Style Agent — convention enforcement with learned rules
- Orchestrator — dispatches to agents, merges results, deduplicates
Phase 3: Sandboxing
- Run the agent in a restricted environment (seccomp/landlock on Linux, sandbox-exec on macOS)
- Limit file access to the repository directory
- No network access beyond LLM API calls
- Timeout per agent iteration
Configuration
agent:
enabled: true # false = use legacy pipeline
max_iterations: 10
tools:
- search_code
- read_file
- search_symbols
- git_log
- git_blame
- run_grep
sandbox: trueRelationship to Other Issues
- Depends on: Embedding-based RAG pipeline with function-level chunking #22 (RAG pipeline) for the
search_codetool - Enhances: Deep codebase graph context in review prompts #10 (deep codebase graph context) — the agent naturally queries the symbol graph
- Foundation for: Multi-agent architecture: review + fix + test agents #21 (multi-agent architecture)
Priority
Critical — architectural shift. This is the direction the entire market is moving. Pipeline → agent is the single biggest architectural decision.