Skip to content

Agentic review loop with tool use #24

@haasonsaas

Description

@haasonsaas

Problem

DiffScope uses a fixed pipeline: parse diff → gather context → build prompt → single LLM call → parse output. Greptile's v3/v4 architecture replaced this with an agentic loop where the reviewer can iteratively search the codebase, read files, check git history, and challenge its own hypotheses. This is why they catch cross-file bugs that pipeline approaches miss.

How Greptile Does It

The v3/v4 agent operates in a loop with a high iteration limit, with these tools:

  1. Codebase search — semantic search against pgvector embeddings + graph traversal
  2. File read — read any file in the repo
  3. Git history — inspect past commits to understand design decisions
  4. Learned rules — query per-team rules from past PR feedback
  5. External context — Jira tickets, Notion docs, project rule files

The flow:

  1. Agent receives the PR diff
  2. Performs initial analysis of changed code
  3. Searches the codebase graph for related implementations, callers, and contracts
  4. Recursively follows function call chains (multi-hop) — finds calculateInvoiceTotal() changed, discovers generateMonthlyStatement() calls it, checks downstream consumers
  5. Retrieves git history for relevant files
  6. Challenges its own hypotheses by searching for counter-evidence before raising a comment

Results: v4 achieved 74% increase in addressed comments per PR (0.92 → 1.60), comment acceptance rate from 30% → 43%.

Infrastructure: The agent worker runs in a rootless Podman sandbox with kernel-level isolation (mount namespaces + pivot_root), preventing prompt injection from accessing host resources.

How CodeRabbit Does It

Five specialized agents running in parallel, each with tool access:

  • Review Agent: reads files, queries AST patterns, runs linters
  • Verification Agent: validates review output against actual code
  • Chat Agent: handles conversations
  • Pre-Merge Checks Agent: quality gates
  • Finishing Touches Agent: runs with Claude Agent SDK (Read, Write, Edit, Glob, Grep, Bash tools)

Proposed Solution

Phase 1: Single Agent with Tools

Replace the single LLM call in the review pipeline with a tool-using agent loop:

struct ReviewAgent {
    model: ModelConfig,
    tools: Vec<Box<dyn AgentTool>>,
    max_iterations: usize,  // default: 10
}

#[async_trait]
trait AgentTool {
    fn name(&self) -> &str;
    fn description(&self) -> &str;
    fn parameters_schema(&self) -> serde_json::Value;
    async fn execute(&self, params: serde_json::Value) -> Result<String>;
}

Tools to implement:

  • search_code — semantic search (ties into RAG pipeline from Embedding-based RAG pipeline with function-level chunking #22)
  • read_file — read any file in the repo with line range
  • search_symbols — query the symbol graph for callers/callees/implementors
  • git_log — recent history for a file/function
  • git_blame — who last changed a line and why
  • run_grep — regex search across the codebase
  • check_tests — find related test files for changed code

Phase 2: Multi-Agent Orchestration

Split into specialized agents (ties into #21):

  • Security Agent — dedicated security-focused review with security tools
  • Logic Agent — correctness review with codebase exploration
  • Style Agent — convention enforcement with learned rules
  • Orchestrator — dispatches to agents, merges results, deduplicates

Phase 3: Sandboxing

  • Run the agent in a restricted environment (seccomp/landlock on Linux, sandbox-exec on macOS)
  • Limit file access to the repository directory
  • No network access beyond LLM API calls
  • Timeout per agent iteration

Configuration

agent:
  enabled: true  # false = use legacy pipeline
  max_iterations: 10
  tools:
    - search_code
    - read_file
    - search_symbols
    - git_log
    - git_blame
    - run_grep
  sandbox: true

Relationship to Other Issues

Priority

Critical — architectural shift. This is the direction the entire market is moving. Pipeline → agent is the single biggest architectural decision.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions