-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem
Large PRs exceed LLM context windows. DiffScope has max_diff_chars and file_change_limit but lacks the progressive compression strategies that Qodo and CodeRabbit use. Currently, diffs that exceed the limit are likely truncated or the review is skipped entirely.
How Qodo Does It (4-stage progressive degradation)
Stage 1: Full Diff
- Generate diff with dynamic context for every file
- Count tokens via tokenizer (tiktoken for OpenAI, Anthropic API for Claude)
- If fits within
max_model_tokens - 1500→ done, use as-is
Stage 2: Compressed Diff
- Sort files by token count (largest first)
- Remove deletion-only hunks
- Iterate through files, accumulating until hitting token budget
- Remaining files go to
remaining_files_list
Stage 3: Per-File Policy
clip: Truncate file diff at a clean line boundary, append"...(truncated)"skip: Drop file entirely with a warning
Stage 4: Multi-Call Splitting
- Split remaining files into multiple LLM calls (up to
max_ai_calls, default 5) - Each call gets a subset that fits the token budget
- Run calls in parallel with
asyncio.gather() - Aggregate + deduplicate results across calls
Remaining Files Metadata
Files that didn't fit are listed as supplementary metadata:
"Deleted files:\n"+ filenames"Additional modified files (not reviewed)..."with edit type- This list is itself token-budget-aware
How CodeRabbit Does It
- Files sorted by size, largest processed first
- Per-file token validation — files exceeding budget skipped
- Batch summarization (groups of 10, then summarize-the-summaries)
- Hunk-level splitting via regex
- Incremental review (only delta since last reviewed commit)
- Triage classification skips cosmetic files
Proposed Solution
Implementation
enum CompressionStrategy {
Full, // All context, no compression
Compressed, // Remove deletion-only hunks, trim context
Clipped, // Truncate large files at line boundaries
MultiCall, // Split across multiple LLM calls
}
async fn review_with_adaptive_compression(
diffs: Vec<UnifiedDiff>,
config: &ReviewConfig,
) -> Vec<Comment> {
let token_budget = config.max_model_tokens - config.response_tokens - 100;
// Stage 1: Try full diff
let full = build_full_prompt(&diffs, config);
if count_tokens(&full) <= token_budget {
return single_review_call(full, config).await;
}
// Stage 2: Compress
let compressed = compress_diffs(&diffs); // remove deletion-only, trim context
if count_tokens(&compressed.prompt) <= token_budget {
return single_review_call(compressed.prompt, config).await
.with_skipped_files(compressed.skipped);
}
// Stage 3: Clip large files
let clipped = clip_large_files(&compressed, token_budget);
if count_tokens(&clipped.prompt) <= token_budget {
return single_review_call(clipped.prompt, config).await
.with_skipped_files(clipped.skipped);
}
// Stage 4: Multi-call
let batches = split_into_batches(&diffs, token_budget, config.max_ai_calls);
let results = futures::future::join_all(
batches.iter().map(|b| single_review_call(b, config))
).await;
deduplicate(results.into_iter().flatten().collect())
}Configuration
compression:
strategy: auto # auto, full, compressed, clipped, multi_call
max_ai_calls: 5
large_file_policy: clip # clip or skip
response_token_budget: 4000
safety_margin_tokens: 100Token Counting
- Use tiktoken for OpenAI models
- Use Anthropic's token counting API for Claude
- Fallback: character-based estimation (chars / 4)
Priority
Medium — production reliability for large PRs. Currently large PRs likely fail or produce degraded results.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request