-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem
LLMs frequently produce malformed structured output. Qodo Merge has 9 sequential fallback strategies for parsing YAML output from LLMs — this is a hard-won production lesson. DiffScope parses LLM output via parsing/ modules but may not be as resilient to malformed responses.
How Qodo Does It
From pr_agent/algo/utils.py, function load_yaml() with try_fix_yaml():
- Direct
yaml.safe_load()— the happy path - YAML literal block (
|-) conversion - Pipe character replacement (
|→|2) - Root-level indentation fixes
- Snippet extraction between backticks (
```yaml ... ```) - Curly bracket removal (JSON-like output from LLM)
- Key-range extraction (grab just the relevant YAML section)
- Leading
+character removal (diff artifacts bleeding into output) - Tab-to-space conversion and encoding fallbacks (latin-1, utf-16)
Proposed Solution
Audit and harden DiffScope's LLM output parsing:
1. Audit Current Parsing
- Review
parsing/llm_response.rsandparsing/smart_review_response.rs - Identify failure modes from production use
- Add error tracking to measure parse failure rates
2. Implement Fallback Chain
fn parse_llm_response(raw: &str) -> Result<Vec<Comment>> {
// Strategy 1: Direct JSON parse
if let Ok(comments) = serde_json::from_str(raw) { return Ok(comments); }
// Strategy 2: Extract JSON from markdown code blocks
if let Some(json_block) = extract_code_block(raw, "json") {
if let Ok(comments) = serde_json::from_str(&json_block) { return Ok(comments); }
}
// Strategy 3: Fix common JSON issues (trailing commas, single quotes)
let fixed = fix_common_json_issues(raw);
if let Ok(comments) = serde_json::from_str(&fixed) { return Ok(comments); }
// Strategy 4: Extract individual comment objects with regex
if let Ok(comments) = extract_comments_regex(raw) { return Ok(comments); }
// Strategy 5: Line-by-line structured extraction
if let Ok(comments) = parse_structured_text(raw) { return Ok(comments); }
// Strategy 6: Ask the LLM to reformat (last resort)
Err(anyhow!("Failed to parse LLM output after all strategies"))
}3. Track Parse Failures
- Log parse failure rate per strategy
- Surface in metrics/analytics
- Feed back into prompt engineering (if certain models consistently produce bad output)
Priority
Medium — production reliability. Low effort, high resilience. Every tool in the space has learned this lesson the hard way.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request