Robust LLM output parsing with fallback strategies

## Problem

LLMs frequently produce malformed structured output. Qodo Merge has **9 sequential fallback strategies** for parsing YAML output from LLMs — this is a hard-won production lesson. DiffScope parses LLM output via `parsing/` modules but may not be as resilient to malformed responses.

## How Qodo Does It

From `pr_agent/algo/utils.py`, function `load_yaml()` with `try_fix_yaml()`:

1. Direct `yaml.safe_load()` — the happy path
2. YAML literal block (`|-`) conversion
3. Pipe character replacement (`|` → `|2`)
4. Root-level indentation fixes
5. Snippet extraction between backticks (` ```yaml ... ``` `)
6. Curly bracket removal (JSON-like output from LLM)
7. Key-range extraction (grab just the relevant YAML section)
8. Leading `+` character removal (diff artifacts bleeding into output)
9. Tab-to-space conversion and encoding fallbacks (latin-1, utf-16)

## Proposed Solution

Audit and harden DiffScope's LLM output parsing:

### 1. Audit Current Parsing
- Review `parsing/llm_response.rs` and `parsing/smart_review_response.rs`
- Identify failure modes from production use
- Add error tracking to measure parse failure rates

### 2. Implement Fallback Chain
```rust
fn parse_llm_response(raw: &str) -> Result<Vec<Comment>> {
    // Strategy 1: Direct JSON parse
    if let Ok(comments) = serde_json::from_str(raw) { return Ok(comments); }
    
    // Strategy 2: Extract JSON from markdown code blocks
    if let Some(json_block) = extract_code_block(raw, "json") {
        if let Ok(comments) = serde_json::from_str(&json_block) { return Ok(comments); }
    }
    
    // Strategy 3: Fix common JSON issues (trailing commas, single quotes)
    let fixed = fix_common_json_issues(raw);
    if let Ok(comments) = serde_json::from_str(&fixed) { return Ok(comments); }
    
    // Strategy 4: Extract individual comment objects with regex
    if let Ok(comments) = extract_comments_regex(raw) { return Ok(comments); }
    
    // Strategy 5: Line-by-line structured extraction
    if let Ok(comments) = parse_structured_text(raw) { return Ok(comments); }
    
    // Strategy 6: Ask the LLM to reformat (last resort)
    Err(anyhow!("Failed to parse LLM output after all strategies"))
}
```

### 3. Track Parse Failures
- Log parse failure rate per strategy
- Surface in metrics/analytics
- Feed back into prompt engineering (if certain models consistently produce bad output)

## Priority

**Medium — production reliability.** Low effort, high resilience. Every tool in the space has learned this lesson the hard way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robust LLM output parsing with fallback strategies #28

Problem

How Qodo Does It

Proposed Solution

1. Audit Current Parsing

2. Implement Fallback Chain

3. Track Parse Failures

Priority

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Robust LLM output parsing with fallback strategies #28

Description

Problem

How Qodo Does It

Proposed Solution

1. Audit Current Parsing

2. Implement Fallback Chain

3. Track Parse Failures

Priority

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions