Autonomous software engineering — powered by OpenAI · Claude · OpenCode
APE plans, reviews, patches, builds, and commits your code — while you stay in control. Every change is risk-classified and reviewed by the appropriate AI pipeline before a single byte hits disk. In debate mode a full Debate Viewer CLI UI renders each phase in real time — color-coded, structured, and engineer-friendly. Works with any language, any stack, any build system.
APE is a fully autonomous AI coding agent for any software project — Python services, TypeScript apps, Rust CLIs, Go microservices, Java backends, C/C++ firmware, and everything in between. You give it a goal in plain English; it figures out the task plan, generates minimal unified-diff patches, runs your build system, and iterates on failures — all with human approval gates at every critical step.
Version 2.0 introduces Risk-Gated Debate Mode: every proposed change is automatically routed to the right review pipeline based on a weighted risk classifier. Low-risk changes get a fast single-pass review; high-risk or safety-critical changes (data models, auth flows, transaction handlers, critical system code) go through a 4-phase adversarial AI debate before anything touches your codebase.
| Feature | Detail | |
|---|---|---|
| 🧠 | Multi-provider AI | OpenAI, Claude, and OpenCode — mix and match any model as proposer or critic |
| 🎯 | Risk-gated mode selection | Automatic LITE vs DEBATE routing per change-set |
| ⚔️ | 4-phase adversarial debate | Propose → Challenge → Rebuttal → Final audit |
| 🆓 | Free model support | OpenCode Zen API: glm-5-free, minimax-m2.5-free, kimi-k2.5-free, big-pickle |
| � | Debate Viewer CLI UI | Structured, color-coded terminal panels for every debate phase |
| 🔥 | Risk heatmap | ASCII per-file risk bars rendered after each debate |
| 🗂️ | Debate session logs | Auto-persists _session.json, _patch_v1.diff, _patch_v2.diff |
| 🔒 | Firmware-safe guardrails | Blocks struct renames, ISR changes, oversized deletions, protected paths — web/script files exempt from deletion limits |
| 🩹 | Unified diff patches | All changes via git apply — no full-file overwrites |
| 🔨 | Build loop | Run your build after every patch; auto-fix on failure (up to 3 retries) |
| 💰 | Budget tracking | Per-provider (OpenAI / Claude) + per-phase USD and token accounting |
| 🧠 | Persistent memory | Architecture decisions, constraints, and errors survive sessions |
| ↩️ | Resume sessions | Pick up exactly where you left off with --resume |
| 🛑 | Human approval gates | Y/N prompts before every apply and commit — always |
| 🌵 | Dry-run by default | Nothing written to disk unless you explicitly pass --apply |
| 🔍 | Verbose debug mode | --verbose dumps full raw model JSON for every phase |
# Install
cd ape && npm install
cp .env.example .env # add OPENAI_API_KEY and ANTHROPIC_API_KEY
# Preview what APE would do (safe, no writes)
node index.js \
--goal="Add rate limiting middleware to the Express API" \
--type=node \
--build="npm test"
# Actually apply patches
node index.js \
--goal="Add rate limiting middleware to the Express API" \
--type=node \
--build="npm test" \
--applyThe centrepiece of APE v2.0. Every task is classified before any AI call is made:
Change set
│
▼
┌──────────────────────────────┐
│ Risk Classifier (0-100) │
│ • struct / enum → +40 │
│ • ISR / IRAM_ATTR → +30 │
│ • memory ops → +15 │
│ • concurrency → +15 │
│ • protected path → +25 │
└──────────┬───────────────────┘
│
┌─────────▼──────────┐
│ Mode Selector │◄── --lite-only / --debate-only
└─────────┬──────────┘
│
┌────────┴────────┐
│ │
▼ ▼
LITE mode DEBATE mode
(score < 55) (score ≥ 55)
1 provider 4 phases:
call Phase 1 — Proposer generates patch
~$0.01-0.03 Phase 2 — Critic challenges
Phase 3 — Proposer rebuts
Phase 4 — Critic final audit
~$0.00-0.25 (free w/ OpenCode)
Force-debate conditions override score entirely:
- Any
struct/typedef/enumkeyword in the diff ISRorIRAM_ATTRin the diff- Protected path (
src/protocol,src/radio,src/routing) + patch > 80 lines
→ Full documentation: docs/risk-gated-debate.md
When running in debate mode, APE renders a full structured terminal UI as each phase completes:
══════════════════════════════════════════════════════════════
AI DEBATE SESSION [Task 3]
──────────────────────────────────────────────────────────────
Mode: debate
Risk Level: HIGH
Risk Score: 72
Triggers:
· struct keyword detected
· TTL logic modified
══════════════════════════════════════════════════════════════
[PHASE 1] GPT Proposal
──────────────────────────────────────────────────────────────
Files: src/mesh_rx.c, src/routing.c
Patch lines: 184
Self risk: medium
Confidence: 78%
[PHASE 2] Critic Challenge
──────────────────────────────────────────────────────────────
⚠ mesh_rx.c:142-168
Issue: Possible race condition on shared buffer
Severity: CRITICAL
⚠ packet.h:33-48
Issue: Enum order modified (protocol risk)
Severity: MEDIUM
[PHASE 3] Proposer Defense
──────────────────────────────────────────────────────────────
✔ Reverted enum reorder
✔ Added boundary guard for TTL decrement
✔ Wrapped shared buffer access in mutex
--- PATCH CHANGES (v1 → v2) ---
Lines removed: 12 Lines added: 18
[PHASE 4] Final Audit
──────────────────────────────────────────────────────────────
Remaining issues: none
Final Risk: LOW
Confidence: 84%
══════════════════════════════════════════════════════════════
FINAL DECISION
──────────────────────────────────────────────────────────────
Mode used: DEBATE
Allow Apply: YES
Allow Commit: NO (requires human)
Final Confidence: 82%
══════════════════════════════════════════════════════════════
Risk Heatmap:
mesh_rx.c ████████░░ 70%
routing.c ███░░░░░░░ 30%
packet.h ██████████ 90%
Apply patch? (y/n)
Color coding: RED = critical, YELLOW = medium/high, GREEN = safe/low.
Enable --verbose to print full raw model JSON after each phase.
Session artifacts persisted to <target>/.ape/sessions/:
.ape/sessions/
<ts>_session.json full 4-phase debate record
<ts>_patch_v1.diff original proposer patch
<ts>_patch_v2.diff revised patch after defense
| Flag | Default | Description |
|---|---|---|
--goal |
required | What to build or fix |
--type |
node |
See Project Types table below. 23 types supported. |
--build |
(none) | Build command — e.g. npm test, cargo test, pytest, make, dotnet build |
--target |
cwd | Path to your project directory |
--max-budget |
5.00 |
USD spending cap |
--max-tokens |
500000 |
Total token cap |
--resume |
false |
Resume from ape-state.json |
--no-git |
false |
Skip all git operations |
| Flag | Default | Description |
|---|---|---|
--lite-only |
false |
Force single-pass LITE review for all tasks |
--debate-only |
false |
Force 4-phase DEBATE review for all tasks |
| Flag | Default | Description |
|---|---|---|
--apply |
false |
Write patches to disk. Without this, APE is in dry-run mode |
| Flag | Default | Description |
|---|---|---|
--allow-protected |
false |
Allow changes to protected paths (e.g. src/protocol, src/radio, src/routing) |
--allow-isr |
false |
Allow patches touching ISR / IRAM_ATTR code |
--confidence-threshold |
70 |
Minimum AI confidence score (0-100) to allow apply |
--auto-commit |
false |
Auto-propose commit after each task (human still approves) |
| Flag | Default | Description |
|---|---|---|
--verbose |
false |
Print full raw model JSON responses after each debate phase |
→ Full reference: docs/cli-reference.md
Pass any of the following to --type. Each type sets the planner prompt, build conventions, and guardrail rules appropriate for that stack.
| Type | Stack | Build command hint |
|---|---|---|
embedded |
C/C++ firmware (ESP-IDF / Arduino / bare-metal) | idf.py build / pio run |
cli |
CLI tool in any language (Python, Go, Rust, Node, C…) | (varies by language) |
node |
Node.js (Express / general) | npm test |
web |
Generic browser front-end (HTML + JS) | (none) |
htmlcss |
Pure HTML + CSS, no build tool | (none) |
python |
Python 3 scripts / libraries | pytest |
react |
React 18 + Vite SPA | npm test |
api |
Node/Express REST API | npm test |
rust |
Rust (Cargo 2021) | cargo test |
docker |
Dockerfiles + Compose only | docker build . |
arduino |
Arduino / PlatformIO sketches | pio run |
nextjs |
Next.js 14 App Router + Tailwind | npm test |
go |
Go modules (go.mod) |
go test ./... |
fastapi |
FastAPI + Pydantic v2 | pytest |
bash |
Bash scripts | shellcheck |
svelte |
SvelteKit + TypeScript | npm test |
tauri |
Tauri (Rust backend + web front-end) | cargo test |
vscode-ext |
VS Code extension | npm test |
terraform |
Terraform / HCL2 | terraform validate |
platformio |
PlatformIO (embedded) | pio test |
dotnet |
.NET 8 / C# | dotnet test |
cpp |
C++17/20 with CMake | cmake --build build |
c |
C11 with Makefile / CMake | make |
1. PLAN Proposer model generates architecture + task list
Critic model reviews and refines the plan
│
2. FOR EACH TASK: │
╔══════════════════════════╗│
║ Pre-guardrail ║│
║ Protected path check ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Risk Classifier ║│
║ Score 0-100, detect ISR ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Mode Selector ║│
║ LITE or DEBATE ║│
╚══════════╤═══════════════╝│
│ │
┌──────────▼──────────┐ │
│ Review Pipeline │ │
│ (LITE or DEBATE) │ │
└──────────┬──────────┘ │
│ │
╔══════════▼═══════════════╗│
║ Post-guardrail ║│
║ checkPatch() ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Consensus ║│
║ allow_apply? ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Dry-run gate ║│ ← default: stop here
║ --apply required ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Human Y/N ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ git apply ║│
║ hard fail if rejected ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Build + fix loop ║│
║ up to 3 retries ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Human Y/N commit ║│
╚══════════╤═══════════════╝│
│ │
╔══════════▼═══════════════╗│
║ Save report artifact ║│
╚══════════════════════════╝│
│
3. SUMMARY budget + phases ────┘
ape/
├── index.js CLI entry — arg parsing, option assembly
└── src/
│
├── orchestrator.js Master loop: risk → mode → review → apply
│
├── ── Planning ─────────────────────────────────────────────────────────
├── planner.js GPT creates task list for any project type; Claude refines
├── taskManager.js Dependency-aware queue; done/failed/pending
├── memory.js Arch decisions, constraints, error history
├── stateTracker.js Iteration counter, current task, budget snapshot
│
├── ── Risk & Mode ──────────────────────────────────────────────────────
├── riskClassifier.js Weighted 0-100 score; detects ISR/struct/protected
├── modeSelector.js lite | debate; CLI flags override classifier
│
├── ── Review Pipelines ─────────────────────────────────────────────────
├── liteReviewer.js Single pass → unified diff (LITE mode)
├── debateOrchestrator.js 4-phase adversarial debate (DEBATE mode)
├── debateViewer.js Debate Viewer CLI UI — panels, heatmap, prompts, log persist
├── critiqueParser.js Parses/normalises all 4 phase JSON; safe fallbacks
├── rebuttalEngine.js Phase 3 rebuttal; addressed_items tracking
├── consensus.js fromLite / fromDebate → CONSENSUS_OUTPUT
│
├── ── Patch Lifecycle ──────────────────────────────────────────────────
├── patchApplier.js applyDiff / saveDiff / previewDiff
├── guardrails.js checkPatch / checkPaths / checkFiles (file-type-aware)
├── patchGenerator.js Legacy helper; used for record saving
│
├── ── Build & Git ──────────────────────────────────────────────────────
├── buildRunner.js Run build command; extract errors
├── gitManager.js Branch, stage, commit, awaitApproval
│
├── ── AI Providers ─────────────────────────────────────────────────────
├── openai.js OpenAIProvider + legacy completeJSON helpers
├── claude.js ClaudeProvider + legacy completeJSON helpers
├── providers/LLMProvider.js Abstract base — generate(prompt, options)
├── providers/providerFactory.js createProvider('openai'|'claude'|'opencode')
├── providers/OpenCodeProvider.js fetch-based; free model allowlist; 4096 token default
├── core/DebateSession.js Session state: proposer/critic providers + model names
│
└── ── Infrastructure ───────────────────────────────────────────────────
├── budgetManager.js Per-model + per-phase USD + token tracking
├── logger.js Coloured console output; modeDecision, riskScore…
└── config.js .env validation; throws on missing keys
Every run writes structured artifacts into your project:
<your-project>/
└── .ape/
├── patches/ <ts>_<taskId>.diff every attempted patch (audit trail)
├── debates/ <ts>_<taskId>.json full 4-phase debate records
├── sessions/ <ts>_session.json debate viewer session log
│ <ts>_patch_v1.diff original proposer patch
│ <ts>_patch_v2.diff revised patch after defense (if changed)
├── memory.json architecture decisions, constraints, error history
└── state.json iteration counter, task status, budget snapshot
# 1. Install dependencies
cd ape && npm install
# 2. Configure API keys
cp .env.example .envEdit .env:
# Required for OpenAI / Claude (default providers)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Optional: choose which model fills each debate role
# PROPOSER_PROVIDER=openai # openai | claude | opencode
# PROPOSER_MODEL=gpt-4.1
# CRITIC_PROVIDER=claude # openai | claude | opencode
# CRITIC_MODEL=claude-opus-4-5
# Optional: use OpenCode free models (no API key needed for free tier)
# OPENCODE_ZEN_BASE_URL=https://opencode.ai
# PROPOSER_PROVIDER=opencode
# PROPOSER_MODEL=glm-5-free
# CRITIC_PROVIDER=opencode
# CRITIC_MODEL=kimi-k2.5-free# 3. Verify
node index.js --helpAPE uses a proposer / critic model: one AI proposes the patch, another challenges it. You can assign any supported provider to either role via .env.
| Provider | Key | Models |
|---|---|---|
| OpenAI | openai |
gpt-4.1, gpt-4o, any GPT model |
| Anthropic (Claude) | claude |
claude-opus-4-5, any Claude model |
| OpenCode Zen API | opencode |
glm-5-free, minimax-m2.5-free, kimi-k2.5-free, big-pickle (free) |
Set these four variables in your .env:
PROPOSER_PROVIDER=openai # who generates the patch
PROPOSER_MODEL=gpt-4.1
CRITIC_PROVIDER=claude # who challenges and audits it
CRITIC_MODEL=claude-opus-4-5OpenCode exposes an OpenAI-compatible /zen/v1/chat/completions endpoint. The free models are allowlisted by default — no billing setup required.
Step 1 — add to .env:
OPENCODE_ZEN_BASE_URL=https://opencode.ai
# OPENCODE_ZEN_API_KEY= ← leave blank for free tier
OPENCODE_DEFAULT_MODEL=glm-5-freeStep 2 — choose a debate pairing:
# All-free debate (GLM proposes, Kimi critiques)
PROPOSER_PROVIDER=opencode
PROPOSER_MODEL=glm-5-free
CRITIC_PROVIDER=opencode
CRITIC_MODEL=kimi-k2.5-free
# Mixed: GPT proposes, OpenCode critiques for free
PROPOSER_PROVIDER=openai
PROPOSER_MODEL=gpt-4.1
CRITIC_PROVIDER=opencode
CRITIC_MODEL=minimax-m2.5-freeStep 3 — run APE normally:
node index.js \
--goal="Add error handling to the data pipeline" \
--type=python --build="pytest" --applyAllowlist: by default only
glm-5-free,minimax-m2.5-free,kimi-k2.5-free, andbig-pickleare accepted. SetOPENCODE_ALLOW_ANY_MODEL=1to bypass the check for other model strings.
# Safe preview — see the plan without touching any files
node index.js \
--goal="Add input validation and error handling to the user registration endpoint" \
--type=node
# FastAPI service
node index.js \
--goal="Add JWT authentication to the FastAPI backend" \
--type=fastapi \
--build="pytest" \
--apply --max-budget=5.00
# Force full adversarial debate on critical payment logic
node index.js \
--goal="Refactor the payment transaction rollback handler" \
--type=node \
--debate-only --allow-protected \
--build="npm test" \
--apply --max-budget=10.00
# Force debate with verbose model JSON output
node index.js \
--goal="Refactor routing layer" \
--type=embedded \
--debate-only --apply --verbose
# Rust CLI tool
node index.js \
--goal="Add async file processing with progress bar" \
--type=cli --target=./my-cli \
--build="cargo test" \
--apply --max-budget=3.00
# Resume an interrupted session
node index.js \
--goal="Add JWT authentication to the FastAPI backend" \
--type=web --build="pytest" \
--resume --apply
# Zero-cost debate using OpenCode free models
node index.js \
--goal="Refactor the data pipeline module" \
--type=python --build="pytest" \
--debate-only --apply
# (set PROPOSER_PROVIDER=opencode CRITIC_PROVIDER=opencode in .env first)| Doc | Description |
|---|---|
| docs/architecture.md | Full data-flow diagrams, module map, session state, risk scoring table |
| docs/risk-gated-debate.md | LITE mode, DEBATE mode (all 4 phases), debate viewer UI, consensus, budget fallback |
| docs/cli-reference.md | Every CLI flag with defaults, types, and examples |
| docs/guardrails.md | Pre/post guardrails, protected paths, deletion ratio, custom config |
| docs/modules.md | Full public API for every module in src/ |
MIT License
Copyright (c) 2026 APE Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.