Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 17 additions & 8 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Persistent codebase knowledge layer for AI agents. Pre-builds architecture, depe
- TypeScript, ESM (`"type": "module"`)
- tree-sitter (native N-API) + 27 language grammar packages
- @modelcontextprotocol/sdk - MCP server (stdio transport)
- commander - CLI (init, serve, update, status, symbols, search, modules, hotspots, hook, upgrade)
- commander - CLI (init, serve, update, inject, status, symbols, search, modules, hotspots, hook, upgrade)
- simple-git - git integration + temporal analysis
- zod - schema validation for LLM analysis results
- yaml - cortex.yaml manifest
Expand Down Expand Up @@ -45,16 +45,25 @@ Hybrid extraction:
- `codecortex symbols [query]` - browse and filter the symbol index
- `codecortex search <query>` - search across all knowledge files
- `codecortex modules [name]` - list modules or deep-dive into one
- `codecortex inject` - regenerate inline context in CLAUDE.md and agent config files
- `codecortex hotspots` - files ranked by risk (churn + coupling + bugs)
- `codecortex hook install|uninstall|status` - manage git hooks for auto-update
- `codecortex upgrade` - check for and install latest version

## MCP Tools (13)
Read (10): get_project_overview, get_module_context, get_session_briefing, search_knowledge, get_decision_history, get_dependency_graph, lookup_symbol, get_change_coupling, get_hotspots, get_edit_briefing
Write (3): record_decision, update_patterns, record_observation
## MCP Tools (5)
get_project_overview, get_dependency_graph, lookup_symbol, get_change_coupling, get_edit_briefing

All read tools include `_freshness` metadata (status, lastAnalyzed, filesChangedSince, changedFiles, message).
All read tools return context-safe responses (<10K chars) via truncation utilities in `src/utils/truncate.ts`.
## MCP Resources (3)
- `codecortex://project/overview` — constitution (architecture, risk map)
- `codecortex://project/hotspots` — risk-ranked files
- `codecortex://module/{name}` — module documentation (template)

## MCP Prompts (2)
- `start_session` — constitution + latest session for context
- `before_editing` — risk assessment for files you plan to edit

All tools include `_freshness` metadata (status, lastAnalyzed, filesChangedSince, changedFiles, message).
All tools return context-safe responses (<10K chars) via truncation utilities in `src/utils/truncate.ts`.

## Pre-Publish Checklist
Run ALL of these before `npm publish`. Do not skip any step.
Expand All @@ -72,7 +81,7 @@ Run ALL of these before `npm publish`. Do not skip any step.
- **Grammar smoke test** (`parser.test.ts`): Loads every language in `LANGUAGE_LOADERS` via `parseSource()`. Catches missing packages, broken native builds, wrong require paths. This is what would have caught the tree-sitter-liquid issue.
- **Version-check tests**: Update notification, cache lifecycle, PM detection, upgrade commands.
- **Hook tests**: Git hook install/uninstall/status integration tests.
- **MCP tests**: All 13 tools (read + write), simulation tests.
- **MCP tests**: All 5 tools, resources, prompts, simulation tests.

### Known limitations
- tree-sitter native bindings don't compile on Node 24 yet (upstream issue)
Expand All @@ -91,7 +100,7 @@ Run ALL of these before `npm publish`. Do not skip any step.
src/
cli/ - commander CLI (init, serve, update, status)
mcp/ - MCP server + tools
core/ - knowledge store (graph, modules, decisions, sessions, patterns, constitution, search, agent-instructions, freshness)
core/ - knowledge store (graph, modules, decisions, sessions, patterns, constitution, search, agent-instructions, context-injection, freshness)
extraction/ - tree-sitter native N-API (parser, symbols, imports, calls)
git/ - git diff, history, temporal analysis
types/ - TypeScript types + Zod schemas
Expand Down
55 changes: 30 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Every AI coding session starts with exploration — grepping, reading wrong file

## The Solution

CodeCortex eliminates the cold start. It pre-builds codebase knowledge — architecture, dependencies, risk areas, hidden coupling — so agents skip the exploration phase and go straight to the right files.
CodeCortex eliminates the cold start. It pre-builds codebase knowledge — architecture, dependencies, risk areas, hidden coupling — and injects it directly into your agent's context (CLAUDE.md, .cursorrules, etc.) so agents have project knowledge from the first prompt.

**Not a middleware. Not a proxy. Just knowledge your agent loads on day one.**

Expand All @@ -43,7 +43,7 @@ Three capabilities no other tool provides:

2. **Risk scores** — File X has been bug-fixed 7 times, has 6 hidden dependencies, and co-changes with 3 other files. Risk score: 35. You can't learn this from reading code.

3. **Cross-session memory** — Decisions, patterns, observations persist. The agent doesn't start from zero each session.
3. **Inline context injection** — Project knowledge is injected directly into CLAUDE.md, .cursorrules, and other agent config files with architecture, risk map, and editing directives. Agents use it without any setup.

**Example from a real codebase:**
- `schema.help.ts` and `schema.labels.ts` co-changed in 12/14 commits (86%) with **zero imports between them**
Expand All @@ -61,8 +61,8 @@ npm install -g codecortex-ai --legacy-peer-deps
cd /path/to/your-project
codecortex init

# Check knowledge freshness
codecortex status
# Regenerate inline context in CLAUDE.md and agent config files
codecortex inject
```

### Connect to Claude Code
Expand Down Expand Up @@ -101,7 +101,7 @@ Add to `.cursor/mcp.json`:

## What Gets Generated

All knowledge lives in `.codecortex/` as flat files in your repo:
All knowledge lives in `.codecortex/` as flat files in your repo, plus inline context is injected into agent config files:

```
.codecortex/
Expand All @@ -111,11 +111,16 @@ All knowledge lives in `.codecortex/` as flat files in your repo:
graph.json # dependency graph (imports, calls, modules)
symbols.json # full symbol index (functions, classes, types...)
temporal.json # git coupling, hotspots, bug history
hotspots.md # risk-ranked files (static, always available)
AGENT.md # tool usage guide for AI agents
modules/*.md # per-module structural analysis
decisions/*.md # architectural decision records
sessions/*.md # session change logs
patterns.md # coding patterns and conventions

CLAUDE.md # ← inline context injected here
.cursorrules # ← and here (if exists)
.windsurfrules # ← and here (if exists)
```

## Six Knowledge Layers
Expand All @@ -129,37 +134,34 @@ All knowledge lives in `.codecortex/` as flat files in your repo:
| 5. Patterns | How code is written here | `patterns.md` |
| 6. Sessions | What changed between sessions | `sessions/*.md` |

## MCP Tools (13)
## MCP Tools (5)

### Navigation — "Where should I look?" (4 tools)
Five focused tools that provide capabilities agents can't get from reading code:

| Tool | Description |
|------|-------------|
| `get_project_overview` | Architecture, modules, risk map. Call this first. |
| `search_knowledge` | Find where a function/class/type is DEFINED by name. Ranked results. |
| `get_dependency_graph` | Import/export graph filtered by module or file. |
| `lookup_symbol` | Precise symbol lookup with kind and file path filters. |
| `get_module_context` | Module files, deps, temporal signals. Zoom into a module. |
| `get_change_coupling` | Files that must change together. Hidden dependencies flagged. |
| `get_edit_briefing` | Pre-edit risk: co-change warnings, hidden deps, bug history. **Always call before editing.** |

### Risk — "What could go wrong?" (4 tools)
### MCP Resources (3)

| Tool | Description |
|------|-------------|
| `get_edit_briefing` | Pre-edit risk: co-change warnings, hidden deps, bug history. **Always call before editing.** |
| `get_hotspots` | Files ranked by risk (churn x coupling x bugs). |
| `get_change_coupling` | Files that must change together. Hidden dependencies flagged. |
| `get_dependency_graph` | Import/export graph filtered by module or file. |
Static knowledge available without tool calls:

### Memory — "Remember this" (5 tools)
| Resource | Description |
|----------|-------------|
| `codecortex://project/overview` | Full project constitution |
| `codecortex://project/hotspots` | Risk-ranked file table |
| `codecortex://module/{name}` | Per-module documentation |

| Tool | Description |
|------|-------------|
| `get_session_briefing` | What changed since the last session. |
| `get_decision_history` | Why things were built this way. |
| `record_decision` | Save an architectural decision. |
| `update_patterns` | Document coding conventions. |
| `record_observation` | Record anything you learned about the codebase. |
### MCP Prompts (2)

All read tools include `_freshness` metadata and return context-safe responses (<10K chars) via size-adaptive caps.
| Prompt | Description |
|--------|-------------|
| `start_session` | Returns constitution + latest session context |
| `before_editing` | Takes file paths, returns risk/coupling/bug briefing |

## CLI Commands

Expand All @@ -168,6 +170,7 @@ All read tools include `_freshness` metadata and return context-safe responses (
| `codecortex init` | Discover project + extract symbols + analyze git history |
| `codecortex serve` | Start MCP server (stdio transport) |
| `codecortex update` | Re-extract changed files, update affected modules |
| `codecortex inject` | Regenerate inline context in CLAUDE.md and agent config files |
| `codecortex status` | Show knowledge freshness, stale modules, symbol counts |
| `codecortex symbols [query]` | Browse and filter the symbol index |
| `codecortex search <query>` | Search across symbols, file paths, and docs |
Expand All @@ -180,6 +183,8 @@ All read tools include `_freshness` metadata and return context-safe responses (

**Hybrid extraction:** tree-sitter native N-API for structure (symbols, imports, calls across 27 languages) + host LLM for semantics (what modules do, why they're built that way). Zero extra API keys.

**Inline context injection:** After analysis, CodeCortex injects a rich knowledge section directly into CLAUDE.md and other agent config files. This includes architecture overview, risk map with coupled file names, and editing directives — so agents have project context from the first prompt without needing MCP.

**Git hooks** keep knowledge fresh — `codecortex update` runs automatically on every commit, re-extracting changed files and updating temporal analysis.

**Size-adaptive responses** — CodeCortex classifies your project (micro → extra-large) and adjusts response caps accordingly. A 23-file project gets full detail. A 6,400-file project gets intelligent summaries. Every MCP tool response stays under 10K chars.
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "codecortex-ai",
"version": "0.5.0",
"version": "0.6.0",
"description": "Persistent codebase knowledge layer for AI agents — architecture, dependencies, coupling, and risk served via MCP",
"type": "module",
"bin": {
Expand Down
23 changes: 19 additions & 4 deletions src/cli/commands/init.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import { writeFile, writeJsonStream, ensureDir, cortexPath } from '../../utils/f
import { readFile } from 'node:fs/promises'
import { generateStructuralModuleDocs } from '../../core/module-gen.js'
import { generateAgentInstructions } from '../../core/agent-instructions.js'
import { generateHotspotsMarkdown } from '../../git/temporal.js'
import { createDecision, writeDecision, listDecisions } from '../../core/decisions.js'
import type { SymbolRecord, ImportEdge, CallEdge, SymbolIndex, ProjectInfo } from '../../types/index.js'

Expand All @@ -40,6 +41,7 @@ export async function initCommand(opts: { root: string; days: string }): Promise
const allImports: ImportEdge[] = []
const allCalls: CallEdge[] = []
let extractionErrors = 0
const langStats = new Map<string, { files: number; symbols: number }>()

let parsed = 0
const parseable = project.files.filter(f => languageFromPath(f.path)).length
Expand All @@ -49,6 +51,9 @@ export async function initCommand(opts: { root: string; days: string }): Promise
const lang = languageFromPath(file.path)
if (!lang) continue

const stats = langStats.get(lang) || { files: 0, symbols: 0 }
stats.files++

try {
const tree = await parseFile(file.absolutePath, lang)
const source = await readFile(file.absolutePath, 'utf-8')
Expand All @@ -57,12 +62,14 @@ export async function initCommand(opts: { root: string; days: string }): Promise
const imports = extractImports(tree, file.path, lang)
const calls = extractCalls(tree, file.path, lang)

stats.symbols += symbols.length
allSymbols.push(...symbols)
allImports.push(...imports)
allCalls.push(...calls)
} catch {
extractionErrors++
}
langStats.set(lang, stats)
parsed++
if (showProgress && parsed % 5000 === 0) {
process.stdout.write(`\r Progress: ${parsed}/${parseable} files (${allSymbols.length} symbols)`)
Expand All @@ -74,6 +81,13 @@ export async function initCommand(opts: { root: string; days: string }): Promise
if (extractionErrors > 0) {
console.log(` (${extractionErrors} files skipped due to parse errors)`)
}

// Warn about languages with 0 symbols extracted
for (const [lang, stats] of langStats) {
if (stats.files > 0 && stats.symbols === 0) {
console.log(` \u26a0 Warning: ${lang} \u2014 ${stats.files} files parsed, 0 symbols extracted. Grammar may not support this language.`)
}
}
console.log('')

// Step 3: Build dependency graph
Expand Down Expand Up @@ -140,9 +154,10 @@ export async function initCommand(opts: { root: string; days: string }): Promise
// Write graph.json
await writeGraph(root, graph)

// Write temporal.json
// Write temporal.json + hotspots.md
if (temporalData) {
await writeFile(cortexPath(root, 'temporal.json'), JSON.stringify(temporalData, null, 2))
await writeFile(cortexPath(root, 'hotspots.md'), generateHotspotsMarkdown(temporalData))
}

// Write overview.md — compact summary only (no raw file listing)
Expand All @@ -162,7 +177,7 @@ export async function initCommand(opts: { root: string; days: string }): Promise
await writeManifest(root, manifest)

// Write patterns.md (empty template)
await writeFile(cortexPath(root, 'patterns.md'), '# Coding Patterns\n\nNo patterns recorded yet. Use `update_patterns` to add patterns.\n')
await writeFile(cortexPath(root, 'patterns.md'), '# Coding Patterns\n\nNo patterns recorded yet. Edit this file directly to add patterns.\n')

// Generate structural module docs
const moduleDocsGenerated = await generateStructuralModuleDocs(root, {
Expand All @@ -185,8 +200,8 @@ export async function initCommand(opts: { root: string; days: string }): Promise
console.log(' Written: constitution.md')
console.log('')

// Step 7: Agent onboarding
console.log('Step 7/7: Generating agent instructions...')
// Step 7: Agent onboarding + inline context injection
console.log('Step 7/7: Generating inline context...')
const updatedFiles = await generateAgentInstructions(root)

// Seed a starter decision (skip if decisions already exist)
Expand Down
27 changes: 27 additions & 0 deletions src/cli/commands/inject.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import { resolve } from 'node:path'
import { existsSync } from 'node:fs'
import { cortexPath } from '../../utils/files.js'
import { injectAllAgentFiles } from '../../core/context-injection.js'

export async function injectCommand(opts: { root: string }): Promise<void> {
const root = resolve(opts.root)

if (!existsSync(cortexPath(root, 'cortex.yaml'))) {
console.error('Error: No CodeCortex knowledge found. Run `codecortex init` first.')
process.exitCode = 1
return
}

console.log('Regenerating inline context...')
const updated = await injectAllAgentFiles(root)

if (updated.length === 0) {
console.log(' All agent config files are already up to date.')
} else {
for (const file of updated) {
console.log(` Updated: ${file}`)
}
}
console.log('')
console.log('Done. Agent config files now contain inline project knowledge.')
}
6 changes: 6 additions & 0 deletions src/cli/commands/update.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ import { generateConstitution } from '../../core/constitution.js'
import { createSession, writeSession, getLatestSession } from '../../core/sessions.js'
import { readFile as fsRead } from 'node:fs/promises'
import { generateStructuralModuleDocs } from '../../core/module-gen.js'
import { generateHotspotsMarkdown } from '../../git/temporal.js'
import { injectAllAgentFiles } from '../../core/context-injection.js'
import type { SymbolRecord, ImportEdge, CallEdge, SymbolIndex } from '../../types/index.js'

export async function updateCommand(opts: { root: string; days: string }): Promise<void> {
Expand Down Expand Up @@ -100,6 +102,7 @@ export async function updateCommand(opts: { root: string; days: string }): Promi
await writeGraph(root, graph)
if (temporalData) {
await writeFile(cortexPath(root, 'temporal.json'), JSON.stringify(temporalData, null, 2))
await writeFile(cortexPath(root, 'hotspots.md'), generateHotspotsMarkdown(temporalData))
}

// Generate structural module docs (skip existing)
Expand All @@ -125,6 +128,9 @@ export async function updateCommand(opts: { root: string; days: string }): Promi
temporal: temporalData,
})

// Refresh inline context in agent config files
await injectAllAgentFiles(root)

// Create session log
const diff = await getUncommittedDiff(root).catch(() => ({ filesChanged: [], summary: 'no changes' }))
const previousSession = await getLatestSession(root)
Expand Down
2 changes: 1 addition & 1 deletion src/cli/grouped-help.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import type { Command, Help } from 'commander'

const COMMAND_GROUPS: Array<{ title: string; commands: string[] }> = [
{ title: 'Core', commands: ['init', 'serve', 'update', 'status'] },
{ title: 'Core', commands: ['init', 'serve', 'update', 'inject', 'status'] },
{ title: 'Query', commands: ['symbols', 'search', 'modules', 'hotspots'] },
{ title: 'Utility', commands: ['hook', 'upgrade'] },
]
Expand Down
7 changes: 7 additions & 0 deletions src/cli/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import { Command } from 'commander'
import { initCommand } from './commands/init.js'
import { serveCommand } from './commands/serve.js'
import { updateCommand } from './commands/update.js'
import { injectCommand } from './commands/inject.js'
import { statusCommand } from './commands/status.js'
import { symbolsCommand } from './commands/symbols.js'
import { searchCommand } from './commands/search.js'
Expand Down Expand Up @@ -51,6 +52,12 @@ program
.option('-d, --days <number>', 'Days of git history to re-analyze', '90')
.action(updateCommand)

program
.command('inject')
.description('Regenerate inline context in CLAUDE.md and agent config files')
.option('-r, --root <path>', 'Project root directory', process.cwd())
.action(injectCommand)

program
.command('status')
.description('Show knowledge freshness and symbol counts')
Expand Down
Loading
Loading