UltraAgent

Autonomous AI coding assistant for the terminal — powered by local LLMs on Apple Silicon.

Getting Started · Commands · Tool System · Project Creation · Configuration

What is UltraAgent?

UltraAgent is a terminal-based AI coding assistant that operates autonomously using an integrated tool system. It reads, writes, and edits files, runs shell commands, searches codebases, manages git, spawns sub-agents, persists memory, and browses the web — all controlled by a configurable three-tier permission system.

Runs 100% local on your machine — no API keys, no cloud, no data leaves your computer. Supports Ollama, LM Studio, MLX-LM, and llama.cpp.

Highlights

100% local & private — no API keys, no cloud, all data stays on your machine
Auto-Setup — detects your hardware, backends, and recommends optimal model + settings
29 built-in tools — file I/O, shell, git, sub-agents, memory, web, planning
Project creation wizard — scaffold production-ready projects from scratch
Project scanner — deep analysis auto-injected into every LLM request
Three-tier permissions — safe / confirm / dangerous, user approves before risky ops
Agentic loop — LLM autonomously selects tools, executes, observes, iterates (up to 30 rounds)
Sub-agents — spawn parallel agents for concurrent tasks
Persistent memory — context that survives across sessions
Apple Silicon optimized — curated model recommendations for M-series chips
Smart caching — LM Studio KV-Cache, Flash Attention, quantized KV for optimal performance
Real-time streaming — token-by-token output with repetition detection

Getting Started

Install from npm

npm install -g ultraagent
ultraagent local setup    # auto-detects hardware, backend, sets optimal config
ultraagent chat           # start coding

Quick Start (from source)

git clone https://github.com/zurd46/UltraAgent.git
cd UltraAgent
npm install

npm run dev -- local setup    # auto-detects hardware, backend, sets optimal config
npm run dev -- chat           # start coding

The local setup command automatically:

Detects your chip, RAM, GPU cores
Finds running backends (LM Studio, Ollama, MLX-LM, llama.cpp)
Recommends the best model for your hardware
Sets optimal context length, KV-cache, batch size, flash attention
Verifies the connection

Manual Setup

npm run dev -- config set provider local-openai
npm run dev -- config set localModel qwen2.5-coder:14b
npm run dev -- config set localBaseUrl http://localhost:1234/v1

npm run dev -- chat

Global Installation

npm run build && npm install -g .
ultraagent local setup
ultraagent chat

Alias: ua works everywhere instead of ultraagent.

Supported Backends

Backend	Provider	Default Port	Description
LM Studio	`local-openai`	`localhost:1234`	Desktop app with Metal GPU, automatic prompt caching
Ollama	`ollama`	`localhost:11434`	Simplest setup, native Metal GPU support
MLX-LM	`local-openai`	`localhost:8080`	Apple's ML Framework, fastest on M-chips
llama.cpp	`local-openai`	`localhost:8080`	Maximum control, OpenAI-compatible API

Recommended Models (Apple Silicon)

RAM	Model	Size	Quality	Speed
64GB+	Llama 3.3 70B (Q4)	40 GB	Excellent	Slow
32GB	Qwen 2.5 Coder 32B (Q4)	18 GB	Excellent	Medium
32GB	Codestral 22B	13 GB	Excellent	Medium
16GB	Qwen 2.5 Coder 14B	9 GB	Excellent	Fast
16GB	DeepSeek Coder V2 16B	9 GB	Excellent	Fast
8GB	Llama 3.1 8B	4.7 GB	Good	Fast
4GB	Qwen 2.5 Coder 3B	1.9 GB	Basic	Fast

ultraagent models --ram 32    # show recommendations for your RAM

LM Studio Optimal Settings

When using LM Studio, set these in Settings > Server for best performance:

Setting	Recommended	Why
Flash Attention	On (not Auto!)	Reduces memory for long contexts
KV Cache Quantization	Q8_0 (16-32GB) / F16 (64GB+)	Halves KV-cache memory, minimal quality loss
GPU Offload	Max (all layers)	Full Metal GPU acceleration
Prompt Caching	Auto	LM Studio caches automatically

ultraagent local setup sets these values automatically based on your hardware.

Commands

ultraagent <command> [options]

Project

Command	Description
`new`	Interactive wizard to scaffold a complete new project
`scan`	Deep-scan the project > `docs/scan.md` (auto-injected into LLM context)
`create`	Create a project from template
`analyze`	Analyze an existing project's structure and codebase

Agent

Command	Description
`chat`	Interactive session with full tool access
`run <prompt>`	Execute a single prompt non-interactively
`edit`	Edit or refactor existing code
`plan <prompt>`	Generate a structured project plan
`code <prompt>`	Generate code from a prompt

Configuration & Local LLM

Command	Description
`config show`	Display current configuration
`config set <key> <value>`	Update a configuration value
`config setup`	Setup wizard (redirects to `local setup`)
`config reset`	Reset to defaults
`models`	Recommended local models for your hardware
`local setup`	Auto-detect hardware & set optimal config
`local status`	Check local LLM server health + current settings
`local pull [model]`	Download a model via Ollama
`local list`	List installed Ollama models

Tool System

UltraAgent ships with 29 tools that the LLM invokes autonomously during a session. Each tool has an assigned permission level.

Permission Model

Level	Behavior
Safe	Executes immediately — no confirmation needed
Confirm	Requires user approval (file writes, commits)
Dangerous	Requires approval with highlighted warning (shell commands)

When prompted: [y]es · [n]o · [a]lways allow · [d]eny always

Tools

Core File Tools

Tool	Permission	Description
`read_file`	Safe	Read file contents with line numbers, offset, limit
`write_file`	Confirm	Create or overwrite files; auto-creates directories
`edit_file`	Confirm	Precise string replacement via diff-based editing
`bash`	Dangerous	Execute shell commands with timeout and output limits
`glob`	Safe	Find files by glob pattern
`grep`	Safe	Search file contents with regex and context lines

Git Tools

Tool	Permission	Description
`git_status`	Safe	Working tree status
`git_diff`	Safe	Staged and unstaged changes
`git_log`	Safe	Recent commit history
`git_commit`	Confirm	Create a commit
`git_branch`	Confirm	Create, switch, or list branches
`git_merge`	Dangerous	Merge branches
`git_revert`	Confirm	Revert a commit
`git_stash`	Confirm	Stash/pop changes
`git_cherry_pick`	Confirm	Cherry-pick commits
`git_rebase`	Dangerous	Rebase branches

GitHub Tools

Tool	Permission	Description
`gh_pr`	Confirm	Create, list, merge, review Pull Requests (requires `gh` CLI)
`gh_issue`	Confirm	Create, list, close, comment on Issues (requires `gh` CLI)

Advanced Tools

Tool	Permission	Description
`sub_agent`	Confirm	Spawn a sub-agent for independent tasks
`sub_agent_batch`	Confirm	Run multiple sub-agents in true parallel
`sub_agent_status`	Safe	Check status and results of running sub-agents
`memory_save`	Safe	Persist context to memory
`memory_search`	Safe	Search persistent memory
`memory_delete`	Safe	Delete a memory entry
`web_search`	Safe	Search the web
`web_fetch`	Safe	Fetch content from a URL
`plan_create`	Safe	Create a structured task plan
`plan_status`	Safe	Show plan progress
`plan_update`	Safe	Update task status

Project Creation

ultraagent new

Interactive wizard that scaffolds complete, production-ready projects.

# Non-interactive
ultraagent new --name my-app --type fullstack --stack nextjs-prisma-pg --prompt "E-commerce platform"

Option	Description
`-n, --name <name>`	Project name
`-t, --type <type>`	Project type
`-s, --stack <stack>`	Tech stack
`-d, --dir <path>`	Parent directory
`-p, --prompt <prompt>`	Project description

Supported Project Types & Stacks

Type	Stacks
webapp	React+Vite, Next.js, Vue 3+Vite, Svelte, Astro
fullstack	Next.js+Prisma+PG, React+Express, Vue+FastAPI, T3 Stack
api	Express, Fastify, NestJS, FastAPI, Hono, Go+Gin, Rust+Actix
cli	TypeScript+Commander, Python+Click, Rust+Clap, Go+Cobra
library	TypeScript npm, Python PyPI, Rust Crate
mobile	React Native+Expo, React Native Bare
desktop	Electron+React, Tauri+React
monorepo	Turborepo, Nx, pnpm Workspaces
ai-ml	Python+LangChain, Python+PyTorch, TypeScript+LangChain
custom	Describe what you need

Optional Features

Docker + docker-compose
CI/CD (GitHub Actions)
Testing (Unit + Integration)
Linting + Formatting (ESLint / Prettier)
Authentication
Database setup
API Documentation (OpenAPI / Swagger)
Environment variables (.env)
Logging
Error handling / Monitoring

Output

A complete project with directory structure, config files, typed source code, test setup, README, .gitignore, build scripts, initialized git repo, installed dependencies, and auto-generated docs/scan.md.

Project Scanner

ultraagent scan            # scan current project
ultraagent scan --force    # force re-scan
ultraagent scan --dir /path/to/project

Creates docs/scan.md — a comprehensive project analysis that is automatically injected into every LLM request, giving the model full project context.

What scan.md contains

Section	Content
Overview	Name, type, language, framework, package manager, file count, LOC
Git	Branch, remote, last commit
Directory Structure	Full tree (4 levels)
Key Files	Entry points with roles
Dependencies	All deps with versions
Scripts	npm scripts with commands
Configuration	tsconfig, eslint, prettier, docker, CI/CD, etc.
Test Files	All test file paths
API Routes	Detected route files
Environment Variables	From .env.example
Lines of Code	Total and per extension
Architecture	AI-generated analysis of patterns and conventions

Cache: scan.md is valid for 24 hours. Use --force to regenerate.

Agent Modes

Mode	Purpose
`chat`	General assistance — full tool access, memory, planning
`create`	Project setup — structure, dependencies, config
`analyze`	Code review — architecture, patterns, quality
`edit`	Refactoring — precise edits, minimal changes
`plan`	Task planning — phases, dependencies, risks
`code`	Code generation — typed, tested, convention-aware

Switch modes in-session with /mode <mode>.

Session Commands

Command	Description
`/help`	Available commands
`/new`	Create a new project
`/scan`	Scan project > `docs/scan.md`
`/mode <mode>`	Switch agent mode
`/dir <path>`	Change working directory
`/clear`	Clear terminal
`/status`	Session status and config
`/history`	Conversation history
`/undo`	Undo last file change
`/tokens`	Token usage
`/plan`	Current plan progress
`/exit`	End session

Hook System

Automate actions before or after tool calls. Config lives in .ultraagent/hooks.json.

{
  "hooks": [
    {
      "timing": "after",
      "tool": "write_file",
      "command": "npx prettier --write ${file}",
      "enabled": true
    },
    {
      "timing": "after",
      "tool": "edit_file",
      "command": "npm test",
      "enabled": true
    }
  ]
}

Field	Type	Description
`timing`	`before` \| `after`	When to run
`tool`	`string`	Tool name (`*` for all)
`command`	`string`	Shell command
`blocking`	`boolean`	Abort tool call on hook failure (before hooks only)
`enabled`	`boolean`	Active state

Variables: ${file} (file path), ${tool} (tool name)

Context Injection

UltraAgent automatically enriches every LLM request with project context:

Source	Description
Project detection	Language, framework, package manager, scripts
`docs/scan.md`	Full project scan
Persistent memory	Saved context from prior sessions
Active plan	Task plan with status
`ULTRAAGENT.md`	Project-specific custom instructions
Git status	Branch, changes, recent commits

Priority: Mode prompt > Project context > scan.md > Memories > Plan > History

Custom Instructions

Create ULTRAAGENT.md in your project root:

# Project Instructions
- TypeScript monorepo using pnpm
- Run `pnpm test` after changes
- Use conventional commits

Global instructions: ~/.ultraagent/instructions.md

Configuration

ultraagent local setup                        # auto-detect & configure (recommended)
ultraagent config set provider local-openai
ultraagent config set localModel qwen2.5-coder:14b
ultraagent config show                        # view config
ultraagent config reset                       # reset to defaults

Environment Variables

ULTRAAGENT_PROVIDER=local-openai
ULTRAAGENT_LOCAL_BASE_URL=http://localhost:1234/v1
ULTRAAGENT_LOCAL_MODEL=qwen2.5-coder:14b
ULTRAAGENT_LOCAL_CONTEXT_LENGTH=32768
ULTRAAGENT_LOCAL_TEMPERATURE=0.7
ULTRAAGENT_LOCAL_GPU_LAYERS=-1
ULTRAAGENT_LOCAL_BATCH_SIZE=512
ULTRAAGENT_LOCAL_FLASH_ATTENTION=true
ULTRAAGENT_LOCAL_KV_CACHE_TYPE=q8_0

All Config Options

Option	Default	Description
`model`	`qwen2.5-coder:14b`	Model identifier
`provider`	`local-openai`	`ollama` or `local-openai`
`maxTokens`	`8192`	Max output tokens per response
`streaming`	`true`	Real-time token streaming
`theme`	`default`	`default` · `minimal` · `verbose`
`history`	`true`	Keep conversation history
`maxSubAgents`	`5`	Max concurrent sub-agents
`localBaseUrl`	(auto)	Local LLM server URL (auto-detected per provider)
`localModel`	—	Local model name
`localGpuLayers`	`-1`	GPU layers (`-1` = all)
`localContextLength`	`32768`	Context window size
`localTemperature`	`0.7`	Sampling temperature
`localBatchSize`	`512`	Inference batch size
`localFlashAttention`	`true`	Enable flash attention
`localKvCacheType`	`q8_0`	KV-cache quantization (`f16`, `q8_0`, `q4_0`)

Architecture

src/
├── cli.ts                    # CLI entry point (Commander.js)
├── index.ts                  # Programmatic API exports
├── commands/                 # Command implementations
│   ├── new.ts                #   Project creation wizard
│   ├── scan.ts               #   Project scanner
│   ├── chat.ts               #   Interactive session
│   ├── edit.ts               #   Code editing
│   ├── run.ts                #   Single prompt execution
│   ├── config-cmd.ts         #   Configuration management
│   ├── local.ts              #   Local LLM auto-setup
│   └── models.ts             #   Model recommendations
├── core/                     # Core engine
│   ├── agent-factory.ts      #   Agent loop, streaming, context injection
│   ├── session.ts            #   Interactive REPL with slash commands
│   ├── context.ts            #   Git context & project instructions
│   ├── context-manager.ts    #   Token-aware history management
│   ├── local-llm.ts          #   Local LLM provider adapters
│   ├── planner.ts            #   Plan creation & task tracking
│   ├── hooks.ts              #   Pre/post tool execution hooks
│   ├── token-tracker.ts      #   Usage tracking
│   └── undo.ts               #   File change rollback
├── tools/                    # Tool system (29 tools)
│   ├── base.ts               #   Registry, permission levels, definitions
│   ├── permissions.ts        #   Permission manager & session allowlist
│   ├── read.ts / write.ts / edit.ts / bash.ts / glob.ts / grep.ts
│   ├── git.ts                #   10 git tools + 2 GitHub tools (gh_pr, gh_issue)
│   ├── sub-agent.ts          #   Parallel sub-agent execution
│   ├── memory.ts             #   memory_save, memory_search, memory_delete
│   ├── web.ts                #   web_search, web_fetch
│   └── index.ts              #   Barrel exports
├── ui/                       # Terminal UI
│   ├── theme.ts              #   Colors, gradients, ASCII banner
│   ├── spinner.ts            #   Loading spinners
│   └── permission-prompt.ts  #   Interactive permission dialog
└── utils/
    ├── config.ts             #   Zod-validated config with env support
    ├── hardware.ts           #   Apple Silicon hardware detection
    ├── backend-detect.ts     #   LLM backend auto-detection
    ├── model-recommender.ts  #   RAM-based model recommendation engine
    └── files.ts              #   Project detection & analysis

Tech Stack

Category	Technology
Language	TypeScript 5.7 (ES Modules)
AI	LangChain · LangGraph
LLM Backends	Ollama · LM Studio · MLX-LM · llama.cpp
CLI	Commander.js · Inquirer.js
Validation	Zod
UI	Chalk · Ora · Boxen · Gradient-string · cli-table3
Testing	Vitest

Development

npm run build        # compile TypeScript -> dist/
npm run dev          # run via tsx (no build)
npm start            # run compiled output
npm test             # run tests
npm run test:watch   # tests in watch mode
npm run clean        # remove dist/

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

Fork the repository
Create your branch (git checkout -b feature/my-feature)
Commit your changes
Push to the branch
Open a Pull Request

Changelog

v1.1.0

Smart context management — automatically reduces tools and prompt size to fit within model's context window
Improved output formatting — cleaner responses with structured formatting rules (status icons, tables, tree output)
11 new tools — git_merge, git_revert, git_stash, git_cherry_pick, git_rebase, gh_pr, gh_issue, sub_agent_batch, sub_agent_status, plan_create/status/update
Context overflow detection — clear error message with actionable solutions when context window is too small
Tool token estimation — dynamic tool binding based on available context budget

v1.0.0

Initial release with 18 tools, agentic loop, sub-agents, memory, project scanner, auto-setup

License

MIT

Built by Daniel Zurmuhle

Made in Switzerland

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.claude		.claude
.elicoder		.elicoder
docs		docs
src		src
testing		testing
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

UltraAgent

What is UltraAgent?

Highlights

Getting Started

Install from npm

Quick Start (from source)

Manual Setup

Global Installation

Supported Backends

Recommended Models (Apple Silicon)

LM Studio Optimal Settings

Commands

Project

Agent

Configuration & Local LLM

Tool System

Permission Model

Tools

Project Creation

Output

Project Scanner

Agent Modes

Session Commands

Hook System

Context Injection

Custom Instructions

Configuration

Environment Variables

Architecture

Tech Stack

Development

Contributing

Changelog

v1.1.0

v1.0.0

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages