avacli is a single native binary that gives you an autonomous coding agent with a local web UI, shell access, file tools, and xAI Grok integration. No accounts, no cloud, no vendor lock-in.
One binary. No daemon, no Docker, no cloud account. Install it, add your xAI key, and start building.
Runs on your machine or VPS. Your code and API keys never leave your environment. No cloud dependency, no telemetry.
Reads, writes, and searches code across your entire project. Runs shell commands and plans multi-step tasks on its own.
Uses Grok models via your own xAI API key. You pay xAI directly at their published rates — zero markup, zero middlemen.
Streaming chat with reasoning display, WebIDE, file browser, and settings — all served from the binary on localhost:8080.
File I/O, regex search, shell execution, web & X search, image generation, persistent memory, todos, sub-agents, app databases, services — across three operating modes.
Deploy and manage Vultr cloud servers directly from the UI. Connect multiple nodes into a private P2P network and control them all from one chat.
Download, install, add your xAI key, and go. Zero registration, zero data collection, zero strings attached. MIT licensed.
Spin up cloud servers and link them into a private node mesh — all from the same chat interface you use for coding.
Pick a plan, region, and config profile — avacli provisions a Vultr instance with your agent pre-installed. Monitor balance, bandwidth, and billing in real-time.
Every connected node sees every other node. Use @node mentions in chat to send commands to specific machines, or click a node to work directly in its remote UI.
Open any node’s full web interface from your local browser. It’s like sitting in front of the remote machine — edit files, run the agent, manage services — without SSH.
Three deploy profiles out of the box: Agent Only (minimal avacli install), Full LAMP Stack (Apache + MariaDB + PHP + Redis + avacli), or Blank server.
avacli isn’t just a chat. It’s a small self-hosted runtime for long-lived processes, per-app storage, and delegated agent work — all sandboxed, all xAI-native.
Host Discord bots, Python workers, or Node daemons as type:"process" services. Auto venv / npm ci bootstrap, exponential-backoff restart policies, live SSE log streams, and vault-backed env. Per-service workdir under ~/.avacli/services/.
Every agent-generated app gets its own authenticated database at ~/.avacli/app_data/<slug>.db, sandboxed with sqlite3_set_authorizer. Auto-injected _sdk.js exposes window.avacli.db, .main, and .ai — zero plumbing, no CORS, no secrets leaking to the browser.
Delegate bounded work with spawn_subagent. Each child runs on its own thread with allowed_paths / allowed_tools whitelists and write leases so siblings can’t collide. Default-deny (max_depth=0) until you raise it in Settings; xAI-only by design.
v2.3 re-engineers every request around the xAI prompt cache and adds a Batch API surface. Interactive chat drops to 70–90% cache hits on turn 2; offline jobs stack a flat 50% batch discount on top.
The dynamic context (notes, todos, memory, edit history) is split into a separate system message after the stable prompt — so the prefix is identical across turns and users. Typical sessions go from ~0% cache hits to 70–90% from turn 2 onward.
conv_id routingNew X-Xai-Conv-Id header on chat completions + conv_id body field on the Responses API pin cache routing to the same shard for a conversation's lifetime. Conv IDs are minted on session create, persisted, and inherited by sub-agents so tool-heavy sessions stay warm.
Every response logs [cache] prompt=X cached=Y ratio=Z%. cached_tokens, reasoning_tokens, billable_prompt_tokens, and cache_hit_rate are exposed on SSE token_usage frames, the headless JSON summary, and the /usage dashboard.
The Responses API path now sends only new items (tool results, next user message) plus previous_response_id — xAI rebuilds history server-side from its 30-day retention. Transparent full-resend fallback on stale-ID errors; persisted in session state.
Chat-completions reasoning models stream reasoning_content through a dedicated callback; it's stored per-message and echoed back byte-identically on replay so the thinking stays inside the cached prefix. No more paying twice for chain-of-thought.
Flat 50% discount on /v1/messages/batches, stacks with prompt caching for ~20× effective discount on offline workloads. New batches table, 30s background poller, REST surface under /api/batches/*, and five agent tools starting with batch_submit.
Three steps. No account creation, no onboarding wizard, no credit card.
Grab a binary from the downloads page, or build from source using the docs.
Get a free API key from console.x.ai and add it in Settings once the UI is running.
Launch the web UI, open it in your browser, and start building with your autonomous AI agent.
We believe AI tools should be free, transparent, and self-hosted. avacli gives you the full power of an autonomous agent without sending your code to a third party. You bring your own API key, you keep your data, and you run it wherever you want — laptop, desktop, or cloud VPS. Deploy nodes, link them together, and orchestrate your entire infrastructure from one chat.