curl -fsSL https://dominic.computer/nanna-install.sh | bashNanna is a coding agent for coding agents. Agents delegate tasks to subagents to parallelize work and to keep the main agent (the orchestrator) lean.
Nanna lets you host subagents locally, allowing you to move some of the orchestrator's work onto different machines. Nanna's subagents can run dev containers on your own machine, or in the same dev environment as the orchestrator. You can provide the orchestrator local small language models (SLMs) you host on your machine, or other models hosted through an AI Gateway. Nanna is built for cloud-based agents, but works for local coding agents like Claude Code CLI as well.
Delegation out of the orchestrator's environment allows you to use your available hardware and choose the right models to expose to your orchestrators, saving electricity and money.
Nanna provides an interface (CLI, MCP) so that agents can delegate work to self-hosted models.
Nanna can run inside background agents' environments, or locally. The orchestrator gets six MCP tools: assign_task, poll_task, get_result, list_tasks, cancel_task, onboard_repo. The small API is designed to be token-efficient and unambiguous for orchestrators.
The orchestrator is responsible for global task completion. If the device or platform Nanna is running on fails, the orchestrator's tool calls to Nanna will fail. But, the orchestrator has subagents it can fall back to. Nanna can thus exploit unreliable resources without compromising the reliability of the orchestrator, or scale based on resource availability: electrical grid surplus or unused personal devices.
SLMs use 10x to 100x less electricity than LLMs and can operate on-device. Self-hosting them brings the marginal cost to the price of electricity (1-2 orders of magnitude cheaper than token metering).
The size of the smallest models able to do meaningful work shrunk at an incredible rate from 2022 through 2024, before stabilizing through 2025. Since then, the focus shifted to specializing models for specific tasks, where they can perform as well or better than general models with 100x their resource footprint.
Providers do route tasks to smaller models, but are generally limited to their own and expose limited levers. Nanna lets you bring any model to a task, alongside a dev environment that's preconfigured for your specific work.
Aside from using less electricity, on-device work is distributed over the existing power grid. Globally, datacenters are difficult to integrate into existing grids, and rely partially on natural gas.