ECOSYSTEM

Local-first multi-agent AI coordination for teams who can't send data to the cloud.

A self-hosted ecosystem of four services: Consciousness Server for shared memory, Cortex for local AI agents, machines-server for infrastructure awareness, and a key server for ed25519 auth. All dual-licensed AGPL-3.0-only plus commercial.

In 30 seconds.

BuildOnAI is a shared-memory system that gives multiple AI agents one common awareness — across different machines, in different locations, even on different networks joined over VPN.

You install four services on your own hardware (one Docker Compose). Your AI agents share one memory, search the same documents, and see each other through one HTTP API.

Which agents you run is your choice. Cortex on Ollama stays fully local — no model traffic leaves the host. If you also run Claude Code, Codex, or wire Cortex's Recovery Engine to Anthropic or OpenAI, those agents read from the same shared memory — so be deliberate about what goes into the shared store when a cloud agent is in the loop. The ecosystem doesn't make that choice for you; it makes both options possible on the same infrastructure.

Useful when you build local AI tooling, when your organisation has data that must not leave the box for compliance reasons, or when you run a homelab with GPU and don't want to assemble Vault + Postgres + ChromaDB + a memory framework yourself.

AGPL-3.0-only on every repository. Commercial licence available.

What ships, what's preview, what's in progress.

We label every component so you know what you can rely on today, what you can try with care, and what is still being designed.

Stable

running in production
  • Consciousness Server v1.1.0 — shared memory, semantic search, agent and skill registry, chat, machine awareness. The backbone six HTTP services agree on. Continuously used in real-world workloads by the author since mid-2025; public since April 2026.
  • Cortex v1.0.8 — local AI agent with CLI, Web UI, worker mode. Multi-model orchestration via Ollama. Policy Engine refuses dangerous tool calls regardless of who asked. 18 audit rounds, 57 tests green.
  • BuildOnAI Key Server v2.0.0 — ed25519 signature-per-request auth + on-disk vault for SSH keys and API tokens. Three AUTH_MODE values for safe migration from unsigned to signed. Single instance handles auth for the whole fleet.

Public preview

works, schema may shift
  • Document Processor (pre-1.0) — desktop app (Tauri + Rust + Svelte) for PDF / DOCX / TXT with image-and-context extraction. For documents that must not leave the host. Storage schema being frozen before 1.0.

In active refinement

running internally, repos not yet public
  • Notes — local note-taking with semantic search built into the editor. Used daily by the author. Being moved onto the unified Tauri / Svelte / Rust stack shared with Document Processor.
  • Inbox — one place where agents drop questions and humans answer — chat, mail, and tasks merged into a single triage queue. Rebuild in progress so it is built on the same CS APIs as the rest.
  • Vox — voice surface shared by Notes, Inbox, and Cortex. Local speech-to-text and voice command layer. Dictate a note in Notes, answer an agent in Inbox by voice, command Cortex from the workshop without touching a keyboard.

Planned

design phase, not built yet
  • Machine orchestration beyond compute. Today the platform sees every host in your fleet — hardware profile, available models, live telemetry. Tomorrow it controls them: 3D printers, inverters, edge nodes (small servers running close to the equipment they serve, e.g. next to a printer in the workshop), each treated as a first-class entity — not interchangeable compute slots.
  • BuildOnAI Mesh. Local search and knowledge layer: SearXNG (self-hosted meta-search that queries Google/Bing/DuckDuckGo on your behalf without revealing who asked) + nomic-embed-text + entity relations. Gives every agent a "search the web + recall my own knowledge" tool without sending queries to Google.

How the pieces fit.

Agents talk HTTP to Consciousness Server. CS orchestrates Redis (working state), ChromaDB (semantic search via Ollama embeddings), and three more services for machines awareness, auth, and execution.

What BuildOnAI brings to the table.

Four capabilities bundled into one stack:

  • One memory, one auth layer. Agents from different frameworks (Cortex, Claude Code, your scripts) talk to the same notes / tasks / chat store — not N parallel memory implementations that drift apart over time.
  • Machine awareness as a first-class concept. The platform knows each host's hardware profile (GPU, free RAM, locally installed models) and routes tasks accordingly — a job that needs a 26B model lands on the workstation with the RTX, not on a Raspberry Pi. Each machine is a known entity with a profile, not anonymous compute.
  • Structural security via Cortex Policy Engine. Dangerous tool calls refused regardless of who asked — human, model, or prompt injection. CI-enforced invariants, not runtime checks you can switch off.
  • Honest project status (see above). No "production-ready" marketing on top of a pre-1.0 component.

BuildOnAI is built for several agents on several machines sharing state under local-first constraints. If that's your situation — this is for you.

What "the platform knows the hardware" actually means.

In a heterogeneous fleet — a workstation with a 24 GB GPU, a mini-PC in the workshop, a five-year-old laptop without graphics — picking the right machine for each job matters. Routing a 26B-parameter model to a Raspberry Pi means it never returns. Routing it to the RTX takes seconds.

BuildOnAI handles that routing. Every machine in your fleet carries a profile in machines-server: GPU model, free memory, locally installed Ollama models, attached peripherals (3D printer, camera, inverter). The router reads that profile and picks the host that fits each task.

Three example hosts in a typical setup:

  • a workstation with an RTX 4090, 64 GB RAM and qwen3:30b installed locally,
  • a mini-PC in the workshop with an ARM CPU and a USB camera attached,
  • a five-year-old laptop with 16 GB RAM and no GPU at all.

For each, BuildOnAI knows what it can do — a heavy model lands on the RTX, a vision task on the mini-PC with the camera, a light background job on the laptop. The agent never has to guess where to run.

What people use this for.

Multi-agent dev lab

Ships

One developer, several AI processes (Cortex CLI, Cortex worker, Claude Code in tmux). All log to the same Consciousness Server. Shared memory between sessions; coordination via chat + broadcast.

Cortex · Consciousness Server · semantic-search · machines-server

Regulated-industry RAG

Tested

Law office or research lab ingests its archive locally with Document Processor; agents query it with Cortex. Audit log via Key Server. Documents never leave the host.

Document Processor · Consciousness Server · Cortex · Key Server

Heterogeneous fleet

Ships

Workstation with GPU runs Cortex with a 26B model; CPU host runs the same Cortex with a 4B model; both share state via Consciousness Server. Tasks route to whichever node has headroom.

Cortex × N · Consciousness Server · machines-server

Regulated industries

Law offices, medical research, clinical labs, public-sector operators, finance. Data sovereignty is a regulatory requirement, not a preference.

Engineering & specialised manufacturing

R&D teams, design firms, any company protecting proprietary work. Anywhere the competitive edge is a trade secret rather than a patent — because filing a patent publishes the know-how to anyone watching. Internal documents stay internal.

Security-conscious homelabs and self-hosters

Engineers running their own GPU + storage who want a working multi-agent setup without standing up Vault, Postgres, ChromaDB, and a memory framework themselves.

Open-source contributors

Developers working on local-first AI tooling. AGPL-3.0-only protects every fork from being absorbed into a closed product.

PLATFORM

Machines aren't just servers.

Most agent platforms model agents but ignore the hardware they run on. BuildOnAI does the opposite. Every machine in your fleet — workstation with GPU, app host, edge node, eventually a 3D printer or an inverter — is a first-class entity with hardware profile, available models, and live telemetry. Today it routes agents; tomorrow it orchestrates the shop floor.

Read more →
01

Local-first by design

Every service runs on hardware you control. No cloud calls, no telemetry, no vendor lock-in. The optional fallback to a hosted LLM is opt-in via API key — without one the ecosystem stays fully offline.

02

Honest about what we are

Not enterprise-deployed. Not battle-tested against a determined attacker. Continuously used in real-world workloads by the author since mid-2025; public since April 2026. We label what ships, what's tested, and what's planned — and don't dress one as the other.

03

Structural security, not posture

Cortex's Policy Engine refuses dangerous tool calls regardless of who or what asked — human, model, or prompt injection. Security invariants are CI-enforced; disabling them voids commercial-licence guarantees.

Which licence do you need?

AGPL-3.0-only — free

Free

Right choice if you're publishing your own modifications, running it for yourself, doing open-source research, or building something you intend to release as AGPL. The catch: AGPL extends to network services — if you offer a modified BuildOnAI as a service, you must publish your modifications.

Commercial licence

Paid

Right choice if you want to embed BuildOnAI in a closed-source product, run it as a SaaS without open-sourcing your modifications, or your organisation can't accept AGPL's network-service obligation. Pricing is bespoke (project size, support level, deployment model). Email [email protected] with a short description of the use case for a quote.