Connecting Cortex — BuildOnAI Docs

Prerequisites

Run through the Quickstart first — you need consciousness-server up on :3032. Verify with:

curl -s http://localhost:3032/health | jq

If that returns "status": "ok", you're ready. AUTH_MODE=off (the default) is fine for this guide.

Step 1 — Clone Cortex

Cortex lives in its own repository. Clone it anywhere on your machine — it does not need to sit next to consciousness-server.

git clone https://github.com/build-on-ai/cortex.git
cd cortex

Step 2 — Make sure Ollama is ready

Cortex routes prompts through Ollama on the host. Pull at least one model that supports tool calling.

# Cortex needs Ollama on the host. If you don't already have it:
curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4:e4b   # 3 GB, runs on CPU
# or
ollama pull gemma4:26b   # 17 GB, needs a GPU

Step 3 — Run Cortex

./run.sh agent

On first run run.sh creates a Python virtualenv and installs dependencies (about a minute). Then you should see the banner:

+==========================================+
|  CORTEX  |  gemma4:e4b   |
|  Local AI Agent                          |
+==========================================+
Type /help for commands

  + Policy Engine loaded (10 tools)
  + Recovery Engine (fallback: none)
  + Context Compactor (limit: 16000 tokens)
  + Discovered Consciousness Server at http://localhost:3032
  + Briefing from Consciousness Server loaded

>

The line + Discovered Consciousness Server at http://localhost:3032 means the auto-discovery probe succeeded. Cortex is now registered with CS, the briefing loaded, and tools that hit CS endpoints (notes, tasks, memory) work without further configuration.

Type a message at the > prompt — it goes through the local model. Try /status to see the live picture.

CS on a different host?

Auto-discovery only checks localhost:3032. If your CS lives on another machine in the LAN, set CS_URL explicitly. AGENT_NAME is the identifier other agents see — give each Cortex instance a unique one if you run several:

CS_URL=http://10.0.0.5:3032 AGENT_NAME=cortex-laptop ./run.sh agent

Three agents coordinating through CS

The promise of the ecosystem isn't one Cortex talking to a memory store — it's many agents sharing state and dropping tasks for each other. The simplest demo: two workers polling for work, plus an interactive operator.

tmux new-session -d -s cortex-demo

# Pane 1 — autonomous worker
tmux send-keys -t cortex-demo \
  "AGENT_NAME=worker-A ./run.sh worker" Enter

# Pane 2 — second worker
tmux split-window -t cortex-demo -h
tmux send-keys -t cortex-demo \
  "AGENT_NAME=worker-B ./run.sh worker" Enter

# Pane 3 — operator (interactive CLI, drops tasks)
tmux split-window -t cortex-demo -v
tmux send-keys -t cortex-demo \
  "AGENT_NAME=operator ./run.sh agent" Enter

tmux attach -t cortex-demo

You'll see three panes. Each worker registers with CS on start, sends a heartbeat every few seconds, and polls /api/tasks/pending/<AGENT_NAME>. The operator is the same Cortex CLI you used in step 3.

Drop a task and watch it get picked up

# From any terminal — drop a task into the queue.
curl -s -X POST http://localhost:3032/api/tasks \
  -H 'Content-Type: application/json' \
  -d '{
    "title": "summarise the README",
    "description": "Read README.md, write 3-bullet summary as a CS note.",
    "assigned_to": "worker-A"
  }'

# Watch worker-A pick it up — within ~5 s the task status flips
# to "in_progress", and on completion a note appears.
curl -s "http://localhost:3032/api/tasks?assigned_to=worker-A" | jq
curl -s "http://localhost:3032/api/notes?agent=worker-A" | jq

Within a polling cycle (5 s by default) worker-A claims the task, executes it through its local model, and publishes a note with the result. Repeat with assigned_to: "worker-B" to balance work, or assign to operator for a human-in-the-loop step.

Verify the fleet

curl -s http://localhost:3032/api/agents | jq
# You'll see worker-A, worker-B, and operator each with a recent
# heartbeat timestamp and "online" status.

This is exactly the pattern for a production fleet — just add more panes (or more machines via Multi-machine fleet) and assign tasks to whichever agent has the right model loaded.

Auto-discovery details

On startup, Cortex checks http://localhost:3032/health with a 1-second timeout. If CS responds, that becomes CS_URL. If anything else (CS not running, wrong port, network error), CS_URL stays empty and Cortex runs in standalone mode — no degradation, just no CS-backed tools.

An explicit CS_URL env var always wins. To disable the probe, or aim it elsewhere:

# Force CS off entirely
CORTEX_AUTO_DISCOVER_CS=0 ./run.sh agent

# Or override the probe URL — useful if CS is on a non-default port
CORTEX_AUTO_DISCOVER_URL=http://localhost:13032 ./run.sh agent

Going further

Want to write your own agent instead? See Your first agent → — the same five HTTP calls Cortex makes, in 40 lines of Python.
Many machines, not just panes? See Multi-machine fleet →
Ready to leave AUTH_MODE=off? See Secure deployment →
Cortex docs and source? github.com/build-on-ai/cortex

Plug Cortex into the ecosystem.