Context LatticeBy Private Memory Corp
Guide 1

Installation Guide

Launch Context Lattice in Lite or Full mode with verified health checks and explicit startup commands.

Quick Path

Fastest clean install route

  1. Less technical macOS users: download and run the DMG bootstrap launcher.
  2. Less technical Linux users: download and run the Linux bootstrap bundle installer.
  3. Run gmake quickstart.
  4. Choose launch mode: gmake mem-up-lite or gmake mem-up-full.
  5. Validate /health and authenticated /status.
  6. If needed, move high-growth data paths to external NVMe before heavy ingest.

All details remain below. Use the on-page nav to jump directly to any section.

Default launch command

gmake mem-up is the standard stack launch entrypoint

Required

gmake mem-up uses the active COMPOSE_PROFILES value from .env (or core if unset).

  • Force full mode: gmake mem-up-full
  • Force core mode: gmake mem-up-core
  • Lite compose stack: gmake mem-up-lite
Foolproof start

Use gmake quickstart on first install

Required

This command creates missing env wiring, runs secure bootstrap, starts services, and validates health/auth so you do not get stuck on early 401 errors.

gmake quickstart
Easy monitoring

Use built-in open/check commands

Recommended
gmake monitor-open
# health + status + fanout checks only:
gmake monitor-check
  • Dashboard URL: http://127.0.0.1:3000 (default local)
  • Health URL: http://127.0.0.1:8075/health
  • Status URL: http://127.0.0.1:8075/status (requires x-api-key)
Distribution note

Homebrew can be added, but Docker + gmake quickstart stays canonical

Advanced
Open optional Homebrew guidance

Context Lattice is a multi-service stack, so Homebrew is best used as a convenience wrapper (bootstrap scripts, templates, health checks), not as the primary runtime package manager.

  • Recommended today: clone repo + run gmake quickstart.
  • Homebrew fit: optional installer for CLI helpers and launch diagnostics.
  • Runtime truth: Docker Compose remains the source of truth for actual services.
Auth note

Secure default means some endpoints require API key

Required

Use /health unauthenticated. Use x-api-key for /status, /memory/*, and /telemetry/*.

Resource requirements

Size your machine by lane and deployment mode

Required
  • Public v3.3.x Hugging Face / Glama lite: 2-4 vCPU, 4-8 GB RAM, 20-50 GB SSD
  • Public v3.3.x local Lite: 2-4 vCPU, 8-12 GB RAM, 25-80 GB SSD
  • Public v3.3.x local Full (no spike-lab): 6-8 vCPU, 12-20 GB RAM, 100-180 GB SSD
  • Public v3.3.x local Full + spike-lab: 8-12 vCPU, 24-32 GB RAM, 180-300 GB SSD/NVMe
  • Public-paid / private v4 local premium: 8-12 vCPU, 24-48 GB RAM, 250 GB-1 TB SSD/NVMe
  • Private v4 hosted baseline: 16+ vCPU host, 64+ GB RAM, GPU lane, 1-2 TB NVMe for retrieval indexes/snapshots
Storage layout

Hot local code + cold external data (recommended)

Recommended

Keep active project source local for speed, and map high-growth service data paths to external NVMe before heavy ingest.

# keep source local
~/Documents/projects

# move service data to external SSD
/Volumes/<external-ssd>/contextlattice/qdrant
/Volumes/<external-ssd>/contextlattice/mongo
/Volumes/<external-ssd>/contextlattice/mindsdb
/Volumes/<external-ssd>/contextlattice/letta
/Volumes/<external-ssd>/contextlattice/orchestrator
  • Why: avoids laptop internal SSD exhaustion during fanout + rehydrate windows.
  • When: set mount paths before first full-mode launch.
  • Check: verify free space before running rehydrate/backfill jobs.
Flexible Launch

Choose your install mode

Required

Hugging Face / Glama lite

Single-container lane for lowest operational overhead and maximum compatibility.

  • App version lane: Public v3.3.x
  • Services: Gateway + orchestrator in one container, topic rollups retrieval lane
  • Retrieval lanes: topic_rollups default, async continuation where configured
  • Resource target: 2-4 vCPU, 4-8 GB RAM, 20-50 GB SSD
  • Startup path: Dockerfile.hf-lite on port 7860

Lite installation

Runs the core memory services only. Best for laptop-first or resource-constrained local deployments.

  • App version lane: Public v3.3.x
  • Services: Gateway-Go frontdoor, orchestrator core, Memory Bank MCP, Mongo raw, Qdrant, MCP hub
  • Retrieval lanes: fast staged sources prioritize topic_rollups + qdrant + postgres_pgvector
  • Resource target: 2-4 vCPU, 8-12 GB RAM, 25-80 GB SSD
  • Startup command: gmake mem-up-lite

Full installation

Enables full retrieval fabric and operations stack including analytics, observability, and deeper RAG support.

  • App version lane: Public v3.3.x Full, and baseline for private v4 tuning
  • Services: Lite stack plus MindsDB, Letta, and observability profile
  • Deep continuation lane: async coverage from mindsdb + mongo_raw + letta + memory_bank
  • Resource target: 6-8 vCPU, 12-20 GB RAM, 100-180 GB SSD (without spike-lab)
  • If spike-lab adapters are active: 8-12 vCPU, 24-32 GB RAM, 180-300 GB SSD/NVMe
  • Startup command: gmake mem-up-full (or gmake mem-mode-full then gmake mem-up)
Prerequisites

Before first launch

Required
  • Container app requirement: a Compose v2-compatible container runtime is required (docker compose), such as Docker Desktop, Docker Engine, or another runtime that supports Compose v2
  • Supported host environments: macOS, Linux, or Windows (WSL2)
  • Host machine sized for selected profile (lite vs full) with enough CPU, RAM, and disk
  • Recommended guardrail: keep at least 40 GB free at storage-governance root to avoid pressure-band throttling
  • gmake, jq, rg, python3, and curl
  • Tested baseline: macOS 13+ with Docker Desktop
  • Project root initialized with .env and compose env symlink
cp .env.example .env
ln -svf ../../.env infra/compose/.env

# foolproof first-run bootstrap (generates API key + runs smoke)
BOOTSTRAP=1 scripts/first_run.sh

# helper for authenticated orchestrator endpoints
ORCH_KEY="$(awk -F= '/^CONTEXTLATTICE_ORCHESTRATOR_API_KEY=/{print substr($0,index($0,"=")+1)}' .env)"
Install path

Lite mode steps

Required (Lite)
  1. Configure environment from the repository root.
  2. Start the Lite profile.
  3. Check service state and endpoint health.
gmake mem-up-lite
gmake mem-ps-lite
curl -fsS http://127.0.0.1:8075/health | jq
curl -fsS -H "x-api-key: ${ORCH_KEY}" http://127.0.0.1:8075/status | jq
Install path

Full mode steps

Recommended (Full)
  1. Set full profile mode for compose-backed startup.
  2. Bring up the stack and watch logs.
  3. Validate API, queue telemetry, and retention telemetry.
gmake mem-mode-full
gmake mem-up
# or force full directly:
gmake mem-up-full
gmake mem-ps
gmake logs
curl -fsS http://127.0.0.1:8075/health | jq
curl -fsS -H "x-api-key: ${ORCH_KEY}" http://127.0.0.1:8075/telemetry/fanout | jq
curl -fsS -H "x-api-key: ${ORCH_KEY}" http://127.0.0.1:8075/telemetry/fanout | jq '.lettaAutoPrune'
curl -fsS -X POST -H "x-api-key: ${ORCH_KEY}" \
  "http://127.0.0.1:8075/telemetry/fanout/letta/auto-prune/run?force=false" \
  | jq '.result | {ran, skipped, deleted: (.prune.deleted // 0)}'
curl -fsS -H "x-api-key: ${ORCH_KEY}" http://127.0.0.1:8075/telemetry/retention | jq
Verification

Expected healthy outcomes

Required
  • /health returns ok for orchestrator and connected services.
  • /status (with x-api-key) lists Memory Bank and Qdrant connectivity.
  • /telemetry/fanout (with x-api-key) shows queue depth/retry counters and lettaAutoPrune.state.
  • POST /telemetry/fanout/letta/auto-prune/run triggers a threshold-gated prune pass and returns ran vs skipped.
  • /telemetry/retention (with x-api-key) reports retention cadence and sweep activity.