Layered architecture for long-horizon memory, retrieval, and learning.
Context view shows boundaries. Container view shows responsibilities and protocol flow. Runtime view shows execution order.
Quick Path
How to read this architecture set
- Start with System Context (C4 Level 1) to understand boundaries and external dependencies.
- Use Service Map to see the single ingress orchestrator and subsystem relationships.
- Review Write + Retrieval Flows for runtime movement of data and learning feedback.
- Finish with Task Coordination + Ops for agent orchestration, messaging surfaces, and reliability controls.
C4 Level 1
System Context View
Context view keeps boundary and interaction semantics clear before implementation detail.
Legend + Roles
What each context block means
- [A] External Actors: clients submitting memory writes and search requests.
- [B] Orchestrator: single control plane for ingress, auth, validation, and routing.
- [C] Memory Plane: durable stores used for fanout and retrieval synthesis.
- [D] Ops + Safety: reliability controls for queue, storage, and policy.
- [E] Task Coordination + Agent Communication: orchestrator-managed task routing for internal/external agents plus messaging ingress/egress.
- [F] Dependencies: pluggable model/tool providers and optional cloud BYO services.
C4 Level 2
Service and Flow Views
Container Notes
How to read these diagrams quickly
- Panel 1 (Service Map): where each subsystem lives and who depends on it.
- Panel 2 (Write Flow): how writes move from ingress to durable fanout sinks.
- Panel 3 (Retrieval Flow): how recall/rerank returns context and captures learning signals.
- Panel 4 (Task Coordination): how the orchestrator plans tasks, fans out workers, and communicates through messaging channels.
- HTTP-first ingress: orchestrator remains the single write/search entrypoint.
- Messaging surfaces: can operate bidirectionally (human ingress + response egress).
- Local-first default: local stores first, cloud BYO remains optional.
Runtime Ownership
Which language runs what, and why
Python: fallback and compatibility lane
- Runs: rollback ingress on
:18075, compatibility adapters, and selected operator tooling paths. - Why: preserves rollback safety and broad integration coverage while Go/Rust remain default.
- Efficacy impact: fail-safe continuity when primary runtime components are unavailable.
Rust: memory and retrieval hot path
- Runs: codec, memory engine, retrieval engine, and staged retrieval proxy paths.
- Why: lower-latency execution, predictable memory behavior, and safer high-throughput concurrency.
- Efficacy impact: smaller p95/p99 tails and better retrieval throughput under sustained load.
Go: ingress and orchestration services
- Runs: primary external ingress on
:8075, staged retrieval policy, scheduler/gateway services, batching, retries, and backpressure control. - Why: efficient long-running service concurrency and operational simplicity for scheduling workloads.
- Efficacy impact: lower user-visible latency with more predictable tail behavior.
Current Runtime Default
Rust+Go enabled, Python fallback retained
- Default posture: Rust+Go runtime path is enabled for performance-critical execution.
- Fallback posture: Python fallback remains available for rollback safety.
- Verification endpoint: query
GET /migration/runtimeto confirm active runtime mode. - Efficacy focus: optimize for correct outcomes per request, not raw benchmark numbers alone.
Release Boundaries
Public v3.2 lane vs private v4 lane
Public v3.2 (operator default)
- Ingress:
gateway-goon:8075 - Fallback: Python rollback lane on
:18075 - Memory-bank default:
icm_spike - Slow-source handling: async continuation by default for tail-latency control
- Objective: stable performance with clear operational behavior
Private v4 (experimentation lane)
- Ingress contract: unchanged from public v3.2
- Policy: stricter tuning loops and candidate backend promotions
- Promotion gate: benchmark delta + recall parity + runtime soak
- Rollback: keep v3.2-compatible path available
- Objective: find next proven step-change before public release
Runtime lane flow
flowchart TD
A["Caller"] --> B["gateway-go :8075"]
B --> C["Rust retrieval + staged policy"]
C --> D["Fast now: topic_rollups, qdrant, postgres_pgvector"]
C --> E["Slow async: mindsdb, mongo_raw, letta, memory_bank"]
E --> F["Continuation events + cache warm"]
B --> G["Python fallback :18075 (rollback)"]
H["Private v4 tuning lane"] --> I["Same external API contract"]
I --> J["Adaptive backend experiments"]
J --> K["Promote only with hard gate evidence"]
Dynamic View: Write Flow
Write Flow
Order of operations
- Authenticate and normalize write payload.
- Persist raw event for replay durability.
- Append async outbox fanout jobs.
- Dispatch sink writes with retries/backpressure.
- Record queue and sink telemetry.
Dynamic View: Retrieval Flow
Retrieval Flow
Order of operations
- Receive scoped query (project/topic aware).
- Run parallel retrieval across sinks.
- Merge and rank candidates for retrieval egress.
- Prompt for context-quality feedback when useful.
- Write feedback signals to improve future ranking.
Ops + Safety
Controls that keep long-horizon operation stable
- Queue durability: retries, deadletters, and replay workflows.
- Write pressure management: backpressure + coalescing under burst load.
- Storage hygiene: retention sweeps and external cold-path handoff.
- Security posture: strict API-key auth and secrets redaction/block controls.
- Operational confidence: health probes, telemetry, and update automation.