Files
lore-engine/docs/13-microservice-decomposition.md
Kaysser Kayyali 45ca1d962d docs: sweep remaining 'Neo4j or Kuzu' references after ADR 0008
The earlier commit missed 8 spots that still presented the graph
backend as undecided (00-overview x2, 12-storage x2, 13-microservice
x2, 06-ingestion, plan/05). All now pinned to Neo4j per ADR 0008.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-17 22:57:16 -04:00

18 KiB
Raw Permalink Blame History

13 — Microservice Decomposition: Iterate at Micro and Macro

v1.2 update: with Cognee as the substrate, the mcp-server-binary problem is gone. Cognee is the gateway. The Lore Engine is one in-process extension. This document is now about how the Lore Engine extension is organized (the in-process module layout, the template-watcher data-pipeline, the optional external-handler protocol) rather than about decomposing a monolithic Go binary.

The original v1.1 design (now superseded) had a single mcp-server Go binary exposing 45 tools. The 1144-line main.go meant adding a new tool required an edit, a recompile, a redeploy, and a restart. The iteration loop was the cost of the entire program. That's the wrong shape for a system the world-builder is going to extend indefinitely.

On Cognee, the iteration loop is minutes, not hours. The Cognee MCP server is the gateway; the Lore Engine is one Python package registered as a Cognee data-model and tool extension. Adding a new tool means: write the tool function, register it in tools/__init__.py, restart the Cognee process. For the polymorphic template-driven tools, no restart is needed — the template-watcher data-pipeline picks up YAML changes and re-registers the tools hot.

The principle: core is stable, edges are extensible

┌─────────────────────────────────────────────────────────────────────┐
│                    COGNEE MCP SERVER (Python)                       │
│  ────────────────────────────────────                              │
│  - JSON-RPC over HTTP                                               │
│  - Session registry (Cognee-managed)                                │
│  - Active context tracking                                          │
│  - Tool discovery proxy (iterates the registered tools)             │
│  - Tool call routing (parses request, dispatches to handler)        │
│  - Cognee handles: storage, extraction, embedding, sessions, vector │
│  Stable. Cognee owns this layer.                                    │
└─────────────────────────────────────────────────────────────────────┘
                                │
                                │  in-process (same Python process)
                                ▼
┌─────────────────────────────────────────────────────────────────────┐
│              LORE ENGINE EXTENSION (Python, in-process)             │
│  ──────────────────────────────────────────────                    │
│  The 45 MCP tools, organized by Group (per 05-mcp-tools.md):        │
│    tools/lookup.py                  # Group 1 — disambiguation      │
│    tools/entity_context.py                                           │
│    tools/was_true_at.py             # Group 2 — time-aware           │
│    tools/state_at.py                                                 │
│    tools/list_lineage.py            # Group 3 — lineage             │
│    tools/event_chain.py             # Group 4 — causal              │
│    tools/lore_about.py              # Group 5 — knowledge            │
│    tools/consistency_tools.py       # Group 6 — consistency          │
│    tools/generation_tools.py        # Group 7 — generation           │
│    tools/worldbuilder_tools.py      # Group 8 — writes               │
│  Each is a small, focused module. New tools can be added without   │
│  touching the others.                                                │
│                                                                     │
│  Hot-reloadable template tools (auto-registered):                   │
│    tools/generated/                # template-driven tool runner    │
│    template-watcher (Cognee data-pipeline)                          │
│      - watches ./templates/                                         │
│      - validates YAML                                               │
│      - registers tools in the Cognee tool registry                  │
│      - notifies subscribed LLM clients: "new tools available"       │
└─────────────────────────────────────────────────────────────────────┘
                                │
                                │  the tools themselves
                                │  compose across Cognee storage + operational Postgres
                                ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    COGNEE STORAGE (managed by Cognee)               │
│  Graph (Neo4j)  ·  Vector store  ·  Postgres metadata       │
│  + Lore Engine operational tables (setting, lore_event, retcon, ...)│
└─────────────────────────────────────────────────────────────────────┘

The gateway (Cognee) is the part that talks to the LLM. It's small, stable, and changes rarely — Cognee owns it. The Lore Engine extension is the part that knows how to answer questions about a high-fantasy world. It's small, focused, and can be added/replaced/swapped without touching Cognee. The storage is Cognee's responsibility; the Lore Engine only owns the operational tables on top.

The tool-handler protocol (for external handlers)

In v1.2 the 45 tools are in-process with Cognee. There is no HTTP handler protocol for the Lore Engine's own tools — they're Python functions in the same process. But Cognee itself supports external handlers (services written in any language, exposing MCP-compatible endpoints). The Lore Engine inherits that capability.

When an external handler is needed (e.g. a heavy ML model, a community-maintained bestiary database, a Python service that uses spaCy for entity resolution), it speaks the Cognee MCP protocol and registers itself with the Cognee gateway. The LLM doesn't know or care.

When to use an external handler

  • The handler needs to be written in a language other than Python.
  • The handler needs to scale independently of the gateway.
  • The handler is a community or third-party contribution.
  • The handler depends on heavy native libraries (CUDA, system-level ML toolkits).

For the Lore Engine's 45 core tools, none of these apply. They're in-process Python functions. The external handler protocol is a capability, not a requirement — it's there for the cases that need it.

The dynamic tool generator

The most leveraged feature in the v1.2 architecture is the template-driven tool generator. It runs as a Cognee data-pipeline (the template-watcher) that watches ./templates/ for YAML changes and auto-generates tool registrations.

On ./templates/thieves_guild/mission.yaml file change:

  1. Cognee data-pipeline picks up the file change event.
  2. YAML parser validates against the template schema
  3. For each tool spec in the template:
       - Generate the JSON schema for params
       - Generate the query code (data-defined, not hand-written)
       - Register the tool in the Cognee tool registry
  4. Notify subscribed LLM clients: "new tools available"
  5. LLM clients see the new tools in their next tools/list call

The generated tool handler is a generic runner that knows the template spec. For each tool, the runner:

  1. Parses the arguments against the template's params schema.
  2. Builds a query (Cypher for graph ops, SQL for time-series, composed for cross-store).
  3. Executes the query.
  4. Returns the result.

This is a one-time code generator. The runtime handler is shared across all generated tools. Adding a new domain type means writing a template, not writing a handler.

The phase plan: which capabilities when

The Cognee-based architecture is a progressive layering, not a big-bang rewrite. The MVP (Phases 03 per 09-roadmap.md) ships with the in-process extension. The v1.1 follow-ups add the template-driven tool generator and the optional external handler protocol.

Phase 03 (MVP): the in-process extension

lore-engine-extension/                # one Python package
├── lore_engine/                      # Cognee data-model extension
│   ├── __init__.py                   # registers labels, constraints, indexes
│   ├── ontology/                     # typed labels (Person, Faction, ...)
│   ├── time_model/                   # time_in_window, era-tree helpers
│   ├── consistency/                  # 4 rule categories
│   └── templates/                    # TypeTemplate validator + registry
│
├── tools/                            # the 45 MCP tools
│   ├── lookup.py
│   ├── entity_context.py
│   ├── ...                           # one file per Group from 05-mcp-tools.md
│   └── worldbuilder_tools.py
│
├── pipelines/                        # Cognee data-pipelines
│   └── consistency_pipeline.py       # nightly + on-demand rule run
│
├── parsers/                          # structured YAML ingest
│   ├── timeline.py
│   ├── family_tree.py
│   └── ...
│
└── schema/                           # init.cognify / init.cypher
    ├── init.cypher                   # constraints, indexes
    └── udfs/                         # time_in_window, time_windows_overlap

Same process, multiple modules. The 45 tools are split into per-Group files, registered at Cognee startup. The MVP ships with 45 manually-defined tools; no template-driven tools yet.

Phase 5: add the template-driven tool generator

The template-watcher data-pipeline and the dynamic tool runner. Per Phase 5 of the Cognee roadmap in 09-roadmap.md. This is the phase that unlocks the "arbitrary new concept" question.

lore-engine-extension/
├── tools/
│   ├── ...                           # the 45 manually-defined tools
│   └── generated/                    # NEW: the template-driven tool runner
│       └── runner.py
│
└── pipelines/
    ├── consistency_pipeline.py
    └── template_watcher.py           # NEW: watches ./templates/

Deliverable: world-builders can add a new domain type by writing YAML, hitting POST /admin/templates/reload, and the gateway picks it up within 5 seconds. No Cognee restart, no Lore Engine restart.

Phase 6: the external handler protocol (optional)

The Cognee MCP server already supports external handlers; the Lore Engine does not need to implement anything new. This phase is only needed if a world-builder or community contributor wants to add a handler in a language other than Python, or wants to scale a handler independently of the gateway.

Deliverable: a Python service (e.g. an entity-linker using spaCy) is registered with the Cognee gateway. The gateway dispatches to it transparently. The LLM doesn't know or care.

What this means for the world-builder

The world-builder's experience is unchanged at the API level:

  • Write YAML → cognee.add() or POST /ingest/structured → data lands.
  • Write template YAML → POST /admin/templates/reload → tools appear.
  • Query via MCP.

What changes is the iteration loop:

v1 (monolith) v1.2 (Cognee extension)
Add a new domain type Edit Go, edit Cypher, rebuild, redeploy Write template YAML, hot-reload. Done.
Add a new tool for an existing type Edit Go, rebuild, redeploy Add a tool spec to the template, hot-reload. Done.
Fix a bug in a tool Find the right handler, rebuild, redeploy gateway Edit the Python function, restart Cognee.
Add a heavy ML-based tool Rewrite in Go or call out to Python via subprocess Write a Python service, register as external Cognee handler.
Scale a single tool to 100x Scale the whole gateway Scale just that handler (if external) or Cognee (if in-process).

The iteration loop drops from hours (code change, build, deploy) to minutes (YAML change, hot-reload) for template-driven tools, and ~5 minutes (Python edit, restart) for manually-defined tools. This is what makes the engine tractable for a world that's going to grow indefinitely.

What this means for the LLM

The LLM doesn't know or care whether the tool is in-process, template-driven, or external. It sees a single MCP server with a list of tools. The tool names, descriptions, and params are the same.

The LLM does get a new tool: list_template_tools(). This returns the auto-generated tools for all currently-loaded templates. The LLM can use this to discover what domain types are available without having to be told.

{
  "name": "list_template_tools",
  "description": "List all tools auto-generated from loaded type templates. Use this to discover what domain types are available in the current world.",
  "params": ["type_filter", "limit"]
}

The Lore Engine extension directory layout

When the system is fully built:

lore-engine-extension/                # the Lore Engine on Cognee
├── lore_engine/                      # data-model extension
│   ├── ontology/                     # typed labels + constraints
│   ├── time_model/                   # time_in_window, era-tree helpers
│   ├── consistency/                  # 4 rule categories + 10 starter rules
│   └── templates/                    # TypeTemplate validator + registry
│
├── tools/                            # the 45 MCP tools (in-process)
│   ├── lookup.py
│   ├── entity_context.py
│   ├── was_true_at.py
│   ├── ...
│   ├── consistency_tools.py
│   ├── generation_tools.py
│   ├── worldbuilder_tools.py
│   └── generated/                    # template-driven tool runner
│       └── runner.py
│
├── pipelines/                        # Cognee data-pipelines
│   ├── consistency_pipeline.py       # nightly + on-demand rule run
│   └── template_watcher.py           # watches ./templates/, hot-reload
│
├── parsers/                          # structured YAML ingest
│   ├── timeline.py
│   ├── family_tree.py
│   ├── gazetteer.py
│   ├── bestiary.py
│   ├── magic_system.py
│   └── culture.py
│
└── schema/                           # init.cypher
    ├── init.cypher                   # constraints, indexes
    └── udfs/                         # time_in_window, time_windows_overlap

The entire extension is one Python package, registered with Cognee at startup. There are no separate Go workers; the entire backend is in-process with Cognee.

The cost: less, not more

The Cognee-based architecture has fewer services to operate than the v1.1 plan called for. The 5+ Go services in the old plan (mcp-gateway, consistency-runner, consistency-monitor, template-watcher, template-registry, structured-ingestor, lore-extractor, etc.) collapse to:

  • Cognee MCP server (1 process, the gateway)
  • Lore Engine extension (in-process with Cognee)
  • Cognee data-pipelines (template-watcher, consistency-pipeline — also in-process)
  • Storage backend (Neo4j — ADR 0008, managed by Cognee)
  • Postgres (Cognee metadata + Lore Engine operational tables)
  • Vector store (Cognee-managed, pgvector or Qdrant)

That's 3-4 infrastructure components, not 8+ Go services. The operational footprint is smaller than the v1.1 plan.

The decomposition is the answer to Kay's question

"We need to be able to take an arbitrary new concept, define how it associates with larger constructs, but also have flexibility to get as detailed as we need."

The v1.2 architecture answers this in three ways:

  1. New concept (macro): a YAML template. No code change. Hot-reload. (~1 hour of world-builder work.)

  2. Detail on the concept (micro): instance YAML with relations to other entities. The template defines the structure; the instance fills it in. (~10 minutes per instance.)

  3. New kind of data the template doesn't cover (escape hatch): an external handler in the language of choice, registered with the Cognee gateway. (~1 day of work for a small, focused handler.)

The macro level is fast and declarative. The micro level is structured and bounded. The escape hatch is open and arbitrary. Three layers, three iteration speeds.

What this is NOT

  • Not a "we use Cognee so we don't have to write any code." The Lore Engine extension is real Python code — the 45 tools, the time model, the consistency rules. Cognee doesn't replace the domain layer; it provides the substrate the domain layer runs on.
  • Not a SaaS architecture. The system runs as one docker-compose stack on one host. Cognee doesn't make it cloud-native; the world-builder still operates it as a single system.
  • Not a guarantee that all extensions are easy. Some extensions still need code — anything that requires a new query language feature, a new database, or a new ML model. The escape hatch (external handler) is for those, and it costs more.
  • Not a rewrite of the world-builder's experience. The world-builder still writes YAML, hits endpoints, sees tools in the LLM. The difference is the speed of the loop.