Files
lore-engine/docs/22-cognee-boundary.md
Kaysser Kayyali ba314bc664 docs(adr): 0009 — reified Relation edges; recall + tool-dispatch reframes
Three findings from the Cognee-API review:

ADR 0009 (the big one): edges with time/confidence are reified
:Relation nodes, promoted v1.1 -> v1. Cognee's graph_model can't
carry valid_from/valid_until/confidence on a native edge (an edge
is a nested DataPoint field; the Edge object only has weight +
relationship_type). So any edge the time model, consistency engine,
disputed-edge machinery, and retcon policy operate on is a Relation
node. Structural edges (is_type, template-wiring) stay native.
Propagated: 11-extensibility (Relation now v1, +disputed/retcon
fields), 04-consistency (Category A + B Cypher match through Relation
nodes, materialize is_disputed/disputed_with), 00-overview count,
CONTEXT.md (+Relation term), slice 1/3/6 notes.

Finding 1: cognee.recall is not 'low-precision' — it returns scored
multi-source RecallResponse objects (incl cypher/triplet/temporal
kinds), session-aware. It's the fallback because results are
un-typed/un-cited/un-time-bounded, not low-precision. Reframed in
07-reasoning-harness + 05-mcp-tools.

Finding 3: 'register our 45 tools with Cognee's dispatch' was false.
Cognee ships cognee-mcp (a fixed 14-tool surface) — a reference
server, not a registry we extend. Lore Engine runs its own MCP
server (45 tools), calls Cognee's Python API in-process. Reframed
in 00-overview + 22-cognee-boundary.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-17 23:20:26 -04:00

5.6 KiB

22 — Cognee ↔ Lore Engine boundary

Status: 📋 planned. The contract between the substrate (Cognee) and the domain layer (Lore Engine).

Goal

Make explicit what Cognee owns, what the Lore Engine owns, and what the boundary looks like. This is the doc that gets read when someone asks "could we swap Cognee for X?"

What Cognee owns

  • Storage. The graph database — Neo4j, pinned by ADR 0008 (Cognee's default is Kuzu, but we override for battle-testedness and for the Java UDFs the time model needs). The vector store (pgvector or Qdrant) is Cognee's choice.
  • Ingestion pipeline. The cognee.add / cognee.cognify lifecycle, including chunking and embedding.
  • Extraction. The LLM call that turns chunks into entities and relations — unless the Lore Engine overrides the prompt.
  • Embedding. All vector operations and similarity scoring.
  • Retrieval. The cognee.recall API and its query-understanding layer.
  • Session/agent API. The remember/recall/forget surface that agent clients (Claude, etc.) call. Cognee also ships its own MCP server (cognee-mcp, a fixed 14-tool surface) — but that's a reference server, not our tool registry. The Lore Engine runs its own MCP server (45 tools) and calls Cognee's Python API in-process; we don't register into cognee-mcp.

What the Lore Engine owns

  • Typed ontology. The 36 node labels and ~70 edge types from docs/01-ontology.md, registered with Cognee as a data-model file.
  • Time model. The time_in_window / time_windows_overlap primitives (slice 1), era-tree membership, the current token. Implemented as a Cognee plugin / Neo4j UDF.
  • Consistency engine. The 4-category rule system from docs/04-consistency.md. Runs as a Cognee data-pipeline on cognee.cognify completion.
  • The 45 MCP tools. All domain operations (was_true_at, entity_context, state_at, etc.) are Lore Engine handlers, not Cognee primitives.
  • NPC knowledge scoping. The knows_about model per-Person, enforced in the was_true_at response.
  • TypeTemplate. The polymorphic extension system (slice 5) — DomainEntity + Relation + TypeTemplate — runs as a Cognee data-pipeline watching ./templates/.
  • Plane model. The v1.2 Setting/Plane nodes and the four plane-relation edge types (slice 6).
  • Codex ingestion. The markdown + YAML parsing layer that feeds Cognee.

The boundary

World-Builder Authoring (markdown / YAML / dialogue JSON)
                ↓
   Lore Engine Codex Parser
                ↓
   cognee.add(data) | cognee.cognify()
                ↓
   ─────────── Cognee boundary ───────────
                ↓
   Cognee Storage (graph + vectors)
                ↓
   Lore Engine Extension Layer:
     - Time model plugin
     - Consistency engine
     - TypeTemplate watcher
                ↓
   Lore Engine MCP tools (45 tools) + cognee.recall() fallback
                ↓
   LLM Client with the reasoning-harness system prompt

The boundary is a data-model file (the Lore Engine ontology as Cognee data) and a handler registry (the 45 tools). Cognee knows about the Lore Engine's labels and edge types; Cognee does not know about the time model, consistency, or TypeTemplate — those are computed above the boundary.

What changes if we swap Cognee

Lore Engine concern Cognee-specific? Swap cost
Typed ontology yes (Cognee data model) medium — re-encode in new format
Time model no (pure logic) none — port to new storage
Consistency engine no (rule engine) none — port to new storage
45 MCP tools mostly no (Cypher-based) low — most are Cypher against the graph
NPC knowledge scoping no (graph property) none — port to new storage
TypeTemplate yes (Cognee pipeline) medium — re-implement watcher
Plane model no (graph nodes) none — port to new storage
Codex ingestion yes (Cognee.add) high — replace with substrate's ingestion

The Lore Engine's domain logic (time, consistency, NPC, TypeTemplate, planes) is substrate-agnostic. The plumbing (Cognee data model, ingestion, pipeline registration) is Cognee-specific. If we ever swap, the domain layer ports cleanly; the plumbing is rewritten.

Troubleshooting

"cognee.cognify() returns nothing"

  • Check that the codex parser emitted data (python3 -m lore_engine.debug.dump_codex).
  • Check Cognee's logs (docker compose logs cognee).
  • Check the LLM key (echo $OPENAI_API_KEY | head -c 10).
  • Check that the data-model file loaded (cognee.config.datasets.list() should show lore-engine).

"The engine returns facts but no sources"

The LLM is bypassing the citation rule. Per docs/07-reasoning-harness.md, this is a prompt-level failure. Re-run the harness (docs/18-eval-policy.md); if the citation rate drops below 90%, the system prompt needs revision.

"Time-bounded queries return cross-era answers"

The LLM is ignoring at_time. Same fix as above — re-run the harness, watch the time-window violation metric.

Cross-references

  • docs/adr/0006-cognee-version-pin.md — version pin and upgrade policy
  • docs/cognee-integration.md — the recipe for overriding Cognee's default extraction prompt and routing through LiteLLM
  • docs/18-eval-policy.md — what we re-test on every substrate change