docs(adr): 0009 — reified Relation edges; recall + tool-dispatch reframes

Three findings from the Cognee-API review: ADR 0009 (the big one): edges with time/confidence are reified :Relation nodes, promoted v1.1 -> v1. Cognee's graph_model can't carry valid_from/valid_until/confidence on a native edge (an edge is a nested DataPoint field; the Edge object only has weight + relationship_type). So any edge the time model, consistency engine, disputed-edge machinery, and retcon policy operate on is a Relation node. Structural edges (is_type, template-wiring) stay native. Propagated: 11-extensibility (Relation now v1, +disputed/retcon fields), 04-consistency (Category A + B Cypher match through Relation nodes, materialize is_disputed/disputed_with), 00-overview count, CONTEXT.md (+Relation term), slice 1/3/6 notes. Finding 1: cognee.recall is not 'low-precision' — it returns scored multi-source RecallResponse objects (incl cypher/triplet/temporal kinds), session-aware. It's the fallback because results are un-typed/un-cited/un-time-bounded, not low-precision. Reframed in 07-reasoning-harness + 05-mcp-tools. Finding 3: 'register our 45 tools with Cognee's dispatch' was false. Cognee ships cognee-mcp (a fixed 14-tool surface) — a reference server, not a registry we extend. Lore Engine runs its own MCP server (45 tools), calls Cognee's Python API in-process. Reframed in 00-overview + 22-cognee-boundary. Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-17 23:20:26 -04:00
parent 45ca1d962d
commit ba314bc664
11 changed files with 163 additions and 22 deletions
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -42,7 +42,10 @@ The floor — `min(extraction_confidence × source_confidence)` across every sou
 _Avoid_: confidence (unqualified — say which dimension)

 **Disputed edge**:
-When two sources produce the same (subject, relation, object) with *conflicting* time bounds. Kept as separate Edge records, both marked `is_disputed: true`, linked via `disputed_with` — not merged. (ADR 0002.)
+When two sources produce the same (subject, relation, object) with *conflicting* time bounds. Kept as separate `Relation` records, both marked `is_disputed: true`, linked via `disputed_with` — not merged. (ADR 0002.)
+
+**Relation**:
+A reified-edge *node* — the representation for any edge that carries time bounds or confidence (`RULED`, `MEMBER_OF`, `PARTICIPATED_IN`, …). Carries `valid_from`/`valid_until`/`extraction_confidence`/`source_confidence`/`is_disputed`/`superseded_by`. Native edges (Cognee field-nesting) are used only for structural edges that never need time (`is_type`, template-wiring). Promoted from v1.1 to v1 (ADR 0009).

 **Contradiction**:
 A first-class *node* flagging incompatible claims between LoreSources. Built from scratch in slice 2 — Cognee ships no contradiction machinery.
--- a/docs/00-overview.md
+++ b/docs/00-overview.md
@@ -24,11 +24,11 @@ The Lore Engine is built on top of [Cognee](https://github.com/topoteretes/cogne
 | Embedding pipeline (vector embeddings of chunks + entities) | Production | Semantic search back-end. Cognee manages the embedding store; we query it through the `lore_about` and `cite` tools. |
 | Agent-native `remember/recall/forget` API | Production | Storage-agnostic interface. The Lore Engine wraps `recall` with typed ontology + time model. |
 | `DataPoint` / `Entity` schema | Production | Base node types. The Lore Engine adds `Person`, `Faction`, `Location`, etc. as typed extensions. |
-| Session / task registry | Production | MCP server pattern. We register our 45 tools with Cognee's tool dispatch. |
+| Session / task registry | Production | The Lore Engine ships its **own** MCP server exposing the 45 typed tools; it calls Cognee's Python API (`remember`/`recall`/`add_data_points`) in-process. Cognee's `cognee-mcp` (a fixed 14-tool surface) is a separate reference server — we don't register our tools into it. |

 ## What we add

- **Deeper ontology** — Era, Calendar, Lineage, Culture, Deity, MagicSystem, Spell, Language, Title, Item (covers Artifact), Region, plus 2 v1.2 nodes (Plane, Setting) and 5 consistency nodes. Roughly 36 node labels total (7 base + 19 v1 core + 2 v1.2 planes + 6 v1.1 polymorphic + 5 consistency). The 7 base types are Lore Engine originals built on Cognee's `DataPoint` — not inherited as-is (Cognee ships no `Contradiction`/`Message`/etc. nodes of its own). The v1.1 docs (`11-extensibility.md`) add the polymorphic `DomainEntity`, `Relation`, `TypeTemplate`, `NPC`, `PC`, and `Human` labels.
+- **Deeper ontology** — Era, Calendar, Lineage, Culture, Deity, MagicSystem, Spell, Language, Title, Item (covers Artifact), Region, `Relation` (reified-edge node, promoted to v1 per ADR 0009), plus 2 v1.2 nodes (Plane, Setting) and 5 consistency nodes. Roughly 36 node labels total (7 base + 19 v1 core + 2 v1.2 planes + 5 v1.1 polymorphic + 5 consistency). The 7 base types are Lore Engine originals built on Cognee's `DataPoint` — not inherited as-is (Cognee ships no `Contradiction`/`Message`/etc. nodes of its own). The v1.1 docs (`11-extensibility.md`) add the polymorphic `DomainEntity`, `TypeTemplate`, `NPC`, `PC`, and `Human` labels (`Relation` graduated to v1).
 - **Time as a first-class concept** — temporal validity windows on relations, era filters, "was X true at time T?" via dedicated tools. The LLM no longer has to guess which version of Aldric it is talking to.
 - **Structured ingestion** — not just `.md` files. We ingest `timeline.yaml`, `family-tree.yaml`, `gazetteer.yaml`, `bestiary.yaml` as first-class sources. Free prose stays, but is no longer the only path.
 - **A consistency engine** — Cypher-based rules that flag anachronisms (Aldric can't be at the Battle of Black Spire if it happened 200 years before his birth), missing lineage (a noble with no recorded parents), and ontological violations (a region claimed to be inside two non-overlapping kingdoms).
@@ -109,5 +109,6 @@ If you want to challenge it: jump to `10-critique.md` first. I tried to break it
 - `0006-cognee-version-pin.md` — Cognee pinned at 1.1.2; harness is the upgrade gate
 - `0007-graph-model-ontology-contract.md` — ontology enforced via `graph_model=`, not RDF/OWL
 - `0008-graph-backend-neo4j.md` — Neo4j (not Cognee's Kuzu default); time model ships as Java UDF
+- `0009-reified-relation-edges.md` — time-bounded/confident edges are `:Relation` nodes (Cognee `graph_model` can't carry edge properties)

 **Build plan (`plan/`):** the 10 slice specs and an index.
--- a/docs/04-consistency.md
+++ b/docs/04-consistency.md
@@ -61,26 +61,34 @@ A `LoreSource` makes a claim about an entity that another `LoreSource` contradic
 - `BELONGS_TO` (a Person belongs to one culture at a time, unless `valid_from`/`valid_until` differ)
 - `EXISTED_DURING` (a Person/Faction/Location/Item has one existence window; multiple non-contiguous windows are valid for reincarnated deities etc., but they must not overlap)

-**Cypher (general pattern):**
+**Cypher (general pattern)** — edges are reified `:Relation` nodes per ADR 0009, so the contradiction check matches Relation nodes by `type` and overlapping time windows:
 ```cypher
-MATCH (a)-[r1:RELATION_TYPE]->(b)
-MATCH (a)-[r2:RELATION_TYPE]->(c)
-WHERE b <> c
+MATCH (a)
+MATCH (r1:Relation {type: "RELATION_TYPE"}) WHERE r1.from_id = a.id
+MATCH (r2:Relation {type: "RELATION_TYPE"}) WHERE r2.from_id = a.id
+WHERE r1.to_id <> r2.to_id
  AND time_windows_overlap(r1.valid_from, r1.valid_until, r2.valid_from, r2.valid_until)
 MERGE (contra:Contradiction {subject: a.name, predicate: "RELATION_TYPE",
-                              claim_a: b.name, claim_b: c.name})
-ON CREATE SET contra.detected_at = timestamp()
-WITH a, contra
+                              claim_a: r1.to_id, claim_b: r2.to_id})
+ON CREATE SET contra.detected_at = timestamp(), contra.is_disputed = true
+WITH a, r1, r2, contra
+SET r1.is_disputed = true, r2.is_disputed = true,
+    r1.disputed_with = coalesce(r1.disputed_with, []) + [r2.id],
+    r2.disputed_with = coalesce(r2.disputed_with, []) + [r1.id]
 MERGE (a)-[:HAS_CONTRADICTION]->(contra)
 ```

+(Note: the `is_disputed` / `disputed_with` mutation here is the consistency engine materializing ADR 0002's disputed-edge state, not the LLM inventing it.)
+
 ### Category B: Anachronism detection

 For every edge of type `PARTICIPATED_IN`, `WITNESSED`, `LOCATED_IN`, `POSSESSES`, `CAUSED`, `CREATED` — verify the subject's existence window contains the event/object's time.

-**Cypher (anachronism: entity before birth):**
+**Cypher (anachronism: entity before birth)** — `PARTICIPATED_IN` is a reified `:Relation` (ADR 0009):
 ```cypher
-MATCH (p:Person)-[r:PARTICIPATED_IN]->(e:Event)
+MATCH (p:Person)
+MATCH (r:Relation {type: "PARTICIPATED_IN"}) WHERE r.from_id = p.id
+MATCH (e:Event {id: r.to_id})
 WHERE p.birth IS NOT NULL
  AND time_in_window(e.in_fiction_date, p.birth, p.death) = false
 MERGE (an:Anachronism {entity_name: p.name, event_name: e.name,
--- a/docs/05-mcp-tools.md
+++ b/docs/05-mcp-tools.md
@@ -384,7 +384,7 @@ The user can disable any rule by ID, and add new ones via `add_ontology_rule`.

 ## Tool count: 45 total (8 base + 37 domain)

-The full catalog: 8 base tools (one wraps `cognee.recall`; the rest are Lore Engine originals) + 37 domain tools across Groups 1–8 = **45 MCP tools** — all Lore Engine handlers. Cognee ships no MCP tool catalog of its own; a handful of tools delegate to Cognee primitives (`cognee.recall`, `cognee.add`) as a low-precision fallback. That's well past the empirical LLM tool-use ceiling (~25 in a single system prompt), so the Phase 6 reasoning-harness validation measures usage and collapses the long tail. The LLM uses 5–8 of them 90% of the time; the long tail exists for edge cases the LLM will sometimes need and shouldn't have to fall back to free-text generation for.
+The full catalog: 8 base tools (one wraps `cognee.recall`; the rest are Lore Engine originals) + 37 domain tools across Groups 1–8 = **45 MCP tools** — all Lore Engine handlers, exposed through the **Lore Engine's own MCP server** (Cognee's `cognee-mcp` is a separate, fixed 14-tool surface; we don't register into it). The server calls Cognee's Python API (`remember`/`recall`/`add_data_points`) in-process. A handful of tools delegate to `cognee.recall` as a fallback — not because recall is low-precision (it returns scored, multi-source results) but because its results are un-typed, un-cited, and un-time-bounded. That's well past the empirical LLM tool-use ceiling (~25 in a single system prompt), so the Phase 6 reasoning-harness validation measures usage and collapses the long tail. The LLM uses 5–8 of them 90% of the time; the long tail exists for edge cases the LLM will sometimes need and shouldn't have to fall back to free-text generation for.

 This is on the high end of what an LLM can effectively use in a single context. We mitigate by:
 - The reasoning harness documents which 8 to use first.
--- a/docs/07-reasoning-harness.md
+++ b/docs/07-reasoning-harness.md
@@ -46,7 +46,13 @@ When a tool returns an error, surface the error to the user and stop.

 Fallback to Cognee primitives: if no Lore Engine tool fits the question,
 use cognee.recall("free-text query") for semantic search over the chunk store.
-This is a low-precision fallback — prefer the typed tools when possible.
+`recall` is not low-precision — it returns scored, multi-source results
+(`list[RecallResponse]`, including `cypher`/`triplet_completion`/`temporal`
+kinds) and is session-aware. It's the fallback because its results are
+**un-typed, un-cited, and un-time-bounded** — exactly the three things the
+typed tools add. Prefer the typed tools when the question has a subject,
+a time, or needs a citation; use `recall` for breadth ("what does the
+chronicle say about…") and when no typed tool fits.
 ```

 This is the bedrock. The patterns below build on it.
--- a/docs/11-extensibility.md
+++ b/docs/11-extensibility.md
@@ -1,6 +1,8 @@
 # 11 — Extensibility: Polymorphic Type Templates

-The v1 ontology has 36 hard-coded labels (7 inherited from Cognee + 19 v1 core + 2 v1.2 planes + 6 v1.1 polymorphic + 5 consistency). That's *fine for the first world* but it has a ceiling: a thieves-guild mission is forced into `:Event`, a war campaign is forced into `:Faction`-with-properties, a black-market trade log is forced into `:Item`-with-properties. The LLM can *talk* about these things, but the engine can't *reason* over their structure.
+The v1 ontology has 36 hard-coded labels (7 base + 19 v1 core — including `Relation`, promoted from v1.1 per ADR 0009 + 2 v1.2 planes + 5 v1.1 polymorphic + 5 consistency). That's *fine for the first world* but it has a ceiling: a thieves-guild mission is forced into `:Event`, a war campaign is forced into `:Faction`-with-properties, a black-market trade log is forced into `:Item`-with-properties. The LLM can *talk* about these things, but the engine can't *reason* over their structure.
+
+> **Note (ADR 0009):** `Relation` now ships in **v1**, not v1.1. It's a general reified-edge node (any edge with time bounds or confidence is a `Relation` node, because Cognee's `graph_model` can't put those properties on a native edge). The `DomainEntity` + `TypeTemplate` polymorphic system below is still v1.1; `Relation` is the one piece that graduated. See `docs/adr/0009-reified-relation-edges.md`.

 **"What missions has the Crimson Hand run in Mardsville over the last year, sorted by payout?"** is unanswerable today. The data lives in `summary` text fields, the relationships are implicit in prose, and the LLM has to reconstruct it via `semantic_search` and hope.

@@ -115,23 +117,30 @@ A single new node label and a single new edge label. This is the polymorphic bac
 })

 // A relation between any two nodes (DomainEntity, Person, Faction, etc.)
+// — promoted to v1 core (ADR 0009). This is THE representation for any
+// edge that carries time bounds or confidence, used by the time model,
+// consistency engine, disputed-edge machinery, and retcon policy.
 (:Relation {
  id: "rel_abc",
  from_id: "mission_4471",
  to_id: "person_vex_silent",
-  type: "GIVEN_BY",                      // matches a template's allowed_relations
-  properties: {                          // typed by template
+  type: "GIVEN_BY",                      // the edge verb; matches a template's allowed_relations
+  properties: {                          // typed by template (v1.1 polymorphism)
    at: "3rd_age.year_385",
    agreed_payout: 500
  },
  valid_from: "3rd_age.year_385",
  valid_until: null,
+  extraction_confidence: 0.88,           // ADR 0001
+  source_confidence: 0.9,                // ADR 0001
  sources: ["crimson_hand_records.yaml"],
-  source_confidence: 0.9
+  is_disputed: false,                    // ADR 0002
+  disputed_with: [],                     // sibling Relation ids
+  superseded_by: null                    // retcon policy (docs/19-retcon-policy.md)
 })
 ```

-Two new labels. ~200 lines of Cypher. Stable forever.
+`Relation` is v1 core. `DomainEntity` + `TypeTemplate` (below) are the v1.1 polymorphic layer that builds on top of it.

 ### Indexes for Layer 2

--- a/docs/22-cognee-boundary.md
+++ b/docs/22-cognee-boundary.md
@@ -25,7 +25,12 @@ when someone asks "could we swap Cognee for X?"
 - **Retrieval.** The `cognee.recall` API and its
  query-understanding layer.
 - **Session/agent API.** The `remember/recall/forget`
-  surface that agent clients (Claude, etc.) call.
+  surface that agent clients (Claude, etc.) call. Cognee *also*
+  ships its own MCP server (`cognee-mcp`, a fixed 14-tool surface)
+  — but that's a reference server, not our tool registry. The
+  Lore Engine runs its **own** MCP server (45 tools) and calls
+  Cognee's Python API in-process; we don't register into
+  `cognee-mcp`.

 ## What the Lore Engine owns

--- a/docs/adr/0009-reified-relation-edges.md
+++ b/docs/adr/0009-reified-relation-edges.md
@@ -0,0 +1,101 @@
+# Edges with time/confidence are reified `:Relation` nodes
+
+**Status:** accepted.
+
+The Lore Engine's edges carry `valid_from`, `valid_until`,
+`extraction_confidence`, `source_confidence`, `is_disputed`,
+and `superseded_by` (ADRs 0001, 0002, + the retcon policy).
+Cognee's `graph_model` extraction cannot put those properties on
+a native edge: in a `graph_model`, an edge is a nested `DataPoint`
+*field* (field name = edge label, direction = owner → target), and
+the runtime `Edge` object carries only `weight` and
+`relationship_type`. There is no field for `valid_from` on an
+extracted edge — the LLM fills DataPoint fields, and a
+field-nesting edge has nowhere to put a time bound.
+
+So any edge that carries time bounds or confidence is a
+**reified `:Relation` node** — a `DataPoint` *node* representing
+the edge, with the time/confidence/dispute/retcon properties as
+node fields the LLM *can* emit via `graph_model`.
+
+## What this means
+
+- **Promote `Relation` from v1.1 to v1.** It was introduced in
+  `11-extensibility.md` alongside the polymorphic `DomainEntity`/
+  `TypeTemplate` system, but it doesn't depend on them. It's a
+  general reified-edge node between *any* two nodes. It now ships
+  in v1.
+- **Two edge representations, by purpose:**
+  - **Reified `:Relation` nodes** — any edge with time bounds or
+    confidence (`RULED`, `MEMBER_OF`, `PARTICIPATED_IN`,
+    `POSSESSES`, `SPOUSE_OF`, `PARENT_OF`, …). These are the ones
+    the time model, consistency engine, disputed-edge machinery,
+    and retcon policy operate on.
+  - **Native edges** (Cognee field-nesting) — structural edges
+    that never carry time or confidence: `is_type`, `exists_in`
+    (an entity's plane membership is structural, not time-bounded —
+    see "Open question" below), template-wiring. Cheap, direct,
+    no `Relation` node.
+- **Both ingest paths produce Relation nodes for time-bounded
+  edges.** The structured-YAML path (slice 1, `add_data_points`)
+  *could* attach properties to a native Neo4j edge, but doesn't —
+  it creates `Relation` nodes too, so the graph has one edge
+  representation regardless of ingest path. (Consistency beats
+  the small per-edge cost.)
+
+## The `Relation` node (v1 core)
+
+```cypher
+(:Relation {
+  id, from_id, to_id,
+  type,                      // the edge verb: RULED, MEMBER_OF, ...
+  valid_from, valid_until,
+  extraction_confidence, source_confidence,   // ADR 0001
+  sources[],                 // LoreSource ids
+  is_disputed,               // ADR 0002
+  disputed_with[],           // sibling Relation ids
+  superseded_by              // retcon policy (docs/19-retcon-policy.md)
+})
+```
+
+Queries match through the Relation node:
+```cypher
+MATCH (r:Relation {type: "RULED"})
+WHERE r.from_id = $subject_id AND r.to_id = $object_id
+  AND time_in_window($at_time, r.valid_from, r.valid_until)
+RETURN r, r.sources
+```
+
+## Trade-off acknowledged
+
+- **More nodes.** Every time-bounded edge is a node + two
+  structural edges (`from`/`to`) instead of one direct edge.
+  Acceptable: the engine's value is *reasoning over time and
+  conflict*, and that needs the properties somewhere queryable.
+- **Cypher is one hop longer.** `was_true_at` matches a
+  `Relation` node instead of a direct edge. The
+  `relation_time_window` index (`valid_from`, `valid_until`) keeps
+  it fast.
+- **Structural edges stay native.** We don't reify `is_type` or
+  template-wiring — they'd just add nodes with no payoff.
+
+## Open question (deferred)
+
+Is `EXISTS_IN` (entity → plane) time-bounded? An entity can move
+between planes (a character who planes-shifts to the Shadow
+Plane). If so, `EXISTS_IN` is a `Relation` node with time bounds;
+if plane membership is permanent, it's a native edge. Slice 6
+resolves this when the plane model ships. Default assumption for
+now: **`EXISTS_IN` is a `Relation` node** (time-bounded), since
+planar travel is a stock high-fantasy trope. Plane-to-plane
+edges (`REFLECTS`, `LAYER_OF`, `ADJACENT_TO`, `ACCESSIBLE_VIA`)
+are structural → native.
+
+## Cross-references
+
+- `docs/11-extensibility.md` — the `Relation` definition (now v1)
+- `docs/04-consistency.md` — Cypher now matches through Relation nodes
+- `docs/adr/0001-aggregate-confidence-floor.md` — confidence fields
+- `docs/adr/0002-disputed-edges-stay-separate.md` — `is_disputed`
+- `docs/19-retcon-policy.md` — `superseded_by`
+- `docs/adr/0007-graph-model-ontology-contract.md` — why `graph_model` can't carry edge properties
--- a/docs/plan/01-slice-structured-yaml.md
+++ b/docs/plan/01-slice-structured-yaml.md
@@ -2,7 +2,9 @@

 **Status:** 📋 planned. The slice that makes `was_true_at` actually
 have something to filter against (real `valid_from` / `valid_until`
-edges).
+on reified `:Relation` nodes — per ADR 0009, time-bounded edges are
+`Relation` nodes, not native edges; the YAML path creates them via
+`add_data_points`).

 ## Goal

--- a/docs/plan/03-slice-llm-extraction.md
+++ b/docs/plan/03-slice-llm-extraction.md
@@ -26,7 +26,11 @@ Wire up an LLM-backed extraction pipeline that:
 2. Custom extraction prompt that emits the 36 typed labels from
   `docs/01-ontology.md`.
 3. Custom relation extraction prompt that emits the ~70 typed edge
-   types.
+   types. Per ADR 0009, edges with time bounds or confidence are
+   reified `:Relation` nodes in the `graph_model` (Cognee can't
+   carry those properties on a native field-nesting edge) — the
+   prompt emits `Relation` nodes with `valid_from`/`valid_until`/
+   `extraction_confidence`, not bare edges.
 4. Entity resolution: pre-computed embeddings of entity names,
   top-K by similarity to the chunk being extracted (addresses
   critique S1.3). M3's 1M context window means the prompt can
--- a/docs/plan/06-slice-planes.md
+++ b/docs/plan/06-slice-planes.md
@@ -16,6 +16,8 @@ types. Multi-setting queries, planar relationships, and the
 2. `Plane` node: `(id, setting_id, name, kind)`.
 3. `EXISTS_IN` edge: every other entity gets
   `setting_id` + `plane_id` properties pointing through this edge.
+   Per ADR 0009, `EXISTS_IN` is a reified `:Relation` node
+   (time-bounded — planar travel), not a native edge.
 4. Four plane-relation edge types:
   - `REFLECTS` — Plane A reflects Plane B
   - `LAYER_OF` — Plane A is a layer of Plane B