docs(adr): 0009 — reified Relation edges; recall + tool-dispatch reframes

Three findings from the Cognee-API review:

ADR 0009 (the big one): edges with time/confidence are reified
:Relation nodes, promoted v1.1 -> v1. Cognee's graph_model can't
carry valid_from/valid_until/confidence on a native edge (an edge
is a nested DataPoint field; the Edge object only has weight +
relationship_type). So any edge the time model, consistency engine,
disputed-edge machinery, and retcon policy operate on is a Relation
node. Structural edges (is_type, template-wiring) stay native.
Propagated: 11-extensibility (Relation now v1, +disputed/retcon
fields), 04-consistency (Category A + B Cypher match through Relation
nodes, materialize is_disputed/disputed_with), 00-overview count,
CONTEXT.md (+Relation term), slice 1/3/6 notes.

Finding 1: cognee.recall is not 'low-precision' — it returns scored
multi-source RecallResponse objects (incl cypher/triplet/temporal
kinds), session-aware. It's the fallback because results are
un-typed/un-cited/un-time-bounded, not low-precision. Reframed in
07-reasoning-harness + 05-mcp-tools.

Finding 3: 'register our 45 tools with Cognee's dispatch' was false.
Cognee ships cognee-mcp (a fixed 14-tool surface) — a reference
server, not a registry we extend. Lore Engine runs its own MCP
server (45 tools), calls Cognee's Python API in-process. Reframed
in 00-overview + 22-cognee-boundary.

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2026-06-17 23:20:26 -04:00
parent 45ca1d962d
commit ba314bc664
11 changed files with 163 additions and 22 deletions

View File

@@ -42,7 +42,10 @@ The floor — `min(extraction_confidence × source_confidence)` across every sou
_Avoid_: confidence (unqualified — say which dimension)
**Disputed edge**:
When two sources produce the same (subject, relation, object) with *conflicting* time bounds. Kept as separate Edge records, both marked `is_disputed: true`, linked via `disputed_with` — not merged. (ADR 0002.)
When two sources produce the same (subject, relation, object) with *conflicting* time bounds. Kept as separate `Relation` records, both marked `is_disputed: true`, linked via `disputed_with` — not merged. (ADR 0002.)
**Relation**:
A reified-edge *node* — the representation for any edge that carries time bounds or confidence (`RULED`, `MEMBER_OF`, `PARTICIPATED_IN`, …). Carries `valid_from`/`valid_until`/`extraction_confidence`/`source_confidence`/`is_disputed`/`superseded_by`. Native edges (Cognee field-nesting) are used only for structural edges that never need time (`is_type`, template-wiring). Promoted from v1.1 to v1 (ADR 0009).
**Contradiction**:
A first-class *node* flagging incompatible claims between LoreSources. Built from scratch in slice 2 — Cognee ships no contradiction machinery.

View File

@@ -24,11 +24,11 @@ The Lore Engine is built on top of [Cognee](https://github.com/topoteretes/cogne
| Embedding pipeline (vector embeddings of chunks + entities) | Production | Semantic search back-end. Cognee manages the embedding store; we query it through the `lore_about` and `cite` tools. |
| Agent-native `remember/recall/forget` API | Production | Storage-agnostic interface. The Lore Engine wraps `recall` with typed ontology + time model. |
| `DataPoint` / `Entity` schema | Production | Base node types. The Lore Engine adds `Person`, `Faction`, `Location`, etc. as typed extensions. |
| Session / task registry | Production | MCP server pattern. We register our 45 tools with Cognee's tool dispatch. |
| Session / task registry | Production | The Lore Engine ships its **own** MCP server exposing the 45 typed tools; it calls Cognee's Python API (`remember`/`recall`/`add_data_points`) in-process. Cognee's `cognee-mcp` (a fixed 14-tool surface) is a separate reference server — we don't register our tools into it. |
## What we add
- **Deeper ontology** — Era, Calendar, Lineage, Culture, Deity, MagicSystem, Spell, Language, Title, Item (covers Artifact), Region, plus 2 v1.2 nodes (Plane, Setting) and 5 consistency nodes. Roughly 36 node labels total (7 base + 19 v1 core + 2 v1.2 planes + 6 v1.1 polymorphic + 5 consistency). The 7 base types are Lore Engine originals built on Cognee's `DataPoint` — not inherited as-is (Cognee ships no `Contradiction`/`Message`/etc. nodes of its own). The v1.1 docs (`11-extensibility.md`) add the polymorphic `DomainEntity`, `Relation`, `TypeTemplate`, `NPC`, `PC`, and `Human` labels.
- **Deeper ontology** — Era, Calendar, Lineage, Culture, Deity, MagicSystem, Spell, Language, Title, Item (covers Artifact), Region, `Relation` (reified-edge node, promoted to v1 per ADR 0009), plus 2 v1.2 nodes (Plane, Setting) and 5 consistency nodes. Roughly 36 node labels total (7 base + 19 v1 core + 2 v1.2 planes + 5 v1.1 polymorphic + 5 consistency). The 7 base types are Lore Engine originals built on Cognee's `DataPoint` — not inherited as-is (Cognee ships no `Contradiction`/`Message`/etc. nodes of its own). The v1.1 docs (`11-extensibility.md`) add the polymorphic `DomainEntity`, `TypeTemplate`, `NPC`, `PC`, and `Human` labels (`Relation` graduated to v1).
- **Time as a first-class concept** — temporal validity windows on relations, era filters, "was X true at time T?" via dedicated tools. The LLM no longer has to guess which version of Aldric it is talking to.
- **Structured ingestion** — not just `.md` files. We ingest `timeline.yaml`, `family-tree.yaml`, `gazetteer.yaml`, `bestiary.yaml` as first-class sources. Free prose stays, but is no longer the only path.
- **A consistency engine** — Cypher-based rules that flag anachronisms (Aldric can't be at the Battle of Black Spire if it happened 200 years before his birth), missing lineage (a noble with no recorded parents), and ontological violations (a region claimed to be inside two non-overlapping kingdoms).
@@ -109,5 +109,6 @@ If you want to challenge it: jump to `10-critique.md` first. I tried to break it
- `0006-cognee-version-pin.md` — Cognee pinned at 1.1.2; harness is the upgrade gate
- `0007-graph-model-ontology-contract.md` — ontology enforced via `graph_model=`, not RDF/OWL
- `0008-graph-backend-neo4j.md` — Neo4j (not Cognee's Kuzu default); time model ships as Java UDF
- `0009-reified-relation-edges.md` — time-bounded/confident edges are `:Relation` nodes (Cognee `graph_model` can't carry edge properties)
**Build plan (`plan/`):** the 10 slice specs and an index.

View File

@@ -61,26 +61,34 @@ A `LoreSource` makes a claim about an entity that another `LoreSource` contradic
- `BELONGS_TO` (a Person belongs to one culture at a time, unless `valid_from`/`valid_until` differ)
- `EXISTED_DURING` (a Person/Faction/Location/Item has one existence window; multiple non-contiguous windows are valid for reincarnated deities etc., but they must not overlap)
**Cypher (general pattern):**
**Cypher (general pattern)** — edges are reified `:Relation` nodes per ADR 0009, so the contradiction check matches Relation nodes by `type` and overlapping time windows:
```cypher
MATCH (a)-[r1:RELATION_TYPE]->(b)
MATCH (a)-[r2:RELATION_TYPE]->(c)
WHERE b <> c
MATCH (a)
MATCH (r1:Relation {type: "RELATION_TYPE"}) WHERE r1.from_id = a.id
MATCH (r2:Relation {type: "RELATION_TYPE"}) WHERE r2.from_id = a.id
WHERE r1.to_id <> r2.to_id
AND time_windows_overlap(r1.valid_from, r1.valid_until, r2.valid_from, r2.valid_until)
MERGE (contra:Contradiction {subject: a.name, predicate: "RELATION_TYPE",
claim_a: b.name, claim_b: c.name})
ON CREATE SET contra.detected_at = timestamp()
WITH a, contra
claim_a: r1.to_id, claim_b: r2.to_id})
ON CREATE SET contra.detected_at = timestamp(), contra.is_disputed = true
WITH a, r1, r2, contra
SET r1.is_disputed = true, r2.is_disputed = true,
r1.disputed_with = coalesce(r1.disputed_with, []) + [r2.id],
r2.disputed_with = coalesce(r2.disputed_with, []) + [r1.id]
MERGE (a)-[:HAS_CONTRADICTION]->(contra)
```
(Note: the `is_disputed` / `disputed_with` mutation here is the consistency engine materializing ADR 0002's disputed-edge state, not the LLM inventing it.)
### Category B: Anachronism detection
For every edge of type `PARTICIPATED_IN`, `WITNESSED`, `LOCATED_IN`, `POSSESSES`, `CAUSED`, `CREATED` — verify the subject's existence window contains the event/object's time.
**Cypher (anachronism: entity before birth):**
**Cypher (anachronism: entity before birth)**`PARTICIPATED_IN` is a reified `:Relation` (ADR 0009):
```cypher
MATCH (p:Person)-[r:PARTICIPATED_IN]->(e:Event)
MATCH (p:Person)
MATCH (r:Relation {type: "PARTICIPATED_IN"}) WHERE r.from_id = p.id
MATCH (e:Event {id: r.to_id})
WHERE p.birth IS NOT NULL
AND time_in_window(e.in_fiction_date, p.birth, p.death) = false
MERGE (an:Anachronism {entity_name: p.name, event_name: e.name,

View File

@@ -384,7 +384,7 @@ The user can disable any rule by ID, and add new ones via `add_ontology_rule`.
## Tool count: 45 total (8 base + 37 domain)
The full catalog: 8 base tools (one wraps `cognee.recall`; the rest are Lore Engine originals) + 37 domain tools across Groups 18 = **45 MCP tools** — all Lore Engine handlers. Cognee ships no MCP tool catalog of its own; a handful of tools delegate to Cognee primitives (`cognee.recall`, `cognee.add`) as a low-precision fallback. That's well past the empirical LLM tool-use ceiling (~25 in a single system prompt), so the Phase 6 reasoning-harness validation measures usage and collapses the long tail. The LLM uses 58 of them 90% of the time; the long tail exists for edge cases the LLM will sometimes need and shouldn't have to fall back to free-text generation for.
The full catalog: 8 base tools (one wraps `cognee.recall`; the rest are Lore Engine originals) + 37 domain tools across Groups 18 = **45 MCP tools** — all Lore Engine handlers, exposed through the **Lore Engine's own MCP server** (Cognee's `cognee-mcp` is a separate, fixed 14-tool surface; we don't register into it). The server calls Cognee's Python API (`remember`/`recall`/`add_data_points`) in-process. A handful of tools delegate to `cognee.recall` as a fallback — not because recall is low-precision (it returns scored, multi-source results) but because its results are un-typed, un-cited, and un-time-bounded. That's well past the empirical LLM tool-use ceiling (~25 in a single system prompt), so the Phase 6 reasoning-harness validation measures usage and collapses the long tail. The LLM uses 58 of them 90% of the time; the long tail exists for edge cases the LLM will sometimes need and shouldn't have to fall back to free-text generation for.
This is on the high end of what an LLM can effectively use in a single context. We mitigate by:
- The reasoning harness documents which 8 to use first.

View File

@@ -46,7 +46,13 @@ When a tool returns an error, surface the error to the user and stop.
Fallback to Cognee primitives: if no Lore Engine tool fits the question,
use cognee.recall("free-text query") for semantic search over the chunk store.
This is a low-precision fallback — prefer the typed tools when possible.
`recall` is not low-precision — it returns scored, multi-source results
(`list[RecallResponse]`, including `cypher`/`triplet_completion`/`temporal`
kinds) and is session-aware. It's the fallback because its results are
**un-typed, un-cited, and un-time-bounded** — exactly the three things the
typed tools add. Prefer the typed tools when the question has a subject,
a time, or needs a citation; use `recall` for breadth ("what does the
chronicle say about…") and when no typed tool fits.
```
This is the bedrock. The patterns below build on it.

View File

@@ -1,6 +1,8 @@
# 11 — Extensibility: Polymorphic Type Templates
The v1 ontology has 36 hard-coded labels (7 inherited from Cognee + 19 v1 core + 2 v1.2 planes + 6 v1.1 polymorphic + 5 consistency). That's *fine for the first world* but it has a ceiling: a thieves-guild mission is forced into `:Event`, a war campaign is forced into `:Faction`-with-properties, a black-market trade log is forced into `:Item`-with-properties. The LLM can *talk* about these things, but the engine can't *reason* over their structure.
The v1 ontology has 36 hard-coded labels (7 base + 19 v1 core — including `Relation`, promoted from v1.1 per ADR 0009 + 2 v1.2 planes + 5 v1.1 polymorphic + 5 consistency). That's *fine for the first world* but it has a ceiling: a thieves-guild mission is forced into `:Event`, a war campaign is forced into `:Faction`-with-properties, a black-market trade log is forced into `:Item`-with-properties. The LLM can *talk* about these things, but the engine can't *reason* over their structure.
> **Note (ADR 0009):** `Relation` now ships in **v1**, not v1.1. It's a general reified-edge node (any edge with time bounds or confidence is a `Relation` node, because Cognee's `graph_model` can't put those properties on a native edge). The `DomainEntity` + `TypeTemplate` polymorphic system below is still v1.1; `Relation` is the one piece that graduated. See `docs/adr/0009-reified-relation-edges.md`.
**"What missions has the Crimson Hand run in Mardsville over the last year, sorted by payout?"** is unanswerable today. The data lives in `summary` text fields, the relationships are implicit in prose, and the LLM has to reconstruct it via `semantic_search` and hope.
@@ -115,23 +117,30 @@ A single new node label and a single new edge label. This is the polymorphic bac
})
// A relation between any two nodes (DomainEntity, Person, Faction, etc.)
// — promoted to v1 core (ADR 0009). This is THE representation for any
// edge that carries time bounds or confidence, used by the time model,
// consistency engine, disputed-edge machinery, and retcon policy.
(:Relation {
id: "rel_abc",
from_id: "mission_4471",
to_id: "person_vex_silent",
type: "GIVEN_BY", // matches a template's allowed_relations
properties: { // typed by template
type: "GIVEN_BY", // the edge verb; matches a template's allowed_relations
properties: { // typed by template (v1.1 polymorphism)
at: "3rd_age.year_385",
agreed_payout: 500
},
valid_from: "3rd_age.year_385",
valid_until: null,
extraction_confidence: 0.88, // ADR 0001
source_confidence: 0.9, // ADR 0001
sources: ["crimson_hand_records.yaml"],
source_confidence: 0.9
is_disputed: false, // ADR 0002
disputed_with: [], // sibling Relation ids
superseded_by: null // retcon policy (docs/19-retcon-policy.md)
})
```
Two new labels. ~200 lines of Cypher. Stable forever.
`Relation` is v1 core. `DomainEntity` + `TypeTemplate` (below) are the v1.1 polymorphic layer that builds on top of it.
### Indexes for Layer 2

View File

@@ -25,7 +25,12 @@ when someone asks "could we swap Cognee for X?"
- **Retrieval.** The `cognee.recall` API and its
query-understanding layer.
- **Session/agent API.** The `remember/recall/forget`
surface that agent clients (Claude, etc.) call.
surface that agent clients (Claude, etc.) call. Cognee *also*
ships its own MCP server (`cognee-mcp`, a fixed 14-tool surface)
— but that's a reference server, not our tool registry. The
Lore Engine runs its **own** MCP server (45 tools) and calls
Cognee's Python API in-process; we don't register into
`cognee-mcp`.
## What the Lore Engine owns

View File

@@ -0,0 +1,101 @@
# Edges with time/confidence are reified `:Relation` nodes
**Status:** accepted.
The Lore Engine's edges carry `valid_from`, `valid_until`,
`extraction_confidence`, `source_confidence`, `is_disputed`,
and `superseded_by` (ADRs 0001, 0002, + the retcon policy).
Cognee's `graph_model` extraction cannot put those properties on
a native edge: in a `graph_model`, an edge is a nested `DataPoint`
*field* (field name = edge label, direction = owner → target), and
the runtime `Edge` object carries only `weight` and
`relationship_type`. There is no field for `valid_from` on an
extracted edge — the LLM fills DataPoint fields, and a
field-nesting edge has nowhere to put a time bound.
So any edge that carries time bounds or confidence is a
**reified `:Relation` node** — a `DataPoint` *node* representing
the edge, with the time/confidence/dispute/retcon properties as
node fields the LLM *can* emit via `graph_model`.
## What this means
- **Promote `Relation` from v1.1 to v1.** It was introduced in
`11-extensibility.md` alongside the polymorphic `DomainEntity`/
`TypeTemplate` system, but it doesn't depend on them. It's a
general reified-edge node between *any* two nodes. It now ships
in v1.
- **Two edge representations, by purpose:**
- **Reified `:Relation` nodes** — any edge with time bounds or
confidence (`RULED`, `MEMBER_OF`, `PARTICIPATED_IN`,
`POSSESSES`, `SPOUSE_OF`, `PARENT_OF`, …). These are the ones
the time model, consistency engine, disputed-edge machinery,
and retcon policy operate on.
- **Native edges** (Cognee field-nesting) — structural edges
that never carry time or confidence: `is_type`, `exists_in`
(an entity's plane membership is structural, not time-bounded —
see "Open question" below), template-wiring. Cheap, direct,
no `Relation` node.
- **Both ingest paths produce Relation nodes for time-bounded
edges.** The structured-YAML path (slice 1, `add_data_points`)
*could* attach properties to a native Neo4j edge, but doesn't —
it creates `Relation` nodes too, so the graph has one edge
representation regardless of ingest path. (Consistency beats
the small per-edge cost.)
## The `Relation` node (v1 core)
```cypher
(:Relation {
id, from_id, to_id,
type, // the edge verb: RULED, MEMBER_OF, ...
valid_from, valid_until,
extraction_confidence, source_confidence, // ADR 0001
sources[], // LoreSource ids
is_disputed, // ADR 0002
disputed_with[], // sibling Relation ids
superseded_by // retcon policy (docs/19-retcon-policy.md)
})
```
Queries match through the Relation node:
```cypher
MATCH (r:Relation {type: "RULED"})
WHERE r.from_id = $subject_id AND r.to_id = $object_id
AND time_in_window($at_time, r.valid_from, r.valid_until)
RETURN r, r.sources
```
## Trade-off acknowledged
- **More nodes.** Every time-bounded edge is a node + two
structural edges (`from`/`to`) instead of one direct edge.
Acceptable: the engine's value is *reasoning over time and
conflict*, and that needs the properties somewhere queryable.
- **Cypher is one hop longer.** `was_true_at` matches a
`Relation` node instead of a direct edge. The
`relation_time_window` index (`valid_from`, `valid_until`) keeps
it fast.
- **Structural edges stay native.** We don't reify `is_type` or
template-wiring — they'd just add nodes with no payoff.
## Open question (deferred)
Is `EXISTS_IN` (entity → plane) time-bounded? An entity can move
between planes (a character who planes-shifts to the Shadow
Plane). If so, `EXISTS_IN` is a `Relation` node with time bounds;
if plane membership is permanent, it's a native edge. Slice 6
resolves this when the plane model ships. Default assumption for
now: **`EXISTS_IN` is a `Relation` node** (time-bounded), since
planar travel is a stock high-fantasy trope. Plane-to-plane
edges (`REFLECTS`, `LAYER_OF`, `ADJACENT_TO`, `ACCESSIBLE_VIA`)
are structural → native.
## Cross-references
- `docs/11-extensibility.md` — the `Relation` definition (now v1)
- `docs/04-consistency.md` — Cypher now matches through Relation nodes
- `docs/adr/0001-aggregate-confidence-floor.md` — confidence fields
- `docs/adr/0002-disputed-edges-stay-separate.md``is_disputed`
- `docs/19-retcon-policy.md``superseded_by`
- `docs/adr/0007-graph-model-ontology-contract.md` — why `graph_model` can't carry edge properties

View File

@@ -2,7 +2,9 @@
**Status:** 📋 planned. The slice that makes `was_true_at` actually
have something to filter against (real `valid_from` / `valid_until`
edges).
on reified `:Relation` nodes — per ADR 0009, time-bounded edges are
`Relation` nodes, not native edges; the YAML path creates them via
`add_data_points`).
## Goal

View File

@@ -26,7 +26,11 @@ Wire up an LLM-backed extraction pipeline that:
2. Custom extraction prompt that emits the 36 typed labels from
`docs/01-ontology.md`.
3. Custom relation extraction prompt that emits the ~70 typed edge
types.
types. Per ADR 0009, edges with time bounds or confidence are
reified `:Relation` nodes in the `graph_model` (Cognee can't
carry those properties on a native field-nesting edge) — the
prompt emits `Relation` nodes with `valid_from`/`valid_until`/
`extraction_confidence`, not bare edges.
4. Entity resolution: pre-computed embeddings of entity names,
top-K by similarity to the chunk being extracted (addresses
critique S1.3). M3's 1M context window means the prompt can

View File

@@ -16,6 +16,8 @@ types. Multi-setting queries, planar relationships, and the
2. `Plane` node: `(id, setting_id, name, kind)`.
3. `EXISTS_IN` edge: every other entity gets
`setting_id` + `plane_id` properties pointing through this edge.
Per ADR 0009, `EXISTS_IN` is a reified `:Relation` node
(time-bounded — planar travel), not a native edge.
4. Four plane-relation edge types:
- `REFLECTS` — Plane A reflects Plane B
- `LAYER_OF` — Plane A is a layer of Plane B