Files

Kaysser Kayyali 50d8deab55 docs: reframe consistency engine as from-scratch on Cognee; add CONTEXT.md glossary

Research into Cognee's actual API (docs.cognee.ai) confirmed the
docs made a load-bearing false claim: that the Lore Engine
'inherits and generalizes' a Contradiction node, get_contradictions
tool, 8 inherited MCP tools, and neo4j-init.cypher from the substrate.

Cognee ships NONE of that. Cognee provides DataPoint + custom graph
models + remember/recall + a Cypher/APOC graph-rule pattern. So:
  - Slice 2 (consistency) is a from-scratch BUILD, not a generalization
  - Categories A/B/D (Contradiction/Anachronism/Orphan) are ours
  - Category C (declarative OntologyRule) rides Cognee's Cypher pattern
  - '8 inherited tools' -> '8 base tools' (one wraps cognee.recall)
  - '7 inherited labels' -> '7 base types' (Lore Engine originals on DataPoint)

Fixed across 04-consistency, 01-ontology, 05-mcp-tools, 00-overview,
09-roadmap, 15-related-work, 16-comparison. Historical GraphMCP
comparisons left intact.

Added CONTEXT.md (glossary) — the grill-with-docs skill mandates it
and 6 ADRs' worth of resolved terms (Lineage/Faction/Region/Plane/
LoreSource/extraction+source confidence/disputed edge/retcon/Setting/
ConsistencyRun/Cognee) had no single home. New readers no longer mine
ADR prose for the vocabulary.

Co-Authored-By: Claude <noreply@anthropic.com>

2026-06-17 22:36:07 -04:00

18 KiB

Raw Blame History

09 — Roadmap

A phased build plan. MVP first, modules second. Each phase produces a working artifact that adds value, not a half-built layer that has to wait for the next.

Phase 0: Pre-flight (1 day)

Before any new code:

Read this design end-to-end. Find the contradictions (they're there; see 10-critique.md).
Resolve open questions in 10-critique.md#open-questions.
Make the world-builder pick the first YAML schema to formalize. Recommendation: family_tree.yaml first, because lineage is the highest-stakes data and the prose path's failure rate is highest there.

Deliverable: sign-off on this design, the YAML schemas for at least 3 source types, and a 1-line summary of the first world to be modeled.

Phase 1: Schema + UDFs (3–5 days)

The substrate. No LLM-facing changes; just the data layer.

Add all new constraints and indexes from 08-architecture.md#schema-bootstrap to neo4j-init.cypher.
Implement time_in_window and time_windows_overlap UDFs in Java (Neo4j user-defined function). Unit-test against 30+ known cases including era-tree membership and current resolution.
Add the :Now config node to the schema (a single :Now node with current_time: "3rd_age.year_380" or similar).
Add 5 starter :OntologyRule nodes to the schema (the 5 most common from 05-mcp-tools.md#starter-rules).
Document the canonical time format as a comment block in the schema file.

Deliverable: running Neo4j with the extended schema, both UDFs working, a test script that validates UDF behavior against 30+ cases. Cognee's default recall still works for unstructured queries alongside the typed model.

Verify:

docker exec neo4j cypher-shell -u neo4j -p $PWD < schema/init.cypher
docker exec neo4j cypher-shell -u neo4j -p $PWD "RETURN time_in_window('3rd_age.year_345', '3rd_age.year_340', '3rd_age.year_352')"
# → true

Phase 2: `time_in_window`-aware tools (3–5 days)

Add 4 of the 5 time-aware MCP tools from 05-mcp-tools.md#group-2:

was_true_at
true_during
entities_present
timeline

(Defer state_at until Phase 4, when the ontology is fully populated.)

Each tool is a single Go function in mcp-server/main.go with one Cypher query. The existing pattern (session-based, JSON-RPC, tool registration) carries over.

Deliverable: 4 new MCP tools registered, working end-to-end against a small seed dataset. Manual test: ask Claude a time-bounded question, get a sourced answer.

Verify:

curl -X POST http://localhost:9000/mcp -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"was_true_at","arguments":{"relation":"RULED","subject":"House Vyr","object":"Valdorn","at_time":"3rd_age.year_345"}}}'

Phase 3: Structured ingestion (5–7 days)

The new pipeline. This is the most leveraged single phase.

Build services/structured-ingestor/ Go worker.
Implement parsers for timeline.yaml, family_tree.yaml, gazetteer.yaml (the three highest-stakes types). Defer bestiary.yaml, magic_system.yaml, culture.yaml to Phase 5.
Add a POST /ingest/structured endpoint to the ingestion worker.
Wire the new Redis stream (raw.structured) into docker-compose.
Write 3 example YAMLs for the seed world.
Add a CLI wrapper tea add-source <file> (optional but nice).

Deliverable: world-builder can write a family_tree.yaml, post it, see the lineage nodes and PARENT_OF edges in Neo4j within 5 seconds. No LLM involved.

Verify:

curl -X POST http://localhost:8080/ingest/structured -F "file=@test-family-tree.yaml"
docker exec neo4j cypher-shell -u neo4j -p $PWD "MATCH (a:Person)-[:PARENT_OF]->(b:Person) RETURN a.name, b.name"

Phase 4: `state_at` + `entity_context` + `lookup` (3–5 days)

The disambiguation and snapshot tools. These unlock the rest.

lookup(query, type?) — the entry point. Uses string similarity + the :Entity hub node.
entity_context(name, at_time?) — one-hop summary.
state_at(entity, at_time) — composes multiple queries.
Active context tracking in the MCP server (session-scoped).

Deliverable: the LLM can ask "who is X?" and get a one-call answer, and can disambiguate ambiguous references via lookup.

Phase 5: Remaining structured ingestors (3–5 days)

bestiary.yaml parser
magic_system.yaml parser
culture.yaml parser
language.yaml parser (could merge with culture)

Deliverable: the world can be fully described in structured YAML, and the engine ingests it deterministically.

Phase 6: Lineage & hierarchy tools (2–3 days)

list_lineage(person)
list_offspring(person)
ancestors_of(person, generations?)
descendants_of(person, generations?)
location_hierarchy(location, direction?)

Deliverable: the LLM can navigate bloodlines and geography in a single tool call.

Phase 7: Consistency engine (5–7 days)

services/consistency-runner/ Go worker.
Implement all 4 rule categories from 04-consistency.md.
Implement the 10 starter :OntologyRule nodes.
Implement services/consistency-monitor/ HTTP service.
Expose get_contradictions, get_anachronisms, get_ontology_violations, get_orphans, flag_for_review, explain_violation, run_consistency_check, latest_run MCP tools.
Schedule the runner at 03:00 daily (cron in docker-compose).

Deliverable: the engine flags its first real contradiction. This is the most leveraged phase after structured ingestion.

Verify: ingest two sources that disagree on the same fact; confirm a Contradiction node is created; call get_contradictions(subject=X) and see it.

Phase 8: Lore extension tools (2–3 days)

expand_context(entity, hops, relation_filter, min_confidence, limit)
event_chain(event, depth)
significance_of(entity)
lore_about(entity, type?, limit)
cite(claim)

Deliverable: the LLM can answer "how are A and B connected?" and "what do the chronicles say about X?" in single calls.

Phase 9: Generation tools (3–5 days)

summarize_chain(entity, depth, style) — uses LiteLLM proxy.
narrate_arc(start_event, end_event, perspective?) — composes multiple queries.
World-builder tools (add_entity, add_relation, add_lore_source, etc.).

Deliverable: the LLM can produce grounded narrative text on demand.

Phase 10: Reasoning harness + integration test (3–5 days)

Write the system prompt from 07-reasoning-harness.md.
Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
Iterate on the system prompt and tool descriptions.

Deliverable: the LLM, with the system prompt and the tool surface, can answer 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.

Phase 11: Polish (open-ended)

UI for the consistency engine (browse contradictions, anachronisms, orphans).
UI for world-builders (YAML editor with autocomplete, validation, preview).
Export: render the world as a wiki, a book, a campaign primer.
Versioning: graph snapshots, time-travel queries.
Cross-world queries: the engine is per-world, but a future version supports multiple.

Total scope estimate

Phase	Days	Cumulative
0 — Cognee spike	2	2
1 — Lore Engine ontology on Cognee	5	7
2 — Time model + UDFs	4	11
3 — MCP tool layer (Cognee extension)	5	16
4 — Consistency engine	6	22
5 — TypeTemplate polymorphic extension	7	29
6 — Reasoning harness + validation	4	33
7 — Polish	—	—
MVP (Phases 0–3)	16 days	end of phase 3

The MVP is end of phase 3. That's: validated Cognee substrate, the typed Lore Engine ontology, the time model + UDFs, and the 45 MCP tools exposed through Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions. The consistency engine and the TypeTemplate polymorphic extension are the v1.1 follow-ups; per the v1.1 plan below, they land in Phases 4 and 5 of the unified roadmap.

What to cut if you're under time pressure

In strict order:

Phase 7 — Polish. Trivially deferrable.
Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful v1.1 addition but not a v1 blocker.
Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.

The non-cuttable core is Phases 0–3 + Phase 4. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.

What NOT to do

Do not skip the Cognee spike. The substrate decision is the highest-leverage call in the project. If Cognee can't represent the typed ontology or the time model, the spike fails fast and the v1 needs a different foundation. The 1–2 days is the cheapest insurance available.
Do not skip the UDF unit tests. Every time-aware query depends on time_in_window and time_windows_overlap. If they have a bug, every consistency check is wrong. Test first, then trust.
Do not over-invest in prose extraction. Cognee handles the prose path; the structured YAML path is what the Lore Engine adds on top. Structured is exact, prose is fuzzy; high-stakes data goes through structured paths.
Do not try to support 100% of question types in the first build. Ship the 5 patterns from 07-reasoning-harness.md and iterate. The LLM is forgiving of missing tools if the existing ones are reliable.

Phases 4–7: From MVP to production-ready

After the v1 MVP (Phases 0–3) is built and the time model works end-to-end, four phases add the consistency engine, the polymorphic extension model, the reasoning-harness validation, and the polish layer. This is a single ~17-day follow-up that lands everything in the v1.1/v1.2 design docs on top of the Cognee-based v1.

Phase 4: Consistency engine (~6 days)

The 4-category rule system from 04-consistency.md: Contradiction, Anachronism, Orphan, OntologyViolation. Implemented as a Cognee data-pipeline that runs on a schedule and on demand, materializing violation nodes in the same graph the LLM queries.

Implement the 4 rule categories from 04-consistency.md as a Cognee data-pipeline.
Implement the 10 starter :OntologyRule nodes from 05-mcp-tools.md#starter-rules.
Expose the 10 Group 6 consistency tools: get_contradictions, get_anachronisms, get_ontology_violations, get_orphans, flag_for_review, explain_violation, run_consistency_check, latest_run, add_ontology_rule, list_ontology_rules.
Schedule the consistency pipeline to run nightly (Cognee task scheduler).

Verify: ingest two sources that disagree on the same fact; confirm a Contradiction node is created; call get_contradictions(subject=X) and see it. Test anachronism detection against a known historical claim that has a Person participating in an Event outside their lifespan.

Phase 5: TypeTemplate polymorphic extension (~7 days)

The big one. The DomainEntity, Relation, and TypeTemplate labels, the template-watcher service, and the dynamic tool generator. Per 11-extensibility.md and 12-storage-strategy.md, this is the v1.1 extension model that makes new domain types a YAML exercise.

Register the DomainEntity, Relation, and TypeTemplate labels as a Cognee data-model extension.
Build the template-watcher service: watches ./templates/, validates YAML, registers templates.
Build the template-registry: persists template specs alongside the Cognee storage layer.
Implement the dynamic tool generator: a generic handler that runs queries generated from TypeTemplate specs.
Add the list_template_tools MCP tool.
Ship the four example templates from 14-examples.md (thieves-guild mission, war campaign, black-market lot, NPC secret knowledge).
Update the reasoning harness to mention template tools.

Verify: write templates/thieves_guild/mission.yaml, hit POST /admin/templates/reload, see 6 new tools in tools/list, ingest a mission, query it via list_missions, get a coherent answer. No Go code change between "template added" and "tool available."

Phase 6: Reasoning harness + validation (~4 days)

The system prompt from 07-reasoning-harness.md, the test harness, and the validation pass.

Write the system prompt.
Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
Iterate on the system prompt and tool descriptions.
Measure tool-selection accuracy across the 45-tool surface; collapse the long tail if the LLM is tool-confused.

Verify: the LLM, with the system prompt and the tool surface, answers 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.

Phase 7: Polish (open-ended)

UI for the consistency engine (browse contradictions, anachronisms, orphans).
UI for world-builders (YAML editor with autocomplete, validation, preview).
Export: render the world as a wiki, a book, a campaign primer.
Versioning: graph snapshots, time-travel queries.
Cross-setting queries: the engine is per-setting, but a future version supports multiple.

Total scope: v1 + v1.1 on Cognee

Phase	Days	Cumulative
0 — Cognee spike	2	2
1 — Lore Engine ontology on Cognee	5	7
2 — Time model + UDFs	4	11
3 — MCP tool layer	5	16
4 — Consistency engine	6	22
5 — TypeTemplate polymorphic extension	7	29
6 — Reasoning harness + validation	4	33
7 — Polish	—	—
Total to v1+ext (Phases 0–6)	33 days	end of phase 6

The MVP is end of phase 3 (16 days). Schema, UDFs, 45 MCP tools, structured ingestion, lookup/entity_context/state_at, all on Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions.

The v1 + extensions is end of phase 6 (33 days). Adds the consistency engine, the TypeTemplate polymorphic extension model, and the reasoning-harness validation. The full 18-doc design contract is implemented.

Compared to the original v1+v1.1 plan (43 days on GraphMCP-Example), the Cognee-based plan saves ~10 days by inheriting the storage abstraction, the extraction pipeline, the embedding store, and the agent-native API. The 17 days of v1.1 modularization work collapses into the 7-day Phase 5 (TypeTemplate) plus the 6-day Phase 4 (consistency engine), because Cognee handles the gateway and decomposition story.

What to cut from the full plan if you're under time pressure

In strict order:

Phase 7 — Polish. Trivially deferrable.
Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful addition but not a v1.1 blocker.
Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.

The non-cuttable core is Phases 0–3 + Phase 4. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.

The recommended order: spike → MVP → validate → extensions

Phase 0 — Cognee spike (2 days). Stand up Cognee locally. Ingest a 10-document sample world. Run cognee.recall("Who is Aldric?"). If the spike fails, the substrate decision is wrong and the v1 needs a different foundation. The 2-day validation is the cheapest insurance available.
Phases 1–3 — MVP (16 days). Typed ontology, time model, MCP tool layer. The LLM can answer time-bounded questions against a hand-crafted world.
Phase 4 — Consistency engine (6 days). The engine flags its first real contradiction.
Phase 5 — TypeTemplate polymorphic extension (7 days). World-builders add new domain types as YAML. This is the phase that unlocks the "arbitrary new concept" question. Ship this second because it's the highest-leverage single change after the v1 data layer.
Phase 6 — Reasoning harness + validation (4 days). Measure: how often does the LLM answer correctly? how often does it surface contradictions? how often does it hallucinate? This is the validation gate. If the numbers aren't good, the design has a bug and the v2 should address it.

Operational docs that ship with the engine

In parallel with the build, the operational story lives in:

docs/21-quickstart.md — the 1-page guide for new world-builders. Update as the smoke-test command changes.
docs/18-eval-policy.md — the threshold and cadence for the 50-question harness. Read by the CI that runs on every PR.
docs/22-cognee-boundary.md — the contract that explains what the Lore Engine owns vs. Cognee. Read by anyone considering a substrate swap.
docs/19-retcon-policy.md and docs/20-multi-setting-policy.md — domain-specific policies. Read by world-builders when they declare a retcon or a cross-setting reference.
docs/cognee-integration.md — the recipe for the substrate-specific code (extraction prompt override, LiteLLM routing). Read by anyone debugging ingestion.
docs/prompts/ and docs/models/ — the prompt and model registries. Updated whenever a prompt or model changes; the harness is the gate.

18 KiB Raw Blame History Unescape Escape

09 — Roadmap

Phase 0: Pre-flight (1 day)

Phase 1: Schema + UDFs (3–5 days)

Phase 2: time_in_window-aware tools (3–5 days)

Phase 3: Structured ingestion (5–7 days)

Phase 4: state_at + entity_context + lookup (3–5 days)

Phase 5: Remaining structured ingestors (3–5 days)

Phase 6: Lineage & hierarchy tools (2–3 days)

Phase 7: Consistency engine (5–7 days)

Phase 8: Lore extension tools (2–3 days)

Phase 9: Generation tools (3–5 days)

Phase 10: Reasoning harness + integration test (3–5 days)

Phase 11: Polish (open-ended)

Total scope estimate

What to cut if you're under time pressure

What NOT to do

Phases 4–7: From MVP to production-ready

Phase 4: Consistency engine (~6 days)

Phase 5: TypeTemplate polymorphic extension (~7 days)

Phase 6: Reasoning harness + validation (~4 days)

Phase 7: Polish (open-ended)

Total scope: v1 + v1.1 on Cognee

What to cut from the full plan if you're under time pressure

The recommended order: spike → MVP → validate → extensions

Operational docs that ship with the engine

18 KiB

Raw Blame History

Phase 2: `time_in_window`-aware tools (3–5 days)

Phase 4: `state_at` + `entity_context` + `lookup` (3–5 days)