Research into Cognee's actual API (docs.cognee.ai) confirmed the docs made a load-bearing false claim: that the Lore Engine 'inherits and generalizes' a Contradiction node, get_contradictions tool, 8 inherited MCP tools, and neo4j-init.cypher from the substrate. Cognee ships NONE of that. Cognee provides DataPoint + custom graph models + remember/recall + a Cypher/APOC graph-rule pattern. So: - Slice 2 (consistency) is a from-scratch BUILD, not a generalization - Categories A/B/D (Contradiction/Anachronism/Orphan) are ours - Category C (declarative OntologyRule) rides Cognee's Cypher pattern - '8 inherited tools' -> '8 base tools' (one wraps cognee.recall) - '7 inherited labels' -> '7 base types' (Lore Engine originals on DataPoint) Fixed across 04-consistency, 01-ontology, 05-mcp-tools, 00-overview, 09-roadmap, 15-related-work, 16-comparison. Historical GraphMCP comparisons left intact. Added CONTEXT.md (glossary) — the grill-with-docs skill mandates it and 6 ADRs' worth of resolved terms (Lineage/Faction/Region/Plane/ LoreSource/extraction+source confidence/disputed edge/retcon/Setting/ ConsistencyRun/Cognee) had no single home. New readers no longer mine ADR prose for the vocabulary. Co-Authored-By: Claude <noreply@anthropic.com>
18 KiB
09 — Roadmap
A phased build plan. MVP first, modules second. Each phase produces a working artifact that adds value, not a half-built layer that has to wait for the next.
Phase 0: Pre-flight (1 day)
Before any new code:
- Read this design end-to-end. Find the contradictions (they're there; see
10-critique.md). - Resolve open questions in
10-critique.md#open-questions. - Make the world-builder pick the first YAML schema to formalize. Recommendation:
family_tree.yamlfirst, because lineage is the highest-stakes data and the prose path's failure rate is highest there.
Deliverable: sign-off on this design, the YAML schemas for at least 3 source types, and a 1-line summary of the first world to be modeled.
Phase 1: Schema + UDFs (3–5 days)
The substrate. No LLM-facing changes; just the data layer.
- Add all new constraints and indexes from
08-architecture.md#schema-bootstraptoneo4j-init.cypher. - Implement
time_in_windowandtime_windows_overlapUDFs in Java (Neo4j user-defined function). Unit-test against 30+ known cases including era-tree membership andcurrentresolution. - Add the
:Nowconfig node to the schema (a single:Nownode withcurrent_time: "3rd_age.year_380"or similar). - Add 5 starter
:OntologyRulenodes to the schema (the 5 most common from05-mcp-tools.md#starter-rules). - Document the canonical time format as a comment block in the schema file.
Deliverable: running Neo4j with the extended schema, both UDFs working, a test script that validates UDF behavior against 30+ cases. Cognee's default recall still works for unstructured queries alongside the typed model.
Verify:
docker exec neo4j cypher-shell -u neo4j -p $PWD < schema/init.cypher
docker exec neo4j cypher-shell -u neo4j -p $PWD "RETURN time_in_window('3rd_age.year_345', '3rd_age.year_340', '3rd_age.year_352')"
# → true
Phase 2: time_in_window-aware tools (3–5 days)
Add 4 of the 5 time-aware MCP tools from 05-mcp-tools.md#group-2:
was_true_attrue_duringentities_presenttimeline
(Defer state_at until Phase 4, when the ontology is fully populated.)
Each tool is a single Go function in mcp-server/main.go with one Cypher query. The existing pattern (session-based, JSON-RPC, tool registration) carries over.
Deliverable: 4 new MCP tools registered, working end-to-end against a small seed dataset. Manual test: ask Claude a time-bounded question, get a sourced answer.
Verify:
curl -X POST http://localhost:9000/mcp -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"was_true_at","arguments":{"relation":"RULED","subject":"House Vyr","object":"Valdorn","at_time":"3rd_age.year_345"}}}'
Phase 3: Structured ingestion (5–7 days)
The new pipeline. This is the most leveraged single phase.
- Build
services/structured-ingestor/Go worker. - Implement parsers for
timeline.yaml,family_tree.yaml,gazetteer.yaml(the three highest-stakes types). Deferbestiary.yaml,magic_system.yaml,culture.yamlto Phase 5. - Add a
POST /ingest/structuredendpoint to the ingestion worker. - Wire the new Redis stream (
raw.structured) into docker-compose. - Write 3 example YAMLs for the seed world.
- Add a CLI wrapper
tea add-source <file>(optional but nice).
Deliverable: world-builder can write a family_tree.yaml, post it, see the lineage nodes and PARENT_OF edges in Neo4j within 5 seconds. No LLM involved.
Verify:
curl -X POST http://localhost:8080/ingest/structured -F "file=@test-family-tree.yaml"
docker exec neo4j cypher-shell -u neo4j -p $PWD "MATCH (a:Person)-[:PARENT_OF]->(b:Person) RETURN a.name, b.name"
Phase 4: state_at + entity_context + lookup (3–5 days)
The disambiguation and snapshot tools. These unlock the rest.
lookup(query, type?)— the entry point. Uses string similarity + the:Entityhub node.entity_context(name, at_time?)— one-hop summary.state_at(entity, at_time)— composes multiple queries.- Active context tracking in the MCP server (session-scoped).
Deliverable: the LLM can ask "who is X?" and get a one-call answer, and can disambiguate ambiguous references via lookup.
Phase 5: Remaining structured ingestors (3–5 days)
bestiary.yamlparsermagic_system.yamlparserculture.yamlparserlanguage.yamlparser (could merge with culture)
Deliverable: the world can be fully described in structured YAML, and the engine ingests it deterministically.
Phase 6: Lineage & hierarchy tools (2–3 days)
list_lineage(person)list_offspring(person)ancestors_of(person, generations?)descendants_of(person, generations?)location_hierarchy(location, direction?)
Deliverable: the LLM can navigate bloodlines and geography in a single tool call.
Phase 7: Consistency engine (5–7 days)
services/consistency-runner/Go worker.- Implement all 4 rule categories from
04-consistency.md. - Implement the 10 starter
:OntologyRulenodes. - Implement
services/consistency-monitor/HTTP service. - Expose
get_contradictions,get_anachronisms,get_ontology_violations,get_orphans,flag_for_review,explain_violation,run_consistency_check,latest_runMCP tools. - Schedule the runner at 03:00 daily (cron in docker-compose).
Deliverable: the engine flags its first real contradiction. This is the most leveraged phase after structured ingestion.
Verify: ingest two sources that disagree on the same fact; confirm a Contradiction node is created; call get_contradictions(subject=X) and see it.
Phase 8: Lore extension tools (2–3 days)
expand_context(entity, hops, relation_filter, min_confidence, limit)event_chain(event, depth)significance_of(entity)lore_about(entity, type?, limit)cite(claim)
Deliverable: the LLM can answer "how are A and B connected?" and "what do the chronicles say about X?" in single calls.
Phase 9: Generation tools (3–5 days)
summarize_chain(entity, depth, style)— uses LiteLLM proxy.narrate_arc(start_event, end_event, perspective?)— composes multiple queries.- World-builder tools (
add_entity,add_relation,add_lore_source, etc.).
Deliverable: the LLM can produce grounded narrative text on demand.
Phase 10: Reasoning harness + integration test (3–5 days)
- Write the system prompt from
07-reasoning-harness.md. - Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
- Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
- Iterate on the system prompt and tool descriptions.
Deliverable: the LLM, with the system prompt and the tool surface, can answer 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.
Phase 11: Polish (open-ended)
- UI for the consistency engine (browse contradictions, anachronisms, orphans).
- UI for world-builders (YAML editor with autocomplete, validation, preview).
- Export: render the world as a wiki, a book, a campaign primer.
- Versioning: graph snapshots, time-travel queries.
- Cross-world queries: the engine is per-world, but a future version supports multiple.
Total scope estimate
| Phase | Days | Cumulative |
|---|---|---|
| 0 — Cognee spike | 2 | 2 |
| 1 — Lore Engine ontology on Cognee | 5 | 7 |
| 2 — Time model + UDFs | 4 | 11 |
| 3 — MCP tool layer (Cognee extension) | 5 | 16 |
| 4 — Consistency engine | 6 | 22 |
| 5 — TypeTemplate polymorphic extension | 7 | 29 |
| 6 — Reasoning harness + validation | 4 | 33 |
| 7 — Polish | — | — |
| MVP (Phases 0–3) | 16 days | end of phase 3 |
The MVP is end of phase 3. That's: validated Cognee substrate, the typed Lore Engine ontology, the time model + UDFs, and the 45 MCP tools exposed through Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions. The consistency engine and the TypeTemplate polymorphic extension are the v1.1 follow-ups; per the v1.1 plan below, they land in Phases 4 and 5 of the unified roadmap.
What to cut if you're under time pressure
In strict order:
- Phase 7 — Polish. Trivially deferrable.
- Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
- Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful v1.1 addition but not a v1 blocker.
- Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.
The non-cuttable core is Phases 0–3 + Phase 4. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.
What NOT to do
- Do not skip the Cognee spike. The substrate decision is the highest-leverage call in the project. If Cognee can't represent the typed ontology or the time model, the spike fails fast and the v1 needs a different foundation. The 1–2 days is the cheapest insurance available.
- Do not skip the UDF unit tests. Every time-aware query depends on
time_in_windowandtime_windows_overlap. If they have a bug, every consistency check is wrong. Test first, then trust. - Do not over-invest in prose extraction. Cognee handles the prose path; the structured YAML path is what the Lore Engine adds on top. Structured is exact, prose is fuzzy; high-stakes data goes through structured paths.
- Do not try to support 100% of question types in the first build. Ship the 5 patterns from
07-reasoning-harness.mdand iterate. The LLM is forgiving of missing tools if the existing ones are reliable.
Phases 4–7: From MVP to production-ready
After the v1 MVP (Phases 0–3) is built and the time model works end-to-end, four phases add the consistency engine, the polymorphic extension model, the reasoning-harness validation, and the polish layer. This is a single ~17-day follow-up that lands everything in the v1.1/v1.2 design docs on top of the Cognee-based v1.
Phase 4: Consistency engine (~6 days)
The 4-category rule system from 04-consistency.md: Contradiction, Anachronism, Orphan, OntologyViolation. Implemented as a Cognee data-pipeline that runs on a schedule and on demand, materializing violation nodes in the same graph the LLM queries.
- Implement the 4 rule categories from
04-consistency.mdas a Cognee data-pipeline. - Implement the 10 starter
:OntologyRulenodes from05-mcp-tools.md#starter-rules. - Expose the 10 Group 6 consistency tools:
get_contradictions,get_anachronisms,get_ontology_violations,get_orphans,flag_for_review,explain_violation,run_consistency_check,latest_run,add_ontology_rule,list_ontology_rules. - Schedule the consistency pipeline to run nightly (Cognee task scheduler).
Verify: ingest two sources that disagree on the same fact; confirm a Contradiction node is created; call get_contradictions(subject=X) and see it. Test anachronism detection against a known historical claim that has a Person participating in an Event outside their lifespan.
Phase 5: TypeTemplate polymorphic extension (~7 days)
The big one. The DomainEntity, Relation, and TypeTemplate labels, the template-watcher service, and the dynamic tool generator. Per 11-extensibility.md and 12-storage-strategy.md, this is the v1.1 extension model that makes new domain types a YAML exercise.
- Register the
DomainEntity,Relation, andTypeTemplatelabels as a Cognee data-model extension. - Build the template-watcher service: watches
./templates/, validates YAML, registers templates. - Build the template-registry: persists template specs alongside the Cognee storage layer.
- Implement the dynamic tool generator: a generic handler that runs queries generated from
TypeTemplatespecs. - Add the
list_template_toolsMCP tool. - Ship the four example templates from
14-examples.md(thieves-guild mission, war campaign, black-market lot, NPC secret knowledge). - Update the reasoning harness to mention template tools.
Verify: write templates/thieves_guild/mission.yaml, hit POST /admin/templates/reload, see 6 new tools in tools/list, ingest a mission, query it via list_missions, get a coherent answer. No Go code change between "template added" and "tool available."
Phase 6: Reasoning harness + validation (~4 days)
The system prompt from 07-reasoning-harness.md, the test harness, and the validation pass.
- Write the system prompt.
- Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
- Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
- Iterate on the system prompt and tool descriptions.
- Measure tool-selection accuracy across the 45-tool surface; collapse the long tail if the LLM is tool-confused.
Verify: the LLM, with the system prompt and the tool surface, answers 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.
Phase 7: Polish (open-ended)
- UI for the consistency engine (browse contradictions, anachronisms, orphans).
- UI for world-builders (YAML editor with autocomplete, validation, preview).
- Export: render the world as a wiki, a book, a campaign primer.
- Versioning: graph snapshots, time-travel queries.
- Cross-setting queries: the engine is per-setting, but a future version supports multiple.
Total scope: v1 + v1.1 on Cognee
| Phase | Days | Cumulative |
|---|---|---|
| 0 — Cognee spike | 2 | 2 |
| 1 — Lore Engine ontology on Cognee | 5 | 7 |
| 2 — Time model + UDFs | 4 | 11 |
| 3 — MCP tool layer | 5 | 16 |
| 4 — Consistency engine | 6 | 22 |
| 5 — TypeTemplate polymorphic extension | 7 | 29 |
| 6 — Reasoning harness + validation | 4 | 33 |
| 7 — Polish | — | — |
| Total to v1+ext (Phases 0–6) | 33 days | end of phase 6 |
The MVP is end of phase 3 (16 days). Schema, UDFs, 45 MCP tools, structured ingestion, lookup/entity_context/state_at, all on Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions.
The v1 + extensions is end of phase 6 (33 days). Adds the consistency engine, the TypeTemplate polymorphic extension model, and the reasoning-harness validation. The full 18-doc design contract is implemented.
Compared to the original v1+v1.1 plan (43 days on GraphMCP-Example), the Cognee-based plan saves ~10 days by inheriting the storage abstraction, the extraction pipeline, the embedding store, and the agent-native API. The 17 days of v1.1 modularization work collapses into the 7-day Phase 5 (TypeTemplate) plus the 6-day Phase 4 (consistency engine), because Cognee handles the gateway and decomposition story.
What to cut from the full plan if you're under time pressure
In strict order:
- Phase 7 — Polish. Trivially deferrable.
- Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
- Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful addition but not a v1.1 blocker.
- Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.
The non-cuttable core is Phases 0–3 + Phase 4. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.
The recommended order: spike → MVP → validate → extensions
- Phase 0 — Cognee spike (2 days). Stand up Cognee locally. Ingest a 10-document sample world. Run
cognee.recall("Who is Aldric?"). If the spike fails, the substrate decision is wrong and the v1 needs a different foundation. The 2-day validation is the cheapest insurance available. - Phases 1–3 — MVP (16 days). Typed ontology, time model, MCP tool layer. The LLM can answer time-bounded questions against a hand-crafted world.
- Phase 4 — Consistency engine (6 days). The engine flags its first real contradiction.
- Phase 5 — TypeTemplate polymorphic extension (7 days). World-builders add new domain types as YAML. This is the phase that unlocks the "arbitrary new concept" question. Ship this second because it's the highest-leverage single change after the v1 data layer.
- Phase 6 — Reasoning harness + validation (4 days). Measure: how often does the LLM answer correctly? how often does it surface contradictions? how often does it hallucinate? This is the validation gate. If the numbers aren't good, the design has a bug and the v2 should address it.
Operational docs that ship with the engine
In parallel with the build, the operational story lives in:
docs/21-quickstart.md— the 1-page guide for new world-builders. Update as the smoke-test command changes.docs/18-eval-policy.md— the threshold and cadence for the 50-question harness. Read by the CI that runs on every PR.docs/22-cognee-boundary.md— the contract that explains what the Lore Engine owns vs. Cognee. Read by anyone considering a substrate swap.docs/19-retcon-policy.mdanddocs/20-multi-setting-policy.md— domain-specific policies. Read by world-builders when they declare a retcon or a cross-setting reference.docs/cognee-integration.md— the recipe for the substrate-specific code (extraction prompt override, LiteLLM routing). Read by anyone debugging ingestion.docs/prompts/anddocs/models/— the prompt and model registries. Updated whenever a prompt or model changes; the harness is the gate.