Files
lore-engine/docs/09-roadmap.md
Kaysser Kayyali 50d8deab55 docs: reframe consistency engine as from-scratch on Cognee; add CONTEXT.md glossary
Research into Cognee's actual API (docs.cognee.ai) confirmed the
docs made a load-bearing false claim: that the Lore Engine
'inherits and generalizes' a Contradiction node, get_contradictions
tool, 8 inherited MCP tools, and neo4j-init.cypher from the substrate.

Cognee ships NONE of that. Cognee provides DataPoint + custom graph
models + remember/recall + a Cypher/APOC graph-rule pattern. So:
  - Slice 2 (consistency) is a from-scratch BUILD, not a generalization
  - Categories A/B/D (Contradiction/Anachronism/Orphan) are ours
  - Category C (declarative OntologyRule) rides Cognee's Cypher pattern
  - '8 inherited tools' -> '8 base tools' (one wraps cognee.recall)
  - '7 inherited labels' -> '7 base types' (Lore Engine originals on DataPoint)

Fixed across 04-consistency, 01-ontology, 05-mcp-tools, 00-overview,
09-roadmap, 15-related-work, 16-comparison. Historical GraphMCP
comparisons left intact.

Added CONTEXT.md (glossary) — the grill-with-docs skill mandates it
and 6 ADRs' worth of resolved terms (Lineage/Faction/Region/Plane/
LoreSource/extraction+source confidence/disputed edge/retcon/Setting/
ConsistencyRun/Cognee) had no single home. New readers no longer mine
ADR prose for the vocabulary.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-17 22:36:07 -04:00

289 lines
18 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 09 — Roadmap
A phased build plan. **MVP first, modules second.** Each phase produces a working artifact that adds value, not a half-built layer that has to wait for the next.
## Phase 0: Pre-flight (1 day)
Before any new code:
- [ ] Read this design end-to-end. Find the contradictions (they're there; see `10-critique.md`).
- [ ] Resolve open questions in `10-critique.md#open-questions`.
- [ ] Make the world-builder pick the first YAML schema to formalize. Recommendation: **`family_tree.yaml` first**, because lineage is the highest-stakes data and the prose path's failure rate is highest there.
**Deliverable:** sign-off on this design, the YAML schemas for at least 3 source types, and a 1-line summary of the first world to be modeled.
## Phase 1: Schema + UDFs (35 days)
The substrate. No LLM-facing changes; just the data layer.
- [ ] Add all new constraints and indexes from `08-architecture.md#schema-bootstrap` to `neo4j-init.cypher`.
- [ ] Implement `time_in_window` and `time_windows_overlap` UDFs in Java (Neo4j user-defined function). Unit-test against 30+ known cases including era-tree membership and `current` resolution.
- [ ] Add the `:Now` config node to the schema (a single `:Now` node with `current_time: "3rd_age.year_380"` or similar).
- [ ] Add 5 starter `:OntologyRule` nodes to the schema (the 5 most common from `05-mcp-tools.md#starter-rules`).
- [ ] Document the canonical time format as a comment block in the schema file.
**Deliverable:** running Neo4j with the extended schema, both UDFs working, a test script that validates UDF behavior against 30+ cases. Cognee's default `recall` still works for unstructured queries alongside the typed model.
**Verify:**
```bash
docker exec neo4j cypher-shell -u neo4j -p $PWD < schema/init.cypher
docker exec neo4j cypher-shell -u neo4j -p $PWD "RETURN time_in_window('3rd_age.year_345', '3rd_age.year_340', '3rd_age.year_352')"
# → true
```
## Phase 2: `time_in_window`-aware tools (35 days)
Add 4 of the 5 time-aware MCP tools from `05-mcp-tools.md#group-2`:
- [ ] `was_true_at`
- [ ] `true_during`
- [ ] `entities_present`
- [ ] `timeline`
(Defer `state_at` until Phase 4, when the ontology is fully populated.)
Each tool is a single Go function in `mcp-server/main.go` with one Cypher query. The existing pattern (session-based, JSON-RPC, tool registration) carries over.
**Deliverable:** 4 new MCP tools registered, working end-to-end against a small seed dataset. Manual test: ask Claude a time-bounded question, get a sourced answer.
**Verify:**
```bash
curl -X POST http://localhost:9000/mcp -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"was_true_at","arguments":{"relation":"RULED","subject":"House Vyr","object":"Valdorn","at_time":"3rd_age.year_345"}}}'
```
## Phase 3: Structured ingestion (57 days)
The new pipeline. This is the most leveraged single phase.
- [ ] Build `services/structured-ingestor/` Go worker.
- [ ] Implement parsers for `timeline.yaml`, `family_tree.yaml`, `gazetteer.yaml` (the three highest-stakes types). Defer `bestiary.yaml`, `magic_system.yaml`, `culture.yaml` to Phase 5.
- [ ] Add a `POST /ingest/structured` endpoint to the ingestion worker.
- [ ] Wire the new Redis stream (`raw.structured`) into docker-compose.
- [ ] Write 3 example YAMLs for the seed world.
- [ ] Add a CLI wrapper `tea add-source <file>` (optional but nice).
**Deliverable:** world-builder can write a `family_tree.yaml`, post it, see the lineage nodes and `PARENT_OF` edges in Neo4j within 5 seconds. No LLM involved.
**Verify:**
```bash
curl -X POST http://localhost:8080/ingest/structured -F "file=@test-family-tree.yaml"
docker exec neo4j cypher-shell -u neo4j -p $PWD "MATCH (a:Person)-[:PARENT_OF]->(b:Person) RETURN a.name, b.name"
```
## Phase 4: `state_at` + `entity_context` + `lookup` (35 days)
The disambiguation and snapshot tools. These unlock the rest.
- [ ] `lookup(query, type?)` — the entry point. Uses string similarity + the `:Entity` hub node.
- [ ] `entity_context(name, at_time?)` — one-hop summary.
- [ ] `state_at(entity, at_time)` — composes multiple queries.
- [ ] Active context tracking in the MCP server (session-scoped).
**Deliverable:** the LLM can ask "who is X?" and get a one-call answer, and can disambiguate ambiguous references via `lookup`.
## Phase 5: Remaining structured ingestors (35 days)
- [ ] `bestiary.yaml` parser
- [ ] `magic_system.yaml` parser
- [ ] `culture.yaml` parser
- [ ] `language.yaml` parser (could merge with culture)
**Deliverable:** the world can be fully described in structured YAML, and the engine ingests it deterministically.
## Phase 6: Lineage & hierarchy tools (23 days)
- [ ] `list_lineage(person)`
- [ ] `list_offspring(person)`
- [ ] `ancestors_of(person, generations?)`
- [ ] `descendants_of(person, generations?)`
- [ ] `location_hierarchy(location, direction?)`
**Deliverable:** the LLM can navigate bloodlines and geography in a single tool call.
## Phase 7: Consistency engine (57 days)
- [ ] `services/consistency-runner/` Go worker.
- [ ] Implement all 4 rule categories from `04-consistency.md`.
- [ ] Implement the 10 starter `:OntologyRule` nodes.
- [ ] Implement `services/consistency-monitor/` HTTP service.
- [ ] Expose `get_contradictions`, `get_anachronisms`, `get_ontology_violations`, `get_orphans`, `flag_for_review`, `explain_violation`, `run_consistency_check`, `latest_run` MCP tools.
- [ ] Schedule the runner at 03:00 daily (cron in docker-compose).
**Deliverable:** the engine flags its first real contradiction. This is the most leveraged phase after structured ingestion.
**Verify:** ingest two sources that disagree on the same fact; confirm a `Contradiction` node is created; call `get_contradictions(subject=X)` and see it.
## Phase 8: Lore extension tools (23 days)
- [ ] `expand_context(entity, hops, relation_filter, min_confidence, limit)`
- [ ] `event_chain(event, depth)`
- [ ] `significance_of(entity)`
- [ ] `lore_about(entity, type?, limit)`
- [ ] `cite(claim)`
**Deliverable:** the LLM can answer "how are A and B connected?" and "what do the chronicles say about X?" in single calls.
## Phase 9: Generation tools (35 days)
- [ ] `summarize_chain(entity, depth, style)` — uses LiteLLM proxy.
- [ ] `narrate_arc(start_event, end_event, perspective?)` — composes multiple queries.
- [ ] World-builder tools (`add_entity`, `add_relation`, `add_lore_source`, etc.).
**Deliverable:** the LLM can produce grounded narrative text on demand.
## Phase 10: Reasoning harness + integration test (35 days)
- [ ] Write the system prompt from `07-reasoning-harness.md`.
- [ ] Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
- [ ] Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
- [ ] Iterate on the system prompt and tool descriptions.
**Deliverable:** the LLM, with the system prompt and the tool surface, can answer 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.
## Phase 11: Polish (open-ended)
- [ ] UI for the consistency engine (browse contradictions, anachronisms, orphans).
- [ ] UI for world-builders (YAML editor with autocomplete, validation, preview).
- [ ] Export: render the world as a wiki, a book, a campaign primer.
- [ ] Versioning: graph snapshots, time-travel queries.
- [ ] Cross-world queries: the engine is per-world, but a future version supports multiple.
---
## Total scope estimate
| Phase | Days | Cumulative |
|---|---|---|
| 0 — Cognee spike | 2 | 2 |
| 1 — Lore Engine ontology on Cognee | 5 | 7 |
| 2 — Time model + UDFs | 4 | 11 |
| 3 — MCP tool layer (Cognee extension) | 5 | 16 |
| 4 — Consistency engine | 6 | 22 |
| 5 — TypeTemplate polymorphic extension | 7 | 29 |
| 6 — Reasoning harness + validation | 4 | 33 |
| 7 — Polish | — | — |
| **MVP (Phases 03)** | **16 days** | **end of phase 3** |
**The MVP is end of phase 3.** That's: validated Cognee substrate, the typed Lore Engine ontology, the time model + UDFs, and the 45 MCP tools exposed through Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions. The consistency engine and the TypeTemplate polymorphic extension are the v1.1 follow-ups; per the v1.1 plan below, they land in Phases 4 and 5 of the unified roadmap.
## What to cut if you're under time pressure
In strict order:
1. Phase 7 — Polish. Trivially deferrable.
2. Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
3. Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful v1.1 addition but not a v1 blocker.
4. Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.
The **non-cuttable core is Phases 03 + Phase 4**. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.
## What NOT to do
- **Do not skip the Cognee spike.** The substrate decision is the highest-leverage call in the project. If Cognee can't represent the typed ontology or the time model, the spike fails fast and the v1 needs a different foundation. The 12 days is the cheapest insurance available.
- **Do not skip the UDF unit tests.** Every time-aware query depends on `time_in_window` and `time_windows_overlap`. If they have a bug, every consistency check is wrong. Test first, then trust.
- **Do not over-invest in prose extraction.** Cognee handles the prose path; the structured YAML path is what the Lore Engine adds on top. Structured is exact, prose is fuzzy; high-stakes data goes through structured paths.
- **Do not try to support 100% of question types in the first build.** Ship the 5 patterns from `07-reasoning-harness.md` and iterate. The LLM is forgiving of missing tools if the existing ones are reliable.
---
## Phases 47: From MVP to production-ready
After the v1 MVP (Phases 03) is built and the time model works end-to-end, four phases add the consistency engine, the polymorphic extension model, the reasoning-harness validation, and the polish layer. This is a single ~17-day follow-up that lands everything in the v1.1/v1.2 design docs on top of the Cognee-based v1.
### Phase 4: Consistency engine (~6 days)
The 4-category rule system from `04-consistency.md`: Contradiction, Anachronism, Orphan, OntologyViolation. Implemented as a Cognee data-pipeline that runs on a schedule and on demand, materializing violation nodes in the same graph the LLM queries.
- [ ] Implement the 4 rule categories from `04-consistency.md` as a Cognee data-pipeline.
- [ ] Implement the 10 starter `:OntologyRule` nodes from `05-mcp-tools.md#starter-rules`.
- [ ] Expose the 10 Group 6 consistency tools: `get_contradictions`, `get_anachronisms`, `get_ontology_violations`, `get_orphans`, `flag_for_review`, `explain_violation`, `run_consistency_check`, `latest_run`, `add_ontology_rule`, `list_ontology_rules`.
- [ ] Schedule the consistency pipeline to run nightly (Cognee task scheduler).
**Verify:** ingest two sources that disagree on the same fact; confirm a `Contradiction` node is created; call `get_contradictions(subject=X)` and see it. Test anachronism detection against a known historical claim that has a Person participating in an Event outside their lifespan.
### Phase 5: TypeTemplate polymorphic extension (~7 days)
The big one. The `DomainEntity`, `Relation`, and `TypeTemplate` labels, the template-watcher service, and the dynamic tool generator. Per `11-extensibility.md` and `12-storage-strategy.md`, this is the v1.1 extension model that makes new domain types a YAML exercise.
- [ ] Register the `DomainEntity`, `Relation`, and `TypeTemplate` labels as a Cognee data-model extension.
- [ ] Build the template-watcher service: watches `./templates/`, validates YAML, registers templates.
- [ ] Build the template-registry: persists template specs alongside the Cognee storage layer.
- [ ] Implement the dynamic tool generator: a generic handler that runs queries generated from `TypeTemplate` specs.
- [ ] Add the `list_template_tools` MCP tool.
- [ ] Ship the four example templates from `14-examples.md` (thieves-guild mission, war campaign, black-market lot, NPC secret knowledge).
- [ ] Update the reasoning harness to mention template tools.
**Verify:** write `templates/thieves_guild/mission.yaml`, hit `POST /admin/templates/reload`, see 6 new tools in `tools/list`, ingest a mission, query it via `list_missions`, get a coherent answer. **No Go code change between "template added" and "tool available."**
### Phase 6: Reasoning harness + validation (~4 days)
The system prompt from `07-reasoning-harness.md`, the test harness, and the validation pass.
- [ ] Write the system prompt.
- [ ] Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
- [ ] Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
- [ ] Iterate on the system prompt and tool descriptions.
- [ ] Measure tool-selection accuracy across the 45-tool surface; collapse the long tail if the LLM is tool-confused.
**Verify:** the LLM, with the system prompt and the tool surface, answers 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.
### Phase 7: Polish (open-ended)
- [ ] UI for the consistency engine (browse contradictions, anachronisms, orphans).
- [ ] UI for world-builders (YAML editor with autocomplete, validation, preview).
- [ ] Export: render the world as a wiki, a book, a campaign primer.
- [ ] Versioning: graph snapshots, time-travel queries.
- [ ] Cross-setting queries: the engine is per-setting, but a future version supports multiple.
---
## Total scope: v1 + v1.1 on Cognee
| Phase | Days | Cumulative |
|---|---|---|
| 0 — Cognee spike | 2 | 2 |
| 1 — Lore Engine ontology on Cognee | 5 | 7 |
| 2 — Time model + UDFs | 4 | 11 |
| 3 — MCP tool layer | 5 | 16 |
| 4 — Consistency engine | 6 | 22 |
| 5 — TypeTemplate polymorphic extension | 7 | 29 |
| 6 — Reasoning harness + validation | 4 | 33 |
| 7 — Polish | — | — |
| **Total to v1+ext (Phases 06)** | **33 days** | **end of phase 6** |
**The MVP is end of phase 3 (16 days).** Schema, UDFs, 45 MCP tools, structured ingestion, lookup/entity_context/state_at, all on Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions.
**The v1 + extensions is end of phase 6 (33 days).** Adds the consistency engine, the TypeTemplate polymorphic extension model, and the reasoning-harness validation. The full 18-doc design contract is implemented.
Compared to the original v1+v1.1 plan (43 days on GraphMCP-Example), the Cognee-based plan saves ~10 days by inheriting the storage abstraction, the extraction pipeline, the embedding store, and the agent-native API. The 17 days of v1.1 modularization work collapses into the 7-day Phase 5 (TypeTemplate) plus the 6-day Phase 4 (consistency engine), because Cognee handles the gateway and decomposition story.
## What to cut from the full plan if you're under time pressure
In strict order:
1. Phase 7 — Polish. Trivially deferrable.
2. Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
3. Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful addition but not a v1.1 blocker.
4. Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.
The **non-cuttable core is Phases 03 + Phase 4**. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.
## The recommended order: spike → MVP → validate → extensions
1. **Phase 0 — Cognee spike (2 days).** Stand up Cognee locally. Ingest a 10-document sample world. Run `cognee.recall("Who is Aldric?")`. **If the spike fails, the substrate decision is wrong and the v1 needs a different foundation.** The 2-day validation is the cheapest insurance available.
2. **Phases 13 — MVP (16 days).** Typed ontology, time model, MCP tool layer. The LLM can answer time-bounded questions against a hand-crafted world.
3. **Phase 4 — Consistency engine (6 days).** The engine flags its first real contradiction.
4. **Phase 5 — TypeTemplate polymorphic extension (7 days).** World-builders add new domain types as YAML. **This is the phase that unlocks the "arbitrary new concept" question. Ship this second because it's the highest-leverage single change after the v1 data layer.**
5. **Phase 6 — Reasoning harness + validation (4 days).** Measure: how often does the LLM answer correctly? how often does it surface contradictions? how often does it hallucinate? **This is the validation gate.** If the numbers aren't good, the design has a bug and the v2 should address it.
## Operational docs that ship with the engine
In parallel with the build, the operational story lives in:
- `docs/21-quickstart.md` — the 1-page guide for new world-builders. **Update as the smoke-test command changes.**
- `docs/18-eval-policy.md` — the threshold and cadence for the 50-question harness. Read by the CI that runs on every PR.
- `docs/22-cognee-boundary.md` — the contract that explains what the Lore Engine owns vs. Cognee. Read by anyone considering a substrate swap.
- `docs/19-retcon-policy.md` and `docs/20-multi-setting-policy.md` — domain-specific policies. Read by world-builders when they declare a retcon or a cross-setting reference.
- `docs/cognee-integration.md` — the recipe for the substrate-specific code (extraction prompt override, LiteLLM routing). Read by anyone debugging ingestion.
- `docs/prompts/` and `docs/models/` — the prompt and model registries. Updated whenever a prompt or model changes; the harness is the gate.