# 09 — Roadmap A phased build plan. **MVP first, modules second.** Each phase produces a working artifact that adds value, not a half-built layer that has to wait for the next. ## Phase 0: Pre-flight (1 day) Before any new code: - [ ] Read this design end-to-end. Find the contradictions (they're there; see `10-critique.md`). - [ ] Resolve open questions in `10-critique.md#open-questions`. - [ ] Make the world-builder pick the first YAML schema to formalize. Recommendation: **`family_tree.yaml` first**, because lineage is the highest-stakes data and the prose path's failure rate is highest there. **Deliverable:** sign-off on this design, the YAML schemas for at least 3 source types, and a 1-line summary of the first world to be modeled. ## Phase 1: Schema + UDFs (3–5 days) The substrate. No LLM-facing changes; just the data layer. - [ ] Add all new constraints and indexes from `08-architecture.md#schema-bootstrap` to `neo4j-init.cypher`. - [ ] Implement `time_in_window` and `time_windows_overlap` UDFs in Java (Neo4j user-defined function). Unit-test against 30+ known cases including era-tree membership and `current` resolution. - [ ] Add the `:Now` config node to the schema (a single `:Now` node with `current_time: "3rd_age.year_380"` or similar). - [ ] Add 5 starter `:OntologyRule` nodes to the schema (the 5 most common from `05-mcp-tools.md#starter-rules`). - [ ] Document the canonical time format as a comment block in the schema file. **Deliverable:** running Neo4j with the extended schema, both UDFs working, a test script that validates UDF behavior against 30+ cases. Cognee's default `recall` still works for unstructured queries alongside the typed model. **Verify:** ```bash docker exec neo4j cypher-shell -u neo4j -p $PWD < schema/init.cypher docker exec neo4j cypher-shell -u neo4j -p $PWD "RETURN time_in_window('3rd_age.year_345', '3rd_age.year_340', '3rd_age.year_352')" # → true ``` ## Phase 2: `time_in_window`-aware tools (3–5 days) Add 4 of the 5 time-aware MCP tools from `05-mcp-tools.md#group-2`: - [ ] `was_true_at` - [ ] `true_during` - [ ] `entities_present` - [ ] `timeline` (Defer `state_at` until Phase 4, when the ontology is fully populated.) Each tool is a single Go function in `mcp-server/main.go` with one Cypher query. The existing pattern (session-based, JSON-RPC, tool registration) carries over. **Deliverable:** 4 new MCP tools registered, working end-to-end against a small seed dataset. Manual test: ask Claude a time-bounded question, get a sourced answer. **Verify:** ```bash curl -X POST http://localhost:9000/mcp -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"was_true_at","arguments":{"relation":"RULED","subject":"House Vyr","object":"Valdorn","at_time":"3rd_age.year_345"}}}' ``` ## Phase 3: Structured ingestion (5–7 days) The new pipeline. This is the most leveraged single phase. - [ ] Build `services/structured-ingestor/` Go worker. - [ ] Implement parsers for `timeline.yaml`, `family_tree.yaml`, `gazetteer.yaml` (the three highest-stakes types). Defer `bestiary.yaml`, `magic_system.yaml`, `culture.yaml` to Phase 5. - [ ] Add a `POST /ingest/structured` endpoint to the ingestion worker. - [ ] Wire the new Redis stream (`raw.structured`) into docker-compose. - [ ] Write 3 example YAMLs for the seed world. - [ ] Add a CLI wrapper `tea add-source ` (optional but nice). **Deliverable:** world-builder can write a `family_tree.yaml`, post it, see the lineage nodes and `PARENT_OF` edges in Neo4j within 5 seconds. No LLM involved. **Verify:** ```bash curl -X POST http://localhost:8080/ingest/structured -F "file=@test-family-tree.yaml" docker exec neo4j cypher-shell -u neo4j -p $PWD "MATCH (a:Person)-[:PARENT_OF]->(b:Person) RETURN a.name, b.name" ``` ## Phase 4: `state_at` + `entity_context` + `lookup` (3–5 days) The disambiguation and snapshot tools. These unlock the rest. - [ ] `lookup(query, type?)` — the entry point. Uses string similarity + the `:Entity` hub node. - [ ] `entity_context(name, at_time?)` — one-hop summary. - [ ] `state_at(entity, at_time)` — composes multiple queries. - [ ] Active context tracking in the MCP server (session-scoped). **Deliverable:** the LLM can ask "who is X?" and get a one-call answer, and can disambiguate ambiguous references via `lookup`. ## Phase 5: Remaining structured ingestors (3–5 days) - [ ] `bestiary.yaml` parser - [ ] `magic_system.yaml` parser - [ ] `culture.yaml` parser - [ ] `language.yaml` parser (could merge with culture) **Deliverable:** the world can be fully described in structured YAML, and the engine ingests it deterministically. ## Phase 6: Lineage & hierarchy tools (2–3 days) - [ ] `list_lineage(person)` - [ ] `list_offspring(person)` - [ ] `ancestors_of(person, generations?)` - [ ] `descendants_of(person, generations?)` - [ ] `location_hierarchy(location, direction?)` **Deliverable:** the LLM can navigate bloodlines and geography in a single tool call. ## Phase 7: Consistency engine (5–7 days) - [ ] `services/consistency-runner/` Go worker. - [ ] Implement all 4 rule categories from `04-consistency.md`. - [ ] Implement the 10 starter `:OntologyRule` nodes. - [ ] Implement `services/consistency-monitor/` HTTP service. - [ ] Expose `get_contradictions`, `get_anachronisms`, `get_ontology_violations`, `get_orphans`, `flag_for_review`, `explain_violation`, `run_consistency_check`, `latest_run` MCP tools. - [ ] Schedule the runner at 03:00 daily (cron in docker-compose). **Deliverable:** the engine flags its first real contradiction. This is the most leveraged phase after structured ingestion. **Verify:** ingest two sources that disagree on the same fact; confirm a `Contradiction` node is created; call `get_contradictions(subject=X)` and see it. ## Phase 8: Lore extension tools (2–3 days) - [ ] `expand_context(entity, hops, relation_filter, min_confidence, limit)` - [ ] `event_chain(event, depth)` - [ ] `significance_of(entity)` - [ ] `lore_about(entity, type?, limit)` - [ ] `cite(claim)` **Deliverable:** the LLM can answer "how are A and B connected?" and "what do the chronicles say about X?" in single calls. ## Phase 9: Generation tools (3–5 days) - [ ] `summarize_chain(entity, depth, style)` — uses LiteLLM proxy. - [ ] `narrate_arc(start_event, end_event, perspective?)` — composes multiple queries. - [ ] World-builder tools (`add_entity`, `add_relation`, `add_lore_source`, etc.). **Deliverable:** the LLM can produce grounded narrative text on demand. ## Phase 10: Reasoning harness + integration test (3–5 days) - [ ] Write the system prompt from `07-reasoning-harness.md`. - [ ] Build a test harness: 50 worked questions, expected tool sequences, expected answer shape. - [ ] Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes. - [ ] Iterate on the system prompt and tool descriptions. **Deliverable:** the LLM, with the system prompt and the tool surface, can answer 80%+ of the test questions correctly. The remaining 20% are documented as known limitations. ## Phase 11: Polish (open-ended) - [ ] UI for the consistency engine (browse contradictions, anachronisms, orphans). - [ ] UI for world-builders (YAML editor with autocomplete, validation, preview). - [ ] Export: render the world as a wiki, a book, a campaign primer. - [ ] Versioning: graph snapshots, time-travel queries. - [ ] Cross-world queries: the engine is per-world, but a future version supports multiple. --- ## Total scope estimate | Phase | Days | Cumulative | |---|---|---| | 0 — Cognee spike | 2 | 2 | | 1 — Lore Engine ontology on Cognee | 5 | 7 | | 2 — Time model + UDFs | 4 | 11 | | 3 — MCP tool layer (Cognee extension) | 5 | 16 | | 4 — Consistency engine | 6 | 22 | | 5 — TypeTemplate polymorphic extension | 7 | 29 | | 6 — Reasoning harness + validation | 4 | 33 | | 7 — Polish | — | — | | **MVP (Phases 0–3)** | **16 days** | **end of phase 3** | **The MVP is end of phase 3.** That's: validated Cognee substrate, the typed Lore Engine ontology, the time model + UDFs, and the 45 MCP tools exposed through Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions. The consistency engine and the TypeTemplate polymorphic extension are the v1.1 follow-ups; per the v1.1 plan below, they land in Phases 4 and 5 of the unified roadmap. ## What to cut if you're under time pressure In strict order: 1. Phase 7 — Polish. Trivially deferrable. 2. Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes. 3. Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful v1.1 addition but not a v1 blocker. 4. Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build. The **non-cuttable core is Phases 0–3 + Phase 4**. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement. ## What NOT to do - **Do not skip the Cognee spike.** The substrate decision is the highest-leverage call in the project. If Cognee can't represent the typed ontology or the time model, the spike fails fast and the v1 needs a different foundation. The 1–2 days is the cheapest insurance available. - **Do not skip the UDF unit tests.** Every time-aware query depends on `time_in_window` and `time_windows_overlap`. If they have a bug, every consistency check is wrong. Test first, then trust. - **Do not over-invest in prose extraction.** Cognee handles the prose path; the structured YAML path is what the Lore Engine adds on top. Structured is exact, prose is fuzzy; high-stakes data goes through structured paths. - **Do not try to support 100% of question types in the first build.** Ship the 5 patterns from `07-reasoning-harness.md` and iterate. The LLM is forgiving of missing tools if the existing ones are reliable. --- ## Phases 4–7: From MVP to production-ready After the v1 MVP (Phases 0–3) is built and the time model works end-to-end, four phases add the consistency engine, the polymorphic extension model, the reasoning-harness validation, and the polish layer. This is a single ~17-day follow-up that lands everything in the v1.1/v1.2 design docs on top of the Cognee-based v1. ### Phase 4: Consistency engine (~6 days) The 4-category rule system from `04-consistency.md`: Contradiction, Anachronism, Orphan, OntologyViolation. Implemented as a Cognee data-pipeline that runs on a schedule and on demand, materializing violation nodes in the same graph the LLM queries. - [ ] Implement the 4 rule categories from `04-consistency.md` as a Cognee data-pipeline. - [ ] Implement the 10 starter `:OntologyRule` nodes from `05-mcp-tools.md#starter-rules`. - [ ] Expose the 10 Group 6 consistency tools: `get_contradictions`, `get_anachronisms`, `get_ontology_violations`, `get_orphans`, `flag_for_review`, `explain_violation`, `run_consistency_check`, `latest_run`, `add_ontology_rule`, `list_ontology_rules`. - [ ] Schedule the consistency pipeline to run nightly (Cognee task scheduler). **Verify:** ingest two sources that disagree on the same fact; confirm a `Contradiction` node is created; call `get_contradictions(subject=X)` and see it. Test anachronism detection against a known historical claim that has a Person participating in an Event outside their lifespan. ### Phase 5: TypeTemplate polymorphic extension (~7 days) The big one. The `DomainEntity`, `Relation`, and `TypeTemplate` labels, the template-watcher service, and the dynamic tool generator. Per `11-extensibility.md` and `12-storage-strategy.md`, this is the v1.1 extension model that makes new domain types a YAML exercise. - [ ] Register the `DomainEntity`, `Relation`, and `TypeTemplate` labels as a Cognee data-model extension. - [ ] Build the template-watcher service: watches `./templates/`, validates YAML, registers templates. - [ ] Build the template-registry: persists template specs alongside the Cognee storage layer. - [ ] Implement the dynamic tool generator: a generic handler that runs queries generated from `TypeTemplate` specs. - [ ] Add the `list_template_tools` MCP tool. - [ ] Ship the four example templates from `14-examples.md` (thieves-guild mission, war campaign, black-market lot, NPC secret knowledge). - [ ] Update the reasoning harness to mention template tools. **Verify:** write `templates/thieves_guild/mission.yaml`, hit `POST /admin/templates/reload`, see 6 new tools in `tools/list`, ingest a mission, query it via `list_missions`, get a coherent answer. **No Go code change between "template added" and "tool available."** ### Phase 6: Reasoning harness + validation (~4 days) The system prompt from `07-reasoning-harness.md`, the test harness, and the validation pass. - [ ] Write the system prompt. - [ ] Build a test harness: 50 worked questions, expected tool sequences, expected answer shape. - [ ] Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes. - [ ] Iterate on the system prompt and tool descriptions. - [ ] Measure tool-selection accuracy across the 45-tool surface; collapse the long tail if the LLM is tool-confused. **Verify:** the LLM, with the system prompt and the tool surface, answers 80%+ of the test questions correctly. The remaining 20% are documented as known limitations. ### Phase 7: Polish (open-ended) - [ ] UI for the consistency engine (browse contradictions, anachronisms, orphans). - [ ] UI for world-builders (YAML editor with autocomplete, validation, preview). - [ ] Export: render the world as a wiki, a book, a campaign primer. - [ ] Versioning: graph snapshots, time-travel queries. - [ ] Cross-setting queries: the engine is per-setting, but a future version supports multiple. --- ## Total scope: v1 + v1.1 on Cognee | Phase | Days | Cumulative | |---|---|---| | 0 — Cognee spike | 2 | 2 | | 1 — Lore Engine ontology on Cognee | 5 | 7 | | 2 — Time model + UDFs | 4 | 11 | | 3 — MCP tool layer | 5 | 16 | | 4 — Consistency engine | 6 | 22 | | 5 — TypeTemplate polymorphic extension | 7 | 29 | | 6 — Reasoning harness + validation | 4 | 33 | | 7 — Polish | — | — | | **Total to v1+ext (Phases 0–6)** | **33 days** | **end of phase 6** | **The MVP is end of phase 3 (16 days).** Schema, UDFs, 45 MCP tools, structured ingestion, lookup/entity_context/state_at, all on Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions. **The v1 + extensions is end of phase 6 (33 days).** Adds the consistency engine, the TypeTemplate polymorphic extension model, and the reasoning-harness validation. The full 18-doc design contract is implemented. Compared to the original v1+v1.1 plan (43 days on GraphMCP-Example), the Cognee-based plan saves ~10 days by inheriting the storage abstraction, the extraction pipeline, the embedding store, and the agent-native API. The 17 days of v1.1 modularization work collapses into the 7-day Phase 5 (TypeTemplate) plus the 6-day Phase 4 (consistency engine), because Cognee handles the gateway and decomposition story. ## What to cut from the full plan if you're under time pressure In strict order: 1. Phase 7 — Polish. Trivially deferrable. 2. Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes. 3. Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful addition but not a v1.1 blocker. 4. Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build. The **non-cuttable core is Phases 0–3 + Phase 4**. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement. ## The recommended order: spike → MVP → validate → extensions 1. **Phase 0 — Cognee spike (2 days).** Stand up Cognee locally. Ingest a 10-document sample world. Run `cognee.recall("Who is Aldric?")`. **If the spike fails, the substrate decision is wrong and the v1 needs a different foundation.** The 2-day validation is the cheapest insurance available. 2. **Phases 1–3 — MVP (16 days).** Typed ontology, time model, MCP tool layer. The LLM can answer time-bounded questions against a hand-crafted world. 3. **Phase 4 — Consistency engine (6 days).** The engine flags its first real contradiction. 4. **Phase 5 — TypeTemplate polymorphic extension (7 days).** World-builders add new domain types as YAML. **This is the phase that unlocks the "arbitrary new concept" question. Ship this second because it's the highest-leverage single change after the v1 data layer.** 5. **Phase 6 — Reasoning harness + validation (4 days).** Measure: how often does the LLM answer correctly? how often does it surface contradictions? how often does it hallucinate? **This is the validation gate.** If the numbers aren't good, the design has a bug and the v2 should address it. ## Operational docs that ship with the engine In parallel with the build, the operational story lives in: - `docs/21-quickstart.md` — the 1-page guide for new world-builders. **Update as the smoke-test command changes.** - `docs/18-eval-policy.md` — the threshold and cadence for the 50-question harness. Read by the CI that runs on every PR. - `docs/22-cognee-boundary.md` — the contract that explains what the Lore Engine owns vs. Cognee. Read by anyone considering a substrate swap. - `docs/19-retcon-policy.md` and `docs/20-multi-setting-policy.md` — domain-specific policies. Read by world-builders when they declare a retcon or a cross-setting reference. - `docs/cognee-integration.md` — the recipe for the substrate-specific code (extraction prompt override, LiteLLM routing). Read by anyone debugging ingestion. - `docs/prompts/` and `docs/models/` — the prompt and model registries. Updated whenever a prompt or model changes; the harness is the gate.