lore-engine/docs/09-roadmap.md

# 09 — Roadmap

A phased build plan. **MVP first, modules second.** Each phase produces a working artifact that adds value, not a half-built layer that has to wait for the next.

## Phase 0: Pre-flight (1 day)

Before any new code:

- [ ] Read this design end-to-end. Find the contradictions (they're there; see `10-critique.md`).
- [ ] Resolve open questions in `10-critique.md#open-questions`.
- [ ] Make the world-builder pick the first YAML schema to formalize. Recommendation: **`family_tree.yaml` first**, because lineage is the highest-stakes data and the prose path's failure rate is highest there.

**Deliverable:** sign-off on this design, the YAML schemas for at least 3 source types, and a 1-line summary of the first world to be modeled.

## Phase 1: Schema + UDFs (3–5 days)

The substrate. No LLM-facing changes; just the data layer.

- [ ] Add all new constraints and indexes from `08-architecture.md#schema-bootstrap` to `neo4j-init.cypher`.
- [ ] Implement `time_in_window` and `time_windows_overlap` UDFs in Java (Neo4j user-defined function). Unit-test against 30+ known cases including era-tree membership and `current` resolution.
- [ ] Add the `:Now` config node to the schema (a single `:Now` node with `current_time: "3rd_age.year_380"` or similar).
- [ ] Add 5 starter `:OntologyRule` nodes to the schema (the 5 most common from `05-mcp-tools.md#starter-rules`).
- [ ] Document the canonical time format as a comment block in the schema file.

**Deliverable:** running Neo4j with the extended schema, both UDFs working, a test script that validates UDF behavior against 30+ cases. Cognee's default `recall` still works for unstructured queries alongside the typed model.

**Verify:**
```bash
docker exec neo4j cypher-shell -u neo4j -p $PWD < schema/init.cypher
docker exec neo4j cypher-shell -u neo4j -p $PWD "RETURN time_in_window('3rd_age.year_345', '3rd_age.year_340', '3rd_age.year_352')"
# → true
```

## Phase 2: `time_in_window`-aware tools (3–5 days)

Add 4 of the 5 time-aware MCP tools from `05-mcp-tools.md#group-2`:

- [ ] `was_true_at`
- [ ] `true_during`
- [ ] `entities_present`
- [ ] `timeline`

(Defer `state_at` until Phase 4, when the ontology is fully populated.)

Each tool is a single Go function in `mcp-server/main.go` with one Cypher query. The existing pattern (session-based, JSON-RPC, tool registration) carries over.

**Deliverable:** 4 new MCP tools registered, working end-to-end against a small seed dataset. Manual test: ask Claude a time-bounded question, get a sourced answer.

**Verify:**
```bash
curl -X POST http://localhost:9000/mcp -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"was_true_at","arguments":{"relation":"RULED","subject":"House Vyr","object":"Valdorn","at_time":"3rd_age.year_345"}}}'
```

## Phase 3: Structured ingestion (5–7 days)

The new pipeline. This is the most leveraged single phase.

- [ ] Build `services/structured-ingestor/` Go worker.
- [ ] Implement parsers for `timeline.yaml`, `family_tree.yaml`, `gazetteer.yaml` (the three highest-stakes types). Defer `bestiary.yaml`, `magic_system.yaml`, `culture.yaml` to Phase 5.
- [ ] Add a `POST /ingest/structured` endpoint to the ingestion worker.
- [ ] Wire the new Redis stream (`raw.structured`) into docker-compose.
- [ ] Write 3 example YAMLs for the seed world.
- [ ] Add a CLI wrapper `tea add-source <file>` (optional but nice).

**Deliverable:** world-builder can write a `family_tree.yaml`, post it, see the lineage nodes and `PARENT_OF` edges in Neo4j within 5 seconds. No LLM involved.

**Verify:**
```bash
curl -X POST http://localhost:8080/ingest/structured -F "file=@test-family-tree.yaml"
docker exec neo4j cypher-shell -u neo4j -p $PWD "MATCH (a:Person)-[:PARENT_OF]->(b:Person) RETURN a.name, b.name"
```

## Phase 4: `state_at` + `entity_context` + `lookup` (3–5 days)

The disambiguation and snapshot tools. These unlock the rest.

- [ ] `lookup(query, type?)` — the entry point. Uses string similarity + the `:Entity` hub node.
- [ ] `entity_context(name, at_time?)` — one-hop summary.
- [ ] `state_at(entity, at_time)` — composes multiple queries.
- [ ] Active context tracking in the MCP server (session-scoped).

**Deliverable:** the LLM can ask "who is X?" and get a one-call answer, and can disambiguate ambiguous references via `lookup`.

## Phase 5: Remaining structured ingestors (3–5 days)

- [ ] `bestiary.yaml` parser
- [ ] `magic_system.yaml` parser
- [ ] `culture.yaml` parser
- [ ] `language.yaml` parser (could merge with culture)

**Deliverable:** the world can be fully described in structured YAML, and the engine ingests it deterministically.

## Phase 6: Lineage & hierarchy tools (2–3 days)

- [ ] `list_lineage(person)`
- [ ] `list_offspring(person)`
- [ ] `ancestors_of(person, generations?)`
- [ ] `descendants_of(person, generations?)`
- [ ] `location_hierarchy(location, direction?)`

**Deliverable:** the LLM can navigate bloodlines and geography in a single tool call.

## Phase 7: Consistency engine (5–7 days)

- [ ] `services/consistency-runner/` Go worker.
- [ ] Implement all 4 rule categories from `04-consistency.md`.
- [ ] Implement the 10 starter `:OntologyRule` nodes.
- [ ] Implement `services/consistency-monitor/` HTTP service.
- [ ] Expose `get_contradictions`, `get_anachronisms`, `get_ontology_violations`, `get_orphans`, `flag_for_review`, `explain_violation`, `run_consistency_check`, `latest_run` MCP tools.
- [ ] Schedule the runner at 03:00 daily (cron in docker-compose).

**Deliverable:** the engine flags its first real contradiction. This is the most leveraged phase after structured ingestion.

**Verify:** ingest two sources that disagree on the same fact; confirm a `Contradiction` node is created; call `get_contradictions(subject=X)` and see it.

## Phase 8: Lore extension tools (2–3 days)

- [ ] `expand_context(entity, hops, relation_filter, min_confidence, limit)`
- [ ] `event_chain(event, depth)`
- [ ] `significance_of(entity)`
- [ ] `lore_about(entity, type?, limit)`
- [ ] `cite(claim)`

**Deliverable:** the LLM can answer "how are A and B connected?" and "what do the chronicles say about X?" in single calls.

## Phase 9: Generation tools (3–5 days)

- [ ] `summarize_chain(entity, depth, style)` — uses LiteLLM proxy.
- [ ] `narrate_arc(start_event, end_event, perspective?)` — composes multiple queries.
- [ ] World-builder tools (`add_entity`, `add_relation`, `add_lore_source`, etc.).

**Deliverable:** the LLM can produce grounded narrative text on demand.

## Phase 10: Reasoning harness + integration test (3–5 days)

- [ ] Write the system prompt from `07-reasoning-harness.md`.
- [ ] Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
- [ ] Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
- [ ] Iterate on the system prompt and tool descriptions.

**Deliverable:** the LLM, with the system prompt and the tool surface, can answer 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.

## Phase 11: Polish (open-ended)

- [ ] UI for the consistency engine (browse contradictions, anachronisms, orphans).
- [ ] UI for world-builders (YAML editor with autocomplete, validation, preview).
- [ ] Export: render the world as a wiki, a book, a campaign primer.
- [ ] Versioning: graph snapshots, time-travel queries.
- [ ] Cross-world queries: the engine is per-world, but a future version supports multiple.

---

## Total scope estimate

| Phase | Days | Cumulative |
|---|---|---|
| 0 — Cognee spike | 2 | 2 |
| 1 — Lore Engine ontology on Cognee | 5 | 7 |
| 2 — Time model + UDFs | 4 | 11 |
| 3 — MCP tool layer (Cognee extension) | 5 | 16 |
| 4 — Consistency engine | 6 | 22 |
| 5 — TypeTemplate polymorphic extension | 7 | 29 |
| 6 — Reasoning harness + validation | 4 | 33 |
| 7 — Polish | — | — |
| **MVP (Phases 0–3)** | **16 days** | **end of phase 3** |

**The MVP is end of phase 3.** That's: validated Cognee substrate, the typed Lore Engine ontology, the time model + UDFs, and the 45 MCP tools exposed through Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions. The consistency engine and the TypeTemplate polymorphic extension are the v1.1 follow-ups; per the v1.1 plan below, they land in Phases 4 and 5 of the unified roadmap.

## What to cut if you're under time pressure

In strict order:

1. Phase 7 — Polish. Trivially deferrable.
2. Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
3. Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful v1.1 addition but not a v1 blocker.
4. Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.

The **non-cuttable core is Phases 0–3 + Phase 4**. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.

## What NOT to do

- **Do not skip the Cognee spike.** The substrate decision is the highest-leverage call in the project. If Cognee can't represent the typed ontology or the time model, the spike fails fast and the v1 needs a different foundation. The 1–2 days is the cheapest insurance available.
- **Do not skip the UDF unit tests.** Every time-aware query depends on `time_in_window` and `time_windows_overlap`. If they have a bug, every consistency check is wrong. Test first, then trust.
- **Do not over-invest in prose extraction.** Cognee handles the prose path; the structured YAML path is what the Lore Engine adds on top. Structured is exact, prose is fuzzy; high-stakes data goes through structured paths.
- **Do not try to support 100% of question types in the first build.** Ship the 5 patterns from `07-reasoning-harness.md` and iterate. The LLM is forgiving of missing tools if the existing ones are reliable.

---

## Phases 4–7: From MVP to production-ready

After the v1 MVP (Phases 0–3) is built and the time model works end-to-end, four phases add the consistency engine, the polymorphic extension model, the reasoning-harness validation, and the polish layer. This is a single ~17-day follow-up that lands everything in the v1.1/v1.2 design docs on top of the Cognee-based v1.

### Phase 4: Consistency engine (~6 days)

The 4-category rule system from `04-consistency.md`: Contradiction, Anachronism, Orphan, OntologyViolation. Implemented as a Cognee data-pipeline that runs on a schedule and on demand, materializing violation nodes in the same graph the LLM queries.

- [ ] Implement the 4 rule categories from `04-consistency.md` as a Cognee data-pipeline.
- [ ] Implement the 10 starter `:OntologyRule` nodes from `05-mcp-tools.md#starter-rules`.
- [ ] Expose the 10 Group 6 consistency tools: `get_contradictions`, `get_anachronisms`, `get_ontology_violations`, `get_orphans`, `flag_for_review`, `explain_violation`, `run_consistency_check`, `latest_run`, `add_ontology_rule`, `list_ontology_rules`.
- [ ] Schedule the consistency pipeline to run nightly (Cognee task scheduler).

**Verify:** ingest two sources that disagree on the same fact; confirm a `Contradiction` node is created; call `get_contradictions(subject=X)` and see it. Test anachronism detection against a known historical claim that has a Person participating in an Event outside their lifespan.

### Phase 5: TypeTemplate polymorphic extension (~7 days)

The big one. The `DomainEntity`, `Relation`, and `TypeTemplate` labels, the template-watcher service, and the dynamic tool generator. Per `11-extensibility.md` and `12-storage-strategy.md`, this is the v1.1 extension model that makes new domain types a YAML exercise.

- [ ] Register the `DomainEntity`, `Relation`, and `TypeTemplate` labels as a Cognee data-model extension.
- [ ] Build the template-watcher service: watches `./templates/`, validates YAML, registers templates.
- [ ] Build the template-registry: persists template specs alongside the Cognee storage layer.
- [ ] Implement the dynamic tool generator: a generic handler that runs queries generated from `TypeTemplate` specs.
- [ ] Add the `list_template_tools` MCP tool.
- [ ] Ship the four example templates from `14-examples.md` (thieves-guild mission, war campaign, black-market lot, NPC secret knowledge).
- [ ] Update the reasoning harness to mention template tools.

**Verify:** write `templates/thieves_guild/mission.yaml`, hit `POST /admin/templates/reload`, see 6 new tools in `tools/list`, ingest a mission, query it via `list_missions`, get a coherent answer. **No Go code change between "template added" and "tool available."**

### Phase 6: Reasoning harness + validation (~4 days)

The system prompt from `07-reasoning-harness.md`, the test harness, and the validation pass.

- [ ] Write the system prompt.
- [ ] Build a test harness: 50 worked questions, expected tool sequences, expected answer shape.
- [ ] Run a "red team" session: deliberately adversarial questions, edge cases, contradiction traps. Document the failure modes.
- [ ] Iterate on the system prompt and tool descriptions.
- [ ] Measure tool-selection accuracy across the 45-tool surface; collapse the long tail if the LLM is tool-confused.

**Verify:** the LLM, with the system prompt and the tool surface, answers 80%+ of the test questions correctly. The remaining 20% are documented as known limitations.

### Phase 7: Polish (open-ended)

- [ ] UI for the consistency engine (browse contradictions, anachronisms, orphans).
- [ ] UI for world-builders (YAML editor with autocomplete, validation, preview).
- [ ] Export: render the world as a wiki, a book, a campaign primer.
- [ ] Versioning: graph snapshots, time-travel queries.
- [ ] Cross-setting queries: the engine is per-setting, but a future version supports multiple.

---

## Total scope: v1 + v1.1 on Cognee

| Phase | Days | Cumulative |
|---|---|---|
| 0 — Cognee spike | 2 | 2 |
| 1 — Lore Engine ontology on Cognee | 5 | 7 |
| 2 — Time model + UDFs | 4 | 11 |
| 3 — MCP tool layer | 5 | 16 |
| 4 — Consistency engine | 6 | 22 |
| 5 — TypeTemplate polymorphic extension | 7 | 29 |
| 6 — Reasoning harness + validation | 4 | 33 |
| 7 — Polish | — | — |
| **Total to v1+ext (Phases 0–6)** | **33 days** | **end of phase 6** |

**The MVP is end of phase 3 (16 days).** Schema, UDFs, 45 MCP tools, structured ingestion, lookup/entity_context/state_at, all on Cognee. World-builders can start writing YAML and the LLM can start answering time-bounded questions.

**The v1 + extensions is end of phase 6 (33 days).** Adds the consistency engine, the TypeTemplate polymorphic extension model, and the reasoning-harness validation. The full 18-doc design contract is implemented.

Compared to the original v1+v1.1 plan (43 days on GraphMCP-Example), the Cognee-based plan saves ~10 days by inheriting the storage abstraction, the extraction pipeline, the embedding store, and the agent-native API. The 17 days of v1.1 modularization work collapses into the 7-day Phase 5 (TypeTemplate) plus the 6-day Phase 4 (consistency engine), because Cognee handles the gateway and decomposition story.

## What to cut from the full plan if you're under time pressure

In strict order:

1. Phase 7 — Polish. Trivially deferrable.
2. Phase 6 — Reasoning harness validation depth. Ship with 20 test questions instead of 50; iterate from observed failure modes.
3. Phase 5 — TypeTemplate polymorphic extension. The v1 ontology covers the macro structure; the polymorphic wrapper is a powerful addition but not a v1.1 blocker.
4. Phase 4 — Consistency engine. Start with the 3 most valuable rule categories (Contradiction, Anachronism, Orphan) and skip OntologyViolation in the first build.

The **non-cuttable core is Phases 0–3 + Phase 4**. Substrate validation, typed ontology, time model, MCP tool layer, and the basic consistency engine. Everything else is enhancement.

## The recommended order: spike → MVP → validate → extensions

1. **Phase 0 — Cognee spike (2 days).** Stand up Cognee locally. Ingest a 10-document sample world. Run `cognee.recall("Who is Aldric?")`. **If the spike fails, the substrate decision is wrong and the v1 needs a different foundation.** The 2-day validation is the cheapest insurance available.
2. **Phases 1–3 — MVP (16 days).** Typed ontology, time model, MCP tool layer. The LLM can answer time-bounded questions against a hand-crafted world.
3. **Phase 4 — Consistency engine (6 days).** The engine flags its first real contradiction.
4. **Phase 5 — TypeTemplate polymorphic extension (7 days).** World-builders add new domain types as YAML. **This is the phase that unlocks the "arbitrary new concept" question. Ship this second because it's the highest-leverage single change after the v1 data layer.**
5. **Phase 6 — Reasoning harness + validation (4 days).** Measure: how often does the LLM answer correctly? how often does it surface contradictions? how often does it hallucinate? **This is the validation gate.** If the numbers aren't good, the design has a bug and the v2 should address it.

## Operational docs that ship with the engine

In parallel with the build, the operational story lives in:

- `docs/21-quickstart.md` — the 1-page guide for new world-builders. **Update as the smoke-test command changes.**
- `docs/18-eval-policy.md` — the threshold and cadence for the 50-question harness. Read by the CI that runs on every PR.
- `docs/22-cognee-boundary.md` — the contract that explains what the Lore Engine owns vs. Cognee. Read by anyone considering a substrate swap.
- `docs/19-retcon-policy.md` and `docs/20-multi-setting-policy.md` — domain-specific policies. Read by world-builders when they declare a retcon or a cross-setting reference.
- `docs/cognee-integration.md` — the recipe for the substrate-specific code (extraction prompt override, LiteLLM routing). Read by anyone debugging ingestion.
- `docs/prompts/` and `docs/models/` — the prompt and model registries. Updated whenever a prompt or model changes; the harness is the gate.