Files
lore-engine/docs/18-substrate-merge-plan.md
hermes-agent 0846dacdd9 docs(18): substrate merge plan — 6-phase execution roadmap for the lore-engine + GraphMCP merge
Phase 0 inventory (gate) → Phase 1 substrate merge → Phase 2 ontology + time + planes → Phase 3 consistency engine → Phase 4 structured + dialogue ingestion → Phase 5 bot integration → Phase 6 connector template + first new source.

Each phase produces a PR with verify.sh per [[Verify Gate]]. Cost estimate: ~11.5h wall clock, ~$9.20 LLM at minimax-m3 rates. Comfortable within Ollama $100/mo + Gemini $20/day caps.

Open questions (deferred to the phase that needs them): bot language (Python decided 2026-06-26, mardonar-npcs uses Pydantic v2), first new source to template (TBD before phase 6), plane taxonomy for non-Mardonari settings (TBD for v1.3).

Companion to [[2026-06-26 Lore Engine GraphMCP Merge]].
2026-06-26 20:23:31 +00:00

11 KiB

18 — Substrate Merge Plan

The execution plan that turns the lore-engine design into a working merged runtime that absorbs the GraphMCP Example substrate and serves the Mardonar Specs bot. Companion to [[2026-06-26 Lore Engine GraphMCP Merge]].

What we're building

A single MCP runtime that:

  • Hosts the 14-node lore ontology + time model + v1.2 planes on top of GraphMCP's existing Person/Location/Faction/Event/Encounter graph
  • Preserves all 7 GraphMCP ingestion workers (Go, Redis Streams)
  • Adds 2 new workers (structured-ingestor, dialogue-processor)
  • Exposes a unified MCP surface: 8 inherited GraphMCP tools + 12 lore-engine POC plugins + 4 new v1.2 plane tools + the consistency generalizations
  • Lets mardonar-npcs (Discord bot) call query_as_npc + log_encounter and publish NPC dialogue to raw.dialogue

Repo layout after the merge

kaykayyali/lore-engine/                 # design docs (unchanged — current main=7d81a76)
  docs/
    00-overview.md ... 17-planes.md
    18-substrate-merge-plan.md          # THIS FILE
    19-mcp-server-contract.md          # the contract for the merged MCP surface
    20-verification.md                  # the verify-gate for the merged stack

kaykayyali/lore-engine-poc/             # runtime — gains workers + structured-ingest + dialogue-processor
  docker-compose.yml                    # neo4j + postgres + minio + redis + gateway + 7 workers + 2 new
  plugins/
    world.py, lineage.py, trade.py, images.py, embeddings.py, consistency.py   # existing 6
    nsc/                                # NEW: NPC scoping plugin (query_as_npc, log_encounter, witness_who)
    planes/                             # NEW: v1.2 plane tools
  workers/                              # NEW: Go services (port from GraphMCP-Example)
    discord-connector/                  # copied, env-adapted
    discord-filter/
    lore-watcher/
    ingestion-worker/
    entity-extractor/  + entity-extractor-2/
    lore-extractor/    + lore-extractor-2/
    encounter-processor/ + encounter-processor-2/
    structured-ingestor/                # NEW
    dialogue-processor/                 # NEW
  gateway/server.py                     # gains the new plugins
  neo4j/init.cypher                     # extended with ontology + time + planes
  postgres/init.sql                     # extended with witness + dialogue tables
  verify-merge.sh                       # NEW: exercises every plugin + every inherited tool
  tests/                                # gained E2E for the merge

kaykayyali/mardonar-specs/              # UNCHANGED — content only
  *.yaml                                # 9 encounter specs

kaykayyali/mardonar-npcs/               # NEW
  src/
    spec/loader.py                      # Pydantic v2 EncounterSpecSchema
    bot/main.py                         # Discord client
    bot/dm.py                           # LLM narrative driver
  specs/                                # build-time injected from mardonar-specs
  Dockerfile                            # has SPECS_GIT_URL build arg
  docker-compose.yml                    # bot + MCP client config
  docs/spec-authoring-guide.md          # the canonical guide (moved from README reference)
  tests/
    test_spec.py
    test_bot_encounter.py

Phased execution

The merge runs as 6 phases. Each phase is one kanban wave with the shape [dev-implementation-cards] → [tester-validation] → [integration-gate].

Phase 0 — Schema + worker inventory (gate)

Owner: dev + tester. Outputs: one PR with a docs/merge/00-inventory.md listing every worker, every tool, every stream, every Cypher query, every LLM call in GraphMCP-Example today — so the merge knows what must be preserved.

Why first: without the inventory, "preserved as-is" is a vibe, not a contract. The orchestrator auto-rejects any merge card that references something not in the inventory.

Phase 1 — Substrate merge

Owner: dev. Outputs: lore-engine-poc gains Redis, 7 GraphMCP workers ported in (with env var adaptations: NEO4J_URL, REDIS_URL, the 3 lore-engine-poc container names), and all 8 GraphMCP tools re-implemented as a single nsc (NPC scoping) Python plugin. The gateway gains a tools/list that shows the merged 24+ tools.

Verification: verify-merge.sh exercises every tool. Neo4j Browser shows the merged graph. The gateway at :8765/mcp advertises the full surface. docker compose ps shows all 11 services healthy.

Phase 2 — Ontology + time + planes

Owner: dev. Outputs: the 14-node ontology + time-bounded relations

  • v1.2 Setting/Plane nodes land on top of GraphMCP's existing nodes via the lore-engine's existing Cypher schema (neo4j/init.cypher is extended). Migration script: collapse the 2 Roland nodes into 1 with two LOCATED_IN edges, deprecate the v1.1 world_id strings.

Verification: existing 12-test test.sh still green + 4 new tests for the plane model. Neo4j Browser shows Setting/Plane nodes + the plane relations. entity_context(Roland) returns one node with 2 planes.

Phase 3 — Consistency engine

Owner: dev. Outputs: generalize get_contradictionsfind_contradictions, add find_anachronisms + find_ontology_violations + find_orphans + find_plane_violations. The existing consistency.py plugin in lore-engine-poc gets these — they're already designed in 04-consistency.md.

Verification: every planted contradiction in the seed data surfaces. A new adversarial test injects 5 contradictions and verifies all 5 are caught.

Phase 4 — Structured + dialogue ingestion

Owner: dev. Outputs: 2 new Go workers (structured-ingestor, dialogue-processor) + the POST /ingest/structured + POST /ingest/dialogue HTTP endpoints on ingestion-worker. The lore-engine-poc gateway gains add_timeline_yaml, add_family_tree_yaml, add_dialogue MCP tools (forwarders to the HTTP endpoints, since the workers are Go).

Verification: curl -F file=@timeline.yaml -F source_type=timeline http://localhost:8080/ingest/structured → 202 + raw.structured stream entry. structured-ingestor consumes it within 1s. Cypher shows the new Date/Event/Edge nodes. find_anachronisms doesn't flag them.

Phase 5 — Bot integration

Owner: dev + tester. Outputs: kaykayyali/mardonar-npcs repo created with Pydantic v2 EncounterSpecSchema + Discord client + DM narrative driver. SPECS_GIT_URL=mardonar-specs baked into the Dockerfile. The bot calls the merged MCP server for query_as_npc + log_encounter and publishes NPC dialogue to raw.dialogue.

Verification: bot loads the-clock-maker.yaml, starts an encounter in Discord, narrates the opening scene, the player interacts, the bot calls log_encounter synchronously, the second query_as_npc returns the new encounter in the witness graph. Live test against the hp-grey-public deployment.

Phase 6 — Connector template + first new source

Owner: dev. Outputs: workers/connector-template/ Go service that becomes the copy-paste starting point for any new *-connector. README explains the 5 producer shapes (connector / watcher / structured / LLM-extractor / event). Two worked examples: slack-connector and pdf-watcher (both real, both wire up to existing extractors).

Verification: cp -r connector-template slack-connector && sed -i 's/TEMPLATE/SLACK/' ... produces a working service that polls a Slack workspace and writes to raw.messages. New workers appear in docker compose ps healthy.

Dependencies between phases

P0 (inventory)
 └── P1 (substrate merge)
      └── P2 (ontology + time + planes)
           └── P3 (consistency engine)
                └── P4 (structured + dialogue ingestion)
                     └── P5 (bot integration)
                          └── P6 (connector template + first new source)

P0 → P1 → P2 → P3 → P4 → P5 → P6 is a strict linear chain. P6 can fan out into multiple connector-* workers once the template ships.

Each phase produces a PR. Each PR ships with verify.sh per the Verify Gate pattern. Merging into main auto-promotes the next phase's task in kanban.

Risk register (excerpt; full version in merge ADR research file)

Risk Mitigation Phase
Lore-engine-poc plugins break when GraphMCP worker writes hit Neo4j Phase 1 includes verify-merge.sh that exercises every plugin against merged stack P1
WITNESSED-edge semantics drift when plane model lands Spell out: WITNESSED is Person↔Encounter, NOT Person↔Plane; orthogonal P2
Two-LLM arbitration (-2 replicas) writes conflicting nodes Add source_lv property check in find_contradictions P3
world_id → plane migration corrupts existing Mardonar data One-shot Cypher with rollback, run against v1.2 seed; 2 Roland nodes collapse to 1 with two LOCATED_IN P2
Bot log_encounter writes fail during active DM Sync write is the contract; failure → bot retries; encounter graph is source of truth P5

Open questions (deferred to the phase that needs them)

  • P5: confirm Python for mardonar-npcs (decision made 2026-06-26; Pydantic v2 replaces Zod)
  • P6: pick first real new source to template (Slack / RSS / Foundry CSV export / other)
  • P2: plane taxonomy for non-Mardonari settings (Eberron already has data; Darksun / Forgotten Realms / homebrew TBD)

Cost + time estimate

Per-phase rough budget on the existing worker fleet (dev/tester, minimax-m3):

Phase Worker turns Wall clock LLM cost (est.)
P0 inventory 60 30 min ~$0.40
P1 substrate merge 240 2 h ~$1.60
P2 ontology + planes 240 2 h ~$1.60
P3 consistency engine 180 1.5 h ~$1.20
P4 structured + dialogue 240 2 h ~$1.60
P5 bot integration 240 2 h ~$1.60
P6 connector template 180 1.5 h ~$1.20
Total 1380 turns ~11.5 h ~$9.20

Runs cleanly within the Ollama $100/mo + Gemini $20/day caps. Use minimax-m3 throughout (already the active model). Sub-agent routing per the LLM Cost Caps pattern.

How to start

  1. Architecture review gate — Kay reviews this plan + the merge ADR research file. (You are here.)
  2. Phase 0 dispatch — one kanban task to dev: produce the inventory doc. Worker is gated on approval, not on review of later phases.
  3. Phases 1-6 — kanban board lore-engine-merge (new), one wave per phase. Board watchdog cron at 1m cadence during active phases, 15m when dormant.

See also