Files
lore-engine/docs/08-architecture.md
Hermes Agent 7d81a761f9 docs(v1.2): planes as first-class graph nodes (Setting, Plane, EXISTS_IN, REFLECTS, LAYER_OF, ADJACENT_TO, ACCESSIBLE_VIA)
Replaces the v1.1 'world_id is a string' model with a graph-of-planes. Driven by Kay's Q2 ('worlds are planes') and the v1.2 design review.

- 17-planes.md (NEW): the plane taxonomy, the four relation types, Cypher patterns, migration from world_id, open questions
- 01-ontology.md: Plane and Setting as first-class nodes; the 6 plane-relation edge types
- 02-time-model.md: plane-aware time (entity_planes_at_time as the 6th time-aware primitive)
- 08-architecture.md: data flow for plane questions (the 'can I get from X to Y' pattern)
- 11-extensibility.md: how to add custom planes and plane-relations without code
- 12-storage-strategy.md: planes are pure graph (no Postgres/Redis/Qdrant/S3 changes)
- 14-examples.md: example 5 — full Setting + planes + Roland + Asmodeus + LLM tool calls
- README.md: v1.2 entry + doc 17 in the table of contents

The POC rebuild (T10) is the next step: migrate the existing 4 world_id values to Setting/Plane nodes and update the plugin queries to use EXISTS_IN/LOCATED_IN.
2026-06-17 03:17:15 +00:00

18 KiB

08 — Architecture

The Lore Engine is a thin layer on top of the existing GraphMCP-Example stack. Same transport, same data substrate, same worker model — extended with new schema, new workers, new tools, and a consistency engine.

System diagram

                            ┌─────────────────────────────────────┐
                            │         World-Builder Authoring      │
                            │   markdown · YAML · dialogue JSON    │
                            └──────────────┬──────────────────────┘
                                           │
              ┌────────────────────────────┼────────────────────────────┐
              │                            │                            │
              ▼                            ▼                            ▼
      ┌───────────────┐         ┌────────────────────┐         ┌────────────────────┐
      │ prose path    │         │ structured path    │         │ dialogue path      │
      │ (LLM extract) │         │ (YAML parse)       │         │ (HTTP POST)        │
      └───────┬───────┘         └─────────┬──────────┘         └─────────┬──────────┘
              │                           │                              │
              ▼                           ▼                              ▼
       Redis Streams:             Redis Streams:                   Redis Streams:
       raw.lore                   raw.structured                   raw.dialogue
              │                           │                              │
              ▼                           ▼                              ▼
      ┌────────────────┐         ┌────────────────────┐         ┌────────────────────┐
      │ lore-extractor │         │ structured-ingest  │         │ dialogue-processor │
      │ (Go worker)    │         │ (Go worker)        │         │ (Go worker)        │
      │ LLM extract    │         │ YAML → Cypher      │         │ Cypher write       │
      │ fuzzy          │         │ exact              │         │ exact              │
      └────────┬───────┘         └─────────┬──────────┘         └─────────┬──────────┘
               │                           │                              │
               └───────────────────────────┼──────────────────────────────┘
                                           │
                                           ▼
                              ┌─────────────────────────┐
                              │       Neo4j Graph       │
                              │  ─────────────────────  │
                              │  Person, Faction,       │
                              │  Location, Era,         │
                              │  Lineage, MagicSystem,  │
                              │  Item, ...              │
                              │  ─────────────────────  │
                              │  + LoreChunk (vectors)  │
                              │  + Encounter            │
                              │  + Contradiction        │
                              │  + Anachronism          │
                              │  + Orphan               │
                              └────────────┬────────────┘
                                           │
                                           │  Cypher / Vector queries
                                           │
                              ┌────────────▼────────────┐
                              │      mcp-server (Go)    │
                              │      HTTP :9000         │
                              │      SSE :9000          │
                              │  ─────────────────────  │
                              │  8 inherited tools      │
                              │  22 new tools           │
                              └────────────┬────────────┘
                                           │  JSON-RPC over HTTP
                                           │
                              ┌────────────▼────────────┐
                              │   LLM Client            │
                              │   (Claude, gpt-4, etc.) │
                              │   with system prompt    │
                              │   from reasoning harness│
                              └─────────────────────────┘

   Background jobs:
   ┌─────────────────────────┐         ┌─────────────────────────┐
   │  consistency-runner     │         │  consistency-monitor    │
   │  (cron, 03:00 daily)    │         │  (HTTP /run-check)      │
   │  runs all rules         │         │  ad-hoc rule run        │
   │  materializes nodes     │         │  for live verification  │
   └─────────────────────────┘         └─────────────────────────┘

Data flow: a question

1. User → LLM Client
   "Did House Vyr rule Valdorn in 340 TA?"

2. LLM Client → LLM (with system prompt + active context)
   LLM picks tool: was_true_at(RULED, "House Vyr", "Valdorn", "3rd_age.year_340")

3. LLM Client → MCP Server (JSON-RPC POST /mcp)
   {"method": "tools/call", "params": {"name": "was_true_at", "arguments": {...}}}

4. MCP Server → Neo4j
   Cypher:
   MATCH (f:Faction {name: "House Vyr"})-[r:RULED]->(l:Location {name: "Valdorn"})
   WHERE time_in_window("3rd_age.year_340", r.valid_from, r.valid_until)
   RETURN r, r.valid_from, r.valid_until, sources

5. Neo4j → MCP Server
   [{valid_from: "3rd_age.year_312", valid_until: "3rd_age.year_360", sources: [...]}]

6. MCP Server → LLM Client (JSON-RPC response)
   {"was_true": true, "valid_from": "...", "valid_until": "...", "sources": [...]}

7. LLM Client → LLM
   Adds the result to its context.

8. LLM → User
   "Yes — House Vyr ruled Valdorn from 312 TA to 360 TA, which covers 340 TA.
    Sources: chronicles-vyr.md."

End-to-end latency target: <500ms for a single-tool call, <2s for a 3-tool chain.

Data flow: a plane question (v1.2)

The plane model (per 17-planes.md) is the new substrate for multi-setting, multi-plane worlds. A question that traverses planes looks like this:

1. User → LLM Client
   "Can Asmodeus reach the Material Plane from the Nine Hells? And what planes
    is the Roland of Mardonari connected to in 430 TA?"

2. LLM Client → LLM (with system prompt + active context)
   LLM picks tools: list_accessible_targets(plane='nine_hells'),
                     entity_planes_at_time(entity='roland_raventhorne', at='3rd_age.year_430')

3. LLM Client → MCP Server (JSON-RPC POST /mcp)
   Two tool calls, possibly chained.

4. MCP Server → Neo4j
   Cypher for accessibility:
     MATCH (start:Plane {id: 'mardonari.nine_hells'})-[:ACCESSIBLE_VIA|ADJACENT_TO*1..2]->(target:Plane)
     RETURN DISTINCT target.id, target.kind
   Cypher for time-bounded location:
     MATCH (p:Person {id: 'roland_raventhorne'})-[r:LOCATED_IN]->(plane:Plane)
     WHERE time_in_window('3rd_age.year_430', r.valid_from, r.valid_until)
     RETURN plane.id, plane.kind, plane.name

5. Neo4j → MCP Server
   Combined response: { reachable_planes: [...], rolands_planes_at_430: [...] }

6. MCP Server → LLM Client (JSON-RPC response)

7. LLM Client → LLM
   Adds both results to its context.

8. LLM → User
   "Yes — Plane Shift (a 7th-level spell) connects the Nine Hells to the Material.
    And in 430 TA, Roland was in the Mardonari Material Plane (pottery workshop) and
    had recently returned from a 2-year stint in Voldramir (a Mardonari demiplane)."

Plane relations (REFLECTS, LAYER_OF, ADJACENT_TO, ACCESSIBLE_VIA) make these questions a single Cypher traversal instead of a multi-step string-matching exercise.

Data flow: structured ingestion

1. World-builder writes timeline.yaml, family_tree.yaml, gazetteer.yaml.

2. World-builder → POST /ingest/structured (multipart)
   Files attached, source_type per file.

3. HTTP server (mcp-server or a thin gateway) → Redis Stream raw.structured
   Each file becomes a stream entry with the YAML body and source_type tag.

4. structured-ingest worker consumes the entry.
   - Reads source_type, dispatches to the right parser.
   - YAML parser validates against the per-type schema.
   - Materializes Cypher (MERGE nodes, MERGE edges with time bounds).
   - Tags the LoreSource with source_type: <type>.

5. consistency-runner (live mode) gets the write event.
   - Runs the relevant anachronism + contradiction checks on the new edges.
   - Materializes any new :Contradiction / :Anachronism / :Orphan nodes.

6. Neo4j state is now consistent.
   The next MCP tool call sees the new data.

Services layout (extends GraphMCP-Example)

services/
├── ingestion-worker/         [inherited]  markdown chunks + embeddings
├── lore-extractor/           [inherited]  LLM entity extraction
├── entity-extractor/         [inherited]  message entity extraction
├── encounter-processor/      [inherited]  encounter graph writes
├── lore-watcher/             [inherited]  ./lore-data/ watcher
├── discord-connector/        [inherited]  Discord → raw.messages
├── mcp-server/               [extended]   +22 new tools
│
├── structured-ingestor/      [NEW]        YAML → Cypher
│   ├── main.go                            # dispatcher
│   ├── parsers/timeline.go                # timeline.yaml
│   ├── parsers/family_tree.go             # family_tree.yaml
│   ├── parsers/gazetteer.go               # gazetteer.yaml
│   ├── parsers/bestiary.go                # bestiary.yaml
│   ├── parsers/magic_system.go            # magic_system.yaml
│   ├── parsers/culture.go                 # culture.yaml
│   └── parsers/validator.go               # strict YAML validation
│
├── dialogue-processor/       [NEW]        POST /ingest/dialogue → Cypher
│   └── main.go
│
├── consistency-runner/       [NEW]        nightly batch
│   ├── main.go                            # cron entry point
│   ├── rules/source_contradictions.go     # category A
│   ├── rules/anachronism.go               # category B
│   ├── rules/ontology.go                  # category C (declarative)
│   └── rules/orphans.go                   # category D
│
├── consistency-monitor/      [NEW]        HTTP /run-check
│   └── main.go                            # exposes run_consistency_check tool
│
└── era-tagger/               [NEW, optional] LLM-assisted era inference
    └── main.go                            # backfills temporal_hint on prose-extracted edges

The era-tagger is optional and only used during the bootstrap phase — when prose has been ingested without canonical era tags. It calls an LLM to backfill temporal_hint on events that have Era: unknown in their temporal_hint. After the structured corpus is in, this worker can be disabled.

Schema bootstrap (new Cypher)

The full schema lives in schema/init.cypher (to be generated during the build phase). Key additions to the GraphMCP-Example schema:

// New label constraints
CREATE CONSTRAINT era_slug IF NOT EXISTS FOR (e:Era) REQUIRE e.slug IS UNIQUE;
CREATE CONSTRAINT calendar_name IF NOT EXISTS FOR (c:Calendar) REQUIRE c.name IS UNIQUE;
CREATE CONSTRAINT date_slug IF NOT EXISTS FOR (d:Date) REQUIRE d.slug IS UNIQUE;
CREATE CONSTRAINT lineage_name IF NOT EXISTS FOR (l:Lineage) REQUIRE l.name IS UNIQUE;
CREATE CONSTRAINT culture_name IF NOT EXISTS FOR (c:Culture) REQUIRE c.name IS UNIQUE;
CREATE CONSTRAINT magic_system_name IF NOT EXISTS FOR (m:MagicSystem) REQUIRE m.name IS UNIQUE;
CREATE CONSTRAINT language_name IF NOT EXISTS FOR (l:Language) REQUIRE l.name IS UNIQUE;
CREATE CONSTRAINT deity_name IF NOT EXISTS FOR (d:Deity) REQUIRE d.name IS UNIQUE;
CREATE CONSTRAINT spell_name IF NOT EXISTS FOR (s:Spell) REQUIRE s.name IS UNIQUE;
CREATE CONSTRAINT title_name IF NOT EXISTS FOR (t:Title) REQUIRE (t.name, t.domain) IS UNIQUE;
CREATE CONSTRAINT region_name IF NOT EXISTS FOR (r:Region) REQUIRE r.name IS UNIQUE;
CREATE CONSTRAINT material_name IF NOT EXISTS FOR (m:Material) REQUIRE m.name IS UNIQUE;

// New violation label constraints
CREATE CONSTRAINT anachronism_id IF NOT EXISTS FOR (a:Anachronism) REQUIRE a.id IS UNIQUE;
CREATE CONSTRAINT ontology_violation_id IF NOT EXISTS FOR (o:OntologyViolation) REQUIRE o.id IS UNIQUE;
CREATE CONSTRAINT orphan_id IF NOT EXISTS FOR (o:Orphan) REQUIRE o.id IS UNIQUE;
CREATE CONSTRAINT ontology_rule_id IF NOT EXISTS FOR (r:OntologyRule) REQUIRE r.id IS UNIQUE;
CREATE CONSTRAINT consistency_run_id IF NOT EXISTS FOR (c:ConsistencyRun) REQUIRE c.id IS UNIQUE;

// Composite index for the most common query: "what was X like at T?"
CREATE INDEX relation_time_window IF NOT EXISTS
FOR ()-[r:RULED|CONTROLS|LOCATED_IN|MEMBER_OF|PARTICIPATED_IN|ALLIED_WITH|ENEMY_OF|POSSESSES|SPOUSE_OF|PARENT_OF|WORSHIPS|PRACTICES|SPEAKS|BELONGS_TO|CLAIMS_TITLE]-()
ON (r.valid_from, r.valid_until);

// Index for era-tree traversal
CREATE INDEX era_parent IF NOT EXISTS FOR (e:Era) ON (e.parent_era);
CREATE INDEX era_slug_idx IF NOT EXISTS FOR (e:Era) ON (e.slug);

// Index for violation queries
CREATE INDEX anachronism_flagged IF NOT EXISTS FOR (a:Anachronism) ON (a.flagged);
CREATE INDEX ontology_violation_flagged IF NOT EXISTS FOR (o:OntologyViolation) ON (o.flagged);
CREATE INDEX orphan_flagged IF NOT EXISTS FOR (o:Orphan) ON (o.flagged);

User-defined functions (UDFs)

Two critical UDFs. They live in schema/udfs/:

time_in_window(t, valid_from, valid_until) → bool

The heart of the time model. See 02-time-model.md for the full specification.

// pseudocode, not runnable Cypher
RETURN time_in_window('3rd_age.year_345', '3rd_age.year_340', '3rd_age.year_352');
// → true

RETURN time_in_window('3rd_age.year_360', '3rd_age.year_340', '3rd_age.year_352');
// → false

Implementation notes:

  • Resolves current against the :Now config node.
  • Walks the era tree for parent-era membership.
  • Treats null as open-ended.
  • Implementation will live in a Neo4j user-defined function (Java or Python) registered at startup.

time_windows_overlap(from_a, until_a, from_b, until_b) → bool

For contradiction detection. Two windows overlap unless one ends before the other starts.

RETURN time_windows_overlap('3rd_age.year_340', '3rd_age.year_360',
                            '3rd_age.year_345', '3rd_age.year_370');
// → true

The time_in_window and time_windows_overlap UDFs are the only ones the engine needs. Everything else is regular Cypher.

Hosting & deployment

The Lore Engine adds no new infrastructure dependencies on top of GraphMCP-Example:

  • Neo4j 5.x — same instance, more schema. APOC plugin required for apoc.create.addLabels and apoc.merge.relationship. Already enabled in GraphMCP-Example.
  • Redis 7 — same instance, two new streams (raw.structured, raw.dialogue).
  • LiteLLM proxy at localhost:4000 — used by the lore-extractor worker. New LLM calls go through the same proxy. The era-tagger worker is the only new LLM consumer.
  • Go MCP server — same binary, new tool registrations.

The deployment story is: add the new workers to docker-compose.yml, add the new schema migration, restart, ingest. No new infrastructure.

Resource budget (delta from GraphMCP-Example)

Component Memory Notes
structured-ingestor ~50MB YAML parsing, no LLM. Light.
consistency-runner (nightly) ~200MB Cypher eval, short-lived.
consistency-monitor ~30MB HTTP, stateless.
era-tagger (bootstrap only) ~200MB LLM calls, only during initial backfill.
Neo4j (additional indexes) ~+200MB New indexes are small.
Total delta ~700MB Negligible on the 58GB host.

The Lore Engine is cheaper to run than the existing stack it extends.

What is intentionally not in this architecture

  • No real-time updates. The consistency engine runs on a schedule and on demand. Streaming consistency (per-write checks) is for a future v2.
  • No LLM in the read path. Tool calls are pure Cypher. Only summarize_chain and narrate_arc call an LLM, and only on the generation side.
  • No external world-data sources. Wikipedia imports, fantasy-name generators, etc. are explicitly out. The engine reasons about the world as defined by its sources, not the world at large.
  • No user authentication. The MCP server is internal. The reasoning harness runs in a trusted context.
  • No graph versioning. Edits overwrite. If you need history, that's a v2 feature (and a real cost in storage and Cypher complexity).

The architecture is a deliberate floor. The features we don't have are features we can add when we have evidence they're needed, not features we can remove once added.