Files

Hermes Agent 7d81a761f9 docs(v1.2): planes as first-class graph nodes (Setting, Plane, EXISTS_IN, REFLECTS, LAYER_OF, ADJACENT_TO, ACCESSIBLE_VIA)

Replaces the v1.1 'world_id is a string' model with a graph-of-planes. Driven by Kay's Q2 ('worlds are planes') and the v1.2 design review.

- 17-planes.md (NEW): the plane taxonomy, the four relation types, Cypher patterns, migration from world_id, open questions
- 01-ontology.md: Plane and Setting as first-class nodes; the 6 plane-relation edge types
- 02-time-model.md: plane-aware time (entity_planes_at_time as the 6th time-aware primitive)
- 08-architecture.md: data flow for plane questions (the 'can I get from X to Y' pattern)
- 11-extensibility.md: how to add custom planes and plane-relations without code
- 12-storage-strategy.md: planes are pure graph (no Postgres/Redis/Qdrant/S3 changes)
- 14-examples.md: example 5 — full Setting + planes + Roland + Asmodeus + LLM tool calls
- README.md: v1.2 entry + doc 17 in the table of contents

The POC rebuild (T10) is the next step: migrate the existing 4 world_id values to Setting/Plane nodes and update the plugin queries to use EXISTS_IN/LOCATED_IN.

2026-06-17 03:17:15 +00:00

18 KiB

Raw Blame History

08 — Architecture

The Lore Engine is a thin layer on top of the existing GraphMCP-Example stack. Same transport, same data substrate, same worker model — extended with new schema, new workers, new tools, and a consistency engine.

System diagram

                            ┌─────────────────────────────────────┐
                            │         World-Builder Authoring      │
                            │   markdown · YAML · dialogue JSON    │
                            └──────────────┬──────────────────────┘
                                           │
              ┌────────────────────────────┼────────────────────────────┐
              │                            │                            │
              ▼                            ▼                            ▼
      ┌───────────────┐         ┌────────────────────┐         ┌────────────────────┐
      │ prose path    │         │ structured path    │         │ dialogue path      │
      │ (LLM extract) │         │ (YAML parse)       │         │ (HTTP POST)        │
      └───────┬───────┘         └─────────┬──────────┘         └─────────┬──────────┘
              │                           │                              │
              ▼                           ▼                              ▼
       Redis Streams:             Redis Streams:                   Redis Streams:
       raw.lore                   raw.structured                   raw.dialogue
              │                           │                              │
              ▼                           ▼                              ▼
      ┌────────────────┐         ┌────────────────────┐         ┌────────────────────┐
      │ lore-extractor │         │ structured-ingest  │         │ dialogue-processor │
      │ (Go worker)    │         │ (Go worker)        │         │ (Go worker)        │
      │ LLM extract    │         │ YAML → Cypher      │         │ Cypher write       │
      │ fuzzy          │         │ exact              │         │ exact              │
      └────────┬───────┘         └─────────┬──────────┘         └─────────┬──────────┘
               │                           │                              │
               └───────────────────────────┼──────────────────────────────┘
                                           │
                                           ▼
                              ┌─────────────────────────┐
                              │       Neo4j Graph       │
                              │  ─────────────────────  │
                              │  Person, Faction,       │
                              │  Location, Era,         │
                              │  Lineage, MagicSystem,  │
                              │  Item, ...              │
                              │  ─────────────────────  │
                              │  + LoreChunk (vectors)  │
                              │  + Encounter            │
                              │  + Contradiction        │
                              │  + Anachronism          │
                              │  + Orphan               │
                              └────────────┬────────────┘
                                           │
                                           │  Cypher / Vector queries
                                           │
                              ┌────────────▼────────────┐
                              │      mcp-server (Go)    │
                              │      HTTP :9000         │
                              │      SSE :9000          │
                              │  ─────────────────────  │
                              │  8 inherited tools      │
                              │  22 new tools           │
                              └────────────┬────────────┘
                                           │  JSON-RPC over HTTP
                                           │
                              ┌────────────▼────────────┐
                              │   LLM Client            │
                              │   (Claude, gpt-4, etc.) │
                              │   with system prompt    │
                              │   from reasoning harness│
                              └─────────────────────────┘

   Background jobs:
   ┌─────────────────────────┐         ┌─────────────────────────┐
   │  consistency-runner     │         │  consistency-monitor    │
   │  (cron, 03:00 daily)    │         │  (HTTP /run-check)      │
   │  runs all rules         │         │  ad-hoc rule run        │
   │  materializes nodes     │         │  for live verification  │
   └─────────────────────────┘         └─────────────────────────┘

Data flow: a question

1. User → LLM Client
   "Did House Vyr rule Valdorn in 340 TA?"

2. LLM Client → LLM (with system prompt + active context)
   LLM picks tool: was_true_at(RULED, "House Vyr", "Valdorn", "3rd_age.year_340")

3. LLM Client → MCP Server (JSON-RPC POST /mcp)
   {"method": "tools/call", "params": {"name": "was_true_at", "arguments": {...}}}

4. MCP Server → Neo4j
   Cypher:
   MATCH (f:Faction {name: "House Vyr"})-[r:RULED]->(l:Location {name: "Valdorn"})
   WHERE time_in_window("3rd_age.year_340", r.valid_from, r.valid_until)
   RETURN r, r.valid_from, r.valid_until, sources

5. Neo4j → MCP Server
   [{valid_from: "3rd_age.year_312", valid_until: "3rd_age.year_360", sources: [...]}]

6. MCP Server → LLM Client (JSON-RPC response)
   {"was_true": true, "valid_from": "...", "valid_until": "...", "sources": [...]}

7. LLM Client → LLM
   Adds the result to its context.

8. LLM → User
   "Yes — House Vyr ruled Valdorn from 312 TA to 360 TA, which covers 340 TA.
    Sources: chronicles-vyr.md."

End-to-end latency target: <500ms for a single-tool call, <2s for a 3-tool chain.

Data flow: a plane question (v1.2)

The plane model (per 17-planes.md) is the new substrate for multi-setting, multi-plane worlds. A question that traverses planes looks like this:

1. User → LLM Client
   "Can Asmodeus reach the Material Plane from the Nine Hells? And what planes
    is the Roland of Mardonari connected to in 430 TA?"

2. LLM Client → LLM (with system prompt + active context)
   LLM picks tools: list_accessible_targets(plane='nine_hells'),
                     entity_planes_at_time(entity='roland_raventhorne', at='3rd_age.year_430')

3. LLM Client → MCP Server (JSON-RPC POST /mcp)
   Two tool calls, possibly chained.

4. MCP Server → Neo4j
   Cypher for accessibility:
     MATCH (start:Plane {id: 'mardonari.nine_hells'})-[:ACCESSIBLE_VIA|ADJACENT_TO*1..2]->(target:Plane)
     RETURN DISTINCT target.id, target.kind
   Cypher for time-bounded location:
     MATCH (p:Person {id: 'roland_raventhorne'})-[r:LOCATED_IN]->(plane:Plane)
     WHERE time_in_window('3rd_age.year_430', r.valid_from, r.valid_until)
     RETURN plane.id, plane.kind, plane.name

5. Neo4j → MCP Server
   Combined response: { reachable_planes: [...], rolands_planes_at_430: [...] }

6. MCP Server → LLM Client (JSON-RPC response)

7. LLM Client → LLM
   Adds both results to its context.

8. LLM → User
   "Yes — Plane Shift (a 7th-level spell) connects the Nine Hells to the Material.
    And in 430 TA, Roland was in the Mardonari Material Plane (pottery workshop) and
    had recently returned from a 2-year stint in Voldramir (a Mardonari demiplane)."

Plane relations (REFLECTS, LAYER_OF, ADJACENT_TO, ACCESSIBLE_VIA) make these questions a single Cypher traversal instead of a multi-step string-matching exercise.

Data flow: structured ingestion

1. World-builder writes timeline.yaml, family_tree.yaml, gazetteer.yaml.

2. World-builder → POST /ingest/structured (multipart)
   Files attached, source_type per file.

3. HTTP server (mcp-server or a thin gateway) → Redis Stream raw.structured
   Each file becomes a stream entry with the YAML body and source_type tag.

4. structured-ingest worker consumes the entry.
   - Reads source_type, dispatches to the right parser.
   - YAML parser validates against the per-type schema.
   - Materializes Cypher (MERGE nodes, MERGE edges with time bounds).
   - Tags the LoreSource with source_type: <type>.

5. consistency-runner (live mode) gets the write event.
   - Runs the relevant anachronism + contradiction checks on the new edges.
   - Materializes any new :Contradiction / :Anachronism / :Orphan nodes.

6. Neo4j state is now consistent.
   The next MCP tool call sees the new data.

Services layout (extends GraphMCP-Example)

services/
├── ingestion-worker/         [inherited]  markdown chunks + embeddings
├── lore-extractor/           [inherited]  LLM entity extraction
├── entity-extractor/         [inherited]  message entity extraction
├── encounter-processor/      [inherited]  encounter graph writes
├── lore-watcher/             [inherited]  ./lore-data/ watcher
├── discord-connector/        [inherited]  Discord → raw.messages
├── mcp-server/               [extended]   +22 new tools
│
├── structured-ingestor/      [NEW]        YAML → Cypher
│   ├── main.go                            # dispatcher
│   ├── parsers/timeline.go                # timeline.yaml
│   ├── parsers/family_tree.go             # family_tree.yaml
│   ├── parsers/gazetteer.go               # gazetteer.yaml
│   ├── parsers/bestiary.go                # bestiary.yaml
│   ├── parsers/magic_system.go            # magic_system.yaml
│   ├── parsers/culture.go                 # culture.yaml
│   └── parsers/validator.go               # strict YAML validation
│
├── dialogue-processor/       [NEW]        POST /ingest/dialogue → Cypher
│   └── main.go
│
├── consistency-runner/       [NEW]        nightly batch
│   ├── main.go                            # cron entry point
│   ├── rules/source_contradictions.go     # category A
│   ├── rules/anachronism.go               # category B
│   ├── rules/ontology.go                  # category C (declarative)
│   └── rules/orphans.go                   # category D
│
├── consistency-monitor/      [NEW]        HTTP /run-check
│   └── main.go                            # exposes run_consistency_check tool
│
└── era-tagger/               [NEW, optional] LLM-assisted era inference
    └── main.go                            # backfills temporal_hint on prose-extracted edges

The era-tagger is optional and only used during the bootstrap phase — when prose has been ingested without canonical era tags. It calls an LLM to backfill temporal_hint on events that have Era: unknown in their temporal_hint. After the structured corpus is in, this worker can be disabled.

Schema bootstrap (new Cypher)

The full schema lives in schema/init.cypher (to be generated during the build phase). Key additions to the GraphMCP-Example schema:

// New label constraints
CREATE CONSTRAINT era_slug IF NOT EXISTS FOR (e:Era) REQUIRE e.slug IS UNIQUE;
CREATE CONSTRAINT calendar_name IF NOT EXISTS FOR (c:Calendar) REQUIRE c.name IS UNIQUE;
CREATE CONSTRAINT date_slug IF NOT EXISTS FOR (d:Date) REQUIRE d.slug IS UNIQUE;
CREATE CONSTRAINT lineage_name IF NOT EXISTS FOR (l:Lineage) REQUIRE l.name IS UNIQUE;
CREATE CONSTRAINT culture_name IF NOT EXISTS FOR (c:Culture) REQUIRE c.name IS UNIQUE;
CREATE CONSTRAINT magic_system_name IF NOT EXISTS FOR (m:MagicSystem) REQUIRE m.name IS UNIQUE;
CREATE CONSTRAINT language_name IF NOT EXISTS FOR (l:Language) REQUIRE l.name IS UNIQUE;
CREATE CONSTRAINT deity_name IF NOT EXISTS FOR (d:Deity) REQUIRE d.name IS UNIQUE;
CREATE CONSTRAINT spell_name IF NOT EXISTS FOR (s:Spell) REQUIRE s.name IS UNIQUE;
CREATE CONSTRAINT title_name IF NOT EXISTS FOR (t:Title) REQUIRE (t.name, t.domain) IS UNIQUE;
CREATE CONSTRAINT region_name IF NOT EXISTS FOR (r:Region) REQUIRE r.name IS UNIQUE;
CREATE CONSTRAINT material_name IF NOT EXISTS FOR (m:Material) REQUIRE m.name IS UNIQUE;

// New violation label constraints
CREATE CONSTRAINT anachronism_id IF NOT EXISTS FOR (a:Anachronism) REQUIRE a.id IS UNIQUE;
CREATE CONSTRAINT ontology_violation_id IF NOT EXISTS FOR (o:OntologyViolation) REQUIRE o.id IS UNIQUE;
CREATE CONSTRAINT orphan_id IF NOT EXISTS FOR (o:Orphan) REQUIRE o.id IS UNIQUE;
CREATE CONSTRAINT ontology_rule_id IF NOT EXISTS FOR (r:OntologyRule) REQUIRE r.id IS UNIQUE;
CREATE CONSTRAINT consistency_run_id IF NOT EXISTS FOR (c:ConsistencyRun) REQUIRE c.id IS UNIQUE;

// Composite index for the most common query: "what was X like at T?"
CREATE INDEX relation_time_window IF NOT EXISTS
FOR ()-[r:RULED|CONTROLS|LOCATED_IN|MEMBER_OF|PARTICIPATED_IN|ALLIED_WITH|ENEMY_OF|POSSESSES|SPOUSE_OF|PARENT_OF|WORSHIPS|PRACTICES|SPEAKS|BELONGS_TO|CLAIMS_TITLE]-()
ON (r.valid_from, r.valid_until);

// Index for era-tree traversal
CREATE INDEX era_parent IF NOT EXISTS FOR (e:Era) ON (e.parent_era);
CREATE INDEX era_slug_idx IF NOT EXISTS FOR (e:Era) ON (e.slug);

// Index for violation queries
CREATE INDEX anachronism_flagged IF NOT EXISTS FOR (a:Anachronism) ON (a.flagged);
CREATE INDEX ontology_violation_flagged IF NOT EXISTS FOR (o:OntologyViolation) ON (o.flagged);
CREATE INDEX orphan_flagged IF NOT EXISTS FOR (o:Orphan) ON (o.flagged);

User-defined functions (UDFs)

Two critical UDFs. They live in schema/udfs/:

`time_in_window(t, valid_from, valid_until) → bool`

The heart of the time model. See 02-time-model.md for the full specification.

// pseudocode, not runnable Cypher
RETURN time_in_window('3rd_age.year_345', '3rd_age.year_340', '3rd_age.year_352');
// → true

RETURN time_in_window('3rd_age.year_360', '3rd_age.year_340', '3rd_age.year_352');
// → false

Implementation notes:

Resolves current against the :Now config node.
Walks the era tree for parent-era membership.
Treats null as open-ended.
Implementation will live in a Neo4j user-defined function (Java or Python) registered at startup.

`time_windows_overlap(from_a, until_a, from_b, until_b) → bool`

For contradiction detection. Two windows overlap unless one ends before the other starts.

RETURN time_windows_overlap('3rd_age.year_340', '3rd_age.year_360',
                            '3rd_age.year_345', '3rd_age.year_370');
// → true

The time_in_window and time_windows_overlap UDFs are the only ones the engine needs. Everything else is regular Cypher.

Hosting & deployment

The Lore Engine adds no new infrastructure dependencies on top of GraphMCP-Example:

Neo4j 5.x — same instance, more schema. APOC plugin required for apoc.create.addLabels and apoc.merge.relationship. Already enabled in GraphMCP-Example.
Redis 7 — same instance, two new streams (raw.structured, raw.dialogue).
LiteLLM proxy at localhost:4000 — used by the lore-extractor worker. New LLM calls go through the same proxy. The era-tagger worker is the only new LLM consumer.
Go MCP server — same binary, new tool registrations.

The deployment story is: add the new workers to docker-compose.yml, add the new schema migration, restart, ingest. No new infrastructure.

Resource budget (delta from GraphMCP-Example)

Component	Memory	Notes
`structured-ingestor`	~50MB	YAML parsing, no LLM. Light.
`consistency-runner` (nightly)	~200MB	Cypher eval, short-lived.
`consistency-monitor`	~30MB	HTTP, stateless.
`era-tagger` (bootstrap only)	~200MB	LLM calls, only during initial backfill.
Neo4j (additional indexes)	~+200MB	New indexes are small.
Total delta	~700MB	Negligible on the 58GB host.

The Lore Engine is cheaper to run than the existing stack it extends.

What is intentionally not in this architecture

No real-time updates. The consistency engine runs on a schedule and on demand. Streaming consistency (per-write checks) is for a future v2.
No LLM in the read path. Tool calls are pure Cypher. Only summarize_chain and narrate_arc call an LLM, and only on the generation side.
No external world-data sources. Wikipedia imports, fantasy-name generators, etc. are explicitly out. The engine reasons about the world as defined by its sources, not the world at large.
No user authentication. The MCP server is internal. The reasoning harness runs in a trusted context.
No graph versioning. Edits overwrite. If you need history, that's a v2 feature (and a real cost in storage and Cypher complexity).

The architecture is a deliberate floor. The features we don't have are features we can add when we have evidence they're needed, not features we can remove once added.

18 KiB Raw Blame History