Replaces the v1.1 'world_id is a string' model with a graph-of-planes. Driven by Kay's Q2 ('worlds are planes') and the v1.2 design review.
- 17-planes.md (NEW): the plane taxonomy, the four relation types, Cypher patterns, migration from world_id, open questions
- 01-ontology.md: Plane and Setting as first-class nodes; the 6 plane-relation edge types
- 02-time-model.md: plane-aware time (entity_planes_at_time as the 6th time-aware primitive)
- 08-architecture.md: data flow for plane questions (the 'can I get from X to Y' pattern)
- 11-extensibility.md: how to add custom planes and plane-relations without code
- 12-storage-strategy.md: planes are pure graph (no Postgres/Redis/Qdrant/S3 changes)
- 14-examples.md: example 5 — full Setting + planes + Roland + Asmodeus + LLM tool calls
- README.md: v1.2 entry + doc 17 in the table of contents
The POC rebuild (T10) is the next step: migrate the existing 4 world_id values to Setting/Plane nodes and update the plugin queries to use EXISTS_IN/LOCATED_IN.
17 KiB
12 — Storage Strategy: Which Data Goes Where
The v1 design treats Neo4j as the universal substrate. For v1.1, with polymorphic domain entities, vector embeddings, time-series events, and high-volume operational logs (trade lots, mission outcomes, campaign movements), Neo4j alone is the wrong tool for the job. Different data has different access patterns, and forcing them all into one graph makes the graph bad at everything.
This document is the storage role split: which database stores which kind of data, why, and how the engine queries across them.
The principle: pick the right tool for the access pattern
Neo4j is excellent at: relationship traversal, graph pattern matching, recursive lineage, spatial aggregation. It is mediocre at: full-text search, large property blobs, high-volume time-series ingestion, free-form JSON querying.
If we force high-volume operational data (every trade, every mission step, every army movement) into Neo4j properties, the graph bloats, indexes fragment, and queries slow down. The right move is to store each kind of data where it naturally lives, and have the engine compose across stores.
The five stores
| Store | What it holds | Why |
|---|---|---|
| Neo4j | The world graph. People, factions, locations, lineage, era trees, type templates, domain entities with relationships, time-bounded edges, ontology rules, violation nodes. | Graph traversal is the primary access pattern. |
| PostgreSQL | Operational records with structured schemas. Trade logs, mission step logs, campaign event streams, audit trails, world-builder write history, session state, MCP tool call logs. | Relational, high-volume, time-series-friendly, transactional. |
| Qdrant (or pgvector) | Vector embeddings of LoreChunk, Message, and DomainEntity.summary text. |
Semantic search is the primary access pattern. |
| Redis | Active MCP session state, per-session world_time, tool-call rate-limit counters, in-flight transaction state, ephemeral caches. |
Sub-millisecond, ephemeral. |
| S3-compatible object store (MinIO) | Full text of lore sources, images, audio, large attachments, archival snapshots. | Blob storage, cheap, durable. |
Existing GraphMCP-Example has Neo4j + Redis + the LLM proxy. We add PostgreSQL (the big new one), Qdrant (or pgvector, for self-contained deployments), and MinIO (or any S3 bucket).
What goes in Neo4j
The macro world graph. Anything where the LLM will say "traverse from A" or "find all X related to Y" or "is X connected to Y?" — that lives in Neo4j.
- Core entities: Person, Faction, Location, Item, Era, Date, Lineage, Culture, Deity, Language, MagicSystem, Title, Region, Material.
- Planes (v1.2):
SettingandPlaneare first-class graph nodes. Plane relations (REFLECTS,LAYER_OF,ADJACENT_TO,ACCESSIBLE_VIA) are first-class edges. TheEXISTS_INandLOCATED_INrelations between entities and planes are stored here too. See17-planes.md. - Time-bounded relations between core entities:
RULED,MEMBER_OF,LOCATED_IN,PARTICIPATED_IN,ALLIED_WITH,POSSESSES, etc. Always time-bounded. Always queryable viatime_in_window. - Polymorphic domain entities (
:DomainEntitywith atemplate_id): a thieves-guild Mission, a war Campaign, a Spellbook, a TradeLot, a Ritual. The entity itself and its relations to other entities (Person, Faction, Location, other DomainEntities) live in Neo4j. - Type templates (
:TypeTemplate): the YAML-defined schemas, stored as parsed JSON for the consistency engine and LLM to query. - Violation nodes (
:Contradiction,:Anachronism,:Orphan,:OntologyViolation,:ConsistencyRun): the consistency engine's output. - Lore source metadata (
:LoreSource): title, source_type, author, ingested_at, version. The text lives in object storage; the metadata is in Neo4j. - Indexes: all property indexes from
01-ontology.mdand08-architecture.md. Plus(:DomainEntity).type,(:DomainEntity).world_id,(:Relation).type,(:Relation).valid_from/until.
What does NOT go in Neo4j:
- The full text of a lore source. (Goes in S3, with a pointer in Neo4j.)
- The full text of a domain entity's
summary(above some length threshold). (Goes in S3; short summaries stay in Neo4j for semantic-search embedding.) - The step-by-step log of a mission. (Goes in Postgres; only the aggregate outcome lives in Neo4j as the Mission node.)
- Vector embeddings. (Goes in Qdrant; Neo4j's vector index is OK but not great for high-volume semantic search.)
- High-volume time-series operational data.
What goes in PostgreSQL
Operational records that are append-mostly, high-volume, and not primarily about relationships.
The shape that Postgres handles well: rows of typed columns, indexed on time, with foreign keys back to Neo4j IDs.
Schema overview
-- World, version, and migration state
CREATE TABLE world (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
current_era TEXT NOT NULL, -- canonical time string
schema_version TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
-- Operational event log (every meaningful state change)
CREATE TABLE lore_event (
id BIGSERIAL PRIMARY KEY,
world_id TEXT REFERENCES world(id),
event_type TEXT NOT NULL, -- 'mission_logged', 'trade_completed', 'army_moved', ...
entity_id TEXT, -- DomainEntity.id from Neo4j
entity_type TEXT, -- discriminator
occurred_at TIMESTAMPTZ NOT NULL,
in_fiction_time TEXT, -- canonical time string
payload JSONB NOT NULL, -- type-specific structured data
sources TEXT[],
actor_id TEXT, -- who/what triggered this
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX ON lore_event (world_id, occurred_at DESC);
CREATE INDEX ON lore_event (entity_id, occurred_at DESC);
CREATE INDEX ON lore_event (event_type, occurred_at DESC);
CREATE INDEX ON lore_event USING GIN (payload);
-- Trade log (every lot, every transaction)
CREATE TABLE trade_log (
id BIGSERIAL PRIMARY KEY,
world_id TEXT REFERENCES world(id),
lot_id TEXT NOT NULL,
item_id TEXT, -- DomainEntity.id of the Item or Material
buyer_id TEXT, -- Person or Faction id
seller_id TEXT,
quantity NUMERIC,
unit TEXT, -- 'gp', 'soulglass_shards', etc.
unit_price NUMERIC,
total_price NUMERIC,
occurred_at TIMESTAMPTZ NOT NULL,
in_fiction_time TEXT,
location_id TEXT, -- Location.id
secrecy TEXT, -- 'public', 'faction_internal', ...
payload JSONB, -- type-specific extras
sources TEXT[]
);
CREATE INDEX ON trade_log (world_id, occurred_at DESC);
CREATE INDEX ON trade_log (lot_id);
CREATE INDEX ON trade_log (buyer_id, occurred_at DESC);
CREATE INDEX ON trade_log (seller_id, occurred_at DESC);
CREATE INDEX ON trade_log (item_id, occurred_at DESC);
CREATE INDEX ON trade_log (location_id, occurred_at DESC);
CREATE INDEX ON trade_log USING GIN (payload);
-- Mission step log (per-mission timeline of events)
CREATE TABLE mission_log (
id BIGSERIAL PRIMARY KEY,
mission_id TEXT NOT NULL, -- DomainEntity.id
step_no INT NOT NULL,
step_type TEXT, -- 'planned', 'briefed', 'infiltrated', 'completed', 'botched', 'paid'
occurred_at TIMESTAMPTZ NOT NULL,
in_fiction_time TEXT,
party TEXT[], -- Person ids present
location_id TEXT,
outcome TEXT,
notes TEXT,
sources TEXT[],
UNIQUE (mission_id, step_no)
);
-- War campaign movement log
CREATE TABLE campaign_event (
id BIGSERIAL PRIMARY KEY,
campaign_id TEXT NOT NULL, -- DomainEntity.id of the Campaign
event_type TEXT, -- 'army_moved', 'battle', 'siege_begun', 'siege_lifted', ...
occurred_at TIMESTAMPTZ NOT NULL,
in_fiction_time TEXT,
faction_id TEXT,
location_id TEXT,
army_size INT,
casualties INT,
outcome TEXT,
payload JSONB,
sources TEXT[]
);
CREATE INDEX ON campaign_event (campaign_id, occurred_at DESC);
CREATE INDEX ON campaign_event (faction_id, occurred_at DESC);
CREATE INDEX ON campaign_event (location_id, occurred_at DESC);
-- MCP tool call log (for the consistency monitor + audit)
CREATE TABLE tool_call (
id BIGSERIAL PRIMARY KEY,
session_id TEXT,
tool_name TEXT NOT NULL,
arguments JSONB,
result JSONB,
duration_ms INT,
error TEXT,
called_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX ON tool_call (tool_name, called_at DESC);
CREATE INDEX ON tool_call (session_id, called_at DESC);
-- Retcon history (Kay's Q4)
CREATE TABLE retcon (
id BIGSERIAL PRIMARY KEY,
world_id TEXT REFERENCES world(id),
target_kind TEXT, -- 'entity' | 'relation' | 'property'
target_id TEXT NOT NULL,
before JSONB, -- snapshot of what was there
after JSONB, -- what it was changed to
reason TEXT,
actor_id TEXT, -- world-builder id
retconned_at TIMESTAMPTZ DEFAULT now(),
sources TEXT[]
);
CREATE INDEX ON retcon (target_id, retconned_at DESC);
-- NPC dialogue history (for NPC knowledge scoping)
CREATE TABLE dialogue_log (
id BIGSERIAL PRIMARY KEY,
world_id TEXT REFERENCES world(id),
npc_id TEXT NOT NULL, -- Person.id
session_id TEXT,
message TEXT NOT NULL,
in_fiction_time TEXT,
occurred_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX ON dialogue_log (npc_id, occurred_at DESC);
These tables are the operational backbone. They're what gets high-volume writes, transactional integrity, and time-series queries.
What goes in Qdrant (or pgvector)
Vector embeddings for semantic search. Three collections:
| Collection | Source | Dimension | Use |
|---|---|---|---|
lore_chunks |
(:LoreChunk).text |
768 | Semantic search over lore documents. |
messages |
(:Message).content |
768 | Semantic search over dialogue. |
domain_summaries |
(:DomainEntity).summary (if present) |
768 | Semantic search over domain entities. |
The first two inherit from GraphMCP-Example. The third is new and only populated for domain entities that opt in (the embedded: true field in the template).
Qdrant vs pgvector: Qdrant is faster and has better filtering. pgvector is simpler (one less service to run) and stays in the same DB family. For self-hosted homelab deployments where minimizing moving parts matters, pgvector is the right call for v1.1. We can swap to Qdrant later without changing the engine's API.
The lore-engine-Example already uses Neo4j's vector index. We can keep that for lore_chunks and add pgvector for domain_summaries, or migrate everything to pgvector. Decision: pgvector for everything in v1.1. Neo4j's vector index is fine but pgvector keeps our vector storage in one place.
What goes in Redis
Ephemeral state. Lost on restart, not backed up.
- Active session context (replaces the in-process
sessionRegistryin the existing MCP server). Each MCP session gets a Redis key with the active entity context, world_time override, and tool-call budget. - Tool-call rate limits (per session, per IP).
- Embedding cache (frequently-searched queries → cached embedding).
- Pub/sub for hot-reload notifications ("new template registered").
- In-flight transactions (for multi-step writes that need atomicity across Neo4j + Postgres).
What goes in S3 (MinIO)
- Full text of every
LoreSource(the YAML/markdown the world-builder wrote). - Full text of every long
DomainEntity.summary. - Attachments: images, audio, videos, large files referenced by lore.
- Snapshots: world state exports, consistency engine reports, retcon bundles.
MinIO is the self-hosted S3. Same protocol, no AWS dependency.
The cross-store query layer
The MCP tools the LLM uses compose across stores. The engine handles the cross-store joins; the LLM sees a unified response.
Example: "What was the Crimson Hand's biggest heist in Mardsville last year?"
LLM → tool: list_missions(filter_by={faction: "crimson_hand", location: "mardsville", since: "1_year_ago"}, sort_by="payout_gp", limit=5)
Engine:
1. Neo4j: MATCH (m:DomainEntity {type: "ThievesGuildMission"})
-[:TARGETS|F2]-> (loc:Location {name: "Mardsville"})
-[:LOGGED_IN]-> (lot:DomainEntity {type: "TradeLogEntry"})
-[:GIVEN_BY]-> (npc:Person {faction: "Crimson Hand"})
RETURN m, lot, npc
2. Postgres: SELECT * FROM mission_log WHERE mission_id IN (...) ORDER BY step_no
3. Postgres: SELECT * FROM trade_log WHERE lot_id IN (...) AND occurred_at > ...
4. Compose: return top 5 by payout_gp, with mission step timeline + trade details
LLM: gets a unified response. Doesn't know it crossed 3 stores.
Example: "What battles did the Vyrs lose?"
LLM → tool: list_campaign_events(filter_by={faction: "house_vyr", outcome: "loss"})
Engine:
1. Neo4j: get the Campaign nodes tied to house_vyr
2. Postgres: SELECT * FROM campaign_event
WHERE campaign_id IN (...) AND outcome = 'loss'
ORDER BY occurred_at DESC
3. Compose: return list with Neo4j faction details + Postgres battle details
LLM: unified response.
The engine exposes composed tools like list_missions, list_campaign_events. The LLM calls one tool; the engine fans out across stores.
The cross-store consistency story
The consistency engine operates across stores. A :Contradiction node in Neo4j can reference a Postgres row. An OntologyRule in Neo4j can include Cypher that joins with a Postgres query (via Neo4j's apoc.load.jdbc).
The rules that go cross-store:
- "A
:DomainEntityof typeTradeLotreferenced in Neo4j must have a corresponding row intrade_log." - "A mission marked
status: 'completed'in Neo4j must have astep_type = 'completed'row inmission_log." - "A campaign event's
army_sizein Postgres must be within 10% of the:DomainEntityaggregate of the participating factions'Person.count."
These rules are written in Cypher with apoc.load.jdbc calls. They run in the nightly batch.
Why this is better than one big Neo4j
| Concern | Neo4j-only | Polyglot |
|---|---|---|
| High-volume writes (mission steps) | Bloats graph, slows down traversal | Postgres handles it cleanly |
| Time-series queries (battles over time) | Requires traversal every query | Postgres GROUP BY occurred_at is fast |
| Full-text search over millions of words | Slow, requires external index | pgvector or Qdrant, designed for it |
| Vector search | OK, but coupled to graph | Dedicated vector store, decoupled |
| Blob storage (full lore text, attachments) | Don't do this in Neo4j | S3, cheap, durable |
| Sub-millisecond ephemeral state | Possible but ugly | Redis, designed for it |
| Graph traversal and pattern matching | Excellent | Still excellent (Neo4j) |
The graph stays the graph. Operational data lives where it belongs. The LLM gets unified responses via composed tools.
The cost: cross-store transactions
When the world-builder writes a new mission, we touch Neo4j (entity, relations), Postgres (mission_log row), and S3 (any attachments). These three writes are not atomic. A partial failure leaves the world in an inconsistent state.
Mitigation: the saga pattern.
saga: log_mission:
step 1: Postgres INSERT INTO mission_log
step 2: Neo4j MERGE (:DomainEntity) + relations
step 3: S3 PUT attachments (if any)
step 4: Neo4j MERGE (:ConsistencyRun {saga_id: ...}) SET status = 'committed'
on failure at step 2: rollback step 1 (Postgres DELETE)
on failure at step 3: mark mission as 'attachments_pending', retry later
on failure at step 4: log to dead-letter queue, alert world-builder
Sagas are more code than a single transaction, but they're correct. The alternative — putting everything in Neo4j and hoping — is the trap.
What this is not
- Not a microservices overhaul. The 5 stores run in 1 docker-compose stack. The engine still looks like one system to the LLM.
- Not eventual-consistency-everywhere. Most operations are single-store. The saga is for the multi-store cases.
- Not a "use Postgres for everything" anti-pattern. We use Postgres for what it's good at, Neo4j for what it's good at, and the cross-store compose layer for the rest.
- Not free. Postgres + Qdrant + MinIO is ~3 more services. On the 58GB host, this is fine (~1-2GB extra). On a Raspberry Pi, it would be wrong.
Summary
The storage strategy is the part of the design that lets the engine scale to the whole world, not just the macro structure. Neo4j is the nervous system — the relations, the time model, the consistency engine. Postgres is the muscle memory — the high-volume operational data. Qdrant/pgvector is the cortex — the semantic search. Redis is the short-term memory — the session state. S3 is the archive — the durable storage.
Each store is the right tool for its job. The engine is the integration layer that makes them feel like one world.