Kay bcda8eff00 Merge branch 'wt/t5-consistency-impl' into main
This merge also brings wt/t6-multi-world (4f92289) since T5 was rebased
on top of T6's namespace work.

Conflicts resolved (always took theirs — the implementation supersedes
the stub):
- README.md: updated 'What's not in this POC' to reflect that T5
  detection is now real, and marked T5 done in 'Next steps'
- plugins/consistency.py: T5's full 254-line implementation replaces
  T1's 153-line stub (find_contradictions / find_anachronisms /
  find_orphans / find_ontology_violations all backed by real Cypher)
2026-06-17 00:40:28 +00:00

lore-engine-poc

Proof of concept for the Lore Engine v1.1 architecture.

Five-minute goal: prove that with mock data, we can run a multi-database backend (Neo4j for the world graph, Postgres for operational records, MinIO for blob/image storage) and expose it all through a plugin-driven MCP gateway — where adding a new domain type is a new file in plugins/, not a Go change.

What's running

Container Image Port Role
lore-neo4j neo4j:5.26-community 7474 (browser), 7687 (bolt) The world graph: people, factions, eras, events, lineage, time-bounded relations
lore-postgres pgvector/pgvector:pg16 5432 Trade log, image manifests, audit, image embeddings
lore-minio minio/minio:latest 9000 (S3), 9001 (console) Image blob storage
lore-gateway built locally 8765 (MCP JSON-RPC) The plugin-driven gateway

The four plugins (this is the proof)

plugins/
├── world.py       # entity_context, was_true_at, state_at   (Neo4j)
├── lineage.py     # ancestors_of, descendants_of, lineage_of (Neo4j)
├── trade.py       # log_trade, trades_by_buyer, market_price (Postgres)
├── images.py      # register_image, recall_images, search_images_by_caption
│                  #                                          (MinIO + Postgres + Neo4j)
└── embeddings.py  # embed_images, search_images_semantic    (Postgres + pgvector)

Each plugin is a single file with a register(registry) entry point. The gateway auto-loads every .py file in plugins/ at startup. No server.py change needed to add a new tool — drop a new file in, restart the container, the new tools appear in tools/list.

How to run it

cd /root/lore-engine-poc
docker compose up -d --build
# wait ~30s for neo4j + postgres + minio to be ready
docker exec -i lore-neo4j cypher-shell -u neo4j -p lore-dev-password < neo4j/init.cypher
docker compose exec -T postgres psql -U lore -d lore < postgres/init.sql
python3 seed.py
# gateway is now live on :8765

The seed.py script is idempotent (uses MERGE and ON CONFLICT). It loads:

  • 3 eras (1st Age, 2nd Age, Age of Iron)
  • 10 people (Theron, Maric, Aldric, Elara, Cael, Yssa, Vex, Alessia, Kael, Guildmaster Torren)
  • 3 factions (House Vyr, The Crimson Pact, Merchants Guild)
  • 4 locations (Valdorn, Mardsville, Thornwall Keep, Black Spire Pass)
  • 4 items (Sword of Eventide, The Pale Ledger, Ruby Eye of Kael, Elara's Locket)
  • 6 events
  • 1 lineage group
  • ~20 time-bounded relations
  • 3 trade log entries
  • 4 generated images (portraits + landscape + battle scene) uploaded to MinIO

Try the gateway

List all tools

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | python3 -m json.tool

Look up Aldric

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"entity_context","arguments":{"name":"Aldric Raventhorne"}}
  }' | python3 -m json.tool

Time-bounded query: was House Vyr allied with the Merchants Guild in 230 TA?

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{
      "name":"was_true_at",
      "arguments":{
        "relation":"ALLIED_WITH",
        "subject":"House Vyr",
        "object":"Merchants Guild",
        "at_time":"2nd_age.year_230"
      }
    }
  }' | python3 -m json.tool

Lineage: Aldric's ancestors

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"ancestors_of","arguments":{"person":"Aldric Raventhorne","generations":5}}
  }' | python3 -m json.tool

Image recall: show me pictures of Aldric

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"recall_images","arguments":{"entity_id":"aldric"}}
  }' | python3 -m json.tool

The response includes a presigned_url — a MinIO URL valid for 1 hour. The LLM (or the calling client) can fetch the actual PNG from there.

Search images by caption

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"search_images_by_caption","arguments":{"q":"aldric"}}
  }' | python3 -m json.tool

Semantic image search (pgvector)

The embeddings plugin encodes each image's caption into a 384-dim vector with a local sentence-transformer model (all-MiniLM-L6-v2) and stores it in Postgres via the pgvector extension. Queries are encoded the same way and ranked by cosine distance. Unlike search_images_by_caption, this works on natural-language descriptions and doesn't require keyword overlap.

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"search_images_semantic","arguments":{"q":"a noble lord with a scar"}}
  }' | python3 -m json.tool

Returns Aldric's portrait as the top match. Try "a sneaky thief in a hood" for Vex. The first call triggers a one-time ~80MB model download on the gateway host; subsequent calls are cached in ~/.cache/torch.

If you add new images via register_image, embeddings are computed in the background by a daemon thread on the gateway — no separate job queue needed. Re-running embed_images is a no-op for images that already have embeddings.

Market price for the Pale Ledger

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"market_price","arguments":{"item_id":"pale_ledger"}}
  }' | python3 -m json.tool

What this proves

  1. The plugin boundary works. A new domain type (trade, images) is a new file in plugins/. No change to server.py, no change to docker-compose, no new container. Restart the gateway and the new tools are live.

  2. Polyglot storage is real, not aspirational. Neo4j holds the typed world graph. Postgres holds the time-series operational data and image manifests. MinIO holds the image bytes. Each store does what it's good at; the gateway composes the answers.

  3. Time is a first-class query primitive. was_true_at checks time-bounded edges with a single Cypher query — no LLM, no inference. Year-level precision works against the mock data (see 2nd_age.year_230 example above).

  4. Image recall works. Images are stored in MinIO, linked to entities in Neo4j ((:Image)-[:DEPICTS]->(:Person)), and discoverable by entity id, by tag, or by caption substring search. Presigned URLs are generated on the fly.

  5. The world is small but real. 10 people, 6 events, 4 images, ~20 relations — enough to demonstrate the architecture end-to-end. Scaling is a separate problem; this is the proof of shape.

What's not in this POC

  • No LLM in the loop. The MCP gateway is a tool server; the LLM client (Claude, GPT, anything) is the consumer. This is intentional — the POC validates the data and tool layers, not the LLM reasoning. The reasoning harness is in the design docs (lore-engine/docs/07-reasoning-harness.md) and would be added as a system prompt in a real deployment.

  • Consistency detection is real (v2.T5). The 4 tools (find_contradictions, find_anachronisms, find_orphans, find_ontology_violations) query pre-materialized violation nodes in Neo4j. The seed (seed.py:seed_violations) computes the violations from the same heuristics (overlapping MEMBER_OF windows, Person.born > event_year, world entities with no relations, and :OntologyRule-driven checks) so the math is visible in plain Python — not hidden in Cypher.

  • No world-builder UI. Everything is curl and cypher-shell. The UI is a v2 feature.

  • No reflective memory or behavior layer. The Stanford Generative Agents pattern (memory stream + reflection + planning) is a v2 borrow per the comparison in lore-engine/docs/16-comparison.md.

Next steps after this POC

  • Implement the consistency detection rules behind the 4 stub tools (T5). Done.
  • Add the embedding-based semantic search plugin (uses the Image.caption and any future Person.summary text).
  • Add an LLM client that consumes the gateway with the reasoning harness system prompt and runs the 5 question types from the design.

The v1 design in lore-engine/docs/ is the contract. This POC is the proof of shape.

Description
Proof of concept: Neo4j + Postgres + MinIO + Python plugin gateway for the Lore Engine. Validates the v1.1 plugin architecture and image recall.
Readme 359 KiB
Languages
Go 47.5%
Python 45.4%
Shell 5.6%
Dockerfile 0.8%
Cypher 0.7%