The 4 tools (find_contradictions, find_anachronisms, find_orphans, find_ontology_violations) now read pre-materialized violation nodes from Neo4j, populated by seed.py:seed_violations. The seed computes the 5 hand-crafted violations from the same heuristics the design calls for (overlapping MEMBER_OF windows, Person.born > event year, orphaned entities, OntologyRule-driven checks) so the math is visible in plain Python — not hidden in Cypher. * plugins/consistency.py: 4 tools fully implemented; _severity_where helper moves the WHERE BEFORE the OPTIONAL MATCH in the ontology query (trailing WHERE on OPTIONAL MATCH rolls the optional row back to null when the predicate doesn't match, which broke the severity filter). * seed.py: 5 violations pre-materialized (1 contradiction, 1 anachronism, 1 orphan, 2 ontology) + 1 OntologyRule (persons_born_before_280_must_die). Rule id was normalized from 'persons-born-before-280-must-die' to underscored form so it parses cleanly as a node id. * examples/test_consistency.sh: 10 assertions across 4 tools (severity filter variants), exits 0. * tests/test_consistency.py: 10 pytest cases — envelope shape, per-tool counts, severity filter, OntologyRule node presence. * README.md: T5 marked done. Verification: pytest tests/test_consistency.py 10/10 PASS bash examples/test_consistency.sh 10/10 assertions, exit 0 bash test.sh no regressions, exit 0
8.7 KiB
lore-engine-poc
Proof of concept for the Lore Engine v1.1 architecture.
Five-minute goal: prove that with mock data, we can run a multi-database backend (Neo4j for the world graph, Postgres for operational records, MinIO for blob/image storage) and expose it all through a plugin-driven MCP gateway — where adding a new domain type is a new file in plugins/, not a Go change.
What's running
| Container | Image | Port | Role |
|---|---|---|---|
lore-neo4j |
neo4j:5.26-community |
7474 (browser), 7687 (bolt) | The world graph: people, factions, eras, events, lineage, time-bounded relations |
lore-postgres |
pgvector/pgvector:pg16 |
5432 | Trade log, image manifests, audit, image embeddings |
lore-minio |
minio/minio:latest |
9000 (S3), 9001 (console) | Image blob storage |
lore-gateway |
built locally | 8765 (MCP JSON-RPC) | The plugin-driven gateway |
The four plugins (this is the proof)
plugins/
├── world.py # entity_context, was_true_at, state_at (Neo4j)
├── lineage.py # ancestors_of, descendants_of, lineage_of (Neo4j)
├── trade.py # log_trade, trades_by_buyer, market_price (Postgres)
├── images.py # register_image, recall_images, search_images_by_caption
│ # (MinIO + Postgres + Neo4j)
└── embeddings.py # embed_images, search_images_semantic (Postgres + pgvector)
Each plugin is a single file with a register(registry) entry point. The gateway auto-loads every .py file in plugins/ at startup. No server.py change needed to add a new tool — drop a new file in, restart the container, the new tools appear in tools/list.
How to run it
cd /root/lore-engine-poc
docker compose up -d --build
# wait ~30s for neo4j + postgres + minio to be ready
docker exec -i lore-neo4j cypher-shell -u neo4j -p lore-dev-password < neo4j/init.cypher
docker compose exec -T postgres psql -U lore -d lore < postgres/init.sql
python3 seed.py
# gateway is now live on :8765
The seed.py script is idempotent (uses MERGE and ON CONFLICT). It loads:
- 3 eras (1st Age, 2nd Age, Age of Iron)
- 10 people (Theron, Maric, Aldric, Elara, Cael, Yssa, Vex, Alessia, Kael, Guildmaster Torren)
- 3 factions (House Vyr, The Crimson Pact, Merchants Guild)
- 4 locations (Valdorn, Mardsville, Thornwall Keep, Black Spire Pass)
- 4 items (Sword of Eventide, The Pale Ledger, Ruby Eye of Kael, Elara's Locket)
- 6 events
- 1 lineage group
- ~20 time-bounded relations
- 3 trade log entries
- 4 generated images (portraits + landscape + battle scene) uploaded to MinIO
Try the gateway
List all tools
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | python3 -m json.tool
Look up Aldric
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"entity_context","arguments":{"name":"Aldric Raventhorne"}}
}' | python3 -m json.tool
Time-bounded query: was House Vyr allied with the Merchants Guild in 230 TA?
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{
"name":"was_true_at",
"arguments":{
"relation":"ALLIED_WITH",
"subject":"House Vyr",
"object":"Merchants Guild",
"at_time":"2nd_age.year_230"
}
}
}' | python3 -m json.tool
Lineage: Aldric's ancestors
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"ancestors_of","arguments":{"person":"Aldric Raventhorne","generations":5}}
}' | python3 -m json.tool
Image recall: show me pictures of Aldric
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"recall_images","arguments":{"entity_id":"aldric"}}
}' | python3 -m json.tool
The response includes a presigned_url — a MinIO URL valid for 1 hour. The LLM (or the calling client) can fetch the actual PNG from there.
Search images by caption
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"search_images_by_caption","arguments":{"q":"aldric"}}
}' | python3 -m json.tool
Semantic image search (pgvector)
The embeddings plugin encodes each image's caption into a 384-dim vector
with a local sentence-transformer model (all-MiniLM-L6-v2) and stores it
in Postgres via the pgvector extension. Queries are encoded the same
way and ranked by cosine distance. Unlike search_images_by_caption, this
works on natural-language descriptions and doesn't require keyword overlap.
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"search_images_semantic","arguments":{"q":"a noble lord with a scar"}}
}' | python3 -m json.tool
Returns Aldric's portrait as the top match. Try "a sneaky thief in a hood"
for Vex. The first call triggers a one-time ~80MB model download on the
gateway host; subsequent calls are cached in ~/.cache/torch.
If you add new images via register_image, embeddings are computed in
the background by a daemon thread on the gateway — no separate job queue
needed. Re-running embed_images is a no-op for images that already have
embeddings.
Market price for the Pale Ledger
curl -s -X POST http://localhost:8765/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"market_price","arguments":{"item_id":"pale_ledger"}}
}' | python3 -m json.tool
What this proves
-
The plugin boundary works. A new domain type (trade, images) is a new file in
plugins/. No change toserver.py, no change to docker-compose, no new container. Restart the gateway and the new tools are live. -
Polyglot storage is real, not aspirational. Neo4j holds the typed world graph. Postgres holds the time-series operational data and image manifests. MinIO holds the image bytes. Each store does what it's good at; the gateway composes the answers.
-
Time is a first-class query primitive.
was_true_atchecks time-bounded edges with a single Cypher query — no LLM, no inference. Year-level precision works against the mock data (see2nd_age.year_230example above). -
Image recall works. Images are stored in MinIO, linked to entities in Neo4j (
(:Image)-[:DEPICTS]->(:Person)), and discoverable by entity id, by tag, or by caption substring search. Presigned URLs are generated on the fly. -
The world is small but real. 10 people, 6 events, 4 images, ~20 relations — enough to demonstrate the architecture end-to-end. Scaling is a separate problem; this is the proof of shape.
What's not in this POC
-
No LLM in the loop. The MCP gateway is a tool server; the LLM client (Claude, GPT, anything) is the consumer. This is intentional — the POC validates the data and tool layers, not the LLM reasoning. The reasoning harness is in the design docs (
lore-engine/docs/07-reasoning-harness.md) and would be added as a system prompt in a real deployment. -
Consistency detection is real (v2.T5). The 4 tools (
find_contradictions,find_anachronisms,find_orphans,find_ontology_violations) query pre-materialized violation nodes in Neo4j. The seed (seed.py:seed_violations) computes the violations from the same heuristics (overlappingMEMBER_OFwindows,Person.born > event_year, world entities with no relations, and:OntologyRule-driven checks) so the math is visible in plain Python — not hidden in Cypher. -
No world-builder UI. Everything is
curlandcypher-shell. The UI is a v2 feature. -
No reflective memory or behavior layer. The Stanford Generative Agents pattern (memory stream + reflection + planning) is a v2 borrow per the comparison in
lore-engine/docs/16-comparison.md.
Next steps after this POC
Implement the consistency detection rules behind the 4 stub tools (T5).Done.- Add the embedding-based semantic search plugin (uses the
Image.captionand any futurePerson.summarytext). - Add an LLM client that consumes the gateway with the reasoning harness system prompt and runs the 5 question types from the design.
The v1 design in lore-engine/docs/ is the contract. This POC is the proof of shape.