Files

Hermes 8261c2dcc1 v2.T5: implement 4 consistency tools — 5/5 violations surfaced

The 4 tools (find_contradictions, find_anachronisms, find_orphans,
find_ontology_violations) now read pre-materialized violation nodes
from Neo4j, populated by seed.py:seed_violations. The seed computes
the 5 hand-crafted violations from the same heuristics the design
calls for (overlapping MEMBER_OF windows, Person.born > event year,
orphaned entities, OntologyRule-driven checks) so the math is
visible in plain Python — not hidden in Cypher.

* plugins/consistency.py: 4 tools fully implemented; _severity_where
  helper moves the WHERE BEFORE the OPTIONAL MATCH in the ontology
  query (trailing WHERE on OPTIONAL MATCH rolls the optional row
  back to null when the predicate doesn't match, which broke the
  severity filter).
* seed.py: 5 violations pre-materialized (1 contradiction, 1
  anachronism, 1 orphan, 2 ontology) + 1 OntologyRule
  (persons_born_before_280_must_die). Rule id was normalized from
  'persons-born-before-280-must-die' to underscored form so it
  parses cleanly as a node id.
* examples/test_consistency.sh: 10 assertions across 4 tools
  (severity filter variants), exits 0.
* tests/test_consistency.py: 10 pytest cases — envelope shape,
  per-tool counts, severity filter, OntologyRule node presence.
* README.md: T5 marked done.

Verification:
  pytest tests/test_consistency.py     10/10 PASS
  bash examples/test_consistency.sh    10/10 assertions, exit 0
  bash test.sh                          no regressions, exit 0

2026-06-16 23:14:34 +00:00

8.7 KiB

Raw Blame History

lore-engine-poc

Proof of concept for the Lore Engine v1.1 architecture.

Five-minute goal: prove that with mock data, we can run a multi-database backend (Neo4j for the world graph, Postgres for operational records, MinIO for blob/image storage) and expose it all through a plugin-driven MCP gateway — where adding a new domain type is a new file in plugins/, not a Go change.

What's running

Container	Image	Port	Role
`lore-neo4j`	`neo4j:5.26-community`	7474 (browser), 7687 (bolt)	The world graph: people, factions, eras, events, lineage, time-bounded relations
`lore-postgres`	`pgvector/pgvector:pg16`	5432	Trade log, image manifests, audit, image embeddings
`lore-minio`	`minio/minio:latest`	9000 (S3), 9001 (console)	Image blob storage
`lore-gateway`	built locally	8765 (MCP JSON-RPC)	The plugin-driven gateway

The four plugins (this is the proof)

plugins/
├── world.py       # entity_context, was_true_at, state_at   (Neo4j)
├── lineage.py     # ancestors_of, descendants_of, lineage_of (Neo4j)
├── trade.py       # log_trade, trades_by_buyer, market_price (Postgres)
├── images.py      # register_image, recall_images, search_images_by_caption
│                  #                                          (MinIO + Postgres + Neo4j)
└── embeddings.py  # embed_images, search_images_semantic    (Postgres + pgvector)

Each plugin is a single file with a register(registry) entry point. The gateway auto-loads every .py file in plugins/ at startup. No server.py change needed to add a new tool — drop a new file in, restart the container, the new tools appear in tools/list.

How to run it

cd /root/lore-engine-poc
docker compose up -d --build
# wait ~30s for neo4j + postgres + minio to be ready
docker exec -i lore-neo4j cypher-shell -u neo4j -p lore-dev-password < neo4j/init.cypher
docker compose exec -T postgres psql -U lore -d lore < postgres/init.sql
python3 seed.py
# gateway is now live on :8765

The seed.py script is idempotent (uses MERGE and ON CONFLICT). It loads:

3 eras (1st Age, 2nd Age, Age of Iron)
10 people (Theron, Maric, Aldric, Elara, Cael, Yssa, Vex, Alessia, Kael, Guildmaster Torren)
3 factions (House Vyr, The Crimson Pact, Merchants Guild)
4 locations (Valdorn, Mardsville, Thornwall Keep, Black Spire Pass)
4 items (Sword of Eventide, The Pale Ledger, Ruby Eye of Kael, Elara's Locket)
6 events
1 lineage group
~20 time-bounded relations
3 trade log entries
4 generated images (portraits + landscape + battle scene) uploaded to MinIO

Try the gateway

List all tools

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | python3 -m json.tool

Look up Aldric

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"entity_context","arguments":{"name":"Aldric Raventhorne"}}
  }' | python3 -m json.tool

Time-bounded query: was House Vyr allied with the Merchants Guild in 230 TA?

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{
      "name":"was_true_at",
      "arguments":{
        "relation":"ALLIED_WITH",
        "subject":"House Vyr",
        "object":"Merchants Guild",
        "at_time":"2nd_age.year_230"
      }
    }
  }' | python3 -m json.tool

Lineage: Aldric's ancestors

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"ancestors_of","arguments":{"person":"Aldric Raventhorne","generations":5}}
  }' | python3 -m json.tool

Image recall: show me pictures of Aldric

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"recall_images","arguments":{"entity_id":"aldric"}}
  }' | python3 -m json.tool

The response includes a presigned_url — a MinIO URL valid for 1 hour. The LLM (or the calling client) can fetch the actual PNG from there.

Search images by caption

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"search_images_by_caption","arguments":{"q":"aldric"}}
  }' | python3 -m json.tool

Semantic image search (pgvector)

The embeddings plugin encodes each image's caption into a 384-dim vector with a local sentence-transformer model (all-MiniLM-L6-v2) and stores it in Postgres via the pgvector extension. Queries are encoded the same way and ranked by cosine distance. Unlike search_images_by_caption, this works on natural-language descriptions and doesn't require keyword overlap.

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"search_images_semantic","arguments":{"q":"a noble lord with a scar"}}
  }' | python3 -m json.tool

Returns Aldric's portrait as the top match. Try "a sneaky thief in a hood" for Vex. The first call triggers a one-time ~80MB model download on the gateway host; subsequent calls are cached in ~/.cache/torch.

If you add new images via register_image, embeddings are computed in the background by a daemon thread on the gateway — no separate job queue needed. Re-running embed_images is a no-op for images that already have embeddings.

Market price for the Pale Ledger

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"market_price","arguments":{"item_id":"pale_ledger"}}
  }' | python3 -m json.tool

What this proves

The plugin boundary works. A new domain type (trade, images) is a new file in plugins/. No change to server.py, no change to docker-compose, no new container. Restart the gateway and the new tools are live.
Polyglot storage is real, not aspirational. Neo4j holds the typed world graph. Postgres holds the time-series operational data and image manifests. MinIO holds the image bytes. Each store does what it's good at; the gateway composes the answers.
Time is a first-class query primitive. was_true_at checks time-bounded edges with a single Cypher query — no LLM, no inference. Year-level precision works against the mock data (see 2nd_age.year_230 example above).
Image recall works. Images are stored in MinIO, linked to entities in Neo4j ((:Image)-[:DEPICTS]->(:Person)), and discoverable by entity id, by tag, or by caption substring search. Presigned URLs are generated on the fly.
The world is small but real. 10 people, 6 events, 4 images, ~20 relations — enough to demonstrate the architecture end-to-end. Scaling is a separate problem; this is the proof of shape.

What's not in this POC

No LLM in the loop. The MCP gateway is a tool server; the LLM client (Claude, GPT, anything) is the consumer. This is intentional — the POC validates the data and tool layers, not the LLM reasoning. The reasoning harness is in the design docs (lore-engine/docs/07-reasoning-harness.md) and would be added as a system prompt in a real deployment.
Consistency detection is real (v2.T5). The 4 tools (find_contradictions, find_anachronisms, find_orphans, find_ontology_violations) query pre-materialized violation nodes in Neo4j. The seed (seed.py:seed_violations) computes the violations from the same heuristics (overlapping MEMBER_OF windows, Person.born > event_year, world entities with no relations, and :OntologyRule-driven checks) so the math is visible in plain Python — not hidden in Cypher.
No world-builder UI. Everything is curl and cypher-shell. The UI is a v2 feature.
No reflective memory or behavior layer. The Stanford Generative Agents pattern (memory stream + reflection + planning) is a v2 borrow per the comparison in lore-engine/docs/16-comparison.md.

Next steps after this POC

~~Implement the consistency detection rules behind the 4 stub tools (T5).~~ Done.
Add the embedding-based semantic search plugin (uses the Image.caption and any future Person.summary text).
Add an LLM client that consumes the gateway with the reasoning harness system prompt and runs the 5 question types from the design.

The v1 design in lore-engine/docs/ is the contract. This POC is the proof of shape.

8.7 KiB Raw Blame History