Go to file

kanban-dev 99535a8f3a docs(v2): T8 — update README + CHANGELOG + 3 worked-example docs

- README.md: 5 plugins / 19 tools (matches /healthz); 'what this proves'
  now lists consistency engine, multi-world namespace, LLM consumer;
  'next steps' section replaced with 'shipped in v2'
- docs/CONSISTENCY_DEMO.md: 4 tools, 5 violations, all output verified
  against live bash examples/test_consistency.sh
- docs/MULTI_WORLD_DEMO.md: list_worlds() + entity_context in both
  worlds + cross-world isolation tests, all output verified live
- docs/LLM_CONSUMER_DEMO.md: 5 question types, 9 distinct tools, all
  output traced to examples/results/*.json
- CHANGELOG.md: v1 -> v2 entry, all 9 task refs (T1-T9)
- examples/test_e2e.sh: T7 E2E validation script (untracked)

2026-06-17 00:45:30 +00:00

docs

docs(v2): T8 — update README + CHANGELOG + 3 worked-example docs

2026-06-17 00:45:30 +00:00

examples

docs(v2): T8 — update README + CHANGELOG + 3 worked-example docs

2026-06-17 00:45:30 +00:00

gateway

T2: pgvector image embeddings — plugin, schema, seed, hook, tests

2026-06-16 14:30:10 +00:00

neo4j

T2: pgvector image embeddings — plugin, schema, seed, hook, tests

2026-06-16 14:30:10 +00:00

plugins

Merge branch 'wt/t2-pgvector' into main

2026-06-17 00:39:40 +00:00

postgres

T6: multi-world namespace — world_id on every node, list_worlds(), arda_greyscale seed

2026-06-16 23:09:40 +00:00

scripts

docs(v2): T1 — push v1 + open v2 milestone board

2026-06-16 14:23:31 +00:00

tests

v2.T5: implement 4 consistency tools — 5/5 violations surfaced

2026-06-16 23:14:34 +00:00

.env.seed

T3: consistency plugin skeleton (4 violation tools, 4 Neo4j labels)

2026-06-16 14:21:48 +00:00

.gitignore

T2: pgvector image embeddings — plugin, schema, seed, hook, tests

2026-06-16 14:30:10 +00:00

CHANGELOG.md

docs(v2): T8 — update README + CHANGELOG + 3 worked-example docs

2026-06-17 00:45:30 +00:00

docker-compose.yml

T2: pgvector image embeddings — plugin, schema, seed, hook, tests

2026-06-16 14:30:10 +00:00

README.md

docs(v2): T8 — update README + CHANGELOG + 3 worked-example docs

2026-06-17 00:45:30 +00:00

seed.py

v2.T5: implement 4 consistency tools — 5/5 violations surfaced

2026-06-16 23:14:34 +00:00

test.sh

T6: multi-world namespace — world_id on every node, list_worlds(), arda_greyscale seed

2026-06-16 23:09:40 +00:00

README.md

lore-engine-poc

Proof of concept for the Lore Engine v1.1 architecture.

Five-minute goal: prove that with mock data, we can run a multi-database backend (Neo4j for the world graph, Postgres for operational records, MinIO for blob/image storage) and expose it all through a plugin-driven MCP gateway — where adding a new domain type is a new file in plugins/, not a Go change.

What's running

Container	Image	Port	Role
`lore-neo4j`	`neo4j:5.26-community`	7474 (browser), 7687 (bolt)	The world graph: people, factions, eras, events, lineage, time-bounded relations
`lore-postgres`	`pgvector/pgvector:pg16`	5432	Trade log, image manifests, audit, image embeddings
`lore-minio`	`minio/minio:latest`	9000 (S3), 9001 (console)	Image blob storage
`lore-gateway`	built locally	8765 (MCP JSON-RPC)	The plugin-driven gateway

The five plugins (this is the proof)

plugins/
├── world.py       # entity_context, was_true_at, state_at   (Neo4j)
├── lineage.py     # ancestors_of, descendants_of, lineage_of (Neo4j)
├── trade.py       # log_trade, trades_by_buyer, market_price (Postgres)
├── images.py      # register_image, recall_images, search_images_by_caption
│                  #                                          (MinIO + Postgres + Neo4j)
├── embeddings.py  # embed_images, search_images_semantic    (Postgres + pgvector)
└── consistency.py # find_contradictions, find_anachronisms, find_orphans,
                   # find_ontology_violations                (Neo4j)

The gateway also exposes one admin tool for the world namespace: list_worlds.

Tool counts and plugin membership are reported live by the gateway itself — curl -s http://localhost:8765/healthz returns the canonical list. As of v2 the healthz reports 19 tools across the 5 plugins above. See docs/LLM_CONSUMER_DEMO.md for an end-to-end driver that exercises them.

Each plugin is a single file with a register(registry) entry point. The gateway auto-loads every .py file in plugins/ at startup. No server.py change needed to add a new tool — drop a new file in, restart the container, the new tools appear in tools/list.

How to run it

cd /root/lore-engine-poc
docker compose up -d --build
# wait ~30s for neo4j + postgres + minio to be ready
docker exec -i lore-neo4j cypher-shell -u neo4j -p lore-dev-password < neo4j/init.cypher
docker compose exec -T postgres psql -U lore -d lore < postgres/init.sql
python3 seed.py
# gateway is now live on :8765

The seed.py script is idempotent (uses MERGE and ON CONFLICT). It loads:

3 eras (1st Age, 2nd Age, Age of Iron)
10 people (Theron, Maric, Aldric, Elara, Cael, Yssa, Vex, Alessia, Kael, Guildmaster Torren)
3 factions (House Vyr, The Crimson Pact, Merchants Guild)
4 locations (Valdorn, Mardsville, Thornwall Keep, Black Spire Pass)
4 items (Sword of Eventide, The Pale Ledger, Ruby Eye of Kael, Elara's Locket)
6 events
1 lineage group
~20 time-bounded relations
3 trade log entries
4 generated images (portraits + landscape + battle scene) uploaded to MinIO
5 hand-crafted consistency violations pre-materialized as :Contradiction, :Anachronism, :Orphan, and :OntologyViolation nodes (see docs/CONSISTENCY_DEMO.md)
1 parallel world, arda_greyscale — a minimal mirror of the default world with no overlapping node ids (see docs/MULTI_WORLD_DEMO.md)

Try the gateway

List all tools

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | python3 -m json.tool

Look up Aldric

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"entity_context","arguments":{"name":"Aldric Raventhorne"}}
  }' | python3 -m json.tool

Time-bounded query: was House Vyr allied with the Merchants Guild in 230 TA?

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{
      "name":"was_true_at",
      "arguments":{
        "relation":"ALLIED_WITH",
        "subject":"House Vyr",
        "object":"Merchants Guild",
        "at_time":"2nd_age.year_230"
      }
    }
  }' | python3 -m json.tool

Lineage: Aldric's ancestors

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"ancestors_of","arguments":{"person":"Aldric Raventhorne","generations":5}}
  }' | python3 -m json.tool

Image recall: show me pictures of Aldric

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"recall_images","arguments":{"entity_id":"aldric"}}
  }' | python3 -m json.tool

The response includes a presigned_url — a MinIO URL valid for 1 hour. The LLM (or the calling client) can fetch the actual PNG from there.

Search images by caption

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"search_images_by_caption","arguments":{"q":"aldric"}}
  }' | python3 -m json.tool

Semantic image search (pgvector)

The embeddings plugin encodes each image's caption into a 384-dim vector with a local sentence-transformer model (all-MiniLM-L6-v2) and stores it in Postgres via the pgvector extension. Queries are encoded the same way and ranked by cosine distance. Unlike search_images_by_caption, this works on natural-language descriptions and doesn't require keyword overlap.

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"search_images_semantic","arguments":{"q":"a noble lord with a scar"}}
  }' | python3 -m json.tool

Returns Aldric's portrait as the top match. Try "a sneaky thief in a hood" for Vex. The first call triggers a one-time ~80MB model download on the gateway host; subsequent calls are cached in ~/.cache/torch.

If you add new images via register_image, embeddings are computed in the background by a daemon thread on the gateway — no separate job queue needed. Re-running embed_images is a no-op for images that already have embeddings.

Market price for the Pale Ledger

curl -s -X POST http://localhost:8765/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"tools/call",
    "params":{"name":"market_price","arguments":{"item_id":"pale_ledger"}}
  }' | python3 -m json.tool

What this proves

The plugin boundary works. A new domain type (trade, images, embeddings, consistency) is a new file in plugins/. No change to server.py, no change to docker-compose, no new container. Restart the gateway and the new tools are live. The consistency plugin (added in v2.T5) is the most recent example — four violation-detection tools, all in one file.
Polyglot storage is real, not aspirational. Neo4j holds the typed world graph. Postgres holds the time-series operational data, image manifests, and the image_embedding vectors (pgvector). MinIO holds the image bytes. Each store does what it's good at; the gateway composes the answers.
Time is a first-class query primitive. was_true_at checks time-bounded edges with a single Cypher query — no LLM, no inference. Year-level precision works against the mock data (see 2nd_age.year_230 example above).
Image recall works. Images are stored in MinIO, linked to entities in Neo4j ((:Image)-[:DEPICTS]->(:Person)), and discoverable by entity id, by tag, by caption substring search, or by natural-language description via the search_images_semantic (pgvector) tool. Presigned URLs are generated on the fly.
The consistency engine is real. The four find_* tools query pre-materialized violation nodes in Neo4j and return structured {violations, count} envelopes — not booleans, not error strings. The seed.py:seed_violations step computes the violations from the same heuristics (overlapping MEMBER_OF windows, Person.born > event_year, orphan entities, and :OntologyRule-driven checks) so the math is visible in plain Python — not hidden in Cypher. See docs/CONSISTENCY_DEMO.md for the five hand-crafted violations the seed surfaces.
Multiple worlds live in one graph. Every world-scoped node and edge carries a world_id property, and the read tools accept a world_id argument (defaulting to "default"). The v2.T6 seed loads a parallel arda_greyscale world with no overlapping node ids, and list_worlds() returns both. See docs/MULTI_WORLD_DEMO.md for the worked example.
An LLM can drive the whole surface. examples/llm_consumer.py is a real driver that takes a natural-language question, calls the gateway's tools/list, picks the right tool(s), and answers in prose — all wired through the local LiteLLM proxy. 5 question types × 9 distinct tools exercised, all answers hand-verified against the seed. See docs/LLM_CONSUMER_DEMO.md and examples/REPORT.md.
The world is small but real. 10 people + 9 greyscale-world people, 6 events, 5 images (4 default + 1 greyscale), ~20 relations — enough to demonstrate the architecture end-to-end across two parallel worlds. Scaling is a separate problem; this is the proof of shape.

What's not in this POC

No LLM in the loop at runtime — the LLM consumer is a separate example. The MCP gateway itself is a tool server; the LLM client (Claude, GPT, anything reachable via the LiteLLM proxy) is the consumer. This is intentional — the POC validates the data and tool layers, not the LLM reasoning. The reasoning harness is in the design docs (lore-engine/docs/07-reasoning-harness.md); examples/llm_consumer.py implements the v1.1 of that harness against the live gateway.
No world-builder UI. Everything is curl and cypher-shell. The UI is a v3 feature.
No reflective memory or behavior layer. The Stanford Generative Agents pattern (memory stream + reflection + planning) is a v3 borrow per the comparison in lore-engine/docs/16-comparison.md.

Shipped in v2

What was on the v1 "next steps" list, and what it became in v2:

~~Implement the consistency detection rules behind the 4 stub tools (T5).~~ Done — see plugins/consistency.py and docs/CONSISTENCY_DEMO.md. 4 tools, 5 violations surfaced from the seed.
~~Add the embedding-based semantic search plugin (uses the Image.caption and any future Person.summary text).~~ Done — see plugins/embeddings.py and docs/LLM_CONSUMER_DEMO.md. 384-dim MiniLM, pgvector cosine distance, background embedding on register_image.
~~Add an LLM client that consumes the gateway with the reasoning harness system prompt and runs the 5 question types from the design.~~ Done — see examples/llm_consumer.py and examples/REPORT.md. 5 questions, 9 distinct tools, all hand-verified against seed ground truth.
v2 extras not on the v1 list: the multi-world namespace with the arda_greyscale parallel seed (T6); the :OntologyViolation rule-driven detection in addition to the original three classes (T5); and a fresh-clone smoke test (scripts/ci-smoke.sh) that exercises the gateway end-to-end from a clean state (T1).

Languages

Go 47.5%

Python 45.4%

Shell 5.6%

Dockerfile 0.8%

Cypher 0.7%

README.md Unescape Escape

lore-engine-poc

What's running

The five plugins (this is the proof)

How to run it

Try the gateway

List all tools

Look up Aldric

Time-bounded query: was House Vyr allied with the Merchants Guild in 230 TA?

Lineage: Aldric's ancestors

Image recall: show me pictures of Aldric

Search images by caption

Semantic image search (pgvector)

Market price for the Pale Ledger

What this proves

What's not in this POC

Shipped in v2

README.md