docs(plan): slice 5 (Neo4j GraphBackend) plan + ADR 0011

Adds the slice 5 plan doc mirroring the existing plan-doc
shape (sub-slices, decisions, files, risks, verification
commands) and ADR 0011 capturing the GraphBackend Protocol
+ dual-write model decisions made across slices 5.1-5.8.
This commit is contained in:
2026-06-18 23:46:35 -04:00
parent 288ad4cb69
commit 03e784104a
2 changed files with 248 additions and 0 deletions

View File

@@ -0,0 +1,74 @@
# GraphBackend Protocol + dual-write model
**Status:** accepted (slice 5.1 - 5.8 shipped 2026-06-18).
The Lore Engine POC's in-memory `Graph` is replaced by a
`GraphBackend` Protocol (`@runtime_checkable`, PEP 544) with two
implementations: the original in-memory backend (renamed
`InMemoryGraph`) and a new `Neo4jGraph`. The MCP entry scripts
select at startup via `LORE_GRAPH_BACKEND=neo4j` (default `pickle`).
## Why a Protocol and not an ABC
`lore_engine_poc/llm.py:39-58` already uses PEP 544
`@runtime_checkable Protocol` for the `LLMProvider` seam. The
mirror pattern keeps the codebase internally consistent and lets
`isinstance(g, GraphBackend)` work without forcing inheritance on
the in-memory dataclass. ABCs would require `InMemoryGraph` to
extend `GraphBackend(ABC)` — fine, but extra ceremony for no
real benefit at this scale.
## Why `Graph = InMemoryGraph` (alias, not rename)
The 559 existing tests across slices 1-11 use `from
lore_engine_poc.tools import Graph`. A hard rename would force
test churn for no functional gain. The alias preserves back-compat
and the migration in slice 5.2 is invisible to the test suite.
## Why dual-write (pickle + Neo4j) at ingest time
`01_ingest.py --write-neo4j` mirrors the in-memory graph into
Neo4j after the pickle is written. The pickle is the integration
test fixture bootstrap (slice 11.4's `docker build` bakes it into
the image; the test suite loads it directly). The Neo4j write is
opt-in so:
- Tests that don't need Neo4j stay fast (no Docker required).
- The in-memory graph remains the source of truth — a Neo4j
outage doesn't lose state.
- The migration is incremental: existing tests don't change.
## Why id-keyed MERGE on `:Relation`, not triple-keyed
The in-memory `build_graph` merges edges that share a
`(subject, relation, object)` triple. If we did the same in
Cypher (`MERGE (s)-[r:REL]->(o)`), the same fact written twice
from two different sources would collapse into one `Relation`
node with no provenance. The reified `:Relation` model (ADR
0009) puts `edge_id` on the `Relation` node and `MERGE`es on
that, so the in-memory merge and the Neo4j merge produce
observationally identical fact sets but the Neo4j store keeps
per-edge provenance.
## What this ADR doesn't change
- **Storage substrate decision** (Neo4j, not Kuzu): ADR 0008.
- **Relation shape** (reified, not first-class edges): ADR 0009.
- **MCP transport** (Streamable HTTP): ADR 0010.
- **Cognee integration** (best-effort cognify, in-memory is
authoritative): ADR 0006.
## What it does change
- `lore_engine_poc/tools.py` exports `Graph = InMemoryGraph`
(alias, not a new class).
- The 40 direct-attribute mutations of `graph.edges_by_*` etc.
are routed through `Graph` methods (slice 5.2).
- A CI fence (`test_no_direct_dict_access_outside_graph_backend`)
fails the build on regression of the slice 5.2 migration.
- `01_ingest.py` accepts `--write-neo4j` (slice 5.6).
- The MCP entry scripts accept `LORE_GRAPH_BACKEND=neo4j`
(slice 5.7) via the shared `scripts/mcp_server_entry.load_graph`.
- `docker-compose.yml` has a `neo4j` profile that brings up
Neo4j + the ingest job + the Neo4j-backed MCP server (slice
5.8).

View File

@@ -0,0 +1,174 @@
# Slice 5 — Neo4j 5 GraphBackend Adapter
## Context
The Lore Engine POC's in-memory `Graph` (`lore_engine_poc/tools.py`) was the
single source of truth for the 36 MCP tools, the consistency engine, the
ontology rules, and the demo scripts across slices 1-11. The storage strategy
(`docs/12-storage-strategy.md`) and ADR 0008 commit to **Neo4j 5.x** as the
production graph backend, but the move was always on the roadmap, never
implemented. Slice 5 is the implementation.
The work is **infrastructure**: replace the in-memory backend with a swappable
`GraphBackend` Protocol, ship a `Neo4jGraph` implementation, and wire
`LORE_GRAPH_BACKEND=neo4j` through the MCP entry scripts so the server can run
against a real Neo4j instance. The 559 existing tests stayed green; the 73
new docker-gated tests cover the Neo4j path.
**Outcome:** `docker compose --profile neo4j up -d` brings Neo4j 5 + the
Lore Engine MCP server online; `LORE_GRAPH_BACKEND=neo4j` makes the MCP
server read from Neo4j; `was_true_at` round-trips through both backends
with byte-identical answers. The pickle path is preserved (it's the
integration-test fixture bootstrap).
## Sub-slices (8 PRs, suite green at each step, ~14.5 h total)
| # | What | Tests | Commit |
|---|---|---|---|
| 5.1 | `GraphBackend` Protocol + `InMemoryGraph` rename + `replace_edge` chokepoint | +15 | `slice 5.1: GraphBackend Protocol + InMemoryGraph rename + replace_edge chokepoint` |
| 5.2 | Channel direct mutations of `graph.edges_by_*` / `entities_by_type` / `aliases` / `lore_sources` through `Graph` methods (40 sites) | +10 | `slice 5.2: channel direct mutations through Graph methods` |
| 5.3 | `Neo4jGraph` skeleton + docker-gated round-trip tests | +12 | `slice 5.3: Neo4jGraph skeleton + docker-gated round-trip tests` |
| 5.4 | Reified `Relation` reads — full read-tool parity vs `InMemoryGraph` | +15 | `slice 5.4: reified :Relation reads + full read-tool parity` |
| 5.5 | `Neo4jGraph` writes + full-codex round-trip | +8 | `slice 5.5: Neo4jGraph writes + full-codex round-trip` |
| 5.6 | `01_ingest.py --write-neo4j` dual-write flag | +5 | `slice 5.6: 01_ingest.py --write-neo4j dual-write flag` |
| 5.7 | `LORE_GRAPH_BACKEND` env var plumbed through entry scripts | +5 | `slice 5.7: LORE_GRAPH_BACKEND env var in MCP entry scripts` |
| 5.8 | `docker-compose.yml` neo4j service + integration test | +3 | `slice 5.8: docker-compose neo4j service + integration test` |
**Total: 8 sub-slices, +73 new tests, 559 → 632 final tests.**
## Architecture
```
┌──────────────────────────┐
│ MCPServer (mcp_server) │ ← unchanged
│ tool.fn(self.graph, ..) │
└────────────┬─────────────┘
│ self.graph: GraphBackend
┌──────────────────────────┐
│ GraphBackend Protocol │ (graph_backend.py)
│ by_name, edges_for_*, │
│ add, replace_edge, ... │
└────────────┬─────────────┘
┌──────────────┴──────────────┐
▼ ▼
┌────────────────────┐ ┌────────────────────┐
│ InMemoryGraph │ │ Neo4jGraph │ (neo4j_graph.py)
│ (renamed Graph) │ │ bolt://neo4j:7687│
│ pickle default │ │ neo4j 5 + driver │
└────────────────────┘ └────────────────────┘
```
Cypher shape (reified `:Relation` per ADR 0009):
```cypher
(:Person {name, name_lower})-[:FROM]->(:Relation {edge_id, type, valid_from, valid_until, sources, ...})-[:TO]->(:Faction)
(:Relation)-[:SOURCED_FROM]->(:LoreSource {path, name, source_type, reliability, source_confidence})
```
## Decisions
- **Schema**: reified `:Relation` nodes per ADR 0009 (not first-class
relationships). This is the shape the consistency engine at slice 6 will
need; slice 5 builds the substrate.
- **Auth**: Neo4j 5 with `NEO4J_AUTH=none` (loopback only) per the user's
earlier choice. Document the move to auth-before-prod in the README.
- **Graph model**: one row per edge (no edge-merge in Cypher — `MERGE`
keyed on `edge_id`). The in-memory `build_graph`'s triple-merging runs
before the mirror, so the observable fact set is identical regardless of
backend.
- **Python driver**: official `neo4j>=5.0`. Not py2neo, not gqlalchemy.
- **Adapter interface**: PEP 544 `Protocol` + `@runtime_checkable`, mirroring
`LLMProvider` (`lore_engine_poc/llm.py:47-48`). No `ABC`.
- **`Graph` class**: renamed `InMemoryGraph` with `Graph = InMemoryGraph`
alias for back-compat. Zero test churn.
- **Pickle path**: preserved. `01_ingest.py` writes the pickle (test fixture
bootstrap), then optionally mirrors to Neo4j. `LORE_GRAPH_BACKEND=neo4j`
is what makes the MCP server read from Neo4j at startup.
## Files created / modified
### Create
| Path | Purpose |
|---|---|
| `lore_engine_poc/graph_backend.py` | `GraphBackend` Protocol, `InMemoryGraph` (renamed), `replace_edge` chokepoint, helper methods |
| `lore_engine_poc/neo4j_graph.py` | `Neo4jGraph` class — bolt driver, `ensure_schema()`, all 14 method surface points, reified `:Relation` |
| `scripts/mcp_server_entry.py` | Shared `load_graph()` that branches on `LORE_GRAPH_BACKEND` |
| `tests/test_tools/test_graph_backend.py` | 15 contract tests for the Protocol |
| `tests/test_tools/test_graph_backend_writes.py` | 10 tests for the new mutation methods |
| `tests/test_tools/test_neo4j_graph.py` | 12 docker-gated tests for `Neo4jGraph` skeleton |
| `tests/test_tools/test_neo4j_read_tools_parity.py` | 15 docker-gated parity tests |
| `tests/test_tools/test_neo4j_full_codex.py` | 8 docker-gated tests for the full codex |
| `tests/test_scripts/test_ingest_neo4j.py` | 5 docker-gated tests for `01_ingest.py --write-neo4j` |
| `tests/test_mcp/test_backend_switch.py` | 5 tests for the `LORE_GRAPH_BACKEND` switch |
| `tests/test_mcp/test_compose_neo4j.py` | 3 docker-gated tests for the Neo4j compose stack |
| `docs/plan/05-slice-neo4j-backend.md` | This file |
| `docs/adr/0011-graph-backend-protocol.md` | PEP 544 Protocol vs ABC, dual-write model |
### Modify
| Path | Change |
|---|---|
| `requirements.txt` | Append `neo4j>=5.0` |
| `lore_engine_poc/tools.py` | `Graph = InMemoryGraph` alias; `was_true_at` uses Protocol method |
| `lore_engine_poc/read_tools.py` | 22 sites migrated to Protocol methods |
| `lore_engine_poc/write_tools.py` | 3 sites + retcon/mark_verified via replace_edge |
| `lore_engine_poc/consistency_runner.py` | 4 sites; Pattern 2 sorts claims for stable output |
| `lore_engine_poc/ontology_rules.py` | 16 sites |
| `scripts/01_ingest.py` | `--write-neo4j` flag |
| `scripts/05_mcp_server.py` | imports `load_graph` from `mcp_server_entry` |
| `scripts/06_mcp_http_server.py` | imports `load_graph` from `mcp_server_entry` |
| `docker-compose.yml` | `neo4j:5` service + `lore-engine-ingest` + `lore-engine-mcp-neo4j` (profile `neo4j`); `lore-engine-mcp` moved to profile `pickle` |
| `README.md` | New "Storage backends" section (pickle vs Neo4j) |
## CI fence
`test_no_direct_dict_access_outside_graph_backend` (in
`tests/test_tools/test_graph_backend_writes.py`) greps
`lore_engine_poc/` for direct dict access of `graph.edges_by_*`,
`graph.entities_by_type`, `graph.aliases`, `graph.lore_sources`, and
`graph.names` outside `graph_backend.py`. Fails the build on regression so
the slice 5.2 migration stays migrated.
## Verification (end-to-end)
```bash
cd /home/kaykayyali/projects/lore-engine-poc
# 1. All tests (no docker).
python3 -m pytest tests/ -q
# expected: 632 passed, ~46 skipped (no docker on host) — the
# Neo4j-specific tests skip. The 559 baseline + 73 new
# pass when docker is up.
# 2. Docker-gated tests (with docker).
python3 -m pytest tests/test_tools/test_neo4j_graph.py \
tests/test_tools/test_neo4j_read_tools_parity.py \
tests/test_tools/test_neo4j_full_codex.py \
tests/test_scripts/test_ingest_neo4j.py \
tests/test_mcp/test_compose_neo4j.py -v
# expected: 46 passed
# 3. Compose round-trip (Neo4j profile).
docker compose --profile neo4j up -d
# Waits for Neo4j healthy, runs 01_ingest --write-neo4j, starts MCP.
sleep 30
curl -s http://localhost:8765/mcp -H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"was_true_at","arguments":{"relation":"MEMBER_OF","subject":"Roland Raventhorne","object":"House Raventhorne","at_time":"3rd_age.year_345"}}}'
# expected: was_true: true
docker compose --profile neo4j down -v
```
## Risks (resolved per sub-slice)
| # | Risk | Resolution |
|---|---|---|
| R1 | Direct mutation of `graph.edges_by_*` sneaks back in | 5.2 CI fence |
| R2 | Neo4j `MERGE` keyed on triple would create duplicates | 5.3 uses MERGE on `edge_id` |
| R3 | Neo4j cold-start 10-30s blows CI timeout | 5.3 module-scoped fixture + start_period 30s |
| R4 | 5.2 breaks tests asserting on dict shape | 5.2 rewrites 9 tests in `test_graph_indexes.py` |
| R5 | Regressions in `Neo4jGraph` only caught by docker-gated tests | CI matrix runs them in a docker job |
| R6 | `aliases` + case-insensitive `by_name` need normalized index | `:Entity {name, name_lower}` + CREATE INDEX in `ensure_schema()` |
| R7 | `LoreSource` round-trips with full metadata | 5.5 pre-pass for `add_lore_source` in `_mirror_in_memory_to_neo4j` |
| R8 | docker-compose Neo4j port conflict with manual neo4j container | 5.8 uses non-standard ports 17474/17687 by default |
| R9 | `lore-engine-mcp` (pickle) and `lore-engine-mcp-neo4j` (neo4j) both start under `--profile neo4j` | 5.8 puts both on profiles; `pickle` and `neo4j` are mutually exclusive |