Per docs/plan/exec/07-harness.md sub-slice 7.2:
- lore_engine_poc/prompts/system_prompt.md — the
canonical system prompt. 5 question types with
canonical tool sequences, the citation rule
("cite every claim"), the time-window rule
(default at_time, explicit time in answer), the
contradiction rule (surface, don't resolve), the
6 failure modes the LLM must avoid. v1.2-aware:
mentions the slice 5T TypeTemplate tools and the
slice 6 Setting/Plane setting= filter.
- lore_engine_poc/prompts/registry.json — the
version registry. Pins the system prompt to v1.2
with model_target=minimax-m3:cloud. Old runs stay
comparable when the prompt iterates (D3).
- lore_engine_poc/prompts/loader.py — the loader.
list_registered_prompts() and load_current_system_prompt()
are the canonical entry points; the harness
runner uses them to fetch the prompt + stamp
results with the version.
- tests/harness/test_system_prompt.py — 9 tests:
registry well-formed, system_prompt registered,
path resolves, loader returns (text, version),
prompt has 5 question types, citation rule
present, time-window rule present, mentions
template tools, mentions setting filter.
Track A only (no API key). Track B uses the loader
when executing the harness.
Suite: 767 → 776 (+9).
Co-Authored-By: Claude <noreply@anthropic.com>
Two changes:
1. apply_plane_migration now counts only *newly
materialised* planes (existence-check before add),
so the PlaneMigrationSummary.planes_added field
reflects the graph delta on re-runs. Previously it
counted plans processed, which made "idempotency"
invisible in the summary shape.
2. New tests/test_planes/test_killer_demo.py — the
slice 6 end-to-end regression net. Wires the 6.4
backfill + 6.6 migration + 6.5 setting filter +
6.2 graph layer in one graph. Pins:
- cross-setting facts are filtered out under
setting="mardonari"
- within-setting facts survive the filter
- entities_present + events_during honour
the setting filter
- the full slice 6 pipeline is idempotent at
both the plane layer and the LAYER_OF edge
layer
Suite: 756 → 761 (+5).
Co-Authored-By: Claude <noreply@anthropic.com>
Adds the slice 6.6 migration as a library helper
(lore_engine_poc.migration: scan_codex_for_planes /
apply_plane_migration) plus a thin CLI wrapper at
scripts/05_migrate_planes.py.
Discriminator: frontmatter signals — plane: true,
tags contains plane, OR type: plane — promote an
entry to :Plane. Default Material Plane convention
remains mardonari.material (added by the 6.4
backfill). Idempotent: re-running on the same codex
produces the same graph state.
LAYER_OF edges are created only between co-referenced
Planes (markdown body [[X]] → Plane node X). Direction
follows docs/17-planes.md: (:Plane A)-[:LAYER_OF]->
(:Plane B) where A is the layer and B is the parent.
In the voldramir fixture, Voldramir(demiplane) LAYER_OF
Underdark(plane).
The script supports --dry-run (print planned changes,
exit 0) and --codex / --setting overrides.
Suite: 747 → 756 tests (+9). No regressions.
Co-Authored-By: Claude <noreply@anthropic.com>
Per AC 6.5, 6.6 — adds a keyword-only 'setting' parameter to:
- lookup (filter on matched name's setting membership)
- entity_context (entity-level filter; empty shape if excluded)
- was_true_at (both subject and object must be in setting)
- true_during (subject must be in setting)
- entities_present (located entity must be in setting)
- events_during (event subject must be in setting)
The filter resolves via graph.setting_entities(setting_id) — O(1)
reverse lookup from slice 6.2's Protocol methods. An unknown
setting returns empty results (defensive). Omitting 'setting'
preserves slice 4 / 9 behaviour (back-compat fence).
MCP tool schemas updated for all 6 entries to expose 'setting'
as an optional [string, null] parameter; opt_string_params
toggled so 'null' is coerced to None by the dispatcher.
The cross-setting fact test (Roland ENCOUNTERED The Wanderer)
is the canonical LLM-target: with setting='mardonari', Roland's
home setting, the answer is was_true=False because The Wanderer
is in the_wild_dream.
+8 tests (739 → 747). All green. No regressions.
Adds lore_engine_poc.migration.migrate_setting_id_to_exists_in()
— idempotent helper that materialises a :Setting node, a default
Material Plane, and per-entity EXISTS_IN facts. Used by:
- 01_ingest.py on every ingest (safe; idempotent)
- scripts/05_migrate_planes.py (slice 6.6 region↔plane pass)
Per docs/17-planes.md: the Material Plane's id is
{setting_id}.material and its kind is 'material'. The EXISTS_IN
fact is the timeless type-assertion (per docs/17-planes.md +
ADR 0009); time-bounded membership is the slice 6.5 reified
:Relation work.
API:
migrate_setting_id_to_exists_in(
graph, setting_id, *,
entity_ids=None, # explicit; no all_names() walk
current_era='unspecified',
schema_version='1.2',
kind='campaign',
) -> BackfillSummary
Returns a BackfillSummary so the caller can verify what changed
(test surface: 7 tests covering idempotency, default-plane
materialisation, partial overlap, multi-setting isolation, and
custom metadata).
+7 tests (732 → 739). All green. No regressions.
Extends the GraphBackend Protocol with 6 new methods (add_setting,
find_setting, add_plane, find_plane, planes_in_setting,
add_exists_in, entity_planes, setting_entities). InMemoryGraph
implements them with O(1) reverse lookups (planes_by_setting,
entities_by_setting, settings_by_entity). Neo4jGraph gains
NotImplementedError stubs so isinstance(neo4j, GraphBackend) keeps
passing until the slice 6 follow-up mirrors the Cypher.
EXISTS_IN is the timeless type-assertion per docs/17-planes.md;
time-bounded membership is the slice 6.5 reified :Relation work.
+8 tests (718 → 726). All green. No regressions.
Promotes Setting and Plane from the write-tools allowlist to
first-class Layer-1 NODE_LABELS. Adds 5 plane-relation edge types
to EDGE_TYPES (EXISTS_IN, REFLECTS, LAYER_OF, ADJACENT_TO,
ACCESSIBLE_VIA).
Per docs/17-planes.md: EXISTS_IN is the timeless type-assertion
that an entity belongs to a Setting; time-bounded planar
membership is carried by a separate reified :Relation (slice 6.5).
+6 tests (712 → 718). All green. No regressions.
Templates module (package init) and the runtime registry:
* TemplateRegistry
- .reload() rescans templates_dir and rebuilds the
{template_id: TemplateSpec} dict atomically
- .query(template_id, qid, args, graph) dispatches to the
graph backend via query_cypher()
- Duplicate template ids in the same dir raise
TemplateRegistryError with both source paths
- Every loaded query is re-validated through the Cypher
allowlist, so a bad body never reaches a backend
- Coerces string-typed optional params the same way the
core tools do ("null" / "None" / "" -> None)
* dynamic_tools.build_dynamic_tools(registry)
- One MCP ToolEntry per declared query: name = query.id,
description = query.description, inputSchema derived
from parameters (required = non-optional names).
- One list_template_tools discovery entry that returns
{templates: [{id, queries: [...]}, ...]} and accepts
an optional template_id filter.
- Same _make_adapter shape as the core tools so the
existing MCPServer dispatcher serves them with no
changes.
* 15 dedicated tests in test_template_registry.py cover:
empty dir, single template, query dispatch, duplicate-id
rejection, reload invalidates cache, dynamic tool schema
shape, list_template_tools (with and without filter),
merge into TOOL_REGISTRY visible in tools/list, end-to-end
tool call (with and without rows), missing required param
yields [], unknown template id / query id raise, and the
defining test: drop a new template file, reload(), see a
new tool appear with no code change.
Suite: 685 -> 700 (+15).
Co-Authored-By: Claude <noreply@anthropic.com>
The safety boundary for every template query. Every Cypher
body passes through validate() before it ever reaches the
in-memory matcher or the Neo4j driver.
What the allowlist does:
- Accepts read-only constructs: MATCH, OPTIONAL MATCH, WHERE,
RETURN, ORDER BY, SKIP, LIMIT.
- Accepts allowlisted aggregation functions in RETURN
(count, coalesce, min, max, sum, avg, toLower, toUpper,
length, id).
- Rejects mutations and control flow: CREATE, MERGE, SET,
DELETE, DETACH, REMOVE, CALL, UNION, WITH, FOREACH, LOAD,
USING -- with the offending keyword and a 1-based line
number.
- Rejects variable-length path patterns ('*', '*1..3').
- Enforces parameter consistency between the body and the
template's parameters: section (typo guard both ways).
- The body is never concatenated with parameter values;
the in-memory matcher / Neo4j driver binds $name via
the parameter API, so a value like "O'Brien); MERGE ...
" is a string, not Cypher.
The allowlist is the safety boundary. Everything downstream
trusts the body once validate() returns.
Suite: 665 -> 685 (+20).
Co-Authored-By: Claude <noreply@anthropic.com>
The docker-compose Neo4j test re-ran 01_ingest.py (via the
lore-engine-ingest service), which rebuilt the .graph.pkl
fixture. The new bytes are identical to the previous build
except for the pickle metadata header timestamp.
No semantic change.
Updates the POC README with a 'Storage backends' section
documenting the GraphBackend Protocol, the two implementations
(InMemoryGraph and Neo4jGraph), and the LORE_GRAPH_BACKEND
env-var selection in the MCP entry scripts. Adds the slice 5
plan doc (docs/plan/05-slice-neo4j-backend.md in the design
repo) and ADR 0011 capturing the Protocol + dual-write
model decisions.
Three docker-gated tests for the full Neo4j compose stack:
* test_compose_neo4j_profile_healthy: docker compose
--profile neo4j up -d brings neo4j + lore-engine-ingest
+ lore-engine-mcp-neo4j to a healthy state within 60s.
* test_compose_neo4j_was_true_at_round_trip: was_true_at
through the Neo4j-backed MCP server returns the same
answer as the pickle-backed server for a known fact
(Roland Raventhorne / House Raventhorne / 3rd_age.year_345
→ was_true: true).
* test_compose_neo4j_down_cleans_volumes: docker compose
--profile neo4j down -v removes the neo4j_data volume.
docker-compose.yml changes:
* New neo4j:5 service with NEO4J_AUTH=none, loopback
HTTP + Bolt ports (17474/17687 by default to avoid
conflict with a developer's manual neo4j on the standard
7474/7687 ports), 1GiB mem_limit, pids_limit, healthcheck
via wget on the HTTP root.
* New lore-engine-ingest service (profile neo4j) that
runs scripts/01_ingest.py --skip-cognee --write-neo4j
after Neo4j is healthy. One-shot; no restart policy.
* The pickle-backed lore-engine-mcp service moved onto
the pickle profile (so it doesn't conflict on the
same host port when the neo4j profile is active).
* New lore-engine-mcp-neo4j service (profile neo4j)
that depends on both neo4j (service_healthy) and
lore-engine-ingest (service_completed_successfully).
Same hardening as the pickle service: cap_drop ALL,
no-new-privileges, mem_limit 512m, read_only rootfs,
tmpfs /tmp.
* Named volume neo4j_data for the Neo4j store.
Profile split (pickle | neo4j) keeps the two stacks from
colliding on the same host port when both are activated.
Run with docker compose --profile pickle up -d for the
default or --profile neo4j up -d for the production
graph substrate.
Slice 11.4 test update:
* tests/test_mcp/test_dockerfile.py test_docker_compose_up_and_round_trip
now uses --profile pickle so the pickle service
activates only.
Pre-prod hardening noted in compose yml: NEO4J_AUTH=none
is loopback-only; switch to a username/password and update
LORE_NEO4J_URI before exposing beyond loopback. Tracked in
docs/plan/05-slice-neo4j-backend.md.
Suite: 629 -> 632 passed (+3 compose-neo4j tests, all 559
baseline + 50 Neo4j + consistency + ingest + backend-switch
+ compose-neo4j tests preserved). The plan's 632 final-test
target is reached.
Both MCP entry scripts (05_mcp_server.py for stdio and
06_mcp_http_server.py for Streamable HTTP) now select their
graph backend at startup through a shared loader
(scripts/mcp_server_entry.load_graph):
* LORE_GRAPH_BACKEND=pickle (default) — load the
.graph.pkl built by 01_ingest.py.
* LORE_GRAPH_BACKEND=neo4j — connect to Neo4j at
$LORE_NEO4J_URI (default bolt://127.0.0.1:7687) and
load the mirrored graph.
* Anything else — clear error and exit 4.
Exit codes:
* 0: graph loaded (only happens if the caller ignores
the sys.exit() call below and treats load_graph() as
non-throwing — for the supported backends, load_graph
returns normally).
* 1: pickle path missing.
* 2: neo4j_graph not importable.
* 3: neo4j unreachable.
* 4: unknown backend value.
Neo4jGraph.__init__ now eagerly calls verify_connectivity()
so the loader fails loudly at startup rather than on the
first query — the driver pool opens sockets lazily otherwise,
and the first session.run would be too late for the
entry scripts to log a clear error.
Refactors:
* scripts/05_mcp_server.py: removed inline _load_graph(),
now imports from scripts.mcp_server_entry.
* scripts/06_mcp_http_server.py: same.
* lore_engine_poc/neo4j_graph.py: Neo4jGraph.__init__
eagerly verifies connectivity.
Tests:
* tests/test_mcp/test_backend_switch.py — 5 docker-gated
tests (pickle default, neo4j up, neo4j down exits 3,
garbage backend exits 4, trivial registry works with
both backends).
Suite: 624 -> 629 passed (+5 backend-switch tests, all 559
baseline + 38 Neo4j + consistency + ingest + backend-switch
tests preserved).
After the in-memory graph + pickle are written, the new flag
mirrors the full graph into the Neo4j 5 container at
$LORE_NEO4J_URI (default bolt://127.0.0.1:7687). The flag
is opt-in (default off) so the existing test suite's
invocations of 01_ingest.py without Docker still work.
The mirror logic:
* Pre-pass for LoreSource nodes (full metadata via
add_lore_source so SOURCED_FROM links find them with
name, source_type, reliability, source_confidence).
* Pre-pass for bare names (entities registered without
any edge participation — keeps :Entity count in sync
with in-memory all_names()).
* Then the edges, add()-ed one by one.
Failure semantics:
* Neo4j unreachable at startup → log + exit 3.
* neo4j_graph not importable → log + exit 2.
* Pickle is always written before the mirror attempt, so
a flaky Neo4j container never loses the in-memory state.
Consistency runner stability:
_detect_contradictions Pattern 2 (same object, different
subjects) now sorts the two claims alphabetically so
claim_a / claim_b are stable across runs. The
graph.all_names() set iteration order is otherwise
non-deterministic across Python processes and across
the in-memory / Neo4j backends, and the original
dict-iteration insertion order broke when slice 5.4
migrated to all_names().
Tests:
* tests/test_scripts/test_ingest_neo4j.py — 5 docker-gated
tests (exits zero, entity count, relation count,
default-off untouched, fails loud on unreachable URI).
* tests/test_consistency/test_runner_categories.py — one
test updated to assert claim_a/claim_b as a set rather
than a specific order (matches the runner's new
lexicographic-sort contract).
Suite: 619 -> 624 passed (+5 ingest-neo4j tests, all 559
baseline + 32 Neo4j + consistency + ingest tests preserved).
Eight docker-gated tests covering the write + full-codex
round-trip against the real .graph.pkl fixture:
* Build a Neo4jGraph from the real codex (smoke test).
* Entity count matches in-memory (corrected for the
LoreSource-as-subject edge case).
* Relation count matches in-memory.
* LoreSource count matches in-memory.
* was_true_at Roland / House Raventhorne / 3rd_age.year_345
— both backends agree.
* was_true_at Aldric / Maric sibling query.
* was_true_at Voldramir / Mardonus PART_OF query.
* Add a new edge via the Neo4j backend and verify
was_true_at against Neo4j sees it (write/read
round-trip in the same process).
The mirror helper (_mirror_in_memory_to_neo4j) pre-passes
LoreSource nodes (full metadata) and bare registered names
(entities that don't participate in any edge) so the Neo4j
backed is observationally equal to the in-memory graph
for the full codex.
Suite: 611 -> 619 passed (+8 full-codex tests, all 559
baseline + 27 Neo4j tests preserved).
Implements the read + write surface of Neo4jGraph against the
reified :Relation shape (ADR 0009). The read tools (slice 4) and
the consistency runner / ontology rules (slice 2) are migrated
to use only GraphBackend Protocol methods, so the same Python
code works against both InMemoryGraph and Neo4jGraph.
Reads (Neo4jGraph):
* edges_for_subject(name, relation=None) -> list[Edge]
* edges_for_object(name) -> list[Edge]
* find_edge_by_id(edge_id) -> Edge | None
* by_name, all_names, all_entity_types, entities_of_type,
lore_source (slice 5.3)
* Round-trip: each Edge field is stored as a :Relation node
property and rehydrated on read; Cypher ORDER BY edge_id
so list order matches the in-memory insertion order
Writes (Neo4jGraph):
* add(edge): MERGE subject + object :Entity nodes, upsert
:Relation (id-keyed), link :FROM/:TO, link :SOURCED_FROM
to each :LoreSource in the edge's sources list
* replace_edge(old_id, new_edge): in-place property update
for same (subject, relation, object); drop+re-add for
different endpoints (preserves edge_id for retcon audit)
* remove_entity(name): DETACH DELETE the :Entity + alias
cleanup; returns the number of edges that were attached
* remove_entity_of_type(name, type_): REMOVE n:Label
* rename_entity(old, new): rename + register old as alias
* resolve_alias, register_name, register_alias, add_lore_source,
add_entity_of_type (slice 5.3)
Migrations (read tools + consistency + ontology):
* tools.py: was_true_at uses graph.edges_for_subject(...)
* read_tools.py: 22 sites of graph.edges_by_subject.get /
.items / .values / graph.edges_by_object.get / graph.entities_by_type
.items / graph.lore_sources.get / graph.names migrated to the
Protocol methods
* consistency_runner.py: 4 sites (all_edges flatten,
anachronism detector, orphan detector)
* ontology_rules.py: 13 sites (10 ontology rules + helper)
* write_tools.py: 3 sites (label membership check, era walk)
CI fence (test_graph_backend_writes.py):
test_no_direct_dict_access_outside_graph_backend now greps for
the broader pattern (bracket, .get, .items, .values, .keys on
graph.edges_by_*, entities_by_type, lore_sources, aliases; and
bare graph.names). Fails the build on regression.
Parity tests (test_neo4j_read_tools_parity.py): 15 docker-gated
tests, one per read tool, asserting the in-memory and Neo4j
backends produce matching answers for a known fixture.
Suite: 596 -> 611 passed (+15 parity tests, 559 baseline preserved)
- Migrate add_lore_source direct mutation in write_tools.py:189
to graph.add_lore_source(source)
- Migrate update_entity type relabel to use graph.all_entity_types() +
graph.remove_entity_of_type() + graph.add_entity_of_type() (loop
over an explicit list since dict shape is no longer public)
- Migrate retcon (36 lines of inlined index surgery) to single
graph.replace_edge(edge_id, new_edge) call
- Migrate mark_verified (9 lines of inlined index surgery) to
graph.replace_edge(edge_id, new_edge)
- Add all_entity_types() method to InMemoryGraph (returns keys
of entities_by_type; was reachable as a private attr before)
- Add 9 tests in test_graph_backend_writes.py: add_lore_source,
remove_entity_of_type, register_name, entities_of_type,
retcon+mark_verified chokepoint contracts, edges_for_subject
with None, replace_edge preserves list position, CI fence
- CI fence: greps lore_engine_poc/ for graph.edges_by_*[
graph.entities_by_type[, graph.lore_sources[, graph.aliases[
outside graph_backend.py; fails the build on regression
Suite: 575 -> 584 passed (+9 new tests, 559 baseline preserved)
- Lift Graph dataclass from tools.py into graph_backend.py as
InMemoryGraph (the slice-0/4/10 body, byte-identical).
- New GraphBackend Protocol (PEP 544 + @runtime_checkable) with
14 method points (7 read, 7 write). Mirrors the LLMProvider
pattern in lore_engine_poc/llm.py:47-48.
- tools.Graph is now a back-compat alias (Graph = InMemoryGraph).
Zero test churn across the 559 existing tests.
- New replace_edge(old_id, new_edge) chokepoint. Lifts the
inlined index surgery that lived in write_tools.py retcon +
mark_verified. Same-endpoint swap is in-place; subject/
relation/object change drops + re-adds.
- New helpers: edges_for_subject, edges_for_object, entities_of_type,
lore_source, all_names, add_lore_source, remove_entity_of_type,
register_alias, register_name.
- 16 contract tests in tests/test_tools/test_graph_backend.py.
- Suite: 559 -> 575. No regressions.
Co-Authored-By: Claude <noreply@anthropic.com>
- MAX_BODY_BYTES = 1 MiB; reject with HTTP 413 + -32600 envelope
before the JSON parser allocates a Python object. Closes the
OOM-by-giant-body DoS vector.
- Drop dead try/except ImportError fallback for ERR_* constants —
always import from mcp_server (same package).
- stream() typing: AsyncIterator[bytes] (was Iterable[bytes]).
- build_app(graph: Graph, tool_registry: list) parameter types.
- Drop unused CONTENT_JSON constant.
- New test: test_post_oversized_body_rejected (HTTP 413).
- New test: test_post_unknown_method_returns_32601 — symmetric
with the stdio server.py coverage of the same path.
Co-Authored-By: Claude <noreply@anthropic.com>
* Add _wants_sse() and SSE branch in mcp_endpoint:
text/event-stream in Accept -> StreamingResponse with one
'event: message\ndata: <json>\n\n' frame. Default JSON path
unchanged. Empty body now rejected with 400 + -32600
(previously coerced to {}).
* 5 new in-process tests (10-14): Accept routing, SSE body shape,
GET 405, empty body 400. 538 -> 543 green.
- mcp_tools.TOOL_REGISTRY goes 24 → 36 entries (12 new write tools)
- Exposes: add_entity, add_relation, add_lore_source (slice 4.7 trio
that had been callable from scripts/02_demo.py only), plus
set_alias, update_entity, delete_entity (10.1), retcon,
mark_verified, merge_entities (10.2), define_calendar,
define_era, define_date (10.3)
- Hand-written JSON Schema per tool; trailing-underscore wire
fields (name_, object_) match the Python kwarg convention
used by the underlying functions
- test_tool_registry.py: EXPECTED_TOOLS / EXPECTED_FN grown to 36
entries; the schema-vs-signature drift detector (already in
place) validates the trailing-underscore convention
- test_protocol.py: tools/list count 24 → 36
- test_slice10_dispatch.py: 12 new dispatch tests, one per
new tool; retcon / mark_verified verify envelope shape only
because edge_id doesn't survive a subprocess restart (in-memory
graph) — actual mutation behaviour is covered in
test_write_tools_slice10b.py
- Suite 529/529 green (was 517; +12)
Co-Authored-By: Claude <noreply@anthropic.com>
- ALLOWED_LABELS gains 'Date' (Era, Calendar were already there)
- write_tools.define_calendar: name + optional days_per_year / months;
rejects empty/duplicate name and non-positive days/months
- write_tools.define_era: name + calendar + start [+ end]; validates
time bounds; stamps PART_OF Calendar edge and, when applicable,
PRECEDED edge to the most recent prior era in the same calendar
(linear ordering; world-builder can override with retcon)
- write_tools.define_date: calendar + year [+ month + day + era];
canonical time atom is '{era}.year_{Y}.month_{M}.day_{D}' (era
prefix optional); stamps INSTANCE_OF Calendar + DURING Era;
idempotent — calling twice with the same args returns the same
canonical and does not duplicate the date node
- 24 new tests in tests/test_tools/test_write_tools_slice10c.py
- Suite 517/517 green (was 493; +24)
Co-Authored-By: Claude <noreply@anthropic.com>
- Edge.edge_id: stable per-edge identity (8 hex chars, default factory)
- Graph.edges_by_id: dict[str, Edge] reverse index, populated by add()
- Graph.find_edge_by_id(id): O(1) lookup
- Graph.rename_entity: also registers old name as alias of new canonical
(merge_entities depends on this)
- Graph.remove_entity: keeps edges_by_id consistent with subject/object
indexes
- add_relation: returns the actual edge.edge_id (was fabricating a separate
uuid), so retcon / mark_verified can target it directly
- Edge.retcon_at / retcon_note: audit metadata stamped by retcon
- Edge.verified_by / verified_at / verified_note: stamped by mark_verified
- write_tools.retcon: amend edge bounds/relation/object; in-place mutation
via dataclasses.replace; validates time bounds; refuses inverted bounds
- write_tools.mark_verified: appends (1.0, 1.0, 'human_verified') source
tuple so aggregate confidence floors to 1.0
- write_tools.merge_entities: folds from_name into to_name, refuses if the
two have different labels, preserves from_name as an alias
- 25 new tests in tests/test_tools/test_write_tools_slice10b.py
- Suite 493/493 green (was 468; +25)
Co-Authored-By: Claude <noreply@anthropic.com>
- Graph.aliases: dict[str, set[str]] field; Graph.by_name follows aliases
- Graph.remove_entity(name) -> int: cascades through edges_by_subject/object,
type index, and aliases; returns edges removed
- Graph.rename_entity(old, new) -> int: re-points edges via dataclasses.replace,
preserves old name as alias of new canonical
- write_tools.set_alias: register alt name; rejects empty / duplicate
- write_tools.update_entity: label/rename with edge cascade; name_ kwarg to
avoid colliding with positional name (MCP layer maps user-facing name JSON
field to name_)
- write_tools.delete_entity: removes entity + all touching edges + aliases
- 20 new tests in tests/test_tools/test_write_tools_slice10.py
- Suite 468/468 green (was 448; +20)
Co-Authored-By: Claude <noreply@anthropic.com>
The 12 read_tools in lore_engine_poc.read_tools (entity_context,
true_during, entities_present, timeline, list_lineage, list_offspring,
ancestors_of, descendants_of, location_hierarchy, event_chain,
events_during, lore_about) were already implemented + unit-tested
in tests/test_tools/ but had not been exposed over the MCP wire.
This slice is pure registration: hand-written JSON Schema + adapter
binding for each tool, no changes to the underlying functions.
- mcp_tools.py: TOOL_REGISTRY goes from 12 → 24 entries. Docstring
updated to reflect the new total.
- test_tool_registry.py: EXPECTED_TOOLS / EXPECTED_FN grown to 24
entries; new tools' signatures cross-checked against the schema
by the existing schema-vs-signature test (caught zero drift).
- test_protocol.py: tools/list test updated to 24 tools; the
"multiple requests on one connection" test likewise.
- test_slice9_dispatch.py: 13 new subprocess tests, one per new
tool (entity_context has 2: happy path + unknown entity). Each
test boots scripts/05_mcp_server.py and verifies the response
shape against the real seed codex.
Live smoke: mcp_server.tools/list returns 24 tools, and tools/call
returns correct data for list_offspring, ancestors_of, etc.
448/448 tests pass (was 435 pre-slice; +13 from new dispatch tests).
- 01_ingest.py: LORE_INGEST_LLM=1 enables LLM extraction after the
deterministic path; build_graph is now called AFTER LLM triples
merge in (the 3.4 ordering fix).
- LORE_INGEST_FAKE_LLM=1 + LORE_INGEST_FAKE_LLM_SCRIPT=path selects
FakeProvider for offline/CI runs.
- Missing OLLAMA_API_KEY degrades gracefully: stderr warning, rc=0,
deterministic graph still built (no crash, no LLM triples).
- scripts/06_llm_smoke.py: one-shot manual smoke for the real
Ollama Cloud provider; loads one NPC, runs extractor, prints
triples. Skips (rc=0, helpful message) when OLLAMA_API_KEY unset.
- FakeProvider gains dict-style {match_any, response} / {match_any,
raise} entries so tests can skip exact-prompt matching when the
body is large.
- tests/test_extraction/test_ingest_wiring.py: 8 subprocess tests
covering default-off, enabled, idempotency (x2), adds-fact,
provider-failure tolerance, bad-JSON tolerance, and missing-key
fallback.
- tests/fixtures/llm_empty_script.json: [] (used by the enabled-
path test where no triples are expected).
435/435 tests pass (was 382 pre-slice; +53). End-to-end ingest with
--skip-cognee runs cleanly on default-off path.
- New lore_engine_poc/consistency_config.py: ConsistencyConfig dataclass
with disable_rules[], severity (default 'warn' per AC 2.8),
confidence_threshold (per-rule floor), acknowledged set (AC 2.9).
is_disabled(rule_id), is_acknowledged(id), acknowledge(id) helpers.
- ConsistencyRunner.run() now accepts an optional config parameter;
applies severity override, skips disabled rules, suppresses below
threshold, suppresses acknowledged violations.
- Anachronism dataclass now carries source_confidences (parallel
to sources) so confidence_threshold can suppress low-confidence
findings. Default = 1.0 when not set.
- get_anachronisms() got an include_flagged param (default False);
flagged violations are hidden by default.
- 9/9 new tests; full suite 245/245 (was 236).
Co-Authored-By: Claude <noreply@anthropic.com>
- New lore_engine_poc/consistency_tools.py: 10 tools from
docs/05-mcp-tools.md. Each is a thin function over a singleton
ConsistencyRunner plus its last-violations list.
Tool Function
------------------- ----------------------------------------
run_consistency_check Force-run the engine; returns ConsistencyRun
latest_run Most recent run summary (or None)
get_contradictions Filter by subject/severity/limit
get_anachronisms Filter by entity/limit
get_orphans Filter by reason/limit
get_ontology_violations Filter by rule_id/severity/limit
flag_for_review Set flagged=True on a violation (acknowledge)
explain_violation Return rule + edges + sources for a violation
add_ontology_rule Register a new rule (ValueError on dup id)
list_ontology_rules All registered rules (≥10 starter rules)
- Tools share a module-level singleton ConsistencyRunner. Tests
reset via the private _reset_runner hook.
- 19/19 new tests; full suite 236/236 (was 217).
Co-Authored-By: Claude <noreply@anthropic.com>
- New lore_engine_poc/ontology_rules.py: 10 starter rules from
docs/05-mcp-tools.md#starter-rules, each as a pure-Python callable
that takes a Graph and returns a list of OntologyViolation nodes.
Rule ids: no-overlapping-rulers, no-overlapping-spouses,
no-anachronism-participation, no-anachronism-rule, no-orphan-events,
no-orphan-locations, lineage-continuity, magic-system-coherence,
deity-worship-coherence, item-lineage. Severity: 'error' for the two
no-overlapping-* rules, 'warn' for the rest per AC 2.8.
- ConsistencyRunner.run() now invokes every registered rule (Category C)
in addition to A/B/D. rules_run=10 in the ConsistencyRun summary.
- Improved _edge_window_overlap: standard interval-overlap test
[a_from, a_until] ∩ [b_from, b_until] (half-open: a_from == b_until
is NOT overlap).
- 13/13 new tests; full suite 217/217 (was 204).
Co-Authored-By: Claude <noreply@anthropic.com>
- New lore_engine_poc/consistency.py: 4 violation dataclasses (Contradiction,
Anachronism, Orphan, OntologyViolation) + ConsistencyRun summary node.
All severity=warn by default per AC 2.8; flagged=False for acknowledge
mechanism per AC 2.9; 4 distinct classes per AC 2.1.
- New lore_engine_poc/consistency_runner.py: ConsistencyRunner walks an
in-memory Graph and emits:
* Category A (Contradiction) — two patterns: same-subject-different-object
(e.g. Aldric in two Factions at once) and same-object-different-subject
(e.g. two family trees give Aldric different fathers). Time-window
overlap required.
* Category B (Anachronism) — Person participates in event outside
inferred lifespan. Lifespan inferred from MEMBER_OF(Lineage) edges.
* Category D (Orphan) — entity with no outgoing edges and not referenced
as object anywhere. Uses pre-baked reason vocabulary from
docs/04-consistency.md §Category D.
* ConsistencyRun summary node with id, started_at, finished_at,
duration_ms, rules_run, *_found counts. latest_run() returns the
most recent summary.
- 21/21 new tests pass; full suite 204/204 (was 183; no slice 0/1 regressions).
Co-Authored-By: Claude <noreply@anthropic.com>