docs(plan): 01-slice-impl-plan.md — TDD sub-slices for structured YAML ingest
This commit is contained in:
176
docs/plan/01-slice-impl-plan.md
Normal file
176
docs/plan/01-slice-impl-plan.md
Normal file
@@ -0,0 +1,176 @@
|
||||
# Slice 1 — TDD Implementation Plan
|
||||
|
||||
**Owner:** this loop (Claude).
|
||||
**Scope:** `docs/plan/01-slice-structured-yaml.md` (the AC table is
|
||||
the contract). Implementation lives in
|
||||
`~/projects/lore-engine-poc/` alongside slice 0.
|
||||
**TDD rule:** every new behaviour ships with a failing test first;
|
||||
the test names follow `test_<AC>_<description>` so a single
|
||||
`pytest --collect-only` makes AC coverage visible.
|
||||
|
||||
## Decision points locked before coding starts
|
||||
|
||||
| # | Decision | Source | Implication |
|
||||
|---|---|---|---|
|
||||
| D1 | Edges with time bounds are reified `:Relation` nodes (not native edges) | ADR 0009 | The in-memory `Edge` from slice 0 stays the substrate; YAML parsers emit time-bounded edges with `valid_from`/`valid_until` populated. |
|
||||
| D2 | `Lineage` ≠ `Faction` | ADR 0003 | `family_tree.yaml` produces `Lineage` nodes only; `factions.yaml` produces `Faction` nodes only. No "House" leakage between them. |
|
||||
| D3 | `LoreSource` is a first-class node | slice 1 AC 1.9, 1.10 | Move the existing `LoreSource` dataclass out of `parsers.py` into a graph-side module so it's not just an attribute on triples. |
|
||||
| D4 | `time_in_window` is the only time predicate | 02-time-model.md + slice 0 | All structured-YAML edges go through `time_in_window` (which the demo's `was_true_at` already calls). |
|
||||
| D5 | Yaml loader: `PyYAML` safe_load only | 06-ingestion.md §YAML | Reject the Norway problem (`NO: false` → `True`) via strict schema, not the YAML parser. |
|
||||
| D6 | File-relative validity: a YAML file is one source | slice 0 dual-confidence | Every triple in `family_tree.yaml` shares the file's `reliability` (default `canonical`). |
|
||||
|
||||
## Sub-slice ordering and parallelisation
|
||||
|
||||
The 9 sub-slices below respect the dependency order
|
||||
`parsers → LoreSource → validation → time tests → seed YAMLs → demo`.
|
||||
Independent items can fan out to sub-agents in the same iteration.
|
||||
|
||||
```
|
||||
1.1 family_tree parser ──┐
|
||||
├──> 1.3 LoreSource as graph node ──┐
|
||||
1.2 factions parser ──────┘ │
|
||||
├──> 1.5 schema validation + idempotent re-ingest
|
||||
┌──> 1.4 timeline/gazetteer/ │
|
||||
│ bestiary/magic/culture │
|
||||
│ │
|
||||
└──> 1.6 time_model ≥30 cases │
|
||||
│
|
||||
├──> 1.7 seed/ YAMLs + demo extension
|
||||
```
|
||||
|
||||
- **1.1 ∥ 1.2** (no shared code) → sub-agent A and B in parallel.
|
||||
- **1.4** (5 parsers) splits internally into 5 sub-tasks; one parser per sub-agent, all reading the same schema-validation contract from 1.5's draft.
|
||||
- **1.6** is independent of 1.1–1.5 — it only touches `time_model.py`. Run it as a stand-alone sub-agent.
|
||||
- **1.3, 1.5, 1.7** are integration points; do them in the main loop.
|
||||
|
||||
## AC → test map
|
||||
|
||||
Each AC row gets at least one pytest. Tests live in
|
||||
`tests/test_parsers/test_<parser>.py` and
|
||||
`tests/test_time_model.py`. The mapping:
|
||||
|
||||
| AC | Test name | Sub-slice |
|
||||
|---|---|---|
|
||||
| 1.1 | `test_all_six_parsers_emit_expected_edge_shape` | 1.4 |
|
||||
| 1.2 | `test_family_tree_edges_carry_non_null_bounds` | 1.1 |
|
||||
| 1.3 | (pytest parametrize, 30+ cases) `test_time_model_<case>` | 1.6 |
|
||||
| 1.4 | `test_was_true_at_filters_by_window` | 1.1 (integration) |
|
||||
| 1.5 | `test_schema_validator_rejects_malformed_with_line` | 1.5 |
|
||||
| 1.6 | `test_parent_dies_before_child_birth_raises` | 1.1 |
|
||||
| 1.7 | `test_reingest_is_idempotent` | 1.5 |
|
||||
| 1.8 | `test_seed_yaml_files_exist` (filesystem fixture) | 1.7 |
|
||||
| 1.9 | `test_lore_source_is_graph_node` | 1.3 |
|
||||
| 1.10 | `test_yaml_default_reliability_canonical` | 1.3 |
|
||||
| 1.11 | `test_demo_queries_exercise_time_in_window` | 1.7 (integration) |
|
||||
| 1.12 | `test_family_tree_emits_lineage_not_faction` | 1.1 |
|
||||
| 1.13 | `test_faction_member_has_reason_field` | 1.2 |
|
||||
| 1.14 | `test_multiple_memberships_non_overlapping` | 1.2 |
|
||||
| 1.15 | `test_cross_lineage_marriage_child_in_named_lineage` | 1.1 |
|
||||
|
||||
## Sub-slice briefs
|
||||
|
||||
### 1.1 family_tree.yaml parser (TDD-first)
|
||||
|
||||
**First failing test** (this is the gate for the whole slice):
|
||||
|
||||
```python
|
||||
def test_family_tree_emits_lineage_not_faction(tmp_path):
|
||||
yaml = tmp_path / "ashveil.yaml"
|
||||
yaml.write_text(textwrap.dedent('''
|
||||
lineage: "ashveil_bloodline"
|
||||
founding_ancestor: "theron_ashveil"
|
||||
members:
|
||||
- id: "theron_ashveil"
|
||||
name: "Theron Ashveil"
|
||||
born: "1st_age.year_412"
|
||||
died: "2nd_age.year_87"
|
||||
parents: []
|
||||
- id: "aldric_raventhorne"
|
||||
name: "Aldric Raventhorne"
|
||||
born: "3rd_age.year_300"
|
||||
parents: ["theron_ashveil"]
|
||||
'''))
|
||||
entities, triples = parse_structured_yaml(str(tmp_path))
|
||||
node_labels = {e.type for e in entities}
|
||||
assert "Lineage" in node_labels
|
||||
assert "Faction" not in node_labels
|
||||
# ...assert the exact triple list
|
||||
```
|
||||
|
||||
Then make it pass with the smallest code possible.
|
||||
|
||||
### 1.2 factions.yaml parser
|
||||
|
||||
Same shape, parallel track. Emit one `Faction` node per file, plus
|
||||
`MEMBER_OF(Faction)` edges with `valid_from = member.joined`,
|
||||
`valid_until = member.left`, and a `reason` field stored as a
|
||||
property on the triple (slice 4 will lift it into the tool layer).
|
||||
|
||||
### 1.3 LoreSource as a graph node
|
||||
|
||||
Move `LoreSource` from `parsers.py` into a new
|
||||
`lore_engine_poc/lore_source.py` module, expose it on `Graph` as a
|
||||
separate index `graph.sources_by_path: dict[str, LoreSource]`, and
|
||||
add a `SOURCED_FROM` triple to the output list (currently the
|
||||
`source_path` is a property — make it a first-class edge so slice
|
||||
2's consistency engine can reason about it). Tests assert that
|
||||
every `Triple` produced by `extract_triples` has a matching
|
||||
`SOURCED_FROM` triple pointing at a `LoreSource` node.
|
||||
|
||||
### 1.4 Five more parsers
|
||||
|
||||
Same TDD-first pattern. Each parser takes a `tmp_path` fixture with
|
||||
one YAML file and asserts:
|
||||
1. The expected node labels appear.
|
||||
2. The expected (subject, relation, object, valid_from, valid_until)
|
||||
triples appear.
|
||||
3. Re-running yields the same set (idempotency).
|
||||
|
||||
### 1.5 Schema validation + idempotent re-ingest
|
||||
|
||||
A single `validate_family_tree(data, source_path) -> None` raises
|
||||
`FamilyTreeSchemaError(message, line=N)` with the offending line.
|
||||
Tests assert:
|
||||
- `NO: false` (the Norway problem) is rejected.
|
||||
- `parents: null` on a non-root is rejected.
|
||||
- A duplicate `(lineage, member_id)` on re-ingest is silently merged,
|
||||
not duplicated (AC 1.7).
|
||||
|
||||
### 1.6 time_model.py ≥30 cases
|
||||
|
||||
Parametrize the existing 13 self-tests, plus add:
|
||||
- month/day precision (`3rd_age.year_345.month_3.day_17`)
|
||||
- `:Now` config node resolution (the slice 0 code path already
|
||||
takes `current_time=`; promote it to module-level config)
|
||||
- half-open window edge cases
|
||||
- both bounds null with `at=None` → True
|
||||
- a malformed time string raises `ValueError`
|
||||
|
||||
Target: 30+ cases, all passing.
|
||||
|
||||
### 1.7 seed/ YAMLs + demo
|
||||
|
||||
Three example YAMLs in `lore_engine_poc/seed/yaml/`:
|
||||
- `ashveil_bloodline.yaml` — lineage (the AC 1.15 cross-lineage case)
|
||||
- `house_raventhorne.yaml` — faction (AC 1.14 multi-membership)
|
||||
- `battle_of_black_spire.yaml` — timeline event (AC 1.11 demo query)
|
||||
|
||||
`scripts/02_demo.py` gains three queries that *only* return true
|
||||
inside the window (the negative case proves the time filter is real,
|
||||
not bypassed).
|
||||
|
||||
## Spot-check protocol
|
||||
|
||||
After every sub-slice lands, a single `pytest -q` runs in
|
||||
`lore-engine-poc/`. Sub-agent commits get reviewed before merging
|
||||
into `main`:
|
||||
- `git diff main..wt/<branch>` — small, scoped to one AC family
|
||||
- `pytest -q` from the sub-agent's branch tip
|
||||
- If anything looks out-of-scope, surface it instead of merging.
|
||||
|
||||
## Out of scope (deferred)
|
||||
|
||||
- LLM extraction (slice 3) — separate track, separate slice.
|
||||
- Consistency engine (slice 2) — needs both 1 and 3, lands later.
|
||||
- Neo4j UDF port — slice 0's pure-Python port stays as the test
|
||||
oracle; Neo4j Java port is a polish item per ADR 0008.
|
||||
Reference in New Issue
Block a user