docs(plan): dual-confidence model + LoreSource as first-class node
Slice 0 acceptance criteria now distinguish three things the slice
proves (time filter, integration, dual-confidence). 5 new criteria
(0.11-0.15) verify the dual-confidence model in tests.
The model:
- extraction_confidence: did we extract this edge correctly?
Frontmatter = 1.0, body-text heuristic = 0.6.
- source_confidence: how reliable is the source document?
Lives on a LoreSource node as reliability
(canonical=1.0 | factional=0.75 | rumor=0.5 | dialogue=0.4
| fanon=0.3).
- aggregate confidence returned to callers = min(extraction * source)
across all sources on the edge.
Slice 1 picks up LoreSource as a first-class graph node and
SOURCED_FROM edges from every typed edge. Path-based reliability
inference (Quests/Random/ -> rumor) ships in slice 0; slice 1
adds YAML frontmatter override and the graph node itself.
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -36,6 +36,45 @@ time-bounded edges, the `was_true_at` query, source attribution.
|
||||
| 0.8 | Every positive result has a non-empty `sources[]` pointing to a real file | ✅ |
|
||||
| 0.9 | Cognee import works, `cognee.cognify()` reaches the LLM-call step | ✅ (fails on missing key, gracefully) |
|
||||
| 0.10 | `scripts/03_reset.py` wipes the in-memory cache and (best-effort) the Cognee dataset | ✅ |
|
||||
| 0.11 | Dual-confidence model: extraction and source dimensions are tracked separately | ✅ `tests/test_confidence.py` 6/6 |
|
||||
| 0.12 | A frontmatter edge reports `extraction=1.0, source=1.0, aggregate=1.0` | ✅ |
|
||||
| 0.13 | A body-text-inferred edge reports `extraction=0.6, source=1.0, aggregate=0.6` | ✅ |
|
||||
| 0.14 | A rumor-sourced edge reports `extraction=1.0, source=0.5, aggregate=0.5` | ✅ |
|
||||
| 0.15 | Two agreeing sources on the same fact merge into one Edge with both per-source confidences preserved | ✅ |
|
||||
|
||||
### What this slice proves vs. what it doesn't
|
||||
|
||||
The acceptance criteria above prove three **independent** things, and
|
||||
it's worth being explicit about which is which so slice 1 doesn't
|
||||
duplicate effort:
|
||||
|
||||
- **0.3 proves the time filter.** `time_in_window` is the
|
||||
load-bearing primitive. 13 self-tests cover era-tree membership,
|
||||
`current` resolution, sub-era windows, and open bounds. **This is
|
||||
the only place in the slice where the time logic is actually
|
||||
exercised end-to-end** — the demo queries pass `at_time` to
|
||||
`was_true_at`, but every edge in the POC has
|
||||
`valid_from = valid_until = null`, so the time filter accepts
|
||||
everything by default.
|
||||
- **0.1–0.2, 0.4–0.10 prove the integration.** Cognee substrate is
|
||||
installable, the codex parser produces typed triples, the
|
||||
`was_true_at` tool resolves names, walks the graph, returns the
|
||||
documented response shape, and cites sources.
|
||||
- **0.11–0.15 prove the dual-confidence model.** Two dimensions
|
||||
are tracked: **extraction confidence** (did we extract this edge
|
||||
correctly? Frontmatter=1.0, body-text heuristic=0.6) and
|
||||
**source confidence** (how reliable is the document? lives on a
|
||||
`LoreSource` node as `reliability: canonical | factional | rumor
|
||||
| dialogue | fanon`). The aggregate confidence returned to
|
||||
callers is `min(extraction * source)` across all sources on the
|
||||
edge. This unblocks the `family_tree.yaml` (slice 1) and the
|
||||
`LoreSource` node (slice 1) without retrofitting.
|
||||
|
||||
**Slice 1 is what couples the three.** When `family_tree.yaml`
|
||||
ships with `valid_from` / `valid_until` per edge, the demo queries
|
||||
will exercise the time filter end-to-end. When `LoreSource` ships
|
||||
as a first-class node, the `reliability` field becomes structured
|
||||
data instead of a path-inferred heuristic.
|
||||
|
||||
## Test plan
|
||||
|
||||
|
||||
@@ -43,11 +43,22 @@ fuzziness. Every edge traces to a YAML line.
|
||||
`Spell` with `PRACTICES` edges.
|
||||
6. `lore_engine_poc/parsers/culture.py` — `Culture`, `Language`,
|
||||
`Deity` with `WORSHIPS` and `SPEAKS` edges.
|
||||
7. Schema validation: strict, fails loudly with line numbers (YAML
|
||||
"gotchas" — `NO: false` parsing as `True`, tab/space sensitivity).
|
||||
8. `time_model.py` test suite grows: era-tree membership, month/day
|
||||
precision, `current` token resolution against `:Now` config node,
|
||||
null bounds semantics.
|
||||
7. **`LoreSource` as a first-class node** — every YAML file
|
||||
becomes a `LoreSource` node with a `reliability` field
|
||||
(`canonical | factional | rumor | dialogue | fanon`). Each
|
||||
edge points to one or more `LoreSource` nodes via a
|
||||
`SOURCED_FROM` edge. This is the structured-data form of
|
||||
the dual-confidence model the POC already implements
|
||||
(`tests/test_confidence.py`). The `reliability` field is
|
||||
overridable per-file via YAML frontmatter; the default is
|
||||
`canonical` for `*.yaml` files and is path-inferred for
|
||||
prose files (Quests/Random/ → `rumor`).
|
||||
8. Schema validation: strict, fails loudly with line numbers
|
||||
(YAML "gotchas" — `NO: false` parsing as `True`,
|
||||
tab/space sensitivity).
|
||||
9. `time_model.py` test suite grows: era-tree membership,
|
||||
month/day precision, `current` token resolution against
|
||||
`:Now` config node, null bounds semantics.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
@@ -61,6 +72,9 @@ fuzziness. Every edge traces to a YAML line.
|
||||
| 1.6 | Anachronism check flags a parent whose death precedes a child's birth |
|
||||
| 1.7 | Re-ingest is idempotent (`MERGE`, not `CREATE`) |
|
||||
| 1.8 | Three example YAMLs ship in `seed/` for demo |
|
||||
| 1.9 | `LoreSource` is a first-class node with `reliability` field, `SOURCED_FROM` edges from every typed edge |
|
||||
| 1.10 | YAML files default to `reliability: canonical`; frontmatter can override |
|
||||
| 1.11 | Time-bounded edges (from `family_tree.yaml` PARENT_OF) carry `valid_from` and `valid_until`; the demo's `was_true_at` queries actually exercise `time_in_window` |
|
||||
|
||||
## Test plan
|
||||
|
||||
|
||||
Reference in New Issue
Block a user