Files
lore-engine/docs/03-macro-micro.md
Hermes Agent 7c4ed58a5b docs: initial Lore Engine design (11 docs, ~140KB)
- 00-overview: goals, what we inherit from GraphMCP-Example, naming
- 01-ontology: 14 node labels, 40+ edge types, time-bound properties
- 02-time-model: era hierarchy, {era}.{year} canonical format, time_in_window UDF
- 03-macro-micro: 3 association patterns, lookup+active context, expand_context
- 04-consistency: Contradiction/Anachronism/Orphan/OntologyViolation, 4 rule categories
- 05-mcp-tools: 30 tools (8 inherited + 22 new), 5 composition patterns, 10 starter rules
- 06-ingestion: 3 paths (prose, structured YAML, dialogue), YAML schemas for 6 source types
- 07-reasoning-harness: 5 question types, system prompt, failure modes, worked example
- 08-architecture: system diagram, services layout, UDFs, schema bootstrap
- 09-roadmap: 11 phases, MVP = 19 days end of phase 4
- 10-critique: pressure-test, S1-S4 severity, open questions
2026-06-16 04:59:12 +00:00

210 lines
9.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 03 — Macro ↔ Micro Association
A high-fantasy world is full of micro details that only make sense in macro context:
- *Aldric carries a dagger* (micro) is meaningless without *Aldric is a Vyr* (macro) and *the Vyrs are a noble house* (more macro).
- *The tavern is on fire* (micro) is meaningless without *the tavern is in Mardsville* (macro) and *Mardsville is contested in the Border Wars* (more macro).
The engine's job is to make these associations *navigable*, not to make the LLM traverse a five-edge chain by hand every time. This document is how we do that.
## The principle: every node knows where it lives
Every node in the engine is implicitly connected, via a small number of well-indexed edges, to the macro structures it belongs to. The LLM can ask a question about *anything* and reach *everything* relevant in O(1) to O(3) hops.
For a `Person`, those connections are:
```
Person
├── MEMBER_OF → Faction(s) (which house / order / company)
├── BELONGS_TO → Culture(s) (which people they are)
├── WORSHIPS → Deity/Deities (what they believe)
├── PRACTICES → MagicSystem(s) (what magic they can use)
├── SPEAKS → Language(s) (what they speak)
├── LOCATED_IN → Location (where they are)
├── CLAIMS_TITLE → Title (what office they hold)
├── PARENT_OF → Person(s) (children)
├── SPOUSE_OF → Person(s) (partner)
├── BURIED_AT → Location (final rest)
└── EXISTED_DURING → Era(s) (when they lived)
```
For a `Faction`:
```
Faction
├── MEMBER_OF → Faction(s) (sub-group, vassal, parent org)
├── RULES → Location / Faction (sovereignty)
├── CONTROLS → Location(s) (military hold)
├── LOCATED_IN → Location (headquarters)
├── BELONGS_TO → Culture(s)
├── POSSESSES → Item(s) (holdings, relics)
├── CREATED → Item(s) (artifacts forged)
├── ALLIED_WITH → Faction(s)
├── ENEMY_OF → Faction(s)
└── EXISTED_DURING → Era(s)
```
For a `Location`:
```
Location
├── PART_OF → Location / Region (geographic hierarchy)
├── LOCATED_IN → Location / Region
├── RULES → Person / Faction (sovereign)
├── CONTROLS → Faction(s) (occupier)
├── NEAR → Location(s) (geographic proximity)
└── CULTURE_OF → Culture(s) (homeland of)
```
For an `Item`:
```
Item
├── POSSESSED_BY → Person / Faction (current holder)
├── CREATED → Person / Faction
├── FORGED_FROM → Material(s)
├── INHERITED_BY → Person(s) (lineage of ownership)
└── ORIGINATES_FROM → Location (where it was forged)
```
These are the **default association paths**. The MCP tool layer exposes them as composable queries.
## Three macro-association patterns
### Pattern 1: Direct association (one hop)
The simplest case. The LLM asks about Aldric, gets his faction, his location, his culture, his titles — all in a single `state_at` call.
**Cypher:**
```cypher
MATCH (p:Person {name: "Aldric Raventhorne"})
OPTIONAL MATCH (p)-[r1:MEMBER_OF]->(f:Faction)
OPTIONAL MATCH (p)-[r2:LOCATED_IN]->(loc:Location)
OPTIONAL MATCH (p)-[r3:BELONGS_TO]->(c:Culture)
OPTIONAL MATCH (p)-[r4:CLAIMS_TITLE]->(t:Title)
RETURN p, collect(DISTINCT f) AS factions,
collect(DISTINCT loc) AS locations,
collect(DISTINCT c) AS cultures,
collect(DISTINCT t) AS titles
```
**Tool:** `entity_context(name)` returns the one-hop summary. See `05-mcp-tools.md`.
### Pattern 2: Lineage chain (variable depth)
Aldric → his father → his grandfather → the Vyr bloodline. The LLM asks "what bloodline does Aldric belong to?" and gets an answer without traversing `PARENT_OF` 12 times.
This is why we have `Lineage` as a *node*, not just a property on `Person`. `Lineage` is a typed, queryable group with:
- `founding_ancestor` (the `Person` who started the bloodline)
- `cadet_branches[]` (sub-`Lineage` nodes)
- `MEMBER_OF` connections from `Person` to `Lineage`
The LLM's question *"What is Aldric's lineage, and who else is in it?"* becomes:
```cypher
MATCH (a:Person {name: "Aldric Raventhorne"})
-[:MEMBER_OF]->(lin:Lineage)
<-[:MEMBER_OF]-(relative:Person)
WHERE relative.name <> a.name
RETURN lin, collect(relative.name) AS bloodline_members
```
**Tool:** `list_lineage(person)` returns the bloodline, its cadet branches, the founding ancestor, and all known members.
### Pattern 3: Geographic / political hierarchy (region tree)
`Region` and `Location` form a tree via `PART_OF`. To go from *Aldric's dagger* (micro) to *the Kingdom of Valdorn's stance in the Border Wars* (macro), the engine traverses:
```
Aldric's dagger (Item)
→ CREATED_BY Aldric (Person)
→ MEMBER_OF House Vyr (Faction)
→ RULES Valdorn (Location)
→ PART_OF Northern Reaches (Region)
→ CULTURE_OF Valdorni (Culture)
→ LOCATED_IN — King Aelric's Court (Location)
→ CONTESTED_BY Crimson Pact (Faction) in the Border Wars (Event)
```
**Six hops.** With proper indexing, that's ~1050ms in Neo4j. We make it a single tool call: `expand_context(entity, hops=6)`.
**Tool:** `expand_context(entity, hops, relation_filter)` returns the n-hop neighborhood filtered to specified relation types.
## The micro-anchoring problem
Here's the design risk: in a world with thousands of `Item` nodes, "Aldric's dagger" is one of 50,000 daggers. The LLM shouldn't have to say "Aldric's dagger" — it should be able to say *"the dagger"* and the engine infers which one.
**Solution: the context window.** Every LLM query in the engine happens in a *context window* — a working set of entities the LLM has been talking about. The MCP server tracks the active context per session:
```json
{
"active_context": [
{"type": "Person", "name": "Aldric Raventhorne", "id": "uuid-1"},
{"type": "Location", "name": "Thornwall Keep", "id": "uuid-2"},
{"type": "Item", "name": "Sword of Eventide", "id": "uuid-3"}
]
}
```
When the LLM calls `lookup("the dagger")`, the engine:
1. First checks the active context for `Item` nodes.
2. If exactly one `Item` matches "dagger" among them, returns it.
3. If multiple, returns the disambiguation list and asks the LLM to pick.
4. If none, falls back to a global fuzzy search.
The active context is updated automatically by other tools: every `state_at`, every `query_faction_at_time`, every `get_context` adds entities to the working set. The LLM doesn't manage it; the engine does.
**Tool:** `lookup(query)` is the disambiguating entry point.
## The "why does this matter" link
Some micro details matter only because of a macro context. Aldric's dagger matters because it's *the dagger that killed the Emperor*. The engine models this as a `SIGNIFICANCE_OF` edge from the item to a specific `Event`:
```cypher
(:Item {name: "Aldric's Dagger"})
-[:SIGNIFICANCE_OF {
role: "weapon",
context: "the assassination of Emperor Vael of the Crimson Throne"
}]->
(:Event {name: "Assassination of Emperor Vael"})
```
The LLM asking "why is this dagger famous?" gets a single tool call answer: `significance_of(item)`.
**Tool:** `significance_of(entity)` returns the historical events, cultural significance, or macro context that makes this entity noteworthy.
## Composition patterns (what the LLM does, not what the schema has)
These are the *recipes* the LLM uses, documented in `07-reasoning-harness.md` but prefigured here:
| Question | Tool sequence |
|---|---|
| "Who is Aldric?" | `entity_context(Aldric)` + `list_lineage(Aldric)` + `significance_of(Aldric)` |
| "What was happening in Valdorn in 340 TA?" | `state_at(Valdorn, "3rd_age.year_340")` + `entities_present(Valdorn, "3rd_age.year_340")` |
| "Why does Aldric care about the Crimson Throne?" | `entity_context(Aldric)` → get `MEMBER_OF House Vyr``expand_context(House Vyr, hops=2)` → find `RULES Crimson Throne` |
| "Is the dagger in the museum the real one?" | `lookup("the dagger")``significance_of(dagger)` → if Event-tagged, `get_event_chain(dagger)` to verify lineage of possession |
| "Was the Long Winter caused by the Sundering?" | `get_event_chain(Sundering)` → check for `CAUSED` edges to Long Winter |
## The risk: traversal explosion
A naive `expand_context` with `hops=10` on a dense world will return thousands of nodes. The LLM's context window will explode. We mitigate in three ways:
1. **Default hops=2.** The LLM must explicitly request deeper traversal, and the tool warns at hops=4.
2. **Relation filters.** `expand_context(Aldric, hops=2, relations=["MEMBER_OF", "RULES"])` returns only those.
3. **Confidence thresholds.** Nodes with `source_confidence < 0.5` are excluded unless requested.
4. **Result caps.** A hard cap of 200 nodes per call. The LLM paginates.
**Tool:** `expand_context(entity, hops, relation_filter, min_confidence, limit)` is rate-limited and capped.
## Summary
The macro↔micro layer is the part of the engine that makes the difference between *"a graph full of facts"* and *"a world you can reason about."* The ontology is the data; this is the access pattern.
Three rules of thumb:
- If the LLM has to do a 3+ hop traversal to answer a basic question, the ontology is missing an edge.
- If the LLM can't disambiguate *"the dagger"*, the active context is broken.
- If the LLM doesn't know *why* a fact matters, the `SIGNIFICANCE_OF` pattern is missing.