lore-engine/docs/03-macro-micro.md

# 03 — Macro ↔ Micro Association

A high-fantasy world is full of micro details that only make sense in macro context:

- *Aldric carries a dagger* (micro) is meaningless without *Aldric is a Vyr* (macro) and *the Vyrs are a noble house* (more macro).
- *The tavern is on fire* (micro) is meaningless without *the tavern is in Mardsville* (macro) and *Mardsville is contested in the Border Wars* (more macro).

The engine's job is to make these associations *navigable*, not to make the LLM traverse a five-edge chain by hand every time. This document is how we do that.

## The principle: every node knows where it lives

Every node in the engine is implicitly connected, via a small number of well-indexed edges, to the macro structures it belongs to. The LLM can ask a question about *anything* and reach *everything* relevant in O(1) to O(3) hops.

For a `Person`, those connections are:

```
Person
  ├── MEMBER_OF     → Faction(s)            (which house / order / company)
  ├── BELONGS_TO    → Culture(s)            (which people they are)
  ├── WORSHIPS      → Deity/Deities         (what they believe)
  ├── PRACTICES     → MagicSystem(s)        (what magic they can use)
  ├── SPEAKS        → Language(s)           (what they speak)
  ├── LOCATED_IN    → Location              (where they are)
  ├── CLAIMS_TITLE  → Title                 (what office they hold)
  ├── PARENT_OF     → Person(s)             (children)
  ├── SPOUSE_OF     → Person(s)             (partner)
  ├── BURIED_AT     → Location              (final rest)
  └── EXISTED_DURING → Era(s)               (when they lived)
```

For a `Faction`:

```
Faction
  ├── MEMBER_OF     → Faction(s)            (sub-group, vassal, parent org)
  ├── RULES          → Location / Faction    (sovereignty)
  ├── CONTROLS       → Location(s)           (military hold)
  ├── LOCATED_IN     → Location              (headquarters)
  ├── BELONGS_TO     → Culture(s)
  ├── POSSESSES      → Item(s)               (holdings, relics)
  ├── CREATED        → Item(s)               (artifacts forged)
  ├── ALLIED_WITH    → Faction(s)
  ├── ENEMY_OF       → Faction(s)
  └── EXISTED_DURING → Era(s)
```

For a `Location`:

```
Location
  ├── PART_OF        → Location / Region     (geographic hierarchy)
  ├── LOCATED_IN     → Location / Region
  ├── RULES          → Person / Faction      (sovereign)
  ├── CONTROLS       → Faction(s)            (occupier)
  ├── NEAR           → Location(s)           (geographic proximity)
  └── CULTURE_OF     → Culture(s)            (homeland of)
```

For an `Item`:

```
Item
  ├── POSSESSED_BY   → Person / Faction      (current holder)
  ├── CREATED        → Person / Faction
  ├── FORGED_FROM    → Material(s)
  ├── INHERITED_BY   → Person(s)             (lineage of ownership)
  └── ORIGINATES_FROM → Location             (where it was forged)
```

These are the **default association paths**. The MCP tool layer exposes them as composable queries.

## Three macro-association patterns

### Pattern 1: Direct association (one hop)

The simplest case. The LLM asks about Aldric, gets his faction, his location, his culture, his titles — all in a single `state_at` call.

**Cypher:**
```cypher
MATCH (p:Person {name: "Aldric Raventhorne"})
OPTIONAL MATCH (p)-[r1:MEMBER_OF]->(f:Faction)
OPTIONAL MATCH (p)-[r2:LOCATED_IN]->(loc:Location)
OPTIONAL MATCH (p)-[r3:BELONGS_TO]->(c:Culture)
OPTIONAL MATCH (p)-[r4:CLAIMS_TITLE]->(t:Title)
RETURN p, collect(DISTINCT f) AS factions,
       collect(DISTINCT loc) AS locations,
       collect(DISTINCT c) AS cultures,
       collect(DISTINCT t) AS titles
```

**Tool:** `entity_context(name)` returns the one-hop summary. See `05-mcp-tools.md`.

### Pattern 2: Lineage chain (variable depth)

Aldric → his father → his grandfather → the Vyr bloodline. The LLM asks "what bloodline does Aldric belong to?" and gets an answer without traversing `PARENT_OF` 12 times.

This is why we have `Lineage` as a *node*, not just a property on `Person`. `Lineage` is a typed, queryable group with:

- `founding_ancestor` (the `Person` who started the bloodline)
- `cadet_branches[]` (sub-`Lineage` nodes)
- `MEMBER_OF` connections from `Person` to `Lineage`

The LLM's question *"What is Aldric's lineage, and who else is in it?"* becomes:

```cypher
MATCH (a:Person {name: "Aldric Raventhorne"})
      -[:MEMBER_OF]->(lin:Lineage)
      <-[:MEMBER_OF]-(relative:Person)
WHERE relative.name <> a.name
RETURN lin, collect(relative.name) AS bloodline_members
```

**Tool:** `list_lineage(person)` returns the bloodline, its cadet branches, the founding ancestor, and all known members.

### Pattern 3: Geographic / political hierarchy (region tree)

`Region` and `Location` form a tree via `PART_OF`. To go from *Aldric's dagger* (micro) to *the Kingdom of Valdorn's stance in the Border Wars* (macro), the engine traverses:

```
Aldric's dagger (Item)
  → CREATED_BY Aldric (Person)
  → MEMBER_OF House Vyr (Faction)
  → RULES Valdorn (Location)
  → PART_OF Northern Reaches (Region)
  → CULTURE_OF Valdorni (Culture)
  → LOCATED_IN — King Aelric's Court (Location)
  → CONTESTED_BY Crimson Pact (Faction) in the Border Wars (Event)
```

**Six hops.** With proper indexing, that's ~10–50ms in Neo4j. We make it a single tool call: `expand_context(entity, hops=6)`.

**Tool:** `expand_context(entity, hops, relation_filter)` returns the n-hop neighborhood filtered to specified relation types.

## The micro-anchoring problem

Here's the design risk: in a world with thousands of `Item` nodes, "Aldric's dagger" is one of 50,000 daggers. The LLM shouldn't have to say "Aldric's dagger" — it should be able to say *"the dagger"* and the engine infers which one.

**Solution: the context window.** Every LLM query in the engine happens in a *context window* — a working set of entities the LLM has been talking about. The MCP server tracks the active context per session:

```json
{
  "active_context": [
    {"type": "Person", "name": "Aldric Raventhorne", "id": "uuid-1"},
    {"type": "Location", "name": "Thornwall Keep", "id": "uuid-2"},
    {"type": "Item", "name": "Sword of Eventide", "id": "uuid-3"}
  ]
}
```

When the LLM calls `lookup("the dagger")`, the engine:

1. First checks the active context for `Item` nodes.
2. If exactly one `Item` matches "dagger" among them, returns it.
3. If multiple, returns the disambiguation list and asks the LLM to pick.
4. If none, falls back to a global fuzzy search.

The active context is updated automatically by other tools: every `state_at`, every `query_faction_at_time`, every `get_context` adds entities to the working set. The LLM doesn't manage it; the engine does.

**Tool:** `lookup(query)` is the disambiguating entry point.

## The "why does this matter" link

Some micro details matter only because of a macro context. Aldric's dagger matters because it's *the dagger that killed the Emperor*. The engine models this as a `SIGNIFICANCE_OF` edge from the item to a specific `Event`:

```cypher
(:Item {name: "Aldric's Dagger"})
  -[:SIGNIFICANCE_OF {
    role: "weapon",
    context: "the assassination of Emperor Vael of the Crimson Throne"
  }]->
(:Event {name: "Assassination of Emperor Vael"})
```

The LLM asking "why is this dagger famous?" gets a single tool call answer: `significance_of(item)`.

**Tool:** `significance_of(entity)` returns the historical events, cultural significance, or macro context that makes this entity noteworthy.

## Composition patterns (what the LLM does, not what the schema has)

These are the *recipes* the LLM uses, documented in `07-reasoning-harness.md` but prefigured here:

| Question | Tool sequence |
|---|---|
| "Who is Aldric?" | `entity_context(Aldric)` + `list_lineage(Aldric)` + `significance_of(Aldric)` |
| "What was happening in Valdorn in 340 TA?" | `state_at(Valdorn, "3rd_age.year_340")` + `entities_present(Valdorn, "3rd_age.year_340")` |
| "Why does Aldric care about the Crimson Throne?" | `entity_context(Aldric)` → get `MEMBER_OF House Vyr` → `expand_context(House Vyr, hops=2)` → find `RULES Crimson Throne` |
| "Is the dagger in the museum the real one?" | `lookup("the dagger")` → `significance_of(dagger)` → if Event-tagged, `get_event_chain(dagger)` to verify lineage of possession |
| "Was the Long Winter caused by the Sundering?" | `get_event_chain(Sundering)` → check for `CAUSED` edges to Long Winter |

## The risk: traversal explosion

A naive `expand_context` with `hops=10` on a dense world will return thousands of nodes. The LLM's context window will explode. We mitigate in three ways:

1. **Default hops=2.** The LLM must explicitly request deeper traversal, and the tool warns at hops=4.
2. **Relation filters.** `expand_context(Aldric, hops=2, relations=["MEMBER_OF", "RULES"])` returns only those.
3. **Confidence thresholds.** Nodes with `source_confidence < 0.5` are excluded unless requested.
4. **Result caps.** A hard cap of 200 nodes per call. The LLM paginates.

**Tool:** `expand_context(entity, hops, relation_filter, min_confidence, limit)` is rate-limited and capped.

## Summary

The macro↔micro layer is the part of the engine that makes the difference between *"a graph full of facts"* and *"a world you can reason about."* The ontology is the data; this is the access pattern.

Three rules of thumb:

- If the LLM has to do a 3+ hop traversal to answer a basic question, the ontology is missing an edge.
- If the LLM can't disambiguate *"the dagger"*, the active context is broken.
- If the LLM doesn't know *why* a fact matters, the `SIGNIFICANCE_OF` pattern is missing.