# 03 — Macro ↔ Micro Association A high-fantasy world is full of micro details that only make sense in macro context: - *Aldric carries a dagger* (micro) is meaningless without *Aldric is a Vyr* (macro) and *the Vyrs are a noble house* (more macro). - *The tavern is on fire* (micro) is meaningless without *the tavern is in Mardsville* (macro) and *Mardsville is contested in the Border Wars* (more macro). The engine's job is to make these associations *navigable*, not to make the LLM traverse a five-edge chain by hand every time. This document is how we do that. ## The principle: every node knows where it lives Every node in the engine is implicitly connected, via a small number of well-indexed edges, to the macro structures it belongs to. The LLM can ask a question about *anything* and reach *everything* relevant in O(1) to O(3) hops. For a `Person`, those connections are: ``` Person ├── MEMBER_OF → Faction(s) (which house / order / company) ├── BELONGS_TO → Culture(s) (which people they are) ├── WORSHIPS → Deity/Deities (what they believe) ├── PRACTICES → MagicSystem(s) (what magic they can use) ├── SPEAKS → Language(s) (what they speak) ├── LOCATED_IN → Location (where they are) ├── CLAIMS_TITLE → Title (what office they hold) ├── PARENT_OF → Person(s) (children) ├── SPOUSE_OF → Person(s) (partner) ├── BURIED_AT → Location (final rest) └── EXISTED_DURING → Era(s) (when they lived) ``` For a `Faction`: ``` Faction ├── MEMBER_OF → Faction(s) (sub-group, vassal, parent org) ├── RULES → Location / Faction (sovereignty) ├── CONTROLS → Location(s) (military hold) ├── LOCATED_IN → Location (headquarters) ├── BELONGS_TO → Culture(s) ├── POSSESSES → Item(s) (holdings, relics) ├── CREATED → Item(s) (artifacts forged) ├── ALLIED_WITH → Faction(s) ├── ENEMY_OF → Faction(s) └── EXISTED_DURING → Era(s) ``` For a `Location`: ``` Location ├── PART_OF → Location / Region (geographic hierarchy) ├── LOCATED_IN → Location / Region ├── RULES → Person / Faction (sovereign) ├── CONTROLS → Faction(s) (occupier) ├── NEAR → Location(s) (geographic proximity) └── CULTURE_OF → Culture(s) (homeland of) ``` For an `Item`: ``` Item ├── POSSESSED_BY → Person / Faction (current holder) ├── CREATED → Person / Faction ├── FORGED_FROM → Material(s) ├── INHERITED_BY → Person(s) (lineage of ownership) └── ORIGINATES_FROM → Location (where it was forged) ``` These are the **default association paths**. The MCP tool layer exposes them as composable queries. ## Three macro-association patterns ### Pattern 1: Direct association (one hop) The simplest case. The LLM asks about Aldric, gets his faction, his location, his culture, his titles — all in a single `state_at` call. **Cypher:** ```cypher MATCH (p:Person {name: "Aldric Raventhorne"}) OPTIONAL MATCH (p)-[r1:MEMBER_OF]->(f:Faction) OPTIONAL MATCH (p)-[r2:LOCATED_IN]->(loc:Location) OPTIONAL MATCH (p)-[r3:BELONGS_TO]->(c:Culture) OPTIONAL MATCH (p)-[r4:CLAIMS_TITLE]->(t:Title) RETURN p, collect(DISTINCT f) AS factions, collect(DISTINCT loc) AS locations, collect(DISTINCT c) AS cultures, collect(DISTINCT t) AS titles ``` **Tool:** `entity_context(name)` returns the one-hop summary. See `05-mcp-tools.md`. ### Pattern 2: Lineage chain (variable depth) Aldric → his father → his grandfather → the Vyr bloodline. The LLM asks "what bloodline does Aldric belong to?" and gets an answer without traversing `PARENT_OF` 12 times. This is why we have `Lineage` as a *node*, not just a property on `Person`. `Lineage` is a typed, queryable group with: - `founding_ancestor` (the `Person` who started the bloodline) - `cadet_branches[]` (sub-`Lineage` nodes) - `MEMBER_OF` connections from `Person` to `Lineage` The LLM's question *"What is Aldric's lineage, and who else is in it?"* becomes: ```cypher MATCH (a:Person {name: "Aldric Raventhorne"}) -[:MEMBER_OF]->(lin:Lineage) <-[:MEMBER_OF]-(relative:Person) WHERE relative.name <> a.name RETURN lin, collect(relative.name) AS bloodline_members ``` **Tool:** `list_lineage(person)` returns the bloodline, its cadet branches, the founding ancestor, and all known members. ### Pattern 3: Geographic / political hierarchy (region tree) `Region` and `Location` form a tree via `PART_OF`. To go from *Aldric's dagger* (micro) to *the Kingdom of Valdorn's stance in the Border Wars* (macro), the engine traverses: ``` Aldric's dagger (Item) → CREATED_BY Aldric (Person) → MEMBER_OF House Vyr (Faction) → RULES Valdorn (Location) → PART_OF Northern Reaches (Region) → CULTURE_OF Valdorni (Culture) → LOCATED_IN — King Aelric's Court (Location) → CONTESTED_BY Crimson Pact (Faction) in the Border Wars (Event) ``` **Six hops.** With proper indexing, that's ~10–50ms in Neo4j. We make it a single tool call: `expand_context(entity, hops=6)`. **Tool:** `expand_context(entity, hops, relation_filter)` returns the n-hop neighborhood filtered to specified relation types. ## The micro-anchoring problem Here's the design risk: in a world with thousands of `Item` nodes, "Aldric's dagger" is one of 50,000 daggers. The LLM shouldn't have to say "Aldric's dagger" — it should be able to say *"the dagger"* and the engine infers which one. **Solution: the context window.** Every LLM query in the engine happens in a *context window* — a working set of entities the LLM has been talking about. The MCP server tracks the active context per session: ```json { "active_context": [ {"type": "Person", "name": "Aldric Raventhorne", "id": "uuid-1"}, {"type": "Location", "name": "Thornwall Keep", "id": "uuid-2"}, {"type": "Item", "name": "Sword of Eventide", "id": "uuid-3"} ] } ``` When the LLM calls `lookup("the dagger")`, the engine: 1. First checks the active context for `Item` nodes. 2. If exactly one `Item` matches "dagger" among them, returns it. 3. If multiple, returns the disambiguation list and asks the LLM to pick. 4. If none, falls back to a global fuzzy search. The active context is updated automatically by other tools: every `state_at`, every `query_faction_at_time`, every `get_context` adds entities to the working set. The LLM doesn't manage it; the engine does. **Tool:** `lookup(query)` is the disambiguating entry point. ## The "why does this matter" link Some micro details matter only because of a macro context. Aldric's dagger matters because it's *the dagger that killed the Emperor*. The engine models this as a `SIGNIFICANCE_OF` edge from the item to a specific `Event`: ```cypher (:Item {name: "Aldric's Dagger"}) -[:SIGNIFICANCE_OF { role: "weapon", context: "the assassination of Emperor Vael of the Crimson Throne" }]-> (:Event {name: "Assassination of Emperor Vael"}) ``` The LLM asking "why is this dagger famous?" gets a single tool call answer: `significance_of(item)`. **Tool:** `significance_of(entity)` returns the historical events, cultural significance, or macro context that makes this entity noteworthy. ## Composition patterns (what the LLM does, not what the schema has) These are the *recipes* the LLM uses, documented in `07-reasoning-harness.md` but prefigured here: | Question | Tool sequence | |---|---| | "Who is Aldric?" | `entity_context(Aldric)` + `list_lineage(Aldric)` + `significance_of(Aldric)` | | "What was happening in Valdorn in 340 TA?" | `state_at(Valdorn, "3rd_age.year_340")` + `entities_present(Valdorn, "3rd_age.year_340")` | | "Why does Aldric care about the Crimson Throne?" | `entity_context(Aldric)` → get `MEMBER_OF House Vyr` → `expand_context(House Vyr, hops=2)` → find `RULES Crimson Throne` | | "Is the dagger in the museum the real one?" | `lookup("the dagger")` → `significance_of(dagger)` → if Event-tagged, `get_event_chain(dagger)` to verify lineage of possession | | "Was the Long Winter caused by the Sundering?" | `get_event_chain(Sundering)` → check for `CAUSED` edges to Long Winter | ## The risk: traversal explosion A naive `expand_context` with `hops=10` on a dense world will return thousands of nodes. The LLM's context window will explode. We mitigate in three ways: 1. **Default hops=2.** The LLM must explicitly request deeper traversal, and the tool warns at hops=4. 2. **Relation filters.** `expand_context(Aldric, hops=2, relations=["MEMBER_OF", "RULES"])` returns only those. 3. **Confidence thresholds.** Nodes with `source_confidence < 0.5` are excluded unless requested. 4. **Result caps.** A hard cap of 200 nodes per call. The LLM paginates. **Tool:** `expand_context(entity, hops, relation_filter, min_confidence, limit)` is rate-limited and capped. ## Summary The macro↔micro layer is the part of the engine that makes the difference between *"a graph full of facts"* and *"a world you can reason about."* The ontology is the data; this is the access pattern. Three rules of thumb: - If the LLM has to do a 3+ hop traversal to answer a basic question, the ontology is missing an edge. - If the LLM can't disambiguate *"the dagger"*, the active context is broken. - If the LLM doesn't know *why* a fact matters, the `SIGNIFICANCE_OF` pattern is missing.