# 05 — MCP Tool Catalog

The complete tool surface the LLM uses to reason about the world. Each tool has one job. Higher-level patterns are compositions, not bigger tools.

Base tools (none are "inherited as-is" — Cognee ships no MCP tool catalog; all 45 are Lore Engine handlers. `semantic_search` delegates to `cognee.recall`; the rest are Lore Engine originals):

- `semantic_search` — vector search over chunks (wraps `cognee.recall`)
- `graph_traverse` — n-hop traversal
- `get_context` — full context for a single entity
- `get_person_profile` — entity summary
- `query_as_npc` — NPC-scoped query
- `log_encounter` — write an encounter (world-builder write tool)
- `get_unresolved` — list provisional entities (built in slice 2)
- `get_contradictions` — list contradictions (built in slice 2)

New tools are grouped by function. Signatures use TypeScript-ish notation for clarity; the actual JSON-RPC schemas live in the MCP server source.

---

## Group 1: Identity & disambiguation

### `lookup(query, type?)`

The disambiguating entry point. *"The dagger"* → which one?

**Parameters:**
- `query` (string) — name, alias, or partial name
- `type` (string, optional) — restrict to one label (`Person`, `Item`, etc.)

**Returns:** Array of matching nodes with name, type, ID, aliases, and a `match_confidence` score. The LLM picks one (or asks the user).

**Why this exists:** Without it, the LLM has to guess entity IDs, which it cannot do reliably. This is the single highest-leverage tool for reducing hallucination.

---

### `entity_context(name, at_time?)`

One-hop summary of an entity. *"Who is Aldric?"* → answer in one call.

**Parameters:**
- `name` (string) — entity name
- `at_time` (string, optional) — canonical time, e.g. `3rd_age.year_345`. If omitted, returns the entity's "current" state (uses `current` reserved token).

**Returns:**
```json
{
  "entity": { "type": "Person", "name": "Aldric Raventhorne", "id": "uuid-..." },
  "at_time": "3rd_age.year_345",
  "factions": [{"name": "House Vyr", "valid_from": "...", "valid_until": "..."}],
  "locations": [{"name": "Thornwall Keep", ...}],
  "cultures": [...],
  "titles": [...],
  "languages": [...],
  "magic_systems": [...],
  "deities": [...],
  "items_possessed": [...],
  "alive": true,
  "lifespan": {"from": "3rd_age.year_300", "until": "3rd_age.year_360"}
}
```

**Why this exists:** Most questions start with "who is X?" or "what is X?" This is the cheapest possible answer. The LLM can always drill deeper with `expand_context`.

---

## Group 2: Time-aware queries

### `was_true_at(relation, subject, object, at_time)`

The most common time-aware query. *"Were House Vyr and the Crimson Pact allied in 340 TA?"*

**Parameters:**
- `relation` (string) — edge type, e.g. `ALLIED_WITH`, `RULED`, `POSSESSES`
- `subject` (string) — subject name
- `object` (string) — object name
- `at_time` (string) — canonical time

**Returns:**
```json
{
  "was_true": true,
  "valid_from": "3rd_age.year_312",
  "valid_until": "3rd_age.year_345",
  "sources": ["chronicles-vyr.md", "pact-treaties.md"],
  "confidence": 0.92
}
```

Or `"was_true": false` with no edge found. The LLM is told the result is from the canonical sources; if `confidence < 0.5`, it should qualify the claim.

---

### `true_during(relation, subject, object, era)`

*"During the Third Age, when was House Vyr at war with anyone?"*

**Parameters:**
- `relation` (string) — edge type
- `subject` (string)
- `object` (string, optional) — if omitted, returns all `object`s of that relation in the time window
- `era` (string) — canonical era, e.g. `3rd_age`

**Returns:** Array of intervals with `valid_from`, `valid_until`, `object` (or `subject`), `sources`.

---

### `state_at(entity, at_time)`

*"What was Valdorn like in 340 TA?"* — full snapshot.

**Parameters:**
- `entity` (string)
- `at_time` (string)

**Returns:** A comprehensive state object: ruling faction(s), controlling faction(s), notable persons present, ongoing events, current contradictions, magic systems in use, deities worshipped, languages spoken, items of note.

**Why this exists:** This is the answer to "what was the world like at time T?" It composes `entity_context` with `entities_present` and `event_chain` filtered by time.

---

### `entities_present(location, at_time, type?)`

*"Who was in Mardsville in 340 TA?"*

**Parameters:**
- `location` (string)
- `at_time` (string)
- `type` (string, optional) — restrict to `Person`, `Faction`, `Creature`, `Item`

**Returns:** Array of entities whose `LOCATED_IN` (or `CONTROLS`, for factions) edge was valid at `at_time`.

---

### `timeline(entity, relation_type?, from?, to?)`

*"What did Aldric do, in order?"*

**Parameters:**
- `entity` (string)
- `relation_type` (string, optional) — filter to one edge type
- `from` (string, optional) — start time
- `to` (string, optional) — end time

**Returns:** Chronologically sorted array of events/relations involving the entity, with the source document for each. The LLM can hand this back to the user as a "biography."

---

## Group 3: Lineage & hierarchy

### `list_lineage(person, depth?)`

*"What bloodline is Aldric part of, and who else?"*

**Parameters:**
- `person` (string)
- `depth` (integer, optional, default=2) — how many `PARENT_OF`/`DESCENDED_FROM` hops to traverse

**Returns:**
```json
{
  "lineage": { "name": "House Vyr (bloodline)", "founding_ancestor": "..." },
  "members": [{"name": "Aldric", "relation": "self"}, ...],
  "cadet_branches": [...],
  "depth_covered": 2
}
```

---

### `list_offspring(person)`

*"Who are Aldric's children?"* Direct children, no recursion. Cheaper than `list_lineage` for simple cases.

---

### `ancestors_of(person, generations?)`

*"Who were Aldric's grandparents and great-grandparents?"* Walks `PARENT_OF` upward, returns the chain.

---

### `descendants_of(person, generations?)`

The inverse — walks `PARENT_OF` downward.

---

### `location_hierarchy(location, direction?)`

*"What is Thornwall Keep part of?"* (up) or *"What is part of Valdorn?"* (down).

**Parameters:**
- `location` (string)
- `direction` (string) — `up` (parent regions/kingdoms) or `down` (sub-locations)

**Returns:** The geographic / political hierarchy above or below the location, with `RULES`/`CONTROLS` annotations.

---

## Group 4: Causal & event chains

### `event_chain(event, depth?)`

*"What caused the Sundering, and what did it cause?"*

**Parameters:**
- `event` (string)
- `depth` (integer, optional, default=2) — `CAUSED`/`PRECEDED`/`CONCURRENT_WITH` traversal depth

**Returns:** A graph structure with the event, its causes (depth 1+), its effects (depth 1+), and any concurrent events. Critical for "why did X happen" and "what were the consequences of X."

---

### `events_during(era, location?, type?)`

*"What battles happened in the Northern Reaches during the Third Age?"*

**Parameters:**
- `era` (string)
- `location` (string, optional)
- `type` (string, optional) — `Event` label or a sub-classification if you create them

**Returns:** Array of events, sorted by in-fiction date.

---

## Group 5: Knowledge & lore

### `lore_about(entity, type?, limit?)`

*"What do the chronicles say about Aldric?"*

**Parameters:**
- `entity` (string)
- `type` (string, optional) — `LoreSource` source_type filter: `prose`, `timeline`, `family_tree`, etc.
- `limit` (integer, optional, default=10)

**Returns:** Array of `LoreSource` documents that mention the entity, with the relevant chunks and a relevance score.

---

### `cite(claim)`

*"Where does the engine get that from?"* — given a claim (a string), return the source documents and the specific chunks that support it.

**Parameters:**
- `claim` (string) — natural language claim

**Returns:** Array of source chunks with similarity scores. The LLM can then say "according to..." and name the source.

**Why this exists:** Source attribution is a first-class feature. The LLM must always be able to back up its claims. This is the inverse of `semantic_search` — instead of "find me chunks that match X," it's "for this claim, where did it come from?"

---

## Group 6: Consistency (see `04-consistency.md` for full details)

| Tool | Purpose |
|---|---|
| `get_contradictions(subject?, severity?, limit?)` | List flagged contradictions |
| `get_anachronisms(entity?, limit?)` | List flagged anachronisms |
| `get_ontology_violations(rule_id?, severity?, limit?)` | List ontology rule violations |
| `get_orphans(reason?, limit?)` | List entities with missing structural data |
| `flag_for_review(node_id, reason)` | LLM marks a node suspicious |
| `explain_violation(node_id)` | Returns the rule, edges, and sources behind a violation |
| `run_consistency_check(scope?)` | Force a check over `entity`/`era`/`all` |
| `latest_run()` | Most recent `ConsistencyRun` summary |
| `add_ontology_rule(id, cypher, description, severity)` | World-builder only |
| `list_ontology_rules()` | Browse rules |

---

## Group 7: Generation & narrative (advanced)

### `summarize_chain(entity, depth, style?)`

*"Summarize the chain of events that produced Aldric's reign."*

Walks the `event_chain` from a starting point, condenses the result into a narrative paragraph or bullet list, and returns it with citations.

**Parameters:**
- `entity` (string)
- `depth` (integer, default=3)
- `style` (string) — `bullet`, `paragraph`, `chronicle`, `whispers` (in-character first-person)

**Returns:** A formatted text block plus a citation map. The LLM uses this as a *base* for its response, then can rewrite in any voice.

**Caveat:** This tool calls an LLM internally to produce the prose. It's the one place in the engine that does narrative generation. The LLM is told to use the returned text as raw material, not as a final answer.

---

### `narrate_arc(start_event, end_event, perspective?)`

*"Tell me the story of the Border Wars, from the Valdorni perspective."*

Composes `event_chain`, `entities_present`, and `summarize_chain` into an arc-narrative.

**Parameters:**
- `start_event` (string)
- `end_event` (string)
- `perspective` (string, optional) — a `Person`, `Faction`, or `Culture` whose `WITNESSED` / `PARTICIPATED_IN` edges filter the events

**Returns:** A multi-paragraph narrative, perspective-filtered, with a timeline of citations at the end.

**Why this exists:** This is the "narrative mode" tool. The LLM can hand the result to the user as a short story seed, or use it as the spine of a longer piece.

---

## Group 8: World-building (not for the LLM during inference)

These are tools for the *human world-builder*, exposed via the MCP server but not in the LLM's primary tool list. The LLM can use them but rarely needs to.

| Tool | Purpose |
|---|---|
| `add_entity(label, name, properties)` | Create a new entity |
| `add_relation(from, relation, to, valid_from?, valid_until?)` | Create a time-bound edge |
| `add_lore_source(title, source_type, content, author?)` | Ingest a new document |
| `merge_entities(id_a, id_b)` | Merge two entities that refer to the same thing |
| `set_alias(entity_id, alias)` | Add an alias |
| `define_era(name, parent_era?, start, end)` | Add a new era |
| `define_calendar(name, months)` | Add a new calendar |
| `define_date(slug, label, era, year, month?, day?)` | Add a new Date node |
| `delete_node(id, reason)` | Soft-delete a node |

These exist so the MCP server is the *only* write surface for the graph. The LLM doesn't need them but the world-builder does, and having one canonical surface is critical for consistency.

---

## Composition patterns (the recipes)

The LLM doesn't use these tools in isolation. It uses *patterns*. Five patterns cover 90% of the world-reasoning questions:

### Pattern 1: "Who/what is X?"

```
entity_context(X) → if insufficient, expand_context(X, hops=2)
```

### Pattern 2: "Did X happen at time T?" / "Was X true at time T?"

```
was_true_at(RELATION, subject, object, T)
```

### Pattern 3: "What was the world like at time T?" / "What was happening in Y at T?"

```
state_at(Y, T) + entities_present(Y, T)
```

### Pattern 4: "How is X connected to Y?" / "What's the relationship?"

```
expand_context(X, hops=3, relations=[...]) → filter for Y
```

### Pattern 5: "Why did X happen?" / "What are the consequences of X?"

```
event_chain(X, depth=3) + cite(claims_from_chain)
```

These five are the recipes the reasoning harness (`07-reasoning-harness.md`) spells out for the LLM.

---

## Starter ontology rules (out of the box)

The first 10 rules that ship with the engine:

1. `no-overlapping-rulers` — A location cannot have two `RULES` edges active at the same time.
2. `no-overlapping-spouses` — A person cannot have two `SPOUSE_OF` edges active at the same time.
3. `no-anachronism-participation` — A person cannot `PARTICIPATED_IN` an event outside their lifespan.
4. `no-anachronism-rule` — A faction cannot `RULES` a location before its founding or after its dissolution.
5. `no-orphan-events` — Every `Event` must have `OCCURRED_AT` and `OCCURRED_DURING`.
6. `no-orphan-locations` — Every `Location` must have `PART_OF` (a parent region, even if it's `Unmapped Lands`).
7. `lineage-continuity` — Every `Lineage` must have a `founding_ancestor` and at least one `MEMBER_OF` Person.
8. `magic-system-coherence` — A `Spell` cannot exist in a `MagicSystem` that has no `PRACTICES` in the relevant era.
9. `deity-worship-coherence` — A `Person` cannot `WORSHIPS` a `Deity` that does not exist in their era.
10. `item-lineage` — An `Item` with `INHERITED_BY` edges must have a `CREATED` edge (it was made by someone).

The user can disable any rule by ID, and add new ones via `add_ontology_rule`.

---

## Tool count: 45 total (8 base + 37 domain)

The full catalog: 8 base tools (one wraps `cognee.recall`; the rest are Lore Engine originals) + 37 domain tools across Groups 1–8 = **45 MCP tools** — all Lore Engine handlers, exposed through the **Lore Engine's own MCP server** (Cognee's `cognee-mcp` is a separate, fixed 14-tool surface; we don't register into it). The server calls Cognee's Python API (`remember`/`recall`/`add_data_points`) in-process. A handful of tools delegate to `cognee.recall` as a fallback — not because recall is low-precision (it returns scored, multi-source results) but because its results are un-typed, un-cited, and un-time-bounded. That's well past the empirical LLM tool-use ceiling (~25 in a single system prompt), so the Phase 6 reasoning-harness validation measures usage and collapses the long tail. The LLM uses 5–8 of them 90% of the time; the long tail exists for edge cases the LLM will sometimes need and shouldn't have to fall back to free-text generation for.

This is on the high end of what an LLM can effectively use in a single context. We mitigate by:
- The reasoning harness documents which 8 to use first.
- Tools are grouped by function in the system prompt.
- The `lookup` tool + active context reduce the need to remember entity IDs.

If the LLM gets tool-confused in practice, the next move is to collapse `entities_present` into `state_at`, and `narrate_arc` into `summarize_chain` with a parameter. **We start big; we collapse based on observed usage.**