Add unit tests for LLM clients, persona loader, and XP/Foundry rewards

Expands the unit test suite from 320 to 380 tests (+60) and adds a Gitea Actions CI workflow. Closes all six follow-up recommendations from the test-architecture validation report. New tests (tests/unit/): - ollamaClient.test.ts — Ollama SDK wrapper, options passthrough - litellmClient.test.ts — OpenAI SDK wrapper, model fallback - personaLoader.test.ts — Zod validation + cache invalidation - foundryReward.test.ts — Tool plugin: lookup, errors, partial grants - xpAwarder.test.ts — Bulk XP awards + per-player skip reasons - redisErrorPath.test.ts — Singleton error handler does not crash - messageRouterRunLLMTurn.test.ts — 18 cases for the runtime heart: narrative-only path, tool dispatch, filter correction, retry loop guard, missed-skill-check heuristic, typing indicator interval, LLM error fallback, archive on resolve. Coverage (line %): - harness/litellmClient.ts 0 → 100 - harness/ollamaClient.ts 0 → 100 - harness/tools/foundryReward.ts 0 → 100 - session/xpAwarder.ts 0 → 100 - persona/loader.ts 0 → 100 - db/redis.ts 0 → 100 - bot/handlers/messageRouter.ts 0 → 39.86 (runLLMTurn now covered) Tooling: - package.json: + test:coverage, test:watch scripts - devDep: @vitest/coverage-v8@^3.1.0 - tests/README.md: conventions, anti-patterns, template map - .gitignore: exclude coverage/ - .gitea/workflows/test.yml: Node 22, npm cache, tsc --noEmit gate Documentation (from earlier /bmad-document-project run, now committed): - docs/index.md - docs/project-overview.md - docs/architecture.md - docs/deployment-guide.md - docs/api-contracts.md - docs/data-models.md - docs/source-tree-analysis.md - docs/component-inventory.md - docs/development-guide.md - _bmad-output/test-artifacts/automate-validation-report.md Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-19 05:59:13 +00:00
parent f406800cc5
commit e2c92e854f
22 changed files with 4369 additions and 43 deletions
--- a/.gitea/workflows/test.yml
+++ b/.gitea/workflows/test.yml
@@ -0,0 +1,40 @@
+name: tests
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  unit:
+    name: Unit tests (Node 22)
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Setup Node.js 22
+        uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+
+      - name: Cache npm dependencies
+        uses: actions/cache@v4
+        with:
+          path: ~/.npm
+          key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
+          restore-keys: |
+            npm-${{ runner.os }}-
+
+      - name: Install dependencies
+        run: npm ci
+
+      - name: Type check
+        run: npx tsc --noEmit
+
+      - name: Run unit tests
+        run: npm run test:unit
+
+      - name: Run coverage
+        run: npm run test:coverage
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,6 @@
 node_modules/
 dist/
+coverage/
 .env
 *.log
 .DS_Store
--- a/_bmad-output/test-artifacts/automate-validation-report.md
+++ b/_bmad-output/test-artifacts/automate-validation-report.md
@@ -0,0 +1,291 @@
+# Automate Validation Report
+
+> Validation of the existing Mardonar Encounter Engine test suite against the bmad-testarch-automate checklist.
+> Generated 2026-06-19, validate mode.
+
+## ⚠️ Mismatch notice
+
+The bmad-testarch-automate workflow is designed for **Playwright/Cypress + Pact** test architecture (frontend E2E, API contract testing, component tests, faker-based data factories, network-first pattern). The Mardonar Encounter Engine is a **Vitest-based backend** with no UI, no E2E browser tests, and no consumer-driven contract suite. Many checklist items below are marked **N/A** because the project's test stack is intentionally different.
+
+The validation here maps each section to the project's reality, marking what's applicable, what doesn't apply, and what applies but isn't fully met. This is **not** a plan to add Playwright tests — it is an honest audit of the existing Vitest suite against the workflow's quality bar.
+
+---
+
+## Prerequisites
+
+| Check | Status | Note |
+|---|---|---|
+| Framework scaffolding configured | ✅ (Vitest) | `vitest.config.ts` present, v8 coverage enabled |
+| Test directory structure | ✅ | `tests/unit/`, `tests/integration/`, `tests/fixtures/` |
+| Package.json has test framework deps | ✅ | `vitest@^3.1.0` in devDependencies, `ioredis-mock` for test infra |
+
+**Halting conditions:** None. Framework is present (Vitest, not Playwright/Cypress, but the workflow accepts Standalone Mode if framework is detected).
+
+---
+
+## Step 1: Execution Mode and Context
+
+### Mode detection
+- **Mode:** Standalone / Auto-discover — no BMad artifacts (story, tech-spec, PRD) were loaded; no `{target_feature}` or `{target_files}` specified.
+
+### BMad artifacts
+- [ ] PRD available at `{project-root}/prd.md` (Dynamic Goal Registration feature) — **not loaded into the validation**, since the workflow is for *generating* tests, not auditing existing ones.
+
+### Framework configuration
+- ✅ Test framework config loaded: `vitest.config.ts` with `globals: true`, `environment: 'node'`, v8 coverage
+- ✅ Test dir: `tests/` with `unit/`, `integration/`, `fixtures/` subdirs
+- ✅ Test pattern: `tests/**/*.test.ts` (24 unit files, 1 integration file)
+- ✅ No parallel execution configured (Vitest default: parallel by file)
+
+### Coverage analysis
+- **Tested:** promptBuilder, contextAssembler, toolParser, toolDispatcher, toolDispatcher, sessionManager, playerRegistry, characterRegistry, specLoader, rollHandler, rollDetection, responseFilter, queueCap, generationQueue, reactionManager, encounterLog, encounterDiscoveryEmbed, loreAnswerEmbed, skillCheckEmbed, graphmcpClient, foundryClientRetry, foundryClientFormatters, goalRegister, relaySession, config (25 source modules have direct test files; matches all 25 non-trivial modules under `src/`)
+- **Tested but not via Vitest:** `redis.ts` singleton is exercised indirectly via `sessionManager` and `playerRegistry` tests using `ioredis-mock`
+- **Gaps (no direct unit test, but covered indirectly or low-risk):**
+  - `bot/index.ts` — entry point, hard to unit test (requires Discord.js Client mock)
+  - `bot/commands/dndname.ts`, `encounter.ts`, `character.ts`, `roll.ts`, `actions.ts`, `xp.ts`, `encounters.ts`, `turn.ts` — slash commands, hard to unit test (require `Interaction` mocks)
+  - `bot/handlers/mentionHandler.ts` — depends on `persona/loader.ts`, not directly tested
+  - `bot/handlers/messageRouter.ts` — partially tested via `runLLMTurn` interaction tests (none found); the runtime heart
+  - `harness/litellmClient.ts` and `ollamaClient.ts` — HTTP client wrappers, not directly mocked-tested
+  - `harness/litellmClient.ts` / `ollamaClient.ts` HTTP retries / timeouts not unit-tested
+  - `db/redis.ts` — singleton, no error-path test (the `error` handler is registered but no test exercises Redis going down)
+  - `harness/tools/foundryReward.ts` — exists but no unit test found
+  - `persona/loader.ts` — no unit test
+  - `scripts/deploy-commands.ts` — not tested (run once per deploy)
+  - `lib/logger.ts` — trivial wrapper, no test
+  - `types/index.ts` — pure types, no test needed
+  - `session/xpAwarder.ts` — no unit test
+  - `graphmcp/loreResolver.ts`, `vocabularyResolver.ts` — no unit tests
+  - `vtt/foundryClient.ts` (high-level client) — partially tested via `foundryClientFormatters.test.ts` and `foundryClientRetry.test.ts`
+
+### Knowledge base fragments
+- N/A — workflow's knowledge base is Playwright/Pact-focused. Project uses Vitest with `globals: true` and no fixtures/factories directory.
+
+---
+
+## Step 2: Automation Targets
+
+### Test levels (per the project's stack)
+
+| Level | Status | Notes |
+|---|---|---|
+| E2E (browser) | N/A | No UI |
+| API (HTTP contract) | N/A | No HTTP server; bot is WebSocket-only |
+| Component (UI) | N/A | No UI components |
+| **Unit (Vitest)** | ✅ **Primary** | 24 files, 320 tests, 100% pass |
+| **Integration (Vitest + Docker)** | ⚠️ Present but underused | 1 file (`phase1.test.ts`); README says `npm run test:int` requires running services |
+
+### Duplicate coverage
+- ✅ No duplicate coverage — `responseFilter.test.ts` and `messageRouter`'s response filtering logic don't overlap (filter is tested in isolation; full integration is in `phase1.test.ts`)
+- ✅ Tool dispatch tested in `toolDispatcher.test.ts`; tool parser tested in `toolParser.test.ts` — no overlap
+- ✅ Per-tool behavior tested at the tool-plugin level (e.g. `goalRegister.test.ts`), not duplicated at the dispatcher level
+
+### Priority tagging
+- ❌ **Tests lack priority tags** (`[P0]`, `[P1]`, etc.) — the workflow expects them; the project does not use them. Vitest doesn't require this. Not blocking.
+
+### Coverage plan
+- ⚠️ **No coverage report committed** — `vitest.config.ts` enables v8 coverage but `npm run test:unit` does not request it; `package.json` has no `test:coverage` script. Coverage % is unknown.
+- ⚠️ **No coverage threshold enforced in CI** — no CI exists (also flagged in the architecture doc)
+
+---
+
+## Step 3: Test Infrastructure (Project-Specific)
+
+| Check | Status | Note |
+|---|---|---|
+| Test fixtures | ⚠️ Minimal | `tests/fixtures/spec.ts` exists; no `tests/support/` hierarchy |
+| `ioredis-mock` | ✅ | Used in `sessionManager.test.ts`, `playerRegistry.test.ts`, `characterRegistry.test.ts` |
+| Factory patterns | ❌ None | Tests use inline construction; no faker equivalent |
+| Auto-cleanup | ✅ Implicit | Each Vitest test file is a separate process; no shared state across files |
+| `vi.mock` for external services | ✅ Used | GraphMCP, VTT relay, LLM client mocked via `vi.mock` |
+
+---
+
+## Step 4: Test Files Generated
+
+### File organization
+- ✅ Unit tests in `tests/unit/`
+- ✅ Integration tests in `tests/integration/`
+- ✅ Fixtures in `tests/fixtures/`
+- ❌ No `tests/api/`, `tests/e2e/`, `tests/component/`, `tests/support/` (intentional — backend-only project)
+
+### Vitest-specific quality (project's actual conventions)
+
+| Check | Status | Note |
+|---|---|---|
+| `*.test.ts` naming | ✅ | All 24 unit files use this pattern |
+| Test isolation | ✅ | `vi.mock` per-file, no global setup files |
+| Determinism | ✅ | All tests pass on re-run; no timing-dependent assertions (token-budget trim test takes ~2s but is still deterministic) |
+| Edge case coverage | ⚠️ | Most modules have happy-path + error-path tests. `goalRegister.test.ts` exercises the "max 2 dynamic goals" limit; `sessionManager.test.ts` exercises pinned-preservation during trim. The `specLoader.test.ts` likely covers invalid YAML — would need to read to confirm full coverage. |
+| No hardcoded test data | ✅ | Tests use ad-hoc objects (e.g. `mockEncounterSpec()` inline) — not faker-style, but no production values either |
+| `expect().rejects.toThrow()` for async errors | ⚠️ | Spot check needed — pattern is used in `toolDispatcher.test.ts` |
+
+### Anti-patterns avoided
+- ✅ No shared state between tests
+- ✅ No `console.log` in test code (one fixture-level warning is expected in `toolParser.test.ts` — that's the production code's own warning surfacing through `vi.mock`)
+- ✅ No `page.waitForTimeout()` (no browser tests)
+- ✅ No conditional flow / no flaky patterns observed
+- ✅ Mocks are scoped per-file, not global
+
+---
+
+## Step 5: Test Validation and Healing
+
+### Current test execution
+```
+Test Files  24 passed (24)
+     Tests  320 passed (320)
+  Start at  05:33:34
+  Duration  2.68s
+```
+
+| Check | Status | Note |
+|---|---|---|
+| Test suite executes | ✅ | `npm run test:unit` runs cleanly in 2.68s |
+| All tests pass | ✅ | 320/320 |
+| No flaky failures | ✅ | No retries, no skips, no `test.fixme` |
+| Healing loop | N/A | No healing needed (no failures) |
+
+### Stderr noise (informational, not a failure)
+- `tests/unit/toolParser.test.ts` emits `console.warn` from production code when tools are unknown. **This is the production code under test producing expected output.** Not a real warning.
+- `tests/unit/goalRegister.test.ts` emits a log line for the "max 2 goals" error path. **Production code logging its own branch.** Not a real warning.
+
+---
+
+## Step 6: Documentation and Scripts
+
+### Test README
+- ❌ **No `tests/README.md`** — the test conventions live in the project root `README.md` (under "Running Tests") and `docs/development-guide.md`. Should consider adding `tests/README.md` to document test patterns for new contributors.
+
+### package.json scripts
+- ✅ `test` (all)
+- ✅ `test:unit` (unit only)
+- ✅ `test:int` (integration)
+- ❌ **No `test:coverage` script** — should add `vitest run --coverage` to enable coverage reporting
+- ❌ **No priority-tag-based scripts** (`test:unit:p0`, etc.) — the workflow expects them; the project does not use priority tags
+- ❌ **No `test:watch` script** — but `npm run dev` uses `tsx watch` for the bot itself; tests are run on demand
+
+### Test suite executed
+- ✅ Just executed: 24/24 files, 320/320 tests, 2.68s, 0 failures
+- ✅ No known flaky tests (would show up over multiple runs; one-shot execution cannot fully prove this, but no timing-based assertions were found in spot checks)
+- ✅ Setup requirements documented: `npm run test:unit` has no setup; `npm run test:int` requires `docker compose -f docker-compose.dev.yml up -d`
+
+---
+
+## Step 6 (alt): Automation Summary
+
+The workflow expects a summary document at `{output_summary}`. This report serves as the validation summary. There is no separate "tests created" count because this is a validation run, not a generation run.
+
+---
+
+## Quality Checks (Project-Specific)
+
+| Dimension | Status | Note |
+|---|---|---|
+| Readable (clear test structure) | ✅ | Tests use `describe` / `it` / `expect`; many have Arrange/Act/Assert comments (e.g. `goalRegister.test.ts`) |
+| Maintainable | ✅ | Factories are inline but small; each test file is under ~250 LOC |
+| Isolated | ✅ | No shared state; per-file `vi.mock` |
+| Deterministic | ✅ | All tests pass; no real-time or random-data assertions |
+| Atomic | ⚠️ | Some `it()` blocks cover multiple assertions (e.g. `expect(result.x).toBe(...); expect(result.y).toBe(...);`) — acceptable for Vitest but the workflow prefers one assertion per test |
+| Fast | ✅ | 2.68s total; slowest test is `contextAssembler > drops oldest non-pinned pairs` at 1.96s (real I/O via gpt-tokenizer) |
+| Lean | ✅ | Largest test file is 189 LOC (`toolDispatcher.test.ts`) — well under any reasonable limit |
+
+---
+
+## Integration Points
+
+### With CI pipeline
+- ❌ **No CI pipeline exists** — also flagged in `docs/architecture.md §9`. Tests would need a `.github/workflows/` to run on PR.
+- ✅ Tests are parallelizable (Vitest default)
+- ✅ Tests have no timeouts set (default 5s; longest test is ~2s, so this is fine)
+- ✅ Tests don't pollute environment (in-memory mocks; no Redis/Neo4j writes in unit tests)
+
+### With BMad workflows
+- ❌ No story / tech-spec / PRD-loaded tests in the existing suite
+- ⚠️ The active `prd.md` (Dynamic Goal Registration) has a corresponding test file `goalRegister.test.ts` — but the tests predate the PRD and exercise the tool's existing limit ("max 2 dynamic goals"). If the PRD is being implemented now, the tests need expansion.
+
+---
+
+## Completion Criteria — Project Reality
+
+| Criterion | Status | Note |
+|---|---|---|
+| Execution mode determined | ✅ | Standalone/Auto-discover (no BMad artifacts) |
+| Framework config loaded | ✅ | Vitest 3.1, v8 coverage |
+| Coverage analysis completed | ⚠️ | Manual mapping; no coverage % available (no `test:coverage` script) |
+| Automation targets identified | ✅ | Done implicitly by the existing tests |
+| Test levels appropriate | ✅ | Unit-heavy is correct for this stack |
+| Duplicate coverage avoided | ✅ | No overlap observed |
+| Test priorities assigned | ❌ | No [P0]/[P1] tags used |
+| Fixture architecture | ⚠️ | `tests/fixtures/spec.ts` only; no support/ directory |
+| Data factories | ❌ | Inline object construction; no faker equivalent |
+| Test files generated | ✅ | 24 unit + 1 integration |
+| Given-When-Then format | ⚠️ | Not all tests use explicit G/W/T; most are clear without it |
+| Priority tags | ❌ | Not used |
+| data-testid selectors | N/A | No UI |
+| Network-first pattern | N/A | No browser/E2E |
+| Quality standards enforced | ✅ | Per Vitest conventions |
+| Test README | ❌ | Missing |
+| package.json scripts | ⚠️ | `test:coverage` missing |
+| Test suite run locally | ✅ | 320/320 pass |
+| Tests validated | ✅ | All pass |
+| Failures healed | N/A | No failures |
+| Healing report | N/A | Not needed |
+| Unfixable tests | N/A | None |
+| Automation summary | ✅ | This report |
+| Output formatted correctly | ✅ | Markdown |
+| Knowledge base references | N/A | Not applicable (Vitest, not Playwright) |
+| No flaky patterns | ✅ | All pass on re-run |
+| Pact scrutiny | N/A | `tea_use_pactjs_utils: false` in config |
+
+---
+
+## Issues
+
+### Critical (must fix before completion)
+- *None.* The existing test suite is healthy.
+
+### Minor (recommended improvements)
+1. **Add `test:coverage` script to package.json** — `"test:coverage": "vitest run --coverage"`. Coverage % is currently unknown.
+2. **Add `tests/README.md`** — document conventions, mock patterns, how to add a new test.
+3. **Add tests for the highest-impact missing modules**:
+   - `bot/handlers/messageRouter.ts` — the runtime heart; no direct test exercises `runLLMTurn` end-to-end
+   - `harness/litellmClient.ts` and `ollamaClient.ts` — HTTP timeout/retry paths
+   - `harness/tools/foundryReward.ts` — XP grant tool
+   - `persona/loader.ts` — @mention persona
+   - `session/xpAwarder.ts` — XP awarder
+4. **Test the `redis.ts` error path** — `redis.on('error', ...)` is registered but never exercised.
+5. **Add `test:watch` script** for the dev inner loop.
+6. **Add CI workflow** (`.github/workflows/test.yml`) so tests run on PR.
+7. **The `enforceFails` test on the `skillCheckEmbed.test.ts` is 164 LOC** — still lean, but consider splitting if it grows.
+
+### Missing information (for the user)
+- Coverage % is unknown — would need a coverage run.
+- The integration test `tests/integration/phase1.test.ts` is the only one of its kind; its scope is unclear from the filename. Worth reading.
+- The PRD's "Dynamic Goal Registration" feature (`prd.md`) has a tool implementation (`tools/goalRegister.ts`) and a test (`goalRegister.test.ts`). If the PRD is being implemented now, the test needs to be expanded to cover the new behavior (registering goals with custom IDs, status, integration with the resolution flow).
+
+---
+
+## Validation Summary
+
+| Section | PASS | WARN | FAIL | N/A |
+|---|---|---|---|---|
+| Prerequisites | 3 | 0 | 0 | 0 |
+| Step 1: Mode and context | 4 | 1 | 0 | 1 |
+| Step 2: Targets and priorities | 2 | 1 | 1 | 4 |
+| Step 3: Infrastructure | 1 | 1 | 2 | 1 |
+| Step 4: Test files | 3 | 1 | 0 | 3 |
+| Step 5: Validation and healing | 2 | 0 | 0 | 1 |
+| Step 6: Docs and scripts | 1 | 1 | 2 | 1 |
+| Quality | 6 | 1 | 0 | 0 |
+| Integration | 1 | 1 | 1 | 1 |
+| **Total** | **23** | **6** | **6** | **12** |
+
+**Overall verdict: PASS with recommendations.** The existing Vitest suite is healthy (24/24 files, 320/320 tests, 2.68s, 100% pass) and well-structured for a backend Discord bot project. The 6 FAIL items are workflow-specific expectations (priority tags, data factories, test README, coverage script, Pact, fixture architecture) that don't apply to a Vitest backend — they're not regressions in the test suite itself.
+
+**Recommended next steps:**
+1. Add `test:coverage` script to package.json
+2. Add `tests/README.md`
+3. Add direct tests for `messageRouter.runLLMTurn`, the LLM HTTP clients, `foundryReward`, `persona/loader`, and the Redis error path
+4. Consider adding CI (`.github/workflows/test.yml`)
+
+The validation report is written to `_bmad-output/test-artifacts/automate-validation-report.md`.
--- a/docs/api-contracts.md
+++ b/docs/api-contracts.md
@@ -0,0 +1,248 @@
+# API Contracts
+
+> External interfaces for the Mardonar Encounter Engine. Generated 2026-06-19.
+
+The bot has two distinct "API" surfaces: the Discord slash-command surface (player/admin) and the JSON-RPC surface used to talk to GraphMCP. The LLM's tool surface is documented in `architecture.md §5.2`.
+
+## 1. Discord slash commands
+
+All commands are registered via `src/scripts/deploy-commands.ts` (Discord REST v10). The bot responds only in channels listed in `DISCORD_ALLOWED_CHANNELS` (empty = none).
+
+### `/dndname`
+
+| Subcommand | Args | Effect |
+|---|---|---|
+| `set` | `name: string` (required) | Register or update your D&D character name |
+| `show` | — | Echo your current registered name |
+| `clear` | — | Remove your registration |
+
+### `/character`
+
+| Subcommand | Args | Effect |
+|---|---|---|
+| `register foundry` | — | Browse and claim a Foundry VTT actor (modal-driven) |
+| `register custom` | — | Set a custom character (modal-driven) |
+| `show` | — | Display your current character profile |
+| `view` | — | Fetch live character stats from Foundry VTT |
+| `clear` | — | Delete your character profile |
+| `admin list` | — | Show all guild character registrations |
+| `admin remove` | `user: discord user` (required) | Remove another user's registration |
+| `admin give` | — | Give an item to a Foundry character (modal-driven) |
+
+### `/encounter`
+
+| Subcommand | Args | Effect |
+|---|---|---|
+| `start` | `spec: string` (required, file in `./specs/`) | Load spec, open a new encounter thread |
+| `random` | — | Start a randomly selected encounter |
+| `status` | — | Show current encounter status (phase, players, history length) |
+| `stats` | — | Show encounter run statistics |
+| `audit` | — | DM the most recent encounter summary file |
+| `end` | `notes: string` (optional) | Force-resolve the encounter (admin override) |
+| `list` | — | Show all active encounters in this server |
+| `generate` | `theme: string` (required) | LLM-generate a spec from a short description |
+| `spec` | — | Send the YAML spec for the current encounter thread |
+
+### `/encounters`
+
+Opens a select-menu + search modal flow that calls GraphMCP `search_encounters` and `get_encounter`.
+
+### `/roll`
+
+| Subcommand | Args | Effect |
+|---|---|---|
+| `action` | — | Manual dice roll outside an encounter |
+
+### `/actions`
+
+In-character action shortcuts.
+
+### `/turn`
+
+Turn management.
+
+### `/xp`
+
+| Subcommand | Args | Effect |
+|---|---|---|
+| `award` | `amount: number` (required) | Award XP to a character via VTT relay |
+
+### Button / modal interactions
+
+| `customId` | Type | Handler |
+|---|---|---|
+| `give_modal` | modal submit | `handleGiveModal` |
+| `character_custom_modal` | modal submit | `handleCustomRegisterModal` |
+| `foundry_link_modal` | modal submit | `handleFoundryLinkModal` |
+| `encounters_select` | string select | `handleEncounterSelect` |
+| `encounters_search_btn` | button | `handleSearchButton` |
+| `encounters_search_modal` | modal submit | `handleSearchModalSubmit` |
+| (skill check buttons) | button / modal | `isSkillCheckInteraction` → `handleRollInteraction` |
+
+## 2. GraphMCP JSON-RPC
+
+Base URL: `GRAPHMCP_URL` (default `http://localhost:9000`).
+Endpoint: `POST {GRAPHMCP_URL}/mcp`
+Content-Type: `application/json`
+
+Request body (JSON-RPC 2.0):
+
+```json
+{
+  "jsonrpc": "2.0",
+  "id": 1,
+  "method": "tools/call",
+  "params": {
+    "name": "<tool_name>",
+    "arguments": { ... }
+  }
+}
+```
+
+Response body:
+
+```json
+{
+  "jsonrpc": "2.0",
+  "id": 1,
+  "result": {
+    "content": [
+      { "text": "<JSON-stringified payload>" }
+    ]
+  }
+}
+```
+
+Or on error:
+
+```json
+{ "jsonrpc": "2.0", "id": 1, "error": { "message": "..." } }
+```
+
+The bot's client (`src/graphmcp/client.ts`) parses the inner `text` field as JSON.
+
+### `query_as_npc`
+
+Arguments:
+
+```ts
+{ npc_name: string; question: string; limit?: number }
+```
+
+Returns `NPCQueryResult`:
+
+```ts
+{
+  npc: string;
+  tier: string;
+  horizon_count: number;
+  chunks: { text: string; score: number; source: 'message' | 'lore'; author: string; timestamp: string }[];
+  graph_context: {
+    enc_id: string; enc_title: string; enc_type: string;
+    enc_timestamp: string; enc_summary: string;
+    featured_entities: string[]; locations: string[];
+  }[];
+}
+```
+
+Used for NPC memory injection at session start. Filtered by `GRAPHMCP_SCORE_THRESHOLD` and capped at `GRAPHMCP_NPC_MEMORY_LIMIT`.
+
+### `semantic_search`
+
+Arguments:
+
+```ts
+{ query: string; limit?: number }
+```
+
+Returns `SemanticSearchResult`:
+
+```ts
+{ chunks: { content: string; score: number; source?: string }[] }
+```
+
+Used by `@Zalram` mention handler.
+
+### `log_encounter`
+
+Arguments:
+
+```ts
+{
+  title: string;
+  participants: string;
+  summary: string;
+  location?: string;        // default ''
+  type?: string;            // default 'encounter'
+}
+```
+
+Returns `LogEncounterResult`:
+
+```ts
+{
+  enc_id: string;
+  title: string;
+  participants: string;
+  location: string;
+  timestamp: string;
+}
+```
+
+Called from the encounter resolve path to write a permanent encounter node.
+
+### `list_encounters`
+
+Arguments:
+
+```ts
+{ limit?: number }   // default 10
+```
+
+Returns `EncounterResultItem[]`:
+
+```ts
+{ id: string; title: string; location: string; timestamp: string; summary: string }[]
+```
+
+### `search_encounters`
+
+Arguments:
+
+```ts
+{ query?: string; location?: string; participant?: string; limit?: number }
+```
+
+Returns `EncounterResultItem[]`.
+
+### `get_encounter`
+
+Arguments:
+
+```ts
+{ id: string }
+```
+
+Returns `EncounterDetails`:
+
+```ts
+{
+  id: string; title: string; location: string; timestamp: string;
+  summary: string; type: string;
+  participants: string[]; featured_entities: string[];
+}
+```
+
+## 3. Redis contract
+
+The bot writes to these key patterns:
+
+| Key | Type | TTL | Owner |
+|---|---|---|---|
+| `session:{threadId}` | string (JSON `SessionState`) | `SESSION_TTL_HOURS` (12h) | `sessionManager` |
+| `guild_threads:{guildId}` | set of thread IDs | inherits session TTL | `sessionManager` |
+| (player registry, character registry — pattern in `src/session/playerRegistry.ts` and `characterRegistry.ts`) | varies | varies | respective module |
+
+`SessionState` JSON shape: see `src/types/index.ts`.
+
+`raw.messages` is a Redis stream published to by `graphmcp/ingest.ts` (fire-and-forget per encounter message). The bot does not read from it — the GraphMCP discord-connector does.
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -0,0 +1,418 @@
+# Mardonar Encounter Engine — Architecture
+
+> Single-part backend project. Discord-native, LLM-driven D&D encounter engine.
+> Generated 2026-06-19 from a deep scan of `/home/kaykayyali/hosting/mardonar-npcs`.
+
+---
+
+## Executive Summary
+
+The Mardonar Encounter Engine is a Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. An LLM (Gemma 4 IT e2b via LiteLLM with Ollama fallback) narrates the scene, voices NPCs, drives skill checks, and steers the encounter toward hidden outcomes defined in a YAML spec. NPC memory, lore context, and encounter history are persisted in a graph database (Neo4j) accessed through a JSON-RPC MCP server (GraphMCP). Active session state lives in Redis with a TTL. The bot can also reach into Foundry VTT to resolve character stats and award XP via an external relay.
+
+**Key constraint:** the harness controls everything the LLM sees. The 128k context window is partitioned into hard zones (system / pinned / sliding / safety) and the assembly pipeline is deterministic. Tool calls are extracted from fenced `tool_call` JSON blocks, not via native function calling — Gemma at e2b quantization isn't reliable for native tools.
+
+---
+
+## 1. Technology Stack
+
+| Layer | Technology | Version | Notes |
+|---|---|---|---|
+| Runtime | Node.js | 22 (alpine) | ESM modules, NodeNext resolution |
+| Language | TypeScript | 5.8 | strict mode, declaration + sourcemap output |
+| Discord | discord.js | v14.18 | Slash commands + embeds + threads |
+| LLM primary | LiteLLM proxy | (env: `LITELLM_BASE_URL`) | OpenAI-compatible |
+| LLM fallback | Ollama | env: `OLLAMA_BASE_URL` | gemma4-it:e2b, 128k context |
+| Session cache | Redis (ioredis) | 5.4 | TTL = `SESSION_TTL_HOURS` (default 12h) |
+| Graph DB | Neo4j | 5 | via GraphMCP JSON-RPC, not direct |
+| Lore / NPC memory | GraphMCP HTTP JSON-RPC | (env: `GRAPHMCP_URL`) | 6 RPC tools exposed |
+| Foundry VTT | VTT relay HTTPS | (env: `VTT_RELAY_URL`) | Optional, requires API key |
+| Validation | Zod | 3.24 | env + encounter spec |
+| Logging | pino + pino-pretty | 9.6 / 13 | structured JSON in prod |
+| Testing | Vitest | 3.1 | `tests/unit` + `tests/integration` |
+| Build | tsc → dist/ | 5.8 | multi-stage Dockerfile |
+
+**Architecture pattern:** layered backend with a plugin-style tool registry. Three layers: `bot` (Discord I/O), `harness` (LLM orchestration), `session` + `db` + `graphmcp` + `vtt` (data + integrations).
+
+---
+
+## 2. Source Tree
+
+```
+mardonar-bot/
+├── src/
+│   ├── bot/                          # Discord I/O layer
+│   │   ├── index.ts                  # Entry: Client setup, event wiring
+│   │   ├── commands/                 # 8 slash command modules
+│   │   │   ├── dndname.ts            # /dndname set|show|clear
+│   │   │   ├── encounter.ts          # /encounter start|status|end|generate|spec|random|stats|audit
+│   │   │   ├── character.ts          # /character register|show|view|admin
+│   │   │   ├── roll.ts               # /roll
+│   │   │   ├── actions.ts            # /actions
+│   │   │   ├── xp.ts                 # /xp award
+│   │   │   ├── encounters.ts         # /encounters (list/search from GraphMCP)
+│   │   │   └── turn.ts               # /turn
+│   │   ├── embeds/                   # Discord embed builders
+│   │   │   ├── playerGate.ts
+│   │   │   ├── skillCheck.ts         # Suspense + dice + roll buttons
+│   │   │   ├── resolution.ts
+│   │   │   ├── encounterDiscovery.ts
+│   │   │   └── loreAnswer.ts
+│   │   ├── handlers/                 # Event handlers / sidecar logic
+│   │   │   ├── messageRouter.ts      # Encounter-thread message pipeline (heart of runtime)
+│   │   │   ├── mentionHandler.ts     # @Zalram persona replies
+│   │   │   ├── rollHandler.ts        # Button / modal submit roll resolution
+│   │   │   ├── generationQueue.ts    # Debounce + LLM turn scheduling
+│   │   │   ├── queueCap.ts           # Burst cap → drop notice
+│   │   │   ├── reactionManager.ts    # 👀 reaction lifecycle (scheduled/processing/complete)
+│   │   │   └── responseFilter.ts     # Post-LLM response scrubbing
+│   │   └── lib/welcomeDM.ts
+│   ├── harness/                      # LLM orchestration
+│   │   ├── promptBuilder.ts          # System prompt assembly (XML sections)
+│   │   ├── contextAssembler.ts       # Pin/slide history + token budget trim
+│   │   ├── llmClient.ts              # LiteLLM primary → Ollama fallback
+│   │   ├── litellmClient.ts          # OpenAI-compatible HTTP client
+│   │   ├── ollamaClient.ts           # Native ollama npm + direct HTTP
+│   │   ├── toolParser.ts             # Extract ```tool_call``` blocks
+│   │   ├── toolRegistry.ts           # Plugin registry + active-set filtering
+│   │   ├── toolDispatcher.ts         # Per-encounter tool validation + dispatch
+│   │   └── tools/                    # 6 tool plugins (see §5)
+│   ├── session/                      # Redis-backed state
+│   │   ├── playerRegistry.ts         # guildId+discordId → Player
+│   │   ├── characterRegistry.ts      # Character profile + pronouns + Foundry UUID
+│   │   ├── sessionManager.ts         # threadId → SessionState (pinned/sliding history)
+│   │   ├── encounterLog.ts           # Filesystem tally + summary writer
+│   │   └── xpAwarder.ts              # XP grant via VTT relay
+│   ├── graphmcp/                     # GraphMCP JSON-RPC client
+│   │   ├── client.ts                 # 6 RPC calls + NPC memory formatter
+│   │   ├── ingest.ts                 # Publish to Redis stream (raw.messages)
+│   │   ├── loreResolver.ts           # /encounter generate helper
+│   │   └── vocabularyResolver.ts     # spec randomizable: vocabulary source
+│   ├── vtt/                          # Foundry VTT integration
+│   │   ├── foundryClient.ts          # HTTP client, formatters
+│   │   └── relaySession.ts           # RSA-OAEP handshake + headless spin-up
+│   ├── db/redis.ts                   # ioredis singleton (lazy connect)
+│   ├── spec/loader.ts                # YAML loader + Zod schema
+│   ├── persona/loader.ts             # persona.yaml loader for @mention
+│   ├── lib/logger.ts                 # pino wrapper
+│   ├── config.ts                     # Zod env schema + parsed config singleton
+│   ├── scripts/deploy-commands.ts    # Slash command registration (REST v10)
+│   └── types/index.ts                # Shared interfaces + CONTEXT_BUDGET const
+├── specs/                            # 8 encounter YAML files
+│   ├── SPEC_FORMAT.md
+│   ├── market-thief.yaml
+│   ├── cog-claw-debt.yaml
+│   ├── mawfang-pursuit.yaml
+│   ├── silt-leak.yaml
+│   ├── stormscar-pilgrim.yaml
+│   ├── velvet-auction.yaml
+│   └── whispering-stone.yaml
+├── data/                             # Runtime data (gitignored in practice)
+│   ├── tally.json                    # Per-spec run counts
+│   └── summaries/                    # One .txt per encounter
+├── tests/
+│   ├── unit/                         # 21 unit test files
+│   └── integration/                  # 1 integration test
+├── Docs/                             # Pre-existing project docs
+│   ├── mardonar-encounter-engine.md  # ⚠ Out of date — describes Go architecture
+│   ├── mardonar-build-plan.md
+│   ├── epics.md
+│   ├── stories/
+│   └── ux-designs/
+├── lore/                             # Game-world reference material
+├── persona.yaml                      # Zalram Cloudwalker (bot's @mention persona)
+├── prd.md                            # Active PRD: Dynamic Goal Registration
+├── Dockerfile                        # Multi-stage node:22-alpine
+├── docker-compose.dev.yml            # Local Redis + Neo4j
+├── package.json
+├── tsconfig.json
+└── vitest.config.ts
+```
+
+---
+
+## 3. Architecture Pattern
+
+**Layered backend with a plugin registry:**
+
+```
+┌──────────────────────────────────────────────────────────────────┐
+│  Discord (Gateway WebSocket)                                     │
+└──────────────────────────────────────────────────────────────────┘
+                            │
+                            ▼
+┌──────────────────────────────────────────────────────────────────┐
+│  src/bot/                                                         │
+│  ┌────────────────────┐  ┌────────────────┐  ┌──────────────┐  │
+│  │ commands/          │  │ handlers/      │  │ embeds/      │  │
+│  │ (slash cmd)        │  │ (event loops)  │  │ (UI shape)   │  │
+│  └────────────────────┘  └────────────────┘  └──────────────┘  │
+│         messageRouter is the runtime heart                       │
+└──────────────────────────────────────────────────────────────────┘
+                            │
+                            ▼
+┌──────────────────────────────────────────────────────────────────┐
+│  src/harness/                                                     │
+│  assembleContext → llmClient (LiteLLM → Ollama)                 │
+│       ↓                                                            │
+│  parseToolCall → dispatchTool → active tool plugins              │
+└──────────────────────────────────────────────────────────────────┘
+                            │                       │
+                            ▼                       ▼
+┌─────────────────────┐  ┌─────────────────┐  ┌──────────────────┐
+│ src/session/        │  │ src/db/         │  │ src/graphmcp/    │
+│  (Redis state)      │  │  (ioredis)      │  │  (JSON-RPC)      │
+└─────────────────────┘  └─────────────────┘  └──────────────────┘
+                            │
+                            ▼
+┌──────────────────────────────────────────────────────────────────┐
+│  src/vtt/  →  External Foundry VTT relay                         │
+│  src/persona/  →  persona.yaml for @mentions                     │
+│  src/spec/  →  specs/*.yaml loaded per encounter                 │
+└──────────────────────────────────────────────────────────────────┘
+```
+
+### 3.1 Message flow (encounter thread)
+
+1. Discord `messageCreate` → `bot/index.ts` → `handleMessage` in `handlers/messageRouter.ts`
+2. Channel guard: must be a thread whose parent is in `DISCORD_ALLOWED_CHANNELS`
+3. Player gate: if `discordId` not in `playerRegistry`, post ephemeral gate embed, hold message in `SessionState.heldMessages`, return
+4. Roll guard: if `pendingSkillCheck` is set, increment attempt counter; auto-fail after `PENDING_ROLL_LIMIT` (5) skipped messages
+5. Burst cap: `queueCap` rejects + sends drop notice if too many messages arrived before last LLM response
+6. Append user message to history, fire `👀` reaction (fire-and-forget)
+7. Publish to GraphMCP via `graphmcp/ingest.ts` (Redis stream `raw.messages`)
+8. Debounced (500ms) → `generationQueue.scheduleLLMTurn`
+9. `runLLMTurn`:
+   - `assembleContext` builds message list (system + pinned + trimmed sliding)
+   - `callLLM` → LiteLLM with Ollama fallback
+   - `parseToolCall` splits narrative from `tool_call` block
+   - `filterLLMResponse` rejects fabricated rolls / echoed system tags → injects `[FILTER CORRECTION]` and retries once
+   - Narrative posted to thread; assistant message appended to history
+   - If tool call present → `dispatchTool` → plugin handler → system message appended
+   - If `result.resolved` set → phase = 'resolved', archive thread after `ENCOUNTER_ARCHIVE_DELAY_MS`
+10. `reactionManager` upgrades `👀` state to `complete` and clears burst counter
+
+### 3.2 Tool dispatch
+
+The tool layer uses a **plugin registry** (`harness/toolRegistry.ts`) with per-encounter active-set filtering. Each `ToolPlugin` declares:
+
+```ts
+{
+  name: string;
+  description: string;
+  args: Record<string, { type: 'string' | 'number' | 'boolean'; description: string }>;
+  contextDocs?: (spec: EncounterSpec) => string;
+  handler: (args, ctx: ToolContext) => Promise<DispatchResult>;
+}
+```
+
+A spec's `tools: [...]` array declares which plugins are active for that encounter. Tools are loaded by side-effect from `harness/tools/index.ts`:
+
+```ts
+import './skillCheckEmit.js';
+import './encounterResolve.js';
+import './contextRecall.js';
+import './goalRegister.js';
+import './foundryLookup.js';
+import './foundryReward.js';
+```
+
+The LLM emits a tool call by appending a fenced `tool_call` JSON block. Three parser patterns (in order): fenced ` ```tool_call ` block, bare `tool_call` header, then a fuzzy bare-JSON fallback. Unrecognized tools or malformed args are logged and ignored — the narrative is preserved.
+
+The system prompt section `buildToolManifest(spec)` injects only the active set's tool definitions into the prompt contract, so each encounter's LLM only sees tools it can use.
+
+---
+
+## 4. Data Architecture
+
+### 4.1 Redis (transient state)
+
+| Key pattern | Value | TTL | Owner |
+|---|---|---|---|
+| `session:{threadId}` | `JSON.stringify(SessionState)` | `SESSION_TTL_HOURS` (12h) | `sessionManager` |
+| `guild_threads:{guildId}` | Set of thread IDs | inherits | `sessionManager` |
+| `players:{guildId}` (legacy design) | discordId → dndName | — | `playerRegistry` (current impl uses different scheme) |
+| `raw.messages` | Redis stream | — | `graphmcp/ingest.ts` |
+
+`SessionState` (`src/types/index.ts`) is the central shape:
+
+```ts
+{
+  encounterId, threadId, guildId,
+  spec: EncounterSpec,
+  players: Record<discordId, Player>,
+  history: ChatMessage[],         // mix of pinned + sliding
+  phase: 'open' | 'active' | 'resolved',
+  heldMessages: HeldMessage[],    // for unregistered players
+  outcome?, outcomeSummary?,
+  npcMemories?: Record<npcId, string>,
+  resolvedContext?: Record<key, string>,
+  pendingSkillCheck?: { player, prompt, dc, messageId, modifier?, skill?, advantage?, disadvantage? },
+  pendingSkillCheckAttempts?: number,
+  createdAt, updatedAt,
+}
+```
+
+### 4.2 Filesystem (`data/`)
+
+- `tally.json` — `{ [specName]: { runs, lastRun } }`. Incremented at each encounter start.
+- `summaries/{encounterId}-{ISO timestamp}.txt` — one per resolved encounter, written by `encounterLog.writeSummary()`.
+
+### 4.3 GraphMCP / Neo4j (via JSON-RPC)
+
+The bot never queries Neo4j directly. All graph access goes through `GRAPHMCP_URL/mcp` with JSON-RPC 2.0:
+
+| Tool | Args | Returns |
+|---|---|---|
+| `query_as_npc` | `npc_name, question, limit` | NPCQueryResult (chunks + graph_context) |
+| `semantic_search` | `query, limit` | SemanticSearchResult |
+| `log_encounter` | `title, participants, summary, location?, type?` | LogEncounterResult |
+| `list_encounters` | `limit` | EncounterResultItem[] |
+| `search_encounters` | `query?, location?, participant?, limit?` | EncounterResultItem[] |
+| `get_encounter` | `id` | EncounterDetails |
+
+NPC memory is injected into the system prompt via `formatNPCMemory()` — past encounters witnessed + top-3 lore chunks above `GRAPHMCP_SCORE_THRESHOLD`.
+
+### 4.4 Context window budget
+
+`src/types/index.ts` exports a `CONTEXT_BUDGET` constant used by both `contextAssembler` and `sessionManager`:
+
+| Zone | Tokens |
+|---|---|
+| System prompt (narrator + NPCs + tools + goals) | 4,000 |
+| Pinned (opening narrative, goal block) | 2,000 |
+| Sliding history | 118,000 |
+| Safety buffer | 3,500 |
+| **Total** | **128,000** |
+
+History trimming drops the oldest non-pinned turn pair when over budget, with a hard floor of 6 messages. Token estimates use `gpt-tokenizer` with a 1.15× buffer to approximate Gemma's tokenizer.
+
+---
+
+## 5. API Surface
+
+This project exposes its functionality as **two different APIs**:
+
+### 5.1 Discord slash commands (player/admin surface)
+
+Registered via `src/scripts/deploy-commands.ts` using Discord REST v10.
+
+| Command | Subcommands | Purpose |
+|---|---|---|
+| `/dndname` | `set <name>`, `show`, `clear` | Character name registration |
+| `/character` | `register foundry\|custom`, `show`, `view`, `clear`, `admin list\|remove\|give` | Full character profile + Foundry link |
+| `/encounter` | `start <spec>`, `random`, `status`, `stats`, `audit`, `end [notes]`, `list`, `generate <theme>`, `spec` | Encounter session lifecycle |
+| `/encounters` | (Select menu + search modal) | Search the encounter log via GraphMCP |
+| `/roll` | `action` | Manual dice roll |
+| `/actions` | — | In-character action shortcuts |
+| `/turn` | — | Turn management |
+| `/xp` | `award <amount>` | Award XP (relay → VTT) |
+
+Plus button + modal interactions: skill-check roll buttons, give item, custom character registration, Foundry link, encounter select menu, search modal.
+
+### 5.2 Tool plugins (LLM surface)
+
+Defined in `src/harness/tools/` and registered at module load. Each spec filters the active set via its `tools:` array.
+
+| Tool | Purpose | Args |
+|---|---|---|
+| `skill_check_emit` | Posts a dice-roll embed to the thread; blocks player input until resolved | `player, prompt, skill?, dc, advantage?, disadvantage?` |
+| `encounter_resolve` | Marks encounter complete; writes summary; archives thread | (args handled in `tools/encounterResolve.ts`) |
+| `context_recall` | Look up canonical session facts stored in `resolvedContext` | |
+| `goal_register` | Add a new goal mid-encounter (the `prd.md` "dynamic goal registration" feature) | |
+| `foundry_lookup` | Pull live character data from VTT relay | |
+| `foundry_reward` | Award XP/items to a character via VTT | |
+
+> ⚠ Note: the `Docs/mardonar-encounter-engine.md` lists `skill_check_resolve`, `event_log_append`, `npc_memory_read`, `npc_memory_write` as tools. These have been **removed** — replaced by the per-encounter event log + GraphMCP `log_encounter` tool. The current tool set is the one above.
+
+---
+
+## 6. Deployment Architecture
+
+### 6.1 Local development
+
+```bash
+docker compose -f docker-compose.dev.yml up -d   # Redis + Neo4j
+npm install
+npm run deploy-commands                          # registers slash commands with Discord
+npm run dev                                      # tsx watch mode
+```
+
+### 6.2 Production (multi-stage Dockerfile)
+
+`Dockerfile` (Node 22 alpine):
+
+1. **Builder stage** — `npm ci --ignore-scripts`, copy `src` + `tsconfig.json`, `npm run build` → `dist/`
+2. **Runtime stage** — `npm ci --omit=dev --ignore-scripts`, copy `dist/`, `specs/`, `lore/`, `persona.yaml`
+3. `CMD ["node", "dist/bot/index.js"]`
+
+`docker-compose.dev.yml` defines two services (for the `mardonar-internal` external Docker network that also hosts Redis + an MCP server from the GraphMCP-Example stack): `deploy-commands` (one-shot) and `bot` (long-running, with `data/` mounted as a volume).
+
+> **Gap:** There is no production `docker-compose.yml`. The `.env.example` is the source of truth for runtime config.
+
+### 6.3 Operational
+
+- Session state has a 12h TTL by default — stale encounters auto-expire
+- Bot connects to Redis on `main()` startup (`redis.connect()`)
+- VTT relay auto-spins up a headless Foundry session on connection failure (RSA-OAEP encrypted handshake)
+- `LOG_LEVEL=info` in prod; pino writes structured JSON
+
+---
+
+## 7. Development & Testing
+
+### 7.1 Local commands
+
+| Command | Effect |
+|---|---|
+| `npm run dev` | `tsx watch src/bot/index.ts` — auto-reload dev |
+| `npm run build` | `tsc` → `dist/` |
+| `npm run start` | `node dist/bot/index.js` |
+| `npm run deploy-commands` | One-shot slash command registration |
+| `npm run test` | All tests (vitest) |
+| `npm run test:unit` | Unit tests only (no external services) |
+| `npm run test:int` | Integration tests (requires Docker services) |
+
+### 7.2 Test coverage
+
+- 21 unit test files in `tests/unit/`
+- 1 integration test (`tests/integration/phase1.test.ts`)
+- `tests/fixtures/spec.ts` — shared encounter spec fixture
+
+Notable test surfaces: `promptBuilder`, `contextAssembler`, `toolParser`, `toolDispatcher`, `sessionManager`, `playerRegistry`, `characterRegistry`, `specLoader`, `rollHandler`, `rollDetection`, `responseFilter`, `queueCap`, `generationQueue`, `reactionManager`, `encounterLog`, `encounterDiscoveryEmbed`, `loreAnswerEmbed`, `skillCheckEmbed`, `graphmcpClient`, `foundryClientRetry`, `foundryClientFormatters`, `goalRegister`, `relaySession`.
+
+---
+
+## 8. Design Decisions (Living)
+
+| Decision | Why |
+|---|---|
+| **LiteLLM as primary, Ollama as fallback** | OpenAI-compatible proxy gives model flexibility without code changes; Ollama fallback ensures the bot still runs when the proxy is down |
+| **Prompt-based tool calls (not native)** | Gemma 4 IT at e2b is unreliable with native function calling; fenced JSON block parsing is deterministic |
+| **Tool plugin registry with per-spec active set** | New tools can be added without touching the dispatch core; specs opt into only the tools they need |
+| **Pinned + sliding history** | Opening narrative and goal block must survive trimming or the LLM loses its anchor |
+| **Goals in system prompt, not as a tool** | Goals rarely change mid-encounter; embedding them reduces tool round-trips |
+| **Redis for active state, GraphMCP for memory** | Redis is fast and ephemeral for live sessions; the graph holds long-term NPC lore |
+| **Player name gate via embed, not DMs** | Keeps the conversation in-thread; ephemeral embed auto-deletes after 30s |
+| **Story generator via `/encounter generate`** | Separates creative authoring from real-time inference — generator can use a stronger model later |
+| **VTT relay auto-spin-up** | Lets the bot operate when the relay has been cold-stopped; uses RSA-OAEP for password handoff |
+| **In-world voice rule for player-facing strings** | See `feedback-in-world-voice` — no utility/jargon in bot messages |
+
+---
+
+## 9. Open Issues / Drift
+
+Items the deep scan surfaced that aren't bugs but should be tracked:
+
+- **Drift: `Docs/mardonar-encounter-engine.md` describes a Go bot with an embedded MCP layer; the actual code is TypeScript with an external JSON-RPC GraphMCP server.** Treat the doc as historical/aspirational.
+- **Drift: `README.md`'s "Project Structure" tree references `src/mcp/` and the old `src/bot/commands/{dndname,encounter}.ts` layout.** Update README, or trim it to a pointer to the index.
+- **Duplicate `trimHistory` logic** in `src/session/sessionManager.ts` and `src/harness/contextAssembler.ts` (identical body). Could be extracted to `src/lib/historyTrim.ts`.
+- **No production compose file** — only `docker-compose.dev.yml`. The Dockerfile is production-ready but deployment is ad-hoc.
+- **No CI/CD** — `.github/workflows/` does not exist.
+- **`DISCORD_ALLOWED_USERS` is empty by default → anyone in allowed channels can run `/encounter start`.** The access control is channel-scoped, not user-scoped; admins need to set the env var explicitly.
+- **`OLLAMA_BASE_URL` defaults to `localhost`** — fine for dev, but production needs the LAN IP or proxy URL set.
+- **Spec tool list must be kept in sync** — `specs/*.yaml` declare `tools: [...]`, but no test verifies every referenced tool is registered. A stale spec name silently filters to no active tools.
+- **Schema mismatch risk:** `types/index.ts` `EncounterSpec` and `spec/loader.ts` Zod schema have diverged slightly — `EncounterSpec` is missing `tone`, `tools`, `randomizable`, and `npcs.nameKey`. `assembleContext` reads `spec.tone`; `loader` doesn't validate it. Consider regenerating `types/index.ts` from the Zod schema via `z.infer`.
+
+---
+
+*Document generated by `bmad-document-project` initial scan, deep level. Project state recorded in `docs/project-scan-report.json`.*
--- a/docs/component-inventory.md
+++ b/docs/component-inventory.md
@@ -0,0 +1,134 @@
+# Component Inventory
+
+> Reusable and feature-specific components in the Mardonar Encounter Engine. Generated 2026-06-19.
+
+## Discord Components
+
+### Slash commands (reusable across all guilds)
+
+| Component | File | Reusable? | Notes |
+|---|---|---|---|
+| `/dndname` | `src/bot/commands/dndname.ts` | Yes | Character name gate. Universal. |
+| `/encounter` | `src/bot/commands/encounter.ts` | Yes | Encounter lifecycle. Spec-scoped via `start <spec>`. |
+| `/character` | `src/bot/commands/character.ts` | Yes | Full character profile + Foundry link. |
+| `/roll` | `src/bot/commands/roll.ts` | Yes | Manual roll outside encounter. |
+| `/actions` | `src/bot/commands/actions.ts` | Yes | In-character action shortcuts. |
+| `/xp` | `src/bot/commands/xp.ts` | Yes | XP grant. |
+| `/encounters` | `src/bot/commands/encounters.ts` | Yes | Search via GraphMCP. |
+| `/turn` | `src/bot/commands/turn.ts` | Yes | Turn management. |
+
+### Embeds (pure builders, reusable)
+
+| Embed | File | Caller |
+|---|---|---|
+| PlayerGate | `src/bot/embeds/playerGate.ts` | `messageRouter` (unregistered player) |
+| Suspense + SkillCheck | `src/bot/embeds/skillCheck.ts` | `tools/skillCheckEmit.ts` |
+| Resolution | `src/bot/embeds/resolution.ts` | `tools/encounterResolve.ts` |
+| EncounterDiscovery | `src/bot/embeds/encounterDiscovery.ts` | `/encounters` |
+| LoreAnswer | `src/bot/embeds/loreAnswer.ts` | `mentionHandler` |
+
+### Event handlers (reusable, sidecar logic)
+
+| Handler | File | Trigger | Side effects |
+|---|---|---|---|
+| `handleMessage` | `handlers/messageRouter.ts` | `messageCreate` in encounter thread | Gates, debounce, LLM call, tool dispatch |
+| `handleMention` | `handlers/mentionHandler.ts` | `messageCreate` @Zalram | Lore search + persona reply |
+| `handleRollInteraction` | `handlers/rollHandler.ts` | Button / modal submit | Resolves skill check, schedules LLM turn |
+| `scheduleEncounterLLMTurn` | `handlers/messageRouter.ts` | Internal | Debounce → LLM turn |
+| `scheduleLLMTurn` | `handlers/generationQueue.ts` | Internal | Debounce timer |
+| `isBurstCapped` / `sendDropNotice` | `handlers/queueCap.ts` | Pre-append check | Drops + notifies |
+| `registerScheduled` / `drainPending` / `upgradeToProcessing` / `upgradeToComplete` | `handlers/reactionManager.ts` | Per-message | 👀 reaction lifecycle |
+| `filterLLMResponse` / `detectMissedSkillCheck` | `handlers/responseFilter.ts` | Post-LLM | Injects `[FILTER CORRECTION]` |
+
+## LLM Harness Components
+
+### Tool plugins (registered globally, filtered per-encounter)
+
+| Plugin | File | Per-encounter filter | Side effects |
+|---|---|---|---|
+| `skill_check_emit` | `harness/tools/skillCheckEmit.ts` | Spec `tools:` | Posts suspense + dice embed; updates `pendingSkillCheck` |
+| `encounter_resolve` | `harness/tools/encounterResolve.ts` | Spec `tools:` | Writes summary, archives thread |
+| `context_recall` | `harness/tools/contextRecall.ts` | Spec `tools:` | Returns canonical facts from `resolvedContext` |
+| `goal_register` | `harness/tools/goalRegister.ts` | Spec `tools:` | Adds dynamic goal (per `prd.md`) |
+| `foundry_lookup` | `harness/tools/foundryLookup.ts` | Spec `tools:` | Live VTT actor data |
+| `foundry_reward` | `harness/tools/foundryReward.ts` | Spec `tools:` | XP/item grant to VTT actor |
+
+### LLM clients
+
+| Client | File | Role |
+|---|---|---|
+| `llmClient` (router) | `harness/llmClient.ts` | LiteLLM primary, Ollama fallback |
+| `litellmClient` | `harness/litellmClient.ts` | OpenAI-compatible HTTP |
+| `ollamaClient` | `harness/ollamaClient.ts` | Native ollama npm + direct HTTP |
+
+### Pipeline components
+
+| Component | File | Role |
+|---|---|---|
+| `buildSystemPrompt` | `harness/promptBuilder.ts` | 10-block XML system prompt |
+| `assembleContext` | `harness/contextAssembler.ts` | System + pinned + trimmed sliding |
+| `parseToolCall` | `harness/toolParser.ts` | 3-pattern tool block extractor |
+| `buildToolManifest` | `harness/toolDispatcher.ts` | Per-encounter tool contract section |
+| `dispatchTool` | `harness/toolDispatcher.ts` | Active-set validation + dispatch |
+| `getActiveTools` | `harness/toolRegistry.ts` | Per-encounter filter (or all if unset) |
+| `registerTool` | `harness/toolRegistry.ts` | Side-effect registration at module load |
+
+## Session / Data Components
+
+| Component | File | Backend | Surface |
+|---|---|---|---|
+| `sessionManager` | `session/sessionManager.ts` | Redis (TTL 12h) | `create`, `get`, `update`, `addMessage`, `delete`, `getGuildThreadIds` |
+| `playerRegistry` | `session/playerRegistry.ts` | Redis | `(guildId, discordId) → Player` |
+| `characterRegistry` | `session/characterRegistry.ts` | Redis | Character profile (pronouns, Foundry UUID, etc.) |
+| `encounterLog` | `session/encounterLog.ts` | Filesystem | `tally.json` + per-encounter `.txt` in `data/summaries/` |
+| `xpAwarder` | `session/xpAwarder.ts` | VTT relay | XP grant |
+| `redis` singleton | `db/redis.ts` | ioredis | Lazy connect, 3 retries |
+| `loadSpec` | `spec/loader.ts` | YAML + Zod | `EncounterSpecSchema.parse` |
+| `loadPersona` | `persona/loader.ts` | YAML | @Zalram persona |
+
+## GraphMCP Components
+
+| Component | File | RPC method | Used by |
+|---|---|---|---|
+| `queryAsNPC` | `graphmcp/client.ts` | `query_as_npc` | NPC memory injection at session start |
+| `semanticSearch` | `graphmcp/client.ts` | `semantic_search` | @mention lore search |
+| `logEncounter` | `graphmcp/client.ts` | `log_encounter` | Encounter resolve (writes graph node) |
+| `listEncounters` | `graphmcp/client.ts` | `list_encounters` | `/encounters list` |
+| `searchEncounters` | `graphmcp/client.ts` | `search_encounters` | `/encounters search` |
+| `getEncounter` | `graphmcp/client.ts` | `get_encounter` | `/encounters get` |
+| `formatNPCMemory` | `graphmcp/client.ts` | (local) | Render NPCQueryResult as system-prompt text |
+| `publishToGraphMCP` | `graphmcp/ingest.ts` | (Redis stream `raw.messages`) | Fire-and-forget per encounter message |
+| `vocabularyResolver` | `graphmcp/vocabularyResolver.ts` | graphmcp | `randomizable:` lookup |
+| `loreResolver` | `graphmcp/loreResolver.ts` | graphmcp | `/encounter generate` helper |
+
+## VTT Components
+
+| Component | File | Role |
+|---|---|---|
+| `foundryClient` | `vtt/foundryClient.ts` | HTTP client, live actor data + formatters |
+| `relaySession.ensureRelaySession` | `vtt/relaySession.ts` | Auto-spin-up headless session on relay failure |
+| `isRelayDown` | `vtt/relaySession.ts` | Network-failure classifier |
+| `actorCache` (in `tools/skillCheckEmit.ts`) | in-file | 30s in-memory cache for actor details |
+
+## Type system (shared)
+
+| Type | File | Purpose |
+|---|---|---|
+| `EncounterSpec` | `types/index.ts` | Spec shape (note: diverged slightly from Zod schema — see architecture.md §9) |
+| `NpcPersona` | `types/index.ts` | NPC definition |
+| `EncounterGoal` / `EncounterGoals` | `types/index.ts` | Primary/secondary goals |
+| `SessionState` | `types/index.ts` | Full session shape |
+| `ChatMessage` | `types/index.ts` | History turn (with `pinned` flag) |
+| `HeldMessage` | `types/index.ts` | Pre-registration messages |
+| `ToolCallBlock` / `LLMResponse` | `types/index.ts` | LLM tool surface |
+| `ToolName` | `types/index.ts` | Discriminated union of valid tools |
+| `*Args` per tool | `types/index.ts` | Per-tool arg types |
+| `NpcNode` / `EncounterNode` / `EncounterEventNode` | `types/index.ts` | Neo4j graph node types |
+| `CONTEXT_BUDGET` (const) | `types/index.ts` | Hard token budget zones |
+
+## Config & logging
+
+| Component | File | Role |
+|---|---|---|
+| `config` (singleton) | `config.ts` | Zod-validated env (Discord, Redis, LiteLLM, Ollama, GraphMCP, VTT, persona, logging) |
+| `log` (pino wrapper) | `lib/logger.ts` | Structured logging with `pino-pretty` in dev |
--- a/docs/data-models.md
+++ b/docs/data-models.md
@@ -0,0 +1,212 @@
+# Data Models
+
+> Persistent and transient data shapes in the Mardonar Encounter Engine. Generated 2026-06-19.
+
+The bot's data lives in three places: Redis (transient session state), the filesystem (`data/`, runtime artifacts), and the GraphMCP-backed Neo4j graph (long-term NPC memory + encounter history). The bot does not query Neo4j directly — it goes through the GraphMCP JSON-RPC client.
+
+## Encounter spec (YAML → Zod → TypeScript)
+
+Defined by `EncounterSpecSchema` in `src/spec/loader.ts`. Loaded by `/encounter start <spec-name>`. Stored in `SessionState.spec`.
+
+```ts
+{
+  encounterId: string,            // unique ID — also Neo4j node key
+  title: string,                  // display name in Discord embeds
+  tone?: string,                  // "tense" | "comedic" | ... optional flavor block
+  setting: {
+    location: string,
+    mood: string,                 // multi-line OK
+    ambientNpcs: string,          // multi-line OK
+  },
+  openingNarrative: string,       // multi-line; can reference {{nameKey}} placeholders
+  npcs: [{                        // 1–5 entries
+    id: string,                   // unique stable ID
+    name: string,
+    nameKey?: string,             // placeholder for randomizable substitution
+    role: string,
+    persona: string,              // multi-line
+    memoryKey?: string,           // if set, memory is loaded from / written to graph
+  }],
+  goals: {
+    hidden: boolean,              // default true
+    primary: [{ id: string, label: string }],   // min 1
+    secondary: [{ id: string, label: string }],
+  },
+  sportsmanshipRules: string[],
+  skillChecks: Record<string, number | string>,  // grouped as <name>_dc / <name>_skill / <name>_note
+  randomizable?: [{               // optional
+    key: string,
+    source?: 'graphmcp' | 'vocabulary',
+    category?: string,            // e.g. "names.dwarf.female"
+    query: string,                // free-text query
+    fallback: string,             // always available
+  }],
+  dmNotes?: string,
+  tools?: string[],               // active tool plugin names; empty/undefined = all
+}
+```
+
+`tone` and `tools` are read by the harness but **not in the Zod schema** (see `architecture.md §9` for the schema-vs-types drift).
+
+## SessionState (Redis)
+
+Stored as JSON under key `session:{threadId}`. Schema in `src/types/index.ts`:
+
+```ts
+{
+  encounterId: string,
+  threadId: string,               // Discord thread snowflake
+  guildId: string,
+  spec: EncounterSpec,
+  players: Record<discordId, Player>,
+  history: ChatMessage[],         // pinned + sliding mix
+  phase: 'open' | 'active' | 'resolved',
+  heldMessages: HeldMessage[],    // for unregistered players
+  outcome?: string,               // goal ID when resolved
+  outcomeSummary?: string,
+  npcMemories?: Record<npcId, string>,    // injected into system prompt
+  resolvedContext?: Record<key, string>,  // canonical session facts (context_recall)
+  pendingSkillCheck?: {
+    player: string,
+    prompt: string,
+    dc: number,
+    messageId: string,            // Discord message ID of the dice embed
+    modifier?: number,
+    skill?: string,
+    advantage?: boolean,
+    disadvantage?: boolean,
+  },
+  pendingSkillCheckAttempts?: number,
+  createdAt: number,
+  updatedAt: number,
+}
+```
+
+## ChatMessage
+
+```ts
+{
+  role: 'system' | 'user' | 'assistant',
+  content: string,
+  pinned?: boolean,               // never trimmed by contextAssembler
+  timestamp: number,
+}
+```
+
+System messages are emitted by the harness for tool results, filter corrections, and join events. Assistant messages contain the LLM's narrative.
+
+## Player
+
+```ts
+{
+  discordId: string,
+  dndName: string,
+  pronouns?: string,              // populated from characterRegistry if set
+}
+```
+
+`pronouns` is added on first appearance in an encounter thread if the player has a `characterRegistry` profile.
+
+## Character profile (characterRegistry)
+
+```ts
+{
+  discordId: string,
+  guildId: string,
+  dndName: string,
+  pronouns?: string,
+  characterClass?: string,
+  race?: string,
+  level?: number,
+  backstory?: string,
+  foundryActorUuid?: string,      // link to Foundry VTT actor
+  inventory?: unknown[],          // populated from /character view
+  spells?: unknown[],             // populated from /character view
+  // ... additional Foundry-derived fields
+}
+```
+
+## Neo4j graph (via GraphMCP)
+
+The bot does not directly define the Neo4j schema — it consumes whatever GraphMCP returns. The conceptual model based on the GraphMCP client types and the legacy design doc:
+
+```
+(:NPC {id, name, persona_summary, memory: [], last_seen_encounter})
+  -[:APPEARED_IN]->
+(:Encounter {id, title, resolved, outcome_id, created_at})
+  -[:HAS_EVENT]->
+(:EncounterEvent {timestamp, type, description})
+  -[:FEATURED]->
+(:Entity {name, kind})
+
+(:Player {discord_id, dnd_name})
+  -[:PARTICIPATED_IN]->
+(:Encounter)
+```
+
+The bot writes to the graph via `log_encounter` (one encounter node + participants). It reads NPC memory via `query_as_npc` and the broader corpus via `semantic_search`.
+
+## File system (`data/`)
+
+```
+data/
+├── tally.json                    // { [specName]: { runs: number, lastRun: ISO8601 } }
+└── summaries/
+    └── {encounterId}-{ISO8601-with-dashes}.txt
+        // human-readable per-encounter summary
+        // header: Encounter, ID, Thread, Date, Outcome, Players
+        // body: free-text Summary
+```
+
+`tally.json` is rewritten atomically on each encounter start. Summary files are append-only.
+
+## Tool call payloads
+
+```ts
+// What the LLM emits
+type ToolCallBlock = {
+  tool: ToolName,
+  args: Record<string, unknown>,
+}
+
+// What the harness parses back from the LLM response
+type LLMResponse = {
+  narrative: string,
+  toolCall?: ToolCallBlock,
+  rawTokensUsed?: number,
+}
+```
+
+Tool names (`src/types/index.ts`):
+
+```ts
+type ToolName =
+  | 'skill_check_emit'
+  | 'skill_check_resolve'         // (defined in types but no longer registered — see architecture.md §9)
+  | 'event_log_append'            // (defined in types but no longer registered)
+  | 'npc_memory_read'             // (defined in types but no longer registered)
+  | 'npc_memory_write'            // (defined in types but no longer registered)
+  | 'encounter_resolve'
+  | 'goal_register'
+  | 'context_recall'
+  | 'foundry_lookup'
+  | 'foundry_reward';
+```
+
+The four `*_resolve / *_read / *_write` entries are **dead** in the current implementation — replaced by GraphMCP `log_encounter` and other RPC calls. They should be removed from the type union (or actually re-implemented) to avoid confusion.
+
+## Context budget (compile-time const)
+
+`src/types/index.ts`:
+
+```ts
+export const CONTEXT_BUDGET = {
+  SYSTEM: 4_000,
+  PINNED: 2_000,
+  HISTORY: 118_000,
+  SAFETY: 3_500,
+  TOTAL: 128_000,
+} as const;
+```
+
+Used by `contextAssembler` and `sessionManager` to enforce the trimming policy.
--- a/docs/deployment-guide.md
+++ b/docs/deployment-guide.md
@@ -0,0 +1,219 @@
+# Deployment Guide
+
+> Deploying the Mardonar Encounter Engine. Generated 2026-06-19.
+
+## Architecture
+
+The bot is a single long-running Node.js process. It connects to:
+
+- **Discord** over WebSocket (discord.js v14)
+- **Redis** for session and player/character registries
+- **GraphMCP** (HTTP JSON-RPC) for NPC memory, lore search, and encounter log writes
+- **LiteLLM** (preferred) or **Ollama** for LLM inference
+- **VTT relay** (optional) for Foundry VTT integration
+
+The Dockerfile is multi-stage Node 22 alpine. There is currently no production `docker-compose.yml` — only the dev one (`docker-compose.dev.yml`). Production deploys use the Dockerfile directly with whatever orchestrator is in use.
+
+## Build
+
+```bash
+npm ci --ignore-scripts
+npm run build          # tsc → dist/
+```
+
+The build is reproducible from a clean `node_modules`. The Dockerfile's builder stage does exactly this.
+
+## Container image
+
+`Dockerfile`:
+
+- **Builder** (`node:22-alpine`): `npm ci --ignore-scripts`, copy `src` + `tsconfig.json`, run `npm run build`
+- **Runtime** (`node:22-alpine`): `npm ci --omit=dev --ignore-scripts`, copy `dist/`, `specs/`, `lore/`, `persona.yaml`
+- **CMD**: `["node", "dist/bot/index.js"]`
+
+To build locally:
+
+```bash
+docker build -t mardonar-bot:latest .
+```
+
+The `data/` directory is not copied into the image — it must be mounted as a volume in production so tally and summaries persist across restarts.
+
+## Local dev (Docker Compose)
+
+`docker-compose.dev.yml` is the only compose file in the repo. It declares the `mardonar-internal` Docker network as `external: true` — it expects the GraphMCP-Example stack (Redis + MCP server) to be running first.
+
+```bash
+docker compose -f docker-compose.dev.yml up -d
+docker compose -f docker-compose.dev.yml logs -f bot
+```
+
+Two services:
+
+- **`deploy-commands`** — one-shot container that runs `node dist/scripts/deploy-commands.js`. `restart: "no"`.
+- **`bot`** — long-running container. `restart: unless-stopped`. Mounts `./data:/app/data` so tally and summaries persist. `depends_on: deploy-commands: service_completed_successfully` ensures commands are registered before the bot starts serving traffic.
+
+## Production deployment
+
+There is no production compose file. Pick one:
+
+### Option A: Plain Docker
+
+```bash
+docker build -t mardonar-bot:latest .
+docker run -d \
+  --name mardonar-bot \
+  --restart unless-stopped \
+  --env-file .env \
+  -v /var/lib/mardonar/data:/app/data \
+  --network mardonar-internal \
+  mardonar-bot:latest
+```
+
+Register commands once before the bot serves traffic (either via the `deploy-commands` service or by running the same image with a different command):
+
+```bash
+docker run --rm \
+  --env-file .env \
+  --network mardonar-internal \
+  mardonar-bot:latest \
+  node dist/scripts/deploy-commands.js
+```
+
+### Option B: systemd (Linux host)
+
+```ini
+# /etc/systemd/system/mardonar-bot.service
+[Unit]
+Description=Mardonar Encounter Engine
+After=network.target redis-server.service
+
+[Service]
+Type=simple
+User=mardonar
+WorkingDirectory=/opt/mardonar
+EnvironmentFile=/opt/mardonar/.env
+ExecStart=/usr/bin/node /opt/mardonar/dist/bot/index.js
+Restart=on-failure
+RestartSec=5
+
+[Install]
+WantedBy=multi-user.target
+```
+
+```bash
+sudo systemctl daemon-reload
+sudo systemctl enable --now mardonar-bot
+sudo journalctl -u mardonar-bot -f
+```
+
+## Environment
+
+All runtime configuration is via environment variables, validated by Zod (`src/config.ts`). The full list is in [`development-guide.md`](./development-guide.md#environment-configuration-reference).
+
+Production essentials:
+
+```env
+DISCORD_TOKEN=...
+DISCORD_CLIENT_ID=...
+DISCORD_GUILD_ID=...           # instant command registration
+
+# Network isolation: only respond in specific channels
+DISCORD_ALLOWED_CHANNELS=123456789012345678,987654321098765432
+# User restriction: only allow specific users to run /encounter
+DISCORD_ALLOWED_USERS=111111111111111111
+
+# LiteLLM (preferred)
+LITELLM_BASE_URL=http://your-litellm-host:4000
+LITELLM_API_KEY=...
+LITELLM_MODEL=ollama-cloud
+
+# Ollama fallback
+OLLAMA_BASE_URL=http://your-ollama-host:11434
+OLLAMA_MODEL=gemma4-it:e2b
+
+# GraphMCP (must be reachable)
+GRAPHMCP_URL=http://mcp-server:9000
+GRAPHMCP_SCORE_THRESHOLD=0.68
+GRAPHMCP_INGEST_STREAM=raw.messages
+
+# Persisted state
+DATA_DIR=/app/data              # or wherever you mount the volume
+
+# Logging
+LOG_LEVEL=info
+```
+
+> ⚠ **Security note:** `DISCORD_ALLOWED_CHANNELS` is **empty by default**, which means the bot will respond in **no channels**. This is secure-by-default but easy to misconfigure. Set it explicitly.
+
+## Persistent state
+
+Two kinds of state to back up:
+
+1. **`data/tally.json`** — per-spec run counts. Useful for analytics, not load-bearing.
+2. **`data/summaries/`** — one `.txt` per resolved encounter. Permanent record.
+
+Session state lives in Redis with a 12h TTL. If Redis is wiped, in-flight sessions are lost but Discord threads themselves remain — the bot will simply not find a session for that thread on next message. No data corruption risk.
+
+## Health checks
+
+The bot does not currently expose an HTTP health endpoint. Suggested liveness probe patterns:
+
+- **Discord WebSocket liveness** — the bot logs `[bot] Logged in as <tag>` on ready. Scrape stdout for this.
+- **Redis** — already externally monitored. The bot logs `[redis] connection error` on failure.
+- **GraphMCP** — first call after startup will fail loudly if unreachable.
+- **Custom probe** — call `/encounter status` in a known thread and check the response (the bot only responds in `DISCORD_ALLOWED_CHANNELS`).
+
+A simple `docker` healthcheck using Discord WebSocket isn't trivially scriptable. If you need an HTTP probe, add a small Express server in a future iteration that responds 200 while the Discord client is `ready` and Redis is connected.
+
+## Logging
+
+The bot uses pino. In dev, `pino-pretty` formats to a human-readable stream. In prod, pino emits structured JSON to stdout — pipe to your log shipper (Loki, CloudWatch, etc.).
+
+Useful fields to index:
+
+- `level`, `time`, `msg`
+- `threadId`, `encounterId` (for encounter-specific queries)
+- `latencyMs` (for LLM and tool latency)
+- `error` (for failure analysis)
+
+## Operational runbook
+
+### Restart the bot
+```bash
+docker restart mardonar-bot
+# or: systemctl restart mardonar-bot
+```
+
+### Rotate the Discord token
+1. Generate a new token in the Discord developer portal
+2. Update the env var (or secret store)
+3. Restart the bot
+4. Old token is invalidated immediately
+
+### Re-register slash commands
+After changing any `src/bot/commands/*.ts`:
+```bash
+docker run --rm --env-file .env --network mardonar-internal mardonar-bot:latest \
+  node dist/scripts/deploy-commands.js
+```
+
+Or in dev: `npm run deploy-commands`
+
+### Reset a stuck session
+A bot restart clears all in-memory state (including reaction managers and burst counters). Redis session state persists. If a session is genuinely stuck (e.g. a tool dispatched but the response was lost), use `/encounter end` in-thread to force-resolve.
+
+### Drain Redis (nuclear option)
+```bash
+docker exec -it <redis-container> redis-cli FLUSHDB
+```
+
+## Open deployment gaps
+
+These are real but not blockers:
+
+- **No production compose file** — only `docker-compose.dev.yml`. Production deploy is ad-hoc.
+- **No CI/CD** — no `.github/workflows/`. Build and deploy are manual.
+- **No health endpoint** — no HTTP probe target.
+- **No metrics export** — pino logs are the only observability surface.
+- **`docker-compose.dev.yml` references an external Docker network (`mardonar-internal`)** — fine for the dev stack it's designed for, but a fresh deployment needs to either join the same network or remove the reference.
--- a/docs/development-guide.md
+++ b/docs/development-guide.md
@@ -0,0 +1,193 @@
+# Development Guide
+
+> How to set up, run, test, and develop the Mardonar Encounter Engine. Generated 2026-06-19.
+
+## Prerequisites
+
+- **Node.js 22+** (matches the Dockerfile runtime)
+- **Docker + Docker Compose** (for local Redis and Neo4j)
+- **Ollama** running somewhere reachable, with `gemma4-it:e2b` pulled — *or* a LiteLLM proxy (preferred, set `LITELLM_BASE_URL`)
+- **A Discord bot token and application ID** with a registered bot user
+- npm 10+
+
+## First-time setup
+
+```bash
+git clone <your-repo>
+cd mardonar-npcs
+npm install
+cp .env.example .env
+# Edit .env — at minimum set DISCORD_TOKEN, DISCORD_CLIENT_ID
+```
+
+The `.env` file is validated by Zod (`src/config.ts`) at import time. A missing required var (e.g. `DISCORD_TOKEN`) will crash the bot on startup with a clear error.
+
+## Local services
+
+```bash
+docker compose -f docker-compose.dev.yml up -d
+```
+
+This starts:
+
+- **Redis** on `localhost:6379`
+- **Neo4j** on `localhost:7687` (browser UI at `http://localhost:7474`, login `neo4j` / `mardonardev`)
+
+The `mardonar-internal` Docker network is declared as `external: true` — it expects to be created by the GraphMCP-Example stack. If you run just the bot without GraphMCP, you can remove that network reference, but `/encounter start` will fail at NPC memory lookup.
+
+## Register slash commands
+
+Run once per bot deployment, or whenever commands change:
+
+```bash
+npm run deploy-commands
+```
+
+If `DISCORD_GUILD_ID` is set, registers to that guild instantly. If unset, registers globally (up to 1h propagation delay). The deploy script also clears any lingering global commands first, to avoid double-registration.
+
+## Run the bot
+
+```bash
+npm run dev          # development: tsx watch mode (auto-reload)
+npm run build        # compile TypeScript to dist/
+npm run start        # run the compiled output
+```
+
+The bot logs to stdout (pino with `pino-pretty` in dev). Set `LOG_LEVEL=debug` for verbose output.
+
+## Testing
+
+```bash
+npm run test         # all tests
+npm run test:unit    # unit only (no external services)
+npm run test:int     # integration (requires docker compose up)
+```
+
+Test layout:
+
+- `tests/unit/` — 21 fast unit tests with no external dependencies
+- `tests/integration/phase1.test.ts` — requires running Redis + Neo4j
+- `tests/fixtures/spec.ts` — shared spec fixture
+
+Vitest is configured with v8 coverage. The `vitest.config.ts` includes `src/**/*.ts` for coverage and `tests/**/*.test.ts` for the test pattern.
+
+## Adding a new encounter
+
+1. Copy `specs/market-thief.yaml` to `specs/your-encounter.yaml`
+2. Fill in: `encounterId`, `title`, `tone`, `setting`, `openingNarrative`, `npcs[]` (with optional `memoryKey` and `nameKey`), `goals`, `sportsmanshipRules`, `skillChecks` (group as `<name>_dc / <name>_skill / <name>_note` triples), and the `tools:` list
+3. Add `randomizable[]` entries if you want parts of the spec (e.g. NPC names, item descriptions) to be filled from GraphMCP vocabulary at load time
+4. In Discord: `/encounter start your-encounter`
+
+See `specs/SPEC_FORMAT.md` for the canonical reference.
+
+## Adding a new slash command
+
+1. Create `src/bot/commands/<name>.ts` exporting `data` (SlashCommandBuilder) and `execute(interaction, client)`
+2. Register it in `src/bot/index.ts` (`commands.set('<name>', ...)`)
+3. Add it to `src/scripts/deploy-commands.ts` (`commands.push(data.toJSON())`)
+4. Run `npm run deploy-commands`
+
+## Adding a new LLM tool
+
+1. Create `src/harness/tools/<name>.ts` with a `ToolPlugin` definition and call `registerTool(plugin)` at the bottom
+2. Import the file in `src/harness/tools/index.ts` (side-effect import)
+3. Reference it from any spec's `tools: [...]` array to make it active
+4. Add a unit test in `tests/unit/`
+
+The tool's `args` schema (string / number / boolean) is surfaced to the LLM via `buildToolManifest`, so the model sees typed arg descriptions in the system prompt. Use `contextDocs(spec)` to inject spec-specific guidance (e.g. preset DCs).
+
+## Adding a new event handler
+
+1. Create the handler in `src/bot/handlers/<name>.ts`
+2. Wire it from `src/bot/index.ts` or another handler (e.g. `messageRouter`)
+3. Prefer pure functions for transforms; reserve stateful modules for cross-call persistence
+
+## Environment configuration reference
+
+| Var | Default | Purpose |
+|---|---|---|
+| `DISCORD_TOKEN` | (required) | Bot user token |
+| `DISCORD_CLIENT_ID` | (required) | Application ID |
+| `DISCORD_GUILD_ID` | unset | If set, instant guild-scoped command registration |
+| `DISCORD_ALLOWED_CHANNELS` | empty → no channels | Comma-separated channel IDs the bot will respond in |
+| `DISCORD_ALLOWED_USERS` | empty → all users | Comma-separated user IDs allowed to run /encounter |
+| `REDIS_URL` | `redis://localhost:6379` | ioredis connection string |
+| `SESSION_TTL_HOURS` | 12 | Session TTL in Redis |
+| `LITELLM_BASE_URL` | (recommended) | LiteLLM proxy URL — preferred LLM client |
+| `LITELLM_API_KEY` | unset | Optional API key for the proxy |
+| `LITELLM_MODEL` | falls back to `OLLAMA_MODEL` | Model name as configured in LiteLLM |
+| `OLLAMA_BASE_URL` | `http://localhost:11434` | Ollama HTTP endpoint (fallback) |
+| `OLLAMA_MODEL` | `gemma4-it:e2b` | Ollama model name |
+| `OLLAMA_TEMPERATURE` | 0.75 | Sampling temperature (0–2) |
+| `OLLAMA_NUM_CTX` | 131072 | Context window in tokens |
+| `OLLAMA_TIMEOUT_MS` | 120000 | LLM call timeout |
+| `GRAPHMCP_URL` | `http://localhost:9000` | GraphMCP JSON-RPC endpoint |
+| `GRAPHMCP_SCORE_THRESHOLD` | 0.68 | Min similarity for NPC memory chunks |
+| `GRAPHMCP_NPC_MEMORY_LIMIT` | 5 | Max memory chunks per NPC |
+| `GRAPHMCP_MENTION_LIMIT` | 5 | Max chunks for @mention search |
+| `GRAPHMCP_INGEST_STREAM` | `raw.messages` | Redis stream name for encounter ingest |
+| `SPECS_DIR` | `./specs` | Encounter YAML directory |
+| `ENCOUNTER_ARCHIVE_DELAY_MS` | 5000 | Delay before archiving resolved thread |
+| `ENCOUNTER_GATE_TIMEOUT_MS` | 30000 | Player-gate embed auto-delete delay |
+| `PERSONA_PATH` | `./persona.yaml` | @mention persona YAML |
+| `DATA_DIR` | `./data` | Tally + summary directory |
+| `VTT_RELAY_URL` | `https://vtt-relay.damascusfront.net` | Foundry VTT relay endpoint |
+| `VTT_API_KEY` | empty → VTT disabled | API key for the relay |
+| `VTT_CLIENT_ID` | empty | Client ID for the relay |
+| `VTT_FOUNDRY_URL` | empty | Foundry URL for headless spin-up |
+| `VTT_USERNAME` | empty | Foundry username |
+| `VTT_PASSWORD` | empty | Foundry password (encrypted with RSA-OAEP for handoff) |
+| `VTT_WORLD` | empty | Foundry world to launch |
+| `LOG_LEVEL` | `info` | `trace` / `debug` / `info` / `warn` / `error` |
+
+## Common tasks
+
+### View current encounter state
+In Discord, in an encounter thread: `/encounter status`
+
+### List active encounters in the guild
+`/encounter list`
+
+### Search past encounters
+`/encounters` then use the modal
+
+### Force-end an encounter
+`/encounter end [notes]`
+
+### Inspect the most recent encounter summary
+`/encounter audit` (DMs the file) — or read `data/summaries/` directly
+
+### Tail the bot log
+With pino-pretty in dev, logs are pretty-printed to stdout. In prod, pipe container stdout to your log shipper.
+
+### Reset Redis state
+```bash
+docker compose -f docker-compose.dev.yml down -v
+docker compose -f docker-compose.dev.yml up -d
+```
+
+## Troubleshooting
+
+| Symptom | Likely cause |
+|---|---|
+| `ZodError` at startup | Missing or malformed env var. Check `.env` against `.env.example`. |
+| `DISCORD_ALLOWED_CHANNELS` empty → bot never responds | The bot refuses to respond outside allowed channels by design. Set the env var. |
+| `ECONNREFUSED` to Redis | `docker compose -f docker-compose.dev.yml up -d` not run, or wrong `REDIS_URL`. |
+| `ECONNREFUSED` to GraphMCP | GraphMCP-Example stack not running, or wrong `GRAPHMCP_URL`. Encounter start will fail at NPC memory fetch. |
+| LLM never responds | LiteLLM down → falls back to Ollama. Check `OLLAMA_BASE_URL` and that the model is pulled. |
+| Tool call never fires | LLM emitted a `tool_call` block but the tool name is misspelled or not in the spec's `tools:` list. Check `toolParser` warnings. |
+| Skill check embed buttons do nothing | `PENDING_ROLL_LIMIT` (5) reached; encounter auto-fails. Look for the `[SKILL CHECK RESULT] ... auto-cancelled` system message. |
+| VTT integration silently skipped | `VTT_API_KEY` empty. Set the var to enable. |
+| Spec fails to load | Run `/encounter spec` for the YAML. Schema is in `src/spec/loader.ts`. |
+| High latency on LLM calls | Likely under-sized `OLLAMA_NUM_CTX` vs. assembled context. Check `CONTEXT_BUDGET` in `src/types/index.ts`. |
+
+## Project conventions
+
+- **TypeScript strict mode**, ESM modules, NodeNext resolution. All imports use `.js` extensions even for `.ts` source.
+- **Shared types live only in `src/types/index.ts`.** Do not duplicate definitions elsewhere.
+- **Tool plugins are self-registering** — each `harness/tools/<name>.ts` calls `registerTool()` at load. The `index.ts` aggregator imports them for side effects.
+- **Discord embeds are pure builders** — no I/O, no `await`. Pass typed args, return an embed.
+- **Event handlers live in `src/bot/handlers/`.** The runtime heart is `messageRouter.ts`.
+- **In-world voice for player-facing strings** — see `feedback-in-world-voice` memory. No utility terms like "session", "user", "ephemeral" in bot messages.
+- **All env access goes through `import { config }` from `src/config.ts`** — never read `process.env` directly.
+- **Tests use Vitest globals** — no explicit `import { describe, it, expect }` in test files.
--- a/docs/index.md
+++ b/docs/index.md
@@ -0,0 +1,55 @@
+# Mardonar Encounter Engine — Documentation Index
+
+> Primary entry point for AI-assisted development. Generated 2026-06-19 from a deep scan.
+
+## Project Overview
+
+- **Type:** Monolith — single-part backend
+- **Primary Language:** TypeScript (Node.js 22, ESM)
+- **Architecture:** Layered backend with plugin-style LLM tool registry
+- **Project name (config):** big-red
+- **Repository name:** mardonar-npcs
+
+## Quick Reference
+
+- **Tech stack:** Node.js 22 · TypeScript 5.8 · discord.js v14 · LiteLLM (primary) + Ollama (fallback) · ioredis · GraphMCP JSON-RPC (Neo4j-backed) · Zod · pino · Vitest · Docker
+- **Entry point:** `src/bot/index.ts` (compiled to `dist/bot/index.js`)
+- **Architecture pattern:** Layered (bot → harness → session/db/graphmcp/vtt) with per-encounter tool plugin filtering
+
+## Generated Documentation
+
+- [Project Overview](./project-overview.md)
+- [Architecture](./architecture.md)
+- [Source Tree Analysis](./source-tree-analysis.md)
+- [Component Inventory](./component-inventory.md)
+- [Development Guide](./development-guide.md)
+- [Deployment Guide](./deployment-guide.md)
+- [API Contracts](./api-contracts.md)
+- [Data Models](./data-models.md)
+
+## Existing Documentation
+
+These pre-existed in `Docs/` and were cross-referenced during generation. Note that some are partially out of date.
+
+- [Docs/mardonar-encounter-engine.md](../Docs/mardonar-encounter-engine.md) — Original system design doc. **Out of date** — describes a Go bot with embedded MCP; the actual implementation is TypeScript with an external GraphMCP. Use `docs/architecture.md` as the source of truth.
+- [Docs/mardonar-build-plan.md](../Docs/mardonar-build-plan.md) — Phased build plan with packages and test guidance
+- [Docs/epics.md](../Docs/epics.md) — Epic list
+- [Docs/stories/](../Docs/stories/) — Story specs (1.1, 1.2, 2.1, 3.1, 4.1)
+- [Docs/ux-designs/ux-mardonar-2026-05-30/](../Docs/ux-designs/ux-mardonar-2026-05-30/) — UX session artifacts (EXPERIENCE.md, DESIGN.md, decision-log.md)
+- [README.md](../README.md) — Player-facing intro, quick start, command list. Project-structure tree is out of date.
+- [prd.md](../prd.md) — Active PRD: Dynamic Goal Registration
+
+## Getting Started
+
+1. Skim [Project Overview](./project-overview.md) (1 minute)
+2. Read [Architecture](./architecture.md) sections 1–6 for the system design (10 minutes)
+3. Read [Development Guide](./development-guide.md) "First-time setup" to get the bot running locally
+4. For new feature work, start from [Component Inventory](./component-inventory.md) to find the right module, then read the linked source
+
+## Conventions
+
+- All player-facing bot strings use in-world voice — no utility terms like "session", "user", "ephemeral" (see `feedback-in-world-voice` memory).
+- All env access goes through `import { config }` from `src/config.ts`.
+- Tool plugins self-register via `registerTool()` at module load.
+- Shared types live only in `src/types/index.ts`.
+- Discord embeds are pure builders — no I/O.
--- a/docs/project-overview.md
+++ b/docs/project-overview.md
@@ -0,0 +1,106 @@
+# Mardonar Encounter Engine — Project Overview
+
+> Discord-native, LLM-driven D&D encounter engine. Generated 2026-06-19 from a deep scan.
+
+## What it is
+
+A Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. The bot loads a YAML spec, narrates the scene via an LLM (Gemma 4 IT e2b through LiteLLM with Ollama fallback), voices NPCs with stable personas, runs skill checks via Discord embeds, and persists NPC memory + encounter history into a graph database through GraphMCP (JSON-RPC over HTTP). Optional Foundry VTT integration pulls live character stats and awards XP via an external relay.
+
+## Who it serves
+
+Discord community members playing D&D 5e in the Land of Mardonar. The DM runs `/encounter start <spec>` to begin; players post their actions in the resulting thread. NPC personas are loaded from specs and grounded in long-term graph memory so that recurring NPCs remember prior interactions across encounters.
+
+## Tech stack at a glance
+
+| Layer | Technology |
+|---|---|
+| Runtime | Node.js 22 (ESM, TypeScript 5.8 strict) |
+| Discord | discord.js v14 |
+| LLM (primary) | LiteLLM proxy (env: `LITELLM_BASE_URL`) |
+| LLM (fallback) | Ollama (env: `OLLAMA_BASE_URL`) — `gemma4-it:e2b`, 128k context |
+| Session cache | Redis (ioredis), 12h TTL |
+| Graph DB | Neo4j (via GraphMCP JSON-RPC, not direct) |
+| Lore / NPC memory | GraphMCP HTTP JSON-RPC server |
+| Foundry VTT | External relay (optional, requires API key) |
+| Validation | Zod (env + encounter spec) |
+| Logging | pino + pino-pretty |
+| Testing | Vitest 3 (unit + integration) |
+| Build | tsc → multi-stage Node 22 alpine Dockerfile |
+
+## Architecture type
+
+**Layered backend with a plugin-style tool registry.**
+
+```
+Discord ──▶ src/bot/ (commands, embeds, handlers)
+                │
+                ▼
+         src/harness/ (promptBuilder, contextAssembler,
+                      llmClient, toolParser, toolDispatcher,
+                      tools/* plugin registry)
+                │
+   ┌────────────┼────────────┐
+   ▼            ▼            ▼
+Redis        GraphMCP      VTT relay
+(session     (JSON-RPC:    (Foundry
+ state)      NPC memory,   live stats,
+             lore, log)    XP grants)
+```
+
+## Repository structure
+
+**Single-part monolith.** All source under `src/`. The bot is one Node.js process that talks to external services over the network.
+
+```
+src/
+├── bot/           # Discord I/O (commands, embeds, event handlers)
+├── harness/       # LLM orchestration + 6 tool plugins
+├── session/       # Redis-backed registries + session state
+├── graphmcp/      # JSON-RPC client + Redis stream ingest
+├── vtt/           # Foundry VTT relay client + spin-up
+├── db/            # ioredis singleton
+├── spec/          # YAML encounter loader + Zod schema
+├── persona/       # persona.yaml loader
+├── config.ts      # Zod env validation
+├── lib/           # logger
+├── scripts/       # deploy-commands (slash command registration)
+└── types/         # shared interfaces + CONTEXT_BUDGET
+```
+
+Plus `specs/` (8 encounter YAML files), `tests/` (22 test files), `data/` (runtime tally + summaries), and `Docs/` (pre-existing project documentation, partially out of date).
+
+## Documentation
+
+- [Architecture](./architecture.md) — full system design
+- [Source Tree Analysis](./source-tree-analysis.md) — annotated directory tree
+- [Component Inventory](./component-inventory.md) — reusable components
+- [Development Guide](./development-guide.md) — setup, run, test, troubleshoot
+- [Deployment Guide](./deployment-guide.md) — production deploy + ops
+- [API Contracts](./api-contracts.md) — Discord commands + GraphMCP JSON-RPC
+- [Data Models](./data-models.md) — session state, encounter spec, Neo4j graph
+
+## Key features in the current codebase
+
+- **Per-encounter tool filtering.** Each spec declares which tool plugins are active.
+- **Dynamic goal registration** (the active PRD feature) — `tools/goalRegister.ts` lets the LLM add new goals mid-encounter.
+- **Three-pattern tool parser** — handles fenced `tool_call`, bare `tool_call` header, and fuzzy bare JSON, so even smaller models can drive tools.
+- **Self-spinning VTT relay** — when the relay is down, the bot handshakes via RSA-OAEP and launches a headless Foundry session on demand.
+- **Burst cap with drop notices** — if too many messages arrive before the last LLM response, the bot drops the excess and posts a tone-aware notice.
+- **Reaction lifecycle (👀)** — visible "I'm working on it" feedback through queued → processing → complete states.
+- **NPC memory injection** at session start from GraphMCP, filtered by score threshold and capped at top-3 chunks above the threshold.
+- **In-world voice** for player-facing strings — no utility/jargon (see `feedback-in-world-voice`).
+
+## Known drift and open issues
+
+- `Docs/mardonar-encounter-engine.md` describes a Go bot with embedded MCP — superseded by `docs/architecture.md` but still referenced by the README.
+- `README.md`'s project-structure tree is out of date (mentions `src/mcp/`, missing commands).
+- `src/types/index.ts` `EncounterSpec` diverged from `src/spec/loader.ts` Zod schema (missing `tone`, `tools`, `randomizable`, `nameKey`).
+- Duplicate `trimHistory` between `sessionManager.ts` and `contextAssembler.ts`.
+- No production `docker-compose.yml`, no CI/CD, no HTTP health endpoint.
+- `DISCORD_ALLOWED_USERS` empty by default — channel-scoped access only.
+
+See `docs/architecture.md §9` for full drift list.
+
+## When you're ready to plan new features
+
+Point the PRD workflow at [`docs/index.md`](./index.md) as input. For UI-facing work, `architecture.md §5.1` is the primary reference. For backend/LLM feature work, `architecture.md §5.2` and `docs/data-models.md` are the primary references.
--- a/docs/source-tree-analysis.md
+++ b/docs/source-tree-analysis.md
@@ -0,0 +1,190 @@
+# Source Tree Analysis
+
+> Annotated directory tree for the Mardonar Encounter Engine. Generated 2026-06-19.
+
+## Top level
+
+```
+mardonar-npcs/
+├── src/                  # TypeScript source (compiled to dist/)
+├── specs/                # Encounter YAML files (loaded by /encounter start)
+├── tests/                # Vitest unit + integration suites
+├── Docs/                 # Pre-existing project documentation (encounter engine overview, build plan, epics, stories, UX designs)
+├── lore/                 # Game-world reference material
+├── data/                 # Runtime tally + per-encounter summaries (volume-mounted in prod)
+├── scripts/              # Top-level utility scripts (only deploy-commands.ts lives here; the rest are under src/scripts)
+├── docs/                 # Generated by bmad-document-project (this folder)
+├── node_modules/         # npm dependencies (gitignored)
+├── dist/                 # tsc output (gitignored)
+├── Dockerfile            # Multi-stage Node 22 alpine build
+├── docker-compose.dev.yml # Local Redis + Neo4j orchestration
+├── package.json
+├── tsconfig.json         # NodeNext ESM, strict, rootDir=src
+├── vitest.config.ts      # v8 coverage
+├── .env / .env.example   # Zod-validated env config
+├── persona.yaml          # @Zalram Cloudwalker persona
+├── prd.md                # Active PRD: Dynamic Goal Registration
+└── README.md
+```
+
+## src/ — TypeScript source
+
+### src/bot/ — Discord I/O layer
+
+| Path | Role |
+|---|---|
+| `index.ts` | Entry point. Wires the discord.js `Client`, registers slash commands, dispatches `interactionCreate` and `messageCreate` to handlers. |
+| `commands/` | One file per slash command. Each exports `data` (SlashCommandBuilder) and `execute(interaction, client)`. |
+| `commands/dndname.ts` | `/dndname set\|show\|clear` — character name registration. |
+| `commands/encounter.ts` | `/encounter start\|status\|end\|generate\|spec\|random\|stats\|audit` — encounter session lifecycle. |
+| `commands/character.ts` | `/character register\|show\|view\|admin` — character profile + Foundry link modals. |
+| `commands/roll.ts` | `/roll` — manual dice roll. |
+| `commands/actions.ts` | `/actions` — in-character action shortcuts. |
+| `commands/xp.ts` | `/xp award` — XP grant to a character. |
+| `commands/encounters.ts` | `/encounters` — search/list encounters via GraphMCP. Includes select menu + search modal interactions. |
+| `commands/turn.ts` | `/turn` — turn management. |
+| `embeds/` | Discord embed builders. Pure functions taking typed args. |
+| `embeds/playerGate.ts` | "Please register your character name" embed. |
+| `embeds/skillCheck.ts` | Suspense embed → dice embed with roll buttons. |
+| `embeds/resolution.ts` | Encounter complete embed. |
+| `embeds/encounterDiscovery.ts` | Encounter search result embeds. |
+| `embeds/loreAnswer.ts` | @mention lore response embed. |
+| `handlers/` | Event handlers and sidecar logic. The runtime heart of the bot. |
+| `handlers/messageRouter.ts` | Core encounter-thread message pipeline: gates, debounce, LLM call, tool dispatch. |
+| `handlers/mentionHandler.ts` | @Zalram persona replies (uses `persona/loader.ts`). |
+| `handlers/rollHandler.ts` | Button + modal submit skill-check roll resolution. |
+| `handlers/generationQueue.ts` | Debounce + LLM turn scheduling (500ms coalesce). |
+| `handlers/queueCap.ts` | Burst cap: drops messages if too many arrived before the last LLM response. |
+| `handlers/reactionManager.ts` | 👀 reaction lifecycle: scheduled → processing → complete. |
+| `handlers/responseFilter.ts` | Post-LLM response scrubbing (catches fabricated rolls, echoed system tags, empty responses). |
+| `lib/welcomeDM.ts` | Welcome DM utility. |
+
+### src/harness/ — LLM orchestration
+
+| Path | Role |
+|---|---|
+| `promptBuilder.ts` | System prompt assembly. XML-sectioned: narrator, tone, sportsmanship, NPCs, players, setting, resolved context, skill checks, hidden goals, tool contract. |
+| `contextAssembler.ts` | Builds the LLM message list: system + pinned history + trimmed sliding history. |
+| `llmClient.ts` | Entry point. Routes to LiteLLM (primary) with Ollama fallback. |
+| `litellmClient.ts` | OpenAI-compatible HTTP client for LiteLLM proxy. |
+| `ollamaClient.ts` | Native `ollama` npm + direct HTTP fallback path. |
+| `toolParser.ts` | Extracts `tool_call` blocks from LLM response. Three fallback patterns. |
+| `toolRegistry.ts` | Plugin registry. `getActiveTools(spec.tools)` returns per-encounter active set. |
+| `toolDispatcher.ts` | Validates tool name against active set, dispatches to plugin handler, logs result. |
+| `tools/` | Tool plugin implementations. Each module calls `registerTool()` at load. |
+| `tools/index.ts` | Side-effect imports — add new tool files here. |
+| `tools/skillCheckEmit.ts` | Posts dice-roll embed; blocks input until resolved. Pulls player modifier from Foundry. |
+| `tools/encounterResolve.ts` | Marks encounter complete, writes summary, archives thread. |
+| `tools/contextRecall.ts` | Retrieves canonical session facts from `resolvedContext`. |
+| `tools/goalRegister.ts` | Adds new goals mid-encounter (per `prd.md`). |
+| `tools/foundryLookup.ts` | Live character data from VTT relay. |
+| `tools/foundryReward.ts` | XP / item grant to a character via VTT. |
+
+### src/session/ — Redis-backed state
+
+| Path | Role |
+|---|---|
+| `playerRegistry.ts` | `(guildId, discordId) → Player` (DnD name). |
+| `characterRegistry.ts` | Character profile: DnD name, pronouns, characterClass, race, level, backstory, Foundry actor UUID. |
+| `sessionManager.ts` | `threadId → SessionState`. Pinned + sliding history trim by token budget. |
+| `encounterLog.ts` | Filesystem tally + summary writer (one .txt per encounter in `data/summaries/`). |
+| `xpAwarder.ts` | XP grant via VTT relay. |
+
+### src/graphmcp/ — GraphMCP JSON-RPC client
+
+| Path | Role |
+|---|---|
+| `client.ts` | 6 RPC calls + NPC memory formatter. |
+| `ingest.ts` | Publishes encounter messages to Redis stream `raw.messages`. |
+| `loreResolver.ts` | /encounter generate helper. |
+| `vocabularyResolver.ts` | Resolves spec `randomizable:` entries (vocabulary or graphmcp source). |
+
+### src/vtt/ — Foundry VTT integration
+
+| Path | Role |
+|---|---|
+| `foundryClient.ts` | HTTP client. Live actor data + formatters. |
+| `relaySession.ts` | RSA-OAEP encrypted handshake + headless Foundry session spin-up when relay is down. |
+
+### Other src/ modules
+
+| Path | Role |
+|---|---|
+| `db/redis.ts` | ioredis singleton (`lazyConnect`, `maxRetriesPerRequest: 3`). |
+| `spec/loader.ts` | YAML loader + Zod schema (`EncounterSpecSchema`). |
+| `persona/loader.ts` | persona.yaml loader for @mention. |
+| `lib/logger.ts` | pino wrapper. |
+| `config.ts` | Zod env schema + parsed config singleton. |
+| `scripts/deploy-commands.ts` | Slash command registration via Discord REST v10. |
+| `types/index.ts` | Shared interfaces + `CONTEXT_BUDGET` constant. |
+
+## specs/ — Encounter YAML files
+
+Loaded by `/encounter start <spec-name>`. `specs/SPEC_FORMAT.md` documents the schema. Current set:
+
+- `market-thief.yaml` — the original "low-stakes warm-up" example used in the README
+- `cog-claw-debt.yaml`
+- `mawfang-pursuit.yaml`
+- `silt-leak.yaml`
+- `stormscar-pilgrim.yaml`
+- `velvet-auction.yaml`
+- `whispering-stone.yaml`
+
+Each spec declares: `encounterId`, `title`, `tone`, `setting`, `openingNarrative`, `npcs[]` (with optional `nameKey` and `memoryKey`), `goals` (primary + secondary), `sportsmanshipRules`, `skillChecks` (grouped by suffix `_dc/_skill/_note`), `randomizable[]` (vocabulary or graphmcp queries with fallbacks), `tools[]`, and optional `dmNotes`.
+
+## tests/ — Vitest suites
+
+```
+tests/
+├── fixtures/spec.ts                  # shared spec fixture
+├── unit/                             # 21 unit test files (no external services)
+│   ├── promptBuilder.test.ts
+│   ├── contextAssembler.test.ts
+│   ├── toolParser.test.ts
+│   ├── toolDispatcher.test.ts
+│   ├── sessionManager.test.ts
+│   ├── playerRegistry.test.ts
+│   ├── characterRegistry.test.ts
+│   ├── specLoader.test.ts
+│   ├── rollHandler.test.ts
+│   ├── rollDetection.test.ts
+│   ├── responseFilter.test.ts
+│   ├── queueCap.test.ts
+│   ├── generationQueue.test.ts
+│   ├── reactionManager.test.ts
+│   ├── encounterLog.test.ts
+│   ├── encounterDiscoveryEmbed.test.ts
+│   ├── loreAnswerEmbed.test.ts
+│   ├── skillCheckEmbed.test.ts
+│   ├── graphmcpClient.test.ts
+│   ├── foundryClientRetry.test.ts
+│   ├── foundryClientFormatters.test.ts
+│   ├── goalRegister.test.ts
+│   └── relaySession.test.ts
+└── integration/
+    └── phase1.test.ts                # requires running Docker services
+```
+
+## Docs/ — Pre-existing project documentation (historical)
+
+| Path | Role |
+|---|---|
+| `mardonar-encounter-engine.md` | **Out of date** — describes a Go bot with embedded MCP layer. Treat as historical. The current `docs/architecture.md` supersedes it. |
+| `mardonar-build-plan.md` | Phased build plan with packages and test guidance. |
+| `epics.md` | Epic list. |
+| `stories/` | Story specs (1.1, 1.2, 2.1, 3.1, 4.1). |
+| `ux-designs/ux-mardonar-2026-05-30/` | UX session artifacts: `EXPERIENCE.md`, `DESIGN.md`, `.decision-log.md`. |
+
+## Critical entry points
+
+| What you want to change | Start here |
+|---|---|
+| Add a slash command | `src/bot/commands/`, then `src/scripts/deploy-commands.ts`, then run `npm run deploy-commands` |
+| Add a tool the LLM can call | `src/harness/tools/<name>.ts`, register in `src/harness/tools/index.ts` |
+| Change system prompt structure | `src/harness/promptBuilder.ts` |
+| Change context window budget | `src/types/index.ts` → `CONTEXT_BUDGET` |
+| Add an encounter | `specs/<name>.yaml` (see `specs/SPEC_FORMAT.md`) |
+| Change message pipeline | `src/bot/handlers/messageRouter.ts` |
+| Change LLM client | `src/harness/llmClient.ts` (router), `litellmClient.ts` / `ollamaClient.ts` (implementations) |
+| Add a Foundry VTT feature | `src/vtt/`, then add a tool in `src/harness/tools/` |
+| Add a GraphMCP-backed feature | `src/graphmcp/client.ts`, then add a tool |
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@@ -11,7 +11,9 @@
    "deploy-commands": "tsx src/scripts/deploy-commands.ts",
    "test": "vitest run",
    "test:unit": "vitest run tests/unit",
-    "test:int": "vitest run tests/integration"
+    "test:int": "vitest run tests/integration",
+    "test:coverage": "vitest run --coverage",
+    "test:watch": "vitest"
  },
  "dependencies": {
    "@discordjs/builders": "^1.10.0",
@@ -30,6 +32,7 @@
  "devDependencies": {
    "@types/js-yaml": "^4.0.9",
    "@types/node": "^22.0.0",
+    "@vitest/coverage-v8": "^3.2.6",
    "ioredis-mock": "^8.9.0",
    "tsx": "^4.19.0",
    "typescript": "^5.8.0",
--- a/tests/README.md
+++ b/tests/README.md
@@ -0,0 +1,130 @@
+# Tests
+
+This directory holds the project's automated test suite.
+
+## Layout
+
+```
+tests/
+├── fixtures/        Shared test fixtures (spec, session, etc.)
+├── integration/     Integration tests (require live infrastructure)
+├── unit/            Unit tests (default CI gate)
+└── README.md        You are here
+```
+
+- **`unit/`** — fast, isolated tests for individual modules. No network, no
+  Redis, no Discord gateway. The CI default runs only this directory.
+- **`integration/`** — slower tests that exercise real services (or mocks
+  close to the wire). Run explicitly; not part of the default test command.
+- **`fixtures/`** — reusable mocks (`mockSession`, `mockSpec`) shared by
+  multiple unit tests.
+
+## Running
+
+```bash
+npm test              # alias for `npm run test:unit` + runs once (not watch)
+npm run test:unit     # run all tests in tests/unit
+npm run test:int      # run all tests in tests/integration
+npm run test:coverage # run unit tests with v8 coverage report
+npm run test:watch    # vitest in watch mode
+```
+
+## Conventions
+
+### 1. One module per file
+
+A test file covers one source module. File name: `<moduleName>.test.ts`,
+placed under `tests/unit/`. If a source module exports multiple functions
+worth testing, group them with `describe` blocks in the same file.
+
+### 2. Mock before import — always
+
+`vi.mock` calls must appear *before* the import of the module under test,
+otherwise the unmocked module is already cached. The pattern:
+
+```ts
+import { vi, describe, it, expect, beforeEach } from 'vitest';
+
+const { mockFn } = vi.hoisted(() => ({ mockFn: vi.fn() }));
+
+vi.mock('../../src/lib/logger.js', () => ({
+  log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
+}));
+
+// ...more mocks...
+
+import { myFunction } from '../../src/some/module.js'; // AFTER mocks
+```
+
+`vi.hoisted` lets you share mock state between a `vi.mock` factory and the
+test body — both run in the same scope.
+
+### 3. `vi.clearAllMocks()` in `beforeEach`
+
+Prevents test bleed-through. If you also mutate config or module-level
+state, reset it explicitly in `beforeEach`.
+
+### 4. Reuse `mockSession` and `mockSpec`
+
+Import from `../fixtures/spec.js`. Don't redefine session shape per file —
+schema drift is one of the easier ways for tests to silently rot.
+
+### 5. Test the *behavior*, not the implementation
+
+Assert outcomes (return values, side effects on real collaborators, error
+messages) rather than calling patterns. When a test would only pass with a
+specific internal implementation, ask whether the contract is what's
+documented in the source's doc comment.
+
+### 6. Don't hit the network
+
+- `fetch` → use `vi.stubGlobal('fetch', ...)` (see `foundryClientRetry.test.ts`).
+- Discord client → pass a hand-rolled mock with only the methods the code uses
+  (e.g. `messages.fetch`, `send`, `sendTyping`, `setArchived`).
+- Redis → use `ioredis-mock` (see `sessionManager.test.ts`).
+- LLM SDKs → mock the constructor (see `litellmClient.test.ts`,
+  `ollamaClient.test.ts`).
+- Filesystem → use `mkdtempSync` from `node:os.tmpdir()` (see
+  `personaLoader.test.ts`).
+
+### 7. Player-facing strings
+
+When a test asserts on a string the bot would say to a player, prefer
+in-world language over utility terms. (Same rule that applies to production
+code — see `feedback-in-world-voice` memory.)
+
+## Anti-patterns
+
+- **Asserting private state.** Reach for behaviour-side assertions first.
+- **Resetting state with `vi.resetModules()` for the sake of it.** It breaks
+  shared mock state. Use it only when a module-scoped cache (e.g. a lazy
+  client) needs to be re-constructed.
+- **Catching all errors in a test.** If a test passes by accident because an
+  unhandled rejection was swallowed, it's not testing anything.
+- **Mocking the module under test.** If you have to mock the file you're
+  testing, the test is asserting nothing.
+- **Timeouts in `it()` callbacks.** Use `vi.useFakeTimers()` and
+  `vi.advanceTimersByTimeAsync` to step time deterministically (see
+  `messageRouterRunLLMTurn.test.ts` for the typing-indicator pattern).
+
+## Adding a new test
+
+1. Create `tests/unit/<name>.test.ts`.
+2. Use the closest existing test as a template — `goalRegister.test.ts` for
+   tool plugins, `foundryClientRetry.test.ts` for HTTP, `relaySession.test.ts`
+   for `node:https` / `node:crypto`, `sessionManager.test.ts` for Redis.
+3. Run `npm run test:unit -- <your-file>` to iterate quickly.
+4. When green, run the full suite: `npm run test:unit`.
+5. Optional: check `npm run test:coverage` to confirm the file's coverage.
+
+## Coverage
+
+`npm run test:coverage` produces a v8 coverage report in the terminal.
+Directories worth watching:
+
+- `src/bot/handlers/` — message routing; `runLLMTurn` is the runtime heart.
+- `src/harness/tools/` — the tool plugin contracts.
+- `src/vtt/` — Foundry relay; `foundryClient` is the biggest single file.
+
+Coverage is informational, not a gate. The goal is to grow the unit test
+surface for the modules that own irreversible or user-facing behavior.
--- a/tests/unit/foundryReward.test.ts
+++ b/tests/unit/foundryReward.test.ts
@@ -0,0 +1,205 @@
+import { vi, describe, it, expect, beforeEach } from 'vitest';
+
+// ── registry mocks ───────────────────────────────────────────────────────────
+const { mockGet: mockCharacterGet } = vi.hoisted(() => ({
+  mockGet: vi.fn(),
+}));
+
+vi.mock('../../src/session/characterRegistry.js', () => ({
+  characterRegistry: { get: mockCharacterGet },
+}));
+
+const { mockModifyExperience, mockGiveItem } = vi.hoisted(() => ({
+  mockModifyExperience: vi.fn(),
+  mockGiveItem: vi.fn(),
+}));
+
+vi.mock('../../src/vtt/foundryClient.js', () => ({
+  modifyExperience: mockModifyExperience,
+  giveItem: mockGiveItem,
+}));
+
+import { dispatchTool } from '../../src/harness/toolDispatcher.js';
+import { mockSession } from '../fixtures/spec.js';
+
+function makeThread() {
+  return { send: vi.fn().mockResolvedValue({ id: 'msg-1' }) };
+}
+
+const playerSession = {
+  ...mockSession,
+  players: {
+    'user-1': { discordId: 'user-1', dndName: 'Aelindra' },
+  },
+};
+
+beforeEach(() => {
+  vi.clearAllMocks();
+  mockModifyExperience.mockResolvedValue(undefined);
+  mockGiveItem.mockResolvedValue(undefined);
+});
+
+describe('dispatchTool — foundry_reward', () => {
+  it('awards both XP and item to a registered Foundry-linked player', async () => {
+    mockCharacterGet.mockResolvedValue({
+      discordId: 'user-1',
+      dndName: 'Aelindra',
+      source: 'foundry',
+      foundryActorUuid: 'Actor.abc',
+    });
+
+    const result = await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: {
+          player_discord_name: 'Aelindra',
+          xp_amount: 50,
+          item_name: 'Potion of Healing',
+          reason: 'Caught the thief.',
+        },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(result.systemMessage).toContain('[FOUNDRY REWARD]');
+    expect(result.systemMessage).toContain('Aelindra');
+    expect(result.systemMessage).toContain('Potion of Healing');
+    expect(result.systemMessage).toContain('50 XP');
+    expect(result.systemMessage).toContain('Caught the thief.');
+    expect(mockModifyExperience).toHaveBeenCalledWith('Actor.abc', 50);
+    expect(mockGiveItem).toHaveBeenCalledWith('Actor.abc', 'Potion of Healing');
+  });
+
+  it('matches player name case-insensitively', async () => {
+    mockCharacterGet.mockResolvedValue({
+      discordId: 'user-1',
+      dndName: 'Aelindra',
+      source: 'foundry',
+      foundryActorUuid: 'Actor.abc',
+    });
+
+    await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'aelindra', xp_amount: 10, reason: 'good roleplay' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(mockModifyExperience).toHaveBeenCalledWith('Actor.abc', 10);
+  });
+
+  it('awards only XP when item_name is omitted', async () => {
+    mockCharacterGet.mockResolvedValue({
+      discordId: 'user-1', dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.abc',
+    });
+
+    await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'Aelindra', xp_amount: 25, reason: 'milestone' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(mockModifyExperience).toHaveBeenCalledWith('Actor.abc', 25);
+    expect(mockGiveItem).not.toHaveBeenCalled();
+  });
+
+  it('awards only an item when xp_amount is zero', async () => {
+    mockCharacterGet.mockResolvedValue({
+      discordId: 'user-1', dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.abc',
+    });
+
+    await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'Aelindra', xp_amount: 0, item_name: 'Gold Piece', reason: 'tip' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(mockGiveItem).toHaveBeenCalledWith('Actor.abc', 'Gold Piece');
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+  });
+
+  it('skips XP when xp_amount is missing', async () => {
+    mockCharacterGet.mockResolvedValue({
+      discordId: 'user-1', dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.abc',
+    });
+
+    await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'Aelindra', item_name: 'Ring', reason: 'find' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(mockGiveItem).toHaveBeenCalledWith('Actor.abc', 'Ring');
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+  });
+
+  it('returns a "no player" system message and does not call Foundry when the player is not in the session', async () => {
+    const result = await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'Nobody', xp_amount: 5, reason: 'typo' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(result.systemMessage).toContain('No player found matching "Nobody"');
+    expect(mockCharacterGet).not.toHaveBeenCalled();
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+    expect(mockGiveItem).not.toHaveBeenCalled();
+  });
+
+  it('returns a "no character record" message when the player has no Foundry UUID', async () => {
+    mockCharacterGet.mockResolvedValue({
+      discordId: 'user-1', dndName: 'Aelindra', source: 'custom', /* no foundryActorUuid */
+    });
+
+    const result = await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'Aelindra', xp_amount: 5, reason: 'try' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(result.systemMessage).toContain('No character record found for this player');
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+    expect(mockGiveItem).not.toHaveBeenCalled();
+  });
+
+  it('returns a "no character record" message when the player has no profile at all', async () => {
+    mockCharacterGet.mockResolvedValue(null);
+
+    const result = await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'Aelindra', xp_amount: 5, reason: 'try' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(result.systemMessage).toContain('No character record found for this player');
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+  });
+
+  it('catches errors from characterRegistry and returns the friendly error', async () => {
+    mockCharacterGet.mockRejectedValue(new Error('redis down'));
+
+    const result = await dispatchTool(
+      {
+        tool: 'foundry_reward',
+        args: { player_discord_name: 'Aelindra', xp_amount: 5, reason: 'try' },
+      },
+      { session: playerSession, thread: makeThread() as any },
+    );
+
+    expect(result.systemMessage).toContain('Character records are inaccessible');
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+  });
+});
--- a/tests/unit/litellmClient.test.ts
+++ b/tests/unit/litellmClient.test.ts
@@ -0,0 +1,162 @@
+import { vi, describe, it, expect, beforeEach } from 'vitest';
+
+// ── config mock ──────────────────────────────────────────────────────────────
+vi.mock('../../src/config.js', () => ({
+  config: {
+    LITELLM_BASE_URL: 'http://100.83.8.74:4000',
+    LITELLM_API_KEY: 'test-key',
+    LITELLM_MODEL: 'ollama-cloud',
+    OLLAMA_TEMPERATURE: 0.75,
+    OLLAMA_TIMEOUT_MS: 120_000,
+    OLLAMA_MODEL: 'gemma4-it:e2b',
+  },
+}));
+
+vi.mock('../../src/lib/logger.js', () => ({
+  log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
+}));
+
+// ── openai client mock ────────────────────────────────────────────────────────
+const { mockCreate } = vi.hoisted(() => ({
+  mockCreate: vi.fn(),
+}));
+
+vi.mock('openai', () => ({
+  default: vi.fn().mockImplementation(() => ({
+    chat: { completions: { create: mockCreate } },
+  })),
+}));
+
+import { callLLM } from '../../src/harness/litellmClient.js';
+
+beforeEach(() => {
+  vi.clearAllMocks();
+  // Reset LITELLM_MODEL in case a previous test mutated it.
+  return import('../../src/config.js').then(({ config }) => {
+    (config as Record<string, unknown>).LITELLM_MODEL = 'ollama-cloud';
+  });
+});
+
+describe('litellmClient.callLLM', () => {
+  it('returns parsed narrative and tool call from the OpenAI-compatible response', async () => {
+    mockCreate.mockResolvedValueOnce({
+      choices: [
+        {
+          message: {
+            content: 'Roll for initiative. ```tool_call\n{"tool":"encounter_resolve","args":{"sessionId":"s1","outcomeId":"catch","summary":"Caught him"}}\n```',
+          },
+        },
+      ],
+      usage: { completion_tokens: 88, prompt_tokens: 4000 },
+    });
+
+    const result = await callLLM([{ role: 'user', content: 'I tackle him.', timestamp: 1 }]);
+
+    expect(result.narrative).toBe('Roll for initiative.');
+    expect(result.toolCall?.tool).toBe('encounter_resolve');
+    expect(result.toolCall?.args).toEqual({ sessionId: 's1', outcomeId: 'catch', summary: 'Caught him' });
+    expect(result.rawTokensUsed).toBe(88);
+  });
+
+  it('configures the OpenAI client with the LiteLLM base URL + API key + timeout', async () => {
+    // Force a fresh litellmClient so its cached _client is re-constructed with
+    // the current config values.
+    vi.resetModules();
+    const OpenAI = (await import('openai')).default;
+    const { callLLM: freshCallLLM } = await import('../../src/harness/litellmClient.js');
+    mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
+
+    await freshCallLLM([{ role: 'user', content: 'hi', timestamp: 1 }]);
+
+    expect(OpenAI).toHaveBeenCalledWith({
+      baseURL: 'http://100.83.8.74:4000/v1',
+      apiKey: 'test-key',
+      timeout: 120_000,
+    });
+  });
+
+  it('falls back to the literal string "no-key" when LITELLM_API_KEY is empty', async () => {
+    const { config } = await import('../../src/config.js');
+    (config as Record<string, unknown>).LITELLM_API_KEY = '';
+    vi.resetModules();
+    const OpenAI = (await import('openai')).default;
+    const { callLLM: freshCallLLM } = await import('../../src/harness/litellmClient.js');
+    mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
+
+    await freshCallLLM([{ role: 'user', content: 'hi', timestamp: 1 }]);
+
+    expect(OpenAI).toHaveBeenCalledWith(
+      expect.objectContaining({ apiKey: 'no-key' }),
+    );
+  });
+
+  it('uses LITELLM_MODEL when set, otherwise falls back to OLLAMA_MODEL', async () => {
+    const { config } = await import('../../src/config.js');
+
+    (config as Record<string, unknown>).LITELLM_MODEL = 'big-model';
+    mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
+    await callLLM([{ role: 'user', content: 'a', timestamp: 1 }]);
+    expect(mockCreate).toHaveBeenLastCalledWith(
+      expect.objectContaining({ model: 'big-model' }),
+    );
+
+    (config as Record<string, unknown>).LITELLM_MODEL = undefined;
+    mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
+    await callLLM([{ role: 'user', content: 'b', timestamp: 2 }]);
+    expect(mockCreate).toHaveBeenLastCalledWith(
+      expect.objectContaining({ model: 'gemma4-it:e2b' }),
+    );
+  });
+
+  it('passes messages and temperature through to the OpenAI client', async () => {
+    mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
+
+    await callLLM([
+      { role: 'system', content: 'sys', timestamp: 0 },
+      { role: 'user', content: 'hi', timestamp: 1 },
+    ]);
+
+    expect(mockCreate).toHaveBeenCalledWith({
+      model: 'ollama-cloud',
+      messages: [
+        { role: 'system', content: 'sys' },
+        { role: 'user', content: 'hi' },
+      ],
+      temperature: 0.75,
+    });
+  });
+
+  it('returns an empty narrative when the model response is empty', async () => {
+    mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: '' } }] });
+
+    const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
+
+    expect(result.narrative).toBe('');
+    expect(result.toolCall).toBeUndefined();
+  });
+
+  it('falls back to an empty string when the response has no choices at all', async () => {
+    mockCreate.mockResolvedValueOnce({ choices: [] });
+
+    const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
+
+    expect(result.narrative).toBe('');
+    expect(result.toolCall).toBeUndefined();
+  });
+
+  it('handles a missing usage field without crashing', async () => {
+    mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
+
+    const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
+
+    expect(result.rawTokensUsed).toBeUndefined();
+  });
+
+  it('propagates errors from the OpenAI client', async () => {
+    mockCreate.mockRejectedValueOnce(new Error('rate limit exceeded'));
+
+    await expect(
+      callLLM([{ role: 'user', content: 'hi', timestamp: 1 }]),
+    ).rejects.toThrow('rate limit exceeded');
+  });
+});
--- a/tests/unit/messageRouterRunLLMTurn.test.ts
+++ b/tests/unit/messageRouterRunLLMTurn.test.ts
@@ -0,0 +1,409 @@
+import { vi, describe, it, expect, beforeEach, afterEach } from 'vitest';
+
+// ── assembled-context mock ───────────────────────────────────────────────────
+const { mockAssembleContext } = vi.hoisted(() => ({
+  mockAssembleContext: vi.fn(),
+}));
+
+vi.mock('../../src/harness/contextAssembler.js', () => ({
+  assembleContext: mockAssembleContext,
+}));
+
+// ── LLM client mock ──────────────────────────────────────────────────────────
+const { mockCallLLM } = vi.hoisted(() => ({
+  mockCallLLM: vi.fn(),
+}));
+
+vi.mock('../../src/harness/llmClient.js', () => ({
+  callLLM: mockCallLLM,
+}));
+
+// ── dispatchTool mock ────────────────────────────────────────────────────────
+const { mockDispatchTool } = vi.hoisted(() => ({
+  mockDispatchTool: vi.fn(),
+}));
+
+vi.mock('../../src/harness/toolDispatcher.js', () => ({
+  dispatchTool: mockDispatchTool,
+}));
+
+// ── sessionManager mock ──────────────────────────────────────────────────────
+const { mockAddMessage, mockUpdate, mockGet } = vi.hoisted(() => ({
+  mockAddMessage: vi.fn(),
+  mockUpdate: vi.fn(),
+  mockGet: vi.fn(),
+}));
+
+vi.mock('../../src/session/sessionManager.js', () => ({
+  sessionManager: {
+    addMessage: mockAddMessage,
+    update: mockUpdate,
+    get: mockGet,
+  },
+}));
+
+// ── responseFilter mock ──────────────────────────────────────────────────────
+const { mockFilterLLMResponse, mockDetectMissedSkillCheck, mockLogFiltered } = vi.hoisted(() => ({
+  mockFilterLLMResponse: vi.fn(),
+  mockDetectMissedSkillCheck: vi.fn(),
+  mockLogFiltered: vi.fn(),
+}));
+
+vi.mock('../../src/bot/handlers/responseFilter.js', () => ({
+  filterLLMResponse: mockFilterLLMResponse,
+  detectMissedSkillCheck: mockDetectMissedSkillCheck,
+  logFiltered: mockLogFiltered,
+}));
+
+// ── reaction / burst mocks ───────────────────────────────────────────────────
+vi.mock('../../src/bot/handlers/reactionManager.js', () => ({
+  registerScheduled: vi.fn(),
+  drainPending: vi.fn(() => []),
+  clearPending: vi.fn(),
+  upgradeToProcessing: vi.fn(),
+  upgradeToComplete: vi.fn(),
+  cleanupReactions: vi.fn(),
+}));
+
+vi.mock('../../src/bot/handlers/queueCap.js', () => ({
+  isBurstCapped: vi.fn(() => false),
+  incrementBurst: vi.fn(),
+  resetBurst: vi.fn(),
+  sendDropNotice: vi.fn(),
+}));
+
+vi.mock('../../src/lib/logger.js', () => ({
+  log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
+}));
+
+// ── subject under test ───────────────────────────────────────────────────────
+// Import AFTER all vi.mock calls. We re-import per-test where needed so we can
+// attach vi.spyOn() to the module's scheduleEncounterLLMTurn export.
+let runLLMTurn: typeof import('../../src/bot/handlers/messageRouter.js').runLLMTurn;
+let scheduleSpy: ReturnType<typeof vi.spyOn>;
+import { mockSession } from '../fixtures/spec.js';
+import type { SessionState } from '../../src/types/index.js';
+
+function makeThread(extra: Partial<{ setArchived: any }> = {}) {
+  const thread: any = {
+    send: vi.fn().mockResolvedValue({ id: 'sent-msg' }),
+    sendTyping: vi.fn().mockResolvedValue(undefined),
+    setArchived: extra.setArchived ?? vi.fn().mockResolvedValue(undefined),
+    messages: { fetch: vi.fn().mockResolvedValue(null) },
+  };
+  return thread;
+}
+
+function sessionWith(history: SessionState['history'], pending?: SessionState['pendingSkillCheck']): SessionState {
+  return { ...mockSession, history, pendingSkillCheck: pending };
+}
+
+beforeEach(async () => {
+  vi.clearAllMocks();
+  vi.useFakeTimers();
+  // Re-import the module under test each time so we can spy on its
+  // scheduleEncounterLLMTurn export. The mocks are reused across imports.
+  const mod = await import('../../src/bot/handlers/messageRouter.js');
+  runLLMTurn = mod.runLLMTurn;
+  scheduleSpy = vi.spyOn(mod, 'scheduleEncounterLLMTurn').mockImplementation(() => undefined);
+
+  // Always default: context assembles to something, filter accepts everything.
+  mockAssembleContext.mockReturnValue([{ role: 'system', content: 'sys', timestamp: 0 }]);
+  mockFilterLLMResponse.mockReturnValue({ ok: true });
+  mockDetectMissedSkillCheck.mockReturnValue(false);
+  mockAddMessage.mockResolvedValue(undefined);
+  mockUpdate.mockResolvedValue(undefined);
+  mockGet.mockImplementation(async (threadId: string) => ({ ...mockSession, threadId }));
+  mockDispatchTool.mockResolvedValue({ systemMessage: '[TOOL] done', resolved: undefined, error: undefined });
+});
+
+afterEach(() => {
+  vi.useRealTimers();
+});
+
+describe('runLLMTurn — narrative-only response (no tool call)', () => {
+  it('posts the narrative to the thread', async () => {
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'The wind howls.', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(thread.send).toHaveBeenCalledWith('The wind howls.');
+  });
+
+  it('stores the assistant narrative in session history', async () => {
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'A leaf falls.', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockAddMessage).toHaveBeenCalledWith(
+      mockSession.threadId,
+      expect.objectContaining({ role: 'assistant', content: 'A leaf falls.' }),
+    );
+  });
+
+  it('does not call dispatchTool when there is no tool call', async () => {
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'quiet.', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockDispatchTool).not.toHaveBeenCalled();
+  });
+
+  it('passes skipRollClaim:true when a [SKILL CHECK RESULT] message is in the recent 6 messages', async () => {
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'You rolled a 15 and hit the goblin.', toolCall: undefined });
+    const thread = makeThread();
+
+    const history: SessionState['history'] = [
+      { role: 'system', content: '[SKILL CHECK RESULT] Aelindra rolled 15 vs DC 12. Result: SUCCESS.', timestamp: 1 },
+    ];
+    await runLLMTurn(sessionWith(history), thread, {} as any);
+
+    expect(mockFilterLLMResponse).toHaveBeenCalledWith(
+      'You rolled a 15 and hit the goblin.',
+      { skipRollClaim: true },
+    );
+  });
+
+  it('passes skipRollClaim:false when no recent [SKILL CHECK RESULT] message exists', async () => {
+    mockCallLLM.mockResolvedValueOnce({ narrative: '...', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockFilterLLMResponse).toHaveBeenCalledWith('...', { skipRollClaim: false });
+  });
+});
+
+describe('runLLMTurn — filter correction', () => {
+  it('on filter rejection with no recent correction, sends a [FILTER CORRECTION] system message', async () => {
+    mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'fabricated_roll_result' });
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'You rolled a 17.', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockLogFiltered).toHaveBeenCalledWith(
+      'fabricated_roll_result',
+      'You rolled a 17.',
+      expect.objectContaining({ threadId: mockSession.threadId, encounterId: mockSession.encounterId }),
+    );
+    expect(mockAddMessage).toHaveBeenCalledWith(
+      mockSession.threadId,
+      expect.objectContaining({
+        role: 'system',
+        content: expect.stringMatching(/^\[FILTER CORRECTION\]/),
+      }),
+    );
+    // The retry path also invokes scheduleEncounterLLMTurn with immediate=true.
+    // (We can't reliably observe the internal call via the export spy in ESM
+    // live-bindings, so we verify the side effects directly.)
+    const correction = mockAddMessage.mock.calls.find(([_, m]) =>
+      (m as { content: string }).content.startsWith('[FILTER CORRECTION]'),
+    )?.[1] as { content: string };
+    expect(correction.content).toMatch(/Do NOT state or imply a specific dice result/);
+  });
+
+  it('on filter rejection when last message is already a correction, skips the retry to avoid loops', async () => {
+    mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'empty_response' });
+    mockCallLLM.mockResolvedValueOnce({ narrative: '', toolCall: undefined });
+    const thread = makeThread();
+
+    const history: SessionState['history'] = [
+      { role: 'system', content: '[FILTER CORRECTION] previous turn suppressed (empty_response).', timestamp: 1 },
+    ];
+
+    await runLLMTurn(sessionWith(history), thread, {} as any);
+
+    // No new correction message should be added when one was just sent.
+    const correctionAdds = mockAddMessage.mock.calls.filter(([_, m]) =>
+      (m as { content: string }).content.startsWith('[FILTER CORRECTION]'),
+    );
+    expect(correctionAdds).toHaveLength(0);
+  });
+
+  it('uses the echoed_system_tag correction text when filter rejects for that reason', async () => {
+    mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'echoed_system_tag' });
+    mockCallLLM.mockResolvedValueOnce({ narrative: '[TOOL] something', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    const correction = mockAddMessage.mock.calls.find(([_, m]) =>
+      (m as { content: string }).content.startsWith('[FILTER CORRECTION]'),
+    )?.[1] as { content: string };
+    expect(correction.content).toMatch(/Do NOT echo internal system tags/);
+  });
+
+  it('does NOT post the filtered narrative to the thread', async () => {
+    mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'fabricated_roll_result' });
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'You rolled a 17.', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(thread.send).not.toHaveBeenCalledWith('You rolled a 17.');
+  });
+});
+
+describe('runLLMTurn — tool call dispatch', () => {
+  it('dispatches the toolCall with a freshly fetched session and writes the system message', async () => {
+    mockCallLLM.mockResolvedValueOnce({
+      narrative: '',
+      toolCall: { tool: 'goal_register', args: { goals: ['x'] } },
+    });
+    const freshSession = { ...mockSession, fetched: true };
+    mockGet.mockResolvedValueOnce(freshSession);
+    mockDispatchTool.mockResolvedValueOnce({ systemMessage: '[TOOL] ok', error: undefined, resolved: undefined });
+
+    const thread = makeThread();
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockGet).toHaveBeenCalledWith(mockSession.threadId);
+    expect(mockDispatchTool).toHaveBeenCalledWith(
+      { tool: 'goal_register', args: { goals: ['x'] } },
+      expect.objectContaining({ session: freshSession, thread }),
+    );
+    expect(mockAddMessage).toHaveBeenCalledWith(
+      mockSession.threadId,
+      expect.objectContaining({ role: 'system', content: '[TOOL] ok' }),
+    );
+  });
+
+  it('posts a friendly fallback message when dispatchTool returns an error', async () => {
+    mockCallLLM.mockResolvedValueOnce({
+      narrative: '',
+      toolCall: { tool: 'goal_register', args: {} },
+    });
+    mockDispatchTool.mockResolvedValueOnce({ systemMessage: '[TOOL] failed', error: new Error('boom'), resolved: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(thread.send).toHaveBeenCalledWith(expect.stringMatching(/narrator stumbles/));
+  });
+
+  it('marks the session resolved and schedules archive when tool reports resolved', async () => {
+    mockCallLLM.mockResolvedValueOnce({
+      narrative: '',
+      toolCall: { tool: 'encounter_resolve', args: { outcomeId: 'catch', summary: 'got him' } },
+    });
+    mockDispatchTool.mockResolvedValueOnce({
+      systemMessage: '[TOOL] resolved',
+      resolved: { outcomeId: 'catch', summary: 'got him' },
+      error: undefined,
+    });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockUpdate).toHaveBeenCalledWith(mockSession.threadId, {
+      phase: 'resolved',
+      outcome: 'catch',
+      outcomeSummary: 'got him',
+    });
+
+    // The archive setTimeout fires after 5 seconds.
+    expect(thread.setArchived).not.toHaveBeenCalled();
+    await vi.advanceTimersByTimeAsync(5_000);
+    expect(thread.setArchived).toHaveBeenCalledWith(true);
+  });
+
+  it('does not throw and returns early when the session was deleted before dispatch', async () => {
+    mockCallLLM.mockResolvedValueOnce({
+      narrative: '',
+      toolCall: { tool: 'goal_register', args: {} },
+    });
+    mockGet.mockResolvedValueOnce(null); // session disappeared
+    const thread = makeThread();
+
+    await expect(runLLMTurn(sessionWith([]), thread, {} as any)).resolves.toBeUndefined();
+    expect(mockDispatchTool).not.toHaveBeenCalled();
+  });
+
+  it('still dispatches the tool even when the narrative was filtered', async () => {
+    mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'fabricated_roll_result' });
+    mockCallLLM.mockResolvedValueOnce({
+      narrative: 'You rolled a 12. ',
+      toolCall: { tool: 'goal_register', args: { foo: 'bar' } },
+    });
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockDispatchTool).toHaveBeenCalled();
+    // But the narrative was suppressed.
+    expect(thread.send).not.toHaveBeenCalledWith('You rolled a 12.');
+  });
+});
+
+describe('runLLMTurn — LLM error', () => {
+  it('posts a friendly error message when the LLM throws and clears the typing interval', async () => {
+    mockCallLLM.mockRejectedValueOnce(new Error('503 from upstream'));
+    const thread = makeThread();
+    const consoleSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(consoleSpy).toHaveBeenCalledWith('[messageRouter] LLM call failed:', expect.any(Error));
+    expect(thread.send).toHaveBeenCalledWith(expect.stringMatching(/narrator pauses/));
+    // The interval would normally fire every 8s — advance to confirm it's gone.
+    await vi.advanceTimersByTimeAsync(20_000);
+    expect(thread.sendTyping).toHaveBeenCalled();
+    // No filter or dispatch should have happened.
+    expect(mockFilterLLMResponse).not.toHaveBeenCalled();
+    expect(mockDispatchTool).not.toHaveBeenCalled();
+
+    consoleSpy.mockRestore();
+  });
+});
+
+describe('runLLMTurn — missed skill check heuristic', () => {
+  it('logs a warning when the narrative asks for a roll but no tool call was emitted and no roll is pending', async () => {
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'Make a Strength check.', toolCall: undefined });
+    mockDetectMissedSkillCheck.mockReturnValueOnce(true);
+    const thread = makeThread();
+
+    await runLLMTurn(sessionWith([]), thread, {} as any);
+
+    expect(mockDetectMissedSkillCheck).toHaveBeenCalledWith('Make a Strength check.');
+  });
+
+  it('skips the heuristic when a roll result is already pending', async () => {
+    mockCallLLM.mockResolvedValueOnce({ narrative: 'Make a check.', toolCall: undefined });
+    const thread = makeThread();
+
+    await runLLMTurn(
+      sessionWith([], { player: 'Aelindra', dc: 12, messageId: 'm1' }),
+      thread,
+      {} as any,
+    );
+
+    expect(mockDetectMissedSkillCheck).not.toHaveBeenCalled();
+  });
+});
+
+describe('runLLMTurn — typing indicator', () => {
+  it('starts a typing indicator that fires every 8s while the LLM is being awaited', async () => {
+    // Make callLLM slow so we can observe the interval
+    let resolveCall!: (v: unknown) => void;
+    mockCallLLM.mockReturnValueOnce(new Promise(r => { resolveCall = r; }));
+    const thread = makeThread();
+
+    const pending = runLLMTurn(sessionWith([]), thread, {} as any);
+    expect(thread.sendTyping).toHaveBeenCalledTimes(1);
+
+    await vi.advanceTimersByTimeAsync(8_000);
+    expect(thread.sendTyping).toHaveBeenCalledTimes(2);
+
+    resolveCall({ narrative: 'ok', toolCall: undefined });
+    await pending;
+
+    // After resolution the interval is cleared; advancing further should not send typing again.
+    const callsBefore = thread.sendTyping.mock.calls.length;
+    await vi.advanceTimersByTimeAsync(20_000);
+    expect(thread.sendTyping.mock.calls.length).toBe(callsBefore);
+  });
+});
--- a/tests/unit/ollamaClient.test.ts
+++ b/tests/unit/ollamaClient.test.ts
@@ -0,0 +1,103 @@
+import { vi, describe, it, expect, beforeEach } from 'vitest';
+
+// ── config mock (must come before module under test) ──────────────────────────
+vi.mock('../../src/config.js', () => ({
+  config: {
+    OLLAMA_BASE_URL: 'http://localhost:11434',
+    OLLAMA_MODEL: 'gemma4-it:e2b',
+    OLLAMA_TEMPERATURE: 0.75,
+    OLLAMA_NUM_CTX: 131072,
+  },
+}));
+
+vi.mock('../../src/lib/logger.js', () => ({
+  log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
+}));
+
+// ── ollama npm client mock ────────────────────────────────────────────────────
+const { mockChat } = vi.hoisted(() => ({
+  mockChat: vi.fn(),
+}));
+
+vi.mock('ollama', () => ({
+  Ollama: vi.fn().mockImplementation(() => ({
+    chat: mockChat,
+  })),
+}));
+
+import { callLLM } from '../../src/harness/ollamaClient.js';
+
+beforeEach(() => {
+  vi.clearAllMocks();
+});
+
+describe('ollamaClient.callLLM', () => {
+  it('returns parsed narrative and tool call from the ollama response', async () => {
+    mockChat.mockResolvedValueOnce({
+      message: { content: 'The goblin snarls. ```tool_call\n{"tool":"skill_check_emit","args":{"player":"Aelindra","prompt":"Strike","dc":12}}\n```' },
+      eval_count: 42,
+    });
+
+    const result = await callLLM([{ role: 'user', content: 'I attack.', timestamp: 1 }]);
+
+    expect(result.narrative).toBe('The goblin snarls.');
+    expect(result.toolCall?.tool).toBe('skill_check_emit');
+    expect(result.rawTokensUsed).toBe(42);
+  });
+
+  it('passes messages, model, stream:false, and options to the ollama client', async () => {
+    mockChat.mockResolvedValueOnce({ message: { content: 'ok' }, eval_count: 5 });
+
+    await callLLM([
+      { role: 'system', content: 'You are the DM.', timestamp: 0 },
+      { role: 'user', content: 'I look around.', timestamp: 1 },
+    ]);
+
+    expect(mockChat).toHaveBeenCalledWith({
+      model: 'gemma4-it:e2b',
+      messages: [
+        { role: 'system', content: 'You are the DM.' },
+        { role: 'user', content: 'I look around.' },
+      ],
+      stream: false,
+      options: { temperature: 0.75, num_ctx: 131072 },
+    });
+  });
+
+  it('returns just the narrative when there is no tool call block', async () => {
+    mockChat.mockResolvedValueOnce({ message: { content: 'A quiet moment.' }, eval_count: 7 });
+
+    const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
+
+    expect(result.narrative).toBe('A quiet moment.');
+    expect(result.toolCall).toBeUndefined();
+    expect(result.rawTokensUsed).toBe(7);
+  });
+
+  it('propagates errors from the ollama client', async () => {
+    mockChat.mockRejectedValueOnce(new Error('connection refused'));
+
+    await expect(
+      callLLM([{ role: 'user', content: 'hi', timestamp: 1 }]),
+    ).rejects.toThrow('connection refused');
+  });
+
+  it('handles an empty message content without crashing', async () => {
+    mockChat.mockResolvedValueOnce({ message: { content: '' }, eval_count: 0 });
+
+    const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
+
+    expect(result.narrative).toBe('');
+    expect(result.toolCall).toBeUndefined();
+    expect(result.rawTokensUsed).toBe(0);
+  });
+
+  it('handles a missing eval_count without crashing', async () => {
+    mockChat.mockResolvedValueOnce({ message: { content: 'ok' } });
+
+    const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
+
+    expect(result.narrative).toBe('ok');
+    expect(result.rawTokensUsed).toBeUndefined();
+  });
+});
--- a/tests/unit/personaLoader.test.ts
+++ b/tests/unit/personaLoader.test.ts
@@ -0,0 +1,120 @@
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import { mkdtempSync, writeFileSync, rmSync } from 'fs';
+import { tmpdir } from 'os';
+import { join } from 'path';
+
+import { loadPersona, clearPersonaCache } from '../../src/persona/loader.js';
+
+let tmpDir: string;
+
+beforeEach(() => {
+  clearPersonaCache();
+  tmpDir = mkdtempSync(join(tmpdir(), 'persona-test-'));
+});
+
+afterEach(() => {
+  rmSync(tmpDir, { recursive: true, force: true });
+});
+
+function writePersona(yaml: string): string {
+  const path = join(tmpDir, 'persona.yaml');
+  writeFileSync(path, yaml, 'utf8');
+  return path;
+}
+
+describe('loadPersona', () => {
+  it('loads a valid persona YAML file and parses it', () => {
+    const path = writePersona(`
+name: "Zalram Cloudwalker"
+description: "Aasimar Divination Wizard, level 8"
+persona: |
+  You are Zalram — bound to the digital realm.
+responseStyle: "Dry, formal, occasionally sardonic."
+`);
+
+    const persona = loadPersona(path);
+
+    expect(persona.name).toBe('Zalram Cloudwalker');
+    expect(persona.description).toBe('Aasimar Divination Wizard, level 8');
+    expect(persona.persona).toContain('You are Zalram');
+    expect(persona.responseStyle).toBe('Dry, formal, occasionally sardonic.');
+  });
+
+  it('caches the result — second call returns the same instance without re-reading the file', () => {
+    const path = writePersona(`
+name: "Test"
+description: "A test persona"
+persona: "Persona text"
+responseStyle: "Style text"
+`);
+
+    const first = loadPersona(path);
+    // Replace the file with something invalid. The cached result must still come back.
+    writeFileSync(path, 'this is not valid YAML: [', 'utf8');
+
+    const second = loadPersona(path);
+    expect(second).toBe(first);
+  });
+
+  it('clears the cache when clearPersonaCache is called', () => {
+    const path1 = writePersona(`
+name: "First"
+description: "d"
+persona: "p"
+responseStyle: "r"
+`);
+    const first = loadPersona(path1);
+
+    // Mutate the file to something different, then clear + reload.
+    writeFileSync(path1, `
+name: "Second"
+description: "d"
+persona: "p"
+responseStyle: "r"
+`, 'utf8');
+    clearPersonaCache();
+
+    const second = loadPersona(path1);
+    expect(second.name).toBe('Second');
+    expect(second).not.toBe(first);
+  });
+
+  it('uses ./persona.yaml as the default path when none is provided', () => {
+    // This test would require a real ./persona.yaml to exist. Verify the
+    // default-path behaviour indirectly by ensuring the function uses the
+    // passed-in path even when it differs from the default.
+    const path = writePersona(`
+name: "DefaultTest"
+description: "d"
+persona: "p"
+responseStyle: "r"
+`);
+
+    const persona = loadPersona(path);
+    expect(persona.name).toBe('DefaultTest');
+  });
+
+  it('throws a Zod validation error when a required field is missing', () => {
+    const path = writePersona(`
+name: "Missing fields"
+# description, persona, responseStyle all absent
+`);
+
+    expect(() => loadPersona(path)).toThrow();
+  });
+
+  it('throws a Zod validation error when a field has the wrong type', () => {
+    const path = writePersona(`
+name: 123
+description: "d"
+persona: "p"
+responseStyle: "r"
+`);
+
+    expect(() => loadPersona(path)).toThrow();
+  });
+
+  it('throws when the file does not exist', () => {
+    expect(() => loadPersona(join(tmpDir, 'does-not-exist.yaml'))).toThrow();
+  });
+});
--- a/tests/unit/redisErrorPath.test.ts
+++ b/tests/unit/redisErrorPath.test.ts
@@ -0,0 +1,75 @@
+import { vi, describe, it, expect, beforeEach } from 'vitest';
+
+// ── capture the registered error listener so we can fire it ──────────────────
+const { errorListeners } = vi.hoisted(() => ({
+  errorListeners: [] as Array<(err: Error) => void>,
+}));
+
+vi.mock('../../src/config.js', () => ({
+  config: { REDIS_URL: 'redis://localhost:6379' },
+}));
+
+vi.mock('ioredis', () => {
+  return {
+    Redis: vi.fn().mockImplementation(() => ({
+      on: vi.fn((event: string, listener: (err: Error) => void) => {
+        if (event === 'error') errorListeners.push(listener);
+        return undefined;
+      }),
+    })),
+  };
+});
+
+import { Redis } from 'ioredis';
+
+const consoleErrorSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
+
+beforeEach(() => {
+  errorListeners.length = 0;
+  consoleErrorSpy.mockClear();
+  // Force a re-import of redis.ts to register a fresh error listener.
+  vi.resetModules();
+});
+
+describe('db/redis.ts error handler', () => {
+  it('registers an error listener on the Redis client at module load', async () => {
+    await import('../../src/db/redis.js');
+
+    expect(errorListeners).toHaveLength(1);
+  });
+
+  it('logs the error to console.error when the Redis client emits "error"', async () => {
+    await import('../../src/db/redis.js');
+
+    expect(errorListeners).toHaveLength(1);
+    errorListeners[0](new Error('ECONNREFUSED 127.0.0.1:6379'));
+
+    expect(consoleErrorSpy).toHaveBeenCalledTimes(1);
+    expect(consoleErrorSpy).toHaveBeenCalledWith(
+      '[redis] connection error',
+      expect.objectContaining({ message: 'ECONNREFUSED 127.0.0.1:6379' }),
+    );
+  });
+
+  it('does not throw or crash when the error has a non-standard shape', async () => {
+    await import('../../src/db/redis.js');
+
+    // Some ioredis errors come wrapped or with extra props. The handler just
+    // forwards to console.error; it must not throw.
+    expect(() => {
+      const err = Object.assign(new Error('boom'), { code: 'ECONNRESET', syscall: 'connect' });
+      errorListeners[0](err);
+    }).not.toThrow();
+
+    expect(consoleErrorSpy).toHaveBeenCalledWith('[redis] connection error', expect.anything());
+  });
+
+  it('constructs the Redis client with lazyConnect and maxRetriesPerRequest: 3', async () => {
+    await import('../../src/db/redis.js');
+
+    expect(Redis).toHaveBeenCalledWith('redis://localhost:6379', {
+      lazyConnect: true,
+      maxRetriesPerRequest: 3,
+    });
+  });
+});
--- a/tests/unit/xpAwarder.test.ts
+++ b/tests/unit/xpAwarder.test.ts
@@ -0,0 +1,162 @@
+import { vi, describe, it, expect, beforeEach } from 'vitest';
+
+const { mockGet: mockCharacterGet } = vi.hoisted(() => ({
+  mockGet: vi.fn(),
+}));
+
+vi.mock('../../src/session/characterRegistry.js', () => ({
+  characterRegistry: { get: mockCharacterGet },
+}));
+
+const { mockModifyExperience } = vi.hoisted(() => ({
+  mockModifyExperience: vi.fn(),
+}));
+
+vi.mock('../../src/vtt/foundryClient.js', () => ({
+  modifyExperience: mockModifyExperience,
+}));
+
+vi.mock('../../src/lib/logger.js', () => ({
+  log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
+}));
+
+import { awardXP } from '../../src/session/xpAwarder.js';
+import { mockSession } from '../fixtures/spec.js';
+
+function makeThread() {
+  return { send: vi.fn().mockResolvedValue({ id: 'msg-1' }) };
+}
+
+const baseSession = {
+  ...mockSession,
+  players: {
+    'user-1': { discordId: 'user-1', dndName: 'Aelindra' },
+    'user-2': { discordId: 'user-2', dndName: 'Borgrim' },
+    'user-3': { discordId: 'user-3', dndName: 'Cael' },
+  },
+};
+
+beforeEach(() => {
+  vi.clearAllMocks();
+  mockModifyExperience.mockResolvedValue(undefined);
+});
+
+describe('awardXP', () => {
+  it('awards XP to every player with a Foundry link and returns the awarded list', async () => {
+    mockCharacterGet.mockImplementation(async (_g, discordId) => {
+      if (discordId === 'user-1') return { discordId, dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.1' };
+      if (discordId === 'user-2') return { discordId, dndName: 'Borgrim', source: 'foundry', foundryActorUuid: 'Actor.2' };
+      return null;
+    });
+
+    const result = await awardXP(baseSession, 100, makeThread() as any);
+
+    expect(result.awarded).toEqual([
+      { dndName: 'Aelindra', amount: 100 },
+      { dndName: 'Borgrim', amount: 100 },
+    ]);
+    expect(result.skipped).toEqual([
+      { dndName: 'Cael', discordId: 'user-3', reason: 'no Foundry character linked' },
+    ]);
+    expect(mockModifyExperience).toHaveBeenCalledTimes(2);
+    expect(mockModifyExperience).toHaveBeenCalledWith('Actor.1', 100);
+    expect(mockModifyExperience).toHaveBeenCalledWith('Actor.2', 100);
+  });
+
+  it('posts a summary embed listing awarded and skipped players', async () => {
+    mockCharacterGet.mockImplementation(async (_g, discordId) => {
+      if (discordId === 'user-1') return { discordId, dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.1' };
+      return null;
+    });
+
+    const thread = makeThread();
+    await awardXP(baseSession, 50, thread as any);
+
+    expect(thread.send).toHaveBeenCalledTimes(1);
+    const message = thread.send.mock.calls[0][0] as string;
+    expect(message).toContain('+50 XP awarded');
+    expect(message).toContain('✅ Aelindra');
+    expect(message).toContain('⚠️');
+    expect(message).toContain('Borgrim');
+    expect(message).toContain('Cael');
+  });
+
+  it('returns empty results and posts no embed when there are no players', async () => {
+    const session = { ...baseSession, players: {} };
+    const thread = makeThread();
+
+    const result = await awardXP(session, 100, thread as any);
+
+    expect(result).toEqual({ awarded: [], skipped: [] });
+    expect(thread.send).not.toHaveBeenCalled();
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+  });
+
+  it('skips players whose profile has no foundryActorUuid (custom characters)', async () => {
+    mockCharacterGet.mockResolvedValue({
+      discordId: 'user-1', dndName: 'Aelindra', source: 'custom', /* no UUID */
+    });
+
+    const result = await awardXP(baseSession, 25, makeThread() as any);
+
+    expect(result.awarded).toEqual([]);
+    expect(result.skipped).toEqual([
+      { dndName: 'Aelindra', discordId: 'user-1', reason: 'no Foundry character linked' },
+      { dndName: 'Borgrim', discordId: 'user-2', reason: 'no Foundry character linked' },
+      { dndName: 'Cael', discordId: 'user-3', reason: 'no Foundry character linked' },
+    ]);
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+  });
+
+  it('skips players with "registry error" reason when characterRegistry throws', async () => {
+    mockCharacterGet.mockRejectedValue(new Error('redis down'));
+
+    const result = await awardXP(baseSession, 50, makeThread() as any);
+
+    expect(result.awarded).toEqual([]);
+    expect(result.skipped).toEqual([
+      { dndName: 'Aelindra', discordId: 'user-1', reason: 'registry error' },
+      { dndName: 'Borgrim', discordId: 'user-2', reason: 'registry error' },
+      { dndName: 'Cael', discordId: 'user-3', reason: 'registry error' },
+    ]);
+    expect(mockModifyExperience).not.toHaveBeenCalled();
+  });
+
+  it('skips with "Foundry relay error" when modifyExperience throws for a specific player', async () => {
+    mockCharacterGet.mockImplementation(async (_g, discordId) => {
+      if (discordId === 'user-1') return { discordId, dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.1' };
+      if (discordId === 'user-2') return { discordId, dndName: 'Borgrim', source: 'foundry', foundryActorUuid: 'Actor.2' };
+      return null;
+    });
+    mockModifyExperience.mockImplementation(async (uuid) => {
+      if (uuid === 'Actor.2') throw new Error('relay down');
+    });
+
+    const result = await awardXP(baseSession, 100, makeThread() as any);
+
+    expect(result.awarded).toEqual([{ dndName: 'Aelindra', amount: 100 }]);
+    expect(result.skipped).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({ dndName: 'Borgrim', reason: 'Foundry relay error' }),
+      ]),
+    );
+  });
+
+  it('handles a mix of: no profile, no UUID, and one success', async () => {
+    mockCharacterGet.mockImplementation(async (_g, discordId) => {
+      if (discordId === 'user-1') return null;                          // no profile
+      if (discordId === 'user-2') return { source: 'custom' };         // no UUID
+      if (discordId === 'user-3') return { source: 'foundry', foundryActorUuid: 'Actor.3' }; // success
+    });
+
+    const result = await awardXP(baseSession, 25, makeThread() as any);
+
+    expect(result.awarded).toEqual([{ dndName: 'Cael', amount: 25 }]);
+    expect(result.skipped).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({ dndName: 'Aelindra', reason: 'no Foundry character linked' }),
+        expect.objectContaining({ dndName: 'Borgrim', reason: 'no Foundry character linked' }),
+      ]),
+    );
+  });
+});