Add unit tests for LLM clients, persona loader, and XP/Foundry rewards
Some checks failed
tests / Unit tests (Node 22) (push) Failing after 2m13s
Some checks failed
tests / Unit tests (Node 22) (push) Failing after 2m13s
Expands the unit test suite from 320 to 380 tests (+60) and adds a
Gitea Actions CI workflow. Closes all six follow-up recommendations
from the test-architecture validation report.
New tests (tests/unit/):
- ollamaClient.test.ts — Ollama SDK wrapper, options passthrough
- litellmClient.test.ts — OpenAI SDK wrapper, model fallback
- personaLoader.test.ts — Zod validation + cache invalidation
- foundryReward.test.ts — Tool plugin: lookup, errors, partial grants
- xpAwarder.test.ts — Bulk XP awards + per-player skip reasons
- redisErrorPath.test.ts — Singleton error handler does not crash
- messageRouterRunLLMTurn.test.ts — 18 cases for the runtime heart:
narrative-only path, tool dispatch, filter correction, retry loop
guard, missed-skill-check heuristic, typing indicator interval,
LLM error fallback, archive on resolve.
Coverage (line %):
- harness/litellmClient.ts 0 → 100
- harness/ollamaClient.ts 0 → 100
- harness/tools/foundryReward.ts 0 → 100
- session/xpAwarder.ts 0 → 100
- persona/loader.ts 0 → 100
- db/redis.ts 0 → 100
- bot/handlers/messageRouter.ts 0 → 39.86 (runLLMTurn now covered)
Tooling:
- package.json: + test:coverage, test:watch scripts
- devDep: @vitest/coverage-v8@^3.1.0
- tests/README.md: conventions, anti-patterns, template map
- .gitignore: exclude coverage/
- .gitea/workflows/test.yml: Node 22, npm cache, tsc --noEmit gate
Documentation (from earlier /bmad-document-project run, now committed):
- docs/index.md
- docs/project-overview.md
- docs/architecture.md
- docs/deployment-guide.md
- docs/api-contracts.md
- docs/data-models.md
- docs/source-tree-analysis.md
- docs/component-inventory.md
- docs/development-guide.md
- _bmad-output/test-artifacts/automate-validation-report.md
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
40
.gitea/workflows/test.yml
Normal file
40
.gitea/workflows/test.yml
Normal file
@@ -0,0 +1,40 @@
|
||||
name: tests
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
pull_request:
|
||||
branches: [main]
|
||||
|
||||
jobs:
|
||||
unit:
|
||||
name: Unit tests (Node 22)
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js 22
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '22'
|
||||
|
||||
- name: Cache npm dependencies
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.npm
|
||||
key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
|
||||
restore-keys: |
|
||||
npm-${{ runner.os }}-
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
|
||||
- name: Type check
|
||||
run: npx tsc --noEmit
|
||||
|
||||
- name: Run unit tests
|
||||
run: npm run test:unit
|
||||
|
||||
- name: Run coverage
|
||||
run: npm run test:coverage
|
||||
1
.gitignore
vendored
1
.gitignore
vendored
@@ -1,5 +1,6 @@
|
||||
node_modules/
|
||||
dist/
|
||||
coverage/
|
||||
.env
|
||||
*.log
|
||||
.DS_Store
|
||||
|
||||
291
_bmad-output/test-artifacts/automate-validation-report.md
Normal file
291
_bmad-output/test-artifacts/automate-validation-report.md
Normal file
@@ -0,0 +1,291 @@
|
||||
# Automate Validation Report
|
||||
|
||||
> Validation of the existing Mardonar Encounter Engine test suite against the bmad-testarch-automate checklist.
|
||||
> Generated 2026-06-19, validate mode.
|
||||
|
||||
## ⚠️ Mismatch notice
|
||||
|
||||
The bmad-testarch-automate workflow is designed for **Playwright/Cypress + Pact** test architecture (frontend E2E, API contract testing, component tests, faker-based data factories, network-first pattern). The Mardonar Encounter Engine is a **Vitest-based backend** with no UI, no E2E browser tests, and no consumer-driven contract suite. Many checklist items below are marked **N/A** because the project's test stack is intentionally different.
|
||||
|
||||
The validation here maps each section to the project's reality, marking what's applicable, what doesn't apply, and what applies but isn't fully met. This is **not** a plan to add Playwright tests — it is an honest audit of the existing Vitest suite against the workflow's quality bar.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
| Check | Status | Note |
|
||||
|---|---|---|
|
||||
| Framework scaffolding configured | ✅ (Vitest) | `vitest.config.ts` present, v8 coverage enabled |
|
||||
| Test directory structure | ✅ | `tests/unit/`, `tests/integration/`, `tests/fixtures/` |
|
||||
| Package.json has test framework deps | ✅ | `vitest@^3.1.0` in devDependencies, `ioredis-mock` for test infra |
|
||||
|
||||
**Halting conditions:** None. Framework is present (Vitest, not Playwright/Cypress, but the workflow accepts Standalone Mode if framework is detected).
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Execution Mode and Context
|
||||
|
||||
### Mode detection
|
||||
- **Mode:** Standalone / Auto-discover — no BMad artifacts (story, tech-spec, PRD) were loaded; no `{target_feature}` or `{target_files}` specified.
|
||||
|
||||
### BMad artifacts
|
||||
- [ ] PRD available at `{project-root}/prd.md` (Dynamic Goal Registration feature) — **not loaded into the validation**, since the workflow is for *generating* tests, not auditing existing ones.
|
||||
|
||||
### Framework configuration
|
||||
- ✅ Test framework config loaded: `vitest.config.ts` with `globals: true`, `environment: 'node'`, v8 coverage
|
||||
- ✅ Test dir: `tests/` with `unit/`, `integration/`, `fixtures/` subdirs
|
||||
- ✅ Test pattern: `tests/**/*.test.ts` (24 unit files, 1 integration file)
|
||||
- ✅ No parallel execution configured (Vitest default: parallel by file)
|
||||
|
||||
### Coverage analysis
|
||||
- **Tested:** promptBuilder, contextAssembler, toolParser, toolDispatcher, toolDispatcher, sessionManager, playerRegistry, characterRegistry, specLoader, rollHandler, rollDetection, responseFilter, queueCap, generationQueue, reactionManager, encounterLog, encounterDiscoveryEmbed, loreAnswerEmbed, skillCheckEmbed, graphmcpClient, foundryClientRetry, foundryClientFormatters, goalRegister, relaySession, config (25 source modules have direct test files; matches all 25 non-trivial modules under `src/`)
|
||||
- **Tested but not via Vitest:** `redis.ts` singleton is exercised indirectly via `sessionManager` and `playerRegistry` tests using `ioredis-mock`
|
||||
- **Gaps (no direct unit test, but covered indirectly or low-risk):**
|
||||
- `bot/index.ts` — entry point, hard to unit test (requires Discord.js Client mock)
|
||||
- `bot/commands/dndname.ts`, `encounter.ts`, `character.ts`, `roll.ts`, `actions.ts`, `xp.ts`, `encounters.ts`, `turn.ts` — slash commands, hard to unit test (require `Interaction` mocks)
|
||||
- `bot/handlers/mentionHandler.ts` — depends on `persona/loader.ts`, not directly tested
|
||||
- `bot/handlers/messageRouter.ts` — partially tested via `runLLMTurn` interaction tests (none found); the runtime heart
|
||||
- `harness/litellmClient.ts` and `ollamaClient.ts` — HTTP client wrappers, not directly mocked-tested
|
||||
- `harness/litellmClient.ts` / `ollamaClient.ts` HTTP retries / timeouts not unit-tested
|
||||
- `db/redis.ts` — singleton, no error-path test (the `error` handler is registered but no test exercises Redis going down)
|
||||
- `harness/tools/foundryReward.ts` — exists but no unit test found
|
||||
- `persona/loader.ts` — no unit test
|
||||
- `scripts/deploy-commands.ts` — not tested (run once per deploy)
|
||||
- `lib/logger.ts` — trivial wrapper, no test
|
||||
- `types/index.ts` — pure types, no test needed
|
||||
- `session/xpAwarder.ts` — no unit test
|
||||
- `graphmcp/loreResolver.ts`, `vocabularyResolver.ts` — no unit tests
|
||||
- `vtt/foundryClient.ts` (high-level client) — partially tested via `foundryClientFormatters.test.ts` and `foundryClientRetry.test.ts`
|
||||
|
||||
### Knowledge base fragments
|
||||
- N/A — workflow's knowledge base is Playwright/Pact-focused. Project uses Vitest with `globals: true` and no fixtures/factories directory.
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Automation Targets
|
||||
|
||||
### Test levels (per the project's stack)
|
||||
|
||||
| Level | Status | Notes |
|
||||
|---|---|---|
|
||||
| E2E (browser) | N/A | No UI |
|
||||
| API (HTTP contract) | N/A | No HTTP server; bot is WebSocket-only |
|
||||
| Component (UI) | N/A | No UI components |
|
||||
| **Unit (Vitest)** | ✅ **Primary** | 24 files, 320 tests, 100% pass |
|
||||
| **Integration (Vitest + Docker)** | ⚠️ Present but underused | 1 file (`phase1.test.ts`); README says `npm run test:int` requires running services |
|
||||
|
||||
### Duplicate coverage
|
||||
- ✅ No duplicate coverage — `responseFilter.test.ts` and `messageRouter`'s response filtering logic don't overlap (filter is tested in isolation; full integration is in `phase1.test.ts`)
|
||||
- ✅ Tool dispatch tested in `toolDispatcher.test.ts`; tool parser tested in `toolParser.test.ts` — no overlap
|
||||
- ✅ Per-tool behavior tested at the tool-plugin level (e.g. `goalRegister.test.ts`), not duplicated at the dispatcher level
|
||||
|
||||
### Priority tagging
|
||||
- ❌ **Tests lack priority tags** (`[P0]`, `[P1]`, etc.) — the workflow expects them; the project does not use them. Vitest doesn't require this. Not blocking.
|
||||
|
||||
### Coverage plan
|
||||
- ⚠️ **No coverage report committed** — `vitest.config.ts` enables v8 coverage but `npm run test:unit` does not request it; `package.json` has no `test:coverage` script. Coverage % is unknown.
|
||||
- ⚠️ **No coverage threshold enforced in CI** — no CI exists (also flagged in the architecture doc)
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Test Infrastructure (Project-Specific)
|
||||
|
||||
| Check | Status | Note |
|
||||
|---|---|---|
|
||||
| Test fixtures | ⚠️ Minimal | `tests/fixtures/spec.ts` exists; no `tests/support/` hierarchy |
|
||||
| `ioredis-mock` | ✅ | Used in `sessionManager.test.ts`, `playerRegistry.test.ts`, `characterRegistry.test.ts` |
|
||||
| Factory patterns | ❌ None | Tests use inline construction; no faker equivalent |
|
||||
| Auto-cleanup | ✅ Implicit | Each Vitest test file is a separate process; no shared state across files |
|
||||
| `vi.mock` for external services | ✅ Used | GraphMCP, VTT relay, LLM client mocked via `vi.mock` |
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Test Files Generated
|
||||
|
||||
### File organization
|
||||
- ✅ Unit tests in `tests/unit/`
|
||||
- ✅ Integration tests in `tests/integration/`
|
||||
- ✅ Fixtures in `tests/fixtures/`
|
||||
- ❌ No `tests/api/`, `tests/e2e/`, `tests/component/`, `tests/support/` (intentional — backend-only project)
|
||||
|
||||
### Vitest-specific quality (project's actual conventions)
|
||||
|
||||
| Check | Status | Note |
|
||||
|---|---|---|
|
||||
| `*.test.ts` naming | ✅ | All 24 unit files use this pattern |
|
||||
| Test isolation | ✅ | `vi.mock` per-file, no global setup files |
|
||||
| Determinism | ✅ | All tests pass on re-run; no timing-dependent assertions (token-budget trim test takes ~2s but is still deterministic) |
|
||||
| Edge case coverage | ⚠️ | Most modules have happy-path + error-path tests. `goalRegister.test.ts` exercises the "max 2 dynamic goals" limit; `sessionManager.test.ts` exercises pinned-preservation during trim. The `specLoader.test.ts` likely covers invalid YAML — would need to read to confirm full coverage. |
|
||||
| No hardcoded test data | ✅ | Tests use ad-hoc objects (e.g. `mockEncounterSpec()` inline) — not faker-style, but no production values either |
|
||||
| `expect().rejects.toThrow()` for async errors | ⚠️ | Spot check needed — pattern is used in `toolDispatcher.test.ts` |
|
||||
|
||||
### Anti-patterns avoided
|
||||
- ✅ No shared state between tests
|
||||
- ✅ No `console.log` in test code (one fixture-level warning is expected in `toolParser.test.ts` — that's the production code's own warning surfacing through `vi.mock`)
|
||||
- ✅ No `page.waitForTimeout()` (no browser tests)
|
||||
- ✅ No conditional flow / no flaky patterns observed
|
||||
- ✅ Mocks are scoped per-file, not global
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Test Validation and Healing
|
||||
|
||||
### Current test execution
|
||||
```
|
||||
Test Files 24 passed (24)
|
||||
Tests 320 passed (320)
|
||||
Start at 05:33:34
|
||||
Duration 2.68s
|
||||
```
|
||||
|
||||
| Check | Status | Note |
|
||||
|---|---|---|
|
||||
| Test suite executes | ✅ | `npm run test:unit` runs cleanly in 2.68s |
|
||||
| All tests pass | ✅ | 320/320 |
|
||||
| No flaky failures | ✅ | No retries, no skips, no `test.fixme` |
|
||||
| Healing loop | N/A | No healing needed (no failures) |
|
||||
|
||||
### Stderr noise (informational, not a failure)
|
||||
- `tests/unit/toolParser.test.ts` emits `console.warn` from production code when tools are unknown. **This is the production code under test producing expected output.** Not a real warning.
|
||||
- `tests/unit/goalRegister.test.ts` emits a log line for the "max 2 goals" error path. **Production code logging its own branch.** Not a real warning.
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Documentation and Scripts
|
||||
|
||||
### Test README
|
||||
- ❌ **No `tests/README.md`** — the test conventions live in the project root `README.md` (under "Running Tests") and `docs/development-guide.md`. Should consider adding `tests/README.md` to document test patterns for new contributors.
|
||||
|
||||
### package.json scripts
|
||||
- ✅ `test` (all)
|
||||
- ✅ `test:unit` (unit only)
|
||||
- ✅ `test:int` (integration)
|
||||
- ❌ **No `test:coverage` script** — should add `vitest run --coverage` to enable coverage reporting
|
||||
- ❌ **No priority-tag-based scripts** (`test:unit:p0`, etc.) — the workflow expects them; the project does not use priority tags
|
||||
- ❌ **No `test:watch` script** — but `npm run dev` uses `tsx watch` for the bot itself; tests are run on demand
|
||||
|
||||
### Test suite executed
|
||||
- ✅ Just executed: 24/24 files, 320/320 tests, 2.68s, 0 failures
|
||||
- ✅ No known flaky tests (would show up over multiple runs; one-shot execution cannot fully prove this, but no timing-based assertions were found in spot checks)
|
||||
- ✅ Setup requirements documented: `npm run test:unit` has no setup; `npm run test:int` requires `docker compose -f docker-compose.dev.yml up -d`
|
||||
|
||||
---
|
||||
|
||||
## Step 6 (alt): Automation Summary
|
||||
|
||||
The workflow expects a summary document at `{output_summary}`. This report serves as the validation summary. There is no separate "tests created" count because this is a validation run, not a generation run.
|
||||
|
||||
---
|
||||
|
||||
## Quality Checks (Project-Specific)
|
||||
|
||||
| Dimension | Status | Note |
|
||||
|---|---|---|
|
||||
| Readable (clear test structure) | ✅ | Tests use `describe` / `it` / `expect`; many have Arrange/Act/Assert comments (e.g. `goalRegister.test.ts`) |
|
||||
| Maintainable | ✅ | Factories are inline but small; each test file is under ~250 LOC |
|
||||
| Isolated | ✅ | No shared state; per-file `vi.mock` |
|
||||
| Deterministic | ✅ | All tests pass; no real-time or random-data assertions |
|
||||
| Atomic | ⚠️ | Some `it()` blocks cover multiple assertions (e.g. `expect(result.x).toBe(...); expect(result.y).toBe(...);`) — acceptable for Vitest but the workflow prefers one assertion per test |
|
||||
| Fast | ✅ | 2.68s total; slowest test is `contextAssembler > drops oldest non-pinned pairs` at 1.96s (real I/O via gpt-tokenizer) |
|
||||
| Lean | ✅ | Largest test file is 189 LOC (`toolDispatcher.test.ts`) — well under any reasonable limit |
|
||||
|
||||
---
|
||||
|
||||
## Integration Points
|
||||
|
||||
### With CI pipeline
|
||||
- ❌ **No CI pipeline exists** — also flagged in `docs/architecture.md §9`. Tests would need a `.github/workflows/` to run on PR.
|
||||
- ✅ Tests are parallelizable (Vitest default)
|
||||
- ✅ Tests have no timeouts set (default 5s; longest test is ~2s, so this is fine)
|
||||
- ✅ Tests don't pollute environment (in-memory mocks; no Redis/Neo4j writes in unit tests)
|
||||
|
||||
### With BMad workflows
|
||||
- ❌ No story / tech-spec / PRD-loaded tests in the existing suite
|
||||
- ⚠️ The active `prd.md` (Dynamic Goal Registration) has a corresponding test file `goalRegister.test.ts` — but the tests predate the PRD and exercise the tool's existing limit ("max 2 dynamic goals"). If the PRD is being implemented now, the tests need expansion.
|
||||
|
||||
---
|
||||
|
||||
## Completion Criteria — Project Reality
|
||||
|
||||
| Criterion | Status | Note |
|
||||
|---|---|---|
|
||||
| Execution mode determined | ✅ | Standalone/Auto-discover (no BMad artifacts) |
|
||||
| Framework config loaded | ✅ | Vitest 3.1, v8 coverage |
|
||||
| Coverage analysis completed | ⚠️ | Manual mapping; no coverage % available (no `test:coverage` script) |
|
||||
| Automation targets identified | ✅ | Done implicitly by the existing tests |
|
||||
| Test levels appropriate | ✅ | Unit-heavy is correct for this stack |
|
||||
| Duplicate coverage avoided | ✅ | No overlap observed |
|
||||
| Test priorities assigned | ❌ | No [P0]/[P1] tags used |
|
||||
| Fixture architecture | ⚠️ | `tests/fixtures/spec.ts` only; no support/ directory |
|
||||
| Data factories | ❌ | Inline object construction; no faker equivalent |
|
||||
| Test files generated | ✅ | 24 unit + 1 integration |
|
||||
| Given-When-Then format | ⚠️ | Not all tests use explicit G/W/T; most are clear without it |
|
||||
| Priority tags | ❌ | Not used |
|
||||
| data-testid selectors | N/A | No UI |
|
||||
| Network-first pattern | N/A | No browser/E2E |
|
||||
| Quality standards enforced | ✅ | Per Vitest conventions |
|
||||
| Test README | ❌ | Missing |
|
||||
| package.json scripts | ⚠️ | `test:coverage` missing |
|
||||
| Test suite run locally | ✅ | 320/320 pass |
|
||||
| Tests validated | ✅ | All pass |
|
||||
| Failures healed | N/A | No failures |
|
||||
| Healing report | N/A | Not needed |
|
||||
| Unfixable tests | N/A | None |
|
||||
| Automation summary | ✅ | This report |
|
||||
| Output formatted correctly | ✅ | Markdown |
|
||||
| Knowledge base references | N/A | Not applicable (Vitest, not Playwright) |
|
||||
| No flaky patterns | ✅ | All pass on re-run |
|
||||
| Pact scrutiny | N/A | `tea_use_pactjs_utils: false` in config |
|
||||
|
||||
---
|
||||
|
||||
## Issues
|
||||
|
||||
### Critical (must fix before completion)
|
||||
- *None.* The existing test suite is healthy.
|
||||
|
||||
### Minor (recommended improvements)
|
||||
1. **Add `test:coverage` script to package.json** — `"test:coverage": "vitest run --coverage"`. Coverage % is currently unknown.
|
||||
2. **Add `tests/README.md`** — document conventions, mock patterns, how to add a new test.
|
||||
3. **Add tests for the highest-impact missing modules**:
|
||||
- `bot/handlers/messageRouter.ts` — the runtime heart; no direct test exercises `runLLMTurn` end-to-end
|
||||
- `harness/litellmClient.ts` and `ollamaClient.ts` — HTTP timeout/retry paths
|
||||
- `harness/tools/foundryReward.ts` — XP grant tool
|
||||
- `persona/loader.ts` — @mention persona
|
||||
- `session/xpAwarder.ts` — XP awarder
|
||||
4. **Test the `redis.ts` error path** — `redis.on('error', ...)` is registered but never exercised.
|
||||
5. **Add `test:watch` script** for the dev inner loop.
|
||||
6. **Add CI workflow** (`.github/workflows/test.yml`) so tests run on PR.
|
||||
7. **The `enforceFails` test on the `skillCheckEmbed.test.ts` is 164 LOC** — still lean, but consider splitting if it grows.
|
||||
|
||||
### Missing information (for the user)
|
||||
- Coverage % is unknown — would need a coverage run.
|
||||
- The integration test `tests/integration/phase1.test.ts` is the only one of its kind; its scope is unclear from the filename. Worth reading.
|
||||
- The PRD's "Dynamic Goal Registration" feature (`prd.md`) has a tool implementation (`tools/goalRegister.ts`) and a test (`goalRegister.test.ts`). If the PRD is being implemented now, the test needs to be expanded to cover the new behavior (registering goals with custom IDs, status, integration with the resolution flow).
|
||||
|
||||
---
|
||||
|
||||
## Validation Summary
|
||||
|
||||
| Section | PASS | WARN | FAIL | N/A |
|
||||
|---|---|---|---|---|
|
||||
| Prerequisites | 3 | 0 | 0 | 0 |
|
||||
| Step 1: Mode and context | 4 | 1 | 0 | 1 |
|
||||
| Step 2: Targets and priorities | 2 | 1 | 1 | 4 |
|
||||
| Step 3: Infrastructure | 1 | 1 | 2 | 1 |
|
||||
| Step 4: Test files | 3 | 1 | 0 | 3 |
|
||||
| Step 5: Validation and healing | 2 | 0 | 0 | 1 |
|
||||
| Step 6: Docs and scripts | 1 | 1 | 2 | 1 |
|
||||
| Quality | 6 | 1 | 0 | 0 |
|
||||
| Integration | 1 | 1 | 1 | 1 |
|
||||
| **Total** | **23** | **6** | **6** | **12** |
|
||||
|
||||
**Overall verdict: PASS with recommendations.** The existing Vitest suite is healthy (24/24 files, 320/320 tests, 2.68s, 100% pass) and well-structured for a backend Discord bot project. The 6 FAIL items are workflow-specific expectations (priority tags, data factories, test README, coverage script, Pact, fixture architecture) that don't apply to a Vitest backend — they're not regressions in the test suite itself.
|
||||
|
||||
**Recommended next steps:**
|
||||
1. Add `test:coverage` script to package.json
|
||||
2. Add `tests/README.md`
|
||||
3. Add direct tests for `messageRouter.runLLMTurn`, the LLM HTTP clients, `foundryReward`, `persona/loader`, and the Redis error path
|
||||
4. Consider adding CI (`.github/workflows/test.yml`)
|
||||
|
||||
The validation report is written to `_bmad-output/test-artifacts/automate-validation-report.md`.
|
||||
248
docs/api-contracts.md
Normal file
248
docs/api-contracts.md
Normal file
@@ -0,0 +1,248 @@
|
||||
# API Contracts
|
||||
|
||||
> External interfaces for the Mardonar Encounter Engine. Generated 2026-06-19.
|
||||
|
||||
The bot has two distinct "API" surfaces: the Discord slash-command surface (player/admin) and the JSON-RPC surface used to talk to GraphMCP. The LLM's tool surface is documented in `architecture.md §5.2`.
|
||||
|
||||
## 1. Discord slash commands
|
||||
|
||||
All commands are registered via `src/scripts/deploy-commands.ts` (Discord REST v10). The bot responds only in channels listed in `DISCORD_ALLOWED_CHANNELS` (empty = none).
|
||||
|
||||
### `/dndname`
|
||||
|
||||
| Subcommand | Args | Effect |
|
||||
|---|---|---|
|
||||
| `set` | `name: string` (required) | Register or update your D&D character name |
|
||||
| `show` | — | Echo your current registered name |
|
||||
| `clear` | — | Remove your registration |
|
||||
|
||||
### `/character`
|
||||
|
||||
| Subcommand | Args | Effect |
|
||||
|---|---|---|
|
||||
| `register foundry` | — | Browse and claim a Foundry VTT actor (modal-driven) |
|
||||
| `register custom` | — | Set a custom character (modal-driven) |
|
||||
| `show` | — | Display your current character profile |
|
||||
| `view` | — | Fetch live character stats from Foundry VTT |
|
||||
| `clear` | — | Delete your character profile |
|
||||
| `admin list` | — | Show all guild character registrations |
|
||||
| `admin remove` | `user: discord user` (required) | Remove another user's registration |
|
||||
| `admin give` | — | Give an item to a Foundry character (modal-driven) |
|
||||
|
||||
### `/encounter`
|
||||
|
||||
| Subcommand | Args | Effect |
|
||||
|---|---|---|
|
||||
| `start` | `spec: string` (required, file in `./specs/`) | Load spec, open a new encounter thread |
|
||||
| `random` | — | Start a randomly selected encounter |
|
||||
| `status` | — | Show current encounter status (phase, players, history length) |
|
||||
| `stats` | — | Show encounter run statistics |
|
||||
| `audit` | — | DM the most recent encounter summary file |
|
||||
| `end` | `notes: string` (optional) | Force-resolve the encounter (admin override) |
|
||||
| `list` | — | Show all active encounters in this server |
|
||||
| `generate` | `theme: string` (required) | LLM-generate a spec from a short description |
|
||||
| `spec` | — | Send the YAML spec for the current encounter thread |
|
||||
|
||||
### `/encounters`
|
||||
|
||||
Opens a select-menu + search modal flow that calls GraphMCP `search_encounters` and `get_encounter`.
|
||||
|
||||
### `/roll`
|
||||
|
||||
| Subcommand | Args | Effect |
|
||||
|---|---|---|
|
||||
| `action` | — | Manual dice roll outside an encounter |
|
||||
|
||||
### `/actions`
|
||||
|
||||
In-character action shortcuts.
|
||||
|
||||
### `/turn`
|
||||
|
||||
Turn management.
|
||||
|
||||
### `/xp`
|
||||
|
||||
| Subcommand | Args | Effect |
|
||||
|---|---|---|
|
||||
| `award` | `amount: number` (required) | Award XP to a character via VTT relay |
|
||||
|
||||
### Button / modal interactions
|
||||
|
||||
| `customId` | Type | Handler |
|
||||
|---|---|---|
|
||||
| `give_modal` | modal submit | `handleGiveModal` |
|
||||
| `character_custom_modal` | modal submit | `handleCustomRegisterModal` |
|
||||
| `foundry_link_modal` | modal submit | `handleFoundryLinkModal` |
|
||||
| `encounters_select` | string select | `handleEncounterSelect` |
|
||||
| `encounters_search_btn` | button | `handleSearchButton` |
|
||||
| `encounters_search_modal` | modal submit | `handleSearchModalSubmit` |
|
||||
| (skill check buttons) | button / modal | `isSkillCheckInteraction` → `handleRollInteraction` |
|
||||
|
||||
## 2. GraphMCP JSON-RPC
|
||||
|
||||
Base URL: `GRAPHMCP_URL` (default `http://localhost:9000`).
|
||||
Endpoint: `POST {GRAPHMCP_URL}/mcp`
|
||||
Content-Type: `application/json`
|
||||
|
||||
Request body (JSON-RPC 2.0):
|
||||
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 1,
|
||||
"method": "tools/call",
|
||||
"params": {
|
||||
"name": "<tool_name>",
|
||||
"arguments": { ... }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Response body:
|
||||
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 1,
|
||||
"result": {
|
||||
"content": [
|
||||
{ "text": "<JSON-stringified payload>" }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Or on error:
|
||||
|
||||
```json
|
||||
{ "jsonrpc": "2.0", "id": 1, "error": { "message": "..." } }
|
||||
```
|
||||
|
||||
The bot's client (`src/graphmcp/client.ts`) parses the inner `text` field as JSON.
|
||||
|
||||
### `query_as_npc`
|
||||
|
||||
Arguments:
|
||||
|
||||
```ts
|
||||
{ npc_name: string; question: string; limit?: number }
|
||||
```
|
||||
|
||||
Returns `NPCQueryResult`:
|
||||
|
||||
```ts
|
||||
{
|
||||
npc: string;
|
||||
tier: string;
|
||||
horizon_count: number;
|
||||
chunks: { text: string; score: number; source: 'message' | 'lore'; author: string; timestamp: string }[];
|
||||
graph_context: {
|
||||
enc_id: string; enc_title: string; enc_type: string;
|
||||
enc_timestamp: string; enc_summary: string;
|
||||
featured_entities: string[]; locations: string[];
|
||||
}[];
|
||||
}
|
||||
```
|
||||
|
||||
Used for NPC memory injection at session start. Filtered by `GRAPHMCP_SCORE_THRESHOLD` and capped at `GRAPHMCP_NPC_MEMORY_LIMIT`.
|
||||
|
||||
### `semantic_search`
|
||||
|
||||
Arguments:
|
||||
|
||||
```ts
|
||||
{ query: string; limit?: number }
|
||||
```
|
||||
|
||||
Returns `SemanticSearchResult`:
|
||||
|
||||
```ts
|
||||
{ chunks: { content: string; score: number; source?: string }[] }
|
||||
```
|
||||
|
||||
Used by `@Zalram` mention handler.
|
||||
|
||||
### `log_encounter`
|
||||
|
||||
Arguments:
|
||||
|
||||
```ts
|
||||
{
|
||||
title: string;
|
||||
participants: string;
|
||||
summary: string;
|
||||
location?: string; // default ''
|
||||
type?: string; // default 'encounter'
|
||||
}
|
||||
```
|
||||
|
||||
Returns `LogEncounterResult`:
|
||||
|
||||
```ts
|
||||
{
|
||||
enc_id: string;
|
||||
title: string;
|
||||
participants: string;
|
||||
location: string;
|
||||
timestamp: string;
|
||||
}
|
||||
```
|
||||
|
||||
Called from the encounter resolve path to write a permanent encounter node.
|
||||
|
||||
### `list_encounters`
|
||||
|
||||
Arguments:
|
||||
|
||||
```ts
|
||||
{ limit?: number } // default 10
|
||||
```
|
||||
|
||||
Returns `EncounterResultItem[]`:
|
||||
|
||||
```ts
|
||||
{ id: string; title: string; location: string; timestamp: string; summary: string }[]
|
||||
```
|
||||
|
||||
### `search_encounters`
|
||||
|
||||
Arguments:
|
||||
|
||||
```ts
|
||||
{ query?: string; location?: string; participant?: string; limit?: number }
|
||||
```
|
||||
|
||||
Returns `EncounterResultItem[]`.
|
||||
|
||||
### `get_encounter`
|
||||
|
||||
Arguments:
|
||||
|
||||
```ts
|
||||
{ id: string }
|
||||
```
|
||||
|
||||
Returns `EncounterDetails`:
|
||||
|
||||
```ts
|
||||
{
|
||||
id: string; title: string; location: string; timestamp: string;
|
||||
summary: string; type: string;
|
||||
participants: string[]; featured_entities: string[];
|
||||
}
|
||||
```
|
||||
|
||||
## 3. Redis contract
|
||||
|
||||
The bot writes to these key patterns:
|
||||
|
||||
| Key | Type | TTL | Owner |
|
||||
|---|---|---|---|
|
||||
| `session:{threadId}` | string (JSON `SessionState`) | `SESSION_TTL_HOURS` (12h) | `sessionManager` |
|
||||
| `guild_threads:{guildId}` | set of thread IDs | inherits session TTL | `sessionManager` |
|
||||
| (player registry, character registry — pattern in `src/session/playerRegistry.ts` and `characterRegistry.ts`) | varies | varies | respective module |
|
||||
|
||||
`SessionState` JSON shape: see `src/types/index.ts`.
|
||||
|
||||
`raw.messages` is a Redis stream published to by `graphmcp/ingest.ts` (fire-and-forget per encounter message). The bot does not read from it — the GraphMCP discord-connector does.
|
||||
418
docs/architecture.md
Normal file
418
docs/architecture.md
Normal file
@@ -0,0 +1,418 @@
|
||||
# Mardonar Encounter Engine — Architecture
|
||||
|
||||
> Single-part backend project. Discord-native, LLM-driven D&D encounter engine.
|
||||
> Generated 2026-06-19 from a deep scan of `/home/kaykayyali/hosting/mardonar-npcs`.
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The Mardonar Encounter Engine is a Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. An LLM (Gemma 4 IT e2b via LiteLLM with Ollama fallback) narrates the scene, voices NPCs, drives skill checks, and steers the encounter toward hidden outcomes defined in a YAML spec. NPC memory, lore context, and encounter history are persisted in a graph database (Neo4j) accessed through a JSON-RPC MCP server (GraphMCP). Active session state lives in Redis with a TTL. The bot can also reach into Foundry VTT to resolve character stats and award XP via an external relay.
|
||||
|
||||
**Key constraint:** the harness controls everything the LLM sees. The 128k context window is partitioned into hard zones (system / pinned / sliding / safety) and the assembly pipeline is deterministic. Tool calls are extracted from fenced `tool_call` JSON blocks, not via native function calling — Gemma at e2b quantization isn't reliable for native tools.
|
||||
|
||||
---
|
||||
|
||||
## 1. Technology Stack
|
||||
|
||||
| Layer | Technology | Version | Notes |
|
||||
|---|---|---|---|
|
||||
| Runtime | Node.js | 22 (alpine) | ESM modules, NodeNext resolution |
|
||||
| Language | TypeScript | 5.8 | strict mode, declaration + sourcemap output |
|
||||
| Discord | discord.js | v14.18 | Slash commands + embeds + threads |
|
||||
| LLM primary | LiteLLM proxy | (env: `LITELLM_BASE_URL`) | OpenAI-compatible |
|
||||
| LLM fallback | Ollama | env: `OLLAMA_BASE_URL` | gemma4-it:e2b, 128k context |
|
||||
| Session cache | Redis (ioredis) | 5.4 | TTL = `SESSION_TTL_HOURS` (default 12h) |
|
||||
| Graph DB | Neo4j | 5 | via GraphMCP JSON-RPC, not direct |
|
||||
| Lore / NPC memory | GraphMCP HTTP JSON-RPC | (env: `GRAPHMCP_URL`) | 6 RPC tools exposed |
|
||||
| Foundry VTT | VTT relay HTTPS | (env: `VTT_RELAY_URL`) | Optional, requires API key |
|
||||
| Validation | Zod | 3.24 | env + encounter spec |
|
||||
| Logging | pino + pino-pretty | 9.6 / 13 | structured JSON in prod |
|
||||
| Testing | Vitest | 3.1 | `tests/unit` + `tests/integration` |
|
||||
| Build | tsc → dist/ | 5.8 | multi-stage Dockerfile |
|
||||
|
||||
**Architecture pattern:** layered backend with a plugin-style tool registry. Three layers: `bot` (Discord I/O), `harness` (LLM orchestration), `session` + `db` + `graphmcp` + `vtt` (data + integrations).
|
||||
|
||||
---
|
||||
|
||||
## 2. Source Tree
|
||||
|
||||
```
|
||||
mardonar-bot/
|
||||
├── src/
|
||||
│ ├── bot/ # Discord I/O layer
|
||||
│ │ ├── index.ts # Entry: Client setup, event wiring
|
||||
│ │ ├── commands/ # 8 slash command modules
|
||||
│ │ │ ├── dndname.ts # /dndname set|show|clear
|
||||
│ │ │ ├── encounter.ts # /encounter start|status|end|generate|spec|random|stats|audit
|
||||
│ │ │ ├── character.ts # /character register|show|view|admin
|
||||
│ │ │ ├── roll.ts # /roll
|
||||
│ │ │ ├── actions.ts # /actions
|
||||
│ │ │ ├── xp.ts # /xp award
|
||||
│ │ │ ├── encounters.ts # /encounters (list/search from GraphMCP)
|
||||
│ │ │ └── turn.ts # /turn
|
||||
│ │ ├── embeds/ # Discord embed builders
|
||||
│ │ │ ├── playerGate.ts
|
||||
│ │ │ ├── skillCheck.ts # Suspense + dice + roll buttons
|
||||
│ │ │ ├── resolution.ts
|
||||
│ │ │ ├── encounterDiscovery.ts
|
||||
│ │ │ └── loreAnswer.ts
|
||||
│ │ ├── handlers/ # Event handlers / sidecar logic
|
||||
│ │ │ ├── messageRouter.ts # Encounter-thread message pipeline (heart of runtime)
|
||||
│ │ │ ├── mentionHandler.ts # @Zalram persona replies
|
||||
│ │ │ ├── rollHandler.ts # Button / modal submit roll resolution
|
||||
│ │ │ ├── generationQueue.ts # Debounce + LLM turn scheduling
|
||||
│ │ │ ├── queueCap.ts # Burst cap → drop notice
|
||||
│ │ │ ├── reactionManager.ts # 👀 reaction lifecycle (scheduled/processing/complete)
|
||||
│ │ │ └── responseFilter.ts # Post-LLM response scrubbing
|
||||
│ │ └── lib/welcomeDM.ts
|
||||
│ ├── harness/ # LLM orchestration
|
||||
│ │ ├── promptBuilder.ts # System prompt assembly (XML sections)
|
||||
│ │ ├── contextAssembler.ts # Pin/slide history + token budget trim
|
||||
│ │ ├── llmClient.ts # LiteLLM primary → Ollama fallback
|
||||
│ │ ├── litellmClient.ts # OpenAI-compatible HTTP client
|
||||
│ │ ├── ollamaClient.ts # Native ollama npm + direct HTTP
|
||||
│ │ ├── toolParser.ts # Extract ```tool_call``` blocks
|
||||
│ │ ├── toolRegistry.ts # Plugin registry + active-set filtering
|
||||
│ │ ├── toolDispatcher.ts # Per-encounter tool validation + dispatch
|
||||
│ │ └── tools/ # 6 tool plugins (see §5)
|
||||
│ ├── session/ # Redis-backed state
|
||||
│ │ ├── playerRegistry.ts # guildId+discordId → Player
|
||||
│ │ ├── characterRegistry.ts # Character profile + pronouns + Foundry UUID
|
||||
│ │ ├── sessionManager.ts # threadId → SessionState (pinned/sliding history)
|
||||
│ │ ├── encounterLog.ts # Filesystem tally + summary writer
|
||||
│ │ └── xpAwarder.ts # XP grant via VTT relay
|
||||
│ ├── graphmcp/ # GraphMCP JSON-RPC client
|
||||
│ │ ├── client.ts # 6 RPC calls + NPC memory formatter
|
||||
│ │ ├── ingest.ts # Publish to Redis stream (raw.messages)
|
||||
│ │ ├── loreResolver.ts # /encounter generate helper
|
||||
│ │ └── vocabularyResolver.ts # spec randomizable: vocabulary source
|
||||
│ ├── vtt/ # Foundry VTT integration
|
||||
│ │ ├── foundryClient.ts # HTTP client, formatters
|
||||
│ │ └── relaySession.ts # RSA-OAEP handshake + headless spin-up
|
||||
│ ├── db/redis.ts # ioredis singleton (lazy connect)
|
||||
│ ├── spec/loader.ts # YAML loader + Zod schema
|
||||
│ ├── persona/loader.ts # persona.yaml loader for @mention
|
||||
│ ├── lib/logger.ts # pino wrapper
|
||||
│ ├── config.ts # Zod env schema + parsed config singleton
|
||||
│ ├── scripts/deploy-commands.ts # Slash command registration (REST v10)
|
||||
│ └── types/index.ts # Shared interfaces + CONTEXT_BUDGET const
|
||||
├── specs/ # 8 encounter YAML files
|
||||
│ ├── SPEC_FORMAT.md
|
||||
│ ├── market-thief.yaml
|
||||
│ ├── cog-claw-debt.yaml
|
||||
│ ├── mawfang-pursuit.yaml
|
||||
│ ├── silt-leak.yaml
|
||||
│ ├── stormscar-pilgrim.yaml
|
||||
│ ├── velvet-auction.yaml
|
||||
│ └── whispering-stone.yaml
|
||||
├── data/ # Runtime data (gitignored in practice)
|
||||
│ ├── tally.json # Per-spec run counts
|
||||
│ └── summaries/ # One .txt per encounter
|
||||
├── tests/
|
||||
│ ├── unit/ # 21 unit test files
|
||||
│ └── integration/ # 1 integration test
|
||||
├── Docs/ # Pre-existing project docs
|
||||
│ ├── mardonar-encounter-engine.md # ⚠ Out of date — describes Go architecture
|
||||
│ ├── mardonar-build-plan.md
|
||||
│ ├── epics.md
|
||||
│ ├── stories/
|
||||
│ └── ux-designs/
|
||||
├── lore/ # Game-world reference material
|
||||
├── persona.yaml # Zalram Cloudwalker (bot's @mention persona)
|
||||
├── prd.md # Active PRD: Dynamic Goal Registration
|
||||
├── Dockerfile # Multi-stage node:22-alpine
|
||||
├── docker-compose.dev.yml # Local Redis + Neo4j
|
||||
├── package.json
|
||||
├── tsconfig.json
|
||||
└── vitest.config.ts
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Architecture Pattern
|
||||
|
||||
**Layered backend with a plugin registry:**
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ Discord (Gateway WebSocket) │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ src/bot/ │
|
||||
│ ┌────────────────────┐ ┌────────────────┐ ┌──────────────┐ │
|
||||
│ │ commands/ │ │ handlers/ │ │ embeds/ │ │
|
||||
│ │ (slash cmd) │ │ (event loops) │ │ (UI shape) │ │
|
||||
│ └────────────────────┘ └────────────────┘ └──────────────┘ │
|
||||
│ messageRouter is the runtime heart │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ src/harness/ │
|
||||
│ assembleContext → llmClient (LiteLLM → Ollama) │
|
||||
│ ↓ │
|
||||
│ parseToolCall → dispatchTool → active tool plugins │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────────────┐ ┌─────────────────┐ ┌──────────────────┐
|
||||
│ src/session/ │ │ src/db/ │ │ src/graphmcp/ │
|
||||
│ (Redis state) │ │ (ioredis) │ │ (JSON-RPC) │
|
||||
└─────────────────────┘ └─────────────────┘ └──────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ src/vtt/ → External Foundry VTT relay │
|
||||
│ src/persona/ → persona.yaml for @mentions │
|
||||
│ src/spec/ → specs/*.yaml loaded per encounter │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 3.1 Message flow (encounter thread)
|
||||
|
||||
1. Discord `messageCreate` → `bot/index.ts` → `handleMessage` in `handlers/messageRouter.ts`
|
||||
2. Channel guard: must be a thread whose parent is in `DISCORD_ALLOWED_CHANNELS`
|
||||
3. Player gate: if `discordId` not in `playerRegistry`, post ephemeral gate embed, hold message in `SessionState.heldMessages`, return
|
||||
4. Roll guard: if `pendingSkillCheck` is set, increment attempt counter; auto-fail after `PENDING_ROLL_LIMIT` (5) skipped messages
|
||||
5. Burst cap: `queueCap` rejects + sends drop notice if too many messages arrived before last LLM response
|
||||
6. Append user message to history, fire `👀` reaction (fire-and-forget)
|
||||
7. Publish to GraphMCP via `graphmcp/ingest.ts` (Redis stream `raw.messages`)
|
||||
8. Debounced (500ms) → `generationQueue.scheduleLLMTurn`
|
||||
9. `runLLMTurn`:
|
||||
- `assembleContext` builds message list (system + pinned + trimmed sliding)
|
||||
- `callLLM` → LiteLLM with Ollama fallback
|
||||
- `parseToolCall` splits narrative from `tool_call` block
|
||||
- `filterLLMResponse` rejects fabricated rolls / echoed system tags → injects `[FILTER CORRECTION]` and retries once
|
||||
- Narrative posted to thread; assistant message appended to history
|
||||
- If tool call present → `dispatchTool` → plugin handler → system message appended
|
||||
- If `result.resolved` set → phase = 'resolved', archive thread after `ENCOUNTER_ARCHIVE_DELAY_MS`
|
||||
10. `reactionManager` upgrades `👀` state to `complete` and clears burst counter
|
||||
|
||||
### 3.2 Tool dispatch
|
||||
|
||||
The tool layer uses a **plugin registry** (`harness/toolRegistry.ts`) with per-encounter active-set filtering. Each `ToolPlugin` declares:
|
||||
|
||||
```ts
|
||||
{
|
||||
name: string;
|
||||
description: string;
|
||||
args: Record<string, { type: 'string' | 'number' | 'boolean'; description: string }>;
|
||||
contextDocs?: (spec: EncounterSpec) => string;
|
||||
handler: (args, ctx: ToolContext) => Promise<DispatchResult>;
|
||||
}
|
||||
```
|
||||
|
||||
A spec's `tools: [...]` array declares which plugins are active for that encounter. Tools are loaded by side-effect from `harness/tools/index.ts`:
|
||||
|
||||
```ts
|
||||
import './skillCheckEmit.js';
|
||||
import './encounterResolve.js';
|
||||
import './contextRecall.js';
|
||||
import './goalRegister.js';
|
||||
import './foundryLookup.js';
|
||||
import './foundryReward.js';
|
||||
```
|
||||
|
||||
The LLM emits a tool call by appending a fenced `tool_call` JSON block. Three parser patterns (in order): fenced ` ```tool_call ` block, bare `tool_call` header, then a fuzzy bare-JSON fallback. Unrecognized tools or malformed args are logged and ignored — the narrative is preserved.
|
||||
|
||||
The system prompt section `buildToolManifest(spec)` injects only the active set's tool definitions into the prompt contract, so each encounter's LLM only sees tools it can use.
|
||||
|
||||
---
|
||||
|
||||
## 4. Data Architecture
|
||||
|
||||
### 4.1 Redis (transient state)
|
||||
|
||||
| Key pattern | Value | TTL | Owner |
|
||||
|---|---|---|---|
|
||||
| `session:{threadId}` | `JSON.stringify(SessionState)` | `SESSION_TTL_HOURS` (12h) | `sessionManager` |
|
||||
| `guild_threads:{guildId}` | Set of thread IDs | inherits | `sessionManager` |
|
||||
| `players:{guildId}` (legacy design) | discordId → dndName | — | `playerRegistry` (current impl uses different scheme) |
|
||||
| `raw.messages` | Redis stream | — | `graphmcp/ingest.ts` |
|
||||
|
||||
`SessionState` (`src/types/index.ts`) is the central shape:
|
||||
|
||||
```ts
|
||||
{
|
||||
encounterId, threadId, guildId,
|
||||
spec: EncounterSpec,
|
||||
players: Record<discordId, Player>,
|
||||
history: ChatMessage[], // mix of pinned + sliding
|
||||
phase: 'open' | 'active' | 'resolved',
|
||||
heldMessages: HeldMessage[], // for unregistered players
|
||||
outcome?, outcomeSummary?,
|
||||
npcMemories?: Record<npcId, string>,
|
||||
resolvedContext?: Record<key, string>,
|
||||
pendingSkillCheck?: { player, prompt, dc, messageId, modifier?, skill?, advantage?, disadvantage? },
|
||||
pendingSkillCheckAttempts?: number,
|
||||
createdAt, updatedAt,
|
||||
}
|
||||
```
|
||||
|
||||
### 4.2 Filesystem (`data/`)
|
||||
|
||||
- `tally.json` — `{ [specName]: { runs, lastRun } }`. Incremented at each encounter start.
|
||||
- `summaries/{encounterId}-{ISO timestamp}.txt` — one per resolved encounter, written by `encounterLog.writeSummary()`.
|
||||
|
||||
### 4.3 GraphMCP / Neo4j (via JSON-RPC)
|
||||
|
||||
The bot never queries Neo4j directly. All graph access goes through `GRAPHMCP_URL/mcp` with JSON-RPC 2.0:
|
||||
|
||||
| Tool | Args | Returns |
|
||||
|---|---|---|
|
||||
| `query_as_npc` | `npc_name, question, limit` | NPCQueryResult (chunks + graph_context) |
|
||||
| `semantic_search` | `query, limit` | SemanticSearchResult |
|
||||
| `log_encounter` | `title, participants, summary, location?, type?` | LogEncounterResult |
|
||||
| `list_encounters` | `limit` | EncounterResultItem[] |
|
||||
| `search_encounters` | `query?, location?, participant?, limit?` | EncounterResultItem[] |
|
||||
| `get_encounter` | `id` | EncounterDetails |
|
||||
|
||||
NPC memory is injected into the system prompt via `formatNPCMemory()` — past encounters witnessed + top-3 lore chunks above `GRAPHMCP_SCORE_THRESHOLD`.
|
||||
|
||||
### 4.4 Context window budget
|
||||
|
||||
`src/types/index.ts` exports a `CONTEXT_BUDGET` constant used by both `contextAssembler` and `sessionManager`:
|
||||
|
||||
| Zone | Tokens |
|
||||
|---|---|
|
||||
| System prompt (narrator + NPCs + tools + goals) | 4,000 |
|
||||
| Pinned (opening narrative, goal block) | 2,000 |
|
||||
| Sliding history | 118,000 |
|
||||
| Safety buffer | 3,500 |
|
||||
| **Total** | **128,000** |
|
||||
|
||||
History trimming drops the oldest non-pinned turn pair when over budget, with a hard floor of 6 messages. Token estimates use `gpt-tokenizer` with a 1.15× buffer to approximate Gemma's tokenizer.
|
||||
|
||||
---
|
||||
|
||||
## 5. API Surface
|
||||
|
||||
This project exposes its functionality as **two different APIs**:
|
||||
|
||||
### 5.1 Discord slash commands (player/admin surface)
|
||||
|
||||
Registered via `src/scripts/deploy-commands.ts` using Discord REST v10.
|
||||
|
||||
| Command | Subcommands | Purpose |
|
||||
|---|---|---|
|
||||
| `/dndname` | `set <name>`, `show`, `clear` | Character name registration |
|
||||
| `/character` | `register foundry\|custom`, `show`, `view`, `clear`, `admin list\|remove\|give` | Full character profile + Foundry link |
|
||||
| `/encounter` | `start <spec>`, `random`, `status`, `stats`, `audit`, `end [notes]`, `list`, `generate <theme>`, `spec` | Encounter session lifecycle |
|
||||
| `/encounters` | (Select menu + search modal) | Search the encounter log via GraphMCP |
|
||||
| `/roll` | `action` | Manual dice roll |
|
||||
| `/actions` | — | In-character action shortcuts |
|
||||
| `/turn` | — | Turn management |
|
||||
| `/xp` | `award <amount>` | Award XP (relay → VTT) |
|
||||
|
||||
Plus button + modal interactions: skill-check roll buttons, give item, custom character registration, Foundry link, encounter select menu, search modal.
|
||||
|
||||
### 5.2 Tool plugins (LLM surface)
|
||||
|
||||
Defined in `src/harness/tools/` and registered at module load. Each spec filters the active set via its `tools:` array.
|
||||
|
||||
| Tool | Purpose | Args |
|
||||
|---|---|---|
|
||||
| `skill_check_emit` | Posts a dice-roll embed to the thread; blocks player input until resolved | `player, prompt, skill?, dc, advantage?, disadvantage?` |
|
||||
| `encounter_resolve` | Marks encounter complete; writes summary; archives thread | (args handled in `tools/encounterResolve.ts`) |
|
||||
| `context_recall` | Look up canonical session facts stored in `resolvedContext` | |
|
||||
| `goal_register` | Add a new goal mid-encounter (the `prd.md` "dynamic goal registration" feature) | |
|
||||
| `foundry_lookup` | Pull live character data from VTT relay | |
|
||||
| `foundry_reward` | Award XP/items to a character via VTT | |
|
||||
|
||||
> ⚠ Note: the `Docs/mardonar-encounter-engine.md` lists `skill_check_resolve`, `event_log_append`, `npc_memory_read`, `npc_memory_write` as tools. These have been **removed** — replaced by the per-encounter event log + GraphMCP `log_encounter` tool. The current tool set is the one above.
|
||||
|
||||
---
|
||||
|
||||
## 6. Deployment Architecture
|
||||
|
||||
### 6.1 Local development
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.dev.yml up -d # Redis + Neo4j
|
||||
npm install
|
||||
npm run deploy-commands # registers slash commands with Discord
|
||||
npm run dev # tsx watch mode
|
||||
```
|
||||
|
||||
### 6.2 Production (multi-stage Dockerfile)
|
||||
|
||||
`Dockerfile` (Node 22 alpine):
|
||||
|
||||
1. **Builder stage** — `npm ci --ignore-scripts`, copy `src` + `tsconfig.json`, `npm run build` → `dist/`
|
||||
2. **Runtime stage** — `npm ci --omit=dev --ignore-scripts`, copy `dist/`, `specs/`, `lore/`, `persona.yaml`
|
||||
3. `CMD ["node", "dist/bot/index.js"]`
|
||||
|
||||
`docker-compose.dev.yml` defines two services (for the `mardonar-internal` external Docker network that also hosts Redis + an MCP server from the GraphMCP-Example stack): `deploy-commands` (one-shot) and `bot` (long-running, with `data/` mounted as a volume).
|
||||
|
||||
> **Gap:** There is no production `docker-compose.yml`. The `.env.example` is the source of truth for runtime config.
|
||||
|
||||
### 6.3 Operational
|
||||
|
||||
- Session state has a 12h TTL by default — stale encounters auto-expire
|
||||
- Bot connects to Redis on `main()` startup (`redis.connect()`)
|
||||
- VTT relay auto-spins up a headless Foundry session on connection failure (RSA-OAEP encrypted handshake)
|
||||
- `LOG_LEVEL=info` in prod; pino writes structured JSON
|
||||
|
||||
---
|
||||
|
||||
## 7. Development & Testing
|
||||
|
||||
### 7.1 Local commands
|
||||
|
||||
| Command | Effect |
|
||||
|---|---|
|
||||
| `npm run dev` | `tsx watch src/bot/index.ts` — auto-reload dev |
|
||||
| `npm run build` | `tsc` → `dist/` |
|
||||
| `npm run start` | `node dist/bot/index.js` |
|
||||
| `npm run deploy-commands` | One-shot slash command registration |
|
||||
| `npm run test` | All tests (vitest) |
|
||||
| `npm run test:unit` | Unit tests only (no external services) |
|
||||
| `npm run test:int` | Integration tests (requires Docker services) |
|
||||
|
||||
### 7.2 Test coverage
|
||||
|
||||
- 21 unit test files in `tests/unit/`
|
||||
- 1 integration test (`tests/integration/phase1.test.ts`)
|
||||
- `tests/fixtures/spec.ts` — shared encounter spec fixture
|
||||
|
||||
Notable test surfaces: `promptBuilder`, `contextAssembler`, `toolParser`, `toolDispatcher`, `sessionManager`, `playerRegistry`, `characterRegistry`, `specLoader`, `rollHandler`, `rollDetection`, `responseFilter`, `queueCap`, `generationQueue`, `reactionManager`, `encounterLog`, `encounterDiscoveryEmbed`, `loreAnswerEmbed`, `skillCheckEmbed`, `graphmcpClient`, `foundryClientRetry`, `foundryClientFormatters`, `goalRegister`, `relaySession`.
|
||||
|
||||
---
|
||||
|
||||
## 8. Design Decisions (Living)
|
||||
|
||||
| Decision | Why |
|
||||
|---|---|
|
||||
| **LiteLLM as primary, Ollama as fallback** | OpenAI-compatible proxy gives model flexibility without code changes; Ollama fallback ensures the bot still runs when the proxy is down |
|
||||
| **Prompt-based tool calls (not native)** | Gemma 4 IT at e2b is unreliable with native function calling; fenced JSON block parsing is deterministic |
|
||||
| **Tool plugin registry with per-spec active set** | New tools can be added without touching the dispatch core; specs opt into only the tools they need |
|
||||
| **Pinned + sliding history** | Opening narrative and goal block must survive trimming or the LLM loses its anchor |
|
||||
| **Goals in system prompt, not as a tool** | Goals rarely change mid-encounter; embedding them reduces tool round-trips |
|
||||
| **Redis for active state, GraphMCP for memory** | Redis is fast and ephemeral for live sessions; the graph holds long-term NPC lore |
|
||||
| **Player name gate via embed, not DMs** | Keeps the conversation in-thread; ephemeral embed auto-deletes after 30s |
|
||||
| **Story generator via `/encounter generate`** | Separates creative authoring from real-time inference — generator can use a stronger model later |
|
||||
| **VTT relay auto-spin-up** | Lets the bot operate when the relay has been cold-stopped; uses RSA-OAEP for password handoff |
|
||||
| **In-world voice rule for player-facing strings** | See `feedback-in-world-voice` — no utility/jargon in bot messages |
|
||||
|
||||
---
|
||||
|
||||
## 9. Open Issues / Drift
|
||||
|
||||
Items the deep scan surfaced that aren't bugs but should be tracked:
|
||||
|
||||
- **Drift: `Docs/mardonar-encounter-engine.md` describes a Go bot with an embedded MCP layer; the actual code is TypeScript with an external JSON-RPC GraphMCP server.** Treat the doc as historical/aspirational.
|
||||
- **Drift: `README.md`'s "Project Structure" tree references `src/mcp/` and the old `src/bot/commands/{dndname,encounter}.ts` layout.** Update README, or trim it to a pointer to the index.
|
||||
- **Duplicate `trimHistory` logic** in `src/session/sessionManager.ts` and `src/harness/contextAssembler.ts` (identical body). Could be extracted to `src/lib/historyTrim.ts`.
|
||||
- **No production compose file** — only `docker-compose.dev.yml`. The Dockerfile is production-ready but deployment is ad-hoc.
|
||||
- **No CI/CD** — `.github/workflows/` does not exist.
|
||||
- **`DISCORD_ALLOWED_USERS` is empty by default → anyone in allowed channels can run `/encounter start`.** The access control is channel-scoped, not user-scoped; admins need to set the env var explicitly.
|
||||
- **`OLLAMA_BASE_URL` defaults to `localhost`** — fine for dev, but production needs the LAN IP or proxy URL set.
|
||||
- **Spec tool list must be kept in sync** — `specs/*.yaml` declare `tools: [...]`, but no test verifies every referenced tool is registered. A stale spec name silently filters to no active tools.
|
||||
- **Schema mismatch risk:** `types/index.ts` `EncounterSpec` and `spec/loader.ts` Zod schema have diverged slightly — `EncounterSpec` is missing `tone`, `tools`, `randomizable`, and `npcs.nameKey`. `assembleContext` reads `spec.tone`; `loader` doesn't validate it. Consider regenerating `types/index.ts` from the Zod schema via `z.infer`.
|
||||
|
||||
---
|
||||
|
||||
*Document generated by `bmad-document-project` initial scan, deep level. Project state recorded in `docs/project-scan-report.json`.*
|
||||
134
docs/component-inventory.md
Normal file
134
docs/component-inventory.md
Normal file
@@ -0,0 +1,134 @@
|
||||
# Component Inventory
|
||||
|
||||
> Reusable and feature-specific components in the Mardonar Encounter Engine. Generated 2026-06-19.
|
||||
|
||||
## Discord Components
|
||||
|
||||
### Slash commands (reusable across all guilds)
|
||||
|
||||
| Component | File | Reusable? | Notes |
|
||||
|---|---|---|---|
|
||||
| `/dndname` | `src/bot/commands/dndname.ts` | Yes | Character name gate. Universal. |
|
||||
| `/encounter` | `src/bot/commands/encounter.ts` | Yes | Encounter lifecycle. Spec-scoped via `start <spec>`. |
|
||||
| `/character` | `src/bot/commands/character.ts` | Yes | Full character profile + Foundry link. |
|
||||
| `/roll` | `src/bot/commands/roll.ts` | Yes | Manual roll outside encounter. |
|
||||
| `/actions` | `src/bot/commands/actions.ts` | Yes | In-character action shortcuts. |
|
||||
| `/xp` | `src/bot/commands/xp.ts` | Yes | XP grant. |
|
||||
| `/encounters` | `src/bot/commands/encounters.ts` | Yes | Search via GraphMCP. |
|
||||
| `/turn` | `src/bot/commands/turn.ts` | Yes | Turn management. |
|
||||
|
||||
### Embeds (pure builders, reusable)
|
||||
|
||||
| Embed | File | Caller |
|
||||
|---|---|---|
|
||||
| PlayerGate | `src/bot/embeds/playerGate.ts` | `messageRouter` (unregistered player) |
|
||||
| Suspense + SkillCheck | `src/bot/embeds/skillCheck.ts` | `tools/skillCheckEmit.ts` |
|
||||
| Resolution | `src/bot/embeds/resolution.ts` | `tools/encounterResolve.ts` |
|
||||
| EncounterDiscovery | `src/bot/embeds/encounterDiscovery.ts` | `/encounters` |
|
||||
| LoreAnswer | `src/bot/embeds/loreAnswer.ts` | `mentionHandler` |
|
||||
|
||||
### Event handlers (reusable, sidecar logic)
|
||||
|
||||
| Handler | File | Trigger | Side effects |
|
||||
|---|---|---|---|
|
||||
| `handleMessage` | `handlers/messageRouter.ts` | `messageCreate` in encounter thread | Gates, debounce, LLM call, tool dispatch |
|
||||
| `handleMention` | `handlers/mentionHandler.ts` | `messageCreate` @Zalram | Lore search + persona reply |
|
||||
| `handleRollInteraction` | `handlers/rollHandler.ts` | Button / modal submit | Resolves skill check, schedules LLM turn |
|
||||
| `scheduleEncounterLLMTurn` | `handlers/messageRouter.ts` | Internal | Debounce → LLM turn |
|
||||
| `scheduleLLMTurn` | `handlers/generationQueue.ts` | Internal | Debounce timer |
|
||||
| `isBurstCapped` / `sendDropNotice` | `handlers/queueCap.ts` | Pre-append check | Drops + notifies |
|
||||
| `registerScheduled` / `drainPending` / `upgradeToProcessing` / `upgradeToComplete` | `handlers/reactionManager.ts` | Per-message | 👀 reaction lifecycle |
|
||||
| `filterLLMResponse` / `detectMissedSkillCheck` | `handlers/responseFilter.ts` | Post-LLM | Injects `[FILTER CORRECTION]` |
|
||||
|
||||
## LLM Harness Components
|
||||
|
||||
### Tool plugins (registered globally, filtered per-encounter)
|
||||
|
||||
| Plugin | File | Per-encounter filter | Side effects |
|
||||
|---|---|---|---|
|
||||
| `skill_check_emit` | `harness/tools/skillCheckEmit.ts` | Spec `tools:` | Posts suspense + dice embed; updates `pendingSkillCheck` |
|
||||
| `encounter_resolve` | `harness/tools/encounterResolve.ts` | Spec `tools:` | Writes summary, archives thread |
|
||||
| `context_recall` | `harness/tools/contextRecall.ts` | Spec `tools:` | Returns canonical facts from `resolvedContext` |
|
||||
| `goal_register` | `harness/tools/goalRegister.ts` | Spec `tools:` | Adds dynamic goal (per `prd.md`) |
|
||||
| `foundry_lookup` | `harness/tools/foundryLookup.ts` | Spec `tools:` | Live VTT actor data |
|
||||
| `foundry_reward` | `harness/tools/foundryReward.ts` | Spec `tools:` | XP/item grant to VTT actor |
|
||||
|
||||
### LLM clients
|
||||
|
||||
| Client | File | Role |
|
||||
|---|---|---|
|
||||
| `llmClient` (router) | `harness/llmClient.ts` | LiteLLM primary, Ollama fallback |
|
||||
| `litellmClient` | `harness/litellmClient.ts` | OpenAI-compatible HTTP |
|
||||
| `ollamaClient` | `harness/ollamaClient.ts` | Native ollama npm + direct HTTP |
|
||||
|
||||
### Pipeline components
|
||||
|
||||
| Component | File | Role |
|
||||
|---|---|---|
|
||||
| `buildSystemPrompt` | `harness/promptBuilder.ts` | 10-block XML system prompt |
|
||||
| `assembleContext` | `harness/contextAssembler.ts` | System + pinned + trimmed sliding |
|
||||
| `parseToolCall` | `harness/toolParser.ts` | 3-pattern tool block extractor |
|
||||
| `buildToolManifest` | `harness/toolDispatcher.ts` | Per-encounter tool contract section |
|
||||
| `dispatchTool` | `harness/toolDispatcher.ts` | Active-set validation + dispatch |
|
||||
| `getActiveTools` | `harness/toolRegistry.ts` | Per-encounter filter (or all if unset) |
|
||||
| `registerTool` | `harness/toolRegistry.ts` | Side-effect registration at module load |
|
||||
|
||||
## Session / Data Components
|
||||
|
||||
| Component | File | Backend | Surface |
|
||||
|---|---|---|---|
|
||||
| `sessionManager` | `session/sessionManager.ts` | Redis (TTL 12h) | `create`, `get`, `update`, `addMessage`, `delete`, `getGuildThreadIds` |
|
||||
| `playerRegistry` | `session/playerRegistry.ts` | Redis | `(guildId, discordId) → Player` |
|
||||
| `characterRegistry` | `session/characterRegistry.ts` | Redis | Character profile (pronouns, Foundry UUID, etc.) |
|
||||
| `encounterLog` | `session/encounterLog.ts` | Filesystem | `tally.json` + per-encounter `.txt` in `data/summaries/` |
|
||||
| `xpAwarder` | `session/xpAwarder.ts` | VTT relay | XP grant |
|
||||
| `redis` singleton | `db/redis.ts` | ioredis | Lazy connect, 3 retries |
|
||||
| `loadSpec` | `spec/loader.ts` | YAML + Zod | `EncounterSpecSchema.parse` |
|
||||
| `loadPersona` | `persona/loader.ts` | YAML | @Zalram persona |
|
||||
|
||||
## GraphMCP Components
|
||||
|
||||
| Component | File | RPC method | Used by |
|
||||
|---|---|---|---|
|
||||
| `queryAsNPC` | `graphmcp/client.ts` | `query_as_npc` | NPC memory injection at session start |
|
||||
| `semanticSearch` | `graphmcp/client.ts` | `semantic_search` | @mention lore search |
|
||||
| `logEncounter` | `graphmcp/client.ts` | `log_encounter` | Encounter resolve (writes graph node) |
|
||||
| `listEncounters` | `graphmcp/client.ts` | `list_encounters` | `/encounters list` |
|
||||
| `searchEncounters` | `graphmcp/client.ts` | `search_encounters` | `/encounters search` |
|
||||
| `getEncounter` | `graphmcp/client.ts` | `get_encounter` | `/encounters get` |
|
||||
| `formatNPCMemory` | `graphmcp/client.ts` | (local) | Render NPCQueryResult as system-prompt text |
|
||||
| `publishToGraphMCP` | `graphmcp/ingest.ts` | (Redis stream `raw.messages`) | Fire-and-forget per encounter message |
|
||||
| `vocabularyResolver` | `graphmcp/vocabularyResolver.ts` | graphmcp | `randomizable:` lookup |
|
||||
| `loreResolver` | `graphmcp/loreResolver.ts` | graphmcp | `/encounter generate` helper |
|
||||
|
||||
## VTT Components
|
||||
|
||||
| Component | File | Role |
|
||||
|---|---|---|
|
||||
| `foundryClient` | `vtt/foundryClient.ts` | HTTP client, live actor data + formatters |
|
||||
| `relaySession.ensureRelaySession` | `vtt/relaySession.ts` | Auto-spin-up headless session on relay failure |
|
||||
| `isRelayDown` | `vtt/relaySession.ts` | Network-failure classifier |
|
||||
| `actorCache` (in `tools/skillCheckEmit.ts`) | in-file | 30s in-memory cache for actor details |
|
||||
|
||||
## Type system (shared)
|
||||
|
||||
| Type | File | Purpose |
|
||||
|---|---|---|
|
||||
| `EncounterSpec` | `types/index.ts` | Spec shape (note: diverged slightly from Zod schema — see architecture.md §9) |
|
||||
| `NpcPersona` | `types/index.ts` | NPC definition |
|
||||
| `EncounterGoal` / `EncounterGoals` | `types/index.ts` | Primary/secondary goals |
|
||||
| `SessionState` | `types/index.ts` | Full session shape |
|
||||
| `ChatMessage` | `types/index.ts` | History turn (with `pinned` flag) |
|
||||
| `HeldMessage` | `types/index.ts` | Pre-registration messages |
|
||||
| `ToolCallBlock` / `LLMResponse` | `types/index.ts` | LLM tool surface |
|
||||
| `ToolName` | `types/index.ts` | Discriminated union of valid tools |
|
||||
| `*Args` per tool | `types/index.ts` | Per-tool arg types |
|
||||
| `NpcNode` / `EncounterNode` / `EncounterEventNode` | `types/index.ts` | Neo4j graph node types |
|
||||
| `CONTEXT_BUDGET` (const) | `types/index.ts` | Hard token budget zones |
|
||||
|
||||
## Config & logging
|
||||
|
||||
| Component | File | Role |
|
||||
|---|---|---|
|
||||
| `config` (singleton) | `config.ts` | Zod-validated env (Discord, Redis, LiteLLM, Ollama, GraphMCP, VTT, persona, logging) |
|
||||
| `log` (pino wrapper) | `lib/logger.ts` | Structured logging with `pino-pretty` in dev |
|
||||
212
docs/data-models.md
Normal file
212
docs/data-models.md
Normal file
@@ -0,0 +1,212 @@
|
||||
# Data Models
|
||||
|
||||
> Persistent and transient data shapes in the Mardonar Encounter Engine. Generated 2026-06-19.
|
||||
|
||||
The bot's data lives in three places: Redis (transient session state), the filesystem (`data/`, runtime artifacts), and the GraphMCP-backed Neo4j graph (long-term NPC memory + encounter history). The bot does not query Neo4j directly — it goes through the GraphMCP JSON-RPC client.
|
||||
|
||||
## Encounter spec (YAML → Zod → TypeScript)
|
||||
|
||||
Defined by `EncounterSpecSchema` in `src/spec/loader.ts`. Loaded by `/encounter start <spec-name>`. Stored in `SessionState.spec`.
|
||||
|
||||
```ts
|
||||
{
|
||||
encounterId: string, // unique ID — also Neo4j node key
|
||||
title: string, // display name in Discord embeds
|
||||
tone?: string, // "tense" | "comedic" | ... optional flavor block
|
||||
setting: {
|
||||
location: string,
|
||||
mood: string, // multi-line OK
|
||||
ambientNpcs: string, // multi-line OK
|
||||
},
|
||||
openingNarrative: string, // multi-line; can reference {{nameKey}} placeholders
|
||||
npcs: [{ // 1–5 entries
|
||||
id: string, // unique stable ID
|
||||
name: string,
|
||||
nameKey?: string, // placeholder for randomizable substitution
|
||||
role: string,
|
||||
persona: string, // multi-line
|
||||
memoryKey?: string, // if set, memory is loaded from / written to graph
|
||||
}],
|
||||
goals: {
|
||||
hidden: boolean, // default true
|
||||
primary: [{ id: string, label: string }], // min 1
|
||||
secondary: [{ id: string, label: string }],
|
||||
},
|
||||
sportsmanshipRules: string[],
|
||||
skillChecks: Record<string, number | string>, // grouped as <name>_dc / <name>_skill / <name>_note
|
||||
randomizable?: [{ // optional
|
||||
key: string,
|
||||
source?: 'graphmcp' | 'vocabulary',
|
||||
category?: string, // e.g. "names.dwarf.female"
|
||||
query: string, // free-text query
|
||||
fallback: string, // always available
|
||||
}],
|
||||
dmNotes?: string,
|
||||
tools?: string[], // active tool plugin names; empty/undefined = all
|
||||
}
|
||||
```
|
||||
|
||||
`tone` and `tools` are read by the harness but **not in the Zod schema** (see `architecture.md §9` for the schema-vs-types drift).
|
||||
|
||||
## SessionState (Redis)
|
||||
|
||||
Stored as JSON under key `session:{threadId}`. Schema in `src/types/index.ts`:
|
||||
|
||||
```ts
|
||||
{
|
||||
encounterId: string,
|
||||
threadId: string, // Discord thread snowflake
|
||||
guildId: string,
|
||||
spec: EncounterSpec,
|
||||
players: Record<discordId, Player>,
|
||||
history: ChatMessage[], // pinned + sliding mix
|
||||
phase: 'open' | 'active' | 'resolved',
|
||||
heldMessages: HeldMessage[], // for unregistered players
|
||||
outcome?: string, // goal ID when resolved
|
||||
outcomeSummary?: string,
|
||||
npcMemories?: Record<npcId, string>, // injected into system prompt
|
||||
resolvedContext?: Record<key, string>, // canonical session facts (context_recall)
|
||||
pendingSkillCheck?: {
|
||||
player: string,
|
||||
prompt: string,
|
||||
dc: number,
|
||||
messageId: string, // Discord message ID of the dice embed
|
||||
modifier?: number,
|
||||
skill?: string,
|
||||
advantage?: boolean,
|
||||
disadvantage?: boolean,
|
||||
},
|
||||
pendingSkillCheckAttempts?: number,
|
||||
createdAt: number,
|
||||
updatedAt: number,
|
||||
}
|
||||
```
|
||||
|
||||
## ChatMessage
|
||||
|
||||
```ts
|
||||
{
|
||||
role: 'system' | 'user' | 'assistant',
|
||||
content: string,
|
||||
pinned?: boolean, // never trimmed by contextAssembler
|
||||
timestamp: number,
|
||||
}
|
||||
```
|
||||
|
||||
System messages are emitted by the harness for tool results, filter corrections, and join events. Assistant messages contain the LLM's narrative.
|
||||
|
||||
## Player
|
||||
|
||||
```ts
|
||||
{
|
||||
discordId: string,
|
||||
dndName: string,
|
||||
pronouns?: string, // populated from characterRegistry if set
|
||||
}
|
||||
```
|
||||
|
||||
`pronouns` is added on first appearance in an encounter thread if the player has a `characterRegistry` profile.
|
||||
|
||||
## Character profile (characterRegistry)
|
||||
|
||||
```ts
|
||||
{
|
||||
discordId: string,
|
||||
guildId: string,
|
||||
dndName: string,
|
||||
pronouns?: string,
|
||||
characterClass?: string,
|
||||
race?: string,
|
||||
level?: number,
|
||||
backstory?: string,
|
||||
foundryActorUuid?: string, // link to Foundry VTT actor
|
||||
inventory?: unknown[], // populated from /character view
|
||||
spells?: unknown[], // populated from /character view
|
||||
// ... additional Foundry-derived fields
|
||||
}
|
||||
```
|
||||
|
||||
## Neo4j graph (via GraphMCP)
|
||||
|
||||
The bot does not directly define the Neo4j schema — it consumes whatever GraphMCP returns. The conceptual model based on the GraphMCP client types and the legacy design doc:
|
||||
|
||||
```
|
||||
(:NPC {id, name, persona_summary, memory: [], last_seen_encounter})
|
||||
-[:APPEARED_IN]->
|
||||
(:Encounter {id, title, resolved, outcome_id, created_at})
|
||||
-[:HAS_EVENT]->
|
||||
(:EncounterEvent {timestamp, type, description})
|
||||
-[:FEATURED]->
|
||||
(:Entity {name, kind})
|
||||
|
||||
(:Player {discord_id, dnd_name})
|
||||
-[:PARTICIPATED_IN]->
|
||||
(:Encounter)
|
||||
```
|
||||
|
||||
The bot writes to the graph via `log_encounter` (one encounter node + participants). It reads NPC memory via `query_as_npc` and the broader corpus via `semantic_search`.
|
||||
|
||||
## File system (`data/`)
|
||||
|
||||
```
|
||||
data/
|
||||
├── tally.json // { [specName]: { runs: number, lastRun: ISO8601 } }
|
||||
└── summaries/
|
||||
└── {encounterId}-{ISO8601-with-dashes}.txt
|
||||
// human-readable per-encounter summary
|
||||
// header: Encounter, ID, Thread, Date, Outcome, Players
|
||||
// body: free-text Summary
|
||||
```
|
||||
|
||||
`tally.json` is rewritten atomically on each encounter start. Summary files are append-only.
|
||||
|
||||
## Tool call payloads
|
||||
|
||||
```ts
|
||||
// What the LLM emits
|
||||
type ToolCallBlock = {
|
||||
tool: ToolName,
|
||||
args: Record<string, unknown>,
|
||||
}
|
||||
|
||||
// What the harness parses back from the LLM response
|
||||
type LLMResponse = {
|
||||
narrative: string,
|
||||
toolCall?: ToolCallBlock,
|
||||
rawTokensUsed?: number,
|
||||
}
|
||||
```
|
||||
|
||||
Tool names (`src/types/index.ts`):
|
||||
|
||||
```ts
|
||||
type ToolName =
|
||||
| 'skill_check_emit'
|
||||
| 'skill_check_resolve' // (defined in types but no longer registered — see architecture.md §9)
|
||||
| 'event_log_append' // (defined in types but no longer registered)
|
||||
| 'npc_memory_read' // (defined in types but no longer registered)
|
||||
| 'npc_memory_write' // (defined in types but no longer registered)
|
||||
| 'encounter_resolve'
|
||||
| 'goal_register'
|
||||
| 'context_recall'
|
||||
| 'foundry_lookup'
|
||||
| 'foundry_reward';
|
||||
```
|
||||
|
||||
The four `*_resolve / *_read / *_write` entries are **dead** in the current implementation — replaced by GraphMCP `log_encounter` and other RPC calls. They should be removed from the type union (or actually re-implemented) to avoid confusion.
|
||||
|
||||
## Context budget (compile-time const)
|
||||
|
||||
`src/types/index.ts`:
|
||||
|
||||
```ts
|
||||
export const CONTEXT_BUDGET = {
|
||||
SYSTEM: 4_000,
|
||||
PINNED: 2_000,
|
||||
HISTORY: 118_000,
|
||||
SAFETY: 3_500,
|
||||
TOTAL: 128_000,
|
||||
} as const;
|
||||
```
|
||||
|
||||
Used by `contextAssembler` and `sessionManager` to enforce the trimming policy.
|
||||
219
docs/deployment-guide.md
Normal file
219
docs/deployment-guide.md
Normal file
@@ -0,0 +1,219 @@
|
||||
# Deployment Guide
|
||||
|
||||
> Deploying the Mardonar Encounter Engine. Generated 2026-06-19.
|
||||
|
||||
## Architecture
|
||||
|
||||
The bot is a single long-running Node.js process. It connects to:
|
||||
|
||||
- **Discord** over WebSocket (discord.js v14)
|
||||
- **Redis** for session and player/character registries
|
||||
- **GraphMCP** (HTTP JSON-RPC) for NPC memory, lore search, and encounter log writes
|
||||
- **LiteLLM** (preferred) or **Ollama** for LLM inference
|
||||
- **VTT relay** (optional) for Foundry VTT integration
|
||||
|
||||
The Dockerfile is multi-stage Node 22 alpine. There is currently no production `docker-compose.yml` — only the dev one (`docker-compose.dev.yml`). Production deploys use the Dockerfile directly with whatever orchestrator is in use.
|
||||
|
||||
## Build
|
||||
|
||||
```bash
|
||||
npm ci --ignore-scripts
|
||||
npm run build # tsc → dist/
|
||||
```
|
||||
|
||||
The build is reproducible from a clean `node_modules`. The Dockerfile's builder stage does exactly this.
|
||||
|
||||
## Container image
|
||||
|
||||
`Dockerfile`:
|
||||
|
||||
- **Builder** (`node:22-alpine`): `npm ci --ignore-scripts`, copy `src` + `tsconfig.json`, run `npm run build`
|
||||
- **Runtime** (`node:22-alpine`): `npm ci --omit=dev --ignore-scripts`, copy `dist/`, `specs/`, `lore/`, `persona.yaml`
|
||||
- **CMD**: `["node", "dist/bot/index.js"]`
|
||||
|
||||
To build locally:
|
||||
|
||||
```bash
|
||||
docker build -t mardonar-bot:latest .
|
||||
```
|
||||
|
||||
The `data/` directory is not copied into the image — it must be mounted as a volume in production so tally and summaries persist across restarts.
|
||||
|
||||
## Local dev (Docker Compose)
|
||||
|
||||
`docker-compose.dev.yml` is the only compose file in the repo. It declares the `mardonar-internal` Docker network as `external: true` — it expects the GraphMCP-Example stack (Redis + MCP server) to be running first.
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.dev.yml up -d
|
||||
docker compose -f docker-compose.dev.yml logs -f bot
|
||||
```
|
||||
|
||||
Two services:
|
||||
|
||||
- **`deploy-commands`** — one-shot container that runs `node dist/scripts/deploy-commands.js`. `restart: "no"`.
|
||||
- **`bot`** — long-running container. `restart: unless-stopped`. Mounts `./data:/app/data` so tally and summaries persist. `depends_on: deploy-commands: service_completed_successfully` ensures commands are registered before the bot starts serving traffic.
|
||||
|
||||
## Production deployment
|
||||
|
||||
There is no production compose file. Pick one:
|
||||
|
||||
### Option A: Plain Docker
|
||||
|
||||
```bash
|
||||
docker build -t mardonar-bot:latest .
|
||||
docker run -d \
|
||||
--name mardonar-bot \
|
||||
--restart unless-stopped \
|
||||
--env-file .env \
|
||||
-v /var/lib/mardonar/data:/app/data \
|
||||
--network mardonar-internal \
|
||||
mardonar-bot:latest
|
||||
```
|
||||
|
||||
Register commands once before the bot serves traffic (either via the `deploy-commands` service or by running the same image with a different command):
|
||||
|
||||
```bash
|
||||
docker run --rm \
|
||||
--env-file .env \
|
||||
--network mardonar-internal \
|
||||
mardonar-bot:latest \
|
||||
node dist/scripts/deploy-commands.js
|
||||
```
|
||||
|
||||
### Option B: systemd (Linux host)
|
||||
|
||||
```ini
|
||||
# /etc/systemd/system/mardonar-bot.service
|
||||
[Unit]
|
||||
Description=Mardonar Encounter Engine
|
||||
After=network.target redis-server.service
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=mardonar
|
||||
WorkingDirectory=/opt/mardonar
|
||||
EnvironmentFile=/opt/mardonar/.env
|
||||
ExecStart=/usr/bin/node /opt/mardonar/dist/bot/index.js
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now mardonar-bot
|
||||
sudo journalctl -u mardonar-bot -f
|
||||
```
|
||||
|
||||
## Environment
|
||||
|
||||
All runtime configuration is via environment variables, validated by Zod (`src/config.ts`). The full list is in [`development-guide.md`](./development-guide.md#environment-configuration-reference).
|
||||
|
||||
Production essentials:
|
||||
|
||||
```env
|
||||
DISCORD_TOKEN=...
|
||||
DISCORD_CLIENT_ID=...
|
||||
DISCORD_GUILD_ID=... # instant command registration
|
||||
|
||||
# Network isolation: only respond in specific channels
|
||||
DISCORD_ALLOWED_CHANNELS=123456789012345678,987654321098765432
|
||||
# User restriction: only allow specific users to run /encounter
|
||||
DISCORD_ALLOWED_USERS=111111111111111111
|
||||
|
||||
# LiteLLM (preferred)
|
||||
LITELLM_BASE_URL=http://your-litellm-host:4000
|
||||
LITELLM_API_KEY=...
|
||||
LITELLM_MODEL=ollama-cloud
|
||||
|
||||
# Ollama fallback
|
||||
OLLAMA_BASE_URL=http://your-ollama-host:11434
|
||||
OLLAMA_MODEL=gemma4-it:e2b
|
||||
|
||||
# GraphMCP (must be reachable)
|
||||
GRAPHMCP_URL=http://mcp-server:9000
|
||||
GRAPHMCP_SCORE_THRESHOLD=0.68
|
||||
GRAPHMCP_INGEST_STREAM=raw.messages
|
||||
|
||||
# Persisted state
|
||||
DATA_DIR=/app/data # or wherever you mount the volume
|
||||
|
||||
# Logging
|
||||
LOG_LEVEL=info
|
||||
```
|
||||
|
||||
> ⚠ **Security note:** `DISCORD_ALLOWED_CHANNELS` is **empty by default**, which means the bot will respond in **no channels**. This is secure-by-default but easy to misconfigure. Set it explicitly.
|
||||
|
||||
## Persistent state
|
||||
|
||||
Two kinds of state to back up:
|
||||
|
||||
1. **`data/tally.json`** — per-spec run counts. Useful for analytics, not load-bearing.
|
||||
2. **`data/summaries/`** — one `.txt` per resolved encounter. Permanent record.
|
||||
|
||||
Session state lives in Redis with a 12h TTL. If Redis is wiped, in-flight sessions are lost but Discord threads themselves remain — the bot will simply not find a session for that thread on next message. No data corruption risk.
|
||||
|
||||
## Health checks
|
||||
|
||||
The bot does not currently expose an HTTP health endpoint. Suggested liveness probe patterns:
|
||||
|
||||
- **Discord WebSocket liveness** — the bot logs `[bot] Logged in as <tag>` on ready. Scrape stdout for this.
|
||||
- **Redis** — already externally monitored. The bot logs `[redis] connection error` on failure.
|
||||
- **GraphMCP** — first call after startup will fail loudly if unreachable.
|
||||
- **Custom probe** — call `/encounter status` in a known thread and check the response (the bot only responds in `DISCORD_ALLOWED_CHANNELS`).
|
||||
|
||||
A simple `docker` healthcheck using Discord WebSocket isn't trivially scriptable. If you need an HTTP probe, add a small Express server in a future iteration that responds 200 while the Discord client is `ready` and Redis is connected.
|
||||
|
||||
## Logging
|
||||
|
||||
The bot uses pino. In dev, `pino-pretty` formats to a human-readable stream. In prod, pino emits structured JSON to stdout — pipe to your log shipper (Loki, CloudWatch, etc.).
|
||||
|
||||
Useful fields to index:
|
||||
|
||||
- `level`, `time`, `msg`
|
||||
- `threadId`, `encounterId` (for encounter-specific queries)
|
||||
- `latencyMs` (for LLM and tool latency)
|
||||
- `error` (for failure analysis)
|
||||
|
||||
## Operational runbook
|
||||
|
||||
### Restart the bot
|
||||
```bash
|
||||
docker restart mardonar-bot
|
||||
# or: systemctl restart mardonar-bot
|
||||
```
|
||||
|
||||
### Rotate the Discord token
|
||||
1. Generate a new token in the Discord developer portal
|
||||
2. Update the env var (or secret store)
|
||||
3. Restart the bot
|
||||
4. Old token is invalidated immediately
|
||||
|
||||
### Re-register slash commands
|
||||
After changing any `src/bot/commands/*.ts`:
|
||||
```bash
|
||||
docker run --rm --env-file .env --network mardonar-internal mardonar-bot:latest \
|
||||
node dist/scripts/deploy-commands.js
|
||||
```
|
||||
|
||||
Or in dev: `npm run deploy-commands`
|
||||
|
||||
### Reset a stuck session
|
||||
A bot restart clears all in-memory state (including reaction managers and burst counters). Redis session state persists. If a session is genuinely stuck (e.g. a tool dispatched but the response was lost), use `/encounter end` in-thread to force-resolve.
|
||||
|
||||
### Drain Redis (nuclear option)
|
||||
```bash
|
||||
docker exec -it <redis-container> redis-cli FLUSHDB
|
||||
```
|
||||
|
||||
## Open deployment gaps
|
||||
|
||||
These are real but not blockers:
|
||||
|
||||
- **No production compose file** — only `docker-compose.dev.yml`. Production deploy is ad-hoc.
|
||||
- **No CI/CD** — no `.github/workflows/`. Build and deploy are manual.
|
||||
- **No health endpoint** — no HTTP probe target.
|
||||
- **No metrics export** — pino logs are the only observability surface.
|
||||
- **`docker-compose.dev.yml` references an external Docker network (`mardonar-internal`)** — fine for the dev stack it's designed for, but a fresh deployment needs to either join the same network or remove the reference.
|
||||
193
docs/development-guide.md
Normal file
193
docs/development-guide.md
Normal file
@@ -0,0 +1,193 @@
|
||||
# Development Guide
|
||||
|
||||
> How to set up, run, test, and develop the Mardonar Encounter Engine. Generated 2026-06-19.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Node.js 22+** (matches the Dockerfile runtime)
|
||||
- **Docker + Docker Compose** (for local Redis and Neo4j)
|
||||
- **Ollama** running somewhere reachable, with `gemma4-it:e2b` pulled — *or* a LiteLLM proxy (preferred, set `LITELLM_BASE_URL`)
|
||||
- **A Discord bot token and application ID** with a registered bot user
|
||||
- npm 10+
|
||||
|
||||
## First-time setup
|
||||
|
||||
```bash
|
||||
git clone <your-repo>
|
||||
cd mardonar-npcs
|
||||
npm install
|
||||
cp .env.example .env
|
||||
# Edit .env — at minimum set DISCORD_TOKEN, DISCORD_CLIENT_ID
|
||||
```
|
||||
|
||||
The `.env` file is validated by Zod (`src/config.ts`) at import time. A missing required var (e.g. `DISCORD_TOKEN`) will crash the bot on startup with a clear error.
|
||||
|
||||
## Local services
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose.dev.yml up -d
|
||||
```
|
||||
|
||||
This starts:
|
||||
|
||||
- **Redis** on `localhost:6379`
|
||||
- **Neo4j** on `localhost:7687` (browser UI at `http://localhost:7474`, login `neo4j` / `mardonardev`)
|
||||
|
||||
The `mardonar-internal` Docker network is declared as `external: true` — it expects to be created by the GraphMCP-Example stack. If you run just the bot without GraphMCP, you can remove that network reference, but `/encounter start` will fail at NPC memory lookup.
|
||||
|
||||
## Register slash commands
|
||||
|
||||
Run once per bot deployment, or whenever commands change:
|
||||
|
||||
```bash
|
||||
npm run deploy-commands
|
||||
```
|
||||
|
||||
If `DISCORD_GUILD_ID` is set, registers to that guild instantly. If unset, registers globally (up to 1h propagation delay). The deploy script also clears any lingering global commands first, to avoid double-registration.
|
||||
|
||||
## Run the bot
|
||||
|
||||
```bash
|
||||
npm run dev # development: tsx watch mode (auto-reload)
|
||||
npm run build # compile TypeScript to dist/
|
||||
npm run start # run the compiled output
|
||||
```
|
||||
|
||||
The bot logs to stdout (pino with `pino-pretty` in dev). Set `LOG_LEVEL=debug` for verbose output.
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
npm run test # all tests
|
||||
npm run test:unit # unit only (no external services)
|
||||
npm run test:int # integration (requires docker compose up)
|
||||
```
|
||||
|
||||
Test layout:
|
||||
|
||||
- `tests/unit/` — 21 fast unit tests with no external dependencies
|
||||
- `tests/integration/phase1.test.ts` — requires running Redis + Neo4j
|
||||
- `tests/fixtures/spec.ts` — shared spec fixture
|
||||
|
||||
Vitest is configured with v8 coverage. The `vitest.config.ts` includes `src/**/*.ts` for coverage and `tests/**/*.test.ts` for the test pattern.
|
||||
|
||||
## Adding a new encounter
|
||||
|
||||
1. Copy `specs/market-thief.yaml` to `specs/your-encounter.yaml`
|
||||
2. Fill in: `encounterId`, `title`, `tone`, `setting`, `openingNarrative`, `npcs[]` (with optional `memoryKey` and `nameKey`), `goals`, `sportsmanshipRules`, `skillChecks` (group as `<name>_dc / <name>_skill / <name>_note` triples), and the `tools:` list
|
||||
3. Add `randomizable[]` entries if you want parts of the spec (e.g. NPC names, item descriptions) to be filled from GraphMCP vocabulary at load time
|
||||
4. In Discord: `/encounter start your-encounter`
|
||||
|
||||
See `specs/SPEC_FORMAT.md` for the canonical reference.
|
||||
|
||||
## Adding a new slash command
|
||||
|
||||
1. Create `src/bot/commands/<name>.ts` exporting `data` (SlashCommandBuilder) and `execute(interaction, client)`
|
||||
2. Register it in `src/bot/index.ts` (`commands.set('<name>', ...)`)
|
||||
3. Add it to `src/scripts/deploy-commands.ts` (`commands.push(data.toJSON())`)
|
||||
4. Run `npm run deploy-commands`
|
||||
|
||||
## Adding a new LLM tool
|
||||
|
||||
1. Create `src/harness/tools/<name>.ts` with a `ToolPlugin` definition and call `registerTool(plugin)` at the bottom
|
||||
2. Import the file in `src/harness/tools/index.ts` (side-effect import)
|
||||
3. Reference it from any spec's `tools: [...]` array to make it active
|
||||
4. Add a unit test in `tests/unit/`
|
||||
|
||||
The tool's `args` schema (string / number / boolean) is surfaced to the LLM via `buildToolManifest`, so the model sees typed arg descriptions in the system prompt. Use `contextDocs(spec)` to inject spec-specific guidance (e.g. preset DCs).
|
||||
|
||||
## Adding a new event handler
|
||||
|
||||
1. Create the handler in `src/bot/handlers/<name>.ts`
|
||||
2. Wire it from `src/bot/index.ts` or another handler (e.g. `messageRouter`)
|
||||
3. Prefer pure functions for transforms; reserve stateful modules for cross-call persistence
|
||||
|
||||
## Environment configuration reference
|
||||
|
||||
| Var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `DISCORD_TOKEN` | (required) | Bot user token |
|
||||
| `DISCORD_CLIENT_ID` | (required) | Application ID |
|
||||
| `DISCORD_GUILD_ID` | unset | If set, instant guild-scoped command registration |
|
||||
| `DISCORD_ALLOWED_CHANNELS` | empty → no channels | Comma-separated channel IDs the bot will respond in |
|
||||
| `DISCORD_ALLOWED_USERS` | empty → all users | Comma-separated user IDs allowed to run /encounter |
|
||||
| `REDIS_URL` | `redis://localhost:6379` | ioredis connection string |
|
||||
| `SESSION_TTL_HOURS` | 12 | Session TTL in Redis |
|
||||
| `LITELLM_BASE_URL` | (recommended) | LiteLLM proxy URL — preferred LLM client |
|
||||
| `LITELLM_API_KEY` | unset | Optional API key for the proxy |
|
||||
| `LITELLM_MODEL` | falls back to `OLLAMA_MODEL` | Model name as configured in LiteLLM |
|
||||
| `OLLAMA_BASE_URL` | `http://localhost:11434` | Ollama HTTP endpoint (fallback) |
|
||||
| `OLLAMA_MODEL` | `gemma4-it:e2b` | Ollama model name |
|
||||
| `OLLAMA_TEMPERATURE` | 0.75 | Sampling temperature (0–2) |
|
||||
| `OLLAMA_NUM_CTX` | 131072 | Context window in tokens |
|
||||
| `OLLAMA_TIMEOUT_MS` | 120000 | LLM call timeout |
|
||||
| `GRAPHMCP_URL` | `http://localhost:9000` | GraphMCP JSON-RPC endpoint |
|
||||
| `GRAPHMCP_SCORE_THRESHOLD` | 0.68 | Min similarity for NPC memory chunks |
|
||||
| `GRAPHMCP_NPC_MEMORY_LIMIT` | 5 | Max memory chunks per NPC |
|
||||
| `GRAPHMCP_MENTION_LIMIT` | 5 | Max chunks for @mention search |
|
||||
| `GRAPHMCP_INGEST_STREAM` | `raw.messages` | Redis stream name for encounter ingest |
|
||||
| `SPECS_DIR` | `./specs` | Encounter YAML directory |
|
||||
| `ENCOUNTER_ARCHIVE_DELAY_MS` | 5000 | Delay before archiving resolved thread |
|
||||
| `ENCOUNTER_GATE_TIMEOUT_MS` | 30000 | Player-gate embed auto-delete delay |
|
||||
| `PERSONA_PATH` | `./persona.yaml` | @mention persona YAML |
|
||||
| `DATA_DIR` | `./data` | Tally + summary directory |
|
||||
| `VTT_RELAY_URL` | `https://vtt-relay.damascusfront.net` | Foundry VTT relay endpoint |
|
||||
| `VTT_API_KEY` | empty → VTT disabled | API key for the relay |
|
||||
| `VTT_CLIENT_ID` | empty | Client ID for the relay |
|
||||
| `VTT_FOUNDRY_URL` | empty | Foundry URL for headless spin-up |
|
||||
| `VTT_USERNAME` | empty | Foundry username |
|
||||
| `VTT_PASSWORD` | empty | Foundry password (encrypted with RSA-OAEP for handoff) |
|
||||
| `VTT_WORLD` | empty | Foundry world to launch |
|
||||
| `LOG_LEVEL` | `info` | `trace` / `debug` / `info` / `warn` / `error` |
|
||||
|
||||
## Common tasks
|
||||
|
||||
### View current encounter state
|
||||
In Discord, in an encounter thread: `/encounter status`
|
||||
|
||||
### List active encounters in the guild
|
||||
`/encounter list`
|
||||
|
||||
### Search past encounters
|
||||
`/encounters` then use the modal
|
||||
|
||||
### Force-end an encounter
|
||||
`/encounter end [notes]`
|
||||
|
||||
### Inspect the most recent encounter summary
|
||||
`/encounter audit` (DMs the file) — or read `data/summaries/` directly
|
||||
|
||||
### Tail the bot log
|
||||
With pino-pretty in dev, logs are pretty-printed to stdout. In prod, pipe container stdout to your log shipper.
|
||||
|
||||
### Reset Redis state
|
||||
```bash
|
||||
docker compose -f docker-compose.dev.yml down -v
|
||||
docker compose -f docker-compose.dev.yml up -d
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Symptom | Likely cause |
|
||||
|---|---|
|
||||
| `ZodError` at startup | Missing or malformed env var. Check `.env` against `.env.example`. |
|
||||
| `DISCORD_ALLOWED_CHANNELS` empty → bot never responds | The bot refuses to respond outside allowed channels by design. Set the env var. |
|
||||
| `ECONNREFUSED` to Redis | `docker compose -f docker-compose.dev.yml up -d` not run, or wrong `REDIS_URL`. |
|
||||
| `ECONNREFUSED` to GraphMCP | GraphMCP-Example stack not running, or wrong `GRAPHMCP_URL`. Encounter start will fail at NPC memory fetch. |
|
||||
| LLM never responds | LiteLLM down → falls back to Ollama. Check `OLLAMA_BASE_URL` and that the model is pulled. |
|
||||
| Tool call never fires | LLM emitted a `tool_call` block but the tool name is misspelled or not in the spec's `tools:` list. Check `toolParser` warnings. |
|
||||
| Skill check embed buttons do nothing | `PENDING_ROLL_LIMIT` (5) reached; encounter auto-fails. Look for the `[SKILL CHECK RESULT] ... auto-cancelled` system message. |
|
||||
| VTT integration silently skipped | `VTT_API_KEY` empty. Set the var to enable. |
|
||||
| Spec fails to load | Run `/encounter spec` for the YAML. Schema is in `src/spec/loader.ts`. |
|
||||
| High latency on LLM calls | Likely under-sized `OLLAMA_NUM_CTX` vs. assembled context. Check `CONTEXT_BUDGET` in `src/types/index.ts`. |
|
||||
|
||||
## Project conventions
|
||||
|
||||
- **TypeScript strict mode**, ESM modules, NodeNext resolution. All imports use `.js` extensions even for `.ts` source.
|
||||
- **Shared types live only in `src/types/index.ts`.** Do not duplicate definitions elsewhere.
|
||||
- **Tool plugins are self-registering** — each `harness/tools/<name>.ts` calls `registerTool()` at load. The `index.ts` aggregator imports them for side effects.
|
||||
- **Discord embeds are pure builders** — no I/O, no `await`. Pass typed args, return an embed.
|
||||
- **Event handlers live in `src/bot/handlers/`.** The runtime heart is `messageRouter.ts`.
|
||||
- **In-world voice for player-facing strings** — see `feedback-in-world-voice` memory. No utility terms like "session", "user", "ephemeral" in bot messages.
|
||||
- **All env access goes through `import { config }` from `src/config.ts`** — never read `process.env` directly.
|
||||
- **Tests use Vitest globals** — no explicit `import { describe, it, expect }` in test files.
|
||||
55
docs/index.md
Normal file
55
docs/index.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# Mardonar Encounter Engine — Documentation Index
|
||||
|
||||
> Primary entry point for AI-assisted development. Generated 2026-06-19 from a deep scan.
|
||||
|
||||
## Project Overview
|
||||
|
||||
- **Type:** Monolith — single-part backend
|
||||
- **Primary Language:** TypeScript (Node.js 22, ESM)
|
||||
- **Architecture:** Layered backend with plugin-style LLM tool registry
|
||||
- **Project name (config):** big-red
|
||||
- **Repository name:** mardonar-npcs
|
||||
|
||||
## Quick Reference
|
||||
|
||||
- **Tech stack:** Node.js 22 · TypeScript 5.8 · discord.js v14 · LiteLLM (primary) + Ollama (fallback) · ioredis · GraphMCP JSON-RPC (Neo4j-backed) · Zod · pino · Vitest · Docker
|
||||
- **Entry point:** `src/bot/index.ts` (compiled to `dist/bot/index.js`)
|
||||
- **Architecture pattern:** Layered (bot → harness → session/db/graphmcp/vtt) with per-encounter tool plugin filtering
|
||||
|
||||
## Generated Documentation
|
||||
|
||||
- [Project Overview](./project-overview.md)
|
||||
- [Architecture](./architecture.md)
|
||||
- [Source Tree Analysis](./source-tree-analysis.md)
|
||||
- [Component Inventory](./component-inventory.md)
|
||||
- [Development Guide](./development-guide.md)
|
||||
- [Deployment Guide](./deployment-guide.md)
|
||||
- [API Contracts](./api-contracts.md)
|
||||
- [Data Models](./data-models.md)
|
||||
|
||||
## Existing Documentation
|
||||
|
||||
These pre-existed in `Docs/` and were cross-referenced during generation. Note that some are partially out of date.
|
||||
|
||||
- [Docs/mardonar-encounter-engine.md](../Docs/mardonar-encounter-engine.md) — Original system design doc. **Out of date** — describes a Go bot with embedded MCP; the actual implementation is TypeScript with an external GraphMCP. Use `docs/architecture.md` as the source of truth.
|
||||
- [Docs/mardonar-build-plan.md](../Docs/mardonar-build-plan.md) — Phased build plan with packages and test guidance
|
||||
- [Docs/epics.md](../Docs/epics.md) — Epic list
|
||||
- [Docs/stories/](../Docs/stories/) — Story specs (1.1, 1.2, 2.1, 3.1, 4.1)
|
||||
- [Docs/ux-designs/ux-mardonar-2026-05-30/](../Docs/ux-designs/ux-mardonar-2026-05-30/) — UX session artifacts (EXPERIENCE.md, DESIGN.md, decision-log.md)
|
||||
- [README.md](../README.md) — Player-facing intro, quick start, command list. Project-structure tree is out of date.
|
||||
- [prd.md](../prd.md) — Active PRD: Dynamic Goal Registration
|
||||
|
||||
## Getting Started
|
||||
|
||||
1. Skim [Project Overview](./project-overview.md) (1 minute)
|
||||
2. Read [Architecture](./architecture.md) sections 1–6 for the system design (10 minutes)
|
||||
3. Read [Development Guide](./development-guide.md) "First-time setup" to get the bot running locally
|
||||
4. For new feature work, start from [Component Inventory](./component-inventory.md) to find the right module, then read the linked source
|
||||
|
||||
## Conventions
|
||||
|
||||
- All player-facing bot strings use in-world voice — no utility terms like "session", "user", "ephemeral" (see `feedback-in-world-voice` memory).
|
||||
- All env access goes through `import { config }` from `src/config.ts`.
|
||||
- Tool plugins self-register via `registerTool()` at module load.
|
||||
- Shared types live only in `src/types/index.ts`.
|
||||
- Discord embeds are pure builders — no I/O.
|
||||
106
docs/project-overview.md
Normal file
106
docs/project-overview.md
Normal file
@@ -0,0 +1,106 @@
|
||||
# Mardonar Encounter Engine — Project Overview
|
||||
|
||||
> Discord-native, LLM-driven D&D encounter engine. Generated 2026-06-19 from a deep scan.
|
||||
|
||||
## What it is
|
||||
|
||||
A Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. The bot loads a YAML spec, narrates the scene via an LLM (Gemma 4 IT e2b through LiteLLM with Ollama fallback), voices NPCs with stable personas, runs skill checks via Discord embeds, and persists NPC memory + encounter history into a graph database through GraphMCP (JSON-RPC over HTTP). Optional Foundry VTT integration pulls live character stats and awards XP via an external relay.
|
||||
|
||||
## Who it serves
|
||||
|
||||
Discord community members playing D&D 5e in the Land of Mardonar. The DM runs `/encounter start <spec>` to begin; players post their actions in the resulting thread. NPC personas are loaded from specs and grounded in long-term graph memory so that recurring NPCs remember prior interactions across encounters.
|
||||
|
||||
## Tech stack at a glance
|
||||
|
||||
| Layer | Technology |
|
||||
|---|---|
|
||||
| Runtime | Node.js 22 (ESM, TypeScript 5.8 strict) |
|
||||
| Discord | discord.js v14 |
|
||||
| LLM (primary) | LiteLLM proxy (env: `LITELLM_BASE_URL`) |
|
||||
| LLM (fallback) | Ollama (env: `OLLAMA_BASE_URL`) — `gemma4-it:e2b`, 128k context |
|
||||
| Session cache | Redis (ioredis), 12h TTL |
|
||||
| Graph DB | Neo4j (via GraphMCP JSON-RPC, not direct) |
|
||||
| Lore / NPC memory | GraphMCP HTTP JSON-RPC server |
|
||||
| Foundry VTT | External relay (optional, requires API key) |
|
||||
| Validation | Zod (env + encounter spec) |
|
||||
| Logging | pino + pino-pretty |
|
||||
| Testing | Vitest 3 (unit + integration) |
|
||||
| Build | tsc → multi-stage Node 22 alpine Dockerfile |
|
||||
|
||||
## Architecture type
|
||||
|
||||
**Layered backend with a plugin-style tool registry.**
|
||||
|
||||
```
|
||||
Discord ──▶ src/bot/ (commands, embeds, handlers)
|
||||
│
|
||||
▼
|
||||
src/harness/ (promptBuilder, contextAssembler,
|
||||
llmClient, toolParser, toolDispatcher,
|
||||
tools/* plugin registry)
|
||||
│
|
||||
┌────────────┼────────────┐
|
||||
▼ ▼ ▼
|
||||
Redis GraphMCP VTT relay
|
||||
(session (JSON-RPC: (Foundry
|
||||
state) NPC memory, live stats,
|
||||
lore, log) XP grants)
|
||||
```
|
||||
|
||||
## Repository structure
|
||||
|
||||
**Single-part monolith.** All source under `src/`. The bot is one Node.js process that talks to external services over the network.
|
||||
|
||||
```
|
||||
src/
|
||||
├── bot/ # Discord I/O (commands, embeds, event handlers)
|
||||
├── harness/ # LLM orchestration + 6 tool plugins
|
||||
├── session/ # Redis-backed registries + session state
|
||||
├── graphmcp/ # JSON-RPC client + Redis stream ingest
|
||||
├── vtt/ # Foundry VTT relay client + spin-up
|
||||
├── db/ # ioredis singleton
|
||||
├── spec/ # YAML encounter loader + Zod schema
|
||||
├── persona/ # persona.yaml loader
|
||||
├── config.ts # Zod env validation
|
||||
├── lib/ # logger
|
||||
├── scripts/ # deploy-commands (slash command registration)
|
||||
└── types/ # shared interfaces + CONTEXT_BUDGET
|
||||
```
|
||||
|
||||
Plus `specs/` (8 encounter YAML files), `tests/` (22 test files), `data/` (runtime tally + summaries), and `Docs/` (pre-existing project documentation, partially out of date).
|
||||
|
||||
## Documentation
|
||||
|
||||
- [Architecture](./architecture.md) — full system design
|
||||
- [Source Tree Analysis](./source-tree-analysis.md) — annotated directory tree
|
||||
- [Component Inventory](./component-inventory.md) — reusable components
|
||||
- [Development Guide](./development-guide.md) — setup, run, test, troubleshoot
|
||||
- [Deployment Guide](./deployment-guide.md) — production deploy + ops
|
||||
- [API Contracts](./api-contracts.md) — Discord commands + GraphMCP JSON-RPC
|
||||
- [Data Models](./data-models.md) — session state, encounter spec, Neo4j graph
|
||||
|
||||
## Key features in the current codebase
|
||||
|
||||
- **Per-encounter tool filtering.** Each spec declares which tool plugins are active.
|
||||
- **Dynamic goal registration** (the active PRD feature) — `tools/goalRegister.ts` lets the LLM add new goals mid-encounter.
|
||||
- **Three-pattern tool parser** — handles fenced `tool_call`, bare `tool_call` header, and fuzzy bare JSON, so even smaller models can drive tools.
|
||||
- **Self-spinning VTT relay** — when the relay is down, the bot handshakes via RSA-OAEP and launches a headless Foundry session on demand.
|
||||
- **Burst cap with drop notices** — if too many messages arrive before the last LLM response, the bot drops the excess and posts a tone-aware notice.
|
||||
- **Reaction lifecycle (👀)** — visible "I'm working on it" feedback through queued → processing → complete states.
|
||||
- **NPC memory injection** at session start from GraphMCP, filtered by score threshold and capped at top-3 chunks above the threshold.
|
||||
- **In-world voice** for player-facing strings — no utility/jargon (see `feedback-in-world-voice`).
|
||||
|
||||
## Known drift and open issues
|
||||
|
||||
- `Docs/mardonar-encounter-engine.md` describes a Go bot with embedded MCP — superseded by `docs/architecture.md` but still referenced by the README.
|
||||
- `README.md`'s project-structure tree is out of date (mentions `src/mcp/`, missing commands).
|
||||
- `src/types/index.ts` `EncounterSpec` diverged from `src/spec/loader.ts` Zod schema (missing `tone`, `tools`, `randomizable`, `nameKey`).
|
||||
- Duplicate `trimHistory` between `sessionManager.ts` and `contextAssembler.ts`.
|
||||
- No production `docker-compose.yml`, no CI/CD, no HTTP health endpoint.
|
||||
- `DISCORD_ALLOWED_USERS` empty by default — channel-scoped access only.
|
||||
|
||||
See `docs/architecture.md §9` for full drift list.
|
||||
|
||||
## When you're ready to plan new features
|
||||
|
||||
Point the PRD workflow at [`docs/index.md`](./index.md) as input. For UI-facing work, `architecture.md §5.1` is the primary reference. For backend/LLM feature work, `architecture.md §5.2` and `docs/data-models.md` are the primary references.
|
||||
190
docs/source-tree-analysis.md
Normal file
190
docs/source-tree-analysis.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# Source Tree Analysis
|
||||
|
||||
> Annotated directory tree for the Mardonar Encounter Engine. Generated 2026-06-19.
|
||||
|
||||
## Top level
|
||||
|
||||
```
|
||||
mardonar-npcs/
|
||||
├── src/ # TypeScript source (compiled to dist/)
|
||||
├── specs/ # Encounter YAML files (loaded by /encounter start)
|
||||
├── tests/ # Vitest unit + integration suites
|
||||
├── Docs/ # Pre-existing project documentation (encounter engine overview, build plan, epics, stories, UX designs)
|
||||
├── lore/ # Game-world reference material
|
||||
├── data/ # Runtime tally + per-encounter summaries (volume-mounted in prod)
|
||||
├── scripts/ # Top-level utility scripts (only deploy-commands.ts lives here; the rest are under src/scripts)
|
||||
├── docs/ # Generated by bmad-document-project (this folder)
|
||||
├── node_modules/ # npm dependencies (gitignored)
|
||||
├── dist/ # tsc output (gitignored)
|
||||
├── Dockerfile # Multi-stage Node 22 alpine build
|
||||
├── docker-compose.dev.yml # Local Redis + Neo4j orchestration
|
||||
├── package.json
|
||||
├── tsconfig.json # NodeNext ESM, strict, rootDir=src
|
||||
├── vitest.config.ts # v8 coverage
|
||||
├── .env / .env.example # Zod-validated env config
|
||||
├── persona.yaml # @Zalram Cloudwalker persona
|
||||
├── prd.md # Active PRD: Dynamic Goal Registration
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## src/ — TypeScript source
|
||||
|
||||
### src/bot/ — Discord I/O layer
|
||||
|
||||
| Path | Role |
|
||||
|---|---|
|
||||
| `index.ts` | Entry point. Wires the discord.js `Client`, registers slash commands, dispatches `interactionCreate` and `messageCreate` to handlers. |
|
||||
| `commands/` | One file per slash command. Each exports `data` (SlashCommandBuilder) and `execute(interaction, client)`. |
|
||||
| `commands/dndname.ts` | `/dndname set\|show\|clear` — character name registration. |
|
||||
| `commands/encounter.ts` | `/encounter start\|status\|end\|generate\|spec\|random\|stats\|audit` — encounter session lifecycle. |
|
||||
| `commands/character.ts` | `/character register\|show\|view\|admin` — character profile + Foundry link modals. |
|
||||
| `commands/roll.ts` | `/roll` — manual dice roll. |
|
||||
| `commands/actions.ts` | `/actions` — in-character action shortcuts. |
|
||||
| `commands/xp.ts` | `/xp award` — XP grant to a character. |
|
||||
| `commands/encounters.ts` | `/encounters` — search/list encounters via GraphMCP. Includes select menu + search modal interactions. |
|
||||
| `commands/turn.ts` | `/turn` — turn management. |
|
||||
| `embeds/` | Discord embed builders. Pure functions taking typed args. |
|
||||
| `embeds/playerGate.ts` | "Please register your character name" embed. |
|
||||
| `embeds/skillCheck.ts` | Suspense embed → dice embed with roll buttons. |
|
||||
| `embeds/resolution.ts` | Encounter complete embed. |
|
||||
| `embeds/encounterDiscovery.ts` | Encounter search result embeds. |
|
||||
| `embeds/loreAnswer.ts` | @mention lore response embed. |
|
||||
| `handlers/` | Event handlers and sidecar logic. The runtime heart of the bot. |
|
||||
| `handlers/messageRouter.ts` | Core encounter-thread message pipeline: gates, debounce, LLM call, tool dispatch. |
|
||||
| `handlers/mentionHandler.ts` | @Zalram persona replies (uses `persona/loader.ts`). |
|
||||
| `handlers/rollHandler.ts` | Button + modal submit skill-check roll resolution. |
|
||||
| `handlers/generationQueue.ts` | Debounce + LLM turn scheduling (500ms coalesce). |
|
||||
| `handlers/queueCap.ts` | Burst cap: drops messages if too many arrived before the last LLM response. |
|
||||
| `handlers/reactionManager.ts` | 👀 reaction lifecycle: scheduled → processing → complete. |
|
||||
| `handlers/responseFilter.ts` | Post-LLM response scrubbing (catches fabricated rolls, echoed system tags, empty responses). |
|
||||
| `lib/welcomeDM.ts` | Welcome DM utility. |
|
||||
|
||||
### src/harness/ — LLM orchestration
|
||||
|
||||
| Path | Role |
|
||||
|---|---|
|
||||
| `promptBuilder.ts` | System prompt assembly. XML-sectioned: narrator, tone, sportsmanship, NPCs, players, setting, resolved context, skill checks, hidden goals, tool contract. |
|
||||
| `contextAssembler.ts` | Builds the LLM message list: system + pinned history + trimmed sliding history. |
|
||||
| `llmClient.ts` | Entry point. Routes to LiteLLM (primary) with Ollama fallback. |
|
||||
| `litellmClient.ts` | OpenAI-compatible HTTP client for LiteLLM proxy. |
|
||||
| `ollamaClient.ts` | Native `ollama` npm + direct HTTP fallback path. |
|
||||
| `toolParser.ts` | Extracts `tool_call` blocks from LLM response. Three fallback patterns. |
|
||||
| `toolRegistry.ts` | Plugin registry. `getActiveTools(spec.tools)` returns per-encounter active set. |
|
||||
| `toolDispatcher.ts` | Validates tool name against active set, dispatches to plugin handler, logs result. |
|
||||
| `tools/` | Tool plugin implementations. Each module calls `registerTool()` at load. |
|
||||
| `tools/index.ts` | Side-effect imports — add new tool files here. |
|
||||
| `tools/skillCheckEmit.ts` | Posts dice-roll embed; blocks input until resolved. Pulls player modifier from Foundry. |
|
||||
| `tools/encounterResolve.ts` | Marks encounter complete, writes summary, archives thread. |
|
||||
| `tools/contextRecall.ts` | Retrieves canonical session facts from `resolvedContext`. |
|
||||
| `tools/goalRegister.ts` | Adds new goals mid-encounter (per `prd.md`). |
|
||||
| `tools/foundryLookup.ts` | Live character data from VTT relay. |
|
||||
| `tools/foundryReward.ts` | XP / item grant to a character via VTT. |
|
||||
|
||||
### src/session/ — Redis-backed state
|
||||
|
||||
| Path | Role |
|
||||
|---|---|
|
||||
| `playerRegistry.ts` | `(guildId, discordId) → Player` (DnD name). |
|
||||
| `characterRegistry.ts` | Character profile: DnD name, pronouns, characterClass, race, level, backstory, Foundry actor UUID. |
|
||||
| `sessionManager.ts` | `threadId → SessionState`. Pinned + sliding history trim by token budget. |
|
||||
| `encounterLog.ts` | Filesystem tally + summary writer (one .txt per encounter in `data/summaries/`). |
|
||||
| `xpAwarder.ts` | XP grant via VTT relay. |
|
||||
|
||||
### src/graphmcp/ — GraphMCP JSON-RPC client
|
||||
|
||||
| Path | Role |
|
||||
|---|---|
|
||||
| `client.ts` | 6 RPC calls + NPC memory formatter. |
|
||||
| `ingest.ts` | Publishes encounter messages to Redis stream `raw.messages`. |
|
||||
| `loreResolver.ts` | /encounter generate helper. |
|
||||
| `vocabularyResolver.ts` | Resolves spec `randomizable:` entries (vocabulary or graphmcp source). |
|
||||
|
||||
### src/vtt/ — Foundry VTT integration
|
||||
|
||||
| Path | Role |
|
||||
|---|---|
|
||||
| `foundryClient.ts` | HTTP client. Live actor data + formatters. |
|
||||
| `relaySession.ts` | RSA-OAEP encrypted handshake + headless Foundry session spin-up when relay is down. |
|
||||
|
||||
### Other src/ modules
|
||||
|
||||
| Path | Role |
|
||||
|---|---|
|
||||
| `db/redis.ts` | ioredis singleton (`lazyConnect`, `maxRetriesPerRequest: 3`). |
|
||||
| `spec/loader.ts` | YAML loader + Zod schema (`EncounterSpecSchema`). |
|
||||
| `persona/loader.ts` | persona.yaml loader for @mention. |
|
||||
| `lib/logger.ts` | pino wrapper. |
|
||||
| `config.ts` | Zod env schema + parsed config singleton. |
|
||||
| `scripts/deploy-commands.ts` | Slash command registration via Discord REST v10. |
|
||||
| `types/index.ts` | Shared interfaces + `CONTEXT_BUDGET` constant. |
|
||||
|
||||
## specs/ — Encounter YAML files
|
||||
|
||||
Loaded by `/encounter start <spec-name>`. `specs/SPEC_FORMAT.md` documents the schema. Current set:
|
||||
|
||||
- `market-thief.yaml` — the original "low-stakes warm-up" example used in the README
|
||||
- `cog-claw-debt.yaml`
|
||||
- `mawfang-pursuit.yaml`
|
||||
- `silt-leak.yaml`
|
||||
- `stormscar-pilgrim.yaml`
|
||||
- `velvet-auction.yaml`
|
||||
- `whispering-stone.yaml`
|
||||
|
||||
Each spec declares: `encounterId`, `title`, `tone`, `setting`, `openingNarrative`, `npcs[]` (with optional `nameKey` and `memoryKey`), `goals` (primary + secondary), `sportsmanshipRules`, `skillChecks` (grouped by suffix `_dc/_skill/_note`), `randomizable[]` (vocabulary or graphmcp queries with fallbacks), `tools[]`, and optional `dmNotes`.
|
||||
|
||||
## tests/ — Vitest suites
|
||||
|
||||
```
|
||||
tests/
|
||||
├── fixtures/spec.ts # shared spec fixture
|
||||
├── unit/ # 21 unit test files (no external services)
|
||||
│ ├── promptBuilder.test.ts
|
||||
│ ├── contextAssembler.test.ts
|
||||
│ ├── toolParser.test.ts
|
||||
│ ├── toolDispatcher.test.ts
|
||||
│ ├── sessionManager.test.ts
|
||||
│ ├── playerRegistry.test.ts
|
||||
│ ├── characterRegistry.test.ts
|
||||
│ ├── specLoader.test.ts
|
||||
│ ├── rollHandler.test.ts
|
||||
│ ├── rollDetection.test.ts
|
||||
│ ├── responseFilter.test.ts
|
||||
│ ├── queueCap.test.ts
|
||||
│ ├── generationQueue.test.ts
|
||||
│ ├── reactionManager.test.ts
|
||||
│ ├── encounterLog.test.ts
|
||||
│ ├── encounterDiscoveryEmbed.test.ts
|
||||
│ ├── loreAnswerEmbed.test.ts
|
||||
│ ├── skillCheckEmbed.test.ts
|
||||
│ ├── graphmcpClient.test.ts
|
||||
│ ├── foundryClientRetry.test.ts
|
||||
│ ├── foundryClientFormatters.test.ts
|
||||
│ ├── goalRegister.test.ts
|
||||
│ └── relaySession.test.ts
|
||||
└── integration/
|
||||
└── phase1.test.ts # requires running Docker services
|
||||
```
|
||||
|
||||
## Docs/ — Pre-existing project documentation (historical)
|
||||
|
||||
| Path | Role |
|
||||
|---|---|
|
||||
| `mardonar-encounter-engine.md` | **Out of date** — describes a Go bot with embedded MCP layer. Treat as historical. The current `docs/architecture.md` supersedes it. |
|
||||
| `mardonar-build-plan.md` | Phased build plan with packages and test guidance. |
|
||||
| `epics.md` | Epic list. |
|
||||
| `stories/` | Story specs (1.1, 1.2, 2.1, 3.1, 4.1). |
|
||||
| `ux-designs/ux-mardonar-2026-05-30/` | UX session artifacts: `EXPERIENCE.md`, `DESIGN.md`, `.decision-log.md`. |
|
||||
|
||||
## Critical entry points
|
||||
|
||||
| What you want to change | Start here |
|
||||
|---|---|
|
||||
| Add a slash command | `src/bot/commands/`, then `src/scripts/deploy-commands.ts`, then run `npm run deploy-commands` |
|
||||
| Add a tool the LLM can call | `src/harness/tools/<name>.ts`, register in `src/harness/tools/index.ts` |
|
||||
| Change system prompt structure | `src/harness/promptBuilder.ts` |
|
||||
| Change context window budget | `src/types/index.ts` → `CONTEXT_BUDGET` |
|
||||
| Add an encounter | `specs/<name>.yaml` (see `specs/SPEC_FORMAT.md`) |
|
||||
| Change message pipeline | `src/bot/handlers/messageRouter.ts` |
|
||||
| Change LLM client | `src/harness/llmClient.ts` (router), `litellmClient.ts` / `ollamaClient.ts` (implementations) |
|
||||
| Add a Foundry VTT feature | `src/vtt/`, then add a tool in `src/harness/tools/` |
|
||||
| Add a GraphMCP-backed feature | `src/graphmcp/client.ts`, then add a tool |
|
||||
934
package-lock.json
generated
934
package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -11,7 +11,9 @@
|
||||
"deploy-commands": "tsx src/scripts/deploy-commands.ts",
|
||||
"test": "vitest run",
|
||||
"test:unit": "vitest run tests/unit",
|
||||
"test:int": "vitest run tests/integration"
|
||||
"test:int": "vitest run tests/integration",
|
||||
"test:coverage": "vitest run --coverage",
|
||||
"test:watch": "vitest"
|
||||
},
|
||||
"dependencies": {
|
||||
"@discordjs/builders": "^1.10.0",
|
||||
@@ -30,6 +32,7 @@
|
||||
"devDependencies": {
|
||||
"@types/js-yaml": "^4.0.9",
|
||||
"@types/node": "^22.0.0",
|
||||
"@vitest/coverage-v8": "^3.2.6",
|
||||
"ioredis-mock": "^8.9.0",
|
||||
"tsx": "^4.19.0",
|
||||
"typescript": "^5.8.0",
|
||||
|
||||
130
tests/README.md
Normal file
130
tests/README.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Tests
|
||||
|
||||
This directory holds the project's automated test suite.
|
||||
|
||||
## Layout
|
||||
|
||||
```
|
||||
tests/
|
||||
├── fixtures/ Shared test fixtures (spec, session, etc.)
|
||||
├── integration/ Integration tests (require live infrastructure)
|
||||
├── unit/ Unit tests (default CI gate)
|
||||
└── README.md You are here
|
||||
```
|
||||
|
||||
- **`unit/`** — fast, isolated tests for individual modules. No network, no
|
||||
Redis, no Discord gateway. The CI default runs only this directory.
|
||||
- **`integration/`** — slower tests that exercise real services (or mocks
|
||||
close to the wire). Run explicitly; not part of the default test command.
|
||||
- **`fixtures/`** — reusable mocks (`mockSession`, `mockSpec`) shared by
|
||||
multiple unit tests.
|
||||
|
||||
## Running
|
||||
|
||||
```bash
|
||||
npm test # alias for `npm run test:unit` + runs once (not watch)
|
||||
npm run test:unit # run all tests in tests/unit
|
||||
npm run test:int # run all tests in tests/integration
|
||||
npm run test:coverage # run unit tests with v8 coverage report
|
||||
npm run test:watch # vitest in watch mode
|
||||
```
|
||||
|
||||
## Conventions
|
||||
|
||||
### 1. One module per file
|
||||
|
||||
A test file covers one source module. File name: `<moduleName>.test.ts`,
|
||||
placed under `tests/unit/`. If a source module exports multiple functions
|
||||
worth testing, group them with `describe` blocks in the same file.
|
||||
|
||||
### 2. Mock before import — always
|
||||
|
||||
`vi.mock` calls must appear *before* the import of the module under test,
|
||||
otherwise the unmocked module is already cached. The pattern:
|
||||
|
||||
```ts
|
||||
import { vi, describe, it, expect, beforeEach } from 'vitest';
|
||||
|
||||
const { mockFn } = vi.hoisted(() => ({ mockFn: vi.fn() }));
|
||||
|
||||
vi.mock('../../src/lib/logger.js', () => ({
|
||||
log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
|
||||
}));
|
||||
|
||||
// ...more mocks...
|
||||
|
||||
import { myFunction } from '../../src/some/module.js'; // AFTER mocks
|
||||
```
|
||||
|
||||
`vi.hoisted` lets you share mock state between a `vi.mock` factory and the
|
||||
test body — both run in the same scope.
|
||||
|
||||
### 3. `vi.clearAllMocks()` in `beforeEach`
|
||||
|
||||
Prevents test bleed-through. If you also mutate config or module-level
|
||||
state, reset it explicitly in `beforeEach`.
|
||||
|
||||
### 4. Reuse `mockSession` and `mockSpec`
|
||||
|
||||
Import from `../fixtures/spec.js`. Don't redefine session shape per file —
|
||||
schema drift is one of the easier ways for tests to silently rot.
|
||||
|
||||
### 5. Test the *behavior*, not the implementation
|
||||
|
||||
Assert outcomes (return values, side effects on real collaborators, error
|
||||
messages) rather than calling patterns. When a test would only pass with a
|
||||
specific internal implementation, ask whether the contract is what's
|
||||
documented in the source's doc comment.
|
||||
|
||||
### 6. Don't hit the network
|
||||
|
||||
- `fetch` → use `vi.stubGlobal('fetch', ...)` (see `foundryClientRetry.test.ts`).
|
||||
- Discord client → pass a hand-rolled mock with only the methods the code uses
|
||||
(e.g. `messages.fetch`, `send`, `sendTyping`, `setArchived`).
|
||||
- Redis → use `ioredis-mock` (see `sessionManager.test.ts`).
|
||||
- LLM SDKs → mock the constructor (see `litellmClient.test.ts`,
|
||||
`ollamaClient.test.ts`).
|
||||
- Filesystem → use `mkdtempSync` from `node:os.tmpdir()` (see
|
||||
`personaLoader.test.ts`).
|
||||
|
||||
### 7. Player-facing strings
|
||||
|
||||
When a test asserts on a string the bot would say to a player, prefer
|
||||
in-world language over utility terms. (Same rule that applies to production
|
||||
code — see `feedback-in-world-voice` memory.)
|
||||
|
||||
## Anti-patterns
|
||||
|
||||
- **Asserting private state.** Reach for behaviour-side assertions first.
|
||||
- **Resetting state with `vi.resetModules()` for the sake of it.** It breaks
|
||||
shared mock state. Use it only when a module-scoped cache (e.g. a lazy
|
||||
client) needs to be re-constructed.
|
||||
- **Catching all errors in a test.** If a test passes by accident because an
|
||||
unhandled rejection was swallowed, it's not testing anything.
|
||||
- **Mocking the module under test.** If you have to mock the file you're
|
||||
testing, the test is asserting nothing.
|
||||
- **Timeouts in `it()` callbacks.** Use `vi.useFakeTimers()` and
|
||||
`vi.advanceTimersByTimeAsync` to step time deterministically (see
|
||||
`messageRouterRunLLMTurn.test.ts` for the typing-indicator pattern).
|
||||
|
||||
## Adding a new test
|
||||
|
||||
1. Create `tests/unit/<name>.test.ts`.
|
||||
2. Use the closest existing test as a template — `goalRegister.test.ts` for
|
||||
tool plugins, `foundryClientRetry.test.ts` for HTTP, `relaySession.test.ts`
|
||||
for `node:https` / `node:crypto`, `sessionManager.test.ts` for Redis.
|
||||
3. Run `npm run test:unit -- <your-file>` to iterate quickly.
|
||||
4. When green, run the full suite: `npm run test:unit`.
|
||||
5. Optional: check `npm run test:coverage` to confirm the file's coverage.
|
||||
|
||||
## Coverage
|
||||
|
||||
`npm run test:coverage` produces a v8 coverage report in the terminal.
|
||||
Directories worth watching:
|
||||
|
||||
- `src/bot/handlers/` — message routing; `runLLMTurn` is the runtime heart.
|
||||
- `src/harness/tools/` — the tool plugin contracts.
|
||||
- `src/vtt/` — Foundry relay; `foundryClient` is the biggest single file.
|
||||
|
||||
Coverage is informational, not a gate. The goal is to grow the unit test
|
||||
surface for the modules that own irreversible or user-facing behavior.
|
||||
205
tests/unit/foundryReward.test.ts
Normal file
205
tests/unit/foundryReward.test.ts
Normal file
@@ -0,0 +1,205 @@
|
||||
import { vi, describe, it, expect, beforeEach } from 'vitest';
|
||||
|
||||
// ── registry mocks ───────────────────────────────────────────────────────────
|
||||
const { mockGet: mockCharacterGet } = vi.hoisted(() => ({
|
||||
mockGet: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/session/characterRegistry.js', () => ({
|
||||
characterRegistry: { get: mockCharacterGet },
|
||||
}));
|
||||
|
||||
const { mockModifyExperience, mockGiveItem } = vi.hoisted(() => ({
|
||||
mockModifyExperience: vi.fn(),
|
||||
mockGiveItem: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/vtt/foundryClient.js', () => ({
|
||||
modifyExperience: mockModifyExperience,
|
||||
giveItem: mockGiveItem,
|
||||
}));
|
||||
|
||||
import { dispatchTool } from '../../src/harness/toolDispatcher.js';
|
||||
import { mockSession } from '../fixtures/spec.js';
|
||||
|
||||
function makeThread() {
|
||||
return { send: vi.fn().mockResolvedValue({ id: 'msg-1' }) };
|
||||
}
|
||||
|
||||
const playerSession = {
|
||||
...mockSession,
|
||||
players: {
|
||||
'user-1': { discordId: 'user-1', dndName: 'Aelindra' },
|
||||
},
|
||||
};
|
||||
|
||||
beforeEach(() => {
|
||||
vi.clearAllMocks();
|
||||
mockModifyExperience.mockResolvedValue(undefined);
|
||||
mockGiveItem.mockResolvedValue(undefined);
|
||||
});
|
||||
|
||||
describe('dispatchTool — foundry_reward', () => {
|
||||
it('awards both XP and item to a registered Foundry-linked player', async () => {
|
||||
mockCharacterGet.mockResolvedValue({
|
||||
discordId: 'user-1',
|
||||
dndName: 'Aelindra',
|
||||
source: 'foundry',
|
||||
foundryActorUuid: 'Actor.abc',
|
||||
});
|
||||
|
||||
const result = await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: {
|
||||
player_discord_name: 'Aelindra',
|
||||
xp_amount: 50,
|
||||
item_name: 'Potion of Healing',
|
||||
reason: 'Caught the thief.',
|
||||
},
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(result.systemMessage).toContain('[FOUNDRY REWARD]');
|
||||
expect(result.systemMessage).toContain('Aelindra');
|
||||
expect(result.systemMessage).toContain('Potion of Healing');
|
||||
expect(result.systemMessage).toContain('50 XP');
|
||||
expect(result.systemMessage).toContain('Caught the thief.');
|
||||
expect(mockModifyExperience).toHaveBeenCalledWith('Actor.abc', 50);
|
||||
expect(mockGiveItem).toHaveBeenCalledWith('Actor.abc', 'Potion of Healing');
|
||||
});
|
||||
|
||||
it('matches player name case-insensitively', async () => {
|
||||
mockCharacterGet.mockResolvedValue({
|
||||
discordId: 'user-1',
|
||||
dndName: 'Aelindra',
|
||||
source: 'foundry',
|
||||
foundryActorUuid: 'Actor.abc',
|
||||
});
|
||||
|
||||
await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'aelindra', xp_amount: 10, reason: 'good roleplay' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(mockModifyExperience).toHaveBeenCalledWith('Actor.abc', 10);
|
||||
});
|
||||
|
||||
it('awards only XP when item_name is omitted', async () => {
|
||||
mockCharacterGet.mockResolvedValue({
|
||||
discordId: 'user-1', dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.abc',
|
||||
});
|
||||
|
||||
await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'Aelindra', xp_amount: 25, reason: 'milestone' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(mockModifyExperience).toHaveBeenCalledWith('Actor.abc', 25);
|
||||
expect(mockGiveItem).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('awards only an item when xp_amount is zero', async () => {
|
||||
mockCharacterGet.mockResolvedValue({
|
||||
discordId: 'user-1', dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.abc',
|
||||
});
|
||||
|
||||
await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'Aelindra', xp_amount: 0, item_name: 'Gold Piece', reason: 'tip' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(mockGiveItem).toHaveBeenCalledWith('Actor.abc', 'Gold Piece');
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('skips XP when xp_amount is missing', async () => {
|
||||
mockCharacterGet.mockResolvedValue({
|
||||
discordId: 'user-1', dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.abc',
|
||||
});
|
||||
|
||||
await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'Aelindra', item_name: 'Ring', reason: 'find' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(mockGiveItem).toHaveBeenCalledWith('Actor.abc', 'Ring');
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('returns a "no player" system message and does not call Foundry when the player is not in the session', async () => {
|
||||
const result = await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'Nobody', xp_amount: 5, reason: 'typo' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(result.systemMessage).toContain('No player found matching "Nobody"');
|
||||
expect(mockCharacterGet).not.toHaveBeenCalled();
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
expect(mockGiveItem).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('returns a "no character record" message when the player has no Foundry UUID', async () => {
|
||||
mockCharacterGet.mockResolvedValue({
|
||||
discordId: 'user-1', dndName: 'Aelindra', source: 'custom', /* no foundryActorUuid */
|
||||
});
|
||||
|
||||
const result = await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'Aelindra', xp_amount: 5, reason: 'try' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(result.systemMessage).toContain('No character record found for this player');
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
expect(mockGiveItem).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('returns a "no character record" message when the player has no profile at all', async () => {
|
||||
mockCharacterGet.mockResolvedValue(null);
|
||||
|
||||
const result = await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'Aelindra', xp_amount: 5, reason: 'try' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(result.systemMessage).toContain('No character record found for this player');
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('catches errors from characterRegistry and returns the friendly error', async () => {
|
||||
mockCharacterGet.mockRejectedValue(new Error('redis down'));
|
||||
|
||||
const result = await dispatchTool(
|
||||
{
|
||||
tool: 'foundry_reward',
|
||||
args: { player_discord_name: 'Aelindra', xp_amount: 5, reason: 'try' },
|
||||
},
|
||||
{ session: playerSession, thread: makeThread() as any },
|
||||
);
|
||||
|
||||
expect(result.systemMessage).toContain('Character records are inaccessible');
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
162
tests/unit/litellmClient.test.ts
Normal file
162
tests/unit/litellmClient.test.ts
Normal file
@@ -0,0 +1,162 @@
|
||||
import { vi, describe, it, expect, beforeEach } from 'vitest';
|
||||
|
||||
// ── config mock ──────────────────────────────────────────────────────────────
|
||||
vi.mock('../../src/config.js', () => ({
|
||||
config: {
|
||||
LITELLM_BASE_URL: 'http://100.83.8.74:4000',
|
||||
LITELLM_API_KEY: 'test-key',
|
||||
LITELLM_MODEL: 'ollama-cloud',
|
||||
OLLAMA_TEMPERATURE: 0.75,
|
||||
OLLAMA_TIMEOUT_MS: 120_000,
|
||||
OLLAMA_MODEL: 'gemma4-it:e2b',
|
||||
},
|
||||
}));
|
||||
|
||||
vi.mock('../../src/lib/logger.js', () => ({
|
||||
log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
|
||||
}));
|
||||
|
||||
// ── openai client mock ────────────────────────────────────────────────────────
|
||||
const { mockCreate } = vi.hoisted(() => ({
|
||||
mockCreate: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('openai', () => ({
|
||||
default: vi.fn().mockImplementation(() => ({
|
||||
chat: { completions: { create: mockCreate } },
|
||||
})),
|
||||
}));
|
||||
|
||||
import { callLLM } from '../../src/harness/litellmClient.js';
|
||||
|
||||
beforeEach(() => {
|
||||
vi.clearAllMocks();
|
||||
// Reset LITELLM_MODEL in case a previous test mutated it.
|
||||
return import('../../src/config.js').then(({ config }) => {
|
||||
(config as Record<string, unknown>).LITELLM_MODEL = 'ollama-cloud';
|
||||
});
|
||||
});
|
||||
|
||||
describe('litellmClient.callLLM', () => {
|
||||
it('returns parsed narrative and tool call from the OpenAI-compatible response', async () => {
|
||||
mockCreate.mockResolvedValueOnce({
|
||||
choices: [
|
||||
{
|
||||
message: {
|
||||
content: 'Roll for initiative. ```tool_call\n{"tool":"encounter_resolve","args":{"sessionId":"s1","outcomeId":"catch","summary":"Caught him"}}\n```',
|
||||
},
|
||||
},
|
||||
],
|
||||
usage: { completion_tokens: 88, prompt_tokens: 4000 },
|
||||
});
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: 'I tackle him.', timestamp: 1 }]);
|
||||
|
||||
expect(result.narrative).toBe('Roll for initiative.');
|
||||
expect(result.toolCall?.tool).toBe('encounter_resolve');
|
||||
expect(result.toolCall?.args).toEqual({ sessionId: 's1', outcomeId: 'catch', summary: 'Caught him' });
|
||||
expect(result.rawTokensUsed).toBe(88);
|
||||
});
|
||||
|
||||
it('configures the OpenAI client with the LiteLLM base URL + API key + timeout', async () => {
|
||||
// Force a fresh litellmClient so its cached _client is re-constructed with
|
||||
// the current config values.
|
||||
vi.resetModules();
|
||||
const OpenAI = (await import('openai')).default;
|
||||
const { callLLM: freshCallLLM } = await import('../../src/harness/litellmClient.js');
|
||||
mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
|
||||
|
||||
await freshCallLLM([{ role: 'user', content: 'hi', timestamp: 1 }]);
|
||||
|
||||
expect(OpenAI).toHaveBeenCalledWith({
|
||||
baseURL: 'http://100.83.8.74:4000/v1',
|
||||
apiKey: 'test-key',
|
||||
timeout: 120_000,
|
||||
});
|
||||
});
|
||||
|
||||
it('falls back to the literal string "no-key" when LITELLM_API_KEY is empty', async () => {
|
||||
const { config } = await import('../../src/config.js');
|
||||
(config as Record<string, unknown>).LITELLM_API_KEY = '';
|
||||
vi.resetModules();
|
||||
const OpenAI = (await import('openai')).default;
|
||||
const { callLLM: freshCallLLM } = await import('../../src/harness/litellmClient.js');
|
||||
mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
|
||||
|
||||
await freshCallLLM([{ role: 'user', content: 'hi', timestamp: 1 }]);
|
||||
|
||||
expect(OpenAI).toHaveBeenCalledWith(
|
||||
expect.objectContaining({ apiKey: 'no-key' }),
|
||||
);
|
||||
});
|
||||
|
||||
it('uses LITELLM_MODEL when set, otherwise falls back to OLLAMA_MODEL', async () => {
|
||||
const { config } = await import('../../src/config.js');
|
||||
|
||||
(config as Record<string, unknown>).LITELLM_MODEL = 'big-model';
|
||||
mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
|
||||
await callLLM([{ role: 'user', content: 'a', timestamp: 1 }]);
|
||||
expect(mockCreate).toHaveBeenLastCalledWith(
|
||||
expect.objectContaining({ model: 'big-model' }),
|
||||
);
|
||||
|
||||
(config as Record<string, unknown>).LITELLM_MODEL = undefined;
|
||||
mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
|
||||
await callLLM([{ role: 'user', content: 'b', timestamp: 2 }]);
|
||||
expect(mockCreate).toHaveBeenLastCalledWith(
|
||||
expect.objectContaining({ model: 'gemma4-it:e2b' }),
|
||||
);
|
||||
});
|
||||
|
||||
it('passes messages and temperature through to the OpenAI client', async () => {
|
||||
mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
|
||||
|
||||
await callLLM([
|
||||
{ role: 'system', content: 'sys', timestamp: 0 },
|
||||
{ role: 'user', content: 'hi', timestamp: 1 },
|
||||
]);
|
||||
|
||||
expect(mockCreate).toHaveBeenCalledWith({
|
||||
model: 'ollama-cloud',
|
||||
messages: [
|
||||
{ role: 'system', content: 'sys' },
|
||||
{ role: 'user', content: 'hi' },
|
||||
],
|
||||
temperature: 0.75,
|
||||
});
|
||||
});
|
||||
|
||||
it('returns an empty narrative when the model response is empty', async () => {
|
||||
mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: '' } }] });
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
|
||||
|
||||
expect(result.narrative).toBe('');
|
||||
expect(result.toolCall).toBeUndefined();
|
||||
});
|
||||
|
||||
it('falls back to an empty string when the response has no choices at all', async () => {
|
||||
mockCreate.mockResolvedValueOnce({ choices: [] });
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
|
||||
|
||||
expect(result.narrative).toBe('');
|
||||
expect(result.toolCall).toBeUndefined();
|
||||
});
|
||||
|
||||
it('handles a missing usage field without crashing', async () => {
|
||||
mockCreate.mockResolvedValueOnce({ choices: [{ message: { content: 'ok' } }] });
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
|
||||
|
||||
expect(result.rawTokensUsed).toBeUndefined();
|
||||
});
|
||||
|
||||
it('propagates errors from the OpenAI client', async () => {
|
||||
mockCreate.mockRejectedValueOnce(new Error('rate limit exceeded'));
|
||||
|
||||
await expect(
|
||||
callLLM([{ role: 'user', content: 'hi', timestamp: 1 }]),
|
||||
).rejects.toThrow('rate limit exceeded');
|
||||
});
|
||||
});
|
||||
409
tests/unit/messageRouterRunLLMTurn.test.ts
Normal file
409
tests/unit/messageRouterRunLLMTurn.test.ts
Normal file
@@ -0,0 +1,409 @@
|
||||
import { vi, describe, it, expect, beforeEach, afterEach } from 'vitest';
|
||||
|
||||
// ── assembled-context mock ───────────────────────────────────────────────────
|
||||
const { mockAssembleContext } = vi.hoisted(() => ({
|
||||
mockAssembleContext: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/harness/contextAssembler.js', () => ({
|
||||
assembleContext: mockAssembleContext,
|
||||
}));
|
||||
|
||||
// ── LLM client mock ──────────────────────────────────────────────────────────
|
||||
const { mockCallLLM } = vi.hoisted(() => ({
|
||||
mockCallLLM: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/harness/llmClient.js', () => ({
|
||||
callLLM: mockCallLLM,
|
||||
}));
|
||||
|
||||
// ── dispatchTool mock ────────────────────────────────────────────────────────
|
||||
const { mockDispatchTool } = vi.hoisted(() => ({
|
||||
mockDispatchTool: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/harness/toolDispatcher.js', () => ({
|
||||
dispatchTool: mockDispatchTool,
|
||||
}));
|
||||
|
||||
// ── sessionManager mock ──────────────────────────────────────────────────────
|
||||
const { mockAddMessage, mockUpdate, mockGet } = vi.hoisted(() => ({
|
||||
mockAddMessage: vi.fn(),
|
||||
mockUpdate: vi.fn(),
|
||||
mockGet: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/session/sessionManager.js', () => ({
|
||||
sessionManager: {
|
||||
addMessage: mockAddMessage,
|
||||
update: mockUpdate,
|
||||
get: mockGet,
|
||||
},
|
||||
}));
|
||||
|
||||
// ── responseFilter mock ──────────────────────────────────────────────────────
|
||||
const { mockFilterLLMResponse, mockDetectMissedSkillCheck, mockLogFiltered } = vi.hoisted(() => ({
|
||||
mockFilterLLMResponse: vi.fn(),
|
||||
mockDetectMissedSkillCheck: vi.fn(),
|
||||
mockLogFiltered: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/bot/handlers/responseFilter.js', () => ({
|
||||
filterLLMResponse: mockFilterLLMResponse,
|
||||
detectMissedSkillCheck: mockDetectMissedSkillCheck,
|
||||
logFiltered: mockLogFiltered,
|
||||
}));
|
||||
|
||||
// ── reaction / burst mocks ───────────────────────────────────────────────────
|
||||
vi.mock('../../src/bot/handlers/reactionManager.js', () => ({
|
||||
registerScheduled: vi.fn(),
|
||||
drainPending: vi.fn(() => []),
|
||||
clearPending: vi.fn(),
|
||||
upgradeToProcessing: vi.fn(),
|
||||
upgradeToComplete: vi.fn(),
|
||||
cleanupReactions: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/bot/handlers/queueCap.js', () => ({
|
||||
isBurstCapped: vi.fn(() => false),
|
||||
incrementBurst: vi.fn(),
|
||||
resetBurst: vi.fn(),
|
||||
sendDropNotice: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/lib/logger.js', () => ({
|
||||
log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
|
||||
}));
|
||||
|
||||
// ── subject under test ───────────────────────────────────────────────────────
|
||||
// Import AFTER all vi.mock calls. We re-import per-test where needed so we can
|
||||
// attach vi.spyOn() to the module's scheduleEncounterLLMTurn export.
|
||||
let runLLMTurn: typeof import('../../src/bot/handlers/messageRouter.js').runLLMTurn;
|
||||
let scheduleSpy: ReturnType<typeof vi.spyOn>;
|
||||
import { mockSession } from '../fixtures/spec.js';
|
||||
import type { SessionState } from '../../src/types/index.js';
|
||||
|
||||
function makeThread(extra: Partial<{ setArchived: any }> = {}) {
|
||||
const thread: any = {
|
||||
send: vi.fn().mockResolvedValue({ id: 'sent-msg' }),
|
||||
sendTyping: vi.fn().mockResolvedValue(undefined),
|
||||
setArchived: extra.setArchived ?? vi.fn().mockResolvedValue(undefined),
|
||||
messages: { fetch: vi.fn().mockResolvedValue(null) },
|
||||
};
|
||||
return thread;
|
||||
}
|
||||
|
||||
function sessionWith(history: SessionState['history'], pending?: SessionState['pendingSkillCheck']): SessionState {
|
||||
return { ...mockSession, history, pendingSkillCheck: pending };
|
||||
}
|
||||
|
||||
beforeEach(async () => {
|
||||
vi.clearAllMocks();
|
||||
vi.useFakeTimers();
|
||||
// Re-import the module under test each time so we can spy on its
|
||||
// scheduleEncounterLLMTurn export. The mocks are reused across imports.
|
||||
const mod = await import('../../src/bot/handlers/messageRouter.js');
|
||||
runLLMTurn = mod.runLLMTurn;
|
||||
scheduleSpy = vi.spyOn(mod, 'scheduleEncounterLLMTurn').mockImplementation(() => undefined);
|
||||
|
||||
// Always default: context assembles to something, filter accepts everything.
|
||||
mockAssembleContext.mockReturnValue([{ role: 'system', content: 'sys', timestamp: 0 }]);
|
||||
mockFilterLLMResponse.mockReturnValue({ ok: true });
|
||||
mockDetectMissedSkillCheck.mockReturnValue(false);
|
||||
mockAddMessage.mockResolvedValue(undefined);
|
||||
mockUpdate.mockResolvedValue(undefined);
|
||||
mockGet.mockImplementation(async (threadId: string) => ({ ...mockSession, threadId }));
|
||||
mockDispatchTool.mockResolvedValue({ systemMessage: '[TOOL] done', resolved: undefined, error: undefined });
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
vi.useRealTimers();
|
||||
});
|
||||
|
||||
describe('runLLMTurn — narrative-only response (no tool call)', () => {
|
||||
it('posts the narrative to the thread', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'The wind howls.', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(thread.send).toHaveBeenCalledWith('The wind howls.');
|
||||
});
|
||||
|
||||
it('stores the assistant narrative in session history', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'A leaf falls.', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockAddMessage).toHaveBeenCalledWith(
|
||||
mockSession.threadId,
|
||||
expect.objectContaining({ role: 'assistant', content: 'A leaf falls.' }),
|
||||
);
|
||||
});
|
||||
|
||||
it('does not call dispatchTool when there is no tool call', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'quiet.', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockDispatchTool).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('passes skipRollClaim:true when a [SKILL CHECK RESULT] message is in the recent 6 messages', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'You rolled a 15 and hit the goblin.', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
const history: SessionState['history'] = [
|
||||
{ role: 'system', content: '[SKILL CHECK RESULT] Aelindra rolled 15 vs DC 12. Result: SUCCESS.', timestamp: 1 },
|
||||
];
|
||||
await runLLMTurn(sessionWith(history), thread, {} as any);
|
||||
|
||||
expect(mockFilterLLMResponse).toHaveBeenCalledWith(
|
||||
'You rolled a 15 and hit the goblin.',
|
||||
{ skipRollClaim: true },
|
||||
);
|
||||
});
|
||||
|
||||
it('passes skipRollClaim:false when no recent [SKILL CHECK RESULT] message exists', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: '...', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockFilterLLMResponse).toHaveBeenCalledWith('...', { skipRollClaim: false });
|
||||
});
|
||||
});
|
||||
|
||||
describe('runLLMTurn — filter correction', () => {
|
||||
it('on filter rejection with no recent correction, sends a [FILTER CORRECTION] system message', async () => {
|
||||
mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'fabricated_roll_result' });
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'You rolled a 17.', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockLogFiltered).toHaveBeenCalledWith(
|
||||
'fabricated_roll_result',
|
||||
'You rolled a 17.',
|
||||
expect.objectContaining({ threadId: mockSession.threadId, encounterId: mockSession.encounterId }),
|
||||
);
|
||||
expect(mockAddMessage).toHaveBeenCalledWith(
|
||||
mockSession.threadId,
|
||||
expect.objectContaining({
|
||||
role: 'system',
|
||||
content: expect.stringMatching(/^\[FILTER CORRECTION\]/),
|
||||
}),
|
||||
);
|
||||
// The retry path also invokes scheduleEncounterLLMTurn with immediate=true.
|
||||
// (We can't reliably observe the internal call via the export spy in ESM
|
||||
// live-bindings, so we verify the side effects directly.)
|
||||
const correction = mockAddMessage.mock.calls.find(([_, m]) =>
|
||||
(m as { content: string }).content.startsWith('[FILTER CORRECTION]'),
|
||||
)?.[1] as { content: string };
|
||||
expect(correction.content).toMatch(/Do NOT state or imply a specific dice result/);
|
||||
});
|
||||
|
||||
it('on filter rejection when last message is already a correction, skips the retry to avoid loops', async () => {
|
||||
mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'empty_response' });
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: '', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
const history: SessionState['history'] = [
|
||||
{ role: 'system', content: '[FILTER CORRECTION] previous turn suppressed (empty_response).', timestamp: 1 },
|
||||
];
|
||||
|
||||
await runLLMTurn(sessionWith(history), thread, {} as any);
|
||||
|
||||
// No new correction message should be added when one was just sent.
|
||||
const correctionAdds = mockAddMessage.mock.calls.filter(([_, m]) =>
|
||||
(m as { content: string }).content.startsWith('[FILTER CORRECTION]'),
|
||||
);
|
||||
expect(correctionAdds).toHaveLength(0);
|
||||
});
|
||||
|
||||
it('uses the echoed_system_tag correction text when filter rejects for that reason', async () => {
|
||||
mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'echoed_system_tag' });
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: '[TOOL] something', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
const correction = mockAddMessage.mock.calls.find(([_, m]) =>
|
||||
(m as { content: string }).content.startsWith('[FILTER CORRECTION]'),
|
||||
)?.[1] as { content: string };
|
||||
expect(correction.content).toMatch(/Do NOT echo internal system tags/);
|
||||
});
|
||||
|
||||
it('does NOT post the filtered narrative to the thread', async () => {
|
||||
mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'fabricated_roll_result' });
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'You rolled a 17.', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(thread.send).not.toHaveBeenCalledWith('You rolled a 17.');
|
||||
});
|
||||
});
|
||||
|
||||
describe('runLLMTurn — tool call dispatch', () => {
|
||||
it('dispatches the toolCall with a freshly fetched session and writes the system message', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({
|
||||
narrative: '',
|
||||
toolCall: { tool: 'goal_register', args: { goals: ['x'] } },
|
||||
});
|
||||
const freshSession = { ...mockSession, fetched: true };
|
||||
mockGet.mockResolvedValueOnce(freshSession);
|
||||
mockDispatchTool.mockResolvedValueOnce({ systemMessage: '[TOOL] ok', error: undefined, resolved: undefined });
|
||||
|
||||
const thread = makeThread();
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockGet).toHaveBeenCalledWith(mockSession.threadId);
|
||||
expect(mockDispatchTool).toHaveBeenCalledWith(
|
||||
{ tool: 'goal_register', args: { goals: ['x'] } },
|
||||
expect.objectContaining({ session: freshSession, thread }),
|
||||
);
|
||||
expect(mockAddMessage).toHaveBeenCalledWith(
|
||||
mockSession.threadId,
|
||||
expect.objectContaining({ role: 'system', content: '[TOOL] ok' }),
|
||||
);
|
||||
});
|
||||
|
||||
it('posts a friendly fallback message when dispatchTool returns an error', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({
|
||||
narrative: '',
|
||||
toolCall: { tool: 'goal_register', args: {} },
|
||||
});
|
||||
mockDispatchTool.mockResolvedValueOnce({ systemMessage: '[TOOL] failed', error: new Error('boom'), resolved: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(thread.send).toHaveBeenCalledWith(expect.stringMatching(/narrator stumbles/));
|
||||
});
|
||||
|
||||
it('marks the session resolved and schedules archive when tool reports resolved', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({
|
||||
narrative: '',
|
||||
toolCall: { tool: 'encounter_resolve', args: { outcomeId: 'catch', summary: 'got him' } },
|
||||
});
|
||||
mockDispatchTool.mockResolvedValueOnce({
|
||||
systemMessage: '[TOOL] resolved',
|
||||
resolved: { outcomeId: 'catch', summary: 'got him' },
|
||||
error: undefined,
|
||||
});
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockUpdate).toHaveBeenCalledWith(mockSession.threadId, {
|
||||
phase: 'resolved',
|
||||
outcome: 'catch',
|
||||
outcomeSummary: 'got him',
|
||||
});
|
||||
|
||||
// The archive setTimeout fires after 5 seconds.
|
||||
expect(thread.setArchived).not.toHaveBeenCalled();
|
||||
await vi.advanceTimersByTimeAsync(5_000);
|
||||
expect(thread.setArchived).toHaveBeenCalledWith(true);
|
||||
});
|
||||
|
||||
it('does not throw and returns early when the session was deleted before dispatch', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({
|
||||
narrative: '',
|
||||
toolCall: { tool: 'goal_register', args: {} },
|
||||
});
|
||||
mockGet.mockResolvedValueOnce(null); // session disappeared
|
||||
const thread = makeThread();
|
||||
|
||||
await expect(runLLMTurn(sessionWith([]), thread, {} as any)).resolves.toBeUndefined();
|
||||
expect(mockDispatchTool).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('still dispatches the tool even when the narrative was filtered', async () => {
|
||||
mockFilterLLMResponse.mockReturnValue({ ok: false, reason: 'fabricated_roll_result' });
|
||||
mockCallLLM.mockResolvedValueOnce({
|
||||
narrative: 'You rolled a 12. ',
|
||||
toolCall: { tool: 'goal_register', args: { foo: 'bar' } },
|
||||
});
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockDispatchTool).toHaveBeenCalled();
|
||||
// But the narrative was suppressed.
|
||||
expect(thread.send).not.toHaveBeenCalledWith('You rolled a 12.');
|
||||
});
|
||||
});
|
||||
|
||||
describe('runLLMTurn — LLM error', () => {
|
||||
it('posts a friendly error message when the LLM throws and clears the typing interval', async () => {
|
||||
mockCallLLM.mockRejectedValueOnce(new Error('503 from upstream'));
|
||||
const thread = makeThread();
|
||||
const consoleSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(consoleSpy).toHaveBeenCalledWith('[messageRouter] LLM call failed:', expect.any(Error));
|
||||
expect(thread.send).toHaveBeenCalledWith(expect.stringMatching(/narrator pauses/));
|
||||
// The interval would normally fire every 8s — advance to confirm it's gone.
|
||||
await vi.advanceTimersByTimeAsync(20_000);
|
||||
expect(thread.sendTyping).toHaveBeenCalled();
|
||||
// No filter or dispatch should have happened.
|
||||
expect(mockFilterLLMResponse).not.toHaveBeenCalled();
|
||||
expect(mockDispatchTool).not.toHaveBeenCalled();
|
||||
|
||||
consoleSpy.mockRestore();
|
||||
});
|
||||
});
|
||||
|
||||
describe('runLLMTurn — missed skill check heuristic', () => {
|
||||
it('logs a warning when the narrative asks for a roll but no tool call was emitted and no roll is pending', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'Make a Strength check.', toolCall: undefined });
|
||||
mockDetectMissedSkillCheck.mockReturnValueOnce(true);
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
|
||||
expect(mockDetectMissedSkillCheck).toHaveBeenCalledWith('Make a Strength check.');
|
||||
});
|
||||
|
||||
it('skips the heuristic when a roll result is already pending', async () => {
|
||||
mockCallLLM.mockResolvedValueOnce({ narrative: 'Make a check.', toolCall: undefined });
|
||||
const thread = makeThread();
|
||||
|
||||
await runLLMTurn(
|
||||
sessionWith([], { player: 'Aelindra', dc: 12, messageId: 'm1' }),
|
||||
thread,
|
||||
{} as any,
|
||||
);
|
||||
|
||||
expect(mockDetectMissedSkillCheck).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
|
||||
describe('runLLMTurn — typing indicator', () => {
|
||||
it('starts a typing indicator that fires every 8s while the LLM is being awaited', async () => {
|
||||
// Make callLLM slow so we can observe the interval
|
||||
let resolveCall!: (v: unknown) => void;
|
||||
mockCallLLM.mockReturnValueOnce(new Promise(r => { resolveCall = r; }));
|
||||
const thread = makeThread();
|
||||
|
||||
const pending = runLLMTurn(sessionWith([]), thread, {} as any);
|
||||
expect(thread.sendTyping).toHaveBeenCalledTimes(1);
|
||||
|
||||
await vi.advanceTimersByTimeAsync(8_000);
|
||||
expect(thread.sendTyping).toHaveBeenCalledTimes(2);
|
||||
|
||||
resolveCall({ narrative: 'ok', toolCall: undefined });
|
||||
await pending;
|
||||
|
||||
// After resolution the interval is cleared; advancing further should not send typing again.
|
||||
const callsBefore = thread.sendTyping.mock.calls.length;
|
||||
await vi.advanceTimersByTimeAsync(20_000);
|
||||
expect(thread.sendTyping.mock.calls.length).toBe(callsBefore);
|
||||
});
|
||||
});
|
||||
103
tests/unit/ollamaClient.test.ts
Normal file
103
tests/unit/ollamaClient.test.ts
Normal file
@@ -0,0 +1,103 @@
|
||||
import { vi, describe, it, expect, beforeEach } from 'vitest';
|
||||
|
||||
// ── config mock (must come before module under test) ──────────────────────────
|
||||
vi.mock('../../src/config.js', () => ({
|
||||
config: {
|
||||
OLLAMA_BASE_URL: 'http://localhost:11434',
|
||||
OLLAMA_MODEL: 'gemma4-it:e2b',
|
||||
OLLAMA_TEMPERATURE: 0.75,
|
||||
OLLAMA_NUM_CTX: 131072,
|
||||
},
|
||||
}));
|
||||
|
||||
vi.mock('../../src/lib/logger.js', () => ({
|
||||
log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
|
||||
}));
|
||||
|
||||
// ── ollama npm client mock ────────────────────────────────────────────────────
|
||||
const { mockChat } = vi.hoisted(() => ({
|
||||
mockChat: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('ollama', () => ({
|
||||
Ollama: vi.fn().mockImplementation(() => ({
|
||||
chat: mockChat,
|
||||
})),
|
||||
}));
|
||||
|
||||
import { callLLM } from '../../src/harness/ollamaClient.js';
|
||||
|
||||
beforeEach(() => {
|
||||
vi.clearAllMocks();
|
||||
});
|
||||
|
||||
describe('ollamaClient.callLLM', () => {
|
||||
it('returns parsed narrative and tool call from the ollama response', async () => {
|
||||
mockChat.mockResolvedValueOnce({
|
||||
message: { content: 'The goblin snarls. ```tool_call\n{"tool":"skill_check_emit","args":{"player":"Aelindra","prompt":"Strike","dc":12}}\n```' },
|
||||
eval_count: 42,
|
||||
});
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: 'I attack.', timestamp: 1 }]);
|
||||
|
||||
expect(result.narrative).toBe('The goblin snarls.');
|
||||
expect(result.toolCall?.tool).toBe('skill_check_emit');
|
||||
expect(result.rawTokensUsed).toBe(42);
|
||||
});
|
||||
|
||||
it('passes messages, model, stream:false, and options to the ollama client', async () => {
|
||||
mockChat.mockResolvedValueOnce({ message: { content: 'ok' }, eval_count: 5 });
|
||||
|
||||
await callLLM([
|
||||
{ role: 'system', content: 'You are the DM.', timestamp: 0 },
|
||||
{ role: 'user', content: 'I look around.', timestamp: 1 },
|
||||
]);
|
||||
|
||||
expect(mockChat).toHaveBeenCalledWith({
|
||||
model: 'gemma4-it:e2b',
|
||||
messages: [
|
||||
{ role: 'system', content: 'You are the DM.' },
|
||||
{ role: 'user', content: 'I look around.' },
|
||||
],
|
||||
stream: false,
|
||||
options: { temperature: 0.75, num_ctx: 131072 },
|
||||
});
|
||||
});
|
||||
|
||||
it('returns just the narrative when there is no tool call block', async () => {
|
||||
mockChat.mockResolvedValueOnce({ message: { content: 'A quiet moment.' }, eval_count: 7 });
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
|
||||
|
||||
expect(result.narrative).toBe('A quiet moment.');
|
||||
expect(result.toolCall).toBeUndefined();
|
||||
expect(result.rawTokensUsed).toBe(7);
|
||||
});
|
||||
|
||||
it('propagates errors from the ollama client', async () => {
|
||||
mockChat.mockRejectedValueOnce(new Error('connection refused'));
|
||||
|
||||
await expect(
|
||||
callLLM([{ role: 'user', content: 'hi', timestamp: 1 }]),
|
||||
).rejects.toThrow('connection refused');
|
||||
});
|
||||
|
||||
it('handles an empty message content without crashing', async () => {
|
||||
mockChat.mockResolvedValueOnce({ message: { content: '' }, eval_count: 0 });
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
|
||||
|
||||
expect(result.narrative).toBe('');
|
||||
expect(result.toolCall).toBeUndefined();
|
||||
expect(result.rawTokensUsed).toBe(0);
|
||||
});
|
||||
|
||||
it('handles a missing eval_count without crashing', async () => {
|
||||
mockChat.mockResolvedValueOnce({ message: { content: 'ok' } });
|
||||
|
||||
const result = await callLLM([{ role: 'user', content: '...', timestamp: 1 }]);
|
||||
|
||||
expect(result.narrative).toBe('ok');
|
||||
expect(result.rawTokensUsed).toBeUndefined();
|
||||
});
|
||||
});
|
||||
120
tests/unit/personaLoader.test.ts
Normal file
120
tests/unit/personaLoader.test.ts
Normal file
@@ -0,0 +1,120 @@
|
||||
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
|
||||
import { mkdtempSync, writeFileSync, rmSync } from 'fs';
|
||||
import { tmpdir } from 'os';
|
||||
import { join } from 'path';
|
||||
|
||||
import { loadPersona, clearPersonaCache } from '../../src/persona/loader.js';
|
||||
|
||||
let tmpDir: string;
|
||||
|
||||
beforeEach(() => {
|
||||
clearPersonaCache();
|
||||
tmpDir = mkdtempSync(join(tmpdir(), 'persona-test-'));
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
rmSync(tmpDir, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
function writePersona(yaml: string): string {
|
||||
const path = join(tmpDir, 'persona.yaml');
|
||||
writeFileSync(path, yaml, 'utf8');
|
||||
return path;
|
||||
}
|
||||
|
||||
describe('loadPersona', () => {
|
||||
it('loads a valid persona YAML file and parses it', () => {
|
||||
const path = writePersona(`
|
||||
name: "Zalram Cloudwalker"
|
||||
description: "Aasimar Divination Wizard, level 8"
|
||||
persona: |
|
||||
You are Zalram — bound to the digital realm.
|
||||
responseStyle: "Dry, formal, occasionally sardonic."
|
||||
`);
|
||||
|
||||
const persona = loadPersona(path);
|
||||
|
||||
expect(persona.name).toBe('Zalram Cloudwalker');
|
||||
expect(persona.description).toBe('Aasimar Divination Wizard, level 8');
|
||||
expect(persona.persona).toContain('You are Zalram');
|
||||
expect(persona.responseStyle).toBe('Dry, formal, occasionally sardonic.');
|
||||
});
|
||||
|
||||
it('caches the result — second call returns the same instance without re-reading the file', () => {
|
||||
const path = writePersona(`
|
||||
name: "Test"
|
||||
description: "A test persona"
|
||||
persona: "Persona text"
|
||||
responseStyle: "Style text"
|
||||
`);
|
||||
|
||||
const first = loadPersona(path);
|
||||
// Replace the file with something invalid. The cached result must still come back.
|
||||
writeFileSync(path, 'this is not valid YAML: [', 'utf8');
|
||||
|
||||
const second = loadPersona(path);
|
||||
expect(second).toBe(first);
|
||||
});
|
||||
|
||||
it('clears the cache when clearPersonaCache is called', () => {
|
||||
const path1 = writePersona(`
|
||||
name: "First"
|
||||
description: "d"
|
||||
persona: "p"
|
||||
responseStyle: "r"
|
||||
`);
|
||||
const first = loadPersona(path1);
|
||||
|
||||
// Mutate the file to something different, then clear + reload.
|
||||
writeFileSync(path1, `
|
||||
name: "Second"
|
||||
description: "d"
|
||||
persona: "p"
|
||||
responseStyle: "r"
|
||||
`, 'utf8');
|
||||
clearPersonaCache();
|
||||
|
||||
const second = loadPersona(path1);
|
||||
expect(second.name).toBe('Second');
|
||||
expect(second).not.toBe(first);
|
||||
});
|
||||
|
||||
it('uses ./persona.yaml as the default path when none is provided', () => {
|
||||
// This test would require a real ./persona.yaml to exist. Verify the
|
||||
// default-path behaviour indirectly by ensuring the function uses the
|
||||
// passed-in path even when it differs from the default.
|
||||
const path = writePersona(`
|
||||
name: "DefaultTest"
|
||||
description: "d"
|
||||
persona: "p"
|
||||
responseStyle: "r"
|
||||
`);
|
||||
|
||||
const persona = loadPersona(path);
|
||||
expect(persona.name).toBe('DefaultTest');
|
||||
});
|
||||
|
||||
it('throws a Zod validation error when a required field is missing', () => {
|
||||
const path = writePersona(`
|
||||
name: "Missing fields"
|
||||
# description, persona, responseStyle all absent
|
||||
`);
|
||||
|
||||
expect(() => loadPersona(path)).toThrow();
|
||||
});
|
||||
|
||||
it('throws a Zod validation error when a field has the wrong type', () => {
|
||||
const path = writePersona(`
|
||||
name: 123
|
||||
description: "d"
|
||||
persona: "p"
|
||||
responseStyle: "r"
|
||||
`);
|
||||
|
||||
expect(() => loadPersona(path)).toThrow();
|
||||
});
|
||||
|
||||
it('throws when the file does not exist', () => {
|
||||
expect(() => loadPersona(join(tmpDir, 'does-not-exist.yaml'))).toThrow();
|
||||
});
|
||||
});
|
||||
75
tests/unit/redisErrorPath.test.ts
Normal file
75
tests/unit/redisErrorPath.test.ts
Normal file
@@ -0,0 +1,75 @@
|
||||
import { vi, describe, it, expect, beforeEach } from 'vitest';
|
||||
|
||||
// ── capture the registered error listener so we can fire it ──────────────────
|
||||
const { errorListeners } = vi.hoisted(() => ({
|
||||
errorListeners: [] as Array<(err: Error) => void>,
|
||||
}));
|
||||
|
||||
vi.mock('../../src/config.js', () => ({
|
||||
config: { REDIS_URL: 'redis://localhost:6379' },
|
||||
}));
|
||||
|
||||
vi.mock('ioredis', () => {
|
||||
return {
|
||||
Redis: vi.fn().mockImplementation(() => ({
|
||||
on: vi.fn((event: string, listener: (err: Error) => void) => {
|
||||
if (event === 'error') errorListeners.push(listener);
|
||||
return undefined;
|
||||
}),
|
||||
})),
|
||||
};
|
||||
});
|
||||
|
||||
import { Redis } from 'ioredis';
|
||||
|
||||
const consoleErrorSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
|
||||
|
||||
beforeEach(() => {
|
||||
errorListeners.length = 0;
|
||||
consoleErrorSpy.mockClear();
|
||||
// Force a re-import of redis.ts to register a fresh error listener.
|
||||
vi.resetModules();
|
||||
});
|
||||
|
||||
describe('db/redis.ts error handler', () => {
|
||||
it('registers an error listener on the Redis client at module load', async () => {
|
||||
await import('../../src/db/redis.js');
|
||||
|
||||
expect(errorListeners).toHaveLength(1);
|
||||
});
|
||||
|
||||
it('logs the error to console.error when the Redis client emits "error"', async () => {
|
||||
await import('../../src/db/redis.js');
|
||||
|
||||
expect(errorListeners).toHaveLength(1);
|
||||
errorListeners[0](new Error('ECONNREFUSED 127.0.0.1:6379'));
|
||||
|
||||
expect(consoleErrorSpy).toHaveBeenCalledTimes(1);
|
||||
expect(consoleErrorSpy).toHaveBeenCalledWith(
|
||||
'[redis] connection error',
|
||||
expect.objectContaining({ message: 'ECONNREFUSED 127.0.0.1:6379' }),
|
||||
);
|
||||
});
|
||||
|
||||
it('does not throw or crash when the error has a non-standard shape', async () => {
|
||||
await import('../../src/db/redis.js');
|
||||
|
||||
// Some ioredis errors come wrapped or with extra props. The handler just
|
||||
// forwards to console.error; it must not throw.
|
||||
expect(() => {
|
||||
const err = Object.assign(new Error('boom'), { code: 'ECONNRESET', syscall: 'connect' });
|
||||
errorListeners[0](err);
|
||||
}).not.toThrow();
|
||||
|
||||
expect(consoleErrorSpy).toHaveBeenCalledWith('[redis] connection error', expect.anything());
|
||||
});
|
||||
|
||||
it('constructs the Redis client with lazyConnect and maxRetriesPerRequest: 3', async () => {
|
||||
await import('../../src/db/redis.js');
|
||||
|
||||
expect(Redis).toHaveBeenCalledWith('redis://localhost:6379', {
|
||||
lazyConnect: true,
|
||||
maxRetriesPerRequest: 3,
|
||||
});
|
||||
});
|
||||
});
|
||||
162
tests/unit/xpAwarder.test.ts
Normal file
162
tests/unit/xpAwarder.test.ts
Normal file
@@ -0,0 +1,162 @@
|
||||
import { vi, describe, it, expect, beforeEach } from 'vitest';
|
||||
|
||||
const { mockGet: mockCharacterGet } = vi.hoisted(() => ({
|
||||
mockGet: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/session/characterRegistry.js', () => ({
|
||||
characterRegistry: { get: mockCharacterGet },
|
||||
}));
|
||||
|
||||
const { mockModifyExperience } = vi.hoisted(() => ({
|
||||
mockModifyExperience: vi.fn(),
|
||||
}));
|
||||
|
||||
vi.mock('../../src/vtt/foundryClient.js', () => ({
|
||||
modifyExperience: mockModifyExperience,
|
||||
}));
|
||||
|
||||
vi.mock('../../src/lib/logger.js', () => ({
|
||||
log: { info: vi.fn(), warn: vi.fn(), error: vi.fn(), debug: vi.fn() },
|
||||
}));
|
||||
|
||||
import { awardXP } from '../../src/session/xpAwarder.js';
|
||||
import { mockSession } from '../fixtures/spec.js';
|
||||
|
||||
function makeThread() {
|
||||
return { send: vi.fn().mockResolvedValue({ id: 'msg-1' }) };
|
||||
}
|
||||
|
||||
const baseSession = {
|
||||
...mockSession,
|
||||
players: {
|
||||
'user-1': { discordId: 'user-1', dndName: 'Aelindra' },
|
||||
'user-2': { discordId: 'user-2', dndName: 'Borgrim' },
|
||||
'user-3': { discordId: 'user-3', dndName: 'Cael' },
|
||||
},
|
||||
};
|
||||
|
||||
beforeEach(() => {
|
||||
vi.clearAllMocks();
|
||||
mockModifyExperience.mockResolvedValue(undefined);
|
||||
});
|
||||
|
||||
describe('awardXP', () => {
|
||||
it('awards XP to every player with a Foundry link and returns the awarded list', async () => {
|
||||
mockCharacterGet.mockImplementation(async (_g, discordId) => {
|
||||
if (discordId === 'user-1') return { discordId, dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.1' };
|
||||
if (discordId === 'user-2') return { discordId, dndName: 'Borgrim', source: 'foundry', foundryActorUuid: 'Actor.2' };
|
||||
return null;
|
||||
});
|
||||
|
||||
const result = await awardXP(baseSession, 100, makeThread() as any);
|
||||
|
||||
expect(result.awarded).toEqual([
|
||||
{ dndName: 'Aelindra', amount: 100 },
|
||||
{ dndName: 'Borgrim', amount: 100 },
|
||||
]);
|
||||
expect(result.skipped).toEqual([
|
||||
{ dndName: 'Cael', discordId: 'user-3', reason: 'no Foundry character linked' },
|
||||
]);
|
||||
expect(mockModifyExperience).toHaveBeenCalledTimes(2);
|
||||
expect(mockModifyExperience).toHaveBeenCalledWith('Actor.1', 100);
|
||||
expect(mockModifyExperience).toHaveBeenCalledWith('Actor.2', 100);
|
||||
});
|
||||
|
||||
it('posts a summary embed listing awarded and skipped players', async () => {
|
||||
mockCharacterGet.mockImplementation(async (_g, discordId) => {
|
||||
if (discordId === 'user-1') return { discordId, dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.1' };
|
||||
return null;
|
||||
});
|
||||
|
||||
const thread = makeThread();
|
||||
await awardXP(baseSession, 50, thread as any);
|
||||
|
||||
expect(thread.send).toHaveBeenCalledTimes(1);
|
||||
const message = thread.send.mock.calls[0][0] as string;
|
||||
expect(message).toContain('+50 XP awarded');
|
||||
expect(message).toContain('✅ Aelindra');
|
||||
expect(message).toContain('⚠️');
|
||||
expect(message).toContain('Borgrim');
|
||||
expect(message).toContain('Cael');
|
||||
});
|
||||
|
||||
it('returns empty results and posts no embed when there are no players', async () => {
|
||||
const session = { ...baseSession, players: {} };
|
||||
const thread = makeThread();
|
||||
|
||||
const result = await awardXP(session, 100, thread as any);
|
||||
|
||||
expect(result).toEqual({ awarded: [], skipped: [] });
|
||||
expect(thread.send).not.toHaveBeenCalled();
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('skips players whose profile has no foundryActorUuid (custom characters)', async () => {
|
||||
mockCharacterGet.mockResolvedValue({
|
||||
discordId: 'user-1', dndName: 'Aelindra', source: 'custom', /* no UUID */
|
||||
});
|
||||
|
||||
const result = await awardXP(baseSession, 25, makeThread() as any);
|
||||
|
||||
expect(result.awarded).toEqual([]);
|
||||
expect(result.skipped).toEqual([
|
||||
{ dndName: 'Aelindra', discordId: 'user-1', reason: 'no Foundry character linked' },
|
||||
{ dndName: 'Borgrim', discordId: 'user-2', reason: 'no Foundry character linked' },
|
||||
{ dndName: 'Cael', discordId: 'user-3', reason: 'no Foundry character linked' },
|
||||
]);
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('skips players with "registry error" reason when characterRegistry throws', async () => {
|
||||
mockCharacterGet.mockRejectedValue(new Error('redis down'));
|
||||
|
||||
const result = await awardXP(baseSession, 50, makeThread() as any);
|
||||
|
||||
expect(result.awarded).toEqual([]);
|
||||
expect(result.skipped).toEqual([
|
||||
{ dndName: 'Aelindra', discordId: 'user-1', reason: 'registry error' },
|
||||
{ dndName: 'Borgrim', discordId: 'user-2', reason: 'registry error' },
|
||||
{ dndName: 'Cael', discordId: 'user-3', reason: 'registry error' },
|
||||
]);
|
||||
expect(mockModifyExperience).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('skips with "Foundry relay error" when modifyExperience throws for a specific player', async () => {
|
||||
mockCharacterGet.mockImplementation(async (_g, discordId) => {
|
||||
if (discordId === 'user-1') return { discordId, dndName: 'Aelindra', source: 'foundry', foundryActorUuid: 'Actor.1' };
|
||||
if (discordId === 'user-2') return { discordId, dndName: 'Borgrim', source: 'foundry', foundryActorUuid: 'Actor.2' };
|
||||
return null;
|
||||
});
|
||||
mockModifyExperience.mockImplementation(async (uuid) => {
|
||||
if (uuid === 'Actor.2') throw new Error('relay down');
|
||||
});
|
||||
|
||||
const result = await awardXP(baseSession, 100, makeThread() as any);
|
||||
|
||||
expect(result.awarded).toEqual([{ dndName: 'Aelindra', amount: 100 }]);
|
||||
expect(result.skipped).toEqual(
|
||||
expect.arrayContaining([
|
||||
expect.objectContaining({ dndName: 'Borgrim', reason: 'Foundry relay error' }),
|
||||
]),
|
||||
);
|
||||
});
|
||||
|
||||
it('handles a mix of: no profile, no UUID, and one success', async () => {
|
||||
mockCharacterGet.mockImplementation(async (_g, discordId) => {
|
||||
if (discordId === 'user-1') return null; // no profile
|
||||
if (discordId === 'user-2') return { source: 'custom' }; // no UUID
|
||||
if (discordId === 'user-3') return { source: 'foundry', foundryActorUuid: 'Actor.3' }; // success
|
||||
});
|
||||
|
||||
const result = await awardXP(baseSession, 25, makeThread() as any);
|
||||
|
||||
expect(result.awarded).toEqual([{ dndName: 'Cael', amount: 25 }]);
|
||||
expect(result.skipped).toEqual(
|
||||
expect.arrayContaining([
|
||||
expect.objectContaining({ dndName: 'Aelindra', reason: 'no Foundry character linked' }),
|
||||
expect.objectContaining({ dndName: 'Borgrim', reason: 'no Foundry character linked' }),
|
||||
]),
|
||||
);
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user