Files
zalbot/docs/architecture.md
Kaysser Kayyali e2c92e854f
Some checks failed
tests / Unit tests (Node 22) (push) Failing after 2m13s
Add unit tests for LLM clients, persona loader, and XP/Foundry rewards
Expands the unit test suite from 320 to 380 tests (+60) and adds a
Gitea Actions CI workflow. Closes all six follow-up recommendations
from the test-architecture validation report.

New tests (tests/unit/):
  - ollamaClient.test.ts          — Ollama SDK wrapper, options passthrough
  - litellmClient.test.ts         — OpenAI SDK wrapper, model fallback
  - personaLoader.test.ts         — Zod validation + cache invalidation
  - foundryReward.test.ts         — Tool plugin: lookup, errors, partial grants
  - xpAwarder.test.ts             — Bulk XP awards + per-player skip reasons
  - redisErrorPath.test.ts        — Singleton error handler does not crash
  - messageRouterRunLLMTurn.test.ts — 18 cases for the runtime heart:
    narrative-only path, tool dispatch, filter correction, retry loop
    guard, missed-skill-check heuristic, typing indicator interval,
    LLM error fallback, archive on resolve.

Coverage (line %):
  - harness/litellmClient.ts      0 → 100
  - harness/ollamaClient.ts       0 → 100
  - harness/tools/foundryReward.ts 0 → 100
  - session/xpAwarder.ts          0 → 100
  - persona/loader.ts             0 → 100
  - db/redis.ts                   0 → 100
  - bot/handlers/messageRouter.ts 0 → 39.86 (runLLMTurn now covered)

Tooling:
  - package.json: + test:coverage, test:watch scripts
  - devDep: @vitest/coverage-v8@^3.1.0
  - tests/README.md: conventions, anti-patterns, template map
  - .gitignore: exclude coverage/
  - .gitea/workflows/test.yml: Node 22, npm cache, tsc --noEmit gate

Documentation (from earlier /bmad-document-project run, now committed):
  - docs/index.md
  - docs/project-overview.md
  - docs/architecture.md
  - docs/deployment-guide.md
  - docs/api-contracts.md
  - docs/data-models.md
  - docs/source-tree-analysis.md
  - docs/component-inventory.md
  - docs/development-guide.md
  - _bmad-output/test-artifacts/automate-validation-report.md

Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-19 05:59:13 +00:00

25 KiB
Raw Blame History

Mardonar Encounter Engine — Architecture

Single-part backend project. Discord-native, LLM-driven D&D encounter engine. Generated 2026-06-19 from a deep scan of /home/kaykayyali/hosting/mardonar-npcs.


Executive Summary

The Mardonar Encounter Engine is a Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. An LLM (Gemma 4 IT e2b via LiteLLM with Ollama fallback) narrates the scene, voices NPCs, drives skill checks, and steers the encounter toward hidden outcomes defined in a YAML spec. NPC memory, lore context, and encounter history are persisted in a graph database (Neo4j) accessed through a JSON-RPC MCP server (GraphMCP). Active session state lives in Redis with a TTL. The bot can also reach into Foundry VTT to resolve character stats and award XP via an external relay.

Key constraint: the harness controls everything the LLM sees. The 128k context window is partitioned into hard zones (system / pinned / sliding / safety) and the assembly pipeline is deterministic. Tool calls are extracted from fenced tool_call JSON blocks, not via native function calling — Gemma at e2b quantization isn't reliable for native tools.


1. Technology Stack

Layer Technology Version Notes
Runtime Node.js 22 (alpine) ESM modules, NodeNext resolution
Language TypeScript 5.8 strict mode, declaration + sourcemap output
Discord discord.js v14.18 Slash commands + embeds + threads
LLM primary LiteLLM proxy (env: LITELLM_BASE_URL) OpenAI-compatible
LLM fallback Ollama env: OLLAMA_BASE_URL gemma4-it:e2b, 128k context
Session cache Redis (ioredis) 5.4 TTL = SESSION_TTL_HOURS (default 12h)
Graph DB Neo4j 5 via GraphMCP JSON-RPC, not direct
Lore / NPC memory GraphMCP HTTP JSON-RPC (env: GRAPHMCP_URL) 6 RPC tools exposed
Foundry VTT VTT relay HTTPS (env: VTT_RELAY_URL) Optional, requires API key
Validation Zod 3.24 env + encounter spec
Logging pino + pino-pretty 9.6 / 13 structured JSON in prod
Testing Vitest 3.1 tests/unit + tests/integration
Build tsc → dist/ 5.8 multi-stage Dockerfile

Architecture pattern: layered backend with a plugin-style tool registry. Three layers: bot (Discord I/O), harness (LLM orchestration), session + db + graphmcp + vtt (data + integrations).


2. Source Tree

mardonar-bot/
├── src/
│   ├── bot/                          # Discord I/O layer
│   │   ├── index.ts                  # Entry: Client setup, event wiring
│   │   ├── commands/                 # 8 slash command modules
│   │   │   ├── dndname.ts            # /dndname set|show|clear
│   │   │   ├── encounter.ts          # /encounter start|status|end|generate|spec|random|stats|audit
│   │   │   ├── character.ts          # /character register|show|view|admin
│   │   │   ├── roll.ts               # /roll
│   │   │   ├── actions.ts            # /actions
│   │   │   ├── xp.ts                 # /xp award
│   │   │   ├── encounters.ts         # /encounters (list/search from GraphMCP)
│   │   │   └── turn.ts               # /turn
│   │   ├── embeds/                   # Discord embed builders
│   │   │   ├── playerGate.ts
│   │   │   ├── skillCheck.ts         # Suspense + dice + roll buttons
│   │   │   ├── resolution.ts
│   │   │   ├── encounterDiscovery.ts
│   │   │   └── loreAnswer.ts
│   │   ├── handlers/                 # Event handlers / sidecar logic
│   │   │   ├── messageRouter.ts      # Encounter-thread message pipeline (heart of runtime)
│   │   │   ├── mentionHandler.ts     # @Zalram persona replies
│   │   │   ├── rollHandler.ts        # Button / modal submit roll resolution
│   │   │   ├── generationQueue.ts    # Debounce + LLM turn scheduling
│   │   │   ├── queueCap.ts           # Burst cap → drop notice
│   │   │   ├── reactionManager.ts    # 👀 reaction lifecycle (scheduled/processing/complete)
│   │   │   └── responseFilter.ts     # Post-LLM response scrubbing
│   │   └── lib/welcomeDM.ts
│   ├── harness/                      # LLM orchestration
│   │   ├── promptBuilder.ts          # System prompt assembly (XML sections)
│   │   ├── contextAssembler.ts       # Pin/slide history + token budget trim
│   │   ├── llmClient.ts              # LiteLLM primary → Ollama fallback
│   │   ├── litellmClient.ts          # OpenAI-compatible HTTP client
│   │   ├── ollamaClient.ts           # Native ollama npm + direct HTTP
│   │   ├── toolParser.ts             # Extract ```tool_call``` blocks
│   │   ├── toolRegistry.ts           # Plugin registry + active-set filtering
│   │   ├── toolDispatcher.ts         # Per-encounter tool validation + dispatch
│   │   └── tools/                    # 6 tool plugins (see §5)
│   ├── session/                      # Redis-backed state
│   │   ├── playerRegistry.ts         # guildId+discordId → Player
│   │   ├── characterRegistry.ts      # Character profile + pronouns + Foundry UUID
│   │   ├── sessionManager.ts         # threadId → SessionState (pinned/sliding history)
│   │   ├── encounterLog.ts           # Filesystem tally + summary writer
│   │   └── xpAwarder.ts              # XP grant via VTT relay
│   ├── graphmcp/                     # GraphMCP JSON-RPC client
│   │   ├── client.ts                 # 6 RPC calls + NPC memory formatter
│   │   ├── ingest.ts                 # Publish to Redis stream (raw.messages)
│   │   ├── loreResolver.ts           # /encounter generate helper
│   │   └── vocabularyResolver.ts     # spec randomizable: vocabulary source
│   ├── vtt/                          # Foundry VTT integration
│   │   ├── foundryClient.ts          # HTTP client, formatters
│   │   └── relaySession.ts           # RSA-OAEP handshake + headless spin-up
│   ├── db/redis.ts                   # ioredis singleton (lazy connect)
│   ├── spec/loader.ts                # YAML loader + Zod schema
│   ├── persona/loader.ts             # persona.yaml loader for @mention
│   ├── lib/logger.ts                 # pino wrapper
│   ├── config.ts                     # Zod env schema + parsed config singleton
│   ├── scripts/deploy-commands.ts    # Slash command registration (REST v10)
│   └── types/index.ts                # Shared interfaces + CONTEXT_BUDGET const
├── specs/                            # 8 encounter YAML files
│   ├── SPEC_FORMAT.md
│   ├── market-thief.yaml
│   ├── cog-claw-debt.yaml
│   ├── mawfang-pursuit.yaml
│   ├── silt-leak.yaml
│   ├── stormscar-pilgrim.yaml
│   ├── velvet-auction.yaml
│   └── whispering-stone.yaml
├── data/                             # Runtime data (gitignored in practice)
│   ├── tally.json                    # Per-spec run counts
│   └── summaries/                    # One .txt per encounter
├── tests/
│   ├── unit/                         # 21 unit test files
│   └── integration/                  # 1 integration test
├── Docs/                             # Pre-existing project docs
│   ├── mardonar-encounter-engine.md  # ⚠ Out of date — describes Go architecture
│   ├── mardonar-build-plan.md
│   ├── epics.md
│   ├── stories/
│   └── ux-designs/
├── lore/                             # Game-world reference material
├── persona.yaml                      # Zalram Cloudwalker (bot's @mention persona)
├── prd.md                            # Active PRD: Dynamic Goal Registration
├── Dockerfile                        # Multi-stage node:22-alpine
├── docker-compose.dev.yml            # Local Redis + Neo4j
├── package.json
├── tsconfig.json
└── vitest.config.ts

3. Architecture Pattern

Layered backend with a plugin registry:

┌──────────────────────────────────────────────────────────────────┐
│  Discord (Gateway WebSocket)                                     │
└──────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│  src/bot/                                                         │
│  ┌────────────────────┐  ┌────────────────┐  ┌──────────────┐  │
│  │ commands/          │  │ handlers/      │  │ embeds/      │  │
│  │ (slash cmd)        │  │ (event loops)  │  │ (UI shape)   │  │
│  └────────────────────┘  └────────────────┘  └──────────────┘  │
│         messageRouter is the runtime heart                       │
└──────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│  src/harness/                                                     │
│  assembleContext → llmClient (LiteLLM → Ollama)                 │
│       ↓                                                            │
│  parseToolCall → dispatchTool → active tool plugins              │
└──────────────────────────────────────────────────────────────────┘
                            │                       │
                            ▼                       ▼
┌─────────────────────┐  ┌─────────────────┐  ┌──────────────────┐
│ src/session/        │  │ src/db/         │  │ src/graphmcp/    │
│  (Redis state)      │  │  (ioredis)      │  │  (JSON-RPC)      │
└─────────────────────┘  └─────────────────┘  └──────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│  src/vtt/  →  External Foundry VTT relay                         │
│  src/persona/  →  persona.yaml for @mentions                     │
│  src/spec/  →  specs/*.yaml loaded per encounter                 │
└──────────────────────────────────────────────────────────────────┘

3.1 Message flow (encounter thread)

  1. Discord messageCreatebot/index.tshandleMessage in handlers/messageRouter.ts
  2. Channel guard: must be a thread whose parent is in DISCORD_ALLOWED_CHANNELS
  3. Player gate: if discordId not in playerRegistry, post ephemeral gate embed, hold message in SessionState.heldMessages, return
  4. Roll guard: if pendingSkillCheck is set, increment attempt counter; auto-fail after PENDING_ROLL_LIMIT (5) skipped messages
  5. Burst cap: queueCap rejects + sends drop notice if too many messages arrived before last LLM response
  6. Append user message to history, fire 👀 reaction (fire-and-forget)
  7. Publish to GraphMCP via graphmcp/ingest.ts (Redis stream raw.messages)
  8. Debounced (500ms) → generationQueue.scheduleLLMTurn
  9. runLLMTurn:
    • assembleContext builds message list (system + pinned + trimmed sliding)
    • callLLM → LiteLLM with Ollama fallback
    • parseToolCall splits narrative from tool_call block
    • filterLLMResponse rejects fabricated rolls / echoed system tags → injects [FILTER CORRECTION] and retries once
    • Narrative posted to thread; assistant message appended to history
    • If tool call present → dispatchTool → plugin handler → system message appended
    • If result.resolved set → phase = 'resolved', archive thread after ENCOUNTER_ARCHIVE_DELAY_MS
  10. reactionManager upgrades 👀 state to complete and clears burst counter

3.2 Tool dispatch

The tool layer uses a plugin registry (harness/toolRegistry.ts) with per-encounter active-set filtering. Each ToolPlugin declares:

{
  name: string;
  description: string;
  args: Record<string, { type: 'string' | 'number' | 'boolean'; description: string }>;
  contextDocs?: (spec: EncounterSpec) => string;
  handler: (args, ctx: ToolContext) => Promise<DispatchResult>;
}

A spec's tools: [...] array declares which plugins are active for that encounter. Tools are loaded by side-effect from harness/tools/index.ts:

import './skillCheckEmit.js';
import './encounterResolve.js';
import './contextRecall.js';
import './goalRegister.js';
import './foundryLookup.js';
import './foundryReward.js';

The LLM emits a tool call by appending a fenced tool_call JSON block. Three parser patterns (in order): fenced ```tool_call block, bare tool_call header, then a fuzzy bare-JSON fallback. Unrecognized tools or malformed args are logged and ignored — the narrative is preserved.

The system prompt section buildToolManifest(spec) injects only the active set's tool definitions into the prompt contract, so each encounter's LLM only sees tools it can use.


4. Data Architecture

4.1 Redis (transient state)

Key pattern Value TTL Owner
session:{threadId} JSON.stringify(SessionState) SESSION_TTL_HOURS (12h) sessionManager
guild_threads:{guildId} Set of thread IDs inherits sessionManager
players:{guildId} (legacy design) discordId → dndName playerRegistry (current impl uses different scheme)
raw.messages Redis stream graphmcp/ingest.ts

SessionState (src/types/index.ts) is the central shape:

{
  encounterId, threadId, guildId,
  spec: EncounterSpec,
  players: Record<discordId, Player>,
  history: ChatMessage[],         // mix of pinned + sliding
  phase: 'open' | 'active' | 'resolved',
  heldMessages: HeldMessage[],    // for unregistered players
  outcome?, outcomeSummary?,
  npcMemories?: Record<npcId, string>,
  resolvedContext?: Record<key, string>,
  pendingSkillCheck?: { player, prompt, dc, messageId, modifier?, skill?, advantage?, disadvantage? },
  pendingSkillCheckAttempts?: number,
  createdAt, updatedAt,
}

4.2 Filesystem (data/)

  • tally.json{ [specName]: { runs, lastRun } }. Incremented at each encounter start.
  • summaries/{encounterId}-{ISO timestamp}.txt — one per resolved encounter, written by encounterLog.writeSummary().

4.3 GraphMCP / Neo4j (via JSON-RPC)

The bot never queries Neo4j directly. All graph access goes through GRAPHMCP_URL/mcp with JSON-RPC 2.0:

Tool Args Returns
query_as_npc npc_name, question, limit NPCQueryResult (chunks + graph_context)
semantic_search query, limit SemanticSearchResult
log_encounter title, participants, summary, location?, type? LogEncounterResult
list_encounters limit EncounterResultItem[]
search_encounters query?, location?, participant?, limit? EncounterResultItem[]
get_encounter id EncounterDetails

NPC memory is injected into the system prompt via formatNPCMemory() — past encounters witnessed + top-3 lore chunks above GRAPHMCP_SCORE_THRESHOLD.

4.4 Context window budget

src/types/index.ts exports a CONTEXT_BUDGET constant used by both contextAssembler and sessionManager:

Zone Tokens
System prompt (narrator + NPCs + tools + goals) 4,000
Pinned (opening narrative, goal block) 2,000
Sliding history 118,000
Safety buffer 3,500
Total 128,000

History trimming drops the oldest non-pinned turn pair when over budget, with a hard floor of 6 messages. Token estimates use gpt-tokenizer with a 1.15× buffer to approximate Gemma's tokenizer.


5. API Surface

This project exposes its functionality as two different APIs:

5.1 Discord slash commands (player/admin surface)

Registered via src/scripts/deploy-commands.ts using Discord REST v10.

Command Subcommands Purpose
/dndname set <name>, show, clear Character name registration
/character register foundry|custom, show, view, clear, admin list|remove|give Full character profile + Foundry link
/encounter start <spec>, random, status, stats, audit, end [notes], list, generate <theme>, spec Encounter session lifecycle
/encounters (Select menu + search modal) Search the encounter log via GraphMCP
/roll action Manual dice roll
/actions In-character action shortcuts
/turn Turn management
/xp award <amount> Award XP (relay → VTT)

Plus button + modal interactions: skill-check roll buttons, give item, custom character registration, Foundry link, encounter select menu, search modal.

5.2 Tool plugins (LLM surface)

Defined in src/harness/tools/ and registered at module load. Each spec filters the active set via its tools: array.

Tool Purpose Args
skill_check_emit Posts a dice-roll embed to the thread; blocks player input until resolved player, prompt, skill?, dc, advantage?, disadvantage?
encounter_resolve Marks encounter complete; writes summary; archives thread (args handled in tools/encounterResolve.ts)
context_recall Look up canonical session facts stored in resolvedContext
goal_register Add a new goal mid-encounter (the prd.md "dynamic goal registration" feature)
foundry_lookup Pull live character data from VTT relay
foundry_reward Award XP/items to a character via VTT

⚠ Note: the Docs/mardonar-encounter-engine.md lists skill_check_resolve, event_log_append, npc_memory_read, npc_memory_write as tools. These have been removed — replaced by the per-encounter event log + GraphMCP log_encounter tool. The current tool set is the one above.


6. Deployment Architecture

6.1 Local development

docker compose -f docker-compose.dev.yml up -d   # Redis + Neo4j
npm install
npm run deploy-commands                          # registers slash commands with Discord
npm run dev                                      # tsx watch mode

6.2 Production (multi-stage Dockerfile)

Dockerfile (Node 22 alpine):

  1. Builder stagenpm ci --ignore-scripts, copy src + tsconfig.json, npm run builddist/
  2. Runtime stagenpm ci --omit=dev --ignore-scripts, copy dist/, specs/, lore/, persona.yaml
  3. CMD ["node", "dist/bot/index.js"]

docker-compose.dev.yml defines two services (for the mardonar-internal external Docker network that also hosts Redis + an MCP server from the GraphMCP-Example stack): deploy-commands (one-shot) and bot (long-running, with data/ mounted as a volume).

Gap: There is no production docker-compose.yml. The .env.example is the source of truth for runtime config.

6.3 Operational

  • Session state has a 12h TTL by default — stale encounters auto-expire
  • Bot connects to Redis on main() startup (redis.connect())
  • VTT relay auto-spins up a headless Foundry session on connection failure (RSA-OAEP encrypted handshake)
  • LOG_LEVEL=info in prod; pino writes structured JSON

7. Development & Testing

7.1 Local commands

Command Effect
npm run dev tsx watch src/bot/index.ts — auto-reload dev
npm run build tscdist/
npm run start node dist/bot/index.js
npm run deploy-commands One-shot slash command registration
npm run test All tests (vitest)
npm run test:unit Unit tests only (no external services)
npm run test:int Integration tests (requires Docker services)

7.2 Test coverage

  • 21 unit test files in tests/unit/
  • 1 integration test (tests/integration/phase1.test.ts)
  • tests/fixtures/spec.ts — shared encounter spec fixture

Notable test surfaces: promptBuilder, contextAssembler, toolParser, toolDispatcher, sessionManager, playerRegistry, characterRegistry, specLoader, rollHandler, rollDetection, responseFilter, queueCap, generationQueue, reactionManager, encounterLog, encounterDiscoveryEmbed, loreAnswerEmbed, skillCheckEmbed, graphmcpClient, foundryClientRetry, foundryClientFormatters, goalRegister, relaySession.


8. Design Decisions (Living)

Decision Why
LiteLLM as primary, Ollama as fallback OpenAI-compatible proxy gives model flexibility without code changes; Ollama fallback ensures the bot still runs when the proxy is down
Prompt-based tool calls (not native) Gemma 4 IT at e2b is unreliable with native function calling; fenced JSON block parsing is deterministic
Tool plugin registry with per-spec active set New tools can be added without touching the dispatch core; specs opt into only the tools they need
Pinned + sliding history Opening narrative and goal block must survive trimming or the LLM loses its anchor
Goals in system prompt, not as a tool Goals rarely change mid-encounter; embedding them reduces tool round-trips
Redis for active state, GraphMCP for memory Redis is fast and ephemeral for live sessions; the graph holds long-term NPC lore
Player name gate via embed, not DMs Keeps the conversation in-thread; ephemeral embed auto-deletes after 30s
Story generator via /encounter generate Separates creative authoring from real-time inference — generator can use a stronger model later
VTT relay auto-spin-up Lets the bot operate when the relay has been cold-stopped; uses RSA-OAEP for password handoff
In-world voice rule for player-facing strings See feedback-in-world-voice — no utility/jargon in bot messages

9. Open Issues / Drift

Items the deep scan surfaced that aren't bugs but should be tracked:

  • Drift: Docs/mardonar-encounter-engine.md describes a Go bot with an embedded MCP layer; the actual code is TypeScript with an external JSON-RPC GraphMCP server. Treat the doc as historical/aspirational.
  • Drift: README.md's "Project Structure" tree references src/mcp/ and the old src/bot/commands/{dndname,encounter}.ts layout. Update README, or trim it to a pointer to the index.
  • Duplicate trimHistory logic in src/session/sessionManager.ts and src/harness/contextAssembler.ts (identical body). Could be extracted to src/lib/historyTrim.ts.
  • No production compose file — only docker-compose.dev.yml. The Dockerfile is production-ready but deployment is ad-hoc.
  • No CI/CD.github/workflows/ does not exist.
  • DISCORD_ALLOWED_USERS is empty by default → anyone in allowed channels can run /encounter start. The access control is channel-scoped, not user-scoped; admins need to set the env var explicitly.
  • OLLAMA_BASE_URL defaults to localhost — fine for dev, but production needs the LAN IP or proxy URL set.
  • Spec tool list must be kept in syncspecs/*.yaml declare tools: [...], but no test verifies every referenced tool is registered. A stale spec name silently filters to no active tools.
  • Schema mismatch risk: types/index.ts EncounterSpec and spec/loader.ts Zod schema have diverged slightly — EncounterSpec is missing tone, tools, randomizable, and npcs.nameKey. assembleContext reads spec.tone; loader doesn't validate it. Consider regenerating types/index.ts from the Zod schema via z.infer.

Document generated by bmad-document-project initial scan, deep level. Project state recorded in docs/project-scan-report.json.