Files
zalbot/docs/architecture.md
Kaysser Kayyali fbd991a2b0
Some checks failed
tests / Unit tests (Node 22) (push) Failing after 28s
feat: docs pass, test fixes, advanced review
2026-06-19 16:15:06 +00:00

27 KiB
Raw Permalink Blame History

Mardonar Encounter Engine — Architecture

Single-part backend project. Discord-native, LLM-driven D&D encounter engine. Generated 2026-06-19 from a deep scan of /home/kaykayyali/hosting/mardonar-npcs.


Executive Summary

The Mardonar Encounter Engine is a Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. An LLM (Gemma 4 IT e2b via LiteLLM with Ollama fallback) narrates the scene, voices NPCs, drives skill checks, and steers the encounter toward hidden outcomes defined in a YAML spec. NPC memory, lore context, and encounter history are persisted in a graph database (Neo4j) accessed through a JSON-RPC MCP server (GraphMCP). Active session state lives in Redis with a TTL. The bot can also reach into Foundry VTT to resolve character stats and award XP via an external relay.

Key constraint: the harness controls everything the LLM sees. The 128k context window is partitioned into hard zones (system / pinned / sliding / safety) and the assembly pipeline is deterministic. Tool calls are extracted from fenced tool_call JSON blocks, not via native function calling — Gemma at e2b quantization isn't reliable for native tools.


1. Technology Stack

Layer Technology Version Notes
Runtime Node.js 22 (alpine) ESM modules, NodeNext resolution
Language TypeScript 5.8 strict mode, declaration + sourcemap output
Discord discord.js v14.18 Slash commands + embeds + threads
LLM primary LiteLLM proxy (env: LITELLM_BASE_URL) OpenAI-compatible
LLM fallback Ollama env: OLLAMA_BASE_URL gemma4-it:e2b, 128k context
Session cache Redis (ioredis) 5.4 TTL = SESSION_TTL_HOURS (default 12h)
Graph DB Neo4j 5 via GraphMCP JSON-RPC, not direct
Lore / NPC memory GraphMCP HTTP JSON-RPC (env: GRAPHMCP_URL) 6 RPC tools exposed
Foundry VTT VTT relay HTTPS (env: VTT_RELAY_URL) Optional, requires API key
Validation Zod 3.24 env + encounter spec
Logging custom (src/lib/logger.ts) plaintext stdout; no env-driven level filter
Testing Vitest 3.1 tests/unit + tests/integration
Build tsc → dist/ 5.8 multi-stage Dockerfile

Architecture pattern: layered backend with a plugin-style tool registry. Three layers: bot (Discord I/O), harness (LLM orchestration), session + db + graphmcp + vtt (data + integrations).


2. Source Tree

mardonar-bot/
├── src/
│   ├── bot/                          # Discord I/O layer
│   │   ├── index.ts                  # Entry: Client setup, event wiring
│   │   ├── commands/                 # 8 slash command modules
│   │   │   ├── dndname.ts            # /dndname set|show|clear
│   │   │   ├── encounter.ts          # /encounter start|status|end|generate|spec|random|stats|audit
│   │   │   ├── character.ts          # /character register|show|view|admin
│   │   │   ├── roll.ts               # /roll
│   │   │   ├── actions.ts            # /actions
│   │   │   ├── xp.ts                 # /xp award
│   │   │   ├── encounters.ts         # /encounters (list/search from GraphMCP)
│   │   │   └── turn.ts               # /turn
│   │   ├── embeds/                   # Discord embed builders
│   │   │   ├── playerGate.ts
│   │   │   ├── skillCheck.ts         # Suspense + dice + roll buttons
│   │   │   ├── resolution.ts
│   │   │   ├── encounterDiscovery.ts
│   │   │   └── loreAnswer.ts
│   │   ├── handlers/                 # Event handlers / sidecar logic
│   │   │   ├── messageRouter.ts      # Encounter-thread message pipeline (heart of runtime)
│   │   │   ├── mentionHandler.ts     # @Zalram persona replies
│   │   │   ├── rollHandler.ts        # Button / modal submit roll resolution
│   │   │   ├── generationQueue.ts    # Debounce + LLM turn scheduling
│   │   │   ├── queueCap.ts           # Burst cap → drop notice
│   │   │   ├── reactionManager.ts    # 👀 reaction lifecycle (scheduled/processing/complete)
│   │   │   └── responseFilter.ts     # Post-LLM response scrubbing
│   │   └── lib/welcomeDM.ts
│   ├── harness/                      # LLM orchestration
│   │   ├── promptBuilder.ts          # System prompt assembly (XML sections)
│   │   ├── contextAssembler.ts       # Pin/slide history + token budget trim
│   │   ├── llmClient.ts              # LiteLLM primary → Ollama fallback
│   │   ├── litellmClient.ts          # OpenAI-compatible HTTP client
│   │   ├── ollamaClient.ts           # Native ollama npm + direct HTTP
│   │   ├── toolParser.ts             # Extract ```tool_call``` blocks
│   │   ├── toolRegistry.ts           # Plugin registry + active-set filtering
│   │   ├── toolDispatcher.ts         # Per-encounter tool validation + dispatch
│   │   └── tools/                    # 6 tool plugins (see §5)
│   ├── session/                      # Redis-backed state
│   │   ├── playerRegistry.ts         # guildId+discordId → Player
│   │   ├── characterRegistry.ts      # Character profile + pronouns + Foundry UUID
│   │   ├── sessionManager.ts         # threadId → SessionState (pinned/sliding history)
│   │   ├── encounterLog.ts           # Filesystem tally + summary writer
│   │   └── xpAwarder.ts              # XP grant via VTT relay
│   ├── graphmcp/                     # GraphMCP JSON-RPC client
│   │   ├── client.ts                 # 6 RPC calls + NPC memory formatter
│   │   ├── ingest.ts                 # Publish to Redis stream (raw.messages)
│   │   ├── loreResolver.ts           # /encounter generate helper
│   │   └── vocabularyResolver.ts     # spec randomizable: vocabulary source
│   ├── vtt/                          # Foundry VTT integration
│   │   ├── foundryClient.ts          # HTTP client, formatters
│   │   └── relaySession.ts           # RSA-OAEP handshake + headless spin-up
│   ├── db/redis.ts                   # ioredis singleton (lazy connect)
│   ├── spec/loader.ts                # YAML loader + Zod schema
│   ├── persona/loader.ts             # persona.yaml loader for @mention
│   ├── lib/logger.ts                 # custom tag+message logger (plaintext stdout)
│   ├── config.ts                     # Zod env schema + parsed config singleton
│   ├── scripts/deploy-commands.ts    # Slash command registration (REST v10)
│   └── types/index.ts                # Shared interfaces + CONTEXT_BUDGET const
├── specs/                            # 8 encounter YAML files
│   ├── SPEC_FORMAT.md
│   ├── market-thief.yaml
│   ├── cog-claw-debt.yaml
│   ├── mawfang-pursuit.yaml
│   ├── silt-leak.yaml
│   ├── stormscar-pilgrim.yaml
│   ├── velvet-auction.yaml
│   └── whispering-stone.yaml
├── data/                             # Runtime data (gitignored in practice)
│   ├── tally.json                    # Per-spec run counts
│   └── summaries/                    # One .txt per encounter
├── tests/
│   ├── unit/                         # 21 unit test files
│   └── integration/                  # 1 integration test
├── Docs/                             # Pre-existing project docs
│   ├── mardonar-encounter-engine.md  # ⚠ Out of date — describes Go architecture
│   ├── mardonar-build-plan.md
│   ├── epics.md
│   ├── stories/
│   └── ux-designs/
├── lore/                             # Game-world reference material
├── persona.yaml                      # Zalram Cloudwalker (bot's @mention persona)
├── prd.md                            # Active PRD: Dynamic Goal Registration
├── Dockerfile                        # Multi-stage node:22-alpine
├── docker-compose.dev.yml            # Builds the bot image; expects Redis + GraphMCP on the external `mardonar-internal` network
├── package.json
├── tsconfig.json
└── vitest.config.ts

3. Architecture Pattern

Layered backend with a plugin registry:

┌──────────────────────────────────────────────────────────────────┐
│  Discord (Gateway WebSocket)                                     │
└──────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│  src/bot/                                                         │
│  ┌────────────────────┐  ┌────────────────┐  ┌──────────────┐  │
│  │ commands/          │  │ handlers/      │  │ embeds/      │  │
│  │ (slash cmd)        │  │ (event loops)  │  │ (UI shape)   │  │
│  └────────────────────┘  └────────────────┘  └──────────────┘  │
│         messageRouter is the runtime heart                       │
└──────────────────────────────────────────────────────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│  src/harness/                                                     │
│  assembleContext → llmClient (LiteLLM → Ollama)                 │
│       ↓                                                            │
│  parseToolCall → dispatchTool → active tool plugins              │
└──────────────────────────────────────────────────────────────────┘
                            │                       │
                            ▼                       ▼
┌─────────────────────┐  ┌─────────────────┐  ┌──────────────────┐
│ src/session/        │  │ src/db/         │  │ src/graphmcp/    │
│  (Redis state)      │  │  (ioredis)      │  │  (JSON-RPC)      │
└─────────────────────┘  └─────────────────┘  └──────────────────┘
                            │
                            ▼
┌──────────────────────────────────────────────────────────────────┐
│  src/vtt/  →  External Foundry VTT relay                         │
│  src/persona/  →  persona.yaml for @mentions                     │
│  src/spec/  →  specs/*.yaml loaded per encounter                 │
└──────────────────────────────────────────────────────────────────┘

3.1 Message flow (encounter thread)

  1. Discord messageCreatebot/index.tshandleMessage in handlers/messageRouter.ts
  2. Channel guard: must be a thread whose parent is in DISCORD_ALLOWED_CHANNELS
  3. Player gate: if discordId not in playerRegistry, post ephemeral gate embed, hold message in SessionState.heldMessages, return
  4. Roll guard: if pendingSkillCheck is set, increment attempt counter; auto-fail after PENDING_ROLL_LIMIT (5) skipped messages
  5. Burst cap: queueCap rejects + sends drop notice if too many messages arrived before last LLM response
  6. Append user message to history, fire 👀 reaction (fire-and-forget)
  7. Publish to GraphMCP via graphmcp/ingest.ts (Redis stream raw.messages)
  8. Debounced (500ms) → generationQueue.scheduleLLMTurn
  9. runLLMTurn:
    • assembleContext builds message list (system + pinned + trimmed sliding)
    • callLLM → LiteLLM with Ollama fallback
    • parseToolCall splits narrative from tool_call block
    • filterLLMResponse rejects fabricated rolls / echoed system tags → injects [FILTER CORRECTION] and retries once
    • Narrative posted to thread; assistant message appended to history
    • If tool call present → dispatchTool → plugin handler → system message appended
    • If result.resolved set → phase = 'resolved', archive thread after ENCOUNTER_ARCHIVE_DELAY_MS
  10. reactionManager upgrades 👀 state to complete and clears burst counter

3.2 Tool dispatch

The tool layer uses a plugin registry (harness/toolRegistry.ts) with per-encounter active-set filtering. Each ToolPlugin declares:

{
  name: string;
  description: string;
  args: Record<string, { type: 'string' | 'number' | 'boolean'; description: string }>;
  contextDocs?: (spec: EncounterSpec) => string;
  handler: (args, ctx: ToolContext) => Promise<DispatchResult>;
}

A spec's tools: [...] array declares which plugins are active for that encounter. Tools are loaded by side-effect from harness/tools/index.ts:

import './skillCheckEmit.js';
import './encounterResolve.js';
import './contextRecall.js';
import './goalRegister.js';
import './foundryLookup.js';
import './foundryReward.js';

The LLM emits a tool call by appending a fenced tool_call JSON block. Three parser patterns (in order): fenced ```tool_call block, bare tool_call header, then a fuzzy bare-JSON fallback. Unrecognized tools or malformed args are logged and ignored — the narrative is preserved.

The system prompt section buildToolManifest(spec) injects only the active set's tool definitions into the prompt contract, so each encounter's LLM only sees tools it can use.


4. Data Architecture

4.1 Redis (transient state)

Key pattern Value TTL Owner
session:{threadId} JSON.stringify(SessionState) SESSION_TTL_HOURS (12h) sessionManager
guild_threads:{guildId} Set of thread IDs inherits sessionManager
players:{guildId} (legacy design) discordId → dndName playerRegistry (current impl uses different scheme)
raw.messages Redis stream graphmcp/ingest.ts

SessionState (src/types/index.ts) is the central shape:

{
  encounterId, threadId, guildId,
  spec: EncounterSpec,
  players: Record<discordId, Player>,
  history: ChatMessage[],         // mix of pinned + sliding
  phase: 'open' | 'active' | 'resolved',
  heldMessages: HeldMessage[],    // for unregistered players
  outcome?, outcomeSummary?,
  npcMemories?: Record<npcId, string>,
  resolvedContext?: Record<key, string>,
  pendingSkillCheck?: { player, prompt, dc, messageId, modifier?, skill?, advantage?, disadvantage? },
  pendingSkillCheckAttempts?: number,
  createdAt, updatedAt,
}

4.2 Filesystem (data/)

  • tally.json{ [specName]: { runs, lastRun } }. Incremented at each encounter start.
  • summaries/{encounterId}-{ISO timestamp}.txt — one per resolved encounter, written by encounterLog.writeSummary().

4.3 GraphMCP / Neo4j (via JSON-RPC)

The bot never queries Neo4j directly. All graph access goes through GRAPHMCP_URL/mcp with JSON-RPC 2.0:

Tool Args Returns
query_as_npc npc_name, question, limit NPCQueryResult (chunks + graph_context)
semantic_search query, limit SemanticSearchResult
log_encounter title, participants, summary, location?, type? LogEncounterResult
list_encounters limit EncounterResultItem[]
search_encounters query?, location?, participant?, limit? EncounterResultItem[]
get_encounter id EncounterDetails

NPC memory is injected into the system prompt via formatNPCMemory() — past encounters witnessed + top-3 lore chunks above GRAPHMCP_SCORE_THRESHOLD.

4.4 Context window budget

src/types/index.ts exports a CONTEXT_BUDGET constant used by both contextAssembler and sessionManager:

Zone Tokens
System prompt (narrator + NPCs + tools + goals) 4,000
Pinned (opening narrative, goal block) 2,000
Sliding history 118,000
Safety buffer 3,500
Total 128,000

History trimming drops the oldest non-pinned turn pair when over budget, with a hard floor of 6 messages. Token estimates use gpt-tokenizer with a 1.15× buffer to approximate Gemma's tokenizer.


5. API Surface

This project exposes its functionality as two different APIs:

5.1 Discord slash commands (player/admin surface)

Registered via src/scripts/deploy-commands.ts using Discord REST v10.

Command Subcommands Purpose
/dndname set <name>, show, clear Character name registration
/character register foundry|custom, show, view, clear, admin list|remove|give Full character profile + Foundry link
/encounter start <spec>, random, status, stats, audit, end [notes], list, generate <theme>, spec Encounter session lifecycle
/encounters (Select menu + search modal) Search the encounter log via GraphMCP
/roll action Manual dice roll
/actions In-character action shortcuts
/turn Turn management
/xp award <amount> Award XP (relay → VTT)

Plus button + modal interactions: skill-check roll buttons, give item, custom character registration, Foundry link, encounter select menu, search modal.

5.2 Tool plugins (LLM surface)

Defined in src/harness/tools/ and registered at module load. Each spec filters the active set via its tools: array.

Tool Purpose Args
skill_check_emit Posts a dice-roll embed to the thread; blocks player input until resolved player, prompt, skill?, dc, advantage?, disadvantage?
encounter_resolve Marks encounter complete; writes summary; archives thread (args handled in tools/encounterResolve.ts)
context_recall Look up canonical session facts stored in resolvedContext
goal_register Add a new goal mid-encounter (the prd.md "dynamic goal registration" feature)
foundry_lookup Pull live character data from VTT relay
foundry_reward Award XP/items to a character via VTT

⚠ Note: the Docs/mardonar-encounter-engine.md lists skill_check_resolve, event_log_append, npc_memory_read, npc_memory_write as tools. These have been removed — replaced by the per-encounter event log + GraphMCP log_encounter tool. The current tool set is the one above.


6. Deployment Architecture

6.1 Local development

docker compose -f docker-compose.dev.yml up -d   # Builds + runs bot; relies on Redis + GraphMCP already running on the `mardonar-internal` Docker network (see `docs/deployment-guide.md`)
npm install
npm run deploy-commands                          # registers slash commands with Discord
npm run dev                                      # tsx watch mode

6.2 Production (multi-stage Dockerfile)

Dockerfile (Node 22 alpine):

  1. Builder stagenpm ci --ignore-scripts, copy src + tsconfig.json, npm run builddist/
  2. Runtime stagenpm ci --omit=dev --ignore-scripts, copy dist/, specs/, lore/, persona.yaml
  3. CMD ["node", "dist/bot/index.js"]

docker-compose.dev.yml defines two services (for the mardonar-internal external Docker network that also hosts Redis + an MCP server from the GraphMCP-Example stack): deploy-commands (one-shot) and bot (long-running, with data/ mounted as a volume).

Gap: There is no production docker-compose.yml. The .env.example is the source of truth for runtime config.

6.3 Operational

  • Session state has a 12h TTL by default — stale encounters auto-expire
  • Bot connects to Redis on main() startup (redis.connect())
  • VTT relay auto-spins up a headless Foundry session on connection failure (RSA-OAEP encrypted handshake)
  • Logging: src/lib/logger.ts writes plaintext to stdout. No LOG_LEVEL env knob; callers pick the level per-call. (Earlier docs claimed pino + structured JSON — that was aspirational; the pino deps were unused and have been removed.)

7. Development & Testing

7.1 Local commands

Command Effect
npm run dev tsx watch src/bot/index.ts — auto-reload dev
npm run build tscdist/
npm run start node dist/bot/index.js
npm run deploy-commands One-shot slash command registration
npm run test All tests (vitest)
npm run test:unit Unit tests only (no external services)
npm run test:int Integration tests (requires Docker services)

7.2 Test coverage

  • 33 unit test files in tests/unit/ (393 tests, 2 skipped)
  • 1 integration test (tests/integration/phase1.test.ts)
  • tests/fixtures/spec.ts — shared encounter spec fixture

Notable test surfaces: promptBuilder, contextAssembler, historyTrim, toolParser, toolDispatcher, toolRegistry, sessionManager, playerRegistry, characterRegistry, specLoader, rollHandler, rollDetection, responseFilter, queueCap, generationQueue, reactionManager, encounterLog, encounterDiscoveryEmbed, loreAnswerEmbed, skillCheckEmbed, graphmcpClient, foundryClientRetry, foundryClientFormatters, goalRegister, relaySession, litellmClient, ollamaClient, personaLoader, foundryReward, xpAwarder, redisErrorPath, messageRouterRunLLMTurn, specsToolsConsistency (the last is a structural-consistency guard, not a module surface).


8. Design Decisions (Living)

Decision Why
LiteLLM as primary, Ollama as fallback OpenAI-compatible proxy gives model flexibility without code changes; Ollama fallback ensures the bot still runs when the proxy is down
Prompt-based tool calls (not native) Gemma 4 IT at e2b is unreliable with native function calling; fenced JSON block parsing is deterministic
Tool plugin registry with per-spec active set New tools can be added without touching the dispatch core; specs opt into only the tools they need
Pinned + sliding history Opening narrative and goal block must survive trimming or the LLM loses its anchor
Goals in system prompt, not as a tool Goals rarely change mid-encounter; embedding them reduces tool round-trips
Redis for active state, GraphMCP for memory Redis is fast and ephemeral for live sessions; the graph holds long-term NPC lore
Player name gate via embed, not DMs Keeps the conversation in-thread; ephemeral embed auto-deletes after 30s
Story generator via /encounter generate Separates creative authoring from real-time inference — generator can use a stronger model later
VTT relay auto-spin-up Lets the bot operate when the relay has been cold-stopped; uses RSA-OAEP for password handoff
In-world voice rule for player-facing strings See feedback-in-world-voice — no utility/jargon in bot messages

9. Open Issues / Drift

Items the deep scan surfaced that aren't bugs but should be tracked:

  • Drift: Docs/mardonar-encounter-engine.md describes a Go bot with an embedded MCP layer; the actual code is TypeScript with an external JSON-RPC GraphMCP server. Treat the doc as historical/aspirational.
  • Resolved 2026-06-19 — README.md's "Project Structure" tree referenced src/mcp/ and the old 2-command layout. README now reflects the actual 8-command structure, src/graphmcp/ (Neo4j/src/mcp/ retired), and includes a callout noting Docs/mardonar-encounter-engine.md is historical.
  • Resolved 2026-06-19 — Duplicate trimHistory logic in src/session/sessionManager.ts and src/harness/contextAssembler.ts was extracted to src/lib/historyTrim.ts. tests/unit/historyTrim.test.ts covers the shared module at 100%.
  • No production compose file — only docker-compose.dev.yml. The Dockerfile is production-ready but deployment is ad-hoc.
  • Resolved 2026-06-19 — No CI/CD.gitea/workflows/test.yml runs tsc --noEmit, npm run test:unit, and npm run test:coverage on push/PR to main (Node 22, cached npm).
  • DISCORD_ALLOWED_USERS is empty by default → anyone in allowed channels can run /encounter start. The access control is channel-scoped, not user-scoped; admins need to set the env var explicitly.
  • OLLAMA_BASE_URL defaults to localhost — fine for dev, but production needs the LAN IP or proxy URL set.
  • Resolved 2026-06-19 — Spec tool list must be kept in synctests/unit/specsToolsConsistency.test.ts walks every specs/*.yaml, asserts each entry in tools: [...] is registered in the tool plugin registry, and fails loudly with the file and unknown name if drift appears. Also asserts every registered tool is referenced by at least one spec.
  • Resolved 2026-06-19 — Schema mismatch risk: src/types/index.ts now re-exports EncounterSpec (and its sub-shapes) derived from z.infer<typeof EncounterSpecSchema>. The static type and the runtime validator are now the same source of truth — drift is structurally impossible. Side effect: loadSpec now also validates xpReward as a number (was previously typed but unenforced).
  • Resolved 2026-06-19 — Logging drift: the architecture previously claimed pino + pino-pretty + structured JSON. The actual logger is the custom src/lib/logger.ts (plaintext stdout, no env-driven level filter). The unused pino and pino-pretty dependencies were removed from package.json; §2.1, §2.2, and §6.3 now describe reality.
  • Resolved 2026-06-19 — README drift: README.md was significantly out of date: it told new contributors to set a no-op LOG_LEVEL=debug, run the non-existent npm run validate-spec, and look at src/mcp/ (renamed to src/graphmcp/) and src/db/neo4j.ts (no Neo4j in the project). It also linked Docs/mardonar-encounter-engine.md (Go architecture, historical) as the current architecture doc. The dead top-level scripts/deploy-commands.ts — a stale duplicate of src/scripts/deploy-commands.ts that only knew about 2 of 8 commands — was removed. The README now reflects the actual layout, command set, and persistence layer.

Document generated by bmad-document-project initial scan, deep level. Project state recorded in docs/project-scan-report.json.