From 37a1a3d421f272b96e77b60acb84ff933279db67 Mon Sep 17 00:00:00 2001 From: Kaysser Kayyali Date: Mon, 22 Jun 2026 16:02:00 +0000 Subject: [PATCH] feat(specs): velvet-auction exercises group tools; FU-9 playtest gate; docs drift fix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit FU-12 — velvet-auction.yaml now uses the group-encounter tools: - minPlayers: 3 (lobby-gated party heist, matches PRD UJ-1) - passiveReveals: Insight/15 (notices Karr's tell — Feature B) - group_stealth skillChecks entry (group Stealth, successRule: majority, durationSeconds: 60) + skill_check_group_emit and character_status added to the tools list. - specsToolsConsistency: emptied the NOT_YET_REFERENCED allowlist (skill_check_group_emit + character_status are now referenced); all 8 registered tools are reachable from specs. Validated: specLoader + specsToolsConsistency + full unit suite (527) pass. FU-9 — docs/release-playtest-checklist.md: the 7-step manual pre-release multi-player playtest checklist checked into the repo as a release gate (was buried only in the arch doc). Includes pass criteria (no orphaned thread / lost roll / raw-JSON leak) + the NFR-3/NFR-4 latency checklist. docs/project-overview.md drift fix: pino -> src/lib/logger.ts (custom plaintext, ADR-002); primary LLM -> minimax-m3 via LiteLLM (LITELLM_MODEL); test count 22 -> 58; lib/ description; relabel dynamic goal registration as delivered. Co-Authored-By: Claude --- docs/project-overview.md | 12 ++-- docs/release-playtest-checklist.md | 76 ++++++++++++++++++++++++ specs/velvet-auction.yaml | 23 +++++++ tests/unit/specsToolsConsistency.test.ts | 8 ++- 4 files changed, 110 insertions(+), 9 deletions(-) create mode 100644 docs/release-playtest-checklist.md diff --git a/docs/project-overview.md b/docs/project-overview.md index 784bbb9..6fbf590 100644 --- a/docs/project-overview.md +++ b/docs/project-overview.md @@ -4,7 +4,7 @@ ## What it is -A Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. The bot loads a YAML spec, narrates the scene via an LLM (Gemma 4 IT e2b through LiteLLM with Ollama fallback), voices NPCs with stable personas, runs skill checks via Discord embeds, and persists NPC memory + encounter history into a graph database through GraphMCP (JSON-RPC over HTTP). Optional Foundry VTT integration pulls live character stats and awards XP via an external relay. +A Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. The bot loads a YAML spec, narrates the scene via an LLM (minimax-m3 through LiteLLM, with Ollama as fallback), voices NPCs with stable personas, runs skill checks via Discord embeds, and persists NPC memory + encounter history into a graph database through GraphMCP (JSON-RPC over HTTP). Optional Foundry VTT integration pulls live character stats and awards XP via an external relay. ## Who it serves @@ -16,14 +16,14 @@ Discord community members playing D&D 5e in the Land of Mardonar. The DM runs `/ |---|---| | Runtime | Node.js 22 (ESM, TypeScript 5.8 strict) | | Discord | discord.js v14 | -| LLM (primary) | LiteLLM proxy (env: `LITELLM_BASE_URL`) | +| LLM (primary) | LiteLLM proxy — minimax-m3 (env: `LITELLM_BASE_URL`, `LITELLM_MODEL`) | | LLM (fallback) | Ollama (env: `OLLAMA_BASE_URL`) — `gemma4-it:e2b`, 128k context | | Session cache | Redis (ioredis), 12h TTL | | Graph DB | Neo4j (via GraphMCP JSON-RPC, not direct) | | Lore / NPC memory | GraphMCP HTTP JSON-RPC server | | Foundry VTT | External relay (optional, requires API key) | | Validation | Zod (env + encounter spec) | -| Logging | pino + pino-pretty | +| Logging | `src/lib/logger.ts` (custom plaintext — pino removed) | | Testing | Vitest 3 (unit + integration) | | Build | tsc → multi-stage Node 22 alpine Dockerfile | @@ -62,12 +62,12 @@ src/ ├── spec/ # YAML encounter loader + Zod schema ├── persona/ # persona.yaml loader ├── config.ts # Zod env validation -├── lib/ # logger +├── lib/ # logger (custom plaintext), historyTrim, skillCheckMessages ├── scripts/ # deploy-commands (slash command registration) └── types/ # shared interfaces + CONTEXT_BUDGET ``` -Plus `specs/` (8 encounter YAML files), `tests/` (22 test files), `data/` (runtime tally + summaries), and `Docs/` (pre-existing project documentation, partially out of date). +Plus `specs/` (8 encounter YAML files), `tests/` (58 test files), `data/` (runtime tally + summaries), and `Docs/` (pre-existing project documentation, partially out of date). ## Documentation @@ -82,7 +82,7 @@ Plus `specs/` (8 encounter YAML files), `tests/` (22 test files), `data/` (runti ## Key features in the current codebase - **Per-encounter tool filtering.** Each spec declares which tool plugins are active. -- **Dynamic goal registration** (the active PRD feature) — `tools/goalRegister.ts` lets the LLM add new goals mid-encounter. +- **Dynamic goal registration** (delivered) — `tools/goalRegister.ts` lets the LLM add new goals mid-encounter. - **Three-pattern tool parser** — handles fenced `tool_call`, bare `tool_call` header, and fuzzy bare JSON, so even smaller models can drive tools. - **Self-spinning VTT relay** — when the relay is down, the bot handshakes via RSA-OAEP and launches a headless Foundry session on demand. - **Burst cap with drop notices** — if too many messages arrive before the last LLM response, the bot drops the excess and posts a tone-aware notice. diff --git a/docs/release-playtest-checklist.md b/docs/release-playtest-checklist.md new file mode 100644 index 0000000..07f1d9c --- /dev/null +++ b/docs/release-playtest-checklist.md @@ -0,0 +1,76 @@ +# Multi-Player Playtest Checklist (Release Gate) + +Manual pre-release checklist for **group-encounter** features (Features A–E + FR-43). +Required before any release that touches the lobby, group checks, passive reveals, +timed checks, or story-status surfaces. + +> **Why this exists.** Group checks have **unit + integration coverage only — no +> live E2E tier**. The one-token constraint makes true multi-player live E2E +> impossible without a synthetic-Interaction forge (integration, not live) or a +> second bot token (violates the constraint). The deterministic core is fully +> unit/integration covered; the Discord fan-out surface is shared with existing +> single-player live ACs. **This manual checklist is the safety net for the +> residual risk** (real Discord fan-out, gateway event ordering, ephemeral-in-thread +> quirks, burst rate-limiting). Source: `_bmad-output/arch/arch-mardonar-encounter-engine-2026-06-20/architecture.md §8` (closes Murat #5). + +--- + +## Pass criteria +A pass = **no orphaned thread, no lost roll, no raw-JSON leak to players.** +Any of those = fail the release and fix before shipping. + +## The 7 steps +Run these against a real Discord guild with ≥3 test players and a group spec +(e.g. `velvet-auction`, `minPlayers: 3`). + +1. **Lobby → start.** N players join the lobby; `Start` enables at `minPlayers`; + starter presses Start. Opening narrative posts; passive reveals fire for + qualifying players. Confirm the auto `[SESSION] entered` announcement is + **suppressed** for the group encounter. +2. **Group check, all roll.** LLM emits a group skill check. Every targeted + player clicks **Roll**; each gets an ephemeral with their d20+mod vs DC; the + central scoreboard fills live and finalizes with a group SUCCESS/FAILURE + + `[SKILL CHECK RESULT]` system message. Confirm no double rolls, no lost rolls. +3. **Timed group check.** A group check with `durationSeconds`. Watch the + countdown (10s increments), the hourglass GIF in the final stretch, and the + "final sands" text cue. Let one player roll early and one let it expire → + expiry finalizes correctly (unrolled = failure) without hanging the thread. +4. **Latecomer joins a running encounter.** A non-joined player tries to post → + their message is auto-deleted (FR-28/29). They join via the persistent + **Join** button on the lobby embed **and** via `/encounter join`; their + messages are then accepted. Confirm they are **not** retro-added to an + in-flight group check's target set. +5. **Non-joined message deleted + guided.** A non-joined member posts during the + lobby phase and during a running group encounter; the bot deletes it and + guides them to Join. Confirm **no false-positive deletions** of joined + players' messages, and that missing `Manage Messages` degrades safely (logs + + skips deletion, does not crash — NFR-7). +6. **No-show.** A targeted player doesn't roll. Untimed: the no-show grace period + (~60s) passes → they count as a failure, check finalizes. Timed: timer + expires → timeout finalize. Either way the thread does not hang. +7. **Bot restart mid-group-check.** With a group check in flight, restart the + bot. The boot sweep rehydrates `groupcheck:{threadId}` (FR-44) and the + `encounter:{threadId}:active` flag; in-flight checks rehydrate for remaining + players to finish, and any check whose deadline passed finalizes as a + timeout. Confirm no orphaned thread and no lost roll state. + +--- + +## Latency checklist (NFR-3 / NFR-4) +While playtesting, record observed p95 from the bot's perspective (non-LLM +overhead, i.e. excluding the LLM generation wait): +- **Single-roll narration path:** p95 ≤ **8s**. +- **Group-resolution path:** p95 ≤ **15s**. + +Record the observed numbers in the release notes. A miss = a perf follow-up, not +an automatic fail, but investigate before shipping if either is exceeded by >50%. + +--- + +## Sign-off +- [ ] All 7 steps run; pass criteria met (no orphaned thread / lost roll / raw-JSON leak). +- [ ] Latency p95 recorded (single ≤8s, group ≤15s). +- [ ] Tester: ______ Date: ______ Release/commit: ______ + +> File issues for any failure. Do not ship a group-encounter release without a +> completed checklist. \ No newline at end of file diff --git a/specs/velvet-auction.yaml b/specs/velvet-auction.yaml index 03b33c0..d3c764a 100644 --- a/specs/velvet-auction.yaml +++ b/specs/velvet-auction.yaml @@ -2,6 +2,9 @@ encounterId: "mardonar-velvet-auction-006" title: "The Velvet Quill Auction" tone: "mysterious" +# Group encounter (Feature D) — requires a party; the lobby gates until 3 join. +minPlayers: 3 + setting: location: "Upper District — private lounge in the Velvet Quill parlor" mood: > @@ -101,6 +104,24 @@ skillChecks: negotiate_vesper_note: > Offering Madame Vesper secrets or trade arrangements of greater value than Karr's gold. + group_stealth_dc: 15 + group_stealth_skill: "Stealth" + group_stealth_note: > + A coordinated group Stealth check when the party moves on the artifact together + (e.g. during a staged distraction). Emit as a GROUP check via skill_check_group_emit + targeting all joined players, with successRule: majority and durationSeconds: 60. + Failure means the guards or the abjuration wards notice the coordinated movement. + +# Passive skill reveals (Feature B) — bot-applied at encounter start, group-visible, +# attributed to the qualifying player. threshold is a passive DC (integer); revealText +# is outcome prose only — no dice results (the engine owns rolls). +passiveReveals: + - skill: "Insight" + threshold: 15 + revealText: > + Notices a faint tremor in Karr's grip as he raises his bid paddle — his swagger + is a veneer, and something about this lot has him badly rattled. + randomizable: - key: broker_name source: vocabulary @@ -121,11 +142,13 @@ randomizable: tools: - skill_check_emit + - skill_check_group_emit - encounter_resolve - context_recall - goal_register - foundry_lookup - foundry_reward + - character_status dmNotes: > This is a social heist encounter. Direct combat is highly discouraged by the presence of abjuration wards diff --git a/tests/unit/specsToolsConsistency.test.ts b/tests/unit/specsToolsConsistency.test.ts index db2df2f..6b047ed 100644 --- a/tests/unit/specsToolsConsistency.test.ts +++ b/tests/unit/specsToolsConsistency.test.ts @@ -78,9 +78,11 @@ describe('specs/*.yaml tool references', () => { it('every registered tool is referenced by at least one spec (sanity: the registry is reachable from the default active set)', () => { // Tools registered ahead of their spec are allowlisted here — remove the - // entry once a spec references the tool. skill_check_group_emit lands a - // group spec with the lobby (Story 9). - const NOT_YET_REFERENCED = new Set(['skill_check_group_emit', 'character_status']); + // entry once a spec references the tool. As of 2026-06-22, velvet-auction + // references skill_check_group_emit (group Stealth) and character_status, + // so the allowlist is empty. Re-add a name here only when a new tool is + // registered ahead of its landing spec. + const NOT_YET_REFERENCED = new Set([]); const referenced = new Set(); for (const { raw } of specFiles) { if (Array.isArray(raw.tools)) {