feat(specs): velvet-auction exercises group tools; FU-9 playtest gate; docs drift fix
FU-12 — velvet-auction.yaml now uses the group-encounter tools: - minPlayers: 3 (lobby-gated party heist, matches PRD UJ-1) - passiveReveals: Insight/15 (notices Karr's tell — Feature B) - group_stealth skillChecks entry (group Stealth, successRule: majority, durationSeconds: 60) + skill_check_group_emit and character_status added to the tools list. - specsToolsConsistency: emptied the NOT_YET_REFERENCED allowlist (skill_check_group_emit + character_status are now referenced); all 8 registered tools are reachable from specs. Validated: specLoader + specsToolsConsistency + full unit suite (527) pass. FU-9 — docs/release-playtest-checklist.md: the 7-step manual pre-release multi-player playtest checklist checked into the repo as a release gate (was buried only in the arch doc). Includes pass criteria (no orphaned thread / lost roll / raw-JSON leak) + the NFR-3/NFR-4 latency checklist. docs/project-overview.md drift fix: pino -> src/lib/logger.ts (custom plaintext, ADR-002); primary LLM -> minimax-m3 via LiteLLM (LITELLM_MODEL); test count 22 -> 58; lib/ description; relabel dynamic goal registration as delivered. Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -4,7 +4,7 @@
|
||||
|
||||
## What it is
|
||||
|
||||
A Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. The bot loads a YAML spec, narrates the scene via an LLM (Gemma 4 IT e2b through LiteLLM with Ollama fallback), voices NPCs with stable personas, runs skill checks via Discord embeds, and persists NPC memory + encounter history into a graph database through GraphMCP (JSON-RPC over HTTP). Optional Foundry VTT integration pulls live character stats and awards XP via an external relay.
|
||||
A Discord bot that runs structured D&D encounters. Each Discord thread is an encounter session. The bot loads a YAML spec, narrates the scene via an LLM (minimax-m3 through LiteLLM, with Ollama as fallback), voices NPCs with stable personas, runs skill checks via Discord embeds, and persists NPC memory + encounter history into a graph database through GraphMCP (JSON-RPC over HTTP). Optional Foundry VTT integration pulls live character stats and awards XP via an external relay.
|
||||
|
||||
## Who it serves
|
||||
|
||||
@@ -16,14 +16,14 @@ Discord community members playing D&D 5e in the Land of Mardonar. The DM runs `/
|
||||
|---|---|
|
||||
| Runtime | Node.js 22 (ESM, TypeScript 5.8 strict) |
|
||||
| Discord | discord.js v14 |
|
||||
| LLM (primary) | LiteLLM proxy (env: `LITELLM_BASE_URL`) |
|
||||
| LLM (primary) | LiteLLM proxy — minimax-m3 (env: `LITELLM_BASE_URL`, `LITELLM_MODEL`) |
|
||||
| LLM (fallback) | Ollama (env: `OLLAMA_BASE_URL`) — `gemma4-it:e2b`, 128k context |
|
||||
| Session cache | Redis (ioredis), 12h TTL |
|
||||
| Graph DB | Neo4j (via GraphMCP JSON-RPC, not direct) |
|
||||
| Lore / NPC memory | GraphMCP HTTP JSON-RPC server |
|
||||
| Foundry VTT | External relay (optional, requires API key) |
|
||||
| Validation | Zod (env + encounter spec) |
|
||||
| Logging | pino + pino-pretty |
|
||||
| Logging | `src/lib/logger.ts` (custom plaintext — pino removed) |
|
||||
| Testing | Vitest 3 (unit + integration) |
|
||||
| Build | tsc → multi-stage Node 22 alpine Dockerfile |
|
||||
|
||||
@@ -62,12 +62,12 @@ src/
|
||||
├── spec/ # YAML encounter loader + Zod schema
|
||||
├── persona/ # persona.yaml loader
|
||||
├── config.ts # Zod env validation
|
||||
├── lib/ # logger
|
||||
├── lib/ # logger (custom plaintext), historyTrim, skillCheckMessages
|
||||
├── scripts/ # deploy-commands (slash command registration)
|
||||
└── types/ # shared interfaces + CONTEXT_BUDGET
|
||||
```
|
||||
|
||||
Plus `specs/` (8 encounter YAML files), `tests/` (22 test files), `data/` (runtime tally + summaries), and `Docs/` (pre-existing project documentation, partially out of date).
|
||||
Plus `specs/` (8 encounter YAML files), `tests/` (58 test files), `data/` (runtime tally + summaries), and `Docs/` (pre-existing project documentation, partially out of date).
|
||||
|
||||
## Documentation
|
||||
|
||||
@@ -82,7 +82,7 @@ Plus `specs/` (8 encounter YAML files), `tests/` (22 test files), `data/` (runti
|
||||
## Key features in the current codebase
|
||||
|
||||
- **Per-encounter tool filtering.** Each spec declares which tool plugins are active.
|
||||
- **Dynamic goal registration** (the active PRD feature) — `tools/goalRegister.ts` lets the LLM add new goals mid-encounter.
|
||||
- **Dynamic goal registration** (delivered) — `tools/goalRegister.ts` lets the LLM add new goals mid-encounter.
|
||||
- **Three-pattern tool parser** — handles fenced `tool_call`, bare `tool_call` header, and fuzzy bare JSON, so even smaller models can drive tools.
|
||||
- **Self-spinning VTT relay** — when the relay is down, the bot handshakes via RSA-OAEP and launches a headless Foundry session on demand.
|
||||
- **Burst cap with drop notices** — if too many messages arrive before the last LLM response, the bot drops the excess and posts a tone-aware notice.
|
||||
|
||||
76
docs/release-playtest-checklist.md
Normal file
76
docs/release-playtest-checklist.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Multi-Player Playtest Checklist (Release Gate)
|
||||
|
||||
Manual pre-release checklist for **group-encounter** features (Features A–E + FR-43).
|
||||
Required before any release that touches the lobby, group checks, passive reveals,
|
||||
timed checks, or story-status surfaces.
|
||||
|
||||
> **Why this exists.** Group checks have **unit + integration coverage only — no
|
||||
> live E2E tier**. The one-token constraint makes true multi-player live E2E
|
||||
> impossible without a synthetic-Interaction forge (integration, not live) or a
|
||||
> second bot token (violates the constraint). The deterministic core is fully
|
||||
> unit/integration covered; the Discord fan-out surface is shared with existing
|
||||
> single-player live ACs. **This manual checklist is the safety net for the
|
||||
> residual risk** (real Discord fan-out, gateway event ordering, ephemeral-in-thread
|
||||
> quirks, burst rate-limiting). Source: `_bmad-output/arch/arch-mardonar-encounter-engine-2026-06-20/architecture.md §8` (closes Murat #5).
|
||||
|
||||
---
|
||||
|
||||
## Pass criteria
|
||||
A pass = **no orphaned thread, no lost roll, no raw-JSON leak to players.**
|
||||
Any of those = fail the release and fix before shipping.
|
||||
|
||||
## The 7 steps
|
||||
Run these against a real Discord guild with ≥3 test players and a group spec
|
||||
(e.g. `velvet-auction`, `minPlayers: 3`).
|
||||
|
||||
1. **Lobby → start.** N players join the lobby; `Start` enables at `minPlayers`;
|
||||
starter presses Start. Opening narrative posts; passive reveals fire for
|
||||
qualifying players. Confirm the auto `[SESSION] entered` announcement is
|
||||
**suppressed** for the group encounter.
|
||||
2. **Group check, all roll.** LLM emits a group skill check. Every targeted
|
||||
player clicks **Roll**; each gets an ephemeral with their d20+mod vs DC; the
|
||||
central scoreboard fills live and finalizes with a group SUCCESS/FAILURE +
|
||||
`[SKILL CHECK RESULT]` system message. Confirm no double rolls, no lost rolls.
|
||||
3. **Timed group check.** A group check with `durationSeconds`. Watch the
|
||||
countdown (10s increments), the hourglass GIF in the final stretch, and the
|
||||
"final sands" text cue. Let one player roll early and one let it expire →
|
||||
expiry finalizes correctly (unrolled = failure) without hanging the thread.
|
||||
4. **Latecomer joins a running encounter.** A non-joined player tries to post →
|
||||
their message is auto-deleted (FR-28/29). They join via the persistent
|
||||
**Join** button on the lobby embed **and** via `/encounter join`; their
|
||||
messages are then accepted. Confirm they are **not** retro-added to an
|
||||
in-flight group check's target set.
|
||||
5. **Non-joined message deleted + guided.** A non-joined member posts during the
|
||||
lobby phase and during a running group encounter; the bot deletes it and
|
||||
guides them to Join. Confirm **no false-positive deletions** of joined
|
||||
players' messages, and that missing `Manage Messages` degrades safely (logs +
|
||||
skips deletion, does not crash — NFR-7).
|
||||
6. **No-show.** A targeted player doesn't roll. Untimed: the no-show grace period
|
||||
(~60s) passes → they count as a failure, check finalizes. Timed: timer
|
||||
expires → timeout finalize. Either way the thread does not hang.
|
||||
7. **Bot restart mid-group-check.** With a group check in flight, restart the
|
||||
bot. The boot sweep rehydrates `groupcheck:{threadId}` (FR-44) and the
|
||||
`encounter:{threadId}:active` flag; in-flight checks rehydrate for remaining
|
||||
players to finish, and any check whose deadline passed finalizes as a
|
||||
timeout. Confirm no orphaned thread and no lost roll state.
|
||||
|
||||
---
|
||||
|
||||
## Latency checklist (NFR-3 / NFR-4)
|
||||
While playtesting, record observed p95 from the bot's perspective (non-LLM
|
||||
overhead, i.e. excluding the LLM generation wait):
|
||||
- **Single-roll narration path:** p95 ≤ **8s**.
|
||||
- **Group-resolution path:** p95 ≤ **15s**.
|
||||
|
||||
Record the observed numbers in the release notes. A miss = a perf follow-up, not
|
||||
an automatic fail, but investigate before shipping if either is exceeded by >50%.
|
||||
|
||||
---
|
||||
|
||||
## Sign-off
|
||||
- [ ] All 7 steps run; pass criteria met (no orphaned thread / lost roll / raw-JSON leak).
|
||||
- [ ] Latency p95 recorded (single ≤8s, group ≤15s).
|
||||
- [ ] Tester: ______ Date: ______ Release/commit: ______
|
||||
|
||||
> File issues for any failure. Do not ship a group-encounter release without a
|
||||
> completed checklist.
|
||||
@@ -2,6 +2,9 @@ encounterId: "mardonar-velvet-auction-006"
|
||||
title: "The Velvet Quill Auction"
|
||||
tone: "mysterious"
|
||||
|
||||
# Group encounter (Feature D) — requires a party; the lobby gates until 3 join.
|
||||
minPlayers: 3
|
||||
|
||||
setting:
|
||||
location: "Upper District — private lounge in the Velvet Quill parlor"
|
||||
mood: >
|
||||
@@ -101,6 +104,24 @@ skillChecks:
|
||||
negotiate_vesper_note: >
|
||||
Offering Madame Vesper secrets or trade arrangements of greater value than Karr's gold.
|
||||
|
||||
group_stealth_dc: 15
|
||||
group_stealth_skill: "Stealth"
|
||||
group_stealth_note: >
|
||||
A coordinated group Stealth check when the party moves on the artifact together
|
||||
(e.g. during a staged distraction). Emit as a GROUP check via skill_check_group_emit
|
||||
targeting all joined players, with successRule: majority and durationSeconds: 60.
|
||||
Failure means the guards or the abjuration wards notice the coordinated movement.
|
||||
|
||||
# Passive skill reveals (Feature B) — bot-applied at encounter start, group-visible,
|
||||
# attributed to the qualifying player. threshold is a passive DC (integer); revealText
|
||||
# is outcome prose only — no dice results (the engine owns rolls).
|
||||
passiveReveals:
|
||||
- skill: "Insight"
|
||||
threshold: 15
|
||||
revealText: >
|
||||
Notices a faint tremor in Karr's grip as he raises his bid paddle — his swagger
|
||||
is a veneer, and something about this lot has him badly rattled.
|
||||
|
||||
randomizable:
|
||||
- key: broker_name
|
||||
source: vocabulary
|
||||
@@ -121,11 +142,13 @@ randomizable:
|
||||
|
||||
tools:
|
||||
- skill_check_emit
|
||||
- skill_check_group_emit
|
||||
- encounter_resolve
|
||||
- context_recall
|
||||
- goal_register
|
||||
- foundry_lookup
|
||||
- foundry_reward
|
||||
- character_status
|
||||
|
||||
dmNotes: >
|
||||
This is a social heist encounter. Direct combat is highly discouraged by the presence of abjuration wards
|
||||
|
||||
@@ -78,9 +78,11 @@ describe('specs/*.yaml tool references', () => {
|
||||
|
||||
it('every registered tool is referenced by at least one spec (sanity: the registry is reachable from the default active set)', () => {
|
||||
// Tools registered ahead of their spec are allowlisted here — remove the
|
||||
// entry once a spec references the tool. skill_check_group_emit lands a
|
||||
// group spec with the lobby (Story 9).
|
||||
const NOT_YET_REFERENCED = new Set(['skill_check_group_emit', 'character_status']);
|
||||
// entry once a spec references the tool. As of 2026-06-22, velvet-auction
|
||||
// references skill_check_group_emit (group Stealth) and character_status,
|
||||
// so the allowlist is empty. Re-add a name here only when a new tool is
|
||||
// registered ahead of its landing spec.
|
||||
const NOT_YET_REFERENCED = new Set<string>([]);
|
||||
const referenced = new Set<string>();
|
||||
for (const { raw } of specFiles) {
|
||||
if (Array.isArray(raw.tools)) {
|
||||
|
||||
Reference in New Issue
Block a user