docs: group-encounters PRD + UX + architecture (finalized 2026-06-20)

PRD (46 FRs, 5 features), UX (DESIGN + EXPERIENCE), and architecture
(READY FOR IMPLEMENTATION) for the group-encounters feature set, plus
decision logs. Built via bmad-prd / bmad-ux / bmad-create-architecture
with party-mode validation.

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2026-06-20 23:21:58 +00:00
parent b884a13d98
commit b6c7c87f60
15 changed files with 3930 additions and 0 deletions

View File

@@ -0,0 +1,19 @@
# Decision Log — Architecture: Group Encounters (2026-06-20)
Canonical memory and audit trail for this architecture run. Every decision,
change, and override is recorded here as the conversation unfolds.
## Decisions
| Date | Decision | Rationale | Status |
|------|----------|-----------|--------|
| 2026-06-20 | Opened architecture run (fresh) from finalized PRD + UX | Downstream architecture for the group-encounters feature set; epics/stories to follow | open |
| 2026-06-20 | Party-mode panel (Winston/Amelia/Murat/Mary) additions accepted into Project Context Analysis | Converged: Feature C is a singular→plural migration of pendingSkillCheck (6 gates + PENDING_ROLL_LIMIT per-player + sc_mod_modal id), not additive; scoreboard read-modify-write is a distinct race needing Redis compare-and-set (Lua/WATCH) + coalesced edits; FR-45 = sessionManager.atomicMutate via Lua, migrate pending writes; FR-43 = store discordId on PendingSkillCheck + validate in submitResult + fail-open if absent (name-match fragility); Feature A restart = resolve-by-deadline (persist startedAt), not blanket fail; restart sweep = four cases + boot barrier + idempotency + durable encounter:{threadId}:active flag; group checks need a straggler deadline (spec gap → flag to PM); Feature E conditions relay-blocked (new slice + interface field) — ship read RPC first, in parallel, with capability handshake; enrichment is multiplicative context-budget pressure coupled to historyTrim → L0/L1/L2 enrichment tier + per-player charctx budget + characterContextCache (5min); D+E adversarial → sequence as a pair; LLM-call-granularity (once-per-roll vs once-per-check) is a spec/arch gap; successRule evaluation stays in code (pure fn, narrator gets pre-computed verdict); inject timed/group outcomes as one system-observation shape; gate submitResult's scheduleEncounterLLMTurn on group completion + aggregate [GROUP CHECK RESULT] tag; character_status capability-escalation → whitelist + DM>LLM priority; per-variant filter registration locked to parser set (test); FR-43 is a breaking change to shipped skill_check_emit (deprecation/migration obligation); Redis key registry (SCAN not KEYS); CAP-17 before group specs; GraphMCP containment exception flagged (party lookup → updates AC6); test-architecture: 7 net-new harnesses incl. group-check-live-E2E decision (one-token impossibility); unknowns to list: max party N, concurrent campaigns, timed-roll latency budget. | decided |
| 2026-06-20 | Party-mode round 2 (Winston devil's-advocate / Sally UX / John PM) — simplifications + corrections accepted | Winston retracts over-engineering for single-process monolith: atomicMutate = in-process per-threadId mutex (NOT Lua/WATCH; keep abstraction + document swap path); scoreboard embed edit = final step inside locked section (drop separate coalescer/lag-invariant; keep only a rate-limit debounce); Feature A = restart=fail w/ pendingSkillCheck Redis TTL for zombie cleanup (drop resolve-by-deadline/startedAt); Feature C = TTL'd encounter:{threadId}:active piggybacking group-check lifecycle (drop separate durable flag) + boot barrier reaffirmed + sweep = two cases not four; Foundry = drop runtime capability handshake → build-time version coupling + CI pair-check + typed client + single Foundry-status→Mardonar-condition mapping seam; straggler-deadline retracted as spec field → DM "resolve with current rolls" UX affordance (reaffirm TTL expiry→timeout spec clarification). Reaffirmed: character_status whitelist/DM>LLM/key registry/SCAN/boot barrier. Sally — surface hierarchy flip: ephemeral = authoritative personal + accessible primary, scoreboard = shared visual summary (may lag, SR-noisy); ephemeral renders BEFORE scoreboard edit + in-world "tally catches up" text; state-aware "My Roll" persistent button (offer roll / show existing); P1-extended = Join on the LIVE scoreboard (pin lobby when Manage Messages, graceful-degrade); P3 failure-mode clause = edit failure never spawns a new public message (stale board > tornado); one ephemeral/player/check edited in place; group timer = one shared scoreboard timer + ephemerals snapshots + ONE urgency edit per ephemeral at final-sands threshold (cue in an announced field, not footer/image). John locks (no LLM-discretion): LLM-call granularity = once-per-check + templated per-roll non-LLM ack + single LLM call at resolution (per-roll O(N) rejected → O(1)/check); defaults MAX_PARTY_N=8, MAX_CONCURRENT_ENCOUNTERS=10 (env-configurable); latency p95 ≤8s single-roll, ≤15s group-resolution; FR-43 = deprecation window + SKILL_CHECK_SURFACE=v1|v2 flag (default v1 on intro, v2 one release after; FR-43 behind flag in Feature C epic, flip in follow-up release); CAP-17 conditional (only if group specs need schema changes) + scoped minimal (schema adds + existing-spec migration + encounter-wizard reconciliation; full refactor separate non-blocking epic); D+E paired (two stories one epic, characterContextCache critical path; optional D-preview behind flag at L0). PRD amendments queued: FR-47 (straggler deadline default + DM /encounter resolve force-evaluate, LLM never decides), FR-48 (per-roll templated ack + single LLM call per group-check), FR-49 (skill_check_emit deprecation window), FR-50 (SKILL_CHECK_SURFACE flag), NFR-3 (p95 ≤8s single-roll), NFR-4 (p95 ≤15s group-resolution), CONFIG defaults, CAP-17 scope, Architectural Assumptions section (7 items: max-party-N ceiling, LLM-call granularity, straggler-deadline mechanism, characterContextCache ownership, FR-43 flag evaluation point, FR-45 CAS failure mode, GraphMCP containment closed set). Sense-check flagged: straggler-deadline default value (John's "72h" is a misremember — PENDING_ROLL_LIMIT=5 msgs, session TTL=12h — pick a sensible default, not 72h). | decided |
| 2026-06-20 | Party-mode round 3 (Murat test / Mary analyst+contrarian / Paige docs) — open items closed; FR-43 process trimmed | Murat locks group-check testing = unit+integration-only w/ documented live-E2E gap (synthetic-Interaction "live" path is mislabeled integration; one-token constraint → no true multi-player live E2E; manual pre-release multi-player playtest checklist = safety net). Live-eligible: group-check init/scoreboard post, minPlayers=1 Roll-lock+ephemeral+successRule collapse, once-per-check LLM guard, TTL/restart boot sweep, templated per-roll ack. Unit+integration only: simultaneous multi-player fan-out, successRule N>1, per-user ephemeral delivery, second-claimant rejection. Mary: consent = Redis system of record + Foundry best-effort mirror + hourly reconciliation cron (zero new SPOF; removes a relay round-trip from E's start path; unblocks E). Relay conditions RPC = contract-first schema owned by bot team + ConditionsReader interface (StubConditionsReader from YAML fixture + RelayConditionsReader behind flag); relay maintainer builds to bot-owned schema; E-L2 sole relay-dependent tier → D + E-L1 ship independently of relay delivery. Enrichment tier order = L0 (today) → L1 (names+archetypes, zero external deps, highest value/effort) → L2 (full conditions, sole relay-dependent); D ships at L1; D-preview-at-L0 only if trivial. FR-43 CONTRARIAN (accepted over round-2 John): cut the SKILL_CHECK_SURFACE v1/v2 flag + two-milestone cutover (sized for public launch, not internal/shared-use) → ship v2 + break v1 in one release; old buttons emit a PLAIN system notice (not in-world narration — in-world voice is for narration, not migration messaging); DM announcement = pinned Discord message (not a process); CAP-17 wizard-reconciliation → non-blocking follow-up (encounter-builder-skill migration note stays as the real authoring-path doc). Mary holds: D+E pairing, defaults, once-per-check granularity, the conditions-reader stub. Paige: Documentation & Authoring Impact subsection — spec-authoring-guide new pitfalls (no dice in passiveReveals.revealText; successRule is a tool arg NOT a spec field; minPlayers/maxPlayers default-and-omit semantics; story-status never in spec prose; passiveReveals.threshold is a DC integer) + amplified existing (new tool names, action-named keys for passiveReveals, [GROUP CHECK RESULT] tag); wizard reconciliation = 2 new sub-skills (Players & Lobby, Passive Reveals) + edits to 2 existing, gated on schema-first, ~half-dayday, CAP-17 wizard item = named sub-task w/ own AC (not a footnote); tool-contract docs (skill_check_group_emit successRule enum + n/m + threshold args + [GROUP CHECK RESULT] authoritative; character_status whitelist + DM>LLM priority; character_spell_lookup opt-in per spec; system-observation shape = engine-injected context, not a tool); in-world voice = centralize system-string templates in one module (src/lib/systemStrings.ts) + named registry + forbidden-word grep on RENDERED output + rewrite "tally catches up shortly" (no temporal-backoff language); 8 doc-debts the architecture must specify (FR-43 deprecation mechanics, minPlayers/maxPlayers enforcement boundary, passiveReveals trigger timing, successRule n_of_m/sum-threshold exact semantics, L0/L1/L2 tier contents+escalation, story-status TTL reset behavior, specs-tools-consistency test as contract surface, wizard schema-first ordering). PRD amendments revised: FR-49/FR-50 RETIRED (Mary's trim); FR-47/FR-48/NFR-3/NFR-4/CONFIG defaults/CAP-17 scope/Architectural Assumptions stand. | decided |
| 2026-06-20 | Steps 36 saved (starter=foundation, core decisions, patterns, structure) + Step 7 validation via party-mode panel (Winston/Murat/Paige — all READY WITH MINOR GAPS, no critical blockers) | Gap resolutions folded: successRule Zod discriminated union + semantics (majority/all/n_of_m{n,m}/sum_threshold{t,of}); minPlayers/maxPlayers enforcement at Join+start+mid-run(continue, no auto-abort); to-resolve closed (OQ-7 boot permission check, no-Foundry-character silent skip, /character status set/clear/show, hourglass GIF in-repo+static fallback, OQ-8 open-rolls default, OQ-1 Foundry fields + conditions slice /dnd5e/get-actor-conditions); character_spell_lookup formally added (opt-in); ConditionsReader flag FOUNDRY_CONDITIONS_ENABLED; boot-sweep→A timeout finalize; character_status Zod StoryStatus{label,setAt,setter,expiresAt}; FR-45 lock ~2s + in-world ephemeral; story-status TTL DM/LLM re-set resets, new-encounter does NOT, expiry silent drop; NFR-3/NFR-4 perf harness tests/perf/latency.test.ts + playtest; enumerated test cases (FR-43 fail-open+second-claimant, FR-45 three contention modes, character_status priority-rejection); 7-step manual playtest checklist; interleaving harness = synthetic Interaction+ioredis-mock (NOT live); Foundry retry test targets writes; systemStrings.voice = 7th harness; GraphMCP closed-set clarified. 16/16 checklist [x]. | decided |
| 2026-06-20 | Architecture COMPLETE — status: complete, READY FOR IMPLEMENTATION, high confidence | All 8 steps done. Validation gate via party mode per user steer. PRD amendments still queued to route back (FR-47/48, NFR-3/4, CONFIG defaults, CAP-17 scope, Architectural Assumptions; FR-49/50 retired). Next: bmad-create-epics-and-stories. | final |
## Changes & Overrides
_(Recorded here as they occur.)_

View File

@@ -0,0 +1,574 @@
---
stepsCompleted: [1, 2, 3, 4, 5, 6, 7, 8]
lastStep: 8
status: complete
completedAt: 2026-06-20
inputDocuments:
- prd.md
- addendum.md
- DESIGN.md
- EXPERIENCE.md
- docs/architecture.md
- SPEC.md
- stack.md
- foundry-integration.md
- slash-commands.md
- encounter-spec-fields.md
- campaign-state.md
- voice-rules.md
workflowType: 'architecture'
project_name: 'Mardonar Encounter Engine — Group Encounters'
user_name: 'Kaysser'
date: '2026-06-20'
---
# Architecture Decision Document — Group Encounters & New Skill-Check Tools
_This document builds collaboratively through step-by-step discovery. Sections
are appended as we work through each architectural decision together. Source of
truth for decisions: the finalized PRD (`_bmad-output/prds/prd-mardonar-encounter-engine-2026-06-20/prd.md`)
+ UX (`_bmad-output/ux/ux-mardonar-encounter-engine-2026-06-20/`). Decisions are
mirrored in `.decision-log.md`._
## Locked decisions (from PRD/UX — not re-litigated here)
- **Tool factoring (OQ-2)** — Option C: extend `skill_check_emit` with optional
`durationSeconds`; dedicated `skill_check_group_emit`; one `character_status`
tool with `action: set|clear`.
- **Timers** — in-memory `setTimeout` + a Redis-backed **restart sweep**
(NFR-2/FR-44); restart finalizes expired pending checks as timeouts.
- **Single player-locked Roll (FR-43)** — retire Adv/Dis/Custom-Modifier buttons
on all skill-check surfaces incl. shipped `skill_check_emit`; adv/dis decided
upstream (LLM emit arg overrides → Foundry-derived → straight roll).
- **Lobby + story-status** — Redis-backed with TTLs (lobby ~30m idle, story
status ~24h); `minPlayers` default 1, optional `maxPlayers` cap.
- **Foundry** — passives (Perception/Investigation direct, others computed
`10+mod+prof`), equipment, conditions; engine-tracked mutable story status.
- **successRule** — `majority` (≥ceil(N/2)) / `all` / `n_of_m` / `sum-threshold`
(≥ t); default majority.
- **Manage Messages** — required at invite scope for FR-28/FR-29; safe
degradation if missing.
## To resolve in this design
- Exact Foundry fields/endpoints (passives, equipment, conditions,
advantage-granting) — OQ-1.
- Scoreboard stat-privacy — OQ-8 (open rolls default).
- Hourglass GIF asset hosting.
- `/character status` command arg shape.
- No-Foundry-character branch UX.
- Manage Messages re-invite plan — OQ-7.
- Central-embed edit coalescing strategy.
- `successRule` arg schema (Zod shape).
## Project Context Analysis
### Requirements Overview
**Functional Requirements (46 PRD FRs, 5 features + cross-cutting):**
- **A — Timed checks (FR-15):** extend `skill_check_emit` with `durationSeconds`; in-memory timer integrated into the existing `pendingSkillCheck` flow; timeout finalization feeds the LLM. **Restart=fail** with Redis-TTL cleanup of zombie pending state (drop resolve-by-deadline/`startedAt`).
- **B — Passive checks (FR-610):** new `EncounterSpec.passiveReveals` (`skill`, `threshold` DC int, `revealText` — no `visibility`); bot-applied at encounter start from Foundry passives (already fetched); deterministic, not LLM-triggered.
- **C — Group checks (FR-1117):** new `skill_check_group_emit` tool; **multi-player pending state — a singular→plural migration of `pendingSkillCheck`** (6 gate sites + per-player `PENDING_ROLL_LIMIT` + parameterized `sc_mod_modal:{checkId}`), not an additive extension; central scoreboard embed (edit = final step inside the locked section; rate-limit debounce only) + per-player ephemeral roll views; deterministic `successRule` (`majority`≥ceil(N/2) / `all` / `n_of_m`(n,m) / `sum-threshold`≥t) evaluated **in code**; new `groupcheck:{threadId}` Redis state; **once-per-check LLM call** + templated per-roll non-LLM ack.
- **D — Lobby + message regulation (FR-1830):** new `minPlayers` (default 1) / optional `maxPlayers` schema fields; `lobby:{threadId}` Redis state; lobby embed + Join/Leave/Start/Cancel via the global `interactionCreate` customId router; non-joined delete+guide (Manage Messages, graceful-degrade); `/encounter join`; suppress `[SESSION] entered` for group encounters. **= CAP-12/13 at the data-model layer.**
- **E — Enriched character context (FR-3135):** passives/equipment/spells already in `FoundryActorDetails` (inject w/ budget caps); **conditions relay-blocked** (new slice + interface field); engine-tracked mutable story status (DM command + `character_status` LLM tool, Redis TTL ~24h, whitelist + DM>LLM priority).
- **Cross-cutting (FR-3646):** tool-registry additions; `filterLLMResponse` routing for new result paths (per-variant registration locked to parser set via test); schema + authoring-guide + specs-tools consistency; GraphMCP events; customId prefixes; Manage Messages; **single player-locked Roll (FR-43)** — store `discordId` on `PendingSkillCheck` + validate in `submitResult` + fail-open if absent + state-aware "My Roll" button; **group-check Redis + restart sweep (FR-44)**; **atomic roll registration (FR-45)**.
**Non-Functional Requirements:** NFR-1 coalesced/debounced embed edits · NFR-2 Redis-backed lobby/story-status/group-check + **restart sweep** (two cases, boot barrier) · NFR-3 all new paths through `toolDispatcher`+`filterLLMResponse` · NFR-4 backward compat (FR-43 button change intentional, ship-and-break in one release) · NFR-5 ephemeral ~15min window / central-embed fallback · NFR-6 GraphMCP + `src/lib/logger.ts` · NFR-7 Manage Messages w/ safe degradation. **+ p95 ≤8s single-roll, ≤15s group-resolution** (queued PRD NFRs).
**Scale & Complexity:** backend, single-process Node monolith, Discord bot, LLM-driven, Redis + GraphMCP + Foundry relay. **Medium-high** (real-time multi-player, concurrency, timers, platform constraints, LLM-contract integrity, relay extension, schema evolution) — not enterprise. Components touched: ~10.
### Technical Constraints & Dependencies
- discord.js v14: global `interactionCreate` customId-prefix routing; buttons have **no player binding today**; ephemeral ~15min token window; no live-broadcast ephemerals; embed edit rate limits (~5/5s route).
- **`sessionManager.update` is a non-atomic read-modify-write** (sessionManager.ts:25-30) — safe today only because no `await` separates read from write; Feature C breaks that → `sessionManager.atomicMutate` (in-process per-threadId mutex, NOT Lua/WATCH; lock wraps full read→decide→write→persist, per-threadId not global; document Lua swap path for future multi-instance).
- **`pendingSkillCheck` singular across 6 gates** + `PENDING_ROLL_LIMIT` session-wide counter + `sc_mod_modal` singleton → Feature C is a **migration**.
- No native function calling — prompt-based fenced `tool_call` (Gemma, unreliable); tool manifest + `filterLLMResponse` discipline mandatory.
- 128k context budget (4k system/2k pinned/118k sliding/3.5k safety) — enriched charctx is **multiplicative** under group checks; `historyTrim` couples to live charctx size.
- No persistent scheduler — in-memory `setTimeout` only; **restart sweep + boot barrier** (synchronous Redis-reachable + single SCAN-republish before accepting interactions).
- No Foundry cache at prompt-build time → `characterContextCache` (keyed by actorUuid, ~5min TTL, background-refreshed).
- Foundry relay: conditions + advantage-as-state **not exposed** (advantage is a roll-time opt on `rollSkillCheck`/etc.); **conditions = new relay slice + interface field**; ship read RPC first, contract-first schema owned by the bot team, `ConditionsReader` stub+real behind a flag, build-time version coupling + CI pair-check (no runtime handshake); single Foundry-status→Mardonar-condition mapping seam file.
- **Consent storage = Redis system of record, Foundry best-effort mirror + hourly reconciliation cron** (zero new SPOF; removes a relay round-trip from E's start path).
- Live Discord serialized (one token = one gateway session).
### Cross-Cutting Concerns
- **Concurrency & locking** — `atomicMutate` (in-process per-threadId mutex); scoreboard edit = final step inside locked section (rate-limit debounce only); FR-43 `discordId`-on-`PendingSkillCheck` + `submitResult` validation + fail-open + state-aware "My Roll" button; `character_status` whitelist + DM>LLM priority + store setter.
- **Restart-recovery** — Feature A restart=fail w/ TTL cleanup; Feature C TTL'd `encounter:{threadId}:active` piggybacking group-check lifecycle + boot barrier + two-case sweep (present→republish, absent→nothing); token-TTL recovery-class split (durable re-hydrate vs in-flight unrecoverable >15min); fix existing silent-active-session-on-restart bug; `/join` restores from `party[]`. **Straggler deadline = DM "resolve with current rolls" UX affordance** (not a new spec field); TTL expiry→timeout resolution + LLM narration must be named in the spec.
- **Context budget** — per-player charctx token budget; spells = name+level+school (full mechanics via on-demand `character_spell_lookup` tool), equipment = names+active bonuses, prepared-only; **L0/L1/L2 enrichment tier** (L0 spec-only, L1 names+archetypes, L2 full conditions); `characterContextCache`; **D+E adversarial → paired** (two stories one epic, `characterContextCache` critical path; optional D-preview behind flag at L0).
- **LLM-contract integrity** — `successRule` in code (pure fn; narrator gets pre-computed verdict); timed/group outcomes injected as **one system-observation shape**; per-variant filter registration locked to parser set (test); **once-per-check LLM call** + templated per-roll ack (per-roll O(N) rejected → O(1)/check); aggregate `[GROUP CHECK RESULT]` tag + extend `rollResultRecent` exemption; gate `submitResult`'s `scheduleEncounterLLMTurn` on group completion; routing assertions `filterLLMResponse` in every new narrator path; per-path trailing-system spec; in-world-voice grep test on rendered system-string templates.
- **UX surface hierarchy (Sally)** — ephemeral = authoritative personal + accessible primary; scoreboard = shared visual summary (may lag, SR-noisy, never the only path to one's own result or to joining). Ephemeral renders before scoreboard edit; P3 failure-mode clause (edit failure never spawns a new public message — stale board > tornado); P1-extended (Join on the live scoreboard; pin lobby when Manage Messages; `/encounter join` = keyboard fallback); group timer = one shared scoreboard timer + ephemerals snapshots + ONE urgency edit per ephemeral at final-sands (cue in an announced field).
- **CAP alignment** — lobby/group = CAP-12/13 data-model layer; reconcile campaign party vs lobby `maxPlayers`; **CAP-17 conditional** (only if group specs need schema the old format can't express) + scoped minimal (schema adds + existing-spec migration + encounter-wizard reconciliation as a **named sub-task w/ own AC**; full refactor separate non-blocking epic); **GraphMCP containment exception = closed set** (updates AC6); Redis key registry (SCAN not KEYS).
### Architectural Assumptions (state explicitly; first epic validates each)
1. Max party N ceiling = **8**. 2. LLM-call granularity = **once-per-check**. 3. successRule straggler-deadline mechanism (where the timer lives + DM force-resolve interaction). 4. `characterContextCache` ownership (E-only vs platform — D reads too?). 5. FR-43 surface ship-and-break in one release (no flag) — evaluation point = registration-time (one surface compiled in) since no coexistence. 6. FR-45 mutex contention failure mode (user-facing behavior — retry w/ backoff / in-world "the moment slips away, try again"). 7. GraphMCP containment exception boundary (closed set).
### Standing Constraints / Config
- **Breaking-change surface inventory** — any shipped-surface change = a migration event; for internal/shared-use, ship-and-break in one release + changelog + pinned Discord message + builder-skill migration note (no flag, no in-world deprecation narration).
- **LLM cost envelope** — O(1)/check (chosen) vs O(N)/roll.
- **Discord UX latency thresholds** — p95 ≤8s single-roll, ≤15s group-resolution.
- **Config defaults** — `MAX_PARTY_N=8`, `MAX_CONCURRENT_ENCOUNTERS=10` (env-configurable).
- Spec schema as load-bearing dependency (wizard reconciliation, schema-first ordering).
### Test Architecture (Murat, round 3)
**Group checks = unit + integration coverage only; no live E2E tier.** One-token constraint makes true multi-player live E2E impossible without a synthetic-Interaction forge (integration, not live — rejected as mislabeled) or a second bot token (violates the constraint). **Live-eligible:** group-check init/scoreboard post; `minPlayers=1` Roll-lock + ephemeral + successRule collapse; once-per-check LLM guard; TTL/restart boot sweep; templated per-roll ack. **Unit+integration only (documented gap):** simultaneous multi-player fan-out; successRule N>1; per-user ephemeral delivery; second-claimant rejection. **Residual risk accepted:** real Discord fan-out, gateway event ordering, ephemeral-in-thread quirks, burst rate-limiting — mitigated by deterministic core fully unit/integration covered + Discord surface shared with existing single-player live ACs + a **manual pre-release multi-player playtest checklist item** in the release gate. Net-new harnesses: interleaving/interaction-replay; boot-barrier + idempotent sweep (Redis fixtures + clock injection); prompt-budget gate (fat payloads); `filterLLMResponse` routing assertions; Foundry capability/version pair-check + non-idempotent retry test; rate-limit fixture for embed coalescing.
### Documentation & Authoring Impact (Paige, round 3)
**spec-authoring-guide new pitfalls:** no dice in `passiveReveals.revealText`; `successRule` is a tool arg NOT a spec field; `minPlayers`/`maxPlayers` default-and-omit semantics; story-status never in spec prose; `passiveReveals.threshold` is a DC integer. **Amplified:** new registered tool names (`skill_check_group_emit`, `character_status`, `character_spell_lookup`); action-named keys for passiveReveals; `[GROUP CHECK RESULT]` engine-emitted tag. **Wizard:** 2 new sub-skills (Players & Lobby, Passive Reveals) + edits to 2 existing, **schema MUST land before wizard reconciliation**, ~half-dayday, CAP-17 wizard item = named sub-task w/ own AC. **Tool-contract docs:** `skill_check_group_emit` (`successRule` enum + n/m + threshold args, semantics in description, `[GROUP CHECK RESULT]` authoritative); `character_status` (whitelist + DM>LLM priority); `character_spell_lookup` (opt-in per spec); system-observation shape = engine-injected context, not a tool. **In-world voice:** centralize system-string templates in one module (`src/lib/systemStrings.ts`) + named registry + forbidden-word grep on **rendered** output (catches runtime concatenation); rewrite "tally catches up shortly" (no temporal-backoff language — "the moment holds"). **Doc-debts the architecture must specify:** FR-43 ship-and-break mechanics; `minPlayers`/`maxPlayers` enforcement boundary (lobby vs start vs mid-run drop); `passiveReveals` trigger timing; `successRule` n_of_m/sum-threshold exact semantics; L0/L1/L2 tier contents + escalation; story-status TTL reset behavior; specs-tools-consistency test as contract surface; wizard schema-first ordering.
### Open Items Closed This Round
- **Consent storage** → Redis (system of record) + Foundry (mirror + hourly cron).
- **Relay conditions RPC** → contract-first bot-owned schema + `ConditionsReader` stub/real behind a flag; E-L2 sole relay-dependent tier.
- **Enrichment tier order** → L0 → L1 → L2; D ships at L1.
- **FR-43 process** → ship-and-break in one release, plain system notice, no flag (Mary's trim, accepted).
- **Group-check live E2E** → unit+integration-only + manual playtest (Murat).
### PRD Amendments Queued (to route back after architecture)
FR-47 (straggler deadline: default + DM `/encounter resolve` force-evaluate, LLM never decides) · FR-48 (per-roll templated ack + single LLM call per group-check) · NFR-3 (p95 ≤8s single-roll) · NFR-4 (p95 ≤15s group-resolution) · CONFIG defaults (`MAX_PARTY_N=8`, `MAX_CONCURRENT_ENCOUNTERS=10`) · CAP-17 scope (minimal + wizard sub-task) · Architectural Assumptions section (7 items). **FR-49/FR-50 RETIRED** (Mary's trim — no deprecation window/flag). Sense-check: straggler-deadline default value (not 72h).
## Starter Template Evaluation
### Primary Technology Domain
Backend — single-process Node ESM monolith, Discord bot, LLM-driven. **Brownfield**:
the engine is substantially built (Phases 14 complete); this architecture **extends
the existing codebase**. No greenfield starter template is evaluated — the existing
engine *is* the foundation.
### Existing Foundation (the "starter" = current codebase)
Exact installed versions (`package.json`):
- **Runtime:** Node 22 ESM (`"type": "module"`, `@types/node ^22`), TypeScript
`^5.8` strict, `tsx ^4.19` (dev), `tsc``node dist/`.
- **Discord:** `discord.js ^14.18.0`, `@discordjs/builders ^1.10.0`,
`@discordjs/rest ^2.4.0`.
- **State/cache:** `ioredis ^5.4.0` + **`ioredis-mock ^8.9.0`** (dev — enables the
Redis-fixture unit tests Murat's test architecture demands: `atomicMutate`,
restart sweep, `successRule`).
- **LLM:** `openai ^6.39.0` (LiteLLM OpenAI-compatible) + `ollama ^0.5.0`
(fallback, `gemma4-it:e2b`); **`gpt-tokenizer ^2.8.0`** (token counting → the
128k context-budget gate test).
- **Spec/validation:** `zod ^3.24.0`, `js-yaml ^4.1.0`.
- **Config:** `dotenv ^16.4.0`.
- **Tests:** `vitest ^3.1.0` + `@vitest/coverage-v8 ^3.2.6`; gates
`RUN_FULL_E2E`/`RUN_GRAPHMCP_LIVE`/`RUN_FOUNDRY_LIVE`.
- **Logging:** `src/lib/logger.ts` (custom plaintext; pino retired).
### Selected Starter: the existing engine (extend in place)
**Rationale:** brownfield — the new features layer onto existing modules
(`spec/loader`, `harness/tools`, `bot/embeds`, `bot/handlers`, `session`, `vtt`,
`db/redis`, `graphmcp`, `promptBuilder`). No project re-init, no starter CLI.
`ioredis-mock` + `gpt-tokenizer` are already present, covering the round-3
test-architecture demands (Redis fixtures + budget gate) with zero new deps.
**Architectural patterns inherited (not re-decided):** ESM · Zod-as-type-source
(`EncounterSpec = z.infer`) · tool-plugin registry (`registerTool` + side-effect
imports) · global `interactionCreate` customId-prefix routing ·
`session:{threadId}` Redis-TTL model · prompt-based fenced `tool_call` (no native
function-calling) · `filterLLMResponse` last-line defense · in-world voice.
**First implementation story is NOT project init** — it's the **cross-cutting
schema additions** (CAP-17 minimal: `minPlayers`/`maxPlayers`/`passiveReveals` in
`EncounterSpecSchema` + authoring-guide + specs-tools-consistency test), which
unblock Features AE. **Schema MUST land before wizard reconciliation** (Paige's
ordering constraint).
## Core Architectural Decisions
### Already Decided (from party rounds + PRD + existing foundation — not re-litigated)
- **Data:** Redis (`ioredis 5.4` + `ioredis-mock` for tests); keyspace registry
(`session`/`groupcheck`/`lobby`/`encounter:active`/`character_status`/`character:{guildId}`/`campaign`);
Zod schema (`EncounterSpec = z.infer`); `characterContextCache` (5min TTL,
background refresh); CAP-17 minimal schema migration.
- **Concurrency:** `sessionManager.atomicMutate` (in-process per-threadId mutex,
not Lua); scoreboard edit = final step inside lock + rate-limit debounce; FR-43
`discordId`-on-`PendingSkillCheck` + `submitResult` validation + fail-open +
state-aware "My Roll" button.
- **Restart:** restart=fail (A) + TTL cleanup; two-case boot sweep + barrier;
TTL'd `encounter:{threadId}:active`.
- **LLM-contract:** once-per-check + templated per-roll ack; `successRule` in code
(pure fn); one system-observation shape; `[GROUP CHECK RESULT]` tag;
`filterLLMResponse` routing; per-variant filter registration locked to parser
set (test).
- **Foundry:** contract-first bot-owned schema + `ConditionsReader` stub/real
behind flag; build-time version coupling + CI pair-check; single
Foundry-status→Mardonar-condition mapping seam; consent in Redis + Foundry
mirror.
- **Enrichment:** L0→L1→L2; D ships at L1; D+E paired.
- **FR-43:** ship-and-break in one release, no flag, plain system notice.
- **Testing:** unit+integration only for group checks + manual playtest; net-new
harnesses (interleaving, boot-sweep, budget-gate, routing assertions, relay
pair-check, rate-limit fixture).
- **Config/NFRs:** `MAX_PARTY_N=8`, `MAX_CONCURRENT_ENCOUNTERS=10`; p95 ≤8s
single-roll, ≤15s group-resolution.
### Remaining Decisions Resolved This Step
- **A. successRule straggler-deadline mechanism** — `groupcheck:{threadId}` carries
a TTL = max check lifetime (**default 10 min**, not 72h); FR-16's **60s no-show
grace** finalizes when all-but-one have rolled; **DM `/encounter resolve`**
force-evaluates early; the LLM never decides the deadline. Restart sweep covers
the TTL.
- **B. `characterContextCache` ownership** — **platform-scoped module**
(`src/harness/characterContext.ts`); **E-L2 populates from Foundry**; **L1 is
spec-sourced** (names+archetypes, no Foundry fetch, no cache hit); D consumes the
module but D-L1 doesn't touch Foundry. Platform infra, E-owned population,
Foundry caching L2-only.
- **C. FR-45 mutex contention failure mode** — losing click rejected with an
**in-world ephemeral** *"the moment slips away — try again"* (no silent retry,
no raw error); same-player double-click = **idempotent no-op reject**;
non-target spam = **rate-limited** ephemeral.
- **D. `character_status` whitelist** — LLM-settable (enum):
`{wounded, inspired, hidden, exhausted, sick, cursed, disguised, frightened}`;
**DM may set any key** (free-form); **DM > LLM** priority (LLM call is a
no-op/logged rejection on conflict with an active DM status); TTL ~24h.
- **E. GraphMCP containment closed set** — **unchanged** `{encounter.ts,
encounters.ts}` for direct reads; group-check/lobby/status events are **writes
via `graphmcp/ingest.ts`** (existing path); **party/roster lives in Redis, not
GraphMCP** → no new direct GraphMCP read caller. AC6 containment test stays green.
- **F. Consent grant model** — **DM grants consent** (stored Redis), gating Foundry
writes (CAP-16) + L2 enrichment reads; bot-mediated writes (Redis first, Foundry
mirror fire-and-forget); player may revoke. DM-only grant keeps it simple for a
D&D table.
### Decision Impact Analysis
- **Implementation sequence:** CAP-17 schema additions → `atomicMutate` + keyspace
→ Feature A (timed) + FR-43 single-Roll (ship-and-break) → Feature C group-check
(migration) + `characterContextCache` (platform) → Feature D lobby (CAP-12/13)
+ Feature E-L1 (paired) → Feature E-L2 (relay conditions RPC, long-lead, parallel).
- **Cross-component dependencies:** schema→wizard (schema-first); `atomicMutate`→C/D;
`characterContextCache`→E (and D-L2); `ConditionsReader` stub→E-L2 cutover;
`filterLLMResponse` per-variant test→all new narrator paths.
## Implementation Patterns & Consistency Rules
### Naming Patterns
- **Files:** camelCase `.ts` (`messageRouter.ts`, `skillCheckEmit.ts`); commands
lowercase. New: `src/harness/tools/{skillCheckGroupEmit,characterStatus,
characterSpellLookup}.ts`; `src/bot/embeds/{lobby,groupScoreboard,timedCheck,
passiveReveal,statusConfirmation}.ts`; `src/harness/characterContext.ts`;
`src/lib/systemStrings.ts`.
- **Code:** camelCase functions, PascalCase types (`PendingGroupCheck`,
`LobbyState`, `ConditionsReader`), UPPER_SNAKE constants
(`EMBED_COLOR.GATHERING`, `GROUPCHECK_TTL_MS`, `MAX_PARTY_N`).
- **customId prefixes** (snake_case, global `interactionCreate` router):
existing `sc_*`/`encounters_search_btn`; new `lobby_join`/`lobby_leave`/
`lobby_start`/`lobby_cancel`/`grp_roll`. **No player id in customId** (validate
against session). **Parameterize modal ids** `sc_mod_modal:{checkId}` (singleton
was a bug). Max 100 chars.
- **Redis keys** (colon-delimited, TTL'd, `SCAN` not `KEYS`): `session:{threadId}`,
`guild_threads:{guildId}`, `characters:{guildId}`, `groupcheck:{threadId}`,
`lobby:{threadId}`, `encounter:{threadId}:active`,
`character_status:{guildId}:{discordId}`, `campaign:{campaignId}` — all in the
key registry (prefix, owner, TTL, sweep behavior).
- **System tags** (bracketed uppercase, role `system`, engine-emitted, never
hand-typed): `[TOOL]`, `[SKILL CHECK RESULT]`, `[GROUP CHECK RESULT]` (new),
`[FILTER CORRECTION]`, `[NO RESPONSE]`, `[SESSION]`.
### Structure Patterns
- **Tests:** `tests/unit/<module>.test.ts` + `tests/integration/` + live AC
suite; **`ioredis-mock`** for Redis-touching unit tests; `gpt-tokenizer` for
budget tests. New: `groupCheck`, `lobby`, `storyStatus`, `characterContext`,
`atomicMutate`, `restartSweep`, `systemStrings.voice` (forbidden-word grep),
`promptBudget`.
- **Tools:** `src/harness/tools/<tool>.ts` self-registers via `registerTool` +
side-effect import in `tools/index.ts`; active set via `spec.tools`;
`specsToolsConsistency.test.ts` locks parity.
- **Embeds:** `src/bot/embeds/<surface>.ts` exporting `build<Name>Embed` +
`EMBED_COLOR` constants; one embed per surface, edited in place.
- **Character context:** `src/harness/characterContext.ts` (platform module) +
`ConditionsReader` interface (`StubConditionsReader`/`RelayConditionsReader`).
- **System strings:** `src/lib/systemStrings.ts` (centralized templates + named
registry).
### Format Patterns
- **LLM tool contract:** fenced ` ```tool_call` JSON;
`DispatchResult = {systemMessage, error?, resolved?}`; results appended as
`system` messages; new tools follow the exact `ToolPlugin` shape.
- **Embed status colors:** `EMBED_COLOR` (pending/success/failure/gathering/
urgent/notice/neutral) — **every color paired with emoji + text label** (a11y).
- **Timer:** 10s-increment text + ~10s GIF below 10s + text cue "Final sands";
never per-second.
- **Results:** `[GROUP CHECK RESULT]` aggregate + per-player breakdown; in-world
ephemeral on contention loss (*"the moment slips away — try again"*).
### Communication Patterns
- **Events (GraphMCP):** existing `setnx`+`xadd`+`expire` ingest pattern; new
event types via `graphmcp/ingest.ts` (**write path — no new direct GraphMCP
read caller**; closed set stays `{encounter.ts, encounters.ts}`).
- **State updates:** **all `SessionState` mutations through
`sessionManager.atomicMutate(threadId, mutator)`** (in-process per-threadId
mutex); never raw `sessionManager.update` for pending/players/group-check
fields.
- **Logging:** `src/lib/logger.ts``log.info('namespace', 'event', {…})`
plaintext stdout; no pino.
### Process Patterns
- **Error handling:** tool throws → `dispatchTool` catches → `[TOOL ERROR]`;
`filterLLMResponse` last-line defense (never `thread.send(raw)`); graceful
in-world degradation on Foundry/relay failure.
- **Validation:** Zod at `loadSpec` (schema = type source); responseFilter for
LLM output; per-variant filter registration locked to parser set (test).
- **Restart:** boot barrier (Redis-reachable + SCAN-republish before gateway
ready); two-case sweep; idempotent across double-restart.
- **Enrichment escalation:** L0 → L1 (spec-sourced) → L2 (Foundry conditions,
relay-gated); `characterContextCache` L2-only.
### Enforcement Guidelines
**All agents MUST:** register new tools via `registerTool` + `tools/index.ts`
import · route every new narrator path through `filterLLMResponse` · mutate
SessionState only via `atomicMutate` · add new Redis keys to the registry (SCAN,
TTL) · pair every status color with emoji+text · centralize player-facing
strings in `systemStrings.ts` · add a GraphMCP event for every new mechanic ·
keep GraphMCP direct reads in the closed set.
**Verification:** `specsToolsConsistency.test.ts` (tools↔registry) · per-variant
filter registration test (parser↔filter) · `systemStrings.voice.test.ts`
(forbidden-word grep on rendered output) · `promptBudget.test.ts` (token gate)
· AC6 (GraphMCP containment) · boot-sweep + atomicMutate tests.
**Anti-patterns:** raw `sessionManager.update` for pending/players · per-roll
LLM calls · customId-embedded player ids · per-second timer edits · new public
messages on embed-edit failure · runtime Foundry capability handshake ·
LLM-discretion on successRule/deadline/granularity · utility words in
player-facing strings.
## Project Structure & Boundaries
### Complete Project Directory Structure (existing + new )
```
mardonar-npcs/
├── package.json, tsconfig.json, vitest.config.ts, Dockerfile, docker-compose.dev.yml, .env(.example), README.md
├── index.ts / promptBuilder.ts # ⚠️ root dupes of src/ — verify/remove (cleanup candidate)
├── scripts/ # lore-options, pull-specs.mjs, validate-spec, list-tools
├── specs/ # spec YAML corpus + SPEC_FORMAT.md (SPECS_DIR)
├── src/
│ ├── bot/
│ │ ├── index.ts # gateway + global interactionCreate router ( lobby_*/grp_* prefixes)
│ │ ├── commands/ # /encounter join (CAP-13), /character status (story-status)
│ │ ├── embeds/ # skillCheck, encounterDiscovery, loreAnswer, playerGate, resolution
│ │ │ └── lobby, groupScoreboard, timedCheck, passiveReveal, statusConfirmation
│ │ ├── handlers/ # messageRouter, rollHandler, responseFilter, generationQueue, queueCap, reactionManager, mentionHandler
│ │ │ └── lobbyHandler (interaction routing), messageRegulation (delete+guide), restartSweep (boot)
│ │ └── lib/welcomeDM.ts
│ ├── harness/
│ │ ├── promptBuilder, contextAssembler, llmClient, litellmClient, ollamaClient, llmMessages, toolParser, toolRegistry, toolDispatcher
│ │ ├── characterContext.ts # platform module + ConditionsReader (stub/real)
│ │ └── tools/ # contextRecall, encounterResolve, foundryLookup, foundryReward, goalRegister, skillCheckEmit, index.ts
│ │ └── skillCheckGroupEmit, characterStatus, characterSpellLookup
│ ├── session/ # sessionManager ( atomicMutate), characterRegistry, playerRegistry, encounterLog, xpAwarder
│ ├── db/redis.ts
│ ├── graphmcp/ # client, ingest, loreResolver, vocabularyResolver (direct reads from encounter.ts/encounters.ts only)
│ ├── vtt/ # foundryClient ( conditions slice + mapping seam), relaySession
│ ├── spec/loader.ts # minPlayers/maxPlayers/passiveReveals
│ ├── persona/loader.ts
│ ├── lib/ # logger, historyTrim ( couples to charctx), systemStrings
│ ├── types/index.ts # PendingGroupCheck, LobbyState, GroupCheckResult, CharacterContextEntry, ConditionsReader
│ └── config.ts # MAX_PARTY_N, MAX_CONCURRENT_ENCOUNTERS, GROUPCHECK_TTL_MS
└── tests/
├── unit/ # existing + groupCheck, lobby, storyStatus, characterContext, atomicMutate, restartSweep, systemStrings.voice, promptBudget, conditionsReader
├── integration/ + graphmcp/ # existing + group-check integration (ConditionsReader stub), boot-sweep
└── fixtures/spec.ts # group-encounter fixtures
```
### Architectural Boundaries
- **Tool** — `src/harness/tools/*` = LLM-callable `ToolPlugin`s; registered via `toolRegistry`+`index.ts`; active per `spec.tools`; `toolDispatcher``DispatchResult`→system message.
- **Embed** — `src/bot/embeds/*` = pure builders (no Discord I/O); called by handlers/tools.
- **Handler** — `src/bot/handlers/*` = Discord I/O + routing; global `interactionCreate` dispatches by customId prefix.
- **Session** — `src/session/*` = Redis state; all `SessionState` mutation via `sessionManager.atomicMutate`.
- **LLM-contract** — `promptBuilder`/`contextAssembler`/`toolDispatcher`/`toolParser` + `responseFilter` = the LLM interface; `filterLLMResponse` is last-line defense for every narrator path.
- **Foundry** — `src/vtt/foundryClient.ts` = the relay seam; conditions via new slice + single status→condition mapping; build-time version coupling.
- **GraphMCP** — direct reads ONLY from `commands/{encounter,encounters}.ts` (closed set); writes via `graphmcp/ingest.ts`; AC6 enforces.
- **Character context** — `src/harness/characterContext.ts` = platform module; L1 spec-sourced, L2 Foundry; `characterContextCache` L2-only.
### Requirements → Structure Mapping
- **A Timed (FR-15):** `skillCheckEmit` (durationSeconds) + `embeds/timedCheck` + `rollHandler` (timeout finalize) + `atomicMutate` (pending TTL).
- **B Passive (FR-610):** `spec/loader` (passiveReveals) + `embeds/passiveReveal` + `characterContext` (passives) + encounter-start hook.
- **C Group (FR-1117):** `tools/skillCheckGroupEmit` + `embeds/groupScoreboard` + `rollHandler` (grp_roll + atomic registration) + `sessionManager` (groupcheck) + `toolDispatcher` (aggregate `[GROUP CHECK RESULT]`).
- **D Lobby (FR-1830):** `spec/loader` (minPlayers/maxPlayers) + `embeds/lobby` + `lobbyHandler` + `messageRegulation` + `commands/encounter` (join) + `sessionManager` (lobby).
- **E Enriched (FR-3135):** `vtt/foundryClient` (conditions slice) + `characterContext` + `tools/characterStatus` + `commands/character` (status) + `sessionManager` (character_status).
- **Cross-cutting (FR-3646):** `toolRegistry`/`index.ts` + `responseFilter` + `toolDispatcher` (manifest) + `specsToolsConsistency.test` + `graphmcp/ingest` + `bot/index.ts` (customId prefixes) + `lib/systemStrings` + `config.ts`.
### Integration Points
- **Internal:** tool → `dispatchTool` → system message → `scheduleEncounterLLMTurn`; button → `interactionCreate` → handler → `atomicMutate` → embed edit; restart → boot barrier → SCAN sweep.
- **External:** Foundry relay (conditions read RPC, contract-first); GraphMCP (writes); Discord (embeds/ephemerals/buttons).
- **Data flow:** message → `messageRouter``contextAssembler` (reads `characterContextCache`) → `promptBuilder``llmClient``toolParser``responseFilter``toolDispatcher` → embed + system message → `graphmcp/ingest`.
- **Assets:** hourglass GIF — host in-repo `assets/` or a URL; referenced from `embeds/timedCheck.ts`.
## Architecture Validation Results
### Coherence Validation ✅
**Decision Compatibility:** all locked decisions are mutually consistent — the in-process
per-threadId mutex is consistent with once-per-check LLM granularity (the LLM call is
gated on group completion inside the locked section) and with the two-case boot sweep
(no cross-process state to recover); FR-43 ship-and-break is orthogonal to
`specsToolsConsistency.test` (it retires buttons, not the `skill_check_emit` tool
registration); consent-in-Redis + Foundry mirror cleanly removes the relay round-trip
from E's start path; the `ConditionsReader` stub correctly decouples D + E-L1 from relay
delivery.
**Pattern Consistency:** naming/structure/communication/process patterns align with the
decisions and the existing codebase conventions.
**Structure Alignment:** the project structure supports all decisions; boundaries
respected (GraphMCP closed set, tool/embed/handler/session seams).
### Requirements Coverage Validation ✅
All 46 PRD FRs (A 15, B 610, C 1117, D 1830, E 3135, cross-cutting 3646) have a
home in the Requirements→Structure Mapping; no FR is orphaned. NFRs addressed: NFR-1
debounced edits · NFR-2 Redis-backed + restart sweep · NFR-3 `filterLLMResponse` routing ·
NFR-4 backward compat (FR-43 ship-and-break) · NFR-5 ephemeral window/fallback · NFR-6
GraphMCP+logger · NFR-7 Manage Messages degrade · p95 ≤8s/≤15s (perf harness + playtest).
### Implementation Readiness Validation ✅
**Decision Completeness:** all critical decisions documented with versions; the
`successRule` contract, enforcement boundary, and remaining "to-resolve" items are closed
below. **Structure Completeness:** complete directory tree with new files marked.
**Pattern Completeness:** conflict points addressed; enforcement + verification named.
### Gap Analysis Results (validation panel: Winston/Murat/Paige — all READY WITH MINOR GAPS)
**Critical:** none.
**Important (closed below):** `successRule` n_of_m/sum-threshold semantics ·
`minPlayers`/`maxPlayers` enforcement boundary · unresolved to-resolve items (OQ-7,
no-Foundry-character, `/character status` shape, GIF hosting, OQ-8, OQ-1) ·
`character_spell_lookup` formal decision · under-specified items (ConditionsReader flag,
boot-sweep→A mapping, `character_status` Zod shape, FR-45 lock timeout, story-status TTL
reset) · NFR-3/NFR-4 latency harness · enumerated test cases for locked decisions · manual
playtest checklist content.
**Minor (closed below):** GraphMCP closed-set clarification · story-status TTL expiry ·
harness-count consistency.
### Validation Issues Addressed (resolutions folded into the architecture)
**1. `successRule` Contract (closes Paige #1 / Winston #3 / Murat #3)**
```ts
successRule: z.union([
z.object({ kind: z.literal('majority') }),
z.object({ kind: z.literal('all') }),
z.object({ kind: z.literal('n_of_m'), n: z.number().int().min(1), m: z.number().int().min(1) }),
z.object({ kind: z.literal('sum_threshold'), t: z.number().int(), of: z.enum(['roll','total']) }),
]).default({ kind: 'majority' });
```
Semantics — `majority`: `count(successes) ≥ ceil(N/2)`, N = rolled count. `all`: every
roller succeeds. `n_of_m`: `count(successes) ≥ n` of `m` rolled; **`m` must equal the
targeted roller count N at emit** (reject the tool call if `m > N`; `n ≥ 1`). `sum_threshold`:
`Σ(values) ≥ t` where `of` names the unit — `roll` = raw d20 face, `total` = d20+modifier.
Unrolled players at finalization count as failures (not zero). Default `majority`. The
pure-fn evaluator, the tool-description string, the unit tests, and the authoring-guide
pitfall all bind to this definition.
**2. `minPlayers`/`maxPlayers` enforcement boundary (closes Paige #2 / Winston #4 / Murat #3)**
Enforced at three points: (a) **lobby Join gate** — Join disabled at `maxPlayers` cap;
overflow = in-world ephemeral reject. (b) **`/encounter start` gate** — `minPlayers ≤ 1`
→ immediate start; `≥ 2` → lobby; Begin disabled below min. (c) **mid-run drop below
`minPlayers`** → the encounter **continues** (no auto-abort) + an in-world notice + the DM
may `/encounter resolve`; `maxPlayers` does **not** re-check mid-run (a leaving player
does not auto-free a slot — latecomers use `/encounter join`, gated by roster < cap).
Campaign-bound encounters: `maxPlayers ≤ party size`.
**3. To-resolve items closed (closes Winston #1)**
- **OQ-7 (Manage Messages):** ops re-invite with the Manage Messages scope; the bot checks
`channel.permissionsFor(bot)` at boot and logs a warning if missing; NFR-7 safe
degradation (skip delete, still guide). Added to the deploy/ops checklist.
- **no-Foundry-character branch (FR-46):** silent skip — the player rolls at +0 (no Foundry
modifier), no passive reveal fires for them, no Foundry-derived advantage; no special
player-facing message; the Roll button stays enabled.
- **`/character status` arg shape:** `set @user <label> [duration_minutes]` /
`clear @user [label]` / `show @user`; DM-only (permission check); ephemeral confirm.
- **hourglass GIF:** in-repo `assets/timer-hourlass.gif` (~10s loop) or `TIMER_GIF_URL`
env; `setImage` from `embeds/timedCheck.ts`; **static fallback** until the asset is
sourced (non-blocking — timed checks ship with a still image until the GIF lands).
- **OQ-8 (stat-privacy):** scoreboard shows each player's modifier by default (open
rolls); no hide flag for v1 (deferred).
- **OQ-1 (Foundry fields):** passives from `skills.{skill}.passive` (already fetched);
equipment from `items[]` (equipped flag); conditions = new relay slice
`/dnd5e/get-actor-conditions` returning `[{id,name,icon,description,durationRemaining,
stacks,concentration,source}]` (contract-first, bot-owned); advantage-granting derived
via the mapping seam (not a relay field).
**4. `character_spell_lookup` formal decision (closes Winston #2)**
Added to Option C: opt-in per `spec.tools`; args `{player, spellName}`; returns full spell
mechanics; used by the LLM when the L2 charctx budget trim omits spell detail (name+level+
school only). Decided — joins `skill_check_group_emit` + `character_status` + optional
`character_spell_lookup` in the registered tool set.
**5. Under-specified items closed (closes Winston under-specified)**
- **ConditionsReader flag:** `FOUNDRY_CONDITIONS_ENABLED` env (default `false` until the
relay ships); cutover = flip `true` + run the integration test against the real relay;
stub returns a canned YAML fixture.
- **boot-sweep → Feature A:** on boot, scan `encounter:*:active` + sessions with
`pendingSkillCheck`/`groupcheck`; pending timed check → finalize as timeout FAILURE
(push `[SKILL CHECK RESULT] … FAILURE (timer expired)` + clear pending + schedule LLM
turn); pending group check → apply `successRule` to partial rolls (unrolled = failure) +
finalize; active-no-pending → no action (resume on next message; mid-flight LLM calls are
unrecoverable past the token TTL, accepted).
- **`character_status` Zod shape:** `StoryStatus = { label: string, setAt: number,
setter: 'dm'|'llm', expiresAt: number }`; LLM-settable labels constrained to the
whitelist enum at the `character_status` tool handler; DM free-form.
- **FR-45 lock-acquire timeout:** ~2s; on timeout → the in-world *"the moment slips away —
try again"* ephemeral (no backoff loop, single reject); same-player double-click = idempotent
no-op (return existing roll result); non-target spam = rate-limited (one ephemeral per ~5s
per user).
- **story-status TTL:** DM re-set resets the 24h TTL; LLM re-set resets too (within the
whitelist); a new encounter does NOT reset (statuses persist cross-encounter per FR-35);
expiry = silent drop (the LLM stops seeing it; no narrated fade).
**6. NFR-3/NFR-4 latency harness (closes Murat #1)**
Add `tests/perf/latency.test.ts` — a latency-injected LLM mock asserting the bot's
non-LLM overhead is bounded (single-roll narration path, group-resolution path). Plus a
**manual playtest latency checklist item** recording observed p95 (≤8s single, ≤15s group).
**7. Enumerated test cases (closes Murat #2)**
- FR-43 fail-open (`discordId` absent → allow the roll) + second-claimant rejection
(integration, documented gap).
- FR-45 three contention modes as enumerated cases in the interleaving harness: lose →
in-world ephemeral; double-click → idempotent no-op; non-target spam → rate-limited.
- `character_status` DM > LLM priority-rejection branch (LLM call is a no-op/logged
rejection on conflict with an active DM status).
**8. Manual pre-release multi-player playtest checklist (closes Murat #5)**
Seven steps — pass = no orphaned thread / lost roll / raw-JSON leak:
1. N players join lobby, start. 2. Group check, all roll (live scoreboard + ephemerals).
3. Timed group check (10s increments + GIF + final-sands). 4. Latecomer joins a running
encounter (Join button + `/encounter join`). 5. Non-joined message deleted + guided.
6. A player doesn't roll (no-show grace/timeout). 7. Bot restart mid-group-check (sweep
finalizes).
**9. Harness clarifications (closes Murat (c)/#4 + Winston #5)**
- **Interleaving harness** = a Vitest test driving the real `rollHandler`/`lobbyHandler`
with synthetic discord.js `Interaction` objects (constructed, not gateway) + `ioredis-mock`
with controlled `await` interleaving; **explicitly unit/integration, NOT labeled live**;
the `atomicMutate` race is its target.
- **Foundry non-idempotent retry test** targets Foundry *writes* (idempotent ops: set HP /
award XP), not the scoreboard (which is Redis via `atomicMutate`).
- **`systemStrings.voice`** is the **7th** net-new harness (rendered-output forbidden-word
grep).
- **GraphMCP closed-set clarification:** `loreResolver`/`vocabularyResolver` are modules
*invoked by* `encounter.ts`/`encounters.ts`, not direct GraphMCP read callers; the closed
set of direct read callers = `{encounter.ts, encounters.ts}` only. AC6 enforces.
### Architecture Completeness Checklist
- [x] Project context thoroughly analyzed · [x] Scale and complexity assessed ·
[x] Technical constraints identified · [x] Cross-cutting concerns mapped
- [x] Critical decisions documented with versions · [x] Technology stack fully specified ·
[x] Integration patterns defined · [x] Performance considerations addressed
- [x] Naming conventions established · [x] Structure patterns defined ·
[x] Communication patterns specified · [x] Process patterns documented
- [x] Complete directory structure defined · [x] Component boundaries established ·
[x] Integration points mapped · [x] Requirements to structure mapping complete
### Architecture Readiness Assessment
**Overall Status: READY FOR IMPLEMENTATION** (all 16 checklist items [x]; no Critical Gaps
remain after the resolutions above).
**Confidence Level:** high.
**Key Strengths:** decisions grounded in the real codebase (file:line); LLM-contract
integrity locked (`filterLLMResponse` routing, per-variant filter test, dice monopoly);
concurrency/restart/test gaps explicitly owned (in-process mutex, two-case boot sweep,
unit+integration+playtest); brownfield extension (no re-architecture); `ioredis-mock` +
`gpt-tokenizer` already cover the new test demands.
**Areas for Future Enhancement:** durable timers / multi-instance locking (if scale lands);
hidden-modifier flag (OQ-8); CAP-17 full corpus refactor beyond minimal; the
`character_spell_lookup` full-mechanics path.
### Implementation Handoff
**AI Agent Guidelines:** follow the architectural decisions exactly as documented · use the
implementation patterns consistently across all components · respect the project structure
and boundaries · route every new narrator path through `filterLLMResponse` · mutate
`SessionState` only via `atomicMutate` · refer to this document for all architectural
questions.
**First Implementation Priority:** **CAP-17 minimal schema additions** (`minPlayers`/
`maxPlayers`/`passiveReveals` in `EncounterSpecSchema` + `docs/spec-authoring-guide.md` +
`tests/unit/specsToolsConsistency.test.ts`) — unblocks Features AE. Schema MUST land
before wizard reconciliation.

View File

@@ -0,0 +1,33 @@
# Decision Log — PRD: Mardonar Encounter Engine (2026-06-20)
Canonical memory and audit trail for this PRD run. Every decision, change, and
override (including headless overrides) is recorded here as the conversation
unfolds.
## Decisions
| Date | Decision | Rationale | Status |
|------|----------|-----------|--------|
| 2026-06-20 | Opened new PRD run (Create intent) | User selected Create via /bmad-prd intent prompt | open |
| 2026-06-20 | PRD scope = 3 new encounter tools + group encounters | Brain dump: (1) timed skill checks (timer→fail), (2) passive skill checks (domain skill grants hidden scene detail to group), (3) group skill checks (central + ephemeral per-player embeds, live roll updates, group success/fail). Plus group encounters with an "encounter lobby" embed — join via button, start gated on min players; existing encounters stay 1-min (anyone can start), 2+ require min met; EncounterSpec gains a `minPlayers` field | open |
| 2026-06-20 | Group success rule = spec-defined per check, default majority | User chose spec-authored successRule (majority/all/n-of-m/sum) per group check; default majority (≥half succeed) when unspecified. Deterministic bot computation, not LLM narrative. | decided |
| 2026-06-20 | Passive reveals = spec-authored, auto at scene start, Foundry scores | New spec field `passiveReveals` {skill, threshold, revealText}; bot auto-applies at encounter start using each player's passive score from Foundry (existing characterRegistry). Not an LLM tool. Group-visible, attributed to qualifying player. | decided |
| 2026-06-20 | minPlayers schema default = 1 (group encounters opt in) | User chose default=1 over recommended default=2 — supersedes original "defaulted to 2". Existing specs unchanged (anyone can start). Group encounters set minPlayers ≥2 to trigger lobby. | decided |
| 2026-06-20 | Timed-check duration set by LLM (tool arg); in-memory timer, restart = fail | LLM passes durationSeconds at emit (sane default). In-memory setTimeout; bot restart cancels/fails the pending check. Durable Redis-TTL timers deferred (out of scope). | decided |
| 2026-06-20 | OQ-1 resolved — Foundry exposes passive scores | User confirmed Foundry exposes passive scores; passive reveals (Feature B) read them directly via characterRegistry/getActorDetails. No client-side computation needed. | decided |
| 2026-06-20 | Enriched per-character context — Foundry + engine-tracked story status | Foundry supplies passive abilities, equipment, and Foundry conditions (read at lookup via characterRegistry/getActorDetails). Engine ALSO tracks mutable story-driven status (sick/cursed/disguised) set/cleared via a DM command + an LLM tool, Redis-backed, surfaced to the LLM. Rejects Foundry-read-only and engine-tracked-only. | decided |
| 2026-06-20 | Lobby-aware opening prompt | Because lobbies define the roster up front, the opening narrative/system prompt at start must account for who has joined (not just implicit chatters). Lobby-joined players populate the roster before the first LLM turn. | decided |
| 2026-06-20 | Lobby + latecomer message regulation | During the lobby, non-joined messages are auto-deleted; only Join-button pressers enter the roster. Latecomers to a RUNNING group encounter must also Join (non-joined messages auto-deleted). The LLM (not the bot) decides when a joiner is narratively part of the story — suppress the bot's auto [SESSION] entered announcement for group encounters. Solo encounters (minPlayers=1) keep today's implicit-join. Bot needs Manage Messages permission. | decided |
| 2026-06-20 | OQ-2 tool factoring = Option C | Extend skill_check_emit with optional durationSeconds (timed single); dedicated skill_check_group_emit (group, optional timer + successRule); one character_status tool with action: set\|clear. Keeps existing single-player contract minimal; group gets its own embed/resolution path. | decided |
| 2026-06-20 | OQ-3 lobby cap + expiry = maxPlayers cap + idle auto-expiry | Optional spec maxPlayers field caps the lobby (joining disabled at cap). Lobbies idle-auto-expire (~30 min default) closing the thread without starting. | decided |
| 2026-06-20 | OQ-4 long group-check window = central-embed-only past 15 min | Per-player ephemeral roll views are best-effort within the Discord interaction-token window (~15 min); the central scoreboard is the durable record. No hard cap on group-check duration. | decided |
| 2026-06-20 | OQ-5 story-status lifecycle = TTL auto-clear | Story status persists per character across encounters AND auto-clears after a TTL (~24h default) to bound staleness (CM-6); DM/LLM may clear explicitly anytime. | decided |
| 2026-06-20 | OQ-6 latecomer Join = button + command | Latecomers join via BOTH a persistent Join button on the lobby embed (live after Start) AND a /encounter join slash command. | decided |
| 2026-06-20 | OQ-7 Manage Messages = invite-scope, ops re-invite | Repo has no Manage Messages config; bot today only deletes its own messages (queueCap.ts:56, messageRouter.ts:136). FR-28/FR-29 require Manage Messages granted at Discord invite scope. NFR-7: safe degradation (log + skip) if missing. | decided |
| 2026-06-20 | Single player-locked Roll button; retire Adv/Dis/Mod buttons (FR-43) | With player sync (Feature E) + true Foundry stats, the player no longer chooses advantage/disadvantage/modifier. Every skill-check surface (solo, timed, group) shows ONE Roll button locked to the targeted player(s); handler rejects non-targeted clickers via private ephemeral. Advantage/disadvantage decided upstream by story/DM (LLM emit) + Foundry stats granting it, shown as a Roll Mode field. Simplifies existing solo skill_check_emit (behavior change to shipped behavior) — FR-39/NFR-4 updated to call it out. Originated in the UX pass; reflected back into the PRD. | decided |
| 2026-06-20 | Reviewer gate run (rubric + adversarial + edge-case); findings resolved | 3 PRD reviewers. Resolved: stale Non-Goals (blocker); dead visibility:'private' schema (FR-6); undefined successRule semantics (FR-15 + addendum §4); restart-mid-group-check (FR-44); untimed no-show hang (FR-16 grace period); lazy restart vs SM-4 (NFR-2 sweep); latecomer-during-active-check excluded (FR-17); roster timing (FR-21); adv/dis precedence (FR-43); OQ-1 refined (Perception/Investigation direct, others computed); no-Foundry-character branch (FR-46); atomic roll (FR-45); addendum §1 customIds / §5 Redis / §6 passives. Review files: review-rubric.md, review-adversarial.md, review-edge-cases.md. | decided |
| 2026-06-20 | PRD finalized — status: final | All OQs resolved (OQ-1 refined, OQ-8 added). 46 FRs, 5 features. Downstream: bmad-create-architecture, bmad-create-epics-and-stories (UX done). | final |
## Changes & Overrides
_(Recorded here as they occur.)_

View File

@@ -0,0 +1,261 @@
# Addendum — Group Encounters & New Skill-Check Tools
Depth that belongs in downstream documents (architecture, solution design, UX
spec) or earned a place but does not fit the PRD's main narrative: platform
constraints, schema shapes, tool-factoring alternatives, rejected-alternative
rationale, and the file:line grounding extracted from the codebase.
---
## 1. Discord platform constraints (drive several FRs/NFRs)
- **Ephemerals are per-interaction, per-user.** Editable only via the owning
interaction's token (`interaction.editReply`/`update`/`deferReply`) for ~15
min. They cannot be edited by another user's interaction, and there is no way
to broadcast a live update to many users' ephemerals from a single actor.
**FR-13**: each player's per-player ephemeral roll view is spawned by that
player's own **Roll** click. The **central embed** (FR-14) is the only
live-updating, group-visible surface.
- **Live embed edits are rate-limited** (~5 channel edits/sec global, per-route
buckets). → **NFR-1 / FR-14**: central scoreboard edits must be
coalesced/debounced, not one-edit-per-roll.
- **Buttons route through the global `interactionCreate` handler**
(`src/bot/index.ts:55`) by customId prefix (`sc_*`, `encounters_search_btn`).
No `createMessageComponentCollector` is used anywhere today. → **FR-41**: new
button prefixes (`lobby_join`, `lobby_leave`, `lobby_start`, `lobby_cancel`,
`grp_roll`, `sc_roll`) add to that one router. The `sc_adv` / `sc_dis` /
`sc_mod` / `grp_adv` / `grp_dis` prefixes are **retired** by the single
player-locked Roll (FR-43). CustomId max 100 chars.
- **Components are removed by editing `components: []`** (the existing pattern,
`rollHandler.ts:168`, `messageRouter.ts:110,149`). `setDisabled(true)` is
available but unused — useful for the lobby **Start** button's
disabled-until-min state (FR-20).
- **Threads are the encounter surface; multiple members can message and click.**
Each click is its own `ButtonInteraction`; the actor can `update()` the
public central embed **and** `reply({ ephemeral: true })` to themselves.
## 2. Current skill-check / tool framework (the extension surface)
- **Tool plugin shape**: `ToolPlugin``name`, `description`,
`args: Record<string, ArgSchema>`, optional `contextDocs(spec)`, async
`handler(args, ctx) → DispatchResult`. Self-registers via `registerTool()`.
Must also be imported in `src/harness/tools/index.ts:3-8` for global
availability. Per-encounter filtering via `getActiveTools(spec.tools)`
(`src/harness/toolRegistry.ts:23-55`).
- **Dispatch**: `dispatchTool(block, ctx)``handler``DispatchResult`
(`{ systemMessage, error?, resolved? }`) (`src/harness/toolDispatcher.ts:17-117`).
LLM-facing contract = `buildToolManifest(spec)` (a `<tool_contract>` block).
- **Existing `skill_check_emit`** (`src/harness/tools/skillCheckEmit.ts:62-138`):
args `player, prompt, skill, dc, advantage, disadvantage`; resolves player
Discord ID from `session.players`, looks up a Foundry modifier via
`characterRegistry` + `getActorDetails` (30s cache), sends a suspense embed,
edits to the real embed + roll buttons after 1.5s, persists
`pendingSkillCheck` to `SessionState`.
- **Pending lifecycle**: `PendingSkillCheck` on `SessionState`
(`src/types/index.ts:38-65`), persisted in Redis at `session:{threadId}`
(`src/session/sessionManager.ts:7-22`). While pending, the LLM turn scheduler
bails (`messageRouter.ts:256-259`); incoming player messages increment
`pendingSkillCheckAttempts` and warn, auto-failing after
`PENDING_ROLL_LIMIT` skipped messages (`messageRouter.ts:141-173`).
- **Roll → LLM**: `rollHandler.ts:40-87` computes `total = roll + modifier`,
`success = total >= dc`, edits the embed green/red, clears pending, pushes a
`[SKILL CHECK RESULT]` **system** message, calls
`scheduleEncounterLLMTurn(…, immediate=true)`.
- **Player roster**: `session.players` is populated implicitly when a user
chats in the thread (`messageRouter.ts:189-205`); the starter is **not**
auto-added. → relevant to FR-12 default player set and to the lobby
(FR-21 adds joined players to this roster explicitly).
- **Roll result filter**: `responseFilter.ts` strips echoed system tags
(`SYSTEM_TAG_RE`), catches leaked `tool_call` JSON (`LEAKED_TOOL_CALL_RE`)
and fabricated rolls (`ROLL_CLAIM_RE`), and has a `detectMissedSkillCheck`
heuristic. → **NFR-3 / FR-27**: new result paths must route through this
filter so timed-out / group outcomes never leak raw JSON.
## 3. Tool-factoring alternatives (OQ-2 — for architecture)
**Option A — separate tools**: `skill_check_timed_emit` (single + duration) and
`skill_check_group_emit` (multi + optional duration + successRule). Keeps each
tool's contract narrow and the LLM's choice explicit. Cost: more tools to
register/document; the LLM must pick the right one.
**Option B — one unified `skill_check_emit`** with optional `durationSeconds`,
`players` (array|"all"), and `successRule` args. The existing single-player
untimed call is the degenerate case (no duration, one player). Cost: a heavier
tool contract and conditional embed/handler logic; risk of the LLM mixing
args.
**Option C — extend `skill_check_emit` with `durationSeconds` only, and add a
separate `skill_check_group_emit`**. Timed is just a flag on the existing
single-player tool; group is its own tool because its embed + resolution differ
structurally. Middle ground.
_Recommendation for architecture_: **Option C** — minimal change to the
existing tool's contract (one optional arg), and group checks get the dedicated
embed/resolution path they need. The PRD is intentionally tool-factoring-agnostic
(FR-1/FR-11 state capability, not tool names).
## 4. EncounterSpec schema additions (for FR-6, FR-18, FR-28)
Current schema: `src/spec/loader.ts:29-58`. `skillChecks` is a flat
`z.record(z.union([z.number(), z.string()]))` — DCs as numbers, `_skill`/`_note`
companions as strings. No timer/group/lobby/passive fields exist; Zod strips
unknown keys silently, so writing them today is a no-op (authoring guide
`docs/spec-authoring-guide.md:75-77` documents `minPlayers`/`campaignId` as
planned, not enforced — CAP-12/CAP-13).
Proposed additions (exact Zod shapes for architecture):
```ts
minPlayers: z.number().int().min(1).default(1), // FR-18
maxPlayers: z.number().int().min(1).optional(), // FR-24 cap
passiveReveals: z.array(z.object({
skill: z.string(), // e.g. "Perception"
threshold: z.number().int(),
revealText: z.string(),
// group-visible only — no private path (no interaction at start) — FR-6
})).optional(),
```
Group-check `successRule` is a **tool arg** (FR-11/FR-15), not a spec field —
it varies per check. Semantics: `majority` = ≥ ceil(N/2) succeed; `all` = every
roller succeeds; `n_of_m` = ≥ `n` of `m` targeted succeed; `sum-threshold` =
sum of d20-totals ≥ `t` (≥, not `>`); ties resolve via the ≥ rule. Timed
duration is also a tool arg (FR-1).
Authoring-guide pitfalls to honor (`docs/spec-authoring-guide.md:93-99`):
- No dice results in spec prose (bot owns dice).
- No system tags / `tool_call` / fenced JSON in `revealText` or `persona`
(responseFilter strips/suppresses).
- `id` fields are stable forever.
- `tools` entries must be registered plugins or the build fails
(`tests/unit/specsToolsConsistency.test.ts`) — new tools must be registered
before any spec lists them.
- Update `docs/spec-authoring-guide.md` and the reference spec
`specs/market-thief.yaml` (which must stay `xpReward`-free — the new fields
don't conflict).
## 5. State & persistence (for NFR-2, FR-21, FR-25)
- **Redis client**: `src/db/redis.ts:4` (ioredis singleton). Existing keys:
`session:{threadId}` (JSON `SessionState`, TTL `SESSION_TTL_HOURS`),
`guild_threads:{guildId}` (SET), `character:{guildId}` (HASH
`discordId → profile`).
- **Lobby state** (FR-21/FR-25): new Redis key `lobby:{threadId}` → JSON
`{ specName, minPlayers, joined: [discordId…], starterDiscordId, messageId }`,
TTL'd. On bot restart, rehydrate the lobby embed from this key. The lobby is
long-lived (minuteshours) so it **must not** be in-memory — unlike timed
checks.
- **Timed checks**: in-memory `setTimeout` only (FR-5). On restart, a **sweep**
finalizes any `session:{threadId}` whose `pendingSkillCheck` deadline has
passed as a timeout failure **immediately** (not lazily) — so no thread hangs
(SM-4, NFR-2).
- **Group-check state** (FR-44): new Redis key `groupcheck:{threadId}` → JSON
`{ skill, dc, targeted: [...], rolls: {discordId: total}, deadline?,
successRule }`, TTL'd. On restart, rehydrate the scoreboard; expired →
timeout-finalize; in-flight → remaining players finish. Roll registration is
atomic per player (FR-45) — compare-and-set on the rolls map.
## 6. Foundry retrieval — passives, equipment, conditions (OQ-1 — RESOLVED)
User confirmed Foundry exposes passive scores. Refinement (OQ-1): Foundry
**directly** exposes passive **Perception / Investigation**; other-skill
passives are **computed** `10 + modifier` (+ proficiency where applicable) —
there is no canonical "per-skill passive" in Foundry beyond those.
`characterRegistry` + `getActorDetails` (used in `skillCheckEmit.ts:12-20`,
30s cache) is **extended** to return equipment and active **Foundry
conditions/features** (which may grant advantage/disadvantage — FR-31/FR-43).
Passive reveals (FR-7/FR-34) consume passives from this enriched retrieval.
Architecture must confirm the exact Foundry fields/endpoints for
passives/equipment/conditions and the advantage-granting conditions (OQ-1
refined).
## 7. GraphMCP event logging (FR-30)
New event types to ingest via `src/graphmcp/ingest.ts` (existing
`setnx`-dedup + `xadd` stream + 7d TTL pattern):
- `lobby_join`, `lobby_leave`, `lobby_start`, `lobby_cancel`
- `latecomer_join` (join to a running group encounter)
- `msg_deleted` (non-joined sender, phase: lobby|running)
- `passive_reveal` (skill, threshold, player, hit)
- `skill_check_timed_emit`, `skill_check_timed_timeout`
- `skill_check_group_emit`, `skill_check_group_result` (rule, per-player
outcomes, group outcome)
- `status_set`, `status_clear` (character, status, actor: dm|llm)
## 8. Rejected alternatives
- **Default `minPlayers = 2`** (user's original dump). Rejected in favor of
default = 1 (user's review choice) so all existing solo specs run unchanged
without migration; group encounters opt in by setting ≥ 2. Recorded in
`.decision-log.md`.
- **Durable Redis-TTL timers.** Rejected for v1 (in-memory timers, restart =
fail) to keep the build small; the lobby is the only long-lived state that
justifies Redis backing. NFR-2 records the accepted trade-off.
- **LLM-decided group success** (no fixed rule). Rejected in favor of a
deterministic, spec-authored `successRule` so outcomes are reproducible and
auditable, not subject to LLM variance.
- **LLM-triggered passive reveals via a tool.** Rejected in favor of
spec-authored, bot-applied reveals at scene start so they fire reliably from
character data without depending on the LLM remembering to call a tool.
- **Foundry read-only for status.** Rejected — Foundry conditions alone don't
capture story-driven state like "sick" that isn't a Foundry condition. The
engine tracks mutable story status alongside Foundry-read
passives/equipment/conditions.
- **Engine-tracked only (no Foundry).** Rejected — duplicating passives and
equipment in engine state would drift from Foundry and add DM burden. Foundry
stays the source of truth for those; the engine owns only story status.
## 9. Key file:line entry points (for architecture/epics)
- Schema: `src/spec/loader.ts:29-58`
- Spec load call site: `src/bot/commands/encounter.ts:130-233` (start flow),
`:144` (loadSpec), `:220` (`players = {}`), `:231` (opening text send)
- Tool registry/dispatch: `src/harness/toolRegistry.ts:23-55`,
`src/harness/toolDispatcher.ts:17-117`, `src/harness/tools/index.ts:3-8`
- Existing skill-check tool: `src/harness/tools/skillCheckEmit.ts:62-138`
- Pending state: `src/types/index.ts:38-65`, `src/session/sessionManager.ts:7-22`
- Roll → LLM: `src/bot/handlers/rollHandler.ts:40-87, 97-181`
- Pending block + scheduler: `src/bot/handlers/messageRouter.ts:141-173, 189-205, 256-259`
- Embeds/buttons: `src/bot/embeds/skillCheck.ts:16-74`
- Filters: `src/bot/handlers/responseFilter.ts:8-77`
- Prompt builder: `src/harness/promptBuilder.ts:4-24, 137-142, 170-190`
- Redis: `src/db/redis.ts:4`; GraphMCP ingest: `src/graphmcp/ingest.ts:28-32`
- Global interaction router: `src/bot/index.ts:55`
## 10. Enriched character context (Feature E)
- **Foundry retrieval**: extend `getActorDetails` to return passives (per
skill), equipment, and active conditions per actor, alongside today's
per-skill modifier. Reuse the 30s in-memory cache pattern
(`skillCheckEmit.ts:12-20`).
- **Story status store**: new Redis structure keyed per character. Candidate:
extend the existing `character:{guildId}` HASH (`characterRegistry.ts:21-39`)
with a `status` field (JSON array of `{label, setAt, actor}`), or a dedicated
key `character_status:{guildId}:{discordId}`. Set/cleared by a DM slash
command and an LLM tool (FR-32). Persists across encounters (FR-35) until
cleared — OQ-5 pins the auto-clear rule.
- **Prompt rendering**: expand `buildPlayersBlock(players)`
(`src/harness/promptBuilder.ts:4-24`) to render each player's passives,
equipment, conditions, and story status. Watch prompt size — equipment lists
can be large; consider per-player summarization/truncation (architecture call).
## 11. Message regulation (Feature D — FR-28/FR-29/FR-42)
- Today players join **implicitly by chatting** and the bot pushes a
`[SESSION] … has entered the encounter` system message
(`src/bot/handlers/messageRouter.ts:189-205`). For group encounters this
changes:
- **Lobby phase**: messages from non-joined members are auto-deleted before
they reach the router's join logic. Requires **Manage Messages** (FR-42,
NFR-7); degrade safely (log + skip) if missing.
- **Running group encounter**: a latecomer must Join first; non-joined
messages are auto-deleted the same way. The implicit-join path
(`messageRouter.ts:189-205`) is **bypassed for group encounters** — joined
players are added to `session.players` by the lobby/latecomer Join handler
(FR-21/FR-29), not by chatting.
- **Solo encounters**: implicit-join + `[SESSION] entered` preserved
(FR-30); no deletion.
- **Latecomer Join affordance** (OQ-6): keep the lobby embed's Join button live
after Start, or add a `/encounter join` slash command. The `/encounter start`
flow (`src/bot/commands/encounter.ts:130-233`) is the integration point.
- **Suppressed announcement** (FR-27): for group encounters, skip the
`[SESSION] entered` system message; the roster is already in the prompt
(FR-26) and the LLM weaves joiners in narratively.

View File

@@ -0,0 +1,421 @@
---
title: "Group Encounters, Timed/Passive/Group Skill Checks & Enriched Character Context"
status: final
created: 2026-06-20
updated: 2026-06-20
---
# Product Requirements Document: Group Encounters, New Skill-Check Tools & Character Context
## 1. Overview
The Mardonar Encounter Engine runs D&D encounters inside Discord threads: an
LLM narrates, voices NPCs, emits skill checks, tracks hidden goals, and logs
everything to GraphMCP. Today the engine is **single-player by default**
players join a thread implicitly by chatting, the starter is not auto-added,
skill checks target one player at a time, no check has a deadline, there is no
minimum-players gate, and the LLM sees only a per-skill modifier per character.
This PRD defines five capabilities that bring the engine to multi-player,
time-pressured, character-aware play:
1. **Timed skill checks** — a single-player skill check with a wall-clock
deadline; expiry is a failure.
2. **Passive skill checks** — hidden scene details auto-revealed to the group
when a player's passive score in a relevant skill meets a spec-defined
threshold (e.g. *passive Perception 16* → the group learns that player sees
a button everyone else missed).
3. **Group skill checks** — a check spanning multiple players, with a live
central scoreboard embed and per-player ephemeral roll views, resolved by a
deterministic group success rule.
4. **Group encounters & encounter lobby** — an encounter that requires a
minimum number of players, gated by a lobby embed with Join/Start buttons,
with message regulation so only joined players participate.
5. **Enriched character context** — the engine tracks and surfaces more per
character: passive abilities, equipment, Foundry conditions, and mutable
story-driven status (e.g. "sick"), giving the LLM richer, accurate context.
Capabilities 14 ship together because the lobby produces the multi-player
roster that group checks operate on and that the opening prompt must account
for; capability 5 enables passive reveals (2) and improves all LLM narration.
## 2. Problem & Motivation
- **No time pressure.** A pending skill check today blocks the thread until the
player rolls or skips enough messages to auto-fail (`PENDING_ROLL_LIMIT`).
There is no wall-clock deadline, so tense scenes can't be staged mechanically.
- **No passive perception.** A high-passive-Perception character should notice
things others don't, automatically. Today this requires the DM to hand-narrate
it or the LLM to fabricate awareness — unreliable.
- **No group checks.** The engine checks one player at a time and cannot
aggregate several players' rolls into one outcome.
- **No minimum-players gate.** Any encounter can be started solo, even ones
designed for a party. Random posts auto-enter players into the scene whether
or not they intend to participate.
- **Thin character context.** The LLM sees only a per-skill modifier per
character. It has no visibility into passive abilities, equipment, active
conditions, or story-driven status (a character who is sick, cursed, or
disguised), so narration can't reliably reflect character state.
## 3. Goals & Non-Goals
### Goals
- Support multi-player group encounters with a minimum-players lobby gate and
message regulation (only joined players participate).
- Add timed skill checks that fail-closed on timeout.
- Add passive skill reveals that fire automatically from character data, with
no DM or LLM intervention.
- Add group skill checks with a live scoreboard and a deterministic,
spec-authored success rule.
- Enrich per-character context (passives, equipment, conditions, story status)
and surface it to the LLM; enable story-status to be set/cleared by the DM
and the LLM.
- Preserve 100% of existing single-player encounter and skill-check behavior.
### Non-Goals (out of scope)
- **Durable timers.** Timed checks use in-memory timers; a bot restart cancels
the pending check as a fail. Redis-TTL-backed timers are deferred (NFR-2).
- **Multi-scene passive reveals.** The engine has no scenes/stages today;
passive reveals apply at encounter start.
- **Lobby max-player cap or auto-expiry.** (Now IN scope per OQ-3 — FR-24 /
NFR-2.)
- **Campaign grouping (`campaignId`).** Remains planned, not enforced.
- **Foundry write-back** for equipment/conditions (Foundry is read-only for
these; only engine-tracked story status is mutable, locally).
- **Spectator / DM-only private views** of group checks.
## 4. User Journeys
### UJ-1 — DM runs a group heist (Mary, DM)
Mary runs `/encounter start velvet-auction` (`minPlayers: 3`). The bot posts a
**lobby embed**: "Velvet Auction — needs 3, 1 joined" with **Join** enabled and
**Start** disabled. A late visitor types "hi" in the thread without joining —
the bot deletes it. Three players press **Join**; **Start** enables. Mary
presses **Start**. The opening narrative posts and — because the roster is known
up front — the system prompt accounts for all three joined players; the LLM
weaves them in (no auto "entered" announcement). Passive reveals fire (one
player with passive Insight 15 is publicly told they notice the auctioneer's
tell). The LLM emits a **group Stealth check**, 60s, `successRule: majority`.
The central scoreboard fills in live as each player clicks **Roll**; at expiry
2 of 3 succeeded → group SUCCESS; the LLM narrates the outcome.
### UJ-2 — High-passive player auto-spots a hidden detail (Zara, player)
Zara joins a corridor encounter with passive Perception 16. The spec's
`passiveReveals` lists `{skill: Perception, threshold: 16, revealText: "Zara
notices a small button set into the wall behind the tapestry."}`. At encounter
start the bot reads Zara's passive Perception from her Foundry character (part
of the enriched character context), sees 16 ≥ 16, and posts the reveal
attributed to Zara. No one rolled dice.
### UJ-3 — Solo timed trap (Kay, solo player)
Kay runs a solo encounter (default `minPlayers`, today's implicit-join). The LLM
emits a **timed skill check** with `durationSeconds: 30` to disarm a trap. The
embed shows a 30s countdown. Kay rolls at 18s → success. In a second run Kay
hesitates; the timer expires → the bot marks the embed timed-out, pushes a
`[SKILL CHECK RESULT] … FAILURE (timer expired)` system message, and the LLM
narrates the trap triggering.
### UJ-4 — Story status colors narration (Zara, sick)
Before the encounter, the DM sets Zara's story status to `sick` (via a DM
command). When the encounter starts, the enriched character context surfaces
`status: sick` alongside her passives/equipment in the LLM prompt. The LLM
narrates Zara's movements as labored and applies disadvantage where appropriate,
without the DM reminding it each turn.
## 5. Features & Functional Requirements
### 5.1 Feature A — Timed Skill Checks
- **FR-1** The LLM can emit a timed single-player skill check by passing a
`durationSeconds` argument on the skill-check tool call.
- **FR-2** On emit, the bot posts the skill-check embed with a visible countdown
and starts an in-memory timer; the encounter thread enters the pending state
(blocked) until the check resolves.
- **FR-3** If the player rolls before expiry, the check resolves per the
existing single-player logic (d20 + modifier vs DC → success/failure).
- **FR-4** If the timer expires before a roll, the bot auto-resolves the check
as **FAILURE**, updates the embed to a timed-out state, pushes a
`[SKILL CHECK RESULT] … FAILURE (timer expired)` system message to the LLM,
clears the pending state, and schedules the next LLM turn.
- **FR-5** If the bot restarts while a timed check is pending, the check is
cancelled and treated as failed (accepted trade-off; NFR-2).
### 5.2 Feature B — Passive Skill Checks
- **FR-6** `EncounterSpec` gains an optional `passiveReveals` array. Each entry
defines `skill` (string), `threshold` (number), and `revealText` (string).
Reveals are **group-visible and attributed** to the qualifying player — there
is no private delivery path, because passive reveals fire at encounter start
when no interaction is in flight to carry an ephemeral (NFR-5).
- **FR-7** At encounter start, after the opening narrative is posted, the bot
reads each present player's passive score for `passiveReveals[].skill`.
Passive **Perception / Investigation** come from Foundry directly;
other-skill passives are **computed** as `10 + modifier` (+ proficiency where
applicable) via the enriched character context (Feature E). Exact Foundry
fields/endpoints are an architecture task (OQ-1 refined, Addendum §6).
- **FR-8** For each player whose passive score ≥ `threshold`, the bot posts
`revealText` to the thread, attributed to that player and visible to the
group.
- **FR-9** Players with no qualifying passive score (or no registered Foundry
character) receive nothing; no die roll is involved. `[ASSUMPTION]`
- **FR-10** Passive reveals are deterministic and bot-applied. The LLM does not
trigger them and does not decide thresholds.
### 5.3 Feature C — Group Skill Checks
- **FR-11** The LLM can emit a group skill check via a tool, specifying `skill`,
`dc`, target `players` (or "all in story so far"), optional
`advantage`/`disadvantage`, optional `durationSeconds` (timed variant), and
optional `successRule`.
- **FR-12** On emit, the bot posts a **central scoreboard embed** in the thread
with a **Roll** button (and Adv/Dis). The check targets the specified player
set; default is the current session roster. `[ASSUMPTION]` — one roll per
player.
- **FR-13** Each targeted player clicks **Roll** and receives an **ephemeral**
reply showing their d20 + modifier vs DC. Ephemerals are driven by each
player's own button interaction (platform constraint — ephemerals cannot be
live-broadcast; Addendum §1).
- **FR-14** The central embed updates with each player's result as rolls
arrive, with edits **coalesced/debounced** to respect Discord rate limits
(NFR-1).
- **FR-15** Group success is computed by the **bot** per the check's
`successRule` (`majority` | `all` | `n_of_m` | `sum-threshold`); if
unspecified, default is **majority** (≥ `ceil(N/2)` of rollers succeed, so
2-of-4 succeeds). Semantics — `all`: every roller succeeds; `n_of_m`: at
least `n` of the `m` targeted rollers succeed (args `n`, `m`); `sum-threshold`:
sum of rollers' d20-totals ≥ threshold `t` (≥, not `>`). Defined in
Addendum §4.
- **FR-16** The check finalizes when all targeted players have rolled, the
timer expires, **or** (untimed checks) a no-show grace period passes after
all others have rolled (default ~60s) — unrolled players count as failures.
The bot finalizes the central embed (group SUCCESS/FAILURE), pushes a
`[SKILL CHECK RESULT]` system message with the per-player breakdown and group
outcome to the LLM, and schedules the next LLM turn. A group check **always
terminates** (timed by timer; untimed by grace period) — no thread hangs.
- **FR-17** Unrolled targeted players count as failures at finalization (timer
expiry, or the untimed no-show grace period). A latecomer who **Joins during
an active group check is not added to that check's target set** — they enter
the roster for subsequent checks only, so an in-flight outcome can't swing.
### 5.4 Feature D — Group Encounters, Lobby & Message Regulation
- **FR-18** `EncounterSpec` gains a `minPlayers` integer field with schema
default **1**.
- **FR-19** `/encounter start` on a spec with `minPlayers ≤ 1` behaves exactly
as today: thread created, encounter starts immediately, any player may begin,
players join implicitly by chatting, and the bot's `[SESSION] entered`
announcement is preserved.
- **FR-20** `/encounter start` on a spec with `minPlayers ≥ 2` enters a
**lobby flow**: the thread is created and a **lobby embed** is posted showing
the required minimum and current joined count, with **Join**, **Leave**,
**Start** (disabled until joined count ≥ `minPlayers`), and **Cancel** buttons.
- **FR-21** Players press **Join** to join the lobby. Joined players are
recorded in **Redis-backed** lobby state (the lobby roster); **Leave**
withdraws a joined player. The lobby roster is promoted to `session.players`
at Start (FR-26), not at Join.
- **FR-22** Once joined count ≥ `minPlayers`, **Start** enables; any joined
player may press **Start** to begin the encounter (opening narrative posted,
passive reveals applied, first LLM turn scheduled). `[ASSUMPTION]`
- **FR-23** **Cancel** (available to the starter) aborts the lobby and closes
the thread without starting. `[ASSUMPTION]`
- **FR-24** An optional spec-defined `maxPlayers` caps the lobby; joining is
disabled once the cap is met. Lobbies also **idle-auto-expire** after a
configurable period (default ~30 min) with no Join/Start activity, cleaning up
abandoned lobbies; expiry closes the thread without starting.
- **FR-25** Lobby state is **Redis-backed** so a bot restart does not lose
joined players; the lobby embed is rehydrated from Redis on restart.
- **FR-26** When a group encounter starts from a lobby, all joined players are
added to `session.players` **before the first LLM turn**, so the opening
system prompt accounts for the full roster.
- **FR-27** The LLM narratively incorporates joined players; the bot
**suppresses** its auto `[SESSION] entered` announcement for group encounters.
`[ASSUMPTION]`
- **FR-28** During the lobby phase, any message from a non-joined member is
**auto-deleted** and the sender is not added to the roster.
- **FR-29** In a running group encounter, a latecomer must **Join** before their
messages are accepted; non-joined latecomer messages are auto-deleted. Join is
available via **both** a persistent Join button on the lobby embed (live after
Start) **and** a `/encounter join` slash command.
- **FR-30** Solo encounters (`minPlayers ≤ 1`) retain today's implicit-join and
`[SESSION] entered` behavior; message regulation applies only to group
encounters.
### 5.5 Feature E — Enriched Character Context
- **FR-31** For each player, the bot reads **passive abilities** (per skill),
**equipment**, and active **Foundry conditions** from Foundry via the existing
`characterRegistry` / `getActorDetails` (extending today's per-skill modifier
lookup) — including any Foundry conditions/features that grant **advantage or
disadvantage** on relevant checks (feeding FR-43).
- **FR-32** The engine tracks mutable **story-driven status effects** per
character (e.g. `sick`, `cursed`, `disguised`) in Redis, set and cleared via
a **DM command** and an **LLM tool**.
- **FR-33** The enriched character context — passives, equipment, Foundry
conditions, and story status — is rendered into the LLM system prompt (player
block) so the LLM has accurate, current character state each turn.
- **FR-34** Passive reveals (Feature B) consume the passive abilities from this
enriched context (no separate lookup path).
- **FR-35** Story status persists **per character across encounters** via the
character registry and **auto-clears after a TTL** (default ~24h) to bound
staleness (CM-6); a DM or the LLM may also clear it explicitly at any time.
### 5.6 Cross-Cutting
- **FR-36** New tools (timed/group skill checks, story-status set/clear) are
registered in the tool registry, exported via the tools index, included in
`VALID_TOOL_NAMES`, and rendered into the LLM tool contract manifest; specs
opt into them via `spec.tools` as today.
- **FR-37** All new tool results and auto-resolutions (timeout, group outcome,
status changes) flow back to the LLM as `[SKILL CHECK RESULT]` / `[SYSTEM]`
messages via the existing dispatch/session path. None leak raw `tool_call`
JSON to players (responseFilter guard).
- **FR-38** New `EncounterSpec` fields (`minPlayers`, `maxPlayers`,
`passiveReveals`, and any group/timer config) are added to
`EncounterSpecSchema` in `src/spec/loader.ts`; `docs/spec-authoring-guide.md`
and the reference spec are updated; the specs-tools consistency test passes.
- **FR-39** All existing specs run unchanged: omitting `minPlayers` yields solo
behavior; omitting `passiveReveals` yields no passive reveals. The existing
solo `skill_check_emit` is **simplified to a single player-locked Roll
button** (FR-43) — its resolution logic is preserved; only the
Adv/Dis/Custom-Modifier player buttons are retired.
- **FR-40** Lobby joins/leaves/starts/cancels, latecomer joins, auto-deleted
messages, group-check emits/results, passive reveals, timed-check timeouts,
and story-status set/clear are logged as events to GraphMCP.
- **FR-41** New button interactions use customId prefixes routed through the
existing global `interactionCreate` handler — no new collector framework.
- **FR-42** The bot holds the **Manage Messages** permission in encounter
threads so it can enforce non-joined message auto-deletion (FR-28/FR-29).
- **FR-43** Every skill-check surface (solo, timed, group) presents a single
**`Roll` button locked to the targeted player(s)**; the bot rejects other
clickers with a private ephemeral. Advantage/disadvantage is decided upstream
— by the story/DM (LLM emit args) and the character's Foundry stats
(FR-31) — and shown as a Roll Mode field, not chosen by the player. The
Adv/Dis/Custom-Modifier roll buttons are retired on all skill-check surfaces,
including the existing solo `skill_check_emit`. **Precedence**: an explicit
LLM emit arg (story/DM discretion) overrides; otherwise Foundry-derived
advantage/disadvantage applies; if neither, a straight roll.
- **FR-44** Group-check roll state (targeted players, who has rolled, each roll
total) is persisted in Redis (`groupcheck:{threadId}`) so a bot restart can
rehydrate the scoreboard; an expired-deadline check is finalized as a timeout
and an in-flight check is rehydrated for remaining players to finish — no
orphaned thread.
- **FR-45** Roll registration is **atomic per player per check** — the bot
defers the button interaction, takes an idempotency lock, and rejects a
second click by the same player; near-simultaneous clicks cannot drop or
duplicate a roll. Non-targeted clickers are rate-limited, not flooded with
ephemerals.
- **FR-46** A player with no registered Foundry character is skipped for
passive reveals (FR-9), rolls with a default `+0` modifier, and receives no
Foundry-derived advantage/disadvantage (FR-31) — the check still proceeds.
## 6. Success Metrics
- **SM-1** A DM can start a group encounter that will not begin until
`minPlayers` have joined (lobby **Start** disabled below the minimum).
- **SM-2** Group skill checks resolve deterministically per the spec's
`successRule`, with zero LLM ambiguity in the computed outcome.
- **SM-3** Passive reveals surface automatically at encounter start for
qualifying players, with no DM or LLM action.
- **SM-4** Timed checks fail-closed on timeout — no pending check hangs the
thread indefinitely.
- **SM-5** The LLM receives enriched character context (passives, equipment,
conditions, story status) each turn, and story status set in one encounter
persists into the next until cleared.
- **SM-6** In group encounters, only joined players' messages enter the scene;
non-joined messages (lobby and latecomer) are removed.
**Counter-metrics**
- **CM-1** Lobby abandonment (joined but never started).
- **CM-2** Timed checks lost to bot restart.
- **CM-3** Group-check central-embed edits throttled by Discord rate limits.
- **CM-4** Passive reveals firing for players who should not qualify.
- **CM-5** False-positive message deletions (joined players' messages wrongly
removed).
- **CM-6** Stale story status never cleared, producing wrong narration over
time.
## 7. Non-Functional Requirements
- **NFR-1 (Performance)** Central group-check embed edits are coalesced/debounced
(target ≤ 1 edit/sec) so an N-player group check does not hit Discord's
per-route edit rate limits.
- **NFR-2 (Reliability)** Lobby state, story status, and group-check roll state
are Redis-backed (survive restart), each with a TTL (lobby idle-auto-expiry
~30 min; story status ~24h). On restart a **sweep** finalizes any pending
check whose deadline has passed as a timeout failure — so no thread hangs in
pending indefinitely (SM-4); in-flight group checks rehydrate (FR-44). Timed
single checks use in-memory timers and fail-closed on restart.
- **NFR-3 (LLM-contract integrity)** New tools and result paths route through
`toolDispatcher` + `responseFilter`; no raw `tool_call` JSON or fabricated
rolls reach players.
- **NFR-4 (Backward compatibility)** 100% of existing specs load and run
unchanged; existing single-player implicit-join behavior is preserved. The
solo `skill_check_emit` button set is simplified to a single player-locked
Roll (FR-43) — an intentional behavior change to shipped behavior, called out
here rather than silently regression-risked.
- **NFR-5 (Platform conformance)** Per-player ephemeral views are driven by each
player's own button interaction; group checks complete within the Discord
interaction-token window (~15 min) or fall back to central-embed-only updates.
- **NFR-6 (Observability)** Every new mechanic emits a GraphMCP event and a
structured log line; timed-check timeout, restart-loss, message deletion, and
story-status change are explicitly logged.
- **NFR-7 (Permissions)** The bot must be granted **Manage Messages** in channels
where group encounters run, to enforce FR-28/FR-29. Missing permission must
degrade safely (log + skip deletion, do not crash).
## 8. Open Questions & Assumptions
**Assumptions (drafted, correct in review)**
- `[ASSUMPTION]` Passive reveals apply at **encounter start** only (no
scenes/stages today).
- `[ASSUMPTION]` Any joined player may Start; **Leave** button present.
- `[ASSUMPTION]` Timeout = bot auto-fail + `[SKILL CHECK RESULT]`, mirroring
today's `PENDING_ROLL_LIMIT`; a timed check blocks the thread (pending).
- `[ASSUMPTION]` Group-check player set defaults to the current roster; one
roll per player.
- `[ASSUMPTION]` Players without a registered Foundry character are skipped
for passive reveals.
**Open Questions**
- **OQ-1 ✅ RESOLVED (refined)** — Foundry exposes passive Perception /
Investigation directly; other-skill passives are computed `10 + modifier`
(+ proficiency). Foundry-granted advantage/disadvantage comes from
conditions/features; exact field/endpoint retrieval is an architecture task
(Addendum §6).
- **OQ-2 ✅ RESOLVED** — Option C: extend `skill_check_emit` with optional
`durationSeconds`; dedicated `skill_check_group_emit`; one `character_status`
tool with `action: set|clear`. (Addendum §3.)
- **OQ-3 ✅ RESOLVED** — Optional spec `maxPlayers` cap + lobby idle-auto-expiry
(~30 min default).
- **OQ-4 ✅ RESOLVED** — Central-embed-only is acceptable past the ~15-min
ephemeral window; no hard cap on group-check duration.
- **OQ-5 ✅ RESOLVED** — Story status persists cross-encounter and auto-clears
after a TTL (~24h default); explicit DM/LLM clear also supported.
- **OQ-6 ✅ RESOLVED** — Latecomer Join via both a persistent lobby Join button
and a `/encounter join` slash command.
- **OQ-7 ✅ RESOLVED** — Repo has no `Manage Messages` config; the bot today
only deletes its own messages. FR-28/FR-29 require `Manage Messages` granted
at the Discord invite scope (ops re-invite); NFR-7 mandates safe degradation
if missing.
- **OQ-8** The scoreboard shows each player's modifier (`rolled 16 +3`) to the
group. Open rolls (modifiers visible) is the assumed default; confirm or
require hidden modifiers. (UX note — see EXPERIENCE.md.)
## 9. Downstream Handoff
- **Architecture / solution design** — `bmad-create-architecture`: timer
durability trade-off, tool factoring (OQ-2), Redis lobby/story-status/timed
key shapes, central-embed edit coalescing, enriched Foundry retrieval + prompt
rendering, message-regulation interaction with `messageRouter` implicit-join,
Manage Messages permission plan (OQ-7). See `addendum.md`.
- **UX design** — `bmad-ux`: lobby embed layout, group-check scoreboard layout,
countdown rendering, ephemeral roll view, story-status command UX, the
latecomer Join affordance (OQ-6).
- **Epics & stories** — `bmad-create-epics-and-stories`: one epic per feature
(AE) plus a cross-cutting schema/tooling/permissions epic.
---
_Technical detail, platform constraints, schema shapes, and rejected
alternatives live in `addendum.md`. Decisions and overrides are recorded in
`.decision-log.md`._

View File

@@ -0,0 +1,382 @@
# Adversarial PRD Review — mardonar-encounter-engine-2026-06-20
Reviewer stance: cynical, default-to-finding-problems. Reviewed `prd.md` and
`addendum.md` in full, cross-checked key claims against the shipped code in
`src/harness/tools/skillCheckEmit.ts`, `src/bot/handlers/rollHandler.ts`,
`src/bot/embeds/skillCheck.ts`, `src/bot/handlers/messageRouter.ts`, and
`src/spec/loader.ts`.
Findings are grouped: (A) hard contradictions inside the PRD/addendum,
(B) claims that break against real platform constraints, (C) conflicts with
shipped behavior or unstated assumptions that will bite at build time,
(D) success metrics that are not measurable, (E) scope/over-reach concerns.
Severity legend: **[BLOCKER]** must resolve before build; **[HIGH]** will cause
rework or runtime bugs; **[MED]** latent risk; **[LOW]** nit/edge.
---
## A. Hard contradictions inside the PRD/addendum
### A1 — Non-Goals directly contradict FR-24 / OQ-3 / NFR-2 [BLOCKER]
`prd.md` §3 Non-Goals lists:
> "**Lobby max-player cap or auto-expiry.** Lobbies are open until Start or
> Cancel. `[ASSUMPTION]` — see OQ-3."
But FR-24 specifies **both** a spec-defined `maxPlayers` cap **and** lobby
idle-auto-expiry (~30 min default). OQ-3 is marked ✅ RESOLVED with the same
content, and NFR-2 mandates the ~30 min TTL. So the Non-Goals section asserts
both features are out of scope while the FRs, NFR, and resolved OQ all put them
*in* scope. The Non-Goals bullet is stale and the PRD contradicts itself on a
scope boundary. Either delete the Non-Goals bullet or strike FR-24's cap+expiry.
As written, an implementer reading top-down will hit §3 and stop.
### A2 — Addendum §1 lists `grp_adv` / `grp_dis` buttons that FR-43 retires [HIGH]
FR-43 (and FR-39 / NFR-4) explicitly retire the Adv/Dis/Custom-Modifier roll
buttons on **every** skill-check surface, including solo `skill_check_emit`, and
state that advantage/disadvantage is decided **upstream** and shown as a "Roll
Mode field, not chosen by the player." Yet Addendum §1 ("Buttons route through
the global `interactionCreate` handler") lists new customId prefixes including
`grp_adv` and `grp_dis`:
> "new buttons (`lobby_join`, `lobby_start`, `lobby_leave`, `lobby_cancel`,
> `grp_roll`, `grp_adv`, `grp_dis`) add prefixes to that one router."
`grp_adv`/`grp_dis` cannot exist if FR-43 is honored. One of two artifact
sections is wrong. The addendum reads like it was written before the FR-43
decision and not re-synchronized. Architecture will inherit a contradictory
button inventory.
### A3 — FR-6 `visibility: 'private'` has no delivery mechanism [HIGH]
FR-6 declares `visibility` with enum `['group', 'private']` (default `group`).
FR-7/FR-8 fire passive reveals **at encounter start**, after the opening
narrative, with no player interaction in flight. Discord ephemerals can only be
spawned by a user's own interaction (Addendum §1, NFR-5). At encounter start
there is no interaction to attach a private ephemeral to, and thread messages
are group-visible. So a `visibility: 'private'` reveal **cannot be delivered**
under the stated platform constraints. The field is dead schema that will
mislead spec authors into writing reveals no one will see. Either remove
`private` from the enum, or specify a delivery path (e.g. DM the player out of
band — which has its own rate-limit/consent implications, not mentioned).
---
## B. Claims that won't hold against real Discord / Foundry / LLM constraints
### B1 — Foundry "passive abilities per skill" is over-claimed [HIGH]
OQ-1 is marked ✅ RESOLVED with "Foundry exposes passive scores; passive
reveals read them directly," and FR-7 reads "each present player's passive score
for `passiveReveals[].skill`." But Foundry only auto-computes and surfaces
**passive Perception** (and optionally passive Investigation) on the default
actor sheet. There is no canonical "passive Stealth" / "passive Insight" /
"passive Athletics" value exposed the way passive Perception is. FR-6's `skill`
field is a free string, so a spec author can write `skill: Stealth`, and the
engine will ask Foundry for a number that does not exist in a standard form.
Addendum §6 even concedes "Architecture to confirm the exact Foundry
fields/endpoint for passives/equipment/conditions during design" — i.e. the
retrieval shape is **not** confirmed, yet OQ-1 is marked resolved and FR-7/FR-34
assume it is. At minimum the spec-authoring guide must constrain `skill` to the
set Foundry actually exposes, or the engine must fall back to `10 + modifier`
(contradicting Addendum §6's "no client-side 10 + modifier computation").
### B2 — FR-31 advantage/disadvantage from Foundry conditions is unresolved [HIGH]
FR-31 asserts the bot reads "any Foundry conditions/features that grant
advantage or disadvantage on relevant checks (feeding FR-43)," and FR-43 makes
that the *sole* source of advantage alongside LLM emit args (player can no longer
choose). Mapping Foundry active effects to "grants advantage on skill X" is
non-trivial: AE advantage is expressed via `flags.midi-qol.advantage.all` /
`flags.dnd5e.advantage.ability.check.all` and per-skill keys, varies by module
(Midi-QOL vs raw D&D5e), and condition → skill mapping is not 1:1. Addendum §6
punts the exact fields/endpoint to architecture. So the **one** mechanism FR-43
gives for non-LLM advantage is unvalidated, while simultaneously removing the
player's ability to choose Adv/Dis. If the Foundry-AE read is not feasible,
players will have lost the Adv/Dis buttons (current `sc_adv`/`sc_dis`/`sc_mod`
buttons confirmed in `src/bot/embeds/skillCheck.ts:53-72`) and gained nothing.
### B3 — Restart-recovery for timed checks can hang the thread, contradicting SM-4 [HIGH]
FR-5 / NFR-2 say timed checks fail-closed on restart. Addendum §5 then narrows
this: "On restart, any `session:{threadId}` with a `pendingSkillCheck` whose
deadline has passed is resolved as a timeout failure **when the session is next
touched**." If the player walks away after a restart and never sends another
message, the session is never "touched," the pending check is never resolved,
and the thread remains in pending state forever — the LLM scheduler bails on
pending (`messageRouter.ts:256-259`). SM-4 asserts "no pending check hangs the
thread indefinitely." That metric is **falsified** by the documented
restart-recovery behavior. Either add a restart sweep that resolves all
post-deadline pending checks on boot, or downgrade SM-4 to "no pending check
hangs the thread *except after a bot restart with no further activity*."
### B4 — `PENDING_ROLL_LIMIT` still runs during timed checks, racing the timer [MED]
Today, while a `pendingSkillCheck` is set, each non-roll player message
increments `pendingSkillCheckAttempts` and auto-fails after `PENDING_ROLL_LIMIT`
skipped messages (`messageRouter.ts:141-173`). FR-2 puts a timed check into the
same pending state. The PRD never disables the message-count auto-fail path for
timed checks. So a timed check with `durationSeconds: 60` could be auto-failed
by `PENDING_ROLL_LIMIT` after N stray messages *before* the 60s timer fires,
with a different `[SKILL CHECK RESULT]` shape than FR-4's "(timer expired)"
outcome. FR-4 claims timer expiry is the failure path; the legacy path is still
armed. The two failure modes will produce conflicting system messages and
GraphMCP events.
### B5 — NFR-5 conflates per-interaction 15-min window with a per-check window [MED]
NFR-5: "group checks complete within the Discord interaction-token window (~15
min) or fall back to central-embed-only updates." But each player's ephemeral is
spawned by **that player's own Roll click** (FR-13); each click starts its own
15-min token window. There is no single "interaction-token window" for the
check as a whole. A group check that runs 20 minutes is fine for any player who
clicks at minute 19 — their ephemeral is valid until minute 34. The "fall back"
clause is ambiguous: fall back to **what** for late ephemerals? The central
embed can be edited via `message.edit()` (not interaction-bound) indefinitely,
so the real constraint is only on ephemeral replies, which are per-click anyway.
Rewrite NFR-5 against the actual per-interaction model, or it will be
mis-implemented as a hard 15-min group-check cap.
### B6 — FR-14 debounce to ≤1 edit/sec creates multi-second scoreboard lag [MED]
NFR-1 targets ≤1 edit/sec to respect Discord's per-route edit limits. FR-14
coalesces edits. With N players rolling in a burst (the realistic case —
everyone clicks Roll within a second or two of the embed appearing), the
central scoreboard will show stale results for up to N seconds before settling.
CM-3 ("edits throttled by Discord rate limits") is addressed, but the UX lag is
not mentioned anywhere. Players will see "my ephemeral says 18, the board still
shows me as 'pending'" and re-click. No FR covers showing a per-player
"submitted, awaiting board update" state in the ephemeral itself.
### B7 — Finalizing the central embed on timeout has no interaction context [MED]
FR-16 finalizes the central embed "when all targeted players have rolled, or the
timer expires." On timer expiry there is no fresh ButtonInteraction to call
`interaction.update()` on (all prior tokens likely expired). The bot must use
`channel.messages.edit(messageId, …)` (non-interaction edit). Addendum §1 only
discusses `interaction.editReply`/`update` for ephemerals and doesn't state the
non-interaction edit path for the central embed on timeout. It is technically
fine, but unspecified — and it is the same path needed for restart-recovery
(B3). Worth making explicit so architecture doesn't assume an interaction is
always available.
---
## C. Conflicts with shipped behavior / hidden assumptions
### C1 — FR-43 retires player-chosen Adv/Dis/Custom-Modifier on the *existing* solo tool [HIGH, called out but underspecified]
Confirmed in code: `src/bot/embeds/skillCheck.ts:53-72` ships `sc_roll`,
`sc_adv`, `sc_dis`, `sc_mod` (Custom Modifier), and `sc_roll_m:*` /
`sc_adv_m:*` / `sc_dis_m:*` variants; `rollHandler.ts:97-181` handles the modal
modifier flow. FR-43 / FR-39 / NFR-4 retire all of these on every surface,
leaving a single player-locked `Roll` button. NFR-4 does call this out as
"intentional behavior change to shipped behavior" — good. **But** the PRD
doesn't specify what happens to the **modifier-flow modal** (`sc_mod_modal`) and
the custom-modifier roll buttons (`sc_*_m:*`). Are players expected to know
their modifier is auto-resolved? Today a player who doesn't trust the bot's
Foundry lookup can enter their own modifier via the modal; FR-43 removes that
escape hatch with no replacement. If the Foundry modifier lookup is wrong (and
`skillCheckEmit.ts` already logs "modifier lookup failed, continuing without"
as a real path), the player now has **no** way to correct it mid-roll. This is a
player-facing regression hidden behind a "simplification" frame.
### C2 — FR-32 gives the LLM an unconstrained persistent-state mutation tool [HIGH]
FR-32 lets the LLM **set** story-driven status via a tool, and FR-35 persists it
across encounters with a ~24h TTL. There is no vocabulary guard: the LLM can
invent any `label` ("dead", "flying", "king", "in love with the auctioneer"),
and it will stick for 24h across every subsequent encounter until cleared. The
PRD lists `sick`, `cursed`, `disguised` as examples but does not constrain the
space. responseFilter (NFR-3) only stops raw `tool_call` JSON leaking to
players; it does nothing to stop the LLM from setting absurd or harmful status.
At minimum: an enum/allowlist of status labels, a DM-approval gate for
LLM-set status, or a scope note that the LLM can only **suggest** status the DM
must confirm. None present.
### C3 — FR-35's 24h TTL will auto-clear permanent story state mid-arc [MED]
A "cursed" or "disguised" status is often a multi-session story arc. A 24h TTL
auto-clears it regardless of arc pacing. The PRD frames story status as
short-lived ("sick") but the examples include "cursed" which is not. The DM can
re-set, but if no one notices the auto-clear, narration will silently revert to
"healthy" mid-arc. CM-6 acknowledges "stale story status" as a counter-metric
but the inverse (premature clear of still-valid status) is not mentioned. Either
the TTL must be per-status configurable, or the auto-clear must require DM
confirmation, or the vocabulary must exclude long-arc statuses.
### C4 — LLM-emit advantage vs Foundry-granted advantage merge is unspecified [MED]
FR-43 says advantage is decided by "the story/DM (LLM emit args) **and** the
character's Foundry stats (FR-31)." What if the LLM emits `advantage: false` but
a Foundry condition grants advantage? What if the LLM emits `disadvantage: true`
and Foundry grants advantage (D&D: they cancel → normal)? The merge rule
(D&D 5e: simultaneous adv+dis = normal, no stacking) is not stated. The bot
computes the roll mode; an implementer will pick one rule silently. This is
load-bearing for roll correctness and should be an FR, not an architecture
detail.
### C5 — Starter's roster membership in group encounters is unspecified [MED]
Today the starter is **not** auto-added to `session.players` (Addendum §2;
`messageRouter.ts:189-205` only adds on first chat). For group encounters,
FR-21 says players press **Join** to be added to the roster; FR-26 says joined
players are added before the first LLM turn. The UJ-1 protagonist Mary is the
DM, not necessarily a player. Does the starter auto-join, or must they press
Join too? If the starter is a player and must Join, the lobby embed's Start
button (which they need to press) is disabled until `minPlayers` joined —
including themselves? The PRD never says. Hidden assumption that will surface as
a UX bug the first time a DM tries to play in their own group encounter.
### C6 — FR-8 posts `revealText` once per qualifying player → duplicate narration [MED]
FR-8: "For each player whose passive score ≥ threshold, the bot posts
`revealText` to the thread, attributed to that player." `revealText` is a single
spec-authored string. If three players all have passive Perception ≥ 16, the
**same** revealText posts three times, attributed to three different players:
"Zara notices a small button…", "Kay notices a small button…", "Len notices a
small button…". That is narratively broken. Either the spec needs a templated
`{player}` token, or the bot posts once with all qualifying players listed, or
only the first qualifier triggers. None specified.
### C7 — Latecomer Join via "lobby embed Join button live after Start" is UX-contradictory [LOW/MED]
FR-29 says latecomer Join is available via "a persistent Join button on the
lobby embed (live after Start) **and** a `/encounter join` slash command."
FR-22 says pressing Start "begins the encounter (opening narrative posted)."
So after Start, the encounter is running, the opening narrative is in the
thread, and the lobby embed — titled "needs 3, 3 joined" with a Join button —
is supposed to stay visible as a latecomer affordance. The embed's framing
("lobby", "joined count") no longer matches the running state. Is the embed
re-rendered post-Start with new copy? Not specified. The `/encounter join`
command is the cleaner path; the "live lobby embed after start" reads like a
leftover from OQ-6 resolution that conflicts with FR-22's "lobby phase ends on
Start."
### C8 — FR-11 LLM-chosen `players` set can target non-joined/latecomers [LOW/MED]
FR-11 lets the LLM specify `players` (or "all in story so far"). The LLM does
not know the live roster boundary reliably (tool-call reliability, NFR-3). It
can name a non-joined latecomer, or a player who left. FR-12 says default is the
roster, but the explicit `players` path has no guard that the named set ⊆
roster. Behavior on a bad name is unspecified (drop silently? error the tool
call? narrate failure?). This will produce "skill check targets player who
isn't in the scene" bugs.
### C9 — responseFilter is trusted as the sole leak guard for new result paths [MED]
NFR-3 / FR-37 lean on `responseFilter.ts` (confirmed: `LEAKED_TOOL_CALL_RE`,
`ROLL_CLAIM_RE`, `SYSTEM_TAG_RE`, `detectMissedSkillCheck`) as the last-line
defense so group/timed outcomes and `status_set`/`status_clear` tool results
never leak raw JSON. The current filter was built for the single `skill_check`
path. New result shapes (group breakdown with per-player numbers, successRule
name, status labels, timer-expired markers) are new content the existing regexes
were not designed against. A group `[SKILL CHECK RESULT]` containing
`successRule: "n_of_m"` or a JSON-ish per-player breakdown could slip past
`LEAKED_TOOL_CALL_RE` (which targets `tool_call`/`function` shapes, not arbitrary
structured result strings). NFR-3 asserts the guard; it does not mandate
**extending the filter** for the new payloads. Add a FR: "responseFilter is
extended and unit-tested against every new result payload shape before merge."
---
## D. Success metrics that aren't measurable
### D1 — SM-2 "zero LLM ambiguity in the computed outcome" [MED]
"Zero ambiguity" is not measurable as written — ambiguity vs what? The
underlying measurable claim is "the bot, not the LLM, computes the group outcome
from `successRule`, and the `[SKILL CHECK RESULT]` message contains the
bot-computed outcome." Rewrite as: "100% of group-check `[SKILL CHECK RESULT]`
system messages contain a group outcome matching the deterministic
`successRule` computation; the LLM never emits a conflicting outcome in the
following narration (auditable via GraphMCP)." As written, an implementer cannot
write a test against "zero ambiguity."
### D2 — SM-4 vs restart-recovery (see B3) [HIGH]
SM-4 "no pending check hangs the thread indefinitely" is falsified by the
documented "resolved when the session is next touched" restart-recovery path.
Not measurable as a hard invariant until B3 is fixed.
### D3 — SM-5 "the LLM receives enriched character context each turn" [LOW]
Measurable in principle (assert the system prompt contains passives/equipment/
conditions/status block), but the PRD gives no target volume or prompt-size
budget. Addendum §10 warns "equipment lists can be large; consider
per-player summarization/truncation (architecture call)." So "receives enriched
context" could silently degrade to "receives a truncated summary" and still pass
SM-5. Pin a measurable floor (e.g. "passive Perception, all active conditions,
all story status, and at least N equipment items per player appear in the
prompt") or accept that SM-5 is a capability claim, not a metric.
### D4 — CM-1 through CM-6 are listed but none have thresholds [LOW]
Counter-metrics are names without numbers. "Lobby abandonment (joined but never
started)" — at what rate is it a problem? "False-positive message deletions" —
what threshold triggers action? Without targets these are observations, not
metrics. Acceptable for a draft PRD, but flag them as "to be baselined" rather
than implying they're operational.
---
## E. Scope / over-reach
### E1 — Big-bang shipment of all 5 capabilities [MED]
The PRD states "Capabilities 14 ship together because the lobby produces the
multi-player roster that group checks operate on… capability 5 enables passive
reveals (2)." So **all five** ship as one. This is a large surface: new
EncounterSpec schema (FR-38), two new tools + one extended tool (OQ-2), Redis
lobby state + rehydration (FR-21/FR-25), Redis story-status store (FR-32),
Foundry retrieval extension for passives/equipment/conditions/advantage-AE
(FR-31), lobby embed + buttons, group scoreboard embed + buttons, timed-check
timer + restart fail-closed, message regulation + Manage Messages permission
ops change (OQ-7), GraphMCP event ingestion for ~10 new event types (FR-30),
and a player-facing regression on the solo skill-check buttons (FR-43). No
phasing, no behind-flag rollout, no v1-scope cut. For a "small build" (the
stated reason for rejecting durable timers), this is contradictory — the
durable-timer savings are a rounding error against this surface. Recommend
splitting: Phase 1 = Feature E + B (passive reveals, no group); Phase 2 =
Feature D (lobby) + A (timed solo); Phase 3 = Feature C (group checks). Each is
independently shippable and testable.
### E2 — `/encounter join` slash command is a new command, not listed in FR-38's schema work [LOW]
FR-29 introduces a `/encounter join` slash command. FR-38 lists schema/doc work
but not new slash-command registration. The Downstream Handoff mentions UX for
"the latecomer Join affordance" but no epic line for a new command registration
+ permission scope. Minor, but it is new shipped surface that needs its own
story.
### E3 — Manage Messages permission requires ops re-invite (OQ-7) — not a PRD-blockable, but a deployment gate [LOW]
OQ-7 ✅ RESOLVED concedes the bot today only deletes its own messages and must
be re-invited with Manage Messages scope. NFR-7 mandates safe degradation if
missing. This is a real-world deployment gate that can block the entire
message-regulation feature (FR-28/FR-29/FR-42) in production even after the code
ships. The PRD should list "ops re-invite with Manage Messages scope" as an
explicit rollout prerequisite in §9 Downstream Handoff, not just an OQ
resolution. Otherwise the feature ships "green" in CI and silently no-ops in
prod.
---
## F. Minor / nits
- **F1** FR-24 "configurable period (default ~30 min)" — where is the config
key defined? No FR specifies the config surface (env var? spec field? guild
setting?). Architecture will pick, but the PRD should at least say "config
key `LOBBY_IDLE_TTL_MIN`, env-sourced" or similar.
- **F2** FR-36 requires new tools be in `VALID_TOOL_NAMES` and the tool contract
manifest, and Addendum §4 notes the specs-tools consistency test will fail the
build if a spec lists an unregistered tool. Good — but no FR requires
**adding the new tools to the test's registered set before any spec opts in**.
Ordering hazard: a spec listing `skill_check_group_emit` merged before the
tool registration PR will break CI.
- **F3** Addendum §5 stores `starterDiscordId` in lobby Redis state, but no FR
uses it after FR-23 (Cancel "available to the starter"). Confirm Cancel is
starter-locked — `starterDiscordId` is otherwise dead state.
- **F4** FR-40 logs `msg_deleted` but NFR-7 says if Manage Messages is missing
the bot "log + skip deletion." So `msg_deleted` never fires in degraded mode.
Add a distinct `msg_delete_skipped` event so observability can distinguish
"no violations" from "no permission to detect violations."
- **F5** UJ-1 says "the system prompt accounts for all three joined players; the
LLM weaves them in (no auto 'entered' announcement)." This is a narrative
expectation placed on the LLM with no FR backing it — FR-27 only suppresses
the bot announcement; nothing guarantees the LLM actually weaves joiners in.
Add a prompt-builder requirement or downgrade UJ-1 to "expected, not
guaranteed."
---
## Verdict
The PRD is internally coherent on intent but carries at least one BLOCKER
contradiction (A1: Non-Goals vs FR-24/OQ-3/NFR-2), two HIGH contradictions
that will produce wrong code (A2: addendum button inventory vs FR-43; A3:
`private` visibility undeliverable), and a cluster of HIGH Foundry/LLM
over-claims (B1/B2, C2) that are marked "RESOLVED" in OQs but explicitly punted
to architecture in the addendum. SM-4 is falsified by the documented
restart-recovery path (B3). Recommend: resolve A1A3, downgrade or re-open
OQ-1 (B1) and the FR-31 Foundry-advantage claim (B2), add a FR for
responseFilter extension against new payloads (C9), constrain the LLM
story-status vocabulary (C2), and phase the build (E1) before handing to
architecture.

View File

@@ -0,0 +1,296 @@
# Edge-Case Review — Mardonar Encounter Engine PRD (2026-06-20)
Scope: walk every branching path and boundary in `prd.md` + `addendum.md` and
report **only edge cases the PRD does NOT address**. Cases the PRD already
handles are deliberately omitted.
Severity scale: HIGH (correctness/data-loss/hang), MEDIUM (ambiguous outcome,
likely needs a patch), LOW (polish/spec gap).
---
## E-1 — Bot restart mid **group** check (not timed, not lobby) [HIGH]
`FR-5` explicitly handles restart mid **timed** check (cancel → fail).
`FR-25` explicitly handles restart mid **lobby** (Redis rehydrate).
Nothing handles restart mid **group skill check**. `addendum §5` lists Redis
keys for `session:*`, `lobby:*`, `character:*` and the in-memory timed timer —
group-check roll state (who has rolled, the central scoreboard message id, the
targeted player set, the successRule) is **not** named as persisted. On restart:
- The in-memory group-check object is gone.
- The central scoreboard embed is orphaned (buttons still live, no handler state).
- Every player's ephemeral roll view is unrecoverable (per-interaction tokens
are dead after ~15 min regardless, `addendum §1`).
- `session:*` may still carry a `pendingSkillCheck`, but the group variant's
per-player roll map is not on `SessionState` per the cited shape
(`src/types/index.ts:38-65`).
There is no FR equivalent of "restart cancels the group check as X." The PRD
must state the fail-closed (or rehydrate) behavior for an in-flight group check
on restart, or explicitly defer it the way FR-5 defers timed-check durability.
Affects: FR-11, FR-12, FR-14, FR-16, NFR-2.
---
## E-2 — Untimed group check where a targeted player never rolls [HIGH]
`FR-16`: finalization fires when "all targeted players have rolled, **or the
timer expires**." `FR-17` only defines the no-roll → failure rule for the
**timed** variant.
If the LLM emits a group check with no `durationSeconds` (untimed variant is
clearly permitted — `durationSeconds` is optional in FR-11) and one targeted
player goes AFK / never clicks Roll, **the check never finalizes**: the thread
stays in pending state, the LLM turn never re-schedules, and the existing
`PENDING_ROLL_LIMIT` auto-fail (`messageRouter.ts:141-173`) is per-single-player
logic and does not obviously map to a multi-roller scoreboard.
The PRD needs either (a) a default timer on every group check, (b) an
untimed-group no-roll resolution rule (e.g. after N skipped messages or a
soft-cap), or (c) an explicit DM/`/check force-resolve` escape hatch.
Affects: FR-16, FR-17.
---
## E-3 — Latecomer joins during an in-flight timed group check [MEDIUM]
`FR-29` covers latecomer Join into a **running group encounter** (adds them to
`session.players`). `FR-12` says the group-check target set defaults to "the
current session roster." `FR-17` says any player who has not rolled by expiry is
counted a failure.
Combined, a latecomer who Joins 5s before a timed group check expires is
either (a) silently added to the targeted set and instantly scored as a failure
at expiry (swinging majority/n_of_m), or (b) not added because the target set
was snapshotted at emit. The PRD does not say which. There is no FR stating the
targeted set is frozen at emit time, and no FR excluding latecomers from an
in-flight check. Either reading changes the group outcome.
Affects: FR-12, FR-17, FR-29.
---
## E-4 — `successRule` boundary semantics undefined [MEDIUM]
`FR-15` enumerates `majority | all | n_of_m | sum-threshold` and defines only
`majority` ("≥ half of rollers succeed"). The other three are undefined at
their boundaries:
- **`n_of_m`**: Is `m` the number of **targeted** players or **actual** rollers?
What if `m` > targeted set (spec/LLM error)? What if `n` = 0? Is `n` ≥ or `>`
the count?
- **`sum-threshold`**: Is the sum **≥** or **>** the threshold? A sum exactly
equal to the threshold is a tie the PRD does not resolve.
- **`majority` with even N**: "≥ half" makes 2-of-4 a SUCCESS (2 ≥ 2). That is
a tie by plain-English "majority" and the PRD never confirms it is intended.
For odd N it is unambiguous; the even-N tie is the unhandled case.
`SM-2` promises "zero LLM ambiguity in the computed outcome," which requires
these boundaries to be pinned. None of them are.
Affects: FR-15, SM-2.
---
## E-5 — Two players click Roll near-simultaneously [MEDIUM]
`FR-14` requires central-embed edit **coalescing/debouncing** (rate-limit
concern only). It does not address the **state race**:
- Both interactions read `hasRolled[playerId]` false, both compute a roll, both
write. Discord interactions are async and arrive through the single
`interactionCreate` router (`addendum §1`), but nothing in the PRD says roll
registration is atomic or guarded by a per-player compare-and-set.
- Two edits to the central embed from two interactions in the same tick can
also lose one result depending on debounce implementation (last-write-wins
could drop an earlier player's row).
The PRD should require per-player idempotent roll registration (a player can
only roll once; second click is rejected) and a merge, not overwrite, into the
scoreboard state. Neither is stated.
Affects: FR-13, FR-14, FR-43.
---
## E-6 — Roll click after the ~15-min ephemeral token window [MEDIUM]
`OQ-4` / `NFR-5` resolve the **long** group check by falling back to
"central-embed-only updates" past 15 min. That resolves **display**. It does
not resolve the **interaction**:
- A player who has not yet rolled clicks Roll at minute 16. The button
interaction's token is already expired, so `interaction.reply({ephemeral})`
and `interaction.update()` both fail. Does the roll still register in the
central scoreboard (via a non-interaction path)? Or is the click a no-op?
- If the click is a no-op, that player is permanently locked out of the check
with no recourse, and for a timed check will be scored a failure at expiry
(`FR-17`) through no fault of their own.
The PRD needs a fallback roll path (e.g. `/encounter roll` slash command, or a
fresh non-ephemeral button) once the original interaction token is dead. Not
addressed.
Affects: FR-13, FR-16, FR-17, NFR-5, OQ-4.
---
## E-7 — Story-status TTL expires mid-encounter [MEDIUM]
`FR-35` / `OQ-5`: story status auto-clears after a ~24h TTL. The enriched
context is re-rendered into the system prompt **each turn** (`FR-33`). If the
TTL fires between turn N and turn N+1 of the same encounter, the LLM loses
`status: sick` mid-scene with no `[SYSTEM]` notification — narration will
abruptly stop reflecting the condition (CM-6 risk realized inside a single
encounter, not just across encounters).
The PRD defines no mid-encounter TTL grace, no "status cleared" event surfaced
to the LLM, and no requirement that a status set **during** the current
encounter be exempt from the wall-clock TTL until the encounter ends. The only
clearing paths named are explicit DM/LLM clear (`FR-35`) and the TTL; the
interaction between TTL and an active encounter is unhandled.
Affects: FR-33, FR-35, CM-6.
---
## E-8 — `maxPlayers` cap reached, then a player Leaves [LOW]
`FR-24`: "joining is disabled once the cap is met." `FR-21`: Leave withdraws a
joined player. The PRD never states that the **Join button re-enables** when a
Leave drops the count back below `maxPlayers` (does the disabled state track
the live count, or is it latched once fired?). With the lobby embed rehydrated
from Redis (`FR-25`), the cap enforcement point is the join handler, but the
button-enabled/disabled state on rehydration is unspecified.
Affects: FR-20, FR-21, FR-24.
---
## E-9 — Lobby at exactly `minPlayers`, then a Leave drops below min [LOW]
`FR-20`: Start is disabled until count ≥ `minPlayers`. `FR-22`: once ≥ min,
Start enables. `FR-21`: Leave withdraws. The PRD never states that **Start
re-disables** when a Leave drops count below min after it had enabled. The
natural implementation would, but the spec does not pin it, and a latched
"Start was enabled once, stays enabled" reading is also conformant to the
current text. Also: if Start is pressed in the same tick a Leave lands, which
wins?
Affects: FR-20, FR-21, FR-22.
---
## E-10 — `Manage Messages` missing degrades message regulation entirely, not just crash-safety [MEDIUM]
`NFR-7` mandates safe **crash** degradation (log + skip deletion). But
`SM-6` / `FR-28` / `FR-29` make message regulation a **core guarantee** of
group encounters ("only joined players' messages enter the scene"). With the
permission missing, the encounter runs with **no regulation**: non-joined
latecomer posts flow into the router's implicit-join path (`messageRouter.ts:
189-205`) — which for group encounters is supposed to be bypassed
(`addendum §11`) — and either get implicitly added to the roster (breaking
FR-26's "roster known up front") or narrated as if joined.
The PRD treats the missing-permission case as a logging concern only; it does
not state the deeper consequence (SM-6 is void, the group-encounter invariant
is broken) nor require a fail-closed behavior such as **refusing to start a
group encounter** when the permission is absent. "Degrade safely" currently
means "keep running with the guarantee off," which violates the goal.
Affects: FR-28, FR-29, FR-42, NFR-7, SM-6.
---
## E-11 — No Leave path during a **running** group encounter [MEDIUM]
`FR-21` Leave is described under the lobby flow. `FR-29` only adds a latecomer
**Join** affordance for running encounters (persistent Join button +
`/encounter join`). There is no FR for a joined player **leaving** a running
encounter.
Consequences the PRD does not address:
- A player who wants to drop out mid-encounter has no mechanism; their messages
continue to be accepted (they are on the roster) and they remain a target for
any group check the LLM emits (`FR-12` default = roster), so they become a
guaranteed-failure entry (E-2/E-3 style) for future group checks.
- If a player is kicked/banned from the guild mid-encounter, the roster still
contains them; group checks target a ghost.
Affects: FR-12, FR-21, FR-29.
---
## E-12 — Lobby idle-auto-expiry vs. the persistent post-Start Join button [LOW]
`FR-24`: lobbies idle-auto-expire after ~30 min of no Join/Start activity.
`FR-29`: the lobby Join button stays **live after Start** for latecomers. These
interact: does the 30-min idle timer keep running against the post-Start lobby
embed? If so, the latecomer Join affordance silently disappears ~30 min into
every running group encounter that has had no joins, with no FR stating that
behavior or any grace period. If the timer is cancelled at Start, that is also
unstated.
Affects: FR-24, FR-29.
---
## E-13 — Non-joined clicker on lobby Start / Join buttons [LOW]
`FR-43` explicitly says the **Roll** button rejects non-targeted clickers with a
private ephemeral. The lobby buttons (`lobby_join`, `lobby_start`,
`lobby_leave`, `lobby_cancel`, `addendum §1`) have no stated rejection rule.
Concrete gaps: a non-joined member pressing **Start** when it is enabled (FR-22
says "any joined player may press Start" — but what happens to a non-joined
press?); a non-starter pressing **Cancel** (FR-23 says "available to the
starter" — rejection behavior unstated); a player pressing **Leave** who had
not joined.
Affects: FR-20, FR-22, FR-23, FR-41.
---
## E-14 — Passive-reveal ordering vs. suppressed `[SESSION] entered` & LLM turn 1 [LOW]
`FR-7` fires passive reveals "after the opening narrative is posted." `FR-22` /
`FR-27` suppress the `[SESSION] entered` announcement and schedule the first
LLM turn on Start. The PRD does not pin the ordering between (a) opening
narrative, (b) passive reveal posts, (c) first LLM turn scheduling — in
particular whether the LLM sees the passive reveal posts in its turn-1 context
or whether turn 1 is already in flight before reveals land. If reveals post
**after** the turn-1 prompt is built, the LLM narrates turn 1 ignorant of the
reveal it just caused (`FR-10` says the LLM does not trigger them; it may still
need to **see** them). Minor but affects narration coherence.
Affects: FR-7, FR-10, FR-22, FR-27.
---
## Summary table
| ID | Edge case | Sev | Primary FRs |
|----|-----------|-----|-------------|
| E-1 | Restart mid group check (no persistence/recovery) | HIGH | FR-11/12/14/16, NFR-2 |
| E-2 | Untimed group check, a targeted player never rolls → hang | HIGH | FR-16/17 |
| E-3 | Latecomer joins during in-flight timed group check | MEDIUM | FR-12/17/29 |
| E-4 | `n_of_m` / `sum-threshold` / even-N majority boundaries | MEDIUM | FR-15, SM-2 |
| E-5 | Near-simultaneous Roll clicks (state race + edit merge) | MEDIUM | FR-13/14/43 |
| E-6 | Roll click after 15-min ephemeral token window | MEDIUM | FR-13/16/17, NFR-5, OQ-4 |
| E-7 | Story-status TTL expires mid-encounter | MEDIUM | FR-33/35, CM-6 |
| E-8 | `maxPlayers` cap reached then Leave (Join re-enable) | LOW | FR-20/21/24 |
| E-9 | Lobby at min then Leave below min (Start re-disable) | LOW | FR-20/21/22 |
| E-10 | Missing Manage Messages voids SM-6, not just crash-safety | MEDIUM | FR-28/29/42, NFR-7, SM-6 |
| E-11 | No Leave path during a running group encounter | MEDIUM | FR-12/21/29 |
| E-12 | Lobby idle-auto-expiry vs. persistent post-Start Join | LOW | FR-24/29 |
| E-13 | Non-joined / non-starter clicks on lobby buttons | LOW | FR-20/22/23/41 |
| E-14 | Passive-reveal vs. turn-1 prompt ordering | LOW | FR-7/10/22/27 |
Cases verified **handled** (excluded from the table): restart mid timed check
(FR-5); restart mid lobby (FR-25); passive reveal for a player with no
registered Foundry character (FR-9 + Assumption); group check past the 15-min
ephemeral **display** window (OQ-4/NFR-5 — display only; the interaction side
is E-6).

View File

@@ -0,0 +1,260 @@
# PRD Rubric Review — Group Encounters, New Skill-Check Tools & Character Context
Review date: 2026-06-20
Reviewer: rubric walker (automated)
Subject: `prd.md` + `addendum.md`
Verdict: **PASS with 9 findings** (no blockers; 2 high, 5 medium, 2 low). The PRD
is decision-ready for architecture/epics on the macro level — tool factoring is
pinned (OQ-2 → Option C), Redis key shapes are drafted, file:line entry points
are grounded, and all seven open questions are resolved. The findings below are
local ambiguities and one stale non-goal that should be patched before epics are
sharded, not before architecture begins.
---
## 1. Decision-readiness — can architecture/epics proceed without guessing?
**Mostly yes.** Strong points:
- OQ-2 resolves tool factoring (Option C: extend `skill_check_emit` with
`durationSeconds`, add `skill_check_group_emit`, one `character_status` tool
with `action: set|clear`). The PRD is tool-name-agnostic in FR-1/FR-11 but the
addendum pins the recommendation — architecture can proceed.
- Redis key shapes are drafted (`lobby:{threadId}`, `character_status:…`,
reuse of `session:{threadId}`) with TTLs and restart-recovery notes
(Addendum §5, §10).
- File:line entry points are enumerated for every extension surface
(Addendum §9), so the architect can ground each decision in real code.
- Restart behavior for the three state classes is explicit and divergent:
timed checks = in-memory + fail-closed (FR-5, NFR-2); lobby/story-status =
Redis-backed + rehydrated (FR-25, FR-35). This is the key durability decision
and it is made.
Gaps that force guessing at epic/story level:
- **successRule enum is half-defined.** FR-15 lists `majority | all | n_of_m |
sum-threshold`. `majority` and `all` are self-explanatory; `n_of_m` and
`sum-threshold` are not defined anywhere in PRD or addendum. What are `n` and
`m` — tool args? Spec fields? Is `sum-threshold` a sum of roll totals vs a
target number, or count of successes vs a threshold? Architecture will have to
invent the arg schema. → Finding H-1.
- **`visibility: private` is introduced but undefined.** FR-6 lists
`visibility` with default `group`; Addendum §4 schema allows
`z.enum(['group','private'])`. FR-8/FR-9 only describe group-visible behavior.
What "private" means (DM-only? ephemeral to that player? suppressed?) is
unspecified. → Finding H-2.
- **Advantage/disadvantage provenance is ambiguous.** FR-11 lists
`advantage`/`disadvantage` as optional tool args; FR-43 says adv/dis is
"decided upstream — by the story/DM (LLM emit args) and the character's
Foundry stats (FR-31) — and shown as a Roll Mode field, not chosen by the
player." Precedence when both sources disagree (LLM says advantage, Foundry
condition says disadvantage) is not stated. → Finding M-1.
- **Roster promotion timing.** FR-21 says joined players are "added to the
session roster" on Join; FR-26 says all joined players are added to
`session.players` "before the first LLM turn." Read literally, FR-21 double-
adds. The intent (lobby roster ≠ session roster; promote at Start) is
recoverable from the addendum but the FR wording should be tightened.
→ Finding M-2.
- **"all in story so far" (FR-12) is undefined.** Presumably the current
roster, but the phrase is loose given the lobby/latecomer split.
## 2. Substance — are FRs capabilities (not implementation), with stable global IDs?
**Mostly yes, with implementation leakage.** FR-1 through FR-43 are stable,
unique, and globally identifiable — good for traceability. The majority are
capability-shaped ("the LLM can emit…", "the bot posts…", "the engine
tracks…").
Implementation leakage (low severity — these are reasonable grounding, but
they belong in the addendum or architecture, not the FR body):
- **FR-36** names the tool registry, `VALID_TOOL_NAMES`, and the "tool contract
manifest." This is mechanism, not capability.
- **FR-38** names `src/spec/loader.ts`, `docs/spec-authoring-guide.md`, and the
specs-tools consistency test. File paths in an FR are a smell.
- **FR-41** names the `interactionCreate` handler and customId prefixes.
- **FR-42** names the Discord permission bit (`Manage Messages`).
These read as architecture constraints the PRD is pinning so they aren't
relitigated. That is defensible (NFR-4 precedent: PRDs may call out
non-negotiable implementation guards), but the framing should be "the system
must…" rather than naming files. Low priority; not blocking.
## 3. Strategic coherence — goals / non-goals / scope alignment; no scope creep.
**One real scope-creep finding.** Goals and non-goals are otherwise aligned
with the five capabilities and the four user journeys.
- **§3 Non-Goals vs FR-24 — stale non-goal, un-reconciled.** §3 lists as a
non-goal: "Lobby max-player cap or auto-expiry. Lobbies are open until Start
or Cancel. `[ASSUMPTION]` — see OQ-3." But OQ-3 is now **RESOLVED** in favor
of an optional `maxPlayers` cap **plus** lobby idle-auto-expiry (~30 min),
and FR-24 ships exactly both. So the non-goal text was left pointing at an
OQ that reversed it. The capability itself is fine (it resolved an open
question); the §3 prose is just stale and contradicts FR-24/NFR-2.
→ Finding M-3.
Otherwise: durable-timer non-goal is held (FR-5, NFR-2); multi-scene non-goal
is held (passive reveals at start only, FR-7); Foundry write-back non-goal is
held (only engine-tracked story status is mutable, FR-31/FR-32); spectator
views non-goal is held. No other scope creep.
## 4. Completeness — NFRs, success metrics + counter-metrics, open questions resolved.
**Strong.** Seven NFRs covering performance, reliability, LLM-contract
integrity, backward-compat, platform conformance, observability, and
permissions — each tied to FRs. Six success metrics, six counter-metrics, each
mapping to a failure mode of an SM. All seven open questions marked RESOLVED
with the resolution captured in the addendum.
Gaps:
- **Success metrics are binary capability checks, not measurable targets.**
SM-1 "A DM can start a group encounter that will not begin until minPlayers
have joined" is a yes/no acceptance test, not a metric. There is no target
value (e.g. "lobby abandonment < X%", "group-check edit-throttle events =
0"). The counter-metrics are named but none have thresholds. This is fine
for a v1 PRD whose success is "does it work," but it limits the ability to
detect degradation post-launch. → Finding M-4 (low-ish; acceptable for this
scope).
- **No rollout/adoption metric.** Nothing measures whether group encounters
are actually used (e.g. group encounters started / week, passive-reveal
hit rate). CM-1 (lobby abandonment) is the closest but has no threshold.
## 5. Backward-compat + risk handling — behavior changes called out, not silent.
**Strong — best dimension.** Every behavior change to shipped behavior is
explicitly flagged:
- NFR-4 and FR-39 call out the simplification of the solo `skill_check_emit`
button set to a single player-locked Roll, and explicitly label it "an
intentional behavior change to shipped behavior, called out here rather than
silently regression-risked." This is the textbook way to handle it.
- FR-43 retires Adv/Dis/Custom-Modifier buttons across all surfaces including
the existing solo path — also flagged via NFR-4.
- FR-19/FR-30 pin solo behavior preservation: `minPlayers ≤ 1` runs exactly as
today, implicit-join and `[SESSION] entered` retained.
- FR-39 pins spec backward-compat: omitting `minPlayers`/`passiveReveals` =
today's behavior; Zod strips unknown keys silently today (Addendum §4) so
old specs won't break on schema extension.
- Restart-fail of timed checks is labeled "an accepted trade-off, not a
defect" (NFR-2) — good risk framing.
- Addendum §8 records rejected alternatives with rationale, so the surviving
choices are auditable.
One minor gap:
- **`maxPlayers` introduction has no migration story for existing specs.**
FR-24 makes `maxPlayers` optional with no default stated; FR-18 gives
`minPlayers` a default of 1 but FR-24 doesn't give `maxPlayers` a default
(presumably "unset = uncapped", but that's implicit). Low severity.
→ Finding L-1.
## 6. Internal consistency — FRs vs NFRs vs decision-log vs addendum.
Cross-checks performed:
- **NFR-2 ↔ FR-5/FR-21/FR-25/FR-32/FR-35**: consistent on the three-class
durability model (timed = memory + fail; lobby + story-status = Redis +
TTL + rehydrate). ✓
- **NFR-3 ↔ FR-37 ↔ FR-27 ↔ Addendum §2 (responseFilter)**: consistent on
routing new result paths through `toolDispatcher` + `responseFilter`. ✓
- **NFR-4 ↔ FR-39 ↔ FR-43**: consistent on solo-button simplification. ✓
- **NFR-5 ↔ FR-13 ↔ FR-16 ↔ OQ-4 ↔ Addendum §1**: consistent on the
ephemeral-token-window constraint and central-embed-only fallback. ✓
- **NFR-7 ↔ FR-42 ↔ FR-28/FR-29 ↔ OQ-7 ↔ Addendum §11**: consistent on
Manage Messages requirement + safe degradation. ✓
- **FR-15 ↔ Addendum §4**: successRule is a tool arg, not a spec field —
consistent. But the enum values `n_of_m`/`sum-threshold` are undefined in
both. → H-1 again.
- **FR-6 ↔ Addendum §4**: `visibility` enum consistent, but `private`
behavior undefined in both. → H-2 again.
- **FR-11 ↔ FR-43**: tension on adv/dis provenance. → M-1 again.
- **FR-21 ↔ FR-26**: roster-add timing wording tension. → M-2 again.
- **§3 Non-Goals ↔ FR-24 ↔ OQ-3 ↔ NFR-2**: stale non-goal text. → M-3 again.
- **FR-40 ↔ Addendum §7**: GraphMCP event list consistent and complete
against the new mechanics. ✓
- **FR-38 ↔ Addendum §4**: schema additions and authoring-guide pitfalls
consistent; reference-spec `xpReward`-free invariant preserved. ✓
- **Decision log**: PRD references `.decision-log.md` for overrides; not
reviewed here (out of scope for this pass). The addendum §8 rejected-
alternatives list is internally consistent with the resolved OQs.
No contradictions beyond the findings already enumerated.
---
## Findings (consolidated)
### H-1 [high] — successRule `n_of_m` and `sum-threshold` undefined
FR-15. The enum lists four rules but only `majority` and `all` are defined.
Architecture must invent the arg schema for `n_of_m` (what are n and m? tool
args? where declared?) and `sum-threshold` (sum of roll totals vs target? or
count of successes vs threshold?). Patch FR-15 with one line per rule, or
drop the two under-specified rules to a follow-up OQ.
### H-2 [high] — `visibility: private` behavior undefined
FR-6 + Addendum §4 schema. The enum permits `private` but FR-8/FR-9 only
specify group-visible reveal behavior. "Private" (DM-only? ephemeral-to-
player? suppressed-entirely?) is unspecified and untested by any SM. Either
define private in FR-8 or restrict the v1 enum to `['group']` and defer
private.
### M-1 [medium] — Advantage/disadvantage provenance precedence unclear
FR-11 vs FR-43. FR-11 keeps `advantage`/`disadvantage` as optional tool args
(LLM emit); FR-43 says adv/dis is "decided upstream by the story/DM (LLM emit
args) and the character's Foundry stats (FR-31)." When the two sources
disagree, which wins? Add one sentence to FR-43 stating precedence (e.g.
"Foundry-derived adv/dis is applied unless the LLM emit arg explicitly
overrides").
### M-2 [medium] — Roster promotion timing ambiguous (FR-21 vs FR-26)
FR-21 says joined players are "added to the session roster" on Join; FR-26
says they are added to `session.players` "before the first LLM turn." Read
literally FR-21 double-adds. Clarify FR-21 to "added to the **lobby** roster"
and FR-26 to "promoted from lobby roster to `session.players` before the
first LLM turn."
### M-3 [medium] — Stale non-goal contradicts FR-24 (scope reconciliation)
§3 Non-Goals lists "Lobby max-player cap or auto-expiry" as out of scope and
points at OQ-3, but OQ-3 is RESOLVED to ship exactly that (FR-24, NFR-2).
Update §3 to remove the stale non-goal or reframe it as "hard lobby max-cap
enforced server-wide" vs the per-spec optional cap that did ship.
### M-4 [medium] — Success metrics are binary checks, no thresholds on counter-metrics
§6. SM-1SM-6 are acceptance tests, not metrics; CM-1CM-6 are named but
none have thresholds (e.g. lobby abandonment < X%, stale-status incidents =
0). Acceptable for v1 but limits post-launch degradation detection. Add
targets where measurable, or mark SMs as acceptance criteria explicitly.
### L-1 [low] — `maxPlayers` default unspecified
FR-24. `minPlayers` has default 1 (FR-18); `maxPlayers` is "optional" with
no default stated. Presumably unset = uncapped, but should be explicit in
FR-24 and in the Addendum §4 schema snippet (which currently omits
`maxPlayers` entirely — another small inconsistency).
### L-2 [low] — Implementation detail leaked into FRs
FR-36/FR-38/FR-41/FR-42 name file paths, registry constants, customId
prefixes, and permission bits. These are useful pins but read as
architecture, not capability. Reframe as "the system must register…"
without naming files, or move the file:line pins to the addendum (which
already has §9 for this).
### L-3 [low] — `maxPlayers` missing from Addendum §4 schema snippet
Addendum §4. The proposed Zod snippet includes `minPlayers` and
`passiveReveals` but not `maxPlayers`, even though FR-24 introduces it.
Add `maxPlayers: z.number().int().min(1).optional()` (or similar) to keep
the addendum the authoritative schema sketch.
---
## Summary
The PRD is in good shape: decision-ready on the macro questions, internally
consistent on the durability model and backward-compat handling (the
strongest dimension), and complete on NFRs/metrics/OQs. The two high findings
(H-1, H-2) are localized under-specification in two enums that will force
architecture to guess arg shapes — fix before sharding epics. The medium
findings (M-1..M-4) are wording/precedence/threshold gaps. The low findings
are cleanup. No findings block starting `bmad-create-architecture`.

View File

@@ -0,0 +1,22 @@
# Decision Log — UX: Mardonar Encounter Engine (2026-06-20)
Canonical memory and audit trail for this UX run. Every decision, change, and
override is recorded here as the conversation unfolds.
## Decisions
| Date | Decision | Rationale | Status |
|------|----------|-----------|--------|
| 2026-06-20 | Opened UX run (Create intent) from PRD prd-mardonar-encounter-engine-2026-06-20 | Downstream UX pass for the group-encounters + skill-check tools PRD | open |
| 2026-06-20 | Working mode = Fast path; visual language = extend existing skillCheck.ts family; sources = PRD + skillCheck.ts (start fresh) | User priorities were behavior/feel, not visual reinvention. Prior ux-designs artifact predates abandoned builder direction; not inherited unless user asks. | decided |
| 2026-06-20 | P1 — obvious in-flight join | Latecomer Join must be obvious/discoverable: lobby Join button stays pinned and live after Start; /encounter join always available. | decided |
| 2026-06-20 | P2 — stop and guide, never silent-delete | A non-joined message is deleted AND replaced with a private ephemeral that shows exactly how to join (button + command). No silent removal. | decided |
| 2026-06-20 | P3 — no tornado (thread calm) | Every live surface is ONE embed edited in place (lobby, timer, scoreboard — never re-posted). Per-player views are ephemeral and self-cleaning. Edits coalesced (≤1/s). | decided |
| 2026-06-20 | Timer display = 10s increments + ~10s GIF final stretch | Countdown text updates every ~10s (NOT per-second). Below ~10s the text stops ticking and an animated ~10s-loop hourglass/sand-timer GIF takes over as the urgency cue; embed color ramps pending → urgent → failure at expiry. GIF loop accuracy vs. the real deadline is not required. GIF asset to be sourced/hosted ( setImage/thumbnail). | decided |
| 2026-06-20 | Single player-locked Roll button; retire Adv/Dis/Mod buttons | With player sync (Feature E) + true Foundry stats, the player no longer chooses advantage/disadvantage/modifier. Every skill-check surface (solo, timed, group) shows ONE Roll button locked to the targeted player(s); the handler rejects non-targeted clickers with a private ephemeral. Advantage/disadvantage is decided upstream by story/DM discretion (LLM emit) + Foundry stats granting it, shown as a Roll Mode field. Also simplifies the existing solo skill_check_emit (behavior change to shipped behavior — to reflect back into the PRD). | decided |
| 2026-06-20 | Reviewer gate run (rubric + adversarial + edge-case + a11y); findings resolved | 4 UX reviewers. Resolved: timer final-stretch keeps a text urgency cue alongside the GIF (a11y backstop — embed images have no alt text); P2 graceful-degrades (no Manage Messages → skip delete + still guide; dropped stale 'above' pointer; best-effort for inactive clients); Roll defers + idempotency lock + non-target rate-limit; lobby Cancel any-joined / Join disabled at cap / pre-expiry warning / idle-reset / Begin revalidate; Flow 1 single-Roll fix; a11y example fix; neutral token signal; Flow 4 (status set) added; post-Start lobby layout resolved. Review files: review-rubric.md, review-adversarial.md, review-edge-cases.md, review-accessibility.md. | decided |
| 2026-06-20 | UX finalized — DESIGN.md + EXPERIENCE.md status: final | Spines distilled; key-screen mocks in-spine (ASCII, Discord embeds); open items triaged (post-Start resolved; stat-privacy OQ-8 + no-character noted). Downstream: bmad-create-architecture, bmad-create-epics-and-stories. | final |
## Changes & Overrides
_(Recorded here as they occur.)_

View File

@@ -0,0 +1,255 @@
---
title: "Mardonar Encounter Engine — Discord Embed Visual Identity"
status: final
created: 2026-06-20
updated: 2026-06-20
colors:
pending: 0x5865F2 # blurple — awaiting a player action
success: 0x2ECC71 # green — roll / group succeeded
failure: 0xE74C3C # red — roll / group failed
gathering: 0xF39C12 # warm orange — lobby assembling
urgent: 0xF1C40F # amber — timer running low
notice: 0x9B59B6 # purple — passive reveal / insight
neutral: 0x95A5A6 # gray — informational / disabled
typography:
system: Discord embed default (not author-controlled)
title: "emoji + double-space + label (e.g. ⚔️ Skill Check — {player})"
emphasis: bold for DC, modifier, roll totals, results
asides: italics for narrative prompt text and in-world flavor
rounded:
note: Discord-controlled; buttons follow platform radius
spacing:
fields: inline triples (DC · Modifier · Mode) per existing skill-check embed
rule: one embed per surface — never stacked embeds
timer: countdown lives in a single inline field, updated in place
components:
skillCheckEmbed: src/bot/embeds/skillCheck.ts (existing — the family root)
lobbyEmbed: gathering embed with Join/Leave/Start/Cancel
groupScoreboardEmbed: one edited-in-place scoreboard with Roll/Adv/Dis
timedCheckEmbed: skill-check embed + countdown field + timeout state
ephemeralRollView: per-player private roll result (self-cleaning)
passiveRevealPost: atmospheric notice attributed to a player
statusConfirmation: ephemeral DM/LLM status set/clear ack
---
# DESIGN.md — Mardonar Encounter Engine
## Brand & Style
Mardonar's encounter UI lives inside Discord threads as **embeds, buttons, and
slash commands** — there is no canvas. The visual identity is therefore an
**embed language**: a disciplined color palette, emoji-led titles, terse
in-world microcopy, and a one-embed-per-surface discipline that keeps threads
calm (P3 — no tornado).
Voice is atmospheric, second-person, fate-flavored — an unnamed narrator
addressing the table. Utility jargon ("session", "lobby", "ephemeral",
"registry") never appears in player-facing strings; the in-world register
holds everywhere a player can see it. Existing root language in
`src/bot/embeds/skillCheck.ts`: *"Fate will decide the outcome"*, *"Roll your
dice to determine your fate."* The new surfaces are siblings of that embed.
## Colors
Status is **never color alone** — every color pairs with an emoji + text label
so meaning survives monochrome and screen readers (see Accessibility in
EXPERIENCE.md).
| Token | Hex | Meaning | Paired signal |
|---|---|---|---|
| `pending` | `0x5865F2` | awaiting a player action | ⚔️ / 🎲 + "awaiting" |
| `success` | `0x2ECC71` | a roll or the group succeeded | ✅ + "Success" |
| `failure` | `0xE74C3C` | a roll or the group failed | ❌ + "Failure" |
| `gathering` | `0xF39C12` | lobby assembling, seats open | 🕯️ + "gathering" |
| `urgent` | `0xF1C40F` | timer running low (≤ ~10s) | ⏳ + "Final sands — roll now" |
| `notice` | `0x9B59B6` | a passive reveal / insight | 👁️ + "notices" |
| `neutral` | `0x95A5A6` | informational, disabled, closed | + "informational / closed" |
**Timer display**: the countdown field updates in **10-second increments**
(an edit every ~10s — even calmer than per-second, P3). At ≤ ~10s the number
stops ticking and the field shows a fixed text cue ("⏳ Final sands — roll
now") alongside an **animated ~10s-loop GIF** (hourglass / sand-timer); the
embed color ramps `pending``urgent`. The **text cue is the a11y backstop**
— discord.js embed images take no alt text, so urgency must not depend on the
GIF (which may also fail to load). At expiry the embed flips to `failure`
(`⏰ Time's up`). The embed is the same message throughout — only the countdown
field, color, and image change. The GIF's loop need not match the real
deadline exactly.
## Typography
Discord renders embed text in its platform default; we do not control font.
What we **do** control and hold consistent:
- **Titles**: `emoji + two spaces + Label`. Emoji carries glanceable meaning;
the double-space matches the existing root (`⚔️ Skill Check — {player}`).
- **Description**: the action prompt in *italics* (the narrator's voice).
- **Field values**: bold for mechanical numbers — `DC`, modifier, roll totals,
joined count, seats remaining.
- **Footer**: one short in-world imperative or status line.
## Layout & Spacing
- **One embed per surface.** A lobby, a timed check, and a group scoreboard are
each a single message edited in place through all their states (P3). No
re-posting, no stacked embeds.
- **Inline field triples** for mechanical readouts (DC · Modifier · Mode), per
the existing skill-check embed.
- **Scoreboard rows**: one field per player (`{player} — {result}`), filled in
as rolls arrive; the field set is rewritten on coalesced edits, not appended.
- **Buttons sit in one `ActionRow`** beneath the embed; a second row is used
only when a surface needs both action buttons and a persistent affordance
(e.g. the lobby's Join stays in a dedicated row so it never visually
disappears — P1).
## Elevation & Depth
Discord embeds offer no shadows/elevation. "Depth" is conveyed through:
- **Color** (status tokens above).
- **Footer + timestamp**: a footer line + `setTimestamp` anchors the "now" of a
live embed (lobby open time, check emit time).
- **Button state**: `setDisabled(true)` to rest a button that isn't yet
available (lobby **Start** below min) — a visible, non-color cue.
## Shapes
Discord-controlled (embed side bar = the status color; buttons follow platform
radius). The colored left bar **is** the primary shape signal — which is why
the color ramp and one-embed discipline matter.
## Components
### Skill-check embed (root pattern — simplified)
Reference for every skill-check surface. With player sync (Feature E) the
player no longer chooses advantage/disadvantage/modifier — those are decided
upstream by the story/DM (LLM emit) and the character's Foundry stats, shown
as a **Roll Mode** field. The player gets a **single `Roll` button, locked to
the targeted player**.
```
⚔️ Skill Check — Zara
*Disarm the trap before the mechanism resets.*
⚖️ DC 15
🎯 Modifier +3 (Perception)
🟢 Roll Mode Advantage
[ Roll ]
footer: 🎲 Roll your dice to determine your fate.
```
### Lobby embed (`gathering`)
Edited in place: `gathering` (seats open) → `gathering` (min met, Start
enabled) → `neutral`/closed on start. The **Join** button lives in its own row
and stays live after Start (P1).
```
🕯️ The Party Gathers — Velvet Auction
*The auction hall fills. Take your place while seats remain.*
Seats 3 of 3 minimum met (cap 5)
Joined Zara · Kay · Mary
Status Ready — any joined player may begin
[ Join ] [ Leave ]
[ ▶ Begin ] [ Cancel ]
footer: 🕯️ Press Join to take your seat — or /encounter join. • closes if idle ~30m
```
- **Begin** (`ButtonStyle.Success`) is `setDisabled(true)` until joined ≥ min;
the handler **revalidates** the count at click time (a Leave dropping below
min between render and click does not start the encounter).
- **Join** is `setDisabled(true)` once joined ≥ `maxPlayers` (cap reached).
- **Cancel** (`ButtonStyle.Secondary`); available to **any joined player** (not
starter-only — the starter may have left the server).
- **Pre-expiry warning**: ~5 min before idle-auto-expiry the embed edits to a
`neutral` warning line; idle resets on any Join/Leave/Start activity.
- After Start, the embed flips to a short `neutral` "The gathering has set out"
with **Join** still in its row for latecomers (P1).
### Group-check scoreboard (`pending` → `success`/`failure`)
One embed, edited in place as rolls arrive (P3). A single **`Roll`** button,
locked to targeted players — the clicker must be a targeted player who hasn't
rolled. Each row shows the player's roll total and modifier (`rolled 16 +3`);
**open rolls** is the default — hiding modifiers is an open question (OQ-8).
```
⚔️ Group Check — Stealth
*Slip the party past the sentries. Each must roll.*
⚖️ DC 13 ⏳ Time 47s
Rolled
Zara — ✅ 19 (rolled 16 +3)
Kay — …awaiting
Mary — …awaiting
[ Roll ]
footer: ⚔️ The party faces the trial together.
```
On resolve (group SUCCESS, `success`):
```
⚔️ Group Check — Stealth ✅ The party prevails
⚖️ DC 13 ⏳ Resolved
Zara ✅ 19 Kay ❌ 8 Mary ✅ 15
Rule: majority (2 of 3) — SUCCESS
```
### Timed-check embed (`pending` → `urgent` → `failure`/`success`)
Skill-check embed + a countdown field, edited in place (P3). The countdown
updates in **10-second increments**; below ~10s the number stops and an
animated ~10s-loop hourglass GIF takes over (embed image/thumbnail).
```
⚔️ Skill Check — Kay ⏳ Time ~0:20
*Disarm the trap before the mechanism resets.*
⚖️ DC 15 🎯 Modifier +2 (Thieves' Tools)
[ Roll ]
footer: ⏳ The sands run out. Roll before time slips away.
```
Final stretch (`urgent`, ≤ ~10s — text stops, GIF runs):
```
⚔️ Skill Check — Kay ⏳ Final sands — roll now [ ⏳ hourglass GIF ]
⚖️ DC 15 🎯 Modifier +2 (Thieves' Tools)
[ Roll ]
footer: ⏳ The sands run out. Roll before time slips away.
```
Timed out (`failure`, `⏰`):
```
⏰ Time's Up — Kay
*The mechanism springs before the lock gives way.*
⚖️ DC 15 Result ❌ FAILURE (timer expired)
```
### Ephemeral roll view (`neutral`, per-player, self-cleaning)
Private to the clicker. Short, no buttons, no thread noise (P3).
```
🎲 Your Roll — Stealth
d20 16 + mod 3 = 19 vs DC 13
✅ Success — the sentries do not see you.
```
### Passive-reveal post (`notice`, public, attributed)
Atmospheric, group-visible, attributed to the qualifying player (FR-8).
```
👁️ A Detail Revealed
Zara's keen Perception catches what others miss.
A small button is set into the wall behind the tapestry.
```
### Status confirmation (`neutral`, ephemeral)
DM command / LLM-tool ack. Private, terse.
```
📜 Status Set
Zara is now marked: *sick* (clears in ~24h, or when cleared)
```
## Do's and Don'ts
**Do**
- Edit one embed in place across all states (P3).
- Pair every color with an emoji + text label (a11y).
- Keep **Join** persistently visible on the lobby row (P1).
- Replace a blocked non-joined post with a guiding ephemeral (P2).
- Use in-world microcopy in every player-facing string.
- Use a single player-locked `Roll` button — advantage/disadvantage is decided
upstream (story/LLM/Foundry), not by the player.
**Don't**
- Re-post a lobby, timer, or scoreboard to "update" it.
- Rely on color alone to signal success/failure/time.
- Show utility terms ("lobby", "session", "ephemeral", "registry") to players.
- Stack embeds or post per-roll messages into the thread.
- Let the timer tick the embed every second — use 10s increments, then a ~10s
GIF for the final stretch (rate-limit risk + thread noise).
- Offer Advantage/Disadvantage/Custom-Modifier roll buttons — player sync makes
them redundant; adv/dis is decided upstream.

View File

@@ -0,0 +1,265 @@
---
title: "Mardonar Encounter Engine — Experience & Interaction"
status: final
created: 2026-06-20
updated: 2026-06-20
---
# EXPERIENCE.md — Mardonar Encounter Engine
## Foundation
- **Form-factor**: Discord threads — embeds, buttons, slash commands
(discord.js v14). No web/mobile canvas. The UI system is Discord's embed
component vocabulary.
- **Visual identity reference**: `DESIGN.md` (embed color tokens, emoji-led
titles, one-embed-per-surface layout). Behavioral specs live here; visual
specs live there.
- **Three governing principles** (user-stated, override any conflicting detail):
- **P1 — Obvious in-flight join**: a latecomer can always see how to join a
running encounter.
- **P2 — Stop & guide, never silent-delete**: a blocked non-joined post is
removed *and* replaced with a private "how to join" ephemeral.
- **P3 — No tornado**: live state is one embed edited in place; per-player
views are ephemeral and self-cleaning; edits coalesced.
## Information Architecture
Seven surfaces, all inside the encounter thread (plus private ephemerals):
| # | Surface | Visibility | Lifespan | See |
|---|---|---|---|---|
| 1 | Lobby embed | public thread | until start/cancel/expiry | DESIGN `components.lobbyEmbed` |
| 2 | Group-check scoreboard | public thread | one check, edited in place | DESIGN `components.groupScoreboardEmbed` |
| 3 | Timed-check embed | public thread | one check, edited in place | DESIGN `components.timedCheckEmbed` |
| 4 | Ephemeral roll view | private (per-player) | seconds, self-cleaning | DESIGN `components.ephemeralRollView` |
| 5 | Passive-reveal post | public thread | permanent ( encounter log) | DESIGN `components.passiveRevealPost` |
| 6 | Status confirmation | private (DM/actor) | seconds, ephemeral | DESIGN `components.statusConfirmation` |
| 7 | Latecomer Join affordance | public button + slash command | persistent | §Interaction Primitives |
**Closure**: every PRD need has a surface. Lobby → roster (FR-21/26);
scoreboard → group checks (FR-1117); timed embed → timed checks (FR-15);
ephemeral view → per-player rolls (FR-13); passive post → reveals (FR-8);
status confirmation → story-status set/clear (FR-32); Join affordance →
latecomers (FR-29). Non-joined guidance → P2.
## Voice and Tone
Atmospheric, second-person, fate-flavored — the unnamed narrator addressing the
table. Mechanical readouts stay terse and bold; flavor stays italic and brief.
Utility jargon is forbidden in player-facing strings (matches the existing
in-world-voice rule). Examples by surface are baked into the
`DESIGN.md.Components` mockups; the voice lives there.
## Component Patterns (behavioral)
### Lobby embed
- **Buttons**: `Join` (Success, own row, persistent; `setDisabled` at
`maxPlayers` cap), `Leave` (Secondary), `Begin` (Success, `setDisabled` until
joined ≥ min), `Cancel` (Secondary, **any joined player** — not starter-only).
- **Begin** enables the instant joined ≥ `minPlayers`; the handler **revalidates
the count at click time** (a Leave below min between render and click does not
start the encounter); disables again if a player Leaves below min.
- **Join** stays in a dedicated row and remains live after Start (P1) so
latecomers always see it.
- **Pre-expiry warning** ~5 min before idle-auto-expiry; **idle resets** on any
Join/Leave/Start activity.
- The embed is edited in place for every state change (P3).
### Group-check scoreboard
- **Buttons**: a single **`Roll`**, locked to targeted players — the clicker
must be a targeted player who hasn't rolled. Advantage/disadvantage is
decided upstream (story/LLM + Foundry stats), shown per row, not chosen by
the player.
- On a click: the bot **defers** the interaction, takes an idempotency lock
(FR-45), then the clicker gets an ephemeral roll view; the scoreboard's row
for that player is updated on the next coalesced edit (P3, ≤1 edit/s).
- One roll per player; a second click by the same player is rejected —
near-simultaneous clicks cannot drop or duplicate a roll (atomic registration).
- A latecomer who Joins during an active group check is **not** added to that
check's target set (FR-17) — they roll in subsequent checks only.
### Timed-check embed
- **Buttons**: a single **`Roll`**, locked to the targeted player.
Advantage/disadvantage is decided upstream (story/LLM + Foundry stats) and
shown as a Roll Mode field — not a player button choice.
- **Countdown field** updated in place in **10-second increments** (not
per-second — P3). Below ~10s the number stops and a fixed text cue
("⏳ Final sands — roll now") shows alongside the ~10s-loop hourglass GIF —
the text cue is the a11y backstop (embed images take no alt text); color ramps
`{colors.pending}``{colors.urgent}``{colors.failure}` at expiry. The
GIF loop need not match the real deadline.
### Ephemeral roll view
- Private to the clicker; no buttons; no thread message (P3). Shows
`d20 + mod = total vs DC` and the result.
### Passive-reveal post
- Posted once at encounter start per qualifying player; public, attributed,
atmospheric. Not edited.
### Status confirmation
- Ephemeral ack to the DM (command) or a `[SYSTEM]` ack to the LLM (tool); never
public unless the LLM chooses to narrate it.
## State Patterns
### Lobby states
`assembling` (joined < min, Begin disabled; Join disabled at `maxPlayers` cap)
`ready` (joined ≥ min, Begin enabled) → `started` (embed flips to a brief
"set out" line; Join persists) | `cancelled` (any joined player) | `expiring`
(~5 min warning) → `expired` (idle ~30m → thread closed; re-openable via
`/encounter start`).
### Check states (single + group + timed)
`pending` (awaiting roll(s)) → `filling` (group: some rolled) →
`resolved-success` | `resolved-failure` | `timed-out`. The single embed
transitions through these by edit, never re-post.
### Button enablement
- `Begin`: enabled iff joined ≥ min.
- `Roll`: enabled while the check is `pending`/`filling` and the clicker is a
targeted player who hasn't rolled yet.
- `Join`: enabled while the encounter is live and joined < `maxPlayers` (P1);
disabled at cap.
- `Leave`: enabled iff the clicker is currently joined.
### Ephemeral lifecycle
- Roll view: posted on click, self-cleaning (Discord ephemeral; no delete
needed). No follow-up messages into the thread.
- Guidance (P2): posted on a blocked non-joined message; ephemeral to the
sender.
## Interaction Primitives
- **Roll button → player-locked**: every skill-check surface (solo, timed,
group) has a single **`Roll`** button. The handler **defers the interaction
first** (Foundry/LLM lookups may exceed the 3s ack window), checks the clicker
is the targeted player (or, in a group check, a targeted player who hasn't
rolled), and takes an **idempotency lock** (FR-45) so a double-click can't
roll twice. Anyone else gets a private ephemeral *"This roll is not yours to
make"* — **rate-limited**, not flooded, on repeated non-target spam.
Advantage/disadvantage is decided upstream by the story/DM (LLM emit) and the
character's Foundry stats — never by a player button choice. On a valid click
the player gets a private ephemeral roll view and the public embed updates on
the next coalesced edit tick (P3). One button, one thread message per roll.
- **Slash command → ephemeral confirm**: `/encounter join`, `/encounter start`,
`/character status set|clear` reply ephemerally to the caller.
- **Non-joined message → stop & guide (P2)**: delete the message, then send the
sender a private ephemeral: *"This gathering is not yet yours to join. Press
**Join** on this encounter's gathering post, or use `/encounter join`."*
Never silent. **Graceful degradation**: if the bot lacks Manage Messages
(NFR-7), it skips the delete but still sends the guidance ephemeral (the
message stays; regulation is best-effort). Guidance is **best-effort for
inactive clients** — Discord does not queue ephemerals to offline users, so
the delete still regulates posting even if the ephemeral lands only when the
user is active.
- **Latecomer Join (P1)**: two always-available paths — the persistent **Join**
button on the lobby embed (its own row, live after Start) and
`/encounter join`. Both add the player to the roster; the LLM weaves them in
(no auto "entered" announcement — FR-27).
- **Timer → edit-in-place**: the countdown lives in one field on the existing
check embed, updated in **10-second increments**; below ~10s the text stops
and a ~10s-loop hourglass GIF takes over (color ramps to `{colors.urgent}`).
On expiry the bot finalizes the embed and pushes the result to the LLM
(FR-4/FR-16). No per-second edits.
- **customId prefixes** (routed via the global `interactionCreate` handler):
`lobby_join`, `lobby_leave`, `lobby_start`, `lobby_cancel`, `grp_roll`,
`sc_roll`. The `sc_adv` / `sc_dis` / `sc_mod` / `grp_adv` / `grp_dis`
prefixes are retired — single player-locked `Roll`.
## Accessibility Floor
- **Color is never the only signal**: every status pairs color with an emoji +
text label (`✅ Success`, `❌ Failure`, `⏰ Time's up`, `⏳ Final sands`). See
`DESIGN.md.Colors`.
- **Button labels are descriptive** ("Begin", "Join", "Roll") — not icon-only.
- **Guidance is text-clear**: the P2 ephemeral names both join paths in plain
text.
- **No motion-dependent meaning**: the timer's final stretch keeps a **text
urgency cue** ("⏳ Final sands — roll now") alongside the GIF — urgency does
not depend on the GIF animation (which has no alt text and may fail to load).
The countdown number conveys meaning while it ticks.
- **Ephemerals are private** — guidance and roll views never expose one
player's info to another.
## Key Flows
### Flow 1 — Mary starts a group heist (climax: the group check resolves)
1. Mary runs `/encounter start velvet-auction` (`minPlayers: 3`). The bot posts
the **lobby embed** (`{colors.gathering}`): seats 0/3, `Begin` disabled.
2. Zara, Kay press **Join**; the embed edits to 2/3. A visitor posts "hi"
without joining → **deleted + guided** (P2 ephemeral).
3. Mary presses **Join** → 3/3, `Begin` enables. Mary presses **Begin**.
4. The lobby embed flips to "set out"; **Join stays live** (P1). The opening
narrative posts; the roster is in the prompt (FR-26); passive reveals fire
(`{colors.notice}`).
5. The LLM emits a **group Stealth check** (60s, `majority`). The **scoreboard**
appears (`{colors.pending}`) with a single **`Roll`** button (FR-43).
6. Each player clicks **Roll** → private ephemeral roll view; the scoreboard
fills in via coalesced edits (P3).
7. **Climax**: at expiry, 2/3 succeeded → the scoreboard finalizes
(`{colors.success}`, "The party prevails"); the LLM narrates the outcome.
### Flow 2 — Zara joins an encounter in flight (climax: she takes her seat)
1. Zara opens a running encounter thread mid-scene and types "hi".
2. The bot **deletes** her message and sends a **private ephemeral**: *"This
gathering is not yet yours to join. Press **Join** on the gathering post, or
use `/encounter join`."* (P2)
3. Zara presses the persistent **Join** button (still visible in its row — P1)
→ added to the roster; a short ephemeral confirms her seat.
4. The LLM weaves Zara into the scene on its own timing (no bot announcement).
Her next message now enters the story.
### Flow 3 — Kay's solo timed trap (climax: the sands run out)
1. Kay's solo encounter (implicit join, today's behavior — FR-30).
2. The LLM emits a **timed check** (30s). The check embed shows `⏳ ~0:30`
(`{colors.pending}`), edited in place in **10s increments** (P3) — ~0:20,
~0:10.
3. Kay rolls at ~0:18 → ephemeral roll view; the embed turns
`{colors.success}`.
4. **Climax (alt run)**: Kay hesitates; below ~10s the text stops and the
**hourglass GIF** runs (`{colors.urgent}`); at 0 it flips to `⏰ Time's up`
(`{colors.failure}`) and the LLM narrates the trap triggering.
### Flow 4 — DM sets a story status (climax: the LLM narrates the sickness)
1. Before the encounter, the DM runs `/character status set @zara sick`.
2. The bot replies with a private `neutral` status confirmation ("Zara is now
marked: *sick* (clears in ~24h, or when cleared)").
3. The encounter starts; the enriched character context surfaces `status: sick`
to the LLM each turn.
4. **Climax**: the LLM narrates Zara's labored movements and applies
disadvantage where appropriate — without the DM reminding it each turn. The
status persists into the next encounter until the TTL clears it or the DM
runs `/character status clear @zara sick`.
## Open Items
- `[RESOLVED]` Post-Start lobby embed: a slim `neutral` "The gathering has set
out" line with the **Join** button remaining in its row (P1); `/encounter
join` is the always-available backstop if the embed scrolls away.
- `[NOTE FOR UX]` Scoreboard shows each player's modifier (open rolls default,
OQ-8) — confirm visibility before build.
- `[NOTE FOR UX]` A player with no registered Foundry character rolls at +0
with no Foundry-derived advantage (FR-46) — confirm the no-character UX (Roll
still enabled; passive reveals silently skipped).
- `[NOTE FOR UX]` Timer final-stretch GIF asset (a ~10s-looping
hourglass/sand-timer) needs sourcing + hosting; embedded via
`setImage`/thumbnail. GIF loop accuracy vs. the real deadline is not
required.
- `[ASSUMPTION]` Passive-reveal post uses `{colors.notice}` purple as its own
register; confirm it reads as a sibling, not a foreign element.
- `[NOTE FOR UX]` Status set/clear slash-command argument shape
(`/character status set @user <label> [duration]`) is a UX detail to finalize
with the command syntax in architecture.
- `[NOTE FOR UX]` The single player-locked `Roll` button simplifies the
**existing** solo `skill_check_emit` too (retires the Adv/Dis/Custom-Modifier
buttons) — a behavior change to shipped behavior. Reflect back into the PRD:
advantage/disadvantage is decided at emit (LLM/story/DM) + Foundry stats, not
by player button choice. Architecture must combine LLM-specified +
Foundry-derived advantage into the roll.
---
_Visual specs live in `DESIGN.md`; decisions in `.decision-log.md`. Spines win
on conflict with any mock._

View File

@@ -0,0 +1,276 @@
---
title: "Mardonar Encounter Engine — Accessibility Review"
status: review
created: 2026-06-20
reviewer: accessibility reviewer (Discord embed UX)
targets:
- DESIGN.md
- EXPERIENCE.md
---
# Accessibility Review — Mardonar Encounter Engine
Scope: the Accessibility Floor as stated in `EXPERIENCE.md §Accessibility Floor`
and as realized in the component mockups in `DESIGN.md`. The review checks each
floor claim against the actual mockup text, then flags gaps.
Severity scale: **Blocker** (a floor claim is broken for a whole user group) ·
**High** (a state is meaningfully less accessible, with a workable fallback) ·
**Medium** (a confusable cue, design-level risk) · **Low** (inconsistency /
stale text, no user harm on its own).
---
## Verdict
The floor is **mostly met** for resolved states (success / failure / timed-out
/ gathering / notice / pending) — every one of those pairs color + emoji + text.
The floor is **not met** for the **timer's `urgent` final-stretch state**, which
removes the numeric label and offloads urgency onto an unlabeled GIF + a color
ramp. That is the one finding worth fixing before build.
---
## Floor-claim audit
### 1. Color is never the only signal — PASS for 6 of 7 tokens, FAIL for the urgent phase
Token-by-token check of `DESIGN.md.Colors` table against the mockups:
| Token | Paired signal in table | Verified in mockup | Verdict |
|---|---|---|---|
| `pending` | ⚔️ / 🎲 + "awaiting" | `⚔️ Group Check` + rows "…awaiting"; `⏳ ~0:20` | PASS |
| `success` | ✅ + "Success" | `✅ The party prevails`; `✅ Success — the sentries…` | PASS |
| `failure` | ❌ + "Failure" | `⏰ Time's Up`; `❌ FAILURE (timer expired)` | PASS |
| `gathering` | 🕯️ + "gathering" | `🕯️ The Party Gathers` + "Seats…minimum met" | PASS |
| `urgent` | ⏳ + "{n}s remain" | **`⏳ Time —` + GIF only** (number removed) | **FAIL** |
| `notice` | 👁️ + "notices" | `👁️ A Detail Revealed` | PASS |
| `neutral` | — (none) | embeds still carry emoji+text titles | PASS-with-caveat (see finding 3) |
Every *resolved* state is triple-redundant (color + emoji + text). The single
break is the `urgent` final-stretch: the mockup at `DESIGN.md` lines 190195
shows `⏳ Time — [ ⏳ hourglass GIF ]` — the `{n}s` number is gone, the text
label is a dash, and the urgency cue is handed to the animated GIF + the
`pending → urgent` color ramp. See finding 1.
### 2. Button labels descriptive (not icon-only) — PASS
All buttons in the mockups are text-labeled: `Join`, `Leave`, `Begin`, `Cancel`,
`Roll`. No icon-only buttons. PASS. (Stale example in the floor text — see
finding 4.)
### 3. P2 guidance is text-clear — PASS
Both P2 ephemeral variants name both join paths in plain text:
- `EXPERIENCE.md §Interaction Primitives`: *"Press **Join** on the gathering
post above, or use `/encounter join`."*
- `EXPERIENCE.md §Flow 2`: *"Press **Join** on the gathering post, or use
`/encounter join`."*
A user who cannot see the button (screen reader, scrolled past it, embed
collapsed) still has the slash command in text. PASS.
Minor: the two phrasings disagree on "above" vs. no-"above"; pick one.
### 4. No motion-dependent meaning — FAIL in the urgent phase
The floor (`EXPERIENCE.md` line 162) states: *"No motion-dependent meaning:
countdown meaning is conveyed by the number + label, not by animation."*
The timer component contradicts this (`DESIGN.md` lines 7074, 190195;
`EXPERIENCE.md` lines 8083, 143146): below ~10 s *"the text stops and an
animated ~10s-loop hourglass GIF takes over as the urgency cue."* In that
window the number is removed and the GIF + color ramp become the primary
urgency signal. That is motion-dependent meaning by construction. See finding 1.
### 5. Ephemeral privacy — PASS (with one stat-exposure note)
- Ephemeral roll views are private to the clicker (`EXPERIENCE.md` line 86).
- P2 guidance is ephemeral to the sender (line 119).
- Status confirmation is ephemeral to the DM (line 94).
- No player's ephemeral content is routed to another player.
The scoreboard is public by design (a group check is a shared trial), so
showing each player's roll total there is not a leak. One nuance — the
scoreboard also exposes each player's **modifier** (`rolled 16 +3`) publicly
(`DESIGN.md` lines 166, 174). Some tables treat character modifiers as
private; the design should make public-modifier display an explicit choice
rather than a default. See finding 5.
---
## Findings
### Finding 1 — [High] Urgent timer phase removes the text label and relies on the GIF
**Where**: `DESIGN.md` lines 7074 (Timer display), 190195 (final-stretch
mockup); `EXPERIENCE.md` lines 8083 (Timed-check embed), 143146 (Timer
primitive), 162 (floor claim).
**Problem**: In the `urgent` final-stretch (≤ ~10 s) the countdown text stops
and is replaced by `⏳ Time —` plus an hourglass GIF. For:
- **Screen-reader users** — discord.js embed images have no alt-text channel,
so the GIF is invisible. The user hears only "hourglass Time dash" with no
quantity and no "running low" phrasing. They go from a clear `⏳ Time ~0:20`
to an ambiguous `⏳ Time —` that reads as "time unknown / expired."
- **Images-off users** — same: `⏳ Time —` with no number.
- **Sighted colorblind users** — the `pending → urgent` ramp (blurple →
amber) is the remaining cue, which is exactly the color-only signal the
floor forbids (see finding 2).
This simultaneously breaks floor claim #4 ("no motion-dependent meaning") and
weakens floor claim #1 ("color is never the only signal") for the one state
where urgency matters most.
**Fix**: Keep a text label in the urgent field — do not replace the number with
a dash. Options, any one is sufficient:
1. `⏳ Time <10s — roll now` (keeps a quantity band + an imperative).
2. `⏳ ~0:0{n}s — final sands` (continue the 10 s decrement down to ~0:00;
keep the GIF as a *secondary* cue, not the primary one).
3. `⏳ Time running out — roll now` (drop the exact number but keep a plain-text
urgency label).
The GIF can stay as enrichment; it must not be the *only* thing carrying
"urgent." The floor text should also be amended to say the number-or-label is
retained through the final stretch.
### Finding 2 — [Medium] `urgent` (amber) → `failure` (red) ramp is confusable for red-green colorblind users
**Where**: `DESIGN.md` line 71 (color ramp), `colors.urgent`/`colors.failure`
table entries.
**Problem**: The timer ramps `pending` (blurple) → `urgent` (amber `0xF1C40F`)
`failure` (red `0xE74C3C`). Amber and red are frequently confusable for
deuteranopia / protanopia. In the resolved `failure` state this is fine —
`⏰ Time's Up` + `❌ FAILURE (timer expired)` carries the meaning in text and
emoji. In the `urgent` state it is not fine, because finding 1 has stripped the
text cue, leaving the amber-vs-red distinction to do work it can't reliably do
for those users.
**Fix**: Resolving finding 1 (restore a text label to the urgent field) also
resolves this — the color ramp becomes decorative rather than load-bearing. No
separate palette change is needed, but if the team wants defense-in-depth, push
`urgent` toward a hue further from red (e.g. a more saturated yellow-orange or
even keep blurple and signal urgency via the ⏳ + text alone).
### Finding 3 — [Low] `neutral` token has no paired signal in the colors table
**Where**: `DESIGN.md` line 66 (`| neutral | 0x95A5A6 | … | — |`), and the
"Do" rule on line 230 ("Pair every color with an emoji + text label").
**Problem**: The table lists `—` (no emoji, no label) for `neutral`, which
contradicts the blanket rule two lines above it and the Do/Don't rule. In
practice every `neutral` mockup (`📜 Status Set`, `🎲 Your Roll`) does carry an
emoji + text title, so no user is harmed — but the table is internally
inconsistent and a future contributor reading only the table could ship a
label-less `neutral` embed.
**Fix**: Either give `neutral` a paired signal in the table (`📜 / 🎲 + title
text`), or scope the rule to "every *status* color" and note that `neutral` is
non-status (informational / disabled) and has no meaning to lose.
### Finding 4 — [Low] Stale "Advantage" example in the Accessibility Floor
**Where**: `EXPERIENCE.md` line 158 — *"Button labels are descriptive ('Begin',
'Join', 'Advantage') — not icon-only."*
**Problem**: The `Advantage` (and `Disadvantage` / `Custom-Modifier`) buttons
are retired everywhere else in the doc (`DESIGN.md` line 235, `EXPERIENCE.md`
lines 149150, 219224 — "single player-locked `Roll`"). Citing `Advantage` as
an exemplar of descriptive labeling is stale and could mislead a builder into
re-adding it.
**Fix**: Replace the example list with the actual current set: `Roll`, `Join`,
`Leave`, `Begin`, `Cancel`.
### Finding 5 — [Low / Info] Scoreboard publicly displays each player's modifier
**Where**: `DESIGN.md` lines 166 and 174 — `Zara — ✅ 19 (rolled 16 +3)` and
`Zara ✅ 19 Kay ❌ 8 Mary ✅ 15`.
**Problem**: This is not an Accessibility Floor violation (the scoreboard is
intentionally public). It is a privacy consideration adjacent to the floor's
"ephemerals are private" claim: character modifiers are stat information, and
some tables treat them as player-private. The current design shows them to the
whole thread by default.
**Fix**: Decide explicitly whether modifiers are public on the scoreboard. If
they should be private, show only the total (`Zara — ✅ 19`) on the public
embed and reserve the `rolled 16 +3` breakdown for the ephemeral roll view
(which is already private). If public-by-default is intended, note it as a
conscious decision in `.decision-log.md` so it isn't read later as a leak.
---
## Screen-reader walkthrough (embeds, fields, scoreboard rows)
Discord embeds are read by screen readers in order: title → description →
fields (name, then value) → footer → timestamp. Findings from a dry walkthrough
of each mockup:
- **Skill-check / group / timed titles** — `emoji + double-space + Label`
reads as e.g. "crossed swords Skill Check dash Zara." The double-space is
ignored by SRs; the label carries meaning. Fine.
- **Inline field triples (DC · Modifier · Mode)** — each is a named field, so
a SR reads "DC 15, Modifier +3 Perception, Roll Mode Advantage." Strong.
- **Scoreboard rows** — one field per player. SR reads "Zara dash check mark
19 rolled 16 +3." The `✅`/`❌` emoji read as "check mark button" / "cross
mark," which is verbose but not ambiguous, and the number + the
`…awaiting` / `SUCCESS` text carry the meaning. Fine.
- **`…awaiting` ellipsis** — decorative; some SRs read "ellipsis awaiting,"
some skip the glyph. The word "awaiting" survives either way. Fine.
- **Timer field, pending phase** — `⏳ Time ~0:20` reads as "hourglass Time
approximately 0 20." Clear.
- **Timer field, urgent phase** — `⏳ Time —` reads as "hourglass Time dash."
No quantity, no urgency phrasing. **Broken** — see finding 1.
- **Timer GIF** — embed images have no alt-text path in discord.js; the SR
skips it entirely. **Broken** — see finding 1.
- **Footer imperatives** — "The sands run out. Roll before time slips away."
reads cleanly and reinforces the timer's meaning in the pending phase. In
the urgent phase the footer is unchanged, so it is the *only* textual
urgency cue left — which is actually a partial mitigation for finding 1,
though it does not convey "how much time" or "this is the final stretch."
- **Passive-reveal post** — `👁️ A Detail Revealed` + body text. Reads well.
- **Status confirmation** — `📜 Status Set` + body. Reads well.
Net: the SR experience is good everywhere except the timer's urgent window.
---
## Motion + colorblind summary
| State | Color | Emoji | Text label | Motion (GIF) | Colorblind-safe | SR-safe |
|---|---|---|---|---|---|---|
| pending | blurple | ⚔️/⏳ | "awaiting" / `~0:20` | no | yes | yes |
| gathering | orange | 🕯️ | "Seats…minimum met" | no | yes | yes |
| success | green | ✅ | "Success" / "prevails" | no | yes | yes |
| failure | red | ❌/⏰ | "Failure" / "Time's up" | no | yes | yes |
| notice | purple | 👁️ | "Detail Revealed" | no | yes | yes |
| **urgent** | amber | ⏳ | **— (removed)** | **yes (primary)** | **no** | **no** |
The `urgent` row is the only red cell in the matrix. Fixing finding 1 turns all
four columns green for that row.
---
## Recommended changes (minimal, ordered)
1. **`DESIGN.md` Timer display + final-stretch mockup**: keep a text label in
the urgent field (`<10s — roll now` or `~0:0{n}s — final sands`). Demote the
GIF to a secondary cue. (Resolves findings 1 and 2.)
2. **`EXPERIENCE.md` Accessibility Floor**: amend "No motion-dependent meaning"
to explicitly require the number-or-label to persist through the final
stretch; correct the floor so it matches the component behavior.
3. **`EXPERIENCE.md` line 158**: swap "Advantage" for `Roll`/`Cancel` in the
descriptive-label example. (Resolves finding 4.)
4. **`DESIGN.md` Colors table**: give `neutral` a paired signal or scope the
"every color pairs with…" rule to status colors only. (Resolves finding 3.)
5. **`DESIGN.md` scoreboard mockups**: decide public-vs-private modifier
display; default to totals-only on the public embed and the breakdown in the
ephemeral. (Resolves finding 5.)
None of these require new components or new color tokens; they are edits to
existing mockup text and one floor-claim wording fix.

View File

@@ -0,0 +1,337 @@
---
title: "Adversarial UX Review — Mardonar Encounter Engine"
reviewer: cynical-ux-reviewer
date: 2026-06-20
target:
- DESIGN.md
- EXPERIENCE.md
status: adversarial
---
# Adversarial UX Review
Premise: assume every promise is a lie until the spec proves it survives contact
with Discord's actual primitives (no atomic button-lock, 15-min ephemeral
window, no embed-image alt text, Manage Messages as a privilege, not a given).
Find the dead-ends, the races, and the claims that contradict each other.
Severity scale: **Critical** (ships broken / breaks a governing principle) ·
**High** (real flow fails for real users) · **Medium** (edge state with no
exit) · **Low** (polish / drift).
---
## C1 — The ≤10s final stretch is undefined behavior, and it contradicts the a11y floor
**[Critical]**
DESIGN.md `Colors → Timer display` and EXPERIENCE.md `Timed-check embed` both
specify: above ~10s the countdown text ticks in 10s increments; **below ~10s
the text stops and a ~10s-loop hourglass GIF takes over** as the urgency cue,
with the color ramping `pending → urgent → failure`.
Two failures stack here:
1. **The GIF asset does not exist.** EXPERIENCE.md `Open Items` flags it as
`[NOTE FOR UX]` — "needs sourcing + hosting." Until it ships, the spec'd
behavior below ~10s is: number frozen, no GIF, only the color ramp. But the
a11y floor (EXPERIENCE.md `Accessibility Floor`) says **color is never the
only signal**. So in production today, the final stretch has *no valid
urgency signal at all*. And if the hosted asset 404s later (CDN down, path
moved), `setImage` fails silently in Discord — no error, no fallback, just a
missing image. No fallback path is specified.
2. **The design itself violates "no motion-dependent meaning."** Once the
number stops updating, the only things changing are the GIF (motion) and the
embed color. The a11y floor claims "countdown meaning is conveyed by the
number + label, not by animation." But the number has stopped. The label
`⏳ {n}s` is frozen. So meaning in the final stretch *is* conveyed by motion
+ color — the two channels the floor forbids as sole carriers. A
reduced-motion user, a screen-reader user, or a monochrome user gets a
frozen number and nothing else. Embed images in discord.js take **no alt
text**, so the hourglass is invisible to AT entirely.
Fix: keep the number ticking (1s increments in the final 10s, accept the
rate-limit risk with coalescing), or post a single `⏰ 10s remain` text
escalation at the threshold. Do not hand urgency to a GIF.
---
## C2 — P2 "stop & guide, never silent" is silently broken for offline posters and for no-Manage-Messages channels
**[Critical]**
EXPERIENCE.md `Interaction Primitives → Non-joined message` and Flow 2 promise:
delete the offending message, then send the sender a private ephemeral
guiding them to Join. "Never silent."
Three breakages:
1. **Ephemerals require an active client within ~15 minutes.** If the sender
posted and closed Discord (mobile push notification tap, posted, swiped
away), the ephemeral is never rendered — Discord does not queue ephemerals
for offline users the way it queues normal messages. Their post is deleted,
the guidance never lands. P2's "never silent" is false for the exact
population most likely to post-and-leave (drive-by "hi" in a thread they
got pinged into). The spec gives no fallback (no DM, no thread-visible
@mention).
2. **No Manage Messages → no delete → no guide.** The spec assumes the bot can
delete. If the bot lacks Manage Messages in the channel (a real config in
restricted channels), the delete silently fails. The spec never specifies
the fallback: does the bot still send the P2 ephemeral? Does it leave the
message and @warn? Undefined. As written, the most likely implementation
branches on `message.delete()` succeeding and skips the guide on failure —
the opposite of P2.
3. **"the gathering post above" is a lie once the thread moves.** Flow 2's
ephemeral copy: *"Press **Join** on the gathering post, or use
`/encounter join`."* In a busy thread the lobby embed can be 50+ messages
up by the time a latecomer posts. "Above" is not a Discord concept — there
is no anchor link to the embed. The slash command is the only honest path,
but the copy credits the button first. P1's "persistent Join button" has
the same burial problem (see H1).
Fix: P2 guidance must lead with `/encounter join` (always reachable), and the
bot must fall back to a thread-visible @mention if the ephemeral cannot be
delivered or the bot can't delete. Define the no-Manage-Messages branch
explicitly.
---
## H1 — P1 "in-flight join" discoverability rests on an unconfirmed assumption and gets buried anyway
**[High]**
P1 (EXPERIENCE.md `Foundation`) is "a latecomer can always see how to join a
running encounter." The claimed mechanism is the persistent Join button on the
lobby embed, which "stays in its own row and remains live after Start."
Problems:
1. **The post-Start lobby embed is an Open Item.** EXPERIENCE.md `Open Items`:
`[ASSUMPTION]` — "confirm the post-Start lobby embed stays vs. is replaced
by a slimmer 'joining still open' notice." So P1's entire button-based
guarantee is riding on an unconfirmed decision. If the lobby embed is
replaced by a "slimmer notice," does that notice carry the Join button?
Unspecified. P1 is currently unfalsifiable.
2. **Even if it stays, it gets buried.** After Start the opening narrative
posts, passive-reveal posts fire (one per qualifying player — see H3), and
the LLM starts narrating. The lobby embed with its Join button is now N
messages up the thread. A latecomer opening the thread lands at the bottom
(recent messages), not at the lobby embed. "Always see how to join" is
false the moment any narrative lands. The slash command is the real
backstop, but the design credits the button row as the P1 win.
Fix: P1's honest answer is `/encounter join` plus a *pinned* lobby/Join
message (Discord pins survive scroll). The spec never mentions pins. A
button that lives on a scrollable embed is not "always visible."
---
## H2 — The "player-locked" Roll button is bypassable by double-click and spam-click
**[High]**
EXPERIENCE.md `Interaction Primitives → Roll button → player-locked` and
`Button enablement`: "the clicker must be a targeted player who hasn't
rolled; anyone else gets a private ephemeral 'This roll is not yours to
make.'" "One roll per player; a player who clicked is locked from rolling
again."
Discord buttons are not atomically locked. The lock is a *handler-side*
check after the click fires. Two races:
1. **Double-click race.** A targeted player clicks Roll; the handler does
Foundry/LLM work (the roll mode is decided upstream by stats — that's a
lookup) and *then* marks the player as rolled. A fast double-click fires
two `interactionCreate` events before the lock is written. Two rolls land.
The spec says "one button, one thread message per roll" — it does not say
one roll per player is *enforced atomically*. No idempotency key on
`customId` is specified.
2. **Non-targeted spam.** Anyone can click the Roll button (Discord doesn't
restrict button clicks by user). Each click from a non-targeted user
generates an ephemeral "not yours to make." A troll in the thread can
spam-click and generate a flood of ephemerals (self-only, but still
handler load) and, worse, can click *before* the targeted player — the
spec doesn't say whether a non-targeted click counts toward any
throttle.
3. **3-second interaction ack window.** discord.js must ack the interaction
(defer or reply) within ~3s or Discord shows "Interaction failed." The
roll handler does a Foundry stat lookup + roll computation + ephemeral
reply + scoreboard coalesced edit. If Foundry is slow, the button shows
failure to the player even though the roll may still process. No defer
strategy is specified.
Fix: lock the player in the same tick as the ack (defer ephemeral → write
lock → compute → edit ephemeral), and use the interaction token as an
idempotency key. Specify the non-targeted click throttle.
---
## H3 — Lobby dead-ends: orphaned Cancel, no-restart expiry, cap overflow, Begin/Leave race
**[High]**
Four unresolved states in the lobby lifecycle:
1. **Cancel is starter-only; starter leaves the server → no abort.**
EXPERIENCE.md `Component Patterns → Lobby embed`: Cancel is "starter-only."
If the starter's account is deleted or they leave the server mid-lobby,
Cancel is orphaned. No other player can abort. The lobby will run until
the ~30m idle expiry, blocking the thread. No co-owner / majority-cancel
path is specified.
2. **Expiry "closes the thread" with no restart affordance.**
`Lobby states → expired (idle ~30m → thread closed without start)`. What
does "thread closed" mean — archived? locked? Can a player re-run
`/encounter start` in the same thread? Is the thread locked for the
encounter's lifetime? The spec says "idle ~30m" but never defines idle
(does joined-player chat reset the timer? does the starter's own messages
reset it?). If chat resets it, a stalled lobby can live forever. If it
doesn't, a lobby with active chat but no Join presses closes under the
players' feet.
3. **Cap overflow (5) is unspecified.** DESIGN.md lobby mockup shows "(cap
5)." A 6th player pressing Join: is the button disabled at cap? Does the
click return an ephemeral "seats full"? Does `/encounter join` refuse?
None of this is in the spec. As written the most likely implementation
silently accepts the 6th or silently no-ops — either is a dead-end.
Post-Start latecomer join vs. cap is also undefined (can a 6th join
after Start? does cap still apply?).
4. **Begin enable/Leave race.** `Button enablement`: "Begin: enabled iff
joined ≥ min"; `Component Patterns`: "disables again if a player Leaves
below min." Race: joined hits min → Begin enables → starter clicks Begin
→ in the same interaction window a player clicks Leave → roster is now
min1 but Begin has already fired. Is the start rolled back? Does Begin
re-check the roster at handler time? The spec implies Begin's
enablement is a *client-side* setDisabled state, but the handler must
re-validate. The doc never says it does. A 2-of-3 encounter could start
with 2 then drop to 1 mid-click.
Fix: re-validate roster at Begin handler time and on every Leave; define
cap-overflow ephemeral; define a co-lead or auto-cancel-on-starter-leave;
define what "idle" means and what "thread closed" lets players do next.
---
## M1 — Group-check "majority" is undefined for even player counts
**[Medium]**
DESIGN.md `Group-check scoreboard` resolve example: "Rule: majority (2 of 3)
— SUCCESS." EXPERIENCE.md Flow 1 uses `majority` with 3 players. For an
even-sized group (2 or 4 players), majority is ambiguous: 1-of-2 is a tie,
2-of-4 is a tie. The spec never defines the tie rule (tie = success? =
failure? = rerun?). A 2-player group check with one success and one failure
has no resolved state defined. Dead-end.
Fix: state the tie rule explicitly (default: tie = failure, since the check
wasn't *majority*-met).
---
## M2 — Status confirmation has no exit if the LLM never clears it
**[Medium]**
DESIGN.md `Status confirmation`: "Zara is now marked: *sick* (clears in
~24h, or when cleared)." Who clears it? The DM command, or an LLM tool emit.
If neither fires, the 24h timer is the only backstop. A status that should
be permanent (a curse, a permanent injury) can't be represented — it always
expires. And a status the LLM *intended* to clear but never emitted a clear
for persists for 24h of wrong state. No "permanent" status option, no
manual-clear-as-DM fallback specified beyond the slash command (which is
gated to "DM/actor" — fine, but the LLM-driven clear path has no
confirmation it ever fires).
---
## M3 — Passive-reveal burst at Start is in tension with P3
**[Medium]**
EXPERIENCE.md `Component Patterns → Passive-reveal post`: "Posted once at
encounter start per qualifying player; public, attributed." If 5 players
qualify, that's 5 permanent embeds posted at encounter start, on top of the
lobby "set out" flip and the opening narrative. That's a 7-message burst at
Start. P3 ("no tornado") is about live-state edits, and these are one-shot
posts — but the principle's spirit is "keep threads calm," and a 7-embed
burst at the climax of Flow 1's step 4 is not calm. The doc never coalesces
these (no "one passive-reveal embed listing all qualifying players" option
is considered).
---
## M4 — Lobby Join/Leave spam is not debounced; "coalesced edits" only covers timer ticks
**[Medium]**
P3's "edits coalesced (≤1 edit/s)" is specified for the timer and the group
scoreboard. The lobby embed edits on *every* Join/Leave event (state change
→ edit). A player trolling the Join/Leave buttons fires an edit per click,
user-driven, not on a coalesced tick. The spec says the lobby is "edited in
place for every state change (P3)" — that's the opposite of coalescing.
Discord's rate limit on embed edits is ~5/10s per channel; a determined
troll hits it and the lobby embed goes stale (edits drop silently). No
debounce or queue is specified for lobby state changes.
---
## L1 — `urgent` (0xF1C40F) and `gathering` (0xF39C12) are near-indistinguishable for colorblind users
**[Low]**
DESIGN.md color table. Amber `#F1C40F` vs warm orange `#F39C12` — adjacent
hues, near-identical luminance. For deuteranopia/protanopia they collapse to
the same percept. The a11y floor holds because the *paired emoji + label*
differs (`⏳ {n}s` vs `🕯️ gathering`) — so this is not a floor violation,
just a weakened glanceable signal. The "color ramp pending → urgent" in the
timer is meant to be perceptible escalation; for colorblind users it is not
a ramp, it is a single step from blurple to "warm." Reconsider: jump
`pending` straight to `failure`-adjacent red for the final stretch, or rely
on the emoji change alone.
---
## L2 — Accessibility "descriptive button labels" claim is undercut by the retired Adv/Dis buttons being listed as labels
**[Low]**
EXPERIENCE.md `Accessibility Floor`: "Button labels are descriptive ('Begin',
'Join', 'Advantage') — not icon-only." But `Interaction Primitives → customId
prefixes` retires `sc_adv` / `sc_dis` / `sc_mod` / `grp_adv` / `grp_dis`
the Advantage button no longer exists. The a11y floor cites "Advantage" as an
example of a descriptive label, which is stale. Minor, but it signals the
a11y section was not updated with the player-sync simplification.
---
## L3 — Resolved-check Roll button transition is unspecified; mid-resolve click race
**[Low]**
`Check states`: `pending`/`filling``resolved-success`/`resolved-failure`/
`timed-out`. `Button enablement` says Roll is enabled "while the check is
pending/filling." The resolved mockup (DESIGN.md) shows no buttons. But the
transition from filling → resolved happens on a coalesced edit tick; a
player whose click is in-flight when the tick fires lands a roll after
resolve. Is it counted? Dropped? The spec doesn't say. Same class of race as
H2 but lower severity (the check is already decided).
---
## L4 — `/encounter join` with no live encounter is undefined
**[Low]**
P2 and P1 both point users to `/encounter join` as the always-available path.
What does the command do when no encounter is live in the thread (or the
thread has no encounter at all)? The spec only describes the success path
("adds the player to the roster"). No "no gathering here" ephemeral is
defined. A user following the P2 guidance into the wrong thread gets a
silent or erroring command. Dead-end.
---
## Verdict
The spec is internally beautiful and externally fragile. The three governing
principles (P1/P2/P3) are each broken by at least one finding: P1 rests on an
unconfirmed Open Item and a button that scrolls away; P2 is defeated by
Discord's ephemeral delivery model and channel-permission reality; P3 holds
for ticks but not for the Start burst or user-driven lobby edits. The a11y
floor is directly contradicted by the timer's final-stretch design. Ship
blockers: C1 (timer final stretch), C2 (P2 delivery). Fix those before any
implementation.

View File

@@ -0,0 +1,429 @@
---
title: "Mardonar Encounter Engine — UX Edge-Case Review"
review_date: 2026-06-20
reviewer: edge-case-hunter
scope: DESIGN.md + EXPERIENCE.md (ux-mardonar-encounter-engine-2026-06-20)
method: Exhaustive path enumeration over the ten nominated boundary interactions.
Handled paths are recorded as such and discarded from the findings set;
only unhandled paths appear in the Findings section.
---
# Edge-Case Review — Mardonar Encounter Engine UX
## Method
Each of the ten nominated boundary interactions is walked against the two
source documents. A path is **handled** only when an explicit guard (a
behavior rule, a state transition, an ephemeral message, an a11y pairing)
exists in the spec text for that exact condition. Anything left implicit,
contradictory, or undefined is reported as a finding with a severity, a
location, a trigger condition, a minimal guard sketch, and a concrete
consequence.
Severity scale: **High** (breaks a governing principle P1/P2/P3, the a11y
floor, or a core flow's climax) · **Medium** (degrades a flow or leaves a
common state without defined behavior) · **Low** (cosmetic / unreliable
pointer with a working fallback).
---
## Boundary Walk
### 1. A non-targeted player clicks the player-locked Roll on a solo/group check
**Verdict: HANDLED.**
`EXPERIENCE.md` Interaction Primitives (lines 124131) and §Button
enablement (lines 109114) explicitly define this path: the handler checks
the clicker is the targeted player (solo) or a targeted player who hasn't
rolled (group); "anyone else gets a private ephemeral *'This roll is not
yours to make.'*" The a11y floor (line 159) confirms the guidance is
text-clear. No finding.
**Adjacent gap (not the listed case, but reachable from it):** a *targeted*
player who has **already rolled** in a group check clicking Roll again is
"locked from rolling again" (line 73), but the only ephemeral defined is
the non-targeted one — *"This roll is not yours to make"* — which is
semantically wrong for a player whose roll it *is*, just already made. No
distinct ephemeral is specified for the already-rolled re-click. Folded
into Finding F2 (double-roll), since the rapid double-click is the common
way to reach this state.
---
### 2. A player clicks Roll twice rapidly (double-roll)
**Verdict: UNHANDLED — Finding F2.**
The spec states the lock as a logical state — "One roll per player; a
player who clicked is locked from rolling again" (`EXPERIENCE.md` line 73)
and "Roll: enabled while the check is `pending`/`filling` and the clicker is
a targeted player who hasn't rolled yet" (line 111). But the button's
*disabled* state only catches up on the **next coalesced edit** ("the
scoreboard's row for that player is updated on the next coalesced edit (P3,
≤1 edit/s)" — line 71). The spec defines no idempotency guard on the
handler itself, and no button-disable-on-click before the coalesced edit
fires. There is therefore a window of up to ~1 second in which the button
is still rendered enabled and a second click will route to the handler.
The spec also doesn't define whether the handler deduplicates by
`customId`+clicker+check-id, nor what ephemeral the already-rolled clicker
receives (see §1 adjacent gap).
---
### 3. Lobby hits maxPlayers, then a joined player Leaves
**Verdict: The Leave itself is HANDLED; the cap-full Join attempt is UNHANDLED — Finding F3.**
The Leave is covered: "Begin enables the instant joined ≥ `minPlayers`;
disables again if a player Leaves below min" (`EXPERIENCE.md` line 60);
"Leave: enabled iff the clicker is currently joined" (line 114); the embed
is edited in place for every state change (line 64). Leaving from cap
therefore re-opens a seat and correctly re-disables Begin only if the
result drops below min.
The unhandled path is the **obverse** state the walk reaches: a player who
clicks **Join while the lobby is already at `maxPlayers`**. "Join: always
enabled while the encounter is live (P1)" (line 113) is unconditional —
Join stays enabled even at cap — but no behavior, no ephemeral, and no
roster rejection is defined for the cap-full click. P1 ("a latecomer can
always see how to join") is about *visibility of the affordance*, not about
accepting joins past cap. The spec never reconciles "Join always enabled"
with "cap 5." A click at cap has no defined outcome.
---
### 4. Timer final-stretch GIF when the asset isn't hosted / fails to load
**Verdict: UNHANDLED — Finding F4 (High).**
The final-stretch design (`DESIGN.md` lines 6874, 189195;
`EXPERIENCE.md` lines 7983, 142146, 210213) is: below ~10s the
countdown **text stops** ("the text stops and an animated ~10s-loop
hourglass GIF takes over"), the color ramps to `urgent`, and the GIF
becomes the urgency cue. The GIF asset is itself an open item —
"[NOTE FOR UX] Timer final-stretch GIF asset ... needs sourcing + hosting"
(`EXPERIENCE.md` lines 210213).
No fallback is defined for when the GIF URL is unhosted, 404s, or Discord
fails to render it. In that case the final stretch has: the number removed
(replaced with `—`), no motion cue, and only the `urgent` color remains.
That violates two explicit a11y-floor rules: "Color is never the only
signal" (`EXPERIENCE.md` line 154) and "No motion-dependent meaning"
(line 161). The final-stretch mockup (`DESIGN.md` line 191) shows `⏳ Time
— [ ⏳ hourglass GIF ]` — the label `Time —` conveys "time is running" but
neither the remaining seconds nor, on its own, the urgency level. With the
GIF missing, a user is left with a color-only urgency signal and no
countdown.
---
### 5. A latecomer tries to Join during the final-stretch timer of a running group check
**Verdict: UNHANDLED — Finding F5 (High).**
P1 is unambiguous that Join is always available while the encounter is
live: "Join: always enabled while the encounter is live (P1)" (line 113);
"two always-available paths — the persistent Join button on the lobby
embed (its own row, live after Start) and `/encounter join`. Both add the
player to the roster" (lines 138141). So a latecomer *can* join during a
running group check's final stretch.
But the spec never defines what happens to the **in-flight group check**
when a latecomer joins mid-check. The scoreboard is "one embed, edited in
place as rolls arrive" with "one field per player" pre-populated for the
targeted players (`DESIGN.md` lines 9496; `EXPERIENCE.md` lines 6673).
Open questions with no spec answer:
- Does the latecomer get a new scoreboard row?
- Are they added to the "targeted players who must roll" set, or just the
roster?
- Does the `majority`/rule threshold recompute against the new roster size?
- If they join at the final-stretch (≤10s, text stopped, GIF running), do
they get any time to roll, or is the check already about to expire on a
roster they weren't part of?
- Can a latecomer joining in the final seconds flip a group `failure` to
`success` (or vice-versa) by changing the denominator?
The "LLM weaves them in on its own timing (no auto 'entered' announcement
— FR-27)" rule (line 141) covers narrative integration, not mechanical
integration with an in-flight check. The climax of Flow 1 (line 181:
"at expiry, 2/3 succeeded → the scoreboard finalizes") assumes a fixed
targeted set; a late join during the final stretch breaks that assumption
with no defined behavior.
---
### 6. Two targeted players click Roll near-simultaneously (coalesced-edit race on the scoreboard)
**Verdict: UNHANDLED — Finding F6 (High).**
The spec addresses the *visual* race by mandating a rewrite, not an
append: "Scoreboard rows ... filled in as rolls arrive; the field set is
rewritten on coalesced edits, not appended" (`DESIGN.md` lines 9596);
"the scoreboard's row for that player is updated on the next coalesced
edit (P3, ≤1 edit/s)" (`EXPERIENCE.md` line 71). This prevents append-order
garbling but does **not** define an atomicity/serialization guard for the
underlying read-modify-write of the rolls state. Two click handlers that
both capture a roll and then both schedule a coalesced rewrite can race:
if each rewrite is computed from a snapshot of "current rolls," a
last-write-wins rewrite can drop one of the two rolls from the rendered
scoreboard. The spec states the lock ("a player who clicked is locked from
rolling again," line 73) but does not state that the roll-capture into the
shared state is atomic, serialized, or compare-and-set against a revision
token. No idempotency key per (check-id, player-id) is specified. The
"rewritten, not appended" rule is a rendering decision, not a concurrency
guard.
---
### 7. P2 guidance ephemeral when the lobby embed has scrolled out of view
**Verdict: PARTIALLY HANDLED — Finding F7 (Low).**
The P2 ephemeral names both join paths in plain text: *"This gathering is
not yet yours to join. Press **Join** on the gathering post above, or use
`/encounter join`."* (`EXPERIENCE.md` lines 134137); the a11y floor
confirms "Guidance is text-clear: the P2 ephemeral names both join paths in
plain text" (line 159). The `/encounter join` path is functional regardless
of scroll position, so the user is never fully stranded.
The unhandled aspect is that the **first** named path — "the gathering post
above" — is a location-relative pointer with no jump link and no scroll
guidance. In a long/busy thread the lobby embed may be far above, or may
have already been edited to the post-Start "The gathering has set out."
state (`DESIGN.md` line 152) whose title no longer matches "the gathering
post" the ephemeral names. The pointer can therefore be stale or
unfindable, leaving the user to rely on the slash-command fallback alone.
Discord ephemerals cannot deep-link to a message, so the spec has no way
to make "above" reliable — but it also doesn't acknowledge that limitation
or rephrase to foreground the slash command. Low severity because the
fallback exists and is named.
A second, smaller unhandled sub-case: in a solo/implicit-join encounter
(FR-30, Flow 3) **there is no lobby embed at all**, yet the same P2
ephemeral text references "the gathering post above." If a visitor posts
in a solo encounter thread, the guidance names a post that does not exist.
The spec uses one fixed ephemeral string regardless of surface.
---
### 8. A colorblind user reading the timer color ramp and the scoreboard success/failure
**Verdict: Scoreboard and normal-phase timer HANDLED; final-stretch timer UNHANDLED — Finding F8 (High, overlaps F4).**
The a11y floor is explicit and well-paired for the steady states:
"Color is never the only signal: every status pairs color with an emoji +
text label (`✅ Success`, `❌ Failure`, `⏰ Time's up`, `⏳ {n}s`)"
(`EXPERIENCE.md` lines 154156); the color table (`DESIGN.md` lines 5866)
pairs every token with an emoji + text. So:
- **Scoreboard success/failure**: HANDLED. `✅` + "Success" / `❌` +
"Failure" + "Rule: majority (2 of 3) — SUCCESS" (`DESIGN.md` lines
173175). A colorblind user reads outcome from emoji + text.
- **Timer color ramp, normal phase** (`pending``urgent` while the number
still ticks in 10s increments): HANDLED. `⏳ {n}s` + color.
- **Timer color ramp, final stretch** (≤ ~10s): **UNHANDLED.** The number
is removed and replaced with `—` (`DESIGN.md` line 191:
`⏳ Time — [ ⏳ hourglass GIF ]`), the GIF is motion (and
"No motion-dependent meaning: countdown meaning is conveyed by the
number + label, not by animation" — `EXPERIENCE.md` lines 161162
forbids motion from carrying meaning), and the color ramps to `urgent`.
For a colorblind user in the final stretch, the distinguishing signal
between "urgent" and "expired" collapses to: emoji `⏳` vs `⏰` + the
label `Time —` vs `Time's up`. The transition from "still time, number
removed" to "no time" is carried by the GIF + color, which is exactly
the combination the a11y floor forbids. There is no text/emoji-only
articulation of "final stretch, seconds remaining" once the number is
blanked.
This is the a11y-face of the same defect as F4: the final-stretch design
offloads urgency to GIF + color and removes the number, breaking
"color never alone" and "no motion-dependent meaning" for the very users
those rules protect.
---
### 9. A player with no Foundry character in a passive-reveal encounter
**Verdict: UNHANDLED — Finding F9 (Medium).**
Passive reveals are "Posted once at encounter start per qualifying player;
public, attributed, atmospheric" (`EXPERIENCE.md` lines 9091), and
attribution is stat-based — "Zara's keen Perception catches what others
miss" (`DESIGN.md` lines 213216). Qualification therefore depends on
Foundry character stats. The spec never defines the case of a joined
player with **no Foundry character linked**:
- For passive reveals: are they silently skipped, or is there guidance?
"Per qualifying player" implies skip, but no spec text says so, and no
ephemeral tells the unlinked player why they got no reveal while others
did.
- More seriously, the single player-locked `Roll` button's whole premise
is that "advantage/disadvantage is decided upstream by the story/DM (LLM
emit) and the character's Foundry stats" (`EXPERIENCE.md` lines 128129,
220223; `DESIGN.md` lines 122125). A targeted player with no Foundry
character has no stats from which to derive modifier, advantage, or
disadvantage. The spec defines no fallback: no default modifier, no
"roll as untrained," no ephemeral "your fate is unbound — link a
character," no refusal-to-emit. The `🎯 Modifier` and `🟢 Roll Mode`
fields in the skill-check mockup (`DESIGN.md` lines 130131) would
render blank or error.
The open item at `EXPERIENCE.md` lines 219223 notes that "Architecture
must combine LLM-specified + Foundry-derived advantage into the roll" —
but that note assumes a Foundry character exists. The no-character branch
of that combination is undefined.
---
### 10. The lobby idle-auto-expiry firing while players are still deciding
**Verdict: UNHANDLED — Finding F10 (Medium-High).**
The spec states the expiry tersely: "closes if idle ~30m" (`DESIGN.md`
line 148); "`expired` (idle ~30m → thread closed without start)"
(`EXPERIENCE.md` line 102). Nothing in either document defines:
- **What activity resets the idle timer.** Does a chat message in the
thread reset it? A `/encounter join`? A `Leave`? If chat resets it, one
chatty player prevents expiry indefinitely; if only Join/Leave/Begin
reset it, the lobby can expire while players are actively reading the
encounter prompt and discussing whether to join — exactly the "still
deciding" state. The signal is undefined.
- **Any pre-expiry warning.** There is no `urgent`-color "the gathering
disperses in ~5m unless someone joins" state, no ephemeral to already-
joined players, no final Begin prompt. The lobby goes straight from
`gathering` to `expired`/thread-closed.
- **Any recovery/extension path.** Once expired, "thread closed without
start" — there is no `/encounter start` replay, no re-open, no
"the gathering re-forms" transition. A starter whose party was still
deciding has no recourse in-spec.
The "idle ~30m" rule is also the *only* lobby terminal state besides
`cancelled` and `started`, so the undefined signal directly affects
whether the lobby reaches its intended climax or dies mid-decision.
---
## Findings (unhandled paths only)
| ID | Severity | Location | Trigger (≤15 words) |
|----|----------|----------|---------------------|
| F4 | High | DESIGN.md:68-74,189-195 / EXPERIENCE.md:79-83,210-213 | Final-stretch GIF asset missing or fails to load |
| F5 | High | EXPERIENCE.md:66-73,113,138-141 | Latecomer joins during in-flight group check's final stretch |
| F6 | High | DESIGN.md:95-96 / EXPERIENCE.md:71-73 | Two targeted players roll near-simultaneously; coalesced rewrite races |
| F8 | High | DESIGN.md:189-195 / EXPERIENCE.md:154-162 | Colorblind user in timer final stretch; number removed, GIF+color only |
| F10 | Med-High | DESIGN.md:148 / EXPERIENCE.md:102 | Lobby idle-auto-expiry fires while players still deciding |
| F9 | Medium | EXPERIENCE.md:90-91,128-129,219-223 / DESIGN.md:122-131,213-216 | Targeted/revealing player has no Foundry character linked |
| F2 | Medium | EXPERIENCE.md:71-73,109-114 | Player double-clicks Roll within the coalesced-edit window |
| F3 | Medium | EXPERIENCE.md:60,64,113-114 / DESIGN.md:136-153 | Join clicked while lobby already at maxPlayers cap |
| F7 | Low | EXPERIENCE.md:134-137,159 / DESIGN.md:152 | P2 ephemeral says "gathering post above" but it has scrolled away |
### F2 — Double-roll within the coalesced-edit window (Medium)
- **Location:** `EXPERIENCE.md` lines 7173, 109114.
- **Trigger:** Player clicks Roll twice within the ≤1 edit/s coalesced
window, before the button's disabled state catches up.
- **Guard sketch:** `if (rolled.has(clickerId)) return ephemeral("Your die is already cast."); button.setDisabled(true) on click before await; handler keyed idempotent on (checkId, clickerId).`
- **Consequence:** Two rolls registered for one player; double ephemeral;
scoreboard row flicker / inconsistent total.
### F3 — Join clicked while lobby already at maxPlayers cap (Medium)
- **Location:** `EXPERIENCE.md` lines 60, 64, 113114; `DESIGN.md` lines
136153.
- **Trigger:** A player clicks Join (or `/encounter join`) when joined
already equals `maxPlayers`.
- **Guard sketch:** `if (joined >= maxPlayers) return ephemeral("The seats are full — no place remains at this gathering.");`
- **Consequence:** Roster overflows cap; seat count renders "6 of 5";
Begin/min logic confused; or click silently no-ops with no feedback.
### F4 — Final-stretch GIF missing or fails to load (High)
- **Location:** `DESIGN.md` lines 6874, 189195; `EXPERIENCE.md` lines
7983, 142146, 210213.
- **Trigger:** Hourglass GIF URL unhosted, 404, or Discord fails to render
it during the ≤~10s final stretch.
- **Guard sketch:** `on setImage error / missing asset: keep numeric countdown text ticking per-second for final 10s instead of blanking to "—"; always render "⏳ {n}s remain" text.`
- **Consequence:** Final stretch has no countdown text, no motion cue, only
color — violates a11y floor; urgency signal lost for everyone, not just
low-vision users.
### F5 — Latecomer joins during in-flight group check's final stretch (High)
- **Location:** `EXPERIENCE.md` lines 6673, 113, 138141.
- **Trigger:** Latecomer clicks Join / `/encounter join` while a group
check scoreboard is in `filling`/final-stretch.
- **Guard sketch:** `define: late join during active check does NOT add a scoreboard row or recompute majority for the in-flight check; roster updated, but check's targeted set is frozen at emit; ephemeral to latecomer: "The trial is already underway — your turn comes with the next."`
- **Consequence:** Unknown whether latecomer must roll, whether majority
denominator changes, whether they can flip an in-flight outcome in the
final seconds; scoreboard/majority logic undefined.
### F6 — Coalesced-edit rewrite race on near-simultaneous rolls (High)
- **Location:** `DESIGN.md` lines 9596; `EXPERIENCE.md` lines 7173.
- **Trigger:** Two targeted players click Roll near-simultaneously; both
schedule coalesced rewrites of the scoreboard field set.
- **Guard sketch:** `roll capture into shared state is atomic (single mutex / DB upsert keyed by (checkId, playerId)); coalesced edit reads from that state, never from a per-click snapshot; last-write-wins forbidden.`
- **Consequence:** One player's roll silently dropped from the rendered
scoreboard; majority computed off missing roll; wrong group outcome.
### F7 — P2 ephemeral references "gathering post above" that has scrolled away (Low)
- **Location:** `EXPERIENCE.md` lines 134137, 159; `DESIGN.md` line 152.
- **Trigger:** Non-joined message deleted in a long thread where the lobby
embed is far above (or already edited to "set out").
- **Guard sketch:** `foreground the slash command in the ephemeral: "Use /encounter join to take your seat." then mention the button only as "or press Join on the gathering post up-thread."`
- **Consequence:** User sent to find a button they can't locate; relies on
the slash-command fallback; in solo encounters the named post doesn't
exist at all.
### F8 — Colorblind user in the timer final stretch (High, overlaps F4)
- **Location:** `DESIGN.md` lines 189195; `EXPERIENCE.md` lines 154162.
- **Trigger:** Final stretch blanks the number to `—` and offloads urgency
to GIF (motion) + `urgent` color.
- **Guard sketch:** `keep a text/emoji articulation of urgency in the final stretch independent of color and motion: render "⏳ Final seconds — roll now" as the field value; do not rely on GIF or color alone.`
- **Consequence:** Color-only urgency signal in the final stretch breaks
"color never alone" and "no motion-dependent meaning" for colorblind /
low-vision users; they cannot distinguish urgent-vs-expired from the
field alone.
### F9 — Player with no Foundry character in a passive-reveal / targeted roll (Medium)
- **Location:** `EXPERIENCE.md` lines 9091, 128129, 219223; `DESIGN.md`
lines 122131, 213216.
- **Trigger:** A joined/targeted player has no Foundry character linked
when a passive reveal fires or a roll is emitted against them.
- **Guard sketch:** `if (!hasFoundryChar(player)) { skip passive reveal silently OR emit guidance ephemeral; for roll: use modifier +0 / no-advantage defaults and mark Roll Mode "Unbound", or refuse emit with ephemeral "Link a character to take this roll." }`
- **Consequence:** Modifier/Roll-Mode fields blank or error on the skill-
check embed; passive reveal silently excludes the unlinked player with no
feedback; roll may fail at the Foundry-derivation step.
### F10 — Lobby idle-auto-expiry fires while players are still deciding (Med-High)
- **Location:** `DESIGN.md` line 148; `EXPERIENCE.md` line 102.
- **Trigger:** The ~30m idle timer elapses while players are reading /
discussing but not pressing Join/Begin.
- **Guard sketch:** `define idle-reset signal (e.g. any thread message resets); add an `urgent` pre-expiry warning at ~25m ("🕯️ The gathering fades in ~5m unless a seat is taken"); provide `/encounter start` replay or re-open from `expired`.`
- **Consequence:** Lobby dies mid-decision with no warning and no recovery;
thread closed; starter loses the roster with no in-spec recourse.
---
## Coverage note
Of the ten nominated boundary interactions, **one is fully handled** (#1,
non-targeted Roll click → ephemeral), **one is partially handled with a
working fallback** (#7, P2 guidance with `/encounter join` fallback), and
**eight produce nine findings** (#2, #3, #4, #5, #6, #8, #9, #10; #8
overlaps #4 as its a11y expression). The dominant defect cluster is the
**timer final stretch**: F4 and F8 together show that the design's
"blank the number, run the GIF, ramp the color" choice simultaneously
creates a missing-asset failure mode (F4) and an a11y-floor violation
(F8). The second cluster is **concurrency / undefined-state** at the
scoreboard and lobby state machine (F2, F5, F6, F10) — the coalesced-edit
discipline (P3) is specified as a rendering rule but not as a
concurrency/atomicity rule, and several state transitions (cap-full Join,
late-join mid-check, idle-expiry signal) have no defined behavior.

View File

@@ -0,0 +1,100 @@
# UX Spine Quality Review — ux-mardonar-encounter-engine-2026-06-20
Reviewed against: DESIGN.md, EXPERIENCE.md, prd.md (IA-closure cross-check).
Date: 2026-06-20.
## 1. DESIGN.md Spine Coverage
| Spine section | Present | Notes |
|---|---|---|
| Brand & Style | Yes | Defines the "embed language" identity; anchors P3 (no tornado) and in-world voice. |
| Colors | Yes | 7 tokens with hex, meaning, and paired non-color signal. Status-never-color-alone rule stated. |
| Typography | Yes | Titles (emoji + double-space), italic description, bold mechanical numbers, footer imperative. Notes Discord-controlled font. |
| Layout & Spacing | Yes | One-embed-per-surface, inline field triples, scoreboard rows, ActionRow discipline (Join in own row → P1). |
| Elevation & Depth | Yes | Acknowledges Discord has no shadows; substitutes color, footer+timestamp, button disable state. |
| Shapes | Yes | Discord-controlled; left bar = status color; one-embed discipline makes the bar read. |
| Components | Yes | 7 surface components + the skill-check root pattern, each with an ASCII mock. |
| Do's and Don'ts | Yes | 6 Do / 6 Don't; map back to P1/P2/P3 and a11y. |
| YAML frontmatter tokens | Yes | `colors`, `typography`, `rounded`, `spacing`, `components` blocks; tokens are path-addressable. |
**DESIGN spine: complete.** All eight sections plus frontmatter present and internally consistent.
## 2. EXPERIENCE.md Spine Coverage
| Spine section | Present | Notes |
|---|---|---|
| Foundation | Yes | Form-factor, DESIGN.md reference, and P1/P2/P3 stated up front as governing principles. |
| Information Architecture | Yes | 7-row surface table with visibility, lifespan, and a "See" pointer per row; explicit closure paragraph mapping surfaces → FRs. |
| Voice and Tone | Yes | Atmospheric second-person narrator; forbids utility jargon; defers mockup-level voice to DESIGN.md Components. |
| Component Patterns (behavioral) | Yes | One block per surface with button styles, enablement, and edit-in-place rules. |
| State Patterns | Yes | Lobby states, check states, button enablement, ephemeral lifecycle. |
| Interaction Primitives | Yes | Roll-button player-lock, slash-command ephemeral confirm, P2 stop-and-guide, P1 latecomer Join, timer edit-in-place, customId prefixes (with retired Adv/Dis prefixes listed). |
| Accessibility Floor | Yes | Color-not-only-signal, descriptive button labels, text-clear guidance, no motion-dependent meaning, private ephemerals. |
| Key Flows | Yes | 3 flows with explicit climaxes (group check resolve, in-flight join, sands run out). |
**EXPERIENCE spine: complete.** All eight sections present.
## 3. IA Closure — PRD Surfaces ↔ UX Surfaces ↔ Flows
PRD's 7 surfaces (per the review brief): lobby, group scoreboard, timed check, ephemeral roll view, passive reveal, status command, latecomer Join.
| PRD surface | UX surface (IA #) | DESIGN component | A Key Flow exercises it? |
|---|---|---|---|
| Lobby | #1 Lobby embed | `lobbyEmbed` | Yes — Flow 1 |
| Group scoreboard | #2 Group-check scoreboard | `groupScoreboardEmbed` | Yes — Flow 1 (climax) |
| Timed check | #3 Timed-check embed | `timedCheckEmbed` | Yes — Flow 3 (climax) |
| Ephemeral roll view | #4 Ephemeral roll view | `ephemeralRollView` | Yes — Flows 1 & 3 |
| Passive reveal | #5 Passive-reveal post | `passiveRevealPost` | Partial — fired as a step in Flow 1, no dedicated climax flow |
| Status command | #6 Status confirmation | `statusConfirmation` | **No — no Key Flow exercises `/character status set|clear`** |
| Latecomer Join | #7 Latecomer Join affordance | (Interaction Primitives) | Yes — Flow 2 (climax) |
**IA closure: 6 of 7 surfaces fully closed; 1 gap.** Every PRD surface has a UX surface and a DESIGN component. Every surface has *some* flow treatment except **status confirmation**, which appears in Component Patterns and State Patterns but never in a Key Flow. The "every surface has a flow" bar is met for 6/7 and missed for status command. Passive reveal is weakly closed (step inside Flow 1, not its own climax), which is acceptable but worth noting.
## 4. Cross-Reference Integrity (EXPERIENCE → DESIGN tokens)
EXPERIENCE.md uses `{path.to.token}` syntax for color tokens: `{colors.pending}`, `{colors.urgent}`, `{colors.failure}`, `{colors.gathering}`, `{colors.notice}`, `{colors.success}`. All six resolve to tokens present in DESIGN.md's YAML `colors:` block. **Color cross-refs: clean.**
Non-color cross-refs are **mixed syntax**: the IA "See" column uses `DESIGN \`components.lobbyEmbed\`` (backtick-path) rather than `{components.lobbyEmbed}`, and the Accessibility Floor refers to `DESIGN.md.Colors` and `DESIGN.md.Components` as section names. This is a consistency nit, not a broken reference — every pointer resolves to a real section/token — but the rubric's "{path.to.token} syntax" bar is only met for colors, not for components/typography/spacing. No broken references found; no `{colors.X}` token mismatches.
## 5. P1 / P2 / P3 Enforcement (repeated, not just stated once)
| Principle | Stated in Foundation | Re-enforced where |
|---|---|---|
| **P1 — obvious in-flight join** | Yes | IA table (surface 7 lifespan "persistent"); Component Patterns → Lobby ("Join stays in a dedicated row, remains live after Start"); State Patterns → Button enablement ("Join: always enabled while the encounter is live"); Interaction Primitives → Latecomer Join (two paths); DESIGN Layout & Spacing (Join own row); DESIGN Do's; Flow 2 climax. |
| **P2 — stop & guide** | Yes | IA closure line ("Non-joined guidance → P2"); State Patterns → Ephemeral lifecycle; Interaction Primitives → Non-joined message (full ephemeral text quoted); Accessibility Floor ("Guidance is text-clear"); DESIGN Do's; Flow 1 step 2; Flow 2 step 2. |
| **P3 — no tornado** | Yes | IA table (lifespan "edited in place"); Component Patterns → Lobby, Group-check, Ephemeral; State Patterns → Check states ("by edit, never re-post"); Interaction Primitives → Timer edit-in-place + coalesced edit tick; DESIGN Layout & Spacing + Colors (10s increments, GIF final stretch); DESIGN Do's/Don'ts; Flow 1 (coalesced edits); Flow 3 (edited in place). |
**All three principles are enforced throughout both documents**, not merely declared. P3 in particular is reinforced at every state-transition point. No principle is stated once and dropped.
## 6. Voice Consistency — In-World, No Utility Jargon
Forbidden utility terms in player-facing strings: "lobby", "session", "ephemeral", "registry".
Audit of every quoted player-facing string across both docs:
- Lobby embed title/footer: *"The Party Gathers — Velvet Auction"*, *"Press Join to take your seat — or /encounter join. • closes if idle ~30m"* — clean; uses "gathering"/"seats", never "lobby".
- P2 guidance ephemeral: *"This gathering is not yet yours to join. Press Join on the gathering post above, or use /encounter join."* — clean.
- Roll-not-yours ephemeral: *"This roll is not yours to make."* — clean.
- Timed footer: *"The sands run out. Roll before time slips away."* — clean, fate-flavored.
- Passive reveal: *"A Detail Revealed … Zara's keen Perception catches what others miss."* — clean.
- Status confirmation: *"Status Set — Zara is now marked: sick"* — clean.
- Group resolve: *"The party prevails"* — clean.
- Ephemeral roll view: *"✅ Success — the sentries do not see you."* — clean.
The terms "lobby", "session", "ephemeral", "registry" appear only in **descriptive/spec prose** (IA table, State Patterns, Interaction Primitives, Open Items) — never inside a quoted player-facing string. The one borderline case is Flow 1 step 5's parenthetical *"`{colors.gathering}`): seats 0/3"*, which is narration about the embed, not embed text. **Voice consistency: holds.** No utility jargon leaks into player-facing copy.
## 7. Findings (severity-ranked)
1. **[Medium] Status-confirmation surface has no Key Flow.** IA closure bar is "every surface has a flow"; surface #6 (status command / `statusConfirmation`) is covered in Component Patterns and State Patterns but absent from all three Key Flows. Add a short Flow 4 showing the DM setting `sick` and the LLM narrating it, or fold the set/clear ack into an existing flow as a step. *(EXPERIENCE.md §Key Flows)*
2. **[Medium] Stale "Roll/Adv/Dis" in Flow 1.** Flow 1 step 5 says the scoreboard appears "with `Roll/Adv/Dis`", contradicting FR-43, DESIGN.md's group-check mock (single `[ Roll ]`), Component Patterns ("a single Roll, locked to targeted players"), and the retired `grp_adv`/`grp_dis` customId prefixes listed in Interaction Primitives. Replace with "with a single player-locked `Roll`." *(EXPERIENCE.md Flow 1, step 5)*
3. **[Low] Stale "Advantage" button-label example in Accessibility Floor.** The a11y section lists `"Advantage"` as a descriptive button-label example, but the Advantage button is retired on every skill-check surface (FR-43). Use `"Begin"`, `"Join"`, `"Roll"` only. *(EXPERIENCE.md §Accessibility Floor)*
4. **[Low] Inconsistent cross-reference syntax for non-color tokens.** Color tokens use `{colors.X}`; component references use `` `components.lobbyEmbed` `` and section-name prose (`DESIGN.md.Colors`). No broken references, but the rubric's "{path.to.token} syntax" bar is applied unevenly. Harmonize to `{components.X}` / `{typography.X}` if the {token} convention is intended to be authoritative. *(EXPERIENCE.md §Information Architecture "See" column, §Accessibility Floor)*
5. **[Low] Passive-reveal surface is only weakly flowed.** It fires as a step inside Flow 1 (step 4) with no climax of its own. Acceptable given UJ-2 lives in the PRD, but a one-line acknowledgment in §Key Flows that passive reveals are exercised via Flow 1 step 4 would make closure explicit. *(EXPERIENCE.md §Key Flows)*
## 8. Verdict
**Pass with minor revisions.** Both spines are fully present, all three user principles are enforced throughout, and voice is clean. IA closure is 6/7 surfaces fully flowed (status command missing), and two stale Adv/Dis references contradict the unified single-Roll decision in FR-43. No broken token cross-references; non-color token syntax is inconsistent but not broken.