merge: PRD — multiplayer live E2E via a second player bot

This commit is contained in:
2026-06-22 19:48:07 +00:00

View File

@@ -0,0 +1,272 @@
---
title: "Multiplayer Live E2E via a Second Player Bot"
status: final
created: 2026-06-22
updated: 2026-06-22
---
# Product Requirements Document: Multiplayer Live E2E via a Second Player Bot
## 1. Overview
The Mardonar Encounter Engine's group-encounter feature set (Features AE +
FR-43) shipped with **unit + integration coverage only — no live E2E tier** for
multiplayer surfaces. The documented blocker is the **one-token constraint**: a
single bot token can represent only one gateway identity, so true multi-player
fan-out (multiple distinct players joining a lobby, posting, and rolling in a
group check) could not be exercised live. The architecture explicitly named a
**second bot token** as the way to close this gap
(`_bmad-output/arch/arch-mardonar-encounter-engine-2026-06-20/architecture.md §8`,
"Manual pre-release multi-player playtest checklist" + the FU-9 optional
second-token live E2E in `follow-up-stories.md`).
This PRD defines a **second player-side bot** in the live E2E flow so that **two
distinct players** can participate in a real group encounter end-to-end: join a
lobby, post real chat turns that route through the live `messageRouter`, and
roll in a group skill check whose scoreboard finalizes with the real
`successRule`. It closes the four documented residual-risk gap cases
(simultaneous fan-out, `successRule` N>1, per-user ephemeral delivery,
second-claimant rejection) that today have no automated coverage above the
unit tier.
The human safety net — `docs/release-playtest-checklist.md` (FU-9) — remains;
this live E2E **reduces, not replaces**, it.
## 2. Problem & Motivation
- **No real multi-player fan-out coverage.** Group checks are exercised only
with `ioredis-mock` + synthetic `Interaction` objects. Real Discord gateway
fan-out, gateway event ordering, and per-user ephemeral delivery are
untested automatically.
- **The driver bot can't act as a player.** The existing live E2E already runs
a second client (`E2E_DRIVER_TOKEN`), but the bot-under-test **ignores
bot-authored messages** (anti-loop guard, `messageRouter.ts:33`). So the
driver bot's messages never route as player turns; the harness works around
this with `fakeInteraction` + direct `runLLMTurn` calls. There is no way to
put two distinct player identities into a live group encounter.
- **The four gap cases are the residual release risk.** Per the arch: unit +
integration only — simultaneous multi-player fan-out; `successRule` N>1;
per-user ephemeral delivery; second-claimant rejection. These are exactly
the behaviors most likely to break on real Discord and least covered today.
## 3. Goals & Non-Goals
### Goals
- Enable **two distinct player identities** in a live group encounter via a
second player-side bot token.
- Route both players' **real gateway messages** through the live
`messageRouter` as player turns (env-gated anti-loop allowlist).
- Drive button interactions (lobby **Join**, group-check **Roll**) for both
players via `fakeInteraction` using each player bot's real userId, against
real Redis + real scoreboard embed state.
- Cover the **MVP multiplayer run** (lobby gating → 2 join → start → 2 real
chat turns → group check N=2 → scoreboard finalizes) **and the four gap
cases** as automated live ACs.
- Zero production behavior change when E2E is not explicitly enabled.
### Non-Goals (out of scope)
- **3+ player bots.** Design the harness for N players, ship N=2.
- **Real gateway button clicks.** Bots cannot click another bot's buttons as a
user; button interactions stay `fakeInteraction` (with real userIds).
- **Real per-user ephemeral delivery to a bot client.** The ephemeral *content*
is asserted via the `fakeInteraction` reply; real gateway delivery of an
ephemeral to a bot client is not represented (documented limitation).
- **Any production use of the second token.** The second bot is an E2E-only
fixture.
- **Replacing the manual playtest checklist.** FU-9 stays the human gate.
- **A separate player-bot process / control channel.** Rejected as overkill
(see §9 — Approach C); in-process clients are sufficient.
## 4. User Journeys
### UJ-1 — Two players run a live group heist (the test)
A maintainer sets `RUN_FULL_E2E=1` plus the player-bot env vars and runs the
multiplayer suite. The harness connects three clients (bot under test, player1,
player2) to the test guild. The test starts a `minPlayers: 2` group fixture
spec; player1 and player2 **Join** the lobby (fakeInteraction, each bot's real
userId); **Start** enables and the opening narrative + a passive reveal fire.
Player1 and player2 each post a **real gateway message** in the thread — both
route through `messageRouter` as player turns (anti-loop allowlist), and
session history grows with both authors. The LLM emits a group Stealth check;
both players **Roll** (fakeInteraction, distinct userIds); the central
scoreboard updates live and finalizes with the real `successRule` (majority); a
`[GROUP CHECK RESULT]` system message lands in history. The test asserts on
Redis state, the real embed, and history structure — never narrative text.
### UJ-2 — Below-minimum gating (the negative AC)
The test starts `velvet-auction` (`minPlayers: 3`); only player1 + player2
join (2 < 3); **Start** stays disabled. Asserts the lobby gate enforces the
minimum.
### UJ-3 — A late click is rejected (the second-claimant AC)
During a group check, player1 clicks **Roll**, then the same userId "clicks"
again via a second fakeInteraction. The second is rejected by the FR-45
idempotency lock; the scoreboard shows one roll for player1. No duplicate roll,
no lost roll.
## 5. Features & Functional Requirements
### 5.1 Harness — N-player live clients
- **FR-1** `tests/integration/graphmcp/support/liveBots.ts` is extended so
`connectLiveBots()` connects **N** player clients and returns
`{ botClient, players: Client[], guild, channel }` with `players.length ≥ 2`.
`player1` is the existing `E2E_DRIVER_TOKEN` bot (re-roled from "driver" to
"player"); `player2` is `E2E_PLAYER2_TOKEN`. A `driverBot` alias to
`players[0]` preserves the existing AC2AC4 tests unchanged.
- **FR-2** New env vars (added to `src/config.ts` via Zod):
`E2E_PLAYER2_TOKEN` (required for the multiplayer suite),
`E2E_PLAYER2_USER_ID` (**optional** — auto-fetched from the client after
`login()` if unset; see OQ-1), `E2E_PLAYER_BOT_IDS` (comma-separated
player-bot userIds — auto-populated from the connected player clients if
unset), and `E2E_ALLOW_PLAYER_BOTS` (`z.coerce.boolean().default(false)`).
`connectLiveBots` throws clearly if `E2E_PLAYER2_TOKEN` is missing when
`RUN_FULL_E2E=1`.
### 5.2 Anti-loop guard — env-gated player allowlist (production code)
- **FR-3** `src/bot/handlers/messageRouter.ts` anti-loop guard gains a branch:
when `config.E2E_ALLOW_PLAYER_BOTS === true` **and** `message.author.id` is in
`config.E2E_PLAYER_BOT_IDS`, the message is treated as a player turn (not
skipped). Otherwise the current behavior (skip `author.bot`) is unchanged.
- **FR-4** The allowlist is **strictly env-gated** — default `false`, no
hardcoded ids. A unit test (no live run) covers both branches: flag off → bot
message skipped; flag on + id allowlisted → routed as a player turn.
### 5.3 E2E group fixture spec
- **FR-5** A dedicated fixture spec `tests/fixtures/spec-group-e2e.yaml`
(`minPlayers: 2`) with a group skill check and a passive reveal, so two
player bots satisfy the lobby gate. `velvet-auction` (`minPlayers: 3`) is
reused for the below-min gating AC (FR-9).
### 5.4 Live multiplayer test
- **FR-6** New `tests/integration/graphmcp/group-encounter-live.test.ts`,
gated by `RUN_FULL_E2E=1` and the new player-bot env (`describe.skipIf`),
skipped by default → CI-safe.
- **FR-7 (MVP — lobby gating + start)** Start the fixture spec via
`fakeInteraction` (starter userId); assert **Start** disabled while < 2
joined; player1 + player2 Join (fakeInteraction, each bot's real userId);
assert **Start** enables; starter starts; assert opening narrative + a
passive reveal fire for a qualifying player.
- **FR-8 (MVP — 2 real chat turns + regulation)** Player1 and player2 each
post a **real gateway message** in the thread; assert both route through
`messageRouter` as player turns (history grows with both authors). A message
from a non-joined, non-allowlisted author is **auto-deleted** (FR-28/29);
assert no false-positive deletion of joined players' messages.
- **FR-9 (below-min gating)** Start `velvet-auction` (`minPlayers: 3`); only 2
join; assert **Start** stays disabled.
- **FR-10 (MVP — group check N=2)** The LLM emits a group skill check; player1
+ player2 **Roll** (fakeInteraction, distinct userIds); assert the central
scoreboard updates live and finalizes with the real `successRule`; assert a
`[GROUP CHECK RESULT]` system message is appended to history.
### 5.5 The four gap cases
- **FR-11 (G1 — simultaneous fan-out)** Both players post near-simultaneously
(Promise.all); assert no lost or duplicated turn — history contains both
authors' turns, in a stable order.
- **FR-12 (G2 — successRule N>1)** Run the group check under `majority`,
`all`, and `n_of_m` and assert the correct group outcome for each on two
rollers. Per the group-encounters PRD FR-15 (`≥ ceil(N/2)` succeed):
`all` → 2/2 SUCCESS, 1/2 FAILURE; `majority` of 2 = `ceil(2/2)` = 1 → one
success suffices (1/2 SUCCESS, 0/2 FAILURE); `n_of_m` n=1 of m=2 → one
success suffices.
- **FR-13 (G3 — per-user ephemeral)** Each player's **Roll** fakeInteraction
yields an ephemeral reply containing that player's `d20 + modifier vs DC`;
assert the reply content is per-user (each sees only their own roll).
- **FR-14 (G4 — second-claimant rejection)** The same userId triggers **Roll**
twice (two fakeInteractions); assert the second is rejected (FR-45 idempotency
lock) and the scoreboard shows exactly one roll for that player.
### 5.6 Cross-cutting
- **FR-15** `afterAll` flushes the test guild's Redis state, deletes created
threads, and destroys all three clients (`botClient`, `players[0]`,
`players[1]`).
- **FR-16** The multiplayer suite is **serialized** with the existing live-E2E
suite (shared guild/channel/Redis) — no parallel multiplayer runs.
- **FR-17** The live-E2E doc (`tests/integration/atdd-checklist-graphmcp-live-integration-tests.md`)
is updated with the new env vars and the multiplayer ACs.
## 6. Success Metrics
- **SM-1** A maintainer can run a live multiplayer encounter with two distinct
player identities and the suite asserts on real Redis/embed/history state.
- **SM-2** Both players' real gateway messages route as player turns (anti-loop
allowlist), and a non-joined message is auto-deleted — automatically.
- **SM-3** A group check with two rollers finalizes with the correct
`successRule` outcome for `majority` / `all` / `n_of_m`.
- **SM-4** A second Roll click by the same user is rejected; no duplicate or
lost roll.
- **SM-5** Production behavior is byte-for-byte unchanged when
`E2E_ALLOW_PLAYER_BOTS` is unset (the unit test proves the default branch).
**Counter-metrics**
- **CM-1** The anti-loop allowlist accidentally activating in prod (mitigated:
default false + id allowlist + unit test).
- **CM-2** Flaky live tests from real Discord latency (mitigated: serialized
runs, generous timeouts, structural assertions).
- **CM-3** A token leaking into the repo (mitigated: env-only, gitignored
`.env`, PRD/tests reference var names only).
## 7. Non-Functional Requirements
- **NFR-1 (Prod safety)** The anti-loop allowlist is default-off, env-gated,
and id-allowlisted. Zero behavior change in production.
- **NFR-2 (CI-safe)** All multiplayer tests are skipped unless `RUN_FULL_E2E=1`
**and** the player-bot env is present. CI never runs them.
- **NFR-3 (Secret hygiene)** No token value in code, tests, logs, or this PRD.
References are env-var names only. `.env` stays gitignored.
- **NFR-4 (Backward compatibility)** Existing AC2AC4 live tests are unchanged
(`driverBot` alias preserved).
- **NFR-5 (Serial execution)** No parallel multiplayer runs; the suite shares
the existing live-E2E serialization.
- **NFR-6 (Documentation)** The live-E2E doc is updated with the new env vars
and the multiplayer ACs (FR-17).
## 8. Open Questions & Assumptions
**Assumptions**
- `[ASSUMPTION]` The second bot token is invited to the test guild with the
same intents as the bot under test.
- `[ASSUMPTION]` Player-bot userIds can be **auto-fetched** from each client
after `login()` (`client.user.id`), so `E2E_PLAYER2_USER_ID` /
`E2E_PLAYER_BOT_IDS` may be derived rather than required (see OQ-1).
- `[ASSUMPTION]` Button interactions stay `fakeInteraction` (bots cannot click
another bot's buttons) — accepted limitation.
**Open Questions**
- **OQ-1** Auto-fetch player-bot userIds after login vs. require them as env.
Auto-fetch is cleaner (one less var to misconfigure); require is more
explicit. **Recommend auto-fetch, drop `E2E_PLAYER2_USER_ID` as required.**
- **OQ-2** Should the group-check emit be driven by the **real LLM** (matches
AC2) or **force-emitted** by the test for determinism? Real LLM is truer but
nondeterministic; force-emit is stable but bypasses the LLM contract.
**Recommend real LLM for the MVP AC, force-emit for the successRule N>1
matrix (FR-12) where determinism matters.**
- **OQ-3** Fixture spec vs. reusing velvet-auction. Confirm a `minPlayers: 2`
fixture is acceptable (we do **not** want to lower velvet-auction's
`minPlayers: 3`).
## 9. Downstream Handoff
- **Architecture** — No separate arch doc required; the change is small and
well-bounded: one env-gated branch in `messageRouter.ts`, Zod additions in
`config.ts`, an N-player extension of `support/liveBots.ts`, one fixture
spec, and one new live test file. If the team prefers, a short architecture
note can record the anti-loop-allowlist safety argument.
- **Epics & stories** — One epic, "multiplayer live E2E harness," with
stories: (1) config + anti-loop allowlist + unit test; (2) `liveBots` N-player
extension + `driverBot` alias; (3) fixture spec `spec-group-e2e.yaml`;
(4) MVP live test (FR-7/8/10); (5) the four gap-case ACs (FR-1114);
(6) doc update (FR-17).
- **References**
- Group-encounters PRD: `_bmad-output/prds/prd-mardonar-encounter-engine-2026-06-20/prd.md`
(FR-28/29 message regulation, FR-43 single Roll, FR-44 group-check Redis,
FR-45 idempotency, `successRule`).
- Group-encounters arch §8: the one-token constraint + the four gap cases.
- `follow-up-stories.md` FU-9: this PRD closes its optional second-token
live-E2E item.
- `docs/release-playtest-checklist.md`: the human safety net this complements.
---
_Token management: the second bot token is referenced only as the env-var name
`E2E_PLAYER2_TOKEN` throughout this document. No token value appears here or in
any committed file._