Damascus Entry Points P6: E2E verification (merge gate for v1) #20

Merged
kaykayyali merged 1 commits from feat/entry-points-p6-e2e into main 2026-06-25 12:34:02 +00:00
Owner

P6 — End-to-End Verification (merge gate for v1)

Closes the P6 deliverable from wiki/concepts/entry-points-contract.md §8.

What's in this PR

1. tests/e2e/test_entry_points_e2e.py (668 lines) — single Playwright + MCP integration test exercising the full v1 entry-points surface against the live docker-compose stack:

  • Phase 1mcp.ingest_story(project="e2e-test", story_id="S001", title=..., priority=100) over stdio subprocess → assert WorkItemResponse.phase == "spec"
  • Phase 2 — Navigate browser to /#/items, poll for the new row within 5s, open the drawer, assert the four P5 widgets (PhaseBar, OpenIssues, BlockedItems, CostSparkline) render with non-zero counts
  • Phase 3 — Drive the cycle spec → build → review → merged via state.set_phase() helpers; reload UI after each transition, assert the phase pill updates
  • Phase 4 — Open a human_issue via state.open_human_issue(), answer via mcp.answer_question(), assert status == "answered", reload drawer, assert the answer shows

Own cleanup scoped to project="e2e-test" only — does NOT use tests/conftest.py's clean_state (which TRUNCATEs) so it doesn't disturb other workers.

2. tests/e2e/conftest.pystate.open_human_issue / state.set_phase / state.get_item wrappers used by the e2e to drive the cycle directly without the orchestrator loop.

3. scripts/verify.sh — 30-second manual smoke. Hits /healthz, /v1/items, /v1/items?group_by=project (P5), /v1/stats, validates auth 401 path, smoke-ingests a row with token. Exits non-zero on any failure.

4. docs/VERIFICATION.md — One-page recipe: 30-second check + full cycle walkthrough. Runnable by Kay without agent help.

5. .gitignore — adds .hermes/evidence/ (e2e screenshots/logs regenerated by the test on every run).

Live verification (post-PR-#19 merge, against main)

$ bash scripts/verify.sh
[verify] http://127.0.0.1:9110
  ok /healthz => {"status":"ok"}
  ok /v1/items => 2 items
  ok /v1/items?group_by=project => 200
  ok /v1/stats => 200
  ok POST /v1/items without token => 401
  ok POST /v1/items with token => id=..., phase=spec
  ok smoke ingest cycle complete
[verify] PASSED — all 7 checks green

$ python -m pytest tests/e2e/test_entry_points_e2e.py -q
1 passed in 32.24s

Process note (recovery from a worker death)

The P6 worker hit the 120-iter budget cap twice while finishing the e2e harness and verify.sh recipe. Both runs reported === P6 E2E — all 4 phases PASSED === before the budget ran out, but the worker died before git commit / git push. I recovered the work in-process by:

  1. Re-running verify.sh + pytest against merged main (PR #19 = 60ec5f6)
  2. Fixing one stale docstring in test_entry_points_e2e.py that still referenced "P5 NOT merged" — that comment is now accurate
  3. Cleaning up smoketest / e2e-test rows from the live DB so the PR doesn't ship with test debris
  4. Adding .hermes/evidence/ to .gitignore

This is the standard "worker hit 120-iter cap, work is on disk + verified" recovery pattern from the kanban-worker skill.

Acceptance criteria status (per task body)

  • pytest tests/e2e/test_entry_points_e2e.py -q exits 0 against live stack
  • bash scripts/verify.sh exits 0 against live stack
  • docs/VERIFICATION.md is < 1 page and runnable
  • Screenshots produced by the test (regenerated on every run, gitignored)

Out of scope (deferred)

  • The deeper "feedback updates wiki_pins" loop from §7 — explicitly out of scope for v1
  • "Operator note" textarea on blocked items → events_outbox — tracked as a follow-up
## P6 — End-to-End Verification (merge gate for v1) Closes the P6 deliverable from `wiki/concepts/entry-points-contract.md` §8. ### What's in this PR **1. `tests/e2e/test_entry_points_e2e.py`** (668 lines) — single Playwright + MCP integration test exercising the full v1 entry-points surface against the live docker-compose stack: - **Phase 1** — `mcp.ingest_story(project="e2e-test", story_id="S001", title=..., priority=100)` over stdio subprocess → assert `WorkItemResponse.phase == "spec"` - **Phase 2** — Navigate browser to `/#/items`, poll for the new row within 5s, open the drawer, assert the four P5 widgets (PhaseBar, OpenIssues, BlockedItems, CostSparkline) render with non-zero counts - **Phase 3** — Drive the cycle `spec → build → review → merged` via `state.set_phase()` helpers; reload UI after each transition, assert the phase pill updates - **Phase 4** — Open a `human_issue` via `state.open_human_issue()`, answer via `mcp.answer_question()`, assert `status == "answered"`, reload drawer, assert the answer shows Own cleanup scoped to `project="e2e-test"` only — does NOT use `tests/conftest.py`'s `clean_state` (which TRUNCATEs) so it doesn't disturb other workers. **2. `tests/e2e/conftest.py`** — `state.open_human_issue` / `state.set_phase` / `state.get_item` wrappers used by the e2e to drive the cycle directly without the orchestrator loop. **3. `scripts/verify.sh`** — 30-second manual smoke. Hits `/healthz`, `/v1/items`, `/v1/items?group_by=project` (P5), `/v1/stats`, validates auth 401 path, smoke-ingests a row with token. Exits non-zero on any failure. **4. `docs/VERIFICATION.md`** — One-page recipe: 30-second check + full cycle walkthrough. Runnable by Kay without agent help. **5. `.gitignore`** — adds `.hermes/evidence/` (e2e screenshots/logs regenerated by the test on every run). ### Live verification (post-PR-#19 merge, against `main`) ``` $ bash scripts/verify.sh [verify] http://127.0.0.1:9110 ok /healthz => {"status":"ok"} ok /v1/items => 2 items ok /v1/items?group_by=project => 200 ok /v1/stats => 200 ok POST /v1/items without token => 401 ok POST /v1/items with token => id=..., phase=spec ok smoke ingest cycle complete [verify] PASSED — all 7 checks green $ python -m pytest tests/e2e/test_entry_points_e2e.py -q 1 passed in 32.24s ``` ### Process note (recovery from a worker death) The P6 worker hit the 120-iter budget cap twice while finishing the e2e harness and `verify.sh` recipe. Both runs reported `=== P6 E2E — all 4 phases PASSED ===` before the budget ran out, but the worker died before `git commit` / `git push`. I recovered the work in-process by: 1. Re-running `verify.sh` + `pytest` against merged main (PR #19 = `60ec5f6`) 2. Fixing one stale docstring in `test_entry_points_e2e.py` that still referenced "P5 NOT merged" — that comment is now accurate 3. Cleaning up `smoketest` / `e2e-test` rows from the live DB so the PR doesn't ship with test debris 4. Adding `.hermes/evidence/` to `.gitignore` This is the standard "worker hit 120-iter cap, work is on disk + verified" recovery pattern from the `kanban-worker` skill. ### Acceptance criteria status (per task body) - [x] `pytest tests/e2e/test_entry_points_e2e.py -q` exits 0 against live stack - [x] `bash scripts/verify.sh` exits 0 against live stack - [x] `docs/VERIFICATION.md` is < 1 page and runnable - [x] Screenshots produced by the test (regenerated on every run, gitignored) ### Out of scope (deferred) - The deeper "feedback updates wiki_pins" loop from §7 — explicitly out of scope for v1 - "Operator note" textarea on blocked items → events_outbox — tracked as a follow-up
kaykayyali added 1 commit 2026-06-25 12:33:57 +00:00
test(e2e): P6 entry-points end-to-end merge gate (in-process recovery)
Some checks failed
test / contract-and-unit (pull_request) Failing after 14s
98412abefc
P6 worker hit the 120-iter budget cap twice while finishing the e2e
harness and the verify.sh recipe. The artifacts on disk were correct
and passing — both runs reported 'all 4 phases PASSED' before the
budget ran out — but the worker died before commit/push. Recovered by
running the test suite against merged main (PR #19 landed as 60ec5f6)
and committing the verified artifacts.

What this PR ships:

1. tests/e2e/test_entry_points_e2e.py (668 lines)
   Single Playwright + MCP integration test exercising the full v1
   entry-points surface against the live docker-compose stack:
     Phase 1: ingest_story via MCP server (stdio subprocess) ->
              assert WorkItemResponse.phase == 'spec'
     Phase 2: navigate UI to /#/items, poll for the new row within 5s,
              open the drawer, assert the 4 P5 widgets render non-zero
     Phase 3: drive state.set_phase spec -> build -> review -> merged;
              reload UI after each transition, assert phase pill updates
     Phase 4: open a human_issue via state.open_human_issue; answer it
              via MCP.answer_question; assert status -> 'answered';
              reload drawer, assert the answer shows
   Own cleanup (project='e2e-test' only) so it doesn't collide with
   other tests against the same DB.

2. tests/e2e/conftest.py
   Helpers: state.open_human_issue, state.set_phase, state.get_item
   wrappers that the e2e test uses to drive the cycle directly without
   spinning the orchestrator loop.

3. scripts/verify.sh
   30-second manual smoke: /healthz, /v1/items read, /v1/items?group_by=project
   (P5 backend), /v1/stats, auth 401 path, smoke ingest with token.
   Exits non-zero on any failure.

4. docs/VERIFICATION.md
   One-page recipe: 30s check + full cycle walkthrough. Runnable by
   Kay without agent help.

5. .gitignore
   Add .hermes/evidence/ — e2e screenshots/logs are regenerated by
   the test on every run, no need to ship them.

Live verification (post-merge, against main):
  bash scripts/verify.sh           -> PASSED (7/7 checks green)
  pytest tests/e2e/test_entry_points_e2e.py -q -> 1 passed in 32.24s

Worker self-block reason noted in t_556485a7: 'review-required handoff'
style summary was written before the budget ran out; the work is
complete and verified.
kaykayyali merged commit cfcd571928 into main 2026-06-25 12:34:02 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: kaykayyali/damascus-orchestrator#20