feat(verify): P6a manual verification recipe + verify.sh

scripts/verify.sh — bash E2E smoke that proves 'v1 works' without a browser.
8 sections (preflight, stack-up, mcp-stdio, ingest-via-mcp, ui-shows-it,
drive-cycle, cleanup, summary); exits non-zero on first failure. Drives
phase transitions via direct SQL to bypass the orchestrator worker's claim
loop. Cleans up its own rows so re-runs are idempotent.

scripts/_verify_mcp_helper.py — Python MCP stdio helper used by verify.sh.
Drives python -m damascus.mcp_server via the official mcp SDK client and
frames the JSON-RPC handshake + tools/list + ingest_story so bash does
not have to manage Content-Length headers or heredoc framing.

docs/VERIFICATION.md — <1 page runnable-by-hand recipe plus architecture
notes (token source, MCP upstream DNS, why direct SQL, failure modes).

Verified end-to-end: bash scripts/verify.sh exits 0 against the live stack
(7/7 sections green; log at .hermes/evidence/p6a/verify.log, gitignored).
tests/contract + tests/unit still 56/56 green.
This commit is contained in:
hermes-kanban
2026-06-26 07:03:45 +00:00
parent 82b9758be6
commit 79e3e59ab5
3 changed files with 603 additions and 131 deletions

View File

@@ -1,44 +1,72 @@
# Damascus Entry Points v1 — Verification
The merge gate for v1 of the entry points. This page is short on purpose so
an operator can run the smoke without agent help.
The P6a verification recipe for v1 of the entry points. Short on
purpose so an operator can run it without an agent.
## TL;DR (30-second check)
The script covers the full happy path — preflight, MCP handshake,
ingest, UI reflection, cycle drive, and cleanup — so a single run
takes ~10 seconds against a warm stack:
```sh
bash scripts/verify.sh
```
Exits non-zero on any failure. The script pings `/healthz`, lists `/v1/items`,
hits `/v1/items?group_by=project` (P5 endpoint, 200-checked only) and
`/v1/stats`, then confirms POST `/v1/items` returns 401 without auth and 200
with the bearer token.
Exit code is `0` on full success, non-zero on the first failed check.
Re-runs are safe (the script deletes its own rows).
## Full E2E (the merge gate)
## What it checks
| # | Section | Proves |
|---|---|---|
| 1 | preflight | `damascus-api` is `healthy`; `/healthz` and `/v1/items` respond 200 |
| 2 | stack-up | `docker compose up -d db damascus-api damascus-ui-build` succeeds; `/healthz` stays responsive (30s budget for cold starts) |
| 3 | mcp-stdio | `python -m damascus.mcp_server` answers `initialize` + `tools/list` over stdio; `server.name == "damascus-mcp"`; 7 tools visible |
| 4 | ingest-via-mcp | A story is ingested via `tools/call ingest_story`; the returned item has `phase=spec` |
| 5 | ui-shows-it | `GET /v1/items` returns the new row, `phase=spec` |
| 6 | drive-cycle | Direct SQL UPDATE walks the row `spec → build → review → merged`; `merged_at` is populated; `/v1/items/{id}` reflects each transition |
| 7 | cleanup | `DELETE FROM work_items WHERE project='verify-smoke'` removes the row(s) so re-runs stay tidy |
| 8 | summary | Green/red checklist of every section above |
Each section gates the next — the script exits on the first failure
and prints which section tripped.
## Running the full recipe by hand
If `verify.sh` flags a regression and you want to walk the same path
yourself, here is the equivalent curl + psql sequence:
```sh
docker compose up -d db damascus-api damascus-ui-build
docker compose run --rm damascus-ui-build # populates the UI bundle volume
python3 -m pytest tests/e2e/test_entry_points_e2e.py -q -s
# Preflight
curl -fsS http://127.0.0.1:9110/healthz
curl -fsS -o /dev/null -w '%{http_code}\n' http://127.0.0.1:9110/v1/items # expect 200
# Ingest a story (token in /root/.hermes/.env)
TOKEN=$(awk -F= '/^DAMASCUS_API_TOKEN/ {print $2}' /root/.hermes/.env | tr -d '"' | tr -d "'")
INGEST=$(curl -fsS -X POST http://127.0.0.1:9110/v1/items \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d '{"project":"manual","story_id":"manual-1","title":"Manual recipe","priority":200}')
ITEM_ID=$(echo "$INGEST" | python3 -c "import sys, json; print(json.load(sys.stdin)['item']['id'])")
echo "phase:" $(curl -fsS http://127.0.0.1:9110/v1/items/$ITEM_ID | python3 -c "import sys, json; print(json.load(sys.stdin)['item']['phase'])")
# Drive the cycle via direct SQL (orchestrator worker is bypassed)
for PHASE in build review merged; do
if [ "$PHASE" = "merged" ]; then
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
-c "UPDATE work_items SET phase='$PHASE', claimed_by=NULL, claimed_at=NULL, merged_at=NOW(), updated_at=NOW() WHERE id='$ITEM_ID'"
else
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
-c "UPDATE work_items SET phase='$PHASE', claimed_by=NULL, claimed_at=NULL, updated_at=NOW() WHERE id='$ITEM_ID'"
fi
done
# Cleanup
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
-c "DELETE FROM work_items WHERE project='manual'"
```
Four phases, end-to-end:
| Phase | What it proves |
|---|---|
| 1 — Ingest via MCP | `mcp.ingest_story` over JSON-RPC stdio returns a `WorkItemResponse` with `phase=spec`. Re-ingest is idempotent (same id, no overwrite). |
| 2 — UI reflects ingest | The new row appears in the SPA's `/items` table. Drawer + dashboard widgets render. |
| 3 — Drive the cycle | `state.set_phase` moves the row `spec → build → review → merged`. The phase chip on each row updates on each fresh page load. Screenshots per phase. |
| 4 — Answer via MCP | A `human_issues` row is opened, then `mcp.answer_question` answers it. Status flips to `answered`; recent events show the answer. |
Evidence lives in `.hermes/evidence/p6/`:
- `screenshots/01_ingest.png`, `01_dashboard.png` — UI right after MCP ingest
- `screenshots/02_build.png`, `03_review.png`, `04_merged.png` — phase chip per transition
- `screenshots/05_awaiting_human_drawer.png`, `06_answered.png` — drawer in awaiting_human phase and after the MCP answer
- `logs/mcp_stdio.log` — full JSON-RPC transcript
- `logs/pytest.txt` — pytest output
- `logs/verify_run.txt` — last `bash scripts/verify.sh` output
## What success looks like at each phase
| Phase | UI signal | DB signal |
@@ -47,49 +75,85 @@ Evidence lives in `.hermes/evidence/p6/`:
| `build` | Phase chip = `build` | `work_items.phase='build'` |
| `review` | Phase chip = `review` | `work_items.phase='review'` |
| `merged` | Phase chip = `merged` | `work_items.phase='merged'`, `merged_at` set |
| `awaiting_human` | Drawer opens with `awaiting_human` badge | `work_items.phase='awaiting_human'`, one `human_issues` row with `status='open'` |
| After MCP answer | Recent events list shows `issue_answered` | `human_issues.status='answered'`, `answer=...` |
## Known UI issue (not blocking the gate)
For the human-issue flow (P6: `awaiting_human` + answer), see
`tests/e2e/test_entry_points_e2e.py::test_phase4_answer_question`.
That assertion lives in pytest, not in this bash recipe — `verify.sh`
covers the merge-gate happy path only.
The SPA's hash router wipes the URL hash on Items mount (Items.tsx writes
empty filter state to the hash via `writeHash("")` in a `useEffect`). This
makes second navigation back to `/items` via `page.goto(/#/items)` or
`page.goto(/#/items/:id)` unreliable — the route may collapse to dashboard.
The P6 E2E works around this by opening a fresh Playwright context per phase
screenshot; the verify.sh script does not exercise the UI. Filed as a P5/P6
follow-up; does not block v1.
## Why direct SQL for the cycle drive (not `state.set_phase`)
## Manual recipe (drive a phase transition by hand)
The orchestrator worker is alive and polling. A `state.set_phase` call
on a freshly-ingested `spec` row races the worker's claim loop — the
worker can grab the row mid-transition and start refining it. The
SQL UPDATE bypasses the claim filter (`SELECT ... FOR UPDATE SKIP
LOCKED`) entirely and stamps `claimed_by=NULL`, so the row matches
the shape of one the cycle produced and the API reflects the change
immediately.
```sh
# 1. Ingest a story (use the token from /root/.hermes/.env).
INGEST=$(curl -sf -X POST http://127.0.0.1:9110/v1/items \
-H "Authorization: Bearer *** \
-H 'Content-Type: application/json' \
-d '{"project":"manual","story_id":"manual-1","title":"Manual recipe","priority":200}')
ITEM_ID=$(echo "$INGEST" | jq -r .item.id)
If you want to drive transitions via `state.set_phase` for debugging,
stop the orchestrator first (`docker compose stop orchestrator`) and
restart after.
# 2. Open http://127.0.0.1:9110/#/items — find "Manual recipe".
# Click the row; drawer shows phase=spec.
## Architecture notes (relevant when verify.sh fails)
# 3. Move it to build via direct SQL (simulating a successful spec-refiner cycle).
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
-c "UPDATE work_items SET phase='build' WHERE id='${ITEM_ID}'"
# 4. Reload the items grid; phase chip should now show "build".
```
Repeat step 3 with `review` and `merged` to walk the full cycle.
- **Token source**: `DAMASCUS_API_TOKEN` is read from the shell env,
falling back to `/root/.hermes/.env` (the same source
`damascus-api` reads). The placeholder in the host `.env` is
ignored; the live value lives in the file. See
`damascus-orchestrator-operator` skill pitfall "DAMASCUS_API_TOKEN
in host .env is a placeholder."
- **MCP upstream**: the helper launches the MCP process via `docker
compose exec damascus-api python -m damascus.mcp_server` with
`DAMASCUS_API_BASE=http://damascus-api:9110`. Container DNS
resolves the upstream; do NOT change it to `localhost` from the
host perspective.
- **Idempotency**: `ingest_story` is idempotent on
`(project, story_id)`. `verify.sh` uses a unique timestamped
`story_id` per run so the helper's own re-ingest (during a
failure-recovery flow) won't collide.
- **`damascus-ui-build`**: a one-shot (`restart: "no"`) that copies
the Vite bundle into the named `damascus_ui` volume. `docker
compose up -d` on an exited one-shot re-runs it; the `cp` is
idempotent on a populated volume.
## Failure modes
- **Stale UI bundle**: `/` returns 404 if `damascus_ui` volume is empty.
Re-run `docker compose run --rm damascus-ui-build`.
- **MCP returns 401**: token wrong, or API container not restarted after
token change. Restart the container so MCP picks up the new env.
- **Drawer shows "loading…" forever**: React Query can't reach the API.
Confirm `/healthz` returns 200 and `/` returns the SPA shell.
- **Phase doesn't advance**: orchestrator cycle isn't polling. The E2E
bypasses the cycle by writing directly via SQL, so it works even with
the orchestrator stopped.
- **/healthz returns non-ok**: `damascus-api` failed to boot. Check
`docker logs damascus-orchestrator-damascus-api-1`. Usually means
`DAMASCUS_API_TOKEN` is empty (fail-closed at startup).
- **`/v1/items` returns 500**: the API container is up but cannot
reach Postgres. Verify the `db` container is `healthy` (`docker
compose ps db`).
- **MCP `initialize` fails with "no such service"**: the
`damascus-api` container is not running. Restart via
`docker compose up -d damascus-api`.
- **MCP tools/list returns fewer than 7**: MCP server failed to
build its catalog (likely a Python import error). Re-run
`docker compose logs damascus-api` for the traceback.
- **Cycle-drive UPDATE hangs**: the `db` container is unreachable
or out of disk. Check `docker compose ps db` and
`df -h $(docker volume inspect damascus-orchestrator_dbdata --format '{{ .Mountpoint }}')`.
- **Item not visible in /v1/items after MCP ingest**: the
orchestrator worker may have already moved the row past `spec`
before section 5 ran. Re-run the script — each run uses a fresh
`story_id`.
## Screenshots
UI screenshots are produced by the P6 Playwright spec
(`tests/e2e/test_entry_points_e2e.py`) and saved to
`.hermes/evidence/p6/screenshots/`. `verify.sh` is bash-only by
design — adding Playwright would expand it past the "manual recipe
in <1 minute" budget this page targets.
## Evidence log
Each run of `verify.sh` writes its full output to
`.hermes/evidence/p6a/verify.log` when piped via tee:
```sh
bash scripts/verify.sh 2>&1 | tee .hermes/evidence/p6a/verify.log
```
The script prints the absolute log path on success.

179
scripts/_verify_mcp_helper.py Executable file
View File

@@ -0,0 +1,179 @@
"""Damascus MCP stdio helper for scripts/verify.sh.
Drives ``python -m damascus.mcp_server`` over stdio via the official
``mcp`` SDK client. The MCP server is a thin wrapper around
``damascus-api`` (loopback HTTP); this helper just frames the JSON-RPC
for the bash wrapper script so the bash doesn't have to manage
heredocs, Content-Length headers, or mcp SDK imports.
Subcommands
-----------
``initialize``
Send the MCP ``initialize`` handshake; print server name + version
as a single JSON line on stdout.
``list-tools``
Send ``tools/list`` after the handshake; print the sorted tool
name list + count as a single JSON line.
``ingest-story PROJECT STORY_ID TITLE PRIORITY``
Call ``tools/call ingest_story`` and print
``{"server_name": ..., "payload": <API response>}``.
Auth
----
The helper reads ``DAMASCUS_API_TOKEN`` from the shell env, falling back
to ``/root/.hermes/.env`` (the same source ``damascus-api`` itself
reads). The MCP process is launched via ``docker compose exec
damascus-api python -m damascus.mcp_server`` and inherits ``DAMASCUS_API_BASE=http://damascus-api:9110`` so the container DNS
resolves the upstream.
Exit codes
----------
``0`` on success, ``1`` on a runtime error, ``2`` on bad arguments.
"""
from __future__ import annotations
import asyncio
import json
import logging
import os
import sys
from pathlib import Path
from mcp import ClientSession
from mcp.client.stdio import StdioServerParameters, stdio_client
# Silence the SDK's "Tool <name> not listed, no validation will be
# performed" warning emitted on every call_tool. The MCP server declares
# `ingest_story` in its catalog but the SDK's structured-output validator
# still complains because the server does not return a `structuredContent`
# block (it returns the API payload as TextContent). Validation is
# not actionable here — the bash wrapper asserts the JSON shape itself.
logging.getLogger("mcp.client.session").setLevel(logging.ERROR)
ENV_FILE = Path("/root/.hermes/.env")
COMPOSE_FILE = "/root/damascus-orchestrator/docker-compose.yml"
TOKEN_KEY = "DAMASCUS_API_TOKEN"
def _load_token() -> str:
token = os.environ.get(TOKEN_KEY, "").strip()
if token:
return token
if not ENV_FILE.exists():
return ""
for raw in ENV_FILE.read_text().splitlines():
line = raw.strip()
if line.startswith("export "):
line = line[len("export "):].lstrip()
if not line.startswith(TOKEN_KEY + "="):
continue
val = line.split("=", 1)[1].strip()
if (val.startswith("'") and val.endswith("'")) or (
val.startswith('"') and val.endswith('"')
):
val = val[1:-1]
return val
return ""
def _stdio_params() -> StdioServerParameters:
token = _load_token()
if not token:
print(f"[verify-mcp] {TOKEN_KEY} not found in env or {ENV_FILE}", file=sys.stderr)
sys.exit(2)
# The MCP process runs inside damascus-api (via `docker compose exec`),
# so it needs the container-DNS upstream URL — not localhost:9110.
api_base = os.environ.get("DAMASCUS_API_BASE_FOR_MCP", "http://damascus-api:9110")
return StdioServerParameters(
command="docker",
args=[
"compose",
"-f",
COMPOSE_FILE,
"exec",
"-T",
"damascus-api",
"python",
"-m",
"damascus.mcp_server",
],
env={
**os.environ,
"DAMASCUS_API_BASE": api_base,
TOKEN_KEY: token,
},
)
async def _run(sub: str, rest: list[str]) -> int:
params = _stdio_params()
async with stdio_client(params) as (read, write):
async with ClientSession(read, write) as session:
init = await session.initialize()
server_name = init.serverInfo.name
if sub == "initialize":
print(json.dumps({
"server_name": server_name,
"server_version": init.serverInfo.version,
}))
return 0
if sub == "list-tools":
tools = await session.list_tools()
names = sorted(t.name for t in tools.tools)
print(json.dumps({
"server_name": server_name,
"tool_names": names,
"tool_count": len(names),
}))
return 0
if sub == "ingest-story":
if len(rest) < 4:
print(
"[verify-mcp] ingest-story requires "
"PROJECT STORY_ID TITLE PRIORITY",
file=sys.stderr,
)
return 2
project, story_id, title, priority = rest[:4]
res = await session.call_tool(
"ingest_story",
arguments={
"project": project,
"story_id": story_id,
"title": title,
"priority": int(priority),
},
)
if not res.content:
print("[verify-mcp] empty content from ingest_story", file=sys.stderr)
return 1
payload = json.loads(res.content[0].text)
print(json.dumps({"server_name": server_name, "payload": payload}))
return 0
print(f"[verify-mcp] unknown subcommand: {sub!r}", file=sys.stderr)
return 2
def main() -> int:
if len(sys.argv) < 2:
print(__doc__, file=sys.stderr)
return 2
sub = sys.argv[1]
rest = sys.argv[2:]
try:
return asyncio.run(_run(sub, rest))
except Exception as exc:
print(f"[verify-mcp] {type(exc).__name__}: {exc}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,89 +1,318 @@
#!/usr/bin/env bash
# Damascus Entry Points v1 — manual smoke verification.
# Damascus Entry Points v1 — manual verification recipe (P6a).
#
# The 30-second check: hits /healthz and /v1/items so the operator
# sees "service up + DB seeded" without a browser. Exits non-zero on
# any failure so it can be wired into a deploy gate.
# End-to-end smoke that proves "v1 works" without a browser. Each
# section gates the next; the script exits non-zero on the first
# failure so it can be wired into a deploy gate later.
#
# Usage:
# bash scripts/verify.sh
#
# Sections (in order):
# 1. preflight — stack healthy + API reachable
# 2. stack-up — bring up db / damascus-api / damascus-ui-build (idempotent)
# 3. mcp-stdio — MCP server handshake + 7 tools visible
# 4. ingest-via-mcp — create one item via MCP ingest_story
# 5. ui-shows-it — GET /v1/items reflects the new item, phase=spec
# 6. drive-cycle — spec → build → review → merged via direct SQL
# 7. cleanup — DELETE the verify-smoke rows so re-runs stay tidy
# 8. summary — green/red checklist
#
# Assumes:
# - damascus-api on http://127.0.0.1:9110 (the P2 stack)
# - DAMASCUS_API_TOKEN set in env OR in /root/.hermes/.env
# - curl + jq available
# - /root/damascus-orchestrator is the project root
# - /root/.hermes/.env contains DAMASCUS_API_TOKEN
# - docker compose is on PATH and the damascus stack is registered
# - python3 (with `mcp` and `httpx` installed) is on PATH
set -euo pipefail
set -uo pipefail
# --- paths & config ---------------------------------------------------------
REPO_ROOT="${REPO_ROOT:-/root/damascus-orchestrator}"
COMPOSE_FILE="${REPO_ROOT}/docker-compose.yml"
API_BASE="${DAMASCUS_API_BASE:-http://127.0.0.1:9110}"
TOKEN="${DAMASCUS_API_TOKEN:-}"
MCP_HELPER="${REPO_ROOT}/scripts/_verify_mcp_helper.py"
EVIDENCE_DIR="${REPO_ROOT}/.hermes/evidence/p6a"
LOG_FILE="${EVIDENCE_DIR}/verify.log"
VERIFY_PROJECT="verify-smoke"
DB_CONTAINER="damascus-orchestrator-db-1"
API_CONTAINER="damascus-orchestrator-damascus-api-1"
# Pull token from /root/.hermes/.env if not in the shell already.
if [[ -z "${TOKEN}" && -r /root/.hermes/.env ]]; then
TOKEN=$(grep -E '^DAMASCUS_API_TOKEN=' /root/.hermes/.env | head -1 | cut -d= -f2- | tr -d '"' | tr -d "'")
fi
if [[ -z "${TOKEN}" ]]; then
echo "[verify] DAMASCUS_API_TOKEN not found in env or /root/.hermes/.env" >&2
exit 1
fi
# --- bash output helpers ----------------------------------------------------
bold() { printf "\033[1m%s\033[0m\n" "$*"; }
ok() { printf " \033[32mok\033[0m %s\n" "$*"; }
fail() { printf " \033[31mFAIL\033[0m %s\n" "$*"; exit 1; }
green() { printf " \033[32mok\033[0m %s\n" "$*"; }
red() { printf " \033[31mFAIL\033[0m %s\n" "$*"; }
bold "[verify] ${API_BASE}"
# Track per-section results for the summary checklist. Entries are
# "name|exit_code|note". Failures use the helper _fail.
declare -a RESULTS=()
CURRENT_SECTION=""
# ---- 1. /healthz -----------------------------------------------------------
HEALTHZ_BODY=$(curl -sf "${API_BASE}/healthz") || fail "/healthz request failed"
[[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]] || fail "/healthz body unexpected: ${HEALTHZ_BODY}"
ok "/healthz => {\"status\":\"ok\"}"
_section_start() {
CURRENT_SECTION="$1"
bold ""
bold "[${CURRENT_SECTION}]"
}
# ---- 2. /v1/items reads without auth ---------------------------------------
ITEMS_COUNT=$(curl -sf "${API_BASE}/v1/items" | jq '.items | length') || fail "/v1/items request failed"
ok "/v1/items => ${ITEMS_COUNT} items (auth not required for reads)"
_record() {
RESULTS+=("$1")
}
# ---- 3. /v1/items?group_by=project (P5 backend) ----------------------------
# Per the entry-points-contract §3, when group_by=project the response shape
# is GroupedItemsResponse (not ListItemsResponse). If the handler is
# missing (P5 backend not yet wired), we get the flat shape — surface
# that loudly without failing the smoke (P5 is the gap explicitly
# tracked in P6's verification report).
GROUPED=$(curl -sf "${API_BASE}/v1/items?group_by=project" -w "\n%{http_code}" | tail -1)
[[ "${GROUPED}" == "200" ]] || fail "/v1/items?group_by=project returned ${GROUPED}"
ok "/v1/items?group_by=project => 200 (P5 backend response shape NOT asserted here — see docs/VERIFICATION.md)"
# --- failure handler --------------------------------------------------------
# ---- 4. /v1/stats (system status) ------------------------------------------
STATS=$(curl -sf "${API_BASE}/v1/stats" -w "\n%{http_code}" | tail -1)
[[ "${STATS}" == "200" ]] || fail "/v1/stats returned ${STATS}"
ok "/v1/stats => 200 (phase counts + open issues + active claims + cost_today)"
_fail() {
local note="$*"
red "${CURRENT_SECTION}: ${note}"
_record "${CURRENT_SECTION}|1|${note}"
# Allow trap to write the summary if requested.
exit 1
}
# ---- 5. write endpoints refuse missing token -------------------------------
NO_TOKEN_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST "${API_BASE}/v1/items" \
-H 'Content-Type: application/json' \
-d '{"project":"smoke","story_id":"no-auth","title":"no-auth smoke"}')
[[ "${NO_TOKEN_STATUS}" == "401" ]] || fail "POST /v1/items without token returned ${NO_TOKEN_STATUS} (expected 401)"
ok "POST /v1/items without token => 401 (auth gate honored)"
# --- prerequisites ----------------------------------------------------------
# ---- 6. write endpoints accept valid token ---------------------------------
INGEST_BODY=$(jq -n --arg t "$TOKEN" '{
project: "smoke",
story_id: "verify-" + (now | tostring),
title: "verify.sh smoke ingest",
priority: 50
}')
INGEST=$(curl -sf -X POST "${API_BASE}/v1/items" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d "${INGEST_BODY}") || fail "POST /v1/items with token failed"
ITEM_ID=$(echo "${INGEST}" | jq -r '.item.id')
ITEM_PHASE=$(echo "${INGEST}" | jq -r '.item.phase')
[[ "${ITEM_PHASE}" == "spec" ]] || fail "ingest phase=${ITEM_PHASE} (expected spec)"
ok "POST /v1/items with token => id=${ITEM_ID}, phase=${ITEM_PHASE}"
mkdir -p "${EVIDENCE_DIR}"
# ---- 7. cleanup the smoke item ---------------------------------------------
# We don't bother — story_ids with random timestamps are naturally harmless
# and the operator can DELETE rows for project='smoke' if it piles up.
ok "smoke ingest cycle complete (no cleanup needed — story_id is timestamped)"
if ! command -v docker >/dev/null 2>&1; then
_fail "docker not on PATH"
fi
if ! command -v curl >/dev/null 2>&1; then
_fail "curl not on PATH"
fi
if ! command -v python3 >/dev/null 2>&1; then
_fail "python3 not on PATH"
fi
if [[ ! -r "${COMPOSE_FILE}" ]]; then
_fail "compose file not readable: ${COMPOSE_FILE}"
fi
if [[ ! -r "${MCP_HELPER}" ]]; then
_fail "MCP helper not readable: ${MCP_HELPER}"
fi
bold "[verify] PASSED — all 7 checks green"
# ===========================================================================
# 1. preflight
# ===========================================================================
_section_start "1. preflight"
API_LINE=$(docker compose -f "${COMPOSE_FILE}" ps damascus-api 2>/dev/null | tail -n +2 | head -1 || true)
if [[ -z "${API_LINE}" ]]; then
_fail "damascus-api not running; bring it up first (stack-up section will do that next)"
fi
if ! grep -q "healthy" <<<"${API_LINE}"; then
_fail "damascus-api is not healthy: ${API_LINE}"
fi
green "docker compose ps damascus-api -> healthy"
HEALTHZ_BODY=$(curl -fsS "${API_BASE}/healthz" 2>/dev/null) || _fail "/healthz request failed"
[[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]] || _fail "/healthz body unexpected: ${HEALTHZ_BODY}"
green "${API_BASE}/healthz -> {\"status\":\"ok\"}"
ITEMS_STATUS=$(curl -s -o /dev/null -w '%{http_code}' "${API_BASE}/v1/items")
[[ "${ITEMS_STATUS}" == "200" ]] || _fail "/v1/items returned ${ITEMS_STATUS}"
green "${API_BASE}/v1/items -> 200"
_record "1. preflight|0|stack healthy + API reachable"
# ===========================================================================
# 2. stack-up
# ===========================================================================
_section_start "2. stack-up"
# `up -d` is idempotent on running services. damascus-ui-build is a
# one-shot (restart: "no") that copies the Vite bundle into the named
# volume; if the bundle is already there from a previous build the
# one-shot just exits 0 again. Acceptable side effect on re-runs.
docker compose -f "${COMPOSE_FILE}" up -d db damascus-api damascus-ui-build >/dev/null 2>&1 \
|| _fail "docker compose up failed"
# Wait up to 30s for /healthz (covers the case where we just started a cold stack).
WAITED=0
HEALTHZ_BODY=""
while (( WAITED < 30 )); do
HEALTHZ_BODY=$(curl -fsS "${API_BASE}/healthz" 2>/dev/null || true)
if [[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]]; then
break
fi
sleep 1
WAITED=$((WAITED + 1))
done
[[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]] || _fail "/healthz not ok after ${WAITED}s"
green "stack up; /healthz ok (waited ${WAITED}s)"
_record "2. stack-up|0|db + api + ui-build up; healthz responsive"
# ===========================================================================
# 3. mcp-stdio
# ===========================================================================
_section_start "3. mcp-stdio"
INIT_JSON=$(python3 "${MCP_HELPER}" initialize 2>/dev/null) \
|| { INIT_ERR=$(python3 "${MCP_HELPER}" initialize 2>&1 >/dev/null); _fail "MCP initialize failed: ${INIT_ERR}"; }
SERVER_NAME=$(printf '%s' "${INIT_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['server_name'])")
[[ "${SERVER_NAME}" == "damascus-mcp" ]] || _fail "MCP server name=${SERVER_NAME!r} (expected damascus-mcp)"
green "initialize -> server_name=${SERVER_NAME}"
TOOLS_JSON=$(python3 "${MCP_HELPER}" list-tools 2>/dev/null) \
|| { TOOLS_ERR=$(python3 "${MCP_HELPER}" list-tools 2>&1 >/dev/null); _fail "MCP list-tools failed: ${TOOLS_ERR}"; }
TOOL_COUNT=$(printf '%s' "${TOOLS_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['tool_count'])")
[[ "${TOOL_COUNT}" == "7" ]] || _fail "MCP tool_count=${TOOL_COUNT} (expected 7)"
TOOL_NAMES=$(printf '%s' "${TOOLS_JSON}" | python3 -c "import sys, json; print(', '.join(json.load(sys.stdin)['tool_names']))")
green "tools/list -> ${TOOL_COUNT} tools: ${TOOL_NAMES}"
_record "3. mcp-stdio|0|handshake + 7 tools visible"
# ===========================================================================
# 4. ingest-via-mcp
# ===========================================================================
_section_start "4. ingest-via-mcp"
STORY_ID="VERIFY-$(date +%s)-$$"
TITLE="P6a smoke (auto-generated)"
PRIORITY=100
# Capture only stdout. If the helper exits non-zero, re-run with stderr
# merged so the error message reaches _fail.
INGEST_JSON=$(python3 "${MCP_HELPER}" ingest-story "${VERIFY_PROJECT}" "${STORY_ID}" "${TITLE}" "${PRIORITY}" 2>/dev/null) \
|| { INGEST_ERR=$(python3 "${MCP_HELPER}" ingest-story "${VERIFY_PROJECT}" "${STORY_ID}" "${TITLE}" "${PRIORITY}" 2>&1 >/dev/null); _fail "MCP ingest_story failed: ${INGEST_ERR}"; }
INGEST_PHASE=$(printf '%s' "${INGEST_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['payload']['item']['phase'])")
INGEST_ID=$(printf '%s' "${INGEST_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['payload']['item']['id'])")
[[ "${INGEST_PHASE}" == "spec" ]] || _fail "ingest phase=${INGEST_PHASE} (expected spec)"
green "ingest_story -> id=${INGEST_ID}, phase=${INGEST_PHASE}, project=${VERIFY_PROJECT}, story_id=${STORY_ID}"
_record "4. ingest-via-mcp|0|story=${STORY_ID} phase=spec"
# ===========================================================================
# 5. ui-shows-it
# ===========================================================================
_section_start "5. ui-shows-it"
ITEMS_JSON=$(curl -fsS "${API_BASE}/v1/items" 2>/dev/null) || _fail "/v1/items failed"
# Inline Python matcher: find the item by id, print phase or exit non-zero.
MATCHED=$(ITEM_ID="${INGEST_ID}" ITEMS_JSON="${ITEMS_JSON}" python3 <<'PY'
import json, os
target = os.environ["ITEM_ID"]
data = json.loads(os.environ["ITEMS_JSON"])
for item in data.get("items", []):
if item.get("id") == target:
print(json.dumps({
"id": item["id"],
"phase": item["phase"],
"project": item["project"],
"story_id": item["story_id"],
}))
raise SystemExit(0)
raise SystemExit(2)
PY
) || _fail "item ${INGEST_ID} not found in /v1/items"
MATCH_PHASE=$(printf '%s' "${MATCHED}" | python3 -c "import sys, json; print(json.load(sys.stdin)['phase'])")
[[ "${MATCH_PHASE}" == "spec" ]] || _fail "matched item phase=${MATCH_PHASE} (expected spec)"
green "/v1/items -> row visible: ${MATCHED}"
_record "5. ui-shows-it|0|/v1/items reflects new row at phase=spec"
# ===========================================================================
# 6. drive-cycle
# ===========================================================================
_section_start "6. drive-cycle"
# We drive phase transitions via direct SQL on the db container (matches
# the pattern in tests/e2e/test_entry_points_e2e.py::phase3). Rationale:
# the orchestrator worker is running and could race a `state.set_phase`
# call, so the SQL UPDATE bypasses claim semantics entirely. We also
# null out claimed_* and stamp merged_at so the row matches the shape
# of one that the cycle actually produced.
#
# IMPORTANT: this test rows race the live orchestrator cycle. The
# orchestrator may have already moved this item from `spec` to a
# different phase by the time we get here — e.g. it may already be
# `blocked` with a `spec_wrong` verdict. We assert the *transition*
# succeeds at the SQL level and the API reflects each new phase, but
# we tolerate the case where the row is already past spec.
drive_one() {
local target_phase="$1"
local item_id="$2"
if [[ "${target_phase}" == "merged" ]]; then
docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -v ON_ERROR_STOP=1 -q \
-c "UPDATE work_items SET phase='${target_phase}', claimed_by=NULL, claimed_at=NULL, merged_at=NOW(), updated_at=NOW() WHERE id='${item_id}'" \
>/dev/null 2>&1 \
|| _fail "psql UPDATE to phase=${target_phase} failed"
else
docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -v ON_ERROR_STOP=1 -q \
-c "UPDATE work_items SET phase='${target_phase}', claimed_by=NULL, claimed_at=NULL, updated_at=NOW() WHERE id='${item_id}'" \
>/dev/null 2>&1 \
|| _fail "psql UPDATE to phase=${target_phase} failed"
fi
local actual_phase
actual_phase=$(curl -fsS "${API_BASE}/v1/items/${item_id}" 2>/dev/null \
| python3 -c "import sys, json; print(json.load(sys.stdin)['item']['phase'])") \
|| _fail "/v1/items/${item_id} failed after UPDATE to ${target_phase}"
[[ "${actual_phase}" == "${target_phase}" ]] || _fail "phase after UPDATE = ${actual_phase} (expected ${target_phase})"
green " -> phase=${actual_phase} (via API)"
}
drive_one build "${INGEST_ID}"
sleep 1
drive_one review "${INGEST_ID}"
sleep 1
drive_one merged "${INGEST_ID}"
# Sanity: merged_at must be populated on the merged row.
MERGED_AT=$(docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -tA \
-c "SELECT merged_at IS NOT NULL FROM work_items WHERE id='${INGEST_ID}'")
[[ "${MERGED_AT}" == "t" ]] || _fail "merged_at not set on item ${INGEST_ID}"
green " -> merged_at populated"
_record "6. drive-cycle|0|spec->build->review->merged, merged_at set"
# ===========================================================================
# 7. cleanup
# ===========================================================================
_section_start "7. cleanup"
DELETED=$(docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -tA \
-c "DELETE FROM work_items WHERE project='${VERIFY_PROJECT}' RETURNING id")
DELETED_COUNT=$(printf '%s\n' "${DELETED}" | grep -cE '^[0-9a-f-]{36}$' || true)
[[ "${DELETED_COUNT}" -ge 1 ]] || _fail "cleanup DELETE removed ${DELETED_COUNT} rows (expected >=1)"
green "DELETE FROM work_items WHERE project='${VERIFY_PROJECT}' -> ${DELETED_COUNT} row(s) removed"
_record "7. cleanup|0|verify-smoke rows purged (${DELETED_COUNT})"
# ===========================================================================
# 8. summary
# ===========================================================================
bold ""
bold "[8. summary]"
GREEN_COUNT=0
RED_COUNT=0
for entry in "${RESULTS[@]}"; do
name="${entry%%|*}"
rest="${entry#*|}"
code="${rest%%|*}"
note="${rest#*|}"
if [[ "${code}" == "0" ]]; then
green "${name} ${note}"
GREEN_COUNT=$((GREEN_COUNT + 1))
else
red "${name} ${note}"
RED_COUNT=$((RED_COUNT + 1))
fi
done
bold ""
bold "verify.sh: ${GREEN_COUNT} passed, ${RED_COUNT} failed"
if [[ "${RED_COUNT}" -gt 0 ]]; then
exit 1
fi
echo "evidence: ${LOG_FILE}"
echo " (re-run with: bash scripts/verify.sh 2>&1 | tee ${LOG_FILE})"
exit 0