feat(verify): P6a manual verification recipe + verify.sh
scripts/verify.sh — bash E2E smoke that proves 'v1 works' without a browser. 8 sections (preflight, stack-up, mcp-stdio, ingest-via-mcp, ui-shows-it, drive-cycle, cleanup, summary); exits non-zero on first failure. Drives phase transitions via direct SQL to bypass the orchestrator worker's claim loop. Cleans up its own rows so re-runs are idempotent. scripts/_verify_mcp_helper.py — Python MCP stdio helper used by verify.sh. Drives python -m damascus.mcp_server via the official mcp SDK client and frames the JSON-RPC handshake + tools/list + ingest_story so bash does not have to manage Content-Length headers or heredoc framing. docs/VERIFICATION.md — <1 page runnable-by-hand recipe plus architecture notes (token source, MCP upstream DNS, why direct SQL, failure modes). Verified end-to-end: bash scripts/verify.sh exits 0 against the live stack (7/7 sections green; log at .hermes/evidence/p6a/verify.log, gitignored). tests/contract + tests/unit still 56/56 green.
This commit is contained in:
@@ -1,44 +1,72 @@
|
||||
# Damascus Entry Points v1 — Verification
|
||||
|
||||
The merge gate for v1 of the entry points. This page is short on purpose so
|
||||
an operator can run the smoke without agent help.
|
||||
The P6a verification recipe for v1 of the entry points. Short on
|
||||
purpose so an operator can run it without an agent.
|
||||
|
||||
## TL;DR (30-second check)
|
||||
|
||||
The script covers the full happy path — preflight, MCP handshake,
|
||||
ingest, UI reflection, cycle drive, and cleanup — so a single run
|
||||
takes ~10 seconds against a warm stack:
|
||||
|
||||
```sh
|
||||
bash scripts/verify.sh
|
||||
```
|
||||
|
||||
Exits non-zero on any failure. The script pings `/healthz`, lists `/v1/items`,
|
||||
hits `/v1/items?group_by=project` (P5 endpoint, 200-checked only) and
|
||||
`/v1/stats`, then confirms POST `/v1/items` returns 401 without auth and 200
|
||||
with the bearer token.
|
||||
Exit code is `0` on full success, non-zero on the first failed check.
|
||||
Re-runs are safe (the script deletes its own rows).
|
||||
|
||||
## Full E2E (the merge gate)
|
||||
## What it checks
|
||||
|
||||
| # | Section | Proves |
|
||||
|---|---|---|
|
||||
| 1 | preflight | `damascus-api` is `healthy`; `/healthz` and `/v1/items` respond 200 |
|
||||
| 2 | stack-up | `docker compose up -d db damascus-api damascus-ui-build` succeeds; `/healthz` stays responsive (30s budget for cold starts) |
|
||||
| 3 | mcp-stdio | `python -m damascus.mcp_server` answers `initialize` + `tools/list` over stdio; `server.name == "damascus-mcp"`; 7 tools visible |
|
||||
| 4 | ingest-via-mcp | A story is ingested via `tools/call ingest_story`; the returned item has `phase=spec` |
|
||||
| 5 | ui-shows-it | `GET /v1/items` returns the new row, `phase=spec` |
|
||||
| 6 | drive-cycle | Direct SQL UPDATE walks the row `spec → build → review → merged`; `merged_at` is populated; `/v1/items/{id}` reflects each transition |
|
||||
| 7 | cleanup | `DELETE FROM work_items WHERE project='verify-smoke'` removes the row(s) so re-runs stay tidy |
|
||||
| 8 | summary | Green/red checklist of every section above |
|
||||
|
||||
Each section gates the next — the script exits on the first failure
|
||||
and prints which section tripped.
|
||||
|
||||
## Running the full recipe by hand
|
||||
|
||||
If `verify.sh` flags a regression and you want to walk the same path
|
||||
yourself, here is the equivalent curl + psql sequence:
|
||||
|
||||
```sh
|
||||
docker compose up -d db damascus-api damascus-ui-build
|
||||
docker compose run --rm damascus-ui-build # populates the UI bundle volume
|
||||
python3 -m pytest tests/e2e/test_entry_points_e2e.py -q -s
|
||||
# Preflight
|
||||
curl -fsS http://127.0.0.1:9110/healthz
|
||||
curl -fsS -o /dev/null -w '%{http_code}\n' http://127.0.0.1:9110/v1/items # expect 200
|
||||
|
||||
# Ingest a story (token in /root/.hermes/.env)
|
||||
TOKEN=$(awk -F= '/^DAMASCUS_API_TOKEN/ {print $2}' /root/.hermes/.env | tr -d '"' | tr -d "'")
|
||||
INGEST=$(curl -fsS -X POST http://127.0.0.1:9110/v1/items \
|
||||
-H "Authorization: Bearer ${TOKEN}" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"project":"manual","story_id":"manual-1","title":"Manual recipe","priority":200}')
|
||||
ITEM_ID=$(echo "$INGEST" | python3 -c "import sys, json; print(json.load(sys.stdin)['item']['id'])")
|
||||
echo "phase:" $(curl -fsS http://127.0.0.1:9110/v1/items/$ITEM_ID | python3 -c "import sys, json; print(json.load(sys.stdin)['item']['phase'])")
|
||||
|
||||
# Drive the cycle via direct SQL (orchestrator worker is bypassed)
|
||||
for PHASE in build review merged; do
|
||||
if [ "$PHASE" = "merged" ]; then
|
||||
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
|
||||
-c "UPDATE work_items SET phase='$PHASE', claimed_by=NULL, claimed_at=NULL, merged_at=NOW(), updated_at=NOW() WHERE id='$ITEM_ID'"
|
||||
else
|
||||
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
|
||||
-c "UPDATE work_items SET phase='$PHASE', claimed_by=NULL, claimed_at=NULL, updated_at=NOW() WHERE id='$ITEM_ID'"
|
||||
fi
|
||||
done
|
||||
|
||||
# Cleanup
|
||||
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
|
||||
-c "DELETE FROM work_items WHERE project='manual'"
|
||||
```
|
||||
|
||||
Four phases, end-to-end:
|
||||
|
||||
| Phase | What it proves |
|
||||
|---|---|
|
||||
| 1 — Ingest via MCP | `mcp.ingest_story` over JSON-RPC stdio returns a `WorkItemResponse` with `phase=spec`. Re-ingest is idempotent (same id, no overwrite). |
|
||||
| 2 — UI reflects ingest | The new row appears in the SPA's `/items` table. Drawer + dashboard widgets render. |
|
||||
| 3 — Drive the cycle | `state.set_phase` moves the row `spec → build → review → merged`. The phase chip on each row updates on each fresh page load. Screenshots per phase. |
|
||||
| 4 — Answer via MCP | A `human_issues` row is opened, then `mcp.answer_question` answers it. Status flips to `answered`; recent events show the answer. |
|
||||
|
||||
Evidence lives in `.hermes/evidence/p6/`:
|
||||
- `screenshots/01_ingest.png`, `01_dashboard.png` — UI right after MCP ingest
|
||||
- `screenshots/02_build.png`, `03_review.png`, `04_merged.png` — phase chip per transition
|
||||
- `screenshots/05_awaiting_human_drawer.png`, `06_answered.png` — drawer in awaiting_human phase and after the MCP answer
|
||||
- `logs/mcp_stdio.log` — full JSON-RPC transcript
|
||||
- `logs/pytest.txt` — pytest output
|
||||
- `logs/verify_run.txt` — last `bash scripts/verify.sh` output
|
||||
|
||||
## What success looks like at each phase
|
||||
|
||||
| Phase | UI signal | DB signal |
|
||||
@@ -47,49 +75,85 @@ Evidence lives in `.hermes/evidence/p6/`:
|
||||
| `build` | Phase chip = `build` | `work_items.phase='build'` |
|
||||
| `review` | Phase chip = `review` | `work_items.phase='review'` |
|
||||
| `merged` | Phase chip = `merged` | `work_items.phase='merged'`, `merged_at` set |
|
||||
| `awaiting_human` | Drawer opens with `awaiting_human` badge | `work_items.phase='awaiting_human'`, one `human_issues` row with `status='open'` |
|
||||
| After MCP answer | Recent events list shows `issue_answered` | `human_issues.status='answered'`, `answer=...` |
|
||||
|
||||
## Known UI issue (not blocking the gate)
|
||||
For the human-issue flow (P6: `awaiting_human` + answer), see
|
||||
`tests/e2e/test_entry_points_e2e.py::test_phase4_answer_question`.
|
||||
That assertion lives in pytest, not in this bash recipe — `verify.sh`
|
||||
covers the merge-gate happy path only.
|
||||
|
||||
The SPA's hash router wipes the URL hash on Items mount (Items.tsx writes
|
||||
empty filter state to the hash via `writeHash("")` in a `useEffect`). This
|
||||
makes second navigation back to `/items` via `page.goto(/#/items)` or
|
||||
`page.goto(/#/items/:id)` unreliable — the route may collapse to dashboard.
|
||||
The P6 E2E works around this by opening a fresh Playwright context per phase
|
||||
screenshot; the verify.sh script does not exercise the UI. Filed as a P5/P6
|
||||
follow-up; does not block v1.
|
||||
## Why direct SQL for the cycle drive (not `state.set_phase`)
|
||||
|
||||
## Manual recipe (drive a phase transition by hand)
|
||||
The orchestrator worker is alive and polling. A `state.set_phase` call
|
||||
on a freshly-ingested `spec` row races the worker's claim loop — the
|
||||
worker can grab the row mid-transition and start refining it. The
|
||||
SQL UPDATE bypasses the claim filter (`SELECT ... FOR UPDATE SKIP
|
||||
LOCKED`) entirely and stamps `claimed_by=NULL`, so the row matches
|
||||
the shape of one the cycle produced and the API reflects the change
|
||||
immediately.
|
||||
|
||||
```sh
|
||||
# 1. Ingest a story (use the token from /root/.hermes/.env).
|
||||
INGEST=$(curl -sf -X POST http://127.0.0.1:9110/v1/items \
|
||||
-H "Authorization: Bearer *** \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"project":"manual","story_id":"manual-1","title":"Manual recipe","priority":200}')
|
||||
ITEM_ID=$(echo "$INGEST" | jq -r .item.id)
|
||||
If you want to drive transitions via `state.set_phase` for debugging,
|
||||
stop the orchestrator first (`docker compose stop orchestrator`) and
|
||||
restart after.
|
||||
|
||||
# 2. Open http://127.0.0.1:9110/#/items — find "Manual recipe".
|
||||
# Click the row; drawer shows phase=spec.
|
||||
## Architecture notes (relevant when verify.sh fails)
|
||||
|
||||
# 3. Move it to build via direct SQL (simulating a successful spec-refiner cycle).
|
||||
docker exec damascus-orchestrator-db-1 psql -U damascus -d damascus \
|
||||
-c "UPDATE work_items SET phase='build' WHERE id='${ITEM_ID}'"
|
||||
|
||||
# 4. Reload the items grid; phase chip should now show "build".
|
||||
```
|
||||
|
||||
Repeat step 3 with `review` and `merged` to walk the full cycle.
|
||||
- **Token source**: `DAMASCUS_API_TOKEN` is read from the shell env,
|
||||
falling back to `/root/.hermes/.env` (the same source
|
||||
`damascus-api` reads). The placeholder in the host `.env` is
|
||||
ignored; the live value lives in the file. See
|
||||
`damascus-orchestrator-operator` skill pitfall "DAMASCUS_API_TOKEN
|
||||
in host .env is a placeholder."
|
||||
- **MCP upstream**: the helper launches the MCP process via `docker
|
||||
compose exec damascus-api python -m damascus.mcp_server` with
|
||||
`DAMASCUS_API_BASE=http://damascus-api:9110`. Container DNS
|
||||
resolves the upstream; do NOT change it to `localhost` from the
|
||||
host perspective.
|
||||
- **Idempotency**: `ingest_story` is idempotent on
|
||||
`(project, story_id)`. `verify.sh` uses a unique timestamped
|
||||
`story_id` per run so the helper's own re-ingest (during a
|
||||
failure-recovery flow) won't collide.
|
||||
- **`damascus-ui-build`**: a one-shot (`restart: "no"`) that copies
|
||||
the Vite bundle into the named `damascus_ui` volume. `docker
|
||||
compose up -d` on an exited one-shot re-runs it; the `cp` is
|
||||
idempotent on a populated volume.
|
||||
|
||||
## Failure modes
|
||||
|
||||
- **Stale UI bundle**: `/` returns 404 if `damascus_ui` volume is empty.
|
||||
Re-run `docker compose run --rm damascus-ui-build`.
|
||||
- **MCP returns 401**: token wrong, or API container not restarted after
|
||||
token change. Restart the container so MCP picks up the new env.
|
||||
- **Drawer shows "loading…" forever**: React Query can't reach the API.
|
||||
Confirm `/healthz` returns 200 and `/` returns the SPA shell.
|
||||
- **Phase doesn't advance**: orchestrator cycle isn't polling. The E2E
|
||||
bypasses the cycle by writing directly via SQL, so it works even with
|
||||
the orchestrator stopped.
|
||||
- **/healthz returns non-ok**: `damascus-api` failed to boot. Check
|
||||
`docker logs damascus-orchestrator-damascus-api-1`. Usually means
|
||||
`DAMASCUS_API_TOKEN` is empty (fail-closed at startup).
|
||||
- **`/v1/items` returns 500**: the API container is up but cannot
|
||||
reach Postgres. Verify the `db` container is `healthy` (`docker
|
||||
compose ps db`).
|
||||
- **MCP `initialize` fails with "no such service"**: the
|
||||
`damascus-api` container is not running. Restart via
|
||||
`docker compose up -d damascus-api`.
|
||||
- **MCP tools/list returns fewer than 7**: MCP server failed to
|
||||
build its catalog (likely a Python import error). Re-run
|
||||
`docker compose logs damascus-api` for the traceback.
|
||||
- **Cycle-drive UPDATE hangs**: the `db` container is unreachable
|
||||
or out of disk. Check `docker compose ps db` and
|
||||
`df -h $(docker volume inspect damascus-orchestrator_dbdata --format '{{ .Mountpoint }}')`.
|
||||
- **Item not visible in /v1/items after MCP ingest**: the
|
||||
orchestrator worker may have already moved the row past `spec`
|
||||
before section 5 ran. Re-run the script — each run uses a fresh
|
||||
`story_id`.
|
||||
|
||||
## Screenshots
|
||||
|
||||
UI screenshots are produced by the P6 Playwright spec
|
||||
(`tests/e2e/test_entry_points_e2e.py`) and saved to
|
||||
`.hermes/evidence/p6/screenshots/`. `verify.sh` is bash-only by
|
||||
design — adding Playwright would expand it past the "manual recipe
|
||||
in <1 minute" budget this page targets.
|
||||
|
||||
## Evidence log
|
||||
|
||||
Each run of `verify.sh` writes its full output to
|
||||
`.hermes/evidence/p6a/verify.log` when piped via tee:
|
||||
|
||||
```sh
|
||||
bash scripts/verify.sh 2>&1 | tee .hermes/evidence/p6a/verify.log
|
||||
```
|
||||
|
||||
The script prints the absolute log path on success.
|
||||
179
scripts/_verify_mcp_helper.py
Executable file
179
scripts/_verify_mcp_helper.py
Executable file
@@ -0,0 +1,179 @@
|
||||
"""Damascus MCP stdio helper for scripts/verify.sh.
|
||||
|
||||
Drives ``python -m damascus.mcp_server`` over stdio via the official
|
||||
``mcp`` SDK client. The MCP server is a thin wrapper around
|
||||
``damascus-api`` (loopback HTTP); this helper just frames the JSON-RPC
|
||||
for the bash wrapper script so the bash doesn't have to manage
|
||||
heredocs, Content-Length headers, or mcp SDK imports.
|
||||
|
||||
Subcommands
|
||||
-----------
|
||||
|
||||
``initialize``
|
||||
Send the MCP ``initialize`` handshake; print server name + version
|
||||
as a single JSON line on stdout.
|
||||
|
||||
``list-tools``
|
||||
Send ``tools/list`` after the handshake; print the sorted tool
|
||||
name list + count as a single JSON line.
|
||||
|
||||
``ingest-story PROJECT STORY_ID TITLE PRIORITY``
|
||||
Call ``tools/call ingest_story`` and print
|
||||
``{"server_name": ..., "payload": <API response>}``.
|
||||
|
||||
Auth
|
||||
----
|
||||
The helper reads ``DAMASCUS_API_TOKEN`` from the shell env, falling back
|
||||
to ``/root/.hermes/.env`` (the same source ``damascus-api`` itself
|
||||
reads). The MCP process is launched via ``docker compose exec
|
||||
damascus-api python -m damascus.mcp_server`` and inherits ``DAMASCUS_API_BASE=http://damascus-api:9110`` so the container DNS
|
||||
resolves the upstream.
|
||||
|
||||
Exit codes
|
||||
----------
|
||||
``0`` on success, ``1`` on a runtime error, ``2`` on bad arguments.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
from mcp import ClientSession
|
||||
from mcp.client.stdio import StdioServerParameters, stdio_client
|
||||
|
||||
# Silence the SDK's "Tool <name> not listed, no validation will be
|
||||
# performed" warning emitted on every call_tool. The MCP server declares
|
||||
# `ingest_story` in its catalog but the SDK's structured-output validator
|
||||
# still complains because the server does not return a `structuredContent`
|
||||
# block (it returns the API payload as TextContent). Validation is
|
||||
# not actionable here — the bash wrapper asserts the JSON shape itself.
|
||||
logging.getLogger("mcp.client.session").setLevel(logging.ERROR)
|
||||
|
||||
|
||||
ENV_FILE = Path("/root/.hermes/.env")
|
||||
COMPOSE_FILE = "/root/damascus-orchestrator/docker-compose.yml"
|
||||
TOKEN_KEY = "DAMASCUS_API_TOKEN"
|
||||
|
||||
|
||||
def _load_token() -> str:
|
||||
token = os.environ.get(TOKEN_KEY, "").strip()
|
||||
if token:
|
||||
return token
|
||||
if not ENV_FILE.exists():
|
||||
return ""
|
||||
for raw in ENV_FILE.read_text().splitlines():
|
||||
line = raw.strip()
|
||||
if line.startswith("export "):
|
||||
line = line[len("export "):].lstrip()
|
||||
if not line.startswith(TOKEN_KEY + "="):
|
||||
continue
|
||||
val = line.split("=", 1)[1].strip()
|
||||
if (val.startswith("'") and val.endswith("'")) or (
|
||||
val.startswith('"') and val.endswith('"')
|
||||
):
|
||||
val = val[1:-1]
|
||||
return val
|
||||
return ""
|
||||
|
||||
|
||||
def _stdio_params() -> StdioServerParameters:
|
||||
token = _load_token()
|
||||
if not token:
|
||||
print(f"[verify-mcp] {TOKEN_KEY} not found in env or {ENV_FILE}", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
# The MCP process runs inside damascus-api (via `docker compose exec`),
|
||||
# so it needs the container-DNS upstream URL — not localhost:9110.
|
||||
api_base = os.environ.get("DAMASCUS_API_BASE_FOR_MCP", "http://damascus-api:9110")
|
||||
return StdioServerParameters(
|
||||
command="docker",
|
||||
args=[
|
||||
"compose",
|
||||
"-f",
|
||||
COMPOSE_FILE,
|
||||
"exec",
|
||||
"-T",
|
||||
"damascus-api",
|
||||
"python",
|
||||
"-m",
|
||||
"damascus.mcp_server",
|
||||
],
|
||||
env={
|
||||
**os.environ,
|
||||
"DAMASCUS_API_BASE": api_base,
|
||||
TOKEN_KEY: token,
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
async def _run(sub: str, rest: list[str]) -> int:
|
||||
params = _stdio_params()
|
||||
async with stdio_client(params) as (read, write):
|
||||
async with ClientSession(read, write) as session:
|
||||
init = await session.initialize()
|
||||
server_name = init.serverInfo.name
|
||||
|
||||
if sub == "initialize":
|
||||
print(json.dumps({
|
||||
"server_name": server_name,
|
||||
"server_version": init.serverInfo.version,
|
||||
}))
|
||||
return 0
|
||||
|
||||
if sub == "list-tools":
|
||||
tools = await session.list_tools()
|
||||
names = sorted(t.name for t in tools.tools)
|
||||
print(json.dumps({
|
||||
"server_name": server_name,
|
||||
"tool_names": names,
|
||||
"tool_count": len(names),
|
||||
}))
|
||||
return 0
|
||||
|
||||
if sub == "ingest-story":
|
||||
if len(rest) < 4:
|
||||
print(
|
||||
"[verify-mcp] ingest-story requires "
|
||||
"PROJECT STORY_ID TITLE PRIORITY",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return 2
|
||||
project, story_id, title, priority = rest[:4]
|
||||
res = await session.call_tool(
|
||||
"ingest_story",
|
||||
arguments={
|
||||
"project": project,
|
||||
"story_id": story_id,
|
||||
"title": title,
|
||||
"priority": int(priority),
|
||||
},
|
||||
)
|
||||
if not res.content:
|
||||
print("[verify-mcp] empty content from ingest_story", file=sys.stderr)
|
||||
return 1
|
||||
payload = json.loads(res.content[0].text)
|
||||
print(json.dumps({"server_name": server_name, "payload": payload}))
|
||||
return 0
|
||||
|
||||
print(f"[verify-mcp] unknown subcommand: {sub!r}", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
|
||||
def main() -> int:
|
||||
if len(sys.argv) < 2:
|
||||
print(__doc__, file=sys.stderr)
|
||||
return 2
|
||||
sub = sys.argv[1]
|
||||
rest = sys.argv[2:]
|
||||
try:
|
||||
return asyncio.run(_run(sub, rest))
|
||||
except Exception as exc:
|
||||
print(f"[verify-mcp] {type(exc).__name__}: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,89 +1,318 @@
|
||||
#!/usr/bin/env bash
|
||||
# Damascus Entry Points v1 — manual smoke verification.
|
||||
# Damascus Entry Points v1 — manual verification recipe (P6a).
|
||||
#
|
||||
# The 30-second check: hits /healthz and /v1/items so the operator
|
||||
# sees "service up + DB seeded" without a browser. Exits non-zero on
|
||||
# any failure so it can be wired into a deploy gate.
|
||||
# End-to-end smoke that proves "v1 works" without a browser. Each
|
||||
# section gates the next; the script exits non-zero on the first
|
||||
# failure so it can be wired into a deploy gate later.
|
||||
#
|
||||
# Usage:
|
||||
# bash scripts/verify.sh
|
||||
#
|
||||
# Sections (in order):
|
||||
# 1. preflight — stack healthy + API reachable
|
||||
# 2. stack-up — bring up db / damascus-api / damascus-ui-build (idempotent)
|
||||
# 3. mcp-stdio — MCP server handshake + 7 tools visible
|
||||
# 4. ingest-via-mcp — create one item via MCP ingest_story
|
||||
# 5. ui-shows-it — GET /v1/items reflects the new item, phase=spec
|
||||
# 6. drive-cycle — spec → build → review → merged via direct SQL
|
||||
# 7. cleanup — DELETE the verify-smoke rows so re-runs stay tidy
|
||||
# 8. summary — green/red checklist
|
||||
#
|
||||
# Assumes:
|
||||
# - damascus-api on http://127.0.0.1:9110 (the P2 stack)
|
||||
# - DAMASCUS_API_TOKEN set in env OR in /root/.hermes/.env
|
||||
# - curl + jq available
|
||||
# - /root/damascus-orchestrator is the project root
|
||||
# - /root/.hermes/.env contains DAMASCUS_API_TOKEN
|
||||
# - docker compose is on PATH and the damascus stack is registered
|
||||
# - python3 (with `mcp` and `httpx` installed) is on PATH
|
||||
|
||||
set -euo pipefail
|
||||
set -uo pipefail
|
||||
|
||||
# --- paths & config ---------------------------------------------------------
|
||||
|
||||
REPO_ROOT="${REPO_ROOT:-/root/damascus-orchestrator}"
|
||||
COMPOSE_FILE="${REPO_ROOT}/docker-compose.yml"
|
||||
API_BASE="${DAMASCUS_API_BASE:-http://127.0.0.1:9110}"
|
||||
TOKEN="${DAMASCUS_API_TOKEN:-}"
|
||||
MCP_HELPER="${REPO_ROOT}/scripts/_verify_mcp_helper.py"
|
||||
EVIDENCE_DIR="${REPO_ROOT}/.hermes/evidence/p6a"
|
||||
LOG_FILE="${EVIDENCE_DIR}/verify.log"
|
||||
VERIFY_PROJECT="verify-smoke"
|
||||
DB_CONTAINER="damascus-orchestrator-db-1"
|
||||
API_CONTAINER="damascus-orchestrator-damascus-api-1"
|
||||
|
||||
# Pull token from /root/.hermes/.env if not in the shell already.
|
||||
if [[ -z "${TOKEN}" && -r /root/.hermes/.env ]]; then
|
||||
TOKEN=$(grep -E '^DAMASCUS_API_TOKEN=' /root/.hermes/.env | head -1 | cut -d= -f2- | tr -d '"' | tr -d "'")
|
||||
fi
|
||||
|
||||
if [[ -z "${TOKEN}" ]]; then
|
||||
echo "[verify] DAMASCUS_API_TOKEN not found in env or /root/.hermes/.env" >&2
|
||||
exit 1
|
||||
fi
|
||||
# --- bash output helpers ----------------------------------------------------
|
||||
|
||||
bold() { printf "\033[1m%s\033[0m\n" "$*"; }
|
||||
ok() { printf " \033[32mok\033[0m %s\n" "$*"; }
|
||||
fail() { printf " \033[31mFAIL\033[0m %s\n" "$*"; exit 1; }
|
||||
green() { printf " \033[32mok\033[0m %s\n" "$*"; }
|
||||
red() { printf " \033[31mFAIL\033[0m %s\n" "$*"; }
|
||||
|
||||
bold "[verify] ${API_BASE}"
|
||||
# Track per-section results for the summary checklist. Entries are
|
||||
# "name|exit_code|note". Failures use the helper _fail.
|
||||
declare -a RESULTS=()
|
||||
CURRENT_SECTION=""
|
||||
|
||||
# ---- 1. /healthz -----------------------------------------------------------
|
||||
HEALTHZ_BODY=$(curl -sf "${API_BASE}/healthz") || fail "/healthz request failed"
|
||||
[[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]] || fail "/healthz body unexpected: ${HEALTHZ_BODY}"
|
||||
ok "/healthz => {\"status\":\"ok\"}"
|
||||
_section_start() {
|
||||
CURRENT_SECTION="$1"
|
||||
bold ""
|
||||
bold "[${CURRENT_SECTION}]"
|
||||
}
|
||||
|
||||
# ---- 2. /v1/items reads without auth ---------------------------------------
|
||||
ITEMS_COUNT=$(curl -sf "${API_BASE}/v1/items" | jq '.items | length') || fail "/v1/items request failed"
|
||||
ok "/v1/items => ${ITEMS_COUNT} items (auth not required for reads)"
|
||||
_record() {
|
||||
RESULTS+=("$1")
|
||||
}
|
||||
|
||||
# ---- 3. /v1/items?group_by=project (P5 backend) ----------------------------
|
||||
# Per the entry-points-contract §3, when group_by=project the response shape
|
||||
# is GroupedItemsResponse (not ListItemsResponse). If the handler is
|
||||
# missing (P5 backend not yet wired), we get the flat shape — surface
|
||||
# that loudly without failing the smoke (P5 is the gap explicitly
|
||||
# tracked in P6's verification report).
|
||||
GROUPED=$(curl -sf "${API_BASE}/v1/items?group_by=project" -w "\n%{http_code}" | tail -1)
|
||||
[[ "${GROUPED}" == "200" ]] || fail "/v1/items?group_by=project returned ${GROUPED}"
|
||||
ok "/v1/items?group_by=project => 200 (P5 backend response shape NOT asserted here — see docs/VERIFICATION.md)"
|
||||
# --- failure handler --------------------------------------------------------
|
||||
|
||||
# ---- 4. /v1/stats (system status) ------------------------------------------
|
||||
STATS=$(curl -sf "${API_BASE}/v1/stats" -w "\n%{http_code}" | tail -1)
|
||||
[[ "${STATS}" == "200" ]] || fail "/v1/stats returned ${STATS}"
|
||||
ok "/v1/stats => 200 (phase counts + open issues + active claims + cost_today)"
|
||||
_fail() {
|
||||
local note="$*"
|
||||
red "${CURRENT_SECTION}: ${note}"
|
||||
_record "${CURRENT_SECTION}|1|${note}"
|
||||
# Allow trap to write the summary if requested.
|
||||
exit 1
|
||||
}
|
||||
|
||||
# ---- 5. write endpoints refuse missing token -------------------------------
|
||||
NO_TOKEN_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST "${API_BASE}/v1/items" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"project":"smoke","story_id":"no-auth","title":"no-auth smoke"}')
|
||||
[[ "${NO_TOKEN_STATUS}" == "401" ]] || fail "POST /v1/items without token returned ${NO_TOKEN_STATUS} (expected 401)"
|
||||
ok "POST /v1/items without token => 401 (auth gate honored)"
|
||||
# --- prerequisites ----------------------------------------------------------
|
||||
|
||||
# ---- 6. write endpoints accept valid token ---------------------------------
|
||||
INGEST_BODY=$(jq -n --arg t "$TOKEN" '{
|
||||
project: "smoke",
|
||||
story_id: "verify-" + (now | tostring),
|
||||
title: "verify.sh smoke ingest",
|
||||
priority: 50
|
||||
}')
|
||||
INGEST=$(curl -sf -X POST "${API_BASE}/v1/items" \
|
||||
-H "Authorization: Bearer ${TOKEN}" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "${INGEST_BODY}") || fail "POST /v1/items with token failed"
|
||||
ITEM_ID=$(echo "${INGEST}" | jq -r '.item.id')
|
||||
ITEM_PHASE=$(echo "${INGEST}" | jq -r '.item.phase')
|
||||
[[ "${ITEM_PHASE}" == "spec" ]] || fail "ingest phase=${ITEM_PHASE} (expected spec)"
|
||||
ok "POST /v1/items with token => id=${ITEM_ID}, phase=${ITEM_PHASE}"
|
||||
mkdir -p "${EVIDENCE_DIR}"
|
||||
|
||||
# ---- 7. cleanup the smoke item ---------------------------------------------
|
||||
# We don't bother — story_ids with random timestamps are naturally harmless
|
||||
# and the operator can DELETE rows for project='smoke' if it piles up.
|
||||
ok "smoke ingest cycle complete (no cleanup needed — story_id is timestamped)"
|
||||
if ! command -v docker >/dev/null 2>&1; then
|
||||
_fail "docker not on PATH"
|
||||
fi
|
||||
if ! command -v curl >/dev/null 2>&1; then
|
||||
_fail "curl not on PATH"
|
||||
fi
|
||||
if ! command -v python3 >/dev/null 2>&1; then
|
||||
_fail "python3 not on PATH"
|
||||
fi
|
||||
if [[ ! -r "${COMPOSE_FILE}" ]]; then
|
||||
_fail "compose file not readable: ${COMPOSE_FILE}"
|
||||
fi
|
||||
if [[ ! -r "${MCP_HELPER}" ]]; then
|
||||
_fail "MCP helper not readable: ${MCP_HELPER}"
|
||||
fi
|
||||
|
||||
bold "[verify] PASSED — all 7 checks green"
|
||||
# ===========================================================================
|
||||
# 1. preflight
|
||||
# ===========================================================================
|
||||
|
||||
_section_start "1. preflight"
|
||||
|
||||
API_LINE=$(docker compose -f "${COMPOSE_FILE}" ps damascus-api 2>/dev/null | tail -n +2 | head -1 || true)
|
||||
if [[ -z "${API_LINE}" ]]; then
|
||||
_fail "damascus-api not running; bring it up first (stack-up section will do that next)"
|
||||
fi
|
||||
if ! grep -q "healthy" <<<"${API_LINE}"; then
|
||||
_fail "damascus-api is not healthy: ${API_LINE}"
|
||||
fi
|
||||
green "docker compose ps damascus-api -> healthy"
|
||||
|
||||
HEALTHZ_BODY=$(curl -fsS "${API_BASE}/healthz" 2>/dev/null) || _fail "/healthz request failed"
|
||||
[[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]] || _fail "/healthz body unexpected: ${HEALTHZ_BODY}"
|
||||
green "${API_BASE}/healthz -> {\"status\":\"ok\"}"
|
||||
|
||||
ITEMS_STATUS=$(curl -s -o /dev/null -w '%{http_code}' "${API_BASE}/v1/items")
|
||||
[[ "${ITEMS_STATUS}" == "200" ]] || _fail "/v1/items returned ${ITEMS_STATUS}"
|
||||
green "${API_BASE}/v1/items -> 200"
|
||||
|
||||
_record "1. preflight|0|stack healthy + API reachable"
|
||||
|
||||
# ===========================================================================
|
||||
# 2. stack-up
|
||||
# ===========================================================================
|
||||
|
||||
_section_start "2. stack-up"
|
||||
|
||||
# `up -d` is idempotent on running services. damascus-ui-build is a
|
||||
# one-shot (restart: "no") that copies the Vite bundle into the named
|
||||
# volume; if the bundle is already there from a previous build the
|
||||
# one-shot just exits 0 again. Acceptable side effect on re-runs.
|
||||
docker compose -f "${COMPOSE_FILE}" up -d db damascus-api damascus-ui-build >/dev/null 2>&1 \
|
||||
|| _fail "docker compose up failed"
|
||||
|
||||
# Wait up to 30s for /healthz (covers the case where we just started a cold stack).
|
||||
WAITED=0
|
||||
HEALTHZ_BODY=""
|
||||
while (( WAITED < 30 )); do
|
||||
HEALTHZ_BODY=$(curl -fsS "${API_BASE}/healthz" 2>/dev/null || true)
|
||||
if [[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]]; then
|
||||
break
|
||||
fi
|
||||
sleep 1
|
||||
WAITED=$((WAITED + 1))
|
||||
done
|
||||
[[ "${HEALTHZ_BODY}" == '{"status":"ok"}' ]] || _fail "/healthz not ok after ${WAITED}s"
|
||||
green "stack up; /healthz ok (waited ${WAITED}s)"
|
||||
|
||||
_record "2. stack-up|0|db + api + ui-build up; healthz responsive"
|
||||
|
||||
# ===========================================================================
|
||||
# 3. mcp-stdio
|
||||
# ===========================================================================
|
||||
|
||||
_section_start "3. mcp-stdio"
|
||||
|
||||
INIT_JSON=$(python3 "${MCP_HELPER}" initialize 2>/dev/null) \
|
||||
|| { INIT_ERR=$(python3 "${MCP_HELPER}" initialize 2>&1 >/dev/null); _fail "MCP initialize failed: ${INIT_ERR}"; }
|
||||
SERVER_NAME=$(printf '%s' "${INIT_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['server_name'])")
|
||||
[[ "${SERVER_NAME}" == "damascus-mcp" ]] || _fail "MCP server name=${SERVER_NAME!r} (expected damascus-mcp)"
|
||||
green "initialize -> server_name=${SERVER_NAME}"
|
||||
|
||||
TOOLS_JSON=$(python3 "${MCP_HELPER}" list-tools 2>/dev/null) \
|
||||
|| { TOOLS_ERR=$(python3 "${MCP_HELPER}" list-tools 2>&1 >/dev/null); _fail "MCP list-tools failed: ${TOOLS_ERR}"; }
|
||||
TOOL_COUNT=$(printf '%s' "${TOOLS_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['tool_count'])")
|
||||
[[ "${TOOL_COUNT}" == "7" ]] || _fail "MCP tool_count=${TOOL_COUNT} (expected 7)"
|
||||
TOOL_NAMES=$(printf '%s' "${TOOLS_JSON}" | python3 -c "import sys, json; print(', '.join(json.load(sys.stdin)['tool_names']))")
|
||||
green "tools/list -> ${TOOL_COUNT} tools: ${TOOL_NAMES}"
|
||||
|
||||
_record "3. mcp-stdio|0|handshake + 7 tools visible"
|
||||
|
||||
# ===========================================================================
|
||||
# 4. ingest-via-mcp
|
||||
# ===========================================================================
|
||||
|
||||
_section_start "4. ingest-via-mcp"
|
||||
|
||||
STORY_ID="VERIFY-$(date +%s)-$$"
|
||||
TITLE="P6a smoke (auto-generated)"
|
||||
PRIORITY=100
|
||||
|
||||
# Capture only stdout. If the helper exits non-zero, re-run with stderr
|
||||
# merged so the error message reaches _fail.
|
||||
INGEST_JSON=$(python3 "${MCP_HELPER}" ingest-story "${VERIFY_PROJECT}" "${STORY_ID}" "${TITLE}" "${PRIORITY}" 2>/dev/null) \
|
||||
|| { INGEST_ERR=$(python3 "${MCP_HELPER}" ingest-story "${VERIFY_PROJECT}" "${STORY_ID}" "${TITLE}" "${PRIORITY}" 2>&1 >/dev/null); _fail "MCP ingest_story failed: ${INGEST_ERR}"; }
|
||||
|
||||
INGEST_PHASE=$(printf '%s' "${INGEST_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['payload']['item']['phase'])")
|
||||
INGEST_ID=$(printf '%s' "${INGEST_JSON}" | python3 -c "import sys, json; print(json.load(sys.stdin)['payload']['item']['id'])")
|
||||
[[ "${INGEST_PHASE}" == "spec" ]] || _fail "ingest phase=${INGEST_PHASE} (expected spec)"
|
||||
green "ingest_story -> id=${INGEST_ID}, phase=${INGEST_PHASE}, project=${VERIFY_PROJECT}, story_id=${STORY_ID}"
|
||||
|
||||
_record "4. ingest-via-mcp|0|story=${STORY_ID} phase=spec"
|
||||
|
||||
# ===========================================================================
|
||||
# 5. ui-shows-it
|
||||
# ===========================================================================
|
||||
|
||||
_section_start "5. ui-shows-it"
|
||||
|
||||
ITEMS_JSON=$(curl -fsS "${API_BASE}/v1/items" 2>/dev/null) || _fail "/v1/items failed"
|
||||
|
||||
# Inline Python matcher: find the item by id, print phase or exit non-zero.
|
||||
MATCHED=$(ITEM_ID="${INGEST_ID}" ITEMS_JSON="${ITEMS_JSON}" python3 <<'PY'
|
||||
import json, os
|
||||
target = os.environ["ITEM_ID"]
|
||||
data = json.loads(os.environ["ITEMS_JSON"])
|
||||
for item in data.get("items", []):
|
||||
if item.get("id") == target:
|
||||
print(json.dumps({
|
||||
"id": item["id"],
|
||||
"phase": item["phase"],
|
||||
"project": item["project"],
|
||||
"story_id": item["story_id"],
|
||||
}))
|
||||
raise SystemExit(0)
|
||||
raise SystemExit(2)
|
||||
PY
|
||||
) || _fail "item ${INGEST_ID} not found in /v1/items"
|
||||
MATCH_PHASE=$(printf '%s' "${MATCHED}" | python3 -c "import sys, json; print(json.load(sys.stdin)['phase'])")
|
||||
[[ "${MATCH_PHASE}" == "spec" ]] || _fail "matched item phase=${MATCH_PHASE} (expected spec)"
|
||||
green "/v1/items -> row visible: ${MATCHED}"
|
||||
|
||||
_record "5. ui-shows-it|0|/v1/items reflects new row at phase=spec"
|
||||
|
||||
# ===========================================================================
|
||||
# 6. drive-cycle
|
||||
# ===========================================================================
|
||||
|
||||
_section_start "6. drive-cycle"
|
||||
|
||||
# We drive phase transitions via direct SQL on the db container (matches
|
||||
# the pattern in tests/e2e/test_entry_points_e2e.py::phase3). Rationale:
|
||||
# the orchestrator worker is running and could race a `state.set_phase`
|
||||
# call, so the SQL UPDATE bypasses claim semantics entirely. We also
|
||||
# null out claimed_* and stamp merged_at so the row matches the shape
|
||||
# of one that the cycle actually produced.
|
||||
#
|
||||
# IMPORTANT: this test rows race the live orchestrator cycle. The
|
||||
# orchestrator may have already moved this item from `spec` to a
|
||||
# different phase by the time we get here — e.g. it may already be
|
||||
# `blocked` with a `spec_wrong` verdict. We assert the *transition*
|
||||
# succeeds at the SQL level and the API reflects each new phase, but
|
||||
# we tolerate the case where the row is already past spec.
|
||||
drive_one() {
|
||||
local target_phase="$1"
|
||||
local item_id="$2"
|
||||
if [[ "${target_phase}" == "merged" ]]; then
|
||||
docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -v ON_ERROR_STOP=1 -q \
|
||||
-c "UPDATE work_items SET phase='${target_phase}', claimed_by=NULL, claimed_at=NULL, merged_at=NOW(), updated_at=NOW() WHERE id='${item_id}'" \
|
||||
>/dev/null 2>&1 \
|
||||
|| _fail "psql UPDATE to phase=${target_phase} failed"
|
||||
else
|
||||
docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -v ON_ERROR_STOP=1 -q \
|
||||
-c "UPDATE work_items SET phase='${target_phase}', claimed_by=NULL, claimed_at=NULL, updated_at=NOW() WHERE id='${item_id}'" \
|
||||
>/dev/null 2>&1 \
|
||||
|| _fail "psql UPDATE to phase=${target_phase} failed"
|
||||
fi
|
||||
local actual_phase
|
||||
actual_phase=$(curl -fsS "${API_BASE}/v1/items/${item_id}" 2>/dev/null \
|
||||
| python3 -c "import sys, json; print(json.load(sys.stdin)['item']['phase'])") \
|
||||
|| _fail "/v1/items/${item_id} failed after UPDATE to ${target_phase}"
|
||||
[[ "${actual_phase}" == "${target_phase}" ]] || _fail "phase after UPDATE = ${actual_phase} (expected ${target_phase})"
|
||||
green " -> phase=${actual_phase} (via API)"
|
||||
}
|
||||
|
||||
drive_one build "${INGEST_ID}"
|
||||
sleep 1
|
||||
drive_one review "${INGEST_ID}"
|
||||
sleep 1
|
||||
drive_one merged "${INGEST_ID}"
|
||||
|
||||
# Sanity: merged_at must be populated on the merged row.
|
||||
MERGED_AT=$(docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -tA \
|
||||
-c "SELECT merged_at IS NOT NULL FROM work_items WHERE id='${INGEST_ID}'")
|
||||
[[ "${MERGED_AT}" == "t" ]] || _fail "merged_at not set on item ${INGEST_ID}"
|
||||
green " -> merged_at populated"
|
||||
|
||||
_record "6. drive-cycle|0|spec->build->review->merged, merged_at set"
|
||||
|
||||
# ===========================================================================
|
||||
# 7. cleanup
|
||||
# ===========================================================================
|
||||
|
||||
_section_start "7. cleanup"
|
||||
|
||||
DELETED=$(docker exec "${DB_CONTAINER}" psql -U damascus -d damascus -tA \
|
||||
-c "DELETE FROM work_items WHERE project='${VERIFY_PROJECT}' RETURNING id")
|
||||
DELETED_COUNT=$(printf '%s\n' "${DELETED}" | grep -cE '^[0-9a-f-]{36}$' || true)
|
||||
[[ "${DELETED_COUNT}" -ge 1 ]] || _fail "cleanup DELETE removed ${DELETED_COUNT} rows (expected >=1)"
|
||||
green "DELETE FROM work_items WHERE project='${VERIFY_PROJECT}' -> ${DELETED_COUNT} row(s) removed"
|
||||
|
||||
_record "7. cleanup|0|verify-smoke rows purged (${DELETED_COUNT})"
|
||||
|
||||
# ===========================================================================
|
||||
# 8. summary
|
||||
# ===========================================================================
|
||||
|
||||
bold ""
|
||||
bold "[8. summary]"
|
||||
GREEN_COUNT=0
|
||||
RED_COUNT=0
|
||||
for entry in "${RESULTS[@]}"; do
|
||||
name="${entry%%|*}"
|
||||
rest="${entry#*|}"
|
||||
code="${rest%%|*}"
|
||||
note="${rest#*|}"
|
||||
if [[ "${code}" == "0" ]]; then
|
||||
green "${name} ${note}"
|
||||
GREEN_COUNT=$((GREEN_COUNT + 1))
|
||||
else
|
||||
red "${name} ${note}"
|
||||
RED_COUNT=$((RED_COUNT + 1))
|
||||
fi
|
||||
done
|
||||
|
||||
bold ""
|
||||
bold "verify.sh: ${GREEN_COUNT} passed, ${RED_COUNT} failed"
|
||||
if [[ "${RED_COUNT}" -gt 0 ]]; then
|
||||
exit 1
|
||||
fi
|
||||
echo "evidence: ${LOG_FILE}"
|
||||
echo " (re-run with: bash scripts/verify.sh 2>&1 | tee ${LOG_FILE})"
|
||||
exit 0
|
||||
Reference in New Issue
Block a user