Reviewer validate layer passes through when worktree is missing — test_reviewer_03 RED #13
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Reviewer validate layer passes through when worktree is missing — test_reviewer_03 RED
The contract
wiki/concepts/reviewer-contract.md§1 Layer 1 (validate) lists 6 mechanical checks as a hard gate:pr_url IS NOT NULL## Test Commandexits 0 in the worktreeThe contract treats these as a hard block: any failure →
tests_failed(orrebase_conflictfor #6), row goes back tobuildwithattempts++.The gap
src/damascus/phases.py:311-317has two early-returnpassbranches:When either is hit, the reviewer passes through to merge even though no actual test/build/lint was run. The "pass" verdict then routes to
mergedper_next_phase_on_verdict. The validate layer's hard gate is silently downgraded to advisory in the missing-worktree / no-test_cmd case.Why it matters
This is the "Surviving bugs revealed by revived-dead tests" pattern from the operator skill. The test was previously dead (MySQL connection refused), the Postgres migration revived it, and its first failure surfaces a real Loop-C bug. A build that ships a no-worktree state (e.g. the build phase ran into a network error and left the row in
reviewwith no worktree) currently passes review and reachesmerged, even though no actual test was run, no actual lint was checked, no actual build was verified. The whole point of the validate layer — defense in depth against build-phase lies — is bypassed.The E2E test that surfaces it
tests/e2e/test_reviewer.py::test_reviewer_03_validate_layer_runs_test_cmd— currently RED:The test inserts a row with
test_cmd="exit 1", sets phase='review' with a fake pr_url, runs a cycle, and asserts the row routes tobuildorblocked. The actual behavior: the worktree doesn't exist, the reviewer returnspass(worktree-gone early-return), the row is supposed to goreview → mergedbut stays inreviewbecause the actual merge call against a fake PR URL fails.The three options (from the gap note)
Option A — Fail-closed on missing worktree / test_cmd (5-line patch)
Replace both early-return
passbranches withtests_failed:Cost: 5 minutes. Risk: rows that legitimately have a PR but no worktree (worktree pruned, branch deleted on remote but PR still open) now go to
buildinstead ofmerged. Recommendation: ship Option B in the same PR; if Option B is too large for one PR, ship Option A as a follow-up fix and schedule Option B as a follow-up PR.Option B — Re-create worktree from row's
branchfield, then validate (recommended)The state-resume-protocol contract says
build()already implements idempotent resume (worktree list, branch check, PR existence check, PR #9 merged 2026-06-24 05:15). Apply the same pattern to the reviewer's validate layer: before failing-closed, try to re-create the worktree at the row'sbranchfield's tip. If re-create succeeds, run the test_cmd. If it fails, returntests_failedwith the recreate error.Cost: 30-60 minutes. Risk: worktree-recreate path could fail in a way that masks the original test failure. Mitigation: log the recreate error prominently in
last_feedback.recreate_error.Option C — Distinguish "validate passed" from "validate skipped" via typed verdict (largest change)
Add a new typed verdict
validate_skippedto theVERDICTSset. The reviewer returnspassonly when validate actually ran and passed. The state machine then enforces "merged requires validate_ran=true." Requires a contract update, the typed-verdict table, and a re-audit of every place that consumespass.Cost: 2-3 hours. Risk: big blast radius.
Recommendation
Option B (re-create worktree from
branchfield, then validate). Reuses the state-resume-protocol pattern frombuild(). The reviewer contract should follow the same shape as the build contract.What needs human/BMAD input
The options require a design call on how aggressive the worktree-recreate path should be:
branchtip unconditionallybranchis on a known set of projects (whitelist)last_feedback.pr_urlis a Gitea PR that's still open (cross-check via API)B-1 is the simplest and matches the build phase's pattern. B-2/B-3 add safety for production multi-project setups. For this homelab, B-1 is the right shape.
Minimum bar to call it done
src/damascus/phases.py:311-322to:branchfield (usinggit_ops.ensure_worktreeor equivalent)tests_failedwith the recreate error inlast_feedbacktests/eviewer.py::test_reviewer_03_validate_layer_runs_test_cmdgoes GREENtests/contract/test_contracts_match_source.pythat asserts the reviewer code does NOT contain the early-returnpassbranches (or contains a comment naming the gap)tests/e2e/test_reviewer_01_no_pr_defensive_guard(theno_prdefensive guard must still fire before the worktree check)tests/e2e/test_reviewer_04_subjective_comments_do_not_block_mergeOut of scope for this issue
no_prdefensive guard (PR #1 baseline, working)attempts >= budget_cycles(PR #3, working)Cross-references
wiki/queries/damascus-orchestrator/reviewer-validates-failing-test-cmd-still-merges-2026-06-24.mdwiki/concepts/reviewer-contract.mdwiki/concepts/state-resume-protocol.md, PR fix(build): idempotent resume for worktree/branch/PR (state-resume contract) (#9)