fix(cycle): 3-txn cycle + stale-claim filter + max_tokens=4000 #8
Reference in New Issue
Block a user
Delete Branch "fix/cycle-stale-claim-3txn-max-tokens"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Fix three bugs in the orchestrator's cycle + spec-refiner + claim pipeline that have been latent in the working tree as anonymous WIP, and were verified broken in the running container at this tick.
Why (running container state, 2026-06-24)
`damascus cycle` errors with:
```
psycopg.errors.SyntaxError: syntax error at or near "$2"
LINE 5: ...aimed_at IS NULL OR claimed_at < NOW() - INTERVAL $2 MINUTE)
```
Cause: a half-finished migration to the literal-interval form left the executor passing `STALE_CLAIM_MINUTES` as a parameter to a query that no longer has a placeholder for it. The working tree had the correct f-string form but the running container image was built from an intermediate commit.
Changes
3-transaction cycle (`src/damascus/cycle.py`): claim, phase function, and verdict write each open a fresh transaction. A phase crash no longer leaves the row locked; the claim is preserved and freed only by the stale-claim window. Verified pattern from wiki/concepts/state-resume-protocol.md.
Stale-claim filter (`src/damascus/state.py`): every `claim_for_*` query gates on `claimed_at IS NULL OR claimed_at < NOW() - INTERVAL '' MINUTE`, inlined as a literal. Window configurable via `DAMASCUS_STALE_CLAIM_MINUTES` (default 30m). Critical for the heartbeating-agent pattern: a live worker's claim must not be stolen by the next tick.
`max_tokens=4000` in refine_spec (`src/damascus/phases.py`): the 1500 cap truncated a 6-section spec before `## Test Command` was emitted, producing `spec_wrong`. 4000 is the verified safe floor.
Contract tests (all in `tests/contract/test_contracts_match_source.py`)
Verified locally
```
pytest tests/contract/ tests/unit/ -q → 25/25 pass
```
(4 new contract tests on top of the 21 already on main.)
Self-review (heartbeat agent)
I am the heartbeat for damascus-orchestrator in YOLO mode. I authored this PR against anonymous WIP that was in the working tree of a prior tick; the running container was failing every cycle on the syntax error above. Per the heartbeating-agent pattern I cannot self-approve (Gitea rejects `tea pulls approve` on my own PR) — please review and merge when you return.
Out of scope
Heartbeat self-review (cycle tick at 2026-06-24 ~03:00)
Diff scope: 4 files, +144 / -30.
What this fixes:
psycopg.errors.SyntaxErrorbecause the prior intermediate commit leftSTALE_CLAIM_MINUTESparameterized after the SQL went literal. Working tree had the f-string form; container image was stale.max_tokens1500 → 4000 per the verified LLM-output-size pitfall.What I checked:
{int(STALE_CLAIM_MINUTES)}inlines to a literal minute value (no parameter is passed tocur.executefor the interval).with state.transaction()blocks.claim_for_*paths injectSTALE_CLAIM_SQL.Risk:
Operational note: container image is still broken (cycle errors on every tick). Once this PR merges, rebuild via
docker compose build && docker compose up -dand the cycle will go green. I am NOT rebuilding the container from the host on this heartbeat tick — that requires the merged code.Blocking on: human review/merge.