Files
damascus-orchestrator/README.md
Kaysser Kayyali e21a8c1f53
Some checks failed
test / contract-and-unit (pull_request) Failing after 8s
Migrate to Postgres + Taskiq (conform to orchestration plan)
Bring the code to the plan: MySQL→Postgres 16 and cron→Taskiq (Python
BullMQ-equivalent over a Redis broker), with Postgres FOR UPDATE SKIP LOCKED
retained as the atomic claim. The per-tick "claim one item, run one phase"
model is unchanged.

Approved fixes folded in:
- claim_for_merge: delete the call to the non-existent state.claim_for_merge;
  claim order is now review→build→spec (merge happens inside review on pass).
- loop-breaker: a non-pass verdict with attempts>=budget_cycles parks the row
  as `blocked` + opens a human_issue + emits work.blocked (design §5/§16).
- spec_wrong: added to phases.VERDICTS and emitted by refine_spec when the
  spec is missing required sections (routes to spec, not awaiting_human).

Driver: PyMySQL→psycopg3 sync (dict_row cursor, Jsonb() for JSONB). schema.sql
rewritten to PG16 (enums, JSONB, TIMESTAMPTZ, BIGSERIAL, BEFORE UPDATE trigger
replacing MySQL ON UPDATE). cli init guard-creates the DB and applies the whole
schema in one execute().

New src/damascus/tasks.py wires ListQueueBroker + TaskiqScheduler with a
run_cycle task (→ cycle.tick()) on a cron label. Dockerfile CMD runs the
worker; docker-compose adds redis:7 + an orchestrator-scheduler service.

Bugs found and fixed during verification:
- cycle.py/cli.py status file was hardcoded to /data; now uses settings.data_dir.
- redis-py 8.0.0 defaults socket_timeout=5s, which killed idle Taskiq workers
  (indefinite BRPOP + uncaught TimeoutError). Broker now sets socket_timeout=None.
- docker-compose scheduler command pointed at :broker; fixed to :scheduler.
- tasks.py docstring referenced non-existent --concurrency; corrected to
  --max-threadpool-threads.

Verified: schema idempotent against postgres:16; damascus init end-to-end;
19 contract+unit tests green; Taskiq worker kiq path advances a row; Taskiq
scheduler path (no damascus cycle call) drives spec→build→retry→blocked +
human_issue, proving the queue replaces cron and the loop-breaker via the queue.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-06-23 13:24:58 -04:00

3.5 KiB

Damascus Orchestrator

A self-hosted work-item state machine that autonomously advances stories through spec → build → review → merge. Designed per multi-project-orchestration-plan_1.md (the design doc this repo implements).

Quick start

# Start the stack (Postgres + Redis + Taskiq worker + scheduler)
docker compose up -d --build

# Apply schema (creates the DB + all tables/types/triggers)
docker compose exec orchestrator damascus init

# Manual one-shot cycle (operators / E2E). The Taskiq worker is the
# automatic trigger — you do not normally need to run this by hand.
docker compose exec orchestrator damascus cycle

# External concurrency view
docker compose exec orchestrator damascus status

# Run the contract + unit suite (needs a Postgres on 127.0.0.1:5432)
pip install -e . pytest psycopg[binary]
pytest tests/contract/ tests/unit/ -v

# Run the E2E suite (needs the docker-compose stack up)
pip install pytest psycopg
pytest tests/ -v

What this repo contains

  • src/damascus/ — the Python package (cycle, phases, state, git_ops, llm, cli, relay, wiki, tasks, config)
  • tests/ — the suite (contract + unit + E2E; the executable form of the design doc)
  • schema.sql — Postgres 16 schema (work_items, coordination_gates, human_issues, cost_ledger, events_outbox)
  • docker-compose.yml — the stack (db + redis + orchestrator worker + scheduler + sidecar-status)
  • Dockerfile — the orchestrator image (Python 3.12 + git + claude-code + BMAD + LLM-wiki + ollama binary)
  • .gitea/workflows/ — CI
  • skills/SKILL.md — operator-facing skill

What this repo does NOT contain

  • The wiki (kaykayyali/damascus-wiki) — separate repo with the contract docs
  • The test project repos (wh40k-pc, restitution) — separate repos with their own BMAD stories
  • The lore entry about this project (kaykayyali/loreInfrastructure/Damascus-Orchestrator)

Architecture (one paragraph)

Postgres is the source of truth on work-item state (design §3). Each story row flows through three loops: spec-refiner (LLM via LiteLLM writes an implementable spec), code-builder (Claude Code via LiteLLM writes the code in a git worktree, opens a real Gitea PR), reviewer (re-runs the spec's test command, gates on objective pass/fail, merges via Gitea API on pass). Atomic claim uses SELECT ... FOR UPDATE SKIP LOCKED. Taskiq (a BullMQ-equivalent Python queue, §13) with a Redis broker is the recurring trigger; the worker's --concurrency N is the global concurrency cap (§10). Every phase transition emits a typed verdict and an events_outbox row in the same transaction. An attempt budget guarantees termination — a non-pass verdict that exhausts the budget parks the item as blocked and opens a human_issue (§5/§16). The human is async — open questions become human_issues rows, never synchronous blocks.

Full design + contracts in the wiki: kaykayyali/damascus-wiki.

Cross-references

  • Design doc: kaykayyali/damascus-wiki/raw/articles/multi-project-orchestration-plan-1.md
  • Contracts: kaykayyali/damascus-wiki/concepts/spec-refiner-contract.md, builder-contract.md, reviewer-contract.md, state-resume-protocol.md
  • Lore entry: kaykayyali/loreInfrastructure/Damascus-Orchestrator
  • Test session: kaykayyali/damascus-wiki/queries/damascus-orchestrator/2026-06-23-test-session.md
  • Spec-refiner gap: kaykayyali/damascus-wiki/queries/damascus-orchestrator/spec-refiner-gap-2026-06-23.md

License

Internal.