From 30fe48d2653de98a44e0fa36f53f053b1702cf66 Mon Sep 17 00:00:00 2001 From: Rodin <4+rodin@noreply.gitea.weiker.me> Date: Fri, 15 May 2026 08:12:02 +0000 Subject: [PATCH] docs(#148): add SKILL.md and dev-loop-spec.md for dispatch redesign (#149) --- SKILL.md | 129 ++++++++++++++++++++ docs/dev-loop-spec.md | 278 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 407 insertions(+) create mode 100644 SKILL.md create mode 100644 docs/dev-loop-spec.md diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..815dfad --- /dev/null +++ b/SKILL.md @@ -0,0 +1,129 @@ +# Dev-Loop Skill: review-bot + +This file documents the dev-loop architecture for the `review-bot` project. +It lives in the repo so changes are version-controlled alongside the code. + +## Architecture + +Dispatch is a **pure shell script** — no model reasoning. + +``` +Cron (agentTurn, toolsAllow: [exec, sessions_spawn, read]) + → runs dispatch script + → reads output for SPAWN or HANDOFF lines + → spawns worker if instructed + +Dispatch script (~/.openclaw/workspace/scripts/dev-loop-dispatch.sh) + → pure bash, all decisions are curl API calls + branches + → exits after emitting one SPAWN line (at most one worker per run) + → emits HANDOFF for each qualifying PR (does not exit after HANDOFF) + +Workers (Opus, spawned by cron model) + → receive precise task description + → do one job: self-review, fix CI, address feedback, or implement + → remove wip label when done, reply NO_REPLY +``` + +The cron model's **only** job: run script, read output, spawn worker if told to. +The model **never** assesses project state or makes dispatch decisions. + +## Safety Invariants + +1. **NEVER MERGE** — no merge API call exists anywhere in the script or worker templates +2. **REQUEST_CHANGES always blocks** — checked first, before CI, before self-review, before handoff +3. **WIP mutex** — one active worker per repo; WIP label gates new issue pickup +4. **One SPAWN per run** — script emits at most one SPAWN line per execution +5. **set -euo pipefail** — any curl failure aborts immediately, no partial actions +6. **Workers reply NO_REPLY** — no dispatch-level side effects (workers may push changes and manage labels as part of their task) + +## Dispatch Rules (in order) + +| Rule | Condition | Action | +|------|-----------|--------| +| 0 | WIP label > 1hr old | Remove stale WIP, continue | +| 0b | WIP label ≤ 1hr old | Mark ACTIVE_WIP=1, continue (only gates Rule 10) | +| _(1)_ | _(reserved — intentionally unused)_ | — | +| 2 | Any reviewer has REQUEST_CHANGES | SPAWN:findings | +| 3 | PR not mergeable | SPAWN:rebase | +| 4 | CI failure, no fix plan | SPAWN:ci-fix | +| 4b | CI failure, fix plan exists | Skip (worker in progress) | +| 5 | Bot review missing | Wait | +| 6 | CI pending/unknown | Wait | +| 7 | No clean self-review, no fix plan | SPAWN:self-review | +| 7b | Self-review needs attention, no fix plan | SPAWN:sr-fix | +| 8 | Unacknowledged bot review findings | SPAWN:address-feedback | +| 9 | Unresolved inline diff comments | SPAWN:address-feedback | +| 10 | All checks pass | HANDOFF | +| 11 | No open PRs + no ACTIVE_WIP | SPAWN:impl (next issue) | + +## Files + +| File | Description | +|------|-------------| +| `~/.openclaw/workspace/scripts/dev-loop-dispatch.sh` | Dispatch script — pure bash | +| `~/.openclaw/workspace/scripts/worker-tasks/self-review.md` | Self-review worker template | +| `~/.openclaw/workspace/scripts/worker-tasks/sr-fix.md` | Fix findings from self-review | +| `~/.openclaw/workspace/scripts/worker-tasks/ci-fix.md` | CI fix worker template | +| `~/.openclaw/workspace/scripts/worker-tasks/address-feedback.md` | Address feedback worker template | +| `~/.openclaw/workspace/scripts/worker-tasks/findings.md` | Address REQUEST_CHANGES findings | +| `~/.openclaw/workspace/scripts/worker-tasks/rebase.md` | Rebase worker template | +| `~/.openclaw/workspace/scripts/worker-tasks/impl.md` | Issue implementation worker template | +| `~/.openclaw/workspace/scripts/test/dispatch.bats` | Unit tests (bats) | +| `~/.openclaw/workspace/scripts/test/check-invariants.sh` | Static invariant checks | +| `~/.openclaw/workspace/memory/projects/review-bot.yaml` | Project config | + +## Project Config + +Config is at `~/.openclaw/workspace/memory/projects/review-bot.yaml`. + +Key fields: +- `repo`: `rodin/review-bot` +- `api_base`: `https://gitea.weiker.me/api/v1` +- `user`: `rodin` (bot Gitea username) +- `labels.wip`: WIP label ID +- `labels.ready`: ready label ID +- `review_bots`: list of bot sentinel names + +## Cron Config + +```yaml +- label: review-bot-dev-loop + schedule: "*/15 * * * *" + prompt: | + Run: bash ~/.openclaw/workspace/scripts/dev-loop-dispatch.sh review-bot + + Read the output. If it contains a SPAWN line, load the matching template from + ~/.openclaw/workspace/scripts/worker-tasks/.md, substitute {{PROJECT}}, + {{PR_NUM}}, and {{HEAD_SHA}}, then spawn with sessions_spawn(mode: "run", + model: "hai-anthropic/anthropic--claude-4.6-opus", thinking: "high"). + + If no SPAWN line in output, reply NO_REPLY. + + See ~/.openclaw/workspace/skills/dev-loop/SKILL.md for full instructions. + (This repo's SKILL.md is deployed to that workspace path.) + model: hai-anthropic/anthropic--claude-4.5-haiku + toolsAllow: [exec, sessions_spawn, read] +``` + +## Tests + +```bash +# Unit tests (no real API calls): +bats ~/.openclaw/workspace/scripts/test/dispatch.bats + +# Invariant checks (static analysis): +bash ~/.openclaw/workspace/scripts/test/check-invariants.sh + +# Dry-run against real API: +DRY_RUN=1 bash ~/.openclaw/workspace/scripts/dev-loop-dispatch.sh review-bot +``` + +## Related Issues + +- **#144** — autonomous merge: eliminated by removing all merge API calls from dispatch +- **#145** — merged despite REQUEST_CHANGES: eliminated by checking REQUEST_CHANGES first, unconditionally +- **#148** — this redesign + +## Spec + +Full design spec: `docs/dev-loop-spec.md` diff --git a/docs/dev-loop-spec.md b/docs/dev-loop-spec.md new file mode 100644 index 0000000..226b511 --- /dev/null +++ b/docs/dev-loop-spec.md @@ -0,0 +1,278 @@ +# Dev-Loop Dispatch Spec + +**Version:** 1.0 +**Status:** Implemented +**Implements:** Issue #148 + +This document is the authoritative spec for the review-bot dev-loop dispatch architecture. +The dispatch script (`~/.openclaw/workspace/scripts/dev-loop-dispatch.sh`) and its tests +are validated against the rules and invariants in this document. + +--- + +## 1. Overview + +The dev-loop is a 15-minute cron that advances the state of open pull requests and picks up +new issues when there is nothing in review. It is designed for **zero human intervention** +in the normal flow and **hard stops at key safety boundaries**. + +### Architecture + +``` +Cron (15-min cadence) + → exec: bash dev-loop-dispatch.sh + → read stdout for SPAWN/HANDOFF lines + → if SPAWN: load worker template, spawn subagent + → if HANDOFF: log, do nothing else + → if neither: NO_REPLY +``` + +The cron model has **no ambient knowledge** of the project state. All state is derived +from the dispatch script's output, which in turn comes from live API calls. + +--- + +## 2. Inputs + +### Project Config + +```yaml +# memory/projects/.yaml +repo: rodin/review-bot # / +api_base: https://gitea.../v1 # API base URL +token_path: ~/.openclaw/... # path to bearer token +user: rodin # bot Gitea username +labels: + wip: + ready: +review_bots: # sentinel names in review bodies + - sonnet + - gpt + - security +``` + +### Script Arguments + +```bash +bash dev-loop-dispatch.sh # normal run +DRY_RUN=1 bash dev-loop-dispatch.sh # dry-run (no mutations) +``` + +--- + +## 3. State + +The dispatch script is **stateless per run**. All state lives in the Gitea API: + +| State | API location | +|-------|-------------| +| Open PRs | `GET /repos/:repo/pulls?state=open` | +| PR labels | `GET /repos/:repo/issues/:n/labels` | +| PR reviews | `GET /repos/:repo/pulls/:n/reviews` | +| CI status | `GET /repos/:repo/commits/:sha/status` | +| Issue comments | `GET /repos/:repo/issues/:n/comments` | +| Inline diff comments | `GET /repos/:repo/pulls/:n/comments` | +| Issue timeline | `GET /repos/:repo/issues/:n/timeline` | + +No file-based state. No cron-to-cron carry-over. + +--- + +## 4. Output Protocol + +The script emits structured lines to stdout. Stderr is diagnostic logging. + +### `SPAWN:::` + +A worker is needed. The cron model reads this and spawns a subagent using the +template at `worker-tasks/.md`. + +| Field | Description | +|-------|-------------| +| `type` | Worker type: `self-review`, `ci-fix`, `address-feedback`, `findings`, `rebase`, `impl` | +| `number` | PR number (or issue number for `impl`) | +| `sha` | HEAD SHA of the PR (empty for `impl`) | + +At most **one SPAWN** is emitted per script run. + +### `HANDOFF:` + +All checks passed for `pr_num`. The script applied the `ready` label and assigned +to the human reviewer. The cron model logs this and takes no further action. + +Multiple HANDOFFs may be emitted in one run (one per qualifying PR). + +--- + +## 5. Dispatch Rules + +Rules are evaluated **in order** for each open PR. The first matching condition wins. +Only one SPAWN is emitted per full pass. + +### Rule 0: WIP Cleanup + +For each open PR with a `wip` label: + +1. Find the timestamp when the label was most recently applied (via timeline events) +2. If age > 1hr: **remove the label** (stale lock — worker likely crashed) +3. If age ≤ 1hr: **set ACTIVE_WIP=1** (do not exit, only gates Rule 10) + +### Rule 2: REQUEST_CHANGES Blocks + +**ALWAYS evaluated before any other per-PR rule.** + +For each reviewer, take their **latest** review state. If any reviewer's latest +state is `REQUEST_CHANGES`: + +→ Acquire WIP label on this PR +→ Emit `SPAWN:findings::` +→ Continue to next PR (but only one SPAWN total) + +This rule cannot be bypassed by any condition. There is no waiver mechanism. + +### Rule 3: Merge Conflicts + +If `mergeable == false`: + +→ Acquire WIP +→ Emit `SPAWN:rebase::` + +### Rule 4: CI Failure + +If CI state is `failure` or `error`: + +- If a fix plan comment exists for this HEAD SHA: **skip** (worker in progress) +- Otherwise: + +→ Acquire WIP +→ Emit `SPAWN:ci-fix::` + +### Rule 5: Bot Reviews Missing + +For each configured `review_bot`, check whether a review body contains the +sentinel ``. + +If any sentinel is missing: **wait** (continue to next PR, no SPAWN). + +### Rule 6: CI Pending/Unknown + +If CI state is `pending` or `unknown`: **wait**. + +### Rule 7: Self-Review + +Check for a self-review comment from the bot user against the current HEAD SHA: +- Comment contains `Self-review against ` + +Sub-cases: +- **Missing**: No self-review comment → + → Acquire WIP, emit `SPAWN:self-review::` +- **Needs attention** (`Assessment: ⚠️`): Found, but has findings: + - Fix plan exists for HEAD SHA: skip + - No fix plan: → Acquire WIP, emit `SPAWN:sr-fix::` +- **Clean** (`Assessment: ✅ Clean`): Continue to Rule 8 + +### Rule 8: Unacknowledged Bot Review Findings + +For each **current** (contains `Evaluated against `) APPROVED bot review +that has a findings table: + +A finding is **unacknowledged** if it does not appear as `Finding #N` in a fix plan +comment from the bot user for this HEAD SHA. + +If any unacknowledged findings exist: +- Fix plan exists: skip +- No fix plan: → Acquire WIP, emit `SPAWN:address-feedback::` + +### Rule 9: Unresolved Inline Diff Comments + +An inline diff comment is **unresolved** if: +1. `in_reply_to_id` is null (top-level comment) +2. `resolver` is null (not formally resolved) +3. No other comment has `in_reply_to_id` pointing to this comment (no reply) + +If unresolved comments exist: +- Fix plan exists: skip +- No fix plan: → Acquire WIP, emit `SPAWN:address-feedback::` + +### Rule 10: Handoff + +All rules above passed. Verify all bot reviews are current (contain `Evaluated against `). + +If all current: +- Apply `ready` label +- Assign to `aweiker` +- Emit `HANDOFF:` +- Continue evaluating remaining PRs (do NOT exit) + +If already assigned to `aweiker`: skip (assume handoff was already performed; continue to next PR without emitting another HANDOFF). + +### Rule 11: New Issue Pickup + +Only runs if: no open PRs exist AND `ACTIVE_WIP == 0`. + +Fetch open, unassigned issues. Priority: bugs first, then by number ascending. + +Claim the issue (assign to bot user to prevent double-pick), then: +→ Emit `SPAWN:impl::` + +--- + +## 6. Safety Invariants + +These are statically checked by `~/.openclaw/workspace/scripts/test/check-invariants.sh` and enforced in all changes: + +| ID | Invariant | +|----|-----------| +| S1 | Zero merge API calls in dispatch script (`/merge` does not appear) | +| S2 | REQUEST_CHANGES check (Rule 2) appears before CI check (Rule 4) | +| S3 | REQUEST_CHANGES check (Rule 2) appears before ready label application (Rule 10) | +| S4 | No model/AI API references in dispatch script | +| S5 | `set -euo pipefail` present | +| S6 | Active WIP does not cause early exit (only sets ACTIVE_WIP flag) | +| S7 | SPAWN:impl guarded by `ACTIVE_WIP == 0` check | +| S8 | No merge calls in any worker template | + +--- + +## 7. Error Handling + +| Error | Behavior | +|-------|----------| +| `curl` returns error | `set -euo pipefail` aborts script — no partial actions | +| `jq` parse error | Script aborts | +| Worker crashes | WIP label left on PR; stale WIP cleanup (Rule 0) removes it after 1hr | +| Race: two crons fire | WIP mutex prevents double-dispatch for same PR | +| `sessions_spawn` fails | Worker not spawned; WIP label orphaned → cleaned in 1hr | +| Config file missing | Exit code 2 with error message | + +--- + +## 8. Worker Templates + +Each worker receives a precise task description with substituted values: + +| Template | Trigger | Key job | +|----------|---------|---------| +| `self-review.md` | No clean self-review | Post self-review comment, remove WIP | +| `sr-fix.md` | Self-review needs attention | Address self-review findings, push, remove WIP | +| `ci-fix.md` | CI failing | Diagnose, fix, push, remove WIP | +| `address-feedback.md` | Unacknowledged findings or inline comments | Address feedback, push, remove WIP | +| `findings.md` | REQUEST_CHANGES present | Address REQUEST_CHANGES, push, remove WIP | +| `rebase.md` | Merge conflicts | Rebase on main, push, remove WIP | +| `impl.md` | New issue | Implement feature/fix, open PR | + +Workers **always** remove the WIP label on completion and reply `NO_REPLY`. + +--- + +## 9. Fixes for Issues #144 and #145 + +**Issue #144** (autonomous merge): +The dispatch script contains no merge API calls anywhere. The `~/.openclaw/workspace/scripts/test/check-invariants.sh` +invariant `S1` verifies this. Workers do not receive merge instructions. + +**Issue #145** (merged despite REQUEST_CHANGES): +Rule 2 is the **first** rule evaluated per PR. It cannot be skipped, reasoned past, +or bypassed. It is checked before CI, before self-review, before handoff. The check +uses latest-per-reviewer state, so a reviewer who re-approved after REQUEST_CHANGES +is correctly handled.