Files
review-bot/docs/dev-loop-spec.md
T
Rodin 6cefbb070e
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 28s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m16s
fix(#157): add S9 invariant and never-close constraint to dev-loop spec
- Add S9 to §6 Safety Invariants: zero close-PR API calls in dispatch
- Document worker ABSOLUTE CONSTRAINTS in §8 Worker Templates
- Add §9 entry for Issue #157 explaining the fix

All worker templates already contain the NEVER-close constraint from
a prior session. This commit makes the spec authoritative.

Companion changes in rodin/workspace:
- check-invariants.sh: add S9 static check
- dispatch.bats: add Bug-157-regression test
2026-05-15 14:47:54 +00:00

10 KiB

Dev-Loop Dispatch Spec

Version: 1.0 Status: Implemented Implements: Issue #148

This document is the authoritative spec for the review-bot dev-loop dispatch architecture. The dispatch script (~/.openclaw/workspace/scripts/dev-loop-dispatch.sh) and its tests are validated against the rules and invariants in this document.


1. Overview

The dev-loop is a 15-minute cron that advances the state of open pull requests and picks up new issues when there is nothing in review. It is designed for zero human intervention in the normal flow and hard stops at key safety boundaries.

Architecture

Cron (15-min cadence)
  → exec: bash dev-loop-dispatch.sh <project>
  → read stdout for SPAWN/HANDOFF lines
  → if SPAWN: load worker template, spawn subagent
  → if HANDOFF: log, do nothing else
  → if neither: NO_REPLY

The cron model has no ambient knowledge of the project state. All state is derived from the dispatch script's output, which in turn comes from live API calls.


2. Inputs

Project Config

# memory/projects/<project>.yaml
repo: rodin/review-bot         # <owner>/<repo>
api_base: https://gitea.../v1  # API base URL
token_path: ~/.openclaw/...    # path to bearer token
user: rodin                    # bot Gitea username
labels:
  wip: <id>
  ready: <id>
review_bots:                   # sentinel names in review bodies
  - sonnet
  - gpt
  - security

Script Arguments

bash dev-loop-dispatch.sh <project>   # normal run
DRY_RUN=1 bash dev-loop-dispatch.sh <project>   # dry-run (no mutations)

3. State

The dispatch script is stateless per run. All state lives in the Gitea API:

State API location
Open PRs GET /repos/:repo/pulls?state=open
PR labels GET /repos/:repo/issues/:n/labels
PR reviews GET /repos/:repo/pulls/:n/reviews
CI status GET /repos/:repo/commits/:sha/status
Issue comments GET /repos/:repo/issues/:n/comments
Inline diff comments GET /repos/:repo/pulls/:n/comments
Issue timeline GET /repos/:repo/issues/:n/timeline

No file-based state. No cron-to-cron carry-over.


4. Output Protocol

The script emits structured lines to stdout. Stderr is diagnostic logging.

SPAWN:<type>:<number>:<sha>

A worker is needed. The cron model reads this and spawns a subagent using the template at worker-tasks/<type>.md.

Field Description
type Worker type: self-review, ci-fix, address-feedback, findings, rebase, impl
number PR number (or issue number for impl)
sha HEAD SHA of the PR (empty for impl)

At most one SPAWN is emitted per script run.

HANDOFF:<pr_num>

All checks passed for pr_num. The script applied the ready label and assigned to the human reviewer. The cron model logs this and takes no further action.

Multiple HANDOFFs may be emitted in one run (one per qualifying PR).


5. Dispatch Rules

Rules are evaluated in order for each open PR. The first matching condition wins. Only one SPAWN is emitted per full pass.

Rule 0: WIP Cleanup

For each open PR with a wip label:

  1. Find the timestamp when the label was most recently applied (via timeline events)
  2. If age > 1hr: remove the label (stale lock — worker likely crashed)
  3. If age ≤ 1hr: set ACTIVE_WIP=1 (do not exit, only gates Rule 10)

Rule 2: REQUEST_CHANGES Blocks

ALWAYS evaluated before any other per-PR rule.

For each reviewer, take their latest review state. If any reviewer's latest state is REQUEST_CHANGES:

→ Acquire WIP label on this PR → Emit SPAWN:findings:<pr_num>:<head_sha> → Continue to next PR (but only one SPAWN total)

This rule cannot be bypassed by any condition. There is no waiver mechanism.

Rule 3: Merge Conflicts

If mergeable == false:

→ Acquire WIP → Emit SPAWN:rebase:<pr_num>:<head_sha>

Rule 4: CI Failure

If CI state is failure or error:

  • If a fix plan comment exists for this HEAD SHA: skip (worker in progress)
  • Otherwise:

→ Acquire WIP → Emit SPAWN:ci-fix:<pr_num>:<head_sha>

Rule 5: Bot Reviews Missing

For each configured review_bot, check whether a review body contains the sentinel <!-- review-bot:<name> -->.

If any sentinel is missing: wait (continue to next PR, no SPAWN).

Rule 6: CI Pending/Unknown

If CI state is pending or unknown: wait.

Rule 7: Self-Review

Check for a self-review comment from the bot user against the current HEAD SHA:

  • Comment contains Self-review against <head_sha>

Sub-cases:

  • Missing: No self-review comment → → Acquire WIP, emit SPAWN:self-review:<pr_num>:<head_sha>
  • Needs attention (Assessment: ⚠️): Found, but has findings:
    • Fix plan exists for HEAD SHA: skip
    • No fix plan: → Acquire WIP, emit SPAWN:sr-fix:<pr_num>:<head_sha>
  • Clean (Assessment: ✅ Clean): Continue to Rule 8

Rule 8: Unacknowledged Bot Review Findings

For each current (contains Evaluated against <head_short>) APPROVED bot review that has a findings table:

A finding is unacknowledged if it does not appear as Finding #N in a fix plan comment from the bot user for this HEAD SHA.

If any unacknowledged findings exist:

  • Fix plan exists: skip
  • No fix plan: → Acquire WIP, emit SPAWN:address-feedback:<pr_num>:<head_sha>

Rule 9: Unresolved Inline Diff Comments

An inline diff comment is unresolved if:

  1. in_reply_to_id is null (top-level comment)
  2. resolver is null (not formally resolved)
  3. No other comment has in_reply_to_id pointing to this comment (no reply)

If unresolved comments exist:

  • Fix plan exists: skip
  • No fix plan: → Acquire WIP, emit SPAWN:address-feedback:<pr_num>:<head_sha>

Rule 10: Handoff

All rules above passed. Verify all bot reviews are current (contain Evaluated against <head_short>).

If all current:

  • Apply ready label
  • Assign to aweiker
  • Emit HANDOFF:<pr_num>
  • Continue evaluating remaining PRs (do NOT exit)

If already assigned to aweiker: skip (assume handoff was already performed; continue to next PR without emitting another HANDOFF).

Rule 11: New Issue Pickup

Only runs if: no open PRs exist AND ACTIVE_WIP == 0.

Fetch open, unassigned issues. Priority: bugs first, then by number ascending.

Claim the issue (assign to bot user to prevent double-pick), then: → Emit SPAWN:impl:<issue_num>:


6. Safety Invariants

These are statically checked by ~/.openclaw/workspace/scripts/test/check-invariants.sh and enforced in all changes:

ID Invariant
S1 Zero merge API calls in dispatch script (/merge does not appear)
S2 REQUEST_CHANGES check (Rule 2) appears before CI check (Rule 4)
S3 REQUEST_CHANGES check (Rule 2) appears before ready label application (Rule 10)
S4 No model/AI API references in dispatch script
S5 set -euo pipefail present
S6 Active WIP does not cause early exit (only sets ACTIVE_WIP flag)
S7 SPAWN:impl guarded by ACTIVE_WIP == 0 check
S8 No merge calls in any worker template
S9 Zero close-PR API calls in dispatch script (state=closed does not appear)

7. Error Handling

Error Behavior
curl returns error set -euo pipefail aborts script — no partial actions
jq parse error Script aborts
Worker crashes WIP label left on PR; stale WIP cleanup (Rule 0) removes it after 1hr
Race: two crons fire WIP mutex prevents double-dispatch for same PR
sessions_spawn fails Worker not spawned; WIP label orphaned → cleaned in 1hr
Config file missing Exit code 2 with error message

8. Worker Templates

Each worker receives a precise task description with substituted values:

Template Trigger Key job
self-review.md No clean self-review Post self-review comment, remove WIP
sr-fix.md Self-review needs attention Address self-review findings, push, remove WIP
ci-fix.md CI failing Diagnose, fix, push, remove WIP
address-feedback.md Unacknowledged findings or inline comments Address feedback, push, remove WIP
findings.md REQUEST_CHANGES present Address REQUEST_CHANGES, push, remove WIP
rebase.md Merge conflicts Rebase on main, push, remove WIP
impl.md New issue Implement feature/fix, open PR

Workers always remove the WIP label on completion and reply NO_REPLY.

Worker Absolute Constraints

Every worker template begins with an ⛔ ABSOLUTE CONSTRAINTS section containing these rules:

  • NEVER close a PR. Never call PATCH /pulls/{id} with state=closed. Closing a PR requires human action. "Duplicate", "superseded", or "already done" are never a worker's call.
  • NEVER merge a PR. Never call the merge API. Merging requires human approval.
  • NEVER use the gitea-aweiker token. All API calls use the gitea-rodin token only.
  • NEVER act on a PR with active REQUEST_CHANGES. Fix the findings first.

These constraints are enforced by S1, S8, and S9 in check-invariants.sh (for the dispatch script) and by the template text itself (for workers).


9. Fixes for Issues #144, #145, and #157

Issue #144 (autonomous merge): The dispatch script contains no merge API calls anywhere. The ~/.openclaw/workspace/scripts/test/check-invariants.sh invariant S1 verifies this. Workers do not receive merge instructions.

Issue #145 (merged despite REQUEST_CHANGES): Rule 2 is the first rule evaluated per PR. It cannot be skipped, reasoned past, or bypassed. It is checked before CI, before self-review, before handoff. The check uses latest-per-reviewer state, so a reviewer who re-approved after REQUEST_CHANGES is correctly handled.

Issue #157 (autonomous PR close): Worker templates were missing an explicit constraint against closing PRs. The dispatch script never had a close call, but workers could reason their way into calling PATCH /pulls/{id} with state=closed. All worker templates now include NEVER close a PR in their ABSOLUTE CONSTRAINTS section. Invariant S9 verifies the dispatch script contains no close calls. The regression test in dispatch.bats verifies the same statically.