docs(#148): add SKILL.md and dev-loop-spec.md for dispatch redesign #149
Reference in New Issue
Block a user
Delete Branch "issue-148"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Adds documentation for the dispatch redesign implemented in issue #148.
Changes
SKILL.md— New file. Documents the dispatch architecture, safety invariants, dispatch rules, file locations, cron config, and test commands for the review-bot dev-loop.docs/dev-loop-spec.md— New file. Authoritative spec for the dispatch logic, referenced by the dispatch script. Defines all 11 rules, output protocol, error handling, safety invariants (S1-S8), and worker templates.Background
The dispatch script (
~/.openclaw/workspace/scripts/dev-loop-dispatch.sh) was redesigned as pure bash with no model reasoning to fix:The script lives in workspace, not in this repo, so it can be updated without a PR cycle. These docs live here so the spec is version-controlled alongside the code it governs.
Also fixed (in workspace)
resolverfield)Tests
All 36 bats tests pass and 8 static invariants verified:
Closes #148
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
76931dfe)Sonnet Review
Summary
This PR adds two documentation-only files (SKILL.md and docs/dev-loop-spec.md) describing the dispatch redesign architecture. The content is well-structured, internally consistent, and CI has passed. There are no code changes to evaluate against Go patterns.
Findings
SKILL.mddocs/dev-loop-spec.mdscripts/check-deps.shas the tool that verifies invariant S1, but the script is described astest/check-invariants.shin Section 6 and throughout SKILL.md. This appears to be a stale reference — should betest/check-invariants.sh.Recommendation
APPROVE — Approve. Both findings are minor documentation inconsistencies that don't affect correctness or safety. The rule numbering mismatch (Rule 10 vs Rule 11 in SKILL.md) is cosmetically confusing but the spec in docs/dev-loop-spec.md is internally correct. The check-deps.sh vs check-invariants.sh reference mismatch is a copy-paste error worth fixing in a follow-up. Neither warrants blocking the PR.
Review by sonnet
Evaluated against
76931dfe@@ -0,0 +48,4 @@| 4b | CI failure, fix plan exists | Skip (worker in progress) || 5 | Bot review missing | Wait || 6 | CI pending/unknown | Wait || 7 | No clean self-review, no fix plan | SPAWN:self-review |[NIT] The dispatch rules table has two rows labeled '10' — one for 'All checks pass → HANDOFF' and one for 'No open PRs + no ACTIVE_WIP → SPAWN:impl (next issue)'. The second should be labeled '11' to match the spec in docs/dev-loop-spec.md which correctly uses Rule 11 for new issue pickup.
@@ -0,0 +255,4 @@|----------|---------|---------|| `self-review.md` | No clean self-review | Post self-review comment, remove WIP || `ci-fix.md` | CI failing | Diagnose, fix, push, remove WIP || `address-feedback.md` | Unacknowledged findings or inline comments | Address feedback, push, remove WIP |[NIT] Section 9 references
scripts/check-deps.shas the tool that verifies invariant S1, but the script is described astest/check-invariants.shin Section 6 and throughout SKILL.md. This appears to be a stale reference — should betest/check-invariants.sh.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
76931dfe)Security Review
Summary
Documentation-only changes adding SKILL.md and docs/dev-loop-spec.md. No executable code or secrets were introduced, and CI passed.
Recommendation
APPROVE — These additions document safety-focused dispatch behavior and do not alter code paths. No security-relevant risks were introduced. Proceed with merge.
Review by security
Evaluated against
76931dfeOriginal reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
76931dfe)Gpt Review
Summary
Documentation-only PR adds a clear, detailed spec and skill guide for the dispatch redesign. CI passed and the content is coherent and consistent with the stated safety invariants and rules. A few minor documentation inconsistencies should be corrected for clarity.
Findings
SKILL.mddocs/dev-loop-spec.mdSKILL.mdRecommendation
APPROVE — Since this PR only introduces documentation and CI passed, it is safe to merge. Please address the minor documentation inconsistencies: (1) fix the duplicate rule numbering in SKILL.md so that the new issue pickup is labeled as rule 11 (matching docs/dev-loop-spec.md), (2) correct the script reference in docs/dev-loop-spec.md to point to the invariant checker (check-invariants.sh) rather than check-deps.sh, and (3) optionally align the toolsAllow lists between the Architecture overview and the Cron Config in SKILL.md for consistency.
Review by gpt
Evaluated against
76931dfe@@ -0,0 +8,4 @@Dispatch is a **pure shell script** — no model reasoning.```Cron (agentTurn, toolsAllow: [exec, sessions_spawn])[NIT] Architecture snippet shows toolsAllow: [exec, sessions_spawn] while the Cron Config later includes [exec, sessions_spawn, read]. Consider aligning these for consistency.
@@ -0,0 +53,4 @@| 8 | Unacknowledged bot review findings | SPAWN:address-feedback || 9 | Unresolved inline diff comments | SPAWN:address-feedback || 10 | All checks pass | HANDOFF || 10 | No open PRs + no ACTIVE_WIP | SPAWN:impl (next issue) |[MINOR] Dispatch rules table duplicates rule number 10 for two different actions; the second should be rule 11 (new issue pickup) to match the spec in docs/dev-loop-spec.md.
@@ -0,0 +257,4 @@| `ci-fix.md` | CI failing | Diagnose, fix, push, remove WIP || `address-feedback.md` | Unacknowledged findings or inline comments | Address feedback, push, remove WIP || `findings.md` | REQUEST_CHANGES present | Address REQUEST_CHANGES, push, remove WIP || `rebase.md` | Merge conflicts | Rebase on main, push, remove WIP |[MINOR] Reference to scripts/check-deps.sh verifying invariant S1 appears incorrect/inconsistent; earlier sections and SKILL.md refer to ~/.openclaw/workspace/scripts/test/check-invariants.sh for invariant checks. Update to the correct script path/name.
Self-review against
151199e436Assessment: ✅ Clean
Self-Review: issue-148 — 2026-05-15
Verdict: PASS
No code changes — documentation only. Reviewed both files (SKILL.md and docs/dev-loop-spec.md) for:
Ready for human review.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
151199e4)Sonnet Review
Summary
This PR adds two documentation files (SKILL.md and docs/dev-loop-spec.md) describing the dispatch architecture for a dev-loop automation system. CI has passed. The changes are documentation-only with no Go code, so language pattern rules don't apply. The docs are well-structured, internally consistent, and clearly address the stated issues (#144, #145).
Findings
SKILL.mddocs/dev-loop-spec.mdaweiker: skip.' — it's unclear whether this means skip the HANDOFF emission only, or skip applying the ready label too. The spec would benefit from a clarifying sentence, e.g. 'If the PR is already assigned toaweiker, assume handoff was already performed and continue to the next PR without emitting another HANDOFF.'SKILL.mdRecommendation
APPROVE — Approve. This is a documentation-only PR with passing CI. The docs are thorough, well-organized, and accurately capture the safety invariants and dispatch logic. The three nits are minor clarity issues that don't affect correctness — the most notable is the slight contradiction between SKILL.md's architecture diagram saying the script 'exits after one SPAWN action per pass' and the actual behavior where HANDOFF does not cause an exit. Consider clarifying this in a follow-up if the docs are used as the authoritative reference.
Review by sonnet
Evaluated against
151199e4@@ -0,0 +10,4 @@```Cron (agentTurn, toolsAllow: [exec, sessions_spawn])→ runs dispatch script→ reads output for SPAWN or HANDOFF lines[NIT] SKILL.md states the dispatch script 'exits after one SPAWN action per pass', but docs/dev-loop-spec.md (Rule 10) clarifies that multiple HANDOFFs can be emitted in one pass and the script does NOT exit after HANDOFF. The architecture diagram in SKILL.md is slightly misleading — it implies the script always exits after one action, when in reality it continues processing PRs after HANDOFF emissions.
@@ -0,0 +36,4 @@5. **set -euo pipefail** — any curl failure aborts immediately, no partial actions6. **Workers reply NO_REPLY** — no announcements, no handoffs, no side effects## Dispatch Rules (in order)[NIT] The dispatch rules table omits Rule 1 (jumps from Rule 0/0b directly to Rule 2). If Rule 1 was intentionally removed or never existed, a brief note would prevent reader confusion about the numbering gap.
@@ -0,0 +151,4 @@For each configured `review_bot`, check whether a review body contains thesentinel `<!-- review-bot:<name> -->`.[NIT] Rule 10 states 'If already assigned to
aweiker: skip.' — it's unclear whether this means skip the HANDOFF emission only, or skip applying the ready label too. The spec would benefit from a clarifying sentence, e.g. 'If the PR is already assigned toaweiker, assume handoff was already performed and continue to the next PR without emitting another HANDOFF.'Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
151199e4)Security Review
Summary
This PR adds documentation only (SKILL.md and docs/dev-loop-spec.md) describing the dispatch architecture and safety rules. There are no code changes or secrets exposed, and CI has passed.
Recommendation
APPROVE — Approve. The documentation clearly outlines safety invariants (no merge calls, REQUEST_CHANGES blocking, WIP mutex, set -euo pipefail) and controlled worker spawning. No security-relevant vulnerabilities are introduced by these docs, and CI passed. Ensure the actual dispatch script in the workspace continues to adhere to the documented invariants, especially token handling and rule ordering, but no changes are needed in this repository.
Review by security
Evaluated against
151199e4Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
151199e4)Gpt Review
Summary
Well-structured documentation addition for the dispatch redesign. The spec and skill guide are clear and comprehensive. A few minor inconsistencies and clarity nits can be addressed in follow-ups.
Findings
SKILL.mdSKILL.mdSKILL.mddocs/dev-loop-spec.mdtest/check-invariants.sh, while later (line 271) it referencesscripts/test/check-invariants.sh. Standardize these references (e.g.,~/.openclaw/workspace/scripts/test/check-invariants.sh) for consistency.docs/dev-loop-spec.mdRecommendation
APPROVE — The documentation is thorough and aligns with the stated goals of the dispatch redesign, with CI passing. Approve as-is. For polish, consider a quick follow-up to reconcile the SKILL.md path reference, align the listed tool permissions between the architecture snippet and Cron config, clarify the "no side effects" phrasing for workers, and standardize the invariants checker path in the spec.
Review by gpt
Evaluated against
151199e4@@ -0,0 +8,4 @@Dispatch is a **pure shell script** — no model reasoning.```Cron (agentTurn, toolsAllow: [exec, sessions_spawn])[NIT] Architecture snippet lists toolsAllow as [exec, sessions_spawn], while the Cron config farther below includes 'read' as well. Consider aligning the lists or adding a note that 'read' is also required.
@@ -0,0 +34,4 @@3. **WIP mutex** — one active worker per repo; WIP label gates new issue pickup4. **One SPAWN per run** — script emits at most one SPAWN line per execution5. **set -euo pipefail** — any curl failure aborts immediately, no partial actions6. **Workers reply NO_REPLY** — no announcements, no handoffs, no side effects[MINOR] Safety invariant says "Workers reply NO_REPLY — no announcements, no handoffs, no side effects". However, worker templates include pushing code and removing WIP labels, which are side effects. Suggest rephrasing to "no dispatch-level side effects (no announcements/handoffs); workers may push changes and manage labels as part of their task".
@@ -0,0 +98,4 @@If no SPAWN line in output, reply NO_REPLY.See ~/.openclaw/workspace/skills/dev-loop/SKILL.md for full instructions.[MINOR] Cron config references a workspace path to SKILL.md ("~/.openclaw/workspace/skills/dev-loop/SKILL.md") but this PR adds SKILL.md at the repo root. Clarify how the repo doc is synchronized into the workspace path or update the reference to avoid confusion.
@@ -0,0 +219,4 @@## 6. Safety InvariantsThese are statically checked by `test/check-invariants.sh` and enforced in all changes:[MINOR] Path to invariants checker is inconsistent across the doc. Here it references
test/check-invariants.sh, while later (line 271) it referencesscripts/test/check-invariants.sh. Standardize these references (e.g.,~/.openclaw/workspace/scripts/test/check-invariants.sh) for consistency.@@ -0,0 +268,4 @@## 9. Fixes for Issues #144 and #145**Issue #144** (autonomous merge):The dispatch script contains no merge API calls anywhere. The `scripts/test/check-invariants.sh`[NIT] Invariants checker path differs from earlier mention (line 222). Unify the path formatting to avoid ambiguity for readers.
Fix Plan against
151199e436:Current (non-stale) review findings to address:
Sonnet review (3 NITs):
may emit multiple HANDOFFs in one pass) but the architecture code block is misleading. Fix: rephrase architecture diagram to clarify SPAWN exits, HANDOFF does not.GPT review (3 MINORs + 2 NITs):
read.test/check-invariants.sh(relative to workspace scripts), line 271 saysscripts/test/check-invariants.sh. Fix: standardize to the full path~/.openclaw/workspace/scripts/test/check-invariants.shin both locations.Security Review
Summary
This PR adds documentation only (SKILL.md and docs/dev-loop-spec.md) describing a dispatch architecture and safety invariants. No executable code changes are introduced, and CI has passed.
Recommendation
APPROVE — Approve as submitted. The changes are documentation-only with no security-impacting code modifications or exposed secrets. The docs emphasize strong safety invariants and do not include sensitive credentials. No further action is required.
Review by security
Evaluated against
d573c149Sonnet Review
Summary
This PR adds two new documentation files (SKILL.md and docs/dev-loop-spec.md) with no Go code changes. CI passed. The documentation is well-structured, internally consistent, and accurately describes the dispatch architecture, safety invariants, and rules.
Findings
SKILL.mdSKILL.mddocs/dev-loop-spec.mdaweiker: skip (assume handoff was already performed; continue to next PR without emitting another HANDOFF).' This hardcodes the reviewer username. Consider referencing a config field (e.g.,reviewerin the project config) rather than a literal name, both for documentation accuracy and maintainability.Recommendation
APPROVE — Approve. This is a documentation-only PR with no Go code changes. CI passed. The two documents are consistent with each other, clearly describe the architecture and safety invariants, and serve a useful purpose for version-controlling the dispatch spec alongside the codebase. The minor nits above are cosmetic and do not block merging.
Review by sonnet
Evaluated against
d573c149@@ -0,0 +15,4 @@Dispatch script (~/.openclaw/workspace/scripts/dev-loop-dispatch.sh)→ pure bash, all decisions are curl API calls + branches→ exits after emitting one SPAWN line (at most one worker per run)[NIT] The architecture diagram in SKILL.md says the dispatch script 'exits after emitting one SPAWN line', but the spec (docs/dev-loop-spec.md §4) says HANDOFF lines continue without exiting, and the diagram itself also says 'emits HANDOFF for each qualifying PR (does not exit after HANDOFF)'. The 'exits' phrasing is slightly misleading — it only exits early for SPAWN, not HANDOFF. This is not wrong but could be clarified to 'exits after emitting one SPAWN line (but continues for HANDOFF)' or similar.
@@ -0,0 +37,4 @@6. **Workers reply NO_REPLY** — no dispatch-level side effects (workers may push changes and manage labels as part of their task)## Dispatch Rules (in order)[NIT] The dispatch rules table in SKILL.md says Rule 10 condition is 'All checks pass' but the corresponding rule in docs/dev-loop-spec.md §5 (Rule 10) also requires verifying all bot reviews are current. The SKILL.md table is a summary so this simplification is acceptable, but a note like 'All checks pass + bot reviews current' would be more precise.
@@ -0,0 +195,4 @@- No fix plan: → Acquire WIP, emit `SPAWN:address-feedback:<pr_num>:<head_sha>`### Rule 10: Handoff[NIT] Rule 10 says 'If already assigned to
aweiker: skip (assume handoff was already performed; continue to next PR without emitting another HANDOFF).' This hardcodes the reviewer username. Consider referencing a config field (e.g.,reviewerin the project config) rather than a literal name, both for documentation accuracy and maintainability.Feedback Addressed
All findings from reviews #3969 (Sonnet) and #3974 (GPT) have been addressed in commit
151199e:Review #3969 (Sonnet) — 3 NITs
NIT SKILL.md:13 — Architecture diagram clarified:
→ exits after emitting one SPAWN line (at most one worker per run)+→ emits HANDOFF for each qualifying PR (does not exit after HANDOFF). The previous ambiguous phrasing that implied HANDOFF causes exit is gone.NIT docs/dev-loop-spec.md:154 — Rule 10 skip clause clarified: "If already assigned to
aweiker: skip (assume handoff was already performed; continue to next PR without emitting another HANDOFF)."NIT SKILL.md:39 — Rule 1 numbering gap documented with a placeholder row:
| _(1)_ | _(reserved — intentionally unused)_ | — |Review #3974 (GPT) — 3 MINORs, 2 NITs
MINOR SKILL.md:101 — Cron config clarified with note: "(This repo's SKILL.md is deployed to that workspace path.)" to explain the sync relationship.
NIT SKILL.md:11 —
toolsAllowaligned: architecture block and Cron config both now show[exec, sessions_spawn, read].MINOR SKILL.md:37 — Safety Invariant #6 rephrased: "Workers reply NO_REPLY — no dispatch-level side effects (workers may push changes and manage labels as part of their task)".
MINOR + NIT docs/dev-loop-spec.md:222+271 —
check-invariants.shpath standardized to~/.openclaw/workspace/scripts/test/check-invariants.shin both Section 6 and Section 9.Gpt Review
Summary
The PR adds thorough documentation for the dispatch redesign and testing approach. However, there is a fundamental inconsistency between SKILL.md and the authoritative dev-loop-spec regarding whether the script exits after emitting a SPAWN, which affects whether HANDOFFs can be emitted in the same run. A few smaller clarity/config consistency issues should also be addressed.
Findings
SKILL.mddocs/dev-loop-spec.mddocs/dev-loop-spec.mdaweiker). Consider making the assignee configurable (e.g., add to project config) or documenting that this value is a configuration parameter, to avoid drift and improve portability.SKILL.mdtoken_path, while docs/dev-loop-spec.md includes it as an input (line 41). Addtoken_pathto SKILL.md’s key fields or clarify where authentication is sourced to keep the two docs consistent.docs/dev-loop-spec.mddocs/dev-loop-spec.mddocs/dev-loop-spec.mddocs/dev-loop-spec.mdGET /repos/:repo/issues/:n/timeline) and specific PR comments endpoints. Verify these paths match the target Gitea version and document any version constraints or deviations to prevent implementation surprises.Recommendation
REQUEST_CHANGES — The documentation is comprehensive and aligns well with the redesign’s safety goals, but it contains a critical inconsistency about control flow after emitting a SPAWN. Please reconcile SKILL.md and docs/dev-loop-spec.md so they both clearly define the same behavior: either exit immediately after the first SPAWN (and accept that no HANDOFFs will be emitted that run), or continue scanning after emitting one SPAWN to allow HANDOFFs (but ensure only one SPAWN is ever emitted). Once aligned, also address minor clarity and consistency improvements: make the
aweikerhandoff assignee configurable or explicitly documented as such, addtoken_pathto SKILL.md’s key fields to match the spec, clarify that Rule 2 is first among per-PR evaluation rules after WIP cleanup, definehead_short, and formalize the fix plan comment markers. Finally, confirm the listed Gitea API endpoints match the actual instance/version. With these changes, the docs will be internally consistent and easier to implement against.Review by gpt
Evaluated against
d573c149@@ -0,0 +15,4 @@Dispatch script (~/.openclaw/workspace/scripts/dev-loop-dispatch.sh)→ pure bash, all decisions are curl API calls + branches→ exits after emitting one SPAWN line (at most one worker per run)[MAJOR] SKILL.md states the dispatch script exits after emitting one SPAWN line, which conflicts with docs/dev-loop-spec.md that indicates the script continues evaluating remaining PRs and can emit HANDOFFs even after a SPAWN. Clarify whether the script should exit after the first SPAWN or continue scanning to emit HANDOFFs, and make both documents consistent.
@@ -0,0 +93,4 @@Run: bash ~/.openclaw/workspace/scripts/dev-loop-dispatch.sh review-botRead the output. If it contains a SPAWN line, load the matching template from~/.openclaw/workspace/scripts/worker-tasks/<type>.md, substitute {{PROJECT}},[MINOR] Project Config section omits
token_path, while docs/dev-loop-spec.md includes it as an input (line 41). Addtoken_pathto SKILL.md’s key fields or clarify where authentication is sourced to keep the two docs consistent.@@ -0,0 +64,4 @@The dispatch script is **stateless per run**. All state lives in the Gitea API:| State | API location |[NIT] API endpoints reference a timeline endpoint (
GET /repos/:repo/issues/:n/timeline) and specific PR comments endpoints. Verify these paths match the target Gitea version and document any version constraints or deviations to prevent implementation surprises.@@ -0,0 +115,4 @@1. Find the timestamp when the label was most recently applied (via timeline events)2. If age > 1hr: **remove the label** (stale lock — worker likely crashed)3. If age ≤ 1hr: **set ACTIVE_WIP=1** (do not exit, only gates Rule 10)[MINOR] The spec says Rule 2 is "ALWAYS evaluated before any other per-PR rule", but Rule 0 (WIP cleanup) is described first and also runs per PR. Consider clarifying that Rule 0 is a preliminary cleanup step and Rule 2 is first among the gating per-PR evaluation rules, to avoid confusion.
@@ -0,0 +170,4 @@- Fix plan exists for HEAD SHA: skip- No fix plan: → Acquire WIP, emit `SPAWN:sr-fix:<pr_num>:<head_sha>`- **Clean** (`Assessment: ✅ Clean`): Continue to Rule 8[NIT] Fix plan recognition relies on a comment format containing "Finding #N" and matching the HEAD SHA. Consider explicitly documenting the required fix plan comment template and markers to ensure reliable detection.
@@ -0,0 +190,4 @@2. `resolver` is null (not formally resolved)3. No other comment has `in_reply_to_id` pointing to this comment (no reply)If unresolved comments exist:[NIT] The phrase "Evaluated against <head_short>" is used without defining how HEAD shortening is computed. Briefly define what "head_short" means (e.g., first 7 chars of the SHA) for unambiguous implementation.
@@ -0,0 +194,4 @@- Fix plan exists: skip- No fix plan: → Acquire WIP, emit `SPAWN:address-feedback:<pr_num>:<head_sha>`### Rule 10: Handoff[MINOR] Handoff assigns to a hard-coded user (
aweiker). Consider making the assignee configurable (e.g., add to project config) or documenting that this value is a configuration parameter, to avoid drift and improve portability.@@ -0,0 +196,4 @@### Rule 10: HandoffAll rules above passed. Verify all bot reviews are current (contain `Evaluated against <head_short>`).[MAJOR] The spec explicitly says to "Continue evaluating remaining PRs (do NOT exit)" during Rule 10 (Handoff), which contradicts SKILL.md’s "exits after emitting one SPAWN line". Resolve this contradiction by aligning both documents on the intended behavior.