cfcad67baa
- review-prompts/generic/sonnet.md: language-agnostic structural review - review-prompts/generic/gpt5.md: language-agnostic semantic/domain review - review-prompts/generic/opus.md: language-agnostic design coherence review - review-prompts/GENERATE.md: meta-prompt for tailoring to any repo - review-prompts/ORCHESTRATION.md: multi-model review orchestration pattern
105 lines
4.1 KiB
Markdown
105 lines
4.1 KiB
Markdown
# Multi-Model Review Orchestration
|
|
|
|
When Rodin is asked to review a PR (e.g., "review PR 630", "look at PR #625"), use this
|
|
orchestration pattern instead of a single-pass review.
|
|
|
|
## Source of Truth
|
|
|
|
Specialized prompt files live at: `~/.openclaw/workspace/review-prompts/`
|
|
- `sonnet.md` — structural/pattern review (Sonnet's mandate)
|
|
- `gpt5.md` — semantic/domain/concurrency review (GPT-5's mandate)
|
|
- `opus.md` — design coherence/contradiction review (Opus's mandate)
|
|
|
|
These same files are used by CI (review-bot via `system-prompt-file`). Update one place → both
|
|
paths improve.
|
|
|
|
## Decision: How Many Models?
|
|
|
|
| PR touches... | Models to run |
|
|
|---------------|---------------|
|
|
| Tests only, config, deps | Sonnet only (structural) |
|
|
| Application code (non-core) | Sonnet + GPT-5 |
|
|
| Core domain (order_management, ledger, risk, decision_engine) | Sonnet + GPT-5 + Opus |
|
|
| Architecture docs or design docs | GPT-5 + Opus (skip Sonnet) |
|
|
| Kill switch, reconciliation, or financial calculations | ALL THREE + narrow deep-pass |
|
|
|
|
## Orchestration Steps
|
|
|
|
### 1. Gather Context (do this yourself, don't delegate)
|
|
- Fetch PR metadata, diff, existing reviews (same as Phase 0-1 of pr-review skill)
|
|
- Identify what files are touched → determines which models to spawn
|
|
- Fetch linked issue/AC if present
|
|
|
|
### 2. Spawn Specialized Sub-Agents
|
|
|
|
Spawn sub-agents in parallel. Each gets:
|
|
- The full diff
|
|
- The relevant prompt file content (read from review-prompts/)
|
|
- Conventions file (CLAUDE.md)
|
|
- Patterns (from elixir-patterns/phoenix-conventions repos if applicable)
|
|
- Instruction: "Output structured findings as JSON. Do not post to Gitea."
|
|
|
|
```
|
|
sessions_spawn(model="sonnet", task="<sonnet prompt + diff + context>")
|
|
sessions_spawn(model="gpt5", task="<gpt5 prompt + diff + context>")
|
|
sessions_spawn(model="opus", task="<opus prompt + diff + context>") # if design PR
|
|
```
|
|
|
|
### 3. Synthesize Results
|
|
|
|
After all sub-agents complete:
|
|
|
|
1. **Deduplicate** — if Sonnet and GPT-5 found the same issue, keep GPT-5's version (deeper
|
|
explanation) and note "(also caught by Sonnet)"
|
|
2. **Rank by severity** — BLOCKER > MAJOR > MINOR > NIT
|
|
3. **Group by category:**
|
|
- 🏗️ Structural (from Sonnet)
|
|
- 🧠 Semantic/Domain (from GPT-5)
|
|
- ⚖️ Design Coherence (from Opus)
|
|
4. **Call out unique contributions** — "Only GPT-5 caught: ..." / "Only Opus caught: ..."
|
|
5. **Actionable fix list:** Real bugs → must fix. Theoretical → discuss. Style → fix if cheap.
|
|
|
|
### 4. Present to Aaron
|
|
|
|
Format as a unified report with clear sections. Include:
|
|
- Overall verdict (APPROVE / REQUEST_CHANGES)
|
|
- Per-model findings (deduplicated, categorized)
|
|
- Recommended actions
|
|
- Any unresolved existing feedback from other reviewers
|
|
|
|
### 5. Post (if requested)
|
|
|
|
If Aaron says "post it" or "looks good, post":
|
|
- Use the pr-review skill's Phase 6 posting mechanics
|
|
- Post as a single unified review (not three separate ones)
|
|
- Use the rodin Gitea token for posting
|
|
|
|
## Narrow Deep-Pass (for financial/safety PRs)
|
|
|
|
After the main review, if the PR touches financial logic:
|
|
|
|
1. Extract ONLY the changed financial logic (strip test code, config, docs)
|
|
2. Ask GPT-5 a single focused question:
|
|
- "Can this code produce a silently incorrect financial calculation? Show the specific input
|
|
that produces a wrong number."
|
|
3. If findings emerge, add them to the report under a "🎯 Deep Analysis" section
|
|
|
|
## Timing Expectations
|
|
|
|
| Configuration | Expected time |
|
|
|--------------|---------------|
|
|
| Sonnet only | ~30s |
|
|
| Sonnet + GPT-5 | ~60s (parallel) |
|
|
| All three | ~90s (parallel, Opus may be faster) |
|
|
| + Deep pass | +45s (sequential after main review) |
|
|
|
|
## What This Replaces
|
|
|
|
This replaces the old single-pass pr-review for on-demand reviews. The pr-review skill is still
|
|
used for its Phase 0 (PR identification), Phase 1 (context gathering), Phase 4 (existing feedback),
|
|
Phase 6 (posting mechanics), and Phase 7 (walk-through). The REVIEW itself (Phase 3) is now
|
|
multi-model.
|
|
|
|
The CI twins (review-bot) continue running independently — they're the automated safety net.
|
|
On-demand reviews are the deep-dive when Aaron wants human-quality analysis.
|