# Multi-Model Review Orchestration When Rodin is asked to review a PR (e.g., "review PR 630", "look at PR #625"), use this orchestration pattern instead of a single-pass review. ## Source of Truth Specialized prompt files live at: `~/.openclaw/workspace/review-prompts/` - `sonnet.md` — structural/pattern review (Sonnet's mandate) - `gpt5.md` — semantic/domain/concurrency review (GPT-5's mandate) - `opus.md` — design coherence/contradiction review (Opus's mandate) These same files are used by CI (review-bot via `system-prompt-file`). Update one place → both paths improve. ## Decision: How Many Models? | PR touches... | Models to run | |---------------|---------------| | Tests only, config, deps | Sonnet only (structural) | | Application code (non-core) | Sonnet + GPT-5 | | Core domain (order_management, ledger, risk, decision_engine) | Sonnet + GPT-5 + Opus | | Architecture docs or design docs | GPT-5 + Opus (skip Sonnet) | | Kill switch, reconciliation, or financial calculations | ALL THREE + narrow deep-pass | ## Orchestration Steps ### 1. Gather Context (do this yourself, don't delegate) - Fetch PR metadata, diff, existing reviews (same as Phase 0-1 of pr-review skill) - Identify what files are touched → determines which models to spawn - Fetch linked issue/AC if present ### 2. Spawn Specialized Sub-Agents Spawn sub-agents in parallel. Each gets: - The full diff - The relevant prompt file content (read from review-prompts/) - Conventions file (CLAUDE.md) - Patterns (from elixir-patterns/phoenix-conventions repos if applicable) - Instruction: "Output structured findings as JSON. Do not post to Gitea." ``` sessions_spawn(model="sonnet", task="") sessions_spawn(model="gpt5", task="") sessions_spawn(model="opus", task="") # if design PR ``` ### 3. Synthesize Results After all sub-agents complete: 1. **Deduplicate** — if Sonnet and GPT-5 found the same issue, keep GPT-5's version (deeper explanation) and note "(also caught by Sonnet)" 2. **Rank by severity** — BLOCKER > MAJOR > MINOR > NIT 3. **Group by category:** - 🏗️ Structural (from Sonnet) - 🧠 Semantic/Domain (from GPT-5) - ⚖️ Design Coherence (from Opus) 4. **Call out unique contributions** — "Only GPT-5 caught: ..." / "Only Opus caught: ..." 5. **Actionable fix list:** Real bugs → must fix. Theoretical → discuss. Style → fix if cheap. ### 4. Present to Aaron Format as a unified report with clear sections. Include: - Overall verdict (APPROVE / REQUEST_CHANGES) - Per-model findings (deduplicated, categorized) - Recommended actions - Any unresolved existing feedback from other reviewers ### 5. Post (if requested) If Aaron says "post it" or "looks good, post": - Use the pr-review skill's Phase 6 posting mechanics - Post as a single unified review (not three separate ones) - Use the rodin Gitea token for posting ## Narrow Deep-Pass (for financial/safety PRs) After the main review, if the PR touches financial logic: 1. Extract ONLY the changed financial logic (strip test code, config, docs) 2. Ask GPT-5 a single focused question: - "Can this code produce a silently incorrect financial calculation? Show the specific input that produces a wrong number." 3. If findings emerge, add them to the report under a "🎯 Deep Analysis" section ## Timing Expectations | Configuration | Expected time | |--------------|---------------| | Sonnet only | ~30s | | Sonnet + GPT-5 | ~60s (parallel) | | All three | ~90s (parallel, Opus may be faster) | | + Deep pass | +45s (sequential after main review) | ## What This Replaces This replaces the old single-pass pr-review for on-demand reviews. The pr-review skill is still used for its Phase 0 (PR identification), Phase 1 (context gathering), Phase 4 (existing feedback), Phase 6 (posting mechanics), and Phase 7 (walk-through). The REVIEW itself (Phase 3) is now multi-model. The CI twins (review-bot) continue running independently — they're the automated safety net. On-demand reviews are the deep-dive when Aaron wants human-quality analysis.