cfcad67baa
- review-prompts/generic/sonnet.md: language-agnostic structural review - review-prompts/generic/gpt5.md: language-agnostic semantic/domain review - review-prompts/generic/opus.md: language-agnostic design coherence review - review-prompts/GENERATE.md: meta-prompt for tailoring to any repo - review-prompts/ORCHESTRATION.md: multi-model review orchestration pattern
4.3 KiB
4.3 KiB
Generating Specialized Review Prompts for a Repository
Use this prompt to generate tailored review prompt files for any repository. Feed it to an AI model along with the repo's conventions file (CLAUDE.md, CONTRIBUTING.md, etc.) and a sample of the codebase.
The Prompt
I need you to generate three specialized code review prompt files for the following repository. Each prompt assigns a specific ROLE to one AI model, so that when all three run in parallel on the same PR, they produce complementary (non-overlapping) findings.
The three roles are:
1. **Structural/Pattern Reviewer** (for Claude Sonnet) — form over meaning
2. **Semantic/Domain Reviewer** (for GPT-5) — meaning over form
3. **Design Coherence Reviewer** (for Claude Opus) — system-level tensions
## Repository Context
- **Language/Framework:** [e.g., Go 1.22, Python/FastAPI, TypeScript/React, Elixir/Phoenix]
- **Domain:** [e.g., payment processing, e-commerce, trading system, infrastructure tooling]
- **Key patterns:** [e.g., hexagonal architecture, CQRS/ES, microservices with gRPC, monolith with modules]
- **Conventions file content:**
[paste CLAUDE.md / CONTRIBUTING.md / .editorconfig / relevant docs here]
- **Critical invariants:** [e.g., "financial calculations must never silently produce wrong numbers", "all API responses must include correlation IDs", "mutations must be idempotent"]
- **Safety mechanisms:** [e.g., "circuit breakers on all external calls", "rate limiting on user-facing endpoints", "kill switch for trading"]
## Instructions for Each Prompt File
For each role, generate a markdown file with:
1. **A one-line role statement** — what this reviewer does (form, meaning, or design)
2. **"Your Domain" section** — 5 specific focus areas tailored to THIS repo's language, framework, and domain. Be concrete (e.g., "correct use of context.Context propagation" not "correct patterns"). Reference the repo's actual conventions.
3. **"NOT Your Domain" section** — explicitly exclude what the other two reviewers handle. This prevents overlap.
4. **"Output Rules" section** — severity definitions specific to this repo's risk profile. What counts as MAJOR in a payment system differs from what counts as MAJOR in a blog.
5. **Optional "Context" section** — domain-specific priorities (e.g., "silent data corruption > crashes" for financial systems, "user data exposure > downtime" for auth systems)
6. **Optional "When to Engage" section** (Opus only) — path-based trigger guidance for when the design reviewer adds value vs when it should just approve.
## Quality Criteria for Generated Prompts
- Each prompt must be SPECIFIC to this repo — no generic advice that applies everywhere
- The three prompts must be COMPLEMENTARY — reading all three, every reasonable finding type is covered exactly once
- The "NOT Your Domain" sections must form a clean partition — nothing falls through the cracks
- Severity definitions must reflect the repo's actual risk profile (a NIT in a blog engine might be a MAJOR in a payment system)
- Focus areas must reference actual frameworks/libraries/patterns the repo uses (not hypothetical ones)
Example Usage
To generate prompts for a new repo, run something like:
# Gather context
cat CLAUDE.md CONTRIBUTING.md > /tmp/repo-context.md
find lib/ -name "*.ex" | head -20 | xargs head -30 >> /tmp/repo-context.md # sample code
tree -L 2 >> /tmp/repo-context.md # structure
# Feed to a model with the prompt above
Then review the output, test on a real PR (dry-run mode), and iterate.
What Makes Good Prompts
Based on 29 model research experiments:
- Specificity beats generality. "Check for correct
context.Contextpropagation in gRPC handlers" catches more than "check for correct patterns." - Explicit exclusions prevent overlap. Without "NOT Your Domain," models default to broad review and duplicate each other's work.
- Domain-calibrated severity prevents noise. A missing error check in a CLI tool is a NIT. The same missing check in a payment handler is a MAJOR.
- Models follow instructions. If you tell Sonnet not to look for race conditions, it won't. The specialization actually works (Finding #26 from our research: prompt framing dominates model personality).
- Short is better. Each prompt should be <3KB. Models don't need verbose instructions — they need clear boundaries.