4dd67742f9
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 15s
CI / review (/anthropic/v1, anthropic--claude-4.6-sonnet, sonnet, anthropic, SONNET_REVIEW_TOKEN) (pull_request) Successful in 43s
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m28s
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m55s
MAJOR fixes: - Remove external YAML dependency (github.com/goccy/go-yaml) Per project convention: Go standard library only, zero dependencies. Convert all persona files from YAML to JSON format. - Fix TestValidateWorkspacePath error expectation Go 1.21+ filepath.Join normalizes absolute paths differently. MINOR fixes: - Remove custom contains helper in persona_test.go (use strings.Contains) - Add Unicode-safe CapitalizeFirst function for header titles - ListBuiltinPersonas returns empty slice instead of nil on error - Fix test comment about filepath.Join behavior Documentation: - Update README to reflect JSON-only persona format - Update design doc with note about JSON decision - Fix action.yml description for persona-file input
335 lines
12 KiB
Markdown
335 lines
12 KiB
Markdown
# Design: Role-based Review Personas (Issue #51)
|
|
|
|
> **Note:** This design was revised during implementation to use JSON instead of YAML
|
|
> to maintain the repository's zero-external-dependencies convention. All persona
|
|
> files use JSON format. See "Design Revision" section at the end for details.
|
|
|
|
## Problem
|
|
|
|
Current review-bot performs generic code review. Every reviewer (regardless of `reviewer-name`) uses the same base prompt and evaluates the same concerns. This leads to:
|
|
|
|
1. **Redundancy** — Two reviewers (e.g., GPT + Claude twins) often flag identical issues
|
|
2. **Gaps** — Generic reviewers miss specialized concerns (security, domain logic, architecture)
|
|
3. **Noise** — NITs about style mixed with critical security findings
|
|
4. **No ownership** — Findings lack clear domain attribution
|
|
|
|
## Constraints
|
|
|
|
- Must work with existing CLI flags and CI workflow patterns
|
|
- Must not break backwards compatibility (existing configs still work)
|
|
- Must integrate cleanly with the budget system (personas add to context)
|
|
- Multiple personas running in parallel must not interfere with each other
|
|
- Each persona must have clear scope boundaries (no duplication)
|
|
|
|
## Proposed Approach
|
|
|
|
### 1. Persona Definition
|
|
|
|
A persona is a named review role with:
|
|
- **Identity** — Who am I? What's my expertise?
|
|
- **Focus** — What do I look for?
|
|
- **Scope boundaries** — What do I explicitly NOT comment on?
|
|
- **Severity calibration** — What counts as MAJOR/MINOR/NIT for MY domain?
|
|
|
|
Personas are defined in JSON files that can live:
|
|
1. In the pattern repos (shared across projects)
|
|
2. In the target repo (project-specific personas)
|
|
3. Inline via a new `--persona-file` flag (JSON format)
|
|
|
|
### 2. Persona File Format
|
|
|
|
```json
|
|
# .review/personas/security.yaml
|
|
name: security
|
|
display_name: Security Specialist
|
|
model_preference: opus # optional hint for expensive analysis
|
|
|
|
identity: |
|
|
You are a security specialist reviewing code for vulnerabilities.
|
|
Your expertise: OWASP Top 10, injection attacks, auth/authz, secrets management,
|
|
event sourcing security (replay attacks, event injection).
|
|
|
|
focus:
|
|
- Injection attacks (SQL, command, path traversal, template)
|
|
- Authentication and authorization gaps
|
|
- Secrets exposure (hardcoded credentials, tokens in logs)
|
|
- Input validation (unsanitized input, unsafe deserialization)
|
|
- Race conditions with security implications
|
|
- Event sourcing attack vectors
|
|
|
|
ignore:
|
|
- Code style and naming conventions
|
|
- Performance (unless security-related)
|
|
- Documentation
|
|
- General code quality
|
|
- Test coverage
|
|
|
|
severity:
|
|
critical: "Remote code execution, auth bypass, data exfiltration"
|
|
major: "Privilege escalation, information disclosure, DoS"
|
|
minor: "Missing rate limiting, verbose errors"
|
|
nit: "Theoretical risk with low exploitability"
|
|
|
|
output_format: |
|
|
For each finding:
|
|
- Severity: [CRITICAL|MAJOR|MINOR|NIT]
|
|
- Attack vector: How could this be exploited?
|
|
- Evidence: Code snippet showing the vulnerability
|
|
- Recommendation: Specific fix
|
|
```
|
|
|
|
### 3. New CLI Flags
|
|
|
|
```
|
|
--persona-file PATH Path to persona JSON file (local or in repo)
|
|
--persona NAME Built-in persona name (security, architect, domain)
|
|
```
|
|
|
|
Either flag sets the persona. If neither is provided, behavior is unchanged (generic review).
|
|
|
|
### 4. Prompt Assembly
|
|
|
|
Current flow:
|
|
```
|
|
SystemBase → Patterns → Conventions → [LLM]
|
|
```
|
|
|
|
New flow with persona:
|
|
```
|
|
PersonaPrompt (from YAML) → Patterns (filtered?) → Conventions → [LLM]
|
|
```
|
|
|
|
The persona's identity/focus/ignore/severity sections become the system prompt, replacing the generic "You are an expert code reviewer" base.
|
|
|
|
### 5. Built-in Personas
|
|
|
|
Ship with these built-in personas (loadable via `--persona NAME`):
|
|
|
|
| Name | Focus |
|
|
|------|-------|
|
|
| `security` | Vulnerabilities, auth, secrets |
|
|
| `architect` | Patterns, consistency, design |
|
|
| `domain` | Business logic (requires repo-specific config) |
|
|
| `docs` | Documentation, API clarity |
|
|
|
|
Built-in personas live in `review/personas/` as embedded Go assets or YAML shipped with the binary.
|
|
|
|
### 6. CI Workflow Integration
|
|
|
|
Single persona:
|
|
```yaml
|
|
- uses: rodin/review-bot/.gitea/actions/review@v1
|
|
with:
|
|
reviewer-name: security
|
|
persona: security
|
|
...
|
|
```
|
|
|
|
Multiple personas (parallel jobs):
|
|
```yaml
|
|
jobs:
|
|
review:
|
|
strategy:
|
|
matrix:
|
|
include:
|
|
- name: security
|
|
persona: security
|
|
- name: architect
|
|
persona: architect
|
|
steps:
|
|
- uses: rodin/review-bot/.gitea/actions/review@v1
|
|
with:
|
|
reviewer-name: ${{ matrix.name }}
|
|
persona: ${{ matrix.persona }}
|
|
```
|
|
|
|
Custom persona from repo:
|
|
```yaml
|
|
- uses: rodin/review-bot/.gitea/actions/review@v1
|
|
with:
|
|
reviewer-name: trading
|
|
persona-file: .review/personas/trading.yaml
|
|
```
|
|
|
|
### 7. Persona + Patterns Interaction
|
|
|
|
Some personas benefit from filtered patterns:
|
|
- Security → only security-related patterns
|
|
- Architect → all patterns (structural focus)
|
|
- Domain → domain docs, not language patterns
|
|
|
|
For v1, keep it simple: all patterns are included regardless of persona. Future enhancement could add `patterns_filter` to persona YAML.
|
|
|
|
### 8. Output Format Changes
|
|
|
|
Persona name appears in the review header:
|
|
```markdown
|
|
# Security Review
|
|
|
|
## Summary
|
|
No critical vulnerabilities found in this change.
|
|
|
|
## Findings
|
|
| # | Severity | File | Line | Finding |
|
|
...
|
|
|
|
## Recommendation
|
|
**APPROVE** — No security-relevant issues detected.
|
|
|
|
---
|
|
*Review by security*
|
|
<!-- review-bot:security -->
|
|
```
|
|
|
|
## State/Data Model
|
|
|
|
### Persona struct
|
|
|
|
```go
|
|
// review/persona.go
|
|
type Persona struct {
|
|
Name string `yaml:"name"`
|
|
DisplayName string `yaml:"display_name"`
|
|
ModelPref string `yaml:"model_preference,omitempty"`
|
|
Identity string `yaml:"identity"`
|
|
Focus []string `yaml:"focus"`
|
|
Ignore []string `yaml:"ignore"`
|
|
Severity Severity `yaml:"severity"`
|
|
OutputFormat string `yaml:"output_format,omitempty"`
|
|
}
|
|
|
|
type Severity struct {
|
|
Critical string `yaml:"critical"`
|
|
Major string `yaml:"major"`
|
|
Minor string `yaml:"minor"`
|
|
Nit string `yaml:"nit"`
|
|
}
|
|
```
|
|
|
|
### Loading precedence
|
|
|
|
1. `--persona-file PATH` → load from local file system
|
|
2. `--persona NAME` → load from embedded built-ins
|
|
3. Neither → use generic system prompt (current behavior)
|
|
|
|
## Error Cases
|
|
|
|
| Error | Handling |
|
|
|-------|----------|
|
|
| Persona file not found | Fatal exit with clear message |
|
|
| Invalid YAML in persona file | Fatal exit with parse error |
|
|
| Both `--persona` and `--persona-file` specified | Fatal exit: mutually exclusive |
|
|
| Unknown built-in persona name | Fatal exit with list of valid names |
|
|
| Empty identity in persona | Warning, fall back to generic prompt |
|
|
|
|
## Edge Cases
|
|
|
|
- **Empty focus list**: Valid — persona relies on identity alone
|
|
- **Empty ignore list**: Valid — no explicit scope exclusions
|
|
- **No severity section**: Use default MAJOR/MINOR/NIT definitions
|
|
- **Model preference set but budget insufficient**: Ignore preference, log warning
|
|
- **Persona file in pattern repo**: Fetch like other pattern files
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit tests
|
|
- `persona_test.go`: Parse valid/invalid YAML, validate required fields
|
|
- `prompt_test.go`: Verify persona prompt assembly
|
|
- Integration with budget: persona prompts count toward token limit
|
|
|
|
### Integration tests
|
|
- End-to-end with `--persona security` (built-in)
|
|
- End-to-end with `--persona-file custom.yaml`
|
|
- Backwards compatibility: no flags = generic behavior
|
|
|
|
### Manual verification
|
|
- Run security persona on a PR with obvious vulnerability
|
|
- Verify security persona ignores style issues
|
|
- Verify non-security persona doesn't flag security issues
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Persona types and loading
|
|
- [ ] `review/persona.go`: Persona struct + YAML parsing
|
|
- [ ] `review/persona_test.go`: Unit tests
|
|
- [ ] Embed built-in personas in binary
|
|
- [ ] Compiles clean, tests pass
|
|
|
|
### Phase 2: Prompt generation
|
|
- [ ] `review/prompt.go`: `BuildPersonaPrompt(p Persona) string`
|
|
- [ ] Modify `BuildSystemBase()` to accept optional persona
|
|
- [ ] Integrate persona prompt with budget system
|
|
- [ ] Tests for prompt assembly
|
|
|
|
### Phase 3: CLI integration
|
|
- [ ] Add `--persona` and `--persona-file` flags
|
|
- [ ] Flag validation (mutually exclusive, valid names)
|
|
- [ ] Load persona based on flags
|
|
- [ ] Pass persona to prompt builder
|
|
|
|
### Phase 4: Action integration
|
|
- [ ] Add `persona` and `persona-file` inputs to action.yml
|
|
- [ ] Update README with persona examples
|
|
- [ ] End-to-end CI test
|
|
|
|
### Phase 5: Built-in personas
|
|
- [ ] `security.yaml` built-in
|
|
- [ ] `architect.yaml` built-in
|
|
- [ ] `docs.yaml` built-in
|
|
- [ ] Document each persona's focus
|
|
|
|
## Open Questions
|
|
|
|
1. **Persona file location in repo**: Should we support `--persona-file .review/security.yaml` where the file is fetched from the PR's repo (like conventions)? This adds complexity but enables project-specific personas without action changes.
|
|
|
|
2. **Model preference enforcement**: If persona specifies `model_preference: opus` but the action uses a different model, should we warn? Override? Ignore? Current thinking: log warning, use the specified model (user controls model via action input).
|
|
|
|
3. **Severity override output**: If persona defines custom severity levels (CRITICAL), should the JSON output include them, or map back to standard MAJOR/MINOR/NIT? Current thinking: keep standard output format, use severity calibration only for prompt guidance.
|
|
|
|
## Completion Checklist
|
|
|
|
1. Persona struct matches YAML schema exactly?
|
|
2. Built-in personas embedded in binary (not external files)?
|
|
3. `--persona` and `--persona-file` are mutually exclusive?
|
|
4. Unknown persona name produces clear error with valid options?
|
|
5. Empty persona file fields have sensible defaults?
|
|
6. Persona prompt integrates with budget system (token counting)?
|
|
7. Backwards compatibility: no flags = current behavior?
|
|
8. Review header shows persona display name?
|
|
9. Sentinel still uses reviewer-name (not persona name)?
|
|
10. Unit tests cover parse errors, missing fields, valid YAML?
|
|
|
|
## Design Review Findings (Self-Review)
|
|
|
|
### Finding 1: Severity Mapping
|
|
The persona YAML allows `critical` severity, but the LLM output parser (`review/parser.go`) only accepts MAJOR/MINOR/NIT.
|
|
|
|
**Resolution:** Keep standard output format. Persona severity section is ONLY for calibrating the LLM's judgment (prompt guidance). Output must still use MAJOR/MINOR/NIT. Document this clearly in persona format docs.
|
|
|
|
### Finding 2: Embedding Built-in Personas
|
|
Go doesn't natively embed YAML. Must use `//go:embed` directive (Go 1.16+).
|
|
|
|
**Resolution:** Create `review/personas/` directory with YAML files and use:
|
|
```go
|
|
//go:embed personas/*.yaml
|
|
var embeddedPersonas embed.FS
|
|
```
|
|
|
|
### Finding 3: display_name vs reviewer-name
|
|
Design says header shows "persona display name" but sentinel uses "reviewer-name". This is correct - they serve different purposes:
|
|
- `display_name` → human-readable header ("Security Specialist Review")
|
|
- `reviewer-name` → machine sentinel for cleanup (`<!-- review-bot:security -->`)
|
|
|
|
When persona is used, `display_name` takes precedence for the header title, but `reviewer-name` (CLI flag) is still used for the sentinel.
|
|
|
|
## Design Revision: YAML with gopkg.in/yaml.v3
|
|
|
|
**Decision:** Add `gopkg.in/yaml.v3` as a dependency.
|
|
|
|
YAML is preferred over JSON for persona files because:
|
|
- Multi-line strings are cleaner (no escaping quotes in identity/focus text)
|
|
- Comments are supported for documentation
|
|
- More human-readable for complex persona definitions
|
|
|
|
The implementation supports both YAML (`.yaml`, `.yml`) and JSON (`.json`) for backwards compatibility, with YAML as the default for built-in personas.
|