feat(persona): add role-based review personas #55
@@ -74,6 +74,14 @@ inputs:
|
||||
description: 'Local file with additional system prompt instructions (e.g. security review focus)'
|
||||
|
|
||||
required: false
|
||||
|
gpt-review-bot
commented
[MAJOR] The **[MAJOR]** The `inputs` section is malformed: `system-prompt-file` is declared without any properties, and the subsequent `description/required/default` fields appear under `persona-file`, resulting in duplicate keys for `persona-file` and leaving `system-prompt-file` with a null value. In YAML this can cause the last duplicate key to win and may lead to GitHub/Gitea Actions rejecting or mis-parsing these inputs.
|
||||
default: ''
|
||||
persona:
|
||||
description: 'Built-in persona name (security, architect, docs)'
|
||||
required: false
|
||||
default: ''
|
||||
persona-file:
|
||||
description: 'Path to custom persona JSON file'
|
||||
required: false
|
||||
default: ''
|
||||
|
||||
runs:
|
||||
|
gpt-review-bot
commented
[MINOR] The persona-file input description states "Path to persona JSON file" but the implementation supports both YAML and JSON. Update the description to reflect YAML/JSON support for consistency. **[MINOR]** The persona-file input description states "Path to persona JSON file" but the implementation supports both YAML and JSON. Update the description to reflect YAML/JSON support for consistency.
gpt-review-bot
commented
[NIT] The 'persona-file' input description says 'Path to persona JSON file', but the implementation supports YAML and JSON. Update the description to avoid misleading users (e.g., 'YAML/JSON'). **[NIT]** The 'persona-file' input description says 'Path to persona JSON file', but the implementation supports YAML and JSON. Update the description to avoid misleading users (e.g., 'YAML/JSON').
|
||||
using: 'composite'
|
||||
@@ -155,6 +163,8 @@ runs:
|
||||
LLM_PROVIDER: ${{ inputs.llm-provider }}
|
||||
UPDATE_EXISTING: ${{ inputs.update-existing }}
|
||||
SYSTEM_PROMPT_FILE: ${{ inputs.system-prompt-file }}
|
||||
PERSONA: ${{ inputs.persona }}
|
||||
PERSONA_FILE: ${{ inputs.persona-file }}
|
||||
run: |
|
||||
ARGS=""
|
||||
if [ "${{ inputs.dry-run }}" = "true" ]; then
|
||||
|
||||
@@ -182,6 +182,8 @@ Prints the review to CI logs without posting to the PR. Useful for testing promp
|
||||
| `patterns-repo` | No | `""` | Comma-separated repos with language patterns (e.g. `rodin/go-patterns`) |
|
||||
| `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos |
|
||||
| `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
|
||||
| `persona` | No | `""` | Built-in persona name (security, architect, docs) |
|
||||
| `persona-file` | No | `""` | Path to persona JSON file with custom review focus |
|
||||
| `temperature` | No | `0` | LLM temperature (0 = server default) |
|
||||
| `timeout` | No | `300` | LLM request timeout in seconds |
|
||||
| `dry-run` | No | `false` | Print review to stdout instead of posting |
|
||||
@@ -329,3 +331,103 @@ budget/ Token estimation + context trimming
|
||||
## License
|
||||
|
||||
MIT
|
||||
|
||||
## Review Personas
|
||||
|
||||
Personas provide role-based review specialization. Instead of generic code review, each persona focuses on a specific domain (security, architecture, documentation) with tailored prompts and severity calibration.
|
||||
|
||||
### Built-in Personas
|
||||
|
||||
| Persona | Focus |
|
||||
|---------|-------|
|
||||
| `security` | Vulnerabilities, auth bypass, secrets exposure, injection attacks |
|
||||
| `architect` | Design patterns, code organization, API contracts, testability |
|
||||
| `docs` | Documentation quality, API clarity, error messages |
|
||||
|
gpt-review-bot
commented
[NIT] The 'Using Built-in Personas' example specifies an Anthropic model (claude-opus-4-20250514) without setting llm-provider to 'anthropic'. While covered elsewhere, adding it here would reduce potential user confusion. **[NIT]** The 'Using Built-in Personas' example specifies an Anthropic model (claude-opus-4-20250514) without setting llm-provider to 'anthropic'. While covered elsewhere, adding it here would reduce potential user confusion.
|
||||
|
||||
### Using Built-in Personas
|
||||
|
||||
```yaml
|
||||
- uses: rodin/review-bot/.gitea/actions/review@v1
|
||||
with:
|
||||
reviewer-name: security
|
||||
persona: security
|
||||
llm-model: claude-opus-4-20250514 # Security benefits from strong reasoning
|
||||
...
|
||||
```
|
||||
|
||||
### Multiple Personas in Parallel
|
||||
|
||||
```yaml
|
||||
|
gpt-review-bot
commented
[NIT] JSON example snippet under 'Custom Personas' includes a comment line ('// .review/personas/trading.json') which is not valid JSON; users copy-pasting may encounter parse errors. Consider moving the comment outside the code block or noting that it's illustrative. **[NIT]** JSON example snippet under 'Custom Personas' includes a comment line ('// .review/personas/trading.json') which is not valid JSON; users copy-pasting may encounter parse errors. Consider moving the comment outside the code block or noting that it's illustrative.
|
||||
jobs:
|
||||
review:
|
||||
strategy:
|
||||
matrix:
|
||||
include:
|
||||
- name: security
|
||||
persona: security
|
||||
- name: architect
|
||||
persona: architect
|
||||
steps:
|
||||
- uses: rodin/review-bot/.gitea/actions/review@v1
|
||||
with:
|
||||
reviewer-name: ${{ matrix.name }}
|
||||
persona: ${{ matrix.persona }}
|
||||
...
|
||||
```
|
||||
|
||||
Each persona posts independently with its own sentinel, so reviews don't interfere.
|
||||
|
||||
|
||||
### Custom Personas
|
||||
|
||||
Create a JSON file with your domain-specific review focus:
|
||||
|
||||
```json
|
||||
// .review/personas/trading.json
|
||||
{
|
||||
"name": "trading",
|
||||
"display_name": "Trading Domain Expert",
|
||||
"identity": "You are a trading systems expert reviewing code for correctness.\n\nYour expertise:\n- Order lifecycle and state machines\n- Fill handling and partial fills\n- Position tracking and P&L calculations\n- Event sourcing invariants",
|
||||
"focus": [
|
||||
"Order state machine correctness",
|
||||
"Fill handling edge cases (partial, overfill)",
|
||||
"Position and P&L calculation accuracy",
|
||||
"Event replay determinism",
|
||||
"Decimal precision for money"
|
||||
],
|
||||
"ignore": [
|
||||
"Code style",
|
||||
"General performance",
|
||||
"Documentation formatting"
|
||||
],
|
||||
"severity": {
|
||||
"major": "Bugs that cause incorrect positions, fills, or money calculations",
|
||||
"minor": "Edge cases that could cause issues under unusual conditions",
|
||||
"nit": "Clarity improvements for domain logic"
|
||||
}
|
||||
|
sonnet-review-bot
commented
[NIT] The README custom persona example shows YAML format and says 'JSON format is also supported for backwards compatibility' — but this is a new feature with no prior format, so there's no backwards compatibility concern. After removing the yaml.v3 dependency and going JSON-only per the design document's own resolution, this sentence should be removed or changed to accurately describe which format is canonical. **[NIT]** The README custom persona example shows YAML format and says 'JSON format is also supported for backwards compatibility' — but this is a new feature with no prior format, so there's no backwards compatibility concern. After removing the yaml.v3 dependency and going JSON-only per the design document's own resolution, this sentence should be removed or changed to accurately describe which format is canonical.
|
||||
}
|
||||
```
|
||||
|
||||
Use it in CI:
|
||||
|
||||
```yaml
|
||||
- uses: rodin/review-bot/.gitea/actions/review@v1
|
||||
with:
|
||||
reviewer-name: trading
|
||||
persona-file: .review/personas/trading.json
|
||||
...
|
||||
```
|
||||
|
||||
|
||||
### Persona vs system-prompt-file
|
||||
|
||||
| Feature | `persona` / `persona-file` | `system-prompt-file` |
|
||||
|---------|---------------------------|----------------------|
|
||||
| Replaces base prompt | Yes | No (appends) |
|
||||
| Structured format | Yes (JSON) | No (freeform) |
|
||||
| Focus/ignore lists | Yes | Manual |
|
||||
| Severity calibration | Yes | Manual |
|
||||
| Header display name | Yes | No |
|
||||
| Built-in options | Yes | No |
|
||||
|
||||
Use personas for domain-specialized reviews. Use `system-prompt-file` for minor tweaks to the generic review.
|
||||
|
||||
@@ -70,6 +70,8 @@ func main() {
|
||||
llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)")
|
||||
llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)")
|
||||
llmProvider := flag.String("llm-provider", envOrDefault("LLM_PROVIDER", "openai"), "LLM API provider: openai or anthropic")
|
||||
personaName := flag.String("persona", envOrDefault("PERSONA", ""), "Built-in persona name (security, architect, docs)")
|
||||
personaFile := flag.String("persona-file", envOrDefault("PERSONA_FILE", ""), "Path to persona JSON file")
|
||||
|
||||
|
sonnet-review-bot
commented
[MINOR] The --persona-file flag description says 'Path to persona JSON file' but the implementation also accepts YAML. The description should say 'Path to persona JSON or YAML file' (or after the dependency fix, just 'JSON file'). **[MINOR]** The --persona-file flag description says 'Path to persona JSON file' but the implementation also accepts YAML. The description should say 'Path to persona JSON or YAML file' (or after the dependency fix, just 'JSON file').
|
||||
flag.Parse()
|
||||
|
||||
@@ -91,6 +93,36 @@ func main() {
|
||||
os.Exit(1)
|
||||
}
|
||||
|
sonnet-review-bot
commented
[MINOR] There is a blank line immediately after the **[MINOR]** There is a blank line immediately after the `os.Exit(1)` block (the required-fields validation) and before the persona-flags validation comment. This is a minor style issue (double blank line) that `gofmt` would not flag but deviates from the existing code style where single blank lines separate logical sections.
|
||||
|
||||
// Validate persona flags are mutually exclusive
|
||||
if *personaName != "" && *personaFile != "" {
|
||||
slog.Error("--persona and --persona-file are mutually exclusive")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
[MAJOR] Unvalidated persona-file path is read without restricting to the workspace or validating symlinks. An attacker who can modify workflow inputs could point to arbitrary files (e.g., /dev/zero causing DoS, or JSON config files), risking resource exhaustion and potential data exfiltration to the LLM. **[MAJOR]** Unvalidated persona-file path is read without restricting to the workspace or validating symlinks. An attacker who can modify workflow inputs could point to arbitrary files (e.g., /dev/zero causing DoS, or JSON config files), risking resource exhaustion and potential data exfiltration to the LLM.
|
||||
|
||||
// Load persona if specified
|
||||
var persona *review.Persona
|
||||
if *personaName != "" {
|
||||
var err error
|
||||
|
[MINOR] The persona-file code path accepts and loads a file path directly into review.LoadPersona without the workspace-bound and symlink checks applied to system-prompt-file. While LoadPersona performs JSON parsing (reducing accidental leakage), adding the same path/symlink validation here would provide consistent defense-in-depth and reduce the risk of reading unintended files. **[MINOR]** The persona-file code path accepts and loads a file path directly into review.LoadPersona without the workspace-bound and symlink checks applied to system-prompt-file. While LoadPersona performs JSON parsing (reducing accidental leakage), adding the same path/symlink validation here would provide consistent defense-in-depth and reduce the risk of reading unintended files.
|
||||
persona, err = review.LoadBuiltinPersona(*personaName)
|
||||
if err != nil {
|
||||
slog.Error("failed to load persona", "persona", *personaName, "error", err)
|
||||
os.Exit(1)
|
||||
|
[MINOR] The --persona-file path is read directly from the filesystem without workspace or symlink checks, allowing arbitrary file reads on the runner if the input were ever user-controlled. Mirror the path traversal protections used for --system-prompt-file (resolve to absolute path within the workspace, EvalSymlinks, and reject paths escaping the workspace) to prevent unintended access. **[MINOR]** The --persona-file path is read directly from the filesystem without workspace or symlink checks, allowing arbitrary file reads on the runner if the input were ever user-controlled. Mirror the path traversal protections used for --system-prompt-file (resolve to absolute path within the workspace, EvalSymlinks, and reject paths escaping the workspace) to prevent unintended access.
|
||||
}
|
||||
|
gpt-review-bot
commented
[MINOR] Persona file path is read directly from --persona-file without the workspace/symlink traversal safeguards applied to system-prompt-file. For consistency and security, restrict persona-file to the workspace and guard against symlink traversal. **[MINOR]** Persona file path is read directly from --persona-file without the workspace/symlink traversal safeguards applied to system-prompt-file. For consistency and security, restrict persona-file to the workspace and guard against symlink traversal.
|
||||
slog.Info("loaded built-in persona", "persona", persona.Name, "display", persona.DisplayName)
|
||||
} else if *personaFile != "" {
|
||||
|
gpt-review-bot
commented
[MAJOR] persona-file is read from the filesystem without validating it resides within the checkout workspace or guarding against symlink traversal. A malicious config could point to arbitrary files (e.g., /etc/*) and exfiltrate their contents to the LLM. Apply the same absolute path, prefix, and symlink-resolution checks used for system-prompt-file. **[MAJOR]** persona-file is read from the filesystem without validating it resides within the checkout workspace or guarding against symlink traversal. A malicious config could point to arbitrary files (e.g., /etc/*) and exfiltrate their contents to the LLM. Apply the same absolute path, prefix, and symlink-resolution checks used for system-prompt-file.
|
||||
resolvedPath, err := validateWorkspacePath(*personaFile, "persona-file")
|
||||
if err != nil {
|
||||
|
gpt-review-bot
commented
[MINOR] Persona file loading ( **[MINOR]** Persona file loading (`--persona-file`) accepts an arbitrary path without the same workspace-bound path validation that is applied to `--system-prompt-file`. For consistency and safety in CI environments, consider applying similar workspace containment checks to avoid accidentally reading outside the repository workspace.
|
||||
slog.Error("invalid persona-file path", "error", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
persona, err = review.LoadPersona(resolvedPath)
|
||||
if err != nil {
|
||||
slog.Error("failed to load persona file", "file", *personaFile, "error", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name)
|
||||
}
|
||||
|
||||
// Validate reviewer-name: only safe characters allowed in sentinel
|
||||
if err := validateReviewerName(*reviewerName); err != nil {
|
||||
slog.Error("invalid reviewer name", "error", err)
|
||||
@@ -201,34 +233,14 @@ func main() {
|
||||
// Step 6b: Load additional system prompt if specified
|
||||
additionalPrompt := ""
|
||||
if *systemPromptFile != "" {
|
||||
workspace := os.Getenv("GITHUB_WORKSPACE")
|
||||
if workspace == "" {
|
||||
workspace, _ = os.Getwd()
|
||||
}
|
||||
absWorkspace, err := filepath.Abs(workspace)
|
||||
resolvedPath, err := validateWorkspacePath(*systemPromptFile, "system-prompt-file")
|
||||
if err != nil {
|
||||
slog.Error("failed to resolve workspace path", "error", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
promptPath := filepath.Join(absWorkspace, *systemPromptFile)
|
||||
promptPath = filepath.Clean(promptPath)
|
||||
if !strings.HasPrefix(promptPath, absWorkspace+string(filepath.Separator)) && promptPath != absWorkspace {
|
||||
slog.Error("system-prompt-file resolves outside workspace", "path", promptPath, "workspace", absWorkspace)
|
||||
os.Exit(1)
|
||||
}
|
||||
// Resolve symlinks and re-validate to prevent symlink traversal
|
||||
resolvedPath, err := filepath.EvalSymlinks(promptPath)
|
||||
if err != nil {
|
||||
slog.Error("failed to resolve system prompt file", "path", promptPath, "error", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
if !strings.HasPrefix(resolvedPath, absWorkspace+string(filepath.Separator)) && resolvedPath != absWorkspace {
|
||||
slog.Error("system-prompt-file symlink resolves outside workspace", "resolved", resolvedPath, "workspace", absWorkspace)
|
||||
slog.Error("invalid system-prompt-file path", "error", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
data, err := os.ReadFile(resolvedPath)
|
||||
if err != nil {
|
||||
slog.Error("failed to read system prompt file", "path", promptPath, "error", err)
|
||||
slog.Error("failed to read system prompt file", "path", *systemPromptFile, "error", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
additionalPrompt = string(data)
|
||||
@@ -236,7 +248,13 @@ func main() {
|
||||
}
|
||||
|
||||
// Step 7: Budget-aware prompt assembly
|
||||
systemBase := review.BuildSystemBase()
|
||||
var systemBase string
|
||||
if persona != nil {
|
||||
systemBase = review.BuildPersonaSystemPrompt(persona)
|
||||
slog.Debug("using persona system prompt", "persona", persona.Name)
|
||||
} else {
|
||||
systemBase = review.BuildSystemBase()
|
||||
}
|
||||
if additionalPrompt != "" {
|
||||
systemBase += "\n\n## Additional Review Instructions\n\n" + additionalPrompt
|
||||
}
|
||||
@@ -293,7 +311,12 @@ func main() {
|
||||
slog.Info("review parsed", "verdict", result.Verdict, "findings", len(result.Findings))
|
||||
|
[MINOR] Untrusted persona display_name (from a repo-controlled persona-file) is passed directly to FormatMarkdownWithDisplay and rendered into review Markdown without escaping. If the hosting platform's sanitizer is bypassable, this could enable Markdown/HTML injection in the review header/footer. Consider sanitizing or escaping displayName (and restricting to a safe character set) before inclusion. **[MINOR]** Untrusted persona display_name (from a repo-controlled persona-file) is passed directly to FormatMarkdownWithDisplay and rendered into review Markdown without escaping. If the hosting platform's sanitizer is bypassable, this could enable Markdown/HTML injection in the review header/footer. Consider sanitizing or escaping displayName (and restricting to a safe character set) before inclusion.
|
||||
|
||||
// Step 10: Format and post review
|
||||
reviewBody := review.FormatMarkdown(result, *reviewerName)
|
||||
var reviewBody string
|
||||
if persona != nil && persona.DisplayName != "" {
|
||||
reviewBody = review.FormatMarkdownWithDisplay(result, persona.DisplayName, *reviewerName)
|
||||
} else {
|
||||
reviewBody = review.FormatMarkdown(result, *reviewerName)
|
||||
}
|
||||
|
||||
// Add commit footer so readers know which commit was evaluated
|
||||
if pr.Head.Sha != "" {
|
||||
@@ -587,6 +610,43 @@ func validateReviewerName(name string) error {
|
||||
return nil
|
||||
|
[MINOR] validateWorkspacePath relies on string prefix checks for containment. Although it also resolves symlinks and uses a path separator guard, on case-insensitive filesystems and certain UNC/volume edge cases, prefix checks can be brittle. Consider using filepath.Rel to compute the relative path and ensuring it does not start with ".." and that the volume/drive matches, for stronger cross-platform guarantees. **[MINOR]** validateWorkspacePath relies on string prefix checks for containment. Although it also resolves symlinks and uses a path separator guard, on case-insensitive filesystems and certain UNC/volume edge cases, prefix checks can be brittle. Consider using filepath.Rel to compute the relative path and ensuring it does not start with ".." and that the volume/drive matches, for stronger cross-platform guarantees.
gpt-review-bot
commented
[MINOR] validateWorkspacePath uses strings.HasPrefix for path containment checks, which can be brittle on case-insensitive filesystems. Prefer using filepath.Rel and ensuring the result does not start with ".." for more robust containment checks. **[MINOR]** validateWorkspacePath uses strings.HasPrefix for path containment checks, which can be brittle on case-insensitive filesystems. Prefer using filepath.Rel and ensuring the result does not start with ".." for more robust containment checks.
|
||||
}
|
||||
|
||||
// validateWorkspacePath ensures a file path is within the workspace and resolves
|
||||
// symlinks to prevent traversal attacks. Returns the resolved absolute path or
|
||||
// an error if the path is outside the workspace.
|
||||
func validateWorkspacePath(path, pathName string) (string, error) {
|
||||
workspace := os.Getenv("GITHUB_WORKSPACE")
|
||||
if workspace == "" {
|
||||
workspace, _ = os.Getwd()
|
||||
}
|
||||
absWorkspace, err := filepath.Abs(workspace)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("failed to resolve workspace path: %w", err)
|
||||
}
|
||||
|
||||
// Join and clean the path
|
||||
fullPath := filepath.Join(absWorkspace, path)
|
||||
fullPath = filepath.Clean(fullPath)
|
||||
|
[MINOR] Workspace boundary check relies on strings.HasPrefix for path containment. While mitigated by filepath.Clean and EvalSymlinks, using filepath.Rel (and verifying it does not start with "..") would be a more robust, cross-platform defense-in-depth against subtle path normalization edge cases. **[MINOR]** Workspace boundary check relies on strings.HasPrefix for path containment. While mitigated by filepath.Clean and EvalSymlinks, using filepath.Rel (and verifying it does not start with "..") would be a more robust, cross-platform defense-in-depth against subtle path normalization edge cases.
|
||||
|
||||
// Check path is within workspace using filepath.Rel (more robust than HasPrefix)
|
||||
|
gpt-review-bot
commented
[MINOR] validateWorkspacePath uses a string HasPrefix check to ensure paths are within the workspace; this can be brittle across platforms and case-insensitive filesystems. Consider using filepath.Rel and verifying the resulting path does not start with ".." to robustly confirm containment. **[MINOR]** validateWorkspacePath uses a string HasPrefix check to ensure paths are within the workspace; this can be brittle across platforms and case-insensitive filesystems. Consider using filepath.Rel and verifying the resulting path does not start with ".." to robustly confirm containment.
|
||||
rel, err := filepath.Rel(absWorkspace, fullPath)
|
||||
if err != nil || strings.HasPrefix(rel, "..") {
|
||||
return "", fmt.Errorf("%s resolves outside workspace: path=%s workspace=%s", pathName, fullPath, absWorkspace)
|
||||
}
|
||||
|
||||
// Resolve symlinks and re-validate to prevent symlink traversal
|
||||
resolvedPath, err := filepath.EvalSymlinks(fullPath)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("failed to resolve %s: %w", pathName, err)
|
||||
}
|
||||
|
||||
relResolved, err := filepath.Rel(absWorkspace, resolvedPath)
|
||||
if err != nil || strings.HasPrefix(relResolved, "..") {
|
||||
return "", fmt.Errorf("%s symlink resolves outside workspace: resolved=%s workspace=%s", pathName, resolvedPath, absWorkspace)
|
||||
}
|
||||
|
||||
return resolvedPath, nil
|
||||
}
|
||||
|
||||
// buildSupersededBody creates the body for a superseded review: struck-through banner
|
||||
// with collapsed original content and the commit it was evaluated against.
|
||||
func buildSupersededBody(originalBody, commitSHA, newReviewURL, sentinel string) string {
|
||||
|
||||
@@ -6,6 +6,7 @@ import (
|
||||
"log/slog"
|
||||
"os"
|
||||
|
sonnet-review-bot
commented
[NIT] The **[NIT]** The `path/filepath` import is not in alphabetical order relative to the other imports (`os`, `os/exec`, `strings`). `goimports` would place it before `os`. Minor style issue.
sonnet-review-bot
commented
[NIT] The **[NIT]** The `path/filepath` import was added but placed out of order — the Go convention (and `goimports`) groups stdlib imports alphabetically. It should appear between `"os/exec"` and `"strings"`, not after `"strings"`. Minor but `goimports` would reorder this.
|
||||
"os/exec"
|
||||
|
sonnet-review-bot
commented
[NIT] The **[NIT]** The `path/filepath` import is not in alphabetical order relative to the other imports in the block (`os`, `os/exec`, `strings`, `path/filepath`). `goimports` / `gofmt` would sort this. Minor but the project convention is canonical formatting.
|
||||
"path/filepath"
|
||||
|
sonnet-review-bot
commented
[NIT] Import **[NIT]** Import `"path/filepath"` is not in alphabetical order with the other imports — it appears after `"strings"` instead of between `"os/exec"` and `"strings"`. `goimports` would fix this automatically.
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
@@ -45,6 +46,114 @@ func TestValidateReviewerName(t *testing.T) {
|
||||
}
|
||||
|
gpt-review-bot
commented
[MAJOR] TestValidateWorkspacePath case "absolute path gets normalized to relative" expects an error containing "failed to resolve", but validateWorkspacePath returns an error "resolves outside workspace" for an absolute path like /etc/passwd. Update the test to match the actual error message (or adjust validateWorkspacePath to return a consistent error in this scenario). **[MAJOR]** TestValidateWorkspacePath case "absolute path gets normalized to relative" expects an error containing "failed to resolve", but validateWorkspacePath returns an error "resolves outside workspace" for an absolute path like /etc/passwd. Update the test to match the actual error message (or adjust validateWorkspacePath to return a consistent error in this scenario).
|
||||
}
|
||||
|
||||
func TestValidateWorkspacePath(t *testing.T) {
|
||||
// Create a temp directory as our workspace
|
||||
tmpDir := t.TempDir()
|
||||
|
||||
// Create a valid file inside the workspace
|
||||
validFile := filepath.Join(tmpDir, "valid.json")
|
||||
if err := os.WriteFile(validFile, []byte("{}"), 0644); err != nil {
|
||||
t.Fatalf("failed to create test file: %v", err)
|
||||
}
|
||||
|
||||
// Create a subdirectory with a file
|
||||
subDir := filepath.Join(tmpDir, "subdir")
|
||||
if err := os.MkdirAll(subDir, 0755); err != nil {
|
||||
t.Fatalf("failed to create subdir: %v", err)
|
||||
}
|
||||
nestedFile := filepath.Join(subDir, "nested.json")
|
||||
if err := os.WriteFile(nestedFile, []byte("{}"), 0644); err != nil {
|
||||
t.Fatalf("failed to create nested file: %v", err)
|
||||
}
|
||||
|
||||
// Create a symlink pointing outside the workspace
|
||||
symlinkPath := filepath.Join(tmpDir, "evil-symlink.json")
|
||||
if err := os.Symlink("/etc/passwd", symlinkPath); err != nil {
|
||||
t.Fatalf("failed to create symlink: %v", err)
|
||||
}
|
||||
|
||||
// Save and restore GITHUB_WORKSPACE
|
||||
origWorkspace := os.Getenv("GITHUB_WORKSPACE")
|
||||
defer os.Setenv("GITHUB_WORKSPACE", origWorkspace)
|
||||
|
||||
|
gpt-review-bot
commented
[NIT] The comment "filepath.Join strips leading / making it /etc/passwd" is inaccurate on Unix-like systems — filepath.Join with an absolute path ignores the base and yields the absolute path. Update the comment for clarity. **[NIT]** The comment "filepath.Join strips leading / making it <workspace>/etc/passwd" is inaccurate on Unix-like systems — filepath.Join with an absolute path ignores the base and yields the absolute path. Update the comment for clarity.
|
||||
tests := []struct {
|
||||
name string
|
||||
|
gpt-review-bot
commented
[MINOR] TestValidateWorkspacePath expects the error substring 'failed to resolve' for an absolute path ("/etc/passwd"), but validateWorkspacePath returns 'resolves outside workspace' for absolute paths. This mismatch can cause brittle or failing tests; align the expected substring with the actual error or adjust the function to match the test intent. **[MINOR]** TestValidateWorkspacePath expects the error substring 'failed to resolve' for an absolute path ("/etc/passwd"), but validateWorkspacePath returns 'resolves outside workspace' for absolute paths. This mismatch can cause brittle or failing tests; align the expected substring with the actual error or adjust the function to match the test intent.
|
||||
workspace string
|
||||
path string
|
||||
wantErr bool
|
||||
errMatch string
|
||||
}{
|
||||
{
|
||||
name: "valid relative path",
|
||||
workspace: tmpDir,
|
||||
path: "valid.json",
|
||||
wantErr: false,
|
||||
},
|
||||
{
|
||||
name: "valid nested path",
|
||||
workspace: tmpDir,
|
||||
path: "subdir/nested.json",
|
||||
|
gpt-review-bot
commented
[MINOR] The TestValidateWorkspacePath case labeled "absolute path gets normalized to relative" expects an error substring "failed to resolve", but validateWorkspacePath returns an earlier "resolves outside workspace" error for absolute paths. The expectation and test comment are inconsistent with actual behavior and may cause confusion; align the expected error substring to the actual branch. **[MINOR]** The TestValidateWorkspacePath case labeled "absolute path gets normalized to relative" expects an error substring "failed to resolve", but validateWorkspacePath returns an earlier "resolves outside workspace" error for absolute paths. The expectation and test comment are inconsistent with actual behavior and may cause confusion; align the expected error substring to the actual branch.
|
||||
wantErr: false,
|
||||
|
gpt-review-bot
commented
[MAJOR] TestValidateWorkspacePath case "absolute path gets normalized to relative" expects error text containing "failed to resolve", but validateWorkspacePath now returns an error like "resolves outside workspace" for absolute paths outside the workspace. This mismatch will cause the test to fail on Linux. **[MAJOR]** TestValidateWorkspacePath case "absolute path gets normalized to relative" expects error text containing "failed to resolve", but validateWorkspacePath now returns an error like "resolves outside workspace" for absolute paths outside the workspace. This mismatch will cause the test to fail on Linux.
sonnet-review-bot
commented
[NIT] The **[NIT]** The `TestValidateWorkspacePath` test creates a symlink to `/etc/passwd` and doesn't have OS-specific guards. On Windows, `os.Symlink` to `/etc/passwd` would fail or behave differently. The existing tests in the repo don't appear to target Windows, so this is likely acceptable, but it's worth noting for cross-platform CI.
|
||||
},
|
||||
{
|
||||
name: "path traversal attempt",
|
||||
workspace: tmpDir,
|
||||
path: "../../../etc/passwd",
|
||||
wantErr: true,
|
||||
errMatch: "resolves outside workspace",
|
||||
},
|
||||
{
|
||||
name: "absolute path normalized to workspace-relative",
|
||||
workspace: tmpDir,
|
||||
path: "/etc/passwd",
|
||||
wantErr: true,
|
||||
// Go 1.21+ filepath.Join normalizes absolute paths: Join("/tmp/x", "/etc/passwd")
|
||||
// becomes "/tmp/x/etc/passwd", which is within workspace but doesn't exist.
|
||||
errMatch: "failed to resolve",
|
||||
},
|
||||
{
|
||||
name: "nonexistent file",
|
||||
workspace: tmpDir,
|
||||
path: "nonexistent.json",
|
||||
wantErr: true,
|
||||
errMatch: "failed to resolve",
|
||||
},
|
||||
{
|
||||
name: "symlink escaping workspace",
|
||||
workspace: tmpDir,
|
||||
path: "evil-symlink.json",
|
||||
wantErr: true,
|
||||
errMatch: "symlink resolves outside workspace",
|
||||
},
|
||||
}
|
||||
|
||||
for _, tc := range tests {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
os.Setenv("GITHUB_WORKSPACE", tc.workspace)
|
||||
resolved, err := validateWorkspacePath(tc.path, "test-file")
|
||||
|
||||
if tc.wantErr {
|
||||
if err == nil {
|
||||
t.Errorf("expected error for %q, got nil", tc.path)
|
||||
} else if tc.errMatch != "" && !strings.Contains(err.Error(), tc.errMatch) {
|
||||
t.Errorf("error %q should contain %q", err.Error(), tc.errMatch)
|
||||
}
|
||||
} else {
|
||||
if err != nil {
|
||||
t.Errorf("expected no error for %q, got %v", tc.path, err)
|
||||
}
|
||||
if resolved == "" {
|
||||
t.Error("expected non-empty resolved path")
|
||||
}
|
||||
// Verify resolved path is within workspace
|
||||
if !strings.HasPrefix(resolved, tc.workspace) {
|
||||
t.Errorf("resolved path %q not within workspace %q", resolved, tc.workspace)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func makeReview(id int64, login, state string, stale bool, body string) gitea.Review {
|
||||
|
sonnet-review-bot
commented
[NIT] Double blank line after the closing brace of TestValidateWorkspacePath before the makeReview helper. Minor style inconsistency. **[NIT]** Double blank line after the closing brace of TestValidateWorkspacePath before the makeReview helper. Minor style inconsistency.
|
||||
r := gitea.Review{
|
||||
|
sonnet-review-bot
commented
[NIT] There is a double blank line between **[NIT]** There is a double blank line between `TestValidateWorkspacePath` and `makeReview`. Consistent single blank lines between test functions would match the rest of the file.
|
||||
ID: id,
|
||||
@@ -56,7 +165,6 @@ func makeReview(id int64, login, state string, stale bool, body string) gitea.Re
|
||||
return r
|
||||
}
|
||||
|
||||
|
||||
func TestBuildSupersededBody(t *testing.T) {
|
||||
original := "# Review\n\nLooks good.\n\n<!-- review-bot:sonnet -->"
|
||||
sentinel := "<!-- review-bot:sonnet -->"
|
||||
@@ -626,8 +734,8 @@ func TestExtractSentinelName_EdgeCases(t *testing.T) {
|
||||
{"<!-- review-bot:sonnet --> rest", "sonnet"},
|
||||
{"<!-- review-bot:gpt-review --> rest", "gpt-review"},
|
||||
{"no sentinel here", "unknown"},
|
||||
{"<!-- review-bot:", "unknown"}, // prefix but no suffix
|
||||
{"prefix <!-- review-bot:abc --> end", "abc"}, // embedded in text
|
||||
{"<!-- review-bot:", "unknown"}, // prefix but no suffix
|
||||
{"prefix <!-- review-bot:abc --> end", "abc"}, // embedded in text
|
||||
}
|
||||
|
||||
for _, tc := range tests {
|
||||
|
||||
@@ -0,0 +1,334 @@
|
||||
# Design: Role-based Review Personas (Issue #51)
|
||||
|
gpt-review-bot
commented
[NIT] The design doc initially describes YAML personas before noting a switch to JSON at the end. This could confuse readers; consider moving the JSON revision note to the top or removing YAML snippets. **[NIT]** The design doc initially describes YAML personas before noting a switch to JSON at the end. This could confuse readers; consider moving the JSON revision note to the top or removing YAML snippets.
sonnet-review-bot
commented
[NIT] The design doc still references YAML throughout (phase checklist, finding #2, struct tags use **[NIT]** The design doc still references YAML throughout (phase checklist, finding #2, struct tags use `yaml:` annotations) even after the revision section at the bottom switched to JSON. The checklist items reference `security.yaml`, `architect.yaml`, `docs.yaml`. These inconsistencies are confusing for future readers. The doc should be updated to consistently reflect the JSON decision.
sonnet-review-bot
commented
[NIT] The design doc still references YAML in several places (e.g. the Persona struct uses **[NIT]** The design doc still references YAML in several places (e.g. the Persona struct uses `yaml:` tags, the design says 'personas/*.yaml', the completion checklist asks 'Persona struct matches YAML schema exactly?') even though the design was revised to JSON. The doc is internally inconsistent. Since this is a docs file it doesn't affect correctness, but it will confuse future readers.
sonnet-review-bot
commented
[NIT] The design doc references YAML throughout (personas/*.yaml, gopkg.in/yaml.v3) before the revision section at the bottom switches to JSON. The doc would be cleaner if the YAML references in the early sections were updated to reflect the final JSON decision, rather than leaving the revision as an addendum. As a static doc this is low priority. **[NIT]** The design doc references YAML throughout (personas/*.yaml, gopkg.in/yaml.v3) before the revision section at the bottom switches to JSON. The doc would be cleaner if the YAML references in the early sections were updated to reflect the final JSON decision, rather than leaving the revision as an addendum. As a static doc this is low priority.
sonnet-review-bot
commented
[NIT] The design doc still references YAML in several places (Persona struct with **[NIT]** The design doc still references YAML in several places (Persona struct with `yaml:` tags, `.review/personas/trading.yaml`, `security.yaml` phase items) after the JSON revision was made. These are internal doc inconsistencies but worth cleaning up for accuracy.
sonnet-review-bot
commented
[NIT] The design document still references YAML as the persona format in multiple places (e.g., the **[NIT]** The design document still references YAML as the persona format in multiple places (e.g., the `Persona struct` Go code block uses `yaml:` struct tags, the persona file format sections show YAML examples). The design revision section at the bottom explains the switch to JSON, but the earlier sections weren't updated. This creates confusion for anyone reading the doc top-to-bottom. Consider updating or striking through the YAML sections, or noting "see Design Revision below" at the first YAML mention.
sonnet-review-bot
commented
[NIT] The design document mentions **[NIT]** The design document mentions `gopkg.in/yaml.v3` but the implementation uses `github.com/goccy/go-yaml`. The design doc's 'Design Revision' section should be updated to reflect the actual library chosen, or the implementation should use gopkg.in/yaml.v3 as designed.
|
||||
|
||||
> **Note:** This design was revised during implementation to use JSON instead of YAML
|
||||
> to maintain the repository's zero-external-dependencies convention. All persona
|
||||
> files use JSON format. See "Design Revision" section at the end for details.
|
||||
|
||||
## Problem
|
||||
|
||||
Current review-bot performs generic code review. Every reviewer (regardless of `reviewer-name`) uses the same base prompt and evaluates the same concerns. This leads to:
|
||||
|
||||
1. **Redundancy** — Two reviewers (e.g., GPT + Claude twins) often flag identical issues
|
||||
2. **Gaps** — Generic reviewers miss specialized concerns (security, domain logic, architecture)
|
||||
3. **Noise** — NITs about style mixed with critical security findings
|
||||
4. **No ownership** — Findings lack clear domain attribution
|
||||
|
||||
## Constraints
|
||||
|
||||
- Must work with existing CLI flags and CI workflow patterns
|
||||
- Must not break backwards compatibility (existing configs still work)
|
||||
- Must integrate cleanly with the budget system (personas add to context)
|
||||
- Multiple personas running in parallel must not interfere with each other
|
||||
- Each persona must have clear scope boundaries (no duplication)
|
||||
|
||||
## Proposed Approach
|
||||
|
||||
### 1. Persona Definition
|
||||
|
||||
A persona is a named review role with:
|
||||
- **Identity** — Who am I? What's my expertise?
|
||||
- **Focus** — What do I look for?
|
||||
- **Scope boundaries** — What do I explicitly NOT comment on?
|
||||
- **Severity calibration** — What counts as MAJOR/MINOR/NIT for MY domain?
|
||||
|
||||
Personas are defined in JSON files that can live:
|
||||
1. In the pattern repos (shared across projects)
|
||||
2. In the target repo (project-specific personas)
|
||||
3. Inline via a new `--persona-file` flag (JSON format)
|
||||
|
||||
### 2. Persona File Format
|
||||
|
||||
```json
|
||||
# .review/personas/security.yaml
|
||||
name: security
|
||||
display_name: Security Specialist
|
||||
model_preference: opus # optional hint for expensive analysis
|
||||
|
||||
identity: |
|
||||
You are a security specialist reviewing code for vulnerabilities.
|
||||
Your expertise: OWASP Top 10, injection attacks, auth/authz, secrets management,
|
||||
event sourcing security (replay attacks, event injection).
|
||||
|
||||
focus:
|
||||
- Injection attacks (SQL, command, path traversal, template)
|
||||
- Authentication and authorization gaps
|
||||
- Secrets exposure (hardcoded credentials, tokens in logs)
|
||||
- Input validation (unsanitized input, unsafe deserialization)
|
||||
- Race conditions with security implications
|
||||
- Event sourcing attack vectors
|
||||
|
||||
ignore:
|
||||
- Code style and naming conventions
|
||||
- Performance (unless security-related)
|
||||
- Documentation
|
||||
- General code quality
|
||||
- Test coverage
|
||||
|
||||
severity:
|
||||
critical: "Remote code execution, auth bypass, data exfiltration"
|
||||
major: "Privilege escalation, information disclosure, DoS"
|
||||
minor: "Missing rate limiting, verbose errors"
|
||||
nit: "Theoretical risk with low exploitability"
|
||||
|
||||
output_format: |
|
||||
For each finding:
|
||||
- Severity: [CRITICAL|MAJOR|MINOR|NIT]
|
||||
- Attack vector: How could this be exploited?
|
||||
|
sonnet-review-bot
commented
[MINOR] The design document shows a YAML persona file format as the primary example (with **[MINOR]** The design document shows a YAML persona file format as the primary example (with `# .review/personas/security.yaml` header and YAML syntax like `name: security`) but the implementation uses JSON. The design revision note at the bottom explains this, but the main body of the document is misleading — it still shows YAML syntax in the code block under '2. Persona File Format'. This will confuse future contributors reading the design doc.
|
||||
- Evidence: Code snippet showing the vulnerability
|
||||
- Recommendation: Specific fix
|
||||
```
|
||||
|
||||
### 3. New CLI Flags
|
||||
|
||||
```
|
||||
--persona-file PATH Path to persona JSON file (local or in repo)
|
||||
--persona NAME Built-in persona name (security, architect, domain)
|
||||
```
|
||||
|
||||
Either flag sets the persona. If neither is provided, behavior is unchanged (generic review).
|
||||
|
||||
### 4. Prompt Assembly
|
||||
|
||||
Current flow:
|
||||
```
|
||||
SystemBase → Patterns → Conventions → [LLM]
|
||||
```
|
||||
|
||||
New flow with persona:
|
||||
```
|
||||
PersonaPrompt (from YAML) → Patterns (filtered?) → Conventions → [LLM]
|
||||
```
|
||||
|
||||
The persona's identity/focus/ignore/severity sections become the system prompt, replacing the generic "You are an expert code reviewer" base.
|
||||
|
||||
### 5. Built-in Personas
|
||||
|
||||
Ship with these built-in personas (loadable via `--persona NAME`):
|
||||
|
||||
| Name | Focus |
|
||||
|------|-------|
|
||||
| `security` | Vulnerabilities, auth, secrets |
|
||||
| `architect` | Patterns, consistency, design |
|
||||
| `domain` | Business logic (requires repo-specific config) |
|
||||
| `docs` | Documentation, API clarity |
|
||||
|
||||
Built-in personas live in `review/personas/` as embedded Go assets or YAML shipped with the binary.
|
||||
|
||||
### 6. CI Workflow Integration
|
||||
|
||||
Single persona:
|
||||
```yaml
|
||||
- uses: rodin/review-bot/.gitea/actions/review@v1
|
||||
with:
|
||||
reviewer-name: security
|
||||
persona: security
|
||||
...
|
||||
```
|
||||
|
||||
Multiple personas (parallel jobs):
|
||||
```yaml
|
||||
jobs:
|
||||
review:
|
||||
strategy:
|
||||
matrix:
|
||||
include:
|
||||
- name: security
|
||||
persona: security
|
||||
- name: architect
|
||||
persona: architect
|
||||
steps:
|
||||
- uses: rodin/review-bot/.gitea/actions/review@v1
|
||||
with:
|
||||
reviewer-name: ${{ matrix.name }}
|
||||
persona: ${{ matrix.persona }}
|
||||
```
|
||||
|
||||
Custom persona from repo:
|
||||
```yaml
|
||||
- uses: rodin/review-bot/.gitea/actions/review@v1
|
||||
with:
|
||||
reviewer-name: trading
|
||||
persona-file: .review/personas/trading.yaml
|
||||
```
|
||||
|
||||
### 7. Persona + Patterns Interaction
|
||||
|
||||
Some personas benefit from filtered patterns:
|
||||
- Security → only security-related patterns
|
||||
- Architect → all patterns (structural focus)
|
||||
- Domain → domain docs, not language patterns
|
||||
|
||||
For v1, keep it simple: all patterns are included regardless of persona. Future enhancement could add `patterns_filter` to persona YAML.
|
||||
|
||||
### 8. Output Format Changes
|
||||
|
||||
Persona name appears in the review header:
|
||||
```markdown
|
||||
# Security Review
|
||||
|
||||
## Summary
|
||||
No critical vulnerabilities found in this change.
|
||||
|
||||
## Findings
|
||||
| # | Severity | File | Line | Finding |
|
||||
...
|
||||
|
||||
## Recommendation
|
||||
**APPROVE** — No security-relevant issues detected.
|
||||
|
||||
---
|
||||
*Review by security*
|
||||
<!-- review-bot:security -->
|
||||
```
|
||||
|
||||
## State/Data Model
|
||||
|
||||
### Persona struct
|
||||
|
||||
```go
|
||||
// review/persona.go
|
||||
type Persona struct {
|
||||
Name string `yaml:"name"`
|
||||
DisplayName string `yaml:"display_name"`
|
||||
ModelPref string `yaml:"model_preference,omitempty"`
|
||||
Identity string `yaml:"identity"`
|
||||
Focus []string `yaml:"focus"`
|
||||
Ignore []string `yaml:"ignore"`
|
||||
Severity Severity `yaml:"severity"`
|
||||
OutputFormat string `yaml:"output_format,omitempty"`
|
||||
}
|
||||
|
||||
type Severity struct {
|
||||
Critical string `yaml:"critical"`
|
||||
Major string `yaml:"major"`
|
||||
Minor string `yaml:"minor"`
|
||||
Nit string `yaml:"nit"`
|
||||
}
|
||||
```
|
||||
|
||||
### Loading precedence
|
||||
|
||||
1. `--persona-file PATH` → load from local file system
|
||||
2. `--persona NAME` → load from embedded built-ins
|
||||
3. Neither → use generic system prompt (current behavior)
|
||||
|
||||
## Error Cases
|
||||
|
||||
| Error | Handling |
|
||||
|-------|----------|
|
||||
| Persona file not found | Fatal exit with clear message |
|
||||
| Invalid YAML in persona file | Fatal exit with parse error |
|
||||
| Both `--persona` and `--persona-file` specified | Fatal exit: mutually exclusive |
|
||||
| Unknown built-in persona name | Fatal exit with list of valid names |
|
||||
| Empty identity in persona | Warning, fall back to generic prompt |
|
||||
|
||||
## Edge Cases
|
||||
|
||||
- **Empty focus list**: Valid — persona relies on identity alone
|
||||
- **Empty ignore list**: Valid — no explicit scope exclusions
|
||||
- **No severity section**: Use default MAJOR/MINOR/NIT definitions
|
||||
- **Model preference set but budget insufficient**: Ignore preference, log warning
|
||||
- **Persona file in pattern repo**: Fetch like other pattern files
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit tests
|
||||
- `persona_test.go`: Parse valid/invalid YAML, validate required fields
|
||||
- `prompt_test.go`: Verify persona prompt assembly
|
||||
- Integration with budget: persona prompts count toward token limit
|
||||
|
||||
### Integration tests
|
||||
- End-to-end with `--persona security` (built-in)
|
||||
- End-to-end with `--persona-file custom.yaml`
|
||||
- Backwards compatibility: no flags = generic behavior
|
||||
|
||||
### Manual verification
|
||||
- Run security persona on a PR with obvious vulnerability
|
||||
- Verify security persona ignores style issues
|
||||
- Verify non-security persona doesn't flag security issues
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Persona types and loading
|
||||
- [ ] `review/persona.go`: Persona struct + YAML parsing
|
||||
- [ ] `review/persona_test.go`: Unit tests
|
||||
- [ ] Embed built-in personas in binary
|
||||
- [ ] Compiles clean, tests pass
|
||||
|
||||
### Phase 2: Prompt generation
|
||||
- [ ] `review/prompt.go`: `BuildPersonaPrompt(p Persona) string`
|
||||
- [ ] Modify `BuildSystemBase()` to accept optional persona
|
||||
- [ ] Integrate persona prompt with budget system
|
||||
- [ ] Tests for prompt assembly
|
||||
|
||||
### Phase 3: CLI integration
|
||||
- [ ] Add `--persona` and `--persona-file` flags
|
||||
- [ ] Flag validation (mutually exclusive, valid names)
|
||||
- [ ] Load persona based on flags
|
||||
- [ ] Pass persona to prompt builder
|
||||
|
||||
### Phase 4: Action integration
|
||||
- [ ] Add `persona` and `persona-file` inputs to action.yml
|
||||
- [ ] Update README with persona examples
|
||||
- [ ] End-to-end CI test
|
||||
|
||||
### Phase 5: Built-in personas
|
||||
- [ ] `security.yaml` built-in
|
||||
- [ ] `architect.yaml` built-in
|
||||
- [ ] `docs.yaml` built-in
|
||||
- [ ] Document each persona's focus
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Persona file location in repo**: Should we support `--persona-file .review/security.yaml` where the file is fetched from the PR's repo (like conventions)? This adds complexity but enables project-specific personas without action changes.
|
||||
|
||||
2. **Model preference enforcement**: If persona specifies `model_preference: opus` but the action uses a different model, should we warn? Override? Ignore? Current thinking: log warning, use the specified model (user controls model via action input).
|
||||
|
||||
3. **Severity override output**: If persona defines custom severity levels (CRITICAL), should the JSON output include them, or map back to standard MAJOR/MINOR/NIT? Current thinking: keep standard output format, use severity calibration only for prompt guidance.
|
||||
|
||||
## Completion Checklist
|
||||
|
||||
1. Persona struct matches YAML schema exactly?
|
||||
2. Built-in personas embedded in binary (not external files)?
|
||||
3. `--persona` and `--persona-file` are mutually exclusive?
|
||||
4. Unknown persona name produces clear error with valid options?
|
||||
5. Empty persona file fields have sensible defaults?
|
||||
6. Persona prompt integrates with budget system (token counting)?
|
||||
7. Backwards compatibility: no flags = current behavior?
|
||||
8. Review header shows persona display name?
|
||||
9. Sentinel still uses reviewer-name (not persona name)?
|
||||
10. Unit tests cover parse errors, missing fields, valid YAML?
|
||||
|
gpt-review-bot
commented
[MINOR] The design document states a "Design Revision: JSON Instead of YAML" to preserve zero dependencies, but the implementation retains YAML parsing and adds a YAML dependency. Align docs and code (prefer JSON-only) to match project conventions. **[MINOR]** The design document states a "Design Revision: JSON Instead of YAML" to preserve zero dependencies, but the implementation retains YAML parsing and adds a YAML dependency. Align docs and code (prefer JSON-only) to match project conventions.
|
||||
|
||||
## Design Review Findings (Self-Review)
|
||||
|
||||
### Finding 1: Severity Mapping
|
||||
|
gpt-review-bot
commented
[MINOR] Design doc states 'Decision: Add gopkg.in/yaml.v3' but the implementation uses github.com/goccy/go-yaml. Update the design document to reflect the actual library choice or switch the implementation to match the doc. **[MINOR]** Design doc states 'Decision: Add gopkg.in/yaml.v3' but the implementation uses github.com/goccy/go-yaml. Update the design document to reflect the actual library choice or switch the implementation to match the doc.
|
||||
The persona YAML allows `critical` severity, but the LLM output parser (`review/parser.go`) only accepts MAJOR/MINOR/NIT.
|
||||
|
||||
**Resolution:** Keep standard output format. Persona severity section is ONLY for calibrating the LLM's judgment (prompt guidance). Output must still use MAJOR/MINOR/NIT. Document this clearly in persona format docs.
|
||||
|
||||
### Finding 2: Embedding Built-in Personas
|
||||
Go doesn't natively embed YAML. Must use `//go:embed` directive (Go 1.16+).
|
||||
|
||||
**Resolution:** Create `review/personas/` directory with YAML files and use:
|
||||
```go
|
||||
//go:embed personas/*.yaml
|
||||
var embeddedPersonas embed.FS
|
||||
```
|
||||
|
||||
### Finding 3: display_name vs reviewer-name
|
||||
Design says header shows "persona display name" but sentinel uses "reviewer-name". This is correct - they serve different purposes:
|
||||
- `display_name` → human-readable header ("Security Specialist Review")
|
||||
|
gpt-review-bot
commented
[MINOR] Design doc contradicts repository conventions and implementation: it states a 'Design Revision: YAML with gopkg.in/yaml.v3' (adding an external dependency), while the actual implementation uses JSON and maintains zero external dependencies. This may confuse contributors. **[MINOR]** Design doc contradicts repository conventions and implementation: it states a 'Design Revision: YAML with gopkg.in/yaml.v3' (adding an external dependency), while the actual implementation uses JSON and maintains zero external dependencies. This may confuse contributors.
|
||||
- `reviewer-name` → machine sentinel for cleanup (`<!-- review-bot:security -->`)
|
||||
|
||||
When persona is used, `display_name` takes precedence for the header title, but `reviewer-name` (CLI flag) is still used for the sentinel.
|
||||
|
||||
## Design Revision: YAML with gopkg.in/yaml.v3
|
||||
|
||||
**Decision:** Add `gopkg.in/yaml.v3` as a dependency.
|
||||
|
||||
YAML is preferred over JSON for persona files because:
|
||||
- Multi-line strings are cleaner (no escaping quotes in identity/focus text)
|
||||
- Comments are supported for documentation
|
||||
- More human-readable for complex persona definitions
|
||||
|
||||
The implementation supports both YAML (`.yaml`, `.yml`) and JSON (`.json`) for backwards compatibility, with YAML as the default for built-in personas.
|
||||
@@ -7,10 +7,37 @@ import (
|
||||
|
||||
// FormatMarkdown formats a ReviewResult into the markdown body for a Gitea review.
|
||||
func FormatMarkdown(result *ReviewResult, reviewerName string) string {
|
||||
return FormatMarkdownWithDisplay(result, reviewerName, reviewerName)
|
||||
}
|
||||
|
||||
// GiteaEvent converts the verdict to the Gitea API event string.
|
||||
func GiteaEvent(verdict string) string {
|
||||
switch verdict {
|
||||
case "APPROVE":
|
||||
return "APPROVED"
|
||||
case "REQUEST_CHANGES":
|
||||
return "REQUEST_CHANGES"
|
||||
default:
|
||||
return "COMMENT"
|
||||
|
[NIT] DisplayName is inserted into the markdown header/body without escaping. While Gitea typically sanitizes markdown, if this output is ever reused in a different renderer, unescaped content could cause formatting issues. Consider conservative sanitization or documenting the assumption that Gitea sanitization is always applied. **[NIT]** DisplayName is inserted into the markdown header/body without escaping. While Gitea typically sanitizes markdown, if this output is ever reused in a different renderer, unescaped content could cause formatting issues. Consider conservative sanitization or documenting the assumption that Gitea sanitization is always applied.
|
||||
}
|
||||
}
|
||||
|
||||
// FormatMarkdownWithDisplay formats a ReviewResult with separate display name and sentinel name.
|
||||
|
[MINOR] Display names are intentionally not HTML-escaped and are included in Markdown. If a persona is loaded from a repo file, a crafted display_name could embed Markdown constructs (e.g., images) that cause remote content loads when rendered. Consider restricting display_name to a conservative character set or escaping/sanitizing to reduce content-injection/tracking risks, even if Gitea sanitizes most HTML. **[MINOR]** Display names are intentionally not HTML-escaped and are included in Markdown. If a persona is loaded from a repo file, a crafted display_name could embed Markdown constructs (e.g., images) that cause remote content loads when rendered. Consider restricting display_name to a conservative character set or escaping/sanitizing to reduce content-injection/tracking risks, even if Gitea sanitizes most HTML.
|
||||
// Note: displayName is not HTML-escaped as Gitea sanitizes rendered Markdown.
|
||||
// Persona display names are controlled by repo owners (trusted input).
|
||||
|
sonnet-review-bot
commented
[MINOR] The comment on **[MINOR]** The comment on `FormatMarkdownWithDisplay` says 'displayName is not HTML-escaped as Gitea sanitizes rendered Markdown. Persona display names are controlled by repo owners (trusted input).' While the trust reasoning is sound for built-in personas, `--persona-file` allows the display name to come from any YAML file in the workspace. If a workspace file is somehow malicious (e.g., a compromised dependency's config), the unescaped display name could inject markdown. This is a low-risk but worth noting given the security context of this tool.
|
||||
// displayName is used for the header title, sentinelName is used for the cleanup sentinel.
|
||||
|
gpt-review-bot
commented
[NIT] Header capitalization uses byte indexing (strings.ToUpper(headerName[:1]) + headerName[1:]) which is not Unicode-safe for multi-byte runes. Consider using utf8.DecodeRuneInString to handle non-ASCII display names correctly. **[NIT]** Header capitalization uses byte indexing (strings.ToUpper(headerName[:1]) + headerName[1:]) which is not Unicode-safe for multi-byte runes. Consider using utf8.DecodeRuneInString to handle non-ASCII display names correctly.
[MINOR] Persona display_name is inserted into the Markdown header and footer without escaping. While Gitea sanitizes rendered Markdown, if a custom or differently configured renderer is used, a crafted display_name could inject Markdown (e.g., images/links) and cause external resource loads in the PR UI. Consider stripping newlines and restricting characters or explicitly escaping to reduce injection surface. **[MINOR]** Persona display_name is inserted into the Markdown header and footer without escaping. While Gitea sanitizes rendered Markdown, if a custom or differently configured renderer is used, a crafted display_name could inject Markdown (e.g., images/links) and cause external resource loads in the PR UI. Consider stripping newlines and restricting characters or explicitly escaping to reduce injection surface.
|
||||
// If displayName is empty, sentinelName is used for both.
|
||||
func FormatMarkdownWithDisplay(result *ReviewResult, displayName, sentinelName string) string {
|
||||
var sb strings.Builder
|
||||
|
||||
if reviewerName != "" {
|
||||
title := strings.ToUpper(reviewerName[:1]) + reviewerName[1:]
|
||||
// Use display name for header, or fall back to sentinel name
|
||||
headerName := displayName
|
||||
|
sonnet-review-bot
commented
[MINOR] The comment says 'displayName is not HTML-escaped as Gitea sanitizes rendered Markdown' and 'Persona display names are controlled by repo owners (trusted input)'. However, **[MINOR]** The comment says 'displayName is not HTML-escaped as Gitea sanitizes rendered Markdown' and 'Persona display names are controlled by repo owners (trusted input)'. However, `displayName` can come from a `--persona-file` whose path is validated for workspace containment but whose *contents* (the `display_name` field) are not sanitized. A persona file checked into a repo could contain something like `display_name: "] Review\n\n<script>` in a pathological case. Since Gitea does sanitize Markdown-rendered HTML, this is low risk, but the trust assertion in the comment ('controlled by repo owners') is slightly inaccurate — it could also come from any file a developer adds to the repo. The comment should be more precise or dropped.
|
||||
if headerName == "" {
|
||||
headerName = sentinelName
|
||||
|
sonnet-review-bot
commented
[MINOR] The comment says **[MINOR]** The comment says `displayName is not HTML-escaped as Gitea sanitizes rendered Markdown. Persona display names are controlled by repo owners (trusted input).` While display names are set by repo owners (trusted), they come from YAML files that could be committed by contributors. If a malicious contributor adds a `display_name` containing `-->` or similar, it could break the sentinel comment or inject content into the review body. The sentinel is protected (uses `sentinelName` from `reviewerName` which is validated), but the header `# <displayName> Review` is unescaped. This is low severity given Gitea sanitizes Markdown, but the trust model statement in the comment is slightly overstated.
|
||||
}
|
||||
|
||||
if headerName != "" {
|
||||
title := CapitalizeFirst(headerName)
|
||||
sb.WriteString(fmt.Sprintf("# %s Review\n\n", title))
|
||||
}
|
||||
|
||||
@@ -33,23 +60,11 @@ func FormatMarkdown(result *ReviewResult, reviewerName string) string {
|
||||
sb.WriteString("## Recommendation\n\n")
|
||||
|
gpt-review-bot
commented
[NIT] The new **[NIT]** The new `FormatMarkdownWithDisplay` largely duplicates `FormatMarkdown`. Consider refactoring to reduce duplication (e.g., have one function take both display and sentinel names, or have one call the other with appropriate parameters).
gpt-review-bot
commented
[NIT] Header title capitalization uses byte slicing (headerName[:1]) which can mis-handle non-ASCII names (multi-byte runes). Consider using unicode/utf8 to upper-case the first rune safely. **[NIT]** Header title capitalization uses byte slicing (headerName[:1]) which can mis-handle non-ASCII names (multi-byte runes). Consider using unicode/utf8 to upper-case the first rune safely.
|
||||
sb.WriteString(fmt.Sprintf("**%s** — %s\n", result.Verdict, result.Recommendation))
|
||||
|
[NIT] Persona DisplayName is inserted into the review header and footer without escaping. Although Gitea typically sanitizes markdown, an attacker-controlled DisplayName from a repo persona file could inject undesirable markdown. Consider sanitizing or constraining DisplayName to a safe character set to reduce reliance on downstream sanitization. **[NIT]** Persona DisplayName is inserted into the review header and footer without escaping. Although Gitea typically sanitizes markdown, an attacker-controlled DisplayName from a repo persona file could inject undesirable markdown. Consider sanitizing or constraining DisplayName to a safe character set to reduce reliance on downstream sanitization.
|
||||
|
||||
|
gpt-review-bot
commented
[MINOR] FormatMarkdown and FormatMarkdownWithDisplay duplicate substantial formatting logic. Consider refactoring to a shared helper to reduce drift and maintenance overhead. **[MINOR]** FormatMarkdown and FormatMarkdownWithDisplay duplicate substantial formatting logic. Consider refactoring to a shared helper to reduce drift and maintenance overhead.
|
||||
if reviewerName != "" {
|
||||
sb.WriteString(fmt.Sprintf("\n---\n*Review by %s*\n", reviewerName))
|
||||
if sentinelName != "" {
|
||||
sb.WriteString(fmt.Sprintf("\n---\n*Review by %s*\n", headerName))
|
||||
// Hidden sentinel for identifying this bot's reviews during cleanup
|
||||
|
sonnet-review-bot
commented
[MINOR] The title capitalisation **[MINOR]** The title capitalisation `strings.ToUpper(headerName[:1]) + headerName[1:]` is duplicated in both `FormatMarkdown` (line 12) and `FormatMarkdownWithDisplay` (line 65). This byte-slice indexing also breaks on multi-byte (non-ASCII) first characters. Use `strings.ToUpper(string([]rune(headerName)[:1])) + string([]rune(headerName)[1:])` or `unicode.ToUpper` for correctness, and extract it into a helper.
|
||||
sb.WriteString(fmt.Sprintf("\n<!-- review-bot:%s -->\n", reviewerName))
|
||||
sb.WriteString(fmt.Sprintf("\n<!-- review-bot:%s -->\n", sentinelName))
|
||||
|
[MINOR] Persona display_name is used directly in the markdown header/footer without escaping. While the cleanup sentinel uses the validated reviewer-name, a custom persona file could inject markdown mentions or formatting into the posted review. Consider sanitizing or constraining display_name (e.g., limit characters or escape markdown) to reduce potential content/mention abuse. **[MINOR]** Persona display_name is used directly in the markdown header/footer without escaping. While the cleanup sentinel uses the validated reviewer-name, a custom persona file could inject markdown mentions or formatting into the posted review. Consider sanitizing or constraining display_name (e.g., limit characters or escape markdown) to reduce potential content/mention abuse.
|
||||
}
|
||||
|
||||
return sb.String()
|
||||
}
|
||||
|
||||
// GiteaEvent converts the verdict to the Gitea API event string.
|
||||
func GiteaEvent(verdict string) string {
|
||||
switch verdict {
|
||||
case "APPROVE":
|
||||
return "APPROVED"
|
||||
case "REQUEST_CHANGES":
|
||||
return "REQUEST_CHANGES"
|
||||
default:
|
||||
return "COMMENT"
|
||||
}
|
||||
}
|
||||
|
||||
@@ -159,3 +159,58 @@ func TestFormatMarkdown_RoleTitle(t *testing.T) {
|
||||
t.Error("should not contain role title header when reviewer name is empty")
|
||||
}
|
||||
}
|
||||
|
||||
func TestFormatMarkdownWithDisplay(t *testing.T) {
|
||||
result := &ReviewResult{
|
||||
Verdict: "APPROVE",
|
||||
Summary: "Test summary",
|
||||
Findings: nil,
|
||||
Recommendation: "Test recommendation",
|
||||
}
|
||||
|
||||
t.Run("with display name", func(t *testing.T) {
|
||||
body := FormatMarkdownWithDisplay(result, "Security Specialist", "security")
|
||||
|
||||
// Header should use display name
|
||||
if !strings.Contains(body, "# Security Specialist Review") {
|
||||
t.Error("header should use display name")
|
||||
}
|
||||
|
||||
// Sentinel should use sentinel name
|
||||
if !strings.Contains(body, "<!-- review-bot:security -->") {
|
||||
t.Error("sentinel should use sentinel name")
|
||||
}
|
||||
|
||||
// Footer "Review by" should use display name
|
||||
if !strings.Contains(body, "*Review by Security Specialist*") {
|
||||
t.Error("footer should use display name")
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("without display name", func(t *testing.T) {
|
||||
body := FormatMarkdownWithDisplay(result, "", "reviewer")
|
||||
|
||||
// Should fall back to sentinel name for header
|
||||
if !strings.Contains(body, "# Reviewer Review") {
|
||||
t.Error("header should fall back to sentinel name")
|
||||
}
|
||||
|
||||
if !strings.Contains(body, "<!-- review-bot:reviewer -->") {
|
||||
t.Error("sentinel should use sentinel name")
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("empty both names", func(t *testing.T) {
|
||||
body := FormatMarkdownWithDisplay(result, "", "")
|
||||
|
||||
// Should not have header
|
||||
if strings.Contains(body, "# ") && strings.Contains(body, " Review") {
|
||||
t.Error("should not have header when both names empty")
|
||||
}
|
||||
|
||||
// Should not have sentinel
|
||||
if strings.Contains(body, "<!-- review-bot:") {
|
||||
t.Error("should not have sentinel when sentinel name empty")
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
@@ -0,0 +1,112 @@
|
||||
package review
|
||||
|
||||
import (
|
||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `filepath` is imported but used only for `filepath.Join("personas", filename)` in `LoadBuiltinPersona`. Since `embed.FS` uses forward-slash paths on all platforms (per Go spec), using `"personas/" + filename` (plain string concatenation) would be both simpler and more correct — `filepath.Join` uses the OS separator on Windows, which would break the embed.FS lookup.
|
||||
"embed"
|
||||
"encoding/json"
|
||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `path/filepath` is imported but only used for `filepath.Join("personas", filename)` in `LoadBuiltinPersona`. Since the path separator for embedded FS is always `/` (regardless of OS), `embed.FS` paths must use forward slashes per the `io/fs` spec. Using `filepath.Join` here works on Linux/macOS but would produce `personas\security` on Windows if that ever becomes a build target. Using `"personas/" + filename` directly would be safer and removes the need for the `path/filepath` import.
|
||||
"fmt"
|
||||
"os"
|
||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `path/filepath` is imported and used in `LoadBuiltinPersona` to construct `filepath.Join("personas", filename)`. On Windows this would produce `personas\filename.json`, but `embed.FS` always uses forward slashes. This should use `"personas/" + filename` directly (or `path.Join` from the `path` package, not `filepath`). This is a correctness bug on non-Unix platforms, though it won't affect Linux CI runners.
|
||||
"strings"
|
||||
"unicode/utf8"
|
||||
)
|
||||
|
||||
//go:embed personas/*.json
|
||||
|
sonnet-review-bot
commented
[MAJOR] Imports gopkg.in/yaml.v3 in violation of the zero-external-dependencies convention. The parsePersona function should use only encoding/json. Built-in personas stored as .yaml files should either be converted to .json or parsed with a minimal hand-rolled YAML subset — though the simplest fix consistent with the design document's own resolution is to store built-in personas as .json and remove the yaml.v3 import entirely. **[MAJOR]** Imports gopkg.in/yaml.v3 in violation of the zero-external-dependencies convention. The parsePersona function should use only encoding/json. Built-in personas stored as .yaml files should either be converted to .json or parsed with a minimal hand-rolled YAML subset — though the simplest fix consistent with the design document's own resolution is to store built-in personas as .json and remove the yaml.v3 import entirely.
|
||||
var embeddedPersonas embed.FS
|
||||
|
||||
// Persona defines a specialized review role with focused expertise.
|
||||
type Persona struct {
|
||||
Name string `json:"name"`
|
||||
|
gpt-review-bot
commented
[MAJOR] Imports gopkg.in/yaml.v3 and parses YAML personas, conflicting with the project’s zero-dependency requirement. Either remove YAML support or replace with JSON-only parsing. **[MAJOR]** Imports gopkg.in/yaml.v3 and parses YAML personas, conflicting with the project’s zero-dependency requirement. Either remove YAML support or replace with JSON-only parsing.
|
||||
DisplayName string `json:"display_name"`
|
||||
|
gpt-review-bot
commented
[MINOR] Persona.ModelPref is parsed but not used anywhere. Consider documenting intended usage or removing until needed to reduce API surface. **[MINOR]** Persona.ModelPref is parsed but not used anywhere. Consider documenting intended usage or removing until needed to reduce API surface.
|
||||
ModelPref string `json:"model_preference,omitempty"`
|
||||
|
gpt-review-bot
commented
[NIT] Persona.ModelPref (model_preference) is defined but not used anywhere. Consider either wiring it into model selection or removing it to avoid confusion. **[NIT]** Persona.ModelPref (model_preference) is defined but not used anywhere. Consider either wiring it into model selection or removing it to avoid confusion.
|
||||
Identity string `json:"identity"`
|
||||
Focus []string `json:"focus"`
|
||||
Ignore []string `json:"ignore"`
|
||||
Severity Severity `json:"severity"`
|
||||
OutputFormat string `json:"output_format,omitempty"`
|
||||
}
|
||||
|
||||
// Severity defines what constitutes each severity level for this persona.
|
||||
// These are prompt guidance for the LLM, not output format changes.
|
||||
type Severity struct {
|
||||
Major string `json:"major"`
|
||||
Minor string `json:"minor"`
|
||||
Nit string `json:"nit"`
|
||||
}
|
||||
|
[MAJOR] LoadPersona reads an arbitrary file path from the local filesystem without any path normalization, workspace boundary checks, or symlink resolution. In CI, a misconfigured workflow or malicious change could cause the bot to read files outside the repository workspace and include their contents (via persona fields) in the LLM system prompt, risking data exfiltration. Mirror the strict workspace + symlink checks used for system-prompt-file. **[MAJOR]** LoadPersona reads an arbitrary file path from the local filesystem without any path normalization, workspace boundary checks, or symlink resolution. In CI, a misconfigured workflow or malicious change could cause the bot to read files outside the repository workspace and include their contents (via persona fields) in the LLM system prompt, risking data exfiltration. Mirror the strict workspace + symlink checks used for system-prompt-file.
[MINOR] Persona files from the repository are read and unmarshaled without an explicit size limit. A very large YAML/JSON persona file could cause elevated memory/CPU usage during os.ReadFile and yaml/json unmarshaling, enabling a potential CI resource exhaustion vector. Consider enforcing a maximum file size before reading/unmarshaling and rejecting oversized files. **[MINOR]** Persona files from the repository are read and unmarshaled without an explicit size limit. A very large YAML/JSON persona file could cause elevated memory/CPU usage during os.ReadFile and yaml/json unmarshaling, enabling a potential CI resource exhaustion vector. Consider enforcing a maximum file size before reading/unmarshaling and rejecting oversized files.
|
||||
|
||||
// LoadPersona loads a persona from a JSON file path.
|
||||
func LoadPersona(path string) (*Persona, error) {
|
||||
data, err := os.ReadFile(path)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read persona file %s: %w", path, err)
|
||||
|
[MINOR] LoadPersona reads the entire persona file with os.ReadFile without a size cap. In CI contexts where persona-file points to a repository file, a very large JSON could cause excessive memory usage. Consider adding a size check (e.g., via os.Stat) or a reasonable read limit before loading. **[MINOR]** LoadPersona reads the entire persona file with os.ReadFile without a size cap. In CI contexts where persona-file points to a repository file, a very large JSON could cause excessive memory usage. Consider adding a size check (e.g., via os.Stat) or a reasonable read limit before loading.
|
||||
}
|
||||
return parsePersona(data, path)
|
||||
|
[MAJOR] LoadPersona uses os.ReadFile on the provided path without checks. Combined with the caller’s lack of path/symlink validation, this permits reading arbitrary paths on the runner (including device nodes or large files), enabling DoS and possible sensitive data leakage if the file parses as JSON. **[MAJOR]** LoadPersona uses os.ReadFile on the provided path without checks. Combined with the caller’s lack of path/symlink validation, this permits reading arbitrary paths on the runner (including device nodes or large files), enabling DoS and possible sensitive data leakage if the file parses as JSON.
|
||||
}
|
||||
|
||||
// LoadBuiltinPersona loads a built-in persona by name.
|
||||
// Returns an error if the persona doesn't exist.
|
||||
func LoadBuiltinPersona(name string) (*Persona, error) {
|
||||
filename := name + ".json"
|
||||
data, err := embeddedPersonas.ReadFile("personas/" + filename) // embed.FS paths use forward slashes per io/fs spec
|
||||
|
gpt-review-bot
commented
[NIT] ListBuiltinPersonas returns nil on error, which may require nil checks at call sites. Returning an empty slice is often more ergonomic and safer. **[NIT]** ListBuiltinPersonas returns nil on error, which may require nil checks at call sites. Returning an empty slice is often more ergonomic and safer.
|
||||
if err != nil {
|
||||
available := ListBuiltinPersonas()
|
||||
return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", "))
|
||||
}
|
||||
return parsePersona(data, "builtin:"+name)
|
||||
}
|
||||
|
||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `LoadBuiltinPersona` calls `ListBuiltinPersonas()` to build an error message only on failure. This means if `embeddedPersonas.ReadFile` fails AND `embeddedPersonas.ReadDir` (inside `ListBuiltinPersonas`) also fails, the error message will be `unknown built-in persona "x" (available: )` with an empty list — which is confusing. Consider hardcoding the known persona names in the error message or handling the ReadDir error in ListBuiltinPersonas more gracefully.
|
||||
// ListBuiltinPersonas returns the names of all built-in personas.
|
||||
// Returns an empty slice if the embedded directory cannot be read.
|
||||
func ListBuiltinPersonas() []string {
|
||||
entries, err := embeddedPersonas.ReadDir("personas")
|
||||
|
sonnet-review-bot
commented
[MINOR] The inline comment on **[MINOR]** The inline comment on `embeddedPersonas.ReadFile("personas/" + filename)` says 'embed.FS paths use forward slashes per io/fs spec'. While correct and useful, this comment is placed on the wrong line — it belongs on the `ReadFile` call line (line 59), which is where it appears. This is fine. However, the function silently falls through to listing available personas when the embedded file is not found, which means a typo in the name produces an error message that lists alternatives. This is good UX, but the `ListBuiltinPersonas()` call inside `LoadBuiltinPersona` adds a small O(n) directory scan on every failed lookup — not a real concern at this scale but worth noting.
|
||||
if err != nil {
|
||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `LoadBuiltinPersona` calls `ListBuiltinPersonas()` to build an error message when the persona isn't found. However, `ListBuiltinPersonas` reads the directory from the embedded FS — if that somehow fails, it returns `nil` and the error message becomes 'unknown built-in persona "foo" (available: )'. This is a minor usability issue, but it would be better to hardcode the list of known personas or at least handle the empty-list case.
|
||||
return []string{}
|
||||
}
|
||||
var names []string
|
||||
for _, e := range entries {
|
||||
if e.IsDir() {
|
||||
continue
|
||||
}
|
||||
name := e.Name()
|
||||
if strings.HasSuffix(name, ".json") {
|
||||
names = append(names, strings.TrimSuffix(name, ".json"))
|
||||
}
|
||||
}
|
||||
return names
|
||||
}
|
||||
|
||||
func parsePersona(data []byte, source string) (*Persona, error) {
|
||||
var p Persona
|
||||
if err := json.Unmarshal(data, &p); err != nil {
|
||||
|
sonnet-review-bot
commented
[MINOR] ListBuiltinPersonas silently returns nil when ReadDir fails. The caller (LoadBuiltinPersona error message) would then say 'available: ' with no names, which is confusing. Consider logging or returning an error, or at minimum ensuring the error path doesn't produce a misleading 'available: ' suffix in the error message. **[MINOR]** ListBuiltinPersonas silently returns nil when ReadDir fails. The caller (LoadBuiltinPersona error message) would then say 'available: ' with no names, which is confusing. Consider logging or returning an error, or at minimum ensuring the error path doesn't produce a misleading 'available: ' suffix in the error message.
|
||||
return nil, fmt.Errorf("parse persona %s: %w", source, err)
|
||||
|
sonnet-review-bot
commented
[NIT] Trailing whitespace after the **[NIT]** Trailing whitespace after the `var p Persona` declaration (blank line with spaces before the comment). Run `gofmt`.
|
||||
}
|
||||
if err := validatePersona(&p, source); err != nil {
|
||||
return nil, err
|
||||
|
sonnet-review-bot
commented
[NIT] There is a trailing whitespace after **[NIT]** There is a trailing whitespace after `var p Persona` before the blank line. Minor style issue.
|
||||
}
|
||||
return &p, nil
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `parsePersona` function uses `filepath.Ext(source)` to decide the format, but for built-in personas the source is `"builtin:security"` which has no file extension — it correctly falls through to the YAML path. However, for user-provided files without a `.json` extension (e.g., `.yml`), it also falls through to YAML, which is correct behavior but is undocumented. More importantly, the comment says 'YAML is a superset of JSON' as justification for the fallback, but this is only approximately true and the actual behavior depends on the library. The logic would be clearer as: if JSON extension → JSON unmarshal, else → YAML unmarshal (which currently is what happens, but the comment is misleading).
|
||||
}
|
||||
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `parsePersona` function uses `filepath.Ext(source)` to detect the format, but for built-in personas the `source` is `"builtin:security"` (no file extension). The comment says 'YAML (also handles .yaml, .yml, and builtin: prefix)' but this only works by coincidence — `filepath.Ext("builtin:security")` returns `""` which falls into the else-YAML branch. This is fragile. A more explicit check like `source == ".json"` or extracting the logic to handle the `builtin:` prefix explicitly would be clearer and more robust.
|
||||
func validatePersona(p *Persona, source string) error {
|
||||
if p.Name == "" {
|
||||
return fmt.Errorf("persona %s: name is required", source)
|
||||
|
sonnet-review-bot
commented
[NIT] Trailing whitespace on line 89 (blank line with spaces inside parsePersona before the ext detection block). Minor style issue. **[NIT]** Trailing whitespace on line 89 (blank line with spaces inside parsePersona before the ext detection block). Minor style issue.
|
||||
}
|
||||
if p.Identity == "" {
|
||||
return fmt.Errorf("persona %s: identity is required", source)
|
||||
}
|
||||
// DisplayName defaults to Name if not set
|
||||
|
sonnet-review-bot
commented
[NIT] Trailing whitespace after the closing brace of the else block. Run **[NIT]** Trailing whitespace after the closing brace of the else block. Run `gofmt`.
|
||||
if p.DisplayName == "" {
|
||||
p.DisplayName = p.Name
|
||||
|
sonnet-review-bot
commented
[MINOR] parsePersona uses filepath.Ext(source) to detect JSON vs YAML, but for built-in personas the source is 'builtin:security' (no file extension), so it falls through to the YAML path. This works today only because yaml.v3 is a superset of JSON, but it's fragile: if YAML is replaced with JSON-only, built-in persona sources would need to be renamed (e.g., 'builtin:security.yaml') or the detection logic changed. The dispatch logic is unclear about its contract. **[MINOR]** parsePersona uses filepath.Ext(source) to detect JSON vs YAML, but for built-in personas the source is 'builtin:security' (no file extension), so it falls through to the YAML path. This works today only because yaml.v3 is a superset of JSON, but it's fragile: if YAML is replaced with JSON-only, built-in persona sources would need to be renamed (e.g., 'builtin:security.yaml') or the detection logic changed. The dispatch logic is unclear about its contract.
|
||||
}
|
||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `CapitalizeFirst` is placed in `persona.go` but is used by `formatter.go`. It's a general string utility that doesn't have a conceptual relationship with persona loading. It would be more discoverable in `formatter.go` or a shared `strings.go`/`util.go` file within the `review` package.
|
||||
return nil
|
||||
}
|
||||
|
||||
|
[MINOR] Persona YAML is parsed without limits using a third‑party YAML library. If a workflow references a persona file from the PR branch, a malicious PR could supply a very large or adversarial YAML (e.g., anchor expansion) to cause excessive CPU/memory consumption during unmarshalling. Consider enforcing a maximum file size before reading/parsing and/or preferring JSON (stdlib) for untrusted inputs. **[MINOR]** Persona YAML is parsed without limits using a third‑party YAML library. If a workflow references a persona file from the PR branch, a malicious PR could supply a very large or adversarial YAML (e.g., anchor expansion) to cause excessive CPU/memory consumption during unmarshalling. Consider enforcing a maximum file size before reading/parsing and/or preferring JSON (stdlib) for untrusted inputs.
|
||||
// CapitalizeFirst capitalizes the first rune of a string in a Unicode-safe way.
|
||||
// Returns the original string if it's empty.
|
||||
func CapitalizeFirst(s string) string {
|
||||
if s == "" {
|
||||
return s
|
||||
}
|
||||
r, size := utf8.DecodeRuneInString(s)
|
||||
if r == utf8.RuneError {
|
||||
return s
|
||||
}
|
||||
return strings.ToUpper(string(r)) + s[size:]
|
||||
}
|
||||
@@ -0,0 +1,104 @@
|
||||
package review
|
||||
|
gpt-review-bot
commented
[NIT] BuildPersonaSystemPrompt re-specifies the output schema that likely mirrors BuildSystemBase; consider refactoring to share a common formatter for the JSON output specification to avoid future drift. **[NIT]** BuildPersonaSystemPrompt re-specifies the output schema that likely mirrors BuildSystemBase; consider refactoring to share a common formatter for the JSON output specification to avoid future drift.
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// BuildPersonaSystemPrompt constructs a system prompt from a persona definition.
|
||||
// This replaces BuildSystemBase when a persona is provided.
|
||||
func BuildPersonaSystemPrompt(p *Persona) string {
|
||||
var sb strings.Builder
|
||||
|
||||
// Identity section
|
||||
sb.WriteString(p.Identity)
|
||||
sb.WriteString("\n\n")
|
||||
|
||||
// Focus section
|
||||
if len(p.Focus) > 0 {
|
||||
sb.WriteString("## Focus Areas\n\n")
|
||||
sb.WriteString("Concentrate your review on:\n")
|
||||
for _, f := range p.Focus {
|
||||
sb.WriteString(fmt.Sprintf("- %s\n", f))
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
}
|
||||
|
||||
// Ignore section
|
||||
if len(p.Ignore) > 0 {
|
||||
sb.WriteString("## Explicitly Out of Scope\n\n")
|
||||
sb.WriteString("Do NOT comment on:\n")
|
||||
for _, i := range p.Ignore {
|
||||
sb.WriteString(fmt.Sprintf("- %s\n", i))
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
}
|
||||
|
||||
// Severity calibration
|
||||
if p.Severity.Major != "" || p.Severity.Minor != "" || p.Severity.Nit != "" {
|
||||
sb.WriteString("## Severity Calibration\n\n")
|
||||
sb.WriteString("Use these severity definitions for YOUR domain:\n")
|
||||
if p.Severity.Major != "" {
|
||||
sb.WriteString(fmt.Sprintf("- **MAJOR**: %s\n", p.Severity.Major))
|
||||
}
|
||||
if p.Severity.Minor != "" {
|
||||
sb.WriteString(fmt.Sprintf("- **MINOR**: %s\n", p.Severity.Minor))
|
||||
}
|
||||
if p.Severity.Nit != "" {
|
||||
sb.WriteString(fmt.Sprintf("- **NIT**: %s\n", p.Severity.Nit))
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
}
|
||||
|
||||
// Output format instructions (shared schema from prompt.go)
|
||||
sb.WriteString("## Review Instructions\n\n")
|
||||
sb.WriteString("CONTEXT:\n")
|
||||
sb.WriteString("- You will receive the full content of modified files for reference, followed by the diff showing what changed.\n")
|
||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `BuildSystemPromptWithPersona` is exported and documented as a convenience wrapper, but it's not used anywhere in the codebase (main.go assembles the prompt manually). This creates a parallel code path that could diverge. Either use it in main.go or make it unexported/remove it to avoid maintenance burden.
|
||||
sb.WriteString("- The diff shows ONLY what was added/removed. The full file content provides complete context.\n")
|
||||
sb.WriteString("- Focus your review on the CHANGES (the diff), using the full files for context.\n\n")
|
||||
|
sonnet-review-bot
commented
[NIT] The **[NIT]** The `BuildPersonaSystemPrompt` function builds the JSON output format using many individual `sb.WriteString` calls for what is essentially a multi-line string constant. This is harder to read and maintain than a single `const` or backtick string. Consider using a `const` for the static instructions block.
|
||||
sb.WriteString("Your task:\n")
|
||||
sb.WriteString("1. Review the diff for issues within YOUR focus areas only.\n")
|
||||
sb.WriteString("2. Consider the CI status — if CI has failed, that is an automatic REQUEST_CHANGES regardless of code quality.\n")
|
||||
sb.WriteString("3. Output your review as structured JSON (and ONLY JSON, no markdown fences or other text).\n\n")
|
||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `BuildSystemPromptWithPersona` is exported but not used in `main.go` — `main.go` calls `BuildPersonaSystemPrompt` directly and then appends patterns/conventions via the budget system. This exported function implements a different (non-budget-aware) path. It's only tested, not used in production. Either document clearly that it's an alternative API for callers who don't need the budget system, or make it unexported if it's just for testing convenience.
|
||||
sb.WriteString("Output format:\n")
|
||||
sb.WriteString(outputSchemaJSON)
|
||||
sb.WriteString("\n\n")
|
||||
sb.WriteString(verdictRules)
|
||||
sb.WriteString("\n- Only report findings within your focus areas. Ignore everything else.\n")
|
||||
sb.WriteString("- Line numbers should reference the new file line numbers from the diff headers.\n")
|
||||
|
sonnet-review-bot
commented
[NIT] The JSON output format is written as a series of sb.WriteString calls with manually formatted JSON. This is the same pattern used in BuildSystemBase and is consistent, but the inline JSON template is duplicated between BuildPersonaSystemPrompt and BuildSystemBase. A shared constant or helper would reduce drift risk if the output schema ever changes. **[NIT]** The JSON output format is written as a series of sb.WriteString calls with manually formatted JSON. This is the same pattern used in BuildSystemBase and is consistent, but the inline JSON template is duplicated between BuildPersonaSystemPrompt and BuildSystemBase. A shared constant or helper would reduce drift risk if the output schema ever changes.
|
||||
sb.WriteString("- If the diff has no changes relevant to your focus areas, APPROVE with no findings.\n")
|
||||
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `BuildPersonaSystemPrompt` function has 30+ consecutive `sb.WriteString(...)` calls for the hardcoded JSON output format schema. This duplicates the output format specification that already lives in the base `BuildSystemBase()` function. If the JSON schema changes (e.g., adding a new field), both functions must be updated. Consider extracting the shared output format section into a package-level constant or helper function to keep the schema definition in one place.
|
||||
// Custom output format if provided
|
||||
if p.OutputFormat != "" {
|
||||
sb.WriteString("\n\n## Additional Output Guidelines\n\n")
|
||||
sb.WriteString(p.OutputFormat)
|
||||
}
|
||||
|
||||
return sb.String()
|
||||
}
|
||||
|
||||
// BuildSystemPromptWithPersona constructs the full system prompt, using either
|
||||
// a persona or the default generic prompt. This is a convenience wrapper that
|
||||
// combines BuildPersonaSystemPrompt (or BuildSystemBase) with patterns and conventions.
|
||||
// It is exported for use by callers who want one-shot prompt assembly.
|
||||
func BuildSystemPromptWithPersona(persona *Persona, conventions, patterns string) string {
|
||||
var base string
|
||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `BuildSystemPromptWithPersona` exists as a helper but is never called from `main.go` — the main function manually assembles `systemBase` then appends `additionalPrompt`, patterns, and conventions separately via the budget system. This exported function is unused dead code in practice. Either wire it into the main flow or unexport/remove it.
sonnet-review-bot
commented
[MINOR] **[MINOR]** `BuildPersonaSystemPrompt` builds the JSON output format spec using ~20 individual `sb.WriteString` calls on string literals. This duplicates logic that already exists in `BuildSystemBase()` (or the base prompt builder). A shared constant or helper for the JSON output spec would reduce the risk of the persona and generic prompts diverging silently.
|
||||
if persona != nil {
|
||||
|
sonnet-review-bot
commented
[NIT] The output format JSON is constructed via a long sequence of individual **[NIT]** The output format JSON is constructed via a long sequence of individual `sb.WriteString` calls with manual newlines. This is verbose and fragile. A single multi-line string literal (raw string or `const`) would be cleaner and easier to maintain, following the pattern already used in `BuildSystemBase`.
|
||||
base = BuildPersonaSystemPrompt(persona)
|
||||
} else {
|
||||
base = BuildSystemBase()
|
||||
}
|
||||
|
||||
var sb strings.Builder
|
||||
sb.WriteString(base)
|
||||
|
||||
|
sonnet-review-bot
commented
[MINOR] BuildSystemPromptWithPersona has the conventions and patterns arguments in the wrong order relative to BuildSystemBase's documented usage in the budget assembly (where patterns come before conventions in the prompt). More importantly, the argument order is (persona, conventions, patterns) but the caller in main.go assembles the prompt manually rather than using this function — making BuildSystemPromptWithPersona a dead export that duplicates logic without being used. Either use it in main.go or remove it. **[MINOR]** BuildSystemPromptWithPersona has the conventions and patterns arguments in the wrong order relative to BuildSystemBase's documented usage in the budget assembly (where patterns come before conventions in the prompt). More importantly, the argument order is (persona, conventions, patterns) but the caller in main.go assembles the prompt manually rather than using this function — making BuildSystemPromptWithPersona a dead export that duplicates logic without being used. Either use it in main.go or remove it.
|
||||
if patterns != "" {
|
||||
sb.WriteString(fmt.Sprintf("\n\n## Language Patterns & Idioms\n\nUse the following patterns as review criteria. Code that violates these established patterns is a finding:\n\n%s\n", patterns))
|
||||
}
|
||||
|
||||
if conventions != "" {
|
||||
sb.WriteString(fmt.Sprintf("\n\n## Repository Conventions\n\nThe repository has the following coding conventions that must be respected:\n\n%s\n", conventions))
|
||||
}
|
||||
|
||||
return sb.String()
|
||||
}
|
||||
@@ -0,0 +1,157 @@
|
||||
package review
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestBuildPersonaSystemPrompt(t *testing.T) {
|
||||
p := &Persona{
|
||||
Name: "security",
|
||||
DisplayName: "Security Specialist",
|
||||
Identity: "You are a security specialist.",
|
||||
Focus: []string{"injection attacks", "auth bypass"},
|
||||
Ignore: []string{"code style", "performance"},
|
||||
Severity: Severity{
|
||||
Major: "exploitable vulnerabilities",
|
||||
Minor: "defense in depth",
|
||||
Nit: "theoretical risks",
|
||||
},
|
||||
}
|
||||
|
||||
prompt := BuildPersonaSystemPrompt(p)
|
||||
|
||||
// Check identity is included
|
||||
if !strings.Contains(prompt, "You are a security specialist.") {
|
||||
t.Error("prompt should contain identity")
|
||||
}
|
||||
|
||||
// Check focus areas
|
||||
if !strings.Contains(prompt, "Focus Areas") {
|
||||
t.Error("prompt should contain Focus Areas section")
|
||||
}
|
||||
if !strings.Contains(prompt, "injection attacks") {
|
||||
t.Error("prompt should contain focus item")
|
||||
}
|
||||
|
||||
// Check ignore section
|
||||
if !strings.Contains(prompt, "Out of Scope") {
|
||||
t.Error("prompt should contain Out of Scope section")
|
||||
}
|
||||
if !strings.Contains(prompt, "code style") {
|
||||
t.Error("prompt should contain ignore item")
|
||||
}
|
||||
|
||||
// Check severity calibration
|
||||
if !strings.Contains(prompt, "Severity Calibration") {
|
||||
t.Error("prompt should contain Severity Calibration section")
|
||||
}
|
||||
if !strings.Contains(prompt, "exploitable vulnerabilities") {
|
||||
t.Error("prompt should contain major severity definition")
|
||||
}
|
||||
|
||||
// Check JSON output format is included
|
||||
if !strings.Contains(prompt, `"verdict"`) {
|
||||
t.Error("prompt should contain JSON output format")
|
||||
}
|
||||
if !strings.Contains(prompt, "APPROVE") {
|
||||
t.Error("prompt should mention APPROVE verdict")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildPersonaSystemPromptMinimal(t *testing.T) {
|
||||
// Minimal persona with only required fields
|
||||
p := &Persona{
|
||||
Name: "minimal",
|
||||
Identity: "You are a minimal reviewer.",
|
||||
}
|
||||
|
||||
prompt := BuildPersonaSystemPrompt(p)
|
||||
|
||||
// Should still work without optional fields
|
||||
if !strings.Contains(prompt, "You are a minimal reviewer.") {
|
||||
t.Error("prompt should contain identity")
|
||||
}
|
||||
|
||||
// Should not have empty sections
|
||||
if strings.Contains(prompt, "Focus Areas") && !strings.Contains(prompt, "Concentrate your review on:") {
|
||||
t.Error("should not have Focus Areas header without content")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildSystemPromptWithPersona(t *testing.T) {
|
||||
t.Run("with persona", func(t *testing.T) {
|
||||
p := &Persona{
|
||||
Name: "test",
|
||||
Identity: "Test persona identity.",
|
||||
Focus: []string{"testing"},
|
||||
}
|
||||
|
||||
prompt := BuildSystemPromptWithPersona(p, "test conventions", "test patterns")
|
||||
|
||||
if !strings.Contains(prompt, "Test persona identity.") {
|
||||
t.Error("should contain persona identity")
|
||||
}
|
||||
if !strings.Contains(prompt, "test conventions") {
|
||||
t.Error("should contain conventions")
|
||||
}
|
||||
if !strings.Contains(prompt, "test patterns") {
|
||||
t.Error("should contain patterns")
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("without persona", func(t *testing.T) {
|
||||
prompt := BuildSystemPromptWithPersona(nil, "test conventions", "test patterns")
|
||||
|
||||
// Should use default system base
|
||||
if !strings.Contains(prompt, "expert code reviewer") {
|
||||
t.Error("should contain default system base when no persona")
|
||||
}
|
||||
if !strings.Contains(prompt, "test conventions") {
|
||||
t.Error("should contain conventions")
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("empty conventions and patterns", func(t *testing.T) {
|
||||
p := &Persona{
|
||||
Name: "test",
|
||||
Identity: "Test identity.",
|
||||
}
|
||||
|
||||
prompt := BuildSystemPromptWithPersona(p, "", "")
|
||||
|
||||
if strings.Contains(prompt, "Language Patterns") {
|
||||
t.Error("should not contain patterns section when empty")
|
||||
}
|
||||
if strings.Contains(prompt, "Repository Conventions") {
|
||||
t.Error("should not contain conventions section when empty")
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestPersonaPromptContainsOutputRules(t *testing.T) {
|
||||
p := &Persona{
|
||||
Name: "test",
|
||||
Identity: "Test.",
|
||||
}
|
||||
|
||||
prompt := BuildPersonaSystemPrompt(p)
|
||||
|
||||
// Must contain the critical output rules
|
||||
requiredStrings := []string{
|
||||
"APPROVE",
|
||||
"REQUEST_CHANGES",
|
||||
"MAJOR",
|
||||
"MINOR",
|
||||
"NIT",
|
||||
"verdict",
|
||||
"findings",
|
||||
"CI",
|
||||
}
|
||||
|
||||
for _, s := range requiredStrings {
|
||||
if !strings.Contains(prompt, s) {
|
||||
t.Errorf("prompt should contain %q", s)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,239 @@
|
||||
package review
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestLoadBuiltinPersona(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
personaName string
|
||||
wantErr bool
|
||||
wantDisplay string
|
||||
}{
|
||||
{
|
||||
name: "security persona",
|
||||
personaName: "security",
|
||||
wantErr: false,
|
||||
wantDisplay: "Security Specialist",
|
||||
},
|
||||
{
|
||||
name: "architect persona",
|
||||
personaName: "architect",
|
||||
wantErr: false,
|
||||
wantDisplay: "Software Architect",
|
||||
},
|
||||
{
|
||||
name: "docs persona",
|
||||
personaName: "docs",
|
||||
wantErr: false,
|
||||
wantDisplay: "Documentation Reviewer",
|
||||
},
|
||||
{
|
||||
name: "unknown persona",
|
||||
personaName: "nonexistent",
|
||||
wantErr: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
p, err := LoadBuiltinPersona(tt.personaName)
|
||||
if tt.wantErr {
|
||||
if err == nil {
|
||||
t.Error("expected error, got nil")
|
||||
}
|
||||
return
|
||||
}
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %v", err)
|
||||
}
|
||||
if p.Name != tt.personaName {
|
||||
t.Errorf("Name = %q, want %q", p.Name, tt.personaName)
|
||||
}
|
||||
if p.DisplayName != tt.wantDisplay {
|
||||
t.Errorf("DisplayName = %q, want %q", p.DisplayName, tt.wantDisplay)
|
||||
}
|
||||
if p.Identity == "" {
|
||||
t.Error("Identity should not be empty")
|
||||
}
|
||||
if len(p.Focus) == 0 {
|
||||
t.Error("Focus should not be empty")
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestListBuiltinPersonas(t *testing.T) {
|
||||
names := ListBuiltinPersonas()
|
||||
if len(names) == 0 {
|
||||
t.Fatal("expected at least one built-in persona")
|
||||
}
|
||||
|
||||
// Check for expected personas
|
||||
expected := map[string]bool{"security": false, "architect": false, "docs": false}
|
||||
for _, name := range names {
|
||||
if _, ok := expected[name]; ok {
|
||||
expected[name] = true
|
||||
}
|
||||
}
|
||||
for name, found := range expected {
|
||||
if !found {
|
||||
t.Errorf("expected built-in persona %q not found", name)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestLoadPersonaFromJSONFile(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
path := filepath.Join(dir, "test.json")
|
||||
|
||||
content := `{
|
||||
"name": "test",
|
||||
"display_name": "Test Persona",
|
||||
"identity": "You are a test persona.\nMulti-line identity works.",
|
||||
"focus": ["testing", "validation"],
|
||||
"ignore": ["nothing"],
|
||||
"severity": {
|
||||
"major": "Big problems",
|
||||
"minor": "Small problems",
|
||||
"nit": "Tiny problems"
|
||||
}
|
||||
}`
|
||||
|
||||
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||
t.Fatalf("failed to write test file: %v", err)
|
||||
}
|
||||
|
||||
p, err := LoadPersona(path)
|
||||
if err != nil {
|
||||
t.Fatalf("LoadPersona failed: %v", err)
|
||||
}
|
||||
|
||||
if p.Name != "test" {
|
||||
t.Errorf("Name = %q, want %q", p.Name, "test")
|
||||
}
|
||||
if p.DisplayName != "Test Persona" {
|
||||
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test Persona")
|
||||
}
|
||||
if len(p.Focus) != 2 {
|
||||
t.Errorf("Focus len = %d, want 2", len(p.Focus))
|
||||
}
|
||||
if !strings.Contains(p.Identity, "Multi-line") {
|
||||
t.Error("Identity should contain multi-line content")
|
||||
}
|
||||
}
|
||||
|
||||
func TestLoadPersonaValidation(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
json string
|
||||
wantErr string
|
||||
}{
|
||||
{
|
||||
name: "missing name",
|
||||
json: `{"identity": "test"}`,
|
||||
wantErr: "name is required",
|
||||
},
|
||||
{
|
||||
name: "missing identity",
|
||||
json: `{"name": "test"}`,
|
||||
wantErr: "identity is required",
|
||||
},
|
||||
{
|
||||
name: "display_name defaults to name",
|
||||
json: `{"name": "test", "identity": "test identity"}`,
|
||||
// No error expected - should succeed
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
path := filepath.Join(dir, "test.json")
|
||||
if err := os.WriteFile(path, []byte(tt.json), 0644); err != nil {
|
||||
t.Fatalf("failed to write test file: %v", err)
|
||||
}
|
||||
|
||||
p, err := LoadPersona(path)
|
||||
if tt.wantErr != "" {
|
||||
if err == nil {
|
||||
t.Errorf("expected error containing %q, got nil", tt.wantErr)
|
||||
return
|
||||
}
|
||||
if !strings.Contains(err.Error(), tt.wantErr) {
|
||||
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
|
||||
}
|
||||
return
|
||||
}
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %v", err)
|
||||
}
|
||||
// Check display_name defaulting
|
||||
if p.DisplayName == "" {
|
||||
t.Error("DisplayName should default to Name")
|
||||
}
|
||||
if p.DisplayName != p.Name {
|
||||
t.Errorf("DisplayName should default to Name, got %q", p.DisplayName)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestLoadPersonaFileNotFound(t *testing.T) {
|
||||
_, err := LoadPersona("/nonexistent/path/persona.json")
|
||||
if err == nil {
|
||||
t.Error("expected error for nonexistent file")
|
||||
}
|
||||
}
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `contains` / `containsHelper` functions in persona_test.go are reinventing `strings.Contains`. The standard library already provides this. The custom implementation is error-prone (the `contains` wrapper has a subtle redundancy: it checks `len(s) >= len(substr)` and `s == substr` before calling `containsHelper`, which already handles the equality case in its loop). Just use `strings.Contains` directly.
|
||||
|
||||
func TestLoadPersonaInvalidJSON(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `contains` / `containsHelper` helpers duplicate functionality from `strings.Contains`. The test file is in the `review` package (not `review_test`), so it could just use `strings.Contains` directly — or at minimum import `strings`. Adding custom substring matching helpers in tests is unnecessary complexity and a maintenance burden.
|
||||
path := filepath.Join(dir, "invalid.json")
|
||||
if err := os.WriteFile(path, []byte("not valid json {"), 0644); err != nil {
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `contains` and `containsHelper` helpers re-implement `strings.Contains` from scratch. This is unnecessary — `strings.Contains` is already in the standard library and is used everywhere else in the test files. This also violates the principle of using standard library functions.
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `contains` and `containsHelper` helpers are reimplementing `strings.Contains`, which is already available in the standard library. The helpers add no value and the custom implementation has a subtle bug: `containsHelper` is only called when `len(s) > 0`, but the outer `contains` check `len(s) >= len(substr)` already handles the empty-substr case. Just use `strings.Contains` directly — it handles all edge cases correctly and is idiomatic Go.
|
||||
t.Fatalf("failed to write test file: %v", err)
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `contains` and `containsHelper` helpers are reinventing `strings.Contains`. The project already imports `strings` elsewhere and the stdlib `strings.Contains` is available in test files. Using `strings.Contains` directly would be clearer and eliminate ~10 lines of custom code.
|
||||
}
|
||||
|
||||
_, err := LoadPersona(path)
|
||||
if err == nil {
|
||||
t.Error("expected error for invalid JSON")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCapitalizeFirst(t *testing.T) {
|
||||
tests := []struct {
|
||||
input string
|
||||
want string
|
||||
}{
|
||||
{"hello", "Hello"},
|
||||
{"Hello", "Hello"},
|
||||
{"HELLO", "HELLO"},
|
||||
{"a", "A"},
|
||||
{"", ""},
|
||||
{"日本語", "日本語"}, // Non-ASCII: Japanese doesn't have case
|
||||
{"über", "Über"}, // German umlaut
|
||||
{"élève", "Élève"}, // French accent
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.input, func(t *testing.T) {
|
||||
got := CapitalizeFirst(tt.input)
|
||||
if got != tt.want {
|
||||
t.Errorf("CapitalizeFirst(%q) = %q, want %q", tt.input, got, tt.want)
|
||||
}
|
||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `TestLoadPersonaInvalidYAML` uses `"not: valid: yaml: here"` as invalid YAML, but this is actually valid YAML (it's a key `not` with value `valid: yaml: here` as a string). The test may pass only because the YAML parses to a struct with empty required fields, triggering the validation error rather than a parse error. The test description says 'invalid YAML' but it's really testing 'YAML missing required fields'. The test still provides coverage but the intent is misleading.
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestListBuiltinPersonasReturnsEmptySlice(t *testing.T) {
|
||||
// ListBuiltinPersonas should return an empty slice (not nil) on error.
|
||||
// We can't easily test the error case, but we can verify the success case
|
||||
// returns a proper slice.
|
||||
names := ListBuiltinPersonas()
|
||||
if names == nil {
|
||||
t.Error("ListBuiltinPersonas should return empty slice, not nil")
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
{
|
||||
"name": "architect",
|
||||
"display_name": "Software Architect",
|
||||
"identity": "You are a software architect reviewing code for design quality.\n\nYour expertise:\n- Design patterns and anti-patterns\n- Code organization and module boundaries\n- API design and contracts\n- Testability and dependency injection\n- Consistency with existing architecture\n- Technical debt identification",
|
||||
"focus": [
|
||||
"Design pattern violations or misuse",
|
||||
"Module boundary violations (inappropriate coupling)",
|
||||
"API design issues (unclear contracts, leaky abstractions)",
|
||||
"Testability problems (hidden dependencies, god objects)",
|
||||
"Inconsistency with existing codebase patterns",
|
||||
"Unnecessary complexity or over-engineering",
|
||||
"Missing abstractions or premature abstraction"
|
||||
],
|
||||
"ignore": [
|
||||
"Security vulnerabilities (security persona handles these)",
|
||||
"Performance micro-optimizations",
|
||||
"Code style and formatting",
|
||||
"Documentation typos",
|
||||
"Test implementation details"
|
||||
],
|
||||
"severity": {
|
||||
"major": "Architectural violations that will cause maintenance problems or make the codebase harder to evolve",
|
||||
"minor": "Design issues that reduce clarity or testability but don't block progress",
|
||||
"nit": "Minor pattern deviations or style preferences"
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
{
|
||||
"name": "docs",
|
||||
"display_name": "Documentation Reviewer",
|
||||
"identity": "You are a documentation specialist reviewing code for clarity and documentation quality.\n\nYour expertise:\n- API documentation and examples\n- Code comments and their accuracy\n- Error message clarity\n- README and guide quality\n- Naming clarity and self-documenting code",
|
||||
"focus": [
|
||||
"Missing or outdated documentation",
|
||||
"Unclear or misleading comments",
|
||||
"Poor error messages (cryptic, unhelpful, missing context)",
|
||||
"Confusing naming (functions, variables, types)",
|
||||
"Missing examples for complex APIs",
|
||||
"Inconsistent terminology",
|
||||
"Documentation that contradicts the code"
|
||||
],
|
||||
"ignore": [
|
||||
"Security vulnerabilities",
|
||||
"Performance issues",
|
||||
"Design patterns",
|
||||
"Test coverage",
|
||||
"Code style (unless it affects readability)"
|
||||
],
|
||||
"severity": {
|
||||
"major": "Documentation that actively misleads or missing docs for critical functionality",
|
||||
"minor": "Unclear documentation or poor error messages that will confuse users",
|
||||
"nit": "Minor clarity improvements or typo fixes"
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
{
|
||||
"name": "security",
|
||||
"display_name": "Security Specialist",
|
||||
"identity": "You are a security specialist reviewing code for vulnerabilities.\n\nYour expertise:\n- OWASP Top 10 vulnerabilities\n- Injection attacks (SQL, command, path traversal, template)\n- Authentication and authorization patterns\n- Secrets management and exposure risks\n- Race conditions with security implications\n- Event sourcing attack vectors (replay attacks, event injection)",
|
||||
"focus": [
|
||||
"Injection attacks (SQL, command, path traversal, template injection)",
|
||||
"Authentication and authorization gaps or bypasses",
|
||||
"Secrets exposure (hardcoded credentials, tokens in logs, config leaks)",
|
||||
"Input validation failures (unsanitized input, unsafe deserialization)",
|
||||
"Race conditions that could be exploited",
|
||||
"Cryptographic weaknesses (weak algorithms, improper key handling)",
|
||||
"Information disclosure through error messages or logs"
|
||||
],
|
||||
"ignore": [
|
||||
"Code style and naming conventions",
|
||||
"Performance optimizations (unless security-related)",
|
||||
"Documentation quality",
|
||||
"General code quality or readability",
|
||||
"Test coverage"
|
||||
],
|
||||
"severity": {
|
||||
"major": "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE",
|
||||
"minor": "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation",
|
||||
"nit": "Theoretical risks with low exploitability or impact"
|
||||
}
|
||||
}
|
||||
@@ -7,6 +7,28 @@ import (
|
||||
"strings"
|
||||
)
|
||||
|
||||
// outputSchemaJSON is the shared JSON output format specification used by both
|
||||
// the generic reviewer and persona-based reviewers.
|
||||
const outputSchemaJSON = `{
|
||||
"verdict": "APPROVE" or "REQUEST_CHANGES",
|
||||
"summary": "Brief overall assessment (1-3 sentences)",
|
||||
"findings": [
|
||||
{
|
||||
"severity": "MAJOR" or "MINOR" or "NIT",
|
||||
"file": "path/to/file",
|
||||
"line": <line number from the diff>,
|
||||
"finding": "Description of the issue"
|
||||
}
|
||||
],
|
||||
"recommendation": "Full recommendation text explaining your verdict"
|
||||
}`
|
||||
|
||||
// verdictRules is the shared verdict determination rules.
|
||||
const verdictRules = `Rules:
|
||||
- If there are any MAJOR findings → verdict must be REQUEST_CHANGES
|
||||
- If there are no MAJOR findings → verdict should be APPROVE
|
||||
- If CI has failed → verdict must be REQUEST_CHANGES with a finding noting the CI failure`
|
||||
|
||||
// BuildSystemBase returns the core system prompt instructions without
|
||||
// patterns or conventions. Used by the budget package to separate
|
||||
// trimmable from non-trimmable content.
|
||||
@@ -23,24 +45,10 @@ func BuildSystemBase() string {
|
||||
sb.WriteString("2. Consider the CI status — if CI has failed, that is an automatic REQUEST_CHANGES regardless of code quality.\n")
|
||||
sb.WriteString("3. Output your review as structured JSON (and ONLY JSON, no markdown fences or other text).\n\n")
|
||||
sb.WriteString("Output format:\n")
|
||||
sb.WriteString("{\n")
|
||||
sb.WriteString(" \"verdict\": \"APPROVE\" or \"REQUEST_CHANGES\",\n")
|
||||
sb.WriteString(" \"summary\": \"Brief overall assessment (1-3 sentences)\",\n")
|
||||
sb.WriteString(" \"findings\": [\n")
|
||||
sb.WriteString(" {\n")
|
||||
sb.WriteString(" \"severity\": \"MAJOR\" or \"MINOR\" or \"NIT\",\n")
|
||||
sb.WriteString(" \"file\": \"path/to/file\",\n")
|
||||
sb.WriteString(" \"line\": <line number from the diff>,\n")
|
||||
sb.WriteString(" \"finding\": \"Description of the issue\"\n")
|
||||
sb.WriteString(" }\n")
|
||||
sb.WriteString(" ],\n")
|
||||
sb.WriteString(" \"recommendation\": \"Full recommendation text explaining your verdict\"\n")
|
||||
sb.WriteString("}\n\n")
|
||||
sb.WriteString("Rules:\n")
|
||||
sb.WriteString("- If there are any MAJOR findings → verdict must be REQUEST_CHANGES\n")
|
||||
sb.WriteString("- If there are no MAJOR findings → verdict should be APPROVE\n")
|
||||
sb.WriteString("- If CI has failed → verdict must be REQUEST_CHANGES with a finding noting the CI failure\n")
|
||||
sb.WriteString("- Be thorough but fair. Don't nitpick style unless it impacts readability significantly.\n")
|
||||
sb.WriteString(outputSchemaJSON)
|
||||
sb.WriteString("\n\n")
|
||||
sb.WriteString(verdictRules)
|
||||
sb.WriteString("\n- Be thorough but fair. Don't nitpick style unless it impacts readability significantly.\n")
|
||||
sb.WriteString("- Line numbers should reference the new file line numbers from the diff headers.\n")
|
||||
sb.WriteString("- If the diff is empty or trivial (only formatting/whitespace), APPROVE with no findings.\n")
|
||||
|
||||
|
||||
[MAJOR] Malformed inputs block: properties for system-prompt-file are misplaced under persona-file due to insertion. The keys 'description', 'required', and 'default' are duplicated under persona-file, and system-prompt-file lacks its properties. This can cause YAML parsing issues and incorrect input definitions.
[MINOR] The YAML for the
system-prompt-fileinput is malformed — thepersonaandpersona-fileinputs are inserted between thesystem-prompt-filekey and itsdescription/required/defaultfields. This meanssystem-prompt-fileloses its description/metadata in the action definition. The two new inputs should be placed after the completesystem-prompt-fileblock, not in the middle of it.