feat: add YAML support for persona files #58
@@ -9,7 +9,7 @@ AI-powered code review bot for Gitea pull requests. Fetches diff + context, send
|
|||||||
- **Smart budget**: Automatically trims context to fit model token limits
|
- **Smart budget**: Automatically trims context to fit model token limits
|
||||||
- **Idempotent reviews**: Posts new review, then cleans up stale ones (one review per bot)
|
- **Idempotent reviews**: Posts new review, then cleans up stale ones (one review per bot)
|
||||||
- **Custom prompts**: Load additional instructions from a file (e.g. security-focused review)
|
- **Custom prompts**: Load additional instructions from a file (e.g. security-focused review)
|
||||||
- **Zero dependencies**: Go stdlib only
|
- **Minimal dependencies**: Go stdlib + `gopkg.in/yaml.v3` only
|
||||||
|
|
||||||
## Quick Start: Composite Action
|
## Quick Start: Composite Action
|
||||||
|
|
||||||
@@ -208,7 +208,7 @@ AI Core handles OAuth token management and deployment discovery automatically. M
|
|||||||
| `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos |
|
| `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos |
|
||||||
| `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
|
| `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
|
||||||
| `persona` | No | `""` | Built-in persona name (security, architect, docs) |
|
| `persona` | No | `""` | Built-in persona name (security, architect, docs) |
|
||||||
| `persona-file` | No | `""` | Path to persona JSON file with custom review focus |
|
| `persona-file` | No | `""` | Path to persona file (YAML or JSON) with custom review focus |
|
||||||
| `temperature` | No | `0` | LLM temperature (0 = server default) |
|
| `temperature` | No | `0` | LLM temperature (0 = server default) |
|
||||||
| `timeout` | No | `300` | LLM request timeout in seconds |
|
| `timeout` | No | `300` | LLM request timeout in seconds |
|
||||||
| `dry-run` | No | `false` | Print review to stdout instead of posting |
|
| `dry-run` | No | `false` | Print review to stdout instead of posting |
|
||||||
@@ -408,32 +408,38 @@ Each persona posts independently with its own sentinel, so reviews don't interfe
|
|||||||
|
|
||||||
### Custom Personas
|
### Custom Personas
|
||||||
|
|
||||||
Create a JSON file with your domain-specific review focus:
|
Create a YAML file with your domain-specific review focus:
|
||||||
|
|
||||||
```json
|
```yaml
|
||||||
// .review/personas/trading.json
|
# .review/personas/trading.yaml
|
||||||
{
|
name: trading
|
||||||
"name": "trading",
|
display_name: Trading Domain Expert
|
||||||
"display_name": "Trading Domain Expert",
|
|
||||||
"identity": "You are a trading systems expert reviewing code for correctness.\n\nYour expertise:\n- Order lifecycle and state machines\n- Fill handling and partial fills\n- Position tracking and P&L calculations\n- Event sourcing invariants",
|
identity: |
|
||||||
"focus": [
|
You are a trading systems expert reviewing code for correctness.
|
||||||
"Order state machine correctness",
|
|
||||||
"Fill handling edge cases (partial, overfill)",
|
Your expertise:
|
||||||
"Position and P&L calculation accuracy",
|
- Order lifecycle and state machines
|
||||||
"Event replay determinism",
|
- Fill handling and partial fills
|
||||||
"Decimal precision for money"
|
- Position tracking and P&L calculations
|
||||||
],
|
- Event sourcing invariants
|
||||||
"ignore": [
|
|
||||||
"Code style",
|
focus:
|
||||||
"General performance",
|
- Order state machine correctness
|
||||||
"Documentation formatting"
|
- Fill handling edge cases (partial, overfill)
|
||||||
],
|
- Position and P&L calculation accuracy
|
||||||
"severity": {
|
- Event replay determinism
|
||||||
"major": "Bugs that cause incorrect positions, fills, or money calculations",
|
- Decimal precision for money
|
||||||
"minor": "Edge cases that could cause issues under unusual conditions",
|
|
||||||
"nit": "Clarity improvements for domain logic"
|
ignore:
|
||||||
}
|
- Code style
|
||||||
}
|
- General performance
|
||||||
|
- Documentation formatting
|
||||||
|
|
||||||
|
severity:
|
||||||
|
major: "Bugs that cause incorrect positions, fills, or money calculations"
|
||||||
|
minor: "Edge cases that could cause issues under unusual conditions"
|
||||||
|
nit: "Clarity improvements for domain logic"
|
||||||
```
|
```
|
||||||
|
|
||||||
Use it in CI:
|
Use it in CI:
|
||||||
@@ -442,17 +448,24 @@ Use it in CI:
|
|||||||
- uses: rodin/review-bot/.gitea/actions/review@v1
|
- uses: rodin/review-bot/.gitea/actions/review@v1
|
||||||
with:
|
with:
|
||||||
reviewer-name: trading
|
reviewer-name: trading
|
||||||
persona-file: .review/personas/trading.json
|
persona-file: .review/personas/trading.yaml
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
YAML is the recommended format for personas because it supports:
|
||||||
|
- Multi-line strings with `|` blocks (cleaner identity definitions)
|
||||||
|
- Comments for documentation
|
||||||
|
- More readable arrays and nested structures
|
||||||
|
|
||||||
|
JSON is also supported for backwards compatibility—just use `.json` extension.
|
||||||
|
|
||||||
|
|
||||||
### Persona vs system-prompt-file
|
### Persona vs system-prompt-file
|
||||||
|
|
||||||
| Feature | `persona` / `persona-file` | `system-prompt-file` |
|
| Feature | `persona` / `persona-file` | `system-prompt-file` |
|
||||||
|---------|---------------------------|----------------------|
|
|---------|---------------------------|----------------------|
|
||||||
| Replaces base prompt | Yes | No (appends) |
|
| Replaces base prompt | Yes | No (appends) |
|
||||||
| Structured format | Yes (JSON) | No (freeform) |
|
| Structured format | Yes (YAML/JSON) | No (freeform) |
|
||||||
| Focus/ignore lists | Yes | Manual |
|
| Focus/ignore lists | Yes | Manual |
|
||||||
| Severity calibration | Yes | Manual |
|
| Severity calibration | Yes | Manual |
|
||||||
| Header display name | Yes | No |
|
| Header display name | Yes | No |
|
||||||
|
|||||||
@@ -0,0 +1,108 @@
|
|||||||
|
# Design: YAML Support for Persona Files (#57)
|
||||||
|
|
|||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
|
||||||
|
|
||||||
|
## Constraints
|
||||||
|
|
||||||
|
- Backwards compatibility: existing JSON personas must continue to work
|
||||||
|
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
|
||||||
|
sonnet-review-bot
commented
[MINOR] The design doc says 'Consistency: use **[MINOR]** The design doc says 'Consistency: use `.yaml` extension (not `.yml`)' as a constraint, but the actual implementation supports both `.yaml` and `.yml` (and the open questions section documents this decision). The constraint section of the design doc is now inconsistent with the implemented behavior and should be updated to reflect the actual decision.
|
|||||||
|
- Consistency: use `.yaml` extension (not `.yml`)
|
||||||
|
gpt-review-bot
commented
[NIT] The design constraint says to use '.yaml' (not '.yml') for consistency, but the implementation and README support both. Clarify the recommendation vs. support to avoid confusion. **[NIT]** The design constraint says to use '.yaml' (not '.yml') for consistency, but the implementation and README support both. Clarify the recommendation vs. support to avoid confusion.
|
|||||||
|
- Library: use `gopkg.in/yaml.v3` (approved in CONVENTIONS.md) with explicit depth limiting
|
||||||
|
gpt-review-bot
commented
[NIT] The design doc checklist references adding a "go-yaml" dependency at v1.16.0+, which doesn't align with the actual **[NIT]** The design doc checklist references adding a "go-yaml" dependency at v1.16.0+, which doesn't align with the actual `gopkg.in/yaml.v3 v3.0.1` dependency used. Consider updating the doc to avoid confusion.
|
|||||||
|
|
||||||
|
## Proposed Approach
|
||||||
|
sonnet-review-bot
commented
[MINOR] The design document says to use **[MINOR]** The design document says to use `github.com/goccy/go-yaml` but the implementation uses `gopkg.in/yaml.v3`. The design document is now inaccurate/misleading. If the decision was made to use `gopkg.in/yaml.v3` instead, the design document should be updated to reflect this, and the security claim about depth protection should be addressed explicitly.
|
|||||||
|
|
||||||
|
1. **Update `parsePersona`** to detect format from file extension
|
||||||
|
2. **Add YAML parsing** with explicit depth limit (defense in depth)
|
||||||
|
3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
|
||||||
|
4. **Convert built-in personas** to YAML format
|
||||||
|
5. **Update embed directive** to include both formats
|
||||||
|
|
||||||
|
### File Extension Detection
|
||||||
|
|
||||||
|
[NIT] Design doc references github.com/goccy/go-yaml depth protections, but the implementation uses gopkg.in/yaml.v3. This mismatch may give a false sense of protection; yaml.v3 has no explicit caller-configurable limits in this code. **[NIT]** Design doc references github.com/goccy/go-yaml depth protections, but the implementation uses gopkg.in/yaml.v3. This mismatch may give a false sense of protection; yaml.v3 has no explicit caller-configurable limits in this code.
|
|||||||
|
```go
|
||||||
|
func parsePersona(data []byte, source string) (*Persona, error) {
|
||||||
|
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
|
||||||
|
if isYAML {
|
||||||
|
return parseYAML(data, source)
|
||||||
|
}
|
||||||
|
return parseJSON(data, source)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### YAML Parsing with Depth Protection
|
||||||
|
|
||||||
|
gpt-review-bot
commented
[NIT] The design doc mentions updating the embed directive to include both formats, but the code embeds only *.yaml (and built-in JSON files were removed). Consider updating the doc to reflect the final decision to embed YAML only. **[NIT]** The design doc mentions updating the embed directive to include both formats, but the code embeds only *.yaml (and built-in JSON files were removed). Consider updating the doc to reflect the final decision to embed YAML only.
|
|||||||
|
```go
|
||||||
|
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
|
||||||
|
var node yaml.Node
|
||||||
|
dec := yaml.NewDecoder(bytes.NewReader(data))
|
||||||
|
if err := dec.Decode(&node); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
if err := checkYAMLDepth(&node, 0, maxDepth); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
return node.Decode(out)
|
||||||
|
sonnet-review-bot
commented
[MINOR] The design document's **[MINOR]** The design document's `parseYAML` code snippet in the 'YAML Parsing with Depth Protection' section shows a naive `yaml.Unmarshal` without depth checking, but the actual implementation correctly uses the two-pass Node approach. The doc snippet is stale/inconsistent with the implementation. The paragraph below it explains the correct approach, but the code block above it is misleading. This is documentation-only and doesn't affect behavior, but could confuse future contributors.
|
|||||||
|
}
|
||||||
|
sonnet-review-bot
commented
[NIT] The design doc's error cases table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' but the actual implementation uses explicit depth checking via **[NIT]** The design doc's error cases table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' but the actual implementation uses explicit depth checking via `checkYAMLDepth`, not a library-level fix. The implementation is correct (and better), but the design doc description is inaccurate.
|
|||||||
|
|
||||||
|
func checkYAMLDepth(node *yaml.Node, depth, maxDepth int) error {
|
||||||
|
if depth > maxDepth {
|
||||||
|
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
|
||||||
|
}
|
||||||
|
// Handle alias nodes by following the Alias pointer
|
||||||
|
if node.Kind == yaml.AliasNode && node.Alias != nil {
|
||||||
|
return checkYAMLDepth(node.Alias, depth, maxDepth)
|
||||||
|
sonnet-review-bot
commented
[NIT] The design doc's Error Cases table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' — but the actual implementation does not rely on library-level rejection; it implements its own depth check. This entry is misleading and should say 'Application-level depth check rejects' to accurately reflect the implementation. **[NIT]** The design doc's Error Cases table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' — but the actual implementation does *not* rely on library-level rejection; it implements its own depth check. This entry is misleading and should say 'Application-level depth check rejects' to accurately reflect the implementation.
sonnet-review-bot
commented
[NIT] The design doc's Error Cases table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' but the actual implementation handles this via the custom **[NIT]** The design doc's Error Cases table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' but the actual implementation handles this via the custom `checkYAMLDepth` walker, not the library itself. The description is misleading and may cause confusion during future maintenance.
|
|||||||
|
}
|
||||||
|
for _, child := range node.Content {
|
||||||
|
if err := checkYAMLDepth(child, depth+1, maxDepth); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `gopkg.in/yaml.v3` library does not have built-in depth protection, so we implement explicit depth checking by first decoding into a `yaml.Node`, walking the tree to verify depth (including alias resolution), then decoding into the target struct.
|
||||||
|
|
||||||
|
## State/Data Model
|
||||||
|
|
||||||
|
No new state. Same `Persona` struct, just different parsing.
|
||||||
|
|
||||||
|
## Error Cases
|
||||||
|
|
||||||
|
| Error | Handling |
|
||||||
|
|-------|----------|
|
||||||
|
| Invalid YAML syntax | Return parse error with source file |
|
||||||
|
sonnet-review-bot
commented
[NIT] The design doc's error table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' but the actual implementation uses explicit application-level depth checking (not library-level). The table is inaccurate. Minor doc inconsistency. **[NIT]** The design doc's error table says 'Deeply nested YAML | Library rejects (v1.16.0+ fix)' but the actual implementation uses explicit application-level depth checking (not library-level). The table is inaccurate. Minor doc inconsistency.
|
|||||||
|
| Deeply nested YAML | Library rejects (v1.16.0+ fix) |
|
||||||
|
| Unknown extension | Fall back to JSON parsing |
|
||||||
|
| Missing required fields | Validation rejects after parse |
|
||||||
|
|
||||||
|
## Edge Cases
|
||||||
|
|
||||||
|
- File with `.json` extension but YAML content → JSON parse fails, user sees error
|
||||||
|
- File with no extension → defaults to JSON
|
||||||
|
- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
1. Unit tests for YAML parsing (valid, invalid, deeply nested)
|
||||||
|
2. Unit tests for extension detection
|
||||||
|
3. Integration test for built-in personas (now YAML)
|
||||||
|
4. Backwards compat test: verify JSON still works for external files
|
||||||
|
|
||||||
|
## Completion Checklist
|
||||||
|
|
||||||
|
1. [ ] `go-yaml` dependency added at v1.16.0+
|
||||||
|
2. [ ] Extension detection uses case-insensitive comparison
|
||||||
|
3. [ ] YAML parse errors include source file name
|
||||||
|
4. [ ] JSON parsing still works for `.json` files
|
||||||
|
5. [ ] Built-in personas converted to YAML with readable multi-line strings
|
||||||
|
6. [ ] Embed directive updated to include `*.yaml`
|
||||||
|
7. [ ] Test for deeply nested YAML rejection
|
||||||
|
8. [ ] All existing tests pass
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
|
||||||
|
- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
|
||||||
@@ -1,3 +1,5 @@
|
|||||||
module gitea.weiker.me/rodin/review-bot
|
module gitea.weiker.me/rodin/review-bot
|
||||||
|
|
||||||
go 1.26.2
|
go 1.26.2
|
||||||
|
|
||||||
|
gpt-review-bot
commented
[MAJOR] Adds third-party dependency github.com/goccy/go-yaml. This conflicts with the documented project rule of no external dependencies. **[MAJOR]** Adds third-party dependency github.com/goccy/go-yaml. This conflicts with the documented project rule of no external dependencies.
gpt-review-bot
commented
[MINOR] Dependency is marked as "// indirect" even though yaml is imported in non-test code (review/persona.go). Run **[MINOR]** Dependency is marked as "// indirect" even though yaml is imported in non-test code (review/persona.go). Run `go mod tidy` to classify it as a direct dependency or fix the annotation.
|
|||||||
|
require gopkg.in/yaml.v3 v3.0.1
|
||||||
|
sonnet-review-bot
commented
[MAJOR] Adds **[MAJOR]** Adds `github.com/goccy/go-yaml v1.19.2` as a dependency. The repository's CONVENTIONS.md explicitly states 'Go standard library only — no external dependencies.' This PR directly violates that constraint. A stdlib-only YAML parser would need to be implemented manually, or the feature scope should be reconsidered (e.g., keeping only JSON for external files and using a minimal hand-rolled YAML subset, or representing persona files differently). The dependency cannot be added regardless of how useful YAML support would be.
[MINOR] New third-party dependency (github.com/goccy/go-yaml v1.19.2) increases supply-chain/attack surface. While the version is pinned and includes prior DoS fixes, adding an external parser warrants routine dependency audit and vulnerability monitoring. **[MINOR]** New third-party dependency (github.com/goccy/go-yaml v1.19.2) increases supply-chain/attack surface. While the version is pinned and includes prior DoS fixes, adding an external parser warrants routine dependency audit and vulnerability monitoring.
|
|||||||
|
|||||||
@@ -0,0 +1,4 @@
|
|||||||
|
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
|
||||||
|
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||||
|
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
||||||
|
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||||
@@ -1,81 +1,153 @@
|
|||||||
package review
|
package review
|
||||||
|
gpt-review-bot
commented
[NIT] LoadPersona wraps os.Stat errors with message 'read persona file ...', which can be slightly misleading; consider using a 'stat persona file' prefix for the stat error path to improve specificity. **[NIT]** LoadPersona wraps os.Stat errors with message 'read persona file ...', which can be slightly misleading; consider using a 'stat persona file' prefix for the stat error path to improve specificity.
gpt-review-bot
commented
[NIT] unmarshalYAMLWithDepthLimit accepts multi-document YAML but silently ignores subsequent documents. Consider rejecting multi-document inputs to avoid ambiguity or add a comment in README clarifying this behavior. **[NIT]** unmarshalYAMLWithDepthLimit accepts multi-document YAML but silently ignores subsequent documents. Consider rejecting multi-document inputs to avoid ambiguity or add a comment in README clarifying this behavior.
gpt-review-bot
commented
[NIT] LoadBuiltinPersona falls back to JSON for embedded personas, but only *.yaml files are embedded now; the JSON fallback is effectively dead code. Optional: either embed both formats or remove the fallback to simplify. **[NIT]** LoadBuiltinPersona falls back to JSON for embedded personas, but only *.yaml files are embedded now; the JSON fallback is effectively dead code. Optional: either embed both formats or remove the fallback to simplify.
gpt-review-bot
commented
[NIT] ListBuiltinPersonas filters by case-sensitive suffixes; while embedded filenames are controlled and lowercase now, using a case-insensitive check would be slightly more robust if naming conventions change. **[NIT]** ListBuiltinPersonas filters by case-sensitive suffixes; while embedded filenames are controlled and lowercase now, using a case-insensitive check would be slightly more robust if naming conventions change.
|
|||||||
|
|
||||||
import (
|
import (
|
||||||
|
"bytes"
|
||||||
"embed"
|
"embed"
|
||||||
"encoding/json"
|
"encoding/json"
|
||||||
"fmt"
|
"fmt"
|
||||||
"os"
|
"os"
|
||||||
|
"sort"
|
||||||
"strings"
|
"strings"
|
||||||
"unicode/utf8"
|
"unicode/utf8"
|
||||||
|
gpt-review-bot
commented
[MAJOR] Introduces external dependency github.com/goccy/go-yaml, which violates the repository convention of using only the Go standard library ("Zero dependencies"). This is a policy-breaking change. **[MAJOR]** Introduces external dependency github.com/goccy/go-yaml, which violates the repository convention of using only the Go standard library ("Zero dependencies"). This is a policy-breaking change.
|
|||||||
|
|
||||||
|
sonnet-review-bot
commented
[MAJOR] The design document ( **[MAJOR]** The design document (`docs/DESIGN-57-yaml-persona.md`) explicitly specifies `github.com/goccy/go-yaml` v1.16.0+ as the required library, citing its built-in protection against deeply nested structures (AIKIDO-2024-10486 / DoS vulnerability fix). The implementation instead uses `gopkg.in/yaml.v3`, which does NOT have the same built-in depth limiting. The comment in `parsePersona` says "go-yaml v1.16.0+ has built-in protection against deeply nested structures" — this is factually incorrect for `gopkg.in/yaml.v3`. This is a security gap: a maliciously crafted deeply-nested YAML file could cause a stack overflow. Either switch to `github.com/goccy/go-yaml` (and add it to the allowlist), or add explicit depth limiting when using `gopkg.in/yaml.v3`.
|
|||||||
|
"gopkg.in/yaml.v3"
|
||||||
)
|
)
|
||||||
|
|
||||||
//go:embed personas/*.json
|
//go:embed personas/*.yaml
|
||||||
|
gpt-review-bot
commented
[NIT] Only YAML files are embedded (//go:embed personas/*.yaml) while the design doc mentions embedding both formats; consider updating the comment/design doc or embedding both if JSON files may ever return. **[NIT]** Only YAML files are embedded (//go:embed personas/*.yaml) while the design doc mentions embedding both formats; consider updating the comment/design doc or embedding both if JSON files may ever return.
|
|||||||
var embeddedPersonas embed.FS
|
var embeddedPersonas embed.FS
|
||||||
|
|
||||||
|
// MaxPersonaFileSize is the maximum size for persona files (64 KB).
|
||||||
|
// This prevents denial-of-service via excessively large files.
|
||||||
|
const MaxPersonaFileSize = 64 * 1024
|
||||||
|
|
||||||
|
// MaxYAMLDepth is the maximum nesting depth allowed in YAML persona files.
|
||||||
|
// This prevents stack exhaustion from deeply nested structures.
|
||||||
|
const MaxYAMLDepth = 20
|
||||||
|
|
||||||
|
// MaxYAMLNodes is the maximum number of YAML nodes allowed in persona files.
|
||||||
|
// This prevents DoS via wide-but-shallow structures that bypass depth limits.
|
||||||
|
const MaxYAMLNodes = 1000
|
||||||
|
|
||||||
// Persona defines a specialized review role with focused expertise.
|
// Persona defines a specialized review role with focused expertise.
|
||||||
type Persona struct {
|
type Persona struct {
|
||||||
Name string `json:"name"`
|
Name string `json:"name" yaml:"name"`
|
||||||
DisplayName string `json:"display_name"`
|
DisplayName string `json:"display_name" yaml:"display_name"`
|
||||||
ModelPref string `json:"model_preference,omitempty"`
|
ModelPref string `json:"model_preference,omitempty" yaml:"model_preference,omitempty"`
|
||||||
Identity string `json:"identity"`
|
Identity string `json:"identity" yaml:"identity"`
|
||||||
Focus []string `json:"focus"`
|
Focus []string `json:"focus" yaml:"focus"`
|
||||||
Ignore []string `json:"ignore"`
|
Ignore []string `json:"ignore" yaml:"ignore"`
|
||||||
Severity Severity `json:"severity"`
|
Severity Severity `json:"severity" yaml:"severity"`
|
||||||
OutputFormat string `json:"output_format,omitempty"`
|
OutputFormat string `json:"output_format,omitempty" yaml:"output_format,omitempty"`
|
||||||
|
[MINOR] LoadPersona reads the entire persona file with os.ReadFile without size checks. A maliciously large YAML/JSON persona file could cause memory/CPU pressure (DoS) in CI or when run manually. Consider enforcing a reasonable max file size before parsing. **[MINOR]** LoadPersona reads the entire persona file with os.ReadFile without size checks. A maliciously large YAML/JSON persona file could cause memory/CPU pressure (DoS) in CI or when run manually. Consider enforcing a reasonable max file size before parsing.
[MINOR] os.ReadFile reads the entire persona file into memory without a size cap. A maliciously large persona file could cause high memory usage on the CI runner (resource exhaustion). **[MINOR]** os.ReadFile reads the entire persona file into memory without a size cap. A maliciously large persona file could cause high memory usage on the CI runner (resource exhaustion).
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Severity defines what constitutes each severity level for this persona.
|
// Severity defines what constitutes each severity level for this persona.
|
||||||
// These are prompt guidance for the LLM, not output format changes.
|
// These are prompt guidance for the LLM, not output format changes.
|
||||||
type Severity struct {
|
type Severity struct {
|
||||||
|
gpt-review-bot
commented
[MINOR] LoadPersona enforces the size limit using os.Stat before reading, but does not re-check len(data) after os.ReadFile. A TOCTOU window could allow reading files larger than MaxPersonaFileSize if the file grows between stat and read. **[MINOR]** LoadPersona enforces the size limit using os.Stat before reading, but does not re-check len(data) after os.ReadFile. A TOCTOU window could allow reading files larger than MaxPersonaFileSize if the file grows between stat and read.
|
|||||||
Major string `json:"major"`
|
Major string `json:"major" yaml:"major"`
|
||||||
Minor string `json:"minor"`
|
Minor string `json:"minor" yaml:"minor"`
|
||||||
Nit string `json:"nit"`
|
Nit string `json:"nit" yaml:"nit"`
|
||||||
}
|
}
|
||||||
|
|
||||||
// LoadPersona loads a persona from a JSON file path.
|
// LoadPersona loads a persona from a JSON or YAML file path.
|
||||||
|
// Format is detected by file extension: .yaml/.yml for YAML, .json or other for JSON.
|
||||||
|
// Files larger than MaxPersonaFileSize are rejected.
|
||||||
|
//
|
||||||
|
// Symlinks are supported: os.Stat follows symlinks, so a symlink pointing to
|
||||||
|
sonnet-review-bot
commented
[NIT] The fallback to JSON in **[NIT]** The fallback to JSON in `LoadBuiltinPersona` has a comment 'Fall back to JSON for backwards compatibility', but the `embeddedPersonas` embed directive now only includes `*.yaml` files (`//go:embed personas/*.yaml`). The JSON files have been deleted. This means the JSON fallback path in `LoadBuiltinPersona` is dead code — it can never succeed because no `.json` files are embedded. The fallback should either be removed or the embed directive should include `*.json` if JSON built-in personas need to be supported.
|
|||||||
|
// a regular file will pass the IsRegular() check. Symlinks to non-regular files
|
||||||
|
// (directories, FIFOs, devices) are still rejected.
|
||||||
|
gpt-review-bot
commented
[MINOR] Built-in persona loader falls back to JSON if YAML not found, but the embedded FS only includes **[MINOR]** Built-in persona loader falls back to JSON if YAML not found, but the embedded FS only includes `*.yaml`. Since the JSON files were removed and not embedded, this fallback will never succeed for built-ins and can be simplified or the embed pattern expanded if JSON fallback is intended.
|
|||||||
func LoadPersona(path string) (*Persona, error) {
|
func LoadPersona(path string) (*Persona, error) {
|
||||||
|
// os.Stat follows symlinks, so symlinks to regular files are supported.
|
||||||
|
gpt-review-bot
commented
[MINOR] LoadPersona rejects non-regular files (e.g., symlinks) by checking Mode().IsRegular(). This may be overly strict for CI or repo setups that use symlinks. Consider allowing symlinks by resolving them (os.Stat vs. Lstat) or relaxing the check. **[MINOR]** LoadPersona rejects non-regular files (e.g., symlinks) by checking Mode().IsRegular(). This may be overly strict for CI or repo setups that use symlinks. Consider allowing symlinks by resolving them (os.Stat vs. Lstat) or relaxing the check.
|
|||||||
|
// The IsRegular() check operates on the target, not the symlink itself.
|
||||||
|
sonnet-review-bot
commented
[NIT] The comment on **[NIT]** The comment on `LoadBuiltinPersona` says 'Try YAML first (preferred format)' but then falls back to JSON. Since the embed directive is now `//go:embed personas/*.yaml` only (JSON files were deleted), the JSON fallback in `LoadBuiltinPersona` is dead code for built-in personas. It's harmless and provides forwards-compatibility for the unlikely case someone adds a `.json` builtin later, but could be simplified.
|
|||||||
|
info, err := os.Stat(path)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("read persona file %s: %w", path, err)
|
||||||
|
}
|
||||||
|
sonnet-review-bot
commented
[NIT] In **[NIT]** In `LoadBuiltinPersona`, the JSON fallback path reassigns `data, err = embeddedPersonas.ReadFile(...)` but the original `err` from the YAML attempt is already discarded by the `if err == nil { return ... }` pattern. This is fine and correct, but worth noting that the error variable from the YAML attempt is silently dropped. Since the YAML error is expected (file doesn't exist), this is intentional and acceptable.
|
|||||||
|
if !info.Mode().IsRegular() {
|
||||||
|
[MINOR] Missing regular file check: LoadPersona does not verify that the path refers to a regular file. Reading special files (FIFOs, devices) could block or cause unexpected behavior. Consider rejecting non-regular files (info.Mode().IsRegular()). **[MINOR]** Missing regular file check: LoadPersona does not verify that the path refers to a regular file. Reading special files (FIFOs, devices) could block or cause unexpected behavior. Consider rejecting non-regular files (info.Mode().IsRegular()).
|
|||||||
|
return nil, fmt.Errorf("persona file %s is not a regular file", path)
|
||||||
|
gpt-review-bot
commented
[NIT] ListBuiltinPersonas returns names in non-deterministic order (map iteration). Consider sorting the slice before returning for determinism, which can help with UX and potential downstream consumers. **[NIT]** ListBuiltinPersonas returns names in non-deterministic order (map iteration). Consider sorting the slice before returning for determinism, which can help with UX and potential downstream consumers.
|
|||||||
|
}
|
||||||
|
if info.Size() > MaxPersonaFileSize {
|
||||||
|
return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
|
||||||
|
}
|
||||||
data, err := os.ReadFile(path)
|
data, err := os.ReadFile(path)
|
||||||
|
[MINOR] Potential TOCTOU/resource exhaustion risk: file size is checked via os.Stat before reading, but not re-validated after os.ReadFile. An attacker with control over the path could swap the file between Stat and Read, bypassing the size cap and causing large in-memory reads. **[MINOR]** Potential TOCTOU/resource exhaustion risk: file size is checked via os.Stat before reading, but not re-validated after os.ReadFile. An attacker with control over the path could swap the file between Stat and Read, bypassing the size cap and causing large in-memory reads.
|
|||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, fmt.Errorf("read persona file %s: %w", path, err)
|
return nil, fmt.Errorf("read persona file %s: %w", path, err)
|
||||||
}
|
}
|
||||||
|
// Re-check size after read to defend against TOCTOU races where file
|
||||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `seen` map in `ListBuiltinPersonas` is used only to deduplicate persona names (it only ever stores `true`), but a `map[string]struct{}` would be more idiomatic. More importantly, the deduplication logic has a subtle inefficiency: it checks `if !seen[personaName]` before setting to `true`, but since the zero value of bool is `false`, the check is redundant — `seen[personaName] = true` is sufficient by itself regardless of the current value. The check adds no correctness but implies to readers it does something meaningful.
|
|||||||
|
// grows between stat and read (e.g., appending process, replaced file).
|
||||||
|
gpt-review-bot
commented
[NIT] LoadBuiltinPersona falls back to reading JSON from embedded FS, but JSON personas are no longer embedded; clarify the comment or remove the fallback if not needed. **[NIT]** LoadBuiltinPersona falls back to reading JSON from embedded FS, but JSON personas are no longer embedded; clarify the comment or remove the fallback if not needed.
|
|||||||
|
if len(data) > MaxPersonaFileSize {
|
||||||
|
return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
|
||||||
|
}
|
||||||
return parsePersona(data, path)
|
return parsePersona(data, path)
|
||||||
}
|
}
|
||||||
|
|
||||||
// LoadBuiltinPersona loads a built-in persona by name.
|
// LoadBuiltinPersona loads a built-in persona by name.
|
||||||
// Returns an error if the persona doesn't exist.
|
// Returns an error if the persona doesn't exist.
|
||||||
|
// Built-in personas are stored in YAML format only (see embed directive).
|
||||||
func LoadBuiltinPersona(name string) (*Persona, error) {
|
func LoadBuiltinPersona(name string) (*Persona, error) {
|
||||||
filename := name + ".json"
|
yamlFile := name + ".yaml"
|
||||||
data, err := embeddedPersonas.ReadFile("personas/" + filename) // embed.FS paths use forward slashes per io/fs spec
|
data, err := embeddedPersonas.ReadFile("personas/" + yamlFile)
|
||||||
|
gpt-review-bot
commented
[NIT] The comment in parsePersona mentions 'go-yaml v1.16.0+ has built-in protection against deeply nested structures' which refers to a different library (goccy/go-yaml). Update the comment to reflect gopkg.in/yaml.v3 or remove the claim. **[NIT]** The comment in parsePersona mentions 'go-yaml v1.16.0+ has built-in protection against deeply nested structures' which refers to a different library (goccy/go-yaml). Update the comment to reflect gopkg.in/yaml.v3 or remove the claim.
|
|||||||
if err != nil {
|
if err != nil {
|
||||||
available := ListBuiltinPersonas()
|
available := ListBuiltinPersonas()
|
||||||
return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", "))
|
return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", "))
|
||||||
}
|
}
|
||||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `ListBuiltinPersonas` returns a non-deterministic (map-iteration) order of persona names. While the return value isn't documented as sorted, callers such as the error message in `LoadBuiltinPersona` join the names to display to users, and the non-deterministic order makes tests and error messages flaky. The slice should be sorted with `slices.Sort` (or `sort.Strings`) before returning.
|
|||||||
return parsePersona(data, "builtin:"+name)
|
return parsePersona(data, "builtin:"+yamlFile)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
sonnet-review-bot
commented
[NIT] In **[NIT]** In `ListBuiltinPersonas`, the `seen` map is `map[string]bool` and the code checks `if !seen[personaName]` before setting it to `true`. The `!seen[...]` check is redundant since setting it to `true` again would be a no-op. The code works correctly but the guard condition (`if !seen[personaName]`) adds noise — it could just be `seen[personaName] = true` unconditionally.
|
|||||||
// ListBuiltinPersonas returns the names of all built-in personas.
|
// ListBuiltinPersonas returns the names of all built-in personas in sorted order.
|
||||||
// Returns an empty slice if the embedded directory cannot be read.
|
// Returns an empty slice if the embedded directory cannot be read.
|
||||||
|
sonnet-review-bot
commented
[MINOR] **[MINOR]** `ListBuiltinPersonas` iterates over a `map[string]bool` to build the names slice, producing non-deterministic ordering. This can cause flaky tests or confusing error messages ("available: docs, security, architect" vs "available: architect, docs, security" depending on map iteration order). The slice should be sorted before returning: `sort.Strings(names)`.
|
|||||||
func ListBuiltinPersonas() []string {
|
func ListBuiltinPersonas() []string {
|
||||||
entries, err := embeddedPersonas.ReadDir("personas")
|
entries, err := embeddedPersonas.ReadDir("personas")
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return []string{}
|
return []string{}
|
||||||
}
|
}
|
||||||
var names []string
|
seen := make(map[string]bool)
|
||||||
for _, e := range entries {
|
for _, e := range entries {
|
||||||
if e.IsDir() {
|
if e.IsDir() {
|
||||||
continue
|
continue
|
||||||
}
|
}
|
||||||
name := e.Name()
|
name := e.Name()
|
||||||
|
sonnet-review-bot
commented
[NIT] In **[NIT]** In `ListBuiltinPersonas`, `if !seen[personaName] { seen[personaName] = true }` can be simplified to `seen[personaName] = true` — the guard is unnecessary since setting an already-true bool to true is a no-op.
|
|||||||
if strings.HasSuffix(name, ".json") {
|
// Strip extension to get persona name
|
||||||
|
sonnet-review-bot
commented
[NIT] In **[NIT]** In `ListBuiltinPersonas`, the map update `if !seen[personaName] { seen[personaName] = true }` can be simplified to just `seen[personaName] = true` — setting to `true` when already `true` is a no-op. The guard is unnecessary.
sonnet-review-bot
commented
[NIT] In **[NIT]** In `ListBuiltinPersonas`, the map population pattern is slightly redundant: `if !seen[personaName] { seen[personaName] = true }`. Since we only ever set it to `true`, the guard `!seen[personaName]` is equivalent to just `seen[personaName] = true` (maps return zero-value false for missing keys, and overwriting with true is idempotent). The condition adds no value. Could simplify to `seen[personaName] = true`.
|
|||||||
names = append(names, strings.TrimSuffix(name, ".json"))
|
var personaName string
|
||||||
|
switch {
|
||||||
|
case strings.HasSuffix(name, ".yaml"):
|
||||||
|
gpt-review-bot
commented
[MINOR] Relying solely on library defaults for YAML depth/alias protections. Consider using an explicit decoder with defensive options (if supported by the library) to enforce a depth/alias limit for defense in depth, aligning with the design doc intent. **[MINOR]** Relying solely on library defaults for YAML depth/alias protections. Consider using an explicit decoder with defensive options (if supported by the library) to enforce a depth/alias limit for defense in depth, aligning with the design doc intent.
gpt-review-bot
commented
[NIT] ListBuiltinPersonas handles .yml and .json extensions even though only *.yaml files are embedded. This is harmless but could be simplified or clarified in comments to reflect the embed pattern. **[NIT]** ListBuiltinPersonas handles .yml and .json extensions even though only *.yaml files are embedded. This is harmless but could be simplified or clarified in comments to reflect the embed pattern.
|
|||||||
|
personaName = strings.TrimSuffix(name, ".yaml")
|
||||||
|
case strings.HasSuffix(name, ".yml"):
|
||||||
|
[MINOR] YAML parsing uses yaml.Unmarshal without explicit decoder limits. Although goccy/go-yaml >= v1.16.0 adds default depth protections, setting explicit limits (e.g., maximum nesting depth/collection sizes if supported) would add defense-in-depth against resource exhaustion in pathological inputs. **[MINOR]** YAML parsing uses yaml.Unmarshal without explicit decoder limits. Although goccy/go-yaml >= v1.16.0 adds default depth protections, setting explicit limits (e.g., maximum nesting depth/collection sizes if supported) would add defense-in-depth against resource exhaustion in pathological inputs.
[MAJOR] Unbounded YAML deserialization with yaml.Unmarshal on potentially untrusted persona files can allow resource exhaustion (e.g., deeply nested structures or alias expansion bombs). There is no explicit limit on nesting, aliases, or input size, contrary to the comment suggesting built-in protection. **[MAJOR]** Unbounded YAML deserialization with yaml.Unmarshal on potentially untrusted persona files can allow resource exhaustion (e.g., deeply nested structures or alias expansion bombs). There is no explicit limit on nesting, aliases, or input size, contrary to the comment suggesting built-in protection.
|
|||||||
|
personaName = strings.TrimSuffix(name, ".yml")
|
||||||
|
case strings.HasSuffix(name, ".json"):
|
||||||
|
personaName = strings.TrimSuffix(name, ".json")
|
||||||
|
default:
|
||||||
|
continue
|
||||||
|
sonnet-review-bot
commented
[NIT] The comment **[NIT]** The comment `// go-yaml v1.16.0+ has built-in protection against deeply nested structures` is misleading/incorrect for `gopkg.in/yaml.v3`. It appears to have been written for `goccy/go-yaml` and copied over without updating. This should either be removed or corrected to accurately describe `gopkg.in/yaml.v3`'s actual behavior.
|
|||||||
|
}
|
||||||
|
if !seen[personaName] {
|
||||||
|
seen[personaName] = true
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
names := make([]string, 0, len(seen))
|
||||||
|
for name := range seen {
|
||||||
|
names = append(names, name)
|
||||||
|
}
|
||||||
|
sort.Strings(names)
|
||||||
return names
|
return names
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// parsePersona parses persona data from JSON or YAML format.
|
||||||
|
// Format is detected by the source file extension.
|
||||||
func parsePersona(data []byte, source string) (*Persona, error) {
|
func parsePersona(data []byte, source string) (*Persona, error) {
|
||||||
|
lowerSource := strings.ToLower(source)
|
||||||
|
isYAML := strings.HasSuffix(lowerSource, ".yaml") || strings.HasSuffix(lowerSource, ".yml")
|
||||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `ListBuiltinPersonas` deduplication map (`seen[personaName] = true`) is correct but slightly redundant: the condition `if !seen[personaName]` is always true on the first encounter (the only time the body runs), so the inner check adds noise. A simpler pattern would be `if !seen[personaName] { seen[personaName] = true; names = append(names, name) }`, building the slice directly instead of a two-pass map → slice conversion. Not a bug, just unnecessary complexity.
|
|||||||
|
|
||||||
var p Persona
|
var p Persona
|
||||||
if err := json.Unmarshal(data, &p); err != nil {
|
var err error
|
||||||
|
if isYAML {
|
||||||
|
err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
|
||||||
|
} else {
|
||||||
|
// Use json.Decoder with DisallowUnknownFields for consistency with
|
||||||
|
// YAML's KnownFields(true) - both reject unknown fields to catch typos.
|
||||||
|
dec := json.NewDecoder(bytes.NewReader(data))
|
||||||
|
gpt-review-bot
commented
[MINOR] YAML decoder does not enable KnownFields/strict mode, so unknown keys in persona YAML are silently ignored. Enabling strict field checking would help catch typos (e.g., dec.KnownFields(true)) before validation. **[MINOR]** YAML decoder does not enable KnownFields/strict mode, so unknown keys in persona YAML are silently ignored. Enabling strict field checking would help catch typos (e.g., dec.KnownFields(true)) before validation.
gpt-review-bot
commented
[MINOR] JSON parsing uses json.Unmarshal which accepts unknown fields, while YAML is parsed with KnownFields(true). For parity and to catch typos in JSON persona files, consider switching to a json.Decoder with DisallowUnknownFields(). **[MINOR]** JSON parsing uses json.Unmarshal which accepts unknown fields, while YAML is parsed with KnownFields(true). For parity and to catch typos in JSON persona files, consider switching to a json.Decoder with DisallowUnknownFields().
|
|||||||
|
dec.DisallowUnknownFields()
|
||||||
|
err = dec.Decode(&p)
|
||||||
|
gpt-review-bot
commented
[MINOR] JSON parsing does not enforce single-document input. After dec.Decode(&p), any trailing JSON values would be silently ignored. Consider verifying EOF by attempting a second decode and expecting io.EOF to ensure there's no extra data. **[MINOR]** JSON parsing does not enforce single-document input. After dec.Decode(&p), any trailing JSON values would be silently ignored. Consider verifying EOF by attempting a second decode and expecting io.EOF to ensure there's no extra data.
|
|||||||
|
}
|
||||||
|
if err != nil {
|
||||||
return nil, fmt.Errorf("parse persona %s: %w", source, err)
|
return nil, fmt.Errorf("parse persona %s: %w", source, err)
|
||||||
|
gpt-review-bot
commented
[MINOR] YAML depth check traverses node.Content but ignores Alias nodes (node.Alias). Malicious YAML could leverage anchors/aliases to create effective deep structures without increasing Content depth. Consider handling Alias nodes explicitly. **[MINOR]** YAML depth check traverses node.Content but ignores Alias nodes (node.Alias). Malicious YAML could leverage anchors/aliases to create effective deep structures without increasing Content depth. Consider handling Alias nodes explicitly.
|
|||||||
}
|
}
|
||||||
if err := validatePersona(&p, source); err != nil {
|
if err := validatePersona(&p, source); err != nil {
|
||||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `unmarshalYAMLWithDepthLimit` function uses `interface{}` as the parameter type for `out` instead of `any`. The project targets the latest stable Go release, and `any` is the idiomatic alias since Go 1.18. This is inconsistent with the rest of the codebase convention.
sonnet-review-bot
commented
[MINOR] The two-pass decode approach (first into yaml.Node for depth check, then strict decode from raw bytes) is necessary because KnownFields doesn't work on yaml.Node.Decode(), but this means the input bytes are parsed twice. The comment explains this correctly. A minor concern: if **[MINOR]** The two-pass decode approach (first into yaml.Node for depth check, then strict decode from raw bytes) is necessary because KnownFields doesn't work on yaml.Node.Decode(), but this means the input bytes are parsed twice. The comment explains this correctly. A minor concern: if `gopkg.in/yaml.v3` ever adds KnownFields support to node.Decode(), this could be simplified — but for now this is the correct workaround and is well-documented.
|
|||||||
@@ -84,6 +156,74 @@ func parsePersona(data []byte, source string) (*Persona, error) {
|
|||||||
return &p, nil
|
return &p, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// unmarshalYAMLWithDepthLimit unmarshals YAML data with explicit depth limiting
|
||||||
|
[MAJOR] The recursive YAML depth checker does not detect alias/anchor cycles. If a YAML document contains cyclic aliases (e.g., an alias ultimately pointing back to an anchored node that references the alias), checkYAMLDepth can recurse indefinitely, leading to stack exhaustion or a hang (DoS). Add cycle detection (visited set of *yaml.Node) or a total node visitation cap to prevent infinite recursion. **[MAJOR]** The recursive YAML depth checker does not detect alias/anchor cycles. If a YAML document contains cyclic aliases (e.g., an alias ultimately pointing back to an anchored node that references the alias), checkYAMLDepth can recurse indefinitely, leading to stack exhaustion or a hang (DoS). Add cycle detection (visited set of *yaml.Node) or a total node visitation cap to prevent infinite recursion.
|
|||||||
|
// and strict field checking. This protects against stack exhaustion from deeply
|
||||||
|
[MINOR] JSON parsing uses json.Unmarshal, which silently ignores unknown fields. For consistency with YAML's KnownFields(true) and to harden against typos or unintended keys, consider switching to json.Decoder with DisallowUnknownFields() to reject unknown JSON fields. **[MINOR]** JSON parsing uses json.Unmarshal, which silently ignores unknown fields. For consistency with YAML's KnownFields(true) and to harden against typos or unintended keys, consider switching to json.Decoder with DisallowUnknownFields() to reject unknown JSON fields.
|
|||||||
|
// nested structures and catches typos in field names.
|
||||||
|
sonnet-review-bot
commented
[MINOR] The decoder created with **[MINOR]** The decoder created with `yaml.NewDecoder` is used to decode a single document but does not check for a second call to `Decode` returning `io.EOF` to confirm there's only one document. For untrusted user-supplied YAML files, silently ignoring additional documents is acceptable (they won't affect the parsed struct), but it's worth noting that multi-document YAML files will have their additional documents silently ignored rather than surfacing an error. This is a minor UX issue, not a correctness bug.
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `unmarshalYAMLWithDepthLimit` function silently ignores multi-document YAML files (only parses the first document). While the comment acknowledges this, it could be a footgun: a user who accidentally writes `---` in their persona file will get no error, just silent truncation. Consider calling `dec.Decode` a second time and returning an error if a second document is found, to give users a clearer signal.
|
|||||||
|
// Multi-document YAML files are rejected to prevent silent data loss.
|
||||||
|
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
|
||||||
|
// First pass: decode into a yaml.Node to check depth limits and node counts.
|
||||||
|
[MINOR] When encountering an alias node, the depth check calls checkYAMLDepth on the alias target without incrementing depth, potentially undercounting nesting by one compared to non-alias children. While limited in practical bypass (one level), this slightly weakens the intended MaxYAMLDepth enforcement. Consider passing depth+1 when following an alias used as a value. **[MINOR]** When encountering an alias node, the depth check calls checkYAMLDepth on the alias target without incrementing depth, potentially undercounting nesting by one compared to non-alias children. While limited in practical bypass (one level), this slightly weakens the intended MaxYAMLDepth enforcement. Consider passing depth+1 when following an alias used as a value.
[MINOR] Depth enforcement occurs after decoding the full YAML into a yaml.Node. While the 64KB file-size cap mitigates risk, an attacker could still supply deeply nested but small YAML to stress the decoder before the depth check. Consider additional safeguards (e.g., limiting total nodes visited or using a visited set, or further reducing allowed size/depth) for defense in depth. **[MINOR]** Depth enforcement occurs after decoding the full YAML into a yaml.Node. While the 64KB file-size cap mitigates risk, an attacker could still supply deeply nested but small YAML to stress the decoder before the depth check. Consider additional safeguards (e.g., limiting total nodes visited or using a visited set, or further reducing allowed size/depth) for defense in depth.
|
|||||||
|
// This prevents stack exhaustion before we attempt to decode into structs.
|
||||||
|
gpt-review-bot
commented
[MINOR] checkYAMLDepth follows alias pointers without guarding against potential alias cycles; consider tracking visited nodes or limiting alias dereference to avoid possible infinite recursion on pathological inputs. **[MINOR]** checkYAMLDepth follows alias pointers without guarding against potential alias cycles; consider tracking visited nodes or limiting alias dereference to avoid possible infinite recursion on pathological inputs.
|
|||||||
|
var node yaml.Node
|
||||||
|
dec := yaml.NewDecoder(bytes.NewReader(data))
|
||||||
|
if err := dec.Decode(&node); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
sonnet-review-bot
commented
[MINOR] The multi-document rejection check ( **[MINOR]** The multi-document rejection check (`dec.Decode(&extra)`) re-uses the same decoder after the first `dec.Decode(&node)` call. If the first document exhausted the decoder but there's trailing non-document content (e.g., a trailing `...` end-of-document marker), the behavior depends on go-yaml internals. This is likely fine in practice, but the comment could clarify that this relies on go-yaml's decoder advancing past the first document.
|
|||||||
|
// Reject multi-document YAML files - silently ignoring additional documents
|
||||||
|
// could lead to confusing behavior where users think their changes take effect.
|
||||||
|
var extra yaml.Node
|
||||||
|
if dec.Decode(&extra) == nil {
|
||||||
|
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
|
||||||
|
sonnet-review-bot
commented
[MINOR] The alias-following logic in **[MINOR]** The alias-following logic in `checkYAMLDepth` does not count the alias node's own depth level before recursing into the alias target. If an alias node appears near the depth limit, the target's content is checked starting from the same `depth` value as the alias itself (not `depth+1`). This means a deeply aliased structure could potentially bypass the limit by 1 level. While the impact is minimal given the depth=20 limit, incrementing depth before recursing into the alias target would be more consistent.
|
|||||||
|
}
|
||||||
|
sonnet-review-bot
commented
[MINOR] The alias depth check uses **[MINOR]** The alias depth check uses `return checkYAMLDepth(node.Alias, depth, maxDepth)` — i.e., the alias itself doesn't add to the depth. This is intentional per the comment, but YAML aliases can be used to create large fan-out structures. An alias to a deeply nested structure effectively doubles the depth impact without incrementing depth for the alias node itself. For the persona use case with MaxYAMLDepth=20 this is acceptable, but worth noting the design tradeoff.
|
|||||||
|
|
||||||
|
nodeCount := 0
|
||||||
|
if err := checkYAMLDepth(&node, 0, maxDepth, MaxYAMLNodes, make(map[*yaml.Node]struct{}), &nodeCount); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
sonnet-review-bot
commented
[MINOR] The two-pass decode approach (first into yaml.Node for depth checking, then a fresh decoder with KnownFields) re-parses the raw bytes a second time. This means a malicious YAML that causes the first-pass decoder to consume significant CPU (e.g., extremely long scalar strings) gets two chances to do so. The depth/node count limits reduce but don't fully eliminate this risk. A minor concern since file size is bounded at 64KB, but worth noting. **[MINOR]** The two-pass decode approach (first into yaml.Node for depth checking, then a fresh decoder with KnownFields) re-parses the raw bytes a second time. This means a malicious YAML that causes the first-pass decoder to consume significant CPU (e.g., extremely long scalar strings) gets two chances to do so. The depth/node count limits reduce but don't fully eliminate this risk. A minor concern since file size is bounded at 64KB, but worth noting.
|
|||||||
|
// Second pass: decode with strict field checking enabled.
|
||||||
|
// KnownFields(true) rejects unknown keys, catching typos like "focuss" or "identiy".
|
||||||
|
// We must re-decode from the original data because yaml.Node.Decode() doesn't
|
||||||
|
// support the KnownFields option.
|
||||||
|
strictDec := yaml.NewDecoder(bytes.NewReader(data))
|
||||||
|
sonnet-review-bot
commented
[MINOR] The multi-document check **[MINOR]** The multi-document check `dec.Decode(&extra) == nil` consumes IO from the first-pass decoder, but the strict second-pass re-decodes from `bytes.NewReader(data)`, so correctness is fine. However, the document count check is somewhat brittle: it silently ignores any decode error on the second document (including genuine IO errors) by treating any non-nil error as 'only one document'. In practice this works since the reader is in-memory, but the intent would be clearer with `if err := dec.Decode(&extra); !errors.Is(err, io.EOF) && err == nil { return fmt.Errorf(...) }` or by explicitly checking `err == nil`.
|
|||||||
|
strictDec.KnownFields(true)
|
||||||
|
[MAJOR] checkYAMLDepth follows yaml aliases recursively without cycle detection. A crafted YAML with cyclical/self-referential aliases can cause infinite recursion and a panic, enabling a denial-of-service against the process loading persona files. **[MAJOR]** checkYAMLDepth follows yaml aliases recursively without cycle detection. A crafted YAML with cyclical/self-referential aliases can cause infinite recursion and a panic, enabling a denial-of-service against the process loading persona files.
|
|||||||
|
return strictDec.Decode(out)
|
||||||
|
}
|
||||||
|
|
||||||
|
// checkYAMLDepth recursively checks that YAML nodes don't exceed the depth limit
|
||||||
|
// or the total node count limit. It also detects alias cycles to prevent infinite
|
||||||
|
// recursion from crafted YAML with self-referential aliases.
|
||||||
|
func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*yaml.Node]struct{}, nodeCount *int) error {
|
||||||
|
if depth > maxDepth {
|
||||||
|
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Track total nodes visited as defense-in-depth against wide-but-shallow attacks.
|
||||||
|
*nodeCount++
|
||||||
|
if *nodeCount > maxNodes {
|
||||||
|
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Cycle detection: if we've seen this node before, we're in a cycle.
|
||||||
|
if _, ok := seen[node]; ok {
|
||||||
|
return nil // Already validated this subtree, skip to avoid infinite recursion.
|
||||||
|
}
|
||||||
|
seen[node] = struct{}{}
|
||||||
|
|
||||||
|
// Handle alias nodes: follow the alias to its anchor target.
|
||||||
|
// Increment depth when following aliases since they expand the effective structure.
|
||||||
|
if node.Kind == yaml.AliasNode && node.Alias != nil {
|
||||||
|
return checkYAMLDepth(node.Alias, depth+1, maxDepth, maxNodes, seen, nodeCount)
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, child := range node.Content {
|
||||||
|
sonnet-review-bot
commented
[NIT] The cycle detection comment says 'Already validated this subtree, skip to avoid infinite recursion' but returning nil on a revisited node means that node's depth contribution is not re-checked at the new depth. This is intentional (and correct for cycle-breaking), but the comment could be clearer that depth checking for shared/aliased subtrees is only done once from the first encounter. **[NIT]** The cycle detection comment says 'Already validated this subtree, skip to avoid infinite recursion' but returning nil on a revisited node means that node's depth contribution is not re-checked at the new depth. This is intentional (and correct for cycle-breaking), but the comment could be clearer that depth checking for shared/aliased subtrees is only done once from the first encounter.
|
|||||||
|
if err := checkYAMLDepth(child, depth+1, maxDepth, maxNodes, seen, nodeCount); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
func validatePersona(p *Persona, source string) error {
|
func validatePersona(p *Persona, source string) error {
|
||||||
if p.Name == "" {
|
if p.Name == "" {
|
||||||
return fmt.Errorf("persona %s: name is required", source)
|
return fmt.Errorf("persona %s: name is required", source)
|
||||||
|
|||||||
@@ -1,10 +1,13 @@
|
|||||||
package review
|
package review
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"fmt"
|
||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
"strings"
|
"strings"
|
||||||
"testing"
|
"testing"
|
||||||
|
|
||||||
|
"gopkg.in/yaml.v3"
|
||||||
)
|
)
|
||||||
|
|
||||||
func TestLoadBuiltinPersona(t *testing.T) {
|
func TestLoadBuiltinPersona(t *testing.T) {
|
||||||
@@ -87,6 +90,83 @@ func TestListBuiltinPersonas(t *testing.T) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestLoadPersonaFromYAMLFile(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "test.yaml")
|
||||||
|
|
||||||
|
content := `# Test persona
|
||||||
|
name: test
|
||||||
|
display_name: Test Persona
|
||||||
|
identity: |
|
||||||
|
You are a test persona.
|
||||||
|
Multi-line identity works.
|
||||||
|
focus:
|
||||||
|
- testing
|
||||||
|
- validation
|
||||||
|
ignore:
|
||||||
|
- nothing
|
||||||
|
severity:
|
||||||
|
major: Big problems
|
||||||
|
minor: Small problems
|
||||||
|
nit: Tiny problems
|
||||||
|
`
|
||||||
|
|
||||||
|
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
p, err := LoadPersona(path)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("LoadPersona failed: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
if p.Name != "test" {
|
||||||
|
t.Errorf("Name = %q, want %q", p.Name, "test")
|
||||||
|
}
|
||||||
|
if p.DisplayName != "Test Persona" {
|
||||||
|
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test Persona")
|
||||||
|
}
|
||||||
|
if len(p.Focus) != 2 {
|
||||||
|
t.Errorf("Focus len = %d, want 2", len(p.Focus))
|
||||||
|
}
|
||||||
|
if !strings.Contains(p.Identity, "Multi-line") {
|
||||||
|
t.Error("Identity should contain multi-line content")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLoadPersonaFromYMLFile(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "test.yml")
|
||||||
|
|
||||||
|
content := `name: test
|
||||||
|
display_name: Test YML
|
||||||
|
identity: Test identity
|
||||||
|
focus:
|
||||||
|
- testing
|
||||||
|
ignore: []
|
||||||
|
severity:
|
||||||
|
major: Big
|
||||||
|
minor: Small
|
||||||
|
nit: Tiny
|
||||||
|
`
|
||||||
|
|
||||||
|
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
p, err := LoadPersona(path)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("LoadPersona failed: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
if p.Name != "test" {
|
||||||
|
t.Errorf("Name = %q, want %q", p.Name, "test")
|
||||||
|
}
|
||||||
|
if p.DisplayName != "Test YML" {
|
||||||
|
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test YML")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
func TestLoadPersonaFromJSONFile(t *testing.T) {
|
func TestLoadPersonaFromJSONFile(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
path := filepath.Join(dir, "test.json")
|
path := filepath.Join(dir, "test.json")
|
||||||
@@ -96,6 +176,7 @@ func TestLoadPersonaFromJSONFile(t *testing.T) {
|
|||||||
"display_name": "Test Persona",
|
"display_name": "Test Persona",
|
||||||
|
sonnet-review-bot
commented
[NIT] The test **[NIT]** The test `TestLoadPersonaFromJSONFile` has a blank line added inside the JSON content string (`"focus": ["testing", "validation"],\n\n"ignore":...`). While valid JSON, the blank line is unnecessary and looks like an accidental edit.
|
|||||||
"identity": "You are a test persona.\nMulti-line identity works.",
|
"identity": "You are a test persona.\nMulti-line identity works.",
|
||||||
"focus": ["testing", "validation"],
|
"focus": ["testing", "validation"],
|
||||||
|
|
||||||
"ignore": ["nothing"],
|
"ignore": ["nothing"],
|
||||||
"severity": {
|
"severity": {
|
||||||
"major": "Big problems",
|
"major": "Big problems",
|
||||||
@@ -130,22 +211,38 @@ func TestLoadPersonaFromJSONFile(t *testing.T) {
|
|||||||
func TestLoadPersonaValidation(t *testing.T) {
|
func TestLoadPersonaValidation(t *testing.T) {
|
||||||
tests := []struct {
|
tests := []struct {
|
||||||
name string
|
name string
|
||||||
json string
|
content string
|
||||||
|
ext string
|
||||||
wantErr string
|
wantErr string
|
||||||
}{
|
}{
|
||||||
{
|
{
|
||||||
name: "missing name",
|
name: "missing name yaml",
|
||||||
json: `{"identity": "test"}`,
|
content: "identity: test\n",
|
||||||
|
ext: ".yaml",
|
||||||
wantErr: "name is required",
|
wantErr: "name is required",
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
name: "missing identity",
|
name: "missing identity yaml",
|
||||||
json: `{"name": "test"}`,
|
content: "name: test\n",
|
||||||
|
ext: ".yaml",
|
||||||
wantErr: "identity is required",
|
wantErr: "identity is required",
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
name: "display_name defaults to name",
|
name: "missing name json",
|
||||||
json: `{"name": "test", "identity": "test identity"}`,
|
content: `{"identity": "test"}`,
|
||||||
|
ext: ".json",
|
||||||
|
wantErr: "name is required",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "missing identity json",
|
||||||
|
content: `{"name": "test"}`,
|
||||||
|
ext: ".json",
|
||||||
|
wantErr: "identity is required",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "display_name defaults to name",
|
||||||
|
content: "name: test\nidentity: test identity\n",
|
||||||
|
ext: ".yaml",
|
||||||
// No error expected - should succeed
|
// No error expected - should succeed
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
@@ -153,8 +250,8 @@ func TestLoadPersonaValidation(t *testing.T) {
|
|||||||
for _, tt := range tests {
|
for _, tt := range tests {
|
||||||
t.Run(tt.name, func(t *testing.T) {
|
t.Run(tt.name, func(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
path := filepath.Join(dir, "test.json")
|
path := filepath.Join(dir, "test"+tt.ext)
|
||||||
if err := os.WriteFile(path, []byte(tt.json), 0644); err != nil {
|
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
|
||||||
t.Fatalf("failed to write test file: %v", err)
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -184,12 +281,25 @@ func TestLoadPersonaValidation(t *testing.T) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
func TestLoadPersonaFileNotFound(t *testing.T) {
|
func TestLoadPersonaFileNotFound(t *testing.T) {
|
||||||
_, err := LoadPersona("/nonexistent/path/persona.json")
|
_, err := LoadPersona("/nonexistent/path/persona.yaml")
|
||||||
if err == nil {
|
if err == nil {
|
||||||
t.Error("expected error for nonexistent file")
|
t.Error("expected error for nonexistent file")
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestLoadPersonaInvalidYAML(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "invalid.yaml")
|
||||||
|
if err := os.WriteFile(path, []byte("not valid yaml:\n - [broken"), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
_, err := LoadPersona(path)
|
||||||
|
if err == nil {
|
||||||
|
t.Error("expected error for invalid YAML")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
func TestLoadPersonaInvalidJSON(t *testing.T) {
|
func TestLoadPersonaInvalidJSON(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
path := filepath.Join(dir, "invalid.json")
|
path := filepath.Join(dir, "invalid.json")
|
||||||
@@ -203,6 +313,38 @@ func TestLoadPersonaInvalidJSON(t *testing.T) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestLoadPersonaCaseInsensitiveExtension(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
name string
|
||||||
|
ext string
|
||||||
|
}{
|
||||||
|
{"lowercase yaml", ".yaml"},
|
||||||
|
{"uppercase YAML", ".YAML"},
|
||||||
|
{"mixed case Yaml", ".Yaml"},
|
||||||
|
{"lowercase yml", ".yml"},
|
||||||
|
{"uppercase YML", ".YML"},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(tt.name, func(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "test"+tt.ext)
|
||||||
|
content := "name: test\nidentity: test identity\n"
|
||||||
|
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
p, err := LoadPersona(path)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("LoadPersona failed for extension %s: %v", tt.ext, err)
|
||||||
|
}
|
||||||
|
if p.Name != "test" {
|
||||||
|
t.Errorf("Name = %q, want %q", p.Name, "test")
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
func TestCapitalizeFirst(t *testing.T) {
|
func TestCapitalizeFirst(t *testing.T) {
|
||||||
tests := []struct {
|
tests := []struct {
|
||||||
input string
|
input string
|
||||||
@@ -237,3 +379,400 @@ func TestListBuiltinPersonasReturnsEmptySlice(t *testing.T) {
|
|||||||
t.Error("ListBuiltinPersonas should return empty slice, not nil")
|
t.Error("ListBuiltinPersonas should return empty slice, not nil")
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func TestYAMLMultilineStrings(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "multiline.yaml")
|
||||||
|
|
||||||
|
// Test literal block scalar (|) which preserves newlines
|
||||||
|
content := `name: multiline
|
||||||
|
display_name: Multiline Test
|
||||||
|
identity: |
|
||||||
|
First line.
|
||||||
|
Second line.
|
||||||
|
Third line.
|
||||||
|
focus:
|
||||||
|
- item one
|
||||||
|
ignore: []
|
||||||
|
severity:
|
||||||
|
major: Major issue
|
||||||
|
minor: Minor issue
|
||||||
|
nit: Nit
|
||||||
|
`
|
||||||
|
|
||||||
|
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
p, err := LoadPersona(path)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("LoadPersona failed: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Literal block scalar preserves newlines
|
||||||
|
if !strings.Contains(p.Identity, "\n") {
|
||||||
|
t.Error("Identity should contain newlines from literal block scalar")
|
||||||
|
}
|
||||||
|
if !strings.Contains(p.Identity, "Second line") {
|
||||||
|
t.Error("Identity should contain 'Second line'")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestYAMLComments(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "comments.yaml")
|
||||||
|
|
||||||
|
content := `# This is a comment
|
||||||
|
name: commented # inline comment
|
||||||
|
display_name: Commented Persona
|
||||||
|
# Another comment
|
||||||
|
identity: Test identity
|
||||||
|
focus:
|
||||||
|
- item # comment after item
|
||||||
|
ignore: []
|
||||||
|
severity:
|
||||||
|
major: Major
|
||||||
|
minor: Minor
|
||||||
|
nit: Nit
|
||||||
|
`
|
||||||
|
|
||||||
|
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
p, err := LoadPersona(path)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("LoadPersona failed: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Comments should be ignored
|
||||||
|
if p.Name != "commented" {
|
||||||
|
sonnet-review-bot
commented
[NIT] Both **[NIT]** Both `TestYAMLAliasCycleDetection` and `TestCheckYAMLDepthCycleDetectionDirect` test nearly identical scenarios (artificial AliasNode cycle). They could be merged into one test with subtly different configurations to reduce duplication.
|
|||||||
|
t.Errorf("Name = %q, want %q", p.Name, "commented")
|
||||||
|
}
|
||||||
|
if p.Focus[0] != "item" {
|
||||||
|
t.Errorf("Focus[0] = %q, want %q", p.Focus[0], "item")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestYAMLDeeplyNestedRejection(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "deeply-nested.yaml")
|
||||||
|
|
||||||
|
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
|
||||||
|
// Each level adds 2 to the depth count (key + value mapping).
|
||||||
|
var sb strings.Builder
|
||||||
|
sb.WriteString("name: test\nidentity: test\nnested:\n")
|
||||||
|
indent := " "
|
||||||
|
for i := 0; i < 25; i++ {
|
||||||
|
sb.WriteString(strings.Repeat(indent, i+1))
|
||||||
|
sb.WriteString(fmt.Sprintf("level%d:\n", i))
|
||||||
|
}
|
||||||
|
sb.WriteString(strings.Repeat(indent, 26))
|
||||||
|
sb.WriteString("value: too-deep\n")
|
||||||
|
|
||||||
|
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
_, err := LoadPersona(path)
|
||||||
|
if err == nil {
|
||||||
|
t.Error("expected error for deeply nested YAML, got nil")
|
||||||
|
}
|
||||||
|
if !strings.Contains(err.Error(), "nesting depth exceeds") {
|
||||||
|
t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestYAMLFileSizeLimit(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "huge.yaml")
|
||||||
|
|
||||||
|
// Create a file larger than MaxPersonaFileSize (64 KB)
|
||||||
|
content := "name: test\nidentity: " + strings.Repeat("x", MaxPersonaFileSize+1) + "\n"
|
||||||
|
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
_, err := LoadPersona(path)
|
||||||
|
if err == nil {
|
||||||
|
t.Error("expected error for oversized file, got nil")
|
||||||
|
}
|
||||||
|
if !strings.Contains(err.Error(), "exceeds maximum size") {
|
||||||
|
t.Errorf("error = %q, want containing 'exceeds maximum size'", err.Error())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestYAMLAliasCycleDetection(t *testing.T) {
|
||||||
|
// Test that our checkYAMLDepth function handles alias cycles gracefully
|
||||||
|
// by using the seen map to prevent infinite recursion.
|
||||||
|
// We test this directly because go-yaml's parser handles most cycles
|
||||||
|
// at parse time, but we need to ensure our checker is robust.
|
||||||
|
|
||||||
|
// Create a node structure where an alias points to a parent node,
|
||||||
|
// simulating what could happen with malicious input that bypasses
|
||||||
|
// go-yaml's cycle detection.
|
||||||
|
parent := &yaml.Node{
|
||||||
|
Kind: yaml.MappingNode,
|
||||||
|
Content: []*yaml.Node{
|
||||||
|
{Kind: yaml.ScalarNode, Value: "name"},
|
||||||
|
{Kind: yaml.ScalarNode, Value: "test"},
|
||||||
|
{Kind: yaml.ScalarNode, Value: "nested"},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create a child that aliases back to the parent (artificial cycle)
|
||||||
|
aliasToParent := &yaml.Node{
|
||||||
|
Kind: yaml.AliasNode,
|
||||||
|
Alias: parent,
|
||||||
|
}
|
||||||
|
parent.Content = append(parent.Content, aliasToParent)
|
||||||
|
|
||||||
|
nodeCount := 0
|
||||||
|
seen := make(map[*yaml.Node]struct{})
|
||||||
|
|
||||||
|
// This should NOT hang or stack overflow - the seen map prevents infinite recursion
|
||||||
|
err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("unexpected error traversing cyclic structure: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Verify we tracked the parent in the seen map
|
||||||
|
if _, ok := seen[parent]; !ok {
|
||||||
|
t.Error("parent node not tracked in seen map")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestYAMLMultiDocumentRejection(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "multi.yaml")
|
||||||
|
|
||||||
|
// Multi-document YAML (documents separated by ---)
|
||||||
|
content := `name: first
|
||||||
|
identity: first document
|
||||||
|
---
|
||||||
|
name: second
|
||||||
|
identity: second document
|
||||||
|
`
|
||||||
|
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
_, err := LoadPersona(path)
|
||||||
|
if err == nil {
|
||||||
|
t.Error("expected error for multi-document YAML, got nil")
|
||||||
|
}
|
||||||
|
if !strings.Contains(err.Error(), "multi-document") {
|
||||||
|
t.Errorf("error = %q, want containing 'multi-document'", err.Error())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestYAMLNodeCountLimit(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "wide.yaml")
|
||||||
|
|
||||||
|
// Build a YAML structure that's shallow but wide - many keys at the same level
|
||||||
|
// to test the node count limit (should exceed MaxYAMLNodes = 1000)
|
||||||
|
var sb strings.Builder
|
||||||
|
sb.WriteString("name: test\nidentity: test\n")
|
||||||
|
for i := 0; i < 600; i++ {
|
||||||
|
sb.WriteString(fmt.Sprintf("key%d: value%d\n", i, i))
|
||||||
|
}
|
||||||
|
|
||||||
|
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
_, err := LoadPersona(path)
|
||||||
|
if err == nil {
|
||||||
|
t.Error("expected error for wide YAML exceeding node count, got nil")
|
||||||
|
}
|
||||||
|
if !strings.Contains(err.Error(), "node count exceeds") {
|
||||||
|
t.Errorf("error = %q, want containing 'node count exceeds'", err.Error())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) {
|
||||||
|
// Direct test of cycle detection in checkYAMLDepth by creating
|
||||||
|
// a node structure with an artificial cycle.
|
||||||
|
// This tests the seen map logic independent of go-yaml's parsing.
|
||||||
|
node := &yaml.Node{
|
||||||
|
Kind: yaml.MappingNode,
|
||||||
|
Content: []*yaml.Node{
|
||||||
|
{Kind: yaml.ScalarNode, Value: "key"},
|
||||||
|
{Kind: yaml.ScalarNode, Value: "value"},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create a cycle by making a child reference the parent
|
||||||
|
cycleChild := &yaml.Node{
|
||||||
|
Kind: yaml.AliasNode,
|
||||||
|
Alias: node, // Points back to the parent
|
||||||
|
}
|
||||||
|
node.Content = append(node.Content,
|
||||||
|
&yaml.Node{Kind: yaml.ScalarNode, Value: "cyclic"},
|
||||||
|
cycleChild,
|
||||||
|
)
|
||||||
|
|
||||||
|
nodeCount := 0
|
||||||
|
seen := make(map[*yaml.Node]struct{})
|
||||||
|
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
|
||||||
|
|
||||||
|
// Should complete without infinite recursion due to cycle detection
|
||||||
|
if err != nil {
|
||||||
|
t.Errorf("unexpected error: %v", err)
|
||||||
|
}
|
||||||
|
// The seen map should contain multiple entries
|
||||||
|
if len(seen) < 2 {
|
||||||
|
t.Errorf("seen map has %d entries, expected at least 2", len(seen))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestListBuiltinPersonasSortedOrder(t *testing.T) {
|
||||||
|
names := ListBuiltinPersonas()
|
||||||
|
if len(names) < 2 {
|
||||||
|
t.Skip("need at least 2 personas to test ordering")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Verify the list is sorted
|
||||||
|
for i := 1; i < len(names); i++ {
|
||||||
|
if names[i-1] > names[i] {
|
||||||
|
t.Errorf("ListBuiltinPersonas not sorted: %q > %q", names[i-1], names[i])
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestYAMLUnknownFieldsRejected(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
name string
|
||||||
|
content string
|
||||||
|
wantErr string
|
||||||
|
}{
|
||||||
|
{
|
||||||
|
name: "unknown top-level field",
|
||||||
|
content: `name: test
|
||||||
|
identity: test identity
|
||||||
|
unknown_field: should fail
|
||||||
|
`,
|
||||||
|
wantErr: "unknown_field",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "typo in field name",
|
||||||
|
content: `name: test
|
||||||
|
identiy: typo should fail
|
||||||
|
`,
|
||||||
|
wantErr: "identiy",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "unknown field in severity",
|
||||||
|
content: `name: test
|
||||||
|
identity: test
|
||||||
|
severity:
|
||||||
|
major: Major
|
||||||
|
minro: typo
|
||||||
|
`,
|
||||||
|
wantErr: "minro",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(tt.name, func(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "unknown.yaml")
|
||||||
|
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
_, err := LoadPersona(path)
|
||||||
|
if err == nil {
|
||||||
|
t.Errorf("expected error for unknown field %q, got nil", tt.wantErr)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
if !strings.Contains(err.Error(), tt.wantErr) {
|
||||||
|
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestJSONUnknownFieldsRejected(t *testing.T) {
|
||||||
|
tests := []struct {
|
||||||
|
name string
|
||||||
|
content string
|
||||||
|
wantErr string
|
||||||
|
}{
|
||||||
|
{
|
||||||
|
name: "unknown top-level field",
|
||||||
|
content: `{
|
||||||
|
"name": "test",
|
||||||
|
"identity": "test identity",
|
||||||
|
"unknown_field": "should fail"
|
||||||
|
}`,
|
||||||
|
wantErr: "unknown_field",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "typo in field name",
|
||||||
|
content: `{
|
||||||
|
"name": "test",
|
||||||
|
"identiy": "typo should fail"
|
||||||
|
}`,
|
||||||
|
wantErr: "identiy",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "unknown field in severity",
|
||||||
|
content: `{
|
||||||
|
"name": "test",
|
||||||
|
"identity": "test",
|
||||||
|
"severity": {
|
||||||
|
"major": "ok",
|
||||||
|
"miner": "typo"
|
||||||
|
}
|
||||||
|
}`,
|
||||||
|
wantErr: "miner",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(tt.name, func(t *testing.T) {
|
||||||
|
dir := t.TempDir()
|
||||||
|
path := filepath.Join(dir, "test.json")
|
||||||
|
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
_, err := LoadPersona(path)
|
||||||
|
if err == nil {
|
||||||
|
t.Fatal("expected error for unknown field, got nil")
|
||||||
|
}
|
||||||
|
if !strings.Contains(err.Error(), tt.wantErr) {
|
||||||
|
t.Errorf("error = %q, want to contain %q", err.Error(), tt.wantErr)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLoadPersonaSymlink(t *testing.T) {
|
||||||
|
// Create a regular persona file
|
||||||
|
dir := t.TempDir()
|
||||||
|
realFile := filepath.Join(dir, "real.yaml")
|
||||||
|
content := `name: test
|
||||||
|
identity: test identity
|
||||||
|
`
|
||||||
|
if err := os.WriteFile(realFile, []byte(content), 0644); err != nil {
|
||||||
|
t.Fatalf("failed to write test file: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create a symlink to it
|
||||||
|
symlink := filepath.Join(dir, "link.yaml")
|
||||||
|
if err := os.Symlink(realFile, symlink); err != nil {
|
||||||
|
t.Fatalf("failed to create symlink: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// LoadPersona should work via symlink
|
||||||
|
p, err := LoadPersona(symlink)
|
||||||
|
if err != nil {
|
||||||
|
t.Fatalf("LoadPersona via symlink failed: %v", err)
|
||||||
|
}
|
||||||
|
if p.Name != "test" {
|
||||||
|
t.Errorf("Name = %q, want %q", p.Name, "test")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|||||||
@@ -1,26 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "architect",
|
|
||||||
"display_name": "Software Architect",
|
|
||||||
"identity": "You are a software architect reviewing code for design quality.\n\nYour expertise:\n- Design patterns and anti-patterns\n- Code organization and module boundaries\n- API design and contracts\n- Testability and dependency injection\n- Consistency with existing architecture\n- Technical debt identification",
|
|
||||||
"focus": [
|
|
||||||
"Design pattern violations or misuse",
|
|
||||||
"Module boundary violations (inappropriate coupling)",
|
|
||||||
"API design issues (unclear contracts, leaky abstractions)",
|
|
||||||
"Testability problems (hidden dependencies, god objects)",
|
|
||||||
"Inconsistency with existing codebase patterns",
|
|
||||||
"Unnecessary complexity or over-engineering",
|
|
||||||
"Missing abstractions or premature abstraction"
|
|
||||||
],
|
|
||||||
"ignore": [
|
|
||||||
"Security vulnerabilities (security persona handles these)",
|
|
||||||
"Performance micro-optimizations",
|
|
||||||
"Code style and formatting",
|
|
||||||
"Documentation typos",
|
|
||||||
"Test implementation details"
|
|
||||||
],
|
|
||||||
"severity": {
|
|
||||||
"major": "Architectural violations that will cause maintenance problems or make the codebase harder to evolve",
|
|
||||||
"minor": "Design issues that reduce clarity or testability but don't block progress",
|
|
||||||
"nit": "Minor pattern deviations or style preferences"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
@@ -0,0 +1,37 @@
|
|||||||
|
# Software Architect Persona
|
||||||
|
# Focuses on design quality, patterns, and code organization
|
||||||
|
|
||||||
|
name: architect
|
||||||
|
display_name: Software Architect
|
||||||
|
|
||||||
|
identity: |
|
||||||
|
You are a software architect reviewing code for design quality.
|
||||||
|
|
||||||
|
Your expertise:
|
||||||
|
- Design patterns and anti-patterns
|
||||||
|
- Code organization and module boundaries
|
||||||
|
- API design and contracts
|
||||||
|
- Testability and dependency injection
|
||||||
|
- Consistency with existing architecture
|
||||||
|
- Technical debt identification
|
||||||
|
|
||||||
|
focus:
|
||||||
|
- Design pattern violations or misuse
|
||||||
|
- Module boundary violations (inappropriate coupling)
|
||||||
|
- API design issues (unclear contracts, leaky abstractions)
|
||||||
|
- Testability problems (hidden dependencies, god objects)
|
||||||
|
- Inconsistency with existing codebase patterns
|
||||||
|
- Unnecessary complexity or over-engineering
|
||||||
|
- Missing abstractions or premature abstraction
|
||||||
|
|
||||||
|
ignore:
|
||||||
|
- Security vulnerabilities (security persona handles these)
|
||||||
|
- Performance micro-optimizations
|
||||||
|
- Code style and formatting
|
||||||
|
- Documentation typos
|
||||||
|
- Test implementation details
|
||||||
|
|
||||||
|
severity:
|
||||||
|
major: "Architectural violations that will cause maintenance problems or make the codebase harder to evolve"
|
||||||
|
minor: "Design issues that reduce clarity or testability but don't block progress"
|
||||||
|
nit: "Minor pattern deviations or style preferences"
|
||||||
@@ -1,26 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "docs",
|
|
||||||
"display_name": "Documentation Reviewer",
|
|
||||||
"identity": "You are a documentation specialist reviewing code for clarity and documentation quality.\n\nYour expertise:\n- API documentation and examples\n- Code comments and their accuracy\n- Error message clarity\n- README and guide quality\n- Naming clarity and self-documenting code",
|
|
||||||
"focus": [
|
|
||||||
"Missing or outdated documentation",
|
|
||||||
"Unclear or misleading comments",
|
|
||||||
"Poor error messages (cryptic, unhelpful, missing context)",
|
|
||||||
"Confusing naming (functions, variables, types)",
|
|
||||||
"Missing examples for complex APIs",
|
|
||||||
"Inconsistent terminology",
|
|
||||||
"Documentation that contradicts the code"
|
|
||||||
],
|
|
||||||
"ignore": [
|
|
||||||
"Security vulnerabilities",
|
|
||||||
"Performance issues",
|
|
||||||
"Design patterns",
|
|
||||||
"Test coverage",
|
|
||||||
"Code style (unless it affects readability)"
|
|
||||||
],
|
|
||||||
"severity": {
|
|
||||||
"major": "Documentation that actively misleads or missing docs for critical functionality",
|
|
||||||
"minor": "Unclear documentation or poor error messages that will confuse users",
|
|
||||||
"nit": "Minor clarity improvements or typo fixes"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
@@ -0,0 +1,36 @@
|
|||||||
|
# Documentation Reviewer Persona
|
||||||
|
# Focuses on clarity, documentation quality, and self-documenting code
|
||||||
|
|
||||||
|
name: docs
|
||||||
|
display_name: Documentation Reviewer
|
||||||
|
|
||||||
|
identity: |
|
||||||
|
You are a documentation specialist reviewing code for clarity and documentation quality.
|
||||||
|
|
||||||
|
Your expertise:
|
||||||
|
- API documentation and examples
|
||||||
|
- Code comments and their accuracy
|
||||||
|
- Error message clarity
|
||||||
|
- README and guide quality
|
||||||
|
- Naming clarity and self-documenting code
|
||||||
|
|
||||||
|
focus:
|
||||||
|
- Missing or outdated documentation
|
||||||
|
- Unclear or misleading comments
|
||||||
|
- Poor error messages (cryptic, unhelpful, missing context)
|
||||||
|
- Confusing naming (functions, variables, types)
|
||||||
|
- Missing examples for complex APIs
|
||||||
|
- Inconsistent terminology
|
||||||
|
- Documentation that contradicts the code
|
||||||
|
|
||||||
|
ignore:
|
||||||
|
- Security vulnerabilities
|
||||||
|
- Performance issues
|
||||||
|
- Design patterns
|
||||||
|
- Test coverage
|
||||||
|
- Code style (unless it affects readability)
|
||||||
|
|
||||||
|
severity:
|
||||||
|
major: "Documentation that actively misleads or missing docs for critical functionality"
|
||||||
|
minor: "Unclear documentation or poor error messages that will confuse users"
|
||||||
|
nit: "Minor clarity improvements or typo fixes"
|
||||||
@@ -1,26 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "security",
|
|
||||||
"display_name": "Security Specialist",
|
|
||||||
"identity": "You are a security specialist reviewing code for vulnerabilities.\n\nYour expertise:\n- OWASP Top 10 vulnerabilities\n- Injection attacks (SQL, command, path traversal, template)\n- Authentication and authorization patterns\n- Secrets management and exposure risks\n- Race conditions with security implications\n- Event sourcing attack vectors (replay attacks, event injection)",
|
|
||||||
"focus": [
|
|
||||||
"Injection attacks (SQL, command, path traversal, template injection)",
|
|
||||||
"Authentication and authorization gaps or bypasses",
|
|
||||||
"Secrets exposure (hardcoded credentials, tokens in logs, config leaks)",
|
|
||||||
"Input validation failures (unsanitized input, unsafe deserialization)",
|
|
||||||
"Race conditions that could be exploited",
|
|
||||||
"Cryptographic weaknesses (weak algorithms, improper key handling)",
|
|
||||||
"Information disclosure through error messages or logs"
|
|
||||||
],
|
|
||||||
"ignore": [
|
|
||||||
"Code style and naming conventions",
|
|
||||||
"Performance optimizations (unless security-related)",
|
|
||||||
"Documentation quality",
|
|
||||||
"General code quality or readability",
|
|
||||||
"Test coverage"
|
|
||||||
],
|
|
||||||
"severity": {
|
|
||||||
"major": "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE",
|
|
||||||
"minor": "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation",
|
|
||||||
"nit": "Theoretical risks with low exploitability or impact"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
@@ -0,0 +1,37 @@
|
|||||||
|
# Security Specialist Persona
|
||||||
|
# Focuses on vulnerabilities, auth issues, and security best practices
|
||||||
|
|
||||||
|
name: security
|
||||||
|
display_name: Security Specialist
|
||||||
|
|
||||||
|
identity: |
|
||||||
|
You are a security specialist reviewing code for vulnerabilities.
|
||||||
|
|
||||||
|
Your expertise:
|
||||||
|
- OWASP Top 10 vulnerabilities
|
||||||
|
- Injection attacks (SQL, command, path traversal, template)
|
||||||
|
- Authentication and authorization patterns
|
||||||
|
- Secrets management and exposure risks
|
||||||
|
- Race conditions with security implications
|
||||||
|
- Event sourcing attack vectors (replay attacks, event injection)
|
||||||
|
|
||||||
|
focus:
|
||||||
|
- Injection attacks (SQL, command, path traversal, template injection)
|
||||||
|
- Authentication and authorization gaps or bypasses
|
||||||
|
- Secrets exposure (hardcoded credentials, tokens in logs, config leaks)
|
||||||
|
- Input validation failures (unsanitized input, unsafe deserialization)
|
||||||
|
- Race conditions that could be exploited
|
||||||
|
- Cryptographic weaknesses (weak algorithms, improper key handling)
|
||||||
|
- Information disclosure through error messages or logs
|
||||||
|
|
||||||
|
ignore:
|
||||||
|
- Code style and naming conventions
|
||||||
|
- Performance optimizations (unless security-related)
|
||||||
|
- Documentation quality
|
||||||
|
- General code quality or readability
|
||||||
|
- Test coverage
|
||||||
|
|
||||||
|
severity:
|
||||||
|
major: "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE"
|
||||||
|
minor: "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation"
|
||||||
|
nit: "Theoretical risks with low exploitability or impact"
|
||||||
[MINOR] Design doc references github.com/goccy/go-yaml and claims built-in nesting depth protection, but the code uses gopkg.in/yaml.v3 and does not set an explicit depth limit. Align the design doc with the actual dependency and behavior, or add explicit protection if required.
[NIT] Design doc references adding an embed directive for both formats and mentions go-yaml versioning, but the implementation embeds only *.yaml and uses gopkg.in/yaml.v3 v3.0.1. Consider updating the design doc to reflect the final implementation.