5cedeee9f4
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 39s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m12s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m33s
MINOR fixes:
- docs/DESIGN-57-yaml-persona.md: fix Error Cases table entry to reflect
custom AST walk (checkYAMLDepth) instead of stale library-level reference
- review/persona.go: add EOF check after JSON decode to reject trailing
garbage after a valid JSON object (prevents silent acceptance of malformed
input like '{"name":"x"}garbage')
- review/persona_test.go: add TestJSONTrailingContentRejected test
NIT fixes:
- review/persona.go: add default case to checkYAMLDepth switch with
explanatory comment about scalar leaf nodes
- review/persona.go: document AnchorNode depth+1 conservative asymmetry
- review/persona.go: simplify redundant if-guard in ListBuiltinPersonas
88 lines
3.5 KiB
Markdown
88 lines
3.5 KiB
Markdown
# Design: YAML Support for Persona Files (#57)
|
|
|
|
## Problem
|
|
|
|
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
|
|
|
|
## Constraints
|
|
|
|
- Backwards compatibility: existing JSON personas must continue to work
|
|
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
|
|
- Consistency: use `.yaml` extension (not `.yml`)
|
|
- Library: use `github.com/goccy/go-yaml` v1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation
|
|
|
|
## Proposed Approach
|
|
|
|
1. **Update `parsePersona`** to detect format from file extension
|
|
2. **Add YAML parsing** with explicit depth limit (defense in depth)
|
|
3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
|
|
4. **Convert built-in personas** to YAML format
|
|
5. **Update embed directive** to include both formats
|
|
|
|
### File Extension Detection
|
|
|
|
```go
|
|
func parsePersona(data []byte, source string) (*Persona, error) {
|
|
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
|
|
if isYAML {
|
|
return parseYAML(data, source)
|
|
}
|
|
return parseJSON(data, source)
|
|
}
|
|
```
|
|
|
|
### YAML Parsing with Depth Protection
|
|
|
|
We implement a custom AST-based depth/node-count walk (`checkYAMLDepth` in
|
|
`review/persona.go`) rather than relying on library decoder options. Key design
|
|
decisions:
|
|
|
|
- **Library:** `github.com/goccy/go-yaml` with `ast.Node`-based traversal
|
|
- **Dual-map tracking:** `validated` (depth-aware short-circuit) + `visiting` (cycle detection)
|
|
- **Node-count limit:** Conservative overcounting bounds total validation work
|
|
- **Alias-aware depth:** Aliases increment depth and are re-checked when encountered at greater depths
|
|
|
|
See `review/persona.go:checkYAMLDepth` for the authoritative implementation.
|
|
|
|
## State/Data Model
|
|
|
|
No new state. Same `Persona` struct, just different parsing.
|
|
|
|
## Error Cases
|
|
|
|
| Error | Handling |
|
|
|-------|----------|
|
|
| Invalid YAML syntax | Return parse error with source file |
|
|
| Deeply nested YAML | Custom AST walk (`checkYAMLDepth`) rejects before decode |
|
|
| Unknown extension | Fall back to JSON parsing |
|
|
| Missing required fields | Validation rejects after parse |
|
|
|
|
## Edge Cases
|
|
|
|
- File with `.json` extension but YAML content → JSON parse fails, user sees error
|
|
- File with no extension → defaults to JSON
|
|
- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
|
|
|
|
## Testing Strategy
|
|
|
|
1. Unit tests for YAML parsing (valid, invalid, deeply nested)
|
|
2. Unit tests for extension detection
|
|
3. Integration test for built-in personas (now YAML)
|
|
4. Backwards compat test: verify JSON still works for external files
|
|
|
|
## Completion Checklist
|
|
|
|
1. [ ] `go-yaml` dependency added at v1.16.0+
|
|
2. [ ] Extension detection uses case-insensitive comparison
|
|
3. [ ] YAML parse errors include source file name
|
|
4. [ ] JSON parsing still works for `.json` files
|
|
5. [ ] Built-in personas converted to YAML with readable multi-line strings
|
|
6. [ ] Embed directive updated to include `*.yaml`
|
|
7. [ ] Test for deeply nested YAML rejection
|
|
8. [ ] All existing tests pass
|
|
|
|
## Open Questions
|
|
|
|
- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
|
|
- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
|