5cedeee9f4
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 39s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m12s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m33s
MINOR fixes:
- docs/DESIGN-57-yaml-persona.md: fix Error Cases table entry to reflect
custom AST walk (checkYAMLDepth) instead of stale library-level reference
- review/persona.go: add EOF check after JSON decode to reject trailing
garbage after a valid JSON object (prevents silent acceptance of malformed
input like '{"name":"x"}garbage')
- review/persona_test.go: add TestJSONTrailingContentRejected test
NIT fixes:
- review/persona.go: add default case to checkYAMLDepth switch with
explanatory comment about scalar leaf nodes
- review/persona.go: document AnchorNode depth+1 conservative asymmetry
- review/persona.go: simplify redundant if-guard in ListBuiltinPersonas
3.5 KiB
3.5 KiB
Design: YAML Support for Persona Files (#57)
Problem
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
Constraints
- Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use
.yamlextension (not.yml) - Library: use
github.com/goccy/go-yamlv1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation
Proposed Approach
- Update
parsePersonato detect format from file extension - Add YAML parsing with explicit depth limit (defense in depth)
- Keep JSON as fallback for files without
.yaml/.ymlextension - Convert built-in personas to YAML format
- Update embed directive to include both formats
File Extension Detection
func parsePersona(data []byte, source string) (*Persona, error) {
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
if isYAML {
return parseYAML(data, source)
}
return parseJSON(data, source)
}
YAML Parsing with Depth Protection
We implement a custom AST-based depth/node-count walk (checkYAMLDepth in
review/persona.go) rather than relying on library decoder options. Key design
decisions:
- Library:
github.com/goccy/go-yamlwithast.Node-based traversal - Dual-map tracking:
validated(depth-aware short-circuit) +visiting(cycle detection) - Node-count limit: Conservative overcounting bounds total validation work
- Alias-aware depth: Aliases increment depth and are re-checked when encountered at greater depths
See review/persona.go:checkYAMLDepth for the authoritative implementation.
State/Data Model
No new state. Same Persona struct, just different parsing.
Error Cases
| Error | Handling |
|---|---|
| Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Custom AST walk (checkYAMLDepth) rejects before decode |
| Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse |
Edge Cases
- File with
.jsonextension but YAML content → JSON parse fails, user sees error - File with no extension → defaults to JSON
- Embedded persona reference like
builtin:security→ detect by embed path (personas/X.yaml)
Testing Strategy
- Unit tests for YAML parsing (valid, invalid, deeply nested)
- Unit tests for extension detection
- Integration test for built-in personas (now YAML)
- Backwards compat test: verify JSON still works for external files
Completion Checklist
go-yamldependency added at v1.16.0+- Extension detection uses case-insensitive comparison
- YAML parse errors include source file name
- JSON parsing still works for
.jsonfiles - Built-in personas converted to YAML with readable multi-line strings
- Embed directive updated to include
*.yaml - Test for deeply nested YAML rejection
- All existing tests pass
Open Questions
- Should we support both
.yamlAND.yml? Issue says.yamlonly for consistency, but some users expect.yml. Decision: Support both for reading, recommend.yamlin docs. - Should we add a "format" field to detect mismatched extension/content? Decision: No, keep it simple. Extension determines format.