6035afeea7
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 9m33s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 9m51s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m13s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 11m25s
3.9 KiB
3.9 KiB
Design: YAML Support for Persona Files (#57)
Problem
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
Constraints
- Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use
.yamlextension (not.yml) - Library: use
gopkg.in/yaml.v3(approved in CONVENTIONS.md) with explicit depth limiting
Proposed Approach
- Update
parsePersonato detect format from file extension - Add YAML parsing with explicit depth limit (defense in depth)
- Keep JSON as fallback for files without
.yaml/.ymlextension - Convert built-in personas to YAML format
- Update embed directive to include both formats
File Extension Detection
func parsePersona(data []byte, source string) (*Persona, error) {
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
if isYAML {
return parseYAML(data, source)
}
return parseJSON(data, source)
}
YAML Parsing with Depth Protection
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
var node yaml.Node
dec := yaml.NewDecoder(bytes.NewReader(data))
if err := dec.Decode(&node); err != nil {
return err
}
if err := checkYAMLDepth(&node, 0, maxDepth); err != nil {
return err
}
return node.Decode(out)
}
func checkYAMLDepth(node *yaml.Node, depth, maxDepth int) error {
if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
}
// Handle alias nodes by following the Alias pointer
if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth, maxDepth)
}
for _, child := range node.Content {
if err := checkYAMLDepth(child, depth+1, maxDepth); err != nil {
return err
}
}
return nil
}
The gopkg.in/yaml.v3 library does not have built-in depth protection, so we implement explicit depth checking by first decoding into a yaml.Node, walking the tree to verify depth (including alias resolution), then decoding into the target struct.
State/Data Model
No new state. Same Persona struct, just different parsing.
Error Cases
| Error | Handling |
|---|---|
| Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Library rejects (v1.16.0+ fix) |
| Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse |
Edge Cases
- File with
.jsonextension but YAML content → JSON parse fails, user sees error - File with no extension → defaults to JSON
- Embedded persona reference like
builtin:security→ detect by embed path (personas/X.yaml)
Testing Strategy
- Unit tests for YAML parsing (valid, invalid, deeply nested)
- Unit tests for extension detection
- Integration test for built-in personas (now YAML)
- Backwards compat test: verify JSON still works for external files
Completion Checklist
go-yamldependency added at v1.16.0+- Extension detection uses case-insensitive comparison
- YAML parse errors include source file name
- JSON parsing still works for
.jsonfiles - Built-in personas converted to YAML with readable multi-line strings
- Embed directive updated to include
*.yaml - Test for deeply nested YAML rejection
- All existing tests pass
Open Questions
- Should we support both
.yamlAND.yml? Issue says.yamlonly for consistency, but some users expect.yml. Decision: Support both for reading, recommend.yamlin docs. - Should we add a "format" field to detect mismatched extension/content? Decision: No, keep it simple. Extension determines format.