006b7a3b27
CI / test (pull_request) Successful in 9m33s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 10m1s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m39s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m6s
- Add github.com/goccy/go-yaml dependency (v1.19.2) - Update parsePersona to detect format by file extension - Support both .yaml and .yml extensions (case-insensitive) - Convert built-in personas to YAML format - Add comprehensive tests for YAML parsing - Update README with YAML examples and documentation YAML provides cleaner multi-line strings via literal block scalars and supports comments, making persona definitions more readable. JSON remains supported for backwards compatibility. Closes #57
94 lines
3.3 KiB
Markdown
94 lines
3.3 KiB
Markdown
# Design: YAML Support for Persona Files (#57)
|
|
|
|
## Problem
|
|
|
|
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
|
|
|
|
## Constraints
|
|
|
|
- Backwards compatibility: existing JSON personas must continue to work
|
|
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
|
|
- Consistency: use `.yaml` extension (not `.yml`)
|
|
- Library: use `github.com/goccy/go-yaml` v1.16.0+ (actively maintained, security fix applied)
|
|
|
|
## Proposed Approach
|
|
|
|
1. **Update `parsePersona`** to detect format from file extension
|
|
2. **Add YAML parsing** with explicit depth limit (defense in depth)
|
|
3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
|
|
4. **Convert built-in personas** to YAML format
|
|
5. **Update embed directive** to include both formats
|
|
|
|
### File Extension Detection
|
|
|
|
```go
|
|
func parsePersona(data []byte, source string) (*Persona, error) {
|
|
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
|
|
if isYAML {
|
|
return parseYAML(data, source)
|
|
}
|
|
return parseJSON(data, source)
|
|
}
|
|
```
|
|
|
|
### YAML Parsing with Depth Protection
|
|
|
|
```go
|
|
func parseYAML(data []byte, source string) (*Persona, error) {
|
|
var p Persona
|
|
// go-yaml has built-in protection against deeply nested structures
|
|
// but we add explicit decoder options for defense in depth
|
|
if err := yaml.Unmarshal(data, &p); err != nil {
|
|
return nil, fmt.Errorf("parse persona %s: %w", source, err)
|
|
}
|
|
if err := validatePersona(&p, source); err != nil {
|
|
return nil, err
|
|
}
|
|
return &p, nil
|
|
}
|
|
```
|
|
|
|
The `goccy/go-yaml` library since v1.16.0 limits nesting depth by default.
|
|
|
|
## State/Data Model
|
|
|
|
No new state. Same `Persona` struct, just different parsing.
|
|
|
|
## Error Cases
|
|
|
|
| Error | Handling |
|
|
|-------|----------|
|
|
| Invalid YAML syntax | Return parse error with source file |
|
|
| Deeply nested YAML | Library rejects (v1.16.0+ fix) |
|
|
| Unknown extension | Fall back to JSON parsing |
|
|
| Missing required fields | Validation rejects after parse |
|
|
|
|
## Edge Cases
|
|
|
|
- File with `.json` extension but YAML content → JSON parse fails, user sees error
|
|
- File with no extension → defaults to JSON
|
|
- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
|
|
|
|
## Testing Strategy
|
|
|
|
1. Unit tests for YAML parsing (valid, invalid, deeply nested)
|
|
2. Unit tests for extension detection
|
|
3. Integration test for built-in personas (now YAML)
|
|
4. Backwards compat test: verify JSON still works for external files
|
|
|
|
## Completion Checklist
|
|
|
|
1. [ ] `go-yaml` dependency added at v1.16.0+
|
|
2. [ ] Extension detection uses case-insensitive comparison
|
|
3. [ ] YAML parse errors include source file name
|
|
4. [ ] JSON parsing still works for `.json` files
|
|
5. [ ] Built-in personas converted to YAML with readable multi-line strings
|
|
6. [ ] Embed directive updated to include `*.yaml`
|
|
7. [ ] Test for deeply nested YAML rejection
|
|
8. [ ] All existing tests pass
|
|
|
|
## Open Questions
|
|
|
|
- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
|
|
- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
|