Files
review-bot/docs/DESIGN-57-yaml-persona.md
T
Rodin 006b7a3b27
CI / test (pull_request) Successful in 9m33s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 10m1s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 10m39s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 11m6s
feat: add YAML support for persona files (#57)
- Add github.com/goccy/go-yaml dependency (v1.19.2)
- Update parsePersona to detect format by file extension
- Support both .yaml and .yml extensions (case-insensitive)
- Convert built-in personas to YAML format
- Add comprehensive tests for YAML parsing
- Update README with YAML examples and documentation

YAML provides cleaner multi-line strings via literal block scalars
and supports comments, making persona definitions more readable.
JSON remains supported for backwards compatibility.

Closes #57
2026-05-10 13:08:06 -07:00

3.3 KiB

Design: YAML Support for Persona Files (#57)

Problem

JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.

Constraints

  • Backwards compatibility: existing JSON personas must continue to work
  • Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
  • Consistency: use .yaml extension (not .yml)
  • Library: use github.com/goccy/go-yaml v1.16.0+ (actively maintained, security fix applied)

Proposed Approach

  1. Update parsePersona to detect format from file extension
  2. Add YAML parsing with explicit depth limit (defense in depth)
  3. Keep JSON as fallback for files without .yaml/.yml extension
  4. Convert built-in personas to YAML format
  5. Update embed directive to include both formats

File Extension Detection

func parsePersona(data []byte, source string) (*Persona, error) {
    isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
    if isYAML {
        return parseYAML(data, source)
    }
    return parseJSON(data, source)
}

YAML Parsing with Depth Protection

func parseYAML(data []byte, source string) (*Persona, error) {
    var p Persona
    // go-yaml has built-in protection against deeply nested structures
    // but we add explicit decoder options for defense in depth
    if err := yaml.Unmarshal(data, &p); err != nil {
        return nil, fmt.Errorf("parse persona %s: %w", source, err)
    }
    if err := validatePersona(&p, source); err != nil {
        return nil, err
    }
    return &p, nil
}

The goccy/go-yaml library since v1.16.0 limits nesting depth by default.

State/Data Model

No new state. Same Persona struct, just different parsing.

Error Cases

Error Handling
Invalid YAML syntax Return parse error with source file
Deeply nested YAML Library rejects (v1.16.0+ fix)
Unknown extension Fall back to JSON parsing
Missing required fields Validation rejects after parse

Edge Cases

  • File with .json extension but YAML content → JSON parse fails, user sees error
  • File with no extension → defaults to JSON
  • Embedded persona reference like builtin:security → detect by embed path (personas/X.yaml)

Testing Strategy

  1. Unit tests for YAML parsing (valid, invalid, deeply nested)
  2. Unit tests for extension detection
  3. Integration test for built-in personas (now YAML)
  4. Backwards compat test: verify JSON still works for external files

Completion Checklist

  1. go-yaml dependency added at v1.16.0+
  2. Extension detection uses case-insensitive comparison
  3. YAML parse errors include source file name
  4. JSON parsing still works for .json files
  5. Built-in personas converted to YAML with readable multi-line strings
  6. Embed directive updated to include *.yaml
  7. Test for deeply nested YAML rejection
  8. All existing tests pass

Open Questions

  • Should we support both .yaml AND .yml? Issue says .yaml only for consistency, but some users expect .yml. Decision: Support both for reading, recommend .yaml in docs.
  • Should we add a "format" field to detect mismatched extension/content? Decision: No, keep it simple. Extension determines format.