feat: add YAML support for persona files #58

Merged
aweiker merged 6 commits from issue-57 into main 2026-05-11 01:39:43 +00:00
2 changed files with 96 additions and 11 deletions
Showing only changes of commit 10cd6203d4 - Show all commits
+13 -11
View File
10
@@ -51,7 +51,13 @@ type Severity struct {
// LoadPersona loads a persona from a JSON or YAML file path.
// Format is detected by file extension: .yaml/.yml for YAML, .json or other for JSON.
// Files larger than MaxPersonaFileSize are rejected.
//
// Symlinks are supported: os.Stat follows symlinks, so a symlink pointing to
Outdated
Review

[NIT] The fallback to JSON in LoadBuiltinPersona has a comment 'Fall back to JSON for backwards compatibility', but the embeddedPersonas embed directive now only includes *.yaml files (//go:embed personas/*.yaml). The JSON files have been deleted. This means the JSON fallback path in LoadBuiltinPersona is dead code — it can never succeed because no .json files are embedded. The fallback should either be removed or the embed directive should include *.json if JSON built-in personas need to be supported.

**[NIT]** The fallback to JSON in `LoadBuiltinPersona` has a comment 'Fall back to JSON for backwards compatibility', but the `embeddedPersonas` embed directive now only includes `*.yaml` files (`//go:embed personas/*.yaml`). The JSON files have been deleted. This means the JSON fallback path in `LoadBuiltinPersona` is dead code — it can never succeed because no `.json` files are embedded. The fallback should either be removed or the embed directive should include `*.json` if JSON built-in personas need to be supported.
// a regular file will pass the IsRegular() check. Symlinks to non-regular files
// (directories, FIFOs, devices) are still rejected.
Outdated
Review

[MINOR] Built-in persona loader falls back to JSON if YAML not found, but the embedded FS only includes *.yaml. Since the JSON files were removed and not embedded, this fallback will never succeed for built-ins and can be simplified or the embed pattern expanded if JSON fallback is intended.

**[MINOR]** Built-in persona loader falls back to JSON if YAML not found, but the embedded FS only includes `*.yaml`. Since the JSON files were removed and not embedded, this fallback will never succeed for built-ins and can be simplified or the embed pattern expanded if JSON fallback is intended.
func LoadPersona(path string) (*Persona, error) {
// os.Stat follows symlinks, so symlinks to regular files are supported.
Outdated
Review

[MINOR] LoadPersona rejects non-regular files (e.g., symlinks) by checking Mode().IsRegular(). This may be overly strict for CI or repo setups that use symlinks. Consider allowing symlinks by resolving them (os.Stat vs. Lstat) or relaxing the check.

**[MINOR]** LoadPersona rejects non-regular files (e.g., symlinks) by checking Mode().IsRegular(). This may be overly strict for CI or repo setups that use symlinks. Consider allowing symlinks by resolving them (os.Stat vs. Lstat) or relaxing the check.
// The IsRegular() check operates on the target, not the symlink itself.
Outdated
Review

[NIT] The comment on LoadBuiltinPersona says 'Try YAML first (preferred format)' but then falls back to JSON. Since the embed directive is now //go:embed personas/*.yaml only (JSON files were deleted), the JSON fallback in LoadBuiltinPersona is dead code for built-in personas. It's harmless and provides forwards-compatibility for the unlikely case someone adds a .json builtin later, but could be simplified.

**[NIT]** The comment on `LoadBuiltinPersona` says 'Try YAML first (preferred format)' but then falls back to JSON. Since the embed directive is now `//go:embed personas/*.yaml` only (JSON files were deleted), the JSON fallback in `LoadBuiltinPersona` is dead code for built-in personas. It's harmless and provides forwards-compatibility for the unlikely case someone adds a `.json` builtin later, but could be simplified.
info, err := os.Stat(path)
if err != nil {
return nil, fmt.Errorf("read persona file %s: %w", path, err)
6
@@ -76,23 +82,15 @@ func LoadPersona(path string) (*Persona, error) {
// LoadBuiltinPersona loads a built-in persona by name.
// Returns an error if the persona doesn't exist.
// Built-in personas are stored in YAML format.
// Built-in personas are stored in YAML format only (see embed directive).
func LoadBuiltinPersona(name string) (*Persona, error) {
// Try YAML first (preferred format)
yamlFile := name + ".yaml"
data, err := embeddedPersonas.ReadFile("personas/" + yamlFile)
Outdated
Review

[NIT] The comment in parsePersona mentions 'go-yaml v1.16.0+ has built-in protection against deeply nested structures' which refers to a different library (goccy/go-yaml). Update the comment to reflect gopkg.in/yaml.v3 or remove the claim.

**[NIT]** The comment in parsePersona mentions 'go-yaml v1.16.0+ has built-in protection against deeply nested structures' which refers to a different library (goccy/go-yaml). Update the comment to reflect gopkg.in/yaml.v3 or remove the claim.
if err == nil {
return parsePersona(data, "builtin:"+yamlFile)
}
// Fall back to JSON for backwards compatibility
jsonFile := name + ".json"
data, err = embeddedPersonas.ReadFile("personas/" + jsonFile)
if err != nil {
available := ListBuiltinPersonas()
return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", "))
}
Outdated
Review

[MINOR] ListBuiltinPersonas returns a non-deterministic (map-iteration) order of persona names. While the return value isn't documented as sorted, callers such as the error message in LoadBuiltinPersona join the names to display to users, and the non-deterministic order makes tests and error messages flaky. The slice should be sorted with slices.Sort (or sort.Strings) before returning.

**[MINOR]** `ListBuiltinPersonas` returns a non-deterministic (map-iteration) order of persona names. While the return value isn't documented as sorted, callers such as the error message in `LoadBuiltinPersona` join the names to display to users, and the non-deterministic order makes tests and error messages flaky. The slice should be sorted with `slices.Sort` (or `sort.Strings`) before returning.
return parsePersona(data, "builtin:"+jsonFile)
return parsePersona(data, "builtin:"+yamlFile)
}
Outdated
Review

[NIT] In ListBuiltinPersonas, the seen map is map[string]bool and the code checks if !seen[personaName] before setting it to true. The !seen[...] check is redundant since setting it to true again would be a no-op. The code works correctly but the guard condition (if !seen[personaName]) adds noise — it could just be seen[personaName] = true unconditionally.

**[NIT]** In `ListBuiltinPersonas`, the `seen` map is `map[string]bool` and the code checks `if !seen[personaName]` before setting it to `true`. The `!seen[...]` check is redundant since setting it to `true` again would be a no-op. The code works correctly but the guard condition (`if !seen[personaName]`) adds noise — it could just be `seen[personaName] = true` unconditionally.
// ListBuiltinPersonas returns the names of all built-in personas in sorted order.
10
@@ -143,7 +141,11 @@ func parsePersona(data []byte, source string) (*Persona, error) {
if isYAML {
err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
} else {
err = json.Unmarshal(data, &p)
// Use json.Decoder with DisallowUnknownFields for consistency with
// YAML's KnownFields(true) - both reject unknown fields to catch typos.
dec := json.NewDecoder(bytes.NewReader(data))
Outdated
Review

[MINOR] YAML decoder does not enable KnownFields/strict mode, so unknown keys in persona YAML are silently ignored. Enabling strict field checking would help catch typos (e.g., dec.KnownFields(true)) before validation.

**[MINOR]** YAML decoder does not enable KnownFields/strict mode, so unknown keys in persona YAML are silently ignored. Enabling strict field checking would help catch typos (e.g., dec.KnownFields(true)) before validation.
Outdated
Review

[MINOR] JSON parsing uses json.Unmarshal which accepts unknown fields, while YAML is parsed with KnownFields(true). For parity and to catch typos in JSON persona files, consider switching to a json.Decoder with DisallowUnknownFields().

**[MINOR]** JSON parsing uses json.Unmarshal which accepts unknown fields, while YAML is parsed with KnownFields(true). For parity and to catch typos in JSON persona files, consider switching to a json.Decoder with DisallowUnknownFields().
dec.DisallowUnknownFields()
err = dec.Decode(&p)
Review

[MINOR] JSON parsing does not enforce single-document input. After dec.Decode(&p), any trailing JSON values would be silently ignored. Consider verifying EOF by attempting a second decode and expecting io.EOF to ensure there's no extra data.

**[MINOR]** JSON parsing does not enforce single-document input. After dec.Decode(&p), any trailing JSON values would be silently ignored. Consider verifying EOF by attempting a second decode and expecting io.EOF to ensure there's no extra data.
}
if err != nil {
return nil, fmt.Errorf("parse persona %s: %w", source, err)
Outdated
Review

[MINOR] YAML depth check traverses node.Content but ignores Alias nodes (node.Alias). Malicious YAML could leverage anchors/aliases to create effective deep structures without increasing Content depth. Consider handling Alias nodes explicitly.

**[MINOR]** YAML depth check traverses node.Content but ignores Alias nodes (node.Alias). Malicious YAML could leverage anchors/aliases to create effective deep structures without increasing Content depth. Consider handling Alias nodes explicitly.
16
+83
View File
4
@@ -693,3 +693,86 @@ severity:
})
}
}
func TestJSONUnknownFieldsRejected(t *testing.T) {
tests := []struct {
name string
content string
wantErr string
}{
{
name: "unknown top-level field",
content: `{
"name": "test",
"identity": "test identity",
"unknown_field": "should fail"
}`,
wantErr: "unknown_field",
},
{
name: "typo in field name",
content: `{
"name": "test",
"identiy": "typo should fail"
}`,
wantErr: "identiy",
},
{
name: "unknown field in severity",
content: `{
"name": "test",
"identity": "test",
"severity": {
"major": "ok",
"miner": "typo"
}
}`,
wantErr: "miner",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for unknown field, got nil")
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want to contain %q", err.Error(), tt.wantErr)
}
})
}
}
func TestLoadPersonaSymlink(t *testing.T) {
// Create a regular persona file
dir := t.TempDir()
realFile := filepath.Join(dir, "real.yaml")
content := `name: test
identity: test identity
`
if err := os.WriteFile(realFile, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
// Create a symlink to it
symlink := filepath.Join(dir, "link.yaml")
if err := os.Symlink(realFile, symlink); err != nil {
t.Fatalf("failed to create symlink: %v", err)
}
// LoadPersona should work via symlink
p, err := LoadPersona(symlink)
if err != nil {
t.Fatalf("LoadPersona via symlink failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}