fix(deps): replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml #89
@@ -171,16 +171,18 @@ func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
|
||||
return err
|
||||
}
|
||||
|
||||
// Reject empty YAML input (whitespace-only, comment-only, or truly empty files).
|
||||
// The parser returns a single doc with nil body for these cases.
|
||||
if len(file.Docs) == 0 || file.Docs[0].Body == nil {
|
||||
return fmt.Errorf("empty YAML document")
|
||||
}
|
||||
|
||||
// Reject multi-document YAML files - silently ignoring additional documents
|
||||
// could lead to confusing behavior where users think their changes take effect.
|
||||
if len(file.Docs) > 1 {
|
||||
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
|
||||
}
|
||||
|
||||
|
|
||||
if len(file.Docs) == 0 {
|
||||
return fmt.Errorf("empty YAML document")
|
||||
}
|
||||
|
||||
nodeCount := 0
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `go 1.26.2` in go.mod is a pre-release/future version (Go 1.26 has not been released). This may cause `go mod tidy` to behave unexpectedly on stable Go toolchains and suggests the module was initialized with an unstable toolchain. This is pre-existing and not introduced by this PR, but worth flagging.
|
||||
if err := checkYAMLDepth(file.Docs[0].Body, 0, maxDepth, MaxYAMLNodes, make(map[ast.Node]struct{}), &nodeCount); err != nil {
|
||||
return err
|
||||
@@ -210,7 +212,10 @@ func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, seen map[ast.N
|
||||
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
|
||||
}
|
||||
|
sonnet-review-bot
commented
[MINOR] The nodeCount increment (line 213, **[MINOR]** The nodeCount increment (line 213, `*nodeCount++`) happens before the depth-aware short-circuit check (line 222, `if prevDepth, ok := validated[node]; ok && depth <= prevDepth`). This means if a node is revisited at a shallower depth (triggering the short-circuit), it still increments the count, causing overcounting of shared subtrees referenced via aliases at multiple depths. This could cause false positives on the MaxYAMLNodes limit for legitimately large but valid documents that share anchors extensively. The nodeCount increment should arguably be inside the path where we actually traverse, or this behavior should be explicitly documented as intentional (conservative overcounting).
|
||||
|
||||
// Cycle detection: if we've seen this node before, we're in a cycle.
|
||||
// Cycle detection: uses pointer identity (ast.Node is an interface, but all
|
||||
// concrete node types are pointers) to detect revisits. This intentionally
|
||||
// compares pointer identity, not structural equality, since we want to track
|
||||
// specific node instances in the parsed AST graph.
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `checkYAMLDepth` function uses `ast.Node` as a map key for cycle detection. `ast.Node` is an interface — map keys for interface types use the dynamic type and value for equality. For pointer types (which all the concrete ast node types appear to be), this works correctly (pointer identity). This is fine but worth a comment explaining that pointer identity is the intended comparison, not structural equality, to prevent future maintainers from being surprised.
|
||||
if _, ok := seen[node]; ok {
|
||||
|
[MAJOR] Depth check can be bypassed due to global 'seen' set: if an anchored/aliased subtree is first validated at a shallow depth and later referenced via an alias deeper in the structure, the early return on previously seen node skips re-checking at the deeper context. This can allow effective nesting depth to exceed MaxYAMLDepth, potentially leading to stack exhaustion/DoS during decode. **[MAJOR]** Depth check can be bypassed due to global 'seen' set: if an anchored/aliased subtree is first validated at a shallow depth and later referenced via an alias deeper in the structure, the early return on previously seen node skips re-checking at the deeper context. This can allow effective nesting depth to exceed MaxYAMLDepth, potentially leading to stack exhaustion/DoS during decode.
|
||||
return nil // Already validated this subtree, skip to avoid infinite recursion.
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `nodeCount` increment happens before the `visiting` and `validated` checks. This means nodes encountered in cycles (where `visiting[node]` is true and we return early) are still counted against the total, potentially causing false-positive node count limit errors for valid YAML with shared anchors referenced multiple times. The increment should ideally only count genuinely new work, or the behavior should be explicitly documented.
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `nodeCount` is incremented before the `visiting` cycle detection check. For cyclic structures, this means a node that triggers cycle detection (returns early) still increments `nodeCount`. This slightly over-counts nodes in cyclic test scenarios, but since real YAML parsed by `parser.ParseBytes` cannot have true reference cycles in the AST (only alias references that point forward), this is functionally harmless in production. However, in the unit tests that construct artificial cycles, the count will be inflated.
|
||||
}
|
||||
|
||||
@@ -459,7 +459,8 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
path := filepath.Join(dir, "deeply-nested.yaml")
|
||||
|
||||
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
|
||||
|
sonnet-review-bot
commented
[MINOR] TestYAMLDeeplyNestedRejection comment says 'With 25 levels, we exceed MaxYAMLDepth (20)' but the depth counting depends on both MappingNode and MappingValueNode traversal (each contributes depth+1 for key, depth+1 for value). The comment would benefit from a concrete trace of how the depth accumulates to make the test self-documenting and easier to maintain if MaxYAMLDepth changes. **[MINOR]** TestYAMLDeeplyNestedRejection comment says 'With 25 levels, we exceed MaxYAMLDepth (20)' but the depth counting depends on both MappingNode and MappingValueNode traversal (each contributes depth+1 for key, depth+1 for value). The comment would benefit from a concrete trace of how the depth accumulates to make the test self-documenting and easier to maintain if MaxYAMLDepth changes.
|
||||
// Each level adds to the depth count via mapping values.
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
// Each nested mapping key generates a MappingValueNode, incrementing depth
|
||||
|
sonnet-review-bot
commented
[NIT] The comment in **[NIT]** The comment in `TestYAMLDeeplyNestedRejection` was updated from 'Each level adds 2 to the depth count (key + value mapping)' to 'Each level adds to the depth count via mapping values.' The new comment is vague — it doesn't explain the actual counting semantics with the new AST walker, making it harder to reason about whether the test correctly triggers the depth limit. Consider specifying exactly how many depth units each level of nesting adds under the new implementation.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
// by 1 per level in the AST walk. With 25 levels, we exceed MaxYAMLDepth (20).
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
var sb strings.Builder
|
||||
sb.WriteString("name: test\nidentity: test\nnested:\n")
|
||||
indent := " "
|
||||
@@ -483,6 +484,37 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
|
sonnet-review-bot
commented
[MINOR] Extra blank line between **[MINOR]** Extra blank line between `TestYAMLDeeplyNestedRejection` and `TestYAMLEmptyFileRejection` (two consecutive blank lines). Minor style issue per `gofmt` conventions (one blank line between top-level declarations).
sonnet-review-bot
commented
[NIT] Double blank line before **[NIT]** Double blank line before `TestYAMLEmptyFileRejection`. Minor formatting inconsistency — `gofmt` doesn't enforce blank-line counts between top-level declarations but it's slightly noisy.
sonnet-review-bot
commented
[NIT] Extra blank line between **[NIT]** Extra blank line between `TestYAMLDeeplyNestedRejection` closing brace and `TestYAMLEmptyFileRejection`. Minor style inconsistency — `gofmt` doesn't enforce single blank lines between top-level declarations, but two blank lines is unusual.
sonnet-review-bot
commented
[MINOR] There is a spurious blank line between **[MINOR]** There is a spurious blank line between `TestYAMLDeeplyNestedRejection` and `TestYAMLEmptyFileRejection` (two consecutive blank lines). Minor style issue per gofmt conventions, but the diff shows it was introduced in this PR.
|
||||
}
|
||||
|
||||
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
func TestYAMLEmptyFileRejection(t *testing.T) {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
dir := t.TempDir()
|
||||
|
sonnet-review-bot
commented
[NIT] In **[NIT]** In `TestYAMLEmptyFileRejection`, the `dir` variable is declared in the outer test function but files are written using paths under `dir` from within subtests. Since `t.TempDir()` is called on the outer `t`, it will be cleaned up when the outer test completes, which is fine. However, each subtest creates a file with the test case name as part of the filename directly in the shared `dir`. This is fine for parallel safety (test cases have unique names) but it is slightly more idiomatic to call `t.TempDir()` inside the subtest.
sonnet-review-bot
commented
[NIT] The **[NIT]** The `TestYAMLEmptyFileRejection` subtests write files into a shared `dir` created outside the loop. If two subtests run with the same filename pattern, they'd overwrite each other. Currently the filenames are distinct (`completely_empty.yaml`, `whitespace_only.yaml`, `comment_only.yaml`) so there's no actual issue, but using `t.TempDir()` inside each subtest would be the idiomatic pattern.
sonnet-review-bot
commented
[NIT] The **[NIT]** The `TestYAMLEmptyFileRejection` test creates the temp dir outside the `tests` loop, reusing a single dir for all subtests. Since each subtest writes to `tc.name+".yaml"` (different file names), there's no collision. However, `t.TempDir()` is called once at the top of the test function rather than inside each subtest — this is fine since the file names don't overlap, but using `t.TempDir()` inside each `t.Run` would be slightly more idiomatic for isolation.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
tests := []struct {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
name string
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `TestYAMLEmptyFileRejection` test uses `tc.name` (e.g., `"completely empty"`) as the filename fragment but writes all files to the same `dir`. The test case names contain spaces which are valid in filenames but could cause issues on some filesystems. More importantly, the test writes files with names like `completely empty.yaml` — using spaces in filenames is unusual and could cause subtle test issues. Prefer replacing spaces with underscores or hyphens.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
sonnet-review-bot
commented
[NIT] **[NIT]** `TestYAMLEmptyFileRejection` has a redundant `if err != nil` guard on the second condition: `if err != nil && !strings.Contains(...)`. Since the preceding `if err == nil { t.Error(...); return }` (implicit — actually the test doesn't return on the first error, it falls through) — actually the first check is `if err == nil { t.Error(...) }` without a return, so `err` could still be nil here. This means the `strings.Contains` check is guarded correctly with `err != nil`, but the test won't `t.Error` the wrong message if `err == nil` (it will have already errored on the first check). The pattern is slightly unusual compared to the rest of the file which uses early returns. Consistent style would be to add a `return` after the first `t.Error`, then drop the `err != nil` guard.
|
||||
content string
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}{
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
{"completely empty", ""},
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
{"whitespace only", " \n\n "},
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
{"comment only", "# just a comment\n"},
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
for _, tc := range tests {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
path := filepath.Join(dir, tc.name+".yaml")
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
if err := os.WriteFile(path, []byte(tc.content), 0644); err != nil {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
t.Fatalf("failed to write test file: %v", err)
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
_, err := LoadPersona(path)
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
if err == nil {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
t.Error("expected error for empty YAML input, got nil")
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
if err != nil && !strings.Contains(err.Error(), "empty YAML document") {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
t.Errorf("expected error containing %q, got: %v", "empty YAML document", err)
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
})
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
func TestYAMLFileSizeLimit(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
path := filepath.Join(dir, "huge.yaml")
|
||||
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
[MINOR] The empty document check (
len(file.Docs) == 0) comes AFTER the multi-document check (len(file.Docs) > 1). While not a bug (both conditions are checked), it would be slightly more natural to check for empty first. More importantly: the previousgopkg.in/yaml.v3implementation usingdec.Decode(&node)would return an error on truly empty input, whileparser.ParseBytesmay return an emptyDocsslice for whitespace-only or empty input. The new explicit empty-doc check is an improvement, but the error message 'empty YAML document' differs from what the old library would have returned — worth noting if callers match on error text.