fix(deps): replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml #89

Merged
aweiker merged 13 commits from review-bot-issue-87 into main 2026-05-13 03:47:02 +00:00
2 changed files with 43 additions and 6 deletions
Showing only changes of commit 45d009dd06 - Show all commits
+10 -5
View File
13
@@ -171,16 +171,18 @@ func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
return err
}
// Reject empty YAML input (whitespace-only, comment-only, or truly empty files).
// The parser returns a single doc with nil body for these cases.
if len(file.Docs) == 0 || file.Docs[0].Body == nil {
return fmt.Errorf("empty YAML document")
}
// Reject multi-document YAML files - silently ignoring additional documents
// could lead to confusing behavior where users think their changes take effect.
if len(file.Docs) > 1 {
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
}
Outdated
Review

[MINOR] The empty document check (len(file.Docs) == 0) comes AFTER the multi-document check (len(file.Docs) > 1). While not a bug (both conditions are checked), it would be slightly more natural to check for empty first. More importantly: the previous gopkg.in/yaml.v3 implementation using dec.Decode(&node) would return an error on truly empty input, while parser.ParseBytes may return an empty Docs slice for whitespace-only or empty input. The new explicit empty-doc check is an improvement, but the error message 'empty YAML document' differs from what the old library would have returned — worth noting if callers match on error text.

**[MINOR]** The empty document check (`len(file.Docs) == 0`) comes AFTER the multi-document check (`len(file.Docs) > 1`). While not a bug (both conditions are checked), it would be slightly more natural to check for empty first. More importantly: the previous `gopkg.in/yaml.v3` implementation using `dec.Decode(&node)` would return an error on truly empty input, while `parser.ParseBytes` may return an empty `Docs` slice for whitespace-only or empty input. The new explicit empty-doc check is an improvement, but the error message 'empty YAML document' differs from what the old library would have returned — worth noting if callers match on error text.
if len(file.Docs) == 0 {
return fmt.Errorf("empty YAML document")
}
nodeCount := 0
Review

[MINOR] The go 1.26.2 in go.mod is a pre-release/future version (Go 1.26 has not been released). This may cause go mod tidy to behave unexpectedly on stable Go toolchains and suggests the module was initialized with an unstable toolchain. This is pre-existing and not introduced by this PR, but worth flagging.

**[MINOR]** The `go 1.26.2` in go.mod is a pre-release/future version (Go 1.26 has not been released). This may cause `go mod tidy` to behave unexpectedly on stable Go toolchains and suggests the module was initialized with an unstable toolchain. This is pre-existing and not introduced by this PR, but worth flagging.
if err := checkYAMLDepth(file.Docs[0].Body, 0, maxDepth, MaxYAMLNodes, make(map[ast.Node]struct{}), &nodeCount); err != nil {
return err
14
@@ -210,7 +212,10 @@ func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, seen map[ast.N
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
}
Outdated
Review

[MINOR] The nodeCount increment (line 213, *nodeCount++) happens before the depth-aware short-circuit check (line 222, if prevDepth, ok := validated[node]; ok && depth <= prevDepth). This means if a node is revisited at a shallower depth (triggering the short-circuit), it still increments the count, causing overcounting of shared subtrees referenced via aliases at multiple depths. This could cause false positives on the MaxYAMLNodes limit for legitimately large but valid documents that share anchors extensively. The nodeCount increment should arguably be inside the path where we actually traverse, or this behavior should be explicitly documented as intentional (conservative overcounting).

**[MINOR]** The nodeCount increment (line 213, `*nodeCount++`) happens before the depth-aware short-circuit check (line 222, `if prevDepth, ok := validated[node]; ok && depth <= prevDepth`). This means if a node is revisited at a shallower depth (triggering the short-circuit), it still increments the count, causing overcounting of shared subtrees referenced via aliases at multiple depths. This could cause false positives on the MaxYAMLNodes limit for legitimately large but valid documents that share anchors extensively. The nodeCount increment should arguably be inside the path where we actually traverse, or this behavior should be explicitly documented as intentional (conservative overcounting).
// Cycle detection: if we've seen this node before, we're in a cycle.
// Cycle detection: uses pointer identity (ast.Node is an interface, but all
// concrete node types are pointers) to detect revisits. This intentionally
// compares pointer identity, not structural equality, since we want to track
// specific node instances in the parsed AST graph.
Outdated
Review

[MINOR] The checkYAMLDepth function uses ast.Node as a map key for cycle detection. ast.Node is an interface — map keys for interface types use the dynamic type and value for equality. For pointer types (which all the concrete ast node types appear to be), this works correctly (pointer identity). This is fine but worth a comment explaining that pointer identity is the intended comparison, not structural equality, to prevent future maintainers from being surprised.

**[MINOR]** The `checkYAMLDepth` function uses `ast.Node` as a map key for cycle detection. `ast.Node` is an interface — map keys for interface types use the dynamic type and value for equality. For pointer types (which all the concrete ast node types appear to be), this works correctly (pointer identity). This is fine but worth a comment explaining that pointer identity is the intended comparison, not structural equality, to prevent future maintainers from being surprised.
if _, ok := seen[node]; ok {
Outdated
Review

[MAJOR] Depth check can be bypassed due to global 'seen' set: if an anchored/aliased subtree is first validated at a shallow depth and later referenced via an alias deeper in the structure, the early return on previously seen node skips re-checking at the deeper context. This can allow effective nesting depth to exceed MaxYAMLDepth, potentially leading to stack exhaustion/DoS during decode.

**[MAJOR]** Depth check can be bypassed due to global 'seen' set: if an anchored/aliased subtree is first validated at a shallow depth and later referenced via an alias deeper in the structure, the early return on previously seen node skips re-checking at the deeper context. This can allow effective nesting depth to exceed MaxYAMLDepth, potentially leading to stack exhaustion/DoS during decode.
return nil // Already validated this subtree, skip to avoid infinite recursion.
Outdated
Review

[MINOR] The nodeCount increment happens before the visiting and validated checks. This means nodes encountered in cycles (where visiting[node] is true and we return early) are still counted against the total, potentially causing false-positive node count limit errors for valid YAML with shared anchors referenced multiple times. The increment should ideally only count genuinely new work, or the behavior should be explicitly documented.

**[MINOR]** The `nodeCount` increment happens before the `visiting` and `validated` checks. This means nodes encountered in cycles (where `visiting[node]` is true and we return early) are still counted against the total, potentially causing false-positive node count limit errors for valid YAML with shared anchors referenced multiple times. The increment should ideally only count genuinely new work, or the behavior should be explicitly documented.
Outdated
Review

[MINOR] The nodeCount is incremented before the visiting cycle detection check. For cyclic structures, this means a node that triggers cycle detection (returns early) still increments nodeCount. This slightly over-counts nodes in cyclic test scenarios, but since real YAML parsed by parser.ParseBytes cannot have true reference cycles in the AST (only alias references that point forward), this is functionally harmless in production. However, in the unit tests that construct artificial cycles, the count will be inflated.

**[MINOR]** The `nodeCount` is incremented before the `visiting` cycle detection check. For cyclic structures, this means a node that triggers cycle detection (returns early) still increments `nodeCount`. This slightly over-counts nodes in cyclic test scenarios, but since real YAML parsed by `parser.ParseBytes` cannot have true reference cycles in the AST (only alias references that point forward), this is functionally harmless in production. However, in the unit tests that construct artificial cycles, the count will be inflated.
}
13
+33 -1
View File
2
@@ -459,7 +459,8 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
path := filepath.Join(dir, "deeply-nested.yaml")
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
Review

[MINOR] TestYAMLDeeplyNestedRejection comment says 'With 25 levels, we exceed MaxYAMLDepth (20)' but the depth counting depends on both MappingNode and MappingValueNode traversal (each contributes depth+1 for key, depth+1 for value). The comment would benefit from a concrete trace of how the depth accumulates to make the test self-documenting and easier to maintain if MaxYAMLDepth changes.

**[MINOR]** TestYAMLDeeplyNestedRejection comment says 'With 25 levels, we exceed MaxYAMLDepth (20)' but the depth counting depends on both MappingNode and MappingValueNode traversal (each contributes depth+1 for key, depth+1 for value). The comment would benefit from a concrete trace of how the depth accumulates to make the test self-documenting and easier to maintain if MaxYAMLDepth changes.
// Each level adds to the depth count via mapping values.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
// Each nested mapping key generates a MappingValueNode, incrementing depth
Outdated
Review

[NIT] The comment in TestYAMLDeeplyNestedRejection was updated from 'Each level adds 2 to the depth count (key + value mapping)' to 'Each level adds to the depth count via mapping values.' The new comment is vague — it doesn't explain the actual counting semantics with the new AST walker, making it harder to reason about whether the test correctly triggers the depth limit. Consider specifying exactly how many depth units each level of nesting adds under the new implementation.

**[NIT]** The comment in `TestYAMLDeeplyNestedRejection` was updated from 'Each level adds 2 to the depth count (key + value mapping)' to 'Each level adds to the depth count via mapping values.' The new comment is vague — it doesn't explain the actual counting semantics with the new AST walker, making it harder to reason about whether the test correctly triggers the depth limit. Consider specifying exactly how many depth units each level of nesting adds under the new implementation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
// by 1 per level in the AST walk. With 25 levels, we exceed MaxYAMLDepth (20).
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\nnested:\n")
indent := " "
2
@@ -483,6 +484,37 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}
Review

[MINOR] Extra blank line between TestYAMLDeeplyNestedRejection and TestYAMLEmptyFileRejection (two consecutive blank lines). Minor style issue per gofmt conventions (one blank line between top-level declarations).

**[MINOR]** Extra blank line between `TestYAMLDeeplyNestedRejection` and `TestYAMLEmptyFileRejection` (two consecutive blank lines). Minor style issue per `gofmt` conventions (one blank line between top-level declarations).
Review

[NIT] Double blank line before TestYAMLEmptyFileRejection. Minor formatting inconsistency — gofmt doesn't enforce blank-line counts between top-level declarations but it's slightly noisy.

**[NIT]** Double blank line before `TestYAMLEmptyFileRejection`. Minor formatting inconsistency — `gofmt` doesn't enforce blank-line counts between top-level declarations but it's slightly noisy.
Review

[NIT] Extra blank line between TestYAMLDeeplyNestedRejection closing brace and TestYAMLEmptyFileRejection. Minor style inconsistency — gofmt doesn't enforce single blank lines between top-level declarations, but two blank lines is unusual.

**[NIT]** Extra blank line between `TestYAMLDeeplyNestedRejection` closing brace and `TestYAMLEmptyFileRejection`. Minor style inconsistency — `gofmt` doesn't enforce single blank lines between top-level declarations, but two blank lines is unusual.
Review

[MINOR] There is a spurious blank line between TestYAMLDeeplyNestedRejection and TestYAMLEmptyFileRejection (two consecutive blank lines). Minor style issue per gofmt conventions, but the diff shows it was introduced in this PR.

**[MINOR]** There is a spurious blank line between `TestYAMLDeeplyNestedRejection` and `TestYAMLEmptyFileRejection` (two consecutive blank lines). Minor style issue per gofmt conventions, but the diff shows it was introduced in this PR.
}
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
func TestYAMLEmptyFileRejection(t *testing.T) {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
dir := t.TempDir()
Review

[NIT] In TestYAMLEmptyFileRejection, the dir variable is declared in the outer test function but files are written using paths under dir from within subtests. Since t.TempDir() is called on the outer t, it will be cleaned up when the outer test completes, which is fine. However, each subtest creates a file with the test case name as part of the filename directly in the shared dir. This is fine for parallel safety (test cases have unique names) but it is slightly more idiomatic to call t.TempDir() inside the subtest.

**[NIT]** In `TestYAMLEmptyFileRejection`, the `dir` variable is declared in the outer test function but files are written using paths under `dir` from within subtests. Since `t.TempDir()` is called on the outer `t`, it will be cleaned up when the outer test completes, which is fine. However, each subtest creates a file with the test case name as part of the filename directly in the shared `dir`. This is fine for parallel safety (test cases have unique names) but it is slightly more idiomatic to call `t.TempDir()` inside the subtest.
Review

[NIT] The TestYAMLEmptyFileRejection subtests write files into a shared dir created outside the loop. If two subtests run with the same filename pattern, they'd overwrite each other. Currently the filenames are distinct (completely_empty.yaml, whitespace_only.yaml, comment_only.yaml) so there's no actual issue, but using t.TempDir() inside each subtest would be the idiomatic pattern.

**[NIT]** The `TestYAMLEmptyFileRejection` subtests write files into a shared `dir` created outside the loop. If two subtests run with the same filename pattern, they'd overwrite each other. Currently the filenames are distinct (`completely_empty.yaml`, `whitespace_only.yaml`, `comment_only.yaml`) so there's no actual issue, but using `t.TempDir()` inside each subtest would be the idiomatic pattern.
Review

[NIT] The TestYAMLEmptyFileRejection test creates the temp dir outside the tests loop, reusing a single dir for all subtests. Since each subtest writes to tc.name+".yaml" (different file names), there's no collision. However, t.TempDir() is called once at the top of the test function rather than inside each subtest — this is fine since the file names don't overlap, but using t.TempDir() inside each t.Run would be slightly more idiomatic for isolation.

**[NIT]** The `TestYAMLEmptyFileRejection` test creates the temp dir outside the `tests` loop, reusing a single dir for all subtests. Since each subtest writes to `tc.name+".yaml"` (different file names), there's no collision. However, `t.TempDir()` is called once at the top of the test function rather than inside each subtest — this is fine since the file names don't overlap, but using `t.TempDir()` inside each `t.Run` would be slightly more idiomatic for isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
tests := []struct {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
name string
Review

[MINOR] The TestYAMLEmptyFileRejection test uses tc.name (e.g., "completely empty") as the filename fragment but writes all files to the same dir. The test case names contain spaces which are valid in filenames but could cause issues on some filesystems. More importantly, the test writes files with names like completely empty.yaml — using spaces in filenames is unusual and could cause subtle test issues. Prefer replacing spaces with underscores or hyphens.

**[MINOR]** The `TestYAMLEmptyFileRejection` test uses `tc.name` (e.g., `"completely empty"`) as the filename fragment but writes all files to the same `dir`. The test case names contain spaces which are valid in filenames but could cause issues on some filesystems. More importantly, the test writes files with names like `completely empty.yaml` — using spaces in filenames is unusual and could cause subtle test issues. Prefer replacing spaces with underscores or hyphens.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

[NIT] TestYAMLEmptyFileRejection has a redundant if err != nil guard on the second condition: if err != nil && !strings.Contains(...). Since the preceding if err == nil { t.Error(...); return } (implicit — actually the test doesn't return on the first error, it falls through) — actually the first check is if err == nil { t.Error(...) } without a return, so err could still be nil here. This means the strings.Contains check is guarded correctly with err != nil, but the test won't t.Error the wrong message if err == nil (it will have already errored on the first check). The pattern is slightly unusual compared to the rest of the file which uses early returns. Consistent style would be to add a return after the first t.Error, then drop the err != nil guard.

**[NIT]** `TestYAMLEmptyFileRejection` has a redundant `if err != nil` guard on the second condition: `if err != nil && !strings.Contains(...)`. Since the preceding `if err == nil { t.Error(...); return }` (implicit — actually the test doesn't return on the first error, it falls through) — actually the first check is `if err == nil { t.Error(...) }` without a return, so `err` could still be nil here. This means the `strings.Contains` check is guarded correctly with `err != nil`, but the test won't `t.Error` the wrong message if `err == nil` (it will have already errored on the first check). The pattern is slightly unusual compared to the rest of the file which uses early returns. Consistent style would be to add a `return` after the first `t.Error`, then drop the `err != nil` guard.
content string
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}{
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
{"completely empty", ""},
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
{"whitespace only", " \n\n "},
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
{"comment only", "# just a comment\n"},
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
for _, tc := range tests {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
t.Run(tc.name, func(t *testing.T) {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
path := filepath.Join(dir, tc.name+".yaml")
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
if err := os.WriteFile(path, []byte(tc.content), 0644); err != nil {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
t.Fatalf("failed to write test file: %v", err)
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
_, err := LoadPersona(path)
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
if err == nil {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
t.Error("expected error for empty YAML input, got nil")
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
if err != nil && !strings.Contains(err.Error(), "empty YAML document") {
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
t.Errorf("expected error containing %q, got: %v", "empty YAML document", err)
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
})
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
}
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
func TestYAMLFileSizeLimit(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "huge.yaml")
1
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
Review

Fixed in 0b16c41: moved t.TempDir() inside each subtest for proper isolation.

Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.