fix(deps): replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml #89
@@ -166,11 +166,10 @@ func parsePersona(data []byte, source string) (*Persona, error) {
|
||||
return &p, nil
|
||||
}
|
||||
|
||||
// unmarshalYAMLWithDepthLimit unmarshals YAML data with explicit depth limiting
|
||||
// and strict field checking. This protects against stack exhaustion from deeply
|
||||
// nested structures and catches typos in field names.
|
||||
// Multi-document YAML files are rejected to prevent confusing behavior
|
||||
// where additional documents are silently ignored.
|
||||
// unmarshalYAMLWithDepthLimit unmarshals YAML data with three safety checks:
|
||||
// - Depth limiting: rejects AST trees exceeding maxDepth to prevent stack exhaustion.
|
||||
// - Multi-document rejection: prevents silent data loss from ignored extra documents.
|
||||
// - Strict field checking: rejects unknown YAML keys to catch typos early.
|
||||
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
|
||||
// First pass: parse into AST to check depth limits, node counts, and
|
||||
// multi-document rejection. This prevents stack exhaustion before we
|
||||
@@ -247,10 +246,13 @@ func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, validated map[
|
||||
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
|
||||
|
|
||||
}
|
||||
|
sonnet-review-bot
commented
[NIT] The **[NIT]** The `checkYAMLDepth` function receives `maxNodes int` as a parameter but `MaxYAMLNodes` is passed as the constant from the call site. The `maxNodes` parameter is never varied between call sites — it's always `MaxYAMLNodes`. Could simplify by using the constant directly in the function, but having it as a parameter makes testing easier (allows injecting lower limits in tests). Current approach is acceptable.
|
||||
|
||||
// Depth-aware short-circuit: only skip re-checking a node if we previously
|
||||
// validated it at the same or deeper effective depth. If this visit is at a
|
||||
// greater depth than before (e.g., alias referenced deeper in the tree),
|
||||
// we must re-traverse to catch depth limit violations.
|
||||
// Depth-aware short-circuit: skip re-validation only when the current visit
|
||||
// depth is the same or shallower than the depth at which this node was
|
||||
// previously validated. A shallower (or equal) current depth means the
|
||||
// prior, deeper validation already covered any subtree depth violations.
|
||||
// If the current depth exceeds the previous validation depth (e.g., an alias
|
||||
// references this node deeper in the tree), we must re-traverse to ensure
|
||||
// the combined effective depth doesn't exceed maxDepth.
|
||||
//
|
||||
// Note: using ast.Node (interface) as map key relies on pointer identity,
|
||||
// which is correct because all goccy/go-yaml AST node types are pointer
|
||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `MappingValueNode` case visits both Key and Value at `depth+1` relative to the MappingValueNode's own depth. Since `MappingValueNode` is itself visited at `depth+1` from its parent `MappingNode`, keys end up at `depth+2` from the mapping. This asymmetry between key depth and value depth means scalar keys consume an extra depth level relative to what might be intuitive, but it's consistent and the tests validate the behavior works. Worth a brief comment explaining that keys consume a depth level intentionally.
|
||||
@@ -273,6 +275,11 @@ func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, validated map[
|
||||
}
|
||||
}
|
||||
|
sonnet-review-bot
commented
[NIT] The **[NIT]** The `AnchorNode` case increments depth for the anchor definition. The comment explains this is intentional and asymmetric. However, this means an anchored value at the top level that nests 10 levels will consume 11 depth budget (1 for anchor + 10 for content), and when aliased at depth 5, the alias expands to depth 5+1+10=16. The comment says 'combined budget is halved' which isn't quite accurate — it's more nuanced. The comment could be clearer, but this doesn't affect correctness.
rodin
commented
Acknowledged. The AnchorNode comment already explains the asymmetry in detail — both the definition site and the reference site each consume a level, making deeply nested anchor/alias pairs hit the limit sooner. The reviewer's observation about the budget not being exactly "halved" is correct (the comment says "reduced" not "halved"), and the existing phrasing captures the design intent accurately. No change — doesn't affect correctness and the comment already explains the reasoning. Acknowledged. The AnchorNode comment already explains the asymmetry in detail — both the definition site and the reference site each consume a level, making deeply nested anchor/alias pairs hit the limit sooner. The reviewer's observation about the budget not being exactly "halved" is correct (the comment says "reduced" not "halved"), and the existing phrasing captures the design intent accurately. No change — doesn't affect correctness and the comment already explains the reasoning.
|
||||
case *ast.MappingValueNode:
|
||||
// Both Key and Value are visited at depth+1 relative to this
|
||||
// MappingValueNode. Since MappingNode visits its MappingValueNode
|
||||
// children at depth+1 as well, keys and values end up at depth+2
|
||||
// from the parent MappingNode. This is intentional: it mirrors the
|
||||
// actual nesting structure (mapping → key-value pair → key/value).
|
||||
if err := checkYAMLDepth(n.Key, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
@@ -510,9 +510,9 @@ func TestYAMLEmptyFileRejection(t *testing.T) {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
|
||||
_, err := LoadPersona(path)
|
||||
if err == nil {
|
||||
t.Error("expected error for empty YAML input, got nil")
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
t.Fatal("expected error for empty YAML input, got nil")
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
}
|
||||
if err != nil && !strings.Contains(err.Error(), "empty YAML document") {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
if !strings.Contains(err.Error(), "empty YAML document") {
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
t.Errorf("expected error containing %q, got: %v", "empty YAML document", err)
|
||||
}
|
||||
})
|
||||
|
||||
|
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
rodin
commented
Fixed in Fixed in 0b16c41: moved `t.TempDir()` inside each subtest for proper isolation.
|
||||
[MINOR] The
validatedmap usesast.Node(interface) as a key, which relies on pointer equality for interface comparison. This works correctly for pointer types like*ast.MappingNode, but if anyast.Nodeimplementation is a value type (non-pointer), two structurally identical nodes could hash to different entries or the same entry incorrectly. In practice, all goccy/go-yaml AST node types appear to be pointer types, so this is low risk, but it's an implicit assumption worth a comment.Already addressed. There's an explicit comment at the
validatedmap access (the depth-aware short-circuit block):This documents the assumption directly at the usage site. No change needed.