feat(#141): validate-docmap subcommand — CI hard-fail on missing docmap coverage #142
Reference in New Issue
Block a user
Delete Branch "issue-141"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Adds
review-bot validate-docmap— a pure static CI check that fails fast if:paths:glob in the docmapdocs:entry in the docmap doesn't exist on diskNo AI, no VCS API. Just YAML parsing + glob matching +
os.Stat.Closes #141.
Plan
Problem
CI has no way to verify that
doc-map.ymlis kept up to date. Missing coverage is silent — the AI reviewer gets no docs injected and nobody notices.Approach
Phase 1 — Export
FileCoveredByDocMapfromreview/docmap.goThin wrapper over the existing unexported
mappingMatchesthat letscmd/check per-file glob coverage without duplicating logic.Phase 2 —
cmd/review-bot/validatedocmap.goflag.NewFlagSet(not global flags) to avoid pollutingmain.go's flag stategit diff --name-onlyidiom)validateurl.gopattern exactly:outWriter/errWriterfor testabilityPhase 3 — Wire into
main.go+ testsPhase 4 — README Subcommands section
Exit codes
0— clean1— coverage or stale-doc failures2— usage/parse errorUsage
Progress
Phase 1: Export FileCoveredByDocMap
FileCoveredByDocMap(cfg *DocMapConfig, file string) booladded toreview/docmap.goreview/docmap_test.gogo test ./review/passesPhase 2: validatedocmap.go
runValidateDocmap(args []string) intimplementedflag.NewFlagSetusedbufio.Scannerdocs:entries,os.Stat)outWriter/errWriterused throughoutPhase 3: Wire + tests
case "validate-docmap":added tomain.godispatchvalidatedocmap_test.go— 9 tests covering all casesgo test ./...passesPhase 4: README
## Subcommandssection added covering bothvalidate-docmapandvalidate-urlChecklist
FileCoveredByDocMapexported and covers all-mappings/any-glob logicrunValidateDocmapfollowsrunValidateURLpatterndocs:entries deduplicated before stale checkoutWriter/errWriterused throughoutcase "validate-docmap":wired inmain.gogo test ./...passes cleanAdds 'review-bot validate-docmap' for CI hard-fail on docmap coverage gaps. Usage: git diff --name-only origin/main HEAD | \ review-bot validate-docmap --docmap .review-bot/doc-map.yml --repo-root . Flags: --docmap (required) path to doc-map YAML file --repo-root (optional, default '.') root for resolving docs: paths Two checks, both always run: 1. Coverage: every stdin file must match at least one paths: glob. 2. Stale docs: every docs: entry must exist on disk under --repo-root. Exit codes: 0=clean, 1=failures found, 2=usage/parse error. Tests cover: clean pass, uncovered file, stale doc, both failures, empty stdin, blank-line stdin, and duplicate docs: deduplication.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
7cdba141)Sonnet Review
Summary
Clean, well-structured implementation of the
validate-docmapsubcommand. The code follows established patterns in the codebase, is thoroughly tested with 9 test cases, and CI passes. Minor observations noted but none blocking.Findings
cmd/review-bot/validatedocmap_test.gocmd/review-bot/validatedocmap.golegacy/**with a stale doc path will fail even when no legacy files changed. A short doc comment on checkStaleDocs clarifying this global-check intent would help future maintainers.cmd/review-bot/validatedocmap.goRecommendation
APPROVE — Approve. The implementation is correct, idiomatic, and well-tested. It follows the existing validateurl.go pattern faithfully, uses flag.NewFlagSet for proper flag isolation, writes to the testable outWriter/errWriter, and handles all documented edge cases (empty stdin, blank lines, duplicate doc deduplication, both-failures). The three findings are all NITs that do not affect correctness or maintainability in any meaningful way. CI passes.
Review by sonnet
Evaluated against
7cdba141@@ -0,0 +93,4 @@}}if failed {[NIT] The stale-docs check always runs against ALL mappings in the docmap regardless of whether those mappings were triggered by any changed files. This is intentional per the PR description ('all docs: entries must exist on disk'), but the behaviour differs from what a user might expect: a mapping for
legacy/**with a stale doc path will fail even when no legacy files changed. A short doc comment on checkStaleDocs clarifying this global-check intent would help future maintainers.@@ -0,0 +125,4 @@return stale}// readLines reads all non-empty trimmed lines from r.[NIT] readLines uses bufio.Scanner with the default 64KB token buffer. Very long file paths (unlikely but possible with generated paths) would silently be truncated by scanner.Scan() returning false without error when a line exceeds the buffer. For a CI tool reading git diff output this is practically fine, but a comment noting the 64KB line limit or using scanner.Buffer to increase it would be defensive.
@@ -0,0 +70,4 @@// Provide stdin via the global os.Stdin would require process tricks.// The implementation reads os.Stdin directly; for tests we need a different approach.// See TestRunValidateDocmap_WithStdin_* below which use a helper that replaces os.Stdin.// This case: empty stdin (no changed files) → clean.[NIT] TestRunValidateDocmap_Clean has a long comment block explaining why it relies on process-level stdin instead of the stdinValidateDocmap helper, which is defined later in the same file. The test would be cleaner and more consistent if it simply used stdinValidateDocmap with an empty string — which is exactly what TestRunValidateDocmap_EmptyStdin does. These two tests appear to be testing the same case (empty stdin → clean). Consider removing TestRunValidateDocmap_Clean or consolidating it with TestRunValidateDocmap_EmptyStdin.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
7cdba141)Security Review
Summary
The new validate-docmap subcommand is well-scoped and CI passed, but there is a path traversal risk in the stale-docs check that can disclose filesystem information on the CI runner. This should be addressed before merging.
Findings
cmd/review-bot/validatedocmap.goRecommendation
REQUEST_CHANGES — Harden the stale-docs check to prevent path traversal and absolute path usage. Specifically: (1) Validate each docPath before stat by rejecting absolute paths and any ".." segments (reuse the review.validateDocPath logic or implement equivalent). (2) After joining with repoRoot, resolve and normalize the path (e.g., filepath.Clean + filepath.EvalSymlinks) and verify it remains within repoRoot via filepath.Rel; if it escapes, treat it as invalid and fail closed (report as stale/invalid rather than attempting os.Stat). (3) Optionally, avoid treating absolute doc paths as valid; require repo-relative paths only. These changes will prevent PR-controlled doc-map entries from probing the runner filesystem and leaking existence information in CI output.
Review by security
Evaluated against
7cdba141@@ -0,0 +116,4 @@}seen[docPath] = struct{}{}fullPath := filepath.Join(repoRoot, filepath.FromSlash(docPath))[MAJOR] Path traversal/information disclosure: stale-docs check joins repoRoot with unvalidated doc paths and calls os.Stat without enforcing that the resolved path stays within repoRoot. Absolute paths or ".." segments in doc-map docs: entries can cause os.Stat to probe arbitrary filesystem locations (e.g., /etc/passwd), creating an existence oracle visible in CI logs.
Self-review against
2ecbd86e24Assessment: ✅ Clean
Self-Review: issue-141 — 2026-05-15
Verdict: PASS
No issues found — ready for human review.
Gates:
go test ./...✅ |go vet ./...✅ |scripts/check-deps.sh✅Coherence: Three clean commits — export helper, add subcommand, update README. Each commit has one reason to exist.
Fit: Follows
validateurl.gopattern exactly (same function signature, same exit codes, sameoutWriter/errWritercapture). Usesflag.NewFlagSetcorrectly to avoid polluting global flag state. No new dependencies, only stdlib + existing review package.Solution: Minimal change — exports one helper from
review/docmap.go, adds one file incmd/, two-line wire-up inmain.go. Doesn't over-engineer.Completeness: 9 tests covering all specified cases. One self-review bug caught and fixed:
TestRunValidateDocmap_Cleanwas reading realos.Stdin(fragile in CI); fixed to use thestdinValidateDocmaphelper.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
2ecbd86e)Sonnet Review
Summary
Clean, well-structured implementation of the
validate-docmapsubcommand. The code follows established patterns in the codebase (validateurl.go), is well-tested with 9 tests covering all meaningful cases, and CI passes. No major issues found.Findings
cmd/review-bot/validatedocmap.gocheckStaleDocs) runs even when no changed files were provided on stdin (empty stdin case). This means a docmap with staledocs:entries will always fail, even on runs where no files changed — which is arguably the correct behavior, but worth a comment clarifying the intent. Currently the empty-stdin test passes because the test fixture has the doc file present, so it's fine, but the behavior asymmetry (coverage check is vacuously true on empty input, stale-docs check is not) is undocumented.cmd/review-bot/validatedocmap.go--docmap(no hyphen between doc and map) but the README, function name, and error messages consistently spell itdoc-map. Minor inconsistency —--doc-mapwould match the established naming convention of--doc-map-max-bytesand the config file namedoc-map.yml.cmd/review-bot/validatedocmap_test.gocaptureOutputhelper mutates package-leveloutWriter/errWritervariables without any synchronisation. This is fine for sequential tests but would race if tests run in parallel (t.Parallel()). No parallel calls exist currently, but a comment noting this limitation would be defensive documentation.Recommendation
APPROVE — Approve. The implementation is correct, idiomatic, and well-tested. The findings are minor: a small naming inconsistency (
--docmapvs--doc-map), an undocumented behavioral asymmetry between the two checks on empty stdin, and a non-threadsafe test helper that is safe in the current non-parallel context. None of these warrant blocking the PR.Review by sonnet
Evaluated against
2ecbd86e@@ -0,0 +77,4 @@if len(uncovered) > 0 {failed = truefmt.Fprintln(errWriter, "ERROR: changed files with no docmap coverage:")for _, f := range uncovered {[MINOR] The stale-docs check (
checkStaleDocs) runs even when no changed files were provided on stdin (empty stdin case). This means a docmap with staledocs:entries will always fail, even on runs where no files changed — which is arguably the correct behavior, but worth a comment clarifying the intent. Currently the empty-stdin test passes because the test fixture has the doc file present, so it's fine, but the behavior asymmetry (coverage check is vacuously true on empty input, stale-docs check is not) is undocumented.@@ -0,0 +93,4 @@}}if failed {[NIT] The flag is named
--docmap(no hyphen between doc and map) but the README, function name, and error messages consistently spell itdoc-map. Minor inconsistency —--doc-mapwould match the established naming convention of--doc-map-max-bytesand the config file namedoc-map.yml.@@ -0,0 +41,4 @@outWriter = &outBuferrWriter = &errBufdefer func() {outWriter = origOut[NIT] The
captureOutputhelper mutates package-leveloutWriter/errWritervariables without any synchronisation. This is fine for sequential tests but would race if tests run in parallel (t.Parallel()). No parallel calls exist currently, but a comment noting this limitation would be defensive documentation.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
2ecbd86e)Gpt Review
Summary
Solid addition of the validate-docmap subcommand with clear wiring, comprehensive tests, and README updates. Implementation is idiomatic, respects repo conventions, and CI passes.
Recommendation
APPROVE — The new validate-docmap subcommand is well-implemented: it cleanly parses flags with a dedicated FlagSet, reads stdin robustly, performs coverage and stale-doc checks as specified, and uses outWriter/errWriter for testability. The export of FileCoveredByDocMap reuses existing matching logic appropriately, and tests thoroughly cover edge cases (empty stdin, duplicate docs, both failure modes). Wiring into main and README updates are consistent. No changes requested.
Review by gpt
Evaluated against
2ecbd86eOriginal reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
2ecbd86e)Security Review
Summary
The new validate-docmap subcommand and exported coverage helper look correct functionally, but the stale-docs check uses unvalidated doc paths from the repo to stat arbitrary filesystem locations under the runner. This enables path traversal/existence probing of the CI host filesystem via PR-controlled YAML.
Findings
cmd/review-bot/validatedocmap.goRecommendation
REQUEST_CHANGES — Overall implementation of validate-docmap is solid, but you must harden the stale-docs check against path traversal. In checkStaleDocs, validate each docPath before using it:
This prevents PR-controlled doc-map entries from probing arbitrary files on the CI runner. Optionally, reuse a shared validator (export review.ValidateDocPath or implement equivalent locally) and add tests covering traversal attempts. After addressing this, the PR should be safe to approve.
Review by security
Evaluated against
2ecbd86e@@ -0,0 +115,4 @@continue}seen[docPath] = struct{}{}[MAJOR] Path traversal/existence probing: checkStaleDocs joins untrusted docmap paths with --repo-root and calls os.Stat without validating that the doc path is relative and confined to repoRoot. A malicious PR can include entries like "../../etc/passwd" or absolute paths to probe the CI runner filesystem. Validate doc paths (reject absolute and '..' segments) and enforce confinement with filepath.Rel/EvalSymlinks before stat.
Fix Plan against
2ecbd86e24:Problem
checkStaleDocsincmd/review-bot/validatedocmap.gojoinsrepoRootwith unvalidateddocPathvalues from PR-controlled YAML and callsos.Statwithout verifying the resolved path stays withinrepoRoot. A maliciousdocs:entry like../../etc/passwdor/etc/passwdcan probe arbitrary filesystem locations on the CI runner.Fix
Step 1 — Export
validateDocPathfrom thereviewpackage.The
review/docmap.gopackage already has a correctvalidateDocPaththat rejects absolute paths and..segments. I'll export it asreview.ValidateDocPath(uppercase V). This avoids duplicating logic and givescmd/review-botaccess to the same validation used in the review engine.Change in
review/docmap.go:validateDocPath→ValidateDocPath(export it)loadDocEntriesto use the new nameStep 2 — Harden
checkStaleDocsincmd/review-bot/validatedocmap.go.Before calling
os.Stat, validate eachdocPath:review.ValidateDocPath(docPath)— rejects absolute paths and..segments. If it fails, add docPath to stale list and continue (don't stat).fullPath := filepath.Join(repoRoot, filepath.FromSlash(docPath)).filepath.EvalSymlinks(fullPath)— if symlink resolution fails (path doesn't exist yet), fall through to the same "stale" treatment. If it succeeds, verify the resolved path is underrepoRootviafilepath.Rel.filepath.Relshows the path escapesrepoRoot(starts with..), treat as invalid: report it and continue without stat.os.Statas before.Note:
filepath.EvalSymlinksreturns an error for non-existent paths. When a valid path simply doesn't exist yet (stale doc), we want to report it as stale, not as a traversal. So: ifValidateDocPathpasses ANDfilepath.Relcheck passes, butEvalSymlinksfails (ENOENT), we can still stat (or skip stat and directly mark stale) — the traversal guard is satisfied byValidateDocPath+filepath.Cleannormalization first.Revised approach to avoid the chicken-and-egg with EvalSymlinks on missing paths:
review.ValidateDocPath(docPath)→ reject if failsfullPath := filepath.Clean(filepath.Join(repoRoot, filepath.FromSlash(docPath)))rel, err := filepath.Rel(repoRoot, fullPath)→ ifstrings.HasPrefix(rel, ".."), reject (escape)os.Stat(fullPath)— only reached for safe, confined pathsfilepath.Cleannormalizes..segments in the joined path, so even if a..slips through (it shouldn't after step 1),filepath.Relwill catch escape. The two checks are defense-in-depth.Step 3 — Tests in
cmd/review-bot/validatedocmap_test.go.Add
TestCheckStaleDocs_PathTraversalcovering:../../etc/passwd→ treated as invalid (not stat'd)/etc/passwd→ treated as invalid (absolute path)../outside→ treated as invaliddocs/valid.md(exists) → not staledocs/missing.md(doesn't exist) → stale, not a traversal errorWhat I'm NOT changing
reviewpackage tests (they already covervalidateDocPath)Completion Checklist
review.ValidateDocPathexported and internal call site updatedcheckStaleDocscallsreview.ValidateDocPathbefore touching filesystemfilepath.Clean+filepath.Relconfinement check in placeos.Stat../../etc/passwd,/etc/passwd,../outside, valid present path, valid missing pathgo test ./...passesgo vet ./...cleanOriginal reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
3f8da76b)Sonnet Review
Summary
Clean, well-structured implementation of the validate-docmap subcommand. The code follows established patterns, has thorough test coverage, and includes proper path-traversal hardening. CI passes.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap_test.goRecommendation
APPROVE — Approve. The implementation is correct, idiomatic, well-tested, and follows the established validateurl.go pattern. The two minor findings (os.Stdin coupling and a slightly awkward error message) are worth addressing in a follow-up but do not warrant blocking this PR. Path-traversal hardening is thorough and correctly layered (ValidateDocPath + filepath.Rel guard). The export of validateDocPath→ValidateDocPath and FileCoveredByDocMap is a minimal, clean API surface. All 9 tests pass and CI is green.
Review by sonnet
Evaluated against
3f8da76b@@ -0,0 +71,4 @@var uncovered []stringfor _, f := range changedFiles {if !review.FileCoveredByDocMap(cfg, f) {uncovered = append(uncovered, f)[MINOR] readLines reads from os.Stdin directly rather than accepting an io.Reader parameter. This forces tests to swap os.Stdin (which is fragile and not goroutine-safe), whereas accepting io.Reader would let callers pass any reader. The existing test helper works around this by redirecting os.Stdin, but the function signature itself is less testable than it could be. Compare with validateurl.go pattern — if that also uses os.Stdin directly this is consistent, but it's still worth noting.
@@ -0,0 +86,4 @@// checkStaleDocs validates each path before touching the filesystem; see// its documentation for the path-traversal hardening applied.repoRoot := filepath.Clean(*repoRootFlag)staleDocs := checkStaleDocs(cfg, repoRoot)[NIT] The error message 'ERROR: stale docmap docs: entries (paths do not exist):' has an awkward colon after 'docs:' — this reads as if 'docs:' is YAML syntax leaking into the message. Consider 'ERROR: stale docmap entries (docs: paths that do not exist on disk):' or simply 'ERROR: docs: entries not found on disk:'.
@@ -0,0 +155,4 @@})return}[NIT] stdinValidateDocmap leaks the temp file handle — it calls defer f.Close() but the file is passed to os.Stdin before the function returns. Because os.Stdin is restored in the deferred restore and runValidateDocmap reads synchronously, this works in practice, but a reader of the test may be confused. A comment explaining the ordering would help.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
3f8da76b)Gpt Review
Summary
The new validate-docmap subcommand is well-implemented, follows existing patterns (flag.NewFlagSet, out/err writer indirection), and includes comprehensive tests. The review package exports needed helpers cleanly with proper documentation, and README updates are clear.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.goRecommendation
APPROVE — Overall this PR is solid: the validate-docmap subcommand is added cleanly, wired into main with isolated flag parsing, and thoroughly tested across success, failure, and edge cases. The review package exports FileCoveredByDocMap and ValidateDocPath appropriately, and all tests pass in CI.
As a small hardening improvement, consider resolving symlinks for doc paths in checkStaleDocs and verifying the resolved path remains under --repo-root before calling os.Stat. Also consider normalizing doc paths before deduplication to avoid duplicate reporting for equivalent paths. These are minor and can be addressed in a follow-up; approving as-is.
Review by gpt
Evaluated against
3f8da76b@@ -0,0 +103,4 @@return 0}// checkStaleDocs returns deduplicated docs: entries that do not exist under[NIT] Deduplication of docs: entries uses the raw docPath string; consider normalizing (e.g., filepath.Clean and using forward slashes) before inserting into the seen set to avoid duplicate reports for equivalent paths like "docs/./a.md" vs "docs/a.md".
@@ -0,0 +115,4 @@seen := make(map[string]struct{})var stale []stringfor _, mapping := range cfg.Mappings {[MINOR] checkStaleDocs confines paths via filepath.Clean + filepath.Rel but does not resolve symlinks. A symlink under repo-root could point outside and os.Stat would follow it, potentially leaking host file existence. Consider using filepath.EvalSymlinks on fullPath and re-checking it stays under repoRoot before statting.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
3f8da76b)Security Review
Summary
Overall the implementation is clean and follows safe patterns for path normalization and validation. However, the new validate-docmap subcommand can be abused via symlink traversal to probe file existence on the CI host, causing an information disclosure side-channel.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.goreview/docmap.goRecommendation
REQUEST_CHANGES — Address the symlink traversal in validate-docmap's stale-docs check to prevent information disclosure on CI hosts. In checkStaleDocs, either: (a) replace os.Stat with os.Lstat and treat any symlink (file or directory) as invalid/stale, or (b) resolve the fullPath with filepath.EvalSymlinks and then verify the resolved path is still within an absolute, symlink-resolved repoRoot before performing any filesystem operations. Additionally, consider making repoRoot absolute and resolving its symlinks once at the start of the function to ensure robust containment, and optionally harden ValidateDocPath to normalize or reject backslashes for consistency on Windows. Add a unit test that creates a symlink under repoRoot pointing outside and asserts it's rejected.
Review by security
Evaluated against
3f8da76b@@ -0,0 +83,4 @@}// --- Check 2: Stale docs ---// checkStaleDocs validates each path before touching the filesystem; see[MINOR] repoRoot is only cleaned (filepath.Clean) but not resolved; resolving repoRoot with filepath.Abs + filepath.EvalSymlinks before Rel checks would harden against cases where the root itself is a symlink and reduce ambiguity across platforms.
@@ -0,0 +134,4 @@// Guard 2: verify the cleaned joined path does not escape repoRoot.// filepath.Clean resolves any remaining ".." after the join; the// filepath.Rel check confirms the path is still under repoRoot.[MAJOR] Symlink traversal allows probing existence of arbitrary host files. checkStaleDocs uses os.Stat on repo-controlled paths after Clean/Rel checks, but does not prevent symlinks under repoRoot from pointing outside the workspace. A malicious PR can add a symlink in the repo and reference it in doc-map to infer presence/absence of sensitive files on the CI runner via pass/fail or logs. Mitigate by using os.Lstat to reject symlinks, or resolve with filepath.EvalSymlinks and re-validate the resolved path remains within repoRoot before stat.
@@ -306,3 +317,4 @@func ValidateDocPath(p string) error {if filepath.IsAbs(p) {return fmt.Errorf("absolute paths not allowed")}[NIT] ValidateDocPath splits on '/' only. While later checks use filepath.FromSlash and Rel, consider normalizing or explicitly rejecting backslashes in paths to avoid platform-specific edge cases on Windows.
Fix Plan against
3f8da76b42:Finding #1 [MAJOR] — Symlink traversal in
checkStaleDocs(cmd/review-bot/validatedocmap.go~line 137)Problem:
os.Stat(fullPath)follows symlinks, so a malicious PR can create a symlink under repoRoot pointing at an arbitrary host path (e.g./etc/shadow) and the existence check will resolve through it, leaking file-existence information.Fix: Replace
os.Stat(fullPath)withos.Lstat(fullPath).Lstatdoes not follow symlinks — it stats the link itself. If the result is a symlink (fi.Mode()&os.ModeSymlink != 0), treat the entry as stale (append to stale list). This refuses to follow symlinks at all, which is the safest posture for a CI tool running against PR-controlled content.Implementation:
Finding #2 [MINOR] —
repoRootonly cleaned, not resolved (cmd/review-bot/validatedocmap.go~line 86)Problem:
filepath.Clean(*repoRootFlag)normalizes lexically but does not resolve symlinks. If--repo-rootitself is a symlink, thefilepath.Relescape check can be bypassed by a symlink chain.Fix: Resolve
repoRootonce at the top ofrunValidateDocmapusingfilepath.Abs+filepath.EvalSymlinksbefore passing it tocheckStaleDocs.Implementation (in
runValidateDocmap, before callingcheckStaleDocs):The
repoRoot := filepath.Clean(*repoRootFlag)line (currently used for stale check) is replaced byresolvedRoot. The coverage check does not use repoRoot so it is unaffected.Finding #3 [NIT] — Backslash not rejected in
ValidateDocPath(review/docmap.go~line 320)Problem:
ValidateDocPathsplits on/only. A path likedocs\foo.mdpasses validation. On Windows or if a backend normalizes backslashes, this could allow platform-specific edge cases.Fix: Add an explicit backslash check at the top of
ValidateDocPath:Test Plan
Add to
cmd/review-bot/validatedocmap_test.go:TestCheckStaleDocs_Symlink— create a symlink under a temp repoRoot that points outside (e.g. to/etc/passwd), add it to docmap docs. Expect the path to be treated as stale (exit 1).TestCheckStaleDocs_SymlinkInsideRepo— create a symlink within repoRoot pointing to another file inside repoRoot (benign case). Expect it to also be reported as stale (we reject all symlinks, not just escaping ones — simpler, safer for CI use).TestRunValidateDocmap_SymlinkRepoRoot— pass--repo-rootpointing at a symlink to a temp dir containing a valid doc. Ensure it resolves and handles correctly.Add to
review/docmap_test.go(orreview/docmap_validate_test.go):4.
TestValidateDocPath_Backslash— verifyValidateDocPath("docs\\foo.md")returns an error.5.
TestValidateDocPath_BackslashNotInPath— verifyValidateDocPath("docs/foo.md")still passes.Open Questions
repoRootbefore Rel checks.filepath.EvalSymlinkswill error if the path does not exist. If--repo-rootdoes not exist, should we return exit 2 (usage error) or 1 (validation failure)? Decision: exit 2 — an unresolvable repo root is a usage error, not a doc-coverage failure.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
b22de19a)Sonnet Review
Summary
Well-structured, idiomatic Go implementation of the validate-docmap subcommand. The code follows established patterns in the codebase, includes strong security hardening (path traversal, symlink rejection), comprehensive tests, and CI is passing. A few minor issues worth noting but none are blockers.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap_test.goreview/docmap.goRecommendation
APPROVE — Approve. The implementation is clean, well-tested, and follows the established patterns of the codebase (flag.NewFlagSet, outWriter/errWriter, return-int exit codes). The security hardening — Lstat instead of Stat, symlink rejection for doc files, path traversal guards at two layers, and EvalSymlinks on --repo-root — is thoughtful and well-documented. The test coverage is thorough, including adversarial symlink scenarios. CI is passing. The minor findings are all non-blocking: the empty-docmap behaviour is a documented design choice, and the other items are clarity/robustness notes that could be addressed in follow-up.
Review by sonnet
Evaluated against
b22de19a@@ -0,0 +40,4 @@if err := fs.Parse(args); err != nil {// flag.ContinueOnError already wrote the error to errWriter.return 2}[MINOR] The EvalSymlinks call on --repo-root (line ~91) will fail with an error if the directory does not yet exist, which gives a confusing error message at startup rather than a clear 'directory not found' message. The error message just says 'failed to resolve --repo-root' without distinguishing a non-existent directory from a permission error. Not a functional bug, but a UX rough edge.
@@ -0,0 +54,4 @@// Parse docmap YAML.cfg, err := review.ParseDocMapConfig(*docmapFlag)if err != nil {fmt.Fprintf(errWriter, "Error: failed to parse docmap %q: %v\n", *docmapFlag, err)[MINOR] The coverage check reports all uncovered files even when the docmap has zero mappings (empty config). This is technically correct behaviour, but an empty docmap + any changed files will fail with a coverage error. There is no documented special case for an empty docmap meaning 'coverage not required'. This is a design choice but could surprise users who have the --docmap flag set but an empty file. Consider documenting this in the usage text or treating an empty docmap as 'coverage not enforced'.
@@ -0,0 +144,4 @@}if _, err := f.Seek(0, 0); err != nil {t.Fatalf("Seek for stdin: %v", err)}[NIT] stdinValidateDocmap defers f.Close() but the file is opened read-write and os.Stdin is set to f. The defer will close the file descriptor while it may still be in use by runValidateDocmap if Go's test runner parallelises subtests. In practice this is safe because the function runs synchronously and the defer fires after the return, but it is slightly fragile. The existing validateurl pattern likely has the same structure so this is not a regression, just a note.
@@ -298,11 +310,15 @@ func readFileBytes(path string) ([]byte, error) {return os.ReadFile(path)[NIT] The comment on ValidateDocPath says 'Backslashes are rejected explicitly to prevent Windows platform edge cases.' — the sentence could be clearer that the tool itself may run on Windows and that backslashes in YAML doc paths could be misinterpreted by filepath.Join on that platform. Minor documentation clarity.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
b22de19a)Security Review
Summary
The new validate-docmap subcommand is implemented with solid path traversal defenses and safe filesystem checks. Exposed helpers in review/docmap.go strengthen validation and do not introduce vulnerabilities. CI has passed.
Recommendation
APPROVE — Approve this PR. The subcommand uses strict input validation (rejects absolute paths, '..' segments, and backslashes), confines doc paths to a resolved repo root, and avoids following symlinks by using Lstat, which mitigates path traversal and filesystem probing risks in CI. Exporting ValidateDocPath and FileCoveredByDocMap is appropriate and improves safety across the codebase. No security issues were identified in the changes.
Review by security
Evaluated against
b22de19aOriginal reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
b22de19a)Gpt Review
Summary
Well-scoped addition of the validate-docmap subcommand with careful path handling and comprehensive tests. The exported helpers in review/docmap provide a clean reuse point and README updates document usage clearly.
Recommendation
APPROVE — The implementation is correct, idiomatic, and security-conscious (rejects traversal and symlinks, resolves repo-root). Tests cover the intended behaviors and edge cases. CI passed. Proceed to merge.
Review by gpt
Evaluated against
b22de19aSecurity Review
Summary
The new validate-docmap subcommand and related exports are implemented with solid path traversal protections and safe filesystem checks. No exploitable security issues are introduced, and CI has passed.
Recommendation
APPROVE — The changes adhere to secure defaults for path handling, avoid following symlinks, and validate user-controlled YAML inputs appropriately. Proceed with merging.
Review by security
Evaluated against
24247a85Sonnet Review
Summary
Clean, well-structured implementation of the
validate-docmapsubcommand. The code follows established patterns in the codebase, has thorough test coverage including security-focused path traversal and symlink tests, and the security hardening (Lstat instead of Stat, backslash rejection, filepath.Rel confinement) is well-reasoned and well-documented. CI passes.Findings
cmd/review-bot/validatedocmap.gofilepath.EvalSymlinks(absRoot)on the--repo-rootpath, which will fail with an error if the directory does not yet exist (e.g. a new repo checkout where the root hasn't been created). In practice--repo-rootdefaults to.(cwd) which always exists, so this is low-risk, but an explicitos.Statcheck beforeEvalSymlinkswould give a clearer error message when the root is genuinely missing vs. a symlink resolution failure.cmd/review-bot/validatedocmap.go"ERROR: stale docmap docs: entries (paths do not exist):"— thedocs:YAML key embedded in the prose makes it slightly awkward to read ("stale docmap docs: entries"). Consider"ERROR: stale docs: entries in docmap do not exist on disk:"for clarity.cmd/review-bot/validatedocmap_test.goTestCheckStaleDocs_PathTraversalconstructs the YAML by concatenatingtc.docPathdirectly into the string literal. A path like/etc/passwdhappens to be valid YAML, but paths containing YAML-special characters (e.g.:,{,#) could produce a parse error (exit 2) rather than the expected stale-doc error (exit 1). Adding a YAML quoting step or restricting the test vectors would make the test more robust.review/docmap.goValidateDocPathsays "Defense-in-depth: the VCS API should already scope paths to the repo" — this is accurate for the VCS-fetch path, butValidateDocPathis now also used by the local-filesystem stale-docs check where the VCS API is not involved. The comment is still broadly correct but could be generalized: "Defense-in-depth: callers should also confine the joined path to the repo root via filepath.Rel before filesystem access."TODO.mdRecommendation
APPROVE — Approve. The implementation is correct, idiomatic, and well-tested. The security hardening (Lstat to avoid following symlinks, backslash rejection, filepath.Rel confinement, EvalSymlinks on repo-root) is thoughtful and properly documented. The code follows the established
runValidateURLpattern exactly as intended. The minor findings are cosmetic and do not affect correctness or safety. The YAML-injection risk in the path-traversal test (NIT) is worth fixing in a follow-up but does not block this PR since all test vectors used are valid YAML.Review by sonnet
Evaluated against
24247a85@@ -1,42 +1,47 @@## Dev Loop Status: 2026-05-15 04:35 UTC## Dev Loop Status: 2026-05-15 07:02 UTC[NIT] TODO.md is committed in its pre-PR-submission state ("Next Steps: create PR via gitea-rodin token"). This file appears to be a dev-loop artifact rather than project documentation. Committing intermediate dev-loop state into the main branch adds noise; consider either removing this file from version control or updating it post-merge.
@@ -0,0 +83,4 @@}// --- Check 2: Stale docs ---// Resolve repoRoot to an absolute, symlink-free path before any Rel checks[MINOR] The stale-docs check calls
filepath.EvalSymlinks(absRoot)on the--repo-rootpath, which will fail with an error if the directory does not yet exist (e.g. a new repo checkout where the root hasn't been created). In practice--repo-rootdefaults to.(cwd) which always exists, so this is low-risk, but an explicitos.Statcheck beforeEvalSymlinkswould give a clearer error message when the root is genuinely missing vs. a symlink resolution failure.@@ -0,0 +98,4 @@}// checkStaleDocs validates each path before touching the filesystem; see// its documentation for the path-traversal hardening applied.staleDocs := checkStaleDocs(cfg, resolvedRoot)[MINOR] The error message for stale docs reads
"ERROR: stale docmap docs: entries (paths do not exist):"— thedocs:YAML key embedded in the prose makes it slightly awkward to read ("stale docmap docs: entries"). Consider"ERROR: stale docs: entries in docmap do not exist on disk:"for clarity.@@ -0,0 +188,4 @@// docs/foo.md intentionally missingdocmap := makeDocmapYAML(t, `mappings:[NIT]
TestCheckStaleDocs_PathTraversalconstructs the YAML by concatenatingtc.docPathdirectly into the string literal. A path like/etc/passwdhappens to be valid YAML, but paths containing YAML-special characters (e.g.:,{,#) could produce a parse error (exit 2) rather than the expected stale-doc error (exit 1). Adding a YAML quoting step or restricting the test vectors would make the test more robust.@@ -306,0 +316,4 @@// to avoid any quirk in backend path handling. Backslashes are rejected// explicitly to prevent Windows platform edge cases.func ValidateDocPath(p string) error {if strings.Contains(p, "\\") {[NIT] The comment on
ValidateDocPathsays "Defense-in-depth: the VCS API should already scope paths to the repo" — this is accurate for the VCS-fetch path, butValidateDocPathis now also used by the local-filesystem stale-docs check where the VCS API is not involved. The comment is still broadly correct but could be generalized: "Defense-in-depth: callers should also confine the joined path to the repo root via filepath.Rel before filesystem access."Gpt Review
Summary
Solid addition of the validate-docmap subcommand with thorough tests, secure path handling, and minimal surface changes. CI passed and the implementation follows existing patterns and repository conventions.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.goRecommendation
APPROVE — The subcommand is well-implemented: it cleanly parses its own flags, reads stdin deterministically, performs both coverage and stale-doc checks, and reports all failures before exiting with correct codes. Security-conscious handling (ValidateDocPath, symlink refusal via Lstat, and resolving --repo-root symlinks) is appropriate for CI. Tests cover normal, error, traversal, and symlink cases comprehensively. Consider tightening the repo-root escape check to use segment-aware comparisons and normalizing changed paths for cross-platform robustness. Otherwise, this is ready to merge.
Review by gpt
Evaluated against
24247a85@@ -0,0 +70,4 @@// --- Check 1: Coverage ---var uncovered []stringfor _, f := range changedFiles {if !review.FileCoveredByDocMap(cfg, f) {[NIT] When matching coverage, changed file paths from stdin are used as-is. For robustness on Windows inputs, normalize backslashes to forward slashes before matching (e.g., f = strings.ReplaceAll(f, "\", "/")).
@@ -0,0 +151,4 @@// filepath.Rel check confirms the path is still under repoRoot.fullPath := filepath.Clean(filepath.Join(repoRoot, filepath.FromSlash(docPath)))rel, err := filepath.Rel(repoRoot, fullPath)if err != nil || strings.HasPrefix(rel, "..") {[MINOR] Path confinement check uses strings.HasPrefix(rel, "..") which may over/under-match on some platforms. Consider tightening to rel == ".." or strings.HasPrefix(rel, ".."+string(os.PathSeparator)) to be precise with path segment boundaries.
Self-review against
24247a8550Assessment: ✅ Clean
Self-Review: issue-141 — 2026-05-15
Verdict: PASS
No issues found — ready for human review.
Gates:
go test ./...✅ |go vet ./...✅ |scripts/check-deps.sh✅Coherence: Five commits — export helper, add subcommand, update README, fix os.Stdin in test, security hardening (path traversal + symlink). Each has one reason to exist. History tells a clear story.
Fit: Mirrors
validateurl.goexactly — same flag parsing (flag.NewFlagSet/ContinueOnError), sameoutWriter/errWritercapture, same exit code contract (0/1/2).FileCoveredByDocMapis a thin named wrapper over the existing unexportedmappingMatches.ValidateDocPathexport avoids duplication. No new vocabulary, no new patterns.Solution: Minimal — exports one helper, adds one file, wires two lines in main.go. Both checks always run (no early return on first failure). Empty stdin → clean (vacuously correct). Deduplication of docs: entries before stale check. The security hardening (Lstat + EvalSymlinks + filepath.Rel guard) is appropriate for a CI tool consuming PR-controlled YAML.
Completeness: 11 test cases covering all 7 plan-specified cases plus 4 security cases (path traversal, symlink outside repo, symlink inside repo, symlinked --repo-root). README updated with flag table and usage example. main.go dispatch wired.
Multi-Model Review — Config A (GPT-5 investigates, Opus judges)
Verdict: ✅ APPROVE
Pipeline Summary
Stage 1 — Areas of Concern (8 items)
Stage 1 flagged: exported API surface (
ValidateDocPath,FileCoveredByDocMap), direct struct access fromcmd/, EvalSymlinks failure mode, bufio.Scanner limit, symlink documentation gap, cfg nil safety, and path traversal defense completeness.Stage 2 — Deep Dive Findings
F1 (DOWNGRADE → low):
EvalSymlinkshas no test for non-existent--repo-rootF2 (DOWNGRADE → low): Symlinks treated as stale but not explicitly called out in README
All others dismissed:
ValidateDocPath/FileCoveredByDocMap: standard Go intra-repo pattern ✅bufio.Scanner64KB limit: git paths max 4096 bytes, well under limit ✅cfgnil safety:ParseDocMapConfigerror guard prevents nil reachingcheckStaleDocs✅cfg.Mappingsaccess: normal Go struct access across package boundary ✅Stage 3 — Cross-Validation
F1 → DISMISS:
TestRunValidateDocmap_SymlinkRepoRoottests the happy path of EvalSymlinks. Non-existent--repo-rootis an invalid CI precondition; adding a test for OS-level failure is unnecessary boilerplate with no user value.F2 → DISMISS: Symlink-as-stale behavior is intentional, security-motivated, fully documented in the
checkStaleDocsdocstring, and validated by two explicit tests (TestCheckStaleDocs_SymlinkOutside,TestCheckStaleDocs_SymlinkInsideRepo). README lives at a different abstraction level — code docs are the right home for this.Final Assessment
Clean implementation. Highlights:
validateurl.gopattern exactly (flag.NewFlagSet, outWriter/errWriter, exit codes 0/1/2)✅ APPROVE — no changes requested.
Fix Plan against
24247a8550:Findings Summary
All three bots (sonnet, gpt, security) APPROVED at this SHA. No REQUEST_CHANGES. The following findings from approve-reviews are being addressed:
Finding #1 [MINOR] — Path escape check too broad (GPT review,
cmd/review-bot/validatedocmap.go~line 154)Finding:
strings.HasPrefix(rel, "..")may match paths like..fooon some platforms. Should berel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator))to match only actual..path segments.Fix: Tighten escape check in
checkStaleDocs:Finding #2 [MINOR] — EvalSymlinks error message not distinguishing causes (Sonnet review, ~line 86)
Finding:
filepath.EvalSymlinks(absRoot)failure gives "failed to resolve --repo-root" without indicating whether it is non-existent vs permission error. Theerris already printed, so the message is acceptable, but wrapping the error with more context helps.Fix: Keep err in message (already there), add note to distinguish non-existent from permission: use
os.IsNotExist(err)to give a clearer message:Finding #3 [MINOR] — Empty docmap behavior undocumented (Sonnet review, ~line 57)
Finding: An empty docmap with changed files will fail coverage check. This behavioral asymmetry (stale-docs check always runs globally; coverage check is vacuously true on empty stdin) should be documented.
Fix: Add a comment above the coverage check block explaining the behavior.
Finding #4 [NIT] — TODO.md is a dev-loop artifact (Sonnet review)
Finding:
TODO.mdcontains dev-loop status notes ("Next Steps: create PR via gitea-rodin token"). This is not project documentation.Fix: Remove
TODO.mdfrom the repo (or replace with an empty file if the dev-loop expects it to exist). Delete the file and commit.Finding #5 [NIT] — ValidateDocPath comment should generalize (Sonnet review,
review/docmap.go~line 319)Finding: Comment says "Defense-in-depth: the VCS API should already scope paths to the repo" — this is stale since ValidateDocPath is now also used by the local-filesystem stale-docs check.
Fix: Update comment to: "Defense-in-depth: callers must also confine the joined path to the repo root via filepath.Rel before filesystem access."
Finding #6 [NIT] — Normalize stdin paths for Windows (GPT review, ~line 73)
Fix: Add
strings.ReplaceAll(f, "\", "/")when reading paths from stdin (in coverage check loop).What I am NOT changing
stdinValidateDocmapdefer pattern — already safe for sequential tests; adding a comment is sufficient (Sonnet NIT #3) — adding inline commentCompletion Checklist
rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator))in checkStaleDocsos.IsNotExistcheck for EvalSymlinks errorgo test ./...passesgo vet ./...cleanSonnet Review
Summary
CI passes, the implementation is correct, well-tested, and follows established patterns in the codebase. The security hardening (symlink rejection, path-traversal guards, double-validation of repo-root) is thorough and well-documented. Minor issues noted below but none are blocking.
Findings
cmd/review-bot/validatedocmap.go--repo-rootresolution fails, but only coverage failures are accumulated before the repo-root resolution block. Iffilepath.Absorfilepath.EvalSymlinksfails the function returns exit code 2 immediately — meaning a coverage failure that was already accumulated (and stored infailed = true) is silently swallowed, and the user sees exit 2 with only the repo-root error rather than exit 1 with both failures. Consider either: (a) emitting the accumulated coverage failures before returning 2, or (b) resolving repo-root before the coverage check so all errors are consistently collected.cmd/review-bot/validatedocmap_test.gostdinValidateDocmapsays "Tests must not call t.Parallel() while sharing the global os.Stdin" but this constraint is not enforced (e.g. via a test-level mutex or a serial subtest runner). The tests currently do not callt.Parallel(), so this is safe today, but the constraint is easy to violate silently when new tests are added. Consider documenting this with at.Setenv-style workaround or a note that tests in this file must remain serial.cmd/review-bot/validatedocmap.gofmt.Fprintln(errWriter, "")calls (blank line) could use a singlefmt.Fprintfwith\n\nfor readability. Minor style issue only.review/docmap.goFileCoveredByDocMapdoc comment says 'It is used by static validation tooling (e.g. the validate-docmap subcommand)'. Per the documentation pattern, the doc comment should start with the function name:// FileCoveredByDocMap reports whether...— the current phrasing already does start correctly but the second sentence adds implementation context that per the patterns belongs in a separate paragraph or omitted from the primary sentence. Very minor.Recommendation
APPROVE — Approve. The implementation is solid: it follows the established
runValidateURLpattern faithfully, usesflag.NewFlagSetcorrectly, accumulates all failures before exiting, handles edge cases (empty stdin, blank lines, deduplication), and includes thorough security hardening for path traversal and symlink attacks with good test coverage. The MINOR finding about exit code 2 swallowing accumulated coverage failures is a correctness nuance worth fixing but is unlikely to matter in practice (repo-root resolution failures are infrastructure-level, not docmap-content-level). The NIT findings are cosmetic only.Review by sonnet
Evaluated against
83a18354@@ -0,0 +45,4 @@if *docmapFlag == "" {fmt.Fprintln(errWriter, "Error: --docmap is required")fmt.Fprintln(errWriter, "")fmt.Fprintln(errWriter, "usage: review-bot validate-docmap --docmap <path> [--repo-root <dir>]")[NIT] The usage message emitted via multiple
fmt.Fprintln(errWriter, "")calls (blank line) could use a singlefmt.Fprintfwith\n\nfor readability. Minor style issue only.@@ -0,0 +89,4 @@}}// --- Check 2: Stale docs ---[MINOR] The stale-docs check runs even when
--repo-rootresolution fails, but only coverage failures are accumulated before the repo-root resolution block. Iffilepath.Absorfilepath.EvalSymlinksfails the function returns exit code 2 immediately — meaning a coverage failure that was already accumulated (and stored infailed = true) is silently swallowed, and the user sees exit 2 with only the repo-root error rather than exit 1 with both failures. Consider either: (a) emitting the accumulated coverage failures before returning 2, or (b) resolving repo-root before the coverage check so all errors are consistently collected.@@ -0,0 +152,4 @@t.Fatalf("Seek for stdin: %v", err)}origStdin := os.Stdin[MINOR] The comment on
stdinValidateDocmapsays "Tests must not call t.Parallel() while sharing the global os.Stdin" but this constraint is not enforced (e.g. via a test-level mutex or a serial subtest runner). The tests currently do not callt.Parallel(), so this is safe today, but the constraint is easy to violate silently when new tests are added. Consider documenting this with at.Setenv-style workaround or a note that tests in this file must remain serial.@@ -66,6 +66,18 @@ func ParseDocMapConfig(localPath string) (*DocMapConfig, error) {return &cfg, nil}[NIT]
FileCoveredByDocMapdoc comment says 'It is used by static validation tooling (e.g. the validate-docmap subcommand)'. Per the documentation pattern, the doc comment should start with the function name:// FileCoveredByDocMap reports whether...— the current phrasing already does start correctly but the second sentence adds implementation context that per the patterns belongs in a separate paragraph or omitted from the primary sentence. Very minor.Gpt Review
Summary
The new validate-docmap subcommand is well-implemented, idiomatic, and thoroughly tested. It cleanly integrates into main dispatch, hardens path handling, and adds clear README documentation. CI passed.
Findings
review/docmap.gocmd/review-bot/validatedocmap.goREADME.mdRecommendation
APPROVE — The implementation meets the stated goals with clear separation of concerns, safe path handling (ValidateDocPath, symlink checks, repo-root confinement), and comprehensive tests covering coverage gaps, stale docs, symlink behavior, and usage errors. The CLI uses a dedicated FlagSet and testable writers, and the main program cleanly dispatches the new subcommand. Minor improvements could include normalizing path separators inside FileCoveredByDocMap and documenting Scanner limitations and symlink behavior. Given CI success and absence of major issues, this is ready to merge.
Review by gpt
Evaluated against
83a18354@@ -296,6 +296,40 @@ review-bot \--conventions-file CONVENTIONS.md```[NIT] The validate-docmap documentation is clear; consider explicitly noting that docs: symlinks are treated as stale (rejected) for safety, matching the implementation and tests, to prevent surprises.
@@ -0,0 +164,4 @@rel, err := filepath.Rel(repoRoot, fullPath)if err != nil || rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator)) {stale = append(stale, docPath)continue[NIT] readLines uses bufio.Scanner with the default token limit (~64KB). While file paths are typically short, if extremely long lines were ever piped in, Scanner would error. Consider documenting this assumption or using bufio.Reader with ReadString('\n') for robustness.
@@ -66,6 +66,18 @@ func ParseDocMapConfig(localPath string) (*DocMapConfig, error) {return &cfg, nil[MINOR] FileCoveredByDocMap assumes the caller passes forward-slash paths. Since it's exported, consider normalizing backslashes to forward slashes internally (as done in the CLI) or documenting this requirement in the function comment.
Security Review
Summary
The new validate-docmap subcommand is generally well-hardened (rejects traversal and symlinks when checking docs), and CI has passed. However, the docmap file itself is read without any validation against repo-root or symlink checks, allowing a PR-controlled symlink to point outside the workspace and potentially cause denial-of-service or unintended host file reads.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.goRecommendation
REQUEST_CHANGES — Harden the validate-docmap subcommand’s handling of the docmap file path. Before calling review.ParseDocMapConfig, validate that the provided --docmap path resolves under --repo-root and is not a symlink (e.g., use filepath.Clean + filepath.Rel checks, then os.Lstat to detect symlinks). Reject inputs that escape repoRoot or are symlinks. Optionally enforce a reasonable maximum file size to prevent oversized docmap YAML from causing resource exhaustion. The rest of the implementation (coverage checks, stale-doc path validation, and symlink handling in checkStaleDocs) appears secure and appropriate. After addressing these issues, the PR would be ready for approval.
Review by security
Evaluated against
83a18354@@ -0,0 +52,4 @@}// Parse docmap YAML.cfg, err := review.ParseDocMapConfig(*docmapFlag)[MAJOR] Docmap file path is read directly with review.ParseDocMapConfig without validating that the path is within --repo-root and not a symlink. A malicious PR can create .review-bot/doc-map.yml as a symlink to a large or special file (e.g., /dev/zero) causing resource exhaustion or reading arbitrary host files. Harden by resolving and confining the docmap path to repoRoot and rejecting symlinks before reading.
[MINOR] There is no upper bound on the size of the docmap YAML when reading it (os.ReadFile). Even without symlinks, a large committed doc-map file could cause excessive memory usage. Consider checking file size via os.Lstat before reading and enforcing a reasonable cap.
Sonnet Review
Summary
This PR adds a well-designed
validate-docmapsubcommand with thorough security hardening (path confinement, symlink rejection, size limits) and comprehensive test coverage. The implementation follows established patterns in the codebase and the code is clean, idiomatic Go.Findings
cmd/review-bot/validatedocmap.goreadLinesfunction is defined invalidatedocmap.gobut is a generic utility. If another subcommand file ever needs to read lines from stdin, there will be a naming conflict since both would be inpackage main. This is fine for now but worth noting — consider a brief comment that this is intentionally scoped to the validate-docmap use case, or move it to a shared helpers file if reuse is anticipated.cmd/review-bot/validatedocmap_test.gostdinValidateDocmapsays "Tests must not call t.Parallel() while sharing the global os.Stdin" — this is a valid concern, but none of the tests callt.Parallel()anyway, so the guard is informational only. The real risk is if a future contributor addst.Parallel()without reading this comment. Consider adding a// WARNING:prefix or an explicit// do not add t.Parallel() to callersnote to make it more prominent.cmd/review-bot/validatedocmap.govalidateDocmapPathfunction documents thatresolvedRootmust be an absolute, symlink-free path, but this is only enforced by convention (the caller must pass the right value). An internal assertion or a brief checkif !filepath.IsAbs(resolvedRoot)at the top of the function would make the contract self-enforcing. Low priority since the caller inrunValidateDocmapdoes this correctly.review/docmap.goValidateDocPathmentions "Finding #3" which appears to be a reference to an internal issue/finding number that won't be meaningful to future readers of this file. This should be reworded to be self-contained (e.g., remove the parenthetical or rephrase as 'to prevent OS-specific path separator normalization issues').cmd/review-bot/validatedocmap_test.goTestCheckStaleDocs_PathTraversal, the inner table test creates a docmap using string concatenation with the docPath directly embedded in YAML (-+tc.docPath+``). For paths like/etc/passwdand../../etc/passwd, this creates valid YAML. However, for paths containing special YAML characters (:,#,{), this could produce invalid YAML that would fail at parse time rather than at the stale-docs check, making the test misleading. The current test cases are safe, but usingfmt.Sprintfwith YAML quoting or writing the file programmatically would be more robust.Recommendation
APPROVE — Approve. The implementation is correct, follows codebase patterns (flag.NewFlagSet, outWriter/errWriter, follows validateurl.go pattern), and the security hardening (Lstat for symlink rejection, path confinement, size limits) is well-thought-out and properly tested. The findings are all minor/nit level and don't block merging. The
FileCoveredByDocMapexport is a clean thin wrapper over the existingmappingMatcheslogic. The test suite is comprehensive with 20+ tests covering happy path, both failure modes, edge cases, and all security properties. CI is passing.Review by sonnet
Evaluated against
af8b29fa@@ -0,0 +65,4 @@return nil}// runValidateDocmap implements the `review-bot validate-docmap` subcommand.[NIT] The
validateDocmapPathfunction documents thatresolvedRootmust be an absolute, symlink-free path, but this is only enforced by convention (the caller must pass the right value). An internal assertion or a brief checkif !filepath.IsAbs(resolvedRoot)at the top of the function would make the contract self-enforcing. Low priority since the caller inrunValidateDocmapdoes this correctly.@@ -0,0 +152,4 @@// --- Check 1: Coverage ---// Note: an empty docmap (no mappings) means every changed file is// uncovered — there are no patterns to match against. This is intentional:[MINOR] The
readLinesfunction is defined invalidatedocmap.gobut is a generic utility. If another subcommand file ever needs to read lines from stdin, there will be a naming conflict since both would be inpackage main. This is fine for now but worth noting — consider a brief comment that this is intentionally scoped to the validate-docmap use case, or move it to a shared helpers file if reuse is anticipated.@@ -0,0 +174,4 @@os.Stdin = fdefer func() { os.Stdin = origStdin }()stdout, stderr = captureOutput(func() {[MINOR] The comment on
stdinValidateDocmapsays "Tests must not call t.Parallel() while sharing the global os.Stdin" — this is a valid concern, but none of the tests callt.Parallel()anyway, so the guard is informational only. The real risk is if a future contributor addst.Parallel()without reading this comment. Consider adding a// WARNING:prefix or an explicit// do not add t.Parallel() to callersnote to make it more prominent.@@ -0,0 +320,4 @@// Baseline: a valid doc that exists.makeDocFile(t, dir, "docs/valid.md")[NIT] In
TestCheckStaleDocs_PathTraversal, the inner table test creates a docmap using string concatenation with the docPath directly embedded in YAML (-+tc.docPath+``). For paths like/etc/passwdand../../etc/passwd, this creates valid YAML. However, for paths containing special YAML characters (:,#,{), this could produce invalid YAML that would fail at parse time rather than at the stale-docs check, making the test misleading. The current test cases are safe, but usingfmt.Sprintfwith YAML quoting or writing the file programmatically would be more robust.@@ -305,1 +313,3 @@func validateDocPath(p string) error {// ValidateDocPath rejects doc paths that could cause path traversal// (absolute paths, any ".." segment, backslashes). Defense-in-depth: callers// must also confine the joined path to the repo root via filepath.Rel before[NIT] The comment on
ValidateDocPathmentions "Finding #3" which appears to be a reference to an internal issue/finding number that won't be meaningful to future readers of this file. This should be reworded to be self-contained (e.g., remove the parenthetical or rephrase as 'to prevent OS-specific path separator normalization issues').Security Review
Summary
Security posture of the new validate-docmap subcommand is solid: paths are confined to the repo root, symlinks are rejected, traversal is blocked, and a size cap is enforced. CI passed and no exploitable issues were found.
Findings
cmd/review-bot/validatedocmap.goRecommendation
APPROVE — Proceed with merge. The added subcommand applies strong path-traversal defenses (absolute/.. rejection, repo-root confinement, symlink rejection) and a generous but bounded size limit, and tests cover edge cases. As a hardening improvement, add a regular-file check (fi.Mode().IsRegular()) in validateDocmapPath to avoid potential hangs when encountering FIFOs or other special files. No other security-relevant issues were identified.
Review by security
Evaluated against
af8b29fa@@ -0,0 +30,4 @@//// resolvedRoot must already be an absolute, symlink-free path (obtained from// filepath.Abs + filepath.EvalSymlinks).func validateDocmapPath(localPath, resolvedRoot string) error {[MINOR] validateDocmapPath does not verify that the docmap input is a regular file. While symlinks are rejected and the path is confined to the repo root, allowing non-regular files (e.g., FIFOs, sockets, device nodes) could lead to denial-of-service (e.g., blocking reads) if such a special file exists in the workspace. Consider checking fi.Mode().IsRegular() and rejecting otherwise.
Sonnet Review
Summary
This PR adds a well-designed
validate-docmapsubcommand with thorough security hardening (symlink rejection, path confinement, size limits) and comprehensive test coverage. The implementation follows established patterns in the codebase and the Go standard library idioms. CI is passing.Findings
cmd/review-bot/validatedocmap.gomaxDocmapBytesconstant is typed asint64solely to compare againstfi.Size()(which returnsint64). Consider using an untyped constant or keeping it typed but adding a brief comment explaining whyint64was chosen, since the type is not otherwise obvious from context.cmd/review-bot/validatedocmap.gofs.Usage(). This is inconsistent with how flag.FlagSet normally surfaces help and could get out of sync with the registered flags. Minor since this is a short flag set, but worth noting.cmd/review-bot/validatedocmap_test.gostdinValidateDocmapsays 'Tests must not call t.Parallel() while sharing the global os.Stdin' but there is not.Skipor assertion to enforce this. Since os.Stdin mutation is inherently process-global, this is a latent footgun if someone later adds t.Parallel() to these tests. Consider adding a brief comment directing future maintainers to useTestMainto serialize tests instead, or document that this constraint is enforced by convention.review/docmap.goValidateDocPathsays 'Defense-in-depth: callers must also confine the joined path to the repo root via filepath.Rel before any filesystem access.' This is good documentation but the function's contract doesn't verify the caller satisfies that postcondition. This is fine (it's defense-in-depth by definition), just making sure the comment is understood as advisory rather than enforced.Recommendation
APPROVE — Approve. The implementation is correct, idiomatic, and well-tested. The security hardening (Lstat instead of Stat everywhere, path confinement via filepath.Rel, size cap before read, symlink rejection at both the docmap-file level and the docs-entry level) is thorough and well-documented. The code follows established patterns in the codebase (flag.NewFlagSet, outWriter/errWriter injection, table-driven tests with t.Run). The four findings are all nits with no correctness impact. CI is passing.
Review by sonnet
Evaluated against
af8b29fa@@ -0,0 +10,4 @@"strings""gitea.weiker.me/rodin/review-bot/review")[NIT] The
maxDocmapBytesconstant is typed asint64solely to compare againstfi.Size()(which returnsint64). Consider using an untyped constant or keeping it typed but adding a brief comment explaining whyint64was chosen, since the type is not otherwise obvious from context.@@ -0,0 +90,4 @@docmapFlag := fs.String("docmap", "", "Path to doc-map YAML file (required)")repoRootFlag := fs.String("repo-root", ".", "Repo root for resolving docs: paths (default: cwd)")if err := fs.Parse(args); err != nil {[NIT] The usage message prints flag help manually (fmt.Fprintln chains) rather than calling
fs.Usage(). This is inconsistent with how flag.FlagSet normally surfaces help and could get out of sync with the registered flags. Minor since this is a short flag set, but worth noting.@@ -0,0 +165,4 @@defer f.Close()if _, err := f.WriteString(stdinContent); err != nil {t.Fatalf("WriteString for stdin: %v", err)}[NIT] The comment on
stdinValidateDocmapsays 'Tests must not call t.Parallel() while sharing the global os.Stdin' but there is not.Skipor assertion to enforce this. Since os.Stdin mutation is inherently process-global, this is a latent footgun if someone later adds t.Parallel() to these tests. Consider adding a brief comment directing future maintainers to useTestMainto serialize tests instead, or document that this constraint is enforced by convention.@@ -306,0 +318,4 @@func ValidateDocPath(p string) error {if strings.Contains(p, "\\") {return fmt.Errorf("backslashes not allowed in doc paths")}[NIT] The comment on
ValidateDocPathsays 'Defense-in-depth: callers must also confine the joined path to the repo root via filepath.Rel before any filesystem access.' This is good documentation but the function's contract doesn't verify the caller satisfies that postcondition. This is fine (it's defense-in-depth by definition), just making sure the comment is understood as advisory rather than enforced.Gpt Review
Summary
Solid addition of the validate-docmap subcommand, with strong path hardening, clear error semantics, and comprehensive tests. The export of FileCoveredByDocMap and ValidateDocPath is clean and consistent with existing patterns.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.goreview/docmap.goRecommendation
APPROVE — The implementation is correct, secure, and well-tested. The new subcommand follows the existing validate-url pattern, enforces robust confinement and symlink rules, and properly exports minimal helpers from the review package. Consider adding a regular-file check in validateDocmapPath and aligning slog output with the subcommand’s errWriter for consistency, but these are minor. Approve the PR.
Review by gpt
Evaluated against
af8b29fa@@ -0,0 +57,4 @@return fmt.Errorf("path must be within --repo-root")}// Enforce size cap before reading to prevent memory exhaustion.[MINOR] validateDocmapPath does not explicitly ensure the target is a regular file (fi.Mode().IsRegular()). A directory will be rejected later by ParseDocMapConfig via os.ReadFile, but an early explicit check would yield a clearer error and avoid attempting to read directories.
@@ -0,0 +221,4 @@if err := review.ValidateDocPath(docPath); err != nil {stale = append(stale, docPath)continue}[NIT] readLines uses bufio.Scanner with the default token size; while fine for filenames, consider documenting or bumping the buffer if you ever accept longer lines via stdin in the future.
Security Review
Summary
The new validate-docmap subcommand and related changes are thoughtfully hardened against path traversal and symlink abuses, and CI has passed. I found no exploitable vulnerabilities; a couple of defense-in-depth improvements are suggested.
Findings
cmd/review-bot/validatedocmap.gocmd/review-bot/validatedocmap.goRecommendation
APPROVE — Overall, the changes are secure and well-tested, with strong path validation and symlink handling for both the docmap file and referenced docs. To further harden the validate-docmap subcommand, consider (1) verifying the docmap path is within --repo-root before calling os.Lstat, and (2) ensuring the docmap is a regular file (fi.Mode().IsRegular()) in validateDocmapPath. These adjustments reduce even theoretical file metadata probing and guard against non-regular files in atypical environments. With CI passing and no exploitable issues identified, this PR is good to merge.
Review by security
Evaluated against
af8b29fa@@ -0,0 +38,4 @@}// Lstat: do NOT follow symlinks. We want to inspect the entry itself.fi, err := os.Lstat(absPath)[MINOR] validateDocmapPath performs os.Lstat on the provided --docmap path before verifying it is confined within --repo-root. While content is not read and the flag is typically controlled by CI, reordering the confinement check before any filesystem access would avoid even a metadata probe on arbitrary host paths if the path were ever user-controlled.
@@ -0,0 +45,4 @@// Reject symlinks outright — a symlink can point to /dev/zero or arbitrary// host paths, bypassing both the confinement check and the size check.if fi.Mode()&os.ModeSymlink != 0 {[MINOR] validateDocmapPath rejects symlinks and enforces a size cap but does not verify that the target is a regular file (fi.Mode().IsRegular()). Adding this check would defensively prevent interacting with special files (FIFOs/devices) in non-standard environments and align with the comment that it should be a "regular file."
Self-review against
af8b29fa5dAssessment: ⚠️ Needs attention
Note: PR was already merged before self-review completed (merged at 2026-05-15T07:39:22Z). Finding is a post-merge observation for tracking.
Self-Review: issue-141 — 2026-05-15
Verdict: NEEDS_WORK
✅ Passing
validateurl.gopattern exactly:flag.NewFlagSet,outWriter/errWriter, same exit-code conventionsFileCoveredByDocMapis a thin, correctly named wrapper over existingmappingMatcheslogicValidateDocPath— consistentgo test ./...✅go vet ./...✅go build ./...✅checkStaleDocsdefense-in-depth correct: EvalSymlinks-resolved repoRoot + ValidateDocPath guard + filepath.Rel guard + Lstat symlink check❌ Finding
[security]
validateDocmapPath: directory-symlink bypassvalidateDocmapPathcallsos.Lstat(absPath)whereabsPath = filepath.Abs(localPath).Lstatonly skips following the final path component — it follows intermediate directory symlinks. If a PR commits.review-bot/as a directory symlink to an outside directory, then--docmap .review-bot/doc-map.ymlwill:filepath.Abs→ textually<repo>/.review-bot/doc-map.ymlos.Lstat→ follows the dir symlink, finds the outside file as a regular file (no symlink bit on final component)filepath.Rel(resolvedRoot, absPath)→.review-bot/doc-map.yml— passes confinementVerified with a Go test: the bypass is real and reproducible.
Fix: after
filepath.Abs, callfilepath.EvalSymlinksand use the resolved path for both the confinement check andLstat:Also add
TestValidateDocmapPath_DirSymlinkBypass.Severity: MINOR in practice — attack requires committing
.review-bot/as a directory symlink (aggressive PR modification), and the--docmapvalue is CI-controlled. But it contradicts the stated security invariant #1 in the docstring.Recommend: open a follow-up issue to fix
validateDocmapPath+ add the missing test.