feat(#141): validate-docmap subcommand — CI hard-fail on missing docmap coverage #142

Merged
rodin merged 11 commits from issue-141 into main 2026-05-15 07:39:22 +00:00
3 changed files with 24 additions and 7 deletions
Showing only changes of commit 5c6758e990 - Show all commits
+12 -1
View File
10
@@ -68,8 +68,15 @@ func runValidateDocmap(args []string) int {
failed := false
Review

[NIT] The validateDocmapPath function documents that resolvedRoot must be an absolute, symlink-free path, but this is only enforced by convention (the caller must pass the right value). An internal assertion or a brief check if !filepath.IsAbs(resolvedRoot) at the top of the function would make the contract self-enforcing. Low priority since the caller in runValidateDocmap does this correctly.

**[NIT]** The `validateDocmapPath` function documents that `resolvedRoot` must be an absolute, symlink-free path, but this is only enforced by convention (the caller must pass the right value). An internal assertion or a brief check `if !filepath.IsAbs(resolvedRoot)` at the top of the function would make the contract self-enforcing. Low priority since the caller in `runValidateDocmap` does this correctly.
// --- Check 1: Coverage ---
// Note: an empty docmap (no mappings) means every changed file is
// uncovered — there are no patterns to match against. This is intentional:
// if you declare a doc-map, every changed file must be accounted for.
Outdated
Review

[NIT] When matching coverage, changed file paths from stdin are used as-is. For robustness on Windows inputs, normalize backslashes to forward slashes before matching (e.g., f = strings.ReplaceAll(f, "\", "/")).

**[NIT]** When matching coverage, changed file paths from stdin are used as-is. For robustness on Windows inputs, normalize backslashes to forward slashes before matching (e.g., f = strings.ReplaceAll(f, "\\", "/")).
// On empty stdin the check is vacuously true (no files to cover).
Outdated
Review

[MINOR] readLines reads from os.Stdin directly rather than accepting an io.Reader parameter. This forces tests to swap os.Stdin (which is fragile and not goroutine-safe), whereas accepting io.Reader would let callers pass any reader. The existing test helper works around this by redirecting os.Stdin, but the function signature itself is less testable than it could be. Compare with validateurl.go pattern — if that also uses os.Stdin directly this is consistent, but it's still worth noting.

**[MINOR]** readLines reads from os.Stdin directly rather than accepting an io.Reader parameter. This forces tests to swap os.Stdin (which is fragile and not goroutine-safe), whereas accepting io.Reader would let callers pass any reader. The existing test helper works around this by redirecting os.Stdin, but the function signature itself is less testable than it could be. Compare with validateurl.go pattern — if that also uses os.Stdin directly this is consistent, but it's still worth noting.
var uncovered []string
for _, f := range changedFiles {
// Normalize Windows-style backslashes to forward slashes so that
// changed-file paths from git on Windows match doc-map globs.
f = strings.ReplaceAll(f, "\\", "/")
if !review.FileCoveredByDocMap(cfg, f) {
Review

[MINOR] The stale-docs check (checkStaleDocs) runs even when no changed files were provided on stdin (empty stdin case). This means a docmap with stale docs: entries will always fail, even on runs where no files changed — which is arguably the correct behavior, but worth a comment clarifying the intent. Currently the empty-stdin test passes because the test fixture has the doc file present, so it's fine, but the behavior asymmetry (coverage check is vacuously true on empty input, stale-docs check is not) is undocumented.

**[MINOR]** The stale-docs check (`checkStaleDocs`) runs even when no changed files were provided on stdin (empty stdin case). This means a docmap with stale `docs:` entries will always fail, even on runs where no files changed — which is arguably the correct behavior, but worth a comment clarifying the intent. Currently the empty-stdin test passes because the test fixture has the doc file present, so it's fine, but the behavior asymmetry (coverage check is vacuously true on empty input, stale-docs check is not) is undocumented.
uncovered = append(uncovered, f)
}
7
@@ -93,7 +100,11 @@ func runValidateDocmap(args []string) int {
}
resolvedRoot, err := filepath.EvalSymlinks(absRoot)
Outdated
Review

[MINOR] The error message for stale docs reads "ERROR: stale docmap docs: entries (paths do not exist):" — the docs: YAML key embedded in the prose makes it slightly awkward to read ("stale docmap docs: entries"). Consider "ERROR: stale docs: entries in docmap do not exist on disk:" for clarity.

**[MINOR]** The error message for stale docs reads `"ERROR: stale docmap docs: entries (paths do not exist):"` — the `docs:` YAML key embedded in the prose makes it slightly awkward to read ("stale docmap docs: entries"). Consider `"ERROR: stale docs: entries in docmap do not exist on disk:"` for clarity.
if err != nil {
if os.IsNotExist(err) {
fmt.Fprintf(errWriter, "Error: --repo-root %q does not exist\n", *repoRootFlag)
} else {
fmt.Fprintf(errWriter, "Error: failed to resolve --repo-root %q: %v\n", *repoRootFlag, err)
Outdated
Review

[NIT] Deduplication of docs: entries uses the raw docPath string; consider normalizing (e.g., filepath.Clean and using forward slashes) before inserting into the seen set to avoid duplicate reports for equivalent paths like "docs/./a.md" vs "docs/a.md".

**[NIT]** Deduplication of docs: entries uses the raw docPath string; consider normalizing (e.g., filepath.Clean and using forward slashes) before inserting into the seen set to avoid duplicate reports for equivalent paths like "docs/./a.md" vs "docs/a.md".
}
return 2
}
// checkStaleDocs validates each path before touching the filesystem; see
7
@@ -151,7 +162,7 @@ func checkStaleDocs(cfg *review.DocMapConfig, repoRoot string) []string {
// filepath.Rel check confirms the path is still under repoRoot.
fullPath := filepath.Clean(filepath.Join(repoRoot, filepath.FromSlash(docPath)))
rel, err := filepath.Rel(repoRoot, fullPath)
if err != nil || strings.HasPrefix(rel, "..") {
if err != nil || rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator)) {
stale = append(stale, docPath)
continue
Outdated
Review

[NIT] readLines uses bufio.Scanner with the default token limit (~64KB). While file paths are typically short, if extremely long lines were ever piped in, Scanner would error. Consider documenting this assumption or using bufio.Reader with ReadString('\n') for robustness.

**[NIT]** readLines uses bufio.Scanner with the default token limit (~64KB). While file paths are typically short, if extremely long lines were ever piped in, Scanner would error. Consider documenting this assumption or using bufio.Reader with ReadString('\n') for robustness.
}
+6
View File
2
@@ -131,6 +131,12 @@ mappings:
}
// stdinValidateDocmap runs runValidateDocmap with a synthetic stdin.
//
// Implementation note: we write stdinContent to a temp file and point
// os.Stdin at it. The defer f.Close() fires after stdinValidateDocmap
// returns, which is after runValidateDocmap has finished reading stdin
// synchronously — so the file is not closed while still in use.
// Tests must not call t.Parallel() while sharing the global os.Stdin.
func stdinValidateDocmap(t *testing.T, stdinContent string, args []string) (code int, stdout, stderr string) {
t.Helper()
// Write stdin content to a temp file and redirect os.Stdin.
7
+5 -5
View File
2
@@ -310,11 +310,11 @@ func readFileBytes(path string) ([]byte, error) {
return os.ReadFile(path)
Review

[NIT] The comment on ValidateDocPath says 'Backslashes are rejected explicitly to prevent Windows platform edge cases.' — the sentence could be clearer that the tool itself may run on Windows and that backslashes in YAML doc paths could be misinterpreted by filepath.Join on that platform. Minor documentation clarity.

**[NIT]** The comment on ValidateDocPath says 'Backslashes are rejected explicitly to prevent Windows platform edge cases.' — the sentence could be clearer that the tool itself may run on Windows and that backslashes in YAML doc paths could be misinterpreted by filepath.Join on that platform. Minor documentation clarity.
}
// ValidateDocPath rejects doc paths that could cause path traversal via the
// VCS API (absolute paths, any ".." segment, backslashes). Defense-in-depth:
// the VCS API should already scope paths to the repo, but we validate locally
// to avoid any quirk in backend path handling. Backslashes are rejected
// explicitly to prevent Windows platform edge cases.
// ValidateDocPath rejects doc paths that could cause path traversal
// (absolute paths, any ".." segment, backslashes). Defense-in-depth: callers
// must also confine the joined path to the repo root via filepath.Rel before
Review

[NIT] The comment on ValidateDocPath mentions "Finding #3" which appears to be a reference to an internal issue/finding number that won't be meaningful to future readers of this file. This should be reworded to be self-contained (e.g., remove the parenthetical or rephrase as 'to prevent OS-specific path separator normalization issues').

**[NIT]** The comment on `ValidateDocPath` mentions "Finding #3" which appears to be a reference to an internal issue/finding number that won't be meaningful to future readers of this file. This should be reworded to be self-contained (e.g., remove the parenthetical or rephrase as 'to prevent OS-specific path separator normalization issues').
// any filesystem access. Backslashes are rejected explicitly to prevent
// Windows platform edge cases.
func ValidateDocPath(p string) error {
if strings.Contains(p, "\\") {
Review

[NIT] The comment on ValidateDocPath says "Defense-in-depth: the VCS API should already scope paths to the repo" — this is accurate for the VCS-fetch path, but ValidateDocPath is now also used by the local-filesystem stale-docs check where the VCS API is not involved. The comment is still broadly correct but could be generalized: "Defense-in-depth: callers should also confine the joined path to the repo root via filepath.Rel before filesystem access."

**[NIT]** The comment on `ValidateDocPath` says "Defense-in-depth: the VCS API should already scope paths to the repo" — this is accurate for the VCS-fetch path, but `ValidateDocPath` is now also used by the local-filesystem stale-docs check where the VCS API is not involved. The comment is still broadly correct but could be generalized: "Defense-in-depth: callers should also confine the joined path to the repo root via filepath.Rel before filesystem access."
return fmt.Errorf("backslashes not allowed in doc paths")
security-review-bot marked this conversation as resolved Outdated
Outdated
Review

[NIT] ValidateDocPath splits on '/' only. While later checks use filepath.FromSlash and Rel, consider normalizing or explicitly rejecting backslashes in paths to avoid platform-specific edge cases on Windows.

**[NIT]** ValidateDocPath splits on '/' only. While later checks use filepath.FromSlash and Rel, consider normalizing or explicitly rejecting backslashes in paths to avoid platform-specific edge cases on Windows.
1