feat(#141): validate-docmap subcommand — CI hard-fail on missing docmap coverage #142
@@ -68,8 +68,15 @@ func runValidateDocmap(args []string) int {
|
||||
failed := false
|
||||
|
|
||||
|
||||
// --- Check 1: Coverage ---
|
||||
// Note: an empty docmap (no mappings) means every changed file is
|
||||
// uncovered — there are no patterns to match against. This is intentional:
|
||||
// if you declare a doc-map, every changed file must be accounted for.
|
||||
|
gpt-review-bot
commented
[NIT] When matching coverage, changed file paths from stdin are used as-is. For robustness on Windows inputs, normalize backslashes to forward slashes before matching (e.g., f = strings.ReplaceAll(f, "\", "/")). **[NIT]** When matching coverage, changed file paths from stdin are used as-is. For robustness on Windows inputs, normalize backslashes to forward slashes before matching (e.g., f = strings.ReplaceAll(f, "\\", "/")).
|
||||
// On empty stdin the check is vacuously true (no files to cover).
|
||||
|
sonnet-review-bot
commented
[MINOR] readLines reads from os.Stdin directly rather than accepting an io.Reader parameter. This forces tests to swap os.Stdin (which is fragile and not goroutine-safe), whereas accepting io.Reader would let callers pass any reader. The existing test helper works around this by redirecting os.Stdin, but the function signature itself is less testable than it could be. Compare with validateurl.go pattern — if that also uses os.Stdin directly this is consistent, but it's still worth noting. **[MINOR]** readLines reads from os.Stdin directly rather than accepting an io.Reader parameter. This forces tests to swap os.Stdin (which is fragile and not goroutine-safe), whereas accepting io.Reader would let callers pass any reader. The existing test helper works around this by redirecting os.Stdin, but the function signature itself is less testable than it could be. Compare with validateurl.go pattern — if that also uses os.Stdin directly this is consistent, but it's still worth noting.
|
||||
var uncovered []string
|
||||
for _, f := range changedFiles {
|
||||
// Normalize Windows-style backslashes to forward slashes so that
|
||||
// changed-file paths from git on Windows match doc-map globs.
|
||||
f = strings.ReplaceAll(f, "\\", "/")
|
||||
if !review.FileCoveredByDocMap(cfg, f) {
|
||||
|
sonnet-review-bot
commented
[MINOR] The stale-docs check ( **[MINOR]** The stale-docs check (`checkStaleDocs`) runs even when no changed files were provided on stdin (empty stdin case). This means a docmap with stale `docs:` entries will always fail, even on runs where no files changed — which is arguably the correct behavior, but worth a comment clarifying the intent. Currently the empty-stdin test passes because the test fixture has the doc file present, so it's fine, but the behavior asymmetry (coverage check is vacuously true on empty input, stale-docs check is not) is undocumented.
|
||||
uncovered = append(uncovered, f)
|
||||
}
|
||||
@@ -93,7 +100,11 @@ func runValidateDocmap(args []string) int {
|
||||
}
|
||||
resolvedRoot, err := filepath.EvalSymlinks(absRoot)
|
||||
|
sonnet-review-bot
commented
[MINOR] The error message for stale docs reads **[MINOR]** The error message for stale docs reads `"ERROR: stale docmap docs: entries (paths do not exist):"` — the `docs:` YAML key embedded in the prose makes it slightly awkward to read ("stale docmap docs: entries"). Consider `"ERROR: stale docs: entries in docmap do not exist on disk:"` for clarity.
|
||||
if err != nil {
|
||||
if os.IsNotExist(err) {
|
||||
fmt.Fprintf(errWriter, "Error: --repo-root %q does not exist\n", *repoRootFlag)
|
||||
} else {
|
||||
fmt.Fprintf(errWriter, "Error: failed to resolve --repo-root %q: %v\n", *repoRootFlag, err)
|
||||
|
gpt-review-bot
commented
[NIT] Deduplication of docs: entries uses the raw docPath string; consider normalizing (e.g., filepath.Clean and using forward slashes) before inserting into the seen set to avoid duplicate reports for equivalent paths like "docs/./a.md" vs "docs/a.md". **[NIT]** Deduplication of docs: entries uses the raw docPath string; consider normalizing (e.g., filepath.Clean and using forward slashes) before inserting into the seen set to avoid duplicate reports for equivalent paths like "docs/./a.md" vs "docs/a.md".
|
||||
}
|
||||
return 2
|
||||
}
|
||||
// checkStaleDocs validates each path before touching the filesystem; see
|
||||
@@ -151,7 +162,7 @@ func checkStaleDocs(cfg *review.DocMapConfig, repoRoot string) []string {
|
||||
// filepath.Rel check confirms the path is still under repoRoot.
|
||||
fullPath := filepath.Clean(filepath.Join(repoRoot, filepath.FromSlash(docPath)))
|
||||
rel, err := filepath.Rel(repoRoot, fullPath)
|
||||
if err != nil || strings.HasPrefix(rel, "..") {
|
||||
if err != nil || rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator)) {
|
||||
stale = append(stale, docPath)
|
||||
continue
|
||||
|
gpt-review-bot
commented
[NIT] readLines uses bufio.Scanner with the default token limit (~64KB). While file paths are typically short, if extremely long lines were ever piped in, Scanner would error. Consider documenting this assumption or using bufio.Reader with ReadString('\n') for robustness. **[NIT]** readLines uses bufio.Scanner with the default token limit (~64KB). While file paths are typically short, if extremely long lines were ever piped in, Scanner would error. Consider documenting this assumption or using bufio.Reader with ReadString('\n') for robustness.
|
||||
}
|
||||
|
||||
@@ -131,6 +131,12 @@ mappings:
|
||||
}
|
||||
|
||||
// stdinValidateDocmap runs runValidateDocmap with a synthetic stdin.
|
||||
//
|
||||
// Implementation note: we write stdinContent to a temp file and point
|
||||
// os.Stdin at it. The defer f.Close() fires after stdinValidateDocmap
|
||||
// returns, which is after runValidateDocmap has finished reading stdin
|
||||
// synchronously — so the file is not closed while still in use.
|
||||
// Tests must not call t.Parallel() while sharing the global os.Stdin.
|
||||
func stdinValidateDocmap(t *testing.T, stdinContent string, args []string) (code int, stdout, stderr string) {
|
||||
t.Helper()
|
||||
// Write stdin content to a temp file and redirect os.Stdin.
|
||||
|
||||
@@ -310,11 +310,11 @@ func readFileBytes(path string) ([]byte, error) {
|
||||
return os.ReadFile(path)
|
||||
|
sonnet-review-bot
commented
[NIT] The comment on ValidateDocPath says 'Backslashes are rejected explicitly to prevent Windows platform edge cases.' — the sentence could be clearer that the tool itself may run on Windows and that backslashes in YAML doc paths could be misinterpreted by filepath.Join on that platform. Minor documentation clarity. **[NIT]** The comment on ValidateDocPath says 'Backslashes are rejected explicitly to prevent Windows platform edge cases.' — the sentence could be clearer that the tool itself may run on Windows and that backslashes in YAML doc paths could be misinterpreted by filepath.Join on that platform. Minor documentation clarity.
|
||||
}
|
||||
|
||||
// ValidateDocPath rejects doc paths that could cause path traversal via the
|
||||
// VCS API (absolute paths, any ".." segment, backslashes). Defense-in-depth:
|
||||
// the VCS API should already scope paths to the repo, but we validate locally
|
||||
// to avoid any quirk in backend path handling. Backslashes are rejected
|
||||
// explicitly to prevent Windows platform edge cases.
|
||||
// ValidateDocPath rejects doc paths that could cause path traversal
|
||||
// (absolute paths, any ".." segment, backslashes). Defense-in-depth: callers
|
||||
// must also confine the joined path to the repo root via filepath.Rel before
|
||||
|
sonnet-review-bot
commented
[NIT] The comment on **[NIT]** The comment on `ValidateDocPath` mentions "Finding #3" which appears to be a reference to an internal issue/finding number that won't be meaningful to future readers of this file. This should be reworded to be self-contained (e.g., remove the parenthetical or rephrase as 'to prevent OS-specific path separator normalization issues').
|
||||
// any filesystem access. Backslashes are rejected explicitly to prevent
|
||||
// Windows platform edge cases.
|
||||
func ValidateDocPath(p string) error {
|
||||
if strings.Contains(p, "\\") {
|
||||
|
sonnet-review-bot
commented
[NIT] The comment on **[NIT]** The comment on `ValidateDocPath` says "Defense-in-depth: the VCS API should already scope paths to the repo" — this is accurate for the VCS-fetch path, but `ValidateDocPath` is now also used by the local-filesystem stale-docs check where the VCS API is not involved. The comment is still broadly correct but could be generalized: "Defense-in-depth: callers should also confine the joined path to the repo root via filepath.Rel before filesystem access."
|
||||
return fmt.Errorf("backslashes not allowed in doc paths")
|
||||
|
security-review-bot marked this conversation as resolved
Outdated
[NIT] ValidateDocPath splits on '/' only. While later checks use filepath.FromSlash and Rel, consider normalizing or explicitly rejecting backslashes in paths to avoid platform-specific edge cases on Windows. **[NIT]** ValidateDocPath splits on '/' only. While later checks use filepath.FromSlash and Rel, consider normalizing or explicitly rejecting backslashes in paths to avoid platform-specific edge cases on Windows.
|
||||
|
||||
[NIT] The
validateDocmapPathfunction documents thatresolvedRootmust be an absolute, symlink-free path, but this is only enforced by convention (the caller must pass the right value). An internal assertion or a brief checkif !filepath.IsAbs(resolvedRoot)at the top of the function would make the contract self-enforcing. Low priority since the caller inrunValidateDocmapdoes this correctly.