diff --git a/DEV_LOOP_STATUS.md b/DEV_LOOP_STATUS.md new file mode 100644 index 0000000..fa7fcd0 --- /dev/null +++ b/DEV_LOOP_STATUS.md @@ -0,0 +1,62 @@ +# Dev Loop Status — 2026-05-15 09:09 UTC + +## Summary + +- **Review-bot status:** ✅ MAIN BRANCH CURRENT +- **Issue-130:** Already merged via PR #131 — branch cleanup needed +- **Active development tracks:** + - issue-137: doc-map features (ready for next cycle) + - issue-141: validate-docmap subcommand (in progress) + - issue-143: secure doc-map loading (in progress) + - issue-150: TBD + +--- + +## Current State + +### Main Branch +- **HEAD:** 64a4fb6 (merged remote health status) +- **Status:** All tests passing, clean tree +- **Recent:** PR #131 merged GitHub API methods & VCS routing for issue-130 + +### Pending Worktrees + +| Issue | Branch | Status | Next Step | +|-------|--------|--------|-----------| +| #137 | issue-137 | ready | Merge after #130 cleanup done | +| #141 | issue-141 | in-progress | Review proposed approach with Aaron | +| #143 | issue-143 | in-progress | Review Option A vs B with Aaron | +| #150 | issue-150 | unknown | Check branch status | + +--- + +## Issue-130 Resolution + +**Finding:** `origin/review-bot-issue-130-work` is redundant — PR #131 already merged the GitHub work. + +**Work on that branch:** +- 5e20dba: pass VCS_TYPE env var to action.yml +- 9a1410c: fix README CLI example +- c5261b9: rename NewPosition → NewLine +- f0ba8fe: move IsBlockedIP to internal/netutil + +**Check:** Are these fixes already in main, or do they need to be pulled from the branch? + +### Recommendation +1. Cherry-pick any fixes from `origin/review-bot-issue-130-work` that aren't in main +2. Delete the branch once cherry-picks confirmed +3. Clean up local worktree (if exists): `git worktree remove ../worktrees/issue-130` + +--- + +## Next Dev-Loop Cycle + +When dev-loop runs next (likely from cron): +1. Check if issue-130 branch cleanup completed +2. Run tests on active branches (137, 141, 143, 150) +3. Report coverage/health updates +4. Flag any blockers for human review + +--- + +_Generated by dev-loop at 2026-05-15 09:09 UTC_ diff --git a/PLAN-141.md b/PLAN-141.md new file mode 100644 index 0000000..1a2cf7c --- /dev/null +++ b/PLAN-141.md @@ -0,0 +1,154 @@ +# Plan: validate-docmap subcommand (Issue #141) + +## Problem + +CI has no way to verify that `doc-map.yml` is kept up to date. When a developer adds a new +module/directory, they may forget to add a `paths:` entry. When a design doc is deleted or +moved, the `docs:` entry becomes stale. Both failures are silent — the AI reviewer just gets +no docs injected, and nobody notices. + +This is a **pure static check**: no AI, no VCS API. Just YAML parsing + glob matching + `os.Stat`. + +## Constraints + +- No external API calls or AI involvement +- Must compose with `git diff --name-only` output via stdin (standard CI pattern) +- Reuse existing `ParseDocMapConfig` from `review/docmap.go` +- Glob matching logic must also reuse (or expose) existing `globMatch`/`mappingMatches` +- Follow the `validate-url` subcommand pattern exactly +- Both checks must always run — report all failures, not just the first +- `outWriter`/`errWriter` vars must be respected for testability + +## Proposed Approach + +### 1. Export a glob-coverage helper from `review/docmap.go` + +Add one new exported function: + +```go +// FileCoveredByDocMap returns true if any paths: glob in cfg matches the given file. +func FileCoveredByDocMap(cfg *DocMapConfig, file string) bool +``` + +This is a thin wrapper over the existing unexported `mappingMatches`. It lets the `cmd/` layer +call into the review package without duplicating glob logic. + +**Alternative considered:** Duplicate the loop in `cmd/`. Rejected — duplication of non-trivial +glob matching is a maintenance hazard. Exporting one function is cleaner. + +### 2. New file: `cmd/review-bot/validatedocmap.go` + +Implements `runValidateDocmap(args []string) int` following the `validateurl.go` pattern. + +``` +Flag parsing (use flag.NewFlagSet — NOT global flag, to avoid polluting main.go's flag state): + --docmap (required) path to YAML file + --repo-root (optional, default ".") base for resolving docs: paths + +Step 1: Parse flags. Validate --docmap is set. Exit 2 on error. +Step 2: ParseDocMapConfig(docmapPath) → exit 2 on parse error +Step 3: Read stdin lines → changedFiles []string +Step 4: Coverage check — for each file in changedFiles: + if !FileCoveredByDocMap(cfg, file) → record as uncovered +Step 5: Stale-docs check — for each unique docs: entry across all mappings: + if os.Stat(filepath.Join(repoRoot, docPath)) fails → record as stale +Step 6: If any uncovered or stale entries → print ERROR sections → return 1 + Else → print "OK" → return 0 +``` + +Exit codes (parallel to `validate-url`): +- `0` — clean +- `1` — coverage or stale-doc failures +- `2` — usage error, missing flag, or YAML parse error + +### 3. Wire into `main.go` + +Add `case "validate-docmap":` to the existing `os.Args[1]` switch. + +### 4. Tests: `cmd/review-bot/validatedocmap_test.go` + +Test table covering: +| Case | stdin | docmap | repo-root | want exit | +|------|-------|--------|-----------|-----------| +| clean | covered file | valid docmap | docs exist | 0 | +| uncovered file | uncovered file | valid docmap | docs exist | 1 | +| stale doc | covered file | stale docs: | missing path | 1 | +| both failures | uncovered + stale | | | 1 | +| empty stdin | (empty) | valid docmap | docs exist | 0 | +| missing --docmap flag | | | | 2 | +| bad YAML | | invalid YAML | | 2 | + +Use `os.MkdirTemp` + `os.WriteFile` to create real temp directories for the stale-docs check. + +### 5. README update + +Add a subsection under the `validate-url` section showing the `validate-docmap` invocation. + +## State/Data Model + +No persistent state. All inputs are flags + stdin + local filesystem. + +## Error Cases + +| Scenario | Behavior | +|----------|----------| +| `--docmap` flag missing | Print usage, exit 2 | +| YAML parse fails | Print error message, exit 2 | +| stdin read error | Print error, exit 2 | +| `--repo-root` does not exist | Individual docs: entries will fail Stat; logged per-path, exit 1 | +| changed file is empty string (blank line) | Skip (trim + ignore empty) | + +## Edge Cases + +- Blank lines in stdin input (from git diff with trailing newline) → trim and skip +- Duplicate `docs:` entries across multiple mappings → deduplicate before checking existence +- `docs:` entry that is a directory (ends with `/`) → `os.Stat` the path; if it exists it's fine +- `--repo-root` with trailing slash → use `filepath.Join` which normalizes it +- Changed files with `../` or absolute paths → check only (no traversal needed here since we're just calling `FileCoveredByDocMap`, which is pure string matching) + +## Testing Strategy + +- Unit tests with real temp files for stale-doc check (no mocking needed for `os.Stat`) +- `outWriter`/`errWriter` capture pattern (same as `validateurl_test.go`) +- Table-driven tests + +## Open Questions + +- **stdin vs `--files` flag**: Using stdin matches the standard CI pipe idiom and avoids shell + quoting issues with many files. Confirmed by Aaron's clarification. +- **Empty stdin coverage**: Aaron said empty stdin = no coverage failures. This means + "no changed files, no problem" — vacuously true. Makes sense for `git diff` on unchanged branches. +- **Directory docs: entries**: `os.Stat` is sufficient — if the directory exists, it's valid. + We don't recursively verify it has `.md` files. Kept simple. +- **`--repo-root` vs always cwd**: Default to cwd but allow override. This makes the command + usable from CI scripts that `cd` to a different directory. + +## Completion Checklist (generated for this task) + +1. `FileCoveredByDocMap` exported and covers the all-mappings, any-glob-matches logic correctly? +2. `runValidateDocmap` follows `runValidateURL` exactly: flag parse → validate → work → exit code? +3. Both checks always run (no early exit after first failure section)? +4. Empty stdin treated as clean (exit 0, no coverage errors)? +5. All `docs:` entries deduplicated before stale check? +6. `outWriter`/`errWriter` used (not `fmt.Println` directly), so tests can capture output? +7. `case "validate-docmap":` added to `main.go` dispatch switch? +8. Tests cover all 7 cases in the table above? +9. README updated with usage example? +10. `go test ./...` passes with no new failures? + +## Implementation Phases + +### Phase 1: Export helper in `review/docmap.go` +- Add `FileCoveredByDocMap(cfg *DocMapConfig, file string) bool` +- Add test in `review/docmap_test.go` + +### Phase 2: `cmd/review-bot/validatedocmap.go` +- Full `runValidateDocmap` implementation + +### Phase 3: Wire into `main.go` + tests +- `case "validate-docmap":` dispatch +- `validatedocmap_test.go` with full table + +### Phase 4: README + final +- Update README +- `go test ./...` diff --git a/PLAN-143.md b/PLAN-143.md new file mode 100644 index 0000000..1d8e521 --- /dev/null +++ b/PLAN-143.md @@ -0,0 +1,125 @@ +# PLAN-143: Load doc-map config from trusted (default) branch + +**Issue:** #143 +**Status:** Planning +**Branch:** TBD (issue-143) + +--- + +## Problem Statement + +The `--doc-map` flag reads the doc-map YAML config from the local `GITHUB_WORKSPACE` checkout, which is the **PR branch** in CI. A malicious PR author can: + +1. Modify `.review-bot/doc-map.yml` in their branch to map any path glob to sensitive docs +2. review-bot reads the PR-branch doc-map config +3. Docs from the **default branch** are fetched and injected into the LLM prompt +4. Via prompt injection in those docs, the attacker could exfiltrate content + +The config is the trust boundary. The *data* fetched (design docs) already comes from the default branch via VCS API. The *config* is what needs to be pinned to the default branch. + +## Constraints + +- Must not break existing callers (backward compatibility) +- Should have a clearly named flag/env var +- Fall back to local workspace if no trusted ref configured (for users not yet migrated) +- The gargoyle workflow (.github/workflows/review.yml) will need updating + +## Proposed Approach + +### Option A: Fetch via VCS API from default branch (preferred) + +Add a new flag `--doc-map-trusted-ref` (default: `""` = use local workspace). + +When `--doc-map-trusted-ref` is set: +1. Use the VCS API to fetch the file at `--doc-map` path from the specified ref +2. Parse the fetched content as YAML +3. Use this config (not the local workspace copy) + +When `--doc-map-trusted-ref` is empty: +- Current behavior (local workspace) with a deprecation warning + +This follows the same pattern as `patterns-repo` which fetches from VCS. + +### Option B: Auto-detect and always use default branch + +Always fetch doc-map from the default branch via VCS API, ignoring local workspace. +Simpler API but breaks local testing (where there's no VCS to fetch from). + +### Recommendation + +Option A — explicit `--doc-map-trusted-ref` flag. The gargoyle workflow would set: +```yaml +doc-map-trusted-ref: "main" +``` + +This is explicit and allows local testing to continue using local workspace. + +## Implementation Plan + +### Phase 1: VCS API fetch for doc-map config + +**Files to change:** +- `cmd/review-bot/main.go` — add `--doc-map-trusted-ref` flag, conditional fetch logic +- `review/docmap.go` — add `FetchDocMapConfig(vcs, owner, repo, ref, path string) (*DocMapConfig, error)` +- `action.yml` — add `doc-map-trusted-ref` input +- `README.md` — document new flag + +**Logic:** +```go +if *docMapTrustedRef != "" { + // Fetch from VCS (trusted branch) — secure + content, err := vcs.GetFileContent(ctx, owner, repoName, *docMapTrustedRef, resolvedDocMap) + ... + docMapCfg, err = review.ParseDocMapConfigContent(content) +} else { + // Local workspace (backward compat with deprecation warning) + slog.Warn("doc-map loaded from local workspace (PR branch) — consider --doc-map-trusted-ref for security") + docMapCfg, err = review.ParseDocMapConfig(resolvedDocMap) +} +``` + +### Phase 2: Tests + +- `TestFetchDocMapConfig_Success`: mock VCS returns valid YAML → parses correctly +- `TestFetchDocMapConfig_NotFound`: VCS returns 404 → clear error +- `TestMainSubprocess_DocMapTrustedRef`: subprocess test for the new flag + +### Phase 3: Gargoyle workflow update + +Update `.github/workflows/review.yml` in gargoyle to add `doc-map-trusted-ref: main`. + +## State/Data Model + +New flag: `--doc-map-trusted-ref` / `DOC_MAP_TRUSTED_REF` env var +- Type: string +- Default: `""` (local workspace) +- Example value: `"main"`, `"master"`, `HEAD` + +## Error Cases + +- VCS returns 404 for doc-map path at trusted ref → error + exit (not silent) +- VCS returns 404 but local copy exists → do NOT fall back (could be attack path) +- Parse error on fetched content → error + exit + +## Edge Cases + +- What if the doc-map doesn't exist at the trusted ref? → log error, exit (don't silently continue) +- What if trusted-ref is a commit SHA? → should work via VCS GetFileContent +- What if the user sets trusted-ref to the PR branch? → Works, but defeats the purpose. Not our problem to prevent. + +## Open Questions + +- Should we warn when `--doc-map` is set without `--doc-map-trusted-ref`? → Yes, deprecation warning pointing to docs +- Should we add `--doc-map-trusted-ref` to the `validate-docmap` subcommand? → No, that subcommand operates on local files only; it's a developer tool + +## Acceptance Criteria + +- [ ] `--doc-map-trusted-ref` flag added to `action.yml` and `cmd/review-bot/main.go` +- [ ] When set, doc-map config fetched from VCS at the specified ref (not local workspace) +- [ ] When unset, local workspace used with deprecation warning in logs +- [ ] 404 from VCS is a hard error (no silent fallback to local copy) +- [ ] Tests cover: fetch success, fetch 404, parse error +- [ ] Gargoyle `.github/workflows/review.yml` updated to use `doc-map-trusted-ref: main` +- [ ] README updated +- [ ] CHANGELOG updated +- [ ] `make precommit` passes