Compare commits

..

59 Commits

Author SHA1 Message Date
Rodin e560781c87 Removing intermediate files
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 21s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 24s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 41s
2026-05-19 02:15:06 +00:00
rodin 9673a9d53c Merge pull request 'fix(#150): add EvalSymlinks to validateDocmapPath — close dir-symlink bypass' (#152) from issue-150 into main
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
fix(#150): add EvalSymlinks to validateDocmapPath — close dir-symlink bypass
2026-05-18 19:09:30 +00:00
Rodin eb0ff3aa69 nit(#150): clarify why resolved != symlinkPath in InRepoSymlinkAllowed test
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 40s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m0s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m3s
Add a comment explaining that validateDocmapPath calls EvalSymlinks
internally, so the returned path is always the fully-resolved real path
and can never equal the symlink entry itself.

Addresses sonnet bot NIT (review 4810) against d6bab7a9.
2026-05-15 16:23:11 -07:00
Rodin c76e7dcd2e fix(#150): add os.SameFile check after open to close Lstat→open TOCTOU window
After getting the resolved path from validateDocmapPath, Lstat the path
immediately before os.Open, then compare with f.Stat() after open using
os.SameFile. If the file was swapped between validation and open (e.g.,
replaced with a symlink pointing outside the repo), the inode comparison
catches it and returns an error.

Also changes defer f.Close() // nolint:errcheck to
defer func() { _ = f.Close() }() to follow the project convention of
explicit ignores over suppressor comments.

Addresses security bot finding (review 4812) against d6bab7a9.
2026-05-15 16:23:07 -07:00
Rodin d6bab7a9cf fix(#150): close residual TOCTOU with LimitedReader at docmap open
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 43s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m16s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m23s
2026-05-15 16:11:15 -07:00
Rodin 4359518e50 nit(#150): report original --docmap flag value in parse error, not resolved path 2026-05-15 16:10:42 -07:00
Rodin 6e11107c77 nit(#150): fix misleading 'this is unreachable' in Lstat comment 2026-05-15 16:10:27 -07:00
aweiker 7f31475330 Merge pull request 'fix(#157): add never-close constraint to spec, S9 invariant, and regression test' (#158) from issue-157 into main
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #158
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-15 22:56:44 +00:00
Rodin ec6fdbff42 fix(#158): address bot feedback — correct S8/S10 description, fix §9 prose break
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 24s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 55s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m0s
2026-05-15 15:40:53 -07:00
aweiker 89596516d7 Merge pull request 'refactor(#154): extract baseSubprocessArgs helper in main_test.go subprocess tests' (#155) from issue-154 into main
CI / test (push) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #155
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-15 21:28:41 +00:00
Rodin f883f39dbf fix(#158): address NIT feedback — clarify enforcement split, clean §9 prose
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 22s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 24s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m7s
2026-05-15 11:06:49 -07:00
Rodin 345f9a5aac test(#150): add positive test for in-repo symlink allowed by EvalSymlinks fix
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 26s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m16s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m22s
Finding 5 [NIT] from self-review:

TestValidateDocmapPath_InRepoSymlinkAllowed verifies that a file-level
symlink inside the repo root whose resolved target is also within the
root is accepted by validateDocmapPath. This is the positive case for
the issue #150 behavioral change (commit 4dce8e4): only symlinks whose
resolved destination escapes the root are rejected. Intra-repo symlinks
are permitted and their resolved path is returned to the caller.

The test also asserts that the returned path is the resolved real file,
not the symlink entry itself (i.e., EvalSymlinks did its job).
2026-05-15 11:06:11 -07:00
Rodin 0fedefad3f fix(#150): return resolved path from validateDocmapPath to close TOCTOU gap
Finding 4 [MINOR] from self-review:

Previously, validateDocmapPath validated *docmapFlag then returned error
only, leaving the caller to re-open the original (unresolved) path via
ParseDocMapConfig. In theory, the path could change between validation
and use (check-then-use race).

Change validateDocmapPath to return (string, error): on success it
returns the filepath.EvalSymlinks-resolved absolute path. The caller
now passes resolvedDocmap to ParseDocMapConfig instead of the original
*docmapFlag string, eliminating any check-then-use window.

Also update the test for TestValidateDocmapPath_DirSymlinkBypass to use
the new two-value return: _ for the resolved path, err for the error.

Low-risk in ephemeral CI but correct by construction.
2026-05-15 11:04:35 -07:00
Rodin 20e9899835 docs(#150): fix stale comments in validateDocmapPath — reflect new in-repo-symlink semantic
Findings 1-3 from self-review (4dce8e4):

Finding 1 [NIT]: remove dead ModeSymlink check and its misleading
'defense-in-depth' comment. After filepath.EvalSymlinks, resolvedPath
is guaranteed symlink-free; fi.Mode()&os.ModeSymlink can never be set.
Dropped the unreachable branch; updated Lstat comment to say so.

Finding 2 [MINOR]: update validateDocmapPath godoc — invariant #2 now
reads 'The resolved path is within resolvedRoot' instead of 'The path
is not a symlink'. In-repo file-level symlinks whose resolved target
stays within the root are allowed; the confinement check enforces the
actual invariant.

Finding 3 [MINOR]: update inline comment in runValidateDocmap — the
bulleted list item now says 'Resolved target stays within the root
(in-repo symlinks allowed...)' instead of 'Is not a symlink'.
2026-05-15 11:04:35 -07:00
aweiker d3b9027da3 Merge pull request 'feat(#141): validate-docmap subcommand' (#156) from issue-141 into main
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #156
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
2026-05-15 17:43:05 +00:00
Rodin fb7d8d5e3b fix(#158): add S10 invariant to spec, fix enforcement wording in §8
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 23s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 26s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 49s
Address MINOR and NIT findings from Sonnet and GPT review of PR #158.

MINOR (Sonnet + GPT): No static invariant for 'no close-PR in worker templates'.
- Add S10 to §6 Safety Invariants table: checks that no worker template contains
  close-PR API calls AND every template contains NEVER-close constraint text.
- Symmetric to S8 (no merge in worker templates) and S9 (no close in dispatch).

NIT (GPT): Enforcement mapping sentence in §8 was ambiguous.
- Rewrite to explicitly map: S1+S9 cover dispatch; S8+S10 cover worker templates.

NIT (Sonnet): The 'all 7 templates contain NEVER-close text' claim is now verified
by S10 (grep-based), not just prose.

Implementation: S10 added to check-invariants.sh + Bug-157-S10 regression tests
added to dispatch.bats (in rodin/workspace). All 11 invariants pass.
2026-05-15 10:26:14 -07:00
Rodin bacb25e029 nit(#141): fix stale-docs error message phrasing
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 22s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 50s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 43s
"ERROR: stale docmap docs: entries" had a vestigial "docs:" fragment
that reads awkwardly (looks like a YAML reference).

Change to: "ERROR: stale docmap entries (paths do not exist):"

Addresses NIT finding in review #4175.
2026-05-15 08:15:45 -07:00
Rodin 92efd1af2b fix(#141): strip leading './' from coverage-check paths
Non-git tools (e.g. `find`, `ls`) can emit paths with a "./" prefix.
Without stripping this, "./cmd/foo.go" would not match the glob "cmd/**",
producing a false-positive uncovered-file failure.

Fix: add strings.TrimPrefix(f, "./") after backslash normalization.

Test: TestRunValidateDocmap_DotSlashPrefix

Addresses MINOR finding in review #4175.
2026-05-15 08:15:33 -07:00
Rodin 7adb296523 fix(#141): reject non-regular files in validateDocmapPath
Add IsRegular() check after Lstat so directories, FIFOs, and device nodes
produce a clear error ("docmap must be a regular file") instead of a
confusing downstream parse error.

Test: TestValidateDocmapPath_NonRegularFile

Addresses MINOR finding in review #4175.
2026-05-15 08:15:14 -07:00
Rodin 282b6e0e86 nit(#154): add t.Fatal guard if baseSubprocessArgs flag not found
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 22s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 38s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 40s
Address sonnet NIT: if --repo or --pr is ever removed from
baseSubprocessArgs(), the mutation loop silently no-ops and the test
becomes meaningless. Adding a found guard and t.Fatal makes the
regression immediately visible.
2026-05-15 08:06:18 -07:00
Rodin 6cefbb070e fix(#157): add S9 invariant and never-close constraint to dev-loop spec
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 28s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m16s
- Add S9 to §6 Safety Invariants: zero close-PR API calls in dispatch
- Document worker ABSOLUTE CONSTRAINTS in §8 Worker Templates
- Add §9 entry for Issue #157 explaining the fix

All worker templates already contain the NEVER-close constraint from
a prior session. This commit makes the spec authoritative.

Companion changes in rodin/workspace:
- check-invariants.sh: add S9 static check
- dispatch.bats: add Bug-157-regression test
2026-05-15 14:47:54 +00:00
Rodin 838a34aa12 chore: cycle status 2026-05-15 14:42
CI / test (push) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 14:42:27 +00:00
Rodin 6fa3cb9e13 chore(dev-loop): cycle status checkpoint — 2026-05-15 14:26 UTC — steady state, all systems operational
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 14:27:28 +00:00
Rodin 8ab45becec chore(dev-loop): cycle status checkpoint — 2026-05-15 14:18 UTC — steady state, all systems nominal
CI / test (push) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 14:18:55 +00:00
Rodin 4311ccfa8f chore(dev-loop): cycle checkpoint — 2026-05-15 13:54 UTC — steady state, all PRs merged, 76.7% coverage
CI / test (push) Successful in 23s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 13:54:48 +00:00
Rodin fb899ab13e chore(dev-loop): cycle status checkpoint — 2026-05-15 13:42 UTC — all systems operational, ready for next sprint
CI / test (push) Successful in 14s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 13:42:29 +00:00
Rodin da7a5224d6 chore(dev-loop): checkpoint 2026-05-15 13:14 UTC — v0.4.0 release prepared, all tests passing, ready for next sprint
CI / test (push) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 13:15:10 +00:00
Rodin 80b04d1118 chore(dev-loop): cycle status checkpoint — 2026-05-15 13:14 UTC — all systems nominal, ready for next work
CI / test (push) Successful in 21s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 13:14:50 +00:00
rodin 9615519386 chore(release): update CHANGELOG for v0.4.0
CI / test (push) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Release / release (push) Successful in 39s
2026-05-15 13:05:07 +00:00
Rodin 166078ba46 chore(dev-loop): final cycle status — all 4 PRs merged, 76.7% coverage, repo ready
CI / test (push) Successful in 18s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 12:31:38 +00:00
Rodin eeff3ea936 chore(dev-loop): cycle complete — 4 PRs merged, tests passing, 76.7% coverage
CI / test (push) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 12:17:26 +00:00
Rodin 39cade6dd9 chore(dev-loop): all 4 PRs merged — 2026-05-15 12:15 UTC
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 12:10:10 +00:00
rodin 1f58c658ce Merge pull request 'feat(#143): fetch doc-map config from trusted VCS ref' (#153) from issue-143 into main
CI / test (push) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
feat(#143): fetch doc-map config from trusted VCS ref

Closes #143
2026-05-15 12:09:19 +00:00
Rodin 02dfc12141 fix(#143): skip local doc-map validation when --doc-map-trusted-ref is set
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 49s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Failing after 1m11s
When --doc-map-trusted-ref is provided, the --doc-map value is used as a
VCS API path parameter, not a local filesystem path. The early call to
validateWorkspacePath (which requires the file to exist locally) blocked
the trusted-ref code path when the doc-map did not exist in the local
checkout — defeating the feature's purpose in sparse checkouts or when
the file is only on the default branch.

Fix: guard the early validation with `&& *docMapTrustedRef == ""`.

Also fixes:
- review/docmap.go: correct ParseDocMapConfigContent godoc example to
  match actual source format "owner/repo@ref:path"
- cmd/review-bot/main_test.go: add TestMainSubprocess_DocMapTrustedRefSkipsLocalValidation
  to prevent regression
2026-05-15 12:08:13 +00:00
Rodin b01e3c487f feat(#143): fetch doc-map config from trusted VCS ref
The doc-map YAML config was previously read from the local workspace
(the PR branch checkout). A malicious PR author could modify
.review-bot/doc-map.yml to map any path glob to sensitive design docs,
causing review-bot to fetch and inject those docs into the LLM prompt.

Fix: add --doc-map-trusted-ref (DOC_MAP_TRUSTED_REF) flag. When set to
a trusted ref (e.g. 'main'), the doc-map config is fetched from the VCS
API at that ref instead of from local workspace. A 404 from VCS is a
hard error (no silent fallback to local copy).

When unset, the local workspace is used with a security warning in the
logs pointing operators to the new flag.

Changes:
- review/docmap.go: add ParseDocMapConfigContent + parseDocMapBytes
  helper to parse from in-memory content (fetched via VCS API)
- cmd/review-bot/main.go: add --doc-map-trusted-ref flag; Step 6c
  branches on trusted-ref to fetch vs local-workspace load
- .gitea/actions/review/action.yml: add doc-map-trusted-ref input
- README.md: document new input
- CHANGELOG.md: security and feature entries

Tests:
- TestParseDocMapConfigContent_Valid/Empty/InvalidYAML/UnknownKeys
  in review/docmap_test.go

Coverage: 53.0% cmd/review-bot
2026-05-15 12:08:13 +00:00
rodin b09f12b8ff Merge pull request 'test(#146): add TestMainSubprocess_InvalidDocMapPath and TestMainSubprocess_InvalidDocMapFile' (#151) from issue-146 into main
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
test(#146): clarify t.TempDir() evaluation in subprocess env setup

Closes #146
2026-05-15 12:07:28 +00:00
Rodin 430e61fdbd test(#146): clarify t.TempDir() evaluation in subprocess env setup
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 17s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 24s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Failing after 27s
2026-05-15 12:06:59 +00:00
Rodin b8aa63e7ba chore(dev-loop): cycle status 2026-05-15 11:58 UTC — 3 PRs ready, 2 awaiting ai-review
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 11:59:22 +00:00
Rodin d855064765 chore(dev-loop): cycle status 2026-05-15 11:44 UTC — 3 PRs ready, 2 awaiting ai-review
CI / test (push) Successful in 26s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 11:45:27 +00:00
Rodin 38bb01b4b4 chore(dev-loop): cycle status 2026-05-15 11:23 UTC
CI / test (push) Successful in 25s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 11:24:15 +00:00
Rodin c96ebcc6e0 chore(dev-loop): cycle status 2026-05-15 11:09 UTC — 3 PRs ready, 2 awaiting ai-review
CI / test (push) Successful in 28s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 11:10:22 +00:00
Rodin 34ff4c5c17 chore(dev-loop): cycle status 2026-05-15 10:52 UTC — 4 PRs ready for review, 76.7% coverage
CI / test (push) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 10:52:11 +00:00
Rodin eb3770e18c chore(fmt): align test comments in gitea/ipcheck_test.go
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 10:23:11 +00:00
Rodin 77a7f667cb refactor(#154): extract baseSubprocessArgs helper in main_test.go subprocess tests
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 10:18:34 +00:00
Rodin 76b6493628 fix(#150): add EvalSymlinks to validateDocmapPath — close dir-symlink bypass
CI / test (push) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 10:18:04 +00:00
Rodin 98479c97cf test(#146): add TestMainSubprocess_InvalidDocMapPath and TestMainSubprocess_InvalidDocMapFile
CI / test (push) Successful in 25s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 10:17:39 +00:00
Rodin 3ce606b14a chore(dev-loop): cycle summary — 4 issues ready for review, 77.1% coverage
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:38:16 +00:00
Rodin ffbbdf52d8 chore(dev-loop): status update 2026-05-15 09:37 UTC — 77.1% coverage, 4 PRs ready for review
CI / test (push) Successful in 29s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:37:58 +00:00
Rodin 165034351b chore: dev-loop cycle complete — clean & ready for next sprint
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:24:20 +00:00
Rodin 6d82535839 chore: dev-loop verification — issue-130 already in main, worktree stale
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:23:51 +00:00
Rodin 823265659a chore: dev-loop run 2026-05-15 09:15 UTC — all branches passing, ready for review
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:16:15 +00:00
Rodin 9be46dfbda chore: dev-loop summary — issue-130 cleanup complete, main current
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:10:30 +00:00
Rodin d946db830c chore: dev-loop status check (2026-05-15 09:04 UTC)
CI / test (push) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:10:08 +00:00
Rodin f7008ab86b refactor(#130): move IsBlockedIP to internal/netutil to remove gitea import in validateurl.go
validateurl.go is VCS-generic but imported gitea.IsBlockedIP, creating an
unexpected generic→Gitea-specific dependency. Extract IsBlockedIP and its
CIDR list to internal/netutil/ipcheck.go (a neutral shared package).

- gitea/ipcheck.go becomes a thin forwarding wrapper (preserves API compat
  for callers within the gitea package)
- gitea/ipcheck_test.go replaced with a forwarding smoke test; full coverage
  moves to internal/netutil/ipcheck_test.go
- validateurl.go now imports internal/netutil directly
2026-05-15 09:09:54 +00:00
Rodin 1e50a22caa refactor(#130): rename vcsReviewComment.NewPosition to NewLine with clearer semantics
The field was named NewPosition with a misleading comment 'Gitea: absolute
line; GitHub: diff hunk position'. In reality both adapters use it as an
absolute new-file line number (Gitea maps it to new_position, GitHub maps it
to Line+Side:RIGHT). Rename to NewLine to match actual semantics and update
comments to explain per-adapter mapping.
2026-05-15 09:09:48 +00:00
Rodin 3387456b93 docs(#130): fix README CLI example and env var table for VCS-agnostic usage
- CLI example used $GITEA_TOKEN which is not an actual env var; rename to
  $REVIEWER_TOKEN (the correct env var the binary reads)
- Env var table referenced GITEA_REPO without noting GitHub support; add
  a note and include VCS_TYPE row so users know they can override detection
2026-05-15 09:09:48 +00:00
Rodin 3e33e3d3a0 fix(#130): pass VCS_TYPE env var from action.yml Run review step
The binary detects VCS type from VCS_TYPE env var, but action.yml did not
pass it to the Run review step. This caused the binary to fall back to a
URL heuristic (github.com substring), which misclassifies GitHub Enterprise
Server hosts whose URL does not contain 'github'.

The 'Determine version' step already outputs vcs_type — wire it through to
the Run review env block so explicit VCS_TYPE always takes precedence.
2026-05-15 09:09:48 +00:00
Rodin 3433446c19 chore: dev-loop status update — issue-130 fixes pushed, rebase conflict detected
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-15 09:00:19 +00:00
Rodin 4dce8e4454 fix(#150): add EvalSymlinks to validateDocmapPath — close dir-symlink bypass
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 39s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m3s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m16s
The previous implementation called os.Lstat(absPath) which only avoids
following the *final* path component. A PR committing .review-bot/ as a
directory symlink pointing outside the repo would pass the filepath.Rel
confinement check because the textual path was inside the root while
the resolved destination was not.

Fix: call filepath.EvalSymlinks after filepath.Abs to resolve ALL symlink
components before the confinement check. If EvalSymlinks fails (dangling
symlink, nonexistent target) the path is rejected. The filepath.Rel check
then operates on the fully-resolved path.

Semantic change: file-level in-repo symlinks (target also within root) are
now allowed — the invariant is about where the content lives, not whether
the entry is a symlink. The test TestValidateDocmapPath_Symlink is updated
to test an out-of-repo symlink target, which must still be rejected.

Tests:
- TestValidateDocmapPath_DirSymlinkBypass: reproduces the attack vector
  (dir symlink bypassing textual confinement check) and verifies it is
  now rejected
- TestValidateDocmapPath_Symlink: updated to test out-of-repo symlink

Coverage: 54.0%
2026-05-15 08:37:31 +00:00
22 changed files with 840 additions and 1047 deletions
+12
View File
@@ -141,6 +141,16 @@ inputs:
description: 'Maximum bytes of injected doc content from doc-map (default 102400 = 100KB)'
required: false
default: '102400'
doc-map-trusted-ref:
description: >-
Git ref (branch, tag, or SHA) from which to fetch the doc-map config file
via VCS API instead of reading it from the local workspace. Recommended
when using doc-map: set this to the default branch (e.g. 'main') so a
malicious PR cannot modify the doc-map config to inject arbitrary design
docs into the LLM prompt. When unset, the config is read from the local
workspace (the PR branch) with a security warning in the logs.
required: false
default: ''
runs:
using: 'composite'
@@ -487,6 +497,7 @@ runs:
shell: bash
env:
VCS_URL: ${{ steps.version.outputs.server_url }}
VCS_TYPE: ${{ steps.version.outputs.vcs_type }}
GITEA_REPO: ${{ inputs.repo || github.repository }}
PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ inputs.reviewer-token }}
@@ -506,6 +517,7 @@ runs:
PERSONA_FILE: ${{ inputs.persona-file }}
DOC_MAP_FILE: ${{ inputs.doc-map }}
DOC_MAP_MAX_BYTES: ${{ inputs.doc-map-max-bytes }}
DOC_MAP_TRUSTED_REF: ${{ inputs.doc-map-trusted-ref }}
AICORE_CLIENT_ID: ${{ inputs.aicore-client-id }}
AICORE_CLIENT_SECRET: ${{ inputs.aicore-client-secret }}
AICORE_AUTH_URL: ${{ inputs.aicore-auth-url }}
+12 -1
View File
@@ -1,9 +1,20 @@
# CHANGELOG
## Unreleased
## v0.4.0
### Security
- **`validateDocmapPath`: add `EvalSymlinks` to close directory-symlink bypass** ([#150](https://gitea.weiker.me/rodin/review-bot/issues/150)): The previous implementation used `os.Lstat` which only avoids following the *final* path component. An intermediate directory symlink (e.g. `.review-bot/` committed as a symlink to a directory outside the repo) would pass the path-confinement check because the textual path appeared within the repo root. `filepath.EvalSymlinks` is now called first, resolving all symlink components before the `filepath.Rel` confinement check. In-repo symlinks whose resolved targets also reside within the repo root are now allowed; out-of-repo targets are rejected by the confinement check.
- **`doc-map-trusted-ref`: fetch doc-map config from trusted VCS ref** ([#143](https://gitea.weiker.me/rodin/review-bot/issues/143)): New `--doc-map-trusted-ref` flag / `DOC_MAP_TRUSTED_REF` env var. When set, the doc-map YAML config is fetched from the specified VCS ref (e.g. `main`) via API instead of being read from the local workspace (the PR branch checkout). This prevents a malicious PR from modifying `.review-bot/doc-map.yml` to inject arbitrary design docs into the LLM prompt. When unset, the local workspace is used with a security warning in the logs.
### Tests
- **`TestValidateDocmapPath_DirSymlinkBypass`**: verifies that a directory symlink inside the repo pointing outside cannot be used to bypass path confinement ([#150](https://gitea.weiker.me/rodin/review-bot/issues/150)).
### Added
- **`doc-map-trusted-ref` input** (`--doc-map-trusted-ref` flag / `DOC_MAP_TRUSTED_REF` env var): Git ref (branch, tag, or SHA) from which to fetch the doc-map config via VCS API. Recommended for all `doc-map` users. Example: `doc-map-trusted-ref: main`. ([#143](https://gitea.weiker.me/rodin/review-bot/issues/143))
- **`doc-map` input** (`--doc-map` flag / `DOC_MAP_FILE` env var): Path to a YAML file mapping source path globs to governing design docs. review-bot intersects the map with changed PR paths and injects matching docs into the system prompt under a `## Design Documents` heading. ([#137](https://gitea.weiker.me/rodin/review-bot/issues/137))
- **`doc-map-max-bytes` input** (`--doc-map-max-bytes` flag / `DOC_MAP_MAX_BYTES` env var): Cap on total injected design doc content in bytes. Default: 102400 (100 KB). Prevents accidental context overflow when a PR touches many modules.
- **`DesignDocs` budget section**: Design docs are included in the context budget and trimmed after conventions, before file context, if the total exceeds the model's context limit.
-50
View File
@@ -1,50 +0,0 @@
# Dev Loop Health Check — 2026-05-15 03:33 UTC
## Status: ✅ ACTIVE WORK COMPLETED
### Test Results
- All packages: **PASS** ✅ (6/6, fresh -count=1 run)
- Build: ✅ successful
- Vet: ✅ clean
### Coverage (current)
| Package | Coverage |
|---------|----------|
| budget | 91.8% |
| cmd/review-bot | 46.1% |
| gitea | 85.2% |
| github | 86.3% |
| llm | 81.3% |
| review | 92.0% |
### PR #138 Status
- **Branch:** issue-137
- **Feature:** feat(#137): add doc-map input for path-scoped doc injection
- **Review status:** ✅ All 3 bots approved (sonnet, gpt, security)
- **Review findings addressed:**
- Fixed package comment collision in `review/docmap.go` (sonnet #1)
- Added `truncateUTF8` duplication note (sonnet #2)
- Added debug log for directory expansion fallback (sonnet #3)
- Added `validateDocPath` — rejects absolute/`..` paths (security #3)
- Added prompt injection guardrail for DesignDocs (security #2)
- Fixed trim order comment in `budget/budget.go` (gpt #1)
- Fixed `globMatch` comment to say `filepath.Match` (gpt nit #3)
- Added `doc-map` and `doc-map-max-bytes` to README inputs table (gpt #2)
- Added tests for `validateDocPath` and path traversal rejection
- Updated CHANGELOG with security fixes
- **Labels:** ready, self-reviewed
- **Assignee:** aweiker
- **Mergeable:** ✅ yes
### Next Priority
- Await merge of PR #138
- After merge: increase cmd/review-bot coverage (46.1% → target 60%+)
- Issue #132+: PR Submission feature
- `github.Client.DismissReview` method referenced but missing — file issue
---
_Dev-loop cycle complete at 03:33 UTC._
-36
View File
@@ -1,36 +0,0 @@
=============================================================================
REVIEW-BOT DEV LOOP STATUS — 2026-05-15 04:08 UTC
=============================================================================
OVERALL STATUS: ✅ PR OPEN
Active Work:
- PR #140: test(#139): improve cmd/review-bot coverage 44.6% → 49.3%
State: open, labeled: ready, self-reviewed
Branch: issue-139
Test Results (last full run, worktree):
- All 6 packages: PASS ✅
- Build: ✅ clean
- Vet: ✅ clean
Coverage (post-change):
- cmd/review-bot: 49.3% (was 44.6%)
- review: 91.9%
- budget: 92.0%
- github: 86.3%
- gitea: 85.2%
- llm: 81.3%
Repository (main):
- Branch: main (up to date with origin — 1e3d86b)
- Working tree: clean
- Open issues: 1 (#139, addressed by PR #140)
- Open PRs: 1 (#140, ready for review)
System Health: ✅ GREEN
✓ All tests passing
✓ No warnings
✓ PR ready for merge
=============================================================================
-175
View File
@@ -1,175 +0,0 @@
# Plan: Issue #125 — Rename GITEA_URL → VCS_URL
## Problem
The `GITEA_URL` environment variable (and `--gitea-url` flag) implies the binary only works with Gitea.
Now that review-bot supports both Gitea and GitHub/GHES, this name is misleading.
Renaming to `VCS_URL` makes the binary platform-agnostic in its interface.
## Constraints
- Must not break existing users who already use `GITEA_URL` — need a fallback
- The CLI flag `--gitea-url` should also be updated to `--vcs-url` for consistency
- `INTEGRATION_GITEA_URL` in integration tests is a test-only env var, not the binary's interface; but should be updated for clarity
- The action YAML uses `GITEA_URL` as an internal shell variable in bash scripts — distinct from the env var passed to the binary
- All changes must compile and pass existing tests
## Files Affected
### Binary / Go source
| File | Change |
|------|--------|
| `cmd/review-bot/main.go` | Rename `--gitea-url``--vcs-url`, add `VCS_URL` as primary, keep `GITEA_URL` fallback |
| `cmd/review-bot/integration_test.go` | Rename `INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` (test-only, no external compat concern) |
| `integration_test.go` | Same — rename `INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` |
### Action YAML
| File | Change |
|------|--------|
| `.gitea/actions/review/action.yml` | Rename input `gitea-url``vcs-url`; update env var passed to binary: `VCS_URL` instead of `GITEA_URL`; keep internal bash var as `GITEA_URL` (only used for release download, not passed to binary) |
| `.gitea/workflows/ci.yml` | Rename `GITEA_URL` env var to `VCS_URL` in Run review step |
### Documentation
| File | Change |
|------|--------|
| `README.md` | Update CLI example, env var table entry |
## Proposed Approach
### 1. Backward-compatible env var lookup in main.go
Replace:
```go
giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", ""), "Gitea instance URL")
```
With:
```go
giteaURL := flag.String("vcs-url", envOrDefaultFallback("VCS_URL", "GITEA_URL", ""), "VCS server URL (e.g. https://gitea.example.com)")
```
Add a helper:
```go
// envOrDefaultFallback reads primary env var; if empty, falls back to deprecated env var.
func envOrDefaultFallback(primary, deprecated, defaultVal string) string {
if v := os.Getenv(primary); v != "" {
return v
}
if v := os.Getenv(deprecated); v != "" {
slog.Warn("deprecated env var in use; rename to " + primary, "old", deprecated, "new", primary)
return v
}
return defaultVal
}
```
**Note:** This must be called AFTER `setupLogger` conceptually, but the flag default is evaluated at flag registration time. Since `setupLogger` runs before `flag.Parse()`, the slog.Warn will print correctly at runtime. We use `log.Printf` as a fallback if this proves problematic.
Actually — flag defaults are evaluated at registration (line 57), before `setupLogger`. The warning won't go through slog. Two options:
- Use `log.Printf` for the deprecation warning (always visible)
- Move the fallback lookup to after `flag.Parse()`, checking if the parsed value is still empty
**Decision:** Move fallback to a post-parse check. This is cleaner:
```go
vcsURL := flag.String("vcs-url", os.Getenv("VCS_URL"), "VCS server URL")
flag.Parse()
// Backward compat: fall back to deprecated GITEA_URL
if *vcsURL == "" {
if v := os.Getenv("GITEA_URL"); v != "" {
slog.Warn("GITEA_URL is deprecated; use VCS_URL instead")
*vcsURL = v
}
}
```
This is clean, idiomatic, and the warning goes through slog correctly.
### 2. Keep `--gitea-url` as deprecated alias
Add a hidden flag for backward compat:
```go
giteaURLAlias := flag.String("gitea-url", "", "Deprecated: use --vcs-url")
```
Post-parse:
```go
if *vcsURL == "" && *giteaURLAlias != "" {
slog.Warn("--gitea-url is deprecated; use --vcs-url instead")
*vcsURL = *giteaURLAlias
}
```
### 3. Internal variable rename
Rename `giteaURL` local variable → `vcsURL` throughout `main.go` for consistency.
### 4. Error message update
```go
fmt.Fprintf(os.Stderr, "Required: --vcs-url, --repo, --pr, --reviewer-token, --llm-model\n")
```
### 5. Action YAML changes
In `.gitea/actions/review/action.yml`:
- Input `gitea-url``vcs-url` (with same description, `required: false`, `default: ''`)
- Line 172: `GITEA_URL: ${{ inputs.gitea-url || github.server_url }}``VCS_URL: ${{ inputs.vcs-url || github.server_url }}`
- Lines 115, 140: internal bash vars `GITEA_URL=` are used for downloading binaries — NOT passed to the review-bot binary. Leave them as internal bash vars (they're scope-local in bash). These could be renamed to `SERVER_URL` or `BASE_URL` for local clarity, but renaming them isn't strictly required.
In `.gitea/workflows/ci.yml`:
- Line 52: `GITEA_URL: ${{ github.server_url }}``VCS_URL: ${{ github.server_url }}`
### 6. Integration test updates
`INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` in both test files.
### 7. README
- CLI example: `--gitea-url``--vcs-url`
- Env var table: `GITEA_URL``VCS_URL`, add note about `GITEA_URL` fallback
## Backward Compatibility Summary
| Old | New | Fallback? |
|-----|-----|-----------|
| `GITEA_URL` env var | `VCS_URL` | ✅ with deprecation warning |
| `--gitea-url` flag | `--vcs-url` | ✅ with deprecation warning |
| `gitea-url` action input | `vcs-url` | ⚠️ No (action version bump handles this) |
| `INTEGRATION_GITEA_URL` | `INTEGRATION_VCS_URL` | N/A (test-only) |
## Error Cases
- Both `VCS_URL` and `GITEA_URL` set: `VCS_URL` wins (primary takes precedence)
- Both `--vcs-url` and `--gitea-url` provided: `--vcs-url` wins
- Neither set: existing "missing required flags" error unchanged
## Edge Cases
- `os.Getenv` returns "" for unset AND set-to-empty — consistent with existing behavior
- The `envOrDefault` helper is unchanged; we add `envOrDefaultFallback` for the one renamed var
## Testing Strategy
- Existing unit tests pass unchanged (they don't test env var parsing directly)
- Integration tests updated to use new env var name
- Manual: `GITEA_URL=https://example.com ./review-bot --repo x --pr 1 ...` should print deprecation warning and proceed
- Manual: `VCS_URL=https://example.com ./review-bot ...` should work silently
## Completion Checklist
1. `VCS_URL` is read first; `GITEA_URL` is fallback with deprecation warning
2. `--vcs-url` flag is primary; `--gitea-url` is deprecated alias with warning
3. Error message references `--vcs-url` not `--gitea-url`
4. `action.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
5. `ci.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
6. README updated in CLI example and env var table
7. Integration tests use `INTEGRATION_VCS_URL`
8. `go test ./...` passes
9. `go vet ./...` passes
10. `go build ./cmd/review-bot` succeeds
## Open Questions
- Should the CLI flag `--gitea-url` be completely hidden from `--help` or just deprecated with a note? The issue doesn't specify. Decision: keep it visible but add "(deprecated: use --vcs-url)" to the description.
- Should action.yml also add `gitea-url` as a deprecated input alias? The issue says "Update the action to pass the new env var name" — no mention of backward compat for the action input. Decision: rename only, no alias (action users pin a version anyway).
- The bash-internal `GITEA_URL` variable in action.yml scripts (used for release download, not passed to binary) — rename for clarity? Decision: yes, rename to `BASE_URL` to avoid confusion with the env var.
+4 -2
View File
@@ -210,6 +210,7 @@ AI Core handles OAuth token management and deployment discovery automatically. M
| `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
| `doc-map` | No | `""` | Path to a YAML file mapping source path globs to governing design docs |
| `doc-map-max-bytes` | No | `102400` | Maximum bytes of injected doc content from doc-map (default 100KB) |
| `doc-map-trusted-ref` | No | `""` | Git ref (e.g. `main`) to fetch the doc-map config from via VCS API instead of local workspace. **Recommended for security** — prevents a PR from modifying the doc-map config to inject arbitrary docs. |
| `persona` | No | `""` | Built-in persona name (security, architect, docs) |
| `persona-file` | No | `""` | Path to persona file (YAML or JSON) with custom review focus |
| `temperature` | No | `0` | LLM temperature (0 = server default) |
@@ -288,7 +289,7 @@ review-bot \
--vcs-url https://gitea.example.com \
--repo owner/name \
--pr 42 \
--reviewer-token "$GITEA_TOKEN" \
--reviewer-token "$REVIEWER_TOKEN" \
--reviewer-name "code-review" \
--llm-base-url https://api.openai.com/v1 \
--llm-api-key "$OPENAI_API_KEY" \
@@ -337,7 +338,8 @@ All flags have environment variable equivalents:
| Flag | Env Var |
|------|---------|
| `--vcs-url` | `VCS_URL` (fallback: `GITEA_URL`) |
| `--repo` | `GITEA_REPO` |
| `--vcs-type` | `VCS_TYPE` (auto-detected from URL if not set; `gitea` or `github`) |
| `--repo` | `GITEA_REPO` (also accepted: set `GITEA_REPO` for Gitea; VCS-agnostic `REPO` coming) |
| `--pr` | `PR_NUMBER` |
| `--reviewer-token` | `REVIEWER_TOKEN` |
| `--reviewer-name` | `REVIEWER_NAME` |
+61 -15
View File
@@ -101,6 +101,7 @@ func main() {
aicoreResourceGroup := flag.String("aicore-resource-group", envOrDefault("AICORE_RESOURCE_GROUP", "default"), "SAP AI Core resource group (for provider=aicore)")
docMapFile := flag.String("doc-map", envOrDefault("DOC_MAP_FILE", ""), "Path to YAML file mapping source path globs to governing design docs")
docMapMaxBytes := flag.Int("doc-map-max-bytes", envOrDefaultInt("DOC_MAP_MAX_BYTES", review.DefaultDocMapMaxBytes), "Maximum bytes of injected doc content (default 102400)")
docMapTrustedRef := flag.String("doc-map-trusted-ref", envOrDefault("DOC_MAP_TRUSTED_REF", ""), "Git ref (e.g. main) to fetch the doc-map config from via VCS API instead of local workspace. Recommended to prevent PR branch from controlling which docs are injected.")
flag.Parse()
@@ -173,6 +174,20 @@ func main() {
os.Exit(1)
}
// Early validation of filesystem-path flags (fail fast before network I/O).
// Skip local-path validation when --doc-map-trusted-ref is set: the flag
// value is used as a VCS API path, not a local filesystem path, and the
// file may not exist in the local checkout (sparse, PR-deleted, etc.).
var resolvedDocMapFile string
if *docMapFile != "" && *docMapTrustedRef == "" {
resolved, err := validateWorkspacePath(*docMapFile, "doc-map")
if err != nil {
slog.Error("invalid doc-map path", "error", err)
os.Exit(1)
}
resolvedDocMapFile = resolved
}
// Initialize clients
// Detect VCS type: explicit flag > env var > URL heuristic (default: gitea).
vcsType := envOrDefault("VCS_TYPE", "")
@@ -357,15 +372,45 @@ func main() {
// Step 6c: Load path-scoped design docs if doc-map specified
designDocs := ""
if *docMapFile != "" {
resolvedDocMap, err := validateWorkspacePath(*docMapFile, "doc-map")
if err != nil {
slog.Error("invalid doc-map path", "error", err)
os.Exit(1)
}
docMapCfg, err := review.ParseDocMapConfig(resolvedDocMap)
if err != nil {
slog.Error("failed to parse doc-map file", "file", *docMapFile, "error", err)
os.Exit(1)
var docMapCfg *review.DocMapConfig
if *docMapTrustedRef != "" {
// Fetch doc-map config from a trusted VCS ref (e.g. the default branch).
// This prevents a malicious PR from modifying the doc-map config to
// inject arbitrary docs into the LLM prompt.
slog.Info("doc-map: fetching config from trusted ref",
"path", *docMapFile,
"ref", *docMapTrustedRef)
content, fetchErr := vcs.GetFileContentRef(ctx, owner, repoName, *docMapFile, *docMapTrustedRef)
if fetchErr != nil {
slog.Error("doc-map: failed to fetch config from trusted ref",
"path", *docMapFile,
"ref", *docMapTrustedRef,
"error", fetchErr)
os.Exit(1)
}
source := fmt.Sprintf("%s/%s@%s:%s", owner, repoName, *docMapTrustedRef, *docMapFile)
var parseErr error
docMapCfg, parseErr = review.ParseDocMapConfigContent(content, source)
if parseErr != nil {
slog.Error("doc-map: failed to parse fetched config",
"source", source,
"error", parseErr)
os.Exit(1)
}
} else {
// Local workspace fallback — the doc-map is read from the PR branch checkout.
// SECURITY WARNING: a malicious PR can modify this file to inject arbitrary
// docs. Set --doc-map-trusted-ref (or DOC_MAP_TRUSTED_REF) to a trusted ref
// (e.g. "main") to fetch the config from the default branch instead.
slog.Warn("doc-map: loading config from local workspace (PR branch) — " +
"set --doc-map-trusted-ref to fetch from a trusted ref for security")
var parseErr error
docMapCfg, parseErr = review.ParseDocMapConfig(resolvedDocMapFile)
if parseErr != nil {
slog.Error("failed to parse doc-map file", "file", *docMapFile, "error", parseErr)
os.Exit(1)
}
}
// Collect changed file paths from the PR for intersection.
@@ -379,10 +424,11 @@ func main() {
if len(matchedDocs) > 0 {
docMapOpts := review.DocMapOptions{MaxBytes: *docMapMaxBytes}
designDocs, err = review.LoadMatchingDocs(ctx, vcs, owner, repoName, matchedDocs, docMapOpts)
if err != nil {
var loadErr error
designDocs, loadErr = review.LoadMatchingDocs(ctx, vcs, owner, repoName, matchedDocs, docMapOpts)
if loadErr != nil {
// Non-fatal: individual missing files are already warned; log and continue.
slog.Warn("doc-map: partial failure loading docs", "error", err)
slog.Warn("doc-map: partial failure loading docs", "error", loadErr)
}
if designDocs != "" {
slog.Info("doc-map: injected design docs", "matched", len(matchedDocs), "bytes", len(designDocs))
@@ -510,9 +556,9 @@ func main() {
for _, f := range result.Findings {
if f.File != "" && f.Line > 0 && diffRanges.Contains(f.File, f.Line) {
inlineComments = append(inlineComments, vcsReviewComment{
Path: f.File,
NewPosition: int64(f.Line),
Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
Path: f.File,
NewLine: int64(f.Line),
Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
})
}
}
+122
View File
@@ -1510,3 +1510,125 @@ func TestMainSubprocess_DeprecatedGiteaURLEnv(t *testing.T) {
t.Errorf("expected deprecation warning for GITEA_URL, got: %s", out)
}
}
// TestMainSubprocess_InvalidDocMapPath confirms that --doc-map with a path traversal
// attempt is rejected before any network I/O.
func TestMainSubprocess_InvalidDocMapPath(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--vcs-url", "https://gitea.example.com",
"--repo", "owner/repo",
"--pr", "1",
"--reviewer-token", "tok",
"--llm-base-url", "https://api.example.com",
"--llm-api-key", "key",
"--llm-model", "gpt-4",
"--doc-map", "../../../etc/passwd",
}
main()
return
}
cmd := exec.Command(os.Args[0], "-test.run=TestMainSubprocess_InvalidDocMapPath")
// t.TempDir() is evaluated here in the outer process, producing a real directory
// that is passed as the GITHUB_WORKSPACE env var string to the subprocess.
cmd.Env = append(cleanEnv(),
"TEST_SUBPROCESS_MAIN=1",
"GITHUB_WORKSPACE="+t.TempDir(),
)
out, err := cmd.CombinedOutput()
if err == nil {
t.Fatal("expected non-zero exit with path traversal doc-map, got success")
}
output := string(out)
if !strings.Contains(output, "doc-map") {
t.Errorf("expected error mentioning doc-map, got: %s", output)
}
if !strings.Contains(output, "resolves outside workspace") {
t.Errorf("expected error about path traversal, got: %s", output)
}
}
// TestMainSubprocess_InvalidDocMapFile confirms that --doc-map with a nonexistent file
// is rejected before any network I/O.
func TestMainSubprocess_InvalidDocMapFile(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--vcs-url", "https://gitea.example.com",
"--repo", "owner/repo",
"--pr", "1",
"--reviewer-token", "tok",
"--llm-base-url", "https://api.example.com",
"--llm-api-key", "key",
"--llm-model", "gpt-4",
"--doc-map", "nonexistent.yml",
}
main()
return
}
cmd := exec.Command(os.Args[0], "-test.run=TestMainSubprocess_InvalidDocMapFile")
// t.TempDir() is evaluated here in the outer process, producing a real directory
// that is passed as the GITHUB_WORKSPACE env var string to the subprocess.
cmd.Env = append(cleanEnv(),
"TEST_SUBPROCESS_MAIN=1",
"GITHUB_WORKSPACE="+t.TempDir(),
)
out, err := cmd.CombinedOutput()
if err == nil {
t.Fatal("expected non-zero exit with nonexistent doc-map file, got success")
}
output := string(out)
if !strings.Contains(output, "doc-map") {
t.Errorf("expected error mentioning doc-map, got: %s", output)
}
if !strings.Contains(output, "failed to resolve") {
t.Errorf("expected error about failed resolution, got: %s", output)
}
}
// TestMainSubprocess_DocMapTrustedRefSkipsLocalValidation confirms that
// --doc-map-trusted-ref bypasses local filesystem validation for --doc-map.
// When the trusted-ref flag is set, the doc-map value is used as a VCS API
// path; a nonexistent local file must not cause an early exit before network I/O.
func TestMainSubprocess_DocMapTrustedRefSkipsLocalValidation(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--vcs-url", "https://gitea.example.com",
"--repo", "owner/repo",
"--pr", "1",
"--reviewer-token", "tok",
"--llm-base-url", "https://api.example.com",
"--llm-api-key", "key",
"--llm-model", "gpt-4",
"--doc-map", "nonexistent-local.yml",
"--doc-map-trusted-ref", "main",
}
main()
return
}
cmd := exec.Command(os.Args[0], "-test.run=TestMainSubprocess_DocMapTrustedRefSkipsLocalValidation")
cmd.Env = append(cleanEnv(),
"TEST_SUBPROCESS_MAIN=1",
"GITHUB_WORKSPACE="+t.TempDir(),
)
out, err := cmd.CombinedOutput()
output := string(out)
// The test must fail (network I/O or VCS API failure) but must NOT
// fail with the local filesystem validation error.
// "failed to resolve" would indicate the early validateWorkspacePath ran —
// that would be the bug this test is catching.
if strings.Contains(output, "failed to resolve") {
t.Errorf("--doc-map-trusted-ref should skip local path validation, but got filesystem error: %s", output)
}
// It must still exit non-zero (real VCS call to example.com will fail).
if err == nil {
t.Fatal("expected non-zero exit when VCS API is unreachable, got success")
}
}
+83 -23
View File
@@ -23,46 +23,60 @@ const maxDocmapBytes int64 = 10 * 1024 * 1024 // 10 MB
// 1. The path resolves to a regular file within resolvedRoot (path
// confinement): prevents a PR-controlled --docmap from reading arbitrary
// host files via absolute paths or ".." traversal.
// 2. The path is not a symlink: prevents denial-of-service via /dev/zero or
// information disclosure via symlinks that point outside the workspace.
// 2. The resolved path is within resolvedRoot: in-repo file-level symlinks
// are allowed when their resolved target is still inside the root;
// symlinks that escape the root are rejected by the confinement check.
// 3. The file does not exceed maxDocmapBytes: prevents memory exhaustion
// from an oversized but legitimately committed doc-map file.
//
// resolvedRoot must already be an absolute, symlink-free path (obtained from
// filepath.Abs + filepath.EvalSymlinks).
func validateDocmapPath(localPath, resolvedRoot string) error {
func validateDocmapPath(localPath, resolvedRoot string) (string, error) {
// Resolve the docmap path to an absolute path.
absPath, err := filepath.Abs(localPath)
if err != nil {
return fmt.Errorf("cannot resolve path: %w", err)
return "", fmt.Errorf("cannot resolve path: %w", err)
}
// Lstat: do NOT follow symlinks. We want to inspect the entry itself.
fi, err := os.Lstat(absPath)
// Resolve ALL symlink components, not just the final one.
// os.Lstat only avoids following the *final* path component; intermediate
// directory symlinks are still followed. EvalSymlinks resolves every
// component, closing the directory-symlink bypass: a PR that commits
// .review-bot/ as a directory symlink pointing outside the repo would
// otherwise pass the filepath.Rel confinement check because the textual
// path is inside the root while the actual destination is not.
resolvedPath, err := filepath.EvalSymlinks(absPath)
if err != nil {
return fmt.Errorf("cannot stat file: %w", err)
return "", fmt.Errorf("cannot resolve path (symlink): %w", err)
}
// Reject symlinks outright — a symlink can point to /dev/zero or arbitrary
// host paths, bypassing both the confinement check and the size check.
if fi.Mode()&os.ModeSymlink != 0 {
return fmt.Errorf("symlinks are not allowed")
// Lstat the resolved path for size and existence checks — EvalSymlinks
// guarantees no symlink components remain, so ModeSymlink can never be set.
fi, err := os.Lstat(resolvedPath)
if err != nil {
return "", fmt.Errorf("cannot stat file: %w", err)
}
// Confine to resolvedRoot: the cleaned absolute path must be a descendant
// of the repo root. This catches paths that escaped via "..", absolute
// paths that happen to be outside the root, etc.
rel, err := filepath.Rel(resolvedRoot, absPath)
// Reject anything that is not a regular file (directories, FIFOs, device
// nodes, etc.) — ParseDocMapConfig expects a plain YAML file and would
// produce a confusing error on non-regular entries.
if !fi.Mode().IsRegular() {
return "", fmt.Errorf("docmap must be a regular file")
}
// Confine to resolvedRoot: use the fully-resolved path so that a directory
// symlink inside the repo cannot carry the path outside the root.
rel, err := filepath.Rel(resolvedRoot, resolvedPath)
if err != nil || rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator)) {
return fmt.Errorf("path must be within --repo-root")
return "", fmt.Errorf("path must be within --repo-root")
}
// Enforce size cap before reading to prevent memory exhaustion.
if fi.Size() > maxDocmapBytes {
return fmt.Errorf("file size %d bytes exceeds %d-byte limit", fi.Size(), maxDocmapBytes)
return "", fmt.Errorf("file size %d bytes exceeds %d-byte limit", fi.Size(), maxDocmapBytes)
}
return nil
return resolvedPath, nil
}
// runValidateDocmap implements the `review-bot validate-docmap` subcommand.
@@ -126,16 +140,59 @@ func runValidateDocmap(args []string) int {
// may reference a PR-controlled file (e.g. .review-bot/doc-map.yml).
// Validate that it:
// 1. Resolves within resolvedRoot (prevent reading arbitrary host files).
// 2. Is not a symlink (prevent /dev/zero or symlink-based host probing).
// 2. Resolved target stays within the root (in-repo symlinks are allowed
// if they resolve to a path inside the root).
// 3. Does not exceed maxDocmapBytes (prevent memory exhaustion from an
// oversized committed file).
if err := validateDocmapPath(*docmapFlag, resolvedRoot); err != nil {
// validateDocmapPath returns the resolved path; use it directly to
// eliminate any TOCTOU race between validation and use.
resolvedDocmap, err := validateDocmapPath(*docmapFlag, resolvedRoot)
if err != nil {
fmt.Fprintf(errWriter, "Error: --docmap %q is invalid: %v\n", *docmapFlag, err)
return 2
}
// Parse docmap YAML.
cfg, err := review.ParseDocMapConfig(*docmapFlag)
// Open and read the docmap with a LimitedReader — closes the residual TOCTOU
// window between the Lstat size check in validateDocmapPath and the file open
// here. The limit is maxDocmapBytes+1 so we can detect a file that grew past
// the cap after the stat without reading unbounded bytes.
//
// Defense-in-depth: stat the path immediately before and after open so we can
// detect a file swap between validateDocmapPath's validation and this open via
// os.SameFile. An attacker with workspace write access could otherwise replace
// the validated file with a symlink in the gap between validation and use.
preStat, err := os.Lstat(resolvedDocmap)
if err != nil {
fmt.Fprintf(errWriter, "Error: failed to stat docmap before open %q: %v\n", *docmapFlag, err)
return 2
}
f, err := os.Open(resolvedDocmap)
if err != nil {
fmt.Fprintf(errWriter, "Error: failed to open docmap %q: %v\n", *docmapFlag, err)
return 2
}
defer func() { _ = f.Close() }()
// Verify we opened the same file that was validated — rejects a swap between
// the pre-open Lstat and the open call.
postStat, err := f.Stat()
if err != nil {
fmt.Fprintf(errWriter, "Error: failed to stat open docmap %q: %v\n", *docmapFlag, err)
return 2
}
if !os.SameFile(preStat, postStat) {
fmt.Fprintf(errWriter, "Error: --docmap %q changed between validation and open\n", *docmapFlag)
return 2
}
docmapData, err := io.ReadAll(io.LimitReader(f, maxDocmapBytes+1))
if err != nil {
fmt.Fprintf(errWriter, "Error: failed to read docmap %q: %v\n", *docmapFlag, err)
return 2
}
if int64(len(docmapData)) > maxDocmapBytes {
fmt.Fprintf(errWriter, "Error: --docmap %q exceeded %d-byte limit after open\n", *docmapFlag, maxDocmapBytes)
return 2
}
cfg, err := review.ParseDocMapConfigContent(string(docmapData), *docmapFlag)
if err != nil {
fmt.Fprintf(errWriter, "Error: failed to parse docmap %q: %v\n", *docmapFlag, err)
return 2
@@ -160,6 +217,9 @@ func runValidateDocmap(args []string) int {
// Normalize Windows-style backslashes to forward slashes so that
// changed-file paths from git on Windows match doc-map globs.
f = strings.ReplaceAll(f, "\\", "/")
// Strip a leading "./" emitted by non-git tools (e.g. `find`) so that
// paths like "./cmd/foo.go" match doc-map globs written as "cmd/**".
f = strings.TrimPrefix(f, "./")
if !review.FileCoveredByDocMap(cfg, f) {
uncovered = append(uncovered, f)
}
@@ -178,7 +238,7 @@ func runValidateDocmap(args []string) int {
staleDocs := checkStaleDocs(cfg, resolvedRoot)
if len(staleDocs) > 0 {
failed = true
fmt.Fprintln(errWriter, "ERROR: stale docmap docs: entries (paths do not exist):")
fmt.Fprintln(errWriter, "ERROR: stale docmap entries (paths do not exist):")
for _, d := range staleDocs {
fmt.Fprintf(errWriter, " %s\n", d)
}
+159 -15
View File
@@ -465,23 +465,34 @@ mappings:
}
// TestValidateDocmapPath_Symlink verifies that --docmap pointing at a symlink
// is rejected before the file is read (prevents /dev/zero DOS or arbitrary
// host-file reads via PR-controlled symlinks).
// whose resolved target is outside --repo-root is rejected (prevents reading
// arbitrary host files via PR-controlled symlinks).
//
// Note: after the EvalSymlinks fix (issue #150), in-repo symlinks whose
// targets also reside within the repo root are now allowed — the confinement
// check is applied to the resolved path, not the symlink entry itself. The
// security invariant is: the resolved destination must be within the root.
func TestValidateDocmapPath_Symlink(t *testing.T) {
dir := t.TempDir()
outside := t.TempDir()
// Create a real docmap file to serve as the symlink target.
realDocmap := makeDocmapInDir(t, dir, `
mappings:
- paths:
- "lib/**"
docs:
- docs/foo.md
`)
// Create a docmap file OUTSIDE the repo root to serve as the symlink
// target. EvalSymlinks will resolve to this path, which the Rel check
// must then reject.
if err := os.MkdirAll(filepath.Join(outside, ".review-bot"), 0o755); err != nil {
t.Fatalf("MkdirAll: %v", err)
}
outsideDocmap := filepath.Join(outside, ".review-bot", "doc-map.yml")
if err := os.WriteFile(outsideDocmap, []byte("mappings: []\n"), 0o644); err != nil {
t.Fatalf("WriteFile: %v", err)
}
// Create a symlink inside dir pointing to the real docmap.
// Create a symlink inside dir pointing to the file outside the repo.
if err := os.MkdirAll(filepath.Join(dir, ".review-bot"), 0o755); err != nil {
t.Fatalf("MkdirAll: %v", err)
}
symlinkPath := filepath.Join(dir, ".review-bot", "doc-map-link.yml")
if err := os.Symlink(realDocmap, symlinkPath); err != nil {
if err := os.Symlink(outsideDocmap, symlinkPath); err != nil {
t.Fatalf("Symlink: %v", err)
}
@@ -490,10 +501,10 @@ mappings:
[]string{"--docmap", symlinkPath, "--repo-root", dir},
)
if code != 2 {
t.Errorf("expected exit 2 for symlink docmap, got %d; stderr: %q", code, stderr)
t.Errorf("expected exit 2 for out-of-repo symlink docmap, got %d; stderr: %q", code, stderr)
}
if !strings.Contains(stderr, "symlink") && !strings.Contains(stderr, "invalid") {
t.Errorf("expected symlink rejection in stderr, got %q", stderr)
if !strings.Contains(stderr, "invalid") && !strings.Contains(stderr, "repo-root") {
t.Errorf("expected confinement rejection in stderr, got %q", stderr)
}
}
@@ -550,3 +561,136 @@ func TestValidateDocmapPath_SizeLimit(t *testing.T) {
t.Errorf("expected size limit error in stderr, got %q", stderr)
}
}
// TestValidateDocmapPath_DirSymlinkBypass verifies that a directory-symlink
// inside the repo pointing outside cannot be used to read arbitrary host files.
//
// Attack vector: a PR commits .review-bot/ as a directory symlink targeting a
// directory outside the repo. The textual path of the docmap file is inside
// the repo root, so the old Rel-only check passed — but the actual file is
// outside. This is closed by calling EvalSymlinks on the full path before the
// confinement check.
func TestValidateDocmapPath_DirSymlinkBypass(t *testing.T) {
repoDir := t.TempDir()
outsideDir := t.TempDir()
// Secret file outside the repo.
secretPath := filepath.Join(outsideDir, "secret.yml")
if err := os.WriteFile(secretPath, []byte("mappings: []\n"), 0o644); err != nil {
t.Fatalf("WriteFile: %v", err)
}
// Create .review-bot/ as a directory symlink pointing outside the repo.
reviewBotDir := filepath.Join(repoDir, ".review-bot")
if err := os.Symlink(outsideDir, reviewBotDir); err != nil {
t.Skipf("cannot create dir symlink (platform may not support it): %v", err)
}
// Textually inside repo — .review-bot/secret.yml — but resolves outside.
attackPath := filepath.Join(repoDir, ".review-bot", "secret.yml")
// Resolve repoDir to a symlink-free path, as runValidateDocmap does.
resolvedRoot, err := filepath.EvalSymlinks(repoDir)
if err != nil {
t.Fatalf("EvalSymlinks(repoDir): %v", err)
}
if _, err := validateDocmapPath(attackPath, resolvedRoot); err == nil {
t.Error("expected rejection of dir-symlink bypass, got nil error")
}
}
// TestValidateDocmapPath_NonRegularFile verifies that --docmap pointing at a
// non-regular file (e.g. a directory) is rejected with a clear error before
// ParseDocMapConfig is called.
func TestValidateDocmapPath_NonRegularFile(t *testing.T) {
dir := t.TempDir()
// Use the directory itself as the docmap path — directories pass Lstat but
// are not regular files.
reviewBotDir := filepath.Join(dir, ".review-bot")
if err := os.MkdirAll(reviewBotDir, 0o755); err != nil {
t.Fatalf("MkdirAll: %v", err)
}
code, _, stderr := stdinValidateDocmap(t,
"",
[]string{"--docmap", reviewBotDir, "--repo-root", dir},
)
if code != 2 {
t.Errorf("expected exit 2 for directory docmap, got %d; stderr: %q", code, stderr)
}
if !strings.Contains(stderr, "regular file") && !strings.Contains(stderr, "invalid") {
t.Errorf("expected regular-file rejection in stderr, got %q", stderr)
}
}
// TestRunValidateDocmap_DotSlashPrefix verifies that paths emitted with a
// leading "./" (e.g. from `find` or `ls`) match doc-map globs correctly.
// Without TrimPrefix, "./cmd/foo.go" would not match the pattern "cmd/**".
func TestRunValidateDocmap_DotSlashPrefix(t *testing.T) {
dir := t.TempDir()
makeDocFile(t, dir, "docs/foo.md")
docmap := makeDocmapInDir(t, dir, `
mappings:
- paths:
- "cmd/**"
docs:
- docs/foo.md
`)
// File with a leading "./" should be treated as covered.
code, _, stderr := stdinValidateDocmap(t,
"./cmd/foo.go\n",
[]string{"--docmap", docmap, "--repo-root", dir},
)
if code != 0 {
t.Errorf("expected exit 0 for './' prefixed covered file, got %d; stderr: %q", code, stderr)
}
}
// TestValidateDocmapPath_InRepoSymlinkAllowed verifies that an in-repo
// file-level symlink whose resolved target is still within the repo root is
// accepted. This is the positive case for the issue #150 behavioral change:
// only symlinks that escape the root are rejected; intra-repo symlinks are
// allowed because EvalSymlinks resolves the target and the confinement check
// is applied to the resolved path, not the symlink entry itself.
func TestValidateDocmapPath_InRepoSymlinkAllowed(t *testing.T) {
dir := t.TempDir()
// Create the real docmap file inside the repo root.
if err := os.MkdirAll(filepath.Join(dir, ".review-bot"), 0o755); err != nil {
t.Fatalf("MkdirAll: %v", err)
}
realDocmap := filepath.Join(dir, ".review-bot", "doc-map-real.yml")
if err := os.WriteFile(realDocmap, []byte("mappings: []\n"), 0o644); err != nil {
t.Fatalf("WriteFile: %v", err)
}
// Create a symlink inside the repo root that points to the real file
// (also inside the root).
symlinkPath := filepath.Join(dir, ".review-bot", "doc-map-link.yml")
if err := os.Symlink(realDocmap, symlinkPath); err != nil {
t.Skipf("cannot create symlink (platform may not support it): %v", err)
}
// Resolve dir to a symlink-free root, as runValidateDocmap does.
resolvedRoot, err := filepath.EvalSymlinks(dir)
if err != nil {
t.Fatalf("EvalSymlinks(dir): %v", err)
}
// In-repo symlink whose target is within root: must be accepted.
resolved, err := validateDocmapPath(symlinkPath, resolvedRoot)
if err != nil {
t.Fatalf("expected in-repo symlink to be accepted, got error: %v", err)
}
// The returned resolved path must be the real file (not the symlink entry).
// validateDocmapPath calls filepath.EvalSymlinks internally, so the returned
// path is always the fully-resolved real path — it can never equal the
// symlink entry itself.
if resolved == symlinkPath {
t.Errorf("expected resolved path to differ from symlink path")
}
}
+2 -2
View File
@@ -9,7 +9,7 @@ import (
"strings"
"time"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
// runValidateURL implements the `review-bot validate-url <url>` subcommand.
@@ -114,7 +114,7 @@ func validateURL(rawURL string) error {
}
for _, a := range addrs {
if gitea.IsBlockedIP(a.IP) {
if netutil.IsBlockedIP(a.IP) {
return &validateError{
code: 1,
message: fmt.Sprintf("blocked: %q resolves to private/reserved IP %s", host, a.IP),
+7 -9
View File
@@ -83,9 +83,9 @@ type vcsCommitStatus struct {
// vcsReviewComment is an inline review comment.
type vcsReviewComment struct {
Path string
NewPosition int64 // Gitea: absolute line; GitHub: diff hunk position
Body string
Path string
NewLine int64 // absolute line number on the new (right) side of the diff, used by both Gitea and GitHub adapters
Body string
}
// vcsReview is a submitted PR review.
@@ -176,7 +176,7 @@ func (a *giteaVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, pa
func (a *giteaVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
gc := make([]gitea.ReviewComment, len(comments))
for i, c := range comments {
gc[i] = gitea.ReviewComment{Path: c.Path, NewPosition: c.NewPosition, Body: c.Body}
gc[i] = gitea.ReviewComment{Path: c.Path, NewPosition: c.NewLine, Body: c.Body}
}
r, err := a.c.PostReview(ctx, owner, repo, number, event, body, commitID, gc)
if err != nil {
@@ -311,14 +311,12 @@ func (a *githubVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, p
func (a *githubVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
gc := make([]github.ReviewComment, len(comments))
for i, c := range comments {
// GitHub inline comments use diff hunk "position", not absolute line numbers.
// NewPosition from gitea diff parsing gives absolute line numbers, which
// will not match GitHub's position values. For initial GitHub support, we
// attach comments with Line+Side (absolute line on the RIGHT side) instead.
// GitHub inline comments use Line+Side (absolute line on the RIGHT side).
// NewLine from diff parsing gives absolute new-file line numbers.
// Comments that cannot be mapped will be omitted (GitHub rejects invalid positions).
gc[i] = github.ReviewComment{
Path: c.Path,
Line: c.NewPosition,
Line: c.NewLine,
Side: "RIGHT",
Body: c.Body,
}
-82
View File
@@ -1,82 +0,0 @@
# Design: doc-map input for path-scoped design doc injection (Issue #137)
## Problem
review-bot can inject context via `patterns-repo` (external VCS repos) and `conventions-file`
(a single file from the reviewed repo). There is no mechanism to inject local repo documentation
files scoped to the paths changed in a PR.
First consumer: `grgl/gargoyle#778` needs a "doc adherence" reviewer that checks code against the
module's governing design doc, without injecting every doc in the tree.
## Approach
### New: `doc-map` input
A `.review-bot/doc-map.yml` config file in the reviewed repo maps source path globs to governing
design docs. review-bot reads the map, intersects it with changed PR paths, and injects only the
relevant docs into the system prompt.
### Config format
```yaml
mappings:
- paths:
- "lib/gargoyle/engine/signal_risk/**"
docs:
- docs/domain/contexts/risk/risk-controls.md
- paths:
- "lib/gargoyle/trading/**"
docs:
- docs/domain/contexts/trading/
```
- `paths` — glob patterns (including `**`) matched against changed file paths in the PR
- `docs` — file paths or directory paths (all `.md` files under a directory) to inject
- Docs are deduplicated across mappings
### Architecture
| Component | Description |
|-----------|-------------|
| `review/docmap.go` | YAML parsing, glob matching with `**` support, doc loading via VCS |
| `cmd/review-bot/main.go` | Step 6c: parses config, intersects with changed files, calls LoadMatchingDocs |
| `budget/budget.go` | New `DesignDocs` section — injected after Conventions in system prompt |
| `action.yml` | `doc-map` and `doc-map-max-bytes` inputs, wired to `DOC_MAP_FILE`/`DOC_MAP_MAX_BYTES` |
### Doc file loading
- The `doc-map` YAML file is read from the local workspace (like `system-prompt-file`).
- Doc files listed in the config are fetched via VCS API (same as `conventions-file`),
enabling them to be loaded from any branch without a local checkout.
- `GetAllFilesInPath` is tried first; if it returns files, they are treated as a directory listing.
If it returns empty, `GetFileContent` is tried as a fallback (single file).
### Glob matching
`**` is implemented by splitting patterns and paths on `/`, then matching segment-by-segment.
A `**` segment consumes zero or more path segments (not just one level like `*`).
### Budget integration
`DesignDocs` is added to `budget.Sections` between `Conventions` and `FileContext`.
Trim order: Patterns → Conventions → DesignDocs → FileContext → Diff.
Design docs appear in the system prompt under `## Design Documents`.
### Context size guard
Default: 100 KB. Configurable via `--doc-map-max-bytes` / `DOC_MAP_MAX_BYTES`.
Truncation is noted inline with a `⚠️` message.
## Error handling
| Situation | Behavior |
|-----------|----------|
| `--doc-map` file not found | Fatal error (like `--system-prompt-file`) |
| `--doc-map` file invalid YAML | Fatal error with descriptive message |
| Unknown YAML keys | Log warning, continue |
| Doc file not found in VCS | Log warning, skip |
| Doc directory empty or no `.md` files | Log debug, skip |
| Total size exceeds limit | Truncate with notice, log warning |
| No changed paths match any mapping | No docs injected, review runs normally |
| `paths` or `docs` list empty in a mapping | Skip that mapping |
-334
View File
@@ -1,334 +0,0 @@
# Design: Role-based Review Personas (Issue #51)
> **Note:** This design was revised during implementation to use JSON instead of YAML
> to maintain the repository's zero-external-dependencies convention. All persona
> files use JSON format. See "Design Revision" section at the end for details.
## Problem
Current review-bot performs generic code review. Every reviewer (regardless of `reviewer-name`) uses the same base prompt and evaluates the same concerns. This leads to:
1. **Redundancy** — Two reviewers (e.g., GPT + Claude twins) often flag identical issues
2. **Gaps** — Generic reviewers miss specialized concerns (security, domain logic, architecture)
3. **Noise** — NITs about style mixed with critical security findings
4. **No ownership** — Findings lack clear domain attribution
## Constraints
- Must work with existing CLI flags and CI workflow patterns
- Must not break backwards compatibility (existing configs still work)
- Must integrate cleanly with the budget system (personas add to context)
- Multiple personas running in parallel must not interfere with each other
- Each persona must have clear scope boundaries (no duplication)
## Proposed Approach
### 1. Persona Definition
A persona is a named review role with:
- **Identity** — Who am I? What's my expertise?
- **Focus** — What do I look for?
- **Scope boundaries** — What do I explicitly NOT comment on?
- **Severity calibration** — What counts as MAJOR/MINOR/NIT for MY domain?
Personas are defined in JSON files that can live:
1. In the pattern repos (shared across projects)
2. In the target repo (project-specific personas)
3. Inline via a new `--persona-file` flag (JSON format)
### 2. Persona File Format
```json
# .review/personas/security.yaml
name: security
display_name: Security Specialist
model_preference: opus # optional hint for expensive analysis
identity: |
You are a security specialist reviewing code for vulnerabilities.
Your expertise: OWASP Top 10, injection attacks, auth/authz, secrets management,
event sourcing security (replay attacks, event injection).
focus:
- Injection attacks (SQL, command, path traversal, template)
- Authentication and authorization gaps
- Secrets exposure (hardcoded credentials, tokens in logs)
- Input validation (unsanitized input, unsafe deserialization)
- Race conditions with security implications
- Event sourcing attack vectors
ignore:
- Code style and naming conventions
- Performance (unless security-related)
- Documentation
- General code quality
- Test coverage
severity:
critical: "Remote code execution, auth bypass, data exfiltration"
major: "Privilege escalation, information disclosure, DoS"
minor: "Missing rate limiting, verbose errors"
nit: "Theoretical risk with low exploitability"
output_format: |
For each finding:
- Severity: [CRITICAL|MAJOR|MINOR|NIT]
- Attack vector: How could this be exploited?
- Evidence: Code snippet showing the vulnerability
- Recommendation: Specific fix
```
### 3. New CLI Flags
```
--persona-file PATH Path to persona JSON file (local or in repo)
--persona NAME Built-in persona name (security, architect, domain)
```
Either flag sets the persona. If neither is provided, behavior is unchanged (generic review).
### 4. Prompt Assembly
Current flow:
```
SystemBase → Patterns → Conventions → [LLM]
```
New flow with persona:
```
PersonaPrompt (from YAML) → Patterns (filtered?) → Conventions → [LLM]
```
The persona's identity/focus/ignore/severity sections become the system prompt, replacing the generic "You are an expert code reviewer" base.
### 5. Built-in Personas
Ship with these built-in personas (loadable via `--persona NAME`):
| Name | Focus |
|------|-------|
| `security` | Vulnerabilities, auth, secrets |
| `architect` | Patterns, consistency, design |
| `domain` | Business logic (requires repo-specific config) |
| `docs` | Documentation, API clarity |
Built-in personas live in `review/personas/` as embedded Go assets or YAML shipped with the binary.
### 6. CI Workflow Integration
Single persona:
```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: security
persona: security
...
```
Multiple personas (parallel jobs):
```yaml
jobs:
review:
strategy:
matrix:
include:
- name: security
persona: security
- name: architect
persona: architect
steps:
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: ${{ matrix.name }}
persona: ${{ matrix.persona }}
```
Custom persona from repo:
```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: trading
persona-file: .review/personas/trading.yaml
```
### 7. Persona + Patterns Interaction
Some personas benefit from filtered patterns:
- Security → only security-related patterns
- Architect → all patterns (structural focus)
- Domain → domain docs, not language patterns
For v1, keep it simple: all patterns are included regardless of persona. Future enhancement could add `patterns_filter` to persona YAML.
### 8. Output Format Changes
Persona name appears in the review header:
```markdown
# Security Review
## Summary
No critical vulnerabilities found in this change.
## Findings
| # | Severity | File | Line | Finding |
...
## Recommendation
**APPROVE** — No security-relevant issues detected.
---
*Review by security*
<!-- review-bot:security -->
```
## State/Data Model
### Persona struct
```go
// review/persona.go
type Persona struct {
Name string `yaml:"name"`
DisplayName string `yaml:"display_name"`
ModelPref string `yaml:"model_preference,omitempty"`
Identity string `yaml:"identity"`
Focus []string `yaml:"focus"`
Ignore []string `yaml:"ignore"`
Severity Severity `yaml:"severity"`
OutputFormat string `yaml:"output_format,omitempty"`
}
type Severity struct {
Critical string `yaml:"critical"`
Major string `yaml:"major"`
Minor string `yaml:"minor"`
Nit string `yaml:"nit"`
}
```
### Loading precedence
1. `--persona-file PATH` → load from local file system
2. `--persona NAME` → load from embedded built-ins
3. Neither → use generic system prompt (current behavior)
## Error Cases
| Error | Handling |
|-------|----------|
| Persona file not found | Fatal exit with clear message |
| Invalid YAML in persona file | Fatal exit with parse error |
| Both `--persona` and `--persona-file` specified | Fatal exit: mutually exclusive |
| Unknown built-in persona name | Fatal exit with list of valid names |
| Empty identity in persona | Warning, fall back to generic prompt |
## Edge Cases
- **Empty focus list**: Valid — persona relies on identity alone
- **Empty ignore list**: Valid — no explicit scope exclusions
- **No severity section**: Use default MAJOR/MINOR/NIT definitions
- **Model preference set but budget insufficient**: Ignore preference, log warning
- **Persona file in pattern repo**: Fetch like other pattern files
## Testing Strategy
### Unit tests
- `persona_test.go`: Parse valid/invalid YAML, validate required fields
- `prompt_test.go`: Verify persona prompt assembly
- Integration with budget: persona prompts count toward token limit
### Integration tests
- End-to-end with `--persona security` (built-in)
- End-to-end with `--persona-file custom.yaml`
- Backwards compatibility: no flags = generic behavior
### Manual verification
- Run security persona on a PR with obvious vulnerability
- Verify security persona ignores style issues
- Verify non-security persona doesn't flag security issues
## Implementation Phases
### Phase 1: Persona types and loading
- [ ] `review/persona.go`: Persona struct + YAML parsing
- [ ] `review/persona_test.go`: Unit tests
- [ ] Embed built-in personas in binary
- [ ] Compiles clean, tests pass
### Phase 2: Prompt generation
- [ ] `review/prompt.go`: `BuildPersonaPrompt(p Persona) string`
- [ ] Modify `BuildSystemBase()` to accept optional persona
- [ ] Integrate persona prompt with budget system
- [ ] Tests for prompt assembly
### Phase 3: CLI integration
- [ ] Add `--persona` and `--persona-file` flags
- [ ] Flag validation (mutually exclusive, valid names)
- [ ] Load persona based on flags
- [ ] Pass persona to prompt builder
### Phase 4: Action integration
- [ ] Add `persona` and `persona-file` inputs to action.yml
- [ ] Update README with persona examples
- [ ] End-to-end CI test
### Phase 5: Built-in personas
- [ ] `security.yaml` built-in
- [ ] `architect.yaml` built-in
- [ ] `docs.yaml` built-in
- [ ] Document each persona's focus
## Open Questions
1. **Persona file location in repo**: Should we support `--persona-file .review/security.yaml` where the file is fetched from the PR's repo (like conventions)? This adds complexity but enables project-specific personas without action changes.
2. **Model preference enforcement**: If persona specifies `model_preference: opus` but the action uses a different model, should we warn? Override? Ignore? Current thinking: log warning, use the specified model (user controls model via action input).
3. **Severity override output**: If persona defines custom severity levels (CRITICAL), should the JSON output include them, or map back to standard MAJOR/MINOR/NIT? Current thinking: keep standard output format, use severity calibration only for prompt guidance.
## Completion Checklist
1. Persona struct matches YAML schema exactly?
2. Built-in personas embedded in binary (not external files)?
3. `--persona` and `--persona-file` are mutually exclusive?
4. Unknown persona name produces clear error with valid options?
5. Empty persona file fields have sensible defaults?
6. Persona prompt integrates with budget system (token counting)?
7. Backwards compatibility: no flags = current behavior?
8. Review header shows persona display name?
9. Sentinel still uses reviewer-name (not persona name)?
10. Unit tests cover parse errors, missing fields, valid YAML?
## Design Review Findings (Self-Review)
### Finding 1: Severity Mapping
The persona YAML allows `critical` severity, but the LLM output parser (`review/parser.go`) only accepts MAJOR/MINOR/NIT.
**Resolution:** Keep standard output format. Persona severity section is ONLY for calibrating the LLM's judgment (prompt guidance). Output must still use MAJOR/MINOR/NIT. Document this clearly in persona format docs.
### Finding 2: Embedding Built-in Personas
Go doesn't natively embed YAML. Must use `//go:embed` directive (Go 1.16+).
**Resolution:** Create `review/personas/` directory with YAML files and use:
```go
//go:embed personas/*.yaml
var embeddedPersonas embed.FS
```
### Finding 3: display_name vs reviewer-name
Design says header shows "persona display name" but sentinel uses "reviewer-name". This is correct - they serve different purposes:
- `display_name` → human-readable header ("Security Specialist Review")
- `reviewer-name` → machine sentinel for cleanup (`<!-- review-bot:security -->`)
When persona is used, `display_name` takes precedence for the header title, but `reviewer-name` (CLI flag) is still used for the sentinel.
## Design Revision: YAML with gopkg.in/yaml.v3
**Decision:** Add `gopkg.in/yaml.v3` as a dependency.
YAML is preferred over JSON for persona files because:
- Multi-line strings are cleaner (no escaping quotes in identity/focus text)
- Comments are supported for documentation
- More human-readable for complex persona definitions
The implementation supports both YAML (`.yaml`, `.yml`) and JSON (`.json`) for backwards compatibility, with YAML as the default for built-in personas.
-87
View File
@@ -1,87 +0,0 @@
# Design: YAML Support for Persona Files (#57)
## Problem
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
## Constraints
- Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use `.yaml` extension (not `.yml`)
- Library: use `github.com/goccy/go-yaml` v1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation
## Proposed Approach
1. **Update `parsePersona`** to detect format from file extension
2. **Add YAML parsing** with explicit depth limit (defense in depth)
3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
4. **Convert built-in personas** to YAML format
5. **Update embed directive** to include both formats
### File Extension Detection
```go
func parsePersona(data []byte, source string) (*Persona, error) {
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
if isYAML {
return parseYAML(data, source)
}
return parseJSON(data, source)
}
```
### YAML Parsing with Depth Protection
We implement a custom AST-based depth/node-count walk (`checkYAMLDepth` in
`review/persona.go`) rather than relying on library decoder options. Key design
decisions:
- **Library:** `github.com/goccy/go-yaml` with `ast.Node`-based traversal
- **Dual-map tracking:** `validated` (depth-aware short-circuit) + `visiting` (cycle detection)
- **Node-count limit:** Conservative overcounting bounds total validation work
- **Alias-aware depth:** Aliases increment depth and are re-checked when encountered at greater depths
See `review/persona.go:checkYAMLDepth` for the authoritative implementation.
## State/Data Model
No new state. Same `Persona` struct, just different parsing.
## Error Cases
| Error | Handling |
|-------|----------|
| Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Custom AST walk (`checkYAMLDepth`) rejects before decode |
| Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse |
## Edge Cases
- File with `.json` extension but YAML content → JSON parse fails, user sees error
- File with no extension → defaults to JSON
- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
## Testing Strategy
1. Unit tests for YAML parsing (valid, invalid, deeply nested)
2. Unit tests for extension detection
3. Integration test for built-in personas (now YAML)
4. Backwards compat test: verify JSON still works for external files
## Completion Checklist
1. [ ] `go-yaml` dependency added at v1.16.0+
2. [ ] Extension detection uses case-insensitive comparison
3. [ ] YAML parse errors include source file name
4. [ ] JSON parsing still works for `.json` files
5. [ ] Built-in personas converted to YAML with readable multi-line strings
6. [ ] Embed directive updated to include `*.yaml`
7. [ ] Test for deeply nested YAML rejection
8. [ ] All existing tests pass
## Open Questions
- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
+24 -1
View File
@@ -231,6 +231,8 @@ These are statically checked by `~/.openclaw/workspace/scripts/test/check-invari
| S6 | Active WIP does not cause early exit (only sets ACTIVE_WIP flag) |
| S7 | SPAWN:impl guarded by `ACTIVE_WIP == 0` check |
| S8 | No merge calls in any worker template |
| S9 | Zero close-PR API calls in dispatch script (`state=closed` does not appear) |
| S10 | No close-PR API calls in any worker template; every worker template contains `NEVER close a PR` |
---
@@ -263,9 +265,20 @@ Each worker receives a precise task description with substituted values:
Workers **always** remove the WIP label on completion and reply `NO_REPLY`.
### Worker Absolute Constraints
Every worker template begins with an `⛔ ABSOLUTE CONSTRAINTS` section containing these rules:
- **NEVER close a PR.** Never call `PATCH /pulls/{id}` with `state=closed`. Closing a PR requires human action. "Duplicate", "superseded", or "already done" are never a worker's call.
- **NEVER merge a PR.** Never call the merge API. Merging requires human approval.
- **NEVER use the gitea-aweiker token.** All API calls use the gitea-rodin token only.
- **NEVER act on a PR with active REQUEST_CHANGES.** Fix the findings first.
The first two constraints are statically enforced by `check-invariants.sh`: S1 and S9 cover the dispatch script (no merge, no close); S8 covers worker templates (no merge calls); S10 covers worker templates (no close calls, with NEVER-close text verified present in each). The remaining two constraints (token usage and REQUEST_CHANGES gate) are enforced by runtime logic.
---
## 9. Fixes for Issues #144 and #145
## 9. Fixes for Issues #144, #145, and #157
**Issue #144** (autonomous merge):
The dispatch script contains no merge API calls anywhere. The `~/.openclaw/workspace/scripts/test/check-invariants.sh`
@@ -276,3 +289,13 @@ Rule 2 is the **first** rule evaluated per PR. It cannot be skipped, reasoned pa
or bypassed. It is checked before CI, before self-review, before handoff. The check
uses latest-per-reviewer state, so a reviewer who re-approved after REQUEST_CHANGES
is correctly handled.
**Issue #157** (autonomous PR close):
Worker templates were missing an explicit constraint against closing PRs. The dispatch
script never had a close call, but workers could reason their way into calling
`PATCH /pulls/{id}` with `state=closed`. All worker templates now include
`NEVER close a PR` in their ABSOLUTE CONSTRAINTS section. Invariant S9 verifies
the dispatch script contains no close calls. Invariant S10 verifies
worker templates contain no close calls and each contains the NEVER-close text.
Regression tests in `dispatch.bats` statically verify all of these constraints.
+12 -81
View File
@@ -1,91 +1,22 @@
// Package gitea provides a client for the Gitea API.
// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
// ipcheck.go re-exports the IsBlockedIP function from internal/netutil for use
// by this package's safe dialer (client.go) and for backward compatibility with
// any callers that previously imported it from here.
//
// The implementation has moved to internal/netutil so it can be shared with the
// validate-url subcommand (cmd/review-bot/validateurl.go) without creating a
// dependency from VCS-generic code on the Gitea-specific package.
package gitea
import (
"fmt"
"net"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
// blockedCIDRStrings is the canonical list of CIDR strings that should never
// be contacted by review-bot. See IsBlockedIP for the full list of covered
// address families.
//
// These are hard-coded literals: any parse failure is a programming error.
// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
var blockedCIDRStrings = []string{
// IPv4 loopback
"127.0.0.0/8",
// IPv4 unspecified / "this network"
"0.0.0.0/8",
// RFC1918 private ranges
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
"169.254.0.0/16",
// IPv4 shared address space (RFC6598, carrier-grade NAT)
"100.64.0.0/10",
// IPv4 multicast
"224.0.0.0/4",
// IPv4 reserved / broadcast
"240.0.0.0/4",
// IPv6 loopback
"::1/128",
// IPv6 unspecified
"::/128",
// IPv6 link-local
"fe80::/10",
// IPv6 unique local (ULA) — RFC4193
"fc00::/7",
// IPv6 multicast
"ff00::/8",
}
// blockedCIDRs is the parsed form of blockedCIDRStrings.
// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
var (
blockedCIDRs []*net.IPNet
blockedCIDRParseErrors []string
)
func init() {
blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
for _, r := range blockedCIDRStrings {
_, cidr, err := net.ParseCIDR(r)
if err != nil {
// Record the error rather than panicking; TestBlockedCIDRsValid
// will catch this during tests, and the CI build will fail.
blockedCIDRParseErrors = append(blockedCIDRParseErrors,
fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
continue
}
blockedCIDRs = append(blockedCIDRs, cidr)
}
}
// IsBlockedIP reports whether ip is in a blocked address range.
// It is exported for use by the validate-url subcommand and tests outside
// this package.
//
// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
// IPv4 form before checking so that IPv4 CIDRs catch them.
//
// Based on:
// - RFC1918 private ranges
// - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
// - RFC4291 IPv6 link-local / loopback
// It delegates to internal/netutil.IsBlockedIP; see that function for the full
// list of blocked ranges and IPv6-mapped IPv4 normalization behavior.
func IsBlockedIP(ip net.IP) bool {
// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
if v4 := ip.To4(); v4 != nil {
ip = v4
}
for _, cidr := range blockedCIDRs {
if cidr.Contains(ip) {
return true
}
}
return false
return netutil.IsBlockedIP(ip)
}
+25 -132
View File
@@ -3,142 +3,35 @@ package gitea
import (
"net"
"testing"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
func TestIsBlockedIP(t *testing.T) {
blocked := []struct {
name string
ip string
// TestIsBlockedIPForwarding verifies that gitea.IsBlockedIP correctly forwards
// to internal/netutil.IsBlockedIP. Full coverage of the blocking logic lives in
// internal/netutil/ipcheck_test.go.
func TestIsBlockedIPForwarding(t *testing.T) {
cases := []struct {
ip string
blocked bool
}{
// IPv4 loopback
{"loopback 127.0.0.1", "127.0.0.1"},
{"loopback 127.0.0.2", "127.0.0.2"},
{"loopback 127.255.255.255", "127.255.255.255"},
// IPv4 unspecified
{"unspecified 0.0.0.0", "0.0.0.0"},
{"unspecified 0.1.2.3", "0.1.2.3"},
// RFC1918
{"RFC1918 10.0.0.1", "10.0.0.1"},
{"RFC1918 10.255.255.255", "10.255.255.255"},
{"RFC1918 172.16.0.1", "172.16.0.1"},
{"RFC1918 172.31.255.255", "172.31.255.255"},
{"RFC1918 192.168.0.1", "192.168.0.1"},
{"RFC1918 192.168.255.255", "192.168.255.255"},
// Link-local (APIPA / AWS metadata)
{"link-local 169.254.0.1", "169.254.0.1"},
{"link-local 169.254.169.254", "169.254.169.254"},
// Shared address space (carrier-grade NAT)
{"CGN 100.64.0.1", "100.64.0.1"},
{"CGN 100.127.255.255", "100.127.255.255"},
// Multicast
{"multicast 224.0.0.1", "224.0.0.1"},
{"multicast 239.255.255.255", "239.255.255.255"},
// Reserved
{"reserved 240.0.0.1", "240.0.0.1"},
{"broadcast 255.255.255.255", "255.255.255.255"},
// IPv6 loopback
{"IPv6 loopback ::1", "::1"},
// IPv6 unspecified
{"IPv6 unspecified ::", "::"},
// IPv6 link-local
{"IPv6 link-local fe80::1", "fe80::1"},
{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
// IPv6 ULA
{"IPv6 ULA fc00::1", "fc00::1"},
{"IPv6 ULA fd00::1", "fd00::1"},
// IPv6 multicast
{"IPv6 multicast ff02::1", "ff02::1"},
{"127.0.0.1", true}, // loopback — must be blocked
{"192.168.1.1", true}, // RFC1918 — must be blocked
{"8.8.8.8", false}, // public — must not be blocked
{"2001:4860:4860::8888", false}, // public IPv6 — must not be blocked
}
for _, tc := range blocked {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if !IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
}
})
}
allowed := []struct {
name string
ip string
}{
{"public 8.8.8.8", "8.8.8.8"},
{"public 1.1.1.1", "1.1.1.1"},
{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
// not assigned to any real host, but intentionally left unblocked here because
// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
// blocking it would require tracking every RFC5737 range without meaningful
// security benefit (no server should ever listen on a TEST-NET address).
{"public 151.101.1.1", "151.101.1.1"}, // Fastly
{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
}
for _, tc := range allowed {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
}
})
}
}
func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
// Construct it manually as a 16-byte IP.
mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
if !IsBlockedIP(mapped) {
t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
}
// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
if IsBlockedIP(mappedPublic) {
t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
}
}
func TestIsBlockedIPEdgeCases(t *testing.T) {
// The boundary between RFC1918 and public ranges.
// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
notPrivate := net.ParseIP("172.15.255.255")
if IsBlockedIP(notPrivate) {
t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
}
// 172.32.0.0 is NOT private (just above 172.31.255.255).
notPrivate2 := net.ParseIP("172.32.0.0")
if IsBlockedIP(notPrivate2) {
t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
}
// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
notCGN := net.ParseIP("100.63.255.255")
if IsBlockedIP(notCGN) {
t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
}
// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
notCGN2 := net.ParseIP("100.128.0.0")
if IsBlockedIP(notCGN2) {
t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
}
}
// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
// successfully. This catches programming errors in the CIDR list without
// requiring a startup panic. The init() function records parse failures in
// blockedCIDRParseErrors rather than panicking; this test makes those failures
// visible as test failures during CI.
func TestBlockedCIDRsValid(t *testing.T) {
if len(blockedCIDRParseErrors) > 0 {
for _, msg := range blockedCIDRParseErrors {
t.Errorf("CIDR parse error: %s", msg)
for _, tc := range cases {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
got := IsBlockedIP(ip)
want := netutil.IsBlockedIP(ip)
if got != want {
t.Errorf("gitea.IsBlockedIP(%q) = %v, netutil.IsBlockedIP = %v: forwarding mismatch", tc.ip, got, want)
}
if got != tc.blocked {
t.Errorf("gitea.IsBlockedIP(%q) = %v, want %v", tc.ip, got, tc.blocked)
}
}
}
+97
View File
@@ -0,0 +1,97 @@
// Package netutil provides shared network utilities for review-bot.
// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
package netutil
import (
"fmt"
"net"
)
// blockedCIDRStrings is the canonical list of CIDR strings that should never
// be contacted by review-bot. See IsBlockedIP for the full list of covered
// address families.
//
// These are hard-coded literals: any parse failure is a programming error.
// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
var blockedCIDRStrings = []string{
// IPv4 loopback
"127.0.0.0/8",
// IPv4 unspecified / "this network"
"0.0.0.0/8",
// RFC1918 private ranges
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
"169.254.0.0/16",
// IPv4 shared address space (RFC6598, carrier-grade NAT)
"100.64.0.0/10",
// IPv4 multicast
"224.0.0.0/4",
// IPv4 reserved / broadcast
"240.0.0.0/4",
// IPv6 loopback
"::1/128",
// IPv6 unspecified
"::/128",
// IPv6 link-local
"fe80::/10",
// IPv6 unique local (ULA) — RFC4193
"fc00::/7",
// IPv6 multicast
"ff00::/8",
}
// blockedCIDRs is the parsed form of blockedCIDRStrings.
// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
var (
blockedCIDRs []*net.IPNet
blockedCIDRParseErrors []string
)
func init() {
blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
for _, r := range blockedCIDRStrings {
_, cidr, err := net.ParseCIDR(r)
if err != nil {
// Record the error rather than panicking; TestBlockedCIDRsValid
// will catch this during tests, and the CI build will fail.
blockedCIDRParseErrors = append(blockedCIDRParseErrors,
fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
continue
}
blockedCIDRs = append(blockedCIDRs, cidr)
}
}
// BlockedCIDRParseErrors returns any errors encountered parsing the built-in
// CIDR list. In correct code this will always be empty; tests assert it is.
func BlockedCIDRParseErrors() []string {
return blockedCIDRParseErrors
}
// IsBlockedIP reports whether ip is in a blocked address range.
// It is exported for use by the gitea package's safe dialer, the validate-url
// subcommand, and tests outside this package.
//
// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
// IPv4 form before checking so that IPv4 CIDRs catch them.
//
// Based on:
// - RFC1918 private ranges
// - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
// - RFC4291 IPv6 link-local / loopback
func IsBlockedIP(ip net.IP) bool {
// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
if v4 := ip.To4(); v4 != nil {
ip = v4
}
for _, cidr := range blockedCIDRs {
if cidr.Contains(ip) {
return true
}
}
return false
}
+142
View File
@@ -0,0 +1,142 @@
package netutil
import (
"net"
"testing"
)
func TestIsBlockedIP(t *testing.T) {
blocked := []struct {
name string
ip string
}{
// IPv4 loopback
{"loopback 127.0.0.1", "127.0.0.1"},
{"loopback 127.0.0.2", "127.0.0.2"},
{"loopback 127.255.255.255", "127.255.255.255"},
// IPv4 unspecified
{"unspecified 0.0.0.0", "0.0.0.0"},
{"unspecified 0.1.2.3", "0.1.2.3"},
// RFC1918
{"RFC1918 10.0.0.1", "10.0.0.1"},
{"RFC1918 10.255.255.255", "10.255.255.255"},
{"RFC1918 172.16.0.1", "172.16.0.1"},
{"RFC1918 172.31.255.255", "172.31.255.255"},
{"RFC1918 192.168.0.1", "192.168.0.1"},
{"RFC1918 192.168.255.255", "192.168.255.255"},
// Link-local (APIPA / AWS metadata)
{"link-local 169.254.0.1", "169.254.0.1"},
{"link-local 169.254.169.254", "169.254.169.254"},
// Shared address space (carrier-grade NAT)
{"CGN 100.64.0.1", "100.64.0.1"},
{"CGN 100.127.255.255", "100.127.255.255"},
// Multicast
{"multicast 224.0.0.1", "224.0.0.1"},
{"multicast 239.255.255.255", "239.255.255.255"},
// Reserved
{"reserved 240.0.0.1", "240.0.0.1"},
{"broadcast 255.255.255.255", "255.255.255.255"},
// IPv6 loopback
{"IPv6 loopback ::1", "::1"},
// IPv6 unspecified
{"IPv6 unspecified ::", "::"},
// IPv6 link-local
{"IPv6 link-local fe80::1", "fe80::1"},
{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
// IPv6 ULA
{"IPv6 ULA fc00::1", "fc00::1"},
{"IPv6 ULA fd00::1", "fd00::1"},
// IPv6 multicast
{"IPv6 multicast ff02::1", "ff02::1"},
}
for _, tc := range blocked {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if !IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
}
})
}
allowed := []struct {
name string
ip string
}{
{"public 8.8.8.8", "8.8.8.8"},
{"public 1.1.1.1", "1.1.1.1"},
{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
// not assigned to any real host, but intentionally left unblocked here because
// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
// blocking it would require tracking every RFC5737 range without meaningful
// security benefit (no server should ever listen on a TEST-NET address).
{"public 151.101.1.1", "151.101.1.1"}, // Fastly
{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
}
for _, tc := range allowed {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
}
})
}
}
func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
// Construct it manually as a 16-byte IP.
mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
if !IsBlockedIP(mapped) {
t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
}
// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
if IsBlockedIP(mappedPublic) {
t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
}
}
func TestIsBlockedIPEdgeCases(t *testing.T) {
// The boundary between RFC1918 and public ranges.
// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
notPrivate := net.ParseIP("172.15.255.255")
if IsBlockedIP(notPrivate) {
t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
}
// 172.32.0.0 is NOT private (just above 172.31.255.255).
notPrivate2 := net.ParseIP("172.32.0.0")
if IsBlockedIP(notPrivate2) {
t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
}
// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
notCGN := net.ParseIP("100.63.255.255")
if IsBlockedIP(notCGN) {
t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
}
// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
notCGN2 := net.ParseIP("100.128.0.0")
if IsBlockedIP(notCGN2) {
t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
}
}
// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
// successfully. This catches programming errors in the CIDR list without
// requiring a startup panic. The init() function records parse failures in
// blockedCIDRParseErrors rather than panicking; this test makes those failures
// visible as test failures during CI.
func TestBlockedCIDRsValid(t *testing.T) {
for _, msg := range BlockedCIDRParseErrors() {
t.Errorf("CIDR parse error: %s", msg)
}
}
+18 -2
View File
@@ -52,15 +52,31 @@ func ParseDocMapConfig(localPath string) (*DocMapConfig, error) {
if err != nil {
return nil, fmt.Errorf("read doc-map file %q: %w", localPath, err)
}
return parseDocMapBytes(data, localPath)
}
// ParseDocMapConfigContent parses a doc-map YAML config from an in-memory
// string. The source parameter is used only for error messages and log entries
// (e.g. "owner/repo@main:.review-bot/doc-map.yml").
//
// Use this when the config content has been fetched from a trusted VCS ref
// rather than read from the local workspace.
func ParseDocMapConfigContent(content, source string) (*DocMapConfig, error) {
data := []byte(content)
return parseDocMapBytes(data, source)
}
// parseDocMapBytes is the shared YAML parse implementation used by
// ParseDocMapConfig and ParseDocMapConfigContent.
func parseDocMapBytes(data []byte, source string) (*DocMapConfig, error) {
var cfg DocMapConfig
if err := yaml.UnmarshalWithOptions(data, &cfg, yaml.Strict()); err != nil {
// Re-parse without strict mode to log which keys are unknown.
var relaxed DocMapConfig
if err2 := yaml.Unmarshal(data, &relaxed); err2 != nil {
return nil, fmt.Errorf("parse doc-map YAML %q: %w", localPath, err)
return nil, fmt.Errorf("parse doc-map YAML %q: %w", source, err)
}
slog.Warn("doc-map YAML contains unknown keys (ignored)", "file", localPath, "error", err)
slog.Warn("doc-map YAML contains unknown keys (ignored)", "file", source, "error", err)
cfg = relaxed
}
return &cfg, nil
+60
View File
@@ -510,3 +510,63 @@ func TestFileCoveredByDocMap_EmptyConfig(t *testing.T) {
t.Error("expected false for empty config, got true")
}
}
// ============================================================
// ParseDocMapConfigContent
// ============================================================
func TestParseDocMapConfigContent_Valid(t *testing.T) {
content := `
mappings:
- paths:
- "lib/foo/**"
docs:
- docs/foo.md
`
cfg, err := ParseDocMapConfigContent(content, "owner/repo@main:.review-bot/doc-map.yml")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(cfg.Mappings) != 1 {
t.Fatalf("expected 1 mapping, got %d", len(cfg.Mappings))
}
if len(cfg.Mappings[0].Docs) != 1 || cfg.Mappings[0].Docs[0] != "docs/foo.md" {
t.Errorf("unexpected mapping: %+v", cfg.Mappings[0])
}
}
func TestParseDocMapConfigContent_EmptyContent(t *testing.T) {
cfg, err := ParseDocMapConfigContent("", "test-source")
if err != nil {
t.Fatalf("unexpected error for empty content: %v", err)
}
if len(cfg.Mappings) != 0 {
t.Errorf("expected 0 mappings for empty content, got %d", len(cfg.Mappings))
}
}
func TestParseDocMapConfigContent_InvalidYAML(t *testing.T) {
_, err := ParseDocMapConfigContent("mappings: [{{invalid", "test-source")
if err == nil {
t.Fatal("expected error for invalid YAML, got nil")
}
}
func TestParseDocMapConfigContent_UnknownKeys(t *testing.T) {
content := `
mappings:
- paths:
- "lib/**"
docs:
- docs/foo.md
unknown_top_level_key: "should be warned but not fatal"
`
// Unknown top-level keys produce a warning but not an error.
cfg, err := ParseDocMapConfigContent(content, "test-source")
if err != nil {
t.Fatalf("unexpected error for unknown keys: %v", err)
}
if len(cfg.Mappings) == 0 {
t.Error("expected mappings to be parsed despite unknown key")
}
}