Removing intermediate files

Merge pull request 'fix(#150 ): add EvalSymlinks to validateDocmapPath — close dir-symlink bypass' (#152 ) from issue-150 into main
fix(#150): add EvalSymlinks to validateDocmapPath — close dir-symlink bypass
2026-05-19 02:15:06 +00:00 · 2026-05-18 19:09:30 +00:00 · 2026-05-15 16:23:11 -07:00 · 2026-05-15 16:23:07 -07:00 · 2026-05-15 16:11:15 -07:00 · 2026-05-15 16:10:42 -07:00
22 changed files with 840 additions and 1047 deletions
@@ -141,6 +141,16 @@ inputs:
    description: 'Maximum bytes of injected doc content from doc-map (default 102400 = 100KB)'
    required: false
    default: '102400'
+  doc-map-trusted-ref:
+    description: >-
+      Git ref (branch, tag, or SHA) from which to fetch the doc-map config file
+      via VCS API instead of reading it from the local workspace. Recommended
+      when using doc-map: set this to the default branch (e.g. 'main') so a
+      malicious PR cannot modify the doc-map config to inject arbitrary design
+      docs into the LLM prompt. When unset, the config is read from the local
+      workspace (the PR branch) with a security warning in the logs.
+    required: false
+    default: ''

 runs:
  using: 'composite'
@@ -487,6 +497,7 @@ runs:
      shell: bash
      env:
        VCS_URL: ${{ steps.version.outputs.server_url }}
+        VCS_TYPE: ${{ steps.version.outputs.vcs_type }}
        GITEA_REPO: ${{ inputs.repo || github.repository }}
        PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
        REVIEWER_TOKEN: ${{ inputs.reviewer-token }}
@@ -506,6 +517,7 @@ runs:
        PERSONA_FILE: ${{ inputs.persona-file }}
        DOC_MAP_FILE: ${{ inputs.doc-map }}
        DOC_MAP_MAX_BYTES: ${{ inputs.doc-map-max-bytes }}
+        DOC_MAP_TRUSTED_REF: ${{ inputs.doc-map-trusted-ref }}
        AICORE_CLIENT_ID: ${{ inputs.aicore-client-id }}
        AICORE_CLIENT_SECRET: ${{ inputs.aicore-client-secret }}
        AICORE_AUTH_URL: ${{ inputs.aicore-auth-url }}
@@ -1,9 +1,20 @@
 # CHANGELOG

-## Unreleased
+## v0.4.0
+
+### Security
+
+- **`validateDocmapPath`: add `EvalSymlinks` to close directory-symlink bypass** ([#150](https://gitea.weiker.me/rodin/review-bot/issues/150)): The previous implementation used `os.Lstat` which only avoids following the *final* path component. An intermediate directory symlink (e.g. `.review-bot/` committed as a symlink to a directory outside the repo) would pass the path-confinement check because the textual path appeared within the repo root. `filepath.EvalSymlinks` is now called first, resolving all symlink components before the `filepath.Rel` confinement check. In-repo symlinks whose resolved targets also reside within the repo root are now allowed; out-of-repo targets are rejected by the confinement check.
+- **`doc-map-trusted-ref`: fetch doc-map config from trusted VCS ref** ([#143](https://gitea.weiker.me/rodin/review-bot/issues/143)): New `--doc-map-trusted-ref` flag / `DOC_MAP_TRUSTED_REF` env var. When set, the doc-map YAML config is fetched from the specified VCS ref (e.g. `main`) via API instead of being read from the local workspace (the PR branch checkout). This prevents a malicious PR from modifying `.review-bot/doc-map.yml` to inject arbitrary design docs into the LLM prompt. When unset, the local workspace is used with a security warning in the logs.
+
+### Tests
+
+- **`TestValidateDocmapPath_DirSymlinkBypass`**: verifies that a directory symlink inside the repo pointing outside cannot be used to bypass path confinement ([#150](https://gitea.weiker.me/rodin/review-bot/issues/150)).

 ### Added

+- **`doc-map-trusted-ref` input** (`--doc-map-trusted-ref` flag / `DOC_MAP_TRUSTED_REF` env var): Git ref (branch, tag, or SHA) from which to fetch the doc-map config via VCS API. Recommended for all `doc-map` users. Example: `doc-map-trusted-ref: main`. ([#143](https://gitea.weiker.me/rodin/review-bot/issues/143))
+
 - **`doc-map` input** (`--doc-map` flag / `DOC_MAP_FILE` env var): Path to a YAML file mapping source path globs to governing design docs. review-bot intersects the map with changed PR paths and injects matching docs into the system prompt under a `## Design Documents` heading. ([#137](https://gitea.weiker.me/rodin/review-bot/issues/137))
 - **`doc-map-max-bytes` input** (`--doc-map-max-bytes` flag / `DOC_MAP_MAX_BYTES` env var): Cap on total injected design doc content in bytes. Default: 102400 (100 KB). Prevents accidental context overflow when a PR touches many modules.
 - **`DesignDocs` budget section**: Design docs are included in the context budget and trimmed after conventions, before file context, if the total exceeds the model's context limit.
@@ -1,50 +0,0 @@
-# Dev Loop Health Check — 2026-05-15 03:33 UTC
-
-## Status: ✅ ACTIVE WORK COMPLETED
-
-### Test Results
- All packages: **PASS** ✅ (6/6, fresh -count=1 run)
- Build: ✅ successful
- Vet: ✅ clean
-
-### Coverage (current)
-
-| Package | Coverage |
-|---------|----------|
-| budget | 91.8% |
-| cmd/review-bot | 46.1% |
-| gitea | 85.2% |
-| github | 86.3% |
-| llm | 81.3% |
-| review | 92.0% |
-
-### PR #138 Status
-
- **Branch:** issue-137
- **Feature:** feat(#137): add doc-map input for path-scoped doc injection
- **Review status:** ✅ All 3 bots approved (sonnet, gpt, security)
- **Review findings addressed:**
-  - Fixed package comment collision in `review/docmap.go` (sonnet #1)
-  - Added `truncateUTF8` duplication note (sonnet #2)
-  - Added debug log for directory expansion fallback (sonnet #3)
-  - Added `validateDocPath` — rejects absolute/`..` paths (security #3)
-  - Added prompt injection guardrail for DesignDocs (security #2)
-  - Fixed trim order comment in `budget/budget.go` (gpt #1)
-  - Fixed `globMatch` comment to say `filepath.Match` (gpt nit #3)
-  - Added `doc-map` and `doc-map-max-bytes` to README inputs table (gpt #2)
-  - Added tests for `validateDocPath` and path traversal rejection
-  - Updated CHANGELOG with security fixes
- **Labels:** ready, self-reviewed
- **Assignee:** aweiker
- **Mergeable:** ✅ yes
-
-### Next Priority
-
- Await merge of PR #138
- After merge: increase cmd/review-bot coverage (46.1% → target 60%+)
- Issue #132+: PR Submission feature
- `github.Client.DismissReview` method referenced but missing — file issue
-
---
-
-_Dev-loop cycle complete at 03:33 UTC._
@@ -1,36 +0,0 @@
-=============================================================================
-REVIEW-BOT DEV LOOP STATUS — 2026-05-15 04:08 UTC
-=============================================================================
-
-OVERALL STATUS: ✅ PR OPEN
-
-Active Work:
-  - PR #140: test(#139): improve cmd/review-bot coverage 44.6% → 49.3%
-    State: open, labeled: ready, self-reviewed
-    Branch: issue-139
-
-Test Results (last full run, worktree):
-  - All 6 packages: PASS ✅
-  - Build: ✅ clean
-  - Vet: ✅ clean
-
-Coverage (post-change):
-  - cmd/review-bot: 49.3% (was 44.6%)
-  - review: 91.9%
-  - budget: 92.0%
-  - github: 86.3%
-  - gitea: 85.2%
-  - llm: 81.3%
-
-Repository (main):
-  - Branch: main (up to date with origin — 1e3d86b)
-  - Working tree: clean
-  - Open issues: 1 (#139, addressed by PR #140)
-  - Open PRs: 1 (#140, ready for review)
-
-System Health: ✅ GREEN
-  ✓ All tests passing
-  ✓ No warnings
-  ✓ PR ready for merge
-
-=============================================================================
@@ -1,175 +0,0 @@
-# Plan: Issue #125 — Rename GITEA_URL → VCS_URL
-
-## Problem
-
-The `GITEA_URL` environment variable (and `--gitea-url` flag) implies the binary only works with Gitea.
-Now that review-bot supports both Gitea and GitHub/GHES, this name is misleading.
-Renaming to `VCS_URL` makes the binary platform-agnostic in its interface.
-
-## Constraints
-
- Must not break existing users who already use `GITEA_URL` — need a fallback
- The CLI flag `--gitea-url` should also be updated to `--vcs-url` for consistency
- `INTEGRATION_GITEA_URL` in integration tests is a test-only env var, not the binary's interface; but should be updated for clarity
- The action YAML uses `GITEA_URL` as an internal shell variable in bash scripts — distinct from the env var passed to the binary
- All changes must compile and pass existing tests
-
-## Files Affected
-
-### Binary / Go source
-| File | Change |
-|------|--------|
-| `cmd/review-bot/main.go` | Rename `--gitea-url` → `--vcs-url`, add `VCS_URL` as primary, keep `GITEA_URL` fallback |
-| `cmd/review-bot/integration_test.go` | Rename `INTEGRATION_GITEA_URL` → `INTEGRATION_VCS_URL` (test-only, no external compat concern) |
-| `integration_test.go` | Same — rename `INTEGRATION_GITEA_URL` → `INTEGRATION_VCS_URL` |
-
-### Action YAML
-| File | Change |
-|------|--------|
-| `.gitea/actions/review/action.yml` | Rename input `gitea-url` → `vcs-url`; update env var passed to binary: `VCS_URL` instead of `GITEA_URL`; keep internal bash var as `GITEA_URL` (only used for release download, not passed to binary) |
-| `.gitea/workflows/ci.yml` | Rename `GITEA_URL` env var to `VCS_URL` in Run review step |
-
-### Documentation
-| File | Change |
-|------|--------|
-| `README.md` | Update CLI example, env var table entry |
-
-## Proposed Approach
-
-### 1. Backward-compatible env var lookup in main.go
-
-Replace:
-```go
-giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", ""), "Gitea instance URL")
-```
-
-With:
-```go
-giteaURL := flag.String("vcs-url", envOrDefaultFallback("VCS_URL", "GITEA_URL", ""), "VCS server URL (e.g. https://gitea.example.com)")
-```
-
-Add a helper:
-```go
-// envOrDefaultFallback reads primary env var; if empty, falls back to deprecated env var.
-func envOrDefaultFallback(primary, deprecated, defaultVal string) string {
-    if v := os.Getenv(primary); v != "" {
-        return v
-    }
-    if v := os.Getenv(deprecated); v != "" {
-        slog.Warn("deprecated env var in use; rename to " + primary, "old", deprecated, "new", primary)
-        return v
-    }
-    return defaultVal
-}
-```
-
-**Note:** This must be called AFTER `setupLogger` conceptually, but the flag default is evaluated at flag registration time. Since `setupLogger` runs before `flag.Parse()`, the slog.Warn will print correctly at runtime. We use `log.Printf` as a fallback if this proves problematic.
-
-Actually — flag defaults are evaluated at registration (line 57), before `setupLogger`. The warning won't go through slog. Two options:
- Use `log.Printf` for the deprecation warning (always visible)
- Move the fallback lookup to after `flag.Parse()`, checking if the parsed value is still empty
-
-**Decision:** Move fallback to a post-parse check. This is cleaner:
-```go
-vcsURL := flag.String("vcs-url", os.Getenv("VCS_URL"), "VCS server URL")
-flag.Parse()
-// Backward compat: fall back to deprecated GITEA_URL
-if *vcsURL == "" {
-    if v := os.Getenv("GITEA_URL"); v != "" {
-        slog.Warn("GITEA_URL is deprecated; use VCS_URL instead")
-        *vcsURL = v
-    }
-}
-```
-
-This is clean, idiomatic, and the warning goes through slog correctly.
-
-### 2. Keep `--gitea-url` as deprecated alias
-
-Add a hidden flag for backward compat:
-```go
-giteaURLAlias := flag.String("gitea-url", "", "Deprecated: use --vcs-url")
-```
-
-Post-parse:
-```go
-if *vcsURL == "" && *giteaURLAlias != "" {
-    slog.Warn("--gitea-url is deprecated; use --vcs-url instead")
-    *vcsURL = *giteaURLAlias
-}
-```
-
-### 3. Internal variable rename
-
-Rename `giteaURL` local variable → `vcsURL` throughout `main.go` for consistency.
-
-### 4. Error message update
-
-```go
-fmt.Fprintf(os.Stderr, "Required: --vcs-url, --repo, --pr, --reviewer-token, --llm-model\n")
-```
-
-### 5. Action YAML changes
-
-In `.gitea/actions/review/action.yml`:
- Input `gitea-url` → `vcs-url` (with same description, `required: false`, `default: ''`)
- Line 172: `GITEA_URL: ${{ inputs.gitea-url || github.server_url }}` → `VCS_URL: ${{ inputs.vcs-url || github.server_url }}`
- Lines 115, 140: internal bash vars `GITEA_URL=` are used for downloading binaries — NOT passed to the review-bot binary. Leave them as internal bash vars (they're scope-local in bash). These could be renamed to `SERVER_URL` or `BASE_URL` for local clarity, but renaming them isn't strictly required.
-
-In `.gitea/workflows/ci.yml`:
- Line 52: `GITEA_URL: ${{ github.server_url }}` → `VCS_URL: ${{ github.server_url }}`
-
-### 6. Integration test updates
-
-`INTEGRATION_GITEA_URL` → `INTEGRATION_VCS_URL` in both test files.
-
-### 7. README
-
- CLI example: `--gitea-url` → `--vcs-url`
- Env var table: `GITEA_URL` → `VCS_URL`, add note about `GITEA_URL` fallback
-
-## Backward Compatibility Summary
-
-| Old | New | Fallback? |
-|-----|-----|-----------|
-| `GITEA_URL` env var | `VCS_URL` | ✅ with deprecation warning |
-| `--gitea-url` flag | `--vcs-url` | ✅ with deprecation warning |
-| `gitea-url` action input | `vcs-url` | ⚠️ No (action version bump handles this) |
-| `INTEGRATION_GITEA_URL` | `INTEGRATION_VCS_URL` | N/A (test-only) |
-
-## Error Cases
-
- Both `VCS_URL` and `GITEA_URL` set: `VCS_URL` wins (primary takes precedence)
- Both `--vcs-url` and `--gitea-url` provided: `--vcs-url` wins
- Neither set: existing "missing required flags" error unchanged
-
-## Edge Cases
-
- `os.Getenv` returns "" for unset AND set-to-empty — consistent with existing behavior
- The `envOrDefault` helper is unchanged; we add `envOrDefaultFallback` for the one renamed var
-
-## Testing Strategy
-
- Existing unit tests pass unchanged (they don't test env var parsing directly)
- Integration tests updated to use new env var name
- Manual: `GITEA_URL=https://example.com ./review-bot --repo x --pr 1 ...` should print deprecation warning and proceed
- Manual: `VCS_URL=https://example.com ./review-bot ...` should work silently
-
-## Completion Checklist
-
-1. `VCS_URL` is read first; `GITEA_URL` is fallback with deprecation warning
-2. `--vcs-url` flag is primary; `--gitea-url` is deprecated alias with warning
-3. Error message references `--vcs-url` not `--gitea-url`
-4. `action.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
-5. `ci.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
-6. README updated in CLI example and env var table
-7. Integration tests use `INTEGRATION_VCS_URL`
-8. `go test ./...` passes
-9. `go vet ./...` passes
-10. `go build ./cmd/review-bot` succeeds
-
-## Open Questions
-
- Should the CLI flag `--gitea-url` be completely hidden from `--help` or just deprecated with a note? The issue doesn't specify. Decision: keep it visible but add "(deprecated: use --vcs-url)" to the description.
- Should action.yml also add `gitea-url` as a deprecated input alias? The issue says "Update the action to pass the new env var name" — no mention of backward compat for the action input. Decision: rename only, no alias (action users pin a version anyway).
- The bash-internal `GITEA_URL` variable in action.yml scripts (used for release download, not passed to binary) — rename for clarity? Decision: yes, rename to `BASE_URL` to avoid confusion with the env var.
@@ -210,6 +210,7 @@ AI Core handles OAuth token management and deployment discovery automatically. M
 | `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
 | `doc-map` | No | `""` | Path to a YAML file mapping source path globs to governing design docs |
 | `doc-map-max-bytes` | No | `102400` | Maximum bytes of injected doc content from doc-map (default 100KB) |
+| `doc-map-trusted-ref` | No | `""` | Git ref (e.g. `main`) to fetch the doc-map config from via VCS API instead of local workspace. **Recommended for security** — prevents a PR from modifying the doc-map config to inject arbitrary docs. |
 | `persona` | No | `""` | Built-in persona name (security, architect, docs) |
 | `persona-file` | No | `""` | Path to persona file (YAML or JSON) with custom review focus |
 | `temperature` | No | `0` | LLM temperature (0 = server default) |
@@ -288,7 +289,7 @@ review-bot \
  --vcs-url https://gitea.example.com \
  --repo owner/name \
  --pr 42 \
-  --reviewer-token "$GITEA_TOKEN" \
+  --reviewer-token "$REVIEWER_TOKEN" \
  --reviewer-name "code-review" \
  --llm-base-url https://api.openai.com/v1 \
  --llm-api-key "$OPENAI_API_KEY" \
@@ -337,7 +338,8 @@ All flags have environment variable equivalents:
 | Flag | Env Var |
 |------|---------|
 | `--vcs-url` | `VCS_URL` (fallback: `GITEA_URL`) |
-| `--repo` | `GITEA_REPO` |
+| `--vcs-type` | `VCS_TYPE` (auto-detected from URL if not set; `gitea` or `github`) |
+| `--repo` | `GITEA_REPO` (also accepted: set `GITEA_REPO` for Gitea; VCS-agnostic `REPO` coming) |
 | `--pr` | `PR_NUMBER` |
 | `--reviewer-token` | `REVIEWER_TOKEN` |
 | `--reviewer-name` | `REVIEWER_NAME` |
@@ -101,6 +101,7 @@ func main() {
 	aicoreResourceGroup := flag.String("aicore-resource-group", envOrDefault("AICORE_RESOURCE_GROUP", "default"), "SAP AI Core resource group (for provider=aicore)")
 	docMapFile := flag.String("doc-map", envOrDefault("DOC_MAP_FILE", ""), "Path to YAML file mapping source path globs to governing design docs")
 	docMapMaxBytes := flag.Int("doc-map-max-bytes", envOrDefaultInt("DOC_MAP_MAX_BYTES", review.DefaultDocMapMaxBytes), "Maximum bytes of injected doc content (default 102400)")
+	docMapTrustedRef := flag.String("doc-map-trusted-ref", envOrDefault("DOC_MAP_TRUSTED_REF", ""), "Git ref (e.g. main) to fetch the doc-map config from via VCS API instead of local workspace. Recommended to prevent PR branch from controlling which docs are injected.")

 	flag.Parse()

@@ -173,6 +174,20 @@ func main() {
 		os.Exit(1)
 	}

+	// Early validation of filesystem-path flags (fail fast before network I/O).
+	// Skip local-path validation when --doc-map-trusted-ref is set: the flag
+	// value is used as a VCS API path, not a local filesystem path, and the
+	// file may not exist in the local checkout (sparse, PR-deleted, etc.).
+	var resolvedDocMapFile string
+	if *docMapFile != "" && *docMapTrustedRef == "" {
+		resolved, err := validateWorkspacePath(*docMapFile, "doc-map")
+		if err != nil {
+			slog.Error("invalid doc-map path", "error", err)
+			os.Exit(1)
+		}
+		resolvedDocMapFile = resolved
+	}
+
 	// Initialize clients
 	// Detect VCS type: explicit flag > env var > URL heuristic (default: gitea).
 	vcsType := envOrDefault("VCS_TYPE", "")
@@ -357,15 +372,45 @@ func main() {
 	// Step 6c: Load path-scoped design docs if doc-map specified
 	designDocs := ""
 	if *docMapFile != "" {
-		resolvedDocMap, err := validateWorkspacePath(*docMapFile, "doc-map")
-		if err != nil {
-			slog.Error("invalid doc-map path", "error", err)
-			os.Exit(1)
-		}
-		docMapCfg, err := review.ParseDocMapConfig(resolvedDocMap)
-		if err != nil {
-			slog.Error("failed to parse doc-map file", "file", *docMapFile, "error", err)
-			os.Exit(1)
+		var docMapCfg *review.DocMapConfig
+
+		if *docMapTrustedRef != "" {
+			// Fetch doc-map config from a trusted VCS ref (e.g. the default branch).
+			// This prevents a malicious PR from modifying the doc-map config to
+			// inject arbitrary docs into the LLM prompt.
+			slog.Info("doc-map: fetching config from trusted ref",
+				"path", *docMapFile,
+				"ref", *docMapTrustedRef)
+			content, fetchErr := vcs.GetFileContentRef(ctx, owner, repoName, *docMapFile, *docMapTrustedRef)
+			if fetchErr != nil {
+				slog.Error("doc-map: failed to fetch config from trusted ref",
+					"path", *docMapFile,
+					"ref", *docMapTrustedRef,
+					"error", fetchErr)
+				os.Exit(1)
+			}
+			source := fmt.Sprintf("%s/%s@%s:%s", owner, repoName, *docMapTrustedRef, *docMapFile)
+			var parseErr error
+			docMapCfg, parseErr = review.ParseDocMapConfigContent(content, source)
+			if parseErr != nil {
+				slog.Error("doc-map: failed to parse fetched config",
+					"source", source,
+					"error", parseErr)
+				os.Exit(1)
+			}
+		} else {
+			// Local workspace fallback — the doc-map is read from the PR branch checkout.
+			// SECURITY WARNING: a malicious PR can modify this file to inject arbitrary
+			// docs. Set --doc-map-trusted-ref (or DOC_MAP_TRUSTED_REF) to a trusted ref
+			// (e.g. "main") to fetch the config from the default branch instead.
+			slog.Warn("doc-map: loading config from local workspace (PR branch) — " +
+				"set --doc-map-trusted-ref to fetch from a trusted ref for security")
+			var parseErr error
+			docMapCfg, parseErr = review.ParseDocMapConfig(resolvedDocMapFile)
+			if parseErr != nil {
+				slog.Error("failed to parse doc-map file", "file", *docMapFile, "error", parseErr)
+				os.Exit(1)
+			}
 		}

 		// Collect changed file paths from the PR for intersection.
@@ -379,10 +424,11 @@ func main() {

 		if len(matchedDocs) > 0 {
 			docMapOpts := review.DocMapOptions{MaxBytes: *docMapMaxBytes}
-			designDocs, err = review.LoadMatchingDocs(ctx, vcs, owner, repoName, matchedDocs, docMapOpts)
-			if err != nil {
+			var loadErr error
+			designDocs, loadErr = review.LoadMatchingDocs(ctx, vcs, owner, repoName, matchedDocs, docMapOpts)
+			if loadErr != nil {
 				// Non-fatal: individual missing files are already warned; log and continue.
-				slog.Warn("doc-map: partial failure loading docs", "error", err)
+				slog.Warn("doc-map: partial failure loading docs", "error", loadErr)
 			}
 			if designDocs != "" {
 				slog.Info("doc-map: injected design docs", "matched", len(matchedDocs), "bytes", len(designDocs))
@@ -510,9 +556,9 @@ func main() {
 	for _, f := range result.Findings {
 		if f.File != "" && f.Line > 0 && diffRanges.Contains(f.File, f.Line) {
 			inlineComments = append(inlineComments, vcsReviewComment{
-				Path:        f.File,
-				NewPosition: int64(f.Line),
-				Body:        fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
+				Path:    f.File,
+				NewLine: int64(f.Line),
+				Body:    fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
 			})
 		}
 	}
@@ -1510,3 +1510,125 @@ func TestMainSubprocess_DeprecatedGiteaURLEnv(t *testing.T) {
 		t.Errorf("expected deprecation warning for GITEA_URL, got: %s", out)
 	}
 }
+
+// TestMainSubprocess_InvalidDocMapPath confirms that --doc-map with a path traversal
+// attempt is rejected before any network I/O.
+func TestMainSubprocess_InvalidDocMapPath(t *testing.T) {
+	if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
+		flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
+		os.Args = []string{"review-bot",
+			"--vcs-url", "https://gitea.example.com",
+			"--repo", "owner/repo",
+			"--pr", "1",
+			"--reviewer-token", "tok",
+			"--llm-base-url", "https://api.example.com",
+			"--llm-api-key", "key",
+			"--llm-model", "gpt-4",
+			"--doc-map", "../../../etc/passwd",
+		}
+		main()
+		return
+	}
+
+	cmd := exec.Command(os.Args[0], "-test.run=TestMainSubprocess_InvalidDocMapPath")
+	// t.TempDir() is evaluated here in the outer process, producing a real directory
+	// that is passed as the GITHUB_WORKSPACE env var string to the subprocess.
+	cmd.Env = append(cleanEnv(),
+		"TEST_SUBPROCESS_MAIN=1",
+		"GITHUB_WORKSPACE="+t.TempDir(),
+	)
+	out, err := cmd.CombinedOutput()
+	if err == nil {
+		t.Fatal("expected non-zero exit with path traversal doc-map, got success")
+	}
+	output := string(out)
+	if !strings.Contains(output, "doc-map") {
+		t.Errorf("expected error mentioning doc-map, got: %s", output)
+	}
+	if !strings.Contains(output, "resolves outside workspace") {
+		t.Errorf("expected error about path traversal, got: %s", output)
+	}
+}
+
+// TestMainSubprocess_InvalidDocMapFile confirms that --doc-map with a nonexistent file
+// is rejected before any network I/O.
+func TestMainSubprocess_InvalidDocMapFile(t *testing.T) {
+	if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
+		flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
+		os.Args = []string{"review-bot",
+			"--vcs-url", "https://gitea.example.com",
+			"--repo", "owner/repo",
+			"--pr", "1",
+			"--reviewer-token", "tok",
+			"--llm-base-url", "https://api.example.com",
+			"--llm-api-key", "key",
+			"--llm-model", "gpt-4",
+			"--doc-map", "nonexistent.yml",
+		}
+		main()
+		return
+	}
+
+	cmd := exec.Command(os.Args[0], "-test.run=TestMainSubprocess_InvalidDocMapFile")
+	// t.TempDir() is evaluated here in the outer process, producing a real directory
+	// that is passed as the GITHUB_WORKSPACE env var string to the subprocess.
+	cmd.Env = append(cleanEnv(),
+		"TEST_SUBPROCESS_MAIN=1",
+		"GITHUB_WORKSPACE="+t.TempDir(),
+	)
+	out, err := cmd.CombinedOutput()
+	if err == nil {
+		t.Fatal("expected non-zero exit with nonexistent doc-map file, got success")
+	}
+	output := string(out)
+	if !strings.Contains(output, "doc-map") {
+		t.Errorf("expected error mentioning doc-map, got: %s", output)
+	}
+	if !strings.Contains(output, "failed to resolve") {
+		t.Errorf("expected error about failed resolution, got: %s", output)
+	}
+}
+
+// TestMainSubprocess_DocMapTrustedRefSkipsLocalValidation confirms that
+// --doc-map-trusted-ref bypasses local filesystem validation for --doc-map.
+// When the trusted-ref flag is set, the doc-map value is used as a VCS API
+// path; a nonexistent local file must not cause an early exit before network I/O.
+func TestMainSubprocess_DocMapTrustedRefSkipsLocalValidation(t *testing.T) {
+	if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
+		flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
+		os.Args = []string{"review-bot",
+			"--vcs-url", "https://gitea.example.com",
+			"--repo", "owner/repo",
+			"--pr", "1",
+			"--reviewer-token", "tok",
+			"--llm-base-url", "https://api.example.com",
+			"--llm-api-key", "key",
+			"--llm-model", "gpt-4",
+			"--doc-map", "nonexistent-local.yml",
+			"--doc-map-trusted-ref", "main",
+		}
+		main()
+		return
+	}
+
+	cmd := exec.Command(os.Args[0], "-test.run=TestMainSubprocess_DocMapTrustedRefSkipsLocalValidation")
+	cmd.Env = append(cleanEnv(),
+		"TEST_SUBPROCESS_MAIN=1",
+		"GITHUB_WORKSPACE="+t.TempDir(),
+	)
+	out, err := cmd.CombinedOutput()
+	output := string(out)
+
+	// The test must fail (network I/O or VCS API failure) but must NOT
+	// fail with the local filesystem validation error.
+	// "failed to resolve" would indicate the early validateWorkspacePath ran —
+	// that would be the bug this test is catching.
+	if strings.Contains(output, "failed to resolve") {
+		t.Errorf("--doc-map-trusted-ref should skip local path validation, but got filesystem error: %s", output)
+	}
+
+	// It must still exit non-zero (real VCS call to example.com will fail).
+	if err == nil {
+		t.Fatal("expected non-zero exit when VCS API is unreachable, got success")
+	}
+}
@@ -23,46 +23,60 @@ const maxDocmapBytes int64 = 10 * 1024 * 1024 // 10 MB
 //  1. The path resolves to a regular file within resolvedRoot (path
 //     confinement): prevents a PR-controlled --docmap from reading arbitrary
 //     host files via absolute paths or ".." traversal.
-//  2. The path is not a symlink: prevents denial-of-service via /dev/zero or
-//     information disclosure via symlinks that point outside the workspace.
+//  2. The resolved path is within resolvedRoot: in-repo file-level symlinks
+//     are allowed when their resolved target is still inside the root;
+//     symlinks that escape the root are rejected by the confinement check.
 //  3. The file does not exceed maxDocmapBytes: prevents memory exhaustion
 //     from an oversized but legitimately committed doc-map file.
 //
 // resolvedRoot must already be an absolute, symlink-free path (obtained from
 // filepath.Abs + filepath.EvalSymlinks).
-func validateDocmapPath(localPath, resolvedRoot string) error {
+func validateDocmapPath(localPath, resolvedRoot string) (string, error) {
 	// Resolve the docmap path to an absolute path.
 	absPath, err := filepath.Abs(localPath)
 	if err != nil {
-		return fmt.Errorf("cannot resolve path: %w", err)
+		return "", fmt.Errorf("cannot resolve path: %w", err)
 	}

-	// Lstat: do NOT follow symlinks. We want to inspect the entry itself.
-	fi, err := os.Lstat(absPath)
+	// Resolve ALL symlink components, not just the final one.
+	// os.Lstat only avoids following the *final* path component; intermediate
+	// directory symlinks are still followed. EvalSymlinks resolves every
+	// component, closing the directory-symlink bypass: a PR that commits
+	// .review-bot/ as a directory symlink pointing outside the repo would
+	// otherwise pass the filepath.Rel confinement check because the textual
+	// path is inside the root while the actual destination is not.
+	resolvedPath, err := filepath.EvalSymlinks(absPath)
 	if err != nil {
-		return fmt.Errorf("cannot stat file: %w", err)
+		return "", fmt.Errorf("cannot resolve path (symlink): %w", err)
 	}

-	// Reject symlinks outright — a symlink can point to /dev/zero or arbitrary
-	// host paths, bypassing both the confinement check and the size check.
-	if fi.Mode()&os.ModeSymlink != 0 {
-		return fmt.Errorf("symlinks are not allowed")
+	// Lstat the resolved path for size and existence checks — EvalSymlinks
+	// guarantees no symlink components remain, so ModeSymlink can never be set.
+	fi, err := os.Lstat(resolvedPath)
+	if err != nil {
+		return "", fmt.Errorf("cannot stat file: %w", err)
 	}

-	// Confine to resolvedRoot: the cleaned absolute path must be a descendant
-	// of the repo root. This catches paths that escaped via "..", absolute
-	// paths that happen to be outside the root, etc.
-	rel, err := filepath.Rel(resolvedRoot, absPath)
+	// Reject anything that is not a regular file (directories, FIFOs, device
+	// nodes, etc.) — ParseDocMapConfig expects a plain YAML file and would
+	// produce a confusing error on non-regular entries.
+	if !fi.Mode().IsRegular() {
+		return "", fmt.Errorf("docmap must be a regular file")
+	}
+
+	// Confine to resolvedRoot: use the fully-resolved path so that a directory
+	// symlink inside the repo cannot carry the path outside the root.
+	rel, err := filepath.Rel(resolvedRoot, resolvedPath)
 	if err != nil || rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator)) {
-		return fmt.Errorf("path must be within --repo-root")
+		return "", fmt.Errorf("path must be within --repo-root")
 	}

 	// Enforce size cap before reading to prevent memory exhaustion.
 	if fi.Size() > maxDocmapBytes {
-		return fmt.Errorf("file size %d bytes exceeds %d-byte limit", fi.Size(), maxDocmapBytes)
+		return "", fmt.Errorf("file size %d bytes exceeds %d-byte limit", fi.Size(), maxDocmapBytes)
 	}

-	return nil
+	return resolvedPath, nil
 }

 // runValidateDocmap implements the `review-bot validate-docmap` subcommand.
@@ -126,16 +140,59 @@ func runValidateDocmap(args []string) int {
 	// may reference a PR-controlled file (e.g. .review-bot/doc-map.yml).
 	// Validate that it:
 	//   1. Resolves within resolvedRoot (prevent reading arbitrary host files).
-	//   2. Is not a symlink (prevent /dev/zero or symlink-based host probing).
+	//   2. Resolved target stays within the root (in-repo symlinks are allowed
+	//      if they resolve to a path inside the root).
 	//   3. Does not exceed maxDocmapBytes (prevent memory exhaustion from an
 	//      oversized committed file).
-	if err := validateDocmapPath(*docmapFlag, resolvedRoot); err != nil {
+	// validateDocmapPath returns the resolved path; use it directly to
+	// eliminate any TOCTOU race between validation and use.
+	resolvedDocmap, err := validateDocmapPath(*docmapFlag, resolvedRoot)
+	if err != nil {
 		fmt.Fprintf(errWriter, "Error: --docmap %q is invalid: %v\n", *docmapFlag, err)
 		return 2
 	}

-	// Parse docmap YAML.
-	cfg, err := review.ParseDocMapConfig(*docmapFlag)
+	// Open and read the docmap with a LimitedReader — closes the residual TOCTOU
+	// window between the Lstat size check in validateDocmapPath and the file open
+	// here. The limit is maxDocmapBytes+1 so we can detect a file that grew past
+	// the cap after the stat without reading unbounded bytes.
+	//
+	// Defense-in-depth: stat the path immediately before and after open so we can
+	// detect a file swap between validateDocmapPath's validation and this open via
+	// os.SameFile. An attacker with workspace write access could otherwise replace
+	// the validated file with a symlink in the gap between validation and use.
+	preStat, err := os.Lstat(resolvedDocmap)
+	if err != nil {
+		fmt.Fprintf(errWriter, "Error: failed to stat docmap before open %q: %v\n", *docmapFlag, err)
+		return 2
+	}
+	f, err := os.Open(resolvedDocmap)
+	if err != nil {
+		fmt.Fprintf(errWriter, "Error: failed to open docmap %q: %v\n", *docmapFlag, err)
+		return 2
+	}
+	defer func() { _ = f.Close() }()
+	// Verify we opened the same file that was validated — rejects a swap between
+	// the pre-open Lstat and the open call.
+	postStat, err := f.Stat()
+	if err != nil {
+		fmt.Fprintf(errWriter, "Error: failed to stat open docmap %q: %v\n", *docmapFlag, err)
+		return 2
+	}
+	if !os.SameFile(preStat, postStat) {
+		fmt.Fprintf(errWriter, "Error: --docmap %q changed between validation and open\n", *docmapFlag)
+		return 2
+	}
+	docmapData, err := io.ReadAll(io.LimitReader(f, maxDocmapBytes+1))
+	if err != nil {
+		fmt.Fprintf(errWriter, "Error: failed to read docmap %q: %v\n", *docmapFlag, err)
+		return 2
+	}
+	if int64(len(docmapData)) > maxDocmapBytes {
+		fmt.Fprintf(errWriter, "Error: --docmap %q exceeded %d-byte limit after open\n", *docmapFlag, maxDocmapBytes)
+		return 2
+	}
+	cfg, err := review.ParseDocMapConfigContent(string(docmapData), *docmapFlag)
 	if err != nil {
 		fmt.Fprintf(errWriter, "Error: failed to parse docmap %q: %v\n", *docmapFlag, err)
 		return 2
@@ -160,6 +217,9 @@ func runValidateDocmap(args []string) int {
 		// Normalize Windows-style backslashes to forward slashes so that
 		// changed-file paths from git on Windows match doc-map globs.
 		f = strings.ReplaceAll(f, "\\", "/")
+		// Strip a leading "./" emitted by non-git tools (e.g. `find`) so that
+		// paths like "./cmd/foo.go" match doc-map globs written as "cmd/**".
+		f = strings.TrimPrefix(f, "./")
 		if !review.FileCoveredByDocMap(cfg, f) {
 			uncovered = append(uncovered, f)
 		}
@@ -178,7 +238,7 @@ func runValidateDocmap(args []string) int {
 	staleDocs := checkStaleDocs(cfg, resolvedRoot)
 	if len(staleDocs) > 0 {
 		failed = true
-		fmt.Fprintln(errWriter, "ERROR: stale docmap docs: entries (paths do not exist):")
+		fmt.Fprintln(errWriter, "ERROR: stale docmap entries (paths do not exist):")
 		for _, d := range staleDocs {
 			fmt.Fprintf(errWriter, "  %s\n", d)
 		}
@@ -465,23 +465,34 @@ mappings:
 }

 // TestValidateDocmapPath_Symlink verifies that --docmap pointing at a symlink
-// is rejected before the file is read (prevents /dev/zero DOS or arbitrary
-// host-file reads via PR-controlled symlinks).
+// whose resolved target is outside --repo-root is rejected (prevents reading
+// arbitrary host files via PR-controlled symlinks).
+//
+// Note: after the EvalSymlinks fix (issue #150), in-repo symlinks whose
+// targets also reside within the repo root are now allowed — the confinement
+// check is applied to the resolved path, not the symlink entry itself. The
+// security invariant is: the resolved destination must be within the root.
 func TestValidateDocmapPath_Symlink(t *testing.T) {
 	dir := t.TempDir()
+	outside := t.TempDir()

-	// Create a real docmap file to serve as the symlink target.
-	realDocmap := makeDocmapInDir(t, dir, `
-mappings:
-  - paths:
-      - "lib/**"
-    docs:
-      - docs/foo.md
-`)
+	// Create a docmap file OUTSIDE the repo root to serve as the symlink
+	// target. EvalSymlinks will resolve to this path, which the Rel check
+	// must then reject.
+	if err := os.MkdirAll(filepath.Join(outside, ".review-bot"), 0o755); err != nil {
+		t.Fatalf("MkdirAll: %v", err)
+	}
+	outsideDocmap := filepath.Join(outside, ".review-bot", "doc-map.yml")
+	if err := os.WriteFile(outsideDocmap, []byte("mappings: []\n"), 0o644); err != nil {
+		t.Fatalf("WriteFile: %v", err)
+	}

-	// Create a symlink inside dir pointing to the real docmap.
+	// Create a symlink inside dir pointing to the file outside the repo.
+	if err := os.MkdirAll(filepath.Join(dir, ".review-bot"), 0o755); err != nil {
+		t.Fatalf("MkdirAll: %v", err)
+	}
 	symlinkPath := filepath.Join(dir, ".review-bot", "doc-map-link.yml")
-	if err := os.Symlink(realDocmap, symlinkPath); err != nil {
+	if err := os.Symlink(outsideDocmap, symlinkPath); err != nil {
 		t.Fatalf("Symlink: %v", err)
 	}

@@ -490,10 +501,10 @@ mappings:
 		[]string{"--docmap", symlinkPath, "--repo-root", dir},
 	)
 	if code != 2 {
-		t.Errorf("expected exit 2 for symlink docmap, got %d; stderr: %q", code, stderr)
+		t.Errorf("expected exit 2 for out-of-repo symlink docmap, got %d; stderr: %q", code, stderr)
 	}
-	if !strings.Contains(stderr, "symlink") && !strings.Contains(stderr, "invalid") {
-		t.Errorf("expected symlink rejection in stderr, got %q", stderr)
+	if !strings.Contains(stderr, "invalid") && !strings.Contains(stderr, "repo-root") {
+		t.Errorf("expected confinement rejection in stderr, got %q", stderr)
 	}
 }

@@ -550,3 +561,136 @@ func TestValidateDocmapPath_SizeLimit(t *testing.T) {
 		t.Errorf("expected size limit error in stderr, got %q", stderr)
 	}
 }
+
+// TestValidateDocmapPath_DirSymlinkBypass verifies that a directory-symlink
+// inside the repo pointing outside cannot be used to read arbitrary host files.
+//
+// Attack vector: a PR commits .review-bot/ as a directory symlink targeting a
+// directory outside the repo. The textual path of the docmap file is inside
+// the repo root, so the old Rel-only check passed — but the actual file is
+// outside. This is closed by calling EvalSymlinks on the full path before the
+// confinement check.
+func TestValidateDocmapPath_DirSymlinkBypass(t *testing.T) {
+	repoDir := t.TempDir()
+	outsideDir := t.TempDir()
+
+	// Secret file outside the repo.
+	secretPath := filepath.Join(outsideDir, "secret.yml")
+	if err := os.WriteFile(secretPath, []byte("mappings: []\n"), 0o644); err != nil {
+		t.Fatalf("WriteFile: %v", err)
+	}
+
+	// Create .review-bot/ as a directory symlink pointing outside the repo.
+	reviewBotDir := filepath.Join(repoDir, ".review-bot")
+	if err := os.Symlink(outsideDir, reviewBotDir); err != nil {
+		t.Skipf("cannot create dir symlink (platform may not support it): %v", err)
+	}
+
+	// Textually inside repo — .review-bot/secret.yml — but resolves outside.
+	attackPath := filepath.Join(repoDir, ".review-bot", "secret.yml")
+
+	// Resolve repoDir to a symlink-free path, as runValidateDocmap does.
+	resolvedRoot, err := filepath.EvalSymlinks(repoDir)
+	if err != nil {
+		t.Fatalf("EvalSymlinks(repoDir): %v", err)
+	}
+
+	if _, err := validateDocmapPath(attackPath, resolvedRoot); err == nil {
+		t.Error("expected rejection of dir-symlink bypass, got nil error")
+	}
+}
+
+// TestValidateDocmapPath_NonRegularFile verifies that --docmap pointing at a
+// non-regular file (e.g. a directory) is rejected with a clear error before
+// ParseDocMapConfig is called.
+func TestValidateDocmapPath_NonRegularFile(t *testing.T) {
+	dir := t.TempDir()
+
+	// Use the directory itself as the docmap path — directories pass Lstat but
+	// are not regular files.
+	reviewBotDir := filepath.Join(dir, ".review-bot")
+	if err := os.MkdirAll(reviewBotDir, 0o755); err != nil {
+		t.Fatalf("MkdirAll: %v", err)
+	}
+
+	code, _, stderr := stdinValidateDocmap(t,
+		"",
+		[]string{"--docmap", reviewBotDir, "--repo-root", dir},
+	)
+	if code != 2 {
+		t.Errorf("expected exit 2 for directory docmap, got %d; stderr: %q", code, stderr)
+	}
+	if !strings.Contains(stderr, "regular file") && !strings.Contains(stderr, "invalid") {
+		t.Errorf("expected regular-file rejection in stderr, got %q", stderr)
+	}
+}
+
+// TestRunValidateDocmap_DotSlashPrefix verifies that paths emitted with a
+// leading "./" (e.g. from `find` or `ls`) match doc-map globs correctly.
+// Without TrimPrefix, "./cmd/foo.go" would not match the pattern "cmd/**".
+func TestRunValidateDocmap_DotSlashPrefix(t *testing.T) {
+	dir := t.TempDir()
+	makeDocFile(t, dir, "docs/foo.md")
+
+	docmap := makeDocmapInDir(t, dir, `
+mappings:
+  - paths:
+      - "cmd/**"
+    docs:
+      - docs/foo.md
+`)
+
+	// File with a leading "./" should be treated as covered.
+	code, _, stderr := stdinValidateDocmap(t,
+		"./cmd/foo.go\n",
+		[]string{"--docmap", docmap, "--repo-root", dir},
+	)
+	if code != 0 {
+		t.Errorf("expected exit 0 for './' prefixed covered file, got %d; stderr: %q", code, stderr)
+	}
+}
+
+// TestValidateDocmapPath_InRepoSymlinkAllowed verifies that an in-repo
+// file-level symlink whose resolved target is still within the repo root is
+// accepted. This is the positive case for the issue #150 behavioral change:
+// only symlinks that escape the root are rejected; intra-repo symlinks are
+// allowed because EvalSymlinks resolves the target and the confinement check
+// is applied to the resolved path, not the symlink entry itself.
+func TestValidateDocmapPath_InRepoSymlinkAllowed(t *testing.T) {
+	dir := t.TempDir()
+
+	// Create the real docmap file inside the repo root.
+	if err := os.MkdirAll(filepath.Join(dir, ".review-bot"), 0o755); err != nil {
+		t.Fatalf("MkdirAll: %v", err)
+	}
+	realDocmap := filepath.Join(dir, ".review-bot", "doc-map-real.yml")
+	if err := os.WriteFile(realDocmap, []byte("mappings: []\n"), 0o644); err != nil {
+		t.Fatalf("WriteFile: %v", err)
+	}
+
+	// Create a symlink inside the repo root that points to the real file
+	// (also inside the root).
+	symlinkPath := filepath.Join(dir, ".review-bot", "doc-map-link.yml")
+	if err := os.Symlink(realDocmap, symlinkPath); err != nil {
+		t.Skipf("cannot create symlink (platform may not support it): %v", err)
+	}
+
+	// Resolve dir to a symlink-free root, as runValidateDocmap does.
+	resolvedRoot, err := filepath.EvalSymlinks(dir)
+	if err != nil {
+		t.Fatalf("EvalSymlinks(dir): %v", err)
+	}
+
+	// In-repo symlink whose target is within root: must be accepted.
+	resolved, err := validateDocmapPath(symlinkPath, resolvedRoot)
+	if err != nil {
+		t.Fatalf("expected in-repo symlink to be accepted, got error: %v", err)
+	}
+	// The returned resolved path must be the real file (not the symlink entry).
+	// validateDocmapPath calls filepath.EvalSymlinks internally, so the returned
+	// path is always the fully-resolved real path — it can never equal the
+	// symlink entry itself.
+	if resolved == symlinkPath {
+		t.Errorf("expected resolved path to differ from symlink path")
+	}
+}
@@ -9,7 +9,7 @@ import (
 	"strings"
 	"time"

-	"gitea.weiker.me/rodin/review-bot/gitea"
+	"gitea.weiker.me/rodin/review-bot/internal/netutil"
 )

 // runValidateURL implements the `review-bot validate-url <url>` subcommand.
@@ -114,7 +114,7 @@ func validateURL(rawURL string) error {
 	}

 	for _, a := range addrs {
-		if gitea.IsBlockedIP(a.IP) {
+		if netutil.IsBlockedIP(a.IP) {
 			return &validateError{
 				code:    1,
 				message: fmt.Sprintf("blocked: %q resolves to private/reserved IP %s", host, a.IP),
@@ -83,9 +83,9 @@ type vcsCommitStatus struct {

 // vcsReviewComment is an inline review comment.
 type vcsReviewComment struct {
-	Path        string
-	NewPosition int64 // Gitea: absolute line; GitHub: diff hunk position
-	Body        string
+	Path    string
+	NewLine int64 // absolute line number on the new (right) side of the diff, used by both Gitea and GitHub adapters
+	Body    string
 }

 // vcsReview is a submitted PR review.
@@ -176,7 +176,7 @@ func (a *giteaVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, pa
 func (a *giteaVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
 	gc := make([]gitea.ReviewComment, len(comments))
 	for i, c := range comments {
-		gc[i] = gitea.ReviewComment{Path: c.Path, NewPosition: c.NewPosition, Body: c.Body}
+		gc[i] = gitea.ReviewComment{Path: c.Path, NewPosition: c.NewLine, Body: c.Body}
 	}
 	r, err := a.c.PostReview(ctx, owner, repo, number, event, body, commitID, gc)
 	if err != nil {
@@ -311,14 +311,12 @@ func (a *githubVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, p
 func (a *githubVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
 	gc := make([]github.ReviewComment, len(comments))
 	for i, c := range comments {
-		// GitHub inline comments use diff hunk "position", not absolute line numbers.
-		// NewPosition from gitea diff parsing gives absolute line numbers, which
-		// will not match GitHub's position values. For initial GitHub support, we
-		// attach comments with Line+Side (absolute line on the RIGHT side) instead.
+		// GitHub inline comments use Line+Side (absolute line on the RIGHT side).
+		// NewLine from diff parsing gives absolute new-file line numbers.
 		// Comments that cannot be mapped will be omitted (GitHub rejects invalid positions).
 		gc[i] = github.ReviewComment{
 			Path: c.Path,
-			Line: c.NewPosition,
+			Line: c.NewLine,
 			Side: "RIGHT",
 			Body: c.Body,
 		}
@@ -1,82 +0,0 @@
-# Design: doc-map input for path-scoped design doc injection (Issue #137)
-
-## Problem
-
-review-bot can inject context via `patterns-repo` (external VCS repos) and `conventions-file`
-(a single file from the reviewed repo). There is no mechanism to inject local repo documentation
-files scoped to the paths changed in a PR.
-
-First consumer: `grgl/gargoyle#778` needs a "doc adherence" reviewer that checks code against the
-module's governing design doc, without injecting every doc in the tree.
-
-## Approach
-
-### New: `doc-map` input
-
-A `.review-bot/doc-map.yml` config file in the reviewed repo maps source path globs to governing
-design docs. review-bot reads the map, intersects it with changed PR paths, and injects only the
-relevant docs into the system prompt.
-
-### Config format
-
-```yaml
-mappings:
-  - paths:
-      - "lib/gargoyle/engine/signal_risk/**"
-    docs:
-      - docs/domain/contexts/risk/risk-controls.md
-  - paths:
-      - "lib/gargoyle/trading/**"
-    docs:
-      - docs/domain/contexts/trading/
-```
-
- `paths` — glob patterns (including `**`) matched against changed file paths in the PR
- `docs` — file paths or directory paths (all `.md` files under a directory) to inject
- Docs are deduplicated across mappings
-
-### Architecture
-
-| Component | Description |
-|-----------|-------------|
-| `review/docmap.go` | YAML parsing, glob matching with `**` support, doc loading via VCS |
-| `cmd/review-bot/main.go` | Step 6c: parses config, intersects with changed files, calls LoadMatchingDocs |
-| `budget/budget.go` | New `DesignDocs` section — injected after Conventions in system prompt |
-| `action.yml` | `doc-map` and `doc-map-max-bytes` inputs, wired to `DOC_MAP_FILE`/`DOC_MAP_MAX_BYTES` |
-
-### Doc file loading
-
- The `doc-map` YAML file is read from the local workspace (like `system-prompt-file`).
- Doc files listed in the config are fetched via VCS API (same as `conventions-file`),
-  enabling them to be loaded from any branch without a local checkout.
- `GetAllFilesInPath` is tried first; if it returns files, they are treated as a directory listing.
-  If it returns empty, `GetFileContent` is tried as a fallback (single file).
-
-### Glob matching
-
-`**` is implemented by splitting patterns and paths on `/`, then matching segment-by-segment.
-A `**` segment consumes zero or more path segments (not just one level like `*`).
-
-### Budget integration
-
-`DesignDocs` is added to `budget.Sections` between `Conventions` and `FileContext`.
-Trim order: Patterns → Conventions → DesignDocs → FileContext → Diff.
-Design docs appear in the system prompt under `## Design Documents`.
-
-### Context size guard
-
-Default: 100 KB. Configurable via `--doc-map-max-bytes` / `DOC_MAP_MAX_BYTES`.
-Truncation is noted inline with a `⚠️` message.
-
-## Error handling
-
-| Situation | Behavior |
-|-----------|----------|
-| `--doc-map` file not found | Fatal error (like `--system-prompt-file`) |
-| `--doc-map` file invalid YAML | Fatal error with descriptive message |
-| Unknown YAML keys | Log warning, continue |
-| Doc file not found in VCS | Log warning, skip |
-| Doc directory empty or no `.md` files | Log debug, skip |
-| Total size exceeds limit | Truncate with notice, log warning |
-| No changed paths match any mapping | No docs injected, review runs normally |
-| `paths` or `docs` list empty in a mapping | Skip that mapping |
@@ -1,334 +0,0 @@
-# Design: Role-based Review Personas (Issue #51)
-
-> **Note:** This design was revised during implementation to use JSON instead of YAML
-> to maintain the repository's zero-external-dependencies convention. All persona
-> files use JSON format. See "Design Revision" section at the end for details.
-
-## Problem
-
-Current review-bot performs generic code review. Every reviewer (regardless of `reviewer-name`) uses the same base prompt and evaluates the same concerns. This leads to:
-
-1. **Redundancy** — Two reviewers (e.g., GPT + Claude twins) often flag identical issues
-2. **Gaps** — Generic reviewers miss specialized concerns (security, domain logic, architecture)
-3. **Noise** — NITs about style mixed with critical security findings
-4. **No ownership** — Findings lack clear domain attribution
-
-## Constraints
-
- Must work with existing CLI flags and CI workflow patterns
- Must not break backwards compatibility (existing configs still work)
- Must integrate cleanly with the budget system (personas add to context)
- Multiple personas running in parallel must not interfere with each other
- Each persona must have clear scope boundaries (no duplication)
-
-## Proposed Approach
-
-### 1. Persona Definition
-
-A persona is a named review role with:
- **Identity** — Who am I? What's my expertise?
- **Focus** — What do I look for?
- **Scope boundaries** — What do I explicitly NOT comment on?
- **Severity calibration** — What counts as MAJOR/MINOR/NIT for MY domain?
-
-Personas are defined in JSON files that can live:
-1. In the pattern repos (shared across projects)
-2. In the target repo (project-specific personas)
-3. Inline via a new `--persona-file` flag (JSON format)
-
-### 2. Persona File Format
-
-```json
-# .review/personas/security.yaml
-name: security
-display_name: Security Specialist
-model_preference: opus  # optional hint for expensive analysis
-
-identity: |
-  You are a security specialist reviewing code for vulnerabilities.
-  Your expertise: OWASP Top 10, injection attacks, auth/authz, secrets management,
-  event sourcing security (replay attacks, event injection).
-
-focus:
-  - Injection attacks (SQL, command, path traversal, template)
-  - Authentication and authorization gaps
-  - Secrets exposure (hardcoded credentials, tokens in logs)
-  - Input validation (unsanitized input, unsafe deserialization)
-  - Race conditions with security implications
-  - Event sourcing attack vectors
-
-ignore:
-  - Code style and naming conventions
-  - Performance (unless security-related)
-  - Documentation
-  - General code quality
-  - Test coverage
-
-severity:
-  critical: "Remote code execution, auth bypass, data exfiltration"
-  major: "Privilege escalation, information disclosure, DoS"
-  minor: "Missing rate limiting, verbose errors"
-  nit: "Theoretical risk with low exploitability"
-
-output_format: |
-  For each finding:
-  - Severity: [CRITICAL|MAJOR|MINOR|NIT]
-  - Attack vector: How could this be exploited?
-  - Evidence: Code snippet showing the vulnerability
-  - Recommendation: Specific fix
-```
-
-### 3. New CLI Flags
-
-```
--persona-file PATH      Path to persona JSON file (local or in repo)
--persona NAME           Built-in persona name (security, architect, domain)
-```
-
-Either flag sets the persona. If neither is provided, behavior is unchanged (generic review).
-
-### 4. Prompt Assembly
-
-Current flow:
-```
-SystemBase → Patterns → Conventions → [LLM]
-```
-
-New flow with persona:
-```
-PersonaPrompt (from YAML) → Patterns (filtered?) → Conventions → [LLM]
-```
-
-The persona's identity/focus/ignore/severity sections become the system prompt, replacing the generic "You are an expert code reviewer" base.
-
-### 5. Built-in Personas
-
-Ship with these built-in personas (loadable via `--persona NAME`):
-
-| Name | Focus |
-|------|-------|
-| `security` | Vulnerabilities, auth, secrets |
-| `architect` | Patterns, consistency, design |
-| `domain` | Business logic (requires repo-specific config) |
-| `docs` | Documentation, API clarity |
-
-Built-in personas live in `review/personas/` as embedded Go assets or YAML shipped with the binary.
-
-### 6. CI Workflow Integration
-
-Single persona:
-```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
-  with:
-    reviewer-name: security
-    persona: security
-    ...
-```
-
-Multiple personas (parallel jobs):
-```yaml
-jobs:
-  review:
-    strategy:
-      matrix:
-        include:
-          - name: security
-            persona: security
-          - name: architect
-            persona: architect
-    steps:
-      - uses: rodin/review-bot/.gitea/actions/review@v1
-        with:
-          reviewer-name: ${{ matrix.name }}
-          persona: ${{ matrix.persona }}
-```
-
-Custom persona from repo:
-```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
-  with:
-    reviewer-name: trading
-    persona-file: .review/personas/trading.yaml
-```
-
-### 7. Persona + Patterns Interaction
-
-Some personas benefit from filtered patterns:
- Security → only security-related patterns
- Architect → all patterns (structural focus)
- Domain → domain docs, not language patterns
-
-For v1, keep it simple: all patterns are included regardless of persona. Future enhancement could add `patterns_filter` to persona YAML.
-
-### 8. Output Format Changes
-
-Persona name appears in the review header:
-```markdown
-# Security Review
-
-## Summary
-No critical vulnerabilities found in this change.
-
-## Findings
-| # | Severity | File | Line | Finding |
-...
-
-## Recommendation
-**APPROVE** — No security-relevant issues detected.
-
---
-*Review by security*
-<!-- review-bot:security -->
-```
-
-## State/Data Model
-
-### Persona struct
-
-```go
-// review/persona.go
-type Persona struct {
-    Name          string   `yaml:"name"`
-    DisplayName   string   `yaml:"display_name"`
-    ModelPref     string   `yaml:"model_preference,omitempty"`
-    Identity      string   `yaml:"identity"`
-    Focus         []string `yaml:"focus"`
-    Ignore        []string `yaml:"ignore"`
-    Severity      Severity `yaml:"severity"`
-    OutputFormat  string   `yaml:"output_format,omitempty"`
-}
-
-type Severity struct {
-    Critical string `yaml:"critical"`
-    Major    string `yaml:"major"`
-    Minor    string `yaml:"minor"`
-    Nit      string `yaml:"nit"`
-}
-```
-
-### Loading precedence
-
-1. `--persona-file PATH` → load from local file system
-2. `--persona NAME` → load from embedded built-ins
-3. Neither → use generic system prompt (current behavior)
-
-## Error Cases
-
-| Error | Handling |
-|-------|----------|
-| Persona file not found | Fatal exit with clear message |
-| Invalid YAML in persona file | Fatal exit with parse error |
-| Both `--persona` and `--persona-file` specified | Fatal exit: mutually exclusive |
-| Unknown built-in persona name | Fatal exit with list of valid names |
-| Empty identity in persona | Warning, fall back to generic prompt |
-
-## Edge Cases
-
- **Empty focus list**: Valid — persona relies on identity alone
- **Empty ignore list**: Valid — no explicit scope exclusions
- **No severity section**: Use default MAJOR/MINOR/NIT definitions
- **Model preference set but budget insufficient**: Ignore preference, log warning
- **Persona file in pattern repo**: Fetch like other pattern files
-
-## Testing Strategy
-
-### Unit tests
- `persona_test.go`: Parse valid/invalid YAML, validate required fields
- `prompt_test.go`: Verify persona prompt assembly
- Integration with budget: persona prompts count toward token limit
-
-### Integration tests
- End-to-end with `--persona security` (built-in)
- End-to-end with `--persona-file custom.yaml`
- Backwards compatibility: no flags = generic behavior
-
-### Manual verification
- Run security persona on a PR with obvious vulnerability
- Verify security persona ignores style issues
- Verify non-security persona doesn't flag security issues
-
-## Implementation Phases
-
-### Phase 1: Persona types and loading
- [ ] `review/persona.go`: Persona struct + YAML parsing
- [ ] `review/persona_test.go`: Unit tests
- [ ] Embed built-in personas in binary
- [ ] Compiles clean, tests pass
-
-### Phase 2: Prompt generation
- [ ] `review/prompt.go`: `BuildPersonaPrompt(p Persona) string`
- [ ] Modify `BuildSystemBase()` to accept optional persona
- [ ] Integrate persona prompt with budget system
- [ ] Tests for prompt assembly
-
-### Phase 3: CLI integration
- [ ] Add `--persona` and `--persona-file` flags
- [ ] Flag validation (mutually exclusive, valid names)
- [ ] Load persona based on flags
- [ ] Pass persona to prompt builder
-
-### Phase 4: Action integration
- [ ] Add `persona` and `persona-file` inputs to action.yml
- [ ] Update README with persona examples
- [ ] End-to-end CI test
-
-### Phase 5: Built-in personas
- [ ] `security.yaml` built-in
- [ ] `architect.yaml` built-in
- [ ] `docs.yaml` built-in
- [ ] Document each persona's focus
-
-## Open Questions
-
-1. **Persona file location in repo**: Should we support `--persona-file .review/security.yaml` where the file is fetched from the PR's repo (like conventions)? This adds complexity but enables project-specific personas without action changes.
-
-2. **Model preference enforcement**: If persona specifies `model_preference: opus` but the action uses a different model, should we warn? Override? Ignore? Current thinking: log warning, use the specified model (user controls model via action input).
-
-3. **Severity override output**: If persona defines custom severity levels (CRITICAL), should the JSON output include them, or map back to standard MAJOR/MINOR/NIT? Current thinking: keep standard output format, use severity calibration only for prompt guidance.
-
-## Completion Checklist
-
-1. Persona struct matches YAML schema exactly?
-2. Built-in personas embedded in binary (not external files)?
-3. `--persona` and `--persona-file` are mutually exclusive?
-4. Unknown persona name produces clear error with valid options?
-5. Empty persona file fields have sensible defaults?
-6. Persona prompt integrates with budget system (token counting)?
-7. Backwards compatibility: no flags = current behavior?
-8. Review header shows persona display name?
-9. Sentinel still uses reviewer-name (not persona name)?
-10. Unit tests cover parse errors, missing fields, valid YAML?
-
-## Design Review Findings (Self-Review)
-
-### Finding 1: Severity Mapping
-The persona YAML allows `critical` severity, but the LLM output parser (`review/parser.go`) only accepts MAJOR/MINOR/NIT. 
-
-**Resolution:** Keep standard output format. Persona severity section is ONLY for calibrating the LLM's judgment (prompt guidance). Output must still use MAJOR/MINOR/NIT. Document this clearly in persona format docs.
-
-### Finding 2: Embedding Built-in Personas
-Go doesn't natively embed YAML. Must use `//go:embed` directive (Go 1.16+).
-
-**Resolution:** Create `review/personas/` directory with YAML files and use:
-```go
-//go:embed personas/*.yaml
-var embeddedPersonas embed.FS
-```
-
-### Finding 3: display_name vs reviewer-name
-Design says header shows "persona display name" but sentinel uses "reviewer-name". This is correct - they serve different purposes:
- `display_name` → human-readable header ("Security Specialist Review")
- `reviewer-name` → machine sentinel for cleanup (`<!-- review-bot:security -->`)
-
-When persona is used, `display_name` takes precedence for the header title, but `reviewer-name` (CLI flag) is still used for the sentinel.
-
-## Design Revision: YAML with gopkg.in/yaml.v3
-
-**Decision:** Add `gopkg.in/yaml.v3` as a dependency.
-
-YAML is preferred over JSON for persona files because:
- Multi-line strings are cleaner (no escaping quotes in identity/focus text)
- Comments are supported for documentation
- More human-readable for complex persona definitions
-
-The implementation supports both YAML (`.yaml`, `.yml`) and JSON (`.json`) for backwards compatibility, with YAML as the default for built-in personas.
@@ -1,87 +0,0 @@
-# Design: YAML Support for Persona Files (#57)
-
-## Problem
-
-JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
-
-## Constraints
-
- Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use `.yaml` extension (not `.yml`)
- Library: use `github.com/goccy/go-yaml` v1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation
-
-## Proposed Approach
-
-1. **Update `parsePersona`** to detect format from file extension
-2. **Add YAML parsing** with explicit depth limit (defense in depth)
-3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
-4. **Convert built-in personas** to YAML format
-5. **Update embed directive** to include both formats
-
-### File Extension Detection
-
-```go
-func parsePersona(data []byte, source string) (*Persona, error) {
-    isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
-    if isYAML {
-        return parseYAML(data, source)
-    }
-    return parseJSON(data, source)
-}
-```
-
-### YAML Parsing with Depth Protection
-
-We implement a custom AST-based depth/node-count walk (`checkYAMLDepth` in
-`review/persona.go`) rather than relying on library decoder options. Key design
-decisions:
-
- **Library:** `github.com/goccy/go-yaml` with `ast.Node`-based traversal
- **Dual-map tracking:** `validated` (depth-aware short-circuit) + `visiting` (cycle detection)
- **Node-count limit:** Conservative overcounting bounds total validation work
- **Alias-aware depth:** Aliases increment depth and are re-checked when encountered at greater depths
-
-See `review/persona.go:checkYAMLDepth` for the authoritative implementation.
-
-## State/Data Model
-
-No new state. Same `Persona` struct, just different parsing.
-
-## Error Cases
-
-| Error | Handling |
-|-------|----------|
-| Invalid YAML syntax | Return parse error with source file |
-| Deeply nested YAML | Custom AST walk (`checkYAMLDepth`) rejects before decode |
-| Unknown extension | Fall back to JSON parsing |
-| Missing required fields | Validation rejects after parse |
-
-## Edge Cases
-
- File with `.json` extension but YAML content → JSON parse fails, user sees error
- File with no extension → defaults to JSON
- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
-
-## Testing Strategy
-
-1. Unit tests for YAML parsing (valid, invalid, deeply nested)
-2. Unit tests for extension detection
-3. Integration test for built-in personas (now YAML)
-4. Backwards compat test: verify JSON still works for external files
-
-## Completion Checklist
-
-1. [ ] `go-yaml` dependency added at v1.16.0+
-2. [ ] Extension detection uses case-insensitive comparison
-3. [ ] YAML parse errors include source file name
-4. [ ] JSON parsing still works for `.json` files
-5. [ ] Built-in personas converted to YAML with readable multi-line strings
-6. [ ] Embed directive updated to include `*.yaml`
-7. [ ] Test for deeply nested YAML rejection
-8. [ ] All existing tests pass
-
-## Open Questions
-
- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
@@ -231,6 +231,8 @@ These are statically checked by `~/.openclaw/workspace/scripts/test/check-invari
 | S6 | Active WIP does not cause early exit (only sets ACTIVE_WIP flag) |
 | S7 | SPAWN:impl guarded by `ACTIVE_WIP == 0` check |
 | S8 | No merge calls in any worker template |
+| S9 | Zero close-PR API calls in dispatch script (`state=closed` does not appear) |
+| S10 | No close-PR API calls in any worker template; every worker template contains `NEVER close a PR` |

 ---

@@ -263,9 +265,20 @@ Each worker receives a precise task description with substituted values:

 Workers **always** remove the WIP label on completion and reply `NO_REPLY`.

+### Worker Absolute Constraints
+
+Every worker template begins with an `⛔ ABSOLUTE CONSTRAINTS` section containing these rules:
+
+- **NEVER close a PR.** Never call `PATCH /pulls/{id}` with `state=closed`. Closing a PR requires human action. "Duplicate", "superseded", or "already done" are never a worker's call.
+- **NEVER merge a PR.** Never call the merge API. Merging requires human approval.
+- **NEVER use the gitea-aweiker token.** All API calls use the gitea-rodin token only.
+- **NEVER act on a PR with active REQUEST_CHANGES.** Fix the findings first.
+
+The first two constraints are statically enforced by `check-invariants.sh`: S1 and S9 cover the dispatch script (no merge, no close); S8 covers worker templates (no merge calls); S10 covers worker templates (no close calls, with NEVER-close text verified present in each). The remaining two constraints (token usage and REQUEST_CHANGES gate) are enforced by runtime logic.
+
 ---

-## 9. Fixes for Issues #144 and #145
+## 9. Fixes for Issues #144, #145, and #157

 **Issue #144** (autonomous merge):
 The dispatch script contains no merge API calls anywhere. The `~/.openclaw/workspace/scripts/test/check-invariants.sh`
@@ -276,3 +289,13 @@ Rule 2 is the **first** rule evaluated per PR. It cannot be skipped, reasoned pa
 or bypassed. It is checked before CI, before self-review, before handoff. The check
 uses latest-per-reviewer state, so a reviewer who re-approved after REQUEST_CHANGES
 is correctly handled.
+
+**Issue #157** (autonomous PR close):
+Worker templates were missing an explicit constraint against closing PRs. The dispatch
+script never had a close call, but workers could reason their way into calling
+`PATCH /pulls/{id}` with `state=closed`. All worker templates now include
+`NEVER close a PR` in their ABSOLUTE CONSTRAINTS section. Invariant S9 verifies
+the dispatch script contains no close calls. Invariant S10 verifies
+worker templates contain no close calls and each contains the NEVER-close text.
+
+Regression tests in `dispatch.bats` statically verify all of these constraints.
@@ -1,91 +1,22 @@
 // Package gitea provides a client for the Gitea API.
-// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
-// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
+// ipcheck.go re-exports the IsBlockedIP function from internal/netutil for use
+// by this package's safe dialer (client.go) and for backward compatibility with
+// any callers that previously imported it from here.
+//
+// The implementation has moved to internal/netutil so it can be shared with the
+// validate-url subcommand (cmd/review-bot/validateurl.go) without creating a
+// dependency from VCS-generic code on the Gitea-specific package.
 package gitea

 import (
-	"fmt"
 	"net"
+
+	"gitea.weiker.me/rodin/review-bot/internal/netutil"
 )

-// blockedCIDRStrings is the canonical list of CIDR strings that should never
-// be contacted by review-bot. See IsBlockedIP for the full list of covered
-// address families.
-//
-// These are hard-coded literals: any parse failure is a programming error.
-// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
-var blockedCIDRStrings = []string{
-	// IPv4 loopback
-	"127.0.0.0/8",
-	// IPv4 unspecified / "this network"
-	"0.0.0.0/8",
-	// RFC1918 private ranges
-	"10.0.0.0/8",
-	"172.16.0.0/12",
-	"192.168.0.0/16",
-	// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
-	"169.254.0.0/16",
-	// IPv4 shared address space (RFC6598, carrier-grade NAT)
-	"100.64.0.0/10",
-	// IPv4 multicast
-	"224.0.0.0/4",
-	// IPv4 reserved / broadcast
-	"240.0.0.0/4",
-	// IPv6 loopback
-	"::1/128",
-	// IPv6 unspecified
-	"::/128",
-	// IPv6 link-local
-	"fe80::/10",
-	// IPv6 unique local (ULA) — RFC4193
-	"fc00::/7",
-	// IPv6 multicast
-	"ff00::/8",
-}
-
-// blockedCIDRs is the parsed form of blockedCIDRStrings.
-// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
-// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
-var (
-	blockedCIDRs           []*net.IPNet
-	blockedCIDRParseErrors []string
-)
-
-func init() {
-	blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
-	for _, r := range blockedCIDRStrings {
-		_, cidr, err := net.ParseCIDR(r)
-		if err != nil {
-			// Record the error rather than panicking; TestBlockedCIDRsValid
-			// will catch this during tests, and the CI build will fail.
-			blockedCIDRParseErrors = append(blockedCIDRParseErrors,
-				fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
-			continue
-		}
-		blockedCIDRs = append(blockedCIDRs, cidr)
-	}
-}
-
 // IsBlockedIP reports whether ip is in a blocked address range.
-// It is exported for use by the validate-url subcommand and tests outside
-// this package.
-//
-// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
-// IPv4 form before checking so that IPv4 CIDRs catch them.
-//
-// Based on:
-//   - RFC1918 private ranges
-//   - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
-//   - RFC4291 IPv6 link-local / loopback
+// It delegates to internal/netutil.IsBlockedIP; see that function for the full
+// list of blocked ranges and IPv6-mapped IPv4 normalization behavior.
 func IsBlockedIP(ip net.IP) bool {
-	// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
-	if v4 := ip.To4(); v4 != nil {
-		ip = v4
-	}
-	for _, cidr := range blockedCIDRs {
-		if cidr.Contains(ip) {
-			return true
-		}
-	}
-	return false
+	return netutil.IsBlockedIP(ip)
 }
@@ -3,142 +3,35 @@ package gitea
 import (
 	"net"
 	"testing"
+
+	"gitea.weiker.me/rodin/review-bot/internal/netutil"
 )

-func TestIsBlockedIP(t *testing.T) {
-	blocked := []struct {
-		name string
-		ip   string
+// TestIsBlockedIPForwarding verifies that gitea.IsBlockedIP correctly forwards
+// to internal/netutil.IsBlockedIP. Full coverage of the blocking logic lives in
+// internal/netutil/ipcheck_test.go.
+func TestIsBlockedIPForwarding(t *testing.T) {
+	cases := []struct {
+		ip      string
+		blocked bool
 	}{
-		// IPv4 loopback
-		{"loopback 127.0.0.1", "127.0.0.1"},
-		{"loopback 127.0.0.2", "127.0.0.2"},
-		{"loopback 127.255.255.255", "127.255.255.255"},
-		// IPv4 unspecified
-		{"unspecified 0.0.0.0", "0.0.0.0"},
-		{"unspecified 0.1.2.3", "0.1.2.3"},
-		// RFC1918
-		{"RFC1918 10.0.0.1", "10.0.0.1"},
-		{"RFC1918 10.255.255.255", "10.255.255.255"},
-		{"RFC1918 172.16.0.1", "172.16.0.1"},
-		{"RFC1918 172.31.255.255", "172.31.255.255"},
-		{"RFC1918 192.168.0.1", "192.168.0.1"},
-		{"RFC1918 192.168.255.255", "192.168.255.255"},
-		// Link-local (APIPA / AWS metadata)
-		{"link-local 169.254.0.1", "169.254.0.1"},
-		{"link-local 169.254.169.254", "169.254.169.254"},
-		// Shared address space (carrier-grade NAT)
-		{"CGN 100.64.0.1", "100.64.0.1"},
-		{"CGN 100.127.255.255", "100.127.255.255"},
-		// Multicast
-		{"multicast 224.0.0.1", "224.0.0.1"},
-		{"multicast 239.255.255.255", "239.255.255.255"},
-		// Reserved
-		{"reserved 240.0.0.1", "240.0.0.1"},
-		{"broadcast 255.255.255.255", "255.255.255.255"},
-		// IPv6 loopback
-		{"IPv6 loopback ::1", "::1"},
-		// IPv6 unspecified
-		{"IPv6 unspecified ::", "::"},
-		// IPv6 link-local
-		{"IPv6 link-local fe80::1", "fe80::1"},
-		{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
-		// IPv6 ULA
-		{"IPv6 ULA fc00::1", "fc00::1"},
-		{"IPv6 ULA fd00::1", "fd00::1"},
-		// IPv6 multicast
-		{"IPv6 multicast ff02::1", "ff02::1"},
+		{"127.0.0.1", true},             // loopback — must be blocked
+		{"192.168.1.1", true},           // RFC1918 — must be blocked
+		{"8.8.8.8", false},              // public — must not be blocked
+		{"2001:4860:4860::8888", false}, // public IPv6 — must not be blocked
 	}
-
-	for _, tc := range blocked {
-		t.Run(tc.name, func(t *testing.T) {
-			ip := net.ParseIP(tc.ip)
-			if ip == nil {
-				t.Fatalf("failed to parse IP %q", tc.ip)
-			}
-			if !IsBlockedIP(ip) {
-				t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
-			}
-		})
-	}
-
-	allowed := []struct {
-		name string
-		ip   string
-	}{
-		{"public 8.8.8.8", "8.8.8.8"},
-		{"public 1.1.1.1", "1.1.1.1"},
-		{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
-		// not assigned to any real host, but intentionally left unblocked here because
-		// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
-		// blocking it would require tracking every RFC5737 range without meaningful
-		// security benefit (no server should ever listen on a TEST-NET address).
-		{"public 151.101.1.1", "151.101.1.1"},                        // Fastly
-		{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
-		{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
-	}
-
-	for _, tc := range allowed {
-		t.Run(tc.name, func(t *testing.T) {
-			ip := net.ParseIP(tc.ip)
-			if ip == nil {
-				t.Fatalf("failed to parse IP %q", tc.ip)
-			}
-			if IsBlockedIP(ip) {
-				t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
-			}
-		})
-	}
-}
-
-func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
-	// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
-	// Construct it manually as a 16-byte IP.
-	mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
-	if !IsBlockedIP(mapped) {
-		t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
-	}
-
-	// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
-	mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
-	if IsBlockedIP(mappedPublic) {
-		t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
-	}
-}
-
-func TestIsBlockedIPEdgeCases(t *testing.T) {
-	// The boundary between RFC1918 and public ranges.
-	// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
-	notPrivate := net.ParseIP("172.15.255.255")
-	if IsBlockedIP(notPrivate) {
-		t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
-	}
-	// 172.32.0.0 is NOT private (just above 172.31.255.255).
-	notPrivate2 := net.ParseIP("172.32.0.0")
-	if IsBlockedIP(notPrivate2) {
-		t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
-	}
-	// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
-	notCGN := net.ParseIP("100.63.255.255")
-	if IsBlockedIP(notCGN) {
-		t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
-	}
-	// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
-	notCGN2 := net.ParseIP("100.128.0.0")
-	if IsBlockedIP(notCGN2) {
-		t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
-	}
-}
-
-// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
-// successfully. This catches programming errors in the CIDR list without
-// requiring a startup panic. The init() function records parse failures in
-// blockedCIDRParseErrors rather than panicking; this test makes those failures
-// visible as test failures during CI.
-func TestBlockedCIDRsValid(t *testing.T) {
-	if len(blockedCIDRParseErrors) > 0 {
-		for _, msg := range blockedCIDRParseErrors {
-			t.Errorf("CIDR parse error: %s", msg)
+	for _, tc := range cases {
+		ip := net.ParseIP(tc.ip)
+		if ip == nil {
+			t.Fatalf("failed to parse IP %q", tc.ip)
+		}
+		got := IsBlockedIP(ip)
+		want := netutil.IsBlockedIP(ip)
+		if got != want {
+			t.Errorf("gitea.IsBlockedIP(%q) = %v, netutil.IsBlockedIP = %v: forwarding mismatch", tc.ip, got, want)
+		}
+		if got != tc.blocked {
+			t.Errorf("gitea.IsBlockedIP(%q) = %v, want %v", tc.ip, got, tc.blocked)
 		}
 	}
 }
@@ -0,0 +1,97 @@
+// Package netutil provides shared network utilities for review-bot.
+// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
+// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
+package netutil
+
+import (
+	"fmt"
+	"net"
+)
+
+// blockedCIDRStrings is the canonical list of CIDR strings that should never
+// be contacted by review-bot. See IsBlockedIP for the full list of covered
+// address families.
+//
+// These are hard-coded literals: any parse failure is a programming error.
+// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
+var blockedCIDRStrings = []string{
+	// IPv4 loopback
+	"127.0.0.0/8",
+	// IPv4 unspecified / "this network"
+	"0.0.0.0/8",
+	// RFC1918 private ranges
+	"10.0.0.0/8",
+	"172.16.0.0/12",
+	"192.168.0.0/16",
+	// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
+	"169.254.0.0/16",
+	// IPv4 shared address space (RFC6598, carrier-grade NAT)
+	"100.64.0.0/10",
+	// IPv4 multicast
+	"224.0.0.0/4",
+	// IPv4 reserved / broadcast
+	"240.0.0.0/4",
+	// IPv6 loopback
+	"::1/128",
+	// IPv6 unspecified
+	"::/128",
+	// IPv6 link-local
+	"fe80::/10",
+	// IPv6 unique local (ULA) — RFC4193
+	"fc00::/7",
+	// IPv6 multicast
+	"ff00::/8",
+}
+
+// blockedCIDRs is the parsed form of blockedCIDRStrings.
+// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
+// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
+var (
+	blockedCIDRs           []*net.IPNet
+	blockedCIDRParseErrors []string
+)
+
+func init() {
+	blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
+	for _, r := range blockedCIDRStrings {
+		_, cidr, err := net.ParseCIDR(r)
+		if err != nil {
+			// Record the error rather than panicking; TestBlockedCIDRsValid
+			// will catch this during tests, and the CI build will fail.
+			blockedCIDRParseErrors = append(blockedCIDRParseErrors,
+				fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
+			continue
+		}
+		blockedCIDRs = append(blockedCIDRs, cidr)
+	}
+}
+
+// BlockedCIDRParseErrors returns any errors encountered parsing the built-in
+// CIDR list. In correct code this will always be empty; tests assert it is.
+func BlockedCIDRParseErrors() []string {
+	return blockedCIDRParseErrors
+}
+
+// IsBlockedIP reports whether ip is in a blocked address range.
+// It is exported for use by the gitea package's safe dialer, the validate-url
+// subcommand, and tests outside this package.
+//
+// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
+// IPv4 form before checking so that IPv4 CIDRs catch them.
+//
+// Based on:
+//   - RFC1918 private ranges
+//   - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
+//   - RFC4291 IPv6 link-local / loopback
+func IsBlockedIP(ip net.IP) bool {
+	// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
+	if v4 := ip.To4(); v4 != nil {
+		ip = v4
+	}
+	for _, cidr := range blockedCIDRs {
+		if cidr.Contains(ip) {
+			return true
+		}
+	}
+	return false
+}
@@ -0,0 +1,142 @@
+package netutil
+
+import (
+	"net"
+	"testing"
+)
+
+func TestIsBlockedIP(t *testing.T) {
+	blocked := []struct {
+		name string
+		ip   string
+	}{
+		// IPv4 loopback
+		{"loopback 127.0.0.1", "127.0.0.1"},
+		{"loopback 127.0.0.2", "127.0.0.2"},
+		{"loopback 127.255.255.255", "127.255.255.255"},
+		// IPv4 unspecified
+		{"unspecified 0.0.0.0", "0.0.0.0"},
+		{"unspecified 0.1.2.3", "0.1.2.3"},
+		// RFC1918
+		{"RFC1918 10.0.0.1", "10.0.0.1"},
+		{"RFC1918 10.255.255.255", "10.255.255.255"},
+		{"RFC1918 172.16.0.1", "172.16.0.1"},
+		{"RFC1918 172.31.255.255", "172.31.255.255"},
+		{"RFC1918 192.168.0.1", "192.168.0.1"},
+		{"RFC1918 192.168.255.255", "192.168.255.255"},
+		// Link-local (APIPA / AWS metadata)
+		{"link-local 169.254.0.1", "169.254.0.1"},
+		{"link-local 169.254.169.254", "169.254.169.254"},
+		// Shared address space (carrier-grade NAT)
+		{"CGN 100.64.0.1", "100.64.0.1"},
+		{"CGN 100.127.255.255", "100.127.255.255"},
+		// Multicast
+		{"multicast 224.0.0.1", "224.0.0.1"},
+		{"multicast 239.255.255.255", "239.255.255.255"},
+		// Reserved
+		{"reserved 240.0.0.1", "240.0.0.1"},
+		{"broadcast 255.255.255.255", "255.255.255.255"},
+		// IPv6 loopback
+		{"IPv6 loopback ::1", "::1"},
+		// IPv6 unspecified
+		{"IPv6 unspecified ::", "::"},
+		// IPv6 link-local
+		{"IPv6 link-local fe80::1", "fe80::1"},
+		{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
+		// IPv6 ULA
+		{"IPv6 ULA fc00::1", "fc00::1"},
+		{"IPv6 ULA fd00::1", "fd00::1"},
+		// IPv6 multicast
+		{"IPv6 multicast ff02::1", "ff02::1"},
+	}
+
+	for _, tc := range blocked {
+		t.Run(tc.name, func(t *testing.T) {
+			ip := net.ParseIP(tc.ip)
+			if ip == nil {
+				t.Fatalf("failed to parse IP %q", tc.ip)
+			}
+			if !IsBlockedIP(ip) {
+				t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
+			}
+		})
+	}
+
+	allowed := []struct {
+		name string
+		ip   string
+	}{
+		{"public 8.8.8.8", "8.8.8.8"},
+		{"public 1.1.1.1", "1.1.1.1"},
+		{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
+		// not assigned to any real host, but intentionally left unblocked here because
+		// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
+		// blocking it would require tracking every RFC5737 range without meaningful
+		// security benefit (no server should ever listen on a TEST-NET address).
+		{"public 151.101.1.1", "151.101.1.1"},                        // Fastly
+		{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
+		{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
+	}
+
+	for _, tc := range allowed {
+		t.Run(tc.name, func(t *testing.T) {
+			ip := net.ParseIP(tc.ip)
+			if ip == nil {
+				t.Fatalf("failed to parse IP %q", tc.ip)
+			}
+			if IsBlockedIP(ip) {
+				t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
+			}
+		})
+	}
+}
+
+func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
+	// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
+	// Construct it manually as a 16-byte IP.
+	mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
+	if !IsBlockedIP(mapped) {
+		t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
+	}
+
+	// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
+	mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
+	if IsBlockedIP(mappedPublic) {
+		t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
+	}
+}
+
+func TestIsBlockedIPEdgeCases(t *testing.T) {
+	// The boundary between RFC1918 and public ranges.
+	// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
+	notPrivate := net.ParseIP("172.15.255.255")
+	if IsBlockedIP(notPrivate) {
+		t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
+	}
+	// 172.32.0.0 is NOT private (just above 172.31.255.255).
+	notPrivate2 := net.ParseIP("172.32.0.0")
+	if IsBlockedIP(notPrivate2) {
+		t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
+	}
+	// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
+	notCGN := net.ParseIP("100.63.255.255")
+	if IsBlockedIP(notCGN) {
+		t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
+	}
+	// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
+	notCGN2 := net.ParseIP("100.128.0.0")
+	if IsBlockedIP(notCGN2) {
+		t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
+	}
+}
+
+// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
+// successfully. This catches programming errors in the CIDR list without
+// requiring a startup panic. The init() function records parse failures in
+// blockedCIDRParseErrors rather than panicking; this test makes those failures
+// visible as test failures during CI.
+func TestBlockedCIDRsValid(t *testing.T) {
+	for _, msg := range BlockedCIDRParseErrors() {
+		t.Errorf("CIDR parse error: %s", msg)
+	}
+}
@@ -52,15 +52,31 @@ func ParseDocMapConfig(localPath string) (*DocMapConfig, error) {
 	if err != nil {
 		return nil, fmt.Errorf("read doc-map file %q: %w", localPath, err)
 	}
+	return parseDocMapBytes(data, localPath)
+}

+// ParseDocMapConfigContent parses a doc-map YAML config from an in-memory
+// string. The source parameter is used only for error messages and log entries
+// (e.g. "owner/repo@main:.review-bot/doc-map.yml").
+//
+// Use this when the config content has been fetched from a trusted VCS ref
+// rather than read from the local workspace.
+func ParseDocMapConfigContent(content, source string) (*DocMapConfig, error) {
+	data := []byte(content)
+	return parseDocMapBytes(data, source)
+}
+
+// parseDocMapBytes is the shared YAML parse implementation used by
+// ParseDocMapConfig and ParseDocMapConfigContent.
+func parseDocMapBytes(data []byte, source string) (*DocMapConfig, error) {
 	var cfg DocMapConfig
 	if err := yaml.UnmarshalWithOptions(data, &cfg, yaml.Strict()); err != nil {
 		// Re-parse without strict mode to log which keys are unknown.
 		var relaxed DocMapConfig
 		if err2 := yaml.Unmarshal(data, &relaxed); err2 != nil {
-			return nil, fmt.Errorf("parse doc-map YAML %q: %w", localPath, err)
+			return nil, fmt.Errorf("parse doc-map YAML %q: %w", source, err)
 		}
-		slog.Warn("doc-map YAML contains unknown keys (ignored)", "file", localPath, "error", err)
+		slog.Warn("doc-map YAML contains unknown keys (ignored)", "file", source, "error", err)
 		cfg = relaxed
 	}
 	return &cfg, nil
@@ -510,3 +510,63 @@ func TestFileCoveredByDocMap_EmptyConfig(t *testing.T) {
 		t.Error("expected false for empty config, got true")
 	}
 }
+
+// ============================================================
+// ParseDocMapConfigContent
+// ============================================================
+
+func TestParseDocMapConfigContent_Valid(t *testing.T) {
+	content := `
+mappings:
+  - paths:
+      - "lib/foo/**"
+    docs:
+      - docs/foo.md
+`
+	cfg, err := ParseDocMapConfigContent(content, "owner/repo@main:.review-bot/doc-map.yml")
+	if err != nil {
+		t.Fatalf("unexpected error: %v", err)
+	}
+	if len(cfg.Mappings) != 1 {
+		t.Fatalf("expected 1 mapping, got %d", len(cfg.Mappings))
+	}
+	if len(cfg.Mappings[0].Docs) != 1 || cfg.Mappings[0].Docs[0] != "docs/foo.md" {
+		t.Errorf("unexpected mapping: %+v", cfg.Mappings[0])
+	}
+}
+
+func TestParseDocMapConfigContent_EmptyContent(t *testing.T) {
+	cfg, err := ParseDocMapConfigContent("", "test-source")
+	if err != nil {
+		t.Fatalf("unexpected error for empty content: %v", err)
+	}
+	if len(cfg.Mappings) != 0 {
+		t.Errorf("expected 0 mappings for empty content, got %d", len(cfg.Mappings))
+	}
+}
+
+func TestParseDocMapConfigContent_InvalidYAML(t *testing.T) {
+	_, err := ParseDocMapConfigContent("mappings: [{{invalid", "test-source")
+	if err == nil {
+		t.Fatal("expected error for invalid YAML, got nil")
+	}
+}
+
+func TestParseDocMapConfigContent_UnknownKeys(t *testing.T) {
+	content := `
+mappings:
+  - paths:
+      - "lib/**"
+    docs:
+      - docs/foo.md
+unknown_top_level_key: "should be warned but not fatal"
+`
+	// Unknown top-level keys produce a warning but not an error.
+	cfg, err := ParseDocMapConfigContent(content, "test-source")
+	if err != nil {
+		t.Fatalf("unexpected error for unknown keys: %v", err)
+	}
+	if len(cfg.Mappings) == 0 {
+		t.Error("expected mappings to be parsed despite unknown key")
+	}
+}