feat(#137): add doc-map input for path-scoped doc injection #138

Merged
rodin merged 7 commits from issue-137 into main 2026-05-15 03:39:37 +00:00
Owner

$PR_BODY

$PR_BODY
rodin added 1 commit 2026-05-15 03:26:16 +00:00
feat(#137): add doc-map input for path-scoped doc injection
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 40s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m26s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m27s
9670a5fda3
- New --doc-map flag (DOC_MAP_FILE env var): path to YAML config mapping
  source path globs to governing design docs
- New --doc-map-max-bytes flag (DOC_MAP_MAX_BYTES env var): cap on total
  injected doc content, default 100KB
- review/docmap.go: DocMapConfig parsing, glob matching with ** support,
  doc loading via VCS with directory expansion and size guard
- budget.Sections: new DesignDocs field, trimmed after conventions
- budget.buildResult: injects DesignDocs under ## Design Documents heading
- action.yml: doc-map and doc-map-max-bytes inputs wired to env vars
- CHANGELOG.md: created with unreleased entry
- Tests: ParseDocMapConfig, MatchDocs, globMatch, LoadMatchingDocs
sonnet-review-bot approved these changes 2026-05-15 03:27:16 +00:00
sonnet-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 9670a5fd)

Sonnet Review

Summary

Well-structured feature addition that follows all established patterns. The doc-map feature is cleanly integrated into the existing architecture with proper error handling, deduplication, size guards, and thorough test coverage.

Findings

# Severity File Line Finding
1 [MINOR] review/docmap.go 57 The package comment says 'Package review provides doc-map parsing and doc injection...' but the review package already exists and has a broader purpose. This package-level comment in docmap.go will either be ignored (not the first file) or will conflict with the package doc in other files. The file-level comment should just be a regular comment, not a package doc comment, or should be omitted since the package is already documented elsewhere.
2 [MINOR] review/docmap.go 201 The truncateUTF8 function is duplicated — an identical implementation already exists in budget/budget.go. Per the DRY principle and the project's own patterns, this should be extracted to a shared utility. Since review and budget are separate packages, the options are: (1) move it to a shared internal package, or (2) accept the duplication with a comment. As-is, future changes to the truncation logic must be made in two places.
3 [MINOR] review/docmap.go 245 In loadDocEntries, when GetAllFilesInPath returns an error (rather than empty results), the code silently falls through to try GetFileContent. The plan's decision explicitly states 'If GetAllFilesInPath returns an error, try GetFileContent', which this does — but the directory error is swallowed without logging. If both calls fail, only the file error is returned. Consider logging a debug message for the directory error before the fallback, which would help diagnose unexpected behavior.
4 [NIT] review/docmap_test.go 14 The fakeDocFetcher struct fields have inconsistent comment styles (inline comments on the struct fields are fine), but the dirs field type map[string]map[string]string could benefit from a brief comment explaining the structure, similar to the files field comment. Minor readability nit.
5 [NIT] review/docmap_test.go 338 The test TestLoadMatchingDocs_ContextSizeGuard uses MaxBytes: 350 with content of 200 bytes each and checks len(content) > 600. The content after metadata (headings, newlines) will be well above 350 bytes due to formatting overhead. The test is validating the right behavior (truncation occurs), but the boundary check (> 600) is loose. Not a bug, just slightly confusing as the actual formatted content with headings for doc a alone is ~220+ bytes.
6 [NIT] PLAN-137.md 1 Committing a PLAN-137.md file to the repository is unusual — planning/design documents are typically kept in issue trackers or wikis. This couples the implementation repository to an issue-tracking artifact. Consider removing it before merge or moving it to a docs/ or .design/ directory if architecture decision records are desired.

Recommendation

APPROVE — APPROVE. CI passes and the implementation is solid. The core feature — doc-map YAML parsing, ** glob matching, VCS-based doc fetching, budget integration, and size guards — is correctly implemented and thoroughly tested. The truncateUTF8 duplication is the most actionable finding but is not a blocker for merging. The package comment collision in docmap.go is a minor issue but won't cause runtime problems. All established patterns (error handling with %w wrapping, interface-based abstraction via DocFetcher, table-driven tests, slog logging, t.Helper() usage) are followed correctly.


Review by sonnet


Evaluated against 9670a5fd

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/138#pullrequestreview-3807) for up-to-date findings. <details><summary>Previous findings (commit 9670a5fd)</summary> # Sonnet Review ## Summary Well-structured feature addition that follows all established patterns. The doc-map feature is cleanly integrated into the existing architecture with proper error handling, deduplication, size guards, and thorough test coverage. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `review/docmap.go` | 57 | The package comment says 'Package review provides doc-map parsing and doc injection...' but the `review` package already exists and has a broader purpose. This package-level comment in docmap.go will either be ignored (not the first file) or will conflict with the package doc in other files. The file-level comment should just be a regular comment, not a `package` doc comment, or should be omitted since the package is already documented elsewhere. | | 2 | [MINOR] | `review/docmap.go` | 201 | The `truncateUTF8` function is duplicated — an identical implementation already exists in `budget/budget.go`. Per the DRY principle and the project's own patterns, this should be extracted to a shared utility. Since `review` and `budget` are separate packages, the options are: (1) move it to a shared internal package, or (2) accept the duplication with a comment. As-is, future changes to the truncation logic must be made in two places. | | 3 | [MINOR] | `review/docmap.go` | 245 | In `loadDocEntries`, when `GetAllFilesInPath` returns an error (rather than empty results), the code silently falls through to try `GetFileContent`. The plan's decision explicitly states 'If GetAllFilesInPath returns an error, try GetFileContent', which this does — but the directory error is swallowed without logging. If both calls fail, only the file error is returned. Consider logging a debug message for the directory error before the fallback, which would help diagnose unexpected behavior. | | 4 | [NIT] | `review/docmap_test.go` | 14 | The `fakeDocFetcher` struct fields have inconsistent comment styles (inline comments on the struct fields are fine), but the `dirs` field type `map[string]map[string]string` could benefit from a brief comment explaining the structure, similar to the `files` field comment. Minor readability nit. | | 5 | [NIT] | `review/docmap_test.go` | 338 | The test `TestLoadMatchingDocs_ContextSizeGuard` uses `MaxBytes: 350` with content of 200 bytes each and checks `len(content) > 600`. The content after metadata (headings, newlines) will be well above 350 bytes due to formatting overhead. The test is validating the right behavior (truncation occurs), but the boundary check (`> 600`) is loose. Not a bug, just slightly confusing as the actual formatted content with headings for doc a alone is ~220+ bytes. | | 6 | [NIT] | `PLAN-137.md` | 1 | Committing a PLAN-137.md file to the repository is unusual — planning/design documents are typically kept in issue trackers or wikis. This couples the implementation repository to an issue-tracking artifact. Consider removing it before merge or moving it to a `docs/` or `.design/` directory if architecture decision records are desired. | ## Recommendation **APPROVE** — APPROVE. CI passes and the implementation is solid. The core feature — doc-map YAML parsing, ** glob matching, VCS-based doc fetching, budget integration, and size guards — is correctly implemented and thoroughly tested. The `truncateUTF8` duplication is the most actionable finding but is not a blocker for merging. The package comment collision in docmap.go is a minor issue but won't cause runtime problems. All established patterns (error handling with `%w` wrapping, interface-based abstraction via `DocFetcher`, table-driven tests, `slog` logging, `t.Helper()` usage) are followed correctly. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 9670a5fd* </details> <!-- review-bot:sonnet -->
PLAN-137.md Outdated
@@ -0,0 +1,194 @@
# Plan: Issue #137 — doc-map input for path-scoped doc injection
First-time contributor

[NIT] Committing a PLAN-137.md file to the repository is unusual — planning/design documents are typically kept in issue trackers or wikis. This couples the implementation repository to an issue-tracking artifact. Consider removing it before merge or moving it to a docs/ or .design/ directory if architecture decision records are desired.

**[NIT]** Committing a PLAN-137.md file to the repository is unusual — planning/design documents are typically kept in issue trackers or wikis. This couples the implementation repository to an issue-tracking artifact. Consider removing it before merge or moving it to a `docs/` or `.design/` directory if architecture decision records are desired.
@@ -0,0 +54,4 @@
return nil, fmt.Errorf("read doc-map file %q: %w", localPath, err)
}
var cfg DocMapConfig
First-time contributor

[MINOR] The package comment says 'Package review provides doc-map parsing and doc injection...' but the review package already exists and has a broader purpose. This package-level comment in docmap.go will either be ignored (not the first file) or will conflict with the package doc in other files. The file-level comment should just be a regular comment, not a package doc comment, or should be omitted since the package is already documented elsewhere.

**[MINOR]** The package comment says 'Package review provides doc-map parsing and doc injection...' but the `review` package already exists and has a broader purpose. This package-level comment in docmap.go will either be ignored (not the first file) or will conflict with the package doc in other files. The file-level comment should just be a regular comment, not a `package` doc comment, or should be omitted since the package is already documented elsewhere.
@@ -0,0 +198,4 @@
continue
}
for _, entry := range entries {
First-time contributor

[MINOR] The truncateUTF8 function is duplicated — an identical implementation already exists in budget/budget.go. Per the DRY principle and the project's own patterns, this should be extracted to a shared utility. Since review and budget are separate packages, the options are: (1) move it to a shared internal package, or (2) accept the duplication with a comment. As-is, future changes to the truncation logic must be made in two places.

**[MINOR]** The `truncateUTF8` function is duplicated — an identical implementation already exists in `budget/budget.go`. Per the DRY principle and the project's own patterns, this should be extracted to a shared utility. Since `review` and `budget` are separate packages, the options are: (1) move it to a shared internal package, or (2) accept the duplication with a comment. As-is, future changes to the truncation logic must be made in two places.
@@ -0,0 +242,4 @@
content string
}
// loadDocEntries returns the doc content for a given path.
First-time contributor

[MINOR] In loadDocEntries, when GetAllFilesInPath returns an error (rather than empty results), the code silently falls through to try GetFileContent. The plan's decision explicitly states 'If GetAllFilesInPath returns an error, try GetFileContent', which this does — but the directory error is swallowed without logging. If both calls fail, only the file error is returned. Consider logging a debug message for the directory error before the fallback, which would help diagnose unexpected behavior.

**[MINOR]** In `loadDocEntries`, when `GetAllFilesInPath` returns an error (rather than empty results), the code silently falls through to try `GetFileContent`. The plan's decision explicitly states 'If GetAllFilesInPath returns an error, try GetFileContent', which this does — but the directory error is swallowed without logging. If both calls fail, only the file error is returned. Consider logging a debug message for the directory error before the fallback, which would help diagnose unexpected behavior.
@@ -0,0 +11,4 @@
// fakeDocFetcher is a mock DocFetcher for tests.
type fakeDocFetcher struct {
files map[string]string // path -> content
First-time contributor

[NIT] The fakeDocFetcher struct fields have inconsistent comment styles (inline comments on the struct fields are fine), but the dirs field type map[string]map[string]string could benefit from a brief comment explaining the structure, similar to the files field comment. Minor readability nit.

**[NIT]** The `fakeDocFetcher` struct fields have inconsistent comment styles (inline comments on the struct fields are fine), but the `dirs` field type `map[string]map[string]string` could benefit from a brief comment explaining the structure, similar to the `files` field comment. Minor readability nit.
@@ -0,0 +335,4 @@
}
}
func TestLoadMatchingDocs_ContextSizeGuard(t *testing.T) {
First-time contributor

[NIT] The test TestLoadMatchingDocs_ContextSizeGuard uses MaxBytes: 350 with content of 200 bytes each and checks len(content) > 600. The content after metadata (headings, newlines) will be well above 350 bytes due to formatting overhead. The test is validating the right behavior (truncation occurs), but the boundary check (> 600) is loose. Not a bug, just slightly confusing as the actual formatted content with headings for doc a alone is ~220+ bytes.

**[NIT]** The test `TestLoadMatchingDocs_ContextSizeGuard` uses `MaxBytes: 350` with content of 200 bytes each and checks `len(content) > 600`. The content after metadata (headings, newlines) will be well above 350 bytes due to formatting overhead. The test is validating the right behavior (truncation occurs), but the boundary check (`> 600`) is loose. Not a bug, just slightly confusing as the actual formatted content with headings for doc a alone is ~220+ bytes.
Author
Owner

Self-review against 9670a5fda3

Assessment: Clean

No issues found — ready for human review.

Checks:

  • Error handling: All errors wrapped/propagated. Nil files slice from Step 3 failure safely handled.
  • Coherence: Single purpose (doc-map injection). All changes necessary.
  • Fits codebase: VCS-fetched docs match conventions-file pattern. Interface in consumer package.
  • Solution: VCS API approach correct. ** glob without new dependency. Size guard prevents context overflow.
  • Completeness: All 10 acceptance criteria met. Full test coverage. CHANGELOG created.
Self-review against 9670a5fda358a4d46b8ccd72b61eaea601d6e020 Assessment: ✅ Clean No issues found — ready for human review. **Checks:** - Error handling: All errors wrapped/propagated. Nil `files` slice from Step 3 failure safely handled. ✅ - Coherence: Single purpose (doc-map injection). All changes necessary. ✅ - Fits codebase: VCS-fetched docs match `conventions-file` pattern. Interface in consumer package. ✅ - Solution: VCS API approach correct. `**` glob without new dependency. Size guard prevents context overflow. ✅ - Completeness: All 10 acceptance criteria met. Full test coverage. CHANGELOG created. ✅
rodin added the self-reviewedneeds-review labels 2026-05-15 03:27:40 +00:00
security-review-bot requested review from security-review-bot 2026-05-15 03:28:03 +00:00
security-review-bot approved these changes 2026-05-15 03:28:03 +00:00
security-review-bot left a comment
Collaborator

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 9670a5fd)

Security Review

Summary

The changes add a path-scoped design doc injection feature with careful workspace path validation and use of the approved YAML parser. CI passed and the implementation avoids obvious SSRF, injection, or auth issues. I identified a few defense-in-depth hardening opportunities around resource limits and LLM prompt injection handling.

Findings

# Severity File Line Finding
1 [MINOR] review/docmap.go 168 Potential denial-of-service via unbounded directory fetch: LoadMatchingDocs calls GetAllFilesInPath which returns full contents of all files under a path before applying the MaxBytes cap. A malicious or misconfigured doc-map could point to a very large directory, causing excessive memory/network usage prior to truncation.
2 [MINOR] budget/budget.go 187 Prompt-injection hardening: Design docs (which are ultimately repository-controlled data) are injected into the system prompt without explicit instruction separation. While content is fetched from the default branch (reducing attacker control via PR), best practice is to clearly treat docs as data and direct the model not to follow any instructions contained within them to mitigate prompt injection.
3 [MINOR] review/docmap.go 229 Path hardening: doc paths from the local YAML are passed directly to GetAllFilesInPath/GetFileContent. Although the VCS API should constrain paths to the repo, adding local validation (e.g., reject absolute paths, clean and deny any '..' segments) would provide defense-in-depth against potential backend path handling quirks.

Recommendation

APPROVE — Overall, the feature is implemented securely with good input validation for the local doc-map file and appropriate use of the approved YAML library. To strengthen the security posture: (1) Add resource-exhaustion guards to doc fetching by avoiding full directory content retrieval before enforcing limits—prefer a two-phase approach (list metadata first, then fetch files incrementally until the byte budget is reached), and consider caps on number of files and per-file size. (2) When embedding design docs, include explicit prompt-guardrails in the system message that instruct the model to treat design-doc content as reference data and to ignore any instructions found within that content, or move the docs to the user prompt with clear data boundaries to mitigate prompt injection risks. (3) Sanitize doc paths from the YAML (reject absolute paths and any path traversal components) before invoking VCS API calls for defense-in-depth. With these adjustments, the changes would further reduce DoS and LLM prompt manipulation risks while maintaining the intended functionality.


Review by security


Evaluated against 9670a5fd

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/138#pullrequestreview-3809) for up-to-date findings. <details><summary>Previous findings (commit 9670a5fd)</summary> # Security Review ## Summary The changes add a path-scoped design doc injection feature with careful workspace path validation and use of the approved YAML parser. CI passed and the implementation avoids obvious SSRF, injection, or auth issues. I identified a few defense-in-depth hardening opportunities around resource limits and LLM prompt injection handling. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `review/docmap.go` | 168 | Potential denial-of-service via unbounded directory fetch: LoadMatchingDocs calls GetAllFilesInPath which returns full contents of all files under a path before applying the MaxBytes cap. A malicious or misconfigured doc-map could point to a very large directory, causing excessive memory/network usage prior to truncation. | | 2 | [MINOR] | `budget/budget.go` | 187 | Prompt-injection hardening: Design docs (which are ultimately repository-controlled data) are injected into the system prompt without explicit instruction separation. While content is fetched from the default branch (reducing attacker control via PR), best practice is to clearly treat docs as data and direct the model not to follow any instructions contained within them to mitigate prompt injection. | | 3 | [MINOR] | `review/docmap.go` | 229 | Path hardening: doc paths from the local YAML are passed directly to GetAllFilesInPath/GetFileContent. Although the VCS API should constrain paths to the repo, adding local validation (e.g., reject absolute paths, clean and deny any '..' segments) would provide defense-in-depth against potential backend path handling quirks. | ## Recommendation **APPROVE** — Overall, the feature is implemented securely with good input validation for the local doc-map file and appropriate use of the approved YAML library. To strengthen the security posture: (1) Add resource-exhaustion guards to doc fetching by avoiding full directory content retrieval before enforcing limits—prefer a two-phase approach (list metadata first, then fetch files incrementally until the byte budget is reached), and consider caps on number of files and per-file size. (2) When embedding design docs, include explicit prompt-guardrails in the system message that instruct the model to treat design-doc content as reference data and to ignore any instructions found within that content, or move the docs to the user prompt with clear data boundaries to mitigate prompt injection risks. (3) Sanitize doc paths from the YAML (reject absolute paths and any path traversal components) before invoking VCS API calls for defense-in-depth. With these adjustments, the changes would further reduce DoS and LLM prompt manipulation risks while maintaining the intended functionality. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 9670a5fd* </details> <!-- review-bot:security -->
@@ -185,6 +187,10 @@ func buildResult(s Sections, trimmed []string, estTokens int) Result {
sys.WriteString("\n\n## Repository Conventions\n\nThe repository has the following coding conventions that must be respected:\n\n")
Collaborator

[MINOR] Prompt-injection hardening: Design docs (which are ultimately repository-controlled data) are injected into the system prompt without explicit instruction separation. While content is fetched from the default branch (reducing attacker control via PR), best practice is to clearly treat docs as data and direct the model not to follow any instructions contained within them to mitigate prompt injection.

**[MINOR]** Prompt-injection hardening: Design docs (which are ultimately repository-controlled data) are injected into the system prompt without explicit instruction separation. While content is fetched from the default branch (reducing attacker control via PR), best practice is to clearly treat docs as data and direct the model not to follow any instructions contained within them to mitigate prompt injection.
security-review-bot marked this conversation as resolved
@@ -0,0 +165,4 @@
// a formatted string suitable for injection into the system prompt.
//
// Behavior:
// - Paths that look like directories (end with /, or GetAllFilesInPath returns files)
Collaborator

[MINOR] Potential denial-of-service via unbounded directory fetch: LoadMatchingDocs calls GetAllFilesInPath which returns full contents of all files under a path before applying the MaxBytes cap. A malicious or misconfigured doc-map could point to a very large directory, causing excessive memory/network usage prior to truncation.

**[MINOR]** Potential denial-of-service via unbounded directory fetch: LoadMatchingDocs calls GetAllFilesInPath which returns full contents of all files under a path before applying the MaxBytes cap. A malicious or misconfigured doc-map could point to a very large directory, causing excessive memory/network usage prior to truncation.
security-review-bot marked this conversation as resolved
@@ -0,0 +226,4 @@
sb.WriteString("\n> ⚠️ (truncated — size limit reached)\n")
}
totalBytes += len(content)
slog.Debug("doc-map: injected doc", "path", entry.path, "bytes", len(content))
Collaborator

[MINOR] Path hardening: doc paths from the local YAML are passed directly to GetAllFilesInPath/GetFileContent. Although the VCS API should constrain paths to the repo, adding local validation (e.g., reject absolute paths, clean and deny any '..' segments) would provide defense-in-depth against potential backend path handling quirks.

**[MINOR]** Path hardening: doc paths from the local YAML are passed directly to GetAllFilesInPath/GetFileContent. Although the VCS API should constrain paths to the repo, adding local validation (e.g., reject absolute paths, clean and deny any '..' segments) would provide defense-in-depth against potential backend path handling quirks.
security-review-bot marked this conversation as resolved
gpt-review-bot approved these changes 2026-05-15 03:28:03 +00:00
gpt-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 9670a5fd)

Gpt Review

Summary

Solid implementation of path-scoped design doc injection with clear wiring through flags, action inputs, budgeting, and comprehensive tests. Minor documentation and comment consistency issues noted, but functionally and idiomatically sound.

Findings

# Severity File Line Finding
1 [MINOR] budget/budget.go 1 Top-of-file comment still states trim order as 'patterns → conventions → file context → diff truncation' but code now includes 'design docs' between conventions and file context. Update the comment to reflect the new order to avoid confusion.
2 [MINOR] README.md 1 The 'Action Inputs' table does not list the new 'doc-map' and 'doc-map-max-bytes' inputs, even though they're mentioned elsewhere. Consider adding them to keep the reference table complete.
3 [NIT] review/docmap.go 1 Comment for globMatch mentions 'Standard path.Match patterns' while code uses filepath.Match. Consider aligning the comment to say 'filepath.Match patterns' for accuracy.
4 [NIT] review/docmap.go 53 In ParseDocMapConfig fallback, the returned error on second unmarshal failure wraps the original strict-mode error (err), potentially obscuring the relaxed unmarshal error (err2). Consider returning or including the second error for clearer diagnostics.
5 [NIT] review/docmap.go 232 sortDocEntries uses a manual insertion sort; consider using sort.Slice for simplicity and readability since the standard library provides it.

Recommendation

APPROVE — The feature is well-implemented: action inputs are correctly added and passed via environment variables, CLI flags are introduced and validated, documents are matched and loaded via VCS with robust glob matching and a sensible size cap, and the budgeting layer integrates the new 'DesignDocs' section in the appropriate trim order. Tests are comprehensive and cover parsing, matching, directory expansion, size guards, and error handling, aligning with the repository's testing conventions. CI is green. Please address the minor documentation inconsistencies (trim order comment and README inputs table) and consider the small comment and error-reporting nits for clarity. Otherwise, this is good to merge.


Review by gpt


Evaluated against 9670a5fd

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/138#pullrequestreview-3810) for up-to-date findings. <details><summary>Previous findings (commit 9670a5fd)</summary> # Gpt Review ## Summary Solid implementation of path-scoped design doc injection with clear wiring through flags, action inputs, budgeting, and comprehensive tests. Minor documentation and comment consistency issues noted, but functionally and idiomatically sound. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `budget/budget.go` | 1 | Top-of-file comment still states trim order as 'patterns → conventions → file context → diff truncation' but code now includes 'design docs' between conventions and file context. Update the comment to reflect the new order to avoid confusion. | | 2 | [MINOR] | `README.md` | 1 | The 'Action Inputs' table does not list the new 'doc-map' and 'doc-map-max-bytes' inputs, even though they're mentioned elsewhere. Consider adding them to keep the reference table complete. | | 3 | [NIT] | `review/docmap.go` | 1 | Comment for globMatch mentions 'Standard path.Match patterns' while code uses filepath.Match. Consider aligning the comment to say 'filepath.Match patterns' for accuracy. | | 4 | [NIT] | `review/docmap.go` | 53 | In ParseDocMapConfig fallback, the returned error on second unmarshal failure wraps the original strict-mode error (err), potentially obscuring the relaxed unmarshal error (err2). Consider returning or including the second error for clearer diagnostics. | | 5 | [NIT] | `review/docmap.go` | 232 | sortDocEntries uses a manual insertion sort; consider using sort.Slice for simplicity and readability since the standard library provides it. | ## Recommendation **APPROVE** — The feature is well-implemented: action inputs are correctly added and passed via environment variables, CLI flags are introduced and validated, documents are matched and loaded via VCS with robust glob matching and a sensible size cap, and the budgeting layer integrates the new 'DesignDocs' section in the appropriate trim order. Tests are comprehensive and cover parsing, matching, directory expansion, size guards, and error handling, aligning with the repository's testing conventions. CI is green. Please address the minor documentation inconsistencies (trim order comment and README inputs table) and consider the small comment and error-reporting nits for clarity. Otherwise, this is good to merge. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against 9670a5fd* </details> <!-- review-bot:gpt -->
review/docmap.go Outdated
@@ -0,0 +1,303 @@
// Package review provides doc-map parsing and doc injection for path-scoped
First-time contributor

[NIT] Comment for globMatch mentions 'Standard path.Match patterns' while code uses filepath.Match. Consider aligning the comment to say 'filepath.Match patterns' for accuracy.

**[NIT]** Comment for globMatch mentions 'Standard path.Match patterns' while code uses filepath.Match. Consider aligning the comment to say 'filepath.Match patterns' for accuracy.
@@ -0,0 +50,4 @@
// Unknown top-level keys produce a warning but are not fatal.
func ParseDocMapConfig(localPath string) (*DocMapConfig, error) {
data, err := readFileBytes(localPath)
if err != nil {
First-time contributor

[NIT] In ParseDocMapConfig fallback, the returned error on second unmarshal failure wraps the original strict-mode error (err), potentially obscuring the relaxed unmarshal error (err2). Consider returning or including the second error for clearer diagnostics.

**[NIT]** In ParseDocMapConfig fallback, the returned error on second unmarshal failure wraps the original strict-mode error (err), potentially obscuring the relaxed unmarshal error (err2). Consider returning or including the second error for clearer diagnostics.
@@ -0,0 +229,4 @@
slog.Debug("doc-map: injected doc", "path", entry.path, "bytes", len(content))
}
}
First-time contributor

[NIT] sortDocEntries uses a manual insertion sort; consider using sort.Slice for simplicity and readability since the standard library provides it.

**[NIT]** sortDocEntries uses a manual insertion sort; consider using sort.Slice for simplicity and readability since the standard library provides it.
rodin added 5 commits 2026-05-15 03:32:32 +00:00
- Fix package comment collision: convert to file comment (not package doc)
- Add debug log for directory expansion failure before single-file fallback
- Add validateDocPath: reject absolute paths and '..' segments (security #3)
- Update globMatch comment to say 'filepath.Match' not 'path.Match' (gpt nit #3)
- Add duplication note to truncateUTF8 explaining why it's kept separate (sonnet #2)
- Update package comment trim order to include design docs (gpt #1)
- Add prompt injection guardrail for DesignDocs section (security #2)
chore(#137): update CHANGELOG with security fixes from review
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 44s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m31s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m48s
f7815b8778
rodin removed the needs-reviewself-reviewed labels 2026-05-15 03:32:44 +00:00
aweiker was assigned by rodin 2026-05-15 03:32:48 +00:00
rodin added the readyself-reviewed labels 2026-05-15 03:32:56 +00:00
sonnet-review-bot approved these changes 2026-05-15 03:33:32 +00:00
sonnet-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit f7815b87)

Sonnet Review

Summary

A well-structured feature addition implementing path-scoped design document injection. The code follows established Go patterns, is well-tested, and handles security concerns (path traversal) and error cases correctly. A few minor issues worth noting but none are blockers.

Findings

# Severity File Line Finding
1 [MINOR] review/docmap.go 57 ParseDocMapConfig uses yaml.Strict() first, then re-parses with yaml.Unmarshal on failure to get the 'unknown keys' warning. This double-parse is slightly wasteful but more importantly, it means the error message logged is the strict-mode error (which describes the unknown key) rather than a custom message. The current approach works correctly but could be simplified: just always use non-strict mode and rely on the YAML library's warning mechanisms if available, or accept that users won't get per-key detail. The current approach is acceptable given the go-yaml library's API.
2 [MINOR] review/docmap.go 170 LoadMatchingDocs writes a truncation notice ("⚠️ Design document context truncated — size limit reached.") to the string builder when limitReached is set inside the outer loop over entries, but this notice can be written multiple times if the outer loop continues processing additional docPaths after limitReached is already set. The outer loop does break on limitReached, so the notice in the inner loop (line 229) is the concern. Looking more carefully: the inner loop only writes the notice once then breaks, but if the same doc triggers truncation, the outer notice at line 229 fires then the outer loop also has a break due to limitReached. The flow is actually correct — the inner notice fires at most once (truncated flag set, limitReached set, inner break), and the outer loop breaks at the top. No double-write occurs. However, this is subtle enough to warrant a clarifying comment.
3 [MINOR] review/docmap.go 302 sortDocEntries uses an insertion sort implemented manually. The standard library's sort.Slice would be idiomatic and more readable: sort.Slice(entries, func(i, j int) bool { return entries[i].path < entries[j].path }). The comment says "doc lists are small" which justifies the O(n²) complexity, but the standard library sort is both clearer and handles all sizes correctly. This is a NIT-level style issue per project conventions.
4 [NIT] review/docmap.go 1 The package doc comment is a single-line imperative description rather than the conventional // Package review ... format documented in the Go patterns (documentation.md pattern #3). The comment reads // doc-map parsing and doc injection for path-scoped design document context in AI code reviews. — but this is a file-level comment, not a package comment. The review package presumably has a package comment elsewhere; this file comment is fine as a file-level description, though it's atypical (most Go files don't have pre-package file comments).
5 [NIT] review/docmap_test.go 14 The fakeDocFetcher struct fields lack alignment/formatting: dirs map[string]map[string]string // dir path -> (file path -> content) — the comment is useful but gofmt may reformat the spacing. Not a correctness issue.
6 [NIT] PLAN-137.md 1 Committing a PLAN file to the repository is unusual and will persist in the repo history as a planning artifact. Consider whether this file should be in the PR description or a wiki instead. That said, some projects intentionally keep ADRs/plans in the repo, so this is a process note rather than a code issue.

Recommendation

APPROVE — Approve. The implementation is solid: the DocFetcher interface keeps the review package decoupled from cmd-level dependencies (correct application of the interface pattern), the glob matching correctly handles **, path traversal is defended at both the config-reading level and before VCS calls, the budget integration is clean with the right trimming priority, and test coverage is thorough (parse, match, load, dir expansion, truncation, path traversal rejection). The duplicated truncateUTF8 is explicitly acknowledged with a justification comment, which is the right approach. The minor findings are cosmetic or informational and do not affect correctness or security.


Review by sonnet


Evaluated against f7815b87

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/138#pullrequestreview-3811) for up-to-date findings. <details><summary>Previous findings (commit f7815b87)</summary> # Sonnet Review ## Summary A well-structured feature addition implementing path-scoped design document injection. The code follows established Go patterns, is well-tested, and handles security concerns (path traversal) and error cases correctly. A few minor issues worth noting but none are blockers. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `review/docmap.go` | 57 | ParseDocMapConfig uses yaml.Strict() first, then re-parses with yaml.Unmarshal on failure to get the 'unknown keys' warning. This double-parse is slightly wasteful but more importantly, it means the error message logged is the strict-mode error (which describes the unknown key) rather than a custom message. The current approach works correctly but could be simplified: just always use non-strict mode and rely on the YAML library's warning mechanisms if available, or accept that users won't get per-key detail. The current approach is acceptable given the go-yaml library's API. | | 2 | [MINOR] | `review/docmap.go` | 170 | LoadMatchingDocs writes a truncation notice ("⚠️ Design document context truncated — size limit reached.") to the string builder when limitReached is set inside the outer loop over entries, but this notice can be written multiple times if the outer loop continues processing additional docPaths after limitReached is already set. The outer loop does break on limitReached, so the notice in the inner loop (line 229) is the concern. Looking more carefully: the inner loop only writes the notice once then breaks, but if the same doc triggers truncation, the outer notice at line 229 fires then the outer loop also has a break due to limitReached. The flow is actually correct — the inner notice fires at most once (truncated flag set, limitReached set, inner break), and the outer loop breaks at the top. No double-write occurs. However, this is subtle enough to warrant a clarifying comment. | | 3 | [MINOR] | `review/docmap.go` | 302 | sortDocEntries uses an insertion sort implemented manually. The standard library's sort.Slice would be idiomatic and more readable: `sort.Slice(entries, func(i, j int) bool { return entries[i].path < entries[j].path })`. The comment says "doc lists are small" which justifies the O(n²) complexity, but the standard library sort is both clearer and handles all sizes correctly. This is a NIT-level style issue per project conventions. | | 4 | [NIT] | `review/docmap.go` | 1 | The package doc comment is a single-line imperative description rather than the conventional `// Package review ...` format documented in the Go patterns (documentation.md pattern #3). The comment reads `// doc-map parsing and doc injection for path-scoped design document context in AI code reviews.` — but this is a file-level comment, not a package comment. The review package presumably has a package comment elsewhere; this file comment is fine as a file-level description, though it's atypical (most Go files don't have pre-package file comments). | | 5 | [NIT] | `review/docmap_test.go` | 14 | The fakeDocFetcher struct fields lack alignment/formatting: `dirs map[string]map[string]string // dir path -> (file path -> content)` — the comment is useful but gofmt may reformat the spacing. Not a correctness issue. | | 6 | [NIT] | `PLAN-137.md` | 1 | Committing a PLAN file to the repository is unusual and will persist in the repo history as a planning artifact. Consider whether this file should be in the PR description or a wiki instead. That said, some projects intentionally keep ADRs/plans in the repo, so this is a process note rather than a code issue. | ## Recommendation **APPROVE** — Approve. The implementation is solid: the DocFetcher interface keeps the review package decoupled from cmd-level dependencies (correct application of the interface pattern), the glob matching correctly handles `**`, path traversal is defended at both the config-reading level and before VCS calls, the budget integration is clean with the right trimming priority, and test coverage is thorough (parse, match, load, dir expansion, truncation, path traversal rejection). The duplicated truncateUTF8 is explicitly acknowledged with a justification comment, which is the right approach. The minor findings are cosmetic or informational and do not affect correctness or security. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against f7815b87* </details> <!-- review-bot:sonnet -->
PLAN-137.md Outdated
@@ -0,0 +1,194 @@
# Plan: Issue #137 — doc-map input for path-scoped doc injection
First-time contributor

[NIT] Committing a PLAN file to the repository is unusual and will persist in the repo history as a planning artifact. Consider whether this file should be in the PR description or a wiki instead. That said, some projects intentionally keep ADRs/plans in the repo, so this is a process note rather than a code issue.

**[NIT]** Committing a PLAN file to the repository is unusual and will persist in the repo history as a planning artifact. Consider whether this file should be in the PR description or a wiki instead. That said, some projects intentionally keep ADRs/plans in the repo, so this is a process note rather than a code issue.
@@ -0,0 +1,332 @@
// doc-map parsing and doc injection for path-scoped design document context in AI code reviews.
First-time contributor

[NIT] The package doc comment is a single-line imperative description rather than the conventional // Package review ... format documented in the Go patterns (documentation.md pattern #3). The comment reads // doc-map parsing and doc injection for path-scoped design document context in AI code reviews. — but this is a file-level comment, not a package comment. The review package presumably has a package comment elsewhere; this file comment is fine as a file-level description, though it's atypical (most Go files don't have pre-package file comments).

**[NIT]** The package doc comment is a single-line imperative description rather than the conventional `// Package review ...` format documented in the Go patterns (documentation.md pattern #3). The comment reads `// doc-map parsing and doc injection for path-scoped design document context in AI code reviews.` — but this is a file-level comment, not a package comment. The review package presumably has a package comment elsewhere; this file comment is fine as a file-level description, though it's atypical (most Go files don't have pre-package file comments).
@@ -0,0 +54,4 @@
}
var cfg DocMapConfig
if err := yaml.UnmarshalWithOptions(data, &cfg, yaml.Strict()); err != nil {
First-time contributor

[MINOR] ParseDocMapConfig uses yaml.Strict() first, then re-parses with yaml.Unmarshal on failure to get the 'unknown keys' warning. This double-parse is slightly wasteful but more importantly, it means the error message logged is the strict-mode error (which describes the unknown key) rather than a custom message. The current approach works correctly but could be simplified: just always use non-strict mode and rely on the YAML library's warning mechanisms if available, or accept that users won't get per-key detail. The current approach is acceptable given the go-yaml library's API.

**[MINOR]** ParseDocMapConfig uses yaml.Strict() first, then re-parses with yaml.Unmarshal on failure to get the 'unknown keys' warning. This double-parse is slightly wasteful but more importantly, it means the error message logged is the strict-mode error (which describes the unknown key) rather than a custom message. The current approach works correctly but could be simplified: just always use non-strict mode and rely on the YAML library's warning mechanisms if available, or accept that users won't get per-key detail. The current approach is acceptable given the go-yaml library's API.
@@ -0,0 +167,4 @@
// - Paths that look like directories (end with /, or GetAllFilesInPath returns files)
// are expanded to all .md files under them.
// - Missing files are logged as warnings and skipped.
// - Total content is capped at opts.MaxBytes; truncation is noted inline.
First-time contributor

[MINOR] LoadMatchingDocs writes a truncation notice ("⚠️ Design document context truncated — size limit reached.") to the string builder when limitReached is set inside the outer loop over entries, but this notice can be written multiple times if the outer loop continues processing additional docPaths after limitReached is already set. The outer loop does break on limitReached, so the notice in the inner loop (line 229) is the concern. Looking more carefully: the inner loop only writes the notice once then breaks, but if the same doc triggers truncation, the outer notice at line 229 fires then the outer loop also has a break due to limitReached. The flow is actually correct — the inner notice fires at most once (truncated flag set, limitReached set, inner break), and the outer loop breaks at the top. No double-write occurs. However, this is subtle enough to warrant a clarifying comment.

**[MINOR]** LoadMatchingDocs writes a truncation notice ("⚠️ Design document context truncated — size limit reached.") to the string builder when limitReached is set inside the outer loop over entries, but this notice can be written multiple times if the outer loop continues processing additional docPaths after limitReached is already set. The outer loop does break on limitReached, so the notice in the inner loop (line 229) is the concern. Looking more carefully: the inner loop only writes the notice once then breaks, but if the same doc triggers truncation, the outer notice at line 229 fires then the outer loop also has a break due to limitReached. The flow is actually correct — the inner notice fires at most once (truncated flag set, limitReached set, inner break), and the outer loop breaks at the top. No double-write occurs. However, this is subtle enough to warrant a clarifying comment.
@@ -0,0 +299,4 @@
}
// validateDocPath rejects doc paths that could cause path traversal via the
// VCS API (absolute paths, any ".." segment). Defense-in-depth: the VCS API
First-time contributor

[MINOR] sortDocEntries uses an insertion sort implemented manually. The standard library's sort.Slice would be idiomatic and more readable: sort.Slice(entries, func(i, j int) bool { return entries[i].path < entries[j].path }). The comment says "doc lists are small" which justifies the O(n²) complexity, but the standard library sort is both clearer and handles all sizes correctly. This is a NIT-level style issue per project conventions.

**[MINOR]** sortDocEntries uses an insertion sort implemented manually. The standard library's sort.Slice would be idiomatic and more readable: `sort.Slice(entries, func(i, j int) bool { return entries[i].path < entries[j].path })`. The comment says "doc lists are small" which justifies the O(n²) complexity, but the standard library sort is both clearer and handles all sizes correctly. This is a NIT-level style issue per project conventions.
@@ -0,0 +11,4 @@
// fakeDocFetcher is a mock DocFetcher for tests.
type fakeDocFetcher struct {
files map[string]string // path -> content
First-time contributor

[NIT] The fakeDocFetcher struct fields lack alignment/formatting: dirs map[string]map[string]string // dir path -> (file path -> content) — the comment is useful but gofmt may reformat the spacing. Not a correctness issue.

**[NIT]** The fakeDocFetcher struct fields lack alignment/formatting: `dirs map[string]map[string]string // dir path -> (file path -> content)` — the comment is useful but gofmt may reformat the spacing. Not a correctness issue.
security-review-bot requested review from security-review-bot 2026-05-15 03:34:21 +00:00
security-review-bot approved these changes 2026-05-15 03:34:21 +00:00
security-review-bot left a comment
Collaborator

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit f7815b87)

Security Review

Summary

The PR adds path-scoped design doc injection with careful input validation, prompt-injection mitigation, and secure handling in the composite action. Overall, changes are security-conscious; the only concern is potential resource exhaustion when expanding directories of docs.

Findings

# Severity File Line Finding
1 [MINOR] review/docmap.go 235 Potential resource exhaustion: loadDocEntries uses GetAllFilesInPath to retrieve all files and their full contents for a directory, with no cap on number of files or total bytes fetched from the VCS before applying the post-fetch MaxBytes limit. A large directory of .md files could cause high memory/network usage before truncation.
2 [MINOR] review/docmap.go 296 Path validation defense-in-depth: validateDocPath rejects absolute paths and literal '..' segments but does not normalize or reject percent-encoded traversal (e.g., %2e%2e), backslashes on Windows, or control characters. While doc-map is typically trusted, adding stricter validation would harden against edge cases.

Recommendation

APPROVE — Overall, the implementation is robust: the composite action continues to guard against SSRF, the new doc-map is validated within the workspace, VCS fetches are restricted to repo paths with traversal checks, and design docs are injected with explicit prompt-injection guidance. To harden further against resource exhaustion, avoid fetching entire directories in one call: add a list-only API (e.g., GetFileListInPath) and then fetch file contents one-by-one, stopping once MaxBytes or a reasonable file count cap is reached (e.g., 50 files). Alternatively, enforce a maximum number of files and per-file size limits when expanding directories. For path validation, normalize inputs and reject percent-encoded traversal, backslashes, and non-printable characters. With these adjustments, the feature will be more resilient to accidental or malicious large inputs. Since CI passed and no exploitable vulnerabilities were found, approval is recommended.


Review by security


Evaluated against f7815b87

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/138#pullrequestreview-3813) for up-to-date findings. <details><summary>Previous findings (commit f7815b87)</summary> # Security Review ## Summary The PR adds path-scoped design doc injection with careful input validation, prompt-injection mitigation, and secure handling in the composite action. Overall, changes are security-conscious; the only concern is potential resource exhaustion when expanding directories of docs. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `review/docmap.go` | 235 | Potential resource exhaustion: loadDocEntries uses GetAllFilesInPath to retrieve all files and their full contents for a directory, with no cap on number of files or total bytes fetched from the VCS before applying the post-fetch MaxBytes limit. A large directory of .md files could cause high memory/network usage before truncation. | | 2 | [MINOR] | `review/docmap.go` | 296 | Path validation defense-in-depth: validateDocPath rejects absolute paths and literal '..' segments but does not normalize or reject percent-encoded traversal (e.g., %2e%2e), backslashes on Windows, or control characters. While doc-map is typically trusted, adding stricter validation would harden against edge cases. | ## Recommendation **APPROVE** — Overall, the implementation is robust: the composite action continues to guard against SSRF, the new doc-map is validated within the workspace, VCS fetches are restricted to repo paths with traversal checks, and design docs are injected with explicit prompt-injection guidance. To harden further against resource exhaustion, avoid fetching entire directories in one call: add a list-only API (e.g., GetFileListInPath) and then fetch file contents one-by-one, stopping once MaxBytes or a reasonable file count cap is reached (e.g., 50 files). Alternatively, enforce a maximum number of files and per-file size limits when expanding directories. For path validation, normalize inputs and reject percent-encoded traversal, backslashes, and non-printable characters. With these adjustments, the feature will be more resilient to accidental or malicious large inputs. Since CI passed and no exploitable vulnerabilities were found, approval is recommended. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against f7815b87* </details> <!-- review-bot:security -->
@@ -0,0 +232,4 @@
if sb.Len() == 0 {
return "", nil
}
return sb.String(), nil
Collaborator

[MINOR] Potential resource exhaustion: loadDocEntries uses GetAllFilesInPath to retrieve all files and their full contents for a directory, with no cap on number of files or total bytes fetched from the VCS before applying the post-fetch MaxBytes limit. A large directory of .md files could cause high memory/network usage before truncation.

**[MINOR]** Potential resource exhaustion: loadDocEntries uses GetAllFilesInPath to retrieve all files and their full contents for a directory, with no cap on number of files or total bytes fetched from the VCS before applying the post-fetch MaxBytes limit. A large directory of .md files could cause high memory/network usage before truncation.
security-review-bot marked this conversation as resolved
@@ -0,0 +293,4 @@
}
}
// readFileBytes reads the contents of a local file.
Collaborator

[MINOR] Path validation defense-in-depth: validateDocPath rejects absolute paths and literal '..' segments but does not normalize or reject percent-encoded traversal (e.g., %2e%2e), backslashes on Windows, or control characters. While doc-map is typically trusted, adding stricter validation would harden against edge cases.

**[MINOR]** Path validation defense-in-depth: validateDocPath rejects absolute paths and literal '..' segments but does not normalize or reject percent-encoded traversal (e.g., %2e%2e), backslashes on Windows, or control characters. While doc-map is typically trusted, adding stricter validation would harden against edge cases.
security-review-bot marked this conversation as resolved
gpt-review-bot approved these changes 2026-05-15 03:34:22 +00:00
gpt-review-bot left a comment
First-time contributor

Gpt Review

Summary

Solid addition of path-scoped design doc injection with clear budgeting, secure path validation, and thorough tests. CI passed and the implementation aligns with repo conventions and error-handling patterns.

Findings

# Severity File Line Finding
1 [MINOR] cmd/review-bot/main.go 375 LoadMatchingDocs is checked for a non-nil error, but LoadMatchingDocs currently never returns an error (it logs and skips per-file issues). This check is redundant and can be removed, or LoadMatchingDocs should propagate an aggregated error if intended.
2 [NIT] review/docmap.go 158 Comment in splitPath says "Clean and split on "/"" but the implementation does not clean the path, it only splits. Update the comment to avoid confusion or add cleaning if truly intended.

Recommendation

APPROVE — The feature is well-implemented: action inputs are wired correctly, budgeting gained a dedicated DesignDocs section with proper trim order, the doc-map parser handles unknown keys leniently, path traversal is guarded, glob matching supports **, and tests cover key cases including directory expansion, size limits, and bad paths. CI passed, so approval is appropriate. Consider removing the redundant error check after LoadMatchingDocs in main.go (or have LoadMatchingDocs propagate an aggregated error if that's desired), and adjust the misleading splitPath comment in review/docmap.go. Otherwise, this is ready to merge.


Review by gpt


Evaluated against f7815b87

# Gpt Review ## Summary Solid addition of path-scoped design doc injection with clear budgeting, secure path validation, and thorough tests. CI passed and the implementation aligns with repo conventions and error-handling patterns. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `cmd/review-bot/main.go` | 375 | LoadMatchingDocs is checked for a non-nil error, but LoadMatchingDocs currently never returns an error (it logs and skips per-file issues). This check is redundant and can be removed, or LoadMatchingDocs should propagate an aggregated error if intended. | | 2 | [NIT] | `review/docmap.go` | 158 | Comment in splitPath says "Clean and split on \"/\"" but the implementation does not clean the path, it only splits. Update the comment to avoid confusion or add cleaning if truly intended. | ## Recommendation **APPROVE** — The feature is well-implemented: action inputs are wired correctly, budgeting gained a dedicated DesignDocs section with proper trim order, the doc-map parser handles unknown keys leniently, path traversal is guarded, glob matching supports **, and tests cover key cases including directory expansion, size limits, and bad paths. CI passed, so approval is appropriate. Consider removing the redundant error check after LoadMatchingDocs in main.go (or have LoadMatchingDocs propagate an aggregated error if that's desired), and adjust the misleading splitPath comment in review/docmap.go. Otherwise, this is ready to merge. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against f7815b87*
@@ -353,0 +372,4 @@
changedPaths = append(changedPaths, f.Filename)
}
matchedDocs := review.MatchDocs(docMapCfg, changedPaths)
First-time contributor

[MINOR] LoadMatchingDocs is checked for a non-nil error, but LoadMatchingDocs currently never returns an error (it logs and skips per-file issues). This check is redundant and can be removed, or LoadMatchingDocs should propagate an aggregated error if intended.

**[MINOR]** LoadMatchingDocs is checked for a non-nil error, but LoadMatchingDocs currently never returns an error (it logs and skips per-file issues). This check is redundant and can be removed, or LoadMatchingDocs should propagate an aggregated error if intended.
@@ -0,0 +155,4 @@
}
patParts = patParts[1:]
pathParts = pathParts[1:]
}
First-time contributor

[NIT] Comment in splitPath says "Clean and split on "/"" but the implementation does not clean the path, it only splits. Update the comment to avoid confusion or add cleaning if truly intended.

**[NIT]** Comment in splitPath says "Clean and split on \"/\"" but the implementation does not clean the path, it only splits. Update the comment to avoid confusion or add cleaning if truly intended.
rodin added 1 commit 2026-05-15 03:37:20 +00:00
test(budget): add DesignDocs tests; replace PLAN-137 with clean design doc
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 48s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m52s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Failing after 2m8s
60c6bd9f49
- budget/budget_test.go: add TestFit_DesignDocsInSystemPrompt,
  TestFit_DesignDocsTrimmedBeforeFileContext, TestFit_DesignDocsEmptyNoHeading
  to cover the new DesignDocs section through Fit() and buildResult()
- Remove PLAN-137.md (contained raw thinking stream, not suitable as repo doc)
- Add docs/DESIGN-137-doc-map.md with clean architectural decision record
Author
Owner

Self-review against 60c6bd9f49

Assessment: Clean

No issues found — ready for human review.

Checks performed:

  • All exported functions tested: ParseDocMapConfig, MatchDocs, LoadMatchingDocs, globMatch (via TestGlobMatch)
  • budget.DesignDocs path through Fit() and buildResult() covered by 3 new tests
  • Security: validateDocPath (absolute paths, .. traversal); validateWorkspacePath for local config file
  • Prompt injection guardrail in buildResult() for DesignDocs section
  • PLAN-137.md removed (contained raw thinking stream); replaced with clean docs/DESIGN-137-doc-map.md
  • CHANGELOG.md and README.md updated
  • make precommit clean (go vet, go test ./...)
Self-review against 60c6bd9f498ab4425a910fcd946b00d920f0de4a Assessment: ✅ Clean No issues found — ready for human review. **Checks performed:** - All exported functions tested: `ParseDocMapConfig`, `MatchDocs`, `LoadMatchingDocs`, `globMatch` (via `TestGlobMatch`) - `budget.DesignDocs` path through `Fit()` and `buildResult()` covered by 3 new tests - Security: `validateDocPath` (absolute paths, `..` traversal); `validateWorkspacePath` for local config file - Prompt injection guardrail in `buildResult()` for `DesignDocs` section - `PLAN-137.md` removed (contained raw thinking stream); replaced with clean `docs/DESIGN-137-doc-map.md` - `CHANGELOG.md` and `README.md` updated - `make precommit` clean (`go vet`, `go test ./...`)
sonnet-review-bot approved these changes 2026-05-15 03:38:27 +00:00
sonnet-review-bot left a comment
First-time contributor

Sonnet Review

Summary

A well-implemented feature that adds path-scoped design doc injection. The implementation is clean, idiomatic Go with good test coverage, proper error handling, and thoughtful security measures (path traversal guards, prompt injection notice). A few minor observations around code structure and edge cases, but nothing blocking.

Findings

# Severity File Line Finding
1 [MINOR] review/docmap.go 199 The LoadMatchingDocs function has a subtle double-write bug for the truncation notice. When available <= 0 is detected at the start of the inner loop, it writes a truncation notice and breaks. But when len(content) > available causes a mid-content truncation, it writes a different truncation notice inline. Then, on the next outer-loop iteration, limitReached is true and we hit the if limitReached { break } before writing another notice. This works correctly, but there are two different truncation messages (one at the outer level `

⚠️ Design document context truncated — size limit reached.
and one inline
⚠️ (truncated — size limit reached)
). These are inconsistent and the outer-level one can appear in the middle of a document section if availablehappens to be exactly 0 at the start of an iteration. The logic is functionally correct but fragile — a future refactor could easily break the invariant that only one message appears. | | 2 | [MINOR] |review/docmap.go| 162 | TheloadDocEntriesfunction silently falls through from a successful-but-emptyGetAllFilesInPathcall to aGetFileContentcall. If a user specifies a directory path likedocs/domain/(with trailing slash) and the directory exists but has no files, the function will then try to fetchdocs/domain/as a single file, which will likely fail with an obscure error. The debug log message only fires whendirErr != nil, not when the directory exists but is empty. This means the caller's debug log "doc-map: no .md files found under path"inLoadMatchingDocswill never fire for this case — instead it falls to the file-fetch error path. Consider: ifGetAllFilesInPathreturns nil error with an empty map, and the path ends with/, skip the single-file fallback and return empty entries directly. | | 3 | [MINOR] | review/docmap.go| 89 | TheParseDocMapConfigstrict-mode fallback parses twice on any strict-mode failure, not just unknown-key failures. If the YAML is structurally valid but has a type mismatch (e.g.,paths: "not-a-list"), the strict parse fails, the relaxed parse also fails (returning an error), and the function returns the *strict-mode error* wrapping localPath. But for unknown keys, it succeeds on the relaxed parse and logs a warning. The comment says "Re-parse without strict mode to log which keys are unknown" — but the code doesn't actually log *which* keys are unknown; it just logs the original strict error. The behavior is correct (lenient for unknown keys, strict for structural issues) but the comment is misleading about what gets logged. | | 4 | [NIT] | review/docmap.go| 22 | The package doc comment is at the file level but not in the// Package review ...form. Per the documentation pattern, the package-level comment (if this is the first file users will encounter for new functionality) should be// Package review .... However, since this is a new file in an existing package and the primary package doc presumably exists elsewhere, a file-level comment is acceptable. The comment could be more explicit: // Package review provides ...or simply be a file comment. Minor style nit. | | 5 | [NIT] |review/docmap_test.go| 10 | ThefakeDocFetcherstruct has inconsistent alignment on the field comments —dirshas a trailing space before the comment to align withfiles. This would normally be fixed by gofmt automatically, but gofmt aligns struct field comments differently from map literal comments. Not a real issue, just a minor style note. | | 6 | [NIT] | review/docmap.go| 253 |sortDocEntriesuses an insertion sort described as "doc lists are small". This is fine, but Go'ssort.Sliceorslices.SortFunc` (Go 1.21+) would be more idiomatic and readable with no meaningful performance difference at these sizes. The comment justifying insertion sort is unnecessary ceremony for a standard sort. |

Recommendation

APPROVE — APPROVE. The implementation is solid: correct glob semantics with ** support, proper path traversal validation (both for the doc-map YAML file via validateWorkspacePath and for doc paths via validateDocPath), prompt injection guard in the budget section header, graceful degradation on missing files, context size limiting with inline truncation notices, and comprehensive test coverage. The CI passes.

The MINOR findings are worth noting for follow-up but do not block merging: (1) the double truncation-message paths are functionally correct but could confuse future maintainers; (2) the empty-directory fallthrough to single-file-fetch could produce confusing errors for trailing-slash directory paths that exist but are empty. These are edge cases that the test suite doesn't currently exercise. Consider adding a test for an empty directory path with trailing slash to catch the fallthrough behavior.


Review by sonnet


Evaluated against 60c6bd9f

# Sonnet Review ## Summary A well-implemented feature that adds path-scoped design doc injection. The implementation is clean, idiomatic Go with good test coverage, proper error handling, and thoughtful security measures (path traversal guards, prompt injection notice). A few minor observations around code structure and edge cases, but nothing blocking. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `review/docmap.go` | 199 | The `LoadMatchingDocs` function has a subtle double-write bug for the truncation notice. When `available <= 0` is detected at the start of the inner loop, it writes a truncation notice and breaks. But when `len(content) > available` causes a mid-content truncation, it writes a *different* truncation notice inline. Then, on the next outer-loop iteration, `limitReached` is true and we hit the `if limitReached { break }` before writing another notice. This works correctly, but there are two different truncation messages (one at the outer level ` > ⚠️ Design document context truncated — size limit reached. ` and one inline ` > ⚠️ (truncated — size limit reached) `). These are inconsistent and the outer-level one can appear in the middle of a document section if `available` happens to be exactly 0 at the start of an iteration. The logic is functionally correct but fragile — a future refactor could easily break the invariant that only one message appears. | | 2 | [MINOR] | `review/docmap.go` | 162 | The `loadDocEntries` function silently falls through from a successful-but-empty `GetAllFilesInPath` call to a `GetFileContent` call. If a user specifies a directory path like `docs/domain/` (with trailing slash) and the directory exists but has no files, the function will then try to fetch `docs/domain/` as a single file, which will likely fail with an obscure error. The debug log message only fires when `dirErr != nil`, not when the directory exists but is empty. This means the caller's debug log `"doc-map: no .md files found under path"` in `LoadMatchingDocs` will never fire for this case — instead it falls to the file-fetch error path. Consider: if `GetAllFilesInPath` returns nil error with an empty map, and the path ends with `/`, skip the single-file fallback and return empty entries directly. | | 3 | [MINOR] | `review/docmap.go` | 89 | The `ParseDocMapConfig` strict-mode fallback parses twice on any strict-mode failure, not just unknown-key failures. If the YAML is structurally valid but has a type mismatch (e.g., `paths: "not-a-list"`), the strict parse fails, the relaxed parse also fails (returning an error), and the function returns the *strict-mode error* wrapping `localPath`. But for unknown keys, it succeeds on the relaxed parse and logs a warning. The comment says "Re-parse without strict mode to log which keys are unknown" — but the code doesn't actually log *which* keys are unknown; it just logs the original strict error. The behavior is correct (lenient for unknown keys, strict for structural issues) but the comment is misleading about what gets logged. | | 4 | [NIT] | `review/docmap.go` | 22 | The package doc comment is at the file level but not in the `// Package review ...` form. Per the documentation pattern, the package-level comment (if this is the first file users will encounter for new functionality) should be `// Package review ...`. However, since this is a new file in an existing package and the primary package doc presumably exists elsewhere, a file-level comment is acceptable. The comment could be more explicit: `// Package review provides ...` or simply be a file comment. Minor style nit. | | 5 | [NIT] | `review/docmap_test.go` | 10 | The `fakeDocFetcher` struct has inconsistent alignment on the field comments — `dirs` has a trailing space before the comment to align with `files`. This would normally be fixed by gofmt automatically, but gofmt aligns struct field comments differently from map literal comments. Not a real issue, just a minor style note. | | 6 | [NIT] | `review/docmap.go` | 253 | `sortDocEntries` uses an insertion sort described as "doc lists are small". This is fine, but Go's `sort.Slice` or `slices.SortFunc` (Go 1.21+) would be more idiomatic and readable with no meaningful performance difference at these sizes. The comment justifying insertion sort is unnecessary ceremony for a standard sort. | ## Recommendation **APPROVE** — APPROVE. The implementation is solid: correct glob semantics with `**` support, proper path traversal validation (both for the doc-map YAML file via `validateWorkspacePath` and for doc paths via `validateDocPath`), prompt injection guard in the budget section header, graceful degradation on missing files, context size limiting with inline truncation notices, and comprehensive test coverage. The CI passes. The MINOR findings are worth noting for follow-up but do not block merging: (1) the double truncation-message paths are functionally correct but could confuse future maintainers; (2) the empty-directory fallthrough to single-file-fetch could produce confusing errors for trailing-slash directory paths that exist but are empty. These are edge cases that the test suite doesn't currently exercise. Consider adding a test for an empty directory path with trailing slash to catch the fallthrough behavior. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 60c6bd9f*
@@ -0,0 +19,4 @@
)
// DocMapping maps a set of path glob patterns to governing doc files/directories.
type DocMapping struct {
First-time contributor

[NIT] The package doc comment is at the file level but not in the // Package review ... form. Per the documentation pattern, the package-level comment (if this is the first file users will encounter for new functionality) should be // Package review .... However, since this is a new file in an existing package and the primary package doc presumably exists elsewhere, a file-level comment is acceptable. The comment could be more explicit: // Package review provides ... or simply be a file comment. Minor style nit.

**[NIT]** The package doc comment is at the file level but not in the `// Package review ...` form. Per the documentation pattern, the package-level comment (if this is the first file users will encounter for new functionality) should be `// Package review ...`. However, since this is a new file in an existing package and the primary package doc presumably exists elsewhere, a file-level comment is acceptable. The comment could be more explicit: `// Package review provides ...` or simply be a file comment. Minor style nit.
@@ -0,0 +86,4 @@
result = append(result, doc)
}
}
}
First-time contributor

[MINOR] The ParseDocMapConfig strict-mode fallback parses twice on any strict-mode failure, not just unknown-key failures. If the YAML is structurally valid but has a type mismatch (e.g., paths: "not-a-list"), the strict parse fails, the relaxed parse also fails (returning an error), and the function returns the strict-mode error wrapping localPath. But for unknown keys, it succeeds on the relaxed parse and logs a warning. The comment says "Re-parse without strict mode to log which keys are unknown" — but the code doesn't actually log which keys are unknown; it just logs the original strict error. The behavior is correct (lenient for unknown keys, strict for structural issues) but the comment is misleading about what gets logged.

**[MINOR]** The `ParseDocMapConfig` strict-mode fallback parses twice on any strict-mode failure, not just unknown-key failures. If the YAML is structurally valid but has a type mismatch (e.g., `paths: "not-a-list"`), the strict parse fails, the relaxed parse also fails (returning an error), and the function returns the *strict-mode error* wrapping `localPath`. But for unknown keys, it succeeds on the relaxed parse and logs a warning. The comment says "Re-parse without strict mode to log which keys are unknown" — but the code doesn't actually log *which* keys are unknown; it just logs the original strict error. The behavior is correct (lenient for unknown keys, strict for structural issues) but the comment is misleading about what gets logged.
@@ -0,0 +159,4 @@
// All pattern parts consumed; path must also be consumed.
return len(pathParts) == 0
}
First-time contributor

[MINOR] The loadDocEntries function silently falls through from a successful-but-empty GetAllFilesInPath call to a GetFileContent call. If a user specifies a directory path like docs/domain/ (with trailing slash) and the directory exists but has no files, the function will then try to fetch docs/domain/ as a single file, which will likely fail with an obscure error. The debug log message only fires when dirErr != nil, not when the directory exists but is empty. This means the caller's debug log "doc-map: no .md files found under path" in LoadMatchingDocs will never fire for this case — instead it falls to the file-fetch error path. Consider: if GetAllFilesInPath returns nil error with an empty map, and the path ends with /, skip the single-file fallback and return empty entries directly.

**[MINOR]** The `loadDocEntries` function silently falls through from a successful-but-empty `GetAllFilesInPath` call to a `GetFileContent` call. If a user specifies a directory path like `docs/domain/` (with trailing slash) and the directory exists but has no files, the function will then try to fetch `docs/domain/` as a single file, which will likely fail with an obscure error. The debug log message only fires when `dirErr != nil`, not when the directory exists but is empty. This means the caller's debug log `"doc-map: no .md files found under path"` in `LoadMatchingDocs` will never fire for this case — instead it falls to the file-fetch error path. Consider: if `GetAllFilesInPath` returns nil error with an empty map, and the path ends with `/`, skip the single-file fallback and return empty entries directly.
@@ -0,0 +196,4 @@
slog.Debug("doc-map: no .md files found under path", "path", docPath)
continue
}
First-time contributor

[MINOR] The LoadMatchingDocs function has a subtle double-write bug for the truncation notice. When available <= 0 is detected at the start of the inner loop, it writes a truncation notice and breaks. But when len(content) > available causes a mid-content truncation, it writes a different truncation notice inline. Then, on the next outer-loop iteration, limitReached is true and we hit the if limitReached { break } before writing another notice. This works correctly, but there are two different truncation messages (one at the outer level `

⚠️ Design document context truncated — size limit reached.
and one inline
⚠️ (truncated — size limit reached)
). These are inconsistent and the outer-level one can appear in the middle of a document section if available` happens to be exactly 0 at the start of an iteration. The logic is functionally correct but fragile — a future refactor could easily break the invariant that only one message appears.

**[MINOR]** The `LoadMatchingDocs` function has a subtle double-write bug for the truncation notice. When `available <= 0` is detected at the start of the inner loop, it writes a truncation notice and breaks. But when `len(content) > available` causes a mid-content truncation, it writes a *different* truncation notice inline. Then, on the next outer-loop iteration, `limitReached` is true and we hit the `if limitReached { break }` before writing another notice. This works correctly, but there are two different truncation messages (one at the outer level ` > ⚠️ Design document context truncated — size limit reached. ` and one inline ` > ⚠️ (truncated — size limit reached) `). These are inconsistent and the outer-level one can appear in the middle of a document section if `available` happens to be exactly 0 at the start of an iteration. The logic is functionally correct but fragile — a future refactor could easily break the invariant that only one message appears.
@@ -0,0 +250,4 @@
}
// Try directory expansion first.
files, dirErr := fetcher.GetAllFilesInPath(ctx, owner, repo, docPath)
First-time contributor

[NIT] sortDocEntries uses an insertion sort described as "doc lists are small". This is fine, but Go's sort.Slice or slices.SortFunc (Go 1.21+) would be more idiomatic and readable with no meaningful performance difference at these sizes. The comment justifying insertion sort is unnecessary ceremony for a standard sort.

**[NIT]** `sortDocEntries` uses an insertion sort described as "doc lists are small". This is fine, but Go's `sort.Slice` or `slices.SortFunc` (Go 1.21+) would be more idiomatic and readable with no meaningful performance difference at these sizes. The comment justifying insertion sort is unnecessary ceremony for a standard sort.
@@ -0,0 +7,4 @@
"path/filepath"
"strings"
"testing"
)
First-time contributor

[NIT] The fakeDocFetcher struct has inconsistent alignment on the field comments — dirs has a trailing space before the comment to align with files. This would normally be fixed by gofmt automatically, but gofmt aligns struct field comments differently from map literal comments. Not a real issue, just a minor style note.

**[NIT]** The `fakeDocFetcher` struct has inconsistent alignment on the field comments — `dirs` has a trailing space before the comment to align with `files`. This would normally be fixed by gofmt automatically, but gofmt aligns struct field comments differently from map literal comments. Not a real issue, just a minor style note.
security-review-bot requested review from security-review-bot 2026-05-15 03:39:31 +00:00
security-review-bot requested changes 2026-05-15 03:39:31 +00:00
security-review-bot left a comment
Collaborator

Security Review

Summary

The feature is implemented thoughtfully with budget limits and path traversal checks, but there is a critical data exposure risk. Loading the doc-map from the PR workspace allows an untrusted PR author to cause arbitrary repository documents (from the default branch) to be sent to external LLM providers.

Findings

# Severity File Line Finding
1 [MAJOR] cmd/review-bot/main.go 352 Doc-map configuration is read from the PR workspace and used to fetch arbitrary docs from the repository (default branch) via the VCS API, which are then sent to external LLMs. An untrusted PR author can modify the doc-map file in their PR to map any changed path to sensitive repository documents, exfiltrating content outside the organization and potentially into PR comments via the model's response. Mitigations: fetch the doc-map from a trusted reference (e.g., default/protected branch) rather than the PR branch, or disable doc-map on untrusted PRs; additionally allowlist permissible doc path prefixes (e.g., only under docs/) and/or require maintainer opt-in before using doc-map on a PR.

Recommendation

REQUEST_CHANGES — The new doc-map functionality is well-integrated and includes defenses against SSRF and path traversal, plus context-size limits and prompt-injection caution. However, because the doc-map file is read from the PR's workspace, a malicious PR author can alter the doc-map to cause the system to fetch and inject arbitrary documents from the repository's default branch into the LLM request, potentially exposing sensitive content to third-party providers or into PR comments. To resolve this, do one or more of the following: (1) Load the doc-map file from a trusted ref (e.g., default/protected branch) instead of the PR branch; (2) Disable or ignore doc-map on untrusted PRs (e.g., from forks) unless explicitly approved; (3) Enforce an allowlist of doc path prefixes (e.g., only under docs/), and reject patterns that match broad directories (like repo root) to limit exposure; (4) Optionally add a policy flag to require maintainer approval before doc-map-derived content is used. After addressing this data exfiltration vector, the feature appears secure and robust.


Review by security


Evaluated against 60c6bd9f

# Security Review ## Summary The feature is implemented thoughtfully with budget limits and path traversal checks, but there is a critical data exposure risk. Loading the doc-map from the PR workspace allows an untrusted PR author to cause arbitrary repository documents (from the default branch) to be sent to external LLM providers. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MAJOR] | `cmd/review-bot/main.go` | 352 | Doc-map configuration is read from the PR workspace and used to fetch arbitrary docs from the repository (default branch) via the VCS API, which are then sent to external LLMs. An untrusted PR author can modify the doc-map file in their PR to map any changed path to sensitive repository documents, exfiltrating content outside the organization and potentially into PR comments via the model's response. Mitigations: fetch the doc-map from a trusted reference (e.g., default/protected branch) rather than the PR branch, or disable doc-map on untrusted PRs; additionally allowlist permissible doc path prefixes (e.g., only under docs/) and/or require maintainer opt-in before using doc-map on a PR. | ## Recommendation **REQUEST_CHANGES** — The new doc-map functionality is well-integrated and includes defenses against SSRF and path traversal, plus context-size limits and prompt-injection caution. However, because the doc-map file is read from the PR's workspace, a malicious PR author can alter the doc-map to cause the system to fetch and inject arbitrary documents from the repository's default branch into the LLM request, potentially exposing sensitive content to third-party providers or into PR comments. To resolve this, do one or more of the following: (1) Load the doc-map file from a trusted ref (e.g., default/protected branch) instead of the PR branch; (2) Disable or ignore doc-map on untrusted PRs (e.g., from forks) unless explicitly approved; (3) Enforce an allowlist of doc path prefixes (e.g., only under docs/), and reject patterns that match broad directories (like repo root) to limit exposure; (4) Optionally add a policy flag to require maintainer approval before doc-map-derived content is used. After addressing this data exfiltration vector, the feature appears secure and robust. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 60c6bd9f*
@@ -350,6 +352,46 @@ func main() {
slog.Debug("loaded system prompt file", "file", *systemPromptFile, "bytes", len(additionalPrompt))
Collaborator

[MAJOR] Doc-map configuration is read from the PR workspace and used to fetch arbitrary docs from the repository (default branch) via the VCS API, which are then sent to external LLMs. An untrusted PR author can modify the doc-map file in their PR to map any changed path to sensitive repository documents, exfiltrating content outside the organization and potentially into PR comments via the model's response. Mitigations: fetch the doc-map from a trusted reference (e.g., default/protected branch) rather than the PR branch, or disable doc-map on untrusted PRs; additionally allowlist permissible doc path prefixes (e.g., only under docs/) and/or require maintainer opt-in before using doc-map on a PR.

**[MAJOR]** Doc-map configuration is read from the PR workspace and used to fetch arbitrary docs from the repository (default branch) via the VCS API, which are then sent to external LLMs. An untrusted PR author can modify the doc-map file in their PR to map any changed path to sensitive repository documents, exfiltrating content outside the organization and potentially into PR comments via the model's response. Mitigations: fetch the doc-map from a trusted reference (e.g., default/protected branch) rather than the PR branch, or disable doc-map on untrusted PRs; additionally allowlist permissible doc path prefixes (e.g., only under docs/) and/or require maintainer opt-in before using doc-map on a PR.
rodin merged commit 1e3d86b604 into main 2026-05-15 03:39:37 +00:00
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: rodin/review-bot#138