fix: address remote persona review findings #63

Closed
rodin wants to merge 2 commits from issue-60-remote-personas into main
Owner

Summary

Addresses review findings from the remote persona feature (PR #61).

Changes

MAJOR fixes:

  • Removed duplicate flag.Parse() call
  • Fixed nil map panic in LoadRemotePersonas error path — now assigns empty map when LoadRemotePersonas returns an error, preventing the subsequent map lookup from panicking

MINOR fixes:

  • Tightened isNotFoundError to only check for "HTTP 404" (removed broad "not found" substring check that could cause false positives)
  • Cleaned up personaErr variable scope — replaced with narrower-scoped err variables in each branch
  • Added proper doc comment to LoadRemotePersonasFromPath starting with the function name (Go convention)
  • Added file count cap (50 files) in LoadRemotePersonasFromPath to prevent resource exhaustion from repos with thousands of small YAML files

Testing

  • All tests pass locally
  • Updated isNotFoundError test expectation to match the tightened behavior
## Summary Addresses review findings from the remote persona feature (PR #61). ## Changes ### MAJOR fixes: - Removed duplicate `flag.Parse()` call - Fixed nil map panic in LoadRemotePersonas error path — now assigns empty map when `LoadRemotePersonas` returns an error, preventing the subsequent map lookup from panicking ### MINOR fixes: - Tightened `isNotFoundError` to only check for `"HTTP 404"` (removed broad `"not found"` substring check that could cause false positives) - Cleaned up `personaErr` variable scope — replaced with narrower-scoped `err` variables in each branch - Added proper doc comment to `LoadRemotePersonasFromPath` starting with the function name (Go convention) - Added file count cap (50 files) in `LoadRemotePersonasFromPath` to prevent resource exhaustion from repos with thousands of small YAML files ## Testing - All tests pass locally - Updated `isNotFoundError` test expectation to match the tightened behavior
rodin added 2 commits 2026-05-11 03:45:32 +00:00
feat: load personas from target repo .review-bot/personas/
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 8m12s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 8m15s
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 42s
2f8d047ef2
Adds support for repository-specific personas. When --persona is
specified, review-bot now:

1. Checks the target repo's .review-bot/personas/<name>.yaml directory
2. Falls back to built-in persona if not found in repo

This allows repos to define domain-specific personas (trading, regulatory,
etc.) or override built-in personas with project-specific rules, without
requiring changes to CI configuration.

Implementation:
- New review.PersonaFetcher interface for abstracting Gitea API access
- review.LoadRemotePersonas() with graceful fallback on 404
- review.MergePersonas() for combining remote and built-in personas
- giteaFetcher adapter in main.go to bridge gitea.Client

The feature follows a partial-success model: invalid YAML files or
network errors for individual persona files are logged and skipped,
allowing other valid personas to load.

Closes #60
fix: address PR #62 review findings
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 27s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m5s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m40s
5fac8bc505
- Remove duplicate flag.Parse() call
- Fix nil map panic in LoadRemotePersonas error path by assigning
  empty map when LoadRemotePersonas returns an error
- Tighten isNotFoundError to only check HTTP 404 (remove broad
  'not found' substring check to avoid false positives)
- Clean up personaErr variable scope using narrower-scoped err variables
- Add proper doc comment to LoadRemotePersonasFromPath (Go convention)
- Add file count cap (50 files) in LoadRemotePersonasFromPath to
  prevent resource exhaustion from repos with thousands of small files
- Update test expectation for tightened isNotFoundError
sonnet-review-bot approved these changes 2026-05-11 03:46:15 +00:00
sonnet-review-bot left a comment
First-time contributor

Sonnet Review

Summary

This PR correctly addresses the review findings from PR #61. The duplicate flag.Parse() removal, nil map panic fix, tightened isNotFoundError, variable scope cleanup, doc comment addition, and file count cap are all implemented correctly. The tests are comprehensive and cover the key behaviors well.

Findings

# Severity File Line Finding
1 [MINOR] cmd/review-bot/main.go 197 The err2 variable name in the personaFile branch is a workaround for shadowing from the outer err in the personaName branch, but the two branches are mutually exclusive (else if). A cleaner approach would be to use a fresh err variable inside the else-if block with its own scope: if err := validateWorkspacePath(...); err != nil { ... } followed by persona, err = review.LoadPersona(resolvedPath) where err is declared at block scope. The current err2 naming is functional but slightly non-idiomatic.
2 [NIT] review/remote_persona.go 78 The processed counter increments only when a persona is successfully parsed, but the cap check if processed >= maxPersonaFiles happens at the top of the loop. This means the cap applies to successfully-loaded personas, not to total files iterated. If there are 100 entries where 49 are invalid YAML and 51 are valid, all 51 valid ones will be processed (the limit is checked against processed, which only increments on success). This is arguably the right behavior but might be surprising — consider a comment clarifying that the cap applies to successfully loaded personas, not files visited.

Recommendation

APPROVE — The PR is ready to merge. The two MAJOR bugs (duplicate flag.Parse() and nil map panic) are fixed correctly. The isNotFoundError tightening is correct and the tests reflect the new behavior. The LoadRemotePersonasFromPath doc comment follows Go conventions. The file cap at 50 is a reasonable defense-in-depth measure. The two findings above are minor style/clarity issues that don't affect correctness.


Review by sonnet


Evaluated against 5fac8bc5

# Sonnet Review ## Summary This PR correctly addresses the review findings from PR #61. The duplicate `flag.Parse()` removal, nil map panic fix, tightened `isNotFoundError`, variable scope cleanup, doc comment addition, and file count cap are all implemented correctly. The tests are comprehensive and cover the key behaviors well. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `cmd/review-bot/main.go` | 197 | The `err2` variable name in the `personaFile` branch is a workaround for shadowing from the outer `err` in the `personaName` branch, but the two branches are mutually exclusive (else if). A cleaner approach would be to use a fresh `err` variable inside the else-if block with its own scope: `if err := validateWorkspacePath(...); err != nil { ... }` followed by `persona, err = review.LoadPersona(resolvedPath)` where `err` is declared at block scope. The current `err2` naming is functional but slightly non-idiomatic. | | 2 | [NIT] | `review/remote_persona.go` | 78 | The `processed` counter increments only when a persona is successfully parsed, but the cap check `if processed >= maxPersonaFiles` happens at the top of the loop. This means the cap applies to successfully-loaded personas, not to total files iterated. If there are 100 entries where 49 are invalid YAML and 51 are valid, all 51 valid ones will be processed (the limit is checked against `processed`, which only increments on success). This is arguably the right behavior but might be surprising — consider a comment clarifying that the cap applies to successfully loaded personas, not files visited. | ## Recommendation **APPROVE** — The PR is ready to merge. The two MAJOR bugs (duplicate `flag.Parse()` and nil map panic) are fixed correctly. The `isNotFoundError` tightening is correct and the tests reflect the new behavior. The `LoadRemotePersonasFromPath` doc comment follows Go conventions. The file cap at 50 is a reasonable defense-in-depth measure. The two findings above are minor style/clarity issues that don't affect correctness. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 5fac8bc5*
@@ -199,0 +194,4 @@
var err error
persona, err = review.LoadBuiltinPersona(*personaName)
if err != nil {
slog.Error("failed to load persona", "persona", *personaName, "error", err)
First-time contributor

[MINOR] The err2 variable name in the personaFile branch is a workaround for shadowing from the outer err in the personaName branch, but the two branches are mutually exclusive (else if). A cleaner approach would be to use a fresh err variable inside the else-if block with its own scope: if err := validateWorkspacePath(...); err != nil { ... } followed by persona, err = review.LoadPersona(resolvedPath) where err is declared at block scope. The current err2 naming is functional but slightly non-idiomatic.

**[MINOR]** The `err2` variable name in the `personaFile` branch is a workaround for shadowing from the outer `err` in the `personaName` branch, but the two branches are mutually exclusive (else if). A cleaner approach would be to use a fresh `err` variable inside the else-if block with its own scope: `if err := validateWorkspacePath(...); err != nil { ... }` followed by `persona, err = review.LoadPersona(resolvedPath)` where `err` is declared at block scope. The current `err2` naming is functional but slightly non-idiomatic.
@@ -0,0 +75,4 @@
}
if !isYAMLFile(entry.Name) {
continue
}
First-time contributor

[NIT] The processed counter increments only when a persona is successfully parsed, but the cap check if processed >= maxPersonaFiles happens at the top of the loop. This means the cap applies to successfully-loaded personas, not to total files iterated. If there are 100 entries where 49 are invalid YAML and 51 are valid, all 51 valid ones will be processed (the limit is checked against processed, which only increments on success). This is arguably the right behavior but might be surprising — consider a comment clarifying that the cap applies to successfully loaded personas, not files visited.

**[NIT]** The `processed` counter increments only when a persona is successfully parsed, but the cap check `if processed >= maxPersonaFiles` happens at the top of the loop. This means the cap applies to successfully-loaded personas, not to total files iterated. If there are 100 entries where 49 are invalid YAML and 51 are valid, all 51 valid ones will be processed (the limit is checked against `processed`, which only increments on success). This is arguably the right behavior but might be surprising — consider a comment clarifying that the cap applies to successfully loaded personas, not files visited.
gpt-review-bot approved these changes 2026-05-11 03:46:53 +00:00
gpt-review-bot left a comment
First-time contributor

Gpt Review

Summary

Changes are correct and align with the stated fixes: duplicate flag.Parse removed, safer handling of remote personas on error, tightened 404 detection, and added a sensible cap on processed persona files. Tests are comprehensive and CI passed.

Findings

# Severity File Line Finding
1 [NIT] review/remote_persona.go 50 The max files cap uses a counter of successfully processed personas. If many files are invalid or fetch errors occur, the loop may still iterate over a large number of entries. Consider capping attempts rather than successes to fully bound work for repos with many invalid files.

Recommendation

APPROVE — Overall, the PR improves robustness and follows Go conventions: error handling is defensive, comments are clear, and tests reflect tightened behavior. The persona auto-discovery logic is well integrated and avoids circular dependencies by using a simple string check for 404. The only minor suggestion is to cap attempts rather than successfully processed files to further mitigate potential resource usage in pathological repos. With CI passing and no major issues found, this is good to merge.


Review by gpt


Evaluated against 5fac8bc5

# Gpt Review ## Summary Changes are correct and align with the stated fixes: duplicate flag.Parse removed, safer handling of remote personas on error, tightened 404 detection, and added a sensible cap on processed persona files. Tests are comprehensive and CI passed. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [NIT] | `review/remote_persona.go` | 50 | The max files cap uses a counter of successfully processed personas. If many files are invalid or fetch errors occur, the loop may still iterate over a large number of entries. Consider capping attempts rather than successes to fully bound work for repos with many invalid files. | ## Recommendation **APPROVE** — Overall, the PR improves robustness and follows Go conventions: error handling is defensive, comments are clear, and tests reflect tightened behavior. The persona auto-discovery logic is well integrated and avoids circular dependencies by using a simple string check for 404. The only minor suggestion is to cap attempts rather than successfully processed files to further mitigate potential resource usage in pathological repos. With CI passing and no major issues found, this is good to merge. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against 5fac8bc5*
@@ -0,0 +47,4 @@
entries, err := fetcher.ListContents(ctx, owner, repo, path)
if err != nil {
// 404 is expected when repo doesn't have personas — return empty, not error
if isNotFoundError(err) {
First-time contributor

[NIT] The max files cap uses a counter of successfully processed personas. If many files are invalid or fetch errors occur, the loop may still iterate over a large number of entries. Consider capping attempts rather than successes to fully bound work for repos with many invalid files.

**[NIT]** The max files cap uses a counter of successfully processed personas. If many files are invalid or fetch errors occur, the loop may still iterate over a large number of entries. Consider capping attempts rather than successes to fully bound work for repos with many invalid files.
security-review-bot requested review from security-review-bot 2026-05-11 03:47:19 +00:00
security-review-bot approved these changes 2026-05-11 03:47:19 +00:00
security-review-bot left a comment
Collaborator

Security Review

Summary

Security-focused changes look sound: remote persona loading adds sensible caps, nil-map handling prevents panics, and error checks were tightened. No exploitable vulnerabilities were introduced by this PR.

Findings

# Severity File Line Finding
1 [MINOR] cmd/review-bot/main.go 188 Remote persona metadata (e.g., DisplayName) is now untrusted input and is later incorporated into the review body. If the Markdown renderer on the Gitea instance is misconfigured or insufficiently sanitizes HTML, this could enable content/HTML injection. Consider sanitizing or escaping persona.DisplayName before embedding it in Markdown.
2 [MINOR] review/remote_persona.go 109 While individual persona files are size-limited and the file count is capped (50), YAML parsers can be susceptible to CPU blow-ups via alias/anchor expansion ('billion laughs'). Consider adding additional safeguards such as limiting alias expansions or total decode time/complexity when parsing YAML to reduce DoS risk.
3 [NIT] review/remote_persona.go 158 isNotFoundError relies on substring matching for "HTTP 404". Although tightened compared to prior behavior, string matching is still fragile; if feasible without introducing dependencies, prefer structured status code checks from the underlying client to avoid misclassification.

Recommendation

APPROVE — Overall, the changes are security-positive: resource-exhaustion mitigations and tighter 404 detection were added, and potential nil-map panics removed. To further harden the system, sanitize or escape untrusted persona metadata (e.g., DisplayName) before embedding in Markdown output, and consider adding YAML parsing safeguards against alias/anchor expansion DoS beyond per-file size limits and file count caps. If practical, replace substring 404 checks with structured status handling to avoid misclassification. With these minor adjustments considered, the PR is approved.


Review by security


Evaluated against 5fac8bc5

# Security Review ## Summary Security-focused changes look sound: remote persona loading adds sensible caps, nil-map handling prevents panics, and error checks were tightened. No exploitable vulnerabilities were introduced by this PR. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `cmd/review-bot/main.go` | 188 | Remote persona metadata (e.g., DisplayName) is now untrusted input and is later incorporated into the review body. If the Markdown renderer on the Gitea instance is misconfigured or insufficiently sanitizes HTML, this could enable content/HTML injection. Consider sanitizing or escaping persona.DisplayName before embedding it in Markdown. | | 2 | [MINOR] | `review/remote_persona.go` | 109 | While individual persona files are size-limited and the file count is capped (50), YAML parsers can be susceptible to CPU blow-ups via alias/anchor expansion ('billion laughs'). Consider adding additional safeguards such as limiting alias expansions or total decode time/complexity when parsing YAML to reduce DoS risk. | | 3 | [NIT] | `review/remote_persona.go` | 158 | isNotFoundError relies on substring matching for "HTTP 404". Although tightened compared to prior behavior, string matching is still fragile; if feasible without introducing dependencies, prefer structured status code checks from the underlying client to avoid misclassification. | ## Recommendation **APPROVE** — Overall, the changes are security-positive: resource-exhaustion mitigations and tighter 404 detection were added, and potential nil-map panics removed. To further harden the system, sanitize or escape untrusted persona metadata (e.g., DisplayName) before embedding in Markdown output, and consider adding YAML parsing safeguards against alias/anchor expansion DoS beyond per-file size limits and file count caps. If practical, replace substring 404 checks with structured status handling to avoid misclassification. With these minor adjustments considered, the PR is approved. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 5fac8bc5*
@@ -199,0 +185,4 @@
// Assign empty map so the lookup below doesn't panic
remotePersonas = map[string]*review.Persona{}
}
Collaborator

[MINOR] Remote persona metadata (e.g., DisplayName) is now untrusted input and is later incorporated into the review body. If the Markdown renderer on the Gitea instance is misconfigured or insufficiently sanitizes HTML, this could enable content/HTML injection. Consider sanitizing or escaping persona.DisplayName before embedding it in Markdown.

**[MINOR]** Remote persona metadata (e.g., DisplayName) is now untrusted input and is later incorporated into the review body. If the Markdown renderer on the Gitea instance is misconfigured or insufficiently sanitizes HTML, this could enable content/HTML injection. Consider sanitizing or escaping persona.DisplayName before embedding it in Markdown.
@@ -0,0 +106,4 @@
// MergePersonas combines remote and built-in personas.
// Remote personas take precedence on name collision.
// Returns the merged map and a list of persona names in sorted order.
func MergePersonas(remote, builtin map[string]*Persona) (map[string]*Persona, []string) {
Collaborator

[MINOR] While individual persona files are size-limited and the file count is capped (50), YAML parsers can be susceptible to CPU blow-ups via alias/anchor expansion ('billion laughs'). Consider adding additional safeguards such as limiting alias expansions or total decode time/complexity when parsing YAML to reduce DoS risk.

**[MINOR]** While individual persona files are size-limited and the file count is capped (50), YAML parsers can be susceptible to CPU blow-ups via alias/anchor expansion ('billion laughs'). Consider adding additional safeguards such as limiting alias expansions or total decode time/complexity when parsing YAML to reduce DoS risk.
@@ -0,0 +155,4 @@
// isNotFoundError checks if an error indicates a 404 response.
// This is a simple string check to avoid importing the gitea package
// (which would create a circular dependency).
func isNotFoundError(err error) bool {
Collaborator

[NIT] isNotFoundError relies on substring matching for "HTTP 404". Although tightened compared to prior behavior, string matching is still fragile; if feasible without introducing dependencies, prefer structured status code checks from the underlying client to avoid misclassification.

**[NIT]** isNotFoundError relies on substring matching for "HTTP 404". Although tightened compared to prior behavior, string matching is still fragile; if feasible without introducing dependencies, prefer structured status code checks from the underlying client to avoid misclassification.
rodin closed this pull request 2026-05-11 03:52:46 +00:00
Author
Owner

Closing as duplicate — PR #61 already merged the feature with a different file structure (repo_persona.go vs remote_persona.go). The file count cap (50 files) from this PR is not in main; will open a follow-up if needed.

Closing as duplicate — PR #61 already merged the feature with a different file structure (`repo_persona.go` vs `remote_persona.go`). The file count cap (50 files) from this PR is not in main; will open a follow-up if needed.
Author
Owner

Addressed all three MINOR findings from review:

1. err2 refactor (sonnet-review-bot, main.go ~197)

  • Refactored to use loadedPersona, loadErr := review.LoadPersona(...) with scoped variables
  • The else-if branches are mutually exclusive, so no need for err2 visible outside the block

2. DisplayName sanitization (security-review-bot, main.go ~188)

  • Added sanitizeMarkdownText() in formatter.go that:
    • Removes control characters and null bytes
    • Escapes Markdown special characters: \*_#<>|~`
  • Applied to both the header title and footer attribution
  • Added comprehensive test suite for the sanitizer

3. YAML DoS documentation (security-review-bot, remote_persona.go ~109)

  • Added detailed comment documenting existing mitigations:
    • File size cap (MaxPersonaFileSize = 64KB)
    • File count cap (maxPersonaFiles = 50)
    • Depth limit (MaxYAMLDepth)
    • Node count limit (MaxYAMLNodes) against billion-laughs expansion
    • Alias cycle detection via seen-node tracking

All tests pass (go test ./... and go vet ./...).

Addressed all three MINOR findings from review: **1. err2 refactor** (sonnet-review-bot, main.go ~197) - Refactored to use `loadedPersona, loadErr := review.LoadPersona(...)` with scoped variables - The else-if branches are mutually exclusive, so no need for `err2` visible outside the block **2. DisplayName sanitization** (security-review-bot, main.go ~188) - Added `sanitizeMarkdownText()` in formatter.go that: - Removes control characters and null bytes - Escapes Markdown special characters: `\*_`[]()#<>|~` - Applied to both the header title and footer attribution - Added comprehensive test suite for the sanitizer **3. YAML DoS documentation** (security-review-bot, remote_persona.go ~109) - Added detailed comment documenting existing mitigations: - File size cap (MaxPersonaFileSize = 64KB) - File count cap (maxPersonaFiles = 50) - Depth limit (MaxYAMLDepth) - Node count limit (MaxYAMLNodes) against billion-laughs expansion - Alias cycle detection via seen-node tracking All tests pass (`go test ./...` and `go vet ./...`).
Some checks are pending
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 27s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m5s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m40s

Pull request closed

Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: rodin/review-bot#63