feat(gitea): harden GetPullRequestDiff against unbounded diff size #109

Merged
aweiker merged 4 commits from review-bot-issue-92 into main 2026-05-13 14:01:22 +00:00
Owner

Summary

Hardens GetPullRequestDiff against unbounded diff size to prevent potential DoS from very large PRs.

Changes

  • Added DefaultMaxDiffSize constant (10 MB) and ErrDiffTooLarge sentinel error
  • Added configurable MaxDiffSize field to Client struct
    • Zero value uses the default (10 MB)
    • Negative value (-1) disables the limit entirely
  • Modified GetPullRequestDiff to enforce the size limit using io.LimitReader
  • Added doGetLimited internal method that reads maxBytes+1 to detect overflow without buffering the full response
  • Added comprehensive tests covering: exceeds limit, within limit, exact boundary, one byte over, disabled limit, and default limit

Testing

  • go test ./... passes
  • go vet ./... clean
  • 6 new test cases in gitea/diff_size_test.go

Closes #92

## Summary Hardens `GetPullRequestDiff` against unbounded diff size to prevent potential DoS from very large PRs. ## Changes - Added `DefaultMaxDiffSize` constant (10 MB) and `ErrDiffTooLarge` sentinel error - Added configurable `MaxDiffSize` field to `Client` struct - Zero value uses the default (10 MB) - Negative value (-1) disables the limit entirely - Modified `GetPullRequestDiff` to enforce the size limit using `io.LimitReader` - Added `doGetLimited` internal method that reads `maxBytes+1` to detect overflow without buffering the full response - Added comprehensive tests covering: exceeds limit, within limit, exact boundary, one byte over, disabled limit, and default limit ## Testing - `go test ./...` passes - `go vet ./...` clean - 6 new test cases in `gitea/diff_size_test.go` Closes #92
sonnet-review-bot approved these changes 2026-05-13 11:58:39 +00:00
sonnet-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 235828ec)

Sonnet Review

Summary

The PR correctly hardens GetPullRequestDiff against unbounded memory allocation using io.LimitReader with a read-one-more-byte overflow detection strategy. The implementation follows established patterns, CI passes, and the test coverage is thorough.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 440 doGetLimited duplicates roughly 80 lines of retry logic that already exists in doGet. The only difference is the size-limiting read on the success path. This violates DRY and means any future change to retry behavior (new backoff strategy, different log fields, new error classification) must be made in two places. A cleaner approach would be to extract the retry loop into a shared helper that accepts a response-body handler function, or to add an optional maxBytes int64 parameter to doGet (where 0 means unlimited). The current duplication is a maintenance hazard.
2 [MINOR] gitea/client.go 144 The maxSize < 0 check uses the negated condition to dispatch to doGet for the unlimited path. Since DefaultMaxDiffSize is already set when maxSize == 0, the check maxSize < 0 is correct, but the comment says "disabled" while the field doc says "-1 to disable". An explicit maxSize == -1 check (or const noLimit int64 = -1) would make the intent clearer and prevent confusion if someone passes e.g. -2.
3 [NIT] gitea/diff_size_test.go 22 The w.Write return values are silently discarded in all test handlers. While harmless in httptest servers, the convention in this codebase (and in stdlib examples) is to at least acknowledge the return. Not a correctness issue.

Recommendation

APPROVE — Approve with the understanding that the duplication between doGet and doGetLimited should be addressed in a follow-up refactor. The core feature is correct: the overflow detection via io.LimitReader(resp.Body, maxBytes+1) followed by a length comparison is sound, the sentinel error wraps correctly for errors.Is, the zero/negative conventions are well-documented, and the test matrix (exact boundary, one-over, disabled, default) is exactly what this kind of guard needs. The MINOR findings are maintainability concerns, not correctness bugs, and CI passes.


Review by sonnet


Evaluated against 235828ec

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3183) for up-to-date findings. <details><summary>Previous findings (commit 235828ec)</summary> # Sonnet Review ## Summary The PR correctly hardens `GetPullRequestDiff` against unbounded memory allocation using `io.LimitReader` with a read-one-more-byte overflow detection strategy. The implementation follows established patterns, CI passes, and the test coverage is thorough. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 440 | `doGetLimited` duplicates roughly 80 lines of retry logic that already exists in `doGet`. The only difference is the size-limiting read on the success path. This violates DRY and means any future change to retry behavior (new backoff strategy, different log fields, new error classification) must be made in two places. A cleaner approach would be to extract the retry loop into a shared helper that accepts a response-body handler function, or to add an optional `maxBytes int64` parameter to `doGet` (where 0 means unlimited). The current duplication is a maintenance hazard. | | 2 | [MINOR] | `gitea/client.go` | 144 | The `maxSize < 0` check uses the negated condition to dispatch to `doGet` for the unlimited path. Since `DefaultMaxDiffSize` is already set when `maxSize == 0`, the check `maxSize < 0` is correct, but the comment says "disabled" while the field doc says "-1 to disable". An explicit `maxSize == -1` check (or `const noLimit int64 = -1`) would make the intent clearer and prevent confusion if someone passes e.g. -2. | | 3 | [NIT] | `gitea/diff_size_test.go` | 22 | The `w.Write` return values are silently discarded in all test handlers. While harmless in `httptest` servers, the convention in this codebase (and in stdlib examples) is to at least acknowledge the return. Not a correctness issue. | ## Recommendation **APPROVE** — Approve with the understanding that the duplication between `doGet` and `doGetLimited` should be addressed in a follow-up refactor. The core feature is correct: the overflow detection via `io.LimitReader(resp.Body, maxBytes+1)` followed by a length comparison is sound, the sentinel error wraps correctly for `errors.Is`, the zero/negative conventions are well-documented, and the test matrix (exact boundary, one-over, disabled, default) is exactly what this kind of guard needs. The MINOR findings are maintainability concerns, not correctness bugs, and CI passes. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 235828ec* </details> <!-- review-bot:sonnet -->
gitea/client.go Outdated
@@ -130,1 +142,3 @@
body, err := c.doGet(ctx, reqURL)
maxSize := c.MaxDiffSize
if maxSize == 0 {
First-time contributor

[MINOR] The maxSize < 0 check uses the negated condition to dispatch to doGet for the unlimited path. Since DefaultMaxDiffSize is already set when maxSize == 0, the check maxSize < 0 is correct, but the comment says "disabled" while the field doc says "-1 to disable". An explicit maxSize == -1 check (or const noLimit int64 = -1) would make the intent clearer and prevent confusion if someone passes e.g. -2.

**[MINOR]** The `maxSize < 0` check uses the negated condition to dispatch to `doGet` for the unlimited path. Since `DefaultMaxDiffSize` is already set when `maxSize == 0`, the check `maxSize < 0` is correct, but the comment says "disabled" while the field doc says "-1 to disable". An explicit `maxSize == -1` check (or `const noLimit int64 = -1`) would make the intent clearer and prevent confusion if someone passes e.g. -2.
@@ -413,6 +440,86 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return nil, lastErr
First-time contributor

[MINOR] doGetLimited duplicates roughly 80 lines of retry logic that already exists in doGet. The only difference is the size-limiting read on the success path. This violates DRY and means any future change to retry behavior (new backoff strategy, different log fields, new error classification) must be made in two places. A cleaner approach would be to extract the retry loop into a shared helper that accepts a response-body handler function, or to add an optional maxBytes int64 parameter to doGet (where 0 means unlimited). The current duplication is a maintenance hazard.

**[MINOR]** `doGetLimited` duplicates roughly 80 lines of retry logic that already exists in `doGet`. The only difference is the size-limiting read on the success path. This violates DRY and means any future change to retry behavior (new backoff strategy, different log fields, new error classification) must be made in two places. A cleaner approach would be to extract the retry loop into a shared helper that accepts a response-body handler function, or to add an optional `maxBytes int64` parameter to `doGet` (where 0 means unlimited). The current duplication is a maintenance hazard.
@@ -0,0 +19,4 @@
}))
defer server.Close()
client := NewClient(server.URL, "test-token")
First-time contributor

[NIT] The w.Write return values are silently discarded in all test handlers. While harmless in httptest servers, the convention in this codebase (and in stdlib examples) is to at least acknowledge the return. Not a correctness issue.

**[NIT]** The `w.Write` return values are silently discarded in all test handlers. While harmless in `httptest` servers, the convention in this codebase (and in stdlib examples) is to at least acknowledge the return. Not a correctness issue.
security-review-bot requested review from security-review-bot 2026-05-13 11:59:24 +00:00
security-review-bot approved these changes 2026-05-13 11:59:24 +00:00
Dismissed
security-review-bot left a comment
Collaborator

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 235828ec)

Security Review

Summary

The changes introduce a bounded diff fetch with sensible defaults and robust retry/error handling, reducing DoS risk from very large PR diffs. Implementation uses io.LimitReader and returns a sentinel error for oversize responses, with tests covering key scenarios. CI passed.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 490 Adding 1 to maxBytes in io.LimitReader(resp.Body, maxBytes+1) can overflow if maxBytes is math.MaxInt64, resulting in a negative limit and an immediate EOF (empty body) instead of enforcing the intended bound. Consider clamping at MaxInt64 or handling the overflow case explicitly.
2 [NIT] gitea/client.go 71 Disabling the diff size limit (MaxDiffSize = -1) removes protection against large responses and could allow memory exhaustion if configured from untrusted input. Ensure this setting cannot be influenced by untrusted sources.

Recommendation

APPROVE — Overall, the hardening is sound and reduces resource exhaustion risk when fetching diffs. Please consider guarding against int64 overflow when computing maxBytes+1 in doGetLimited (e.g., clamp to MaxInt64 or treat MaxInt64 as unlimited), and ensure that the configuration to disable the limit is not user-controlled. With these minor considerations, the PR is ready to merge.


Review by security


Evaluated against 235828ec

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3185) for up-to-date findings. <details><summary>Previous findings (commit 235828ec)</summary> # Security Review ## Summary The changes introduce a bounded diff fetch with sensible defaults and robust retry/error handling, reducing DoS risk from very large PR diffs. Implementation uses io.LimitReader and returns a sentinel error for oversize responses, with tests covering key scenarios. CI passed. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 490 | Adding 1 to maxBytes in io.LimitReader(resp.Body, maxBytes+1) can overflow if maxBytes is math.MaxInt64, resulting in a negative limit and an immediate EOF (empty body) instead of enforcing the intended bound. Consider clamping at MaxInt64 or handling the overflow case explicitly. | | 2 | [NIT] | `gitea/client.go` | 71 | Disabling the diff size limit (MaxDiffSize = -1) removes protection against large responses and could allow memory exhaustion if configured from untrusted input. Ensure this setting cannot be influenced by untrusted sources. | ## Recommendation **APPROVE** — Overall, the hardening is sound and reduces resource exhaustion risk when fetching diffs. Please consider guarding against int64 overflow when computing maxBytes+1 in doGetLimited (e.g., clamp to MaxInt64 or treat MaxInt64 as unlimited), and ensure that the configuration to disable the limit is not user-controlled. With these minor considerations, the PR is ready to merge. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 235828ec* </details> <!-- review-bot:security -->
gitea/client.go Outdated
@@ -62,2 +68,4 @@
// Modifying it while requests are in flight is not safe.
RetryBackoff []time.Duration
// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.
Collaborator

[NIT] Disabling the diff size limit (MaxDiffSize = -1) removes protection against large responses and could allow memory exhaustion if configured from untrusted input. Ensure this setting cannot be influenced by untrusted sources.

**[NIT]** Disabling the diff size limit (MaxDiffSize = -1) removes protection against large responses and could allow memory exhaustion if configured from untrusted input. Ensure this setting cannot be influenced by untrusted sources.
security-review-bot marked this conversation as resolved
gitea/client.go Outdated
@@ -416,0 +487,4 @@
if attempt < maxAttempts-1 && isTemporaryNetError(err) {
slog.Warn("temporary network error, will retry",
"attempt", attempt+1,
"url", redactURL(reqURL),
Collaborator

[MINOR] Adding 1 to maxBytes in io.LimitReader(resp.Body, maxBytes+1) can overflow if maxBytes is math.MaxInt64, resulting in a negative limit and an immediate EOF (empty body) instead of enforcing the intended bound. Consider clamping at MaxInt64 or handling the overflow case explicitly.

**[MINOR]** Adding 1 to maxBytes in io.LimitReader(resp.Body, maxBytes+1) can overflow if maxBytes is math.MaxInt64, resulting in a negative limit and an immediate EOF (empty body) instead of enforcing the intended bound. Consider clamping at MaxInt64 or handling the overflow case explicitly.
security-review-bot marked this conversation as resolved
gpt-review-bot approved these changes 2026-05-13 11:59:35 +00:00
Dismissed
gpt-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 235828ec)

Gpt Review

Summary

Well-implemented size limit for PR diffs with sensible defaults and comprehensive tests. The approach uses io.LimitReader and a sentinel error with error wrapping, aligning with Go patterns. Minor opportunities for clarity and refactoring noted.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 440 doGetLimited largely duplicates retry and error-handling logic from doGet. Consider refactoring shared behavior into a helper to reduce maintenance overhead and risk of drift.
2 [NIT] gitea/client.go 67 MaxDiffSize is read at call time; like RetryBackoff, consider documenting that modifying it while requests are in flight is not safe to avoid potential data races given the Client is intended for concurrent use.

Recommendation

APPROVE — The changes correctly harden GetPullRequestDiff with a configurable default-limited read, use a sentinel error wrapped with context, and include thorough tests for edge cases. The implementation follows Go error handling patterns (errors.Is compatibility) and zero-value usability for configuration. I recommend approval. As follow-ups, consider factoring shared retry/GET logic between doGet and doGetLimited to avoid duplication, and clarify concurrency expectations for MaxDiffSize in the field's comment to match RetryBackoff's guidance.


Review by gpt


Evaluated against 235828ec

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3186) for up-to-date findings. <details><summary>Previous findings (commit 235828ec)</summary> # Gpt Review ## Summary Well-implemented size limit for PR diffs with sensible defaults and comprehensive tests. The approach uses io.LimitReader and a sentinel error with error wrapping, aligning with Go patterns. Minor opportunities for clarity and refactoring noted. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 440 | doGetLimited largely duplicates retry and error-handling logic from doGet. Consider refactoring shared behavior into a helper to reduce maintenance overhead and risk of drift. | | 2 | [NIT] | `gitea/client.go` | 67 | MaxDiffSize is read at call time; like RetryBackoff, consider documenting that modifying it while requests are in flight is not safe to avoid potential data races given the Client is intended for concurrent use. | ## Recommendation **APPROVE** — The changes correctly harden GetPullRequestDiff with a configurable default-limited read, use a sentinel error wrapped with context, and include thorough tests for edge cases. The implementation follows Go error handling patterns (errors.Is compatibility) and zero-value usability for configuration. I recommend approval. As follow-ups, consider factoring shared retry/GET logic between doGet and doGetLimited to avoid duplication, and clarify concurrency expectations for MaxDiffSize in the field's comment to match RetryBackoff's guidance. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against 235828ec* </details> <!-- review-bot:gpt -->
@@ -61,6 +67,10 @@ type Client struct {
// This field must be configured before the first request is made.
First-time contributor

[NIT] MaxDiffSize is read at call time; like RetryBackoff, consider documenting that modifying it while requests are in flight is not safe to avoid potential data races given the Client is intended for concurrent use.

**[NIT]** MaxDiffSize is read at call time; like RetryBackoff, consider documenting that modifying it while requests are in flight is not safe to avoid potential data races given the Client is intended for concurrent use.
@@ -413,6 +440,86 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return nil, lastErr
First-time contributor

[MINOR] doGetLimited largely duplicates retry and error-handling logic from doGet. Consider refactoring shared behavior into a helper to reduce maintenance overhead and risk of drift.

**[MINOR]** doGetLimited largely duplicates retry and error-handling logic from doGet. Consider refactoring shared behavior into a helper to reduce maintenance overhead and risk of drift.
rodin added the wip label 2026-05-13 12:07:14 +00:00
Author
Owner

Self-Review: PR #109

Self-review against 235828ec42

Phase 1: Independent Findings

  1. [MINOR] doGetLimited duplicates ~65 lines of retry logic from doGet (gitea/client.go ~line 440)
    The only difference between the two methods is the size-limited read on the success path. Any future change to retry logic (backoff strategy, log fields, error classification) must be applied twice. This is a maintenance hazard.

  2. [MINOR] Integer overflow on maxBytes+1 (gitea/client.go ~line 490)
    io.LimitReader(resp.Body, maxBytes+1) overflows to a negative value when maxBytes == math.MaxInt64. A negative LimitReader reads 0 bytes, making the subsequent length check always trigger ErrDiffTooLarge. In practice DefaultMaxDiffSize is 10 MB so this won't happen with the default, but users who pass a very large explicit limit could hit it. A clamp like min(maxBytes+1, math.MaxInt64) (or treating MaxInt64 as unlimited) would fix it.

  3. [MINOR] maxSize < 0 vs documented -1 to disable (gitea/client.go ~line 144)
    The field doc says "Set to -1 to disable the limit" but the code checks maxSize < 0, silently treating -2, -100, etc. identically. Either the check should be == -1 (strict contract) or the doc should say "any negative value disables the limit." The current asymmetry is a trap.

  4. [NIT] MaxDiffSize missing concurrency documentation
    RetryBackoff has an explicit doc comment: "must be configured before the first request is made; modifying it while requests are in flight is not safe." MaxDiffSize has no equivalent guidance, even though the same concurrency constraint applies. Should mirror RetryBackoff's wording.

Phase 2: Prior Review Verification

Finding Reviewer Status Notes
[MINOR] doGetLimited duplicates retry logic sonnet-review-bot (#1) STILL PRESENT Duplication remains in the diff. Accepted as follow-up by all reviewers; not blocking per their APPROVE verdicts.
[MINOR] maxSize < 0 vs -1 semantics sonnet-review-bot (#2) STILL PRESENT Check if maxSize < 0 remains; field doc still says "Set to -1 to disable." Ambiguity not resolved.
[NIT] w.Write return values in tests sonnet-review-bot (#3) STILL PRESENT Test handlers still discard w.Write returns. NIT-level, not blocking.
[MINOR] maxBytes+1 int64 overflow security-review-bot (#1) STILL PRESENT io.LimitReader(resp.Body, maxBytes+1) has no clamp. Real correctness bug for extreme maxBytes values.
[NIT] -1 disable from untrusted input security-review-bot (#2) INVALID Configuration concern outside the scope of this diff; not a code-level finding.
[MINOR] doGetLimited duplication gpt-review-bot (#1) STILL PRESENT Same as sonnet finding #1; accepted as follow-up.
[NIT] MaxDiffSize concurrency doc gpt-review-bot (#2) STILL PRESENT MaxDiffSize field still lacks the concurrency warning present on RetryBackoff.

Assessment: ⚠️ Needs attention

The core feature (bounded diff fetch with io.LimitReader, sentinel error, test coverage) is sound and all three bot reviews APPROVEd. However, two MINOR findings remain unaddressed: the maxBytes+1 integer overflow (security bot #1 — real correctness bug under extreme values) and the < 0 vs -1 semantic ambiguity (sonnet bot #2). The code duplication in doGetLimited was accepted as a follow-up by all reviewers, so that is not blocking. The overflow and -1 ambiguity should be fixed before merge, or explicitly waived by the PR author.

## Self-Review: PR #109 Self-review against 235828ec4233fb71a0558255383ca1d30917826a ### Phase 1: Independent Findings 1. **[MINOR] `doGetLimited` duplicates ~65 lines of retry logic from `doGet`** (`gitea/client.go` ~line 440) The only difference between the two methods is the size-limited read on the success path. Any future change to retry logic (backoff strategy, log fields, error classification) must be applied twice. This is a maintenance hazard. 2. **[MINOR] Integer overflow on `maxBytes+1`** (`gitea/client.go` ~line 490) `io.LimitReader(resp.Body, maxBytes+1)` overflows to a negative value when `maxBytes == math.MaxInt64`. A negative LimitReader reads 0 bytes, making the subsequent length check always trigger `ErrDiffTooLarge`. In practice `DefaultMaxDiffSize` is 10 MB so this won't happen with the default, but users who pass a very large explicit limit could hit it. A clamp like `min(maxBytes+1, math.MaxInt64)` (or treating `MaxInt64` as unlimited) would fix it. 3. **[MINOR] `maxSize < 0` vs documented `-1 to disable`** (`gitea/client.go` ~line 144) The field doc says "Set to -1 to disable the limit" but the code checks `maxSize < 0`, silently treating -2, -100, etc. identically. Either the check should be `== -1` (strict contract) or the doc should say "any negative value disables the limit." The current asymmetry is a trap. 4. **[NIT] `MaxDiffSize` missing concurrency documentation** `RetryBackoff` has an explicit doc comment: "must be configured before the first request is made; modifying it while requests are in flight is not safe." `MaxDiffSize` has no equivalent guidance, even though the same concurrency constraint applies. Should mirror `RetryBackoff`'s wording. ### Phase 2: Prior Review Verification | Finding | Reviewer | Status | Notes | |---------|----------|--------|-------| | [MINOR] `doGetLimited` duplicates retry logic | sonnet-review-bot (#1) | STILL PRESENT | Duplication remains in the diff. Accepted as follow-up by all reviewers; not blocking per their APPROVE verdicts. | | [MINOR] `maxSize < 0` vs `-1` semantics | sonnet-review-bot (#2) | STILL PRESENT | Check `if maxSize < 0` remains; field doc still says "Set to -1 to disable." Ambiguity not resolved. | | [NIT] `w.Write` return values in tests | sonnet-review-bot (#3) | STILL PRESENT | Test handlers still discard `w.Write` returns. NIT-level, not blocking. | | [MINOR] `maxBytes+1` int64 overflow | security-review-bot (#1) | STILL PRESENT | `io.LimitReader(resp.Body, maxBytes+1)` has no clamp. Real correctness bug for extreme `maxBytes` values. | | [NIT] `-1` disable from untrusted input | security-review-bot (#2) | INVALID | Configuration concern outside the scope of this diff; not a code-level finding. | | [MINOR] `doGetLimited` duplication | gpt-review-bot (#1) | STILL PRESENT | Same as sonnet finding #1; accepted as follow-up. | | [NIT] `MaxDiffSize` concurrency doc | gpt-review-bot (#2) | STILL PRESENT | `MaxDiffSize` field still lacks the concurrency warning present on `RetryBackoff`. | ### Assessment: ⚠️ Needs attention The core feature (bounded diff fetch with `io.LimitReader`, sentinel error, test coverage) is sound and all three bot reviews APPROVEd. However, two MINOR findings remain unaddressed: the `maxBytes+1` integer overflow (security bot #1 — real correctness bug under extreme values) and the `< 0` vs `-1` semantic ambiguity (sonnet bot #2). The code duplication in `doGetLimited` was accepted as a follow-up by all reviewers, so that is not blocking. The overflow and -1 ambiguity should be fixed before merge, or explicitly waived by the PR author.
rodin removed the wip label 2026-05-13 12:08:48 +00:00
rodin added the wip label 2026-05-13 12:14:46 +00:00
Author
Owner

Review findings addressed

1. maxBytes+1 integer overflow (security-review-bot)

Fixed. Added overflow clamp before passing to io.LimitReader:

limitBytes := maxBytes + 1
if limitBytes <= 0 {
    limitBytes = math.MaxInt64
}

When maxBytes == math.MaxInt64, maxBytes+1 wraps to math.MinInt64 (negative). The guard catches this and falls back to math.MaxInt64, which effectively means "read everything" — correct behavior since if someone configured the max at MaxInt64, they don't care about the limit.

2. maxSize < 0 vs documented -1 to disable (sonnet-review-bot)

Fixed by updating the doc to say "Set to any negative value to disable the limit", which matches the actual maxSize < 0 check. This is the simpler approach — tightening to == -1 would be a breaking change for no practical benefit, and "any negative disables" is a common Go convention (e.g., strings.Count, bytes.Split).

3. Code duplication in doGetLimited (all reviewers)

Acknowledged as follow-up — not addressed in this PR per reviewer consensus.

## Review findings addressed ### 1. `maxBytes+1` integer overflow (security-review-bot) Fixed. Added overflow clamp before passing to `io.LimitReader`: ```go limitBytes := maxBytes + 1 if limitBytes <= 0 { limitBytes = math.MaxInt64 } ``` When `maxBytes == math.MaxInt64`, `maxBytes+1` wraps to `math.MinInt64` (negative). The guard catches this and falls back to `math.MaxInt64`, which effectively means "read everything" — correct behavior since if someone configured the max at `MaxInt64`, they don't care about the limit. ### 2. `maxSize < 0` vs documented `-1 to disable` (sonnet-review-bot) Fixed by updating the doc to say "Set to any negative value to disable the limit", which matches the actual `maxSize < 0` check. This is the simpler approach — tightening to `== -1` would be a breaking change for no practical benefit, and "any negative disables" is a common Go convention (e.g., `strings.Count`, `bytes.Split`). ### 3. Code duplication in `doGetLimited` (all reviewers) Acknowledged as follow-up — not addressed in this PR per reviewer consensus.
rodin removed the wip label 2026-05-13 12:17:43 +00:00
sonnet-review-bot approved these changes 2026-05-13 12:18:04 +00:00
sonnet-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 1a305092)

Sonnet Review

Summary

The PR correctly hardens GetPullRequestDiff against unbounded diff size. The implementation is sound, follows established patterns in the codebase, and includes comprehensive boundary tests. CI passes.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 441 doGetLimited is ~90 lines and duplicates nearly all of doGet's retry/backoff logic. The two methods share: the attempt loop, backoff wiring, retry-delay select, request construction, header setting, temporary-network-error retry, and error-body reading. The only meaningful difference is the LimitReader + overflow check on the success path. Extracting a shared doGetCore(ctx, reqURL string, processBody func(r io.Reader) ([]byte, error)) ([]byte, error) or similar would eliminate the duplication without sacrificing clarity. As-is, any future change to the retry policy (new backoff defaults, added headers, logging changes) must be applied in both places.
2 [NIT] gitea/client.go 156 The disabled-limit branch (maxSize < 0) wraps the error with 'fetch diff: %w' but the limited branch also wraps with the same prefix inside the return. This is fine, but worth noting that the error string convention is consistent between the two paths.
3 [NIT] gitea/diff_size_test.go 25 The test server handler ignores write errors from w.Write([]byte(largeDiff)). This is idiomatic for test handlers and not a bug, but for completeness the existing codebase pattern is the same so this is consistent.

Recommendation

APPROVE — Approve with a suggestion to track the code-duplication between doGet and doGetLimited as a follow-up. The duplication is the only notable concern: the retry loop, backoff logic, request construction, and error-body reading are identical across both methods. A future bug fix or enhancement to retry behavior would need to be applied in two places. The correct fix would be to refactor the shared scaffolding into a private helper and let doGetLimited supply only the body-processing callback — but this is a non-trivial refactor that goes beyond the scope of this PR, which otherwise is correct, well-tested, and follows the project's conventions.


Review by sonnet


Evaluated against 1a305092

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3187) for up-to-date findings. <details><summary>Previous findings (commit 1a305092)</summary> # Sonnet Review ## Summary The PR correctly hardens GetPullRequestDiff against unbounded diff size. The implementation is sound, follows established patterns in the codebase, and includes comprehensive boundary tests. CI passes. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 441 | doGetLimited is ~90 lines and duplicates nearly all of doGet's retry/backoff logic. The two methods share: the attempt loop, backoff wiring, retry-delay select, request construction, header setting, temporary-network-error retry, and error-body reading. The only meaningful difference is the LimitReader + overflow check on the success path. Extracting a shared doGetCore(ctx, reqURL string, processBody func(r io.Reader) ([]byte, error)) ([]byte, error) or similar would eliminate the duplication without sacrificing clarity. As-is, any future change to the retry policy (new backoff defaults, added headers, logging changes) must be applied in both places. | | 2 | [NIT] | `gitea/client.go` | 156 | The disabled-limit branch (maxSize < 0) wraps the error with 'fetch diff: %w' but the limited branch also wraps with the same prefix inside the return. This is fine, but worth noting that the error string convention is consistent between the two paths. | | 3 | [NIT] | `gitea/diff_size_test.go` | 25 | The test server handler ignores write errors from w.Write([]byte(largeDiff)). This is idiomatic for test handlers and not a bug, but for completeness the existing codebase pattern is the same so this is consistent. | ## Recommendation **APPROVE** — Approve with a suggestion to track the code-duplication between doGet and doGetLimited as a follow-up. The duplication is the only notable concern: the retry loop, backoff logic, request construction, and error-body reading are identical across both methods. A future bug fix or enhancement to retry behavior would need to be applied in two places. The correct fix would be to refactor the shared scaffolding into a private helper and let doGetLimited supply only the body-processing callback — but this is a non-trivial refactor that goes beyond the scope of this PR, which otherwise is correct, well-tested, and follows the project's conventions. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 1a305092* </details> <!-- review-bot:sonnet -->
gitea/client.go Outdated
@@ -131,0 +153,4 @@
return "", fmt.Errorf("fetch diff: %w", err)
}
return string(body), nil
}
First-time contributor

[NIT] The disabled-limit branch (maxSize < 0) wraps the error with 'fetch diff: %w' but the limited branch also wraps with the same prefix inside the return. This is fine, but worth noting that the error string convention is consistent between the two paths.

**[NIT]** The disabled-limit branch (maxSize < 0) wraps the error with 'fetch diff: %w' but the limited branch also wraps with the same prefix inside the return. This is fine, but worth noting that the error string convention is consistent between the two paths.
@@ -413,6 +441,92 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return nil, lastErr
First-time contributor

[MINOR] doGetLimited is ~90 lines and duplicates nearly all of doGet's retry/backoff logic. The two methods share: the attempt loop, backoff wiring, retry-delay select, request construction, header setting, temporary-network-error retry, and error-body reading. The only meaningful difference is the LimitReader + overflow check on the success path. Extracting a shared doGetCore(ctx, reqURL string, processBody func(r io.Reader) ([]byte, error)) ([]byte, error) or similar would eliminate the duplication without sacrificing clarity. As-is, any future change to the retry policy (new backoff defaults, added headers, logging changes) must be applied in both places.

**[MINOR]** doGetLimited is ~90 lines and duplicates nearly all of doGet's retry/backoff logic. The two methods share: the attempt loop, backoff wiring, retry-delay select, request construction, header setting, temporary-network-error retry, and error-body reading. The only meaningful difference is the LimitReader + overflow check on the success path. Extracting a shared doGetCore(ctx, reqURL string, processBody func(r io.Reader) ([]byte, error)) ([]byte, error) or similar would eliminate the duplication without sacrificing clarity. As-is, any future change to the retry policy (new backoff defaults, added headers, logging changes) must be applied in both places.
@@ -0,0 +22,4 @@
client := NewClient(server.URL, "test-token")
client.MaxDiffSize = 100 // 100 bytes limit
client.RetryBackoff = []time.Duration{} // no delay in tests
First-time contributor

[NIT] The test server handler ignores write errors from w.Write([]byte(largeDiff)). This is idiomatic for test handlers and not a bug, but for completeness the existing codebase pattern is the same so this is consistent.

**[NIT]** The test server handler ignores write errors from w.Write([]byte(largeDiff)). This is idiomatic for test handlers and not a bug, but for completeness the existing codebase pattern is the same so this is consistent.
security-review-bot requested review from security-review-bot 2026-05-13 12:18:45 +00:00
security-review-bot approved these changes 2026-05-13 12:18:45 +00:00
Dismissed
security-review-bot left a comment
Collaborator

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 1a305092)

Security Review

Summary

The changes add a sensible default limit for PR diffs, enforce it safely with streaming reads, and include thorough tests. This hardens against resource exhaustion without introducing new security risks.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 472 In doGetLimited, when maxBytes equals math.MaxInt64, limitBytes overflows and is clamped to math.MaxInt64, causing the reader to effectively allow an unbounded read (up to MaxInt64). While this only happens with an extreme configuration, it could permit large memory usage if someone sets MaxDiffSize to MaxInt64 instead of using -1 to disable the limit.

Recommendation

APPROVE — Overall, this PR improves security by bounding diff sizes with a safe default and streaming reads. Approve as-is. For defense in depth, consider handling pathological configurations: treat values >= math.MaxInt64-1 as a special case (e.g., require using -1 to disable, or clamp to a safe high bound and still perform overflow detection). This avoids potentially unbounded reads if a caller mistakenly sets MaxDiffSize to MaxInt64.


Review by security


Evaluated against 1a305092

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3190) for up-to-date findings. <details><summary>Previous findings (commit 1a305092)</summary> # Security Review ## Summary The changes add a sensible default limit for PR diffs, enforce it safely with streaming reads, and include thorough tests. This hardens against resource exhaustion without introducing new security risks. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 472 | In doGetLimited, when maxBytes equals math.MaxInt64, limitBytes overflows and is clamped to math.MaxInt64, causing the reader to effectively allow an unbounded read (up to MaxInt64). While this only happens with an extreme configuration, it could permit large memory usage if someone sets MaxDiffSize to MaxInt64 instead of using -1 to disable the limit. | ## Recommendation **APPROVE** — Overall, this PR improves security by bounding diff sizes with a safe default and streaming reads. Approve as-is. For defense in depth, consider handling pathological configurations: treat values >= math.MaxInt64-1 as a special case (e.g., require using -1 to disable, or clamp to a safe high bound and still perform overflow detection). This avoids potentially unbounded reads if a caller mistakenly sets MaxDiffSize to MaxInt64. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 1a305092* </details> <!-- review-bot:security -->
gitea/client.go Outdated
@@ -416,0 +469,4 @@
timer := time.NewTimer(delay)
select {
case <-timer.C:
Collaborator

[MINOR] In doGetLimited, when maxBytes equals math.MaxInt64, limitBytes overflows and is clamped to math.MaxInt64, causing the reader to effectively allow an unbounded read (up to MaxInt64). While this only happens with an extreme configuration, it could permit large memory usage if someone sets MaxDiffSize to MaxInt64 instead of using -1 to disable the limit.

**[MINOR]** In doGetLimited, when maxBytes equals math.MaxInt64, limitBytes overflows and is clamped to math.MaxInt64, causing the reader to effectively allow an unbounded read (up to MaxInt64). While this only happens with an extreme configuration, it could permit large memory usage if someone sets MaxDiffSize to MaxInt64 instead of using -1 to disable the limit.
security-review-bot marked this conversation as resolved
gpt-review-bot approved these changes 2026-05-13 12:19:15 +00:00
Dismissed
gpt-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 1a305092)

Gpt Review

Summary

Solid hardening of GetPullRequestDiff with a sensible default size limit and a clear sentinel error. The doGetLimited implementation correctly bounds memory using io.LimitReader and preserves existing retry semantics. Tests comprehensively cover boundary conditions.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 68 Adding an exported field (MaxDiffSize) to the Client struct may break external code that constructs Client using unkeyed composite literals. Consider calling this out in release notes or ensuring consumers use NewClient.
2 [MINOR] gitea/client.go 441 doGetLimited largely duplicates logic from doGet (retry/backoff, error handling). This increases maintenance burden and risk of drift. Consider refactoring to share common request/response handling with a size-limit strategy.
3 [NIT] gitea/client.go 68 MaxDiffSize lacks an explicit concurrency note like RetryBackoff. Since Client is used concurrently, documenting that this field should be set before use (and not mutated concurrently) would clarify safe usage.
4 [NIT] gitea/diff_size_test.go 1 Multiple similar test cases could be consolidated into a table-driven test to reduce repetition and follow established testing patterns.

Recommendation

APPROVE — The change is well-implemented and aligns with Go patterns for bounding reads (io.LimitReader) and sentinel errors (errors.Is). The default limit and the ability to disable it are sensible, and tests exercise key edge cases. Consider refactoring to reduce duplication between doGet and doGetLimited, and add a brief concurrency note to MaxDiffSize similar to RetryBackoff. Also note the potential for breaking external unkeyed composite literals in release notes. Otherwise, this is ready to merge.


Review by gpt


Evaluated against 1a305092

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3188) for up-to-date findings. <details><summary>Previous findings (commit 1a305092)</summary> # Gpt Review ## Summary Solid hardening of GetPullRequestDiff with a sensible default size limit and a clear sentinel error. The doGetLimited implementation correctly bounds memory using io.LimitReader and preserves existing retry semantics. Tests comprehensively cover boundary conditions. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 68 | Adding an exported field (MaxDiffSize) to the Client struct may break external code that constructs Client using unkeyed composite literals. Consider calling this out in release notes or ensuring consumers use NewClient. | | 2 | [MINOR] | `gitea/client.go` | 441 | doGetLimited largely duplicates logic from doGet (retry/backoff, error handling). This increases maintenance burden and risk of drift. Consider refactoring to share common request/response handling with a size-limit strategy. | | 3 | [NIT] | `gitea/client.go` | 68 | MaxDiffSize lacks an explicit concurrency note like RetryBackoff. Since Client is used concurrently, documenting that this field should be set before use (and not mutated concurrently) would clarify safe usage. | | 4 | [NIT] | `gitea/diff_size_test.go` | 1 | Multiple similar test cases could be consolidated into a table-driven test to reduce repetition and follow established testing patterns. | ## Recommendation **APPROVE** — The change is well-implemented and aligns with Go patterns for bounding reads (io.LimitReader) and sentinel errors (errors.Is). The default limit and the ability to disable it are sensible, and tests exercise key edge cases. Consider refactoring to reduce duplication between doGet and doGetLimited, and add a brief concurrency note to MaxDiffSize similar to RetryBackoff. Also note the potential for breaking external unkeyed composite literals in release notes. Otherwise, this is ready to merge. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against 1a305092* </details> <!-- review-bot:gpt -->
@@ -61,6 +68,10 @@ type Client struct {
// This field must be configured before the first request is made.
First-time contributor

[MINOR] Adding an exported field (MaxDiffSize) to the Client struct may break external code that constructs Client using unkeyed composite literals. Consider calling this out in release notes or ensuring consumers use NewClient.

**[MINOR]** Adding an exported field (MaxDiffSize) to the Client struct may break external code that constructs Client using unkeyed composite literals. Consider calling this out in release notes or ensuring consumers use NewClient.
First-time contributor

[NIT] MaxDiffSize lacks an explicit concurrency note like RetryBackoff. Since Client is used concurrently, documenting that this field should be set before use (and not mutated concurrently) would clarify safe usage.

**[NIT]** MaxDiffSize lacks an explicit concurrency note like RetryBackoff. Since Client is used concurrently, documenting that this field should be set before use (and not mutated concurrently) would clarify safe usage.
@@ -413,6 +441,92 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return nil, lastErr
First-time contributor

[MINOR] doGetLimited largely duplicates logic from doGet (retry/backoff, error handling). This increases maintenance burden and risk of drift. Consider refactoring to share common request/response handling with a size-limit strategy.

**[MINOR]** doGetLimited largely duplicates logic from doGet (retry/backoff, error handling). This increases maintenance burden and risk of drift. Consider refactoring to share common request/response handling with a size-limit strategy.
@@ -0,0 +1,143 @@
package gitea
First-time contributor

[NIT] Multiple similar test cases could be consolidated into a table-driven test to reduce repetition and follow established testing patterns.

**[NIT]** Multiple similar test cases could be consolidated into a table-driven test to reduce repetition and follow established testing patterns.
rodin added the wip label 2026-05-13 12:21:14 +00:00
Author
Owner

Acknowledged as a follow-up refactor (see prior comment). All three reviews approved with this accepted as future work.

Acknowledged as a follow-up refactor (see [prior comment](https://gitea.weiker.me/rodin/review-bot/pulls/109#issuecomment-19065)). All three reviews approved with this accepted as future work.
Author
Owner

Thanks for confirming the consistency — no change needed here.

Thanks for confirming the consistency — no change needed here.
Author
Owner

Added //nolint:errcheck annotation to the test handler w.Write call in the table-driven rewrite for clarity.

Added `//nolint:errcheck` annotation to the test handler `w.Write` call in the table-driven rewrite for clarity.
Author
Owner

Already fixed in the previous commit — see comment. The overflow clamp (if limitBytes <= 0 { limitBytes = math.MaxInt64 }) handles this case correctly.

Already fixed in the previous commit — see [comment](https://gitea.weiker.me/rodin/review-bot/pulls/109#issuecomment-19065). The overflow clamp (`if limitBytes <= 0 { limitBytes = math.MaxInt64 }`) handles this case correctly.
Author
Owner

Not applicable here. The Client struct has unexported fields (baseURL, token, http), so no external package can use unkeyed composite literals — the compiler already rejects it. Only NewClient() can construct a Client from outside the package.

Not applicable here. The `Client` struct has unexported fields (`baseURL`, `token`, `http`), so no external package can use unkeyed composite literals — the compiler already rejects it. Only `NewClient()` can construct a `Client` from outside the package.
Author
Owner

Same as sonnet finding — acknowledged as a follow-up refactor. See prior comment.

Same as sonnet finding — acknowledged as a follow-up refactor. See [prior comment](https://gitea.weiker.me/rodin/review-bot/pulls/109#issuecomment-19065).
Author
Owner

Fixed in 84ac50a — added the same concurrency safety note that RetryBackoff has: "This field must be configured before the first request is made. Modifying it while requests are in flight is not safe."

Fixed in 84ac50a — added the same concurrency safety note that `RetryBackoff` has: "This field must be configured before the first request is made. Modifying it while requests are in flight is not safe."
Author
Owner

Fixed in 84ac50a — consolidated all six individual test functions into a single table-driven TestGetPullRequestDiff_SizeLimits test with named subtests.

Fixed in 84ac50a — consolidated all six individual test functions into a single table-driven `TestGetPullRequestDiff_SizeLimits` test with named subtests.
rodin removed the wip label 2026-05-13 12:24:20 +00:00
sonnet-review-bot approved these changes 2026-05-13 12:24:33 +00:00
sonnet-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 84ac50a8)

Sonnet Review

Summary

The implementation correctly hardens GetPullRequestDiff against unbounded diff sizes using a well-designed sentinel error, configurable limit field, and proper overflow detection. The code follows established patterns and is well-tested with appropriate boundary cases.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 444 doGetLimited duplicates the entire retry loop from doGet (backoff initialization, delay logic, request construction, error handling, retry conditions). This is ~60 lines of duplicated logic. A cleaner design would accept an optional response-body transformer/limiter function in a single internal method, or factor the retry machinery into a shared helper that accepts a per-attempt callback. As-is, any future fix to the retry logic (e.g., adding a new retriable error condition, changing log fields) must be applied in two places.
2 [NIT] gitea/client.go 161 The error message returned on size exceeded wraps ErrDiffTooLarge correctly with %w, but the message says 'response is larger than %d bytes' which is technically inaccurate — the response is at least maxBytes+1 bytes. A slightly more accurate phrasing would be 'response exceeds %d bytes'. Minor readability nit.

Recommendation

APPROVE — Approve with the understanding that the code is correct and well-tested. The only meaningful concern is the duplication between doGet and doGetLimited — this is a MINOR finding because it works correctly now but creates a maintenance burden. If the retry logic changes in the future, both copies must be updated. Consider refactoring in a follow-up to extract the shared retry scaffold. The sentinel error follows the patterns correctly (lowercase, no punctuation, package prefix implied by context, wrapped with %w for proper errors.Is propagation). The overflow detection logic (reading maxBytes+1, clamping at math.MaxInt64) is correct and well-commented. Tests cover all meaningful boundary cases including exact-limit, one-byte-over, disabled, and default.


Review by sonnet


Evaluated against 84ac50a8

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3191) for up-to-date findings. <details><summary>Previous findings (commit 84ac50a8)</summary> # Sonnet Review ## Summary The implementation correctly hardens GetPullRequestDiff against unbounded diff sizes using a well-designed sentinel error, configurable limit field, and proper overflow detection. The code follows established patterns and is well-tested with appropriate boundary cases. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 444 | doGetLimited duplicates the entire retry loop from doGet (backoff initialization, delay logic, request construction, error handling, retry conditions). This is ~60 lines of duplicated logic. A cleaner design would accept an optional response-body transformer/limiter function in a single internal method, or factor the retry machinery into a shared helper that accepts a per-attempt callback. As-is, any future fix to the retry logic (e.g., adding a new retriable error condition, changing log fields) must be applied in two places. | | 2 | [NIT] | `gitea/client.go` | 161 | The error message returned on size exceeded wraps ErrDiffTooLarge correctly with %w, but the message says 'response is larger than %d bytes' which is technically inaccurate — the response is at least maxBytes+1 bytes. A slightly more accurate phrasing would be 'response exceeds %d bytes'. Minor readability nit. | ## Recommendation **APPROVE** — Approve with the understanding that the code is correct and well-tested. The only meaningful concern is the duplication between doGet and doGetLimited — this is a MINOR finding because it works correctly now but creates a maintenance burden. If the retry logic changes in the future, both copies must be updated. Consider refactoring in a follow-up to extract the shared retry scaffold. The sentinel error follows the patterns correctly (lowercase, no punctuation, package prefix implied by context, wrapped with %w for proper errors.Is propagation). The overflow detection logic (reading maxBytes+1, clamping at math.MaxInt64) is correct and well-commented. Tests cover all meaningful boundary cases including exact-limit, one-byte-over, disabled, and default. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 84ac50a8* </details> <!-- review-bot:sonnet -->
gitea/client.go Outdated
@@ -131,0 +158,4 @@
return string(body), nil
}
body, err := c.doGetLimited(ctx, reqURL, maxSize)
First-time contributor

[NIT] The error message returned on size exceeded wraps ErrDiffTooLarge correctly with %w, but the message says 'response is larger than %d bytes' which is technically inaccurate — the response is at least maxBytes+1 bytes. A slightly more accurate phrasing would be 'response exceeds %d bytes'. Minor readability nit.

**[NIT]** The error message returned on size exceeded wraps ErrDiffTooLarge correctly with %w, but the message says 'response is larger than %d bytes' which is technically inaccurate — the response is at least maxBytes+1 bytes. A slightly more accurate phrasing would be 'response exceeds %d bytes'. Minor readability nit.
@@ -413,6 +444,92 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return nil, lastErr
First-time contributor

[MINOR] doGetLimited duplicates the entire retry loop from doGet (backoff initialization, delay logic, request construction, error handling, retry conditions). This is ~60 lines of duplicated logic. A cleaner design would accept an optional response-body transformer/limiter function in a single internal method, or factor the retry machinery into a shared helper that accepts a per-attempt callback. As-is, any future fix to the retry logic (e.g., adding a new retriable error condition, changing log fields) must be applied in two places.

**[MINOR]** doGetLimited duplicates the entire retry loop from doGet (backoff initialization, delay logic, request construction, error handling, retry conditions). This is ~60 lines of duplicated logic. A cleaner design would accept an optional response-body transformer/limiter function in a single internal method, or factor the retry machinery into a shared helper that accepts a per-attempt callback. As-is, any future fix to the retry logic (e.g., adding a new retriable error condition, changing log fields) must be applied in two places.
gpt-review-bot approved these changes 2026-05-13 12:25:29 +00:00
gpt-review-bot left a comment
First-time contributor

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 84ac50a8)

Gpt Review

Summary

The changes add a sensible, configurable size limit for PR diffs with a safe default, implement bounded reads to avoid unbounded memory usage, and include thorough tests. Error handling, configuration semantics, and retry behavior remain consistent and idiomatic.

Recommendation

APPROVE — The implementation follows Go error-handling and configuration patterns well: a package-level sentinel error, zero-value default behavior on the Client, and bounded reads via io.LimitReader. The new doGetLimited mirrors existing retry logic and clamps overflow properly. Tests cover boundary conditions and disabled limits. Given CI is green and no major issues are identified, this is ready to merge.


Review by gpt


Evaluated against 84ac50a8

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3194) for up-to-date findings. <details><summary>Previous findings (commit 84ac50a8)</summary> # Gpt Review ## Summary The changes add a sensible, configurable size limit for PR diffs with a safe default, implement bounded reads to avoid unbounded memory usage, and include thorough tests. Error handling, configuration semantics, and retry behavior remain consistent and idiomatic. ## Recommendation **APPROVE** — The implementation follows Go error-handling and configuration patterns well: a package-level sentinel error, zero-value default behavior on the Client, and bounded reads via io.LimitReader. The new doGetLimited mirrors existing retry logic and clamps overflow properly. Tests cover boundary conditions and disabled limits. Given CI is green and no major issues are identified, this is ready to merge. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against 84ac50a8* </details> <!-- review-bot:gpt -->
security-review-bot requested review from security-review-bot 2026-05-13 12:25:39 +00:00
security-review-bot approved these changes 2026-05-13 12:25:40 +00:00
Dismissed
security-review-bot left a comment
Collaborator

Original review

Supersededsee current review for up-to-date findings.

Previous findings (commit 84ac50a8)

Security Review

Summary

The changes introduce a bounded read for PR diffs with sensible defaults and clear override behavior, reducing DoS risk from oversized responses. The implementation uses limit readers, preserves existing retry logic, and includes comprehensive tests; CI passed.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 458 Logging uses redactURL(reqURL) which only strips query parameters and may leak userinfo credentials if present in baseURL (e.g., https://user:pass@host). Consider redacting userinfo in redactURL to avoid potential secret exposure in logs.
2 [MINOR] gitea/client.go 492 In doGetLimited, when maxBytes == math.MaxInt64, limitBytes is set to math.MaxInt64, effectively removing the practical limit and allowing io.ReadAll to attempt to read an unbounded response. While this requires explicit misconfiguration, it can still risk memory exhaustion; consider treating extremely large values as disabled in a controlled way or enforcing a reasonable upper cap.

Recommendation

APPROVE — Overall, the diff-size hardening is sound and improves resilience against large PR diffs without introducing new security vulnerabilities. Approve as is. As defense-in-depth, consider enhancing redactURL to strip userinfo components to prevent accidental credential leakage in logs, and add a guard for extreme MaxDiffSize values (e.g., clamp to a safe upper bound or explicitly treat MaxInt64 as 'disabled' with clear documentation) to avoid unbounded memory use under misconfiguration.


Review by security


Evaluated against 84ac50a8

~~Original review~~ **Superseded** — [see current review](https://gitea.weiker.me/rodin/review-bot/pulls/109#pullrequestreview-3232) for up-to-date findings. <details><summary>Previous findings (commit 84ac50a8)</summary> # Security Review ## Summary The changes introduce a bounded read for PR diffs with sensible defaults and clear override behavior, reducing DoS risk from oversized responses. The implementation uses limit readers, preserves existing retry logic, and includes comprehensive tests; CI passed. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 458 | Logging uses redactURL(reqURL) which only strips query parameters and may leak userinfo credentials if present in baseURL (e.g., https://user:pass@host). Consider redacting userinfo in redactURL to avoid potential secret exposure in logs. | | 2 | [MINOR] | `gitea/client.go` | 492 | In doGetLimited, when maxBytes == math.MaxInt64, limitBytes is set to math.MaxInt64, effectively removing the practical limit and allowing io.ReadAll to attempt to read an unbounded response. While this requires explicit misconfiguration, it can still risk memory exhaustion; consider treating extremely large values as disabled in a controlled way or enforcing a reasonable upper cap. | ## Recommendation **APPROVE** — Overall, the diff-size hardening is sound and improves resilience against large PR diffs without introducing new security vulnerabilities. Approve as is. As defense-in-depth, consider enhancing redactURL to strip userinfo components to prevent accidental credential leakage in logs, and add a guard for extreme MaxDiffSize values (e.g., clamp to a safe upper bound or explicitly treat MaxInt64 as 'disabled' with clear documentation) to avoid unbounded memory use under misconfiguration. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 84ac50a8* </details> <!-- review-bot:security -->
gitea/client.go Outdated
@@ -416,0 +455,4 @@
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
const maxErrorBodyBytes = 64 * 1024
Collaborator

[MINOR] Logging uses redactURL(reqURL) which only strips query parameters and may leak userinfo credentials if present in baseURL (e.g., https://user:pass@host). Consider redacting userinfo in redactURL to avoid potential secret exposure in logs.

**[MINOR]** Logging uses redactURL(reqURL) which only strips query parameters and may leak userinfo credentials if present in baseURL (e.g., https://user:pass@host). Consider redacting userinfo in redactURL to avoid potential secret exposure in logs.
security-review-bot marked this conversation as resolved
gitea/client.go Outdated
@@ -416,0 +489,4 @@
resp, err := c.http.Do(req)
if err != nil {
lastErr = err
if attempt < maxAttempts-1 && isTemporaryNetError(err) {
Collaborator

[MINOR] In doGetLimited, when maxBytes == math.MaxInt64, limitBytes is set to math.MaxInt64, effectively removing the practical limit and allowing io.ReadAll to attempt to read an unbounded response. While this requires explicit misconfiguration, it can still risk memory exhaustion; consider treating extremely large values as disabled in a controlled way or enforcing a reasonable upper cap.

**[MINOR]** In doGetLimited, when maxBytes == math.MaxInt64, limitBytes is set to math.MaxInt64, effectively removing the practical limit and allowing io.ReadAll to attempt to read an unbounded response. While this requires explicit misconfiguration, it can still risk memory exhaustion; consider treating extremely large values as disabled in a controlled way or enforcing a reasonable upper cap.
security-review-bot marked this conversation as resolved
rodin added the wip label 2026-05-13 12:29:19 +00:00
Author
Owner

Self-Review: PR #109

Self-review against 84ac50a8cf

Phase 1: Independent Findings

1. [MINOR] doGetLimited duplicates retry/backoff logic from doGet (gitea/client.go)
~65 lines of retry loop, backoff initialization, delay select, request construction, header setting, temporary-network-error handling, and error-body reading are duplicated between doGet and doGetLimited. The only meaningful difference is the bounded io.LimitReader read on the success path. Any future change to retry behavior (new backoff strategy, new retriable error conditions, log field changes) must be applied in both places. This is a known maintenance hazard — noted here as still present per design, pending a future refactor.

2. [NIT] redactURL does not strip userinfo (gitea/client.go, line 329)
redactURL strips query parameters but does not strip Userinfo from the URL (e.g., https://user:pass@host). In the current codebase baseURL is always set from a config value (no credential embedding), so this is low risk. But the function's name and intent suggest it should be comprehensive. The security-review-bot raised this in their latest pass.

3. [NIT] Error message slight inaccuracy (gitea/client.go, line 515)
"response is larger than %d bytes" — when triggered, we've read maxBytes+1 bytes which means the response is at minimum maxBytes+1 bytes, not just > maxBytes. "response exceeds %d bytes" would be marginally more accurate but both convey the meaning clearly. Very minor.

4. [OBSERVATION] No test for maxBytes == math.MaxInt64
The overflow clamp (limitBytes = maxBytes + 1; if limitBytes <= 0 { limitBytes = math.MaxInt64 }) handles math.MaxInt64 correctly (sets limitBytes = math.MaxInt64). The overflow check int64(len(body)) > maxBytes can never be true when maxBytes = math.MaxInt64 since len() on a 64-bit platform is bounded by MaxInt64. The behavior is effectively "no limit", which is distinct from passing -1. This is well-handled but there's no test for it. Not a bug — more of a documentation gap for extreme configurations.


Phase 2: Prior Review Verification

Finding Reviewer Status Notes
doGetLimited duplicates doGet retry logic sonnet-review-bot (all 3 passes), gpt-review-bot ⚠️ Partial Acknowledged as a follow-up refactor; approved with understanding. Duplication is still present by design.
maxBytes+1 int64 overflow security-review-bot (pass 1, pass 2) Resolved Overflow clamp added: limitBytes = maxBytes + 1; if limitBytes <= 0 { limitBytes = math.MaxInt64 }. Correct.
maxSize < 0 vs explicit -1 — use const noLimit for clarity sonnet-review-bot (pass 1) 🚫 Invalid The existing docs say "Set to any negative value to disable the limit" which is intentionally flexible (not just -1). The check maxSize < 0 is correct. A named constant would be a style preference, not a correctness issue.
w.Write return ignored in test handler sonnet-review-bot (pass 1), gpt-review-bot (pass 3) Resolved //nolint:errcheck annotation added; test consolidated to table-driven format.
redactURL does not strip userinfo security-review-bot (pass 3) Still present redactURL only strips query params, not userinfo. Low risk in practice (baseURL won't contain credentials) but technically incomplete.
MaxDiffSize missing concurrency safety note gpt-review-bot (pass 2) Resolved Field doc now matches RetryBackoff: "This field must be configured before the first request is made. Modifying it while requests are in flight is not safe."
Exported field breaks unkeyed composite literals gpt-review-bot (pass 2) 🚫 Invalid Client has unexported fields, so external packages cannot use unkeyed composite literals. The compiler prevents it.
Test cases not consolidated into table-driven test gpt-review-bot (pass 2) Resolved All 6 cases are now a single TestGetPullRequestDiff_SizeLimits table-driven test with named subtests.
maxBytes == math.MaxInt64 allows unbounded read security-review-bot (pass 2, pass 3) ⚠️ Partial The overflow clamp correctly handles the +1 overflow, but with maxBytes = MaxInt64, limitBytes = MaxInt64 and the len(body) > maxBytes check can never trigger. This means MaxInt64 is a misconfiguration that silently disables the limit rather than enforcing it. This is an edge case with no practical exploit path (requires deliberate misconfiguration), and behavior is consistent with "use -1 to disable". Accepted as-is.
doGet log says "disabled" but field doc says "-1 to disable" sonnet-review-bot (pass 1) 🚫 Invalid The comment at line 152 is "When the limit is disabled, use the standard doGet path." which is describing the code path, not the API surface. The maxSize < 0 check is correctly documented in the field's GoDoc. No inconsistency.
Error message "larger than" vs "exceeds" sonnet-review-bot (pass 3) Still present Minor wording nit. "response is larger than %d bytes" vs "response exceeds %d bytes". Not a correctness issue; either phrasing is accurate.

Assessment: Clean

The implementation is correct, well-tested, and ready to merge. The core change — bounding GetPullRequestDiff with a configurable limit using io.LimitReader and a sentinel error — is sound. The overflow clamp is correct, tests cover all meaningful boundary cases, and prior review findings are either resolved or explicitly accepted as follow-up work. The two "Still present" items are nits with no correctness impact: redactURL not stripping userinfo (low risk, no credentials in baseURL), and the error message wording preference. The doGetLimited duplication remains by design (agreed as future refactor). CI is green.

## Self-Review: PR #109 Self-review against 84ac50a8cffdb79df34f554e30a0079a40f465d6 ### Phase 1: Independent Findings **1. [MINOR] `doGetLimited` duplicates retry/backoff logic from `doGet`** (`gitea/client.go`) ~65 lines of retry loop, backoff initialization, delay select, request construction, header setting, temporary-network-error handling, and error-body reading are duplicated between `doGet` and `doGetLimited`. The only meaningful difference is the bounded `io.LimitReader` read on the success path. Any future change to retry behavior (new backoff strategy, new retriable error conditions, log field changes) must be applied in both places. This is a known maintenance hazard — noted here as still present per design, pending a future refactor. **2. [NIT] `redactURL` does not strip userinfo** (`gitea/client.go`, line 329) `redactURL` strips query parameters but does not strip `Userinfo` from the URL (e.g., `https://user:pass@host`). In the current codebase `baseURL` is always set from a config value (no credential embedding), so this is low risk. But the function's name and intent suggest it should be comprehensive. The security-review-bot raised this in their latest pass. **3. [NIT] Error message slight inaccuracy** (`gitea/client.go`, line 515) `"response is larger than %d bytes"` — when triggered, we've read `maxBytes+1` bytes which means the response is at minimum `maxBytes+1` bytes, not just `> maxBytes`. `"response exceeds %d bytes"` would be marginally more accurate but both convey the meaning clearly. Very minor. **4. [OBSERVATION] No test for `maxBytes == math.MaxInt64`** The overflow clamp (`limitBytes = maxBytes + 1; if limitBytes <= 0 { limitBytes = math.MaxInt64 }`) handles `math.MaxInt64` correctly (sets `limitBytes = math.MaxInt64`). The overflow check `int64(len(body)) > maxBytes` can never be true when `maxBytes = math.MaxInt64` since `len()` on a 64-bit platform is bounded by `MaxInt64`. The behavior is effectively "no limit", which is distinct from passing `-1`. This is well-handled but there's no test for it. Not a bug — more of a documentation gap for extreme configurations. --- ### Phase 2: Prior Review Verification | Finding | Reviewer | Status | Notes | |---------|----------|--------|-------| | `doGetLimited` duplicates doGet retry logic | sonnet-review-bot (all 3 passes), gpt-review-bot | ⚠️ Partial | Acknowledged as a follow-up refactor; approved with understanding. Duplication is still present by design. | | `maxBytes+1` int64 overflow | security-review-bot (pass 1, pass 2) | ✅ Resolved | Overflow clamp added: `limitBytes = maxBytes + 1; if limitBytes <= 0 { limitBytes = math.MaxInt64 }`. Correct. | | `maxSize < 0` vs explicit `-1` — use `const noLimit` for clarity | sonnet-review-bot (pass 1) | 🚫 Invalid | The existing docs say "Set to any negative value to disable the limit" which is intentionally flexible (not just -1). The check `maxSize < 0` is correct. A named constant would be a style preference, not a correctness issue. | | `w.Write` return ignored in test handler | sonnet-review-bot (pass 1), gpt-review-bot (pass 3) | ✅ Resolved | `//nolint:errcheck` annotation added; test consolidated to table-driven format. | | `redactURL` does not strip userinfo | security-review-bot (pass 3) | ❌ Still present | `redactURL` only strips query params, not userinfo. Low risk in practice (baseURL won't contain credentials) but technically incomplete. | | `MaxDiffSize` missing concurrency safety note | gpt-review-bot (pass 2) | ✅ Resolved | Field doc now matches `RetryBackoff`: "This field must be configured before the first request is made. Modifying it while requests are in flight is not safe." | | Exported field breaks unkeyed composite literals | gpt-review-bot (pass 2) | 🚫 Invalid | `Client` has unexported fields, so external packages cannot use unkeyed composite literals. The compiler prevents it. | | Test cases not consolidated into table-driven test | gpt-review-bot (pass 2) | ✅ Resolved | All 6 cases are now a single `TestGetPullRequestDiff_SizeLimits` table-driven test with named subtests. | | `maxBytes == math.MaxInt64` allows unbounded read | security-review-bot (pass 2, pass 3) | ⚠️ Partial | The overflow clamp correctly handles the `+1` overflow, but with `maxBytes = MaxInt64`, `limitBytes` = `MaxInt64` and the `len(body) > maxBytes` check can never trigger. This means `MaxInt64` is a misconfiguration that silently disables the limit rather than enforcing it. This is an edge case with no practical exploit path (requires deliberate misconfiguration), and behavior is consistent with "use -1 to disable". Accepted as-is. | | `doGet` log says "disabled" but field doc says "-1 to disable" | sonnet-review-bot (pass 1) | 🚫 Invalid | The comment at line 152 is "When the limit is disabled, use the standard doGet path." which is describing the code path, not the API surface. The `maxSize < 0` check is correctly documented in the field's GoDoc. No inconsistency. | | Error message "larger than" vs "exceeds" | sonnet-review-bot (pass 3) | ❌ Still present | Minor wording nit. "response is larger than %d bytes" vs "response exceeds %d bytes". Not a correctness issue; either phrasing is accurate. | --- ### Assessment: ✅ Clean The implementation is correct, well-tested, and ready to merge. The core change — bounding `GetPullRequestDiff` with a configurable limit using `io.LimitReader` and a sentinel error — is sound. The overflow clamp is correct, tests cover all meaningful boundary cases, and prior review findings are either resolved or explicitly accepted as follow-up work. The two "Still present" items are nits with no correctness impact: `redactURL` not stripping userinfo (low risk, no credentials in baseURL), and the error message wording preference. The `doGetLimited` duplication remains by design (agreed as future refactor). CI is green.
rodin removed the wip label 2026-05-13 12:31:43 +00:00
rodin added the wip label 2026-05-13 12:35:38 +00:00
Author
Owner

Review feedback addressed (1f0636e)

[MINOR] doGetLimited duplicates retry/backoff logic (sonnet, gpt, all reviewers)

Fixed. Extracted doGetWithReader as a shared core that accepts a readBody func(io.ReadCloser) ([]byte, error) callback. doGet and doGetLimited are now thin wrappers (~5 lines each) over the shared retry/backoff logic. Net reduction of 28 lines.

[MINOR] redactURL does not strip userinfo (security-review-bot)

Fixed. redactURL now strips userinfo credentials (user:pass@host) by setting parsed.User = url.User("REDACTED") before returning the sanitized URL. Added 3 test cases covering user+pass, user+pass+query, and username-only scenarios.

[MINOR] maxBytes == math.MaxInt64 silently disables limit (security-review-bot)

Fixed. GetPullRequestDiff now explicitly treats MaxDiffSize == math.MaxInt64 as disabled (falls through to the unlimited doGet path), preventing the silent bypass where the overflow clamp makes the size check unreachable. Added test case and updated field documentation.

[NIT] Error message wording (sonnet-review-bot, review 3187)

Fixed. Changed from "response is larger than" to "response exceeds" for accuracy.

[NIT] Unkeyed composite literal concern (gpt-review-bot)

Not applicable. The Client struct has unexported fields (baseURL, token, http), so external packages cannot use unkeyed composite literals — the compiler already rejects it.

[NIT] MaxDiffSize concurrency note

Already fixed in prior commit (84ac50a) — the field doc matches RetryBackoff.

[NIT] Table-driven tests

Already fixed in prior commit (84ac50a).

## Review feedback addressed (1f0636e) ### [MINOR] doGetLimited duplicates retry/backoff logic (sonnet, gpt, all reviewers) **Fixed.** Extracted `doGetWithReader` as a shared core that accepts a `readBody func(io.ReadCloser) ([]byte, error)` callback. `doGet` and `doGetLimited` are now thin wrappers (~5 lines each) over the shared retry/backoff logic. Net reduction of 28 lines. ### [MINOR] redactURL does not strip userinfo (security-review-bot) **Fixed.** `redactURL` now strips userinfo credentials (`user:pass@host`) by setting `parsed.User = url.User("REDACTED")` before returning the sanitized URL. Added 3 test cases covering user+pass, user+pass+query, and username-only scenarios. ### [MINOR] maxBytes == math.MaxInt64 silently disables limit (security-review-bot) **Fixed.** `GetPullRequestDiff` now explicitly treats `MaxDiffSize == math.MaxInt64` as disabled (falls through to the unlimited `doGet` path), preventing the silent bypass where the overflow clamp makes the size check unreachable. Added test case and updated field documentation. ### [NIT] Error message wording (sonnet-review-bot, review 3187) **Fixed.** Changed from "response is larger than" to "response exceeds" for accuracy. ### [NIT] Unkeyed composite literal concern (gpt-review-bot) **Not applicable.** The `Client` struct has unexported fields (`baseURL`, `token`, `http`), so external packages cannot use unkeyed composite literals — the compiler already rejects it. ### [NIT] MaxDiffSize concurrency note **Already fixed** in prior commit (84ac50a) — the field doc matches `RetryBackoff`. ### [NIT] Table-driven tests **Already fixed** in prior commit (84ac50a).
rodin removed the wip label 2026-05-13 12:41:20 +00:00
sonnet-review-bot approved these changes 2026-05-13 12:42:01 +00:00
sonnet-review-bot left a comment
First-time contributor

Sonnet Review

Summary

The PR correctly hardens GetPullRequestDiff against unbounded diff sizes using a clean doGetWithReader abstraction, sentinel error, and configurable limit. The implementation is correct, well-tested, and follows the established patterns in the codebase.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 149 The MaxDiffSize field documentation says 'Set to any negative value (or math.MaxInt64) to disable the limit' but the convention for disabling is inconsistent. Negative values and math.MaxInt64 both map to the unlimited path, but the doc comment grouping math.MaxInt64 alongside negative values is slightly surprising since math.MaxInt64 is positive. Consider clarifying the doc: 'Set to a negative value to disable the limit. math.MaxInt64 is also treated as disabled to avoid overflow edge cases.'
2 [MINOR] gitea/client.go 470 In doGetLimited, the overflow guard if limitBytes <= 0 { limitBytes = math.MaxInt64 } is dead code given the caller's pre-check (maxSize == math.MaxInt64 routes to the unlimited path before this function is ever called). The comment on the clamp says it handles maxBytes == math.MaxInt64, but that case never reaches doGetLimited. The guard is harmless and provides defense-in-depth, but the comment is misleading since it describes a case that is already excluded by the caller.
3 [NIT] gitea/diff_size_test.go 72 The test uses defer server.Close() inside a table-driven t.Run loop. This is fine here since each subtest gets its own server, but it's worth noting that t.Cleanup(server.Close) would be more idiomatic per the testing patterns (ensures cleanup after the subtest completes, not just when the closure returns). Not a bug since both are equivalent in this pattern.

Recommendation

APPROVE — Approve. The implementation is correct and well-designed. The doGetWithReader refactor is a clean application of the layered API pattern, the +1 read-to-detect-overflow approach is correct, and the tests cover all the boundary cases (exactly at limit, one over, disabled). The two MINOR findings are about comment clarity and dead code, neither of which affects correctness or safety. CI passes.


Review by sonnet


Evaluated against 1f0636e1

# Sonnet Review ## Summary The PR correctly hardens `GetPullRequestDiff` against unbounded diff sizes using a clean `doGetWithReader` abstraction, sentinel error, and configurable limit. The implementation is correct, well-tested, and follows the established patterns in the codebase. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 149 | The `MaxDiffSize` field documentation says 'Set to any negative value (or math.MaxInt64) to disable the limit' but the convention for disabling is inconsistent. Negative values and math.MaxInt64 both map to the unlimited path, but the doc comment grouping `math.MaxInt64` alongside negative values is slightly surprising since math.MaxInt64 is positive. Consider clarifying the doc: 'Set to a negative value to disable the limit. math.MaxInt64 is also treated as disabled to avoid overflow edge cases.' | | 2 | [MINOR] | `gitea/client.go` | 470 | In `doGetLimited`, the overflow guard `if limitBytes <= 0 { limitBytes = math.MaxInt64 }` is dead code given the caller's pre-check (`maxSize == math.MaxInt64` routes to the unlimited path before this function is ever called). The comment on the clamp says it handles `maxBytes == math.MaxInt64`, but that case never reaches `doGetLimited`. The guard is harmless and provides defense-in-depth, but the comment is misleading since it describes a case that is already excluded by the caller. | | 3 | [NIT] | `gitea/diff_size_test.go` | 72 | The test uses `defer server.Close()` inside a table-driven `t.Run` loop. This is fine here since each subtest gets its own `server`, but it's worth noting that `t.Cleanup(server.Close)` would be more idiomatic per the testing patterns (ensures cleanup after the subtest completes, not just when the closure returns). Not a bug since both are equivalent in this pattern. | ## Recommendation **APPROVE** — Approve. The implementation is correct and well-designed. The `doGetWithReader` refactor is a clean application of the layered API pattern, the `+1` read-to-detect-overflow approach is correct, and the tests cover all the boundary cases (exactly at limit, one over, disabled). The two MINOR findings are about comment clarity and dead code, neither of which affects correctness or safety. CI passes. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 1f0636e1*
gitea/client.go Outdated
@@ -130,1 +147,3 @@
body, err := c.doGet(ctx, reqURL)
maxSize := c.MaxDiffSize
if maxSize == 0 {
First-time contributor

[MINOR] The MaxDiffSize field documentation says 'Set to any negative value (or math.MaxInt64) to disable the limit' but the convention for disabling is inconsistent. Negative values and math.MaxInt64 both map to the unlimited path, but the doc comment grouping math.MaxInt64 alongside negative values is slightly surprising since math.MaxInt64 is positive. Consider clarifying the doc: 'Set to a negative value to disable the limit. math.MaxInt64 is also treated as disabled to avoid overflow edge cases.'

**[MINOR]** The `MaxDiffSize` field documentation says 'Set to any negative value (or math.MaxInt64) to disable the limit' but the convention for disabling is inconsistent. Negative values and math.MaxInt64 both map to the unlimited path, but the doc comment grouping `math.MaxInt64` alongside negative values is slightly surprising since math.MaxInt64 is positive. Consider clarifying the doc: 'Set to a negative value to disable the limit. math.MaxInt64 is also treated as disabled to avoid overflow edge cases.'
gitea/client.go Outdated
@@ -416,0 +467,4 @@
limitBytes := maxBytes + 1
if limitBytes <= 0 {
limitBytes = math.MaxInt64
}
First-time contributor

[MINOR] In doGetLimited, the overflow guard if limitBytes <= 0 { limitBytes = math.MaxInt64 } is dead code given the caller's pre-check (maxSize == math.MaxInt64 routes to the unlimited path before this function is ever called). The comment on the clamp says it handles maxBytes == math.MaxInt64, but that case never reaches doGetLimited. The guard is harmless and provides defense-in-depth, but the comment is misleading since it describes a case that is already excluded by the caller.

**[MINOR]** In `doGetLimited`, the overflow guard `if limitBytes <= 0 { limitBytes = math.MaxInt64 }` is dead code given the caller's pre-check (`maxSize == math.MaxInt64` routes to the unlimited path before this function is ever called). The comment on the clamp says it handles `maxBytes == math.MaxInt64`, but that case never reaches `doGetLimited`. The guard is harmless and provides defense-in-depth, but the comment is misleading since it describes a case that is already excluded by the caller.
@@ -0,0 +69,4 @@
w.Write([]byte(tt.diff)) //nolint:errcheck // test handler
}))
defer server.Close()
First-time contributor

[NIT] The test uses defer server.Close() inside a table-driven t.Run loop. This is fine here since each subtest gets its own server, but it's worth noting that t.Cleanup(server.Close) would be more idiomatic per the testing patterns (ensures cleanup after the subtest completes, not just when the closure returns). Not a bug since both are equivalent in this pattern.

**[NIT]** The test uses `defer server.Close()` inside a table-driven `t.Run` loop. This is fine here since each subtest gets its own `server`, but it's worth noting that `t.Cleanup(server.Close)` would be more idiomatic per the testing patterns (ensures cleanup after the subtest completes, not just when the closure returns). Not a bug since both are equivalent in this pattern.
security-review-bot requested review from security-review-bot 2026-05-13 12:42:23 +00:00
security-review-bot approved these changes 2026-05-13 12:42:23 +00:00
Dismissed
security-review-bot left a comment
Collaborator

Security Review

Summary

Solid security hardening that bounds diff fetch size to mitigate DoS risk and improves URL redaction in logs. Implementation is careful about overflow and preserves existing behavior when limits are disabled.

Recommendation

APPROVE — The changes correctly enforce a configurable maximum diff size using io.LimitReader with a +1 byte overflow check, avoiding unbounded memory consumption. Edge cases are handled (0 uses default, <0 and MaxInt64 disable the limit), and the refactoring via doGetWithReader keeps retry behavior while limiting error bodies. The URL redaction improvement avoids leaking userinfo and query parameters in logs. Tests comprehensively cover limits and boundary conditions. Consider extending similar size limiting to other large-response endpoints in the future for broader DoS resilience and ensuring non-2xx paths elsewhere also cap error body sizes, but this PR is good to merge as-is.


Review by security


Evaluated against 1f0636e1

# Security Review ## Summary Solid security hardening that bounds diff fetch size to mitigate DoS risk and improves URL redaction in logs. Implementation is careful about overflow and preserves existing behavior when limits are disabled. ## Recommendation **APPROVE** — The changes correctly enforce a configurable maximum diff size using io.LimitReader with a +1 byte overflow check, avoiding unbounded memory consumption. Edge cases are handled (0 uses default, <0 and MaxInt64 disable the limit), and the refactoring via doGetWithReader keeps retry behavior while limiting error bodies. The URL redaction improvement avoids leaking userinfo and query parameters in logs. Tests comprehensively cover limits and boundary conditions. Consider extending similar size limiting to other large-response endpoints in the future for broader DoS resilience and ensuring non-2xx paths elsewhere also cap error body sizes, but this PR is good to merge as-is. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 1f0636e1*
gpt-review-bot approved these changes 2026-05-13 12:43:16 +00:00
gpt-review-bot left a comment
First-time contributor

Gpt Review

Summary

Solid, idiomatic hardening against unbounded diff sizes with clear defaults, a sentinel error, and well-structured retries and body reading. Tests are comprehensive and CI passed.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 68 API design nit: using math.MaxInt64 as a special value to disable the diff size limit may be surprising. Since negative values already disable the limit and doGetLimited clamps overflow, consider simplifying the API by documenting only negative values as the disable mechanism and treating very large positive values (including MaxInt64) as enforced limits.
2 [NIT] gitea/client.go 140 The MaxDiffSize branching logic and limit calculation are repeated across GetPullRequestDiff and doGetLimited. Consider factoring an internal helper (e.g., effectiveMaxDiffSize() and a boolean for disabled) to centralize the policy and reduce repetition.

Recommendation

APPROVE — The changes are well-implemented and align with Go patterns: zero-value configuration defaults, a clear sentinel error with proper wrapping, and safe, bounded reads using io.LimitReader. The retry logic is cleanly factored via doGetWithReader, and redactURL now properly strips userinfo and query parameters. Tests cover boundary conditions thoroughly and CI has passed. I recommend approval. As minor polish, consider dropping the math.MaxInt64 special-case in favor of a single "negative disables" convention, and optionally refactor the limit computation into a small helper to avoid duplication.


Review by gpt


Evaluated against 1f0636e1

# Gpt Review ## Summary Solid, idiomatic hardening against unbounded diff sizes with clear defaults, a sentinel error, and well-structured retries and body reading. Tests are comprehensive and CI passed. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 68 | API design nit: using math.MaxInt64 as a special value to disable the diff size limit may be surprising. Since negative values already disable the limit and doGetLimited clamps overflow, consider simplifying the API by documenting only negative values as the disable mechanism and treating very large positive values (including MaxInt64) as enforced limits. | | 2 | [NIT] | `gitea/client.go` | 140 | The MaxDiffSize branching logic and limit calculation are repeated across GetPullRequestDiff and doGetLimited. Consider factoring an internal helper (e.g., effectiveMaxDiffSize() and a boolean for disabled) to centralize the policy and reduce repetition. | ## Recommendation **APPROVE** — The changes are well-implemented and align with Go patterns: zero-value configuration defaults, a clear sentinel error with proper wrapping, and safe, bounded reads using io.LimitReader. The retry logic is cleanly factored via doGetWithReader, and redactURL now properly strips userinfo and query parameters. Tests cover boundary conditions thoroughly and CI has passed. I recommend approval. As minor polish, consider dropping the math.MaxInt64 special-case in favor of a single "negative disables" convention, and optionally refactor the limit computation into a small helper to avoid duplication. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against 1f0636e1*
@@ -61,6 +68,14 @@ type Client struct {
// This field must be configured before the first request is made.
First-time contributor

[MINOR] API design nit: using math.MaxInt64 as a special value to disable the diff size limit may be surprising. Since negative values already disable the limit and doGetLimited clamps overflow, consider simplifying the API by documenting only negative values as the disable mechanism and treating very large positive values (including MaxInt64) as enforced limits.

**[MINOR]** API design nit: using math.MaxInt64 as a special value to disable the diff size limit may be surprising. Since negative values already disable the limit and doGetLimited clamps overflow, consider simplifying the API by documenting only negative values as the disable mechanism and treating very large positive values (including MaxInt64) as enforced limits.
@@ -125,9 +140,28 @@ func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number
}
First-time contributor

[NIT] The MaxDiffSize branching logic and limit calculation are repeated across GetPullRequestDiff and doGetLimited. Consider factoring an internal helper (e.g., effectiveMaxDiffSize() and a boolean for disabled) to centralize the policy and reduce repetition.

**[NIT]** The MaxDiffSize branching logic and limit calculation are repeated across GetPullRequestDiff and doGetLimited. Consider factoring an internal helper (e.g., effectiveMaxDiffSize() and a boolean for disabled) to centralize the policy and reduce repetition.
rodin added the wip label 2026-05-13 12:46:09 +00:00
Author
Owner

Self-Review: PR #109

Self-review against 1f0636e140

Phase 1: Independent Findings

1. [NIT] doGetLimited overflow guard is now dead code (gitea/client.go, ~line 470)
The limitBytes <= 0 clamp was added to protect against maxBytes+1 overflowing when maxBytes == math.MaxInt64. However, GetPullRequestDiff now pre-checks maxSize == math.MaxInt64 and routes it directly to the unlimited doGet path before ever calling doGetLimited. So maxBytes == math.MaxInt64 can never reach doGetLimited. The guard is harmless defense-in-depth, but the comment — "Clamp to prevent integer overflow when maxBytes == math.MaxInt64" — describes a case that is now excluded by the caller. A brief note clarifying this ("retained as defense-in-depth; caller pre-filters MaxInt64") would prevent future confusion when someone reads doGetLimited in isolation.

2. [NIT] Two separate unlimited-path guards in GetPullRequestDiff
if maxSize < 0 || maxSize == math.MaxInt64 routes to the unlimited doGet path. Clear and correct. A single effectivelyUnlimited helper could simplify it, but as-is it's readable and the comment explains the rationale. No action required.

Otherwise the diff is clean. doGetWithReader is a well-designed abstraction, redactURL now correctly strips userinfo, the error message wording is accurate, and tests cover all boundary conditions including math.MaxInt64.

Phase 2: Prior Review Verification

Finding Reviewer Status Notes
doGetLimited duplicates retry/backoff logic sonnet, gpt (multiple passes) RESOLVED Extracted doGetWithReader shared helper. doGet and doGetLimited are now ~5-line wrappers.
redactURL does not strip userinfo security-review-bot (pass 3) RESOLVED redactURL now sets parsed.User = url.User("REDACTED") when userinfo is present. 3 new test cases added.
maxBytes == math.MaxInt64 silently disables limit security-review-bot (pass 2, pass 3) RESOLVED GetPullRequestDiff explicitly routes MaxInt64 to unlimited doGet path. Documented. Test case added.
Error message "larger than" vs "exceeds" sonnet (pass 3, finding #2) RESOLVED Changed to "response exceeds %d bytes".
MaxDiffSize doc: "any negative value" vs "-1 to disable" sonnet (pass 1 finding #3, pass 2) RESOLVED Field doc now says "any negative value (or math.MaxInt64) to disable".
MaxDiffSize missing concurrency safety note gpt (pass 2 finding #3) RESOLVED Field doc now matches RetryBackoff concurrency warning.
w.Write return ignored in test handlers sonnet (pass 1 NIT), gpt RESOLVED //nolint:errcheck annotation added.
Tests not consolidated to table-driven gpt (pass 2 finding #4) RESOLVED Single TestGetPullRequestDiff_SizeLimits with named subtests.
Exported field breaks unkeyed literals gpt-review-bot (pass 2) INVALID (confirmed) Client has unexported fields; compiler rejects unkeyed external literals.
Code-duplication follow-up acknowledged sonnet, gpt (all passes) RESOLVED Now fixed via doGetWithReader (not deferred).
doGetLimited overflow guard comment misleading sonnet (pass 4 finding #2) ⚠️ PARTIAL Guard retained as defense-in-depth but comment still describes case that can no longer reach this code. Comment needs update.
Doc: grouping math.MaxInt64 with negative values is surprising sonnet (pass 4 finding #1) ⚠️ PARTIAL Field doc includes math.MaxInt64 alongside negative values. Sonnet's clarification suggestion is noted but not blocking.

Assessment: Clean

The implementation is correct, well-tested, and all blocking findings from prior reviews have been resolved. The doGetWithReader refactor cleanly eliminated the retry duplication, redactURL now properly sanitizes userinfo, the overflow is handled correctly, and all test boundaries are covered. The two PARTIAL items above are documentation nits — the dead-code comment in doGetLimited and the math.MaxInt64 grouping in the field doc — neither affects correctness or safety. CI passes (reviewed against latest commit 1f0636e).

## Self-Review: PR #109 Self-review against 1f0636e1403acc72bf902d6e1a52cb3416589142 ### Phase 1: Independent Findings **1. [NIT] `doGetLimited` overflow guard is now dead code** (`gitea/client.go`, ~line 470) The `limitBytes <= 0` clamp was added to protect against `maxBytes+1` overflowing when `maxBytes == math.MaxInt64`. However, `GetPullRequestDiff` now pre-checks `maxSize == math.MaxInt64` and routes it directly to the unlimited `doGet` path before ever calling `doGetLimited`. So `maxBytes == math.MaxInt64` can never reach `doGetLimited`. The guard is harmless defense-in-depth, but the comment — "Clamp to prevent integer overflow when maxBytes == math.MaxInt64" — describes a case that is now excluded by the caller. A brief note clarifying this ("retained as defense-in-depth; caller pre-filters MaxInt64") would prevent future confusion when someone reads `doGetLimited` in isolation. **2. [NIT] Two separate unlimited-path guards in `GetPullRequestDiff`** `if maxSize < 0 || maxSize == math.MaxInt64` routes to the unlimited `doGet` path. Clear and correct. A single `effectivelyUnlimited` helper could simplify it, but as-is it's readable and the comment explains the rationale. No action required. Otherwise the diff is clean. `doGetWithReader` is a well-designed abstraction, `redactURL` now correctly strips userinfo, the error message wording is accurate, and tests cover all boundary conditions including `math.MaxInt64`. ### Phase 2: Prior Review Verification | Finding | Reviewer | Status | Notes | |---------|----------|--------|-------| | `doGetLimited` duplicates retry/backoff logic | sonnet, gpt (multiple passes) | ✅ RESOLVED | Extracted `doGetWithReader` shared helper. `doGet` and `doGetLimited` are now ~5-line wrappers. | | `redactURL` does not strip userinfo | security-review-bot (pass 3) | ✅ RESOLVED | `redactURL` now sets `parsed.User = url.User("REDACTED")` when userinfo is present. 3 new test cases added. | | `maxBytes == math.MaxInt64` silently disables limit | security-review-bot (pass 2, pass 3) | ✅ RESOLVED | `GetPullRequestDiff` explicitly routes `MaxInt64` to unlimited `doGet` path. Documented. Test case added. | | Error message "larger than" vs "exceeds" | sonnet (pass 3, finding #2) | ✅ RESOLVED | Changed to `"response exceeds %d bytes"`. | | `MaxDiffSize` doc: "any negative value" vs "-1 to disable" | sonnet (pass 1 finding #3, pass 2) | ✅ RESOLVED | Field doc now says "any negative value (or math.MaxInt64) to disable". | | `MaxDiffSize` missing concurrency safety note | gpt (pass 2 finding #3) | ✅ RESOLVED | Field doc now matches `RetryBackoff` concurrency warning. | | `w.Write` return ignored in test handlers | sonnet (pass 1 NIT), gpt | ✅ RESOLVED | `//nolint:errcheck` annotation added. | | Tests not consolidated to table-driven | gpt (pass 2 finding #4) | ✅ RESOLVED | Single `TestGetPullRequestDiff_SizeLimits` with named subtests. | | Exported field breaks unkeyed literals | gpt-review-bot (pass 2) | ✅ INVALID (confirmed) | `Client` has unexported fields; compiler rejects unkeyed external literals. | | Code-duplication follow-up acknowledged | sonnet, gpt (all passes) | ✅ RESOLVED | Now fixed via `doGetWithReader` (not deferred). | | `doGetLimited` overflow guard comment misleading | sonnet (pass 4 finding #2) | ⚠️ PARTIAL | Guard retained as defense-in-depth but comment still describes case that can no longer reach this code. Comment needs update. | | Doc: grouping `math.MaxInt64` with negative values is surprising | sonnet (pass 4 finding #1) | ⚠️ PARTIAL | Field doc includes `math.MaxInt64` alongside negative values. Sonnet's clarification suggestion is noted but not blocking. | ### Assessment: ✅ Clean The implementation is correct, well-tested, and all blocking findings from prior reviews have been resolved. The `doGetWithReader` refactor cleanly eliminated the retry duplication, `redactURL` now properly sanitizes userinfo, the overflow is handled correctly, and all test boundaries are covered. The two PARTIAL items above are documentation nits — the dead-code comment in `doGetLimited` and the `math.MaxInt64` grouping in the field doc — neither affects correctness or safety. CI passes (reviewed against latest commit 1f0636e).
rodin removed the wip label 2026-05-13 12:48:07 +00:00
aweiker was assigned by rodin 2026-05-13 12:56:40 +00:00
rodin added the ready label 2026-05-13 12:56:40 +00:00
aweiker added 4 commits 2026-05-13 13:39:39 +00:00
Add a configurable MaxDiffSize field to Client that limits how much
data GetPullRequestDiff will read into memory. The default is 10 MB
(DefaultMaxDiffSize). When the diff exceeds the limit, ErrDiffTooLarge
is returned, allowing callers to skip position translation gracefully.

Implementation uses io.LimitReader to read maxBytes+1, detecting
overflow without buffering the entire response. Setting MaxDiffSize
to -1 disables the limit entirely.

Closes #92
- Clamp maxBytes+1 to prevent integer overflow to negative when
  maxBytes == math.MaxInt64 (falls back to math.MaxInt64)
- Update MaxDiffSize doc: 'any negative value' disables the limit,
  matching actual behavior of 'maxSize < 0' check
- Add concurrency safety note to MaxDiffSize field documentation,
  mirroring the existing note on RetryBackoff
- Consolidate six individual test functions into a single table-driven
  test (TestGetPullRequestDiff_SizeLimits) reducing repetition
- Add //nolint:errcheck annotation to test handler w.Write calls
refactor(gitea): eliminate retry duplication, harden redactURL and MaxInt64 edge case
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 32s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m2s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m21s
49d6ca77a3
- Extract doGetWithReader to share retry/backoff logic between doGet and
  doGetLimited, eliminating ~60 lines of duplicated code (addresses MINOR
  finding from all reviewers).
- redactURL now strips userinfo credentials (user:pass@host) in addition
  to query parameters (addresses security-review-bot finding).
- GetPullRequestDiff treats MaxDiffSize == math.MaxInt64 as disabled,
  preventing the silent enforcement bypass where the overflow clamp makes
  the size check unreachable (addresses security-review-bot finding).
- Improved error message wording: 'response exceeds N bytes' (NIT fix).
aweiker force-pushed review-bot-issue-92 from 1f0636e140 to 49d6ca77a3 2026-05-13 13:39:39 +00:00 Compare
sonnet-review-bot approved these changes 2026-05-13 13:40:27 +00:00
sonnet-review-bot left a comment
First-time contributor

Sonnet Review

Summary

The PR correctly hardens GetPullRequestDiff against unbounded diff size using io.LimitReader with a +1 read to detect overflow. The implementation is well-structured, follows established patterns in the codebase, CI passes, and tests cover all meaningful boundary cases. A few minor observations are noted but none block approval.

Findings

# Severity File Line Finding
1 [NIT] gitea/client.go 81 The doc comment for MaxDiffSize says 'Set to any negative value (or math.MaxInt64) to disable the limit.' This is slightly awkward API design — math.MaxInt64 is a magic value that happens to bypass the overflow protection, not an intentional disable signal. The docstring documents it correctly but it would be cleaner if the disable path were only a negative value. However, since the math.MaxInt64 bypass exists to defend against a subtle overflow in doGetLimited (where maxBytes+1 would wrap), documenting both is the right call for transparency.
2 [NIT] gitea/client.go 467 In doGetLimited, the clamp comment says 'Clamp to prevent integer overflow when maxBytes == math.MaxInt64' but the guard `if maxSize < 0
3 [NIT] gitea/diff_size_test.go 75 The test case 'math.MaxInt64 treated as disabled' verifies the bypass works, but the test name 'treated as disabled' implies intended behavior rather than an edge-case workaround. A name like 'math.MaxInt64 bypasses limit via overflow protection' would be more precise. Minor naming issue only.

Recommendation

APPROVE — Approve. The implementation is correct and safe. The +1 trick to detect overflow without buffering the full response is idiomatic. doGetWithReader is a clean higher-order function that follows the Go pattern of separating retry logic from body-reading logic. Sentinel error ErrDiffTooLarge is properly declared as a package-level var so errors.Is works through wrapping. Tests are comprehensive and cover the boundary cases (exact limit, one byte over, disabled, default). The three findings above are all NITs that don't affect correctness.


Review by sonnet


Evaluated against 49d6ca77

# Sonnet Review ## Summary The PR correctly hardens `GetPullRequestDiff` against unbounded diff size using `io.LimitReader` with a `+1` read to detect overflow. The implementation is well-structured, follows established patterns in the codebase, CI passes, and tests cover all meaningful boundary cases. A few minor observations are noted but none block approval. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [NIT] | `gitea/client.go` | 81 | The doc comment for `MaxDiffSize` says 'Set to any negative value (or math.MaxInt64) to disable the limit.' This is slightly awkward API design — `math.MaxInt64` is a magic value that happens to bypass the overflow protection, not an intentional disable signal. The docstring documents it correctly but it would be cleaner if the disable path were only a negative value. However, since the math.MaxInt64 bypass exists to defend against a subtle overflow in doGetLimited (where maxBytes+1 would wrap), documenting both is the right call for transparency. | | 2 | [NIT] | `gitea/client.go` | 467 | In `doGetLimited`, the clamp comment says 'Clamp to prevent integer overflow when maxBytes == math.MaxInt64' but the guard `if maxSize < 0 || maxSize == math.MaxInt64` in `GetPullRequestDiff` already prevents `doGetLimited` from ever being called with `math.MaxInt64`. The overflow protection in `doGetLimited` is therefore defensive-only. This is fine, but worth noting as the comment is slightly misleading — it implies a reachable case that `GetPullRequestDiff` already prevents. Consider adding a note that the clamp is a belt-and-suspenders defense. | | 3 | [NIT] | `gitea/diff_size_test.go` | 75 | The test case 'math.MaxInt64 treated as disabled' verifies the bypass works, but the test name 'treated as disabled' implies intended behavior rather than an edge-case workaround. A name like 'math.MaxInt64 bypasses limit via overflow protection' would be more precise. Minor naming issue only. | ## Recommendation **APPROVE** — Approve. The implementation is correct and safe. The `+1` trick to detect overflow without buffering the full response is idiomatic. `doGetWithReader` is a clean higher-order function that follows the Go pattern of separating retry logic from body-reading logic. Sentinel error `ErrDiffTooLarge` is properly declared as a package-level var so `errors.Is` works through wrapping. Tests are comprehensive and cover the boundary cases (exact limit, one byte over, disabled, default). The three findings above are all NITs that don't affect correctness. --- *Review by sonnet* <!-- review-bot:sonnet --> --- *Evaluated against 49d6ca77*
@@ -64,3 +78,4 @@
MaxDiffSize int64
}
// NewClient creates a new Gitea API client.
First-time contributor

[NIT] The doc comment for MaxDiffSize says 'Set to any negative value (or math.MaxInt64) to disable the limit.' This is slightly awkward API design — math.MaxInt64 is a magic value that happens to bypass the overflow protection, not an intentional disable signal. The docstring documents it correctly but it would be cleaner if the disable path were only a negative value. However, since the math.MaxInt64 bypass exists to defend against a subtle overflow in doGetLimited (where maxBytes+1 would wrap), documenting both is the right call for transparency.

**[NIT]** The doc comment for `MaxDiffSize` says 'Set to any negative value (or math.MaxInt64) to disable the limit.' This is slightly awkward API design — `math.MaxInt64` is a magic value that happens to bypass the overflow protection, not an intentional disable signal. The docstring documents it correctly but it would be cleaner if the disable path were only a negative value. However, since the math.MaxInt64 bypass exists to defend against a subtle overflow in doGetLimited (where maxBytes+1 would wrap), documenting both is the right call for transparency.
@@ -416,0 +464,4 @@
defer body.Close()
// Read up to maxBytes+1 to detect overflow.
// Clamp to prevent integer overflow when maxBytes == math.MaxInt64.
limitBytes := maxBytes + 1
First-time contributor

[NIT] In doGetLimited, the clamp comment says 'Clamp to prevent integer overflow when maxBytes == math.MaxInt64' but the guard if maxSize < 0 || maxSize == math.MaxInt64 in GetPullRequestDiff already prevents doGetLimited from ever being called with math.MaxInt64. The overflow protection in doGetLimited is therefore defensive-only. This is fine, but worth noting as the comment is slightly misleading — it implies a reachable case that GetPullRequestDiff already prevents. Consider adding a note that the clamp is a belt-and-suspenders defense.

**[NIT]** In `doGetLimited`, the clamp comment says 'Clamp to prevent integer overflow when maxBytes == math.MaxInt64' but the guard `if maxSize < 0 || maxSize == math.MaxInt64` in `GetPullRequestDiff` already prevents `doGetLimited` from ever being called with `math.MaxInt64`. The overflow protection in `doGetLimited` is therefore defensive-only. This is fine, but worth noting as the comment is slightly misleading — it implies a reachable case that `GetPullRequestDiff` already prevents. Consider adding a note that the clamp is a belt-and-suspenders defense.
@@ -0,0 +72,4 @@
client := NewClient(server.URL, "test-token")
client.MaxDiffSize = tt.maxDiffSize
client.RetryBackoff = []time.Duration{}
First-time contributor

[NIT] The test case 'math.MaxInt64 treated as disabled' verifies the bypass works, but the test name 'treated as disabled' implies intended behavior rather than an edge-case workaround. A name like 'math.MaxInt64 bypasses limit via overflow protection' would be more precise. Minor naming issue only.

**[NIT]** The test case 'math.MaxInt64 treated as disabled' verifies the bypass works, but the test name 'treated as disabled' implies intended behavior rather than an edge-case workaround. A name like 'math.MaxInt64 bypasses limit via overflow protection' would be more precise. Minor naming issue only.
security-review-bot requested review from security-review-bot 2026-05-13 13:41:00 +00:00
security-review-bot approved these changes 2026-05-13 13:41:00 +00:00
security-review-bot left a comment
Collaborator

Security Review

Summary

The changes harden diff fetching against resource exhaustion by enforcing a sensible default size limit and safely detecting overflow without buffering the full response. Redaction of URL userinfo in logs is a security improvement. CI passed and no exploitable vulnerabilities are introduced.

Recommendation

APPROVE — Approve this PR. The introduced MaxDiffSize with a default 10 MB and the use of io.LimitReader in doGetLimited mitigate potential DoS from large diffs while retaining configurability to disable limits when necessary. The refactor to doGetWithReader maintains robust retry/error handling and limits error body reads. The redactURL enhancement reduces risk of credential leakage in logs. Consider future hardening to apply similar size limits to other endpoints that may return large bodies (e.g., raw file content), but this PR’s scope is appropriate and secure.


Review by security


Evaluated against 49d6ca77

# Security Review ## Summary The changes harden diff fetching against resource exhaustion by enforcing a sensible default size limit and safely detecting overflow without buffering the full response. Redaction of URL userinfo in logs is a security improvement. CI passed and no exploitable vulnerabilities are introduced. ## Recommendation **APPROVE** — Approve this PR. The introduced MaxDiffSize with a default 10 MB and the use of io.LimitReader in doGetLimited mitigate potential DoS from large diffs while retaining configurability to disable limits when necessary. The refactor to doGetWithReader maintains robust retry/error handling and limits error body reads. The redactURL enhancement reduces risk of credential leakage in logs. Consider future hardening to apply similar size limits to other endpoints that may return large bodies (e.g., raw file content), but this PR’s scope is appropriate and secure. --- *Review by security* <!-- review-bot:security --> --- *Evaluated against 49d6ca77*
rodin removed the ready label 2026-05-13 13:41:48 +00:00
aweiker was unassigned by rodin 2026-05-13 13:41:48 +00:00
rodin self-assigned this 2026-05-13 13:41:49 +00:00
gpt-review-bot approved these changes 2026-05-13 13:42:06 +00:00
gpt-review-bot left a comment
First-time contributor

Gpt Review

Summary

The changes cleanly add a configurable max diff size with sensible defaults and a sentinel error, refactor GET handling to support limited reads, and harden URL redaction. Tests are comprehensive and CI passed. Overall, the implementation is correct and aligns with Go patterns.

Findings

# Severity File Line Finding
1 [MINOR] gitea/client.go 150 Special-casing math.MaxInt64 to disable the size limit is unconventional and redundant given the clamp logic in doGetLimited; using any negative value to disable the limit is sufficient and simpler.
2 [NIT] gitea/client.go 72 The MaxDiffSize field comment suggests using math.MaxInt64 to disable the limit. This dual mechanism (negative and MaxInt64) may confuse users; consider documenting only the negative value as the disable switch.
3 [NIT] gitea/client.go 48 ErrDiffTooLarge follows standard practices, but consider prefixing the error message with the package context (e.g., "gitea: diff too large") per error string conventions to improve disambiguation across packages.

Recommendation

APPROVE — The PR implements a robust size limit for PR diffs using io.LimitReader and a new sentinel error, introduces a configurable MaxDiffSize with zero-value defaults, and refactors the GET logic to allow limited reads — all of which adhere to established Go configuration and error-handling patterns. URL redaction is improved to remove userinfo as well as queries. Tests comprehensively cover the new behavior. I recommend approval. For polish, consider removing the math.MaxInt64 special case in both code and comments and rely solely on negative values to disable the limit, and optionally adjust the sentinel error string to include a package prefix.


Review by gpt


Evaluated against 49d6ca77

# Gpt Review ## Summary The changes cleanly add a configurable max diff size with sensible defaults and a sentinel error, refactor GET handling to support limited reads, and harden URL redaction. Tests are comprehensive and CI passed. Overall, the implementation is correct and aligns with Go patterns. ## Findings | # | Severity | File | Line | Finding | |---|----------|------|------|--------| | 1 | [MINOR] | `gitea/client.go` | 150 | Special-casing math.MaxInt64 to disable the size limit is unconventional and redundant given the clamp logic in doGetLimited; using any negative value to disable the limit is sufficient and simpler. | | 2 | [NIT] | `gitea/client.go` | 72 | The MaxDiffSize field comment suggests using math.MaxInt64 to disable the limit. This dual mechanism (negative and MaxInt64) may confuse users; consider documenting only the negative value as the disable switch. | | 3 | [NIT] | `gitea/client.go` | 48 | ErrDiffTooLarge follows standard practices, but consider prefixing the error message with the package context (e.g., "gitea: diff too large") per error string conventions to improve disambiguation across packages. | ## Recommendation **APPROVE** — The PR implements a robust size limit for PR diffs using io.LimitReader and a new sentinel error, introduces a configurable MaxDiffSize with zero-value defaults, and refactors the GET logic to allow limited reads — all of which adhere to established Go configuration and error-handling patterns. URL redaction is improved to remove userinfo as well as queries. Tests comprehensively cover the new behavior. I recommend approval. For polish, consider removing the math.MaxInt64 special case in both code and comments and rely solely on negative values to disable the limit, and optionally adjust the sentinel error string to include a package prefix. --- *Review by gpt* <!-- review-bot:gpt --> --- *Evaluated against 49d6ca77*
@@ -47,6 +48,12 @@ func IsServerError(err error) bool {
return errors.As(err, &apiErr) && apiErr.StatusCode >= 500 && apiErr.StatusCode < 600
First-time contributor

[NIT] ErrDiffTooLarge follows standard practices, but consider prefixing the error message with the package context (e.g., "gitea: diff too large") per error string conventions to improve disambiguation across packages.

**[NIT]** ErrDiffTooLarge follows standard practices, but consider prefixing the error message with the package context (e.g., "gitea: diff too large") per error string conventions to improve disambiguation across packages.
@@ -62,2 +69,4 @@
// Modifying it while requests are in flight is not safe.
RetryBackoff []time.Duration
// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.
First-time contributor

[NIT] The MaxDiffSize field comment suggests using math.MaxInt64 to disable the limit. This dual mechanism (negative and MaxInt64) may confuse users; consider documenting only the negative value as the disable switch.

**[NIT]** The MaxDiffSize field comment suggests using math.MaxInt64 to disable the limit. This dual mechanism (negative and MaxInt64) may confuse users; consider documenting only the negative value as the disable switch.
@@ -131,0 +147,4 @@
maxSize := c.MaxDiffSize
if maxSize == 0 {
maxSize = DefaultMaxDiffSize
First-time contributor

[MINOR] Special-casing math.MaxInt64 to disable the size limit is unconventional and redundant given the clamp logic in doGetLimited; using any negative value to disable the limit is sufficient and simpler.

**[MINOR]** Special-casing math.MaxInt64 to disable the size limit is unconventional and redundant given the clamp logic in doGetLimited; using any negative value to disable the limit is sufficient and simpler.
rodin added the wip label 2026-05-13 13:52:53 +00:00
Author
Owner

Self-Review: PR #109

Self-review against 49d6ca77a3

Phase 1: Independent Findings

1. [NIT] Overflow-guard comment in doGetLimited describes unreachable case (gitea/client.go, line ~467)
The comment says "Clamp to prevent integer overflow when maxBytes == math.MaxInt64" but the caller (GetPullRequestDiff) now explicitly routes math.MaxInt64 to the unlimited doGet path before doGetLimited is ever called. The guard itself is correct defense-in-depth, but the comment implies a case that can no longer reach this code. A note like "defense-in-depth; caller pre-filters MaxInt64" would prevent confusion for future readers.

No other findings. The diff is clean: doGetWithReader eliminates the prior retry duplication, redactURL properly strips userinfo, the overflow clamp handles edge cases correctly, error message wording is accurate, field docs have concurrency safety notes, and tests comprehensively cover all boundary conditions.

Phase 2: Prior Review Verification

Finding Reviewer Status Notes
doGetLimited duplicates retry logic sonnet, gpt, security (all passes) Resolved Extracted doGetWithReader shared core; doGet and doGetLimited are thin wrappers.
redactURL does not strip userinfo security-review-bot Resolved Now sets parsed.User = url.User("REDACTED"). 3 new test cases cover user+pass, user+pass+query, username-only.
maxBytes == math.MaxInt64 silently disables limit security-review-bot Resolved GetPullRequestDiff explicitly routes MaxInt64 to unlimited path. Test case added.
maxBytes+1 integer overflow security-review-bot Resolved Overflow clamp added: if limitBytes <= 0 { limitBytes = math.MaxInt64 }. Correct.
Error message "larger than" vs "exceeds" sonnet-review-bot Resolved Now uses "response exceeds %d bytes".
MaxDiffSize doc: "-1" vs "any negative" sonnet-review-bot Resolved Field doc says "any negative value (or math.MaxInt64) to disable the limit".
MaxDiffSize missing concurrency safety note gpt-review-bot Resolved Field doc matches RetryBackoff wording.
w.Write return ignored in tests sonnet, gpt Resolved //nolint:errcheck annotation added.
Tests not table-driven gpt-review-bot Resolved Single TestGetPullRequestDiff_SizeLimits with named subtests.
Overflow-guard comment describes unreachable case prior self-review ⚠️ Partial Guard retained (defense-in-depth) but comment still describes the math.MaxInt64 case which caller now pre-filters. Documentation nit only.
math.MaxInt64 grouping with negative in doc is surprising sonnet-review-bot 🚫 Invalid The field doc correctly documents both disable mechanisms. math.MaxInt64 is documented alongside negative values because it also triggers the unlimited path. This is intentional transparency, not a bug.

Assessment: Clean

All blocking findings from all prior reviews (sonnet, security, gpt) are resolved. The implementation is correct, well-tested, and ready to merge. The single remaining NIT (overflow-guard comment describing an unreachable case) is pure documentation polish with zero correctness or safety impact. CI is green.

## Self-Review: PR #109 Self-review against 49d6ca77a386c31106676f112b6fdc23b88fb360 ### Phase 1: Independent Findings **1. [NIT] Overflow-guard comment in `doGetLimited` describes unreachable case** (`gitea/client.go`, line ~467) The comment says "Clamp to prevent integer overflow when maxBytes == math.MaxInt64" but the caller (`GetPullRequestDiff`) now explicitly routes `math.MaxInt64` to the unlimited `doGet` path before `doGetLimited` is ever called. The guard itself is correct defense-in-depth, but the comment implies a case that can no longer reach this code. A note like "defense-in-depth; caller pre-filters MaxInt64" would prevent confusion for future readers. No other findings. The diff is clean: `doGetWithReader` eliminates the prior retry duplication, `redactURL` properly strips userinfo, the overflow clamp handles edge cases correctly, error message wording is accurate, field docs have concurrency safety notes, and tests comprehensively cover all boundary conditions. ### Phase 2: Prior Review Verification | Finding | Reviewer | Status | Notes | |---------|----------|--------|-------| | `doGetLimited` duplicates retry logic | sonnet, gpt, security (all passes) | ✅ Resolved | Extracted `doGetWithReader` shared core; `doGet` and `doGetLimited` are thin wrappers. | | `redactURL` does not strip userinfo | security-review-bot | ✅ Resolved | Now sets `parsed.User = url.User("REDACTED")`. 3 new test cases cover user+pass, user+pass+query, username-only. | | `maxBytes == math.MaxInt64` silently disables limit | security-review-bot | ✅ Resolved | `GetPullRequestDiff` explicitly routes `MaxInt64` to unlimited path. Test case added. | | `maxBytes+1` integer overflow | security-review-bot | ✅ Resolved | Overflow clamp added: `if limitBytes <= 0 { limitBytes = math.MaxInt64 }`. Correct. | | Error message "larger than" vs "exceeds" | sonnet-review-bot | ✅ Resolved | Now uses "response exceeds %d bytes". | | `MaxDiffSize` doc: "-1" vs "any negative" | sonnet-review-bot | ✅ Resolved | Field doc says "any negative value (or math.MaxInt64) to disable the limit". | | `MaxDiffSize` missing concurrency safety note | gpt-review-bot | ✅ Resolved | Field doc matches `RetryBackoff` wording. | | `w.Write` return ignored in tests | sonnet, gpt | ✅ Resolved | `//nolint:errcheck` annotation added. | | Tests not table-driven | gpt-review-bot | ✅ Resolved | Single `TestGetPullRequestDiff_SizeLimits` with named subtests. | | Overflow-guard comment describes unreachable case | prior self-review | ⚠️ Partial | Guard retained (defense-in-depth) but comment still describes the `math.MaxInt64` case which caller now pre-filters. Documentation nit only. | | `math.MaxInt64` grouping with negative in doc is surprising | sonnet-review-bot | 🚫 Invalid | The field doc correctly documents both disable mechanisms. `math.MaxInt64` is documented alongside negative values because it also triggers the unlimited path. This is intentional transparency, not a bug. | ### Assessment: ✅ Clean All blocking findings from all prior reviews (sonnet, security, gpt) are resolved. The implementation is correct, well-tested, and ready to merge. The single remaining NIT (overflow-guard comment describing an unreachable case) is pure documentation polish with zero correctness or safety impact. CI is green.
rodin removed the wip label 2026-05-13 13:54:21 +00:00
rodin removed their assignment 2026-05-13 13:57:54 +00:00
aweiker was assigned by rodin 2026-05-13 13:57:54 +00:00
rodin added the ready label 2026-05-13 13:57:54 +00:00
aweiker approved these changes 2026-05-13 14:01:20 +00:00
aweiker merged commit a845ce32eb into main 2026-05-13 14:01:22 +00:00
Sign in to join this conversation.
5 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: rodin/review-bot#109