feat(gitea): harden GetPullRequestDiff against unbounded diff size #109
Reference in New Issue
Block a user
Delete Branch "review-bot-issue-92"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Hardens
GetPullRequestDiffagainst unbounded diff size to prevent potential DoS from very large PRs.Changes
DefaultMaxDiffSizeconstant (10 MB) andErrDiffTooLargesentinel errorMaxDiffSizefield toClientstructGetPullRequestDiffto enforce the size limit usingio.LimitReaderdoGetLimitedinternal method that readsmaxBytes+1to detect overflow without buffering the full responseTesting
go test ./...passesgo vet ./...cleangitea/diff_size_test.goCloses #92
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
235828ec)Sonnet Review
Summary
The PR correctly hardens
GetPullRequestDiffagainst unbounded memory allocation usingio.LimitReaderwith a read-one-more-byte overflow detection strategy. The implementation follows established patterns, CI passes, and the test coverage is thorough.Findings
gitea/client.godoGetLimitedduplicates roughly 80 lines of retry logic that already exists indoGet. The only difference is the size-limiting read on the success path. This violates DRY and means any future change to retry behavior (new backoff strategy, different log fields, new error classification) must be made in two places. A cleaner approach would be to extract the retry loop into a shared helper that accepts a response-body handler function, or to add an optionalmaxBytes int64parameter todoGet(where 0 means unlimited). The current duplication is a maintenance hazard.gitea/client.gomaxSize < 0check uses the negated condition to dispatch todoGetfor the unlimited path. SinceDefaultMaxDiffSizeis already set whenmaxSize == 0, the checkmaxSize < 0is correct, but the comment says "disabled" while the field doc says "-1 to disable". An explicitmaxSize == -1check (orconst noLimit int64 = -1) would make the intent clearer and prevent confusion if someone passes e.g. -2.gitea/diff_size_test.gow.Writereturn values are silently discarded in all test handlers. While harmless inhttptestservers, the convention in this codebase (and in stdlib examples) is to at least acknowledge the return. Not a correctness issue.Recommendation
APPROVE — Approve with the understanding that the duplication between
doGetanddoGetLimitedshould be addressed in a follow-up refactor. The core feature is correct: the overflow detection viaio.LimitReader(resp.Body, maxBytes+1)followed by a length comparison is sound, the sentinel error wraps correctly forerrors.Is, the zero/negative conventions are well-documented, and the test matrix (exact boundary, one-over, disabled, default) is exactly what this kind of guard needs. The MINOR findings are maintainability concerns, not correctness bugs, and CI passes.Review by sonnet
Evaluated against
235828ec@@ -130,1 +142,3 @@body, err := c.doGet(ctx, reqURL)maxSize := c.MaxDiffSizeif maxSize == 0 {[MINOR] The
maxSize < 0check uses the negated condition to dispatch todoGetfor the unlimited path. SinceDefaultMaxDiffSizeis already set whenmaxSize == 0, the checkmaxSize < 0is correct, but the comment says "disabled" while the field doc says "-1 to disable". An explicitmaxSize == -1check (orconst noLimit int64 = -1) would make the intent clearer and prevent confusion if someone passes e.g. -2.@@ -413,6 +440,86 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {return nil, lastErr[MINOR]
doGetLimitedduplicates roughly 80 lines of retry logic that already exists indoGet. The only difference is the size-limiting read on the success path. This violates DRY and means any future change to retry behavior (new backoff strategy, different log fields, new error classification) must be made in two places. A cleaner approach would be to extract the retry loop into a shared helper that accepts a response-body handler function, or to add an optionalmaxBytes int64parameter todoGet(where 0 means unlimited). The current duplication is a maintenance hazard.@@ -0,0 +19,4 @@}))defer server.Close()client := NewClient(server.URL, "test-token")[NIT] The
w.Writereturn values are silently discarded in all test handlers. While harmless inhttptestservers, the convention in this codebase (and in stdlib examples) is to at least acknowledge the return. Not a correctness issue.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
235828ec)Security Review
Summary
The changes introduce a bounded diff fetch with sensible defaults and robust retry/error handling, reducing DoS risk from very large PR diffs. Implementation uses io.LimitReader and returns a sentinel error for oversize responses, with tests covering key scenarios. CI passed.
Findings
gitea/client.gogitea/client.goRecommendation
APPROVE — Overall, the hardening is sound and reduces resource exhaustion risk when fetching diffs. Please consider guarding against int64 overflow when computing maxBytes+1 in doGetLimited (e.g., clamp to MaxInt64 or treat MaxInt64 as unlimited), and ensure that the configuration to disable the limit is not user-controlled. With these minor considerations, the PR is ready to merge.
Review by security
Evaluated against
235828ec@@ -62,2 +68,4 @@// Modifying it while requests are in flight is not safe.RetryBackoff []time.Duration// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.[NIT] Disabling the diff size limit (MaxDiffSize = -1) removes protection against large responses and could allow memory exhaustion if configured from untrusted input. Ensure this setting cannot be influenced by untrusted sources.
@@ -416,0 +487,4 @@if attempt < maxAttempts-1 && isTemporaryNetError(err) {slog.Warn("temporary network error, will retry","attempt", attempt+1,"url", redactURL(reqURL),[MINOR] Adding 1 to maxBytes in io.LimitReader(resp.Body, maxBytes+1) can overflow if maxBytes is math.MaxInt64, resulting in a negative limit and an immediate EOF (empty body) instead of enforcing the intended bound. Consider clamping at MaxInt64 or handling the overflow case explicitly.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
235828ec)Gpt Review
Summary
Well-implemented size limit for PR diffs with sensible defaults and comprehensive tests. The approach uses io.LimitReader and a sentinel error with error wrapping, aligning with Go patterns. Minor opportunities for clarity and refactoring noted.
Findings
gitea/client.gogitea/client.goRecommendation
APPROVE — The changes correctly harden GetPullRequestDiff with a configurable default-limited read, use a sentinel error wrapped with context, and include thorough tests for edge cases. The implementation follows Go error handling patterns (errors.Is compatibility) and zero-value usability for configuration. I recommend approval. As follow-ups, consider factoring shared retry/GET logic between doGet and doGetLimited to avoid duplication, and clarify concurrency expectations for MaxDiffSize in the field's comment to match RetryBackoff's guidance.
Review by gpt
Evaluated against
235828ec@@ -61,6 +67,10 @@ type Client struct {// This field must be configured before the first request is made.[NIT] MaxDiffSize is read at call time; like RetryBackoff, consider documenting that modifying it while requests are in flight is not safe to avoid potential data races given the Client is intended for concurrent use.
@@ -413,6 +440,86 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {return nil, lastErr[MINOR] doGetLimited largely duplicates retry and error-handling logic from doGet. Consider refactoring shared behavior into a helper to reduce maintenance overhead and risk of drift.
Self-Review: PR #109
Self-review against
235828ec42Phase 1: Independent Findings
[MINOR]
doGetLimitedduplicates ~65 lines of retry logic fromdoGet(gitea/client.go~line 440)The only difference between the two methods is the size-limited read on the success path. Any future change to retry logic (backoff strategy, log fields, error classification) must be applied twice. This is a maintenance hazard.
[MINOR] Integer overflow on
maxBytes+1(gitea/client.go~line 490)io.LimitReader(resp.Body, maxBytes+1)overflows to a negative value whenmaxBytes == math.MaxInt64. A negative LimitReader reads 0 bytes, making the subsequent length check always triggerErrDiffTooLarge. In practiceDefaultMaxDiffSizeis 10 MB so this won't happen with the default, but users who pass a very large explicit limit could hit it. A clamp likemin(maxBytes+1, math.MaxInt64)(or treatingMaxInt64as unlimited) would fix it.[MINOR]
maxSize < 0vs documented-1 to disable(gitea/client.go~line 144)The field doc says "Set to -1 to disable the limit" but the code checks
maxSize < 0, silently treating -2, -100, etc. identically. Either the check should be== -1(strict contract) or the doc should say "any negative value disables the limit." The current asymmetry is a trap.[NIT]
MaxDiffSizemissing concurrency documentationRetryBackoffhas an explicit doc comment: "must be configured before the first request is made; modifying it while requests are in flight is not safe."MaxDiffSizehas no equivalent guidance, even though the same concurrency constraint applies. Should mirrorRetryBackoff's wording.Phase 2: Prior Review Verification
doGetLimitedduplicates retry logicmaxSize < 0vs-1semanticsif maxSize < 0remains; field doc still says "Set to -1 to disable." Ambiguity not resolved.w.Writereturn values in testsw.Writereturns. NIT-level, not blocking.maxBytes+1int64 overflowio.LimitReader(resp.Body, maxBytes+1)has no clamp. Real correctness bug for extrememaxBytesvalues.-1disable from untrusted inputdoGetLimitedduplicationMaxDiffSizeconcurrency docMaxDiffSizefield still lacks the concurrency warning present onRetryBackoff.Assessment: ⚠️ Needs attention
The core feature (bounded diff fetch with
io.LimitReader, sentinel error, test coverage) is sound and all three bot reviews APPROVEd. However, two MINOR findings remain unaddressed: themaxBytes+1integer overflow (security bot #1 — real correctness bug under extreme values) and the< 0vs-1semantic ambiguity (sonnet bot #2). The code duplication indoGetLimitedwas accepted as a follow-up by all reviewers, so that is not blocking. The overflow and -1 ambiguity should be fixed before merge, or explicitly waived by the PR author.Review findings addressed
1.
maxBytes+1integer overflow (security-review-bot)Fixed. Added overflow clamp before passing to
io.LimitReader:When
maxBytes == math.MaxInt64,maxBytes+1wraps tomath.MinInt64(negative). The guard catches this and falls back tomath.MaxInt64, which effectively means "read everything" — correct behavior since if someone configured the max atMaxInt64, they don't care about the limit.2.
maxSize < 0vs documented-1 to disable(sonnet-review-bot)Fixed by updating the doc to say "Set to any negative value to disable the limit", which matches the actual
maxSize < 0check. This is the simpler approach — tightening to== -1would be a breaking change for no practical benefit, and "any negative disables" is a common Go convention (e.g.,strings.Count,bytes.Split).3. Code duplication in
doGetLimited(all reviewers)Acknowledged as follow-up — not addressed in this PR per reviewer consensus.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
1a305092)Sonnet Review
Summary
The PR correctly hardens GetPullRequestDiff against unbounded diff size. The implementation is sound, follows established patterns in the codebase, and includes comprehensive boundary tests. CI passes.
Findings
gitea/client.gogitea/client.gogitea/diff_size_test.goRecommendation
APPROVE — Approve with a suggestion to track the code-duplication between doGet and doGetLimited as a follow-up. The duplication is the only notable concern: the retry loop, backoff logic, request construction, and error-body reading are identical across both methods. A future bug fix or enhancement to retry behavior would need to be applied in two places. The correct fix would be to refactor the shared scaffolding into a private helper and let doGetLimited supply only the body-processing callback — but this is a non-trivial refactor that goes beyond the scope of this PR, which otherwise is correct, well-tested, and follows the project's conventions.
Review by sonnet
Evaluated against
1a305092@@ -131,0 +153,4 @@return "", fmt.Errorf("fetch diff: %w", err)}return string(body), nil}[NIT] The disabled-limit branch (maxSize < 0) wraps the error with 'fetch diff: %w' but the limited branch also wraps with the same prefix inside the return. This is fine, but worth noting that the error string convention is consistent between the two paths.
@@ -413,6 +441,92 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {return nil, lastErr[MINOR] doGetLimited is ~90 lines and duplicates nearly all of doGet's retry/backoff logic. The two methods share: the attempt loop, backoff wiring, retry-delay select, request construction, header setting, temporary-network-error retry, and error-body reading. The only meaningful difference is the LimitReader + overflow check on the success path. Extracting a shared doGetCore(ctx, reqURL string, processBody func(r io.Reader) ([]byte, error)) ([]byte, error) or similar would eliminate the duplication without sacrificing clarity. As-is, any future change to the retry policy (new backoff defaults, added headers, logging changes) must be applied in both places.
@@ -0,0 +22,4 @@client := NewClient(server.URL, "test-token")client.MaxDiffSize = 100 // 100 bytes limitclient.RetryBackoff = []time.Duration{} // no delay in tests[NIT] The test server handler ignores write errors from w.Write([]byte(largeDiff)). This is idiomatic for test handlers and not a bug, but for completeness the existing codebase pattern is the same so this is consistent.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
1a305092)Security Review
Summary
The changes add a sensible default limit for PR diffs, enforce it safely with streaming reads, and include thorough tests. This hardens against resource exhaustion without introducing new security risks.
Findings
gitea/client.goRecommendation
APPROVE — Overall, this PR improves security by bounding diff sizes with a safe default and streaming reads. Approve as-is. For defense in depth, consider handling pathological configurations: treat values >= math.MaxInt64-1 as a special case (e.g., require using -1 to disable, or clamp to a safe high bound and still perform overflow detection). This avoids potentially unbounded reads if a caller mistakenly sets MaxDiffSize to MaxInt64.
Review by security
Evaluated against
1a305092@@ -416,0 +469,4 @@timer := time.NewTimer(delay)select {case <-timer.C:[MINOR] In doGetLimited, when maxBytes equals math.MaxInt64, limitBytes overflows and is clamped to math.MaxInt64, causing the reader to effectively allow an unbounded read (up to MaxInt64). While this only happens with an extreme configuration, it could permit large memory usage if someone sets MaxDiffSize to MaxInt64 instead of using -1 to disable the limit.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
1a305092)Gpt Review
Summary
Solid hardening of GetPullRequestDiff with a sensible default size limit and a clear sentinel error. The doGetLimited implementation correctly bounds memory using io.LimitReader and preserves existing retry semantics. Tests comprehensively cover boundary conditions.
Findings
gitea/client.gogitea/client.gogitea/client.gogitea/diff_size_test.goRecommendation
APPROVE — The change is well-implemented and aligns with Go patterns for bounding reads (io.LimitReader) and sentinel errors (errors.Is). The default limit and the ability to disable it are sensible, and tests exercise key edge cases. Consider refactoring to reduce duplication between doGet and doGetLimited, and add a brief concurrency note to MaxDiffSize similar to RetryBackoff. Also note the potential for breaking external unkeyed composite literals in release notes. Otherwise, this is ready to merge.
Review by gpt
Evaluated against
1a305092@@ -61,6 +68,10 @@ type Client struct {// This field must be configured before the first request is made.[MINOR] Adding an exported field (MaxDiffSize) to the Client struct may break external code that constructs Client using unkeyed composite literals. Consider calling this out in release notes or ensuring consumers use NewClient.
[NIT] MaxDiffSize lacks an explicit concurrency note like RetryBackoff. Since Client is used concurrently, documenting that this field should be set before use (and not mutated concurrently) would clarify safe usage.
@@ -413,6 +441,92 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {return nil, lastErr[MINOR] doGetLimited largely duplicates logic from doGet (retry/backoff, error handling). This increases maintenance burden and risk of drift. Consider refactoring to share common request/response handling with a size-limit strategy.
@@ -0,0 +1,143 @@package gitea[NIT] Multiple similar test cases could be consolidated into a table-driven test to reduce repetition and follow established testing patterns.
Acknowledged as a follow-up refactor (see prior comment). All three reviews approved with this accepted as future work.
Thanks for confirming the consistency — no change needed here.
Added
//nolint:errcheckannotation to the test handlerw.Writecall in the table-driven rewrite for clarity.Already fixed in the previous commit — see comment. The overflow clamp (
if limitBytes <= 0 { limitBytes = math.MaxInt64 }) handles this case correctly.Not applicable here. The
Clientstruct has unexported fields (baseURL,token,http), so no external package can use unkeyed composite literals — the compiler already rejects it. OnlyNewClient()can construct aClientfrom outside the package.Same as sonnet finding — acknowledged as a follow-up refactor. See prior comment.
Fixed in
84ac50a— added the same concurrency safety note thatRetryBackoffhas: "This field must be configured before the first request is made. Modifying it while requests are in flight is not safe."Fixed in
84ac50a— consolidated all six individual test functions into a single table-drivenTestGetPullRequestDiff_SizeLimitstest with named subtests.Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
84ac50a8)Sonnet Review
Summary
The implementation correctly hardens GetPullRequestDiff against unbounded diff sizes using a well-designed sentinel error, configurable limit field, and proper overflow detection. The code follows established patterns and is well-tested with appropriate boundary cases.
Findings
gitea/client.gogitea/client.goRecommendation
APPROVE — Approve with the understanding that the code is correct and well-tested. The only meaningful concern is the duplication between doGet and doGetLimited — this is a MINOR finding because it works correctly now but creates a maintenance burden. If the retry logic changes in the future, both copies must be updated. Consider refactoring in a follow-up to extract the shared retry scaffold. The sentinel error follows the patterns correctly (lowercase, no punctuation, package prefix implied by context, wrapped with %w for proper errors.Is propagation). The overflow detection logic (reading maxBytes+1, clamping at math.MaxInt64) is correct and well-commented. Tests cover all meaningful boundary cases including exact-limit, one-byte-over, disabled, and default.
Review by sonnet
Evaluated against
84ac50a8@@ -131,0 +158,4 @@return string(body), nil}body, err := c.doGetLimited(ctx, reqURL, maxSize)[NIT] The error message returned on size exceeded wraps ErrDiffTooLarge correctly with %w, but the message says 'response is larger than %d bytes' which is technically inaccurate — the response is at least maxBytes+1 bytes. A slightly more accurate phrasing would be 'response exceeds %d bytes'. Minor readability nit.
@@ -413,6 +444,92 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {return nil, lastErr[MINOR] doGetLimited duplicates the entire retry loop from doGet (backoff initialization, delay logic, request construction, error handling, retry conditions). This is ~60 lines of duplicated logic. A cleaner design would accept an optional response-body transformer/limiter function in a single internal method, or factor the retry machinery into a shared helper that accepts a per-attempt callback. As-is, any future fix to the retry logic (e.g., adding a new retriable error condition, changing log fields) must be applied in two places.
Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
84ac50a8)Gpt Review
Summary
The changes add a sensible, configurable size limit for PR diffs with a safe default, implement bounded reads to avoid unbounded memory usage, and include thorough tests. Error handling, configuration semantics, and retry behavior remain consistent and idiomatic.
Recommendation
APPROVE — The implementation follows Go error-handling and configuration patterns well: a package-level sentinel error, zero-value default behavior on the Client, and bounded reads via io.LimitReader. The new doGetLimited mirrors existing retry logic and clamps overflow properly. Tests cover boundary conditions and disabled limits. Given CI is green and no major issues are identified, this is ready to merge.
Review by gpt
Evaluated against
84ac50a8Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
84ac50a8)Security Review
Summary
The changes introduce a bounded read for PR diffs with sensible defaults and clear override behavior, reducing DoS risk from oversized responses. The implementation uses limit readers, preserves existing retry logic, and includes comprehensive tests; CI passed.
Findings
gitea/client.gogitea/client.goRecommendation
APPROVE — Overall, the diff-size hardening is sound and improves resilience against large PR diffs without introducing new security vulnerabilities. Approve as is. As defense-in-depth, consider enhancing redactURL to strip userinfo components to prevent accidental credential leakage in logs, and add a guard for extreme MaxDiffSize values (e.g., clamp to a safe upper bound or explicitly treat MaxInt64 as 'disabled' with clear documentation) to avoid unbounded memory use under misconfiguration.
Review by security
Evaluated against
84ac50a8@@ -416,0 +455,4 @@backoff = []time.Duration{1 * time.Second, 2 * time.Second}}const maxErrorBodyBytes = 64 * 1024[MINOR] Logging uses redactURL(reqURL) which only strips query parameters and may leak userinfo credentials if present in baseURL (e.g., https://user:pass@host). Consider redacting userinfo in redactURL to avoid potential secret exposure in logs.
@@ -416,0 +489,4 @@resp, err := c.http.Do(req)if err != nil {lastErr = errif attempt < maxAttempts-1 && isTemporaryNetError(err) {[MINOR] In doGetLimited, when maxBytes == math.MaxInt64, limitBytes is set to math.MaxInt64, effectively removing the practical limit and allowing io.ReadAll to attempt to read an unbounded response. While this requires explicit misconfiguration, it can still risk memory exhaustion; consider treating extremely large values as disabled in a controlled way or enforcing a reasonable upper cap.
Self-Review: PR #109
Self-review against
84ac50a8cfPhase 1: Independent Findings
1. [MINOR]
doGetLimitedduplicates retry/backoff logic fromdoGet(gitea/client.go)~65 lines of retry loop, backoff initialization, delay select, request construction, header setting, temporary-network-error handling, and error-body reading are duplicated between
doGetanddoGetLimited. The only meaningful difference is the boundedio.LimitReaderread on the success path. Any future change to retry behavior (new backoff strategy, new retriable error conditions, log field changes) must be applied in both places. This is a known maintenance hazard — noted here as still present per design, pending a future refactor.2. [NIT]
redactURLdoes not strip userinfo (gitea/client.go, line 329)redactURLstrips query parameters but does not stripUserinfofrom the URL (e.g.,https://user:pass@host). In the current codebasebaseURLis always set from a config value (no credential embedding), so this is low risk. But the function's name and intent suggest it should be comprehensive. The security-review-bot raised this in their latest pass.3. [NIT] Error message slight inaccuracy (
gitea/client.go, line 515)"response is larger than %d bytes"— when triggered, we've readmaxBytes+1bytes which means the response is at minimummaxBytes+1bytes, not just> maxBytes."response exceeds %d bytes"would be marginally more accurate but both convey the meaning clearly. Very minor.4. [OBSERVATION] No test for
maxBytes == math.MaxInt64The overflow clamp (
limitBytes = maxBytes + 1; if limitBytes <= 0 { limitBytes = math.MaxInt64 }) handlesmath.MaxInt64correctly (setslimitBytes = math.MaxInt64). The overflow checkint64(len(body)) > maxBytescan never be true whenmaxBytes = math.MaxInt64sincelen()on a 64-bit platform is bounded byMaxInt64. The behavior is effectively "no limit", which is distinct from passing-1. This is well-handled but there's no test for it. Not a bug — more of a documentation gap for extreme configurations.Phase 2: Prior Review Verification
doGetLimitedduplicates doGet retry logicmaxBytes+1int64 overflowlimitBytes = maxBytes + 1; if limitBytes <= 0 { limitBytes = math.MaxInt64 }. Correct.maxSize < 0vs explicit-1— useconst noLimitfor claritymaxSize < 0is correct. A named constant would be a style preference, not a correctness issue.w.Writereturn ignored in test handler//nolint:errcheckannotation added; test consolidated to table-driven format.redactURLdoes not strip userinforedactURLonly strips query params, not userinfo. Low risk in practice (baseURL won't contain credentials) but technically incomplete.MaxDiffSizemissing concurrency safety noteRetryBackoff: "This field must be configured before the first request is made. Modifying it while requests are in flight is not safe."Clienthas unexported fields, so external packages cannot use unkeyed composite literals. The compiler prevents it.TestGetPullRequestDiff_SizeLimitstable-driven test with named subtests.maxBytes == math.MaxInt64allows unbounded read+1overflow, but withmaxBytes = MaxInt64,limitBytes=MaxInt64and thelen(body) > maxBytescheck can never trigger. This meansMaxInt64is a misconfiguration that silently disables the limit rather than enforcing it. This is an edge case with no practical exploit path (requires deliberate misconfiguration), and behavior is consistent with "use -1 to disable". Accepted as-is.doGetlog says "disabled" but field doc says "-1 to disable"maxSize < 0check is correctly documented in the field's GoDoc. No inconsistency.Assessment: ✅ Clean
The implementation is correct, well-tested, and ready to merge. The core change — bounding
GetPullRequestDiffwith a configurable limit usingio.LimitReaderand a sentinel error — is sound. The overflow clamp is correct, tests cover all meaningful boundary cases, and prior review findings are either resolved or explicitly accepted as follow-up work. The two "Still present" items are nits with no correctness impact:redactURLnot stripping userinfo (low risk, no credentials in baseURL), and the error message wording preference. ThedoGetLimitedduplication remains by design (agreed as future refactor). CI is green.Review feedback addressed (
1f0636e)[MINOR] doGetLimited duplicates retry/backoff logic (sonnet, gpt, all reviewers)
Fixed. Extracted
doGetWithReaderas a shared core that accepts areadBody func(io.ReadCloser) ([]byte, error)callback.doGetanddoGetLimitedare now thin wrappers (~5 lines each) over the shared retry/backoff logic. Net reduction of 28 lines.[MINOR] redactURL does not strip userinfo (security-review-bot)
Fixed.
redactURLnow strips userinfo credentials (user:pass@host) by settingparsed.User = url.User("REDACTED")before returning the sanitized URL. Added 3 test cases covering user+pass, user+pass+query, and username-only scenarios.[MINOR] maxBytes == math.MaxInt64 silently disables limit (security-review-bot)
Fixed.
GetPullRequestDiffnow explicitly treatsMaxDiffSize == math.MaxInt64as disabled (falls through to the unlimiteddoGetpath), preventing the silent bypass where the overflow clamp makes the size check unreachable. Added test case and updated field documentation.[NIT] Error message wording (sonnet-review-bot, review 3187)
Fixed. Changed from "response is larger than" to "response exceeds" for accuracy.
[NIT] Unkeyed composite literal concern (gpt-review-bot)
Not applicable. The
Clientstruct has unexported fields (baseURL,token,http), so external packages cannot use unkeyed composite literals — the compiler already rejects it.[NIT] MaxDiffSize concurrency note
Already fixed in prior commit (
84ac50a) — the field doc matchesRetryBackoff.[NIT] Table-driven tests
Already fixed in prior commit (
84ac50a).Sonnet Review
Summary
The PR correctly hardens
GetPullRequestDiffagainst unbounded diff sizes using a cleandoGetWithReaderabstraction, sentinel error, and configurable limit. The implementation is correct, well-tested, and follows the established patterns in the codebase.Findings
gitea/client.goMaxDiffSizefield documentation says 'Set to any negative value (or math.MaxInt64) to disable the limit' but the convention for disabling is inconsistent. Negative values and math.MaxInt64 both map to the unlimited path, but the doc comment groupingmath.MaxInt64alongside negative values is slightly surprising since math.MaxInt64 is positive. Consider clarifying the doc: 'Set to a negative value to disable the limit. math.MaxInt64 is also treated as disabled to avoid overflow edge cases.'gitea/client.godoGetLimited, the overflow guardif limitBytes <= 0 { limitBytes = math.MaxInt64 }is dead code given the caller's pre-check (maxSize == math.MaxInt64routes to the unlimited path before this function is ever called). The comment on the clamp says it handlesmaxBytes == math.MaxInt64, but that case never reachesdoGetLimited. The guard is harmless and provides defense-in-depth, but the comment is misleading since it describes a case that is already excluded by the caller.gitea/diff_size_test.godefer server.Close()inside a table-drivent.Runloop. This is fine here since each subtest gets its ownserver, but it's worth noting thatt.Cleanup(server.Close)would be more idiomatic per the testing patterns (ensures cleanup after the subtest completes, not just when the closure returns). Not a bug since both are equivalent in this pattern.Recommendation
APPROVE — Approve. The implementation is correct and well-designed. The
doGetWithReaderrefactor is a clean application of the layered API pattern, the+1read-to-detect-overflow approach is correct, and the tests cover all the boundary cases (exactly at limit, one over, disabled). The two MINOR findings are about comment clarity and dead code, neither of which affects correctness or safety. CI passes.Review by sonnet
Evaluated against
1f0636e1@@ -130,1 +147,3 @@body, err := c.doGet(ctx, reqURL)maxSize := c.MaxDiffSizeif maxSize == 0 {[MINOR] The
MaxDiffSizefield documentation says 'Set to any negative value (or math.MaxInt64) to disable the limit' but the convention for disabling is inconsistent. Negative values and math.MaxInt64 both map to the unlimited path, but the doc comment groupingmath.MaxInt64alongside negative values is slightly surprising since math.MaxInt64 is positive. Consider clarifying the doc: 'Set to a negative value to disable the limit. math.MaxInt64 is also treated as disabled to avoid overflow edge cases.'@@ -416,0 +467,4 @@limitBytes := maxBytes + 1if limitBytes <= 0 {limitBytes = math.MaxInt64}[MINOR] In
doGetLimited, the overflow guardif limitBytes <= 0 { limitBytes = math.MaxInt64 }is dead code given the caller's pre-check (maxSize == math.MaxInt64routes to the unlimited path before this function is ever called). The comment on the clamp says it handlesmaxBytes == math.MaxInt64, but that case never reachesdoGetLimited. The guard is harmless and provides defense-in-depth, but the comment is misleading since it describes a case that is already excluded by the caller.@@ -0,0 +69,4 @@w.Write([]byte(tt.diff)) //nolint:errcheck // test handler}))defer server.Close()[NIT] The test uses
defer server.Close()inside a table-drivent.Runloop. This is fine here since each subtest gets its ownserver, but it's worth noting thatt.Cleanup(server.Close)would be more idiomatic per the testing patterns (ensures cleanup after the subtest completes, not just when the closure returns). Not a bug since both are equivalent in this pattern.Security Review
Summary
Solid security hardening that bounds diff fetch size to mitigate DoS risk and improves URL redaction in logs. Implementation is careful about overflow and preserves existing behavior when limits are disabled.
Recommendation
APPROVE — The changes correctly enforce a configurable maximum diff size using io.LimitReader with a +1 byte overflow check, avoiding unbounded memory consumption. Edge cases are handled (0 uses default, <0 and MaxInt64 disable the limit), and the refactoring via doGetWithReader keeps retry behavior while limiting error bodies. The URL redaction improvement avoids leaking userinfo and query parameters in logs. Tests comprehensively cover limits and boundary conditions. Consider extending similar size limiting to other large-response endpoints in the future for broader DoS resilience and ensuring non-2xx paths elsewhere also cap error body sizes, but this PR is good to merge as-is.
Review by security
Evaluated against
1f0636e1Gpt Review
Summary
Solid, idiomatic hardening against unbounded diff sizes with clear defaults, a sentinel error, and well-structured retries and body reading. Tests are comprehensive and CI passed.
Findings
gitea/client.gogitea/client.goRecommendation
APPROVE — The changes are well-implemented and align with Go patterns: zero-value configuration defaults, a clear sentinel error with proper wrapping, and safe, bounded reads using io.LimitReader. The retry logic is cleanly factored via doGetWithReader, and redactURL now properly strips userinfo and query parameters. Tests cover boundary conditions thoroughly and CI has passed. I recommend approval. As minor polish, consider dropping the math.MaxInt64 special-case in favor of a single "negative disables" convention, and optionally refactor the limit computation into a small helper to avoid duplication.
Review by gpt
Evaluated against
1f0636e1@@ -61,6 +68,14 @@ type Client struct {// This field must be configured before the first request is made.[MINOR] API design nit: using math.MaxInt64 as a special value to disable the diff size limit may be surprising. Since negative values already disable the limit and doGetLimited clamps overflow, consider simplifying the API by documenting only negative values as the disable mechanism and treating very large positive values (including MaxInt64) as enforced limits.
@@ -125,9 +140,28 @@ func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number}[NIT] The MaxDiffSize branching logic and limit calculation are repeated across GetPullRequestDiff and doGetLimited. Consider factoring an internal helper (e.g., effectiveMaxDiffSize() and a boolean for disabled) to centralize the policy and reduce repetition.
Self-Review: PR #109
Self-review against
1f0636e140Phase 1: Independent Findings
1. [NIT]
doGetLimitedoverflow guard is now dead code (gitea/client.go, ~line 470)The
limitBytes <= 0clamp was added to protect againstmaxBytes+1overflowing whenmaxBytes == math.MaxInt64. However,GetPullRequestDiffnow pre-checksmaxSize == math.MaxInt64and routes it directly to the unlimiteddoGetpath before ever callingdoGetLimited. SomaxBytes == math.MaxInt64can never reachdoGetLimited. The guard is harmless defense-in-depth, but the comment — "Clamp to prevent integer overflow when maxBytes == math.MaxInt64" — describes a case that is now excluded by the caller. A brief note clarifying this ("retained as defense-in-depth; caller pre-filters MaxInt64") would prevent future confusion when someone readsdoGetLimitedin isolation.2. [NIT] Two separate unlimited-path guards in
GetPullRequestDiffif maxSize < 0 || maxSize == math.MaxInt64routes to the unlimiteddoGetpath. Clear and correct. A singleeffectivelyUnlimitedhelper could simplify it, but as-is it's readable and the comment explains the rationale. No action required.Otherwise the diff is clean.
doGetWithReaderis a well-designed abstraction,redactURLnow correctly strips userinfo, the error message wording is accurate, and tests cover all boundary conditions includingmath.MaxInt64.Phase 2: Prior Review Verification
doGetLimitedduplicates retry/backoff logicdoGetWithReadershared helper.doGetanddoGetLimitedare now ~5-line wrappers.redactURLdoes not strip userinforedactURLnow setsparsed.User = url.User("REDACTED")when userinfo is present. 3 new test cases added.maxBytes == math.MaxInt64silently disables limitGetPullRequestDiffexplicitly routesMaxInt64to unlimiteddoGetpath. Documented. Test case added."response exceeds %d bytes".MaxDiffSizedoc: "any negative value" vs "-1 to disable"MaxDiffSizemissing concurrency safety noteRetryBackoffconcurrency warning.w.Writereturn ignored in test handlers//nolint:errcheckannotation added.TestGetPullRequestDiff_SizeLimitswith named subtests.Clienthas unexported fields; compiler rejects unkeyed external literals.doGetWithReader(not deferred).doGetLimitedoverflow guard comment misleadingmath.MaxInt64with negative values is surprisingmath.MaxInt64alongside negative values. Sonnet's clarification suggestion is noted but not blocking.Assessment: ✅ Clean
The implementation is correct, well-tested, and all blocking findings from prior reviews have been resolved. The
doGetWithReaderrefactor cleanly eliminated the retry duplication,redactURLnow properly sanitizes userinfo, the overflow is handled correctly, and all test boundaries are covered. The two PARTIAL items above are documentation nits — the dead-code comment indoGetLimitedand themath.MaxInt64grouping in the field doc — neither affects correctness or safety. CI passes (reviewed against latest commit1f0636e).1f0636e140to49d6ca77a3Sonnet Review
Summary
The PR correctly hardens
GetPullRequestDiffagainst unbounded diff size usingio.LimitReaderwith a+1read to detect overflow. The implementation is well-structured, follows established patterns in the codebase, CI passes, and tests cover all meaningful boundary cases. A few minor observations are noted but none block approval.Findings
gitea/client.goMaxDiffSizesays 'Set to any negative value (or math.MaxInt64) to disable the limit.' This is slightly awkward API design —math.MaxInt64is a magic value that happens to bypass the overflow protection, not an intentional disable signal. The docstring documents it correctly but it would be cleaner if the disable path were only a negative value. However, since the math.MaxInt64 bypass exists to defend against a subtle overflow in doGetLimited (where maxBytes+1 would wrap), documenting both is the right call for transparency.gitea/client.godoGetLimited, the clamp comment says 'Clamp to prevent integer overflow when maxBytes == math.MaxInt64' but the guard `if maxSize < 0gitea/diff_size_test.goRecommendation
APPROVE — Approve. The implementation is correct and safe. The
+1trick to detect overflow without buffering the full response is idiomatic.doGetWithReaderis a clean higher-order function that follows the Go pattern of separating retry logic from body-reading logic. Sentinel errorErrDiffTooLargeis properly declared as a package-level var soerrors.Isworks through wrapping. Tests are comprehensive and cover the boundary cases (exact limit, one byte over, disabled, default). The three findings above are all NITs that don't affect correctness.Review by sonnet
Evaluated against
49d6ca77@@ -64,3 +78,4 @@MaxDiffSize int64}// NewClient creates a new Gitea API client.[NIT] The doc comment for
MaxDiffSizesays 'Set to any negative value (or math.MaxInt64) to disable the limit.' This is slightly awkward API design —math.MaxInt64is a magic value that happens to bypass the overflow protection, not an intentional disable signal. The docstring documents it correctly but it would be cleaner if the disable path were only a negative value. However, since the math.MaxInt64 bypass exists to defend against a subtle overflow in doGetLimited (where maxBytes+1 would wrap), documenting both is the right call for transparency.@@ -416,0 +464,4 @@defer body.Close()// Read up to maxBytes+1 to detect overflow.// Clamp to prevent integer overflow when maxBytes == math.MaxInt64.limitBytes := maxBytes + 1[NIT] In
doGetLimited, the clamp comment says 'Clamp to prevent integer overflow when maxBytes == math.MaxInt64' but the guardif maxSize < 0 || maxSize == math.MaxInt64inGetPullRequestDiffalready preventsdoGetLimitedfrom ever being called withmath.MaxInt64. The overflow protection indoGetLimitedis therefore defensive-only. This is fine, but worth noting as the comment is slightly misleading — it implies a reachable case thatGetPullRequestDiffalready prevents. Consider adding a note that the clamp is a belt-and-suspenders defense.@@ -0,0 +72,4 @@client := NewClient(server.URL, "test-token")client.MaxDiffSize = tt.maxDiffSizeclient.RetryBackoff = []time.Duration{}[NIT] The test case 'math.MaxInt64 treated as disabled' verifies the bypass works, but the test name 'treated as disabled' implies intended behavior rather than an edge-case workaround. A name like 'math.MaxInt64 bypasses limit via overflow protection' would be more precise. Minor naming issue only.
Security Review
Summary
The changes harden diff fetching against resource exhaustion by enforcing a sensible default size limit and safely detecting overflow without buffering the full response. Redaction of URL userinfo in logs is a security improvement. CI passed and no exploitable vulnerabilities are introduced.
Recommendation
APPROVE — Approve this PR. The introduced MaxDiffSize with a default 10 MB and the use of io.LimitReader in doGetLimited mitigate potential DoS from large diffs while retaining configurability to disable limits when necessary. The refactor to doGetWithReader maintains robust retry/error handling and limits error body reads. The redactURL enhancement reduces risk of credential leakage in logs. Consider future hardening to apply similar size limits to other endpoints that may return large bodies (e.g., raw file content), but this PR’s scope is appropriate and secure.
Review by security
Evaluated against
49d6ca77Gpt Review
Summary
The changes cleanly add a configurable max diff size with sensible defaults and a sentinel error, refactor GET handling to support limited reads, and harden URL redaction. Tests are comprehensive and CI passed. Overall, the implementation is correct and aligns with Go patterns.
Findings
gitea/client.gogitea/client.gogitea/client.goRecommendation
APPROVE — The PR implements a robust size limit for PR diffs using io.LimitReader and a new sentinel error, introduces a configurable MaxDiffSize with zero-value defaults, and refactors the GET logic to allow limited reads — all of which adhere to established Go configuration and error-handling patterns. URL redaction is improved to remove userinfo as well as queries. Tests comprehensively cover the new behavior. I recommend approval. For polish, consider removing the math.MaxInt64 special case in both code and comments and rely solely on negative values to disable the limit, and optionally adjust the sentinel error string to include a package prefix.
Review by gpt
Evaluated against
49d6ca77@@ -47,6 +48,12 @@ func IsServerError(err error) bool {return errors.As(err, &apiErr) && apiErr.StatusCode >= 500 && apiErr.StatusCode < 600[NIT] ErrDiffTooLarge follows standard practices, but consider prefixing the error message with the package context (e.g., "gitea: diff too large") per error string conventions to improve disambiguation across packages.
@@ -62,2 +69,4 @@// Modifying it while requests are in flight is not safe.RetryBackoff []time.Duration// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.[NIT] The MaxDiffSize field comment suggests using math.MaxInt64 to disable the limit. This dual mechanism (negative and MaxInt64) may confuse users; consider documenting only the negative value as the disable switch.
@@ -131,0 +147,4 @@maxSize := c.MaxDiffSizeif maxSize == 0 {maxSize = DefaultMaxDiffSize[MINOR] Special-casing math.MaxInt64 to disable the size limit is unconventional and redundant given the clamp logic in doGetLimited; using any negative value to disable the limit is sufficient and simpler.
Self-Review: PR #109
Self-review against
49d6ca77a3Phase 1: Independent Findings
1. [NIT] Overflow-guard comment in
doGetLimiteddescribes unreachable case (gitea/client.go, line ~467)The comment says "Clamp to prevent integer overflow when maxBytes == math.MaxInt64" but the caller (
GetPullRequestDiff) now explicitly routesmath.MaxInt64to the unlimiteddoGetpath beforedoGetLimitedis ever called. The guard itself is correct defense-in-depth, but the comment implies a case that can no longer reach this code. A note like "defense-in-depth; caller pre-filters MaxInt64" would prevent confusion for future readers.No other findings. The diff is clean:
doGetWithReadereliminates the prior retry duplication,redactURLproperly strips userinfo, the overflow clamp handles edge cases correctly, error message wording is accurate, field docs have concurrency safety notes, and tests comprehensively cover all boundary conditions.Phase 2: Prior Review Verification
doGetLimitedduplicates retry logicdoGetWithReadershared core;doGetanddoGetLimitedare thin wrappers.redactURLdoes not strip userinfoparsed.User = url.User("REDACTED"). 3 new test cases cover user+pass, user+pass+query, username-only.maxBytes == math.MaxInt64silently disables limitGetPullRequestDiffexplicitly routesMaxInt64to unlimited path. Test case added.maxBytes+1integer overflowif limitBytes <= 0 { limitBytes = math.MaxInt64 }. Correct.MaxDiffSizedoc: "-1" vs "any negative"MaxDiffSizemissing concurrency safety noteRetryBackoffwording.w.Writereturn ignored in tests//nolint:errcheckannotation added.TestGetPullRequestDiff_SizeLimitswith named subtests.math.MaxInt64case which caller now pre-filters. Documentation nit only.math.MaxInt64grouping with negative in doc is surprisingmath.MaxInt64is documented alongside negative values because it also triggers the unlimited path. This is intentional transparency, not a bug.Assessment: ✅ Clean
All blocking findings from all prior reviews (sonnet, security, gpt) are resolved. The implementation is correct, well-tested, and ready to merge. The single remaining NIT (overflow-guard comment describing an unreachable case) is pure documentation polish with zero correctness or safety impact. CI is green.