feat(gitea): harden GetPullRequestDiff against unbounded diff size #109

Merged
aweiker merged 4 commits from review-bot-issue-92 into main 2026-05-13 14:01:22 +00:00
Showing only changes of commit 004343d05f - Show all commits
+10 -3
View File
@@ -11,6 +11,7 @@ import (
"fmt"
"io"
"log/slog"
"math"
"net"
"net/http"
"net/url"
5
@@ -69,7 +70,7 @@ type Client struct {
RetryBackoff []time.Duration
security-review-bot marked this conversation as resolved Outdated
Outdated
Review

[NIT] Disabling the diff size limit (MaxDiffSize = -1) removes protection against large responses and could allow memory exhaustion if configured from untrusted input. Ensure this setting cannot be influenced by untrusted sources.

**[NIT]** Disabling the diff size limit (MaxDiffSize = -1) removes protection against large responses and could allow memory exhaustion if configured from untrusted input. Ensure this setting cannot be influenced by untrusted sources.
// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.
Review

[NIT] The MaxDiffSize field comment suggests using math.MaxInt64 to disable the limit. This dual mechanism (negative and MaxInt64) may confuse users; consider documenting only the negative value as the disable switch.

**[NIT]** The MaxDiffSize field comment suggests using math.MaxInt64 to disable the limit. This dual mechanism (negative and MaxInt64) may confuse users; consider documenting only the negative value as the disable switch.
// If zero, defaults to DefaultMaxDiffSize (10 MB). Set to -1 to disable the limit.
// If zero, defaults to DefaultMaxDiffSize (10 MB). Set to any negative value to disable the limit.
MaxDiffSize int64
}
11
@@ -442,7 +443,8 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
// doGetLimited performs an HTTP GET request with retry (like doGet) but enforces
Review

[MINOR] doGetLimited duplicates the entire retry loop from doGet (backoff initialization, delay logic, request construction, error handling, retry conditions). This is ~60 lines of duplicated logic. A cleaner design would accept an optional response-body transformer/limiter function in a single internal method, or factor the retry machinery into a shared helper that accepts a per-attempt callback. As-is, any future fix to the retry logic (e.g., adding a new retriable error condition, changing log fields) must be applied in two places.

**[MINOR]** doGetLimited duplicates the entire retry loop from doGet (backoff initialization, delay logic, request construction, error handling, retry conditions). This is ~60 lines of duplicated logic. A cleaner design would accept an optional response-body transformer/limiter function in a single internal method, or factor the retry machinery into a shared helper that accepts a per-attempt callback. As-is, any future fix to the retry logic (e.g., adding a new retriable error condition, changing log fields) must be applied in two places.
// a maximum response body size. Returns ErrDiffTooLarge if the response exceeds
// maxBytes. It reads maxBytes+1 to detect overflow without buffering the entire body.
// maxBytes. It reads maxBytes+1 (clamped to avoid overflow) to detect truncation
// without buffering the entire body.
func (c *Client) doGetLimited(ctx context.Context, reqURL string, maxBytes int64) ([]byte, error) {
const maxAttempts = 3
backoff := c.RetryBackoff
6
@@ -495,7 +497,12 @@ func (c *Client) doGetLimited(ctx context.Context, reqURL string, maxBytes int64
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
// Read up to maxBytes+1 to detect overflow.
limited := io.LimitReader(resp.Body, maxBytes+1)
// Clamp to prevent integer overflow when maxBytes == math.MaxInt64.
limitBytes := maxBytes + 1
if limitBytes <= 0 {
limitBytes = math.MaxInt64
}
limited := io.LimitReader(resp.Body, limitBytes)
body, err := io.ReadAll(limited)
resp.Body.Close()
if err != nil {