feat(github): implement PRReader + FileReader client (#80) #93

Closed
rodin wants to merge 16 commits from review-bot-issue-80 into feature/github-support
3 changed files with 35 additions and 32 deletions
Showing only changes of commit 491df7cb1f - Show all commits
+24 -24
View File
2
@@ -26,6 +26,10 @@ const (
// APIError represents an HTTP error response from the GitHub API.
// It carries the status code so callers can distinguish between
// different failure modes (e.g. 404 vs 500).
//
// Note: Error() includes up to 200 bytes of the response body for debugging.
// Callers should avoid logging raw error messages in production if the upstream
Outdated
Review

[NIT] APIError.Error includes up to 200 bytes of the response body. If callers log errors verbatim, this could leak server-provided details (e.g., repository names). Consider further redaction or requiring callers to log status codes without bodies in production.

**[NIT]** APIError.Error includes up to 200 bytes of the response body. If callers log errors verbatim, this could leak server-provided details (e.g., repository names). Consider further redaction or requiring callers to log status codes without bodies in production.
// server may return sensitive details in error responses.
type APIError struct {
Outdated
Review

[NIT] APIError.Error includes up to 200 bytes of the response body. If callers log errors, this could surface sensitive server details. Consider reducing or masking returned body content, or clearly documenting that callers should avoid logging raw error messages in production.

**[NIT]** APIError.Error includes up to 200 bytes of the response body. If callers log errors, this could surface sensitive server details. Consider reducing or masking returned body content, or clearly documenting that callers should avoid logging raw error messages in production.
StatusCode int
Body string
27
@@ -97,6 +101,21 @@ type Client struct {
retryBackoff []time.Duration
Outdated
Review

[NIT] The parameter name 'url' in doRequest shadows the conceptual URL type used elsewhere (e.g., net/url in other files). Renaming to 'reqURL' could improve clarity, though this is purely stylistic.

**[NIT]** The parameter name 'url' in doRequest shadows the conceptual URL type used elsewhere (e.g., net/url in other files). Renaming to 'reqURL' could improve clarity, though this is purely stylistic.
}
// defaultCheckRedirect is the redirect policy used by NewClient and SetHTTPClient(nil).
Outdated
Review

[NIT] The doc comment for Client says SetHTTPClient and SetRetryBackoff must not be called concurrently with requests, but these are public methods and there's no enforcement or noCopy guard. Given the concurrent-use note, a brief comment in SetHTTPClient and SetRetryBackoff reiterating the constraint would help (though this matches the stdlib's tls.Config immutable-after-use convention, so it's acceptable as-is).

**[NIT]** The doc comment for `Client` says `SetHTTPClient` and `SetRetryBackoff` must not be called concurrently with requests, but these are public methods and there's no enforcement or `noCopy` guard. Given the concurrent-use note, a brief comment in `SetHTTPClient` and `SetRetryBackoff` reiterating the constraint would help (though this matches the stdlib's `tls.Config` immutable-after-use convention, so it's acceptable as-is).
// It strips the Authorization header on cross-host redirects or protocol downgrades
// (HTTPS→HTTP) to prevent credential leakage, while still following the redirect.
func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
Outdated
Review

[MINOR] The CheckRedirect handler allows following cross-host and HTTPS→HTTP redirects (while stripping Authorization). Although the token isn’t leaked, following cross-host redirects may contact untrusted hosts. Consider restricting redirects to same-host and HTTPS-only or fail on cross-host redirects to reduce SSRF-style risks.

**[MINOR]** The CheckRedirect handler allows following cross-host and HTTPS→HTTP redirects (while stripping Authorization). Although the token isn’t leaked, following cross-host redirects may contact untrusted hosts. Consider restricting redirects to same-host and HTTPS-only or fail on cross-host redirects to reduce SSRF-style risks.
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
Review

[MINOR] defaultCheckRedirect allows cross-host redirects (with Authorization stripped). Although token leakage is mitigated, following cross-host redirects can facilitate SSRF-like behavior if baseURL is misconfigured or points to a compromised server. Consider rejecting cross-host redirects by default or enforcing an allowlist of trusted hosts.

**[MINOR]** defaultCheckRedirect allows cross-host redirects (with Authorization stripped). Although token leakage is mitigated, following cross-host redirects can facilitate SSRF-like behavior if baseURL is misconfigured or points to a compromised server. Consider rejecting cross-host redirects by default or enforcing an allowlist of trusted hosts.
}
Outdated
Review

[MINOR] Consider setting a User-Agent header on all requests. GitHub recommends identifying clients, and some enterprise installations enforce it. Add req.Header.Set("User-Agent", "review-bot/1.0") or similar.

**[MINOR]** Consider setting a User-Agent header on all requests. GitHub recommends identifying clients, and some enterprise installations enforce it. Add req.Header.Set("User-Agent", "review-bot/1.0") or similar.
// Strip Authorization on cross-host redirect or protocol downgrade (https→http).
prev := via[len(via)-1]
Outdated
Review

[MINOR] SetHTTPClient allows setting a nil *http.Client; subsequent use (c.httpClient.Do) would panic. Either guard against nil (return error or restore default client) or document that nil is invalid.

**[MINOR]** SetHTTPClient allows setting a nil *http.Client; subsequent use (c.httpClient.Do) would panic. Either guard against nil (return error or restore default client) or document that nil is invalid.
if req.URL.Host != prev.URL.Host || (prev.URL.Scheme == "https" && req.URL.Scheme == "http") {
req.Header.Del("Authorization")
}
Review

[MINOR] defaultCheckRedirect follows HTTPS→HTTP redirects after stripping Authorization. While credentials are protected, this still permits plaintext requests to proceed, which can leak metadata and expands attack surface if a misconfigured or compromised server issues such redirects. Prefer failing closed on protocol downgrades.

**[MINOR]** defaultCheckRedirect follows HTTPS→HTTP redirects after stripping Authorization. While credentials are protected, this still permits plaintext requests to proceed, which can leak metadata and expands attack surface if a misconfigured or compromised server issues such redirects. Prefer failing closed on protocol downgrades.
return nil
Outdated
Review

[MINOR] Authorization header is always set to "Bearer "+token even when token is empty. Consider only setting the header when token is non-empty to avoid sending an empty bearer token on unauthenticated requests.

**[MINOR]** Authorization header is always set to "Bearer "+token even when token is empty. Consider only setting the header when token is non-empty to avoid sending an empty bearer token on unauthenticated requests.
Outdated
Review

[NIT] GitHub classic PATs typically use the "token" scheme while fine-grained tokens use "Bearer". If supporting both is desired, consider documenting or adapting the auth scheme based on token type.

**[NIT]** GitHub classic PATs typically use the "token" scheme while fine-grained tokens use "Bearer". If supporting both is desired, consider documenting or adapting the auth scheme based on token type.
Review

[MINOR] The doc comment on defaultCheckRedirect says it "strips the Authorization header on cross-host redirects or protocol downgrades (HTTPS→HTTP) to prevent credential leakage, while still following the redirect." However, a protocol downgrade from HTTPS to HTTP is a genuine security issue — stripping the header and still following is debatable. Consider returning an error on HTTPS→HTTP downgrade rather than silently following. This is a design choice that has security implications, not a bug per se, but worth flagging.

**[MINOR]** The doc comment on `defaultCheckRedirect` says it "strips the Authorization header on cross-host redirects or protocol downgrades (HTTPS→HTTP) to prevent credential leakage, while still following the redirect." However, a protocol downgrade from HTTPS to HTTP is a genuine security issue — stripping the header and still following is debatable. Consider returning an error on HTTPS→HTTP downgrade rather than silently following. This is a design choice that has security implications, not a bug per se, but worth flagging.
Review

[MINOR] defaultCheckRedirect indexes via[len(via)-1] without guarding for len(via) == 0. net/http currently guarantees at least one prior request in via, but adding a len(via) check would make this more robust against misuse.

**[MINOR]** defaultCheckRedirect indexes via[len(via)-1] without guarding for len(via) == 0. net/http currently guarantees at least one prior request in via, but adding a len(via) check would make this more robust against misuse.
}
Outdated
Review

[NIT] The concurrency safety doc comment says "SetHTTPClient and SetRetryBackoff must not be called concurrently with requests" but these are exported methods with no enforcement (no mutex, no atomic). This is an acceptable design for a test-support method, but the comment could be stronger: e.g., "These methods are intended for test setup only and must be called before any goroutines issue requests."

**[NIT]** The concurrency safety doc comment says "SetHTTPClient and SetRetryBackoff must not be called concurrently with requests" but these are exported methods with no enforcement (no mutex, no atomic). This is an acceptable design for a test-support method, but the comment could be stronger: e.g., "These methods are intended for test setup only and must be called before any goroutines issue requests."
// NewClient creates a new GitHub API client.
// If baseURL is empty, it defaults to https://api.github.com.
// For GitHub Enterprise, pass the API base URL (e.g. https://github.concur.com/api/v3).
Outdated
Review

[NIT] Parameter name "url" in doRequest shadows the common concept of URLs and can reduce readability given other files import net/url. Consider renaming the parameter to reqURL for clarity.

**[NIT]** Parameter name "url" in doRequest shadows the common concept of URLs and can reduce readability given other files import net/url. Consider renaming the parameter to reqURL for clarity.
Outdated
Review

[MINOR] Redirects to different hosts or to HTTP are allowed (Authorization is stripped), which can lead to consuming responses from untrusted or downgraded endpoints. While token leakage is prevented, consider rejecting cross-host redirects and HTTPS→HTTP downgrades entirely to avoid integrity/confidentiality risks.

**[MINOR]** Redirects to different hosts or to HTTP are allowed (Authorization is stripped), which can lead to consuming responses from untrusted or downgraded endpoints. While token leakage is prevented, consider rejecting cross-host redirects and HTTPS→HTTP downgrades entirely to avoid integrity/confidentiality risks.
6
@@ -115,18 +134,8 @@ func NewClient(token, baseURL string, opts ...ClientOption) *Client {
allowInsecureHTTP: cfg.allowInsecureHTTP,
token: token,
httpClient: &http.Client{
Outdated
Review

[MINOR] Retry-After is parsed only as delta-seconds; per RFC 7231 it may also be an HTTP-date. Consider falling back to parsing an HTTP-date when Atoi fails to honor server guidance more robustly.

**[MINOR]** Retry-After is parsed only as delta-seconds; per RFC 7231 it may also be an HTTP-date. Consider falling back to parsing an HTTP-date when Atoi fails to honor server guidance more robustly.
Timeout: 30 * time.Second,
CheckRedirect: func(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
// Strip Authorization on cross-host redirect or protocol downgrade (https→http).
prev := via[len(via)-1]
if req.URL.Host != prev.URL.Host || (prev.URL.Scheme == "https" && req.URL.Scheme == "http") {
req.Header.Del("Authorization")
}
return nil
},
Timeout: 30 * time.Second,
Outdated
Review

[MINOR] Non-HTTPS rejection checks strings.HasPrefix(url, "https://"). This is case-sensitive and string-based; safer to parse with net/url and check URL.Scheme case-insensitively (e.g., u.Scheme == "https").

**[MINOR]** Non-HTTPS rejection checks strings.HasPrefix(url, "https://"). This is case-sensitive and string-based; safer to parse with net/url and check URL.Scheme case-insensitively (e.g., u.Scheme == "https").
CheckRedirect: defaultCheckRedirect,
},
}
}
3
@@ -138,17 +147,8 @@ func NewClient(token, baseURL string, opts ...ClientOption) *Client {
func (c *Client) SetHTTPClient(hc *http.Client) {
if hc == nil {
Outdated
Review

[MINOR] Authorization header always uses the "Bearer" scheme. Some GitHub token types (classic PAT) historically use the "token" scheme. Consider making the auth scheme configurable or auto-detectable to maximize compatibility.

**[MINOR]** Authorization header always uses the "Bearer" scheme. Some GitHub token types (classic PAT) historically use the "token" scheme. Consider making the auth scheme configurable or auto-detectable to maximize compatibility.
hc = &http.Client{
Outdated
Review

[MINOR] Authorization header is sent to whatever baseURL is configured. If baseURL can be influenced by untrusted input, this could leak tokens to an attacker-controlled host (SSRF/token exfiltration). Ensure baseURL is treated as trusted configuration and consider allowlisting expected hosts at a higher layer.

**[MINOR]** Authorization header is sent to whatever baseURL is configured. If baseURL can be influenced by untrusted input, this could leak tokens to an attacker-controlled host (SSRF/token exfiltration). Ensure baseURL is treated as trusted configuration and consider allowlisting expected hosts at a higher layer.
Timeout: 30 * time.Second,
CheckRedirect: func(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
prev := via[len(via)-1]
if req.URL.Host != prev.URL.Host || (prev.URL.Scheme == "https" && req.URL.Scheme == "http") {
req.Header.Del("Authorization")
}
return nil
},
Timeout: 30 * time.Second,
Outdated
Review

[MINOR] The CheckRedirect lambda is duplicated verbatim in both NewClient and SetHTTPClient(nil). Extract it to a package-level function (e.g., defaultCheckRedirect) to eliminate the duplication and ensure both code paths stay in sync when the policy changes.

**[MINOR]** The CheckRedirect lambda is duplicated verbatim in both NewClient and SetHTTPClient(nil). Extract it to a package-level function (e.g., `defaultCheckRedirect`) to eliminate the duplication and ensure both code paths stay in sync when the policy changes.
CheckRedirect: defaultCheckRedirect,
Outdated
Review

[NIT] Retry-After parsing only handles delta-seconds via Atoi. RFC 7231 allows an HTTP-date format; optionally support parsing HTTP-date to fully respect server guidance.

**[NIT]** Retry-After parsing only handles delta-seconds via Atoi. RFC 7231 allows an HTTP-date format; optionally support parsing HTTP-date to fully respect server guidance.
}
}
Outdated
Review

[MINOR] When c.http.Do(req) returns an error (network failure, context cancellation), the function returns immediately without retrying. For transient network errors, a retry could be valuable. The current behavior is reasonable for the stated scope (only retry on 429), but the comment says 'It respects the Retry-After header when present' without mentioning the no-retry-on-transport-error behavior. This is a documentation gap rather than a bug.

**[MINOR]** When `c.http.Do(req)` returns an error (network failure, context cancellation), the function returns immediately without retrying. For transient network errors, a retry could be valuable. The current behavior is reasonable for the stated scope (only retry on 429), but the comment says 'It respects the Retry-After header when present' without mentioning the no-retry-on-transport-error behavior. This is a documentation gap rather than a bug.
c.httpClient = hc
29
@@ -235,7 +235,7 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
if err != nil {
Outdated
Review

[MINOR] The shadow variable t on } else if t, err := http.ParseTime(ra); err == nil { shadows the outer t parameter name from the function signature — but this function has no t parameter, so there is no actual shadowing issue here. However, using t as a local variable for a time.Time is confusing since t conventionally means *testing.T in Go. Consider renaming to parsedTime or retryAt for clarity.

**[MINOR]** The shadow variable `t` on `} else if t, err := http.ParseTime(ra); err == nil {` shadows the outer `t` parameter name from the function signature — but this function has no `t` parameter, so there is no actual shadowing issue here. However, using `t` as a local variable for a `time.Time` is confusing since `t` conventionally means `*testing.T` in Go. Consider renaming to `parsedTime` or `retryAt` for clarity.
return nil, fmt.Errorf("read response body: %w", err)
Review

[MINOR] After c.httpClient.Do(req) returns an error, the function returns immediately with a wrapped transport error. However, the body is not closed because resp is nil on transport error — this is correct, but deserves a comment for future readers since handleResponse uses defer for body close, it's not obvious why there's no defer here.

**[MINOR]** After `c.httpClient.Do(req)` returns an error, the function returns immediately with a wrapped transport error. However, the body is not closed because `resp` is nil on transport error — this is correct, but deserves a comment for future readers since `handleResponse` uses defer for body close, it's not obvious why there's no defer here.
}
if int64(len(body)) >= maxResponseBytes {
if len(body) >= maxResponseBytes {
return nil, fmt.Errorf("response body exceeded %d bytes (truncated)", maxResponseBytes)
}
return body, nil
5
+3
View File
12
@@ -48,6 +48,9 @@ func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]
if err2 := json.Unmarshal(body, &single); err2 != nil {
return nil, fmt.Errorf("parse contents JSON: %w", err2)
}
security-review-bot marked this conversation as resolved Outdated
Outdated
Review

[MINOR] escapePath preserves dot-segments ("." and "..") in path components. Some servers normalize dot-segments, which could let an untrusted path escape the intended "contents" endpoint (e.g., "/contents/../../pulls"). While this stays on the same host, it could lead to unexpected behavior. Consider rejecting or encoding dot-segments to prevent path traversal in URL construction.

**[MINOR]** escapePath preserves dot-segments ("." and "..") in path components. Some servers normalize dot-segments, which could let an untrusted path escape the intended "contents" endpoint (e.g., "/contents/../../pulls"). While this stays on the same host, it could lead to unexpected behavior. Consider rejecting or encoding dot-segments to prevent path traversal in URL construction.
if single.Name == "" && single.Path == "" && single.Type == "" {
Outdated
Review

[MINOR] Silent empty-array handling: when the GitHub Contents API returns an empty JSON array [], json.Unmarshal into []entry succeeds with len(entries) == 0 and the function returns an empty slice without error. This is probably the correct behavior for an empty directory, but it's undocumented and could silently mask an unexpected server response. A comment clarifying this intent would be valuable.

**[MINOR]** Silent empty-array handling: when the GitHub Contents API returns an empty JSON array `[]`, `json.Unmarshal` into `[]entry` succeeds with `len(entries) == 0` and the function returns an empty slice without error. This is probably the correct behavior for an empty directory, but it's undocumented and could silently mask an unexpected server response. A comment clarifying this intent would be valuable.
return nil, fmt.Errorf("parse contents JSON: unexpected response format")
}
entries = []entry{single}
}
Outdated
Review

[MINOR] The ListContents fallback from array→object parse uses err2 for the object parse error but discards the original array parse error err. If both fail (e.g., the response is valid JSON but neither an array nor an object matching entry), only the object-unmarshal error is returned, which may be less informative. This is a minor diagnostics issue.

**[MINOR]** The `ListContents` fallback from array→object parse uses `err2` for the object parse error but discards the original array parse error `err`. If both fail (e.g., the response is valid JSON but neither an array nor an object matching `entry`), only the object-unmarshal error is returned, which may be less informative. This is a minor diagnostics issue.
11
+8 -8
View File
2
@@ -528,13 +528,13 @@ func TestGetCommitStatuses_CheckRunConclusions(t *testing.T) {
status string
want string
}{
{strPtr("success"), "completed", "success"},
{strPtr("failure"), "completed", "failure"},
{strPtr("action_required"), "completed", "failure"},
{strPtr("timed_out"), "completed", "failure"},
{strPtr("cancelled"), "completed", "success"},
{strPtr("skipped"), "completed", "success"},
{strPtr("neutral"), "completed", "success"},
{stringPtr("success"), "completed", "success"},
{stringPtr("failure"), "completed", "failure"},
{stringPtr("action_required"), "completed", "failure"},
{stringPtr("timed_out"), "completed", "failure"},
{stringPtr("cancelled"), "completed", "success"},
{stringPtr("skipped"), "completed", "success"},
{stringPtr("neutral"), "completed", "success"},
{nil, "in_progress", "pending"},
{nil, "queued", "pending"},
}
3
@@ -632,6 +632,6 @@ func TestGetCommitStatuses_MalformedJSON(t *testing.T) {
}
Review

[NIT] strPtr helper is defined in pr_test.go but files_test.go is in the same package and could theoretically need it. Both are in package github (white-box tests), so there's no duplication issue here. However, there is a duplicate strPtr in neither file — only pr_test.go has it. Fine as-is.

**[NIT]** strPtr helper is defined in pr_test.go but files_test.go is in the same package and could theoretically need it. Both are in `package github` (white-box tests), so there's no duplication issue here. However, there is a duplicate strPtr in neither file — only pr_test.go has it. Fine as-is.
Review

[NIT] strPtr is defined in pr_test.go but files_test.go and client_test.go are in the same package. If any other test file ever needs this helper, it will collide. It's already fine since both are in package github, but naming it something more descriptive (e.g., stringPtr) would match the codebase's convention of meaningful names over abbreviations per the style conventions.

**[NIT]** `strPtr` is defined in `pr_test.go` but `files_test.go` and `client_test.go` are in the same package. If any other test file ever needs this helper, it will collide. It's already fine since both are in `package github`, but naming it something more descriptive (e.g., `stringPtr`) would match the codebase's convention of meaningful names over abbreviations per the style conventions.
Review

[NIT] stringPtr helper is defined in pr_test.go. If future tests in files_test.go or client_test.go need similar helpers, there could be duplication. Consider moving to a shared test helper file, though for now it's fine since it's only used in pr_test.go.

**[NIT]** `stringPtr` helper is defined in `pr_test.go`. If future tests in `files_test.go` or `client_test.go` need similar helpers, there could be duplication. Consider moving to a shared test helper file, though for now it's fine since it's only used in `pr_test.go`.
}
Review

[NIT] strPtr helper is defined in pr_test.go but files_test.go is in the same package (package github). If files_test.go ever needs this helper, there would be a duplicate definition. Consider putting shared test helpers in a helpers_test.go file. Not a current problem since files_test.go doesn't use it, but worth considering for consistency.

**[NIT]** `strPtr` helper is defined in `pr_test.go` but `files_test.go` is in the same package (`package github`). If `files_test.go` ever needs this helper, there would be a duplicate definition. Consider putting shared test helpers in a `helpers_test.go` file. Not a current problem since `files_test.go` doesn't use it, but worth considering for consistency.
func strPtr(s string) *string {
func stringPtr(s string) *string {
return &s
}