feat(github): implement PRReader + FileReader client (#80) #93
@@ -26,6 +26,10 @@ const (
|
|||||||
// APIError represents an HTTP error response from the GitHub API.
|
// APIError represents an HTTP error response from the GitHub API.
|
||||||
// It carries the status code so callers can distinguish between
|
// It carries the status code so callers can distinguish between
|
||||||
// different failure modes (e.g. 404 vs 500).
|
// different failure modes (e.g. 404 vs 500).
|
||||||
|
//
|
||||||
|
// Note: Error() includes up to 200 bytes of the response body for debugging.
|
||||||
|
// Callers should avoid logging raw error messages in production if the upstream
|
||||||
|
|
|||||||
|
// server may return sensitive details in error responses.
|
||||||
type APIError struct {
|
type APIError struct {
|
||||||
|
[NIT] APIError.Error includes up to 200 bytes of the response body. If callers log errors, this could surface sensitive server details. Consider reducing or masking returned body content, or clearly documenting that callers should avoid logging raw error messages in production. **[NIT]** APIError.Error includes up to 200 bytes of the response body. If callers log errors, this could surface sensitive server details. Consider reducing or masking returned body content, or clearly documenting that callers should avoid logging raw error messages in production.
|
|||||||
StatusCode int
|
StatusCode int
|
||||||
Body string
|
Body string
|
||||||
@@ -97,6 +101,21 @@ type Client struct {
|
|||||||
retryBackoff []time.Duration
|
retryBackoff []time.Duration
|
||||||
|
gpt-review-bot
commented
[NIT] The parameter name 'url' in doRequest shadows the conceptual URL type used elsewhere (e.g., net/url in other files). Renaming to 'reqURL' could improve clarity, though this is purely stylistic. **[NIT]** The parameter name 'url' in doRequest shadows the conceptual URL type used elsewhere (e.g., net/url in other files). Renaming to 'reqURL' could improve clarity, though this is purely stylistic.
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
// defaultCheckRedirect is the redirect policy used by NewClient and SetHTTPClient(nil).
|
||||||
|
sonnet-review-bot
commented
[NIT] The doc comment for **[NIT]** The doc comment for `Client` says `SetHTTPClient` and `SetRetryBackoff` must not be called concurrently with requests, but these are public methods and there's no enforcement or `noCopy` guard. Given the concurrent-use note, a brief comment in `SetHTTPClient` and `SetRetryBackoff` reiterating the constraint would help (though this matches the stdlib's `tls.Config` immutable-after-use convention, so it's acceptable as-is).
|
|||||||
|
// It strips the Authorization header on cross-host redirects or protocol downgrades
|
||||||
|
// (HTTPS→HTTP) to prevent credential leakage, while still following the redirect.
|
||||||
|
func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
|
||||||
|
[MINOR] The CheckRedirect handler allows following cross-host and HTTPS→HTTP redirects (while stripping Authorization). Although the token isn’t leaked, following cross-host redirects may contact untrusted hosts. Consider restricting redirects to same-host and HTTPS-only or fail on cross-host redirects to reduce SSRF-style risks. **[MINOR]** The CheckRedirect handler allows following cross-host and HTTPS→HTTP redirects (while stripping Authorization). Although the token isn’t leaked, following cross-host redirects may contact untrusted hosts. Consider restricting redirects to same-host and HTTPS-only or fail on cross-host redirects to reduce SSRF-style risks.
|
|||||||
|
if len(via) >= 10 {
|
||||||
|
return fmt.Errorf("stopped after 10 redirects")
|
||||||
|
[MINOR] defaultCheckRedirect allows cross-host redirects (with Authorization stripped). Although token leakage is mitigated, following cross-host redirects can facilitate SSRF-like behavior if baseURL is misconfigured or points to a compromised server. Consider rejecting cross-host redirects by default or enforcing an allowlist of trusted hosts. **[MINOR]** defaultCheckRedirect allows cross-host redirects (with Authorization stripped). Although token leakage is mitigated, following cross-host redirects can facilitate SSRF-like behavior if baseURL is misconfigured or points to a compromised server. Consider rejecting cross-host redirects by default or enforcing an allowlist of trusted hosts.
|
|||||||
|
}
|
||||||
|
gpt-review-bot
commented
[MINOR] Consider setting a User-Agent header on all requests. GitHub recommends identifying clients, and some enterprise installations enforce it. Add req.Header.Set("User-Agent", "review-bot/1.0") or similar. **[MINOR]** Consider setting a User-Agent header on all requests. GitHub recommends identifying clients, and some enterprise installations enforce it. Add req.Header.Set("User-Agent", "review-bot/1.0") or similar.
|
|||||||
|
// Strip Authorization on cross-host redirect or protocol downgrade (https→http).
|
||||||
|
prev := via[len(via)-1]
|
||||||
|
gpt-review-bot
commented
[MINOR] SetHTTPClient allows setting a nil *http.Client; subsequent use (c.httpClient.Do) would panic. Either guard against nil (return error or restore default client) or document that nil is invalid. **[MINOR]** SetHTTPClient allows setting a nil *http.Client; subsequent use (c.httpClient.Do) would panic. Either guard against nil (return error or restore default client) or document that nil is invalid.
|
|||||||
|
if req.URL.Host != prev.URL.Host || (prev.URL.Scheme == "https" && req.URL.Scheme == "http") {
|
||||||
|
req.Header.Del("Authorization")
|
||||||
|
}
|
||||||
|
[MINOR] defaultCheckRedirect follows HTTPS→HTTP redirects after stripping Authorization. While credentials are protected, this still permits plaintext requests to proceed, which can leak metadata and expands attack surface if a misconfigured or compromised server issues such redirects. Prefer failing closed on protocol downgrades. **[MINOR]** defaultCheckRedirect follows HTTPS→HTTP redirects after stripping Authorization. While credentials are protected, this still permits plaintext requests to proceed, which can leak metadata and expands attack surface if a misconfigured or compromised server issues such redirects. Prefer failing closed on protocol downgrades.
|
|||||||
|
return nil
|
||||||
|
gpt-review-bot
commented
[MINOR] Authorization header is always set to "Bearer "+token even when token is empty. Consider only setting the header when token is non-empty to avoid sending an empty bearer token on unauthenticated requests. **[MINOR]** Authorization header is always set to "Bearer "+token even when token is empty. Consider only setting the header when token is non-empty to avoid sending an empty bearer token on unauthenticated requests.
gpt-review-bot
commented
[NIT] GitHub classic PATs typically use the "token" scheme while fine-grained tokens use "Bearer". If supporting both is desired, consider documenting or adapting the auth scheme based on token type. **[NIT]** GitHub classic PATs typically use the "token" scheme while fine-grained tokens use "Bearer". If supporting both is desired, consider documenting or adapting the auth scheme based on token type.
sonnet-review-bot
commented
[MINOR] The doc comment on **[MINOR]** The doc comment on `defaultCheckRedirect` says it "strips the Authorization header on cross-host redirects or protocol downgrades (HTTPS→HTTP) to prevent credential leakage, while still following the redirect." However, a protocol downgrade from HTTPS to HTTP is a genuine security issue — stripping the header and still following is debatable. Consider returning an error on HTTPS→HTTP downgrade rather than silently following. This is a design choice that has security implications, not a bug per se, but worth flagging.
gpt-review-bot
commented
[MINOR] defaultCheckRedirect indexes via[len(via)-1] without guarding for len(via) == 0. net/http currently guarantees at least one prior request in via, but adding a len(via) check would make this more robust against misuse. **[MINOR]** defaultCheckRedirect indexes via[len(via)-1] without guarding for len(via) == 0. net/http currently guarantees at least one prior request in via, but adding a len(via) check would make this more robust against misuse.
|
|||||||
|
}
|
||||||
|
|
||||||
|
sonnet-review-bot
commented
[NIT] The concurrency safety doc comment says "SetHTTPClient and SetRetryBackoff must not be called concurrently with requests" but these are exported methods with no enforcement (no mutex, no atomic). This is an acceptable design for a test-support method, but the comment could be stronger: e.g., "These methods are intended for test setup only and must be called before any goroutines issue requests." **[NIT]** The concurrency safety doc comment says "SetHTTPClient and SetRetryBackoff must not be called concurrently with requests" but these are exported methods with no enforcement (no mutex, no atomic). This is an acceptable design for a test-support method, but the comment could be stronger: e.g., "These methods are intended for test setup only and must be called before any goroutines issue requests."
|
|||||||
// NewClient creates a new GitHub API client.
|
// NewClient creates a new GitHub API client.
|
||||||
// If baseURL is empty, it defaults to https://api.github.com.
|
// If baseURL is empty, it defaults to https://api.github.com.
|
||||||
// For GitHub Enterprise, pass the API base URL (e.g. https://github.concur.com/api/v3).
|
// For GitHub Enterprise, pass the API base URL (e.g. https://github.concur.com/api/v3).
|
||||||
|
gpt-review-bot
commented
[NIT] Parameter name "url" in doRequest shadows the common concept of URLs and can reduce readability given other files import net/url. Consider renaming the parameter to reqURL for clarity. **[NIT]** Parameter name "url" in doRequest shadows the common concept of URLs and can reduce readability given other files import net/url. Consider renaming the parameter to reqURL for clarity.
[MINOR] Redirects to different hosts or to HTTP are allowed (Authorization is stripped), which can lead to consuming responses from untrusted or downgraded endpoints. While token leakage is prevented, consider rejecting cross-host redirects and HTTPS→HTTP downgrades entirely to avoid integrity/confidentiality risks. **[MINOR]** Redirects to different hosts or to HTTP are allowed (Authorization is stripped), which can lead to consuming responses from untrusted or downgraded endpoints. While token leakage is prevented, consider rejecting cross-host redirects and HTTPS→HTTP downgrades entirely to avoid integrity/confidentiality risks.
|
|||||||
@@ -115,18 +134,8 @@ func NewClient(token, baseURL string, opts ...ClientOption) *Client {
|
|||||||
allowInsecureHTTP: cfg.allowInsecureHTTP,
|
allowInsecureHTTP: cfg.allowInsecureHTTP,
|
||||||
token: token,
|
token: token,
|
||||||
httpClient: &http.Client{
|
httpClient: &http.Client{
|
||||||
|
gpt-review-bot
commented
[MINOR] Retry-After is parsed only as delta-seconds; per RFC 7231 it may also be an HTTP-date. Consider falling back to parsing an HTTP-date when Atoi fails to honor server guidance more robustly. **[MINOR]** Retry-After is parsed only as delta-seconds; per RFC 7231 it may also be an HTTP-date. Consider falling back to parsing an HTTP-date when Atoi fails to honor server guidance more robustly.
|
|||||||
Timeout: 30 * time.Second,
|
Timeout: 30 * time.Second,
|
||||||
|
gpt-review-bot
commented
[MINOR] Non-HTTPS rejection checks strings.HasPrefix(url, "https://"). This is case-sensitive and string-based; safer to parse with net/url and check URL.Scheme case-insensitively (e.g., u.Scheme == "https"). **[MINOR]** Non-HTTPS rejection checks strings.HasPrefix(url, "https://"). This is case-sensitive and string-based; safer to parse with net/url and check URL.Scheme case-insensitively (e.g., u.Scheme == "https").
|
|||||||
CheckRedirect: func(req *http.Request, via []*http.Request) error {
|
CheckRedirect: defaultCheckRedirect,
|
||||||
if len(via) >= 10 {
|
|
||||||
return fmt.Errorf("stopped after 10 redirects")
|
|
||||||
}
|
|
||||||
// Strip Authorization on cross-host redirect or protocol downgrade (https→http).
|
|
||||||
prev := via[len(via)-1]
|
|
||||||
if req.URL.Host != prev.URL.Host || (prev.URL.Scheme == "https" && req.URL.Scheme == "http") {
|
|
||||||
req.Header.Del("Authorization")
|
|
||||||
}
|
|
||||||
return nil
|
|
||||||
},
|
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -138,17 +147,8 @@ func NewClient(token, baseURL string, opts ...ClientOption) *Client {
|
|||||||
func (c *Client) SetHTTPClient(hc *http.Client) {
|
func (c *Client) SetHTTPClient(hc *http.Client) {
|
||||||
if hc == nil {
|
if hc == nil {
|
||||||
|
gpt-review-bot
commented
[MINOR] Authorization header always uses the "Bearer" scheme. Some GitHub token types (classic PAT) historically use the "token" scheme. Consider making the auth scheme configurable or auto-detectable to maximize compatibility. **[MINOR]** Authorization header always uses the "Bearer" scheme. Some GitHub token types (classic PAT) historically use the "token" scheme. Consider making the auth scheme configurable or auto-detectable to maximize compatibility.
|
|||||||
hc = &http.Client{
|
hc = &http.Client{
|
||||||
|
[MINOR] Authorization header is sent to whatever baseURL is configured. If baseURL can be influenced by untrusted input, this could leak tokens to an attacker-controlled host (SSRF/token exfiltration). Ensure baseURL is treated as trusted configuration and consider allowlisting expected hosts at a higher layer. **[MINOR]** Authorization header is sent to whatever baseURL is configured. If baseURL can be influenced by untrusted input, this could leak tokens to an attacker-controlled host (SSRF/token exfiltration). Ensure baseURL is treated as trusted configuration and consider allowlisting expected hosts at a higher layer.
|
|||||||
Timeout: 30 * time.Second,
|
Timeout: 30 * time.Second,
|
||||||
|
sonnet-review-bot
commented
[MINOR] The CheckRedirect lambda is duplicated verbatim in both NewClient and SetHTTPClient(nil). Extract it to a package-level function (e.g., **[MINOR]** The CheckRedirect lambda is duplicated verbatim in both NewClient and SetHTTPClient(nil). Extract it to a package-level function (e.g., `defaultCheckRedirect`) to eliminate the duplication and ensure both code paths stay in sync when the policy changes.
|
|||||||
CheckRedirect: func(req *http.Request, via []*http.Request) error {
|
CheckRedirect: defaultCheckRedirect,
|
||||||
|
gpt-review-bot
commented
[NIT] Retry-After parsing only handles delta-seconds via Atoi. RFC 7231 allows an HTTP-date format; optionally support parsing HTTP-date to fully respect server guidance. **[NIT]** Retry-After parsing only handles delta-seconds via Atoi. RFC 7231 allows an HTTP-date format; optionally support parsing HTTP-date to fully respect server guidance.
|
|||||||
if len(via) >= 10 {
|
|
||||||
return fmt.Errorf("stopped after 10 redirects")
|
|
||||||
}
|
|
||||||
prev := via[len(via)-1]
|
|
||||||
if req.URL.Host != prev.URL.Host || (prev.URL.Scheme == "https" && req.URL.Scheme == "http") {
|
|
||||||
req.Header.Del("Authorization")
|
|
||||||
}
|
|
||||||
return nil
|
|
||||||
},
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
sonnet-review-bot
commented
[MINOR] When **[MINOR]** When `c.http.Do(req)` returns an error (network failure, context cancellation), the function returns immediately without retrying. For transient network errors, a retry could be valuable. The current behavior is reasonable for the stated scope (only retry on 429), but the comment says 'It respects the Retry-After header when present' without mentioning the no-retry-on-transport-error behavior. This is a documentation gap rather than a bug.
|
|||||||
c.httpClient = hc
|
c.httpClient = hc
|
||||||
@@ -235,7 +235,7 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
|
|||||||
if err != nil {
|
if err != nil {
|
||||||
|
sonnet-review-bot
commented
[MINOR] The shadow variable **[MINOR]** The shadow variable `t` on `} else if t, err := http.ParseTime(ra); err == nil {` shadows the outer `t` parameter name from the function signature — but this function has no `t` parameter, so there is no actual shadowing issue here. However, using `t` as a local variable for a `time.Time` is confusing since `t` conventionally means `*testing.T` in Go. Consider renaming to `parsedTime` or `retryAt` for clarity.
|
|||||||
return nil, fmt.Errorf("read response body: %w", err)
|
return nil, fmt.Errorf("read response body: %w", err)
|
||||||
|
sonnet-review-bot
commented
[MINOR] After **[MINOR]** After `c.httpClient.Do(req)` returns an error, the function returns immediately with a wrapped transport error. However, the body is not closed because `resp` is nil on transport error — this is correct, but deserves a comment for future readers since `handleResponse` uses defer for body close, it's not obvious why there's no defer here.
|
|||||||
}
|
}
|
||||||
if int64(len(body)) >= maxResponseBytes {
|
if len(body) >= maxResponseBytes {
|
||||||
return nil, fmt.Errorf("response body exceeded %d bytes (truncated)", maxResponseBytes)
|
return nil, fmt.Errorf("response body exceeded %d bytes (truncated)", maxResponseBytes)
|
||||||
}
|
}
|
||||||
return body, nil
|
return body, nil
|
||||||
|
|||||||
@@ -48,6 +48,9 @@ func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]
|
|||||||
if err2 := json.Unmarshal(body, &single); err2 != nil {
|
if err2 := json.Unmarshal(body, &single); err2 != nil {
|
||||||
return nil, fmt.Errorf("parse contents JSON: %w", err2)
|
return nil, fmt.Errorf("parse contents JSON: %w", err2)
|
||||||
}
|
}
|
||||||
|
security-review-bot marked this conversation as resolved
Outdated
[MINOR] escapePath preserves dot-segments ("." and "..") in path components. Some servers normalize dot-segments, which could let an untrusted path escape the intended "contents" endpoint (e.g., "/contents/../../pulls"). While this stays on the same host, it could lead to unexpected behavior. Consider rejecting or encoding dot-segments to prevent path traversal in URL construction. **[MINOR]** escapePath preserves dot-segments ("." and "..") in path components. Some servers normalize dot-segments, which could let an untrusted path escape the intended "contents" endpoint (e.g., "/contents/../../pulls"). While this stays on the same host, it could lead to unexpected behavior. Consider rejecting or encoding dot-segments to prevent path traversal in URL construction.
|
|||||||
|
if single.Name == "" && single.Path == "" && single.Type == "" {
|
||||||
|
sonnet-review-bot
commented
[MINOR] Silent empty-array handling: when the GitHub Contents API returns an empty JSON array **[MINOR]** Silent empty-array handling: when the GitHub Contents API returns an empty JSON array `[]`, `json.Unmarshal` into `[]entry` succeeds with `len(entries) == 0` and the function returns an empty slice without error. This is probably the correct behavior for an empty directory, but it's undocumented and could silently mask an unexpected server response. A comment clarifying this intent would be valuable.
|
|||||||
|
return nil, fmt.Errorf("parse contents JSON: unexpected response format")
|
||||||
|
}
|
||||||
entries = []entry{single}
|
entries = []entry{single}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
sonnet-review-bot
commented
[MINOR] The **[MINOR]** The `ListContents` fallback from array→object parse uses `err2` for the object parse error but discards the original array parse error `err`. If both fail (e.g., the response is valid JSON but neither an array nor an object matching `entry`), only the object-unmarshal error is returned, which may be less informative. This is a minor diagnostics issue.
|
|||||||
|
|||||||
@@ -528,13 +528,13 @@ func TestGetCommitStatuses_CheckRunConclusions(t *testing.T) {
|
|||||||
status string
|
status string
|
||||||
want string
|
want string
|
||||||
}{
|
}{
|
||||||
{strPtr("success"), "completed", "success"},
|
{stringPtr("success"), "completed", "success"},
|
||||||
{strPtr("failure"), "completed", "failure"},
|
{stringPtr("failure"), "completed", "failure"},
|
||||||
{strPtr("action_required"), "completed", "failure"},
|
{stringPtr("action_required"), "completed", "failure"},
|
||||||
{strPtr("timed_out"), "completed", "failure"},
|
{stringPtr("timed_out"), "completed", "failure"},
|
||||||
{strPtr("cancelled"), "completed", "success"},
|
{stringPtr("cancelled"), "completed", "success"},
|
||||||
{strPtr("skipped"), "completed", "success"},
|
{stringPtr("skipped"), "completed", "success"},
|
||||||
{strPtr("neutral"), "completed", "success"},
|
{stringPtr("neutral"), "completed", "success"},
|
||||||
{nil, "in_progress", "pending"},
|
{nil, "in_progress", "pending"},
|
||||||
{nil, "queued", "pending"},
|
{nil, "queued", "pending"},
|
||||||
}
|
}
|
||||||
@@ -632,6 +632,6 @@ func TestGetCommitStatuses_MalformedJSON(t *testing.T) {
|
|||||||
}
|
}
|
||||||
|
sonnet-review-bot
commented
[NIT] strPtr helper is defined in pr_test.go but files_test.go is in the same package and could theoretically need it. Both are in **[NIT]** strPtr helper is defined in pr_test.go but files_test.go is in the same package and could theoretically need it. Both are in `package github` (white-box tests), so there's no duplication issue here. However, there is a duplicate strPtr in neither file — only pr_test.go has it. Fine as-is.
sonnet-review-bot
commented
[NIT] **[NIT]** `strPtr` is defined in `pr_test.go` but `files_test.go` and `client_test.go` are in the same package. If any other test file ever needs this helper, it will collide. It's already fine since both are in `package github`, but naming it something more descriptive (e.g., `stringPtr`) would match the codebase's convention of meaningful names over abbreviations per the style conventions.
sonnet-review-bot
commented
[NIT] **[NIT]** `stringPtr` helper is defined in `pr_test.go`. If future tests in `files_test.go` or `client_test.go` need similar helpers, there could be duplication. Consider moving to a shared test helper file, though for now it's fine since it's only used in `pr_test.go`.
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
sonnet-review-bot
commented
[NIT] **[NIT]** `strPtr` helper is defined in `pr_test.go` but `files_test.go` is in the same package (`package github`). If `files_test.go` ever needs this helper, there would be a duplicate definition. Consider putting shared test helpers in a `helpers_test.go` file. Not a current problem since `files_test.go` doesn't use it, but worth considering for consistency.
|
|||||||
func strPtr(s string) *string {
|
func stringPtr(s string) *string {
|
||||||
return &s
|
return &s
|
||||||
}
|
}
|
||||||
|
|||||||
[NIT] APIError.Error includes up to 200 bytes of the response body. If callers log errors verbatim, this could leak server-provided details (e.g., repository names). Consider further redaction or requiring callers to log status codes without bodies in production.