Compare commits

..

42 Commits

Author SHA1 Message Date
aweiker f8b9d7d282 fix: portable checksum on darwin, anchor grep pattern
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 27s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m4s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m13s
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
- sha256sum is not available on macOS; use shasum -a 256 on darwin.
  Select based on steps.version.outputs.os which is already computed.
  Fixes MAJOR finding from gpt-review-bot (PR #127 review).

- Anchor checksum grep with ^ to avoid matching on partial lines.
  Fixes MINOR finding from gpt-review-bot (PR #127 review).
2026-05-14 05:48:21 +00:00
aweiker 7a8fc166ec feat(action): derive binary name from uname for multi-arch support (#124)
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 26s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m12s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m33s
Previously the Install step hard-coded 'review-bot-linux-amd64'. This
fails on arm64 runners (Graviton, Apple Silicon) where uname -m returns
'aarch64' or 'arm64'.

Changes:
- Add OS/arch detection in 'Determine version' step using uname -s/-m
- Map uname output to asset name format: x86_64→amd64, aarch64/arm64→arm64,
  linux→linux, darwin→darwin
- Emit 'os' and 'arch' as step outputs alongside 'version'
- Update cache key: review-bot-{os}-{arch}-{version}
- Update Install step: BINARY derived from step outputs
- Anchor checksum grep to exact filename (not substring match)
- Unsupported OS or arch exits with a clear error message

Supported platforms: linux-amd64, linux-arm64, darwin-amd64, darwin-arm64
(matches what the release workflow builds)
2026-05-14 05:45:20 +00:00
aweiker 5e351b85f0 Merge pull request 'feat(github): add safeguards against accidental AllowInsecureHTTP use (#96)' (#113) from review-bot-issue-96 into main
CI / test (push) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #113
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
2026-05-13 20:21:42 +00:00
claw ab2a6c8aef Address review feedback on PR #113
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 22s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m33s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m47s
- Fix AllowInsecureHTTP doc comment: say '_test.go file in the same
  package' instead of 'export_test.go' (MINOR #1)
- Remove dead u.Fragment = "" from redactURL: HTTP requests never
  carry fragments over the wire per RFC 7230 §5.1 (MINOR #2)
- Use 127.0.0.1:1 in scheme-rejection tests to make intent clearer
  that no network call should occur (NIT #3)
2026-05-13 13:04:23 -07:00
claw 6b7f3f6924 fix: address NIT findings from sonnet review (#113)
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 35s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m7s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m32s
- Fix AllowInsecureHTTP doc comment: 'silently ignored' -> 'ignored' (it logs a warning)
- Remove unnecessary nil guard in redactURL (assigning nil to nil is a no-op)
- Add clarifying comment about acceptable double URL parse in doRequest
2026-05-13 12:48:02 -07:00
claw 4c032a3b53 fix: move AllowInsecureHTTPForTest to export_test.go, fix gofmt alignment
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m14s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m45s
2026-05-13 11:58:25 -07:00
claw 64c9d551ba fix: address review feedback — restore timer.Stop() and fix test spacing
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 24s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 32s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m14s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m8s
- Restore timer.Stop() no-op in case <-timer.C for symmetry with ctx.Done
- Add missing blank line between TestNoInsecureOption_RejectsHTTP and
  TestNoInsecureOption_RejectsUppercaseHTTP
- Remove double blank line before TestAllowInsecureHTTP_WithoutEnvVar_Rejected

Resolves review comments from sonnet-review-bot on PR #113.
2026-05-13 11:44:28 -07:00
claw db7b7e66bf fix: use case-insensitive HTTP scheme check and redact userinfo
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 56s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m53s
Address review feedback on PR #113:

- MAJOR (both reviews): Replace strings.HasPrefix(reqURL, "http://")
  with url.Parse + strings.EqualFold for case-insensitive scheme
  comparison per RFC 3986. Prevents bypass via HTTP:// or Http://.

- MINOR (security): Enhance redactURL to strip userinfo component
  (user:pass@host) in addition to query params, preventing credential
  leakage in error messages and logs.

- NIT (gpt): Remove redundant timer.Stop() after timer.C fires —
  it's a no-op and the comment was misleading.

- Add tests for uppercase/mixed-case HTTP scheme rejection and
  userinfo redaction.
2026-05-13 11:35:10 -07:00
claw 0232343126 feat(github): add safeguards against accidental AllowInsecureHTTP use in production
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 13s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 38s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m45s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m9s
Three-layer defense for the AllowInsecureHTTP client option:

1. Environment gate: AllowInsecureHTTP() requires REVIEW_BOT_ALLOW_INSECURE=1
   env var. Without it, the option is silently ignored with a slog.Warn.

2. Warning log on activation: When the env gate IS satisfied, a slog.Warn
   fires at client construction time so operators notice in production logs.

3. Test bypass: AllowInsecureHTTPForTest() skips the env gate entirely,
   keeping test code clean (no t.Setenv needed in every test).

Additionally, doRequest now rejects HTTP URLs unless allowInsecureHTTP is set
on the client, providing defense-in-depth against credential leakage.

Closes #96
2026-05-13 11:24:07 -07:00
aweiker b26514714f Merge pull request 'feat(gitea): pass commit_id explicitly in PostReview (#107)' (#112) from review-bot-issue-107 into main
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #112
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 18:07:38 +00:00
claw 028d46942a fix(gitea): update PostReview doc comment to include COMMENT event value
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 25s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 53s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m4s
2026-05-13 10:41:52 -07:00
claw e59c2bc831 feat(gitea): pass commit_id explicitly in PostReview (#107)
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 25s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 42s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 57s
Add commitID parameter to gitea.Client.PostReview so the review is
anchored to the specific commit that was evaluated. The caller
(cmd/review-bot) already computes evaluatedSHA from pr.Head.Sha;
this wires it through to the Gitea API payload.

When commitID is empty, omitempty drops it from the JSON and Gitea
defaults to the current PR head (backward-compatible).

Closes #107
2026-05-13 10:29:05 -07:00
aweiker dc2e1ca5de Merge pull request 'feat: reject cross-host redirects and HTTPS→HTTP downgrades (#95)' (#111) from review-bot-issue-95 into main
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #111
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 16:58:49 +00:00
claw 7de6fdd9ec fix: address review feedback on redirect policy
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 31s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 52s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m24s
- Replace Unicode arrows (→) with ASCII (->) in error messages and
  comments for log compatibility (gpt-review NITs #19626, #19628)
- Improve guard comment to clarify it exists for testability, not
  runtime safety (sonnet-review NIT #19619)
- Add cross-reference comments noting intentional duplication between
  gitea/client.go and github/client.go (sonnet-review #19618,
  gpt-review #19625, #19627)

Pushed back on:
- internal/ package for dedup: structural overhead not warranted for
  a single ~25-line function
- strings.EqualFold for scheme: Go's url.Parse normalizes schemes to
  lowercase, making case-insensitive comparison unnecessary
2026-05-13 09:28:46 -07:00
claw 1e0959b077 feat: reject cross-host redirects and HTTPS→HTTP downgrades (#95)
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 37s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m19s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m47s
Add defaultCheckRedirect to both GitHub and Gitea clients that rejects:
- HTTPS→HTTP protocol downgrades (prevents plaintext leakage)
- Cross-host redirects entirely (prevents consuming untrusted responses)

Same-host, same-or-upgraded-scheme redirects remain allowed.

Both NewClient constructors wire the policy, and SetHTTPClient(nil)
restores it. Callers providing a non-nil client are responsible for
configuring their own safe redirect policy.

Closes #95
2026-05-13 08:51:36 -07:00
aweiker 67c3db70cb Merge pull request 'feat(github): support HTTP-date format in Retry-After header' (#110) from review-bot-issue-94 into main
CI / test (push) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #110
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 15:34:25 +00:00
aweiker a845ce32eb Merge pull request 'feat(gitea): harden GetPullRequestDiff against unbounded diff size' (#109) from review-bot-issue-92 into main
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #109
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 14:01:22 +00:00
claw 49d6ca77a3 refactor(gitea): eliminate retry duplication, harden redactURL and MaxInt64 edge case
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 32s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m2s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m21s
- Extract doGetWithReader to share retry/backoff logic between doGet and
  doGetLimited, eliminating ~60 lines of duplicated code (addresses MINOR
  finding from all reviewers).
- redactURL now strips userinfo credentials (user:pass@host) in addition
  to query parameters (addresses security-review-bot finding).
- GetPullRequestDiff treats MaxDiffSize == math.MaxInt64 as disabled,
  preventing the silent enforcement bypass where the overflow clamp makes
  the size check unreachable (addresses security-review-bot finding).
- Improved error message wording: 'response exceeds N bytes' (NIT fix).
2026-05-13 13:39:37 +00:00
claw 6ebf66aefb fix(gitea): address review feedback on diff size limiting
- Add concurrency safety note to MaxDiffSize field documentation,
  mirroring the existing note on RetryBackoff
- Consolidate six individual test functions into a single table-driven
  test (TestGetPullRequestDiff_SizeLimits) reducing repetition
- Add //nolint:errcheck annotation to test handler w.Write calls
2026-05-13 13:39:37 +00:00
claw 004343d05f fix(gitea): address review findings — clamp overflow, clarify maxSize doc
- Clamp maxBytes+1 to prevent integer overflow to negative when
  maxBytes == math.MaxInt64 (falls back to math.MaxInt64)
- Update MaxDiffSize doc: 'any negative value' disables the limit,
  matching actual behavior of 'maxSize < 0' check
2026-05-13 13:39:37 +00:00
claw 92b84976cf feat(gitea): harden GetPullRequestDiff against unbounded diff size
Add a configurable MaxDiffSize field to Client that limits how much
data GetPullRequestDiff will read into memory. The default is 10 MB
(DefaultMaxDiffSize). When the diff exceeds the limit, ErrDiffTooLarge
is returned, allowing callers to skip position translation gracefully.

Implementation uses io.LimitReader to read maxBytes+1, detecting
overflow without buffering the entire response. Setting MaxDiffSize
to -1 disables the limit entirely.

Closes #92
2026-05-13 13:39:37 +00:00
claw 9f8e9aa8d3 fix: timer leak and http field shadowing in github client
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 46s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m57s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m17s
- Add timer.Stop() to the timer.C branch to prevent timer leak on the
  normal path (previously only called in ctx.Done branch)
- Rename struct field 'http' to 'httpClient' to avoid shadowing the
  net/http import

Addresses self-review findings on PR #110.
2026-05-13 06:22:30 -07:00
aweiker 881ce232eb Merge pull request 'docs(deps): update CONVENTIONS.md allowlist for go-yaml' (#108) from review-bot-issue-91 into main
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #108
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 13:16:40 +00:00
claw 31a28b1dd5 address review feedback: baseURL TODO, timer-cancel test, fast retry test, doc note
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 21s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 49s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m47s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m14s
- Add TODO comment on unused baseURL field explaining its intended
  future use by higher-level exported methods
- Rewrite TestDoRequest_ContextCanceled to actually exercise the
  timer-cancel path in the retry select (goroutine cancels context
  while client is blocked in timer.C select)
- Change Retry-After: 1 to Retry-After: 0 in IntegerSeconds test
  to avoid unnecessary 1s sleep during test runs
- Add doc note on SetRetryBackoff explaining that an empty non-nil
  slice silently drops Retry-After delays
2026-05-13 06:08:51 -07:00
claw e414471a16 fix(github): address review feedback on Retry-After implementation
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 23s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 43s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m21s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m22s
- Fix data race: copy retryBackoff slice per-request to prevent
  concurrent doRequest calls from racing on shared state
- Fix parseRetryAfter: trim whitespace before parsing for robustness
- Fix parseRetryAfter: treat delta-seconds of 0 as valid per RFC 7231
- Add bounded read on success path (10 MB limit) for defense-in-depth
- Clean up TestDoRequest_429_RetryAfter_IntegerSeconds: remove dead
  code block and commented-out reasoning
- Fix import ordering: context before errors (goimports compliance)
2026-05-13 05:54:06 -07:00
claw 41e1d48b54 feat(github): support HTTP-date format in Retry-After header
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 39s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m35s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m18s
Implement the github package client with Retry-After header parsing that
supports both integer seconds (e.g. "Retry-After: 120") and HTTP-date
format (e.g. "Retry-After: Thu, 01 Dec 2025 16:00:00 GMT") per RFC 7231
§7.1.3.

Key design decisions:
- Use http.ParseTime which handles RFC 1123, RFC 850, and ASCTIME formats
- Cap maximum retry delay at 60s (maxRetryAfter) to prevent stalling
- If HTTP-date is in the past, use delay of 0 (retry immediately)
- Inject time.Now via c.now field for deterministic testing

Closes #94
2026-05-13 05:10:52 -07:00
claw bf52fceea0 docs(deps): update CONVENTIONS.md allowlist for go-yaml
CI / test (pull_request) Successful in 20s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 20s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 48s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 51s
Update the approved dependency table to document go-yaml subpackage
usage (ast, parser) and remove the deviation comment now that the
proper allowlist process is being followed.

Closes #91
2026-05-13 02:56:06 -07:00
aweiker d722035629 Merge pull request 'fix(deps): replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml' (#89) from review-bot-issue-87 into main
CI / test (push) Successful in 20s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
Reviewed-on: #89
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 03:47:01 +00:00
claw b9b7be3b4e fix: address review #2888 findings (comment clarity, test cleanup)
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 37s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m0s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m11s
- Clarify depth-aware short-circuit comment to unambiguously describe
  the relationship between current depth and previous validation depth
- Add comment to MappingValueNode case explaining intentional depth+2
  behavior from parent MappingNode perspective
- Restructure unmarshalYAMLWithDepthLimit doc comment as bullet list
  covering all three safety checks (depth, multi-doc, strict fields)
- Replace t.Error with t.Fatal in TestYAMLEmptyFileRejection to remove
  redundant nil guard on subsequent err.Error() call
2026-05-12 19:06:52 -07:00
claw baa917f228 fix: handle MergeKeyNode explicitly in depth check, add size limit to ParsePersonaBytes
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 35s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 58s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m16s
- Add explicit case for *ast.MergeKeyNode in checkYAMLDepth switch to
  make it clear this is an intentional leaf (no children to recurse)
  rather than relying on the default case. Prevents future library
  changes from silently bypassing depth checks.

- Add MaxPersonaFileSize bound check at the top of ParsePersonaBytes.
  While callers already check size, the public API should defend itself
  (defense in depth) against arbitrarily large inputs that could cause
  excessive memory/CPU before AST validation runs.

- Add tests for both behaviors.

Addresses review #2879 findings.
2026-05-12 18:45:48 -07:00
claw b0352ba1c9 docs: address review findings on YAML depth validation
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m20s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m49s
- Add safety note on Strict() decoder not expanding aliases recursively,
  since alias resolution uses the pre-validated AST (finding #1)
- Document that ast.Node map keys rely on pointer identity, which holds
  because all goccy/go-yaml AST types are pointer receivers (finding #2)
- Clarify AnchorNode comment: effective depth budget is reduced for
  anchor+alias pairs, not literally halved (finding #3)
- Improve test depth trace comment for accuracy (finding #4)
- Add HTML comment in CONVENTIONS.md referencing #91 for the two-step
  process deviation (finding #5)
2026-05-12 17:39:38 -07:00
claw 0b16c4143a test: use per-subtest TempDir in TestYAMLEmptyFileRejection
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 44s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m42s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m9s
Move t.TempDir() inside each subtest for idiomatic test isolation,
as suggested by reviewers.
2026-05-12 15:22:27 -07:00
claw 493349e11a fix: correct comment accuracy and improve trailing-content check clarity
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m10s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m47s
- Fix validated map comment: says 'minimum depth' but stores the maximum
  depth at which a node was validated (overwritten on deeper visits).
- Replace dec.More() with explicit dec.Decode check for trailing JSON
  content. More() is documented for use inside arrays/objects; the
  explicit EOF check is clearer at the top-level stream.
2026-05-12 14:51:49 -07:00
claw 5cedeee9f4 address self-review findings on PR #89
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 39s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m12s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m33s
MINOR fixes:
- docs/DESIGN-57-yaml-persona.md: fix Error Cases table entry to reflect
  custom AST walk (checkYAMLDepth) instead of stale library-level reference
- review/persona.go: add EOF check after JSON decode to reject trailing
  garbage after a valid JSON object (prevents silent acceptance of malformed
  input like '{"name":"x"}garbage')
- review/persona_test.go: add TestJSONTrailingContentRejected test

NIT fixes:
- review/persona.go: add default case to checkYAMLDepth switch with
  explanatory comment about scalar leaf nodes
- review/persona.go: document AnchorNode depth+1 conservative asymmetry
- review/persona.go: simplify redundant if-guard in ListBuiltinPersonas
2026-05-12 14:42:22 -07:00
claw 01b6af03a8 fix(review): address review 2792 feedback
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 33s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m11s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m53s
- Document nodeCount overcounting as intentional conservative behavior
  (bounds total validation work, not unique nodes)
- Improve TestYAMLDeeplyNestedRejection comment with concrete depth trace
- Replace outdated gopkg.in/yaml.v3 pseudocode in design doc with
  reference to authoritative implementation
- Update PR description to clarify pre-approval via issue #57
2026-05-12 14:24:06 -07:00
claw 80091fb080 fix(review): address feedback from reviews 2788, 2789, 2791
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 23s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 39s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m45s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m7s
- Move nodeCount increment after cycle detection to avoid over-counting
  cyclic references (sonnet #2)
- Use underscores in test case names used as filenames (sonnet #3)
- Fix function comment: 'prevent silent data loss' → 'prevent confusing
  behavior where additional documents are silently ignored' (sonnet #4)
- Mark design doc pseudocode as historical since implementation uses
  goccy/go-yaml ast.Node, not gopkg.in/yaml.v3 yaml.Node (sonnet #5)
2026-05-12 14:13:59 -07:00
claw b5f17ddfc4 fix(security): prevent alias depth bypass in YAML validator
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 38s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m18s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m20s
The global 'seen' set allowed anchored subtrees validated at a shallow
depth to be skipped when later referenced via alias at a greater depth.
This could let effective nesting exceed MaxYAMLDepth, enabling DoS.

Fix: replace the single 'seen' set with two tracking maps:
- validated (node -> min depth): only short-circuits when current depth
  <= previously validated depth; re-checks at deeper contexts.
- visiting (node -> bool): per-path recursion stack for true cycle
  detection (breaks alias loops without suppressing depth checks).

Add TestYAMLAliasDepthBypass that constructs a document with an
anchored 15-level subtree referenced via alias under 6 levels of
nesting, verifying the combined effective depth (22) is rejected.

Addresses security-review-bot findings on review #2774.
2026-05-12 14:07:05 -07:00
rodin 144a36a2a7 docs: update DESIGN-57 to reflect goccy/go-yaml as the supported YAML library
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 31s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m19s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m57s
2026-05-12 20:52:37 +00:00
rodin 12f5f5a5e4 docs: update YAML library to github.com/goccy/go-yaml in CONVENTIONS.md
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 28s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m4s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m16s
2026-05-12 20:52:31 +00:00
claw 45d009dd06 fix(review): address review feedback on persona YAML handling
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 30s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m5s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m8s
- Reorder empty doc check before multi-doc check for natural flow
- Detect nil-body docs (whitespace-only, comment-only input)
- Add explanatory comment on pointer identity for cycle detection map
- Improve depth-counting test comment with AST walker specifics
- Add TestYAMLEmptyFileRejection covering empty/whitespace/comment inputs

Addresses MINOR and NIT findings from sonnet, gpt, and security reviews.
MAJOR (allowlist violation) tracked in issue #91.
2026-05-12 13:38:48 -07:00
claw 8991260333 fix(deps): replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 46s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m38s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m49s
Fixes #87.

PR #58 incorrectly added gopkg.in/yaml.v3 (abandoned library) instead of
github.com/goccy/go-yaml as required by issue #57.

Changes:
- Replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml v1.19.2
- Update review/persona.go to use goccy/go-yaml API:
  - parser.ParseBytes for AST-based depth/node count checking
  - yaml.Strict() decoder option instead of KnownFields(true)
  - ast.Node types instead of yaml.Node for tree walking
- Update review/persona_test.go to use ast types for cycle tests
- Remove gopkg.in/yaml.v3 from go.mod and go.sum

All existing YAML tests pass with the new library.
2026-05-12 13:27:30 -07:00
rodin 6f86e66943 fix(patterns): default patterns-files to empty (fetch all) (#77)
CI / test (push) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (push) Has been skipped
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (push) Has been skipped
2026-05-11 17:45:19 +00:00
35 changed files with 1617 additions and 3496 deletions
+34 -5
View File
@@ -124,14 +124,38 @@ runs:
else else
VERSION="${{ inputs.version }}" VERSION="${{ inputs.version }}"
fi fi
# Detect OS and architecture for platform-specific binary download
OS_RAW=$(uname -s | tr '[:upper:]' '[:lower:]')
case "$OS_RAW" in
linux) OS="linux" ;;
darwin) OS="darwin" ;;
*)
echo "Error: unsupported OS: $(uname -s)" >&2
exit 1
;;
esac
RAW_ARCH=$(uname -m)
case "$RAW_ARCH" in
x86_64) ARCH="amd64" ;;
aarch64 | arm64) ARCH="arm64" ;;
*)
echo "Error: unsupported architecture: $RAW_ARCH" >&2
exit 1
;;
esac
echo "version=${VERSION}" >> "$GITHUB_OUTPUT" echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
echo "os=${OS}" >> "$GITHUB_OUTPUT"
echo "arch=${ARCH}" >> "$GITHUB_OUTPUT"
- name: Cache review-bot binary - name: Cache review-bot binary
id: cache id: cache
uses: actions/cache@v4 uses: actions/cache@v4
with: with:
path: ${{ runner.temp }}/review-bot path: ${{ runner.temp }}/review-bot
key: review-bot-linux-amd64-${{ steps.version.outputs.version }} key: review-bot-${{ steps.version.outputs.os }}-${{ steps.version.outputs.arch }}-${{ steps.version.outputs.version }}
- name: Install review-bot - name: Install review-bot
if: steps.cache.outputs.cache-hit != 'true' if: steps.cache.outputs.cache-hit != 'true'
@@ -140,7 +164,7 @@ runs:
GITEA_URL="${{ inputs.gitea-url || github.server_url }}" GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}" REPO="${{ inputs.repo || 'rodin/review-bot' }}"
VERSION="${{ steps.version.outputs.version }}" VERSION="${{ steps.version.outputs.version }}"
BINARY="review-bot-linux-amd64" BINARY="review-bot-${{ steps.version.outputs.os }}-${{ steps.version.outputs.arch }}"
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \ curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \
-o "${{ runner.temp }}/review-bot" -o "${{ runner.temp }}/review-bot"
@@ -149,8 +173,13 @@ runs:
# Verify SHA-256 checksum # Verify SHA-256 checksum
cd "${{ runner.temp }}" cd "${{ runner.temp }}"
EXPECTED=$(grep "${BINARY}" checksums.txt | awk '{print $1}') EXPECTED=$(grep -E "^[[:xdigit:]]+[[:space:]]+\*?${BINARY}$" checksums.txt | awk '{print $1}')
ACTUAL=$(sha256sum review-bot | awk '{print $1}') # sha256sum (GNU) is not available on macOS; use shasum -a 256 on darwin.
if [ "${{ steps.version.outputs.os }}" = "darwin" ]; then
ACTUAL=$(shasum -a 256 review-bot | awk '{print $1}')
else
ACTUAL=$(sha256sum review-bot | awk '{print $1}')
fi
if [ -z "$EXPECTED" ]; then if [ -z "$EXPECTED" ]; then
echo "Error: no checksum found for ${BINARY}" >&2 echo "Error: no checksum found for ${BINARY}" >&2
@@ -164,7 +193,7 @@ runs:
fi fi
chmod +x "${{ runner.temp }}/review-bot" chmod +x "${{ runner.temp }}/review-bot"
echo "Installed review-bot ${VERSION} (checksum verified)" echo "Installed review-bot-${{ steps.version.outputs.os }}-${{ steps.version.outputs.arch }} ${VERSION} (checksum verified)"
- name: Run review - name: Run review
shell: bash shell: bash
-200
View File
@@ -1,200 +0,0 @@
# This composite action is designed for Gitea Actions runners.
# Gitea Actions supports GitHub Actions syntax including $GITHUB_OUTPUT,
# actions/cache, and actions/checkout.
# Requirements: python3, sha256sum, curl (all present on ubuntu-* runners).
name: 'AI Code Review'
description: 'Run AI-powered code review on a pull request using review-bot'
inputs:
gitea-url:
description: 'Gitea instance URL (defaults to server_url)'
required: false
default: ''
repo:
description: 'Repository (owner/name, defaults to current)'
required: false
default: ''
pr-number:
description: 'Pull request number (defaults to current PR)'
required: false
default: ''
reviewer-token:
description: 'Gitea token for posting the review'
required: true
reviewer-name:
description: 'Display name for the reviewer'
required: false
default: ''
llm-base-url:
description: 'OpenAI-compatible LLM API base URL (not required for aicore provider)'
required: false
default: ''
llm-api-key:
description: 'LLM API key (not required for aicore provider)'
required: false
default: ''
llm-model:
description: 'LLM model name'
required: true
llm-provider:
description: 'LLM API provider: openai, anthropic, or aicore (default openai)'
required: false
default: 'openai'
aicore-client-id:
description: 'SAP AI Core client ID (required for aicore provider)'
required: false
default: ''
aicore-client-secret:
description: 'SAP AI Core client secret (required for aicore provider)'
required: false
default: ''
aicore-auth-url:
description: 'SAP AI Core authentication URL (required for aicore provider)'
required: false
default: ''
aicore-api-url:
description: 'SAP AI Core API URL (required for aicore provider)'
required: false
default: ''
aicore-resource-group:
description: 'SAP AI Core resource group (default: default)'
required: false
default: 'default'
conventions-file:
description: 'Path to conventions file in the repo (e.g. CLAUDE.md)'
required: false
default: ''
patterns-repo:
description: 'Comma-separated repos with language patterns (e.g. rodin/elixir-patterns,rodin/phoenix-conventions)'
required: false
default: ''
patterns-files:
description: 'Comma-separated file paths or directories to fetch from patterns repos'
required: false
default: 'README.md'
temperature:
description: 'LLM temperature (0 = server default)'
required: false
default: '0'
timeout:
description: 'LLM request timeout in seconds (default 300)'
required: false
default: '300'
version:
description: 'review-bot version to install (e.g. v0.1.0, defaults to latest)'
required: false
default: 'latest'
dry-run:
description: 'Print review to stdout instead of posting'
required: false
default: 'false'
update-existing:
description: 'Delete previous review from same bot after posting new one. Accepts: true/1/yes or false/0/no (default true)'
required: false
default: 'true'
system-prompt-file:
description: 'Local file with additional system prompt instructions (e.g. security review focus)'
required: false
default: ''
persona:
description: 'Built-in persona name (security, architect, docs)'
required: false
default: ''
persona-file:
description: 'Path to custom persona JSON file'
required: false
default: ''
runs:
using: 'composite'
steps:
- name: Determine version
id: version
shell: bash
run: |
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
if [ "${{ inputs.version }}" = "latest" ]; then
VERSION=$(curl -sSf "${GITEA_URL}/api/v1/repos/${REPO}/releases?limit=1" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
if [ -z "$VERSION" ]; then
echo "Failed to determine latest version" >&2
exit 1
fi
else
VERSION="${{ inputs.version }}"
fi
echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
- name: Cache review-bot binary
id: cache
uses: actions/cache@v4
with:
path: ${{ runner.temp }}/review-bot
key: review-bot-linux-amd64-${{ steps.version.outputs.version }}
- name: Install review-bot
if: steps.cache.outputs.cache-hit != 'true'
shell: bash
run: |
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
VERSION="${{ steps.version.outputs.version }}"
BINARY="review-bot-linux-amd64"
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/checksums.txt" \
-o "${{ runner.temp }}/checksums.txt"
# Verify SHA-256 checksum
cd "${{ runner.temp }}"
EXPECTED=$(grep "${BINARY}" checksums.txt | awk '{print $1}')
ACTUAL=$(sha256sum review-bot | awk '{print $1}')
if [ -z "$EXPECTED" ]; then
echo "Error: no checksum found for ${BINARY}" >&2
exit 1
fi
if [ "$EXPECTED" != "$ACTUAL" ]; then
echo "Error: checksum mismatch!" >&2
echo " Expected: $EXPECTED" >&2
echo " Actual: $ACTUAL" >&2
exit 1
fi
chmod +x "${{ runner.temp }}/review-bot"
echo "Installed review-bot ${VERSION} (checksum verified)"
- name: Run review
shell: bash
env:
GITHUB_SERVER_URL: ${{ inputs.gitea-url || github.server_url }}
GITHUB_REPOSITORY: ${{ inputs.repo || github.repository }}
PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ inputs.reviewer-token }}
REVIEWER_NAME: ${{ inputs.reviewer-name }}
LLM_BASE_URL: ${{ inputs.llm-base-url }}
LLM_API_KEY: ${{ inputs.llm-api-key }}
LLM_MODEL: ${{ inputs.llm-model }}
CONVENTIONS_FILE: ${{ inputs.conventions-file }}
PATTERNS_REPO: ${{ inputs.patterns-repo }}
PATTERNS_FILES: ${{ inputs.patterns-files }}
LLM_TEMPERATURE: ${{ inputs.temperature }}
LLM_TIMEOUT: ${{ inputs.timeout }}
LLM_PROVIDER: ${{ inputs.llm-provider }}
UPDATE_EXISTING: ${{ inputs.update-existing }}
SYSTEM_PROMPT_FILE: ${{ inputs.system-prompt-file }}
PERSONA: ${{ inputs.persona }}
PERSONA_FILE: ${{ inputs.persona-file }}
AICORE_CLIENT_ID: ${{ inputs.aicore-client-id }}
AICORE_CLIENT_SECRET: ${{ inputs.aicore-client-secret }}
AICORE_AUTH_URL: ${{ inputs.aicore-auth-url }}
AICORE_API_URL: ${{ inputs.aicore-api-url }}
AICORE_RESOURCE_GROUP: ${{ inputs.aicore-resource-group }}
run: |
ARGS=""
if [ "${{ inputs.dry-run }}" = "true" ]; then
ARGS="--dry-run"
fi
${{ runner.temp }}/review-bot $ARGS
-69
View File
@@ -1,69 +0,0 @@
name: CI
on:
push:
branches: [main]
pull_request:
types: [opened, synchronize]
jobs:
test:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.26'
- run: go test ./...
- run: go vet ./...
- run: go build -o review-bot ./cmd/review-bot
# Self-review using native SAP AI Core provider
# Models must match SAP AI Core deployments
# Available models: gpt-5, anthropic--claude-4.6-sonnet, anthropic--claude-4.6-opus
# Removed gpt-4.1, gpt-5-mini, gpt-4.1-mini - not deployed on AI Core
review:
runs-on: ubuntu-24.04
if: github.event_name == 'pull_request'
needs: test
strategy:
matrix:
include:
- name: sonnet
token_secret: SONNET_REVIEW_TOKEN
model: anthropic--claude-4.6-sonnet
- name: gpt
token_secret: GPT_REVIEW_TOKEN
model: gpt-5
- name: security
token_secret: SECURITY_REVIEW_TOKEN
model: gpt-5
patterns_repo: rodin/security-patterns
patterns_files: "."
system_prompt_file: SECURITY_REVIEW.md
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.26'
- run: go build -o review-bot ./cmd/review-bot
- name: Run ${{ matrix.name }} review
env:
GITHUB_SERVER_URL: ${{ github.server_url }}
GITHUB_REPOSITORY: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}
REVIEWER_NAME: ${{ matrix.name }}
LLM_PROVIDER: aicore
LLM_MODEL: ${{ matrix.model }}
AICORE_CLIENT_ID: ${{ secrets.AICORE_CLIENT_ID }}
AICORE_CLIENT_SECRET: ${{ secrets.AICORE_CLIENT_SECRET }}
AICORE_AUTH_URL: ${{ secrets.AICORE_AUTH_URL }}
AICORE_API_URL: ${{ secrets.AICORE_API_URL }}
AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }}
CONVENTIONS_FILE: "CONVENTIONS.md"
PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }}
PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }}
LLM_TIMEOUT: "600"
SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }}
run: ./review-bot
-38
View File
@@ -1,38 +0,0 @@
name: PR Ready Gate
on:
pull_request:
types: [synchronize]
jobs:
clear-labels:
runs-on: ubuntu-24.04
# Always run - curl commands are safe if labels don't exist
steps:
- name: Remove ready and self-reviewed labels, reassign to author
env:
GITEA_TOKEN: ${{ secrets.RODIN_TOKEN }}
run: |
PR_NUMBER=${{ github.event.pull_request.number }}
AUTHOR=${{ github.event.pull_request.user.login }}
READY_LABEL_ID=38
SELF_REVIEWED_LABEL_ID=37
# Remove ready label if present
curl -sS -X DELETE \
-H "Authorization: token $GITEA_TOKEN" \
"https://gitea.weiker.me/api/v1/repos/${{ github.repository }}/issues/${PR_NUMBER}/labels/${READY_LABEL_ID}" || true
# Remove self-reviewed label if present
curl -sS -X DELETE \
-H "Authorization: token $GITEA_TOKEN" \
"https://gitea.weiker.me/api/v1/repos/${{ github.repository }}/issues/${PR_NUMBER}/labels/${SELF_REVIEWED_LABEL_ID}" || true
# Reassign to author
curl -sS -X PATCH \
-H "Authorization: token $GITEA_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"assignees\": [\"${AUTHOR}\"]}" \
"https://gitea.weiker.me/api/v1/repos/${{ github.repository }}/pulls/${PR_NUMBER}"
echo "Cleared ready/self-reviewed labels and reassigned PR #${PR_NUMBER} to ${AUTHOR}"
-97
View File
@@ -1,97 +0,0 @@
name: Release
on:
push:
tags:
- 'v*'
jobs:
release:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.26'
- name: Run tests
run: |
go vet ./...
go test ./...
- name: Build binaries
run: |
VERSION=${GITHUB_REF_NAME}
mkdir -p dist
GOOS=linux GOARCH=amd64 go build -ldflags "-s -w -X main.version=${VERSION}" -o dist/review-bot-linux-amd64 ./cmd/review-bot
GOOS=linux GOARCH=arm64 go build -ldflags "-s -w -X main.version=${VERSION}" -o dist/review-bot-linux-arm64 ./cmd/review-bot
GOOS=darwin GOARCH=amd64 go build -ldflags "-s -w -X main.version=${VERSION}" -o dist/review-bot-darwin-amd64 ./cmd/review-bot
GOOS=darwin GOARCH=arm64 go build -ldflags "-s -w -X main.version=${VERSION}" -o dist/review-bot-darwin-arm64 ./cmd/review-bot
cd dist && sha256sum * > checksums.txt
- name: Create release and upload assets
env:
GITEA_TOKEN: ${{ secrets.RELEASE_TOKEN }}
run: |
VERSION=${GITHUB_REF_NAME}
GITEA_URL="${{ github.server_url }}"
REPO="${{ github.repository }}"
# Create release (or find existing one for this tag)
HTTP_CODE=$(curl -s -o /tmp/release_response.json -w "%{http_code}" -X POST \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
"${GITEA_URL}/api/v1/repos/${REPO}/releases" \
-d "{\"tag_name\": \"${VERSION}\", \"name\": \"${VERSION}\", \"body\": \"Release ${VERSION}\", \"draft\": false, \"prerelease\": false}")
if [ "$HTTP_CODE" = "409" ]; then
echo "Release for ${VERSION} already exists, fetching existing..."
curl -sSf -o /tmp/release_response.json \
-H "Authorization: token ${GITEA_TOKEN}" \
"${GITEA_URL}/api/v1/repos/${REPO}/releases/tags/${VERSION}"
elif [ "$HTTP_CODE" != "201" ]; then
echo "Failed to create release (HTTP ${HTTP_CODE})" >&2
cat /tmp/release_response.json >&2
exit 1
fi
# Parse release ID (python3 available on ubuntu-24.04 runners)
RELEASE_ID=$(python3 -c "import json; print(json.load(open('/tmp/release_response.json'))['id'])")
if [ -z "$RELEASE_ID" ]; then
echo "Failed to parse release ID" >&2
cat /tmp/release_response.json >&2
exit 1
fi
echo "Release ID: ${RELEASE_ID}"
# Upload each asset (idempotent: delete existing asset with same name first)
for file in dist/*; do
filename=$(basename "$file")
echo "Uploading ${filename}..."
# Check if asset already exists and delete it
EXISTING_ID=$(export ASSET_NAME="${filename}"; curl -sS \
-H "Authorization: token ${GITEA_TOKEN}" \
"${GITEA_URL}/api/v1/repos/${REPO}/releases/${RELEASE_ID}/assets" \
| python3 -c "import json,sys,os; name=os.environ['ASSET_NAME']; assets=json.load(sys.stdin); print(next((str(a['id']) for a in assets if a['name']==name),''))" 2>/dev/null)
if [ -n "$EXISTING_ID" ]; then
echo " Asset ${filename} already exists (id=${EXISTING_ID}), deleting..."
curl -sSf -X DELETE \
-H "Authorization: token ${GITEA_TOKEN}" \
"${GITEA_URL}/api/v1/repos/${REPO}/releases/${RELEASE_ID}/assets/${EXISTING_ID}"
fi
curl -sSf -X POST \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/octet-stream" \
"${GITEA_URL}/api/v1/repos/${REPO}/releases/${RELEASE_ID}/assets?name=$(printf '%s' "${filename}" | jq -sRr @uri)" \
--data-binary "@${file}"
done
echo "Release ${VERSION} created with assets"
+1 -1
View File
@@ -9,7 +9,7 @@
| Package | Use Case | Scope | | Package | Use Case | Scope |
|---------|----------|-------| |---------|----------|-------|
| `gopkg.in/yaml.v3` | YAML parsing (persona files, config) | production | | `github.com/goccy/go-yaml` | YAML parsing and AST inspection (subpkgs: `ast`, `parser`) | production |
| `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only | | `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only |
**Any import not in this table or the Go standard library is forbidden.** **Any import not in this table or the Go standard library is forbidden.**
+1 -1
View File
@@ -130,7 +130,7 @@ func TestIntegration_PostAndCleanup(t *testing.T) {
// Post a test review // Post a test review
sentinel := "<!-- review-bot:integration-test -->" sentinel := "<!-- review-bot:integration-test -->"
testBody := "# Integration Test Review\n\nThis is a test review.\n\n" + sentinel testBody := "# Integration Test Review\n\nThis is a test review.\n\n" + sentinel
posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, "COMMENT", testBody, nil) posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, "COMMENT", testBody, "", nil)
if err != nil { if err != nil {
t.Fatalf("PostReview: %v", err) t.Fatalf("PostReview: %v", err)
} }
+25 -20
View File
@@ -15,7 +15,6 @@ import (
"gitea.weiker.me/rodin/review-bot/gitea" "gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/llm" "gitea.weiker.me/rodin/review-bot/llm"
"gitea.weiker.me/rodin/review-bot/review" "gitea.weiker.me/rodin/review-bot/review"
"gitea.weiker.me/rodin/review-bot/vcs"
) )
var version = "dev" var version = "dev"
@@ -55,8 +54,8 @@ func main() {
logFormat := flag.String("log-format", envOrDefault("LOG_FORMAT", "text"), "Log output format: text or json") logFormat := flag.String("log-format", envOrDefault("LOG_FORMAT", "text"), "Log output format: text or json")
verbosity := flag.String("verbosity", envOrDefault("LOG_VERBOSITY", "info"), "Log verbosity: debug, info, warn, error") verbosity := flag.String("verbosity", envOrDefault("LOG_VERBOSITY", "info"), "Log verbosity: debug, info, warn, error")
// CLI flags // CLI flags
giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", envOrDefault("GITHUB_SERVER_URL", "")), "Gitea instance URL") giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", ""), "Gitea instance URL")
repo := flag.String("repo", envOrDefault("GITEA_REPO", envOrDefault("GITHUB_REPOSITORY", "")), "Repository (owner/name)") repo := flag.String("repo", envOrDefault("GITEA_REPO", ""), "Repository (owner/name)")
prNum := flag.String("pr", envOrDefault("PR_NUMBER", ""), "Pull request number") prNum := flag.String("pr", envOrDefault("PR_NUMBER", ""), "Pull request number")
reviewerName := flag.String("reviewer-name", envOrDefault("REVIEWER_NAME", ""), "Reviewer display name") reviewerName := flag.String("reviewer-name", envOrDefault("REVIEWER_NAME", ""), "Reviewer display name")
reviewerToken := flag.String("reviewer-token", envOrDefault("REVIEWER_TOKEN", ""), "Gitea token for posting review") reviewerToken := flag.String("reviewer-token", envOrDefault("REVIEWER_TOKEN", ""), "Gitea token for posting review")
@@ -66,7 +65,7 @@ func main() {
conventionsFile := flag.String("conventions-file", envOrDefault("CONVENTIONS_FILE", ""), "Conventions file path in repo (e.g. CLAUDE.md)") conventionsFile := flag.String("conventions-file", envOrDefault("CONVENTIONS_FILE", ""), "Conventions file path in repo (e.g. CLAUDE.md)")
systemPromptFile := flag.String("system-prompt-file", envOrDefault("SYSTEM_PROMPT_FILE", ""), "Local file with additional system prompt instructions") systemPromptFile := flag.String("system-prompt-file", envOrDefault("SYSTEM_PROMPT_FILE", ""), "Local file with additional system prompt instructions")
patternsRepo := flag.String("patterns-repo", envOrDefault("PATTERNS_REPO", ""), "Repo with language patterns (e.g. rodin/elixir-patterns)") patternsRepo := flag.String("patterns-repo", envOrDefault("PATTERNS_REPO", ""), "Repo with language patterns (e.g. rodin/elixir-patterns)")
patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", "README.md"), "Comma-separated file paths to fetch from patterns repo") patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", ""), "Comma-separated file paths to fetch from patterns repo (empty = all files)")
dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting") dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting")
llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)") llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)")
llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)") llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)")
@@ -445,7 +444,7 @@ func main() {
// POST new review // POST new review
slog.Info("posting review", "event", event, "pr", prNumber) slog.Info("posting review", "event", event, "pr", prNumber)
posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, event, reviewBody, inlineComments) posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, event, reviewBody, evaluatedSHA, inlineComments)
if err != nil { if err != nil {
slog.Error("failed to post review", "pr", prNumber, "event", event, "error", err) slog.Error("failed to post review", "pr", prNumber, "event", event, "error", err)
os.Exit(1) os.Exit(1)
@@ -524,11 +523,25 @@ func fetchFileContext(ctx context.Context, client *gitea.Client, owner, repo, re
// patternsRepo is comma-separated list of owner/name repos. // patternsRepo is comma-separated list of owner/name repos.
// patternsFiles is comma-separated list of file paths or directories. // patternsFiles is comma-separated list of file paths or directories.
// If a path ends with / or is a directory, all files within it are fetched recursively. // If a path ends with / or is a directory, all files within it are fetched recursively.
// If patternsFiles is empty, all files from the repo root are fetched.
func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patternsFiles string) string { func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patternsFiles string) string {
var sb strings.Builder var sb strings.Builder
repos := strings.Split(patternsRepo, ",") repos := strings.Split(patternsRepo, ",")
paths := strings.Split(patternsFiles, ",")
// Build the list of paths to fetch
var paths []string
if patternsFiles == "" {
// Empty patternsFiles means "fetch all files from repo root"
paths = []string{""}
} else {
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
}
for _, repoRef := range repos { for _, repoRef := range repos {
if ctx.Err() != nil { if ctx.Err() != nil {
@@ -549,11 +562,6 @@ func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patt
var repoSkippedFiles []string var repoSkippedFiles []string
for _, path := range paths { for _, path := range paths {
path = strings.TrimSpace(path)
if path == "" {
continue
}
files, err := client.GetAllFilesInPath(ctx, owner, repo, path) files, err := client.GetAllFilesInPath(ctx, owner, repo, path)
if err != nil { if err != nil {
slog.Warn("could not fetch patterns", "path", path, "repo", repoRef, "error", err) slog.Warn("could not fetch patterns", "path", path, "repo", repoRef, "error", err)
@@ -813,7 +821,7 @@ func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
return evaluatedSHA != currentSHA return evaluatedSHA != currentSHA
} }
// giteaClientAdapter adapts gitea.Client to vcs.FileReader interface. // giteaClientAdapter adapts gitea.Client to review.GiteaClient interface.
type giteaClientAdapter struct { type giteaClientAdapter struct {
client *gitea.Client client *gitea.Client
} }
@@ -822,14 +830,14 @@ func newGiteaClientAdapter(c *gitea.Client) *giteaClientAdapter {
return &giteaClientAdapter{client: c} return &giteaClientAdapter{client: c}
} }
func (a *giteaClientAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]vcs.ContentEntry, error) { func (a *giteaClientAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
entries, err := a.client.ListContents(ctx, owner, repo, path) entries, err := a.client.ListContents(ctx, owner, repo, path)
if err != nil { if err != nil {
return nil, err return nil, err
} }
result := make([]vcs.ContentEntry, len(entries)) result := make([]review.ContentEntry, len(entries))
for i, e := range entries { for i, e := range entries {
result[i] = vcs.ContentEntry{ result[i] = review.ContentEntry{
Name: e.Name, Name: e.Name,
Path: e.Path, Path: e.Path,
Type: e.Type, Type: e.Type,
@@ -838,9 +846,6 @@ func (a *giteaClientAdapter) ListContents(ctx context.Context, owner, repo, path
return result, nil return result, nil
} }
func (a *giteaClientAdapter) GetFileContent(ctx context.Context, owner, repo, filePath, ref string) (string, error) { func (a *giteaClientAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
if ref != "" { return a.client.GetFileContent(ctx, owner, repo, filepath)
return a.client.GetFileContentRef(ctx, owner, repo, filePath, ref)
}
return a.client.GetFileContent(ctx, owner, repo, filePath)
} }
+46
View File
@@ -504,6 +504,52 @@ func TestIsPatternFile(t *testing.T) {
} }
} }
// TestBuildPatternPaths verifies the path-building logic for fetchPatterns.
// Empty patternsFiles means "fetch all from root" (represented as [""]).
func TestBuildPatternPaths(t *testing.T) {
buildPaths := func(patternsFiles string) []string {
if patternsFiles == "" {
return []string{""}
}
var paths []string
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
return paths
}
tests := []struct {
name string
input string
want []string
}{
{"empty fetches root", "", []string{""}},
{"single file", "README.md", []string{"README.md"}},
{"multiple files", "README.md,PATTERNS.md", []string{"README.md", "PATTERNS.md"}},
{"trims whitespace", " foo.md , bar.md ", []string{"foo.md", "bar.md"}},
{"skips empty between commas", "foo.md,,bar.md", []string{"foo.md", "bar.md"}},
{"directory path", "patterns/", []string{"patterns/"}},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := buildPaths(tc.input)
if len(got) != len(tc.want) {
t.Errorf("buildPaths(%q) = %v, want %v", tc.input, got, tc.want)
return
}
for i := range got {
if got[i] != tc.want[i] {
t.Errorf("buildPaths(%q)[%d] = %q, want %q", tc.input, i, got[i], tc.want[i])
}
}
})
}
}
func TestEvaluateCIStatus(t *testing.T) { func TestEvaluateCIStatus(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
+10 -31
View File
@@ -9,7 +9,7 @@ JSON is awkward for persona files that contain multi-line text (identity, severi
- Backwards compatibility: existing JSON personas must continue to work - Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486) - Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use `.yaml` extension (not `.yml`) - Consistency: use `.yaml` extension (not `.yml`)
- Library: use `gopkg.in/yaml.v3` (approved in CONVENTIONS.md) with explicit depth limiting - Library: use `github.com/goccy/go-yaml` v1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation
## Proposed Approach ## Proposed Approach
@@ -33,37 +33,16 @@ func parsePersona(data []byte, source string) (*Persona, error) {
### YAML Parsing with Depth Protection ### YAML Parsing with Depth Protection
```go We implement a custom AST-based depth/node-count walk (`checkYAMLDepth` in
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error { `review/persona.go`) rather than relying on library decoder options. Key design
var node yaml.Node decisions:
dec := yaml.NewDecoder(bytes.NewReader(data))
if err := dec.Decode(&node); err != nil {
return err
}
if err := checkYAMLDepth(&node, 0, maxDepth); err != nil {
return err
}
return node.Decode(out)
}
func checkYAMLDepth(node *yaml.Node, depth, maxDepth int) error { - **Library:** `github.com/goccy/go-yaml` with `ast.Node`-based traversal
if depth > maxDepth { - **Dual-map tracking:** `validated` (depth-aware short-circuit) + `visiting` (cycle detection)
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth) - **Node-count limit:** Conservative overcounting bounds total validation work
} - **Alias-aware depth:** Aliases increment depth and are re-checked when encountered at greater depths
// Handle alias nodes by following the Alias pointer
if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth, maxDepth)
}
for _, child := range node.Content {
if err := checkYAMLDepth(child, depth+1, maxDepth); err != nil {
return err
}
}
return nil
}
```
The `gopkg.in/yaml.v3` library does not have built-in depth protection, so we implement explicit depth checking by first decoding into a `yaml.Node`, walking the tree to verify depth (including alias resolution), then decoding into the target struct. See `review/persona.go:checkYAMLDepth` for the authoritative implementation.
## State/Data Model ## State/Data Model
@@ -74,7 +53,7 @@ No new state. Same `Persona` struct, just different parsing.
| Error | Handling | | Error | Handling |
|-------|----------| |-------|----------|
| Invalid YAML syntax | Return parse error with source file | | Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Library rejects (v1.16.0+ fix) | | Deeply nested YAML | Custom AST walk (`checkYAMLDepth`) rejects before decode |
| Unknown extension | Fall back to JSON parsing | | Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse | | Missing required fields | Validation rejects after parse |
-268
View File
@@ -1,268 +0,0 @@
# GitHub Support for review-bot
## Goal
AI code reviews on GitHub PRs using SAP AI Core as the LLM provider.
## Non-Goals
- Auto-detection of platform (explicit `--provider` flag is fine)
- Unifying into one abstraction layer for its own sake
## Constraints
1. **Same features on both platforms** — anything review-bot does on Gitea should work on GitHub
2. **Testable** — small interfaces, dependency injection, no global state
3. **Interface from working code** — extract from gitea/, don't invent in vacuum
---
## Part 1: Feature Inventory
What does review-bot actually do?
### Core Review Flow
| Feature | Description |
|---------|-------------|
| Get PR metadata | Title, body, head SHA, base ref |
| Get PR diff | Unified diff format |
| Get PR files | List of changed files with status |
| Get file content | Raw file at ref |
| List directory | Enumerate files in path |
| Post review | Body + inline comments + verdict |
### Review Management
| Feature | Description |
|---------|-------------|
| List reviews | Get existing reviews on PR |
| Delete review | Remove old review before re-posting |
| Get authenticated user | Who am I? |
### Platform-Specific (not in shared interface)
| Feature | Gitea | GitHub |
|---------|-------|--------|
| Resolve comment | Yes | No equivalent |
| Timeline API | Yes | No equivalent |
These stay on gitea.Client directly. Callers that need them type-assert.
---
## Part 2: GitHub API Mapping
| Feature | Gitea API | GitHub API |
|---------|-----------|------------|
| Get PR | `GET /api/v1/repos/.../pulls/{n}` | `GET /repos/.../pulls/{n}` |
| Get diff | `.diff` suffix | `Accept: application/vnd.github.diff` header |
| Get files | `GET .../pulls/{n}/files` | Same |
| Get file content | `GET .../raw/{path}?ref=` | `GET .../contents/{path}?ref=` + base64 decode |
| List directory | `GET .../contents/{path}` | Same |
| Post review | `POST .../pulls/{n}/reviews` | Same (adapter handles comment schema) |
| List reviews | `GET .../pulls/{n}/reviews` | Same |
| Delete review | `DELETE .../pulls/{n}/reviews/{id}` | Same |
| Get user | `GET /api/v1/user` | `GET /user` |
---
## Part 3: Interface Design
**Principle:** Extract from working gitea/ code. The interface is discovered, not invented.
### Small, role-based interfaces
```go
// vcs/interfaces.go
type PRReader interface {
GetPullRequest(ctx context.Context, owner, repo string, number int) (*PullRequest, error)
GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error)
GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]ChangedFile, error)
}
type FileReader interface {
GetFileContent(ctx context.Context, owner, repo, path, ref string) (string, error)
ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
}
type Reviewer interface {
PostReview(ctx context.Context, owner, repo string, number int, req ReviewRequest) (*Review, error)
ListReviews(ctx context.Context, owner, repo string, number int) ([]Review, error)
DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error
}
type Identity interface {
GetAuthenticatedUser(ctx context.Context) (string, error)
}
// Client combines all for callers that need everything
type Client interface {
PRReader
FileReader
Reviewer
Identity
}
```
### Types
Use what gitea/ already has. Move to vcs/types.go or re-export.
```go
type PullRequest struct { ... } // from gitea.PullRequest
type ChangedFile struct { ... } // from gitea.ChangedFile
type ContentEntry struct { ... } // from gitea.ContentEntry
type Review struct { ... } // from gitea.Review
type ReviewRequest struct { ... } // new, for PostReview input
type ReviewComment struct { ... } // from gitea.ReviewComment
```
### Adapter responsibilities
Each adapter (gitea, github) handles:
- API URL construction
- Auth header format (`token` vs `Bearer`)
- Request/response mapping
- Comment schema translation (line numbers, commit IDs, etc.)
---
## Part 4: Test Plan
### Unit Tests (mock HTTP)
```
github/
pr_test.go # TestGetPullRequest, TestGetDiff, TestGetFiles
files_test.go # TestGetFileContent, TestListContents
review_test.go # TestPostReview, TestListReviews, TestDeleteReview
identity_test.go # TestGetAuthenticatedUser
```
Per method: happy path, 404, 401, 429, malformed response.
### Integration Tests
Against github.com/aweiker/ai-core-review-bot:
- Fetch real PR
- Fetch real file
- Post + delete review (clean up)
### End-to-End
Open PR on test repo, run full review-bot, verify review appears.
---
## Part 5: Implementation Phases
### Phase 1: Extract interfaces from gitea/
**Work:**
- Create `vcs/interfaces.go` with interfaces extracted from gitea/client.go signatures
- Create `vcs/types.go` — move or alias types from gitea/
- Verify gitea.Client satisfies vcs.Client (compile-time check)
**Exit criteria:** `var _ vcs.Client = (*gitea.Client)(nil)` compiles.
---
### Phase 2: Gitea adapter (if needed)
**Work:**
- If gitea.Client method signatures don't match exactly, create wrapper
- Keep gitea/ working exactly as before
**Exit criteria:** Existing tests pass. No behavior change.
---
### Phase 3: GitHub client — PRReader
**Work:**
- `github/client.go` — struct, constructor, HTTP helpers
- `github/pr.go` — GetPullRequest, GetPullRequestDiff, GetPullRequestFiles
- Unit tests
**Exit criteria:** `go test ./github/...` passes for PR methods.
---
### Phase 4: GitHub client — FileReader
**Work:**
- `github/files.go` — GetFileContent, ListContents
- Unit tests
**Exit criteria:** Unit tests pass.
---
### Phase 5: GitHub client — Reviewer + Identity
**Work:**
- `github/review.go` — PostReview, ListReviews, DeleteReview
- `github/identity.go` — GetAuthenticatedUser
- Unit tests
**Exit criteria:** Unit tests pass.
---
### Phase 6: Integration tests
**Work:**
- `integration/github_test.go`
- Test against real GitHub
**Exit criteria:** All integration tests pass.
---
### Phase 7: Wire into cmd/review-bot
**Work:**
- Add `--provider github|gitea` flag (default: gitea for backward compat)
- Select client based on flag
- Update to use vcs interfaces where it makes sense
**Exit criteria:**
- `./review-bot --provider github ...` works
- `./review-bot --provider gitea ...` works (same as before)
- Existing Gitea workflows unchanged
---
### Phase 8: GitHub Actions workflow + releases
**Work:**
- `.github/workflows/ci.yml` — test on PR
- `.github/workflows/release.yml` — publish binary to GitHub releases
- `.github/actions/review/action.yml` — composite action
- Action downloads binary from github.com/aweiker/ai-core-review-bot releases
**Exit criteria:**
- CI runs on github.com/aweiker/ai-core-review-bot
- Release creates downloadable binary
- Review action posts review successfully
---
## Part 6: Decisions
| Question | Decision |
|----------|----------|
| Auth token | Workflow `GITHUB_TOKEN` (automatic) |
| Binary distribution | GitHub releases on aweiker/ai-core-review-bot |
| Comment schema | Adapter's job — translate ReviewComment to platform format |
| Default provider | `gitea` for backward compatibility |
| Shared types | vcs/types.go (extracted from gitea/) |
| Platform-specific features | Stay on concrete client, not interface |
---
## Summary
8 phases. Start by extracting interfaces from working gitea/ code, not inventing them. GitHub implements the same interfaces. Each phase has clear exit criteria.
+134 -30
View File
@@ -11,6 +11,7 @@ import (
"fmt" "fmt"
"io" "io"
"log/slog" "log/slog"
"math"
"net" "net"
"net/http" "net/http"
"net/url" "net/url"
@@ -47,6 +48,12 @@ func IsServerError(err error) bool {
return errors.As(err, &apiErr) && apiErr.StatusCode >= 500 && apiErr.StatusCode < 600 return errors.As(err, &apiErr) && apiErr.StatusCode >= 500 && apiErr.StatusCode < 600
} }
// DefaultMaxDiffSize is the default maximum diff size in bytes (10 MB).
const DefaultMaxDiffSize = 10 * 1024 * 1024
// ErrDiffTooLarge is returned when a PR diff exceeds the configured MaxDiffSize.
var ErrDiffTooLarge = errors.New("diff size exceeds maximum allowed size")
// Client interacts with the Gitea API. // Client interacts with the Gitea API.
// A Client is safe for concurrent use by multiple goroutines. // A Client is safe for concurrent use by multiple goroutines.
type Client struct { type Client struct {
@@ -61,6 +68,42 @@ type Client struct {
// This field must be configured before the first request is made. // This field must be configured before the first request is made.
// Modifying it while requests are in flight is not safe. // Modifying it while requests are in flight is not safe.
RetryBackoff []time.Duration RetryBackoff []time.Duration
// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.
// If zero, defaults to DefaultMaxDiffSize (10 MB). Set to any negative value
// (or math.MaxInt64) to disable the limit.
//
// This field must be configured before the first request is made.
// Modifying it while requests are in flight is not safe.
MaxDiffSize int64
}
// defaultCheckRedirect is the redirect policy used by NewClient.
// NOTE: This function is intentionally duplicated in github/client.go (and vice versa)
// because the packages are separate. Changes here must be mirrored there.
// It rejects HTTPS->HTTP protocol downgrades (to prevent plaintext leakage)
// and cross-host redirects (to prevent following responses from untrusted
// endpoints). Same-host, same-or-upgraded-scheme redirects are allowed.
func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
// Guard for direct invocation in tests and any future callers;
// net/http guarantees len(via) >= 1 during actual redirects.
if len(via) == 0 {
return nil
}
prev := via[len(via)-1]
// Reject protocol downgrade: HTTPS->HTTP leaks request metadata over plaintext.
if prev.URL.Scheme == "https" && req.URL.Scheme == "http" {
return fmt.Errorf("refusing redirect: HTTPS to HTTP downgrade (%s -> %s)", prev.URL.Host, req.URL.Host)
}
// Reject cross-host redirect entirely to avoid consuming responses
// from untrusted endpoints.
if req.URL.Host != prev.URL.Host {
return fmt.Errorf("refusing redirect: cross-host (%s -> %s)", prev.URL.Host, req.URL.Host)
}
return nil
} }
// NewClient creates a new Gitea API client. // NewClient creates a new Gitea API client.
@@ -68,13 +111,30 @@ func NewClient(baseURL, token string) *Client {
return &Client{ return &Client{
baseURL: strings.TrimRight(baseURL, "/"), baseURL: strings.TrimRight(baseURL, "/"),
token: token, token: token,
http: &http.Client{Timeout: 30 * time.Second}, http: &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
},
} }
} }
// SetHTTPClient sets the underlying HTTP client used for requests. // SetHTTPClient sets the underlying HTTP client used for requests.
// This is intended for testing to inject mock transports. // This is intended for test setup only to inject mock transports; it must be
// called before any goroutines issue requests.
//
// Passing nil restores the default client (30s timeout + redirect-rejecting
// CheckRedirect policy matching NewClient).
//
// Callers providing a non-nil client are responsible for configuring a safe
// CheckRedirect policy. Without one, the default net/http behavior will follow
// redirects and may forward the Authorization header to untrusted hosts.
func (c *Client) SetHTTPClient(hc *http.Client) { func (c *Client) SetHTTPClient(hc *http.Client) {
if hc == nil {
hc = &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
}
}
c.http = hc c.http = hc
} }
@@ -125,9 +185,28 @@ func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number
} }
// GetPullRequestDiff fetches the unified diff for a PR. // GetPullRequestDiff fetches the unified diff for a PR.
// It enforces MaxDiffSize to prevent unbounded memory allocation.
// Returns ErrDiffTooLarge if the diff exceeds the configured limit.
func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) { func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d.diff", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number) reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d.diff", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
maxSize := c.MaxDiffSize
if maxSize == 0 {
maxSize = DefaultMaxDiffSize
}
// When the limit is disabled (negative) or set to math.MaxInt64 (which
// would overflow the +1 detection and silently disable enforcement),
// use the standard unlimited doGet path.
if maxSize < 0 || maxSize == math.MaxInt64 {
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch diff: %w", err)
}
return string(body), nil
}
body, err := c.doGetLimited(ctx, reqURL, maxSize)
if err != nil { if err != nil {
return "", fmt.Errorf("fetch diff: %w", err) return "", fmt.Errorf("fetch diff: %w", err)
} }
@@ -183,18 +262,22 @@ func (c *Client) GetFileContentRef(ctx context.Context, owner, repo, filepath, r
} }
// PostReview submits a review to a PR and returns the created review. // PostReview submits a review to a PR and returns the created review.
// event should be "APPROVED" or "REQUEST_CHANGES". // event should be one of "APPROVED", "REQUEST_CHANGES", or "COMMENT".
// commitID anchors the review to a specific commit SHA. If empty, Gitea
// defaults to the current PR head.
// comments are optional inline comments attached to specific lines. // comments are optional inline comments attached to specific lines.
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body string, comments []ReviewComment) (*Review, error) { func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []ReviewComment) (*Review, error) {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number) reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
payload := struct { payload := struct {
Body string `json:"body"` Body string `json:"body"`
Event string `json:"event"` Event string `json:"event"`
CommitID string `json:"commit_id,omitempty"`
Comments []ReviewComment `json:"comments,omitempty"` Comments []ReviewComment `json:"comments,omitempty"`
}{ }{
Body: body, Body: body,
Event: event, Event: event,
CommitID: commitID,
Comments: comments, Comments: comments,
} }
@@ -292,9 +375,9 @@ func isRetriableSyscallError(err error) bool {
return true return true
} }
// redactURL strips query parameters from a URL for safe logging. // redactURL strips query parameters and userinfo credentials from a URL for
// This prevents accidental exposure of sensitive data that future callers // safe logging. This prevents accidental exposure of sensitive data (tokens in
// might pass via query strings. // query strings, or user:pass in the authority) in log output.
func redactURL(rawURL string) string { func redactURL(rawURL string) string {
parsed, err := url.Parse(rawURL) parsed, err := url.Parse(rawURL)
if err != nil { if err != nil {
@@ -302,6 +385,9 @@ func redactURL(rawURL string) string {
// potentially logging something sensitive. // potentially logging something sensitive.
return "[invalid URL]" return "[invalid URL]"
} }
if parsed.User != nil {
parsed.User = url.User("REDACTED")
}
if parsed.RawQuery != "" { if parsed.RawQuery != "" {
parsed.RawQuery = "[redacted]" parsed.RawQuery = "[redacted]"
} }
@@ -322,10 +408,12 @@ func sanitizeErrorForLog(err error) string {
return err.Error() return err.Error()
} }
// doGet performs an HTTP GET request with retry on 5xx errors and temporary // doGetWithReader performs an HTTP GET request with retry on 5xx errors and
// network errors. Retries up to 3 times with exponential backoff (1s, 2s delays // temporary network errors. Retries up to 3 times with exponential backoff
// by default; configurable via Client.RetryBackoff for testing). // (1s, 2s delays by default; configurable via Client.RetryBackoff for testing).
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) { // The readBody function is called with the response body on success (2xx) and
// is responsible for reading and closing it.
func (c *Client) doGetWithReader(ctx context.Context, reqURL string, readBody func(io.ReadCloser) ([]byte, error)) ([]byte, error) {
const maxAttempts = 3 const maxAttempts = 3
// backoff[i] is the delay before attempt i+1 (i.e., after attempt i fails). // backoff[i] is the delay before attempt i+1 (i.e., after attempt i fails).
// First attempt (i=0) has no delay; retries wait 1s then 2s by default. // First attempt (i=0) has no delay; retries wait 1s then 2s by default.
@@ -390,12 +478,7 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return nil, lastErr return nil, lastErr
} }
if resp.StatusCode >= 200 && resp.StatusCode < 300 { if resp.StatusCode >= 200 && resp.StatusCode < 300 {
body, err := io.ReadAll(resp.Body) return readBody(resp.Body)
resp.Body.Close()
if err != nil {
return nil, err
}
return body, nil
} }
// Error path: limit how much we read from potentially malicious server // Error path: limit how much we read from potentially malicious server
@@ -413,6 +496,39 @@ func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return nil, lastErr return nil, lastErr
} }
// doGet performs an HTTP GET request with retry, reading the full response body.
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return c.doGetWithReader(ctx, reqURL, func(body io.ReadCloser) ([]byte, error) {
defer body.Close()
return io.ReadAll(body)
})
}
// doGetLimited performs an HTTP GET request with retry but enforces a maximum
// response body size. Returns ErrDiffTooLarge if the response exceeds maxBytes.
// It reads maxBytes+1 (clamped to avoid overflow) to detect truncation without
// buffering the entire body.
func (c *Client) doGetLimited(ctx context.Context, reqURL string, maxBytes int64) ([]byte, error) {
return c.doGetWithReader(ctx, reqURL, func(body io.ReadCloser) ([]byte, error) {
defer body.Close()
// Read up to maxBytes+1 to detect overflow.
// Clamp to prevent integer overflow when maxBytes == math.MaxInt64.
limitBytes := maxBytes + 1
if limitBytes <= 0 {
limitBytes = math.MaxInt64
}
limited := io.LimitReader(body, limitBytes)
data, err := io.ReadAll(limited)
if err != nil {
return nil, err
}
if int64(len(data)) > maxBytes {
return nil, fmt.Errorf("%w: response exceeds %d bytes", ErrDiffTooLarge, maxBytes)
}
return data, nil
})
}
// escapePath escapes each segment of a relative file path for use in URLs. // escapePath escapes each segment of a relative file path for use in URLs.
// Slashes are preserved as path separators; other special characters are escaped. // Slashes are preserved as path separators; other special characters are escaped.
// Input should be a relative path (no leading slash). Already-encoded segments // Input should be a relative path (no leading slash). Already-encoded segments
@@ -831,15 +947,3 @@ func (c *Client) ResolveComment(ctx context.Context, owner, repo string, comment
} }
return nil return nil
} }
// DismissReview dismisses a review on a pull request.
// This is a stub for the vcs.Reviewer interface; full implementation is Phase 2.
func (c *Client) DismissReview(ctx context.Context, owner, repo string, number int, reviewID int64, message string) error {
return fmt.Errorf("dismiss review %d on %s/%s#%d: %w", reviewID, owner, repo, number, errors.ErrUnsupported)
}
// GetFileContentAtRef fetches a file at a specific ref from a repo.
// This delegates to GetFileContentRef for the Gitea implementation.
func (c *Client) GetFileContentAtRef(ctx context.Context, owner, repo, path, ref string) (string, error) {
return c.GetFileContentRef(ctx, owner, repo, path, ref)
}
+148 -7
View File
@@ -9,6 +9,7 @@ import (
"net" "net"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"net/url"
"strings" "strings"
"sync/atomic" "sync/atomic"
"syscall" "syscall"
@@ -116,8 +117,9 @@ func TestPostReview(t *testing.T) {
} }
var payload struct { var payload struct {
Body string `json:"body"` Body string `json:"body"`
Event string `json:"event"` Event string `json:"event"`
CommitID string `json:"commit_id"`
} }
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil { if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
t.Fatalf("failed to decode payload: %v", err) t.Fatalf("failed to decode payload: %v", err)
@@ -128,14 +130,16 @@ func TestPostReview(t *testing.T) {
if payload.Event != "APPROVED" { if payload.Event != "APPROVED" {
t.Errorf("expected event %q, got %q", "APPROVED", payload.Event) t.Errorf("expected event %q, got %q", "APPROVED", payload.Event)
} }
if payload.CommitID != "abc123def" {
t.Errorf("expected commit_id %q, got %q", "abc123def", payload.CommitID)
}
w.WriteHeader(http.StatusOK) w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"id":100,"user":{"login":"review-bot"},"state":"APPROVED","stale":false}`)) w.Write([]byte(`{"id":100,"user":{"login":"review-bot"},"state":"APPROVED","stale":false}`))
})) }))
defer server.Close() defer server.Close()
client := NewClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
review, err := client.PostReview(context.Background(), "owner", "repo", 3, "APPROVED", "LGTM", nil) review, err := client.PostReview(context.Background(), "owner", "repo", 3, "APPROVED", "LGTM", "abc123def", nil)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
@@ -182,12 +186,35 @@ func TestPostReview_Non200(t *testing.T) {
defer server.Close() defer server.Close()
client := NewClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "test", nil) _, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "test", "", nil)
if err == nil { if err == nil {
t.Fatal("expected error for 403, got nil") t.Fatal("expected error for 403, got nil")
} }
} }
func TestPostReview_EmptyCommitID_OmittedFromPayload(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var raw map[string]interface{}
if err := json.Unmarshal(body, &raw); err != nil {
t.Fatalf("failed to decode payload: %v", err)
}
if _, exists := raw["commit_id"]; exists {
t.Errorf("expected commit_id to be omitted from payload when empty, but it was present")
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"id":200,"user":{"login":"bot"},"state":"APPROVED","stale":false}`))
}))
defer server.Close()
client := NewClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "ok", "", nil)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestGetFileContent(t *testing.T) { func TestGetFileContent(t *testing.T) {
expected := "# Conventions\n- Be nice\n" expected := "# Conventions\n- Be nice\n"
@@ -944,8 +971,6 @@ func TestDoGet_RespectsContextCancellation(t *testing.T) {
t.Errorf("attempts = %d, expected 1 before context cancel during backoff", attempts) t.Errorf("attempts = %d, expected 1 before context cancel during backoff", attempts)
} }
} }
// mockTransport is a test helper that returns errors for the first N calls, // mockTransport is a test helper that returns errors for the first N calls,
// then delegates to a real server. // then delegates to a real server.
type mockTransport struct { type mockTransport struct {
@@ -1092,6 +1117,21 @@ func TestRedactURL(t *testing.T) {
input: "", input: "",
want: "", want: "",
}, },
{
name: "with userinfo - redacts credentials",
input: "https://admin:secret@gitea.example.com/api/v1/repos",
want: "https://REDACTED@gitea.example.com/api/v1/repos",
},
{
name: "with userinfo and query params",
input: "https://user:pass@example.com/path?token=abc",
want: "https://REDACTED@example.com/path?[redacted]",
},
{
name: "username only - no password",
input: "https://user@example.com/path",
want: "https://REDACTED@example.com/path",
},
} }
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) { t.Run(tt.name, func(t *testing.T) {
@@ -1144,3 +1184,104 @@ func TestSanitizeErrorForLog(t *testing.T) {
}) })
} }
} }
func TestNewClient_HasCheckRedirect(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
if c.http.CheckRedirect == nil {
t.Fatal("expected CheckRedirect to be set")
}
}
func TestDefaultCheckRedirect_RejectsHTTPSToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "gitea.example.com", Path: "/foo"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on HTTPS->HTTP redirect")
}
if !strings.Contains(err.Error(), "HTTPS to HTTP downgrade") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsCrossHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "cdn.example.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on cross-host redirect")
}
if !strings.Contains(err.Error(), "cross-host") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_AllowsSameHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if auth := req.Header.Get("Authorization"); auth != "token abc" {
t.Errorf("expected Authorization to be preserved, got %q", auth)
}
}
func TestDefaultCheckRedirect_AllowsSameHostHTTPToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "http", Host: "localhost:3000", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "localhost:3000", Path: "/bar"},
Header: http.Header{},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsTooManyRedirects(t *testing.T) {
via := make([]*http.Request, 10)
for i := range via {
via[i] = &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/"}}
}
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/final"}}
err := defaultCheckRedirect(req, via)
if err == nil {
t.Fatal("expected error after 10 redirects")
}
if !strings.Contains(err.Error(), "10 redirects") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_EmptyViaAllowed(t *testing.T) {
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
err := defaultCheckRedirect(req, nil)
if err != nil {
t.Fatalf("unexpected error with empty via: %v", err)
}
}
func TestSetHTTPClient_NilRestoresDefault(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
c.SetHTTPClient(nil)
if c.http == nil {
t.Fatal("expected non-nil http client after SetHTTPClient(nil)")
}
if c.http.Timeout != 30*time.Second {
t.Errorf("expected 30s timeout, got %v", c.http.Timeout)
}
if c.http.CheckRedirect == nil {
t.Fatal("expected CheckRedirect policy after SetHTTPClient(nil)")
}
}
-25
View File
@@ -1,25 +0,0 @@
//go:build phase2
package gitea_test
import (
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// Compile-time interface conformance assertions.
// These will verify gitea.Client satisfies vcs interfaces once the Phase 2
// adapter bridges the method signature gaps:
//
// - PRReader: GetPullRequest returns *gitea.PullRequest (needs *vcs.PullRequest)
// - PRReader: GetPullRequestFiles returns []gitea.ChangedFile (needs []vcs.ChangedFile)
// - FileReader: GetFileContent lacks ref parameter
// - Reviewer: PostReview uses (event, body, comments) instead of vcs.ReviewRequest
//
// Remove the phase2 build tag once the adapter is complete.
var (
_ vcs.PRReader = (*gitea.Client)(nil)
_ vcs.FileReader = (*gitea.Client)(nil)
_ vcs.Reviewer = (*gitea.Client)(nil)
_ vcs.Identity = (*gitea.Client)(nil)
)
+97
View File
@@ -0,0 +1,97 @@
package gitea
import (
"context"
"errors"
"math"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
func TestGetPullRequestDiff_SizeLimits(t *testing.T) {
tests := []struct {
name string
diff string
maxDiffSize int64
wantErr error
wantDiff string
}{
{
name: "exceeds max size",
diff: strings.Repeat("+ added line\n", 1000), // ~13 KB
maxDiffSize: 100,
wantErr: ErrDiffTooLarge,
},
{
name: "within max size",
diff: "diff --git a/f.go b/f.go\n--- a/f.go\n+++ b/f.go\n@@ -1 +1 @@\n-old\n+new\n",
maxDiffSize: 1024,
wantDiff: "diff --git a/f.go b/f.go\n--- a/f.go\n+++ b/f.go\n@@ -1 +1 @@\n-old\n+new\n",
},
{
name: "exactly at limit",
diff: strings.Repeat("x", 50),
maxDiffSize: 50,
wantDiff: strings.Repeat("x", 50),
},
{
name: "one byte over limit",
diff: strings.Repeat("x", 51),
maxDiffSize: 50,
wantErr: ErrDiffTooLarge,
},
{
name: "disabled limit",
diff: strings.Repeat("x", 10000),
maxDiffSize: -1,
wantDiff: strings.Repeat("x", 10000),
},
{
name: "math.MaxInt64 treated as disabled",
diff: strings.Repeat("x", 10000),
maxDiffSize: math.MaxInt64,
wantDiff: strings.Repeat("x", 10000),
},
{
name: "default limit",
diff: "diff content",
maxDiffSize: 0, // zero means use DefaultMaxDiffSize
wantDiff: "diff content",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte(tt.diff)) //nolint:errcheck // test handler
}))
defer server.Close()
client := NewClient(server.URL, "test-token")
client.MaxDiffSize = tt.maxDiffSize
client.RetryBackoff = []time.Duration{}
got, err := client.GetPullRequestDiff(context.Background(), "owner", "repo", 1)
if tt.wantErr != nil {
if err == nil {
t.Fatal("expected error, got nil")
}
if !errors.Is(err, tt.wantErr) {
t.Errorf("expected %v, got: %v", tt.wantErr, err)
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if got != tt.wantDiff {
t.Errorf("diff mismatch: got length %d, want length %d", len(got), len(tt.wantDiff))
}
})
}
}
+2 -2
View File
@@ -37,7 +37,7 @@ func TestPostReview_WithComments(t *testing.T) {
{Path: "util.go", NewPosition: 10, Body: "[MINOR] Style issue"}, {Path: "util.go", NewPosition: 10, Body: "[MINOR] Style issue"},
} }
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "REQUEST_CHANGES", "summary", comments) _, err := client.PostReview(context.Background(), "owner", "repo", 1, "REQUEST_CHANGES", "summary", "", comments)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
@@ -72,7 +72,7 @@ func TestPostReview_NilComments(t *testing.T) {
defer server.Close() defer server.Close()
client := NewClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "all good", nil) _, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "all good", "", nil)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
+174 -137
View File
@@ -1,6 +1,6 @@
// Package github provides a client for the GitHub API. // Package github provides a client for the GitHub API.
// It supports pull request operations, file content retrieval, CI status checks, // It supports pull request operations, file content retrieval,
// and directory listing for both github.com and GitHub Enterprise. // and review submission for both github.com and GitHub Enterprise.
package github package github
import ( import (
@@ -8,8 +8,10 @@ import (
"errors" "errors"
"fmt" "fmt"
"io" "io"
"log/slog"
"net/http" "net/http"
"net/url" "net/url"
"os"
"strconv" "strconv"
"strings" "strings"
"time" "time"
@@ -17,17 +19,28 @@ import (
const ( const (
defaultBaseURL = "https://api.github.com" defaultBaseURL = "https://api.github.com"
userAgent = "review-bot/1.0"
// maxResponseBytes limits successful response body reads to 10 MiB. // maxRetryAttempts is the number of times doRequest will attempt a request.
maxResponseBytes = 10 * 1024 * 1024 maxRetryAttempts = 3
// maxRetryAfter caps the maximum delay from a Retry-After header to prevent
// a server from stalling the client indefinitely.
maxRetryAfter = 60 * time.Second
// maxErrorBodyBytes limits how much of an error response body we read
// to protect against malicious servers sending unbounded data.
maxErrorBodyBytes = 64 * 1024 // 64 KB
// maxResponseBodyBytes limits how much of a successful response body we read
// for defense-in-depth against servers returning excessively large payloads.
maxResponseBodyBytes = 10 * 1024 * 1024 // 10 MB
) )
// APIError represents an HTTP error response from the GitHub API. // APIError represents an HTTP error response from the GitHub API.
// It carries the status code so callers can distinguish between // It carries the status code so callers can distinguish between
// different failure modes (e.g. 404 vs 500). // different failure modes (e.g. 404 vs 500).
// //
// The Body field stores up to 4 KiB of the raw response for programmatic // The Body field stores up to 64 KiB of the raw response for programmatic
// inspection. Error() truncates to 200 bytes for safe logging, but callers // inspection. Error() truncates to 200 bytes for safe logging, but callers
// should avoid logging or propagating Body directly in production since it may // should avoid logging or propagating Body directly in production since it may
// contain sensitive details from the upstream server. // contain sensitive details from the upstream server.
@@ -47,13 +60,6 @@ func (e *APIError) Error() string {
return fmt.Sprintf("HTTP %d: %s", e.StatusCode, body) return fmt.Sprintf("HTTP %d: %s", e.StatusCode, body)
} }
// SafeError returns the error string without response body content,
// suitable for logging in contexts where upstream response data should
// not be exposed.
func (e *APIError) SafeError() string {
return fmt.Sprintf("HTTP %d", e.StatusCode)
}
// IsNotFound reports whether an error is an API 404 response. // IsNotFound reports whether an error is an API 404 response.
func IsNotFound(err error) bool { func IsNotFound(err error) bool {
if apiErr, ok := asAPIError(err); ok { if apiErr, ok := asAPIError(err); ok {
@@ -81,85 +87,113 @@ func asAPIError(err error) (*APIError, bool) {
return nil, false return nil, false
} }
// clientConfig holds optional configuration for NewClient.
type clientConfig struct {
allowInsecureHTTP bool
}
// ClientOption configures optional behavior of NewClient.
type ClientOption func(*clientConfig)
// AllowInsecureHTTP permits the client to use HTTP (non-TLS) base URLs.
// This should only be used for trusted internal deployments or testing.
func AllowInsecureHTTP() ClientOption {
return func(c *clientConfig) {
c.allowInsecureHTTP = true
}
}
// Client interacts with the GitHub API. // Client interacts with the GitHub API.
// A Client is safe for concurrent use by multiple goroutines. // A Client is safe for concurrent use by multiple goroutines.
// SetHTTPClient and SetRetryBackoff are intended for test setup only and must // SetHTTPClient and SetRetryBackoff are intended for test setup only and must
// be called before any goroutines issue requests; they have no synchronization. // be called before any goroutines issue requests; they have no synchronization.
type Client struct { type Client struct {
baseURL string // TODO: baseURL is populated by NewClient but not yet consumed by doRequest/doGet.
token string // Higher-level exported methods (GetPullRequest, etc.) will use it to
// construct request URLs; remove this field if those methods end up
// accepting full URLs instead.
baseURL string
token string
httpClient *http.Client
// allowInsecureHTTP permits requests to HTTP (non-TLS) endpoints.
// When false, doRequest rejects URLs with an http:// scheme.
allowInsecureHTTP bool allowInsecureHTTP bool
httpClient *http.Client
// retryBackoff defines the delays between retry attempts for 429 responses. // retryBackoff defines the delays between retry attempts for 429 responses.
// retryBackoff[i] is the delay before attempt i+1 (after attempt i fails). // retryBackoff[i] is the delay before attempt i+1 (after attempt i fails).
// If nil, defaults to {1s, 2s}. Set to shorter durations in tests via SetRetryBackoff. // If nil, defaults to {1s, 2s}.
retryBackoff []time.Duration retryBackoff []time.Duration
// now returns the current time. Defaults to time.Now.
// Override in tests to control HTTP-date Retry-After calculations.
now func() time.Time
} }
// defaultCheckRedirect is the redirect policy used by NewClient and SetHTTPClient(nil). // defaultCheckRedirect is the redirect policy used by NewClient.
// It rejects HTTPS→HTTP protocol downgrades (to prevent plaintext leakage) and strips // NOTE: This function is intentionally duplicated in gitea/client.go (and vice versa)
// the Authorization header on cross-host redirects to prevent credential leakage to // because the packages are separate. Changes here must be mirrored there.
// third-party hosts (e.g. CDN redirects from GitHub). // It rejects HTTPS->HTTP protocol downgrades (to prevent plaintext leakage)
// and cross-host redirects (to prevent following responses from untrusted
// endpoints). Same-host, same-or-upgraded-scheme redirects are allowed.
func defaultCheckRedirect(req *http.Request, via []*http.Request) error { func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
if len(via) >= 10 { if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects") return fmt.Errorf("stopped after 10 redirects")
} }
// Guard: net/http guarantees len(via) >= 1 but this is undocumented; // Guard for direct invocation in tests and any future callers;
// defend against zero-length to avoid panic on index out of range. // net/http guarantees len(via) >= 1 during actual redirects.
if len(via) == 0 { if len(via) == 0 {
return nil return nil
} }
prev := via[len(via)-1] prev := via[len(via)-1]
// Reject protocol downgrade: HTTPSHTTP leaks request metadata over plaintext. // Reject protocol downgrade: HTTPS->HTTP leaks request metadata over plaintext.
if prev.URL.Scheme == "https" && req.URL.Scheme == "http" { if prev.URL.Scheme == "https" && req.URL.Scheme == "http" {
return fmt.Errorf("refusing redirect from HTTPS to HTTP (%s %s)", prev.URL.Host, req.URL.Host) return fmt.Errorf("refusing redirect: HTTPS to HTTP downgrade (%s -> %s)", prev.URL.Host, req.URL.Host)
} }
// Strip Authorization on cross-host redirect to avoid leaking credentials // Reject cross-host redirect entirely to avoid consuming responses
// to third-party hosts (GitHub legitimately redirects to CDN hosts). // from untrusted endpoints.
if req.URL.Host != prev.URL.Host { if req.URL.Host != prev.URL.Host {
req.Header.Del("Authorization") return fmt.Errorf("refusing redirect: cross-host (%s -> %s)", prev.URL.Host, req.URL.Host)
} }
return nil return nil
} }
// ClientOption configures optional behavior of a Client.
type ClientOption func(*clientConfig)
type clientConfig struct {
allowInsecureHTTP bool
insecureIsTestBypass bool
}
// AllowInsecureHTTP permits sending credentials over plaintext HTTP connections.
// In production, this option is gated by the REVIEW_BOT_ALLOW_INSECURE=1
// environment variable. Without the env var set, the option is ignored
// and a warning is logged.
//
// For tests, use AllowInsecureHTTPForTest (defined in a _test.go file in the same package) which bypasses the env gate.
func AllowInsecureHTTP() ClientOption {
return func(cfg *clientConfig) {
cfg.allowInsecureHTTP = true
}
}
// NewClient creates a new GitHub API client. // NewClient creates a new GitHub API client.
// If baseURL is empty, it defaults to https://api.github.com. // If baseURL is empty, it defaults to https://api.github.com.
// For GitHub Enterprise, pass the API base URL (e.g. https://github.concur.com/api/v3). // For GitHub Enterprise, pass the API base URL (e.g. https://github.concur.com/api/v3).
// The baseURL must use HTTPS; pass AllowInsecureHTTP() as an option to permit HTTP
// for trusted internal deployments (e.g. local testing).
func NewClient(token, baseURL string, opts ...ClientOption) *Client { func NewClient(token, baseURL string, opts ...ClientOption) *Client {
if baseURL == "" { if baseURL == "" {
baseURL = defaultBaseURL baseURL = defaultBaseURL
} }
cfg := clientConfig{}
for _, o := range opts { var cfg clientConfig
o(&cfg) for _, opt := range opts {
opt(&cfg)
} }
if cfg.allowInsecureHTTP && !cfg.insecureIsTestBypass {
if os.Getenv("REVIEW_BOT_ALLOW_INSECURE") != "1" {
slog.Warn("AllowInsecureHTTP ignored: set REVIEW_BOT_ALLOW_INSECURE=1 to enable")
cfg.allowInsecureHTTP = false
} else {
slog.Warn("AllowInsecureHTTP enabled — credentials may be sent over plaintext",
"env", "REVIEW_BOT_ALLOW_INSECURE=1")
}
}
return &Client{ return &Client{
baseURL: strings.TrimRight(baseURL, "/"), baseURL: strings.TrimRight(baseURL, "/"),
allowInsecureHTTP: cfg.allowInsecureHTTP,
token: token, token: token,
allowInsecureHTTP: cfg.allowInsecureHTTP,
httpClient: &http.Client{ httpClient: &http.Client{
Timeout: 30 * time.Second, Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect, CheckRedirect: defaultCheckRedirect,
}, },
now: time.Now,
} }
} }
@@ -167,7 +201,7 @@ func NewClient(token, baseURL string, opts ...ClientOption) *Client {
// This is intended for test setup only to inject mock transports; it must be // This is intended for test setup only to inject mock transports; it must be
// called before any goroutines issue requests. // called before any goroutines issue requests.
// //
// Passing nil restores the default client (30s timeout + auth-stripping // Passing nil restores the default client (30s timeout + redirect-rejecting
// CheckRedirect policy matching NewClient). // CheckRedirect policy matching NewClient).
// //
// Callers providing a non-nil client are responsible for configuring a safe // Callers providing a non-nil client are responsible for configuring a safe
@@ -179,56 +213,94 @@ func (c *Client) SetHTTPClient(hc *http.Client) {
Timeout: 30 * time.Second, Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect, CheckRedirect: defaultCheckRedirect,
} }
} else if hc.CheckRedirect == nil {
// Enforce safe redirect policy when caller provides a client without one.
// The default net/http behavior follows up to 10 redirects and forwards
// all headers (including Authorization) to any host, which can leak
// credentials on cross-host redirects.
hc.CheckRedirect = defaultCheckRedirect
} }
c.httpClient = hc c.httpClient = hc
} }
// SetRetryBackoff configures the retry backoff durations for testing. // SetRetryBackoff sets the delays between retry attempts.
// It must be called before any goroutines issue requests. // This is intended for testing to speed up retry tests.
// In production the default {1s, 2s} applies. //
func (c *Client) SetRetryBackoff(d []time.Duration) { // Note: if an empty non-nil slice is provided, Retry-After delays parsed from
c.retryBackoff = d // server responses will be computed and capped but not applied (because
// attempt < len(backoff) is always false). This is acceptable for the
// test-only use case but callers should be aware of this edge case.
func (c *Client) SetRetryBackoff(backoff []time.Duration) {
c.retryBackoff = backoff
}
// parseRetryAfter parses a Retry-After header value, supporting both integer
// seconds (e.g. "120") and HTTP-date format (e.g. "Thu, 01 Dec 2025 16:00:00 GMT")
// as specified in RFC 7231 §7.1.3.
//
// For integer values, it returns the duration directly.
// For HTTP-date values, it computes the delay as the difference between the
// parsed time and now. If the date is in the past, it returns 0.
//
// Returns (0, false) if the value cannot be parsed as either format.
func (c *Client) parseRetryAfter(value string) (time.Duration, bool) {
value = strings.TrimSpace(value)
// Try integer seconds first (most common from GitHub).
// RFC 7231 allows delta-seconds of 0 to indicate immediate retry.
if seconds, err := strconv.Atoi(value); err == nil && seconds >= 0 {
return time.Duration(seconds) * time.Second, true
}
// Try HTTP-date format (RFC 7231 §7.1.3).
// http.ParseTime handles RFC 1123, RFC 850, and ASCTIME formats.
if retryAt, err := http.ParseTime(value); err == nil {
delay := retryAt.Sub(c.now())
if delay < 0 {
delay = 0
}
return delay, true
}
return 0, false
}
// redactURL redacts sensitive components from a URL for safe inclusion in error
// messages and log output. It removes userinfo (e.g., user:pass@) and replaces
// query parameters with a placeholder.
func redactURL(rawURL string) string {
u, err := url.Parse(rawURL)
if err != nil {
return "<unparseable URL>"
}
u.User = nil
if u.RawQuery != "" {
u.RawQuery = "<redacted>"
}
return u.String()
} }
// doRequest performs an HTTP request with retry on 429 rate limit responses. // doRequest performs an HTTP request with retry on 429 rate limit responses.
// It respects the Retry-After header when present (capped at maxRetryAfter). // It respects the Retry-After header when present, supporting both integer
// Transport errors (network failures, context cancellation) are not retried. // seconds and HTTP-date formats (capped at maxRetryAfter).
func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept string) ([]byte, error) { func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept string) ([]byte, error) {
const maxAttempts = 3 // NOTE: This parses reqURL a second time (http.NewRequestWithContext parses it
const maxRetryAfter = 120 * time.Second // again internally). Acceptable cost: URL parsing is cheap and threading the
// parsed *url.URL through would complicate the interface for negligible gain.
var backoff []time.Duration if !c.allowInsecureHTTP {
if c.retryBackoff != nil {
backoff = make([]time.Duration, len(c.retryBackoff))
copy(backoff, c.retryBackoff)
} else {
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
// maxErrorBodyBytes limits how much of an error response body is stored.
// Kept small (4 KiB) to reduce the risk of sensitive data leakage if callers
// log APIError.Body directly. Error() further truncates to 200 bytes.
const maxErrorBodyBytes = 4 * 1024
// Reject non-HTTPS URLs early since the URL is immutable across retries.
if c.token != "" && !c.allowInsecureHTTP {
parsed, err := url.Parse(reqURL) parsed, err := url.Parse(reqURL)
if err != nil { if err != nil {
return nil, fmt.Errorf("parse request URL: %w", err) return nil, fmt.Errorf("parse request URL: %w", err)
} }
if !strings.EqualFold(parsed.Scheme, "https") { if strings.EqualFold(parsed.Scheme, "http") {
return nil, fmt.Errorf("refusing to send credentials over non-HTTPS URL %q (use AllowInsecureHTTP option for trusted networks)", reqURL) return nil, fmt.Errorf("refusing HTTP request to %s: use HTTPS or set AllowInsecureHTTP option", redactURL(reqURL))
} }
} }
var backoff []time.Duration
if c.retryBackoff != nil {
backoff = append([]time.Duration(nil), c.retryBackoff...)
} else {
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
var lastErr error var lastErr error
for attempt := 0; attempt < maxAttempts; attempt++ { for attempt := 0; attempt < maxRetryAttempts; attempt++ {
if attempt > 0 { if attempt > 0 {
var delay time.Duration var delay time.Duration
if attempt-1 < len(backoff) { if attempt-1 < len(backoff) {
@@ -250,13 +322,7 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
if err != nil { if err != nil {
return nil, fmt.Errorf("create request: %w", err) return nil, fmt.Errorf("create request: %w", err)
} }
if c.token != "" { req.Header.Set("Authorization", "Bearer "+c.token)
// Bearer is the OAuth2 standard and is accepted by GitHub for both
// classic PATs and fine-grained tokens. The alternative "token" scheme
// is GitHub-specific and offers no additional compatibility.
req.Header.Set("Authorization", "Bearer "+c.token)
}
req.Header.Set("User-Agent", userAgent)
if accept != "" { if accept != "" {
req.Header.Set("Accept", accept) req.Header.Set("Accept", accept)
} else { } else {
@@ -265,34 +331,29 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
resp, err := c.httpClient.Do(req) resp, err := c.httpClient.Do(req)
if err != nil { if err != nil {
// Transport errors (DNS, TLS, timeout) yield nil resp; no body to close.
return nil, fmt.Errorf("do request: %w", err) return nil, fmt.Errorf("do request: %w", err)
} }
body, done, err := handleResponse(resp, maxResponseBytes, maxErrorBodyBytes) if resp.StatusCode >= 200 && resp.StatusCode < 300 {
if done { body, err := io.ReadAll(io.LimitReader(resp.Body, maxResponseBodyBytes))
return body, err resp.Body.Close()
if err != nil {
return nil, fmt.Errorf("read response body: %w", err)
}
return body, nil
} }
lastErr = err
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
resp.Body.Close()
lastErr = &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
// Retry on 429 rate limit // Retry on 429 rate limit
if resp.StatusCode == http.StatusTooManyRequests && attempt < maxAttempts-1 { if resp.StatusCode == http.StatusTooManyRequests && attempt < maxRetryAttempts-1 {
// Check for Retry-After header and override backoff if present. // Check for Retry-After header and override backoff if present.
// Supports both integer seconds (common) and HTTP-date format (RFC 7231). // Supports both integer seconds (common) and HTTP-date format (RFC 7231).
if ra := resp.Header.Get("Retry-After"); ra != "" { if ra := resp.Header.Get("Retry-After"); ra != "" {
if seconds, err := strconv.Atoi(ra); err == nil && seconds > 0 { if delay, ok := c.parseRetryAfter(ra); ok {
delay := time.Duration(seconds) * time.Second
if delay > maxRetryAfter {
delay = maxRetryAfter
}
if attempt < len(backoff) {
backoff[attempt] = delay
}
} else if retryAt, err := http.ParseTime(ra); err == nil {
delay := time.Until(retryAt)
if delay < 0 {
delay = 0
}
if delay > maxRetryAfter { if delay > maxRetryAfter {
delay = maxRetryAfter delay = maxRetryAfter
} }
@@ -311,31 +372,7 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
return nil, lastErr return nil, lastErr
} }
// handleResponse reads and closes the response body, returning the result.
// It uses defer to ensure the body is always closed regardless of code path.
// Returns (body, done, err) where done=true means the caller should return immediately.
func handleResponse(resp *http.Response, maxRespBytes int, maxErrBytes int) ([]byte, bool, error) {
defer resp.Body.Close()
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
body, err := io.ReadAll(io.LimitReader(resp.Body, int64(maxRespBytes)+1))
if err != nil {
return nil, true, fmt.Errorf("read response body: %w", err)
}
if len(body) > maxRespBytes {
return nil, true, fmt.Errorf("response body exceeded %d bytes (truncated)", maxRespBytes)
}
return body, true, nil
}
errBody, readErr := io.ReadAll(io.LimitReader(resp.Body, int64(maxErrBytes)))
if readErr != nil && len(errBody) == 0 {
errBody = []byte(fmt.Sprintf("[error reading response body: %v]", readErr))
}
return nil, false, &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
}
// doGet is a convenience wrapper for GET requests with the default Accept header. // doGet is a convenience wrapper for GET requests with the default Accept header.
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) { func (c *Client) doGet(ctx context.Context, url string) ([]byte, error) {
return c.doRequest(ctx, http.MethodGet, reqURL, "") return c.doRequest(ctx, http.MethodGet, url, "")
} }
+482 -423
View File
File diff suppressed because it is too large Load Diff
-13
View File
@@ -1,13 +0,0 @@
package github_test
import (
"gitea.weiker.me/rodin/review-bot/github"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// Compile-time interface conformance assertions.
// These verify github.Client satisfies vcs.PRReader and vcs.FileReader.
var (
_ vcs.PRReader = (*github.Client)(nil)
_ vcs.FileReader = (*github.Client)(nil)
)
+13
View File
@@ -0,0 +1,13 @@
package github
// AllowInsecureHTTPForTest permits sending credentials over plaintext HTTP
// without requiring the REVIEW_BOT_ALLOW_INSECURE environment variable.
// This is intended exclusively for test code using httptest.Server.
//
// Defined in a _test.go file so it is only available to test binaries.
func AllowInsecureHTTPForTest() ClientOption {
return func(cfg *clientConfig) {
cfg.allowInsecureHTTP = true
cfg.insecureIsTestBypass = true
}
}
-135
View File
@@ -1,135 +0,0 @@
package github
import (
"context"
"encoding/base64"
"encoding/json"
"fmt"
"net/url"
"strings"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// GetFileContent fetches a file from a repo at the given ref.
// Delegates to GetFileContentAtRef with the provided ref.
func (c *Client) GetFileContent(ctx context.Context, owner, repo, path, ref string) (string, error) {
return c.GetFileContentAtRef(ctx, owner, repo, path, ref)
}
// GetFileContentAtRef fetches a file at a specific ref from a repo.
// If ref is empty, the query parameter is omitted (uses default branch).
//
// Note: dot-segments ("." and "..") in the path are silently removed to
// prevent path traversal. This means a path like "foo/../bar" resolves
// to "foo/bar" rather than "bar".
func (c *Client) GetFileContentAtRef(ctx context.Context, owner, repo, path, ref string) (string, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/contents/%s",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), escapePath(path))
if ref != "" {
reqURL += "?ref=" + url.QueryEscape(ref)
}
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch file %s: %w", path, err)
}
var resp struct {
Content string `json:"content"`
Encoding string `json:"encoding"`
}
if err := json.Unmarshal(body, &resp); err != nil {
return "", fmt.Errorf("parse file content JSON: %w", err)
}
if resp.Encoding != "base64" {
return "", fmt.Errorf("unexpected encoding %q for file %s", resp.Encoding, path)
}
decoded, err := decodeBase64Content(resp.Content)
if err != nil {
return "", fmt.Errorf("decode base64 content for %s: %w", path, err)
}
return decoded, nil
}
// ListContents lists files and directories at a given path in a repo.
// Returns the directory listing from the GitHub contents API.
// If the path points to a single file (not a directory), the API returns
// a JSON object instead of an array; this is handled by returning a
// single-element slice.
//
// Note: dot-segments ("." and "..") in the path are silently removed to
// prevent path traversal. This means a path like "foo/../bar" resolves
// to "foo/bar" rather than "bar".
func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]vcs.ContentEntry, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/contents/%s",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), escapePath(path))
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list contents %s: %w", path, err)
}
type entry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"`
}
// The GitHub contents API returns an array for directories and an object
// for single files. Try array first (common case), then fall back to object.
// An empty array ([]) is valid — it represents an empty directory — and
// results in a zero-length slice returned without error.
var entries []entry
if err := json.Unmarshal(body, &entries); err != nil {
var single entry
if err2 := json.Unmarshal(body, &single); err2 != nil {
return nil, fmt.Errorf("parse contents JSON: as array: %w; as object: %w", err, err2)
}
// Guard against empty objects ({}) or unexpected shapes that
// unmarshal successfully but carry no useful data.
if single.Name == "" && single.Path == "" && single.Type == "" {
return nil, fmt.Errorf("parse contents JSON: unexpected response format")
}
entries = []entry{single}
}
result := make([]vcs.ContentEntry, len(entries))
for i, e := range entries {
result[i] = vcs.ContentEntry{
Name: e.Name,
Path: e.Path,
Type: e.Type,
}
}
return result, nil
}
// escapePath escapes each segment of a relative file path for use in URLs.
// Slashes are preserved as path separators; other special characters are escaped.
// Dot-segments ("." and "..") and empty segments (from consecutive slashes like
// "a//b") are silently removed to prevent path traversal and produce canonical
// paths. This is intentional: callers may receive a different path than requested
// without error. The function is package-private, and all callers
// (GetFileContentAtRef, ListContents) already handle missing-file errors from the
// API if the cleaned path doesn't match what the caller intended.
func escapePath(p string) string {
parts := strings.Split(p, "/")
var clean []string
for _, part := range parts {
if part == "." || part == ".." || part == "" {
continue
}
clean = append(clean, url.PathEscape(part))
}
return strings.Join(clean, "/")
}
// decodeBase64Content decodes base64-encoded content from the GitHub contents API.
// GitHub returns base64 content with line breaks for formatting; we strip \r and \n before decoding.
func decodeBase64Content(encoded string) (string, error) {
// GitHub inserts newlines in base64 content
cleaned := strings.NewReplacer("\n", "", "\r", "").Replace(encoded)
decoded, err := base64.StdEncoding.DecodeString(cleaned)
if err != nil {
return "", err
}
return string(decoded), nil
}
-334
View File
@@ -1,334 +0,0 @@
package github
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
)
func TestGetFileContent_DelegatesToGetFileContentAtRef(t *testing.T) {
var gotRef string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotRef = r.URL.Query().Get("ref")
json.NewEncoder(w).Encode(map[string]string{
"content": "dGVzdA==", // "test" in base64
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
// Call with empty ref — should not include ref param
content, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if content != "test" {
t.Errorf("expected 'test', got %q", content)
}
if gotRef != "" {
t.Errorf("expected empty ref, got %q", gotRef)
}
}
func TestGetFileContent_WithRef(t *testing.T) {
var gotRef string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotRef = r.URL.Query().Get("ref")
json.NewEncoder(w).Encode(map[string]string{
"content": "dGVzdA==",
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "abc123")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if gotRef != "abc123" {
t.Errorf("expected ref 'abc123', got %q", gotRef)
}
}
func TestGetFileContent_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "missing.go", "")
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestGetFileContent_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err == nil {
t.Fatal("expected error for 401")
}
}
func TestGetFileContent_429Retry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit"}`))
return
}
json.NewEncoder(w).Encode(map[string]string{
"content": "b2s=",
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
c.SetRetryBackoff([]time.Duration{1 * time.Millisecond})
content, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if content != "ok" {
t.Errorf("expected 'ok', got %q", content)
}
if attempts != 2 {
t.Errorf("expected 2 attempts, got %d", attempts)
}
}
func TestGetFileContent_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`not json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err == nil {
t.Fatal("expected error for malformed JSON")
}
}
func TestListContents_HappyPath(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/contents/src" {
t.Errorf("unexpected path: %s", r.URL.Path)
}
json.NewEncoder(w).Encode([]map[string]string{
{"name": "main.go", "path": "src/main.go", "type": "file"},
{"name": "lib", "path": "src/lib", "type": "dir"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
entries, err := c.ListContents(context.Background(), "owner", "repo", "src")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 2 {
t.Fatalf("expected 2 entries, got %d", len(entries))
}
if entries[0].Name != "main.go" {
t.Errorf("expected name 'main.go', got %q", entries[0].Name)
}
if entries[0].Path != "src/main.go" {
t.Errorf("expected path 'src/main.go', got %q", entries[0].Path)
}
if entries[0].Type != "file" {
t.Errorf("expected type 'file', got %q", entries[0].Type)
}
if entries[1].Name != "lib" {
t.Errorf("expected name 'lib', got %q", entries[1].Name)
}
if entries[1].Type != "dir" {
t.Errorf("expected type 'dir', got %q", entries[1].Type)
}
}
func TestListContents_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.ListContents(context.Background(), "owner", "repo", "missing")
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestListContents_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.ListContents(context.Background(), "owner", "repo", "src")
if err == nil {
t.Fatal("expected error for 401")
}
}
func TestListContents_429Retry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit"}`))
return
}
json.NewEncoder(w).Encode([]map[string]string{
{"name": "file.go", "path": "file.go", "type": "file"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
c.SetRetryBackoff([]time.Duration{1 * time.Millisecond})
entries, err := c.ListContents(context.Background(), "owner", "repo", ".")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if attempts != 2 {
t.Errorf("expected 2 attempts, got %d", attempts)
}
}
func TestListContents_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`not json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.ListContents(context.Background(), "owner", "repo", "src")
if err == nil {
t.Fatal("expected error for malformed JSON")
}
}
func TestDecodeBase64Content(t *testing.T) {
// Test with newlines (GitHub's format)
encoded := "cGFja2FnZSBt\nYWlu"
decoded, err := decodeBase64Content(encoded)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if decoded != "package main" {
t.Errorf("expected 'package main', got %q", decoded)
}
}
func TestDecodeBase64Content_Invalid(t *testing.T) {
_, err := decodeBase64Content("not!!!valid!!!base64")
if err == nil {
t.Fatal("expected error for invalid base64")
}
}
func TestEscapePath_RejectsDotSegments(t *testing.T) {
tests := []struct {
input string
want string
}{
{"src/main.go", "src/main.go"},
{"../etc/passwd", "etc/passwd"},
{"./src/../main.go", "src/main.go"},
{"a/b/c", "a/b/c"},
{"file with spaces.go", "file%20with%20spaces.go"},
{"a/./b/../c", "a/b/c"},
}
for _, tt := range tests {
got := escapePath(tt.input)
if got != tt.want {
t.Errorf("escapePath(%q) = %q, want %q", tt.input, got, tt.want)
}
}
}
func TestDecodeBase64Content_CRLF(t *testing.T) {
// Base64 of "hello world" with CRLF line breaks inserted
encoded := "aGVs\r\nbG8g\r\nd29y\r\nbGQ="
decoded, err := decodeBase64Content(encoded)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if decoded != "hello world" {
t.Errorf("expected 'hello world', got %q", decoded)
}
}
func TestListContents_SingleFile(t *testing.T) {
// GitHub Contents API returns a JSON object (not array) for single-file paths
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`{"name":"README.md","path":"README.md","type":"file"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
entries, err := c.ListContents(context.Background(), "owner", "repo", "README.md")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if entries[0].Name != "README.md" {
t.Errorf("expected name 'README.md', got %q", entries[0].Name)
}
if entries[0].Type != "file" {
t.Errorf("expected type 'file', got %q", entries[0].Type)
}
}
-212
View File
@@ -1,212 +0,0 @@
package github
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/url"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// pullRequestResponse is the GitHub API response for a pull request.
type pullRequestResponse struct {
Number int `json:"number"`
Title string `json:"title"`
Body string `json:"body"`
Head struct {
SHA string `json:"sha"`
Ref string `json:"ref"`
} `json:"head"`
Base struct {
Ref string `json:"ref"`
} `json:"base"`
}
// changedFileResponse is the GitHub API response for a changed file in a PR.
type changedFileResponse struct {
Filename string `json:"filename"`
Status string `json:"status"`
Patch string `json:"patch"`
}
// commitStatusResponse is the GitHub combined status API response.
type commitStatusResponse struct {
Statuses []struct {
Context string `json:"context"`
State string `json:"state"`
Description string `json:"description"`
TargetURL string `json:"target_url"`
} `json:"statuses"`
}
// checkRunsResponse is the GitHub check runs API response.
type checkRunsResponse struct {
CheckRuns []struct {
Name string `json:"name"`
Conclusion *string `json:"conclusion"`
Status string `json:"status"`
HTMLURL string `json:"html_url"`
} `json:"check_runs"`
}
// GetPullRequest fetches PR metadata.
func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number int) (*vcs.PullRequest, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR: %w", err)
}
var resp pullRequestResponse
if err := json.Unmarshal(body, &resp); err != nil {
return nil, fmt.Errorf("parse PR JSON: %w", err)
}
return &vcs.PullRequest{
Number: resp.Number,
Title: resp.Title,
Body: resp.Body,
Head: vcs.HeadRef{SHA: resp.Head.SHA, Ref: resp.Head.Ref},
Base: vcs.BaseRef{Ref: resp.Base.Ref},
}, nil
}
// GetPullRequestDiff fetches the unified diff for a PR.
// Uses Accept: application/vnd.github.diff to get raw diff text.
func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doRequest(ctx, http.MethodGet, reqURL, "application/vnd.github.diff")
if err != nil {
return "", fmt.Errorf("fetch diff: %w", err)
}
return string(body), nil
}
// maxPages is the upper bound on pagination loops to prevent unbounded iteration
// in case the server returns a full page indefinitely.
const maxPages = 100
// GetPullRequestFiles fetches the list of files changed in a PR.
// Paginates through all pages (100 per page) to collect all files.
// Returns nil (not an empty slice) when the PR has no changed files.
// Callers can safely range over or check len() on a nil slice.
func (c *Client) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]vcs.ChangedFile, error) {
var allFiles []vcs.ChangedFile
for page := 1; page <= maxPages; page++ {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/files?per_page=100&page=%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, page)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR files page %d: %w", page, err)
}
var files []changedFileResponse
if err := json.Unmarshal(body, &files); err != nil {
return nil, fmt.Errorf("parse PR files JSON: %w", err)
}
if len(files) == 0 {
break
}
for _, f := range files {
allFiles = append(allFiles, vcs.ChangedFile{
Filename: f.Filename,
Status: f.Status,
Patch: f.Patch,
})
}
if len(files) < 100 {
break
}
}
return allFiles, nil
}
// GetCommitStatuses fetches both commit statuses and check runs for a SHA,
// merging them into a unified []vcs.CommitStatus slice.
// Returns nil (not an empty slice) when there are no statuses or check runs.
// If the commit statuses endpoint fails (e.g. 404 for an unknown SHA), the
// function returns immediately without attempting the check-runs endpoint.
// If the check-runs endpoint fails after statuses were fetched successfully,
// the function returns an error (not a partial result) so callers always get
// either a complete view or a clear error signal.
func (c *Client) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]vcs.CommitStatus, error) {
var result []vcs.CommitStatus
// Fetch commit statuses
statusURL := fmt.Sprintf("%s/repos/%s/%s/commits/%s/status",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), url.PathEscape(sha))
statusBody, err := c.doGet(ctx, statusURL)
if err != nil {
return nil, fmt.Errorf("fetch commit statuses: %w", err)
}
var statusResp commitStatusResponse
if err := json.Unmarshal(statusBody, &statusResp); err != nil {
return nil, fmt.Errorf("parse commit statuses JSON: %w", err)
}
for _, s := range statusResp.Statuses {
result = append(result, vcs.CommitStatus{
Context: s.Context,
Status: s.State,
Description: s.Description,
TargetURL: s.TargetURL,
})
}
// Fetch check runs (paginated)
for checkPage := 1; checkPage <= maxPages; checkPage++ {
checkURL := fmt.Sprintf("%s/repos/%s/%s/commits/%s/check-runs?per_page=100&page=%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), url.PathEscape(sha), checkPage)
checkBody, err := c.doGet(ctx, checkURL)
if err != nil {
return nil, fmt.Errorf("fetch check runs page %d: %w", checkPage, err)
}
var checkResp checkRunsResponse
if err := json.Unmarshal(checkBody, &checkResp); err != nil {
return nil, fmt.Errorf("parse check runs JSON: %w", err)
}
for _, cr := range checkResp.CheckRuns {
result = append(result, vcs.CommitStatus{
Context: cr.Name,
Status: mapCheckRunStatus(cr.Conclusion),
Description: derefString(cr.Conclusion),
TargetURL: cr.HTMLURL,
})
}
if len(checkResp.CheckRuns) < 100 {
break
}
}
return result, nil
}
// mapCheckRunStatus maps a check run conclusion to a vcs.CommitStatus status string.
// Conclusion alone determines the mapped state: nil conclusion means the run is
// still in progress (pending), regardless of the status field value.
func mapCheckRunStatus(conclusion *string) string {
if conclusion == nil {
// Still running or queued
return "pending"
}
switch *conclusion {
case "success":
return "success"
case "failure", "action_required", "timed_out":
return "failure"
case "cancelled", "skipped", "neutral":
return "success" // non-blocking: these do not indicate a blocking failure per GitHub check suite semantics
case "stale", "waiting":
return "pending"
default:
return "pending"
}
}
// derefString safely dereferences a string pointer, returning empty string if nil.
func derefString(s *string) string {
if s == nil {
return ""
}
return *s
}
-637
View File
@@ -1,637 +0,0 @@
package github
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
func TestGetPullRequest_HappyPath(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/pulls/42" {
t.Errorf("unexpected path: %s", r.URL.Path)
}
json.NewEncoder(w).Encode(map[string]interface{}{
"number": 42,
"title": "Test PR",
"body": "Description",
"head": map[string]string{"sha": "abc123", "ref": "feature-branch"},
"base": map[string]string{"ref": "main"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
pr, err := c.GetPullRequest(context.Background(), "owner", "repo", 42)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if pr.Number != 42 {
t.Errorf("expected number 42, got %d", pr.Number)
}
if pr.Title != "Test PR" {
t.Errorf("expected title 'Test PR', got %q", pr.Title)
}
if pr.Body != "Description" {
t.Errorf("expected body 'Description', got %q", pr.Body)
}
if pr.Head.SHA != "abc123" {
t.Errorf("expected head SHA 'abc123', got %q", pr.Head.SHA)
}
if pr.Head.Ref != "feature-branch" {
t.Errorf("expected head ref 'feature-branch', got %q", pr.Head.Ref)
}
if pr.Base.Ref != "main" {
t.Errorf("expected base ref 'main', got %q", pr.Base.Ref)
}
}
func TestGetPullRequest_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetPullRequest(context.Background(), "owner", "repo", 999)
if err == nil {
t.Fatal("expected error for 404")
}
if !IsNotFound(err) {
t.Errorf("expected IsNotFound=true, got error: %v", err)
}
}
func TestGetPullRequest_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err == nil {
t.Fatal("expected error for 401")
}
if !IsUnauthorized(err) {
t.Errorf("expected IsUnauthorized=true, got error: %v", err)
}
}
func TestGetPullRequest_429Retry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit"}`))
return
}
json.NewEncoder(w).Encode(map[string]interface{}{
"number": 1,
"title": "PR",
"body": "",
"head": map[string]string{"sha": "abc", "ref": "br"},
"base": map[string]string{"ref": "main"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
c.SetRetryBackoff([]time.Duration{1 * time.Millisecond})
pr, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if pr.Number != 1 {
t.Errorf("expected number 1, got %d", pr.Number)
}
if attempts != 2 {
t.Errorf("expected 2 attempts, got %d", attempts)
}
}
func TestGetPullRequest_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`{invalid json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err == nil {
t.Fatal("expected error for malformed JSON")
}
if !strings.Contains(err.Error(), "parse PR JSON") {
t.Errorf("expected parse error, got: %v", err)
}
}
func TestGetPullRequestDiff_HappyPath(t *testing.T) {
expectedDiff := "diff --git a/file.go b/file.go\n--- a/file.go\n+++ b/file.go\n@@ -1,3 +1,4 @@\n+// new line\n"
var gotAccept string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotAccept = r.Header.Get("Accept")
w.WriteHeader(200)
w.Write([]byte(expectedDiff))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
diff, err := c.GetPullRequestDiff(context.Background(), "owner", "repo", 42)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if diff != expectedDiff {
t.Errorf("unexpected diff: %q", diff)
}
if gotAccept != "application/vnd.github.diff" {
t.Errorf("expected diff Accept header, got %q", gotAccept)
}
}
func TestGetPullRequestDiff_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetPullRequestDiff(context.Background(), "owner", "repo", 999)
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestGetPullRequestDiff_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetPullRequestDiff(context.Background(), "owner", "repo", 1)
if err == nil {
t.Fatal("expected error for 401")
}
}
func TestGetPullRequestFiles_HappyPath(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
json.NewEncoder(w).Encode([]map[string]interface{}{
{"filename": "main.go", "status": "modified", "patch": "@@ -1,3 +1,4 @@\n+line"},
{"filename": "test.go", "status": "added", "patch": "@@ -0,0 +1,5 @@\n+new file"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
files, err := c.GetPullRequestFiles(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(files) != 2 {
t.Fatalf("expected 2 files, got %d", len(files))
}
if files[0].Filename != "main.go" {
t.Errorf("expected filename 'main.go', got %q", files[0].Filename)
}
if files[0].Status != "modified" {
t.Errorf("expected status 'modified', got %q", files[0].Status)
}
if files[0].Patch != "@@ -1,3 +1,4 @@\n+line" {
t.Errorf("unexpected patch: %q", files[0].Patch)
}
}
func TestGetPullRequestFiles_Pagination(t *testing.T) {
// Simulate > 100 files requiring pagination
page1Files := make([]map[string]string, 100)
for i := 0; i < 100; i++ {
page1Files[i] = map[string]string{
"filename": fmt.Sprintf("file%d.go", i),
"status": "modified",
"patch": fmt.Sprintf("patch%d", i),
}
}
page2Files := []map[string]string{
{"filename": "file100.go", "status": "added", "patch": "patch100"},
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
page := r.URL.Query().Get("page")
if page == "" || page == "1" {
json.NewEncoder(w).Encode(page1Files)
} else {
json.NewEncoder(w).Encode(page2Files)
}
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
files, err := c.GetPullRequestFiles(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(files) != 101 {
t.Errorf("expected 101 files (paginated), got %d", len(files))
}
if files[100].Filename != "file100.go" {
t.Errorf("expected last file 'file100.go', got %q", files[100].Filename)
}
if files[100].Patch != "patch100" {
t.Errorf("expected last patch 'patch100', got %q", files[100].Patch)
}
}
func TestGetPullRequestFiles_BinaryFile_NoPatch(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Binary files have no patch field in GitHub response
json.NewEncoder(w).Encode([]map[string]interface{}{
{"filename": "image.png", "status": "added"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
files, err := c.GetPullRequestFiles(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(files) != 1 {
t.Fatalf("expected 1 file, got %d", len(files))
}
if files[0].Patch != "" {
t.Errorf("expected empty patch for binary file, got %q", files[0].Patch)
}
}
func TestGetPullRequestFiles_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetPullRequestFiles(context.Background(), "owner", "repo", 999)
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestGetPullRequestFiles_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`not json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetPullRequestFiles(context.Background(), "owner", "repo", 1)
if err == nil {
t.Fatal("expected error for malformed JSON")
}
}
func TestGetFileContentAtRef_HappyPath(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/contents/path/to/file.go" {
t.Errorf("unexpected path: %s", r.URL.Path)
}
if r.URL.Query().Get("ref") != "abc123" {
t.Errorf("unexpected ref: %s", r.URL.Query().Get("ref"))
}
json.NewEncoder(w).Encode(map[string]string{
"content": "cGFja2FnZSBtYWlu", // "package main" in base64
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
content, err := c.GetFileContentAtRef(context.Background(), "owner", "repo", "path/to/file.go", "abc123")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if content != "package main" {
t.Errorf("expected 'package main', got %q", content)
}
}
func TestGetFileContentAtRef_EmptyRef(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Query().Get("ref") != "" {
t.Errorf("expected no ref param, got %q", r.URL.Query().Get("ref"))
}
json.NewEncoder(w).Encode(map[string]string{
"content": "aGVsbG8=", // "hello" in base64
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
content, err := c.GetFileContentAtRef(context.Background(), "owner", "repo", "file.txt", "")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if content != "hello" {
t.Errorf("expected 'hello', got %q", content)
}
}
func TestGetFileContentAtRef_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContentAtRef(context.Background(), "owner", "repo", "missing.go", "main")
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestGetFileContentAtRef_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContentAtRef(context.Background(), "owner", "repo", "file.go", "main")
if err == nil {
t.Fatal("expected error for 401")
}
}
func TestGetFileContentAtRef_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`not valid json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContentAtRef(context.Background(), "owner", "repo", "file.go", "main")
if err == nil {
t.Fatal("expected error for malformed JSON")
}
}
func TestGetFileContentAtRef_429Retry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit"}`))
return
}
json.NewEncoder(w).Encode(map[string]string{
"content": "b2s=", // "ok" in base64
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
c.SetRetryBackoff([]time.Duration{1 * time.Millisecond})
content, err := c.GetFileContentAtRef(context.Background(), "owner", "repo", "file.go", "main")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if content != "ok" {
t.Errorf("expected 'ok', got %q", content)
}
if attempts != 2 {
t.Errorf("expected 2 attempts, got %d", attempts)
}
}
func TestGetCommitStatuses_HappyPath(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch {
case strings.Contains(r.URL.Path, "/status"):
json.NewEncoder(w).Encode(map[string]interface{}{
"state": "success",
"statuses": []map[string]string{
{
"context": "ci/build",
"state": "success",
"description": "Build passed",
"target_url": "https://ci.example.com/1",
},
},
})
case strings.Contains(r.URL.Path, "/check-runs"):
conclusion := "success"
json.NewEncoder(w).Encode(map[string]interface{}{
"total_count": 1,
"check_runs": []map[string]interface{}{
{
"name": "lint",
"conclusion": &conclusion,
"status": "completed",
"html_url": "https://github.com/check/1",
},
},
})
default:
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(404)
}
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
statuses, err := c.GetCommitStatuses(context.Background(), "owner", "repo", "abc123")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(statuses) != 2 {
t.Fatalf("expected 2 statuses, got %d", len(statuses))
}
// First should be from commit statuses
if statuses[0].Context != "ci/build" {
t.Errorf("expected context 'ci/build', got %q", statuses[0].Context)
}
if statuses[0].Status != "success" {
t.Errorf("expected status 'success', got %q", statuses[0].Status)
}
// Second should be from check runs
if statuses[1].Context != "lint" {
t.Errorf("expected context 'lint', got %q", statuses[1].Context)
}
if statuses[1].Status != "success" {
t.Errorf("expected status 'success', got %q", statuses[1].Status)
}
}
func TestGetCommitStatuses_CheckRunConclusions(t *testing.T) {
tests := []struct {
conclusion *string
status string
want string
}{
{stringPtr("success"), "completed", "success"},
{stringPtr("failure"), "completed", "failure"},
{stringPtr("action_required"), "completed", "failure"},
{stringPtr("timed_out"), "completed", "failure"},
{stringPtr("cancelled"), "completed", "success"},
{stringPtr("skipped"), "completed", "success"},
{stringPtr("neutral"), "completed", "success"},
{nil, "in_progress", "pending"},
{nil, "queued", "pending"},
}
for _, tt := range tests {
name := "nil"
if tt.conclusion != nil {
name = *tt.conclusion
}
t.Run(name, func(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if strings.Contains(r.URL.Path, "/status") {
json.NewEncoder(w).Encode(map[string]interface{}{
"state": "success",
"statuses": []interface{}{},
})
return
}
json.NewEncoder(w).Encode(map[string]interface{}{
"total_count": 1,
"check_runs": []map[string]interface{}{
{
"name": "check",
"conclusion": tt.conclusion,
"status": tt.status,
"html_url": "https://github.com/check/1",
},
},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
statuses, err := c.GetCommitStatuses(context.Background(), "owner", "repo", "sha1")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(statuses) != 1 {
t.Fatalf("expected 1 status, got %d", len(statuses))
}
if statuses[0].Status != tt.want {
t.Errorf("expected status %q, got %q", tt.want, statuses[0].Status)
}
})
}
}
func TestGetCommitStatuses_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetCommitStatuses(context.Background(), "owner", "repo", "badsha")
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestGetCommitStatuses_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetCommitStatuses(context.Background(), "owner", "repo", "sha")
if err == nil {
t.Fatal("expected error for 401")
}
}
func TestGetCommitStatuses_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`not json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetCommitStatuses(context.Background(), "owner", "repo", "sha")
if err == nil {
t.Fatal("expected error for malformed JSON")
}
}
func stringPtr(s string) *string {
return &s
}
+1 -1
View File
@@ -2,4 +2,4 @@ module gitea.weiker.me/rodin/review-bot
go 1.26.2 go 1.26.2
require gopkg.in/yaml.v3 v3.0.1 require github.com/goccy/go-yaml v1.19.2
+2 -4
View File
@@ -1,4 +1,2 @@
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM= github.com/goccy/go-yaml v1.19.2 h1:PmFC1S6h8ljIz6gMRBopkjP1TVT7xuwrButHID66PoM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= github.com/goccy/go-yaml v1.19.2/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
+146 -38
View File
@@ -5,12 +5,15 @@ import (
"embed" "embed"
"encoding/json" "encoding/json"
"fmt" "fmt"
"io"
"os" "os"
"sort" "sort"
"strings" "strings"
"unicode/utf8" "unicode/utf8"
"gopkg.in/yaml.v3" "github.com/goccy/go-yaml"
"github.com/goccy/go-yaml/ast"
"github.com/goccy/go-yaml/parser"
) )
//go:embed personas/*.yaml //go:embed personas/*.yaml
@@ -118,9 +121,7 @@ func ListBuiltinPersonas() []string {
default: default:
continue continue
} }
if !seen[personaName] { seen[personaName] = true
seen[personaName] = true
}
} }
names := make([]string, 0, len(seen)) names := make([]string, 0, len(seen))
for name := range seen { for name := range seen {
@@ -142,10 +143,19 @@ func parsePersona(data []byte, source string) (*Persona, error) {
err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth) err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
} else { } else {
// Use json.Decoder with DisallowUnknownFields for consistency with // Use json.Decoder with DisallowUnknownFields for consistency with
// YAML's KnownFields(true) - both reject unknown fields to catch typos. // YAML's Strict() - both reject unknown fields to catch typos.
dec := json.NewDecoder(bytes.NewReader(data)) dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields() dec.DisallowUnknownFields()
err = dec.Decode(&p) err = dec.Decode(&p)
if err == nil {
// Reject trailing content after the first valid JSON object.
// Without this check, input like `{"name":"x"}garbage` would
// silently succeed because Decoder stops after one object.
var dummy json.RawMessage
if err2 := dec.Decode(&dummy); err2 != io.EOF {
err = fmt.Errorf("unexpected trailing content after JSON object")
}
}
} }
if err != nil { if err != nil {
return nil, fmt.Errorf("parse persona %s: %w", source, err) return nil, fmt.Errorf("parse persona %s: %w", source, err)
@@ -156,70 +166,164 @@ func parsePersona(data []byte, source string) (*Persona, error) {
return &p, nil return &p, nil
} }
// unmarshalYAMLWithDepthLimit unmarshals YAML data with explicit depth limiting // unmarshalYAMLWithDepthLimit unmarshals YAML data with three safety checks:
// and strict field checking. This protects against stack exhaustion from deeply // - Depth limiting: rejects AST trees exceeding maxDepth to prevent stack exhaustion.
// nested structures and catches typos in field names. // - Multi-document rejection: prevents silent data loss from ignored extra documents.
// Multi-document YAML files are rejected to prevent silent data loss. // - Strict field checking: rejects unknown YAML keys to catch typos early.
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error { func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
// First pass: decode into a yaml.Node to check depth limits and node counts. // First pass: parse into AST to check depth limits, node counts, and
// This prevents stack exhaustion before we attempt to decode into structs. // multi-document rejection. This prevents stack exhaustion before we
var node yaml.Node // attempt to decode into structs.
dec := yaml.NewDecoder(bytes.NewReader(data)) file, err := parser.ParseBytes(data, 0)
if err := dec.Decode(&node); err != nil { if err != nil {
return err return err
} }
// Reject empty YAML input (whitespace-only, comment-only, or truly empty files).
// The parser returns a single doc with nil body for these cases.
if len(file.Docs) == 0 || file.Docs[0].Body == nil {
return fmt.Errorf("empty YAML document")
}
// Reject multi-document YAML files - silently ignoring additional documents // Reject multi-document YAML files - silently ignoring additional documents
// could lead to confusing behavior where users think their changes take effect. // could lead to confusing behavior where users think their changes take effect.
var extra yaml.Node if len(file.Docs) > 1 {
if dec.Decode(&extra) == nil {
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed") return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
} }
nodeCount := 0 nodeCount := 0
if err := checkYAMLDepth(&node, 0, maxDepth, MaxYAMLNodes, make(map[*yaml.Node]struct{}), &nodeCount); err != nil { if err := checkYAMLDepth(file.Docs[0].Body, 0, maxDepth, MaxYAMLNodes, make(map[ast.Node]int), make(map[ast.Node]bool), &nodeCount); err != nil {
return err return err
} }
// Second pass: decode with strict field checking enabled. // Second pass: decode with strict field checking enabled.
// KnownFields(true) rejects unknown keys, catching typos like "focuss" or "identiy". // Strict() rejects unknown keys, catching typos like "focuss" or "identiy".
// We must re-decode from the original data because yaml.Node.Decode() doesn't //
// support the KnownFields option. // Safety note: goccy/go-yaml's decoder does not expand YAML aliases
strictDec := yaml.NewDecoder(bytes.NewReader(data)) // recursively — it resolves them via the pre-built AST, which our first
strictDec.KnownFields(true) // pass already depth-checked. Alias chains that would exceed depth limits
return strictDec.Decode(out) // are caught above; the decoder merely reads the resolved scalar values.
dec := yaml.NewDecoder(bytes.NewReader(data), yaml.Strict())
return dec.Decode(out)
} }
// checkYAMLDepth recursively checks that YAML nodes don't exceed the depth limit // checkYAMLDepth recursively checks that YAML AST nodes don't exceed the depth
// or the total node count limit. It also detects alias cycles to prevent infinite // limit or the total node count limit. It uses two tracking maps:
// recursion from crafted YAML with self-referential aliases. // - validated: maps each node to the maximum depth at which it was previously
func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*yaml.Node]struct{}, nodeCount *int) error { // checked. If a node is revisited at a deeper depth (e.g., via an alias),
// we re-check it to ensure the combined effective depth doesn't exceed limits.
// - visiting: per-path recursion stack for true cycle detection. A node on the
// current path is a cycle (alias loop); we return nil to avoid infinite recursion.
//
// This design prevents the alias depth bypass where an anchored subtree validated
// at a shallow depth could be referenced via alias at a greater depth, effectively
// exceeding MaxYAMLDepth.
func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, validated map[ast.Node]int, visiting map[ast.Node]bool, nodeCount *int) error {
if node == nil {
return nil
}
if depth > maxDepth { if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth) return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
} }
// Cycle detection: if we're currently visiting this node on the current
// recursion path, it's a cycle (e.g., alias pointing to an ancestor).
// Return nil to break the cycle without error — cycles are a structural
// property, not a depth violation.
if visiting[node] {
return nil
}
// Track total nodes visited as defense-in-depth against wide-but-shallow attacks. // Track total nodes visited as defense-in-depth against wide-but-shallow attacks.
// Placed after cycle detection but before the depth-aware short-circuit. This means
// nodes revisited at shallower depths (via aliases) are counted each time they are
// encountered — intentional conservative overcounting. This bounds the total work
// performed during validation rather than tracking unique nodes, which is the safer
// security posture for untrusted YAML input.
*nodeCount++ *nodeCount++
if *nodeCount > maxNodes { if *nodeCount > maxNodes {
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes) return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
} }
// Cycle detection: if we've seen this node before, we're in a cycle. // Depth-aware short-circuit: skip re-validation only when the current visit
if _, ok := seen[node]; ok { // depth is the same or shallower than the depth at which this node was
return nil // Already validated this subtree, skip to avoid infinite recursion. // previously validated. A shallower (or equal) current depth means the
// prior, deeper validation already covered any subtree depth violations.
// If the current depth exceeds the previous validation depth (e.g., an alias
// references this node deeper in the tree), we must re-traverse to ensure
// the combined effective depth doesn't exceed maxDepth.
//
// Note: using ast.Node (interface) as map key relies on pointer identity,
// which is correct because all goccy/go-yaml AST node types are pointer
// receivers (*MappingNode, *SequenceNode, etc.), never value types.
if prevDepth, ok := validated[node]; ok && depth <= prevDepth {
return nil
} }
seen[node] = struct{}{} validated[node] = depth
// Handle alias nodes: follow the alias to its anchor target. // Mark as visiting (on the current recursion path) for cycle detection.
// Increment depth when following aliases since they expand the effective structure. visiting[node] = true
if node.Kind == yaml.AliasNode && node.Alias != nil { defer func() { visiting[node] = false }()
return checkYAMLDepth(node.Alias, depth+1, maxDepth, maxNodes, seen, nodeCount)
}
for _, child := range node.Content { // Walk children based on node type.
if err := checkYAMLDepth(child, depth+1, maxDepth, maxNodes, seen, nodeCount); err != nil { switch n := node.(type) {
case *ast.MappingNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.MappingValueNode:
// Both Key and Value are visited at depth+1 relative to this
// MappingValueNode. Since MappingNode visits its MappingValueNode
// children at depth+1 as well, keys and values end up at depth+2
// from the parent MappingNode. This is intentional: it mirrors the
// actual nesting structure (mapping → key-value pair → key/value).
if err := checkYAMLDepth(n.Key, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err return err
} }
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.SequenceNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.AliasNode:
// Follow alias to its target, incrementing depth since aliases expand
// the effective structure.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.AnchorNode:
// Increment depth for anchor values as a conservative measure: the
// anchor definition itself is structural, and treating it as a depth
// level ensures that deeply nested anchors are caught at definition
// time rather than only when referenced via alias. This +1 is
// asymmetric with alias (which also increments) — by design, the
// effective depth budget for anchored-then-aliased content is reduced
// because both the definition site and the reference site each consume
// a level, making deeply nested anchor/alias pairs hit the limit sooner.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.TagNode:
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.MergeKeyNode:
// MergeKeyNode represents the literal "<<" merge key token. It has no
// child nodes — the value side of a merge (e.g., *alias) lives in the
// parent MappingValueNode.Value, which is already recursed into above.
// Explicitly listed here (rather than in the default case) to prevent
// future library changes from silently bypassing depth checks.
default:
// Scalar leaf nodes (StringNode, IntegerNode, FloatNode, BoolNode,
// NullNode, InfinityNode, NanNode, LiteralNode) have no children to
// recurse into.
} }
return nil return nil
} }
@@ -227,7 +331,11 @@ func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*ya
// ParsePersonaBytes parses persona data from bytes with a source label for errors. // ParsePersonaBytes parses persona data from bytes with a source label for errors.
// This is useful for parsing personas fetched from external sources (e.g., Gitea API) // This is useful for parsing personas fetched from external sources (e.g., Gitea API)
// without requiring filesystem access. Format is detected by source extension. // without requiring filesystem access. Format is detected by source extension.
// Input is bounded by MaxPersonaFileSize to prevent resource exhaustion.
func ParsePersonaBytes(data []byte, source string) (*Persona, error) { func ParsePersonaBytes(data []byte, source string) (*Persona, error) {
if len(data) > MaxPersonaFileSize {
return nil, fmt.Errorf("persona data from %s exceeds maximum size (%d bytes, limit %d)", source, len(data), MaxPersonaFileSize)
}
return parsePersona(data, source) return parsePersona(data, source)
} }
+222 -41
View File
@@ -7,7 +7,7 @@ import (
"strings" "strings"
"testing" "testing"
"gopkg.in/yaml.v3" "github.com/goccy/go-yaml/ast"
) )
func TestLoadBuiltinPersona(t *testing.T) { func TestLoadBuiltinPersona(t *testing.T) {
@@ -459,7 +459,14 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
path := filepath.Join(dir, "deeply-nested.yaml") path := filepath.Join(dir, "deeply-nested.yaml")
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20). // Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
// Each level adds 2 to the depth count (key + value mapping). // Depth accumulation trace for "nested: \n level0: \n level1: ...":
// - Document root parsed at depth 0
// - Root MappingNode children (MappingValueNodes) visited at depth 1
// - "nested" MappingValueNode: key at depth 2, value at depth 2
// - Each levelN adds depth via MappingValueNode traversal (key + value)
// - Exact depth per level depends on AST structure (MappingNode wrapping),
// but 25 levels reliably exceeds MaxYAMLDepth (20) with comfortable margin.
// The test uses 25 levels rather than exactly 21 to avoid brittleness.
var sb strings.Builder var sb strings.Builder
sb.WriteString("name: test\nidentity: test\nnested:\n") sb.WriteString("name: test\nidentity: test\nnested:\n")
indent := " " indent := " "
@@ -483,6 +490,35 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
} }
} }
func TestYAMLEmptyFileRejection(t *testing.T) {
tests := []struct {
name string
content string
}{
{"completely_empty", ""},
{"whitespace_only", " \n\n "},
{"comment_only", "# just a comment\n"},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, tc.name+".yaml")
if err := os.WriteFile(path, []byte(tc.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for empty YAML input, got nil")
}
if !strings.Contains(err.Error(), "empty YAML document") {
t.Errorf("expected error containing %q, got: %v", "empty YAML document", err)
}
})
}
}
func TestYAMLFileSizeLimit(t *testing.T) { func TestYAMLFileSizeLimit(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
path := filepath.Join(dir, "huge.yaml") path := filepath.Join(dir, "huge.yaml")
@@ -504,41 +540,41 @@ func TestYAMLFileSizeLimit(t *testing.T) {
func TestYAMLAliasCycleDetection(t *testing.T) { func TestYAMLAliasCycleDetection(t *testing.T) {
// Test that our checkYAMLDepth function handles alias cycles gracefully // Test that our checkYAMLDepth function handles alias cycles gracefully
// by using the seen map to prevent infinite recursion. // by using the visiting map to prevent infinite recursion.
// We test this directly because go-yaml's parser handles most cycles
// at parse time, but we need to ensure our checker is robust.
// Create a node structure where an alias points to a parent node, // Create a node structure where an alias points to a parent node,
// simulating what could happen with malicious input that bypasses // simulating what could happen with crafted input.
// go-yaml's cycle detection. parent := &ast.MappingNode{
parent := &yaml.Node{ Values: []*ast.MappingValueNode{
Kind: yaml.MappingNode, {
Content: []*yaml.Node{ Key: &ast.StringNode{Value: "name"},
{Kind: yaml.ScalarNode, Value: "name"}, Value: &ast.StringNode{Value: "test"},
{Kind: yaml.ScalarNode, Value: "test"}, },
{Kind: yaml.ScalarNode, Value: "nested"},
}, },
} }
// Create a child that aliases back to the parent (artificial cycle) // Create a child that aliases back to the parent (artificial cycle)
aliasToParent := &yaml.Node{ aliasToParent := &ast.AliasNode{
Kind: yaml.AliasNode, Value: parent,
Alias: parent,
} }
parent.Content = append(parent.Content, aliasToParent) parent.Values = append(parent.Values, &ast.MappingValueNode{
Key: &ast.StringNode{Value: "nested"},
Value: aliasToParent,
})
nodeCount := 0 nodeCount := 0
seen := make(map[*yaml.Node]struct{}) validated := make(map[ast.Node]int)
visiting := make(map[ast.Node]bool)
// This should NOT hang or stack overflow - the seen map prevents infinite recursion // This should NOT hang or stack overflow - cycle detection prevents infinite recursion
err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount) err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount)
if err != nil { if err != nil {
t.Errorf("unexpected error traversing cyclic structure: %v", err) t.Errorf("unexpected error traversing cyclic structure: %v", err)
} }
// Verify we tracked the parent in the seen map // Verify we tracked the parent in the validated map
if _, ok := seen[parent]; !ok { if _, ok := validated[parent]; !ok {
t.Error("parent node not tracked in seen map") t.Error("parent node not tracked in validated map")
} }
} }
@@ -594,36 +630,82 @@ func TestYAMLNodeCountLimit(t *testing.T) {
func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) { func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) {
// Direct test of cycle detection in checkYAMLDepth by creating // Direct test of cycle detection in checkYAMLDepth by creating
// a node structure with an artificial cycle. // a node structure with an artificial cycle.
// This tests the seen map logic independent of go-yaml's parsing. node := &ast.MappingNode{
node := &yaml.Node{ Values: []*ast.MappingValueNode{
Kind: yaml.MappingNode, {
Content: []*yaml.Node{ Key: &ast.StringNode{Value: "key"},
{Kind: yaml.ScalarNode, Value: "key"}, Value: &ast.StringNode{Value: "value"},
{Kind: yaml.ScalarNode, Value: "value"}, },
}, },
} }
// Create a cycle by making a child reference the parent // Create a cycle by making a child reference the parent
cycleChild := &yaml.Node{ cycleChild := &ast.AliasNode{
Kind: yaml.AliasNode, Value: node, // Points back to the parent
Alias: node, // Points back to the parent
} }
node.Content = append(node.Content, node.Values = append(node.Values, &ast.MappingValueNode{
&yaml.Node{Kind: yaml.ScalarNode, Value: "cyclic"}, Key: &ast.StringNode{Value: "cyclic"},
cycleChild, Value: cycleChild,
) })
nodeCount := 0 nodeCount := 0
seen := make(map[*yaml.Node]struct{}) validated := make(map[ast.Node]int)
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount) visiting := make(map[ast.Node]bool)
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount)
// Should complete without infinite recursion due to cycle detection // Should complete without infinite recursion due to cycle detection
if err != nil { if err != nil {
t.Errorf("unexpected error: %v", err) t.Errorf("unexpected error: %v", err)
} }
// The seen map should contain multiple entries // The validated map should contain multiple entries
if len(seen) < 2 { if len(validated) < 2 {
t.Errorf("seen map has %d entries, expected at least 2", len(seen)) t.Errorf("validated map has %d entries, expected at least 2", len(validated))
}
}
func TestYAMLAliasDepthBypass(t *testing.T) {
// Test that an anchored subtree first validated at a shallow depth is
// re-checked when referenced via alias at a deeper position. Without the
// depth-aware validated map, the alias reference would skip re-checking
// and allow the effective nesting to exceed MaxYAMLDepth.
dir := t.TempDir()
path := filepath.Join(dir, "alias-depth-bypass.yaml")
// Build YAML with an anchor at shallow depth containing a subtree near the limit,
// then reference it via alias deep enough that effective depth exceeds MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\n")
// Create the anchored subtree at depth 1 (key level) that nests 15 levels deep.
sb.WriteString("anchor_key: &deep_anchor\n")
for i := 0; i < 15; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("level%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 16))
sb.WriteString("leaf: value\n")
// Create a wrapper that nests 6 levels deep, then references the anchor.
// Effective depth at alias target = 6 (wrapper nesting) + 1 (alias) + 15 (subtree) = 22 > 20
sb.WriteString("wrapper:\n")
for i := 0; i < 6; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("n%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 7))
sb.WriteString("alias_ref: *deep_anchor\n")
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for alias depth bypass, got nil")
}
if !strings.Contains(err.Error(), "nesting depth exceeds") {
t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
} }
} }
@@ -776,3 +858,102 @@ identity: test identity
t.Errorf("Name = %q, want %q", p.Name, "test") t.Errorf("Name = %q, want %q", p.Name, "test")
} }
} }
func TestJSONTrailingContentRejected(t *testing.T) {
tests := []struct {
name string
content string
}{
{
name: "trailing garbage after object",
content: `{"name":"test","identity":"test identity"}garbage`,
},
{
name: "two JSON objects",
content: `{"name":"test","identity":"test identity"}{"name":"other"}`,
},
{
name: "trailing array",
content: `{"name":"test","identity":"test identity"}[]`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for trailing content, got nil")
}
if !strings.Contains(err.Error(), "trailing content") {
t.Errorf("error = %q, want to contain 'trailing content'", err.Error())
}
})
}
}
func TestParsePersonaBytesSizeLimit(t *testing.T) {
// ParsePersonaBytes should reject input exceeding MaxPersonaFileSize
oversized := make([]byte, MaxPersonaFileSize+1)
for i := range oversized {
oversized[i] = 'x'
}
_, err := ParsePersonaBytes(oversized, "oversized.yaml")
if err == nil {
t.Fatal("expected error for oversized input, got nil")
}
if !strings.Contains(err.Error(), "exceeds maximum size") {
t.Errorf("error = %q, want to contain 'exceeds maximum size'", err.Error())
}
// Just under the limit should not trigger size error (may fail parse, but not size)
underLimit := []byte("name: test\nidentity: test persona\n")
p, err := ParsePersonaBytes(underLimit, "valid.yaml")
if err != nil {
t.Fatalf("unexpected error for valid input: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}
func TestYAMLMergeKeyDepthCheck(t *testing.T) {
// Verify that YAML merge keys (<<: *alias) are properly handled by the
// depth checker. The merge key content is in the MappingValueNode.Value
// (an AliasNode), not in the MergeKeyNode itself.
p, err := ParsePersonaBytes([]byte("name: merge-test\nidentity: test\n"), "merge.yaml")
if err != nil {
t.Fatalf("basic parse failed: %v", err)
}
if p.Name != "merge-test" {
t.Errorf("Name = %q, want %q", p.Name, "merge-test")
}
// Test that deeply nested merge keys still hit depth limit.
// Build YAML with merge key content nested beyond MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: deep-merge\nidentity: deep merge persona\n")
sb.WriteString("anchor: &deep\n")
indent := " "
for i := 0; i < MaxYAMLDepth+5; i++ {
sb.WriteString(indent)
sb.WriteString(fmt.Sprintf("level%d:\n", i))
indent += " "
}
sb.WriteString(indent + "leaf: value\n")
sb.WriteString("target:\n <<: *deep\n")
_, err = ParsePersonaBytes([]byte(sb.String()), "deep-merge.yaml")
if err == nil {
t.Fatal("expected error for deeply nested merge key content, got nil")
}
if !strings.Contains(err.Error(), "depth") {
t.Errorf("error = %q, want to contain 'depth'", err.Error())
}
}
+17 -4
View File
@@ -4,19 +4,32 @@ import (
"context" "context"
"log/slog" "log/slog"
"strings" "strings"
"gitea.weiker.me/rodin/review-bot/vcs"
) )
// RepoPersonaPath is the directory path where repo-specific personas are stored. // RepoPersonaPath is the directory path where repo-specific personas are stored.
const RepoPersonaPath = ".review-bot/personas" const RepoPersonaPath = ".review-bot/personas"
// GiteaClient defines the subset of gitea.Client methods needed for loading repo personas.
// This interface allows for easier testing and decouples the review package from gitea.
type GiteaClient interface {
ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
}
// ContentEntry represents a file or directory entry from the contents API.
// This mirrors gitea.ContentEntry to avoid import cycles.
type ContentEntry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"` // "file" or "dir"
}
// LoadRepoPersonas fetches personas from a repository's .review-bot/personas/ directory. // LoadRepoPersonas fetches personas from a repository's .review-bot/personas/ directory.
// Returns an empty map (not nil) if the directory doesn't exist or is empty. // Returns an empty map (not nil) if the directory doesn't exist or is empty.
// Individual parse failures are logged and skipped; the remaining personas are still returned. // Individual parse failures are logged and skipped; the remaining personas are still returned.
// Auth errors and other non-404 errors are propagated. // Auth errors and other non-404 errors are propagated.
// Files exceeding MaxPersonaFileSize are rejected to prevent resource exhaustion. // Files exceeding MaxPersonaFileSize are rejected to prevent resource exhaustion.
func LoadRepoPersonas(ctx context.Context, client vcs.FileReader, owner, repo string) (map[string]*Persona, error) { func LoadRepoPersonas(ctx context.Context, client GiteaClient, owner, repo string) (map[string]*Persona, error) {
result := make(map[string]*Persona) result := make(map[string]*Persona)
entries, err := client.ListContents(ctx, owner, repo, RepoPersonaPath) entries, err := client.ListContents(ctx, owner, repo, RepoPersonaPath)
@@ -44,7 +57,7 @@ func LoadRepoPersonas(ctx context.Context, client vcs.FileReader, owner, repo st
continue continue
} }
content, err := client.GetFileContent(ctx, owner, repo, entry.Path, "") content, err := client.GetFileContent(ctx, owner, repo, entry.Path)
if err != nil { if err != nil {
slog.Warn("could not fetch repo persona file", slog.Warn("could not fetch repo persona file",
"file", entry.Path, "file", entry.Path,
+62 -31
View File
@@ -5,21 +5,23 @@ import (
"errors" "errors"
"strings" "strings"
"testing" "testing"
"gitea.weiker.me/rodin/review-bot/vcs"
) )
func TestParsePersonaBytes(t *testing.T) { func TestParsePersonaBytes(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
data string data string
source string source string
wantName string wantName string
wantErr string wantErr string
}{ }{
{ {
name: "valid yaml", name: "valid yaml",
data: "name: test\nidentity: test identity\nfocus:\n - testing\n", data: `name: test
identity: test identity
focus:
- testing
`,
source: "test.yaml", source: "test.yaml",
wantName: "test", wantName: "test",
}, },
@@ -36,8 +38,8 @@ func TestParsePersonaBytes(t *testing.T) {
wantErr: "parse", wantErr: "parse",
}, },
{ {
name: "json format by extension", name: "json format by extension",
data: `{"name": "jsontest", "identity": "json identity"}`, data: `{"name": "jsontest", "identity": "json identity"}`,
source: "test.json", source: "test.json",
wantName: "jsontest", wantName: "jsontest",
}, },
@@ -65,15 +67,15 @@ func TestParsePersonaBytes(t *testing.T) {
} }
} }
// mockGiteaClient implements vcs.FileReader for testing. // mockGiteaClient implements GiteaClient for testing.
type mockGiteaClient struct { type mockGiteaClient struct {
contents map[string][]vcs.ContentEntry // path -> entries contents map[string][]ContentEntry // path -> entries
files map[string]string // path -> content files map[string]string // path -> content
listErr error listErr error
fileErr map[string]error // path -> error fileErr map[string]error // path -> error
} }
func (m *mockGiteaClient) ListContents(ctx context.Context, owner, repo, path string) ([]vcs.ContentEntry, error) { func (m *mockGiteaClient) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
if m.listErr != nil { if m.listErr != nil {
return nil, m.listErr return nil, m.listErr
} }
@@ -84,7 +86,7 @@ func (m *mockGiteaClient) ListContents(ctx context.Context, owner, repo, path st
return entries, nil return entries, nil
} }
func (m *mockGiteaClient) GetFileContent(ctx context.Context, owner, repo, filepath, ref string) (string, error) { func (m *mockGiteaClient) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
if m.fileErr != nil { if m.fileErr != nil {
if err, ok := m.fileErr[filepath]; ok { if err, ok := m.fileErr[filepath]; ok {
return "", err return "", err
@@ -116,7 +118,7 @@ func TestLoadRepoPersonas(t *testing.T) {
t.Run("empty directory returns empty map", func(t *testing.T) { t.Run("empty directory returns empty map", func(t *testing.T) {
client := &mockGiteaClient{ client := &mockGiteaClient{
contents: map[string][]vcs.ContentEntry{ contents: map[string][]ContentEntry{
RepoPersonaPath: {}, RepoPersonaPath: {},
}, },
} }
@@ -131,15 +133,27 @@ func TestLoadRepoPersonas(t *testing.T) {
t.Run("loads valid personas", func(t *testing.T) { t.Run("loads valid personas", func(t *testing.T) {
client := &mockGiteaClient{ client := &mockGiteaClient{
contents: map[string][]vcs.ContentEntry{ contents: map[string][]ContentEntry{
RepoPersonaPath: { RepoPersonaPath: {
{Name: "trading.yaml", Path: ".review-bot/personas/trading.yaml", Type: "file"}, {Name: "trading.yaml", Path: ".review-bot/personas/trading.yaml", Type: "file"},
{Name: "crypto.yaml", Path: ".review-bot/personas/crypto.yaml", Type: "file"}, {Name: "crypto.yaml", Path: ".review-bot/personas/crypto.yaml", Type: "file"},
}, },
}, },
files: map[string]string{ files: map[string]string{
".review-bot/personas/trading.yaml": "name: trading\ndisplay_name: Trading Expert\nidentity: You are a trading expert.\nfocus:\n - order handling\n - risk management\n", ".review-bot/personas/trading.yaml": `name: trading
".review-bot/personas/crypto.yaml": "name: crypto\ndisplay_name: Crypto Expert\nidentity: You are a cryptography expert.\nfocus:\n - key management\n - encryption\n", display_name: Trading Expert
identity: You are a trading expert.
focus:
- order handling
- risk management
`,
".review-bot/personas/crypto.yaml": `name: crypto
display_name: Crypto Expert
identity: You are a cryptography expert.
focus:
- key management
- encryption
`,
}, },
} }
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo") personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
@@ -162,14 +176,16 @@ func TestLoadRepoPersonas(t *testing.T) {
t.Run("skips invalid persona files", func(t *testing.T) { t.Run("skips invalid persona files", func(t *testing.T) {
client := &mockGiteaClient{ client := &mockGiteaClient{
contents: map[string][]vcs.ContentEntry{ contents: map[string][]ContentEntry{
RepoPersonaPath: { RepoPersonaPath: {
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"}, {Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "invalid.yaml", Path: ".review-bot/personas/invalid.yaml", Type: "file"}, {Name: "invalid.yaml", Path: ".review-bot/personas/invalid.yaml", Type: "file"},
}, },
}, },
files: map[string]string{ files: map[string]string{
".review-bot/personas/valid.yaml": "name: valid\nidentity: Valid persona\n", ".review-bot/personas/valid.yaml": `name: valid
identity: Valid persona
`,
".review-bot/personas/invalid.yaml": "not valid yaml: [broken", ".review-bot/personas/invalid.yaml": "not valid yaml: [broken",
}, },
} }
@@ -177,6 +193,7 @@ func TestLoadRepoPersonas(t *testing.T) {
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
// Should have the valid one, skip the invalid
if len(personas) != 1 { if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped invalid), got %d", len(personas)) t.Fatalf("expected 1 persona (skipped invalid), got %d", len(personas))
} }
@@ -187,7 +204,7 @@ func TestLoadRepoPersonas(t *testing.T) {
t.Run("skips non-yaml files", func(t *testing.T) { t.Run("skips non-yaml files", func(t *testing.T) {
client := &mockGiteaClient{ client := &mockGiteaClient{
contents: map[string][]vcs.ContentEntry{ contents: map[string][]ContentEntry{
RepoPersonaPath: { RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"}, {Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "README.md", Path: ".review-bot/personas/README.md", Type: "file"}, {Name: "README.md", Path: ".review-bot/personas/README.md", Type: "file"},
@@ -195,8 +212,10 @@ func TestLoadRepoPersonas(t *testing.T) {
}, },
}, },
files: map[string]string{ files: map[string]string{
".review-bot/personas/persona.yaml": "name: test\nidentity: Test persona\n", ".review-bot/personas/persona.yaml": `name: test
".review-bot/personas/README.md": "# Personas\n\nPut your personas here.", identity: Test persona
`,
".review-bot/personas/README.md": "# Personas\n\nPut your personas here.",
}, },
} }
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo") personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
@@ -210,14 +229,16 @@ func TestLoadRepoPersonas(t *testing.T) {
t.Run("skips subdirectories", func(t *testing.T) { t.Run("skips subdirectories", func(t *testing.T) {
client := &mockGiteaClient{ client := &mockGiteaClient{
contents: map[string][]vcs.ContentEntry{ contents: map[string][]ContentEntry{
RepoPersonaPath: { RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"}, {Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "subdir", Path: ".review-bot/personas/subdir", Type: "dir"}, {Name: "subdir", Path: ".review-bot/personas/subdir", Type: "dir"},
}, },
}, },
files: map[string]string{ files: map[string]string{
".review-bot/personas/persona.yaml": "name: test\nidentity: Test persona\n", ".review-bot/personas/persona.yaml": `name: test
identity: Test persona
`,
}, },
} }
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo") personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
@@ -244,14 +265,16 @@ func TestLoadRepoPersonas(t *testing.T) {
t.Run("skips files that fail to fetch", func(t *testing.T) { t.Run("skips files that fail to fetch", func(t *testing.T) {
client := &mockGiteaClient{ client := &mockGiteaClient{
contents: map[string][]vcs.ContentEntry{ contents: map[string][]ContentEntry{
RepoPersonaPath: { RepoPersonaPath: {
{Name: "good.yaml", Path: ".review-bot/personas/good.yaml", Type: "file"}, {Name: "good.yaml", Path: ".review-bot/personas/good.yaml", Type: "file"},
{Name: "bad.yaml", Path: ".review-bot/personas/bad.yaml", Type: "file"}, {Name: "bad.yaml", Path: ".review-bot/personas/bad.yaml", Type: "file"},
}, },
}, },
files: map[string]string{ files: map[string]string{
".review-bot/personas/good.yaml": "name: good\nidentity: Good persona\n", ".review-bot/personas/good.yaml": `name: good
identity: Good persona
`,
}, },
fileErr: map[string]error{ fileErr: map[string]error{
".review-bot/personas/bad.yaml": errors.New("HTTP 500: internal server error"), ".review-bot/personas/bad.yaml": errors.New("HTTP 500: internal server error"),
@@ -267,23 +290,27 @@ func TestLoadRepoPersonas(t *testing.T) {
}) })
t.Run("skips oversized files", func(t *testing.T) { t.Run("skips oversized files", func(t *testing.T) {
// Create a content string that exceeds MaxPersonaFileSize (64KB)
oversizedContent := strings.Repeat("a", MaxPersonaFileSize+1) oversizedContent := strings.Repeat("a", MaxPersonaFileSize+1)
client := &mockGiteaClient{ client := &mockGiteaClient{
contents: map[string][]vcs.ContentEntry{ contents: map[string][]ContentEntry{
RepoPersonaPath: { RepoPersonaPath: {
{Name: "normal.yaml", Path: ".review-bot/personas/normal.yaml", Type: "file"}, {Name: "normal.yaml", Path: ".review-bot/personas/normal.yaml", Type: "file"},
{Name: "huge.yaml", Path: ".review-bot/personas/huge.yaml", Type: "file"}, {Name: "huge.yaml", Path: ".review-bot/personas/huge.yaml", Type: "file"},
}, },
}, },
files: map[string]string{ files: map[string]string{
".review-bot/personas/normal.yaml": "name: normal\nidentity: Normal sized persona\n", ".review-bot/personas/normal.yaml": `name: normal
".review-bot/personas/huge.yaml": oversizedContent, identity: Normal sized persona
`,
".review-bot/personas/huge.yaml": oversizedContent,
}, },
} }
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo") personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
// Should have the normal one, skip the oversized
if len(personas) != 1 { if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped oversized), got %d", len(personas)) t.Fatalf("expected 1 persona (skipped oversized), got %d", len(personas))
} }
@@ -343,6 +370,7 @@ func TestGetBuiltinPersonasMap(t *testing.T) {
t.Fatal("expected at least one built-in persona") t.Fatal("expected at least one built-in persona")
} }
// Verify expected personas exist
expected := []string{"security", "architect", "docs"} expected := []string{"security", "architect", "docs"}
for _, name := range expected { for _, name := range expected {
if personas[name] == nil { if personas[name] == nil {
@@ -350,6 +378,7 @@ func TestGetBuiltinPersonasMap(t *testing.T) {
} }
} }
// Verify personas are valid
for name, p := range personas { for name, p := range personas {
if p.Name != name { if p.Name != name {
t.Errorf("persona %q has mismatched name %q", name, p.Name) t.Errorf("persona %q has mismatched name %q", name, p.Name)
@@ -393,6 +422,8 @@ func TestIsNotFoundError(t *testing.T) {
{nil, false}, {nil, false},
{errors.New("HTTP 404: not found"), true}, {errors.New("HTTP 404: not found"), true},
{errors.New("HTTP 404"), true}, {errors.New("HTTP 404"), true},
// Intentionally false: generic "not found" could mask auth/transport errors.
// Only explicit HTTP 404 responses should be treated as "directory doesn't exist".
{errors.New("something not found"), false}, {errors.New("something not found"), false},
{errors.New("HTTP 401: unauthorized"), false}, {errors.New("HTTP 401: unauthorized"), false},
{errors.New("connection refused"), false}, {errors.New("connection refused"), false},
-27
View File
@@ -1,27 +0,0 @@
//go:build phase2
package vcs_test
import (
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// Compile-time assertion: documents the gap between gitea.Client and vcs.Client.
// Guarded by the "phase2" build tag — enable once the Gitea adapter bridges these gaps:
//
// 1. PostReview signature mismatch:
// gitea.Client: PostReview(ctx, owner, repo, number, event, body string, comments []gitea.ReviewComment)
// vcs.Reviewer: PostReview(ctx, owner, repo, number, req vcs.ReviewRequest)
//
// 2. GetFileContent signature mismatch:
// gitea.Client: GetFileContent(ctx, owner, repo, filepath string) [no ref; uses default branch]
// vcs.FileReader: GetFileContent(ctx, owner, repo, path, ref string)
// (gitea.Client has GetFileContentRef for the ref variant)
//
// 3. ReviewComment type mismatch:
// gitea.ReviewComment uses NewPosition int64 (Gitea line-number convention)
// vcs.ReviewComment uses Position int (GitHub diff-position convention)
//
// The Gitea adapter (Phase 2) will wrap gitea.Client to bridge these gaps.
var _ vcs.Client = (*gitea.Client)(nil)
-43
View File
@@ -1,43 +0,0 @@
// Package vcs defines the shared VCS client interface and supporting types.
// Platform adapters (gitea, github) implement these interfaces so the core
// review logic can work with any VCS platform without platform-specific code.
package vcs
import "context"
// PRReader can fetch pull request metadata, diffs, and changed files.
type PRReader interface {
GetPullRequest(ctx context.Context, owner, repo string, number int) (*PullRequest, error)
GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error)
GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]ChangedFile, error)
GetFileContentAtRef(ctx context.Context, owner, repo, path, ref string) (string, error)
GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]CommitStatus, error)
}
// FileReader can fetch file contents and list directory entries.
type FileReader interface {
GetFileContent(ctx context.Context, owner, repo, path, ref string) (string, error)
ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
}
// Reviewer can post, list, and delete pull request reviews.
type Reviewer interface {
PostReview(ctx context.Context, owner, repo string, number int, req ReviewRequest) (*Review, error)
ListReviews(ctx context.Context, owner, repo string, number int) ([]Review, error)
DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error
DismissReview(ctx context.Context, owner, repo string, number int, reviewID int64, message string) error
}
// Identity can report who the authenticated user is.
type Identity interface {
GetAuthenticatedUser(ctx context.Context) (string, error)
}
// Client is the full VCS interface: PR reads, file reads, review management, and identity.
// Platform adapters (gitea, github) implement this interface.
type Client interface {
PRReader
FileReader
Reviewer
Identity
}
-98
View File
@@ -1,98 +0,0 @@
package vcs
// ReviewEvent is the event type for a pull request review action.
// Adapters must translate these action constants to/from platform-native values.
// For example, Gitea uses "APPROVED" as both action and state, while GitHub
// uses "APPROVE" for the action and returns "approved" as the state.
type ReviewEvent string
const (
// ReviewEventApprove approves the pull request.
ReviewEventApprove ReviewEvent = "APPROVE"
// ReviewEventRequestChanges requests changes to the pull request.
ReviewEventRequestChanges ReviewEvent = "REQUEST_CHANGES"
// ReviewEventComment posts a review comment without approval or rejection.
ReviewEventComment ReviewEvent = "COMMENT"
)
// BaseRef identifies the target branch of a pull request.
type BaseRef struct {
Ref string `json:"ref"`
}
// HeadRef identifies the source branch and latest commit of a pull request.
type HeadRef struct {
SHA string `json:"sha"`
Ref string `json:"ref"`
}
// UserInfo identifies a user by login name.
type UserInfo struct {
Login string `json:"login"`
}
// PullRequest holds relevant PR metadata.
type PullRequest struct {
Number int `json:"number"`
Title string `json:"title"`
Body string `json:"body"`
Head HeadRef `json:"head"`
Base BaseRef `json:"base"`
}
// ChangedFile represents a file modified in a PR.
type ChangedFile struct {
Filename string `json:"filename"`
Status string `json:"status"`
Patch string `json:"patch"`
}
// ContentEntry represents a file or directory entry from the contents API.
type ContentEntry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"` // "file" or "dir"
}
// CommitStatus represents a single CI status entry for a commit.
type CommitStatus struct {
Status string `json:"status"`
Context string `json:"context"`
Description string `json:"description"`
TargetURL string `json:"target_url"`
}
// Review represents a pull request review.
type Review struct {
ID int64 `json:"id"`
Body string `json:"body"`
User UserInfo `json:"user"`
State string `json:"state"`
Stale bool `json:"stale"`
CommitID string `json:"commit_id"`
}
// ReviewComment represents an inline comment in a review.
// All adapters use GitHub diff-position convention:
// - Position is a 1-indexed offset from the @@ hunk line in the unified diff.
// - CommitID identifies the commit the comment is anchored to.
// It is optional; omit (empty string) for review-level comments that are
// not attached to a specific commit.
//
// Adapters are responsible for translating to/from platform-native formats
// (e.g. Gitea uses line numbers; GitHub uses diff positions natively).
type ReviewComment struct {
Path string `json:"path"`
Position int `json:"position"` // diff-position: 1-indexed offset from @@ hunk line
CommitID string `json:"commit_id"`
Body string `json:"body"`
}
// ReviewRequest is the payload for posting a review.
type ReviewRequest struct {
// Body is the top-level review comment.
Body string `json:"body"`
// Event is the review action (approve, request changes, or comment).
Event ReviewEvent `json:"event"`
Comments []ReviewComment `json:"comments,omitempty"`
}
-193
View File
@@ -1,193 +0,0 @@
package vcs
import (
"context"
"fmt"
"strconv"
"strings"
)
const (
// maxFilesInPath is the maximum number of files GetAllFilesInPath will fetch.
// Prevents unbounded resource consumption on very large directory trees.
maxFilesInPath = 10000
// maxTotalBytesInPath is the maximum total bytes GetAllFilesInPath will accumulate.
// Prevents memory exhaustion when fetching large repositories.
maxTotalBytesInPath = 100 * 1024 * 1024 // 100 MB
)
// GetAllFilesInPath recursively fetches all file contents under a path using the
// provided FileReader. Returns a map of filepath -> content for all files found.
// If the path points to an empty directory, returns an empty map.
//
// This function uses fail-fast error handling: any error from ListContents or
// GetFileContent aborts the entire traversal and returns the error immediately.
// This differs from gitea.Client.GetAllFilesInPath, which logs errors and continues.
// The fail-fast contract ensures callers can trust that a nil error means all files
// were successfully fetched.
//
// Resource limits: the traversal is bounded by maxFilesInPath (file count) and
// maxTotalBytesInPath (total accumulated bytes). The context is checked before each
// recursive call and file fetch to respect cancellation.
func GetAllFilesInPath(ctx context.Context, client FileReader, owner, repo, path string) (map[string]string, error) {
results := make(map[string]string)
totalBytes := 0
var walk func(string) error
walk = func(dir string) error {
if err := ctx.Err(); err != nil {
return fmt.Errorf("context canceled during traversal: %w", err)
}
entries, err := client.ListContents(ctx, owner, repo, dir)
if err != nil {
return fmt.Errorf("list contents %q: %w", dir, err)
}
for _, entry := range entries {
if err := ctx.Err(); err != nil {
return fmt.Errorf("context canceled during traversal: %w", err)
}
switch entry.Type {
case "file":
if len(results) >= maxFilesInPath {
return fmt.Errorf("exceeded max file count (%d) in path %q", maxFilesInPath, path)
}
content, err := client.GetFileContent(ctx, owner, repo, entry.Path, "")
if err != nil {
return fmt.Errorf("get file %q: %w", entry.Path, err)
}
totalBytes += len(content)
if totalBytes > maxTotalBytesInPath {
return fmt.Errorf("exceeded max total bytes (%d) in path %q", maxTotalBytesInPath, path)
}
results[entry.Path] = content
case "dir":
if err := walk(entry.Path); err != nil {
return err
}
}
}
return nil
}
if err := walk(path); err != nil {
return nil, err
}
return results, nil
}
// BuildLineToPositionMap parses a unified diff and returns a map of
// filename -> (new line number -> diff position). The diff position is a
// 1-indexed offset from the @@ hunk header line for each file.
// Only lines that appear in the new file (context lines and additions) are mapped.
// Deletion-only lines are not included.
func BuildLineToPositionMap(diff string) map[string]map[int]int {
result := make(map[string]map[int]int)
lines := strings.Split(diff, "\n")
var currentFile string
var position int
var newLine int
for _, line := range lines {
// Detect new file in diff
if strings.HasPrefix(line, "+++ b/") {
currentFile = strings.TrimPrefix(line, "+++ b/")
position = 0
newLine = 0
if result[currentFile] == nil {
result[currentFile] = make(map[int]int)
}
continue
}
// Skip --- lines (old file header)
if strings.HasPrefix(line, "--- ") {
continue
}
// Skip diff --git lines
if strings.HasPrefix(line, "diff --git") {
continue
}
// Skip index lines
if strings.HasPrefix(line, "index ") {
continue
}
// Parse hunk headers
if strings.HasPrefix(line, "@@") {
position++
// Extract new file start line from @@ -a,b +c,d @@
newLine = parseHunkNewStart(line)
continue
}
// We need a current file to map lines
if currentFile == "" {
continue
}
// Skip "\ No newline at end of file" markers — these are git diff
// metadata and not part of the file content.
if strings.HasPrefix(line, `\`) {
continue
}
// Process diff content lines
if strings.HasPrefix(line, "+") {
position++
result[currentFile][newLine] = position
newLine++
} else if strings.HasPrefix(line, "-") {
position++
// Deletion lines don't map to new line numbers
} else if strings.HasPrefix(line, " ") {
// Context line (space-prefixed).
// Only map if position > 0, which means we've seen a hunk header.
// Lines before the first hunk header (position == 0) are not part
// of any diff hunk and should be skipped.
if position > 0 {
position++
result[currentFile][newLine] = position
newLine++
}
}
}
return result
}
// parseHunkNewStart extracts the new-file starting line number from a hunk header.
// Format: @@ -old_start[,old_count] +new_start[,new_count] @@
func parseHunkNewStart(hunkLine string) int {
// Find the +N part
plusIdx := strings.Index(hunkLine, "+")
if plusIdx < 0 {
return 1
}
rest := hunkLine[plusIdx+1:]
// Find the end of the number (first non-digit after +)
endIdx := 0
for endIdx < len(rest) && rest[endIdx] >= '0' && rest[endIdx] <= '9' {
endIdx++
}
if endIdx == 0 {
return 1
}
n, err := strconv.Atoi(rest[:endIdx])
if err != nil {
return 1
}
return n
}
-331
View File
@@ -1,331 +0,0 @@
package vcs_test
import (
"context"
"fmt"
"strings"
"testing"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// mockFileReader implements vcs.FileReader for testing.
type mockFileReader struct {
contents map[string][]vcs.ContentEntry // path -> entries
files map[string]string // path -> content
}
func (m *mockFileReader) GetFileContent(ctx context.Context, owner, repo, path, ref string) (string, error) {
content, ok := m.files[path]
if !ok {
return "", fmt.Errorf("HTTP 404: file not found: %s", path)
}
return content, nil
}
func (m *mockFileReader) ListContents(ctx context.Context, owner, repo, path string) ([]vcs.ContentEntry, error) {
entries, ok := m.contents[path]
if !ok {
return nil, fmt.Errorf("HTTP 404: path not found: %s", path)
}
return entries, nil
}
func TestGetAllFilesInPath(t *testing.T) {
ctx := context.Background()
t.Run("empty directory", func(t *testing.T) {
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
"src": {},
},
}
result, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "src")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 0 {
t.Errorf("expected empty map, got %d entries", len(result))
}
})
t.Run("flat directory", func(t *testing.T) {
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
"src": {
{Name: "main.go", Path: "src/main.go", Type: "file"},
{Name: "util.go", Path: "src/util.go", Type: "file"},
},
},
files: map[string]string{
"src/main.go": "package main",
"src/util.go": "package main\n// util",
},
}
result, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "src")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 2 {
t.Fatalf("expected 2 files, got %d", len(result))
}
if result["src/main.go"] != "package main" {
t.Errorf("main.go content = %q", result["src/main.go"])
}
if result["src/util.go"] != "package main\n// util" {
t.Errorf("util.go content = %q", result["src/util.go"])
}
})
t.Run("nested directories", func(t *testing.T) {
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
"src": {
{Name: "main.go", Path: "src/main.go", Type: "file"},
{Name: "pkg", Path: "src/pkg", Type: "dir"},
},
"src/pkg": {
{Name: "lib.go", Path: "src/pkg/lib.go", Type: "file"},
{Name: "sub", Path: "src/pkg/sub", Type: "dir"},
},
"src/pkg/sub": {
{Name: "deep.go", Path: "src/pkg/sub/deep.go", Type: "file"},
},
},
files: map[string]string{
"src/main.go": "package main",
"src/pkg/lib.go": "package pkg",
"src/pkg/sub/deep.go": "package sub",
},
}
result, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "src")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 3 {
t.Fatalf("expected 3 files, got %d", len(result))
}
if result["src/main.go"] != "package main" {
t.Errorf("main.go content = %q", result["src/main.go"])
}
if result["src/pkg/lib.go"] != "package pkg" {
t.Errorf("lib.go content = %q", result["src/pkg/lib.go"])
}
if result["src/pkg/sub/deep.go"] != "package sub" {
t.Errorf("deep.go content = %q", result["src/pkg/sub/deep.go"])
}
})
t.Run("mixed files and dirs", func(t *testing.T) {
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
"root": {
{Name: "README.md", Path: "root/README.md", Type: "file"},
{Name: "docs", Path: "root/docs", Type: "dir"},
{Name: "config.yaml", Path: "root/config.yaml", Type: "file"},
},
"root/docs": {
{Name: "guide.md", Path: "root/docs/guide.md", Type: "file"},
},
},
files: map[string]string{
"root/README.md": "# Hello",
"root/config.yaml": "key: value",
"root/docs/guide.md": "## Guide",
},
}
result, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "root")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 3 {
t.Fatalf("expected 3 files, got %d", len(result))
}
if result["root/README.md"] != "# Hello" {
t.Errorf("README content = %q", result["root/README.md"])
}
if result["root/docs/guide.md"] != "## Guide" {
t.Errorf("guide content = %q", result["root/docs/guide.md"])
}
})
}
func TestBuildLineToPositionMap(t *testing.T) {
t.Run("single hunk", func(t *testing.T) {
diff := "diff --git a/file.go b/file.go\nindex abc..def 100644\n--- a/file.go\n+++ b/file.go\n@@ -1,3 +1,4 @@\n package main\n \n+// new comment\n func main() {}\n"
result := vcs.BuildLineToPositionMap(diff)
fileMap, ok := result["file.go"]
if !ok {
t.Fatal("expected file.go in result")
}
// Hunk header @@ is position 1
// Line 1: " package main" -> position 2
if fileMap[1] != 2 {
t.Errorf("line 1 position = %d, want 2", fileMap[1])
}
// Line 2: " " (context) -> position 3
if fileMap[2] != 3 {
t.Errorf("line 2 position = %d, want 3", fileMap[2])
}
// Line 3: "+// new comment" -> position 4
if fileMap[3] != 4 {
t.Errorf("line 3 position = %d, want 4", fileMap[3])
}
// Line 4: " func main() {}" -> position 5
if fileMap[4] != 5 {
t.Errorf("line 4 position = %d, want 5", fileMap[4])
}
})
t.Run("multi hunk", func(t *testing.T) {
diff := "diff --git a/file.go b/file.go\n--- a/file.go\n+++ b/file.go\n@@ -1,3 +1,3 @@\n package main\n \n-// old\n+// new\n@@ -10,3 +10,4 @@\n func foo() {\n+\t// added\n \treturn\n }\n"
result := vcs.BuildLineToPositionMap(diff)
fileMap, ok := result["file.go"]
if !ok {
t.Fatal("expected file.go in result")
}
// First hunk: @@ is position 1
// Line 1: " package main" -> position 2
if fileMap[1] != 2 {
t.Errorf("line 1 position = %d, want 2", fileMap[1])
}
// Line 3: "+// new" -> position 5 (after " ", "-// old" at pos 3,4)
if fileMap[3] != 5 {
t.Errorf("line 3 position = %d, want 5", fileMap[3])
}
// Second hunk: @@ is position 6
// Line 10: " func foo() {" -> position 7
if fileMap[10] != 7 {
t.Errorf("line 10 position = %d, want 7", fileMap[10])
}
// Line 11: "+\t// added" -> position 8
if fileMap[11] != 8 {
t.Errorf("line 11 position = %d, want 8", fileMap[11])
}
})
t.Run("deletion lines not in map", func(t *testing.T) {
diff := "diff --git a/file.go b/file.go\n--- a/file.go\n+++ b/file.go\n@@ -1,4 +1,3 @@\n package main\n \n-// deleted line\n func main() {}\n"
result := vcs.BuildLineToPositionMap(diff)
fileMap, ok := result["file.go"]
if !ok {
t.Fatal("expected file.go in result")
}
// Line 1: " package main" -> position 2
if fileMap[1] != 2 {
t.Errorf("line 1 position = %d, want 2", fileMap[1])
}
// Line 3 in new file: " func main() {}" -> position 5 (after deletion at pos 4)
if fileMap[3] != 5 {
t.Errorf("line 3 position = %d, want 5", fileMap[3])
}
// Should only have 3 entries (lines 1, 2, 3 of new file)
if len(fileMap) != 3 {
t.Errorf("expected 3 mapped lines, got %d: %v", len(fileMap), fileMap)
}
})
t.Run("multiple files", func(t *testing.T) {
diff := "diff --git a/a.go b/a.go\n--- a/a.go\n+++ b/a.go\n@@ -1,2 +1,3 @@\n package a\n \n+// file a\ndiff --git a/b.go b/b.go\n--- a/b.go\n+++ b/b.go\n@@ -1,2 +1,3 @@\n package b\n \n+// file b\n"
result := vcs.BuildLineToPositionMap(diff)
if len(result) != 2 {
t.Fatalf("expected 2 files, got %d", len(result))
}
aMap, ok := result["a.go"]
if !ok {
t.Fatal("expected a.go in result")
}
bMap, ok := result["b.go"]
if !ok {
t.Fatal("expected b.go in result")
}
// a.go line 3: "+// file a" -> position 4
if aMap[3] != 4 {
t.Errorf("a.go line 3 position = %d, want 4", aMap[3])
}
// b.go line 3: "+// file b" -> position 4
if bMap[3] != 4 {
t.Errorf("b.go line 3 position = %d, want 4", bMap[3])
}
})
}
func TestGetAllFilesInPath_ErrorPropagation(t *testing.T) {
ctx := context.Background()
t.Run("ListContents error propagates", func(t *testing.T) {
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
// "src" not in map, so ListContents will fail
},
}
_, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "src")
if err == nil {
t.Fatal("expected error, got nil")
}
if !strings.Contains(err.Error(), "list contents") {
t.Errorf("expected error about list contents, got: %v", err)
}
})
t.Run("GetFileContent error propagates", func(t *testing.T) {
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
"src": {
{Name: "main.go", Path: "src/main.go", Type: "file"},
},
},
files: map[string]string{
// "src/main.go" not in files map, so GetFileContent will fail
},
}
_, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "src")
if err == nil {
t.Fatal("expected error, got nil")
}
if !strings.Contains(err.Error(), "get file") {
t.Errorf("expected error about get file, got: %v", err)
}
})
t.Run("nested ListContents error propagates", func(t *testing.T) {
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
"src": {
{Name: "pkg", Path: "src/pkg", Type: "dir"},
},
// "src/pkg" not in map, so recursive ListContents will fail
},
}
_, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "src")
if err == nil {
t.Fatal("expected error, got nil")
}
if !strings.Contains(err.Error(), "list contents") {
t.Errorf("expected error about list contents, got: %v", err)
}
})
t.Run("canceled context propagates", func(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
cancel() // Cancel immediately
client := &mockFileReader{
contents: map[string][]vcs.ContentEntry{
"src": {
{Name: "main.go", Path: "src/main.go", Type: "file"},
},
},
files: map[string]string{
"src/main.go": "package main",
},
}
_, err := vcs.GetAllFilesInPath(ctx, client, "owner", "repo", "src")
if err == nil {
t.Fatal("expected error from canceled context, got nil")
}
if !strings.Contains(err.Error(), "context canceled") {
t.Errorf("expected context cancellation error, got: %v", err)
}
})
}