Compare commits

..

43 Commits

Author SHA1 Message Date
claw e7efbe2204 fix: address review feedback on PR #106
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 20s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 51s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m43s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m42s
- Remove unused envOrDefaultBool function and its test (Sonnet #3266 NIT)
- Replace Unicode em dashes with ASCII in slog messages (GPT #3267 NIT)
- Add scheme validation for vcsURL before embedding in Markdown link
  (Security #3269 MINOR — defense-in-depth against unsafe schemes)
- Extract ReviewerSelfRequester interface to remove concrete gitea.Adapter
  dependency from main's self-reviewer path (Sonnet #3266 NIT)
- Add compile-time conformance assertion and test for Adapter.RequestReviewerSelf
2026-05-13 08:01:15 -07:00
claw d9179c27ea fix(github): remove double blank line in client_test.go (gofmt)
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 21s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 49s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m3s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m4s
2026-05-13 07:45:19 -07:00
claw 8e3c31cce7 style: remove stray blank line in doRequestWithBody
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 45s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m20s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m18s
2026-05-13 07:25:26 -07:00
claw e72bda0110 fix: address review feedback — gofmt NITs and remove unreachable default
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 26s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 47s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m37s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m38s
- github/client.go: add missing blank line between doRequestWithBody and doJSONRequest
- cmd/review-bot/main.go: remove double blank line before findAllOwnReviews
- cmd/review-bot/main.go: remove unreachable default case in VCS client init switch
  (provider is already validated at startup)
- cmd/review-bot/main_test.go: remove double blank line before TestHasSharedToken
- cmd/review-bot/main_test.go: fix comment alignment (gofmt)
- review/persona_test.go: fix comment alignment in table literal (gofmt)
2026-05-13 07:13:12 -07:00
claw 8eeab96364 fix(cmd): remove duplicate doc comment and double blank line
CI / test (pull_request) Successful in 18s
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 1m2s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m31s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 3m13s
2026-05-13 06:47:34 -07:00
claw b7f108faf6 docs(cmd,github): clarify type assertion and parameter usage in review superseding
Address sonnet-review feedback on PR #106:

- Document that the type assertion in supersedeOldReviews is guaranteed to
  succeed given the caller's provider switch, with the !ok branch guarding
  against future refactors (comment 18889).
- Clarify that vcsURL is only used in the Gitea path for constructing
  review permalink URLs (comment 18890).
- Add note explaining why the page-limit warning in ListReviews only fires
  when the final page is full, confirming the logic is intentional
  (comment 18891).
2026-05-13 06:47:34 -07:00
claw 528d29b63d fix: address self-review findings
- Remove dead code: findOwnReview (replaced by findAllOwnReviews)
- Check SetRetryBackoff return value in doJSONRequest tests
- Extract doWithRetry shared helper to eliminate ~100 lines of
  duplicated 429-retry/backoff/Retry-After logic between doRequest
  and doJSONRequest
- Fix import order: context before encoding/json (goimports)
- Add slog.Warn when ListReviews hits maxReviewPages limit
2026-05-13 06:47:28 -07:00
claw 0e503357ed fix: address review feedback on PR #106
- Add 429 rate-limit retry logic to doJSONRequest (matching doRequest
  behavior) so write operations (PostReview, DismissReview) properly
  retry when rate-limited by GitHub
- Remove redundant explicit case for ReviewEventComment in
  translateReviewEvent (default already handles it)
- Add ordering comment on --gitea-url alias registration explaining
  the dependency on registration-before-parse evaluation order
- Add tests for doJSONRequest retry/exhaust behavior
2026-05-13 06:46:17 -07:00
claw f50920333f fix(review): address bot review feedback on PR #106
- Document --gitea-url/--vcs-url last-one-wins behavior when both flags
  are passed simultaneously (sonnet MINOR #1)
- Move doJSONRequest from github/reviews.go to github/client.go where
  other HTTP helpers live (sonnet MINOR #2)
- Return joined error from supersedeOldReviews GitHub case instead of
  silently swallowing DismissReview failures (sonnet MINOR #3)
- Fix evaluateCIStatus to distinguish 'all checks passed' from 'no
  failures (N pending)' to avoid misleading status (gpt MINOR #2)
- Extract reviewsPerPage and maxReviewPages named constants for
  ListReviews pagination (gpt NIT #3)
2026-05-13 06:45:55 -07:00
claw b47bab08d6 fix(review): address inline review feedback on PR #106
- Reword misleading 'Fall through' comment to 'Continue to' in
  supersedeOldReviews (comment #18704)
- Add shared-pointer explanation comment for --gitea-url alias
  registration (comment #18703)
- Add comment clarifying CommitID same-commit expectation in
  PostReview (comment #18705)
- Rename 'hidden alias' to 'backward-compatible alias' in flag
  comment (comment #18708)
2026-05-13 06:44:32 -07:00
claw 550ba44354 fix(cmd): clarify empty gitea case control flow in supersedeOldReviews
The empty case "gitea": body exits the switch and continues to the
Gitea-specific logic below. Replace the vague comment with an explicit
note about the fall-through intent, per self-review feedback.
2026-05-13 06:44:32 -07:00
claw 6a3f63a726 fix(cmd,github): address review feedback on PR #106
- Replace panic() with fmt.Fprintf+os.Exit(1) in provider switch default
  (repo convention: never panic)
- Remove spurious 'event' field from DismissReview payload (GitHub dismiss
  endpoint only documents 'message')
- Change translateReviewEvent default to return 'COMMENT' as canonical
  fallback instead of passing unknown events through to GitHub API
- Refactor supersedeOldReviews to use explicit switch/case with default
  error for exhaustiveness
2026-05-13 06:44:32 -07:00
claw 0ea9c93af6 fix: address review feedback on PR #106
- Replace interface{} with any in github/reviews.go (Go 1.18+ idiom)
- Add default panic case to VCS client init switch
- Refactor supersedeOldReviews to return error instead of os.Exit(1)
- Remove spurious blank lines in formatter.go and formatter_test.go
- Add doc comment to DeleteReview explaining when to use vs DismissReview
- Sanitize extractSentinelName output to prevent log injection
2026-05-13 06:44:32 -07:00
claw bee4d98b8c feat(cmd): wire --provider and --base-url flags into CLI
- Add --provider flag (gitea|github) for VCS backend selection
- Add --base-url flag for GitHub API endpoint configuration
- Rename --gitea-url to --vcs-url with backward-compatible alias
- Replace direct gitea.Client usage with vcs.Client interface
- Create vcs.Client via factory switch based on --provider value
- Implement Reviewer + Identity interfaces on github.Client
- Add verdictToEvent() using canonical vcs.ReviewEvent types
- Remove review.GiteaEvent() (replaced by verdictToEvent)
- GitHub supersede uses DismissReview; Gitea keeps EditComment flow
- Add VCS_PROVIDER, VCS_BASE_URL, VCS_URL env var support

Closes #82
2026-05-13 06:44:32 -07:00
claw 691788833e docs(deps): update CONVENTIONS.md allowlist for go-yaml
Update the approved dependency table to document go-yaml subpackage
usage (ast, parser) and remove the deviation comment now that the
proper allowlist process is being followed.

Closes #91
2026-05-13 06:44:13 -07:00
claw d8c8a743ed fix: address review #2888 findings (comment clarity, test cleanup)
- Clarify depth-aware short-circuit comment to unambiguously describe
  the relationship between current depth and previous validation depth
- Add comment to MappingValueNode case explaining intentional depth+2
  behavior from parent MappingNode perspective
- Restructure unmarshalYAMLWithDepthLimit doc comment as bullet list
  covering all three safety checks (depth, multi-doc, strict fields)
- Replace t.Error with t.Fatal in TestYAMLEmptyFileRejection to remove
  redundant nil guard on subsequent err.Error() call
2026-05-13 06:44:13 -07:00
claw 629e29806c fix: handle MergeKeyNode explicitly in depth check, add size limit to ParsePersonaBytes
- Add explicit case for *ast.MergeKeyNode in checkYAMLDepth switch to
  make it clear this is an intentional leaf (no children to recurse)
  rather than relying on the default case. Prevents future library
  changes from silently bypassing depth checks.

- Add MaxPersonaFileSize bound check at the top of ParsePersonaBytes.
  While callers already check size, the public API should defend itself
  (defense in depth) against arbitrarily large inputs that could cause
  excessive memory/CPU before AST validation runs.

- Add tests for both behaviors.

Addresses review #2879 findings.
2026-05-13 06:44:13 -07:00
claw 787ac3b736 docs: address review findings on YAML depth validation
- Add safety note on Strict() decoder not expanding aliases recursively,
  since alias resolution uses the pre-validated AST (finding #1)
- Document that ast.Node map keys rely on pointer identity, which holds
  because all goccy/go-yaml AST types are pointer receivers (finding #2)
- Clarify AnchorNode comment: effective depth budget is reduced for
  anchor+alias pairs, not literally halved (finding #3)
- Improve test depth trace comment for accuracy (finding #4)
- Add HTML comment in CONVENTIONS.md referencing #91 for the two-step
  process deviation (finding #5)
2026-05-13 06:44:13 -07:00
claw 1200ef700d test: use per-subtest TempDir in TestYAMLEmptyFileRejection
Move t.TempDir() inside each subtest for idiomatic test isolation,
as suggested by reviewers.
2026-05-13 06:44:13 -07:00
claw 5e36547a1c fix: correct comment accuracy and improve trailing-content check clarity
- Fix validated map comment: says 'minimum depth' but stores the maximum
  depth at which a node was validated (overwritten on deeper visits).
- Replace dec.More() with explicit dec.Decode check for trailing JSON
  content. More() is documented for use inside arrays/objects; the
  explicit EOF check is clearer at the top-level stream.
2026-05-13 06:44:13 -07:00
claw 0522cfe3cb address self-review findings on PR #89
MINOR fixes:
- docs/DESIGN-57-yaml-persona.md: fix Error Cases table entry to reflect
  custom AST walk (checkYAMLDepth) instead of stale library-level reference
- review/persona.go: add EOF check after JSON decode to reject trailing
  garbage after a valid JSON object (prevents silent acceptance of malformed
  input like '{"name":"x"}garbage')
- review/persona_test.go: add TestJSONTrailingContentRejected test

NIT fixes:
- review/persona.go: add default case to checkYAMLDepth switch with
  explanatory comment about scalar leaf nodes
- review/persona.go: document AnchorNode depth+1 conservative asymmetry
- review/persona.go: simplify redundant if-guard in ListBuiltinPersonas
2026-05-13 06:44:13 -07:00
claw ee84c64151 fix(review): address review 2792 feedback
- Document nodeCount overcounting as intentional conservative behavior
  (bounds total validation work, not unique nodes)
- Improve TestYAMLDeeplyNestedRejection comment with concrete depth trace
- Replace outdated gopkg.in/yaml.v3 pseudocode in design doc with
  reference to authoritative implementation
- Update PR description to clarify pre-approval via issue #57
2026-05-13 06:44:13 -07:00
claw 3a1e5e443e fix(review): address feedback from reviews 2788, 2789, 2791
- Move nodeCount increment after cycle detection to avoid over-counting
  cyclic references (sonnet #2)
- Use underscores in test case names used as filenames (sonnet #3)
- Fix function comment: 'prevent silent data loss' → 'prevent confusing
  behavior where additional documents are silently ignored' (sonnet #4)
- Mark design doc pseudocode as historical since implementation uses
  goccy/go-yaml ast.Node, not gopkg.in/yaml.v3 yaml.Node (sonnet #5)
2026-05-13 06:44:13 -07:00
claw 9b1e93bfde fix(security): prevent alias depth bypass in YAML validator
The global 'seen' set allowed anchored subtrees validated at a shallow
depth to be skipped when later referenced via alias at a greater depth.
This could let effective nesting exceed MaxYAMLDepth, enabling DoS.

Fix: replace the single 'seen' set with two tracking maps:
- validated (node -> min depth): only short-circuits when current depth
  <= previously validated depth; re-checks at deeper contexts.
- visiting (node -> bool): per-path recursion stack for true cycle
  detection (breaks alias loops without suppressing depth checks).

Add TestYAMLAliasDepthBypass that constructs a document with an
anchored 15-level subtree referenced via alias under 6 levels of
nesting, verifying the combined effective depth (22) is rejected.

Addresses security-review-bot findings on review #2774.
2026-05-13 06:44:13 -07:00
rodin dbbc7bda6a docs: update DESIGN-57 to reflect goccy/go-yaml as the supported YAML library 2026-05-13 06:44:13 -07:00
rodin 93e2fbd58d docs: update YAML library to github.com/goccy/go-yaml in CONVENTIONS.md 2026-05-13 06:44:13 -07:00
claw 6f4461b0f7 fix(review): address review feedback on persona YAML handling
- Reorder empty doc check before multi-doc check for natural flow
- Detect nil-body docs (whitespace-only, comment-only input)
- Add explanatory comment on pointer identity for cycle detection map
- Improve depth-counting test comment with AST walker specifics
- Add TestYAMLEmptyFileRejection covering empty/whitespace/comment inputs

Addresses MINOR and NIT findings from sonnet, gpt, and security reviews.
MAJOR (allowlist violation) tracked in issue #91.
2026-05-13 06:44:13 -07:00
claw 32c89330aa fix(deps): replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml
Fixes #87.

PR #58 incorrectly added gopkg.in/yaml.v3 (abandoned library) instead of
github.com/goccy/go-yaml as required by issue #57.

Changes:
- Replace gopkg.in/yaml.v3 with github.com/goccy/go-yaml v1.19.2
- Update review/persona.go to use goccy/go-yaml API:
  - parser.ParseBytes for AST-based depth/node count checking
  - yaml.Strict() decoder option instead of KnownFields(true)
  - ast.Node types instead of yaml.Node for tree walking
- Update review/persona_test.go to use ast types for cycle tests
- Remove gopkg.in/yaml.v3 from go.mod and go.sum

All existing YAML tests pass with the new library.
2026-05-13 06:44:13 -07:00
rodin 2036f57011 fix(patterns): default patterns-files to empty (fetch all) (#77) 2026-05-13 06:44:13 -07:00
aweiker 4e8c676515 Merge pull request 'feat(github): implement Reviewer and Identity interfaces (#81)' (#105) from review-bot-issue-81 into feature/github-support
Reviewed-on: #105
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 13:39:13 +00:00
claw 027bad2f7c fix(github): add DismissReview Event comment; use t.Fatalf for routing assertions
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 33s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 50s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m48s
- Add comment in DismissReview explaining why the Event field is required
  by the GitHub API even though DISMISS is the only valid value (#18652).
- Change t.Errorf to t.Fatalf for method/path routing assertions in test
  handlers so failures are immediately fatal instead of silently
  continuing handler execution (#18653).
2026-05-13 13:18:46 +00:00
claw cd8a1becb3 test(github): use t.Run subtests in TestTranslateGitHubReviewState; doc: note nil body in DeleteReview 2026-05-13 13:18:46 +00:00
claw 9dd5e8dbac fix(github): validate conflicting commit IDs and extract test helper
Address review findings from sonnet-review-bot (review 3086):

- PostReview now returns ErrConflictingCommitIDs when comments specify
  different non-empty CommitIDs, since the GitHub API accepts only a
  single commit_id per review. Previously the discrepancy was silently
  ignored, using only the first commit's ID.

- Extract newTestClient into helpers_test.go to make cross-file sharing
  between review_test.go and identity_test.go explicit.

Refs: #81
2026-05-13 13:18:46 +00:00
claw 8b256360bf fix(github): clarify PostReview doc comment, rename test field to 'want'
Address review feedback from round-3 sonnet review:
- PostReview doc comment now accurately describes vcs.ReviewEvent → GitHub
  wire-format string cast and notes nil-Comments omitempty behavior.
- Rename 'expected' field to 'want' in TestTranslateGitHubReviewState to
  match the project's established naming convention.
2026-05-13 13:18:46 +00:00
claw 293296b50c address review feedback: wrap ErrCannotDeleteSubmittedReview, fix nits
- Wrap ErrCannotDeleteSubmittedReview with operation context via fmt.Errorf
  so callers get both sentinel identity and context (MINOR fix)
- Combine double iteration in PostReview into single loop (NIT)
- Remove extra trailing blank line in review_test.go (NIT)
- Clarify translateGitHubReviewState comment re: PENDING state (NIT)
- Update requestOptions.bodyFn comment to mention DELETE-with-body (NIT)
2026-05-13 13:18:46 +00:00
claw eba97321ad refactor(github): extract doRequestCore, address review feedback
- MAJOR: Extract doRequestCore to eliminate doRequest/doRequestWithBody
  duplication. Both now delegate to a shared implementation with the
  retry/backoff logic in a single place.

- MINOR: Replace custom containsStr/containsSubstring helpers with
  strings.Contains in review_test.go.

- MINOR: Use http.Method* constants (MethodPost, MethodDelete, MethodPut)
  in review.go for consistency with doGet.

- MINOR: Remove redundant APPROVED/DISMISSED cases from
  translateGitHubReviewState that were identical to the default passthrough.

- NIT: Clarify DeleteReview comment about COMMENTED being a GitHub API
  state name.

- DismissReview Event field verified as required by GitHub API docs;
  kept as-is.
2026-05-13 13:18:46 +00:00
claw be3f696a70 feat(github): implement Reviewer and Identity interfaces (#81)
Implement the remaining vcs.Client interface methods for github.Client:

Reviewer:
- PostReview: POST /repos/{owner}/{repo}/pulls/{number}/reviews
- ListReviews: GET /repos/{owner}/{repo}/pulls/{number}/reviews
  with state translation (CHANGES_REQUESTED → REQUEST_CHANGES, etc.)
- DeleteReview: DELETE /repos/{owner}/{repo}/pulls/{number}/reviews/{id}
  Returns ErrCannotDeleteSubmittedReview on 422
- DismissReview: PUT /repos/{owner}/{repo}/pulls/{number}/reviews/{id}/dismissals

Identity:
- GetAuthenticatedUser: GET /user

Infrastructure:
- Add doRequestWithBody helper for POST/PUT/DELETE with JSON bodies
- Update conformance_test.go: var _ vcs.Client = (*github.Client)(nil)

All unit tests pass including error cases (401, 404, 422, malformed).
2026-05-13 13:18:46 +00:00
aweiker 65ba8af244 Merge pull request 'fix(gitea): map hunk-header positions in BuildPositionToLineMap' (#104) from review-bot-issue-97 into feature/github-support
Reviewed-on: #104
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
Reviewed-by: Aaron Weiker <aaron@weiker.org>
2026-05-13 13:15:30 +00:00
claw 02bdd701a5 test(gitea): add hunk-header-at-end error path test
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 20s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 23s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m39s
Adds TestTranslate_HunkHeaderAtEnd covering the edge case where a
hunk-header is the last position in the file with no subsequent
new-file line. Mirrors TestBuildPositionToLineMap_DeletionAtEnd for
the hunk-header code path.

Addresses NIT from sonnet-review-bot on PR #104 (comment 18412).
2026-05-12 23:32:22 -07:00
claw 23dc781908 fix(gitea): map hunk-header positions in BuildPositionToLineMap
CI / test (pull_request) Successful in 28s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 27s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 44s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m49s
BuildPositionToLineMap incremented position and updated maxPositions for
@@ hunk-header lines but did not store a map entry, causing Translate()
to return a hard error for any comment positioned at a hunk header.

Store sentinel value 0 for hunk-header positions (analogous to -1 for
deletions) and extend Translate() to fall through to the nearest
context/addition line below, matching the existing deletion-line
behavior.

Fixes #97
2026-05-12 23:13:28 -07:00
aweiker 1960d987ed Merge pull request 'feat(github): implement FileReader interface' (#103) from issue-80-c-file-reader into feature/github-support
Reviewed-on: #103
Reviewed-by: Aaron Weiker <aaron@weiker.org>
Reviewed-by: security-review-bot <10+security-review-bot@noreply.gitea.weiker.me>
2026-05-13 06:05:58 +00:00
claw dca260f582 fix(test): SetRetryBackoff with correct slice length
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 19s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 32s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m56s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m33s
Pass 2 elements to SetRetryBackoff (matching maxRetryAttempts-1 = 2)
and check the error return. Previously passing 1 element silently
failed, causing tests to fall back to default {1s, 2s} backoffs.

Fixes self-review finding: 429Retry tests now run in <10ms instead
of ~1s.
2026-05-12 22:47:31 -07:00
aweiker 921599542d feat(github): implement FileReader interface (#80)
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 21s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m6s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m53s
Implement FileReader conformance on the GitHub client: GetFileContent,
ListContents, path helpers, base64 decode. Includes compile-time
conformance checks for both PRReader and FileReader.

Requires PR B (#102). Part 3 of 3 for #80.
2026-05-13 05:33:30 +00:00
28 changed files with 2145 additions and 608 deletions
+9 -26
View File
@@ -104,10 +104,6 @@ inputs:
description: 'Path to custom persona JSON file'
required: false
default: ''
action-repo:
description: 'Repository hosting the review-bot binary (owner/name). Defaults to rodin/review-bot on Gitea, or strat/review-bot on GitHub.'
required: false
default: ''
runs:
using: 'composite'
@@ -116,21 +112,10 @@ runs:
id: version
shell: bash
run: |
SERVER_URL="${{ inputs.gitea-url || github.server_url }}"
# Detect VCS type: Gitea uses /api/v1/, GitHub uses /api/v3/
if echo "$SERVER_URL" | grep -qi 'gitea'; then
API_BASE="${SERVER_URL}/api/v1"
DEFAULT_ACTION_REPO="rodin/review-bot"
else
API_BASE="${SERVER_URL}/api/v3"
DEFAULT_ACTION_REPO="strat/review-bot"
fi
ACTION_REPO="${{ inputs.action-repo || '' }}"
if [ -z "$ACTION_REPO" ]; then
ACTION_REPO="$DEFAULT_ACTION_REPO"
fi
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
if [ "${{ inputs.version }}" = "latest" ]; then
VERSION=$(curl -sSf "${API_BASE}/repos/${ACTION_REPO}/releases?limit=1" \
VERSION=$(curl -sSf "${GITEA_URL}/api/v1/repos/${REPO}/releases?limit=1" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
if [ -z "$VERSION" ]; then
echo "Failed to determine latest version" >&2
@@ -140,8 +125,6 @@ runs:
VERSION="${{ inputs.version }}"
fi
echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
echo "action-repo=${ACTION_REPO}" >> "$GITHUB_OUTPUT"
echo "server-url=${SERVER_URL}" >> "$GITHUB_OUTPUT"
- name: Cache review-bot binary
id: cache
@@ -154,14 +137,14 @@ runs:
if: steps.cache.outputs.cache-hit != 'true'
shell: bash
run: |
SERVER_URL="${{ steps.version.outputs.server-url }}"
ACTION_REPO="${{ steps.version.outputs.action-repo }}"
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
VERSION="${{ steps.version.outputs.version }}"
BINARY="review-bot-linux-amd64"
curl -sSfL "${SERVER_URL}/${ACTION_REPO}/releases/download/${VERSION}/${BINARY}" \
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL "${SERVER_URL}/${ACTION_REPO}/releases/download/${VERSION}/checksums.txt" \
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/checksums.txt" \
-o "${{ runner.temp }}/checksums.txt"
# Verify SHA-256 checksum
@@ -186,8 +169,8 @@ runs:
- name: Run review
shell: bash
env:
GITHUB_SERVER_URL: ${{ inputs.gitea-url || github.server_url }}
GITHUB_REPOSITORY: ${{ inputs.repo || github.repository }}
GITEA_URL: ${{ inputs.gitea-url || github.server_url }}
GITEA_REPO: ${{ inputs.repo || github.repository }}
PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ inputs.reviewer-token }}
REVIEWER_NAME: ${{ inputs.reviewer-name }}
+7 -24
View File
@@ -104,10 +104,6 @@ inputs:
description: 'Path to custom persona JSON file'
required: false
default: ''
action-repo:
description: 'Repository hosting the review-bot binary (owner/name). Defaults to rodin/review-bot on Gitea, or strat/review-bot on GitHub.'
required: false
default: ''
runs:
using: 'composite'
@@ -116,21 +112,10 @@ runs:
id: version
shell: bash
run: |
SERVER_URL="${{ inputs.gitea-url || github.server_url }}"
# Detect VCS type: Gitea uses /api/v1/, GitHub uses /api/v3/
if echo "$SERVER_URL" | grep -qi 'gitea'; then
API_BASE="${SERVER_URL}/api/v1"
DEFAULT_ACTION_REPO="rodin/review-bot"
else
API_BASE="${SERVER_URL}/api/v3"
DEFAULT_ACTION_REPO="strat/review-bot"
fi
ACTION_REPO="${{ inputs.action-repo || '' }}"
if [ -z "$ACTION_REPO" ]; then
ACTION_REPO="$DEFAULT_ACTION_REPO"
fi
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
if [ "${{ inputs.version }}" = "latest" ]; then
VERSION=$(curl -sSf "${API_BASE}/repos/${ACTION_REPO}/releases?limit=1" \
VERSION=$(curl -sSf "${GITEA_URL}/api/v1/repos/${REPO}/releases?limit=1" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
if [ -z "$VERSION" ]; then
echo "Failed to determine latest version" >&2
@@ -140,8 +125,6 @@ runs:
VERSION="${{ inputs.version }}"
fi
echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
echo "action-repo=${ACTION_REPO}" >> "$GITHUB_OUTPUT"
echo "server-url=${SERVER_URL}" >> "$GITHUB_OUTPUT"
- name: Cache review-bot binary
id: cache
@@ -154,14 +137,14 @@ runs:
if: steps.cache.outputs.cache-hit != 'true'
shell: bash
run: |
SERVER_URL="${{ steps.version.outputs.server-url }}"
ACTION_REPO="${{ steps.version.outputs.action-repo }}"
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
VERSION="${{ steps.version.outputs.version }}"
BINARY="review-bot-linux-amd64"
curl -sSfL "${SERVER_URL}/${ACTION_REPO}/releases/download/${VERSION}/${BINARY}" \
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL "${SERVER_URL}/${ACTION_REPO}/releases/download/${VERSION}/checksums.txt" \
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/checksums.txt" \
-o "${{ runner.temp }}/checksums.txt"
# Verify SHA-256 checksum
-47
View File
@@ -1,47 +0,0 @@
# Self-review workflow for strat/review-bot on GitHub Enterprise Server.
# Tests that the composite action runs correctly on GitHub runners:
# - GITHUB_SERVER_URL and GITHUB_REPOSITORY env vars are set correctly
# - Binary is downloaded from gitea.weiker.me (where releases live)
# - Review is posted to the corresponding Gitea PR
name: Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-24.04
if: github.event_name == 'pull_request'
strategy:
matrix:
include:
- name: sonnet
token_secret: SONNET_REVIEW_TOKEN
model: anthropic--claude-4.6-sonnet
- name: gpt
token_secret: GPT_REVIEW_TOKEN
model: gpt-5
steps:
- uses: actions/checkout@v4
- name: Run ${{ matrix.name }} review
uses: ./.gitea/actions/review
with:
# Download binary from Gitea (releases live there, not on GHE)
gitea-url: https://gitea.weiker.me
# Post review to the corresponding Gitea repo
repo: rodin/review-bot
reviewer-token: ${{ secrets[matrix.token_secret] }}
reviewer-name: ${{ matrix.name }}
llm-model: ${{ matrix.model }}
llm-provider: aicore
aicore-client-id: ${{ secrets.AICORE_CLIENT_ID }}
aicore-client-secret: ${{ secrets.AICORE_CLIENT_SECRET }}
aicore-auth-url: ${{ secrets.AICORE_AUTH_URL }}
aicore-api-url: ${{ secrets.AICORE_API_URL }}
aicore-resource-group: ${{ secrets.AICORE_RESOURCE_GROUP }}
conventions-file: CONVENTIONS.md
patterns-repo: rodin/go-patterns
patterns-files: 'README.md,patterns/'
dry-run: 'true'
timeout: '600'
+1 -1
View File
@@ -9,7 +9,7 @@
| Package | Use Case | Scope |
|---------|----------|-------|
| `gopkg.in/yaml.v3` | YAML parsing (persona files, config) | production |
| `github.com/goccy/go-yaml` | YAML parsing and AST inspection (subpkgs: `ast`, `parser`) | production |
| `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only |
**Any import not in this table or the Go standard library is forbidden.**
+252 -169
View File
@@ -2,6 +2,7 @@ package main
import (
"context"
"errors"
"flag"
"fmt"
"log/slog"
@@ -13,6 +14,7 @@ import (
"gitea.weiker.me/rodin/review-bot/budget"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/github"
"gitea.weiker.me/rodin/review-bot/llm"
"gitea.weiker.me/rodin/review-bot/review"
"gitea.weiker.me/rodin/review-bot/vcs"
@@ -54,19 +56,22 @@ func main() {
// Logging flags
logFormat := flag.String("log-format", envOrDefault("LOG_FORMAT", "text"), "Log output format: text or json")
verbosity := flag.String("verbosity", envOrDefault("LOG_VERBOSITY", "info"), "Log verbosity: debug, info, warn, error")
// CLI flags
giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", envOrDefault("GITHUB_SERVER_URL", "")), "Gitea instance URL")
repo := flag.String("repo", envOrDefault("GITEA_REPO", envOrDefault("GITHUB_REPOSITORY", "")), "Repository (owner/name)")
// VCS flags
provider := flag.String("provider", envOrDefault("VCS_PROVIDER", "gitea"), "VCS provider: gitea or github")
baseURL := flag.String("base-url", envOrDefault("VCS_BASE_URL", ""), "VCS API base URL (for github provider; defaults to https://api.github.com)")
vcsURL := flag.String("vcs-url", envOrDefault("VCS_URL", envOrDefault("GITEA_URL", envOrDefault("GITHUB_SERVER_URL", ""))), "VCS instance URL (Gitea) [deprecated alias: --gitea-url]")
// Keep --gitea-url as backward-compatible alias (flag package doesn't support aliases natively, handle below)
repo := flag.String("repo", envOrDefault("VCS_REPO", envOrDefault("GITEA_REPO", envOrDefault("GITHUB_REPOSITORY", ""))), "Repository (owner/name)")
prNum := flag.String("pr", envOrDefault("PR_NUMBER", ""), "Pull request number")
reviewerName := flag.String("reviewer-name", envOrDefault("REVIEWER_NAME", ""), "Reviewer display name")
reviewerToken := flag.String("reviewer-token", envOrDefault("REVIEWER_TOKEN", ""), "Gitea token for posting review")
reviewerToken := flag.String("reviewer-token", envOrDefault("REVIEWER_TOKEN", ""), "VCS token for posting review")
llmBaseURL := flag.String("llm-base-url", envOrDefault("LLM_BASE_URL", ""), "LLM API base URL")
llmAPIKey := flag.String("llm-api-key", envOrDefault("LLM_API_KEY", ""), "LLM API key")
llmModel := flag.String("llm-model", envOrDefault("LLM_MODEL", ""), "LLM model name")
conventionsFile := flag.String("conventions-file", envOrDefault("CONVENTIONS_FILE", ""), "Conventions file path in repo (e.g. CLAUDE.md)")
systemPromptFile := flag.String("system-prompt-file", envOrDefault("SYSTEM_PROMPT_FILE", ""), "Local file with additional system prompt instructions")
patternsRepo := flag.String("patterns-repo", envOrDefault("PATTERNS_REPO", ""), "Repo with language patterns (e.g. rodin/elixir-patterns)")
patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", "README.md"), "Comma-separated file paths to fetch from patterns repo")
patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", ""), "Comma-separated file paths to fetch from patterns repo (empty = all files)")
dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting")
llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)")
llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)")
@@ -80,6 +85,18 @@ func main() {
aicoreAPIURL := flag.String("aicore-api-url", envOrDefault("AICORE_API_URL", ""), "SAP AI Core API URL (for provider=aicore)")
aicoreResourceGroup := flag.String("aicore-resource-group", envOrDefault("AICORE_RESOURCE_GROUP", "default"), "SAP AI Core resource group (for provider=aicore)")
// Register --gitea-url as a backward-compatible alias for --vcs-url.
// StringVar shares the *string pointer with vcsURL, so whichever flag is
// set last by flag.Parse wins — both point to the same underlying value.
// NOTE: If a user passes both --vcs-url and --gitea-url, the last one on
// the command line takes effect (standard flag package behavior). This is
// acceptable since --gitea-url is deprecated and both serve the same purpose.
//
// ORDERING: This must remain AFTER vcsURL's flag.String declaration and BEFORE
// flag.Parse(). The *vcsURL dereference captures the env-var-resolved default
// at registration time; moving flag.Parse() above this line would break it.
flag.StringVar(vcsURL, "gitea-url", *vcsURL, "Deprecated: use --vcs-url instead")
flag.Parse()
if *versionFlag {
@@ -92,12 +109,25 @@ func main() {
slog.Info("review-bot starting", "version", version)
// Validate VCS provider
switch *provider {
case "gitea", "github":
// valid
default:
fmt.Fprintf(os.Stderr, "Error: invalid --provider %q (valid: gitea, github)\n", *provider)
os.Exit(1)
}
// Validate required fields
// For aicore provider, llm-base-url and llm-api-key are not required
isAICore := llm.Provider(*llmProvider) == llm.ProviderAICore
if *giteaURL == "" || *repo == "" || *prNum == "" || *reviewerToken == "" || *llmModel == "" {
if *repo == "" || *prNum == "" || *reviewerToken == "" || *llmModel == "" {
fmt.Fprintf(os.Stderr, "Error: missing required flags or environment variables\n\n")
fmt.Fprintf(os.Stderr, "Required: --gitea-url, --repo, --pr, --reviewer-token, --llm-model\n")
fmt.Fprintf(os.Stderr, "Required: --repo, --pr, --reviewer-token, --llm-model\n")
os.Exit(1)
}
// --vcs-url is required only for gitea provider
if *provider == "gitea" && *vcsURL == "" {
fmt.Fprintf(os.Stderr, "Error: --vcs-url (or --gitea-url) is required for provider=gitea\n")
os.Exit(1)
}
if !isAICore && (*llmBaseURL == "" || *llmAPIKey == "") {
@@ -116,8 +146,6 @@ func main() {
os.Exit(1)
}
// NOTE: Persona loading deferred until after Gitea client init to support repo personas
// Validate reviewer-name: only safe characters allowed in sentinel
if err := validateReviewerName(*reviewerName); err != nil {
slog.Error("invalid reviewer name", "error", err)
@@ -139,8 +167,22 @@ func main() {
os.Exit(1)
}
// Initialize clients
giteaClient := gitea.NewClient(*giteaURL, *reviewerToken)
// Initialize VCS client
var client vcs.Client
switch *provider {
case "gitea":
giteaClient := gitea.NewClient(*vcsURL, *reviewerToken)
client = gitea.NewAdapter(giteaClient)
case "github":
ghBaseURL := *baseURL
if ghBaseURL == "" {
ghBaseURL = "https://api.github.com"
}
client = github.NewClient(*reviewerToken, ghBaseURL)
}
slog.Info("VCS client initialized", "provider", *provider)
// Initialize LLM client
llmClient := llm.NewClient(*llmBaseURL, *llmAPIKey, *llmModel)
if *llmTemp < 0 || *llmTemp > 2 {
slog.Error("invalid LLM temperature", "temperature", *llmTemp, "range", "0-2")
@@ -174,16 +216,13 @@ func main() {
ctx, cancel := context.WithTimeout(context.Background(), overallTimeout)
defer cancel()
// Load persona if specified (after Gitea client init to support repo personas)
// Load persona if specified
var persona *review.Persona
if *personaName != "" {
// Try loading from repo first, then fall back to built-in
repoPersonas, err := review.LoadRepoPersonas(ctx, newGiteaClientAdapter(giteaClient), owner, repoName)
repoPersonas, err := review.LoadRepoPersonas(ctx, client, owner, repoName)
if err != nil {
slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)
// Continue with built-in personas only.
// NOTE: repoPersonas is nil here, but map indexing on a nil map is safe in Go
// (returns the zero value), so the fallback to built-in below works correctly.
}
if p, ok := repoPersonas[*personaName]; ok {
persona = p
@@ -214,7 +253,7 @@ func main() {
slog.Info("reviewing pull request", "pr", prNumber, "repo", fmt.Sprintf("%s/%s", owner, repoName))
// Step 1: Fetch PR metadata
pr, err := giteaClient.GetPullRequest(ctx, owner, repoName, prNumber)
pr, err := client.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil {
slog.Error("failed to fetch PR", "pr", prNumber, "error", err)
os.Exit(1)
@@ -222,7 +261,7 @@ func main() {
slog.Info("fetched PR metadata", "pr", prNumber, "title", pr.Title)
// Step 2: Fetch diff
diff, err := giteaClient.GetPullRequestDiff(ctx, owner, repoName, prNumber)
diff, err := client.GetPullRequestDiff(ctx, owner, repoName, prNumber)
if err != nil {
slog.Error("failed to fetch diff", "pr", prNumber, "error", err)
os.Exit(1)
@@ -231,21 +270,21 @@ func main() {
// Step 3: Fetch full file content for modified files
fileContext := ""
files, err := giteaClient.GetPullRequestFiles(ctx, owner, repoName, prNumber)
files, err := client.GetPullRequestFiles(ctx, owner, repoName, prNumber)
if err != nil {
slog.Warn("could not fetch PR files list", "pr", prNumber, "error", err)
} else {
fileContext = fetchFileContext(ctx, giteaClient, owner, repoName, pr.Head.Ref, files)
fileContext = fetchFileContext(ctx, client, owner, repoName, pr.Head.Ref, files)
slog.Debug("fetched file context", "files", len(files))
}
// Step 4: Check CI status
ciPassed := true
ciDetails := ""
if pr.Head.Sha != "" {
statuses, err := giteaClient.GetCommitStatuses(ctx, owner, repoName, pr.Head.Sha)
if pr.Head.SHA != "" {
statuses, err := client.GetCommitStatuses(ctx, owner, repoName, pr.Head.SHA)
if err != nil {
slog.Warn("could not fetch CI status", "sha", pr.Head.Sha, "error", err)
slog.Warn("could not fetch CI status", "sha", pr.Head.SHA, "error", err)
} else {
ciPassed, ciDetails = evaluateCIStatus(statuses)
slog.Info("CI status checked", "passed", ciPassed)
@@ -255,7 +294,7 @@ func main() {
// Step 5: Load conventions file if specified
conventions := ""
if *conventionsFile != "" {
content, err := giteaClient.GetFileContent(ctx, owner, repoName, *conventionsFile)
content, err := client.GetFileContent(ctx, owner, repoName, *conventionsFile, "")
if err != nil {
slog.Warn("could not load conventions file", "file", *conventionsFile, "error", err)
} else {
@@ -267,7 +306,7 @@ func main() {
// Step 6: Load patterns from external repo if specified
patterns := ""
if *patternsRepo != "" {
patterns = fetchPatterns(ctx, giteaClient, *patternsRepo, *patternsFiles)
patterns = fetchPatterns(ctx, client, *patternsRepo, *patternsFiles)
slog.Debug("loaded patterns", "repo", *patternsRepo, "bytes", len(patterns))
}
@@ -360,15 +399,16 @@ func main() {
}
// Add commit footer so readers know which commit was evaluated
if pr.Head.Sha != "" {
shortSHA := pr.Head.Sha
if pr.Head.SHA != "" {
shortSHA := pr.Head.SHA
if len(shortSHA) > 8 {
shortSHA = shortSHA[:8]
}
reviewBody += fmt.Sprintf("\n\n---\n*Evaluated against %s*", shortSHA)
}
event := review.GiteaEvent(result.Verdict)
// Map verdict to canonical review event
event := verdictToEvent(result.Verdict)
if *dryRun {
fmt.Println("--- DRY RUN ---")
@@ -380,34 +420,40 @@ func main() {
sentinel := fmt.Sprintf("<!-- review-bot:%s -->", *reviewerName)
// Stale check: verify HEAD hasn't moved since we started
evaluatedSHA := pr.Head.Sha
evaluatedSHA := pr.Head.SHA
var currentSHA string
currentPR, err := giteaClient.GetPullRequest(ctx, owner, repoName, prNumber)
currentPR, err := client.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil {
slog.Warn("could not re-fetch PR for stale check", "pr", prNumber, "error", err)
// currentSHA stays empty — shouldSkipStaleReview will return false
} else {
currentSHA = currentPR.Head.Sha
currentSHA = currentPR.Head.SHA
}
if shouldSkipStaleReview(evaluatedSHA, currentSHA) {
slog.Warn("HEAD moved during review skipping stale review",
slog.Warn("HEAD moved during review -- skipping stale review",
"evaluated", evaluatedSHA,
"current", currentSHA,
"pr", prNumber)
return
}
// Map findings to inline comments for lines present in the diff
diffRanges := gitea.ParseDiffNewLines(diff)
var inlineComments []gitea.ReviewComment
// Build line→position map for inline comments
lineToPosition := vcs.BuildLineToPositionMap(diff)
var inlineComments []vcs.ReviewComment
for _, f := range result.Findings {
if f.File != "" && f.Line > 0 && diffRanges.Contains(f.File, f.Line) {
inlineComments = append(inlineComments, gitea.ReviewComment{
Path: f.File,
NewPosition: int64(f.Line),
Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
})
if f.File == "" || f.Line <= 0 {
continue
}
pos, ok := lineToPosition[f.File][f.Line]
if !ok {
slog.Warn("line not in diff, skipping comment", "file", f.File, "line", f.Line)
continue
}
inlineComments = append(inlineComments, vcs.ReviewComment{
Path: f.File,
Position: pos,
CommitID: pr.Head.SHA,
Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
})
}
if len(inlineComments) > 0 {
slog.Debug("attaching inline comments", "count", len(inlineComments))
@@ -416,10 +462,9 @@ func main() {
// --- Review update strategy ---
// 1. POST new review first (gets non-stale approval badge on HEAD)
// 2. Then supersede old review with link to the new one
// Order matters: post first so we have the new review's URL for the supersede message.
var oldReviews []gitea.Review
var oldReviews []vcs.Review
if *reviewerName != "" {
existingReviews, err := giteaClient.ListReviews(ctx, owner, repoName, prNumber)
existingReviews, err := client.ListReviews(ctx, owner, repoName, prNumber)
if err != nil {
slog.Warn("could not list existing reviews", "pr", prNumber, "error", err)
} else {
@@ -431,74 +476,141 @@ func main() {
}
}
// Self-request as reviewer (ensures we appear in required-reviewer checks)
authUser, err := giteaClient.GetAuthenticatedUser(ctx)
if err != nil {
slog.Warn("could not determine authenticated user for reviewer self-request", "error", err)
} else if authUser != "" {
if err := giteaClient.RequestReviewer(ctx, owner, repoName, prNumber, authUser); err != nil {
slog.Warn("could not self-request as reviewer", "user", authUser, "error", err)
} else {
slog.Debug("self-requested as reviewer", "user", authUser, "pr", prNumber)
// Self-request as reviewer (Gitea-specific; ensures we appear in required-reviewer checks)
if selfReq, ok := client.(vcs.ReviewerSelfRequester); ok {
authUser, err := client.GetAuthenticatedUser(ctx)
if err != nil {
slog.Warn("could not determine authenticated user for reviewer self-request", "error", err)
} else if authUser != "" {
if err := selfReq.RequestReviewerSelf(ctx, owner, repoName, prNumber, authUser); err != nil {
slog.Warn("could not self-request as reviewer", "user", authUser, "error", err)
} else {
slog.Debug("self-requested as reviewer", "user", authUser, "pr", prNumber)
}
}
} else {
slog.Debug("RequestReviewer not supported for provider, skipping")
}
// POST new review
slog.Info("posting review", "event", event, "pr", prNumber)
posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, event, reviewBody, inlineComments)
reviewReq := vcs.ReviewRequest{
Body: reviewBody,
Event: event,
Comments: inlineComments,
}
posted, err := client.PostReview(ctx, owner, repoName, prNumber, reviewReq)
if err != nil {
slog.Error("failed to post review", "pr", prNumber, "event", event, "error", err)
os.Exit(1)
}
slog.Info("review posted", "review_id", posted.ID, "user", posted.User.Login, "pr", prNumber)
// Supersede all old reviews with link to the new one
// Supersede all old reviews
if len(oldReviews) > 0 {
newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(*giteaURL, "/"), owner, repoName, prNumber, posted.ID)
for _, oldReview := range oldReviews {
cid, err := giteaClient.GetTimelineReviewCommentIDForReview(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil {
slog.Warn("could not find comment ID for old review", "review_id", oldReview.ID, "error", err)
continue
}
supersededBody := buildSupersededBody(oldReview.Body, oldReview.CommitID, newReviewURL, sentinel)
if err := giteaClient.EditComment(ctx, owner, repoName, cid, supersededBody); err != nil {
slog.Warn("could not mark old review as superseded", "review_id", oldReview.ID, "comment_id", cid, "error", err)
continue
}
slog.Info("marked old review as superseded", "review_id", oldReview.ID, "new_review_id", posted.ID, "pr", prNumber)
// Resolve old review's inline comments
oldComments, err := giteaClient.ListReviewComments(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil {
slog.Warn("could not list old review comments for resolution", "review_id", oldReview.ID, "error", err)
continue
}
resolved, failed := 0, 0
for _, c := range oldComments {
if c.ID == 0 {
continue
}
if err := giteaClient.ResolveComment(ctx, owner, repoName, c.ID); err != nil {
slog.Debug("could not resolve inline comment", "comment_id", c.ID, "error", err)
failed++
} else {
resolved++
}
}
if resolved > 0 {
slog.Info("resolved old inline comments", "review_id", oldReview.ID, "count", resolved, "pr", prNumber)
}
if failed > 0 {
slog.Warn("some inline comments could not be resolved", "review_id", oldReview.ID, "failed", failed, "pr", prNumber)
}
if err := supersedeOldReviews(ctx, client, *provider, *vcsURL, owner, repoName, prNumber, oldReviews, posted.ID, sentinel); err != nil {
slog.Error("failed to supersede old reviews", "error", err)
os.Exit(1)
}
}
}
// verdictToEvent maps a verdict string from the LLM response to a canonical vcs.ReviewEvent.
func verdictToEvent(verdict string) vcs.ReviewEvent {
switch verdict {
case "APPROVE":
return vcs.ReviewEventApprove
case "REQUEST_CHANGES":
return vcs.ReviewEventRequestChanges
default:
return vcs.ReviewEventComment
}
}
// supersedeOldReviews marks prior reviews as superseded so only the latest review is visible.
// For GitHub: dismisses old reviews (vcsURL is unused in this path).
// For Gitea: edits the review body with a link to the new review and resolves inline comments.
//
// The vcsURL parameter is only used in the Gitea path to construct review permalink URLs;
// it is accepted unconditionally to keep the function signature uniform across providers.
func supersedeOldReviews(ctx context.Context, client vcs.Client, provider, vcsURL, owner, repoName string, prNumber int, oldReviews []vcs.Review, newReviewID int64, sentinel string) error {
switch provider {
case "github":
// Best-effort dismissal: attempt all reviews, join any errors.
var errs []error
for _, old := range oldReviews {
if err := client.DismissReview(ctx, owner, repoName, prNumber, old.ID, "Superseded by new review"); err != nil {
slog.Warn("failed to dismiss review", "id", old.ID, "error", err)
errs = append(errs, fmt.Errorf("dismiss review %d: %w", old.ID, err))
} else {
slog.Info("dismissed old review", "review_id", old.ID, "new_review_id", newReviewID, "pr", prNumber)
}
}
return errors.Join(errs...)
case "gitea":
// Continue to Gitea-specific logic below the switch.
default:
return fmt.Errorf("supersedeOldReviews: unsupported provider %q", provider)
}
// The type assertion below is guaranteed to succeed: the caller's provider switch
// ensures we only reach this point when provider == "gitea", and the gitea provider
// always constructs a *gitea.Adapter. The !ok branch guards against future refactors
// (e.g. wrapping the adapter in a decorator) that would silently break this path.
giteaAdapter, ok := client.(*gitea.Adapter)
if !ok {
return fmt.Errorf("expected gitea.Adapter for gitea provider, got %T", client)
}
underlying := giteaAdapter.Underlying()
// Validate vcsURL scheme before embedding in Markdown link (defense-in-depth).
if !strings.HasPrefix(vcsURL, "http://") && !strings.HasPrefix(vcsURL, "https://") {
return fmt.Errorf("supersedeOldReviews: vcsURL must have http or https scheme, got %q", vcsURL)
}
newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(vcsURL, "/"), owner, repoName, prNumber, newReviewID)
for _, oldReview := range oldReviews {
cid, err := underlying.GetTimelineReviewCommentIDForReview(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil {
slog.Warn("could not find comment ID for old review", "review_id", oldReview.ID, "error", err)
continue
}
supersededBody := buildSupersededBody(oldReview.Body, oldReview.CommitID, newReviewURL, sentinel)
if err := underlying.EditComment(ctx, owner, repoName, cid, supersededBody); err != nil {
slog.Warn("could not mark old review as superseded", "review_id", oldReview.ID, "comment_id", cid, "error", err)
continue
}
slog.Info("marked old review as superseded", "review_id", oldReview.ID, "new_review_id", newReviewID, "pr", prNumber)
// Resolve old review's inline comments
oldComments, err := underlying.ListReviewComments(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil {
slog.Warn("could not list old review comments for resolution", "review_id", oldReview.ID, "error", err)
continue
}
resolved, failed := 0, 0
for _, c := range oldComments {
if c.ID == 0 {
continue
}
if err := underlying.ResolveComment(ctx, owner, repoName, c.ID); err != nil {
slog.Debug("could not resolve inline comment", "comment_id", c.ID, "error", err)
failed++
} else {
resolved++
}
}
if resolved > 0 {
slog.Info("resolved old inline comments", "review_id", oldReview.ID, "count", resolved, "pr", prNumber)
}
if failed > 0 {
slog.Warn("some inline comments could not be resolved", "review_id", oldReview.ID, "failed", failed, "pr", prNumber)
}
}
return nil
}
// fetchFileContext fetches the full content of modified files from the PR branch.
func fetchFileContext(ctx context.Context, client *gitea.Client, owner, repo, ref string, files []gitea.ChangedFile) string {
func fetchFileContext(ctx context.Context, client vcs.PRReader, owner, repo, ref string, files []vcs.ChangedFile) string {
var sb strings.Builder
for _, f := range files {
if ctx.Err() != nil {
@@ -507,7 +619,7 @@ func fetchFileContext(ctx context.Context, client *gitea.Client, owner, repo, re
if f.Status == "removed" {
continue // Skip deleted files
}
content, err := client.GetFileContentRef(ctx, owner, repo, f.Filename, ref)
content, err := client.GetFileContentAtRef(ctx, owner, repo, f.Filename, ref)
if err != nil {
slog.Warn("could not fetch file content", "file", f.Filename, "error", err)
continue
@@ -524,11 +636,25 @@ func fetchFileContext(ctx context.Context, client *gitea.Client, owner, repo, re
// patternsRepo is comma-separated list of owner/name repos.
// patternsFiles is comma-separated list of file paths or directories.
// If a path ends with / or is a directory, all files within it are fetched recursively.
func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patternsFiles string) string {
// If patternsFiles is empty, all files from the repo root are fetched.
func fetchPatterns(ctx context.Context, client vcs.FileReader, patternsRepo, patternsFiles string) string {
var sb strings.Builder
repos := strings.Split(patternsRepo, ",")
paths := strings.Split(patternsFiles, ",")
// Build the list of paths to fetch
var paths []string
if patternsFiles == "" {
// Empty patternsFiles means "fetch all files from repo root"
paths = []string{""}
} else {
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
}
for _, repoRef := range repos {
if ctx.Err() != nil {
@@ -549,12 +675,7 @@ func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patt
var repoSkippedFiles []string
for _, path := range paths {
path = strings.TrimSpace(path)
if path == "" {
continue
}
files, err := client.GetAllFilesInPath(ctx, owner, repo, path)
files, err := vcs.GetAllFilesInPath(ctx, client, owner, repo, path)
if err != nil {
slog.Warn("could not fetch patterns", "path", path, "repo", repoRef, "error", err)
continue
@@ -593,18 +714,20 @@ func isPatternFile(path string) bool {
}
// evaluateCIStatus checks if all CI statuses indicate success.
func evaluateCIStatus(statuses []gitea.CommitStatus) (passed bool, details string) {
// Returns passed=true if no checks have failed (pending checks are not treated as failures).
func evaluateCIStatus(statuses []vcs.CommitStatus) (passed bool, details string) {
if len(statuses) == 0 {
return true, "no CI statuses found"
}
var failed []string
var pending int
for _, s := range statuses {
switch s.Status {
case "success":
// good
case "pending":
// treat pending as not-failed
pending++
case "failure", "error":
failed = append(failed, fmt.Sprintf("%s: %s", s.Context, s.Description))
}
@@ -613,6 +736,9 @@ func evaluateCIStatus(statuses []gitea.CommitStatus) (passed bool, details strin
if len(failed) > 0 {
return false, strings.Join(failed, "; ")
}
if pending > 0 {
return true, fmt.Sprintf("no failures (%d pending)", pending)
}
return true, "all checks passed"
}
@@ -643,14 +769,6 @@ func envOrDefaultInt(key string, defaultVal int) int {
return defaultVal
}
func envOrDefaultBool(key string, defaultVal bool) bool {
v := strings.TrimSpace(strings.ToLower(os.Getenv(key)))
if v == "" {
return defaultVal
}
return v == "true" || v == "1" || v == "yes"
}
// validateReviewerName checks that the name contains only safe characters
// for embedding in an HTML comment sentinel ([a-zA-Z0-9_-]).
func validateReviewerName(name string) error {
@@ -728,10 +846,10 @@ func buildSupersededBody(originalBody, commitSHA, newReviewURL, sentinel string)
}
// hasSharedToken detects if another review-bot role posted under the same
// Gitea user. This indicates misconfiguration where two roles share a token
// instead of having separate Gitea accounts. Returns true if shared token
// VCS user. This indicates misconfiguration where two roles share a token
// instead of having separate accounts. Returns true if shared token
// detected (caller should skip update-in-place logic to avoid clobbering).
func hasSharedToken(reviews []gitea.Review, ownSentinel string) bool {
func hasSharedToken(reviews []vcs.Review, ownSentinel string) bool {
ownLogin := ""
for _, r := range reviews {
if strings.Contains(r.Body, ownSentinel) {
@@ -744,7 +862,7 @@ func hasSharedToken(reviews []gitea.Review, ownSentinel string) bool {
}
for _, r := range reviews {
if r.User.Login == ownLogin && strings.Contains(r.Body, "<!-- review-bot:") && !strings.Contains(r.Body, ownSentinel) {
slog.Warn("shared token detected another review-bot role is using the same Gitea user",
slog.Warn("shared token detected -- another review-bot role is using the same VCS user",
"sibling_role", extractSentinelName(r.Body), "user", ownLogin)
return true
}
@@ -765,29 +883,26 @@ func extractSentinelName(body string) string {
if end < 0 {
return "unknown"
}
return rest[:end]
}
// findOwnReview locates the most recent non-superseded review matching the sentinel.
func findOwnReview(reviews []gitea.Review, sentinel string) *gitea.Review {
var best *gitea.Review
for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) {
continue
}
if strings.Contains(reviews[i].Body, "~~Original review~~") {
continue
}
if best == nil || reviews[i].ID > best.ID {
best = &reviews[i]
name := rest[:end]
// Sanitize: strip control characters to prevent log injection.
name = strings.Map(func(r rune) rune {
if r < 0x20 || r == 0x7f {
return -1
}
return r
}, name)
if len(name) > 64 {
name = name[:64]
}
return best
if name == "" {
return "unknown"
}
return name
}
// findAllOwnReviews returns all non-superseded reviews matching the sentinel.
func findAllOwnReviews(reviews []gitea.Review, sentinel string) []gitea.Review {
var result []gitea.Review
func findAllOwnReviews(reviews []vcs.Review, sentinel string) []vcs.Review {
var result []vcs.Review
for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) {
continue
@@ -812,35 +927,3 @@ func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
}
return evaluatedSHA != currentSHA
}
// giteaClientAdapter adapts gitea.Client to vcs.FileReader interface.
type giteaClientAdapter struct {
client *gitea.Client
}
func newGiteaClientAdapter(c *gitea.Client) *giteaClientAdapter {
return &giteaClientAdapter{client: c}
}
func (a *giteaClientAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]vcs.ContentEntry, error) {
entries, err := a.client.ListContents(ctx, owner, repo, path)
if err != nil {
return nil, err
}
result := make([]vcs.ContentEntry, len(entries))
for i, e := range entries {
result[i] = vcs.ContentEntry{
Name: e.Name,
Path: e.Path,
Type: e.Type,
}
}
return result, nil
}
func (a *giteaClientAdapter) GetFileContent(ctx context.Context, owner, repo, filePath, ref string) (string, error) {
if ref != "" {
return a.client.GetFileContentRef(ctx, owner, repo, filePath, ref)
}
return a.client.GetFileContent(ctx, owner, repo, filePath)
}
+135 -170
View File
@@ -10,7 +10,7 @@ import (
"strings"
"testing"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/vcs"
)
func TestValidateReviewerName(t *testing.T) {
@@ -107,9 +107,7 @@ func TestValidateWorkspacePath(t *testing.T) {
workspace: tmpDir,
path: "/etc/passwd",
wantErr: true,
// Go 1.21+ filepath.Join normalizes absolute paths: Join("/tmp/x", "/etc/passwd")
// becomes "/tmp/x/etc/passwd", which is within workspace but doesn't exist.
errMatch: "failed to resolve",
errMatch: "failed to resolve",
},
{
name: "nonexistent file",
@@ -154,15 +152,14 @@ func TestValidateWorkspacePath(t *testing.T) {
}
}
func makeReview(id int64, login, state string, stale bool, body string) gitea.Review {
r := gitea.Review{
func makeReview(id int64, login, state string, stale bool, body string) vcs.Review {
return vcs.Review{
ID: id,
Body: body,
User: vcs.UserInfo{Login: login},
State: state,
Stale: stale,
}
r.User.Login = login
return r
}
func TestBuildSupersededBody(t *testing.T) {
@@ -213,96 +210,10 @@ func TestBuildSupersededBodyShortSHA(t *testing.T) {
}
}
func TestFindOwnReview(t *testing.T) {
tests := []struct {
name string
reviews []gitea.Review
sentinel string
wantID int64
wantNil bool
}{
{
name: "no reviews",
reviews: nil,
sentinel: "<!-- review-bot:sonnet -->",
wantNil: true,
},
{
name: "found by sentinel",
reviews: []gitea.Review{
makeReview(42, "bot", "APPROVED", false, "review body\n<!-- review-bot:sonnet -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
wantID: 42,
},
{
name: "wrong sentinel",
reviews: []gitea.Review{
makeReview(42, "bot", "APPROVED", false, "body\n<!-- review-bot:gpt -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
wantNil: true,
},
{
name: "multiple reviews, returns first match",
reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "old\n<!-- review-bot:gpt -->"),
makeReview(20, "bot", "APPROVED", false, "new\n<!-- review-bot:sonnet -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
wantID: 20,
},
{
name: "skips superseded review",
reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n**Superseded**\n<!-- review-bot:sonnet -->"),
makeReview(20, "bot", "APPROVED", false, "fresh review\n<!-- review-bot:sonnet -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
wantID: 20,
},
{
name: "only superseded reviews exist",
reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n<!-- review-bot:sonnet -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
wantNil: true,
},
{
name: "picks highest ID among matches",
reviews: []gitea.Review{
makeReview(50, "bot", "APPROVED", false, "v1\n<!-- review-bot:sonnet -->"),
makeReview(30, "bot", "APPROVED", false, "v0\n<!-- review-bot:sonnet -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
wantID: 50,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := findOwnReview(tc.reviews, tc.sentinel)
if tc.wantNil {
if got != nil {
t.Errorf("findOwnReview() = %v, want nil", got)
}
} else {
if got == nil {
t.Fatal("findOwnReview() = nil, want non-nil")
}
if got.ID != tc.wantID {
t.Errorf("findOwnReview().ID = %d, want %d", got.ID, tc.wantID)
}
}
})
}
}
func TestHasSharedToken(t *testing.T) {
tests := []struct {
name string
reviews []gitea.Review
reviews []vcs.Review
sentinel string
want bool
}{
@@ -314,36 +225,36 @@ func TestHasSharedToken(t *testing.T) {
},
{
name: "no own review yet - cannot detect",
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "other"}, Body: "<!-- review-bot:gpt --> body"},
reviews: []vcs.Review{
makeReview(1, "other", "APPROVED", false, "<!-- review-bot:gpt --> body"),
},
sentinel: "<!-- review-bot:sonnet -->",
want: false,
},
{
name: "separate users - no shared token",
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string `json:"login"` }{Login: "security-review-bot"}, Body: "<!-- review-bot:security --> body"},
reviews: []vcs.Review{
makeReview(1, "sonnet-review-bot", "APPROVED", false, "<!-- review-bot:sonnet --> body"),
makeReview(2, "security-review-bot", "APPROVED", false, "<!-- review-bot:security --> body"),
},
sentinel: "<!-- review-bot:sonnet -->",
want: false,
},
{
name: "shared token detected - same user different sentinels",
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:security --> body"},
reviews: []vcs.Review{
makeReview(1, "sonnet-review-bot", "APPROVED", false, "<!-- review-bot:sonnet --> body"),
makeReview(2, "sonnet-review-bot", "APPROVED", false, "<!-- review-bot:security --> body"),
},
sentinel: "<!-- review-bot:sonnet -->",
want: true,
},
{
name: "three roles same user",
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:security --> body"},
{ID: 3, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:gpt --> body"},
reviews: []vcs.Review{
makeReview(1, "bot", "APPROVED", false, "<!-- review-bot:sonnet --> body"),
makeReview(2, "bot", "APPROVED", false, "<!-- review-bot:security --> body"),
makeReview(3, "bot", "APPROVED", false, "<!-- review-bot:gpt --> body"),
},
sentinel: "<!-- review-bot:sonnet -->",
want: true,
@@ -504,10 +415,56 @@ func TestIsPatternFile(t *testing.T) {
}
}
// TestBuildPatternPaths verifies the path-building logic for fetchPatterns.
// Empty patternsFiles means "fetch all from root" (represented as [""]).
func TestBuildPatternPaths(t *testing.T) {
buildPaths := func(patternsFiles string) []string {
if patternsFiles == "" {
return []string{""}
}
var paths []string
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
return paths
}
tests := []struct {
name string
input string
want []string
}{
{"empty fetches root", "", []string{""}},
{"single file", "README.md", []string{"README.md"}},
{"multiple files", "README.md,PATTERNS.md", []string{"README.md", "PATTERNS.md"}},
{"trims whitespace", " foo.md , bar.md ", []string{"foo.md", "bar.md"}},
{"skips empty between commas", "foo.md,,bar.md", []string{"foo.md", "bar.md"}},
{"directory path", "patterns/", []string{"patterns/"}},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := buildPaths(tc.input)
if len(got) != len(tc.want) {
t.Errorf("buildPaths(%q) = %v, want %v", tc.input, got, tc.want)
return
}
for i := range got {
if got[i] != tc.want[i] {
t.Errorf("buildPaths(%q)[%d] = %q, want %q", tc.input, i, got[i], tc.want[i])
}
}
})
}
}
func TestEvaluateCIStatus(t *testing.T) {
tests := []struct {
name string
statuses []gitea.CommitStatus
statuses []vcs.CommitStatus
wantPassed bool
wantSubstr string
}{
@@ -519,7 +476,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "all success",
statuses: []gitea.CommitStatus{
statuses: []vcs.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "success", Context: "ci/test", Description: "Tests passed"},
},
@@ -528,7 +485,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "one failure",
statuses: []gitea.CommitStatus{
statuses: []vcs.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"},
},
@@ -537,7 +494,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "error status",
statuses: []gitea.CommitStatus{
statuses: []vcs.CommitStatus{
{Status: "error", Context: "ci/lint", Description: "Lint error"},
},
wantPassed: false,
@@ -545,16 +502,16 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "pending treated as not-failed",
statuses: []gitea.CommitStatus{
statuses: []vcs.CommitStatus{
{Status: "pending", Context: "ci/build", Description: "In progress"},
{Status: "success", Context: "ci/test", Description: "Tests passed"},
},
wantPassed: true,
wantSubstr: "all checks passed",
wantSubstr: "no failures",
},
{
name: "multiple failures",
statuses: []gitea.CommitStatus{
statuses: []vcs.CommitStatus{
{Status: "failure", Context: "ci/build", Description: "Build failed"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"},
},
@@ -563,7 +520,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "mixed with pending and failure",
statuses: []gitea.CommitStatus{
statuses: []vcs.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "pending", Context: "ci/deploy", Description: "Deploying"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"},
@@ -685,47 +642,6 @@ func TestEnvOrDefaultInt(t *testing.T) {
}
}
func TestEnvOrDefaultBool(t *testing.T) {
tests := []struct {
name string
envVal string
setEnv bool
defaultVal bool
want bool
}{
{"unset returns default true", "", false, true, true},
{"unset returns default false", "", false, false, false},
{"true", "true", true, false, true},
{"TRUE", "TRUE", true, false, true},
{"True", "True", true, false, true},
{"1", "1", true, false, true},
{"yes", "yes", true, false, true},
{"YES", "YES", true, false, true},
{"false", "false", true, true, false},
{"0", "0", true, true, false},
{"no", "no", true, true, false},
{"random string", "random", true, true, false},
{"empty string returns default", "", true, true, true},
{"whitespace true", " true ", true, false, true},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
envKey := "TEST_ENV_BOOL_" + strings.ReplaceAll(tc.name, " ", "_")
if tc.setEnv {
os.Setenv(envKey, tc.envVal)
defer os.Unsetenv(envKey)
} else {
os.Unsetenv(envKey)
}
got := envOrDefaultBool(envKey, tc.defaultVal)
if got != tc.want {
t.Errorf("envOrDefaultBool(%q, %v) = %v, want %v", tc.envVal, tc.defaultVal, got, tc.want)
}
})
}
}
func TestExtractSentinelName_EdgeCases(t *testing.T) {
tests := []struct {
body string
@@ -734,8 +650,8 @@ func TestExtractSentinelName_EdgeCases(t *testing.T) {
{"<!-- review-bot:sonnet --> rest", "sonnet"},
{"<!-- review-bot:gpt-review --> rest", "gpt-review"},
{"no sentinel here", "unknown"},
{"<!-- review-bot:", "unknown"}, // prefix but no suffix
{"prefix <!-- review-bot:abc --> end", "abc"}, // embedded in text
{"<!-- review-bot:", "unknown"}, // prefix but no suffix
{"prefix <!-- review-bot:abc --> end", "abc"}, // embedded in text
}
for _, tc := range tests {
@@ -792,7 +708,7 @@ func TestMainSubprocess_InvalidReviewerName(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--gitea-url", "http://localhost",
"--vcs-url", "http://localhost",
"--repo", "owner/repo",
"--pr", "1",
"--reviewer-name", "invalid name",
@@ -820,7 +736,7 @@ func TestMainSubprocess_InvalidRepo(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--gitea-url", "http://localhost",
"--vcs-url", "http://localhost",
"--repo", "invalidrepo",
"--pr", "1",
"--reviewer-token", "tok",
@@ -847,7 +763,7 @@ func TestMainSubprocess_InvalidPRNumber(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--gitea-url", "http://localhost",
"--vcs-url", "http://localhost",
"--repo", "owner/repo",
"--pr", "notanumber",
"--reviewer-token", "tok",
@@ -874,7 +790,7 @@ func TestMainSubprocess_InvalidTemperature(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--gitea-url", "http://localhost",
"--vcs-url", "http://localhost",
"--repo", "owner/repo",
"--pr", "1",
"--reviewer-token", "tok",
@@ -902,7 +818,7 @@ func TestMainSubprocess_InvalidProvider(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--gitea-url", "http://localhost",
"--vcs-url", "http://localhost",
"--repo", "owner/repo",
"--pr", "1",
"--reviewer-token", "tok",
@@ -926,7 +842,35 @@ func TestMainSubprocess_InvalidProvider(t *testing.T) {
}
}
// cleanEnv returns environ without any GITEA/LLM/REVIEWER env vars that would
func TestMainSubprocess_InvalidVCSProvider(t *testing.T) {
if os.Getenv("TEST_SUBPROCESS_MAIN") == "1" {
flag.CommandLine = flag.NewFlagSet(os.Args[0], flag.ExitOnError)
os.Args = []string{"review-bot",
"--provider", "invalid",
"--vcs-url", "http://localhost",
"--repo", "owner/repo",
"--pr", "1",
"--reviewer-token", "tok",
"--llm-base-url", "http://localhost",
"--llm-api-key", "key",
"--llm-model", "model",
}
main()
return
}
cmd := exec.Command(os.Args[0], "-test.run=TestMainSubprocess_InvalidVCSProvider")
cmd.Env = append(cleanEnv(), "TEST_SUBPROCESS_MAIN=1")
out, err := cmd.CombinedOutput()
if err == nil {
t.Fatal("expected non-zero exit with invalid VCS provider")
}
if !strings.Contains(string(out), "invalid --provider") {
t.Errorf("expected error about invalid --provider, got: %s", out)
}
}
// cleanEnv returns environ without any GITEA/LLM/REVIEWER/VCS env vars that would
// interfere with testing missing-flag scenarios.
func cleanEnv() []string {
var env []string
@@ -934,6 +878,7 @@ func cleanEnv() []string {
key := strings.SplitN(e, "=", 2)[0]
switch {
case strings.HasPrefix(key, "GITEA_"),
strings.HasPrefix(key, "VCS_"),
strings.HasPrefix(key, "LLM_"),
strings.HasPrefix(key, "REVIEWER_"),
strings.HasPrefix(key, "PR_"),
@@ -951,12 +896,12 @@ func cleanEnv() []string {
}
func TestFindAllOwnReviews(t *testing.T) {
reviews := []gitea.Review{
{ID: 1, Body: "<!-- review-bot:sonnet -->\nfirst review"},
{ID: 2, Body: "<!-- review-bot:gpt -->\nother bot"},
{ID: 3, Body: "<!-- review-bot:sonnet -->\nsecond review"},
{ID: 4, Body: "~~Original review~~\n<!-- review-bot:sonnet -->\nsuperseded"},
{ID: 5, Body: "<!-- review-bot:sonnet -->\nthird review"},
reviews := []vcs.Review{
makeReview(1, "bot", "APPROVED", false, "<!-- review-bot:sonnet -->\nfirst review"),
makeReview(2, "bot", "APPROVED", false, "<!-- review-bot:gpt -->\nother bot"),
makeReview(3, "bot", "APPROVED", false, "<!-- review-bot:sonnet -->\nsecond review"),
makeReview(4, "bot", "APPROVED", false, "~~Original review~~\n<!-- review-bot:sonnet -->\nsuperseded"),
makeReview(5, "bot", "APPROVED", false, "<!-- review-bot:sonnet -->\nthird review"),
}
got := findAllOwnReviews(reviews, "<!-- review-bot:sonnet -->")
@@ -1020,3 +965,23 @@ func TestShouldSkipStaleReview(t *testing.T) {
})
}
}
func TestVerdictToEvent(t *testing.T) {
tests := []struct {
verdict string
want vcs.ReviewEvent
}{
{"APPROVE", vcs.ReviewEventApprove},
{"REQUEST_CHANGES", vcs.ReviewEventRequestChanges},
{"COMMENT", vcs.ReviewEventComment},
{"other", vcs.ReviewEventComment},
{"", vcs.ReviewEventComment},
}
for _, tc := range tests {
got := verdictToEvent(tc.verdict)
if got != tc.want {
t.Errorf("verdictToEvent(%q) = %q, want %q", tc.verdict, got, tc.want)
}
}
}
+10 -31
View File
@@ -9,7 +9,7 @@ JSON is awkward for persona files that contain multi-line text (identity, severi
- Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use `.yaml` extension (not `.yml`)
- Library: use `gopkg.in/yaml.v3` (approved in CONVENTIONS.md) with explicit depth limiting
- Library: use `github.com/goccy/go-yaml` v1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation
## Proposed Approach
@@ -33,37 +33,16 @@ func parsePersona(data []byte, source string) (*Persona, error) {
### YAML Parsing with Depth Protection
```go
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
var node yaml.Node
dec := yaml.NewDecoder(bytes.NewReader(data))
if err := dec.Decode(&node); err != nil {
return err
}
if err := checkYAMLDepth(&node, 0, maxDepth); err != nil {
return err
}
return node.Decode(out)
}
We implement a custom AST-based depth/node-count walk (`checkYAMLDepth` in
`review/persona.go`) rather than relying on library decoder options. Key design
decisions:
func checkYAMLDepth(node *yaml.Node, depth, maxDepth int) error {
if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
}
// Handle alias nodes by following the Alias pointer
if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth, maxDepth)
}
for _, child := range node.Content {
if err := checkYAMLDepth(child, depth+1, maxDepth); err != nil {
return err
}
}
return nil
}
```
- **Library:** `github.com/goccy/go-yaml` with `ast.Node`-based traversal
- **Dual-map tracking:** `validated` (depth-aware short-circuit) + `visiting` (cycle detection)
- **Node-count limit:** Conservative overcounting bounds total validation work
- **Alias-aware depth:** Aliases increment depth and are re-checked when encountered at greater depths
The `gopkg.in/yaml.v3` library does not have built-in depth protection, so we implement explicit depth checking by first decoding into a `yaml.Node`, walking the tree to verify depth (including alias resolution), then decoding into the target struct.
See `review/persona.go:checkYAMLDepth` for the authoritative implementation.
## State/Data Model
@@ -74,7 +53,7 @@ No new state. Same `Persona` struct, just different parsing.
| Error | Handling |
|-------|----------|
| Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Library rejects (v1.16.0+ fix) |
| Deeply nested YAML | Custom AST walk (`checkYAMLDepth`) rejects before decode |
| Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse |
+7
View File
@@ -16,6 +16,7 @@ type Adapter struct {
// Compile-time interface conformance assertion.
var _ vcs.Client = (*Adapter)(nil)
var _ vcs.ReviewerSelfRequester = (*Adapter)(nil)
// NewAdapter creates a new Adapter wrapping the given gitea Client.
func NewAdapter(client *Client) *Adapter {
@@ -230,3 +231,9 @@ func (a *Adapter) DismissReview(ctx context.Context, owner, repo string, number
func (a *Adapter) GetAuthenticatedUser(ctx context.Context) (string, error) {
return a.client.GetAuthenticatedUser(ctx)
}
// RequestReviewerSelf adds the given user as a requested reviewer on a pull request.
// This implements vcs.ReviewerSelfRequester for the Gitea adapter.
func (a *Adapter) RequestReviewerSelf(ctx context.Context, owner, repo string, number int, user string) error {
return a.client.RequestReviewer(ctx, owner, repo, number, user)
}
+22
View File
@@ -386,3 +386,25 @@ func TestAdapter_GetFileContent_RefRouting(t *testing.T) {
t.Errorf("GetFileContent(ref=\"abc123\") = %q, want %q", got, "content-at-ref")
}
}
func TestAdapter_RequestReviewerSelf(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
t.Errorf("expected POST, got %s", r.Method)
}
expected := "/api/v1/repos/owner/repo/pulls/5/requested_reviewers"
if r.URL.Path != expected {
t.Errorf("path = %q, want %q", r.URL.Path, expected)
}
w.WriteHeader(http.StatusCreated)
}))
defer server.Close()
client := gitea.NewClient(server.URL, "token")
adapter := gitea.NewAdapter(client)
err := adapter.RequestReviewerSelf(context.Background(), "owner", "repo", 5, "bot-user")
if err != nil {
t.Fatalf("RequestReviewerSelf() error = %v", err)
}
}
+11 -4
View File
@@ -11,6 +11,7 @@ import (
type PositionMap struct {
// files maps filename → (position → new-file line number).
// Deletion lines are mapped to -1 (no new-file line).
// Hunk-header lines are mapped to 0 (no new-file line).
files map[string]map[int]int
// maxPositions caches the highest position number per file,
// tracked during construction to avoid O(n) scans at translate time.
@@ -19,8 +20,8 @@ type PositionMap struct {
// Translate converts a GitHub diff-position to a new-file line number for a given file.
// Returns an error if the file is not in the diff or the position is out of range.
// If the position targets a deletion line, it maps to the nearest non-deletion line below;
// if no such line exists, returns an error.
// If the position targets a deletion or hunk-header line, it maps to the nearest
// context/addition line below; if no such line exists, returns an error.
func (pm *PositionMap) Translate(file string, position int) (int, error) {
if pm == nil || pm.files == nil {
return 0, fmt.Errorf("empty position map")
@@ -41,14 +42,18 @@ func (pm *PositionMap) Translate(file string, position int) (int, error) {
}
// lineNum == -1 means this position is a deletion line.
// Map to the nearest non-deletion line below.
if lineNum == -1 {
// lineNum == 0 means this position is a hunk-header line.
// Both map to the nearest context/addition line below.
if lineNum <= 0 {
maxPos := pm.maxPosition(file)
for p := position + 1; p <= maxPos; p++ {
if ln, exists := fileMap[p]; exists && ln > 0 {
return ln, nil
}
}
if lineNum == 0 {
return 0, fmt.Errorf("position %d targets a hunk-header line with no subsequent new-file line in %q", position, file)
}
return 0, fmt.Errorf("position %d targets a deletion line with no subsequent new-file line in %q", position, file)
}
@@ -70,6 +75,7 @@ func (pm *PositionMap) maxPosition(file string) int {
// - A new @@ hunk within the same file continues incrementing (does not reset)
// - Position maps to the new file line number for additions and context lines
// - Deletion lines have a position but no new-file line number (stored as -1)
// - Hunk-header lines have a position but no new-file line number (stored as 0)
func BuildPositionToLineMap(diff string) *PositionMap {
pm := &PositionMap{
files: make(map[string]map[int]int),
@@ -126,6 +132,7 @@ func BuildPositionToLineMap(diff string) *PositionMap {
// Parse hunk headers
if strings.HasPrefix(line, "@@") && currentFile != "" {
position++
pm.files[currentFile][position] = 0 // sentinel: hunk-header has no new-file line
pm.maxPositions[currentFile] = position
newLine = parseHunkStart(line)
continue
+109
View File
@@ -272,3 +272,112 @@ diff --git a/b.go b/b.go
t.Errorf("Translate(b.go, 3) = %d, want 2", got)
}
}
func TestTranslate_HunkHeaderPosition_SingleHunk(t *testing.T) {
// Position 1 is the @@ hunk-header line.
// It should resolve to the first context/addition line below (new line 16).
diff := `diff --git a/file.go b/file.go
index abc..def 100644
--- a/file.go
+++ b/file.go
@@ -16,4 +16,5 @@ func example() {
context line
-deleted line
+added line
context after
`
pm := BuildPositionToLineMap(diff)
got, err := pm.Translate("file.go", 1)
if err != nil {
t.Fatalf("Translate(file.go, 1): unexpected error: %v", err)
}
if got != 16 {
t.Errorf("Translate(file.go, 1) = %d, want 16 (first context/addition line in hunk)", got)
}
}
func TestTranslate_HunkHeaderPosition_MultiHunk(t *testing.T) {
// First hunk: @@ is pos 1, then " line1" (pos 2), "-old" (pos 3), "+new" (pos 4)
// Second hunk: @@ is pos 5, then " func foo() {" (pos 6), "+// added" (pos 7), etc.
// Translating position 5 (second @@) should resolve to new line 10.
diff := `diff --git a/file.go b/file.go
--- a/file.go
+++ b/file.go
@@ -1,3 +1,3 @@ package main
line1
-old
+new
@@ -10,3 +10,4 @@ func foo() {
func foo() {
+ // added
return
}
`
pm := BuildPositionToLineMap(diff)
// Position 5 is the second @@ hunk-header — should resolve to new line 10
got, err := pm.Translate("file.go", 5)
if err != nil {
t.Fatalf("Translate(file.go, 5): unexpected error: %v", err)
}
if got != 10 {
t.Errorf("Translate(file.go, 5) = %d, want 10 (first context/addition line in second hunk)", got)
}
// Also verify first hunk header at position 1 resolves to new line 1
got, err = pm.Translate("file.go", 1)
if err != nil {
t.Fatalf("Translate(file.go, 1): unexpected error: %v", err)
}
if got != 1 {
t.Errorf("Translate(file.go, 1) = %d, want 1 (first context/addition line in first hunk)", got)
}
}
func TestTranslate_HunkHeaderPosition_NewFile(t *testing.T) {
// New file: @@ -0,0 +1,3 @@ is position 1.
// Should resolve to new line 1 (the first addition).
diff := `diff --git a/new.go b/new.go
new file mode 100644
--- /dev/null
+++ b/new.go
@@ -0,0 +1,3 @@
+package main
+
+func init() {}
`
pm := BuildPositionToLineMap(diff)
got, err := pm.Translate("new.go", 1)
if err != nil {
t.Fatalf("Translate(new.go, 1): unexpected error: %v", err)
}
if got != 1 {
t.Errorf("Translate(new.go, 1) = %d, want 1 (first addition line)", got)
}
}
func TestTranslate_HunkHeaderAtEnd(t *testing.T) {
// A hunk-header at the last position with no subsequent new-file line should error.
// This is the hunk-header equivalent of TestBuildPositionToLineMap_DeletionAtEnd.
diff := `diff --git a/file.go b/file.go
--- a/file.go
+++ b/file.go
@@ -1,2 +1,2 @@ package main
line1
-old
+new
@@ -10,2 +10,1 @@ func foo() {
-removed
`
pm := BuildPositionToLineMap(diff)
// Position 5 is the second @@ hunk-header; the only line after it (pos 6) is a
// deletion (lineNum == -1), so there's no positive new-file line to resolve to.
// The hunk-header lookup should fail.
_, err := pm.Translate("file.go", 5)
if err == nil {
t.Error("expected error for hunk-header at end with no subsequent new-file line")
}
}
+70 -15
View File
@@ -4,7 +4,9 @@
package github
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"io"
@@ -192,12 +194,32 @@ func (c *Client) SetRetryBackoff(d []time.Duration) error {
return nil
}
// doRequest performs an HTTP request with retry on 429 rate limit responses.
// It respects the Retry-After header when present (capped at maxRetryAfter).
// Transport errors (network failures, context cancellation) are not retried.
func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept string) ([]byte, error) {
// requestOptions holds per-request configuration for doRequestCore.
type requestOptions struct {
// bodyFn returns a fresh io.Reader for the request body on each attempt.
// Must be non-nil for any request that carries a body (POST, PUT, PATCH,
// or DELETE when a body is required by the API).
// Returning a fresh reader on each call allows retries to re-send the body.
bodyFn func() io.Reader
// accept overrides the default Accept header. Empty means "application/vnd.github+json".
accept string
// extraHeaders are additional headers to set on each request attempt.
extraHeaders map[string]string
}
// doRequestCore is the shared implementation for all HTTP requests with retry
// on 429 rate limit responses. It respects the Retry-After header when present
// (capped at maxRetryAfter). Transport errors are not retried.
func (c *Client) doRequestCore(ctx context.Context, method, reqURL string, opts requestOptions) ([]byte, error) {
const maxRetryAfter = 120 * time.Second
// maxErrorBodyBytes limits how much of an error response body is stored.
// Kept small (4 KiB) to reduce the risk of sensitive data leakage if callers
// log APIError.Body directly. Error() further truncates to 200 bytes.
const maxErrorBodyBytes = 4 * 1024
// backoff holds per-attempt delays: backoff[i] is the delay before attempt i+1.
// Length must be maxRetryAttempts-1 (one entry per retry gap).
// SetRetryBackoff validates at configuration time; the default is always valid.
@@ -211,11 +233,6 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
copy(backoff, defaultBackoff)
}
// maxErrorBodyBytes limits how much of an error response body is stored.
// Kept small (4 KiB) to reduce the risk of sensitive data leakage if callers
// log APIError.Body directly. Error() further truncates to 200 bytes.
const maxErrorBodyBytes = 4 * 1024
// Reject non-HTTPS URLs early since the URL is immutable across retries.
if c.token != "" && !c.allowInsecureHTTP {
parsed, err := url.Parse(reqURL)
@@ -246,7 +263,11 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
}
}
req, err := http.NewRequestWithContext(ctx, method, reqURL, nil)
var body io.Reader
if opts.bodyFn != nil {
body = opts.bodyFn()
}
req, err := http.NewRequestWithContext(ctx, method, reqURL, body)
if err != nil {
return nil, fmt.Errorf("create request: %w", err)
}
@@ -257,11 +278,14 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
req.Header.Set("Authorization", "Bearer "+c.token)
}
req.Header.Set("User-Agent", userAgent)
if accept != "" {
req.Header.Set("Accept", accept)
if opts.accept != "" {
req.Header.Set("Accept", opts.accept)
} else {
req.Header.Set("Accept", "application/vnd.github+json")
}
for k, v := range opts.extraHeaders {
req.Header.Set(k, v)
}
resp, err := c.httpClient.Do(req)
if err != nil {
@@ -272,11 +296,11 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
respStatus := resp.StatusCode
retryAfterHeader := resp.Header.Get("Retry-After")
body, done, err := c.handleResponse(resp, maxResponseBytes, maxErrorBodyBytes)
respBody, done, handleErr := c.handleResponse(resp, maxResponseBytes, maxErrorBodyBytes)
if done {
return body, err
return respBody, handleErr
}
lastErr = err
lastErr = handleErr
// Retry on 429 rate limit
if respStatus == http.StatusTooManyRequests && attempt < maxRetryAttempts-1 {
@@ -314,6 +338,13 @@ func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept st
return nil, lastErr
}
// doRequest performs an HTTP request with retry on 429 rate limit responses.
// It respects the Retry-After header when present (capped at maxRetryAfter).
// Transport errors (network failures, context cancellation) are not retried.
func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept string) ([]byte, error) {
return c.doRequestCore(ctx, method, reqURL, requestOptions{accept: accept})
}
// handleResponse reads and closes the response body, returning the result.
// It uses defer to ensure the body is always closed regardless of code path.
// Returns (body, done, err) where done=true means the caller should return immediately.
@@ -342,3 +373,27 @@ func (c *Client) handleResponse(resp *http.Response, maxRespBytes int, maxErrByt
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return c.doRequest(ctx, http.MethodGet, reqURL, "")
}
// doRequestWithBody is like doRequest but sends a request body.
// It accepts the raw body bytes and sets Content-Type to application/json.
// Retry semantics match doRequest (retries on 429 with Retry-After support).
func (c *Client) doRequestWithBody(ctx context.Context, method, reqURL string, reqBody []byte) ([]byte, error) {
var opts requestOptions
if reqBody != nil {
opts.bodyFn = func() io.Reader { return bytes.NewReader(reqBody) }
opts.extraHeaders = map[string]string{"Content-Type": "application/json"}
}
return c.doRequestCore(ctx, method, reqURL, opts)
}
// doJSONRequest performs an HTTP request with a JSON body and returns the response body.
// It delegates retry/backoff/429 handling to doRequestWithBody.
// This is a general-purpose helper used by any method that needs to send JSON payloads
// (e.g. PostReview, DismissReview).
func (c *Client) doJSONRequest(ctx context.Context, method, reqURL string, payload any) ([]byte, error) {
jsonBody, err := json.Marshal(payload)
if err != nil {
return nil, fmt.Errorf("marshal request body: %w", err)
}
return c.doRequestWithBody(ctx, method, reqURL, jsonBody)
}
+57 -1
View File
@@ -2,6 +2,7 @@ package github
import (
"context"
"errors"
"net/http"
"net/http/httptest"
"net/url"
@@ -567,7 +568,6 @@ func TestSetHTTPClient_NilRestoresDefault(t *testing.T) {
}
}
func TestSetRetryBackoff_RejectsInvalidLength(t *testing.T) {
c := NewClient("token", "https://api.github.com")
@@ -592,3 +592,59 @@ func TestSetRetryBackoff_RejectsInvalidLength(t *testing.T) {
t.Fatalf("unexpected error for valid backoff: %v", err)
}
}
func TestDoJSONRequest_429Retry(t *testing.T) {
attempts := 0
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts < 3 {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit exceeded"}`))
return
}
w.WriteHeader(200)
w.Write([]byte(`{"id":1}`))
}))
defer ts.Close()
c := NewClient("token", ts.URL, AllowInsecureHTTP())
if err := c.SetRetryBackoff([]time.Duration{1 * time.Millisecond, 1 * time.Millisecond}); err != nil {
t.Fatalf("SetRetryBackoff: %v", err)
}
body, err := c.doJSONRequest(context.Background(), http.MethodPost, ts.URL+"/test", map[string]string{"key": "val"})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if attempts != 3 {
t.Errorf("expected 3 attempts, got %d", attempts)
}
if string(body) != `{"id":1}` {
t.Errorf("unexpected body: %s", body)
}
}
func TestDoJSONRequest_429ExhaustsRetries(t *testing.T) {
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit"}`))
}))
defer ts.Close()
c := NewClient("token", ts.URL, AllowInsecureHTTP())
if err := c.SetRetryBackoff([]time.Duration{1 * time.Millisecond, 1 * time.Millisecond}); err != nil {
t.Fatalf("SetRetryBackoff: %v", err)
}
_, err := c.doJSONRequest(context.Background(), http.MethodPost, ts.URL+"/test", map[string]string{"key": "val"})
if err == nil {
t.Fatal("expected error after exhausting retries")
}
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected APIError, got %T: %v", err, err)
}
if apiErr.StatusCode != 429 {
t.Errorf("expected 429, got %d", apiErr.StatusCode)
}
}
+3 -2
View File
@@ -6,5 +6,6 @@ import (
)
// Compile-time interface conformance assertion.
// Verifies github.Client satisfies vcs.PRReader.
var _ vcs.PRReader = (*github.Client)(nil)
// This verifies github.Client satisfies the full vcs.Client interface
// (PRReader, FileReader, Reviewer, Identity).
var _ vcs.Client = (*github.Client)(nil)
+60
View File
@@ -8,8 +8,16 @@ import (
"net/url"
"path"
"strings"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// GetFileContent fetches a file from a repo at the given ref.
// Delegates to GetFileContentAtRef with the provided ref.
func (c *Client) GetFileContent(ctx context.Context, owner, repo, filePath, ref string) (string, error) {
return c.GetFileContentAtRef(ctx, owner, repo, filePath, ref)
}
// GetFileContentAtRef fetches a file at a specific ref from a repo.
// If ref is empty, the query parameter is omitted (uses default branch).
//
@@ -46,6 +54,58 @@ func (c *Client) GetFileContentAtRef(ctx context.Context, owner, repo, filePath,
return decoded, nil
}
// ListContents lists files and directories at a given path in a repo.
// Returns the directory listing from the GitHub contents API.
// If the path points to a single file (not a directory), the API returns
// a JSON object instead of an array; this is handled by returning a
// single-element slice.
func (c *Client) ListContents(ctx context.Context, owner, repo, filePath string) ([]vcs.ContentEntry, error) {
escaped, err := escapePath(filePath)
if err != nil {
return nil, fmt.Errorf("invalid file path: %w", err)
}
reqURL := fmt.Sprintf("%s/repos/%s/%s/contents/%s",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), escaped)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list contents %s: %w", filePath, err)
}
type entry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"`
}
// The GitHub contents API returns an array for directories and an object
// for single files. Try array first (common case), then fall back to object.
// An empty array ([]) is valid — it represents an empty directory — and
// results in a zero-length slice returned without error.
var entries []entry
if err := json.Unmarshal(body, &entries); err != nil {
var single entry
if err2 := json.Unmarshal(body, &single); err2 != nil {
return nil, fmt.Errorf("parse contents JSON: as array: %v; as object: %w", err, err2)
}
// Guard against empty objects ({}) or unexpected shapes that
// unmarshal successfully but carry no useful data.
if single.Name == "" && single.Path == "" && single.Type == "" {
return nil, fmt.Errorf("parse contents JSON: unexpected response format")
}
entries = []entry{single}
}
result := make([]vcs.ContentEntry, len(entries))
for i, e := range entries {
result[i] = vcs.ContentEntry{
Name: e.Name,
Path: e.Path,
Type: e.Type,
}
}
return result, nil
}
// escapePath validates and encodes a slash-separated file path for use in
// GitHub API URLs. Returns an error if the path contains dot-segments ("."
// or "..") or resolves to a path outside the repository root.
+311 -2
View File
@@ -2,12 +2,291 @@ package github
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
func TestGetFileContent_DelegatesToGetFileContentAtRef(t *testing.T) {
var gotRef string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotRef = r.URL.Query().Get("ref")
json.NewEncoder(w).Encode(map[string]string{
"content": "dGVzdA==", // "test" in base64
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
// Call with empty ref — should not include ref param
content, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if content != "test" {
t.Errorf("expected 'test', got %q", content)
}
if gotRef != "" {
t.Errorf("expected empty ref, got %q", gotRef)
}
}
func TestGetFileContent_WithRef(t *testing.T) {
var gotRef string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotRef = r.URL.Query().Get("ref")
json.NewEncoder(w).Encode(map[string]string{
"content": "dGVzdA==",
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "abc123")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if gotRef != "abc123" {
t.Errorf("expected ref 'abc123', got %q", gotRef)
}
}
func TestGetFileContent_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "missing.go", "")
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestGetFileContent_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err == nil {
t.Fatal("expected error for 401")
}
}
func TestGetFileContent_429Retry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit"}`))
return
}
json.NewEncoder(w).Encode(map[string]string{
"content": "b2s=",
"encoding": "base64",
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
if err := c.SetRetryBackoff([]time.Duration{1 * time.Millisecond, 1 * time.Millisecond}); err != nil {
t.Fatalf("SetRetryBackoff: %v", err)
}
content, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if content != "ok" {
t.Errorf("expected 'ok', got %q", content)
}
if attempts != 2 {
t.Errorf("expected 2 attempts, got %d", attempts)
}
}
func TestGetFileContent_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`not json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.GetFileContent(context.Background(), "owner", "repo", "file.go", "")
if err == nil {
t.Fatal("expected error for malformed JSON")
}
}
func TestListContents_HappyPath(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/contents/src" {
t.Errorf("unexpected path: %s", r.URL.Path)
}
json.NewEncoder(w).Encode([]map[string]string{
{"name": "main.go", "path": "src/main.go", "type": "file"},
{"name": "lib", "path": "src/lib", "type": "dir"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
entries, err := c.ListContents(context.Background(), "owner", "repo", "src")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 2 {
t.Fatalf("expected 2 entries, got %d", len(entries))
}
if entries[0].Name != "main.go" {
t.Errorf("expected name 'main.go', got %q", entries[0].Name)
}
if entries[0].Path != "src/main.go" {
t.Errorf("expected path 'src/main.go', got %q", entries[0].Path)
}
if entries[0].Type != "file" {
t.Errorf("expected type 'file', got %q", entries[0].Type)
}
if entries[1].Name != "lib" {
t.Errorf("expected name 'lib', got %q", entries[1].Name)
}
if entries[1].Type != "dir" {
t.Errorf("expected type 'dir', got %q", entries[1].Type)
}
}
func TestListContents_404(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.ListContents(context.Background(), "owner", "repo", "missing")
if err == nil {
t.Fatal("expected error for 404")
}
}
func TestListContents_401(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.ListContents(context.Background(), "owner", "repo", "src")
if err == nil {
t.Fatal("expected error for 401")
}
}
func TestListContents_429Retry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.WriteHeader(429)
w.Write([]byte(`{"message":"rate limit"}`))
return
}
json.NewEncoder(w).Encode([]map[string]string{
{"name": "file.go", "path": "file.go", "type": "file"},
})
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
if err := c.SetRetryBackoff([]time.Duration{1 * time.Millisecond, 1 * time.Millisecond}); err != nil {
t.Fatalf("SetRetryBackoff: %v", err)
}
entries, err := c.ListContents(context.Background(), "owner", "repo", "src")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if attempts != 2 {
t.Errorf("expected 2 attempts, got %d", attempts)
}
}
func TestListContents_MalformedJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`not json`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
_, err := c.ListContents(context.Background(), "owner", "repo", "src")
if err == nil {
t.Fatal("expected error for malformed JSON")
}
}
func TestListContents_SingleFile(t *testing.T) {
// GitHub Contents API returns a JSON object (not array) for single-file paths
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
w.Write([]byte(`{"name":"README.md","path":"README.md","type":"file"}`))
}))
defer srv.Close()
c := NewClient("token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
entries, err := c.ListContents(context.Background(), "owner", "repo", "README.md")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if entries[0].Name != "README.md" {
t.Errorf("expected name 'README.md', got %q", entries[0].Name)
}
if entries[0].Type != "file" {
t.Errorf("expected type 'file', got %q", entries[0].Type)
}
}
func TestEscapePath_ValidPaths(t *testing.T) {
t.Parallel()
tests := []struct {
@@ -68,7 +347,7 @@ func TestGetFileContentAtRef_DotSegmentError(t *testing.T) {
}))
defer srv.Close()
c := NewClient("token", srv.URL)
c := NewClient("token", srv.URL, AllowInsecureHTTP())
_, err := c.GetFileContentAtRef(context.Background(), "owner", "repo", "foo/../bar.go", "main")
if err == nil {
t.Fatal("expected error for path with dot-segments")
@@ -78,6 +357,37 @@ func TestGetFileContentAtRef_DotSegmentError(t *testing.T) {
}
}
func TestDecodeBase64Content(t *testing.T) {
// Test with newlines (GitHub's format)
encoded := "cGFja2FnZSBt\nYWlu"
decoded, err := decodeBase64Content(encoded)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if decoded != "package main" {
t.Errorf("expected 'package main', got %q", decoded)
}
}
func TestDecodeBase64Content_Invalid(t *testing.T) {
_, err := decodeBase64Content("not!!!valid!!!base64")
if err == nil {
t.Fatal("expected error for invalid base64")
}
}
func TestDecodeBase64Content_CRLF(t *testing.T) {
// Base64 of "hello world" with CRLF line breaks inserted
encoded := "aGVs\r\nbG8g\r\nd29y\r\nbGQ="
decoded, err := decodeBase64Content(encoded)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if decoded != "hello world" {
t.Errorf("expected 'hello world', got %q", decoded)
}
}
func TestDecodeBase64Content_SizeLimit(t *testing.T) {
t.Parallel()
// Create base64 content that would decode to > maxFileContentSize.
@@ -93,4 +403,3 @@ func TestDecodeBase64Content_SizeLimit(t *testing.T) {
t.Errorf("expected 'too large' error, got: %v", err)
}
}
+23
View File
@@ -0,0 +1,23 @@
package github
import (
"net/http"
"net/http/httptest"
"testing"
"time"
)
// newTestClient creates a *Client backed by an httptest.Server running the
// given handler. The server is automatically closed when the test finishes.
// Shared across test files in package github.
func newTestClient(t *testing.T, handler http.HandlerFunc) *Client {
t.Helper()
srv := httptest.NewServer(handler)
t.Cleanup(srv.Close)
c := NewClient("test-token", srv.URL, AllowInsecureHTTP())
c.SetHTTPClient(srv.Client())
if err := c.SetRetryBackoff([]time.Duration{1 * time.Millisecond, 1 * time.Millisecond}); err != nil {
t.Fatalf("SetRetryBackoff: %v", err)
}
return c
}
+29
View File
@@ -0,0 +1,29 @@
package github
import (
"context"
"encoding/json"
"fmt"
)
// userResponse is the GitHub API response for the authenticated user.
type userResponse struct {
Login string `json:"login"`
}
// GetAuthenticatedUser returns the login of the currently authenticated user.
func (c *Client) GetAuthenticatedUser(ctx context.Context) (string, error) {
reqURL := fmt.Sprintf("%s/user", c.baseURL)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("get authenticated user: %w", err)
}
var resp userResponse
if err := json.Unmarshal(body, &resp); err != nil {
return "", fmt.Errorf("parse user response: %w", err)
}
return resp.Login, nil
}
+46
View File
@@ -0,0 +1,46 @@
package github
import (
"context"
"encoding/json"
"net/http"
"testing"
)
func TestGetAuthenticatedUser_HappyPath(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "GET" {
t.Errorf("expected GET, got %s", r.Method)
}
if r.URL.Path != "/user" {
t.Errorf("unexpected path: %s", r.URL.Path)
}
if r.Header.Get("Authorization") != "Bearer test-token" {
t.Errorf("unexpected auth header: %s", r.Header.Get("Authorization"))
}
json.NewEncoder(w).Encode(map[string]string{"login": "review-bot"})
})
login, err := c.GetAuthenticatedUser(context.Background())
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if login != "review-bot" {
t.Errorf("expected login 'review-bot', got %q", login)
}
}
func TestGetAuthenticatedUser_401(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
})
_, err := c.GetAuthenticatedUser(context.Background())
if err == nil {
t.Fatal("expected error for 401")
}
if !IsUnauthorized(err) {
t.Errorf("expected IsUnauthorized=true, got error: %v", err)
}
}
+212
View File
@@ -0,0 +1,212 @@
package github
import (
"context"
"encoding/json"
"errors"
"fmt"
"net/http"
"net/url"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// ErrCannotDeleteSubmittedReview is returned when DeleteReview is called on
// a review that has already been submitted (APPROVED, REQUEST_CHANGES, COMMENT).
// GitHub only allows deletion of PENDING reviews. Callers that need to replace
// a submitted review should use DismissReview instead.
var ErrCannotDeleteSubmittedReview = errors.New("cannot delete submitted review: use DismissReview instead")
// ErrConflictingCommitIDs is returned when PostReview receives comments with
// differing non-empty CommitIDs. The GitHub API accepts only a single commit_id
// per review submission; callers must ensure all comments target the same commit.
var ErrConflictingCommitIDs = errors.New("comments contain conflicting commit IDs: all must target the same commit")
// postReviewRequest is the GitHub API request body for creating a review.
type postReviewRequest struct {
CommitID string `json:"commit_id,omitempty"`
Body string `json:"body"`
Event string `json:"event"`
Comments []reviewCommentEntry `json:"comments,omitempty"`
}
// reviewCommentEntry is a single inline comment in a review creation request.
type reviewCommentEntry struct {
Path string `json:"path"`
Position int `json:"position"`
Body string `json:"body"`
}
// reviewResponse is the GitHub API response for a review.
type reviewResponse struct {
ID int64 `json:"id"`
Body string `json:"body"`
State string `json:"state"`
CommitID string `json:"commit_id"`
User struct {
Login string `json:"login"`
} `json:"user"`
}
// dismissReviewRequest is the GitHub API request body for dismissing a review.
type dismissReviewRequest struct {
Message string `json:"message"`
Event string `json:"event"`
}
// translateGitHubReviewState translates a GitHub API review state to the
// canonical vcs.Review.State value.
func translateGitHubReviewState(state string) string {
switch state {
case "CHANGES_REQUESTED":
return "REQUEST_CHANGES"
case "COMMENTED":
return "COMMENT"
default:
// States like APPROVED, DISMISSED, and PENDING pass through unchanged
// as they already match the canonical vcs representation. PENDING appears
// on draft reviews that have not yet been submitted via the GitHub UI or API.
return state
}
}
// PostReview submits a review on a pull request.
//
// The vcs.ReviewEvent constants (ReviewEventApprove, ReviewEventRequestChanges,
// ReviewEventComment) have string values that match GitHub's wire-format event
// strings (APPROVE, REQUEST_CHANGES, COMMENT), so Event is cast directly to
// string without translation.
//
// ReviewComment.Position maps directly to the GitHub API position field.
// When req.Comments is empty, the payload omits the comments field entirely
// (via the omitempty tag on postReviewRequest.Comments).
//
// The GitHub API accepts a single commit_id per review submission. PostReview
// extracts it from the first comment with a non-empty CommitID. If any subsequent
// comment specifies a different CommitID, PostReview returns ErrConflictingCommitIDs.
// Comments with an empty CommitID are allowed and inherit the review-level value.
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, req vcs.ReviewRequest) (*vcs.Review, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
payload := postReviewRequest{
Body: req.Body,
Event: string(req.Event),
}
// Build the payload in one pass. The GitHub API accepts a single commit_id
// per review; we extract it from the first comment that supplies one and
// reject the request if any other comment disagrees.
for _, comment := range req.Comments {
if comment.CommitID != "" {
if payload.CommitID == "" {
payload.CommitID = comment.CommitID
} else if payload.CommitID != comment.CommitID {
return nil, ErrConflictingCommitIDs
}
}
payload.Comments = append(payload.Comments, reviewCommentEntry{
Path: comment.Path,
Position: comment.Position,
Body: comment.Body,
})
}
data, err := json.Marshal(payload)
if err != nil {
return nil, fmt.Errorf("marshal review request: %w", err)
}
body, err := c.doRequestWithBody(ctx, http.MethodPost, reqURL, data)
if err != nil {
return nil, fmt.Errorf("post review: %w", err)
}
var resp reviewResponse
if err := json.Unmarshal(body, &resp); err != nil {
return nil, fmt.Errorf("parse review response: %w", err)
}
return &vcs.Review{
ID: resp.ID,
Body: resp.Body,
User: vcs.UserInfo{Login: resp.User.Login},
State: translateGitHubReviewState(resp.State),
CommitID: resp.CommitID,
}, nil
}
// ListReviews retrieves all reviews for a pull request.
// GitHub review states are translated to canonical vcs values.
func (c *Client) ListReviews(ctx context.Context, owner, repo string, number int) ([]vcs.Review, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list reviews: %w", err)
}
var responses []reviewResponse
if err := json.Unmarshal(body, &responses); err != nil {
return nil, fmt.Errorf("parse reviews response: %w", err)
}
reviews := make([]vcs.Review, len(responses))
for i, r := range responses {
reviews[i] = vcs.Review{
ID: r.ID,
Body: r.Body,
User: vcs.UserInfo{Login: r.User.Login},
State: translateGitHubReviewState(r.State),
CommitID: r.CommitID,
}
}
return reviews, nil
}
// DeleteReview deletes a pull request review.
// Only PENDING reviews can be deleted; attempting to delete a submitted review
// (APPROVED, CHANGES_REQUESTED, or COMMENTED per GitHub API naming) returns
// ErrCannotDeleteSubmittedReview.
func (c *Client) DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews/%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, reviewID)
// nil body: the GitHub DELETE endpoint for reviews requires no request body.
_, err := c.doRequestWithBody(ctx, http.MethodDelete, reqURL, nil)
if err != nil {
var apiErr *APIError
if errors.As(err, &apiErr) && apiErr.StatusCode == 422 {
return fmt.Errorf("delete review: %w", ErrCannotDeleteSubmittedReview)
}
return fmt.Errorf("delete review: %w", err)
}
return nil
}
// DismissReview dismisses a submitted review on a pull request.
// This is the correct way to "remove" a submitted review (APPROVED, REQUEST_CHANGES).
// GitHub does not allow deleting submitted reviews — they must be dismissed.
func (c *Client) DismissReview(ctx context.Context, owner, repo string, number int, reviewID int64, message string) error {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews/%d/dismissals",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, reviewID)
payload := dismissReviewRequest{
Message: message,
// Event is required by the GitHub API for dismissal requests, even though
// "DISMISS" is the only valid value for this endpoint.
Event: "DISMISS",
}
data, err := json.Marshal(payload)
if err != nil {
return fmt.Errorf("marshal dismiss request: %w", err)
}
_, err = c.doRequestWithBody(ctx, http.MethodPut, reqURL, data)
if err != nil {
return fmt.Errorf("dismiss review: %w", err)
}
return nil
}
+391
View File
@@ -0,0 +1,391 @@
package github
import (
"context"
"encoding/json"
"errors"
"io"
"net/http"
"strings"
"testing"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// --- PostReview tests ---
func TestPostReview_HappyPath(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
t.Fatalf("expected POST, got %s", r.Method)
}
if r.URL.Path != "/repos/owner/repo/pulls/5/reviews" {
t.Fatalf("unexpected path: %s", r.URL.Path)
}
if r.Header.Get("Content-Type") != "application/json" {
t.Errorf("expected Content-Type application/json, got %q", r.Header.Get("Content-Type"))
}
// Verify request body
body, _ := io.ReadAll(r.Body)
var req postReviewRequest
if err := json.Unmarshal(body, &req); err != nil {
t.Fatalf("unmarshal request: %v", err)
}
if req.Event != "APPROVE" {
t.Errorf("expected event APPROVE, got %q", req.Event)
}
if req.Body != "LGTM" {
t.Errorf("expected body 'LGTM', got %q", req.Body)
}
if req.CommitID != "abc123" {
t.Errorf("expected commit_id 'abc123', got %q", req.CommitID)
}
if len(req.Comments) != 1 {
t.Fatalf("expected 1 comment, got %d", len(req.Comments))
}
if req.Comments[0].Path != "main.go" {
t.Errorf("expected comment path 'main.go', got %q", req.Comments[0].Path)
}
if req.Comments[0].Position != 4 {
t.Errorf("expected comment position 4, got %d", req.Comments[0].Position)
}
json.NewEncoder(w).Encode(map[string]interface{}{
"id": 100,
"body": "LGTM",
"state": "APPROVED",
"commit_id": "abc123",
"user": map[string]string{"login": "reviewer"},
})
})
review, err := c.PostReview(context.Background(), "owner", "repo", 5, vcs.ReviewRequest{
Body: "LGTM",
Event: vcs.ReviewEventApprove,
Comments: []vcs.ReviewComment{
{Path: "main.go", Position: 4, CommitID: "abc123", Body: "nit: rename"},
},
})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if review.ID != 100 {
t.Errorf("expected ID 100, got %d", review.ID)
}
if review.Body != "LGTM" {
t.Errorf("expected body 'LGTM', got %q", review.Body)
}
if review.State != "APPROVED" {
t.Errorf("expected state 'APPROVED', got %q", review.State)
}
if review.User.Login != "reviewer" {
t.Errorf("expected user 'reviewer', got %q", review.User.Login)
}
if review.CommitID != "abc123" {
t.Errorf("expected commit_id 'abc123', got %q", review.CommitID)
}
}
func TestPostReview_401(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
})
_, err := c.PostReview(context.Background(), "owner", "repo", 5, vcs.ReviewRequest{
Body: "LGTM",
Event: vcs.ReviewEventApprove,
})
if err == nil {
t.Fatal("expected error for 401")
}
if !IsUnauthorized(err) {
t.Errorf("expected IsUnauthorized=true, got error: %v", err)
}
}
func TestPostReview_422(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(422)
w.Write([]byte(`{"message":"Unprocessable Entity"}`))
})
_, err := c.PostReview(context.Background(), "owner", "repo", 5, vcs.ReviewRequest{
Body: "LGTM",
Event: vcs.ReviewEventApprove,
})
if err == nil {
t.Fatal("expected error for 422")
}
// 422 should surface as a wrapped APIError
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected *APIError, got %T: %v", err, err)
}
if apiErr.StatusCode != 422 {
t.Errorf("expected status 422, got %d", apiErr.StatusCode)
}
}
func TestPostReview_MalformedResponse(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte(`not json`))
})
_, err := c.PostReview(context.Background(), "owner", "repo", 5, vcs.ReviewRequest{
Body: "LGTM",
Event: vcs.ReviewEventApprove,
})
if err == nil {
t.Fatal("expected error for malformed response")
}
if !strings.Contains(err.Error(), "parse review response") {
t.Errorf("expected parse error, got: %v", err)
}
}
// --- ListReviews tests ---
func TestListReviews_HappyPath(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "GET" {
t.Fatalf("expected GET, got %s", r.Method)
}
if r.URL.Path != "/repos/owner/repo/pulls/3/reviews" {
t.Fatalf("unexpected path: %s", r.URL.Path)
}
json.NewEncoder(w).Encode([]map[string]interface{}{
{
"id": 1,
"body": "Approved",
"state": "APPROVED",
"commit_id": "sha1",
"user": map[string]string{"login": "user1"},
},
{
"id": 2,
"body": "Needs work",
"state": "CHANGES_REQUESTED",
"commit_id": "sha2",
"user": map[string]string{"login": "user2"},
},
{
"id": 3,
"body": "Comment only",
"state": "COMMENTED",
"commit_id": "sha3",
"user": map[string]string{"login": "user3"},
},
{
"id": 4,
"body": "Old review",
"state": "DISMISSED",
"commit_id": "sha4",
"user": map[string]string{"login": "user4"},
},
})
})
reviews, err := c.ListReviews(context.Background(), "owner", "repo", 3)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(reviews) != 4 {
t.Fatalf("expected 4 reviews, got %d", len(reviews))
}
// Check state translation
expected := []struct {
id int64
state string
}{
{1, "APPROVED"},
{2, "REQUEST_CHANGES"},
{3, "COMMENT"},
{4, "DISMISSED"},
}
for i, e := range expected {
if reviews[i].ID != e.id {
t.Errorf("review[%d]: expected ID %d, got %d", i, e.id, reviews[i].ID)
}
if reviews[i].State != e.state {
t.Errorf("review[%d]: expected state %q, got %q", i, e.state, reviews[i].State)
}
}
}
func TestListReviews_404(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
})
_, err := c.ListReviews(context.Background(), "owner", "repo", 999)
if err == nil {
t.Fatal("expected error for 404")
}
if !IsNotFound(err) {
t.Errorf("expected IsNotFound=true, got error: %v", err)
}
}
func TestListReviews_401(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
})
_, err := c.ListReviews(context.Background(), "owner", "repo", 3)
if err == nil {
t.Fatal("expected error for 401")
}
if !IsUnauthorized(err) {
t.Errorf("expected IsUnauthorized=true, got error: %v", err)
}
}
// --- DeleteReview tests ---
func TestDeleteReview_HappyPath(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "DELETE" {
t.Fatalf("expected DELETE, got %s", r.Method)
}
if r.URL.Path != "/repos/owner/repo/pulls/5/reviews/42" {
t.Fatalf("unexpected path: %s", r.URL.Path)
}
w.WriteHeader(204)
})
err := c.DeleteReview(context.Background(), "owner", "repo", 5, 42)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestDeleteReview_422_SubmittedReview(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(422)
w.Write([]byte(`{"message":"Can not delete a non pending review"}`))
})
err := c.DeleteReview(context.Background(), "owner", "repo", 5, 42)
if err == nil {
t.Fatal("expected error for 422")
}
if !errors.Is(err, ErrCannotDeleteSubmittedReview) {
t.Errorf("expected ErrCannotDeleteSubmittedReview, got: %v", err)
}
}
// --- DismissReview tests ---
func TestDismissReview_HappyPath(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
if r.Method != "PUT" {
t.Fatalf("expected PUT, got %s", r.Method)
}
if r.URL.Path != "/repos/owner/repo/pulls/5/reviews/10/dismissals" {
t.Fatalf("unexpected path: %s", r.URL.Path)
}
body, _ := io.ReadAll(r.Body)
var req dismissReviewRequest
if err := json.Unmarshal(body, &req); err != nil {
t.Fatalf("unmarshal request: %v", err)
}
if req.Message != "Superseded by new review" {
t.Errorf("expected message 'Superseded by new review', got %q", req.Message)
}
if req.Event != "DISMISS" {
t.Errorf("expected event 'DISMISS', got %q", req.Event)
}
json.NewEncoder(w).Encode(map[string]interface{}{
"id": 10,
"state": "DISMISSED",
})
})
err := c.DismissReview(context.Background(), "owner", "repo", 5, 10, "Superseded by new review")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestDismissReview_404(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(404)
w.Write([]byte(`{"message":"Not Found"}`))
})
err := c.DismissReview(context.Background(), "owner", "repo", 5, 999, "dismiss")
if err == nil {
t.Fatal("expected error for 404")
}
if !IsNotFound(err) {
t.Errorf("expected IsNotFound=true, got error: %v", err)
}
}
func TestDismissReview_401(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(401)
w.Write([]byte(`{"message":"Bad credentials"}`))
})
err := c.DismissReview(context.Background(), "owner", "repo", 5, 10, "dismiss")
if err == nil {
t.Fatal("expected error for 401")
}
if !IsUnauthorized(err) {
t.Errorf("expected IsUnauthorized=true, got error: %v", err)
}
}
// --- State translation tests ---
func TestTranslateGitHubReviewState(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{"approved passes through", "APPROVED", "APPROVED"},
{"changes_requested maps to REQUEST_CHANGES", "CHANGES_REQUESTED", "REQUEST_CHANGES"},
{"commented maps to COMMENT", "COMMENTED", "COMMENT"},
{"dismissed passes through", "DISMISSED", "DISMISSED"},
{"unknown state passes through", "UNKNOWN_STATE", "UNKNOWN_STATE"},
{"empty string passes through", "", ""},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := translateGitHubReviewState(tt.input)
if got != tt.want {
t.Errorf("translateGitHubReviewState(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
func TestPostReview_ConflictingCommitIDs(t *testing.T) {
c := newTestClient(t, func(w http.ResponseWriter, r *http.Request) {
t.Fatal("request should not be sent when commit IDs conflict")
})
_, err := c.PostReview(context.Background(), "owner", "repo", 5, vcs.ReviewRequest{
Body: "Review",
Event: vcs.ReviewEventComment,
Comments: []vcs.ReviewComment{
{Path: "a.go", Position: 1, CommitID: "sha-1", Body: "first"},
{Path: "b.go", Position: 2, CommitID: "sha-2", Body: "second"},
},
})
if err == nil {
t.Fatal("expected error for conflicting commit IDs")
}
if !errors.Is(err, ErrConflictingCommitIDs) {
t.Errorf("expected ErrConflictingCommitIDs, got: %v", err)
}
}
+1 -1
View File
@@ -2,4 +2,4 @@ module gitea.weiker.me/rodin/review-bot
go 1.26.2
require gopkg.in/yaml.v3 v3.0.1
require github.com/goccy/go-yaml v1.19.2
+2 -4
View File
@@ -1,4 +1,2 @@
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
github.com/goccy/go-yaml v1.19.2 h1:PmFC1S6h8ljIz6gMRBopkjP1TVT7xuwrButHID66PoM=
github.com/goccy/go-yaml v1.19.2/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA=
-12
View File
@@ -10,18 +10,6 @@ func FormatMarkdown(result *ReviewResult, reviewerName string) string {
return FormatMarkdownWithDisplay(result, reviewerName, reviewerName)
}
// GiteaEvent converts the verdict to the Gitea API event string.
func GiteaEvent(verdict string) string {
switch verdict {
case "APPROVE":
return "APPROVED"
case "REQUEST_CHANGES":
return "REQUEST_CHANGES"
default:
return "COMMENT"
}
}
// FormatMarkdownWithDisplay formats a ReviewResult with separate display name and sentinel name.
// Note: displayName is not HTML-escaped as Gitea sanitizes rendered Markdown.
// Persona display names are controlled by repo owners (trusted input).
-19
View File
@@ -98,25 +98,6 @@ func TestFormatMarkdown_SpecialChars(t *testing.T) {
}
}
func TestGiteaEvent(t *testing.T) {
tests := []struct {
verdict string
expected string
}{
{"APPROVE", "APPROVED"},
{"REQUEST_CHANGES", "REQUEST_CHANGES"},
{"UNKNOWN", "COMMENT"},
{"", "COMMENT"},
}
for _, tc := range tests {
got := GiteaEvent(tc.verdict)
if got != tc.expected {
t.Errorf("GiteaEvent(%q) = %q, want %q", tc.verdict, got, tc.expected)
}
}
}
func TestFormatMarkdown_Sentinel(t *testing.T) {
result := &ReviewResult{
Verdict: "APPROVE",
+146 -38
View File
@@ -5,12 +5,15 @@ import (
"embed"
"encoding/json"
"fmt"
"io"
"os"
"sort"
"strings"
"unicode/utf8"
"gopkg.in/yaml.v3"
"github.com/goccy/go-yaml"
"github.com/goccy/go-yaml/ast"
"github.com/goccy/go-yaml/parser"
)
//go:embed personas/*.yaml
@@ -118,9 +121,7 @@ func ListBuiltinPersonas() []string {
default:
continue
}
if !seen[personaName] {
seen[personaName] = true
}
seen[personaName] = true
}
names := make([]string, 0, len(seen))
for name := range seen {
@@ -142,10 +143,19 @@ func parsePersona(data []byte, source string) (*Persona, error) {
err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
} else {
// Use json.Decoder with DisallowUnknownFields for consistency with
// YAML's KnownFields(true) - both reject unknown fields to catch typos.
// YAML's Strict() - both reject unknown fields to catch typos.
dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields()
err = dec.Decode(&p)
if err == nil {
// Reject trailing content after the first valid JSON object.
// Without this check, input like `{"name":"x"}garbage` would
// silently succeed because Decoder stops after one object.
var dummy json.RawMessage
if err2 := dec.Decode(&dummy); err2 != io.EOF {
err = fmt.Errorf("unexpected trailing content after JSON object")
}
}
}
if err != nil {
return nil, fmt.Errorf("parse persona %s: %w", source, err)
@@ -156,70 +166,164 @@ func parsePersona(data []byte, source string) (*Persona, error) {
return &p, nil
}
// unmarshalYAMLWithDepthLimit unmarshals YAML data with explicit depth limiting
// and strict field checking. This protects against stack exhaustion from deeply
// nested structures and catches typos in field names.
// Multi-document YAML files are rejected to prevent silent data loss.
// unmarshalYAMLWithDepthLimit unmarshals YAML data with three safety checks:
// - Depth limiting: rejects AST trees exceeding maxDepth to prevent stack exhaustion.
// - Multi-document rejection: prevents silent data loss from ignored extra documents.
// - Strict field checking: rejects unknown YAML keys to catch typos early.
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
// First pass: decode into a yaml.Node to check depth limits and node counts.
// This prevents stack exhaustion before we attempt to decode into structs.
var node yaml.Node
dec := yaml.NewDecoder(bytes.NewReader(data))
if err := dec.Decode(&node); err != nil {
// First pass: parse into AST to check depth limits, node counts, and
// multi-document rejection. This prevents stack exhaustion before we
// attempt to decode into structs.
file, err := parser.ParseBytes(data, 0)
if err != nil {
return err
}
// Reject empty YAML input (whitespace-only, comment-only, or truly empty files).
// The parser returns a single doc with nil body for these cases.
if len(file.Docs) == 0 || file.Docs[0].Body == nil {
return fmt.Errorf("empty YAML document")
}
// Reject multi-document YAML files - silently ignoring additional documents
// could lead to confusing behavior where users think their changes take effect.
var extra yaml.Node
if dec.Decode(&extra) == nil {
if len(file.Docs) > 1 {
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
}
nodeCount := 0
if err := checkYAMLDepth(&node, 0, maxDepth, MaxYAMLNodes, make(map[*yaml.Node]struct{}), &nodeCount); err != nil {
if err := checkYAMLDepth(file.Docs[0].Body, 0, maxDepth, MaxYAMLNodes, make(map[ast.Node]int), make(map[ast.Node]bool), &nodeCount); err != nil {
return err
}
// Second pass: decode with strict field checking enabled.
// KnownFields(true) rejects unknown keys, catching typos like "focuss" or "identiy".
// We must re-decode from the original data because yaml.Node.Decode() doesn't
// support the KnownFields option.
strictDec := yaml.NewDecoder(bytes.NewReader(data))
strictDec.KnownFields(true)
return strictDec.Decode(out)
// Strict() rejects unknown keys, catching typos like "focuss" or "identiy".
//
// Safety note: goccy/go-yaml's decoder does not expand YAML aliases
// recursively — it resolves them via the pre-built AST, which our first
// pass already depth-checked. Alias chains that would exceed depth limits
// are caught above; the decoder merely reads the resolved scalar values.
dec := yaml.NewDecoder(bytes.NewReader(data), yaml.Strict())
return dec.Decode(out)
}
// checkYAMLDepth recursively checks that YAML nodes don't exceed the depth limit
// or the total node count limit. It also detects alias cycles to prevent infinite
// recursion from crafted YAML with self-referential aliases.
func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*yaml.Node]struct{}, nodeCount *int) error {
// checkYAMLDepth recursively checks that YAML AST nodes don't exceed the depth
// limit or the total node count limit. It uses two tracking maps:
// - validated: maps each node to the maximum depth at which it was previously
// checked. If a node is revisited at a deeper depth (e.g., via an alias),
// we re-check it to ensure the combined effective depth doesn't exceed limits.
// - visiting: per-path recursion stack for true cycle detection. A node on the
// current path is a cycle (alias loop); we return nil to avoid infinite recursion.
//
// This design prevents the alias depth bypass where an anchored subtree validated
// at a shallow depth could be referenced via alias at a greater depth, effectively
// exceeding MaxYAMLDepth.
func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, validated map[ast.Node]int, visiting map[ast.Node]bool, nodeCount *int) error {
if node == nil {
return nil
}
if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
}
// Cycle detection: if we're currently visiting this node on the current
// recursion path, it's a cycle (e.g., alias pointing to an ancestor).
// Return nil to break the cycle without error — cycles are a structural
// property, not a depth violation.
if visiting[node] {
return nil
}
// Track total nodes visited as defense-in-depth against wide-but-shallow attacks.
// Placed after cycle detection but before the depth-aware short-circuit. This means
// nodes revisited at shallower depths (via aliases) are counted each time they are
// encountered — intentional conservative overcounting. This bounds the total work
// performed during validation rather than tracking unique nodes, which is the safer
// security posture for untrusted YAML input.
*nodeCount++
if *nodeCount > maxNodes {
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
}
// Cycle detection: if we've seen this node before, we're in a cycle.
if _, ok := seen[node]; ok {
return nil // Already validated this subtree, skip to avoid infinite recursion.
// Depth-aware short-circuit: skip re-validation only when the current visit
// depth is the same or shallower than the depth at which this node was
// previously validated. A shallower (or equal) current depth means the
// prior, deeper validation already covered any subtree depth violations.
// If the current depth exceeds the previous validation depth (e.g., an alias
// references this node deeper in the tree), we must re-traverse to ensure
// the combined effective depth doesn't exceed maxDepth.
//
// Note: using ast.Node (interface) as map key relies on pointer identity,
// which is correct because all goccy/go-yaml AST node types are pointer
// receivers (*MappingNode, *SequenceNode, etc.), never value types.
if prevDepth, ok := validated[node]; ok && depth <= prevDepth {
return nil
}
seen[node] = struct{}{}
validated[node] = depth
// Handle alias nodes: follow the alias to its anchor target.
// Increment depth when following aliases since they expand the effective structure.
if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth+1, maxDepth, maxNodes, seen, nodeCount)
}
// Mark as visiting (on the current recursion path) for cycle detection.
visiting[node] = true
defer func() { visiting[node] = false }()
for _, child := range node.Content {
if err := checkYAMLDepth(child, depth+1, maxDepth, maxNodes, seen, nodeCount); err != nil {
// Walk children based on node type.
switch n := node.(type) {
case *ast.MappingNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.MappingValueNode:
// Both Key and Value are visited at depth+1 relative to this
// MappingValueNode. Since MappingNode visits its MappingValueNode
// children at depth+1 as well, keys and values end up at depth+2
// from the parent MappingNode. This is intentional: it mirrors the
// actual nesting structure (mapping → key-value pair → key/value).
if err := checkYAMLDepth(n.Key, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.SequenceNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.AliasNode:
// Follow alias to its target, incrementing depth since aliases expand
// the effective structure.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.AnchorNode:
// Increment depth for anchor values as a conservative measure: the
// anchor definition itself is structural, and treating it as a depth
// level ensures that deeply nested anchors are caught at definition
// time rather than only when referenced via alias. This +1 is
// asymmetric with alias (which also increments) — by design, the
// effective depth budget for anchored-then-aliased content is reduced
// because both the definition site and the reference site each consume
// a level, making deeply nested anchor/alias pairs hit the limit sooner.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.TagNode:
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.MergeKeyNode:
// MergeKeyNode represents the literal "<<" merge key token. It has no
// child nodes — the value side of a merge (e.g., *alias) lives in the
// parent MappingValueNode.Value, which is already recursed into above.
// Explicitly listed here (rather than in the default case) to prevent
// future library changes from silently bypassing depth checks.
default:
// Scalar leaf nodes (StringNode, IntegerNode, FloatNode, BoolNode,
// NullNode, InfinityNode, NanNode, LiteralNode) have no children to
// recurse into.
}
return nil
}
@@ -227,7 +331,11 @@ func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*ya
// ParsePersonaBytes parses persona data from bytes with a source label for errors.
// This is useful for parsing personas fetched from external sources (e.g., Gitea API)
// without requiring filesystem access. Format is detected by source extension.
// Input is bounded by MaxPersonaFileSize to prevent resource exhaustion.
func ParsePersonaBytes(data []byte, source string) (*Persona, error) {
if len(data) > MaxPersonaFileSize {
return nil, fmt.Errorf("persona data from %s exceeds maximum size (%d bytes, limit %d)", source, len(data), MaxPersonaFileSize)
}
return parsePersona(data, source)
}
+223 -42
View File
@@ -7,7 +7,7 @@ import (
"strings"
"testing"
"gopkg.in/yaml.v3"
"github.com/goccy/go-yaml/ast"
)
func TestLoadBuiltinPersona(t *testing.T) {
@@ -355,7 +355,7 @@ func TestCapitalizeFirst(t *testing.T) {
{"HELLO", "HELLO"},
{"a", "A"},
{"", ""},
{"日本語", "日本語"}, // Non-ASCII: Japanese doesn't have case
{"日本語", "日本語"}, // Non-ASCII: Japanese doesn't have case
{"über", "Über"}, // German umlaut
{"élève", "Élève"}, // French accent
}
@@ -459,7 +459,14 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
path := filepath.Join(dir, "deeply-nested.yaml")
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
// Each level adds 2 to the depth count (key + value mapping).
// Depth accumulation trace for "nested: \n level0: \n level1: ...":
// - Document root parsed at depth 0
// - Root MappingNode children (MappingValueNodes) visited at depth 1
// - "nested" MappingValueNode: key at depth 2, value at depth 2
// - Each levelN adds depth via MappingValueNode traversal (key + value)
// - Exact depth per level depends on AST structure (MappingNode wrapping),
// but 25 levels reliably exceeds MaxYAMLDepth (20) with comfortable margin.
// The test uses 25 levels rather than exactly 21 to avoid brittleness.
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\nnested:\n")
indent := " "
@@ -483,6 +490,35 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
}
}
func TestYAMLEmptyFileRejection(t *testing.T) {
tests := []struct {
name string
content string
}{
{"completely_empty", ""},
{"whitespace_only", " \n\n "},
{"comment_only", "# just a comment\n"},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, tc.name+".yaml")
if err := os.WriteFile(path, []byte(tc.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for empty YAML input, got nil")
}
if !strings.Contains(err.Error(), "empty YAML document") {
t.Errorf("expected error containing %q, got: %v", "empty YAML document", err)
}
})
}
}
func TestYAMLFileSizeLimit(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "huge.yaml")
@@ -504,41 +540,41 @@ func TestYAMLFileSizeLimit(t *testing.T) {
func TestYAMLAliasCycleDetection(t *testing.T) {
// Test that our checkYAMLDepth function handles alias cycles gracefully
// by using the seen map to prevent infinite recursion.
// We test this directly because go-yaml's parser handles most cycles
// at parse time, but we need to ensure our checker is robust.
// by using the visiting map to prevent infinite recursion.
// Create a node structure where an alias points to a parent node,
// simulating what could happen with malicious input that bypasses
// go-yaml's cycle detection.
parent := &yaml.Node{
Kind: yaml.MappingNode,
Content: []*yaml.Node{
{Kind: yaml.ScalarNode, Value: "name"},
{Kind: yaml.ScalarNode, Value: "test"},
{Kind: yaml.ScalarNode, Value: "nested"},
// simulating what could happen with crafted input.
parent := &ast.MappingNode{
Values: []*ast.MappingValueNode{
{
Key: &ast.StringNode{Value: "name"},
Value: &ast.StringNode{Value: "test"},
},
},
}
// Create a child that aliases back to the parent (artificial cycle)
aliasToParent := &yaml.Node{
Kind: yaml.AliasNode,
Alias: parent,
aliasToParent := &ast.AliasNode{
Value: parent,
}
parent.Content = append(parent.Content, aliasToParent)
parent.Values = append(parent.Values, &ast.MappingValueNode{
Key: &ast.StringNode{Value: "nested"},
Value: aliasToParent,
})
nodeCount := 0
seen := make(map[*yaml.Node]struct{})
validated := make(map[ast.Node]int)
visiting := make(map[ast.Node]bool)
// This should NOT hang or stack overflow - the seen map prevents infinite recursion
err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
// This should NOT hang or stack overflow - cycle detection prevents infinite recursion
err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount)
if err != nil {
t.Errorf("unexpected error traversing cyclic structure: %v", err)
}
// Verify we tracked the parent in the seen map
if _, ok := seen[parent]; !ok {
t.Error("parent node not tracked in seen map")
// Verify we tracked the parent in the validated map
if _, ok := validated[parent]; !ok {
t.Error("parent node not tracked in validated map")
}
}
@@ -594,36 +630,82 @@ func TestYAMLNodeCountLimit(t *testing.T) {
func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) {
// Direct test of cycle detection in checkYAMLDepth by creating
// a node structure with an artificial cycle.
// This tests the seen map logic independent of go-yaml's parsing.
node := &yaml.Node{
Kind: yaml.MappingNode,
Content: []*yaml.Node{
{Kind: yaml.ScalarNode, Value: "key"},
{Kind: yaml.ScalarNode, Value: "value"},
node := &ast.MappingNode{
Values: []*ast.MappingValueNode{
{
Key: &ast.StringNode{Value: "key"},
Value: &ast.StringNode{Value: "value"},
},
},
}
// Create a cycle by making a child reference the parent
cycleChild := &yaml.Node{
Kind: yaml.AliasNode,
Alias: node, // Points back to the parent
cycleChild := &ast.AliasNode{
Value: node, // Points back to the parent
}
node.Content = append(node.Content,
&yaml.Node{Kind: yaml.ScalarNode, Value: "cyclic"},
cycleChild,
)
node.Values = append(node.Values, &ast.MappingValueNode{
Key: &ast.StringNode{Value: "cyclic"},
Value: cycleChild,
})
nodeCount := 0
seen := make(map[*yaml.Node]struct{})
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
validated := make(map[ast.Node]int)
visiting := make(map[ast.Node]bool)
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount)
// Should complete without infinite recursion due to cycle detection
if err != nil {
t.Errorf("unexpected error: %v", err)
}
// The seen map should contain multiple entries
if len(seen) < 2 {
t.Errorf("seen map has %d entries, expected at least 2", len(seen))
// The validated map should contain multiple entries
if len(validated) < 2 {
t.Errorf("validated map has %d entries, expected at least 2", len(validated))
}
}
func TestYAMLAliasDepthBypass(t *testing.T) {
// Test that an anchored subtree first validated at a shallow depth is
// re-checked when referenced via alias at a deeper position. Without the
// depth-aware validated map, the alias reference would skip re-checking
// and allow the effective nesting to exceed MaxYAMLDepth.
dir := t.TempDir()
path := filepath.Join(dir, "alias-depth-bypass.yaml")
// Build YAML with an anchor at shallow depth containing a subtree near the limit,
// then reference it via alias deep enough that effective depth exceeds MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\n")
// Create the anchored subtree at depth 1 (key level) that nests 15 levels deep.
sb.WriteString("anchor_key: &deep_anchor\n")
for i := 0; i < 15; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("level%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 16))
sb.WriteString("leaf: value\n")
// Create a wrapper that nests 6 levels deep, then references the anchor.
// Effective depth at alias target = 6 (wrapper nesting) + 1 (alias) + 15 (subtree) = 22 > 20
sb.WriteString("wrapper:\n")
for i := 0; i < 6; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("n%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 7))
sb.WriteString("alias_ref: *deep_anchor\n")
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for alias depth bypass, got nil")
}
if !strings.Contains(err.Error(), "nesting depth exceeds") {
t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
}
}
@@ -776,3 +858,102 @@ identity: test identity
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}
func TestJSONTrailingContentRejected(t *testing.T) {
tests := []struct {
name string
content string
}{
{
name: "trailing garbage after object",
content: `{"name":"test","identity":"test identity"}garbage`,
},
{
name: "two JSON objects",
content: `{"name":"test","identity":"test identity"}{"name":"other"}`,
},
{
name: "trailing array",
content: `{"name":"test","identity":"test identity"}[]`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for trailing content, got nil")
}
if !strings.Contains(err.Error(), "trailing content") {
t.Errorf("error = %q, want to contain 'trailing content'", err.Error())
}
})
}
}
func TestParsePersonaBytesSizeLimit(t *testing.T) {
// ParsePersonaBytes should reject input exceeding MaxPersonaFileSize
oversized := make([]byte, MaxPersonaFileSize+1)
for i := range oversized {
oversized[i] = 'x'
}
_, err := ParsePersonaBytes(oversized, "oversized.yaml")
if err == nil {
t.Fatal("expected error for oversized input, got nil")
}
if !strings.Contains(err.Error(), "exceeds maximum size") {
t.Errorf("error = %q, want to contain 'exceeds maximum size'", err.Error())
}
// Just under the limit should not trigger size error (may fail parse, but not size)
underLimit := []byte("name: test\nidentity: test persona\n")
p, err := ParsePersonaBytes(underLimit, "valid.yaml")
if err != nil {
t.Fatalf("unexpected error for valid input: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}
func TestYAMLMergeKeyDepthCheck(t *testing.T) {
// Verify that YAML merge keys (<<: *alias) are properly handled by the
// depth checker. The merge key content is in the MappingValueNode.Value
// (an AliasNode), not in the MergeKeyNode itself.
p, err := ParsePersonaBytes([]byte("name: merge-test\nidentity: test\n"), "merge.yaml")
if err != nil {
t.Fatalf("basic parse failed: %v", err)
}
if p.Name != "merge-test" {
t.Errorf("Name = %q, want %q", p.Name, "merge-test")
}
// Test that deeply nested merge keys still hit depth limit.
// Build YAML with merge key content nested beyond MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: deep-merge\nidentity: deep merge persona\n")
sb.WriteString("anchor: &deep\n")
indent := " "
for i := 0; i < MaxYAMLDepth+5; i++ {
sb.WriteString(indent)
sb.WriteString(fmt.Sprintf("level%d:\n", i))
indent += " "
}
sb.WriteString(indent + "leaf: value\n")
sb.WriteString("target:\n <<: *deep\n")
_, err = ParsePersonaBytes([]byte(sb.String()), "deep-merge.yaml")
if err == nil {
t.Fatal("expected error for deeply nested merge key content, got nil")
}
if !strings.Contains(err.Error(), "depth") {
t.Errorf("error = %q, want to contain 'depth'", err.Error())
}
}
+8
View File
@@ -41,3 +41,11 @@ type Client interface {
Reviewer
Identity
}
// ReviewerSelfRequester is an optional interface implemented by adapters that support
// requesting the authenticated user as a reviewer on a pull request. This is used for
// Gitea-specific behavior (ensuring the bot appears in required-reviewer checks).
// Consumers should use interface assertion: if sr, ok := client.(ReviewerSelfRequester); ok { ... }
type ReviewerSelfRequester interface {
RequestReviewerSelf(ctx context.Context, owner, repo string, number int, user string) error
}