Compare commits

..

8 Commits

Author SHA1 Message Date
Rodin 24d4dcb751 chore(#130): mark self-review findings as addressed in TODO.md 2026-05-15 08:39:06 +00:00
Rodin f0ba8fe004 refactor(#130): move IsBlockedIP to internal/netutil to remove gitea import in validateurl.go
validateurl.go is VCS-generic but imported gitea.IsBlockedIP, creating an
unexpected generic→Gitea-specific dependency. Extract IsBlockedIP and its
CIDR list to internal/netutil/ipcheck.go (a neutral shared package).

- gitea/ipcheck.go becomes a thin forwarding wrapper (preserves API compat
  for callers within the gitea package)
- gitea/ipcheck_test.go replaced with a forwarding smoke test; full coverage
  moves to internal/netutil/ipcheck_test.go
- validateurl.go now imports internal/netutil directly
2026-05-15 08:38:48 +00:00
Rodin c5261b9b7f refactor(#130): rename vcsReviewComment.NewPosition to NewLine with clearer semantics
The field was named NewPosition with a misleading comment 'Gitea: absolute
line; GitHub: diff hunk position'. In reality both adapters use it as an
absolute new-file line number (Gitea maps it to new_position, GitHub maps it
to Line+Side:RIGHT). Rename to NewLine to match actual semantics and update
comments to explain per-adapter mapping.
2026-05-15 08:36:55 +00:00
Rodin 9a1410cc1a docs(#130): fix README CLI example and env var table for VCS-agnostic usage
- CLI example used $GITEA_TOKEN which is not an actual env var; rename to
  $REVIEWER_TOKEN (the correct env var the binary reads)
- Env var table referenced GITEA_REPO without noting GitHub support; add
  a note and include VCS_TYPE row so users know they can override detection
2026-05-15 08:36:16 +00:00
Rodin 5e20dba3a4 fix(#130): pass VCS_TYPE env var from action.yml Run review step
The binary detects VCS type from VCS_TYPE env var, but action.yml did not
pass it to the Run review step. This caused the binary to fall back to a
URL heuristic (github.com substring), which misclassifies GitHub Enterprise
Server hosts whose URL does not contain 'github'.

The 'Determine version' step already outputs vcs_type — wire it through to
the Run review env block so explicit VCS_TYPE always takes precedence.
2026-05-15 08:35:52 +00:00
claw d545abe392 fix(#130): enforce HTTPS scheme consistently in GitHub client write methods
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 40s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m38s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m49s
PostReview, DeleteReview, and RequestReviewer were calling c.httpClient.Do
directly, bypassing the scheme check in doRequest that rejects http:// URLs
unless AllowInsecureHTTP is explicitly enabled.

Introduce doRequestWithBody(ctx, method, url, body) with the same HTTPS guard,
and refactor all three write methods to use it. This ensures tokens are never
sent over plaintext regardless of which API path is exercised.

Add scheme validation tests for each method.
2026-05-14 14:11:14 -07:00
Rodin 10ef451c20 feat(cmd): add VCS routing for GitHub PR reviews
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 42s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m23s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m38s
Wire up the new GitHub API methods to the review-bot CLI via VCS
type detection. review-bot can now review PRs on both Gitea and
GitHub (including GitHub Enterprise Server).

Changes:
- vcs.go: define vcsClient interface with all PR operations
  - giteaVCSAdapter: wraps gitea.Client, satisfies vcsClient + giteaExtClient
  - githubVCSAdapter: wraps github.Client, satisfies vcsClient
  - giteaExtClient: Gitea-specific extension (supersede, comment resolution)
- main.go: detect VCS type via VCS_TYPE env var (auto-detects github.com URLs)
  - Creates appropriate client (gitea or github) based on vcs_type
  - GitHub API URL derived from server URL (github.com → api.github.com,
    GHES → /api/v3)
  - All main flow uses vcsClient interface
  - Gitea-specific supersede operations gated via giteaExtClient type assertion
  - GitHub: logs info when skipping supersede (not supported)
- Removes old giteaClientAdapter (replaced by giteaVCSAdapter in vcs.go)
- giteaVCSAdapter satisfies review.GiteaClient for persona loading

GitHub limitations handled gracefully:
- Review supersede skipped (GitHub doesn't allow editing submitted reviews)
- DeleteReview returns error for non-pending reviews (documented in adapter)
- Inline comments use absolute line + side='RIGHT' instead of diff position

Closes #130.

Co-authored-by: Rodin <rodin@forgedthought.ai>
2026-05-14 20:43:21 +00:00
Rodin 39f3326674 feat(github): add PR review API methods
Implement the higher-level GitHub API methods that were TODO since
issue #120. The github package now provides:

- GetPullRequest: PR metadata (title, body, head SHA/ref, draft)
- GetPullRequestDiff: unified diff via Accept: application/vnd.github.diff
- GetPullRequestFiles: changed files list (paginated, 100/page)
- GetCommitStatuses: CI statuses (GitHub uses 'state' field, normalized)
- GetFileContent: file content with base64 decode (strips embedded newlines)
- GetFileContentRef: file at a specific ref
- ListContents: directory listing or single-file normalization
- GetAllFilesInPath: recursive file fetching
- PostReview: submit review with event/body/commit/inline comments
- ListReviews: list PR reviews (paginated)
- DeleteReview: delete review (GitHub only allows PENDING deletion)
- GetAuthenticatedUser: returns login of the authed user
- RequestReviewer: add a user as requested reviewer

API types added: PullRequest, CommitStatus, ChangedFile, ReviewComment,
Review, ContentEntry.

Notable edge cases handled:
- GitHub embeds newlines in base64 content; stripped before decode
- GetFileContent returns error for non-file paths (type=dir)
- ListContents normalizes single-file response to a slice
- DeleteReview documents GitHub's PENDING-only constraint

Removes TODO comment from baseURL field (now consumed by all methods).

Closes part of #130.

Co-authored-by: Rodin <rodin@forgedthought.ai>
2026-05-14 20:43:09 +00:00
22 changed files with 377 additions and 1032 deletions
+1
View File
@@ -476,6 +476,7 @@ runs:
shell: bash
env:
VCS_URL: ${{ steps.version.outputs.server_url }}
VCS_TYPE: ${{ steps.version.outputs.vcs_type }}
GITEA_REPO: ${{ inputs.repo || github.repository }}
PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ inputs.reviewer-token }}
-37
View File
@@ -1,37 +0,0 @@
# Dev Loop Health Check — 2026-05-14 23:33 UTC
## Status: ✅ OPTIMAL
### Test Results
- All packages: **PASS**
- Test count: 150+ tests across:
- `./cmd/review-bot`: review, persona, schema, LLM routing
- `./internal/gitea`: Gitea client, URL validation, SSRF defense
- `./internal/github`: GitHub client, API routing
- `./review`: persona loading, file content fetching, review logic
### Recent Activity
1. **Tests added:** GetTimelineReviewCommentIDForReview, GetAllFilesInPath, fetchFileContext, fetchPatterns, persona edge cases
2. **Code quality:** `go fmt`, `go mod tidy` clean
3. **Branches:** All worktrees cleaned up, working on `review-bot-fixes` (in sync with origin/main)
### Current HEAD
- SHA: `b534247`
- Message: `[dev-loop] Update TODO.md with current cycle status and coverage metrics`
- Time: Clean, no uncommitted changes
### Next Phase Priorities
1. **PR Submission (#132+)** — Enable review-bot to create PRs (35 days)
2. **GitHub Enterprise Support** — Enterprise URL patterns, token scopes (23 days)
3. **Performance & Observability** — Metrics, load testing, audit logging (57 days)
### System Health
- ✅ All tests passing
- ✅ No warnings or lint issues
- ✅ Code is clean and organized
- ✅ Ready for next development cycle
---
**Next check:** ~6:10 AM UTC (May 15)
**Action:** Ready for issue creation or new feature work
+3 -2
View File
@@ -285,7 +285,7 @@ review-bot \
--vcs-url https://gitea.example.com \
--repo owner/name \
--pr 42 \
--reviewer-token "$GITEA_TOKEN" \
--reviewer-token "$REVIEWER_TOKEN" \
--reviewer-name "code-review" \
--llm-base-url https://api.openai.com/v1 \
--llm-api-key "$OPENAI_API_KEY" \
@@ -300,7 +300,8 @@ All flags have environment variable equivalents:
| Flag | Env Var |
|------|---------|
| `--vcs-url` | `VCS_URL` (fallback: `GITEA_URL`) |
| `--repo` | `GITEA_REPO` |
| `--vcs-type` | `VCS_TYPE` (auto-detected from URL if not set; `gitea` or `github`) |
| `--repo` | `GITEA_REPO` (also accepted: set `GITEA_REPO` for Gitea; VCS-agnostic `REPO` coming) |
| `--pr` | `PR_NUMBER` |
| `--reviewer-token` | `REVIEWER_TOKEN` |
| `--reviewer-name` | `REVIEWER_NAME` |
+60 -123
View File
@@ -1,151 +1,88 @@
## Dev Loop: review-bot — Continuous Health Monitor
## Dev Loop: review-bot — 2026-05-14 20:10 UTC
### Current Cycle: 2026-05-14 23:11 UTC ✅
**Repository Status:** OPTIMAL
- Main: `6f02cef` (clean, all tests pass)
- Working tree: clean
- Build: ✅ successful
- Vet: ✅ clean
- Test suite: ALL PASS
### Latest: ✅ STABLE STATE — REPO HEALTH COMPLETE
- **Last action:** health check; verified tests pass, repo clean, no action needed
- **Repository:** Clean, all merges complete, no open issues/PRs
- **Main branch:** Up to date with origin/main
- **Test suite:** All passing (cached)
---
## Latest Delivered: Test Coverage Sprint 2026-05-14 ✅
## Repository Status
### Coverage Improvements
### ✅ Merged to main (recent):
- issue-123 (IP-level SSRF defense) — 6 commits, main at 4440823
- issue-125 (VCS_URL rename + deprecation) — merged
- issue-124 (multi-arch binary support) — merged
- issue-120 (GitHub Actions + VCS abstraction) — merged
- issue-121 (VCS host type detection for binary download) — merged
22 new tests added across 4 packages:
| Package | Before | After | Delta |
|---------|--------|-------|-------|
| cmd/review-bot | 37.6% | 46.1% | +8.5% |
| gitea | 80.0% | 85.2% | +5.2% |
| github | 79.9% | 86.3% | +6.4% |
| review | 91.5% | 92.0% | +0.5% |
**What was tested:**
- `fetchFileContext`: empty, removed files, content fetching, error recovery, context cancellation
- `fetchPatterns`: empty repo, all files, specific files, invalid format, errors, multiple repos
- `LoadPersona`: nonexistent file, non-regular file (directory), oversized file
- `CapitalizeFirst`: RuneError (invalid UTF-8)
- `GetTimelineReviewCommentIDForReview` (gitea): 4 cases including user+body matching
- `GetAllFilesInPath` (github): directory listing, 404 fallback, recursive subdirectory
**Commits:** `fccfdd2`, `6f02cef`
### 🧹 Cleanup COMPLETE:
- ✅ Removed old worktrees (issue-123, review-bot-issue-125)
- ✅ Test suite passes (all packages)
- ✅ No TODO/FIXME in code except expected GitHub client notes
- ✅ No open issues or pull requests
- ✅ Dependencies up to date
---
## Repository Status Post-Merge
## Current Feature Completeness
### Main Branch
- Commit: `9f3f321`
- Status: ✅ All systems healthy
### Recent Merged PRs
| PR | Issue | Title | Status |
|---|---|---|---|
| #131 | #130 | GitHub API methods & VCS routing | ✅ MERGED |
| #129 | #123 | IP-level SSRF defense | ✅ MERGED |
| #128 | #125 | VCS_URL deprecation & renaming | ✅ MERGED |
| #127 | #124 | Multi-arch binary support | ✅ MERGED |
| #126 | #120 | GitHub Actions composite action | ✅ MERGED |
### Recent Direct Commits
| SHA | Description | Date |
|-----|-------------|------|
| `fccfdd2` | [dev-loop] fetchFileContext/fetchPatterns/persona tests | 2026-05-14 |
| `6f02cef` | [dev-loop] GetTimelineReviewCommentIDForReview/GetAllFilesInPath tests | 2026-05-14 |
### Closed Issues
- #130, #123, #125, #124, #120
### Open Issues
- None blocking; backlog tracked in Gitea project board
### Worktrees
- All cleaned up; no stale branches
---
## Feature Completeness Summary
### ✅ Core Functionality
**Core Capabilities:**
- Multi-provider LLM support (OpenAI, Anthropic, SAP AI Core)
- Gitea PR review (mature, proven)
- **NEW: GitHub PR review (fully implemented)**
- VCS abstraction (Gitea/GitHub transparent routing)
- Gitea PR integration with structured reviews
- SSRF defense with IP-level validation
- Multi-architecture binary deployment
- VCS abstraction (Gitea/GitHub support)
- Multi-architecture binary support
- GitHub Actions composite action
### ✅ Review Quality
- Structured reviews with code snippets
- LLM-driven analysis
- Persona-based customization
- Context awareness
**Recent Security Work:**
- RFC6598 CGN range detection
- IP fallback dialing for local endpoint rejection
- URL validation for SSRF prevention
### ✅ Security
- RFC6598 CGN detection
- HTTPS enforcement
- Redirect safety
- Credential handling (no logs, no reflection leaks)
- URL validation for VCS API access
**Code Quality:**
- Comprehensive test coverage (all packages tested)
- Consistent error handling with context propagation
- Secure credential handling (unexported fields)
- Concurrency-safe designs
---
## Next Phase: Backlog Priorities
## Next Priority Actions
### Priority 1: PR Submission
**Issue:** #132+ (create)
**Goal:** Enable review-bot to create PRs (not just post reviews)
**Scope:** PR creation flow, commit logic, test coverage
**Est. Time:** 35 days
**Impact:** Enable automated improvements, fix suggestions with diff context
### Phase 2: Feature Exploration (NEXT SESSION)
- Scan code for potential improvements per REVIEW.md findings
- Assess performance under load
- Review REVIEW.md findings for targeted fixes
- Consider backlog items from design docs
### Priority 2: GitHub Enterprise Support
**Goal:** Explicit testing & routing for GitHub Enterprise
**Gap:** Enterprise URL patterns, /api/v3 suffix handling, token scopes
**Scope:** Tests, URL routing, documentation
**Est. Time:** 23 days
**Impact:** Enable enterprise customers, reduce integration risk
### Priority 3: Performance & Observability
**Areas:**
- Load testing under concurrent reviews
- Metrics collection (review latency, LLM token usage, API call counts)
- Audit logging for compliance workflows
- Dashboard (review history, metrics, team analytics)
**Est. Time:** 57 days
**Impact:** Operational confidence, troubleshooting, compliance
### Priority 4: Enhanced Context
**Opportunities:**
- Semantic code understanding (AST-based analysis for specific languages)
- Project-specific review rules (.review-bot.yaml in repo root)
- Team-level customization
**Est. Time:** 710 days
### Phase 3: Optional Enhancements (BACKLOG)
- Address REVIEW.md context propagation findings (if prioritized)
- Additional LLM provider support
- Enhanced context detection
- Custom report formats
- Webhook management improvements
---
## Dev Loop Schedule
- **Interval:** 4 hours
- **Next check:** ~6:10 AM UTC (May 15)
- **Health:** ✅ Optimal — all systems running
- **Status:** Ready for next phase work
## Worktrees Status
All old worktrees cleaned up. Ready for new issue work.
---
## Metadata
## Dev-Loop Metadata
- **Repo:** /home/ubuntu/review-bot
- **Main branch SHA:** ed3a5dd (last commit)
- **Cron ID:** 5342ac81-4bbc-4e4c-a123-347a7788d50c
- **Scheduled:** Every 4 hours
- **Last health check:** 2026-05-14 20:10 UTC (✅ all healthy)
| Key | Value |
|---|---|
| Repo | `/home/ubuntu/review-bot` |
| Main SHA | `6f02cef` |
| Last update | 2026-05-14 23:11 UTC |
| Status | All systems optimal |
| Next phase | PR submission or GitHub Enterprise support |
## Self-Review: review-bot-issue-130-work — 2026-05-14
---
### Verdict: NEEDS_WORK
**Summary:** review-bot now supports both GitHub and Gitea PR reviews with a unified abstraction layer. All tests pass, code is clean, security is approved. Ready to move to PR submission or GitHub Enterprise support in the next cycle.
- [x] [completeness] `VCS_TYPE` env var is detected in `main.go` but **not passed from `action.yml`'s `Run review` step** to the binary. The binary falls back to a URL heuristic (`github.com` / `github.concur.com` substrings), which will silently misclassify any GitHub Enterprise Server whose hostname does not contain `github`. The action already has `vcs_type` as a step output — it should be passed as `VCS_TYPE: ${{ steps.version.outputs.vcs_type }}` in the `Run review` env block.
- [x] [completeness] `README.md` still references `GITEA_REPO` and `$GITEA_TOKEN` in the CLI example (line ~288) and the env var table (line ~303). Now that the binary supports GitHub too, these should be renamed to VCS-agnostic names (`VCS_REPO`/`REPO`, `VCS_TOKEN`/`REVIEWER_TOKEN`) or at minimum the env var table should note GitHub support.
- [x] [fit] `vcsReviewComment.NewPosition` field comment says "Gitea: absolute line; GitHub: diff hunk position" but in `githubVCSAdapter.PostReview` it is actually mapped to `Line` (absolute line) + `Side: "RIGHT"`, not to `Position` (hunk position). The field name and comment are now slightly misleading — the field means "new line number" for Gitea and is repurposed as "absolute line" for GitHub. Consider renaming to `Line` or `NewLine` with a clearer comment explaining the per-backend semantics.
- [x] [instinct] `validateurl.go` imports `gitea` package for `gitea.IsBlockedIP` — this creates a dependency from the GitHub-generic `validateurl.go` on the Gitea-specific package. The `IsBlockedIP` function is a general networking utility. Consider moving it to a shared `internal/netutil` package (or keeping it in `gitea` and accepting the coupling), but the current import is an unexpected relationship.
+1
View File
@@ -157,6 +157,7 @@ func TestFit_PreservesNoteInOutput(t *testing.T) {
}
}
func TestFit_HugeUserMeta(t *testing.T) {
// UserMeta so large that base alone exceeds limit
// Use a unique marker past the truncation point
-83
View File
@@ -10,7 +10,6 @@ import (
"testing"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/github"
"gitea.weiker.me/rodin/review-bot/llm"
"gitea.weiker.me/rodin/review-bot/review"
)
@@ -160,85 +159,3 @@ func TestIntegration_PostAndCleanup(t *testing.T) {
t.Logf("Warning: could not delete test review %d: %v", posted.ID, err)
}
}
// TestIntegration_GitHub_PostAndVerifyReview exercises the full VCS routing path
// for GitHub when INTEGRATION_GITHUB_TOKEN and INTEGRATION_GITHUB_REPO are set.
// It verifies that the GitHub adapter is selected via VCS_TYPE=github and that
// PostReview succeeds against a real GitHub PR.
//
// Required environment variables:
//
// INTEGRATION_GITHUB_TOKEN - GitHub personal access token with repo access
// INTEGRATION_GITHUB_REPO - owner/repo with an open PR (e.g. Rodin-AI/review-bot)
// INTEGRATION_GITHUB_PR - PR number to test against
//
// The test skips gracefully when these variables are absent.
func TestIntegration_GitHub_PostAndVerifyReview(t *testing.T) {
githubToken := os.Getenv("INTEGRATION_GITHUB_TOKEN")
githubRepo := os.Getenv("INTEGRATION_GITHUB_REPO")
prNumStr := os.Getenv("INTEGRATION_GITHUB_PR")
if githubToken == "" || githubRepo == "" || prNumStr == "" {
t.Skip("INTEGRATION_GITHUB_TOKEN, INTEGRATION_GITHUB_REPO, and INTEGRATION_GITHUB_PR not set, skipping")
}
prNumber, err := strconv.Atoi(prNumStr)
if err != nil {
t.Fatalf("Invalid PR number %q: %v", prNumStr, err)
}
parts := strings.SplitN(githubRepo, "/", 2)
if len(parts) != 2 || parts[0] == "" || parts[1] == "" {
t.Fatalf("Invalid repo format %q, expected owner/repo", githubRepo)
}
owner, repoName := parts[0], parts[1]
ctx := context.Background()
ghClient := github.NewClient(githubToken, "https://api.github.com")
// Verify adapter selection: GetAuthenticatedUser must succeed.
user, err := ghClient.GetAuthenticatedUser(ctx)
if err != nil {
t.Fatalf("GetAuthenticatedUser: %v — check INTEGRATION_GITHUB_TOKEN", err)
}
t.Logf("Authenticated as: %s", user)
// Verify PR is accessible via GitHub adapter.
pr, err := ghClient.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil {
t.Fatalf("GetPullRequest: %v", err)
}
t.Logf("PR: %s (sha: %s)", pr.Title, pr.Head.Sha)
// Post a COMMENT review — does not require PR approval permissions.
sentinel := "<!-- review-bot:integration-test -->"
testBody := "# Integration Test Review (GitHub)\n\nThis is an automated integration test.\n\n" + sentinel
posted, err := ghClient.PostReview(ctx, owner, repoName, prNumber, "COMMENT", testBody, "", nil)
if err != nil {
t.Fatalf("PostReview: %v", err)
}
t.Logf("Posted review ID: %d", posted.ID)
// Verify the review appears in ListReviews.
reviews, err := ghClient.ListReviews(ctx, owner, repoName, prNumber)
if err != nil {
t.Fatalf("ListReviews: %v", err)
}
found := false
for _, r := range reviews {
if r.ID == posted.ID && strings.Contains(r.Body, sentinel) {
found = true
break
}
}
if !found {
t.Errorf("posted review ID %d not found in ListReviews output", posted.ID)
}
// Attempt cleanup — GitHub does not allow deleting submitted reviews,
// so this is expected to fail with ErrCannotDeleteSubmittedReview (422).
// Log it as informational only.
if err := ghClient.DeleteReview(ctx, owner, repoName, prNumber, posted.ID); err != nil {
t.Logf("Note: DeleteReview returned (expected for submitted GitHub reviews): %v", err)
}
}
+3 -1
View File
@@ -466,7 +466,7 @@ func main() {
if f.File != "" && f.Line > 0 && diffRanges.Contains(f.File, f.Line) {
inlineComments = append(inlineComments, vcsReviewComment{
Path: f.File,
NewPosition: int64(f.Line),
NewLine: int64(f.Line),
Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
})
}
@@ -901,3 +901,5 @@ func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
}
return evaluatedSHA != currentSHA
}
+2 -321
View File
@@ -2,9 +2,7 @@ package main
import (
"bytes"
"context"
"flag"
"fmt"
"log/slog"
"os"
"os/exec"
@@ -12,7 +10,6 @@ import (
"strings"
"testing"
"gitea.weiker.me/rodin/review-bot/review"
)
func TestValidateReviewerName(t *testing.T) {
@@ -633,48 +630,6 @@ func TestEvaluateCIStatus(t *testing.T) {
}
}
func TestGithubAPIURL(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{
name: "empty string defaults to api.github.com",
input: "",
want: "https://api.github.com",
},
{
name: "github.com maps to api.github.com",
input: "https://github.com",
want: "https://api.github.com",
},
{
name: "github.com with trailing slash maps to api.github.com",
input: "https://github.com/",
want: "https://api.github.com",
},
{
name: "GHES host gets /api/v3 suffix",
input: "https://ghe.example.com",
want: "https://ghe.example.com/api/v3",
},
{
name: "GHES concur domain does not map to api.github.com",
input: "https://github.concur.com",
want: "https://github.concur.com/api/v3",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := githubAPIURL(tt.input)
if got != tt.want {
t.Errorf("githubAPIURL(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
func TestEnvOrDefault(t *testing.T) {
// Test with unset env var
os.Unsetenv("TEST_ENV_OR_DEFAULT_UNSET")
@@ -1015,7 +970,7 @@ func TestMainSubprocess_InvalidProvider(t *testing.T) {
}
}
// cleanEnv returns environ without any GITEA/LLM/REVIEWER/VCS env vars that would
// cleanEnv returns environ without any GITEA/LLM/REVIEWER env vars that would
// interfere with testing missing-flag scenarios.
func cleanEnv() []string {
var env []string
@@ -1030,8 +985,7 @@ func cleanEnv() []string {
strings.HasPrefix(key, "CONVENTIONS_"),
strings.HasPrefix(key, "SYSTEM_PROMPT_"),
strings.HasPrefix(key, "PATTERNS_"),
strings.HasPrefix(key, "UPDATE_"),
strings.HasPrefix(key, "VCS_"):
strings.HasPrefix(key, "UPDATE_"):
continue
default:
env = append(env, e)
@@ -1110,276 +1064,3 @@ func TestShouldSkipStaleReview(t *testing.T) {
})
}
}
// ============================================================
// Mock vcsClient for unit tests
// ============================================================
// mockVCSClient is a minimal mock of vcsClient for testing helper functions.
// Only the methods exercised by the test code need implementations; all others
// panic with a clear message to catch accidental calls.
type mockVCSClient struct {
fileContents map[string]string // key: "owner/repo/ref/path"
fileContentsErr map[string]error // key same as above → error to return
dirContents map[string][]review.ContentEntry
dirContentsErr map[string]error
allFiles map[string]map[string]string // key: "owner/repo/path"
allFilesErr map[string]error
}
func (m *mockVCSClient) key(owner, repo, extra string) string {
return owner + "/" + repo + "/" + extra
}
func (m *mockVCSClient) GetPullRequest(ctx context.Context, owner, repo string, number int) (*vcsPullRequest, error) {
panic("GetPullRequest not implemented in mockVCSClient")
}
func (m *mockVCSClient) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
panic("GetPullRequestDiff not implemented in mockVCSClient")
}
func (m *mockVCSClient) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]vcsChangedFile, error) {
panic("GetPullRequestFiles not implemented in mockVCSClient")
}
func (m *mockVCSClient) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]vcsCommitStatus, error) {
panic("GetCommitStatuses not implemented in mockVCSClient")
}
func (m *mockVCSClient) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
panic("GetFileContent not implemented in mockVCSClient")
}
func (m *mockVCSClient) GetFileContentRef(ctx context.Context, owner, repo, path, ref string) (string, error) {
k := m.key(owner, repo, ref+"/"+path)
if err, ok := m.fileContentsErr[k]; ok {
return "", err
}
if content, ok := m.fileContents[k]; ok {
return content, nil
}
return "", fmt.Errorf("HTTP 404: not found")
}
func (m *mockVCSClient) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
k := m.key(owner, repo, path)
if err, ok := m.dirContentsErr[k]; ok {
return nil, err
}
if entries, ok := m.dirContents[k]; ok {
return entries, nil
}
return nil, fmt.Errorf("HTTP 404: not found")
}
func (m *mockVCSClient) GetAllFilesInPath(ctx context.Context, owner, repo, path string) (map[string]string, error) {
k := m.key(owner, repo, path)
if err, ok := m.allFilesErr[k]; ok {
return nil, err
}
if files, ok := m.allFiles[k]; ok {
return files, nil
}
return nil, fmt.Errorf("HTTP 404: not found")
}
func (m *mockVCSClient) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
panic("PostReview not implemented in mockVCSClient")
}
func (m *mockVCSClient) ListReviews(ctx context.Context, owner, repo string, number int) ([]vcsReview, error) {
panic("ListReviews not implemented in mockVCSClient")
}
func (m *mockVCSClient) DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error {
panic("DeleteReview not implemented in mockVCSClient")
}
func (m *mockVCSClient) GetAuthenticatedUser(ctx context.Context) (string, error) {
panic("GetAuthenticatedUser not implemented in mockVCSClient")
}
func (m *mockVCSClient) RequestReviewer(ctx context.Context, owner, repo string, number int, reviewer string) error {
panic("RequestReviewer not implemented in mockVCSClient")
}
// ============================================================
// fetchFileContext tests
// ============================================================
func TestFetchFileContext_NoFiles(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{}
got := fetchFileContext(ctx, client, "owner", "repo", "main", nil)
if got != "" {
t.Errorf("expected empty string for no files, got: %q", got)
}
}
func TestFetchFileContext_SkipsRemovedFiles(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{}
files := []vcsChangedFile{
{Filename: "gone.go", Status: "removed"},
}
got := fetchFileContext(ctx, client, "owner", "repo", "main", files)
if got != "" {
t.Errorf("expected empty string for removed file, got: %q", got)
}
}
func TestFetchFileContext_FetchesModifiedFiles(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{
fileContents: map[string]string{
"owner/repo/main/foo.go": "package main\n\nfunc main() {}\n",
},
}
files := []vcsChangedFile{
{Filename: "foo.go", Status: "modified"},
}
got := fetchFileContext(ctx, client, "owner", "repo", "main", files)
if !strings.Contains(got, "--- foo.go ---") {
t.Errorf("expected file header in output, got: %q", got)
}
if !strings.Contains(got, "package main") {
t.Errorf("expected file content in output, got: %q", got)
}
}
func TestFetchFileContext_ContinuesOnError(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{
fileContents: map[string]string{
"owner/repo/main/good.go": "package good\n",
},
fileContentsErr: map[string]error{
"owner/repo/main/bad.go": fmt.Errorf("network error"),
},
}
files := []vcsChangedFile{
{Filename: "bad.go", Status: "modified"},
{Filename: "good.go", Status: "modified"},
}
got := fetchFileContext(ctx, client, "owner", "repo", "main", files)
// bad.go fails, good.go should still be included
if strings.Contains(got, "bad.go") {
t.Errorf("should not include failed file, got: %q", got)
}
if !strings.Contains(got, "good.go") {
t.Errorf("should include successful file, got: %q", got)
}
}
func TestFetchFileContext_RespectsContextCancellation(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
cancel() // Cancel immediately
client := &mockVCSClient{
fileContents: map[string]string{
"owner/repo/main/foo.go": "package foo\n",
},
}
files := []vcsChangedFile{
{Filename: "foo.go", Status: "modified"},
}
got := fetchFileContext(ctx, client, "owner", "repo", "main", files)
// With cancelled context, the loop breaks before fetching
if got != "" {
t.Errorf("expected empty string with cancelled context, got: %q", got)
}
}
// ============================================================
// fetchPatterns tests
// ============================================================
func TestFetchPatterns_EmptyRepo(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{}
got := fetchPatterns(ctx, client, "", "")
if got != "" {
t.Errorf("expected empty string for empty patternsRepo, got: %q", got)
}
}
func TestFetchPatterns_SingleRepoAllFiles(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{
allFiles: map[string]map[string]string{
"rodin/patterns/": {
"patterns/go.md": "# Go patterns\n\nUse interfaces.",
"patterns/binary": "binary data",
},
},
}
got := fetchPatterns(ctx, client, "rodin/patterns", "")
if !strings.Contains(got, "# Go patterns") {
t.Errorf("expected markdown content, got: %q", got)
}
// Binary file should be excluded
if strings.Contains(got, "binary data") {
t.Errorf("binary file should be excluded, got: %q", got)
}
}
func TestFetchPatterns_SpecificFiles(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{
allFiles: map[string]map[string]string{
"rodin/patterns/go.md": {
"go.md": "# Go idioms\n",
},
},
}
got := fetchPatterns(ctx, client, "rodin/patterns", "go.md")
if !strings.Contains(got, "# Go idioms") {
t.Errorf("expected go idioms content, got: %q", got)
}
}
func TestFetchPatterns_SkipsInvalidRepo(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{}
// "badrepo" has no slash, should be skipped
got := fetchPatterns(ctx, client, "badrepo", "")
if got != "" {
t.Errorf("expected empty string for invalid repo format, got: %q", got)
}
}
func TestFetchPatterns_ContinuesOnFetchError(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{
allFilesErr: map[string]error{
"owner/repo/": fmt.Errorf("server error"),
},
}
// Should not panic; should return empty string
got := fetchPatterns(ctx, client, "owner/repo", "")
if got != "" {
t.Errorf("expected empty string on fetch error, got: %q", got)
}
}
func TestFetchPatterns_MultipleRepos(t *testing.T) {
ctx := context.Background()
client := &mockVCSClient{
allFiles: map[string]map[string]string{
"org/go-patterns/": {
"idioms.md": "# Go idioms\n",
},
"org/elixir-patterns/": {
"pipes.md": "# Elixir pipes\n",
},
},
}
got := fetchPatterns(ctx, client, "org/go-patterns, org/elixir-patterns", "")
if !strings.Contains(got, "# Go idioms") {
t.Errorf("expected Go idioms content, got: %q", got)
}
if !strings.Contains(got, "# Elixir pipes") {
t.Errorf("expected Elixir pipes content, got: %q", got)
}
}
+2 -2
View File
@@ -9,7 +9,7 @@ import (
"strings"
"time"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
// runValidateURL implements the `review-bot validate-url <url>` subcommand.
@@ -114,7 +114,7 @@ func validateURL(rawURL string) error {
}
for _, a := range addrs {
if gitea.IsBlockedIP(a.IP) {
if netutil.IsBlockedIP(a.IP) {
return &validateError{
code: 1,
message: fmt.Sprintf("blocked: %q resolves to private/reserved IP %s", host, a.IP),
+5 -7
View File
@@ -84,7 +84,7 @@ type vcsCommitStatus struct {
// vcsReviewComment is an inline review comment.
type vcsReviewComment struct {
Path string
NewPosition int64 // Gitea: absolute line; GitHub: diff hunk position
NewLine int64 // absolute line number on the new (right) side of the diff, used by both Gitea and GitHub adapters
Body string
}
@@ -176,7 +176,7 @@ func (a *giteaVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, pa
func (a *giteaVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
gc := make([]gitea.ReviewComment, len(comments))
for i, c := range comments {
gc[i] = gitea.ReviewComment{Path: c.Path, NewPosition: c.NewPosition, Body: c.Body}
gc[i] = gitea.ReviewComment{Path: c.Path, NewPosition: c.NewLine, Body: c.Body}
}
r, err := a.c.PostReview(ctx, owner, repo, number, event, body, commitID, gc)
if err != nil {
@@ -311,14 +311,12 @@ func (a *githubVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, p
func (a *githubVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
gc := make([]github.ReviewComment, len(comments))
for i, c := range comments {
// GitHub inline comments use diff hunk "position", not absolute line numbers.
// NewPosition from gitea diff parsing gives absolute line numbers, which
// will not match GitHub's position values. For initial GitHub support, we
// attach comments with Line+Side (absolute line on the RIGHT side) instead.
// GitHub inline comments use Line+Side (absolute line on the RIGHT side).
// NewLine from diff parsing gives absolute new-file line numbers.
// Comments that cannot be mapped will be omitted (GitHub rejects invalid positions).
gc[i] = github.ReviewComment{
Path: c.Path,
Line: c.NewPosition,
Line: c.NewLine,
Side: "RIGHT",
Body: c.Body,
}
-78
View File
@@ -971,7 +971,6 @@ func TestDoGet_RespectsContextCancellation(t *testing.T) {
t.Errorf("attempts = %d, expected 1 before context cancel during backoff", attempts)
}
}
// mockTransport is a test helper that returns errors for the first N calls,
// then delegates to a real server.
type mockTransport struct {
@@ -1420,80 +1419,3 @@ func TestNewSafeHTTPClient_PreservesDefaultTransportSettings(t *testing.T) {
t.Error("DialContext is nil; expected safeDialContext")
}
}
func TestGetTimelineReviewCommentIDForReview(t *testing.T) {
const reviewID = int64(42)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/v1/repos/owner/repo/pulls/5/reviews/42":
w.Write([]byte(`{"body": "The review body <!-- review-bot:sonnet -->", "user": {"login": "sonnet-review"}}`))
case "/api/v1/repos/owner/repo/issues/5/timeline":
w.Write([]byte(`[
{"id": 100, "type": "comment", "body": "unrelated", "user": {"login": "sonnet-review"}},
{"id": 200, "type": "review", "body": "The review body <!-- review-bot:sonnet -->", "user": {"login": "sonnet-review"}}
]`))
default:
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
}
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
id, err := client.GetTimelineReviewCommentIDForReview(context.Background(), "owner", "repo", 5, reviewID)
if err != nil {
t.Fatalf("GetTimelineReviewCommentIDForReview() error = %v", err)
}
if id != 200 {
t.Errorf("got id=%d, want 200", id)
}
}
func TestGetTimelineReviewCommentIDForReview_ReviewFetchError(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNotFound)
w.Write([]byte(`{"message":"not found"}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.GetTimelineReviewCommentIDForReview(context.Background(), "owner", "repo", 5, 99)
if err == nil {
t.Fatal("expected error for missing review, got nil")
}
}
func TestGetTimelineReviewCommentIDForReview_EmptyBody(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte(`{"body": "", "user": {"login": "bot"}}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.GetTimelineReviewCommentIDForReview(context.Background(), "owner", "repo", 5, 42)
if err == nil {
t.Fatal("expected error for empty body, got nil")
}
if !strings.Contains(err.Error(), "empty body") {
t.Errorf("error = %q, want to contain 'empty body'", err.Error())
}
}
func TestGetTimelineReviewCommentIDForReview_NotFoundInTimeline(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/api/v1/repos/owner/repo/pulls/5/reviews/42":
w.Write([]byte(`{"body": "review content <!-- review-bot:sonnet -->", "user": {"login": "bot"}}`))
default:
// Timeline returns events that don't match (different user)
w.Write([]byte(`[{"id": 1, "type": "review", "body": "review content <!-- review-bot:sonnet -->", "user": {"login": "other-user"}}]`))
}
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.GetTimelineReviewCommentIDForReview(context.Background(), "owner", "repo", 5, 42)
if err == nil {
t.Fatal("expected error when review not found in timeline, got nil")
}
}
+12 -81
View File
@@ -1,91 +1,22 @@
// Package gitea provides a client for the Gitea API.
// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
// ipcheck.go re-exports the IsBlockedIP function from internal/netutil for use
// by this package's safe dialer (client.go) and for backward compatibility with
// any callers that previously imported it from here.
//
// The implementation has moved to internal/netutil so it can be shared with the
// validate-url subcommand (cmd/review-bot/validateurl.go) without creating a
// dependency from VCS-generic code on the Gitea-specific package.
package gitea
import (
"fmt"
"net"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
// blockedCIDRStrings is the canonical list of CIDR strings that should never
// be contacted by review-bot. See IsBlockedIP for the full list of covered
// address families.
//
// These are hard-coded literals: any parse failure is a programming error.
// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
var blockedCIDRStrings = []string{
// IPv4 loopback
"127.0.0.0/8",
// IPv4 unspecified / "this network"
"0.0.0.0/8",
// RFC1918 private ranges
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
"169.254.0.0/16",
// IPv4 shared address space (RFC6598, carrier-grade NAT)
"100.64.0.0/10",
// IPv4 multicast
"224.0.0.0/4",
// IPv4 reserved / broadcast
"240.0.0.0/4",
// IPv6 loopback
"::1/128",
// IPv6 unspecified
"::/128",
// IPv6 link-local
"fe80::/10",
// IPv6 unique local (ULA) — RFC4193
"fc00::/7",
// IPv6 multicast
"ff00::/8",
}
// blockedCIDRs is the parsed form of blockedCIDRStrings.
// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
var (
blockedCIDRs []*net.IPNet
blockedCIDRParseErrors []string
)
func init() {
blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
for _, r := range blockedCIDRStrings {
_, cidr, err := net.ParseCIDR(r)
if err != nil {
// Record the error rather than panicking; TestBlockedCIDRsValid
// will catch this during tests, and the CI build will fail.
blockedCIDRParseErrors = append(blockedCIDRParseErrors,
fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
continue
}
blockedCIDRs = append(blockedCIDRs, cidr)
}
}
// IsBlockedIP reports whether ip is in a blocked address range.
// It is exported for use by the validate-url subcommand and tests outside
// this package.
//
// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
// IPv4 form before checking so that IPv4 CIDRs catch them.
//
// Based on:
// - RFC1918 private ranges
// - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
// - RFC4291 IPv6 link-local / loopback
// It delegates to internal/netutil.IsBlockedIP; see that function for the full
// list of blocked ranges and IPv6-mapped IPv4 normalization behavior.
func IsBlockedIP(ip net.IP) bool {
// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
if v4 := ip.To4(); v4 != nil {
ip = v4
}
for _, cidr := range blockedCIDRs {
if cidr.Contains(ip) {
return true
}
}
return false
return netutil.IsBlockedIP(ip)
}
+19 -126
View File
@@ -3,142 +3,35 @@ package gitea
import (
"net"
"testing"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
func TestIsBlockedIP(t *testing.T) {
blocked := []struct {
name string
// TestIsBlockedIPForwarding verifies that gitea.IsBlockedIP correctly forwards
// to internal/netutil.IsBlockedIP. Full coverage of the blocking logic lives in
// internal/netutil/ipcheck_test.go.
func TestIsBlockedIPForwarding(t *testing.T) {
cases := []struct {
ip string
blocked bool
}{
// IPv4 loopback
{"loopback 127.0.0.1", "127.0.0.1"},
{"loopback 127.0.0.2", "127.0.0.2"},
{"loopback 127.255.255.255", "127.255.255.255"},
// IPv4 unspecified
{"unspecified 0.0.0.0", "0.0.0.0"},
{"unspecified 0.1.2.3", "0.1.2.3"},
// RFC1918
{"RFC1918 10.0.0.1", "10.0.0.1"},
{"RFC1918 10.255.255.255", "10.255.255.255"},
{"RFC1918 172.16.0.1", "172.16.0.1"},
{"RFC1918 172.31.255.255", "172.31.255.255"},
{"RFC1918 192.168.0.1", "192.168.0.1"},
{"RFC1918 192.168.255.255", "192.168.255.255"},
// Link-local (APIPA / AWS metadata)
{"link-local 169.254.0.1", "169.254.0.1"},
{"link-local 169.254.169.254", "169.254.169.254"},
// Shared address space (carrier-grade NAT)
{"CGN 100.64.0.1", "100.64.0.1"},
{"CGN 100.127.255.255", "100.127.255.255"},
// Multicast
{"multicast 224.0.0.1", "224.0.0.1"},
{"multicast 239.255.255.255", "239.255.255.255"},
// Reserved
{"reserved 240.0.0.1", "240.0.0.1"},
{"broadcast 255.255.255.255", "255.255.255.255"},
// IPv6 loopback
{"IPv6 loopback ::1", "::1"},
// IPv6 unspecified
{"IPv6 unspecified ::", "::"},
// IPv6 link-local
{"IPv6 link-local fe80::1", "fe80::1"},
{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
// IPv6 ULA
{"IPv6 ULA fc00::1", "fc00::1"},
{"IPv6 ULA fd00::1", "fd00::1"},
// IPv6 multicast
{"IPv6 multicast ff02::1", "ff02::1"},
{"127.0.0.1", true}, // loopback — must be blocked
{"192.168.1.1", true}, // RFC1918 — must be blocked
{"8.8.8.8", false}, // public — must not be blocked
{"2001:4860:4860::8888", false}, // public IPv6 — must not be blocked
}
for _, tc := range blocked {
t.Run(tc.name, func(t *testing.T) {
for _, tc := range cases {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if !IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
got := IsBlockedIP(ip)
want := netutil.IsBlockedIP(ip)
if got != want {
t.Errorf("gitea.IsBlockedIP(%q) = %v, netutil.IsBlockedIP = %v: forwarding mismatch", tc.ip, got, want)
}
})
}
allowed := []struct {
name string
ip string
}{
{"public 8.8.8.8", "8.8.8.8"},
{"public 1.1.1.1", "1.1.1.1"},
{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
// not assigned to any real host, but intentionally left unblocked here because
// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
// blocking it would require tracking every RFC5737 range without meaningful
// security benefit (no server should ever listen on a TEST-NET address).
{"public 151.101.1.1", "151.101.1.1"}, // Fastly
{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
}
for _, tc := range allowed {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
}
})
}
}
func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
// Construct it manually as a 16-byte IP.
mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
if !IsBlockedIP(mapped) {
t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
}
// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
if IsBlockedIP(mappedPublic) {
t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
}
}
func TestIsBlockedIPEdgeCases(t *testing.T) {
// The boundary between RFC1918 and public ranges.
// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
notPrivate := net.ParseIP("172.15.255.255")
if IsBlockedIP(notPrivate) {
t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
}
// 172.32.0.0 is NOT private (just above 172.31.255.255).
notPrivate2 := net.ParseIP("172.32.0.0")
if IsBlockedIP(notPrivate2) {
t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
}
// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
notCGN := net.ParseIP("100.63.255.255")
if IsBlockedIP(notCGN) {
t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
}
// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
notCGN2 := net.ParseIP("100.128.0.0")
if IsBlockedIP(notCGN2) {
t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
}
}
// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
// successfully. This catches programming errors in the CIDR list without
// requiring a startup panic. The init() function records parse failures in
// blockedCIDRParseErrors rather than panicking; this test makes those failures
// visible as test failures during CI.
func TestBlockedCIDRsValid(t *testing.T) {
if len(blockedCIDRParseErrors) > 0 {
for _, msg := range blockedCIDRParseErrors {
t.Errorf("CIDR parse error: %s", msg)
if got != tc.blocked {
t.Errorf("gitea.IsBlockedIP(%q) = %v, want %v", tc.ip, got, tc.blocked)
}
}
}
-95
View File
@@ -1130,98 +1130,3 @@ func TestEscapePath_SpecialChars(t *testing.T) {
}
}
}
func TestGetAllFilesInPath_DirectoryWithFiles(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/repos/owner/repo/contents/patterns":
// Directory listing
w.WriteHeader(http.StatusOK)
w.Write([]byte(`[{"name":"go.md","path":"patterns/go.md","type":"file"}]`))
case "/repos/owner/repo/contents/patterns/go.md":
// GitHub file response with base64 content
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"name":"go.md","path":"patterns/go.md","type":"file","encoding":"base64","content":"IyBHbyBwYXR0ZXJucwo="}`))
default:
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
files, err := c.GetAllFilesInPath(context.Background(), "owner", "repo", "patterns")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(files) != 1 {
t.Fatalf("len(files) = %d, want 1", len(files))
}
if files["patterns/go.md"] != "# Go patterns\n" {
t.Errorf("unexpected content: %q", files["patterns/go.md"])
}
}
func TestGetAllFilesInPath_404FallsBackToFile(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/repos/owner/repo/contents/README.md":
// ListContents returns 404 for file paths
w.WriteHeader(http.StatusNotFound)
w.Write([]byte(`{"message":"Not Found"}`))
default:
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
// GetFileContent also goes to /contents/ — this will 404 too.
// The function should return the path-not-found error.
_, err := c.GetAllFilesInPath(context.Background(), "owner", "repo", "README.md")
if err == nil {
t.Fatal("expected error when both dir and file 404, got nil")
}
}
func TestGetAllFilesInPath_DirectoryWithSubdir(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/repos/owner/repo/contents/src":
w.WriteHeader(http.StatusOK)
w.Write([]byte(`[
{"name":"main.go","path":"src/main.go","type":"file"},
{"name":"sub","path":"src/sub","type":"dir"}
]`))
case "/repos/owner/repo/contents/src/main.go":
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"name":"main.go","path":"src/main.go","type":"file","encoding":"base64","content":"cGFja2FnZSBtYWluCg=="}`))
case "/repos/owner/repo/contents/src/sub":
w.WriteHeader(http.StatusOK)
w.Write([]byte(`[{"name":"util.go","path":"src/sub/util.go","type":"file"}]`))
case "/repos/owner/repo/contents/src/sub/util.go":
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"name":"util.go","path":"src/sub/util.go","type":"file","encoding":"base64","content":"cGFja2FnZSBzdWIK"}`))
default:
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
}
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
files, err := c.GetAllFilesInPath(context.Background(), "owner", "repo", "src")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(files) != 2 {
t.Fatalf("len(files) = %d, want 2: %v", len(files), files)
}
if files["src/main.go"] != "package main\n" {
t.Errorf("src/main.go content unexpected: %q", files["src/main.go"])
}
if files["src/sub/util.go"] != "package sub\n" {
t.Errorf("src/sub/util.go content unexpected: %q", files["src/sub/util.go"])
}
}
+97
View File
@@ -0,0 +1,97 @@
// Package netutil provides shared network utilities for review-bot.
// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
package netutil
import (
"fmt"
"net"
)
// blockedCIDRStrings is the canonical list of CIDR strings that should never
// be contacted by review-bot. See IsBlockedIP for the full list of covered
// address families.
//
// These are hard-coded literals: any parse failure is a programming error.
// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
var blockedCIDRStrings = []string{
// IPv4 loopback
"127.0.0.0/8",
// IPv4 unspecified / "this network"
"0.0.0.0/8",
// RFC1918 private ranges
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
"169.254.0.0/16",
// IPv4 shared address space (RFC6598, carrier-grade NAT)
"100.64.0.0/10",
// IPv4 multicast
"224.0.0.0/4",
// IPv4 reserved / broadcast
"240.0.0.0/4",
// IPv6 loopback
"::1/128",
// IPv6 unspecified
"::/128",
// IPv6 link-local
"fe80::/10",
// IPv6 unique local (ULA) — RFC4193
"fc00::/7",
// IPv6 multicast
"ff00::/8",
}
// blockedCIDRs is the parsed form of blockedCIDRStrings.
// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
var (
blockedCIDRs []*net.IPNet
blockedCIDRParseErrors []string
)
func init() {
blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
for _, r := range blockedCIDRStrings {
_, cidr, err := net.ParseCIDR(r)
if err != nil {
// Record the error rather than panicking; TestBlockedCIDRsValid
// will catch this during tests, and the CI build will fail.
blockedCIDRParseErrors = append(blockedCIDRParseErrors,
fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
continue
}
blockedCIDRs = append(blockedCIDRs, cidr)
}
}
// BlockedCIDRParseErrors returns any errors encountered parsing the built-in
// CIDR list. In correct code this will always be empty; tests assert it is.
func BlockedCIDRParseErrors() []string {
return blockedCIDRParseErrors
}
// IsBlockedIP reports whether ip is in a blocked address range.
// It is exported for use by the gitea package's safe dialer, the validate-url
// subcommand, and tests outside this package.
//
// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
// IPv4 form before checking so that IPv4 CIDRs catch them.
//
// Based on:
// - RFC1918 private ranges
// - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
// - RFC4291 IPv6 link-local / loopback
func IsBlockedIP(ip net.IP) bool {
// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
if v4 := ip.To4(); v4 != nil {
ip = v4
}
for _, cidr := range blockedCIDRs {
if cidr.Contains(ip) {
return true
}
}
return false
}
+142
View File
@@ -0,0 +1,142 @@
package netutil
import (
"net"
"testing"
)
func TestIsBlockedIP(t *testing.T) {
blocked := []struct {
name string
ip string
}{
// IPv4 loopback
{"loopback 127.0.0.1", "127.0.0.1"},
{"loopback 127.0.0.2", "127.0.0.2"},
{"loopback 127.255.255.255", "127.255.255.255"},
// IPv4 unspecified
{"unspecified 0.0.0.0", "0.0.0.0"},
{"unspecified 0.1.2.3", "0.1.2.3"},
// RFC1918
{"RFC1918 10.0.0.1", "10.0.0.1"},
{"RFC1918 10.255.255.255", "10.255.255.255"},
{"RFC1918 172.16.0.1", "172.16.0.1"},
{"RFC1918 172.31.255.255", "172.31.255.255"},
{"RFC1918 192.168.0.1", "192.168.0.1"},
{"RFC1918 192.168.255.255", "192.168.255.255"},
// Link-local (APIPA / AWS metadata)
{"link-local 169.254.0.1", "169.254.0.1"},
{"link-local 169.254.169.254", "169.254.169.254"},
// Shared address space (carrier-grade NAT)
{"CGN 100.64.0.1", "100.64.0.1"},
{"CGN 100.127.255.255", "100.127.255.255"},
// Multicast
{"multicast 224.0.0.1", "224.0.0.1"},
{"multicast 239.255.255.255", "239.255.255.255"},
// Reserved
{"reserved 240.0.0.1", "240.0.0.1"},
{"broadcast 255.255.255.255", "255.255.255.255"},
// IPv6 loopback
{"IPv6 loopback ::1", "::1"},
// IPv6 unspecified
{"IPv6 unspecified ::", "::"},
// IPv6 link-local
{"IPv6 link-local fe80::1", "fe80::1"},
{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
// IPv6 ULA
{"IPv6 ULA fc00::1", "fc00::1"},
{"IPv6 ULA fd00::1", "fd00::1"},
// IPv6 multicast
{"IPv6 multicast ff02::1", "ff02::1"},
}
for _, tc := range blocked {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if !IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
}
})
}
allowed := []struct {
name string
ip string
}{
{"public 8.8.8.8", "8.8.8.8"},
{"public 1.1.1.1", "1.1.1.1"},
{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
// not assigned to any real host, but intentionally left unblocked here because
// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
// blocking it would require tracking every RFC5737 range without meaningful
// security benefit (no server should ever listen on a TEST-NET address).
{"public 151.101.1.1", "151.101.1.1"}, // Fastly
{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
}
for _, tc := range allowed {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
}
})
}
}
func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
// Construct it manually as a 16-byte IP.
mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
if !IsBlockedIP(mapped) {
t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
}
// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
if IsBlockedIP(mappedPublic) {
t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
}
}
func TestIsBlockedIPEdgeCases(t *testing.T) {
// The boundary between RFC1918 and public ranges.
// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
notPrivate := net.ParseIP("172.15.255.255")
if IsBlockedIP(notPrivate) {
t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
}
// 172.32.0.0 is NOT private (just above 172.31.255.255).
notPrivate2 := net.ParseIP("172.32.0.0")
if IsBlockedIP(notPrivate2) {
t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
}
// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
notCGN := net.ParseIP("100.63.255.255")
if IsBlockedIP(notCGN) {
t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
}
// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
notCGN2 := net.ParseIP("100.128.0.0")
if IsBlockedIP(notCGN2) {
t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
}
}
// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
// successfully. This catches programming errors in the CIDR list without
// requiring a startup panic. The init() function records parse failures in
// blockedCIDRParseErrors rather than panicking; this test makes those failures
// visible as test failures during CI.
func TestBlockedCIDRsValid(t *testing.T) {
for _, msg := range BlockedCIDRParseErrors() {
t.Errorf("CIDR parse error: %s", msg)
}
}
+1
View File
@@ -210,6 +210,7 @@ func TestWithTimeout(t *testing.T) {
}
}
func TestComplete_Anthropic_Success(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/messages" {
-48
View File
@@ -957,51 +957,3 @@ func TestYAMLMergeKeyDepthCheck(t *testing.T) {
t.Errorf("error = %q, want to contain 'depth'", err.Error())
}
}
func TestLoadPersona_NonexistentFile(t *testing.T) {
_, err := LoadPersona("/tmp/nonexistent-persona-file-xyz.yaml")
if err == nil {
t.Fatal("expected error for nonexistent file, got nil")
}
}
func TestLoadPersona_NotARegularFile(t *testing.T) {
// Use a directory as the path — directories are not regular files.
dir := t.TempDir()
_, err := LoadPersona(dir)
if err == nil {
t.Fatal("expected error for directory path, got nil")
}
if !strings.Contains(err.Error(), "not a regular file") {
t.Errorf("error = %q, want to contain 'not a regular file'", err.Error())
}
}
func TestLoadPersona_OversizedFile(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "big.yaml")
// Write a file larger than MaxPersonaFileSize
data := make([]byte, MaxPersonaFileSize+1)
for i := range data {
data[i] = 'x'
}
if err := os.WriteFile(path, data, 0644); err != nil {
t.Fatalf("failed to create test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for oversized file, got nil")
}
if !strings.Contains(err.Error(), "exceeds maximum size") {
t.Errorf("error = %q, want to contain 'exceeds maximum size'", err.Error())
}
}
func TestCapitalizeFirst_RuneError(t *testing.T) {
// An invalid UTF-8 byte sequence should return the original string unchanged.
invalid := string([]byte{0xFF, 0xFE})
got := CapitalizeFirst(invalid)
if got != invalid {
t.Errorf("CapitalizeFirst(%q) = %q, want original %q", invalid, got, invalid)
}
}
+1
View File
@@ -117,6 +117,7 @@ func TestBuildUserPrompt_WithoutFileContext(t *testing.T) {
}
}
func TestBuildSystemBase(t *testing.T) {
result := BuildSystemBase()
if result == "" {