Compare commits

..

3 Commits

Author SHA1 Message Date
Rodin 27a9be38bc fix: address PR #63 review findings
1. Refactor err2 to use scoped loadErr variable (MINOR - sonnet-review-bot)
   The else-if branches are mutually exclusive, so the error variable
   should be scoped inside the block, not declared outside with err2.

2. Sanitize DisplayName before embedding in Markdown (MINOR - security-review-bot)
   Remote persona metadata is untrusted. Added sanitizeMarkdownText() to
   escape Markdown special characters and strip control characters.
   Applied to both the header title and the footer attribution.

3. Document YAML DoS mitigations (MINOR - security-review-bot)
   Added comprehensive comment in remote_persona.go explaining existing
   defenses: file size limit, file count cap, depth limit, node count cap,
   and alias cycle detection. These collectively mitigate billion-laughs
   and stack exhaustion attacks.
2026-05-10 20:54:20 -07:00
Rodin 5fac8bc505 fix: address PR #62 review findings
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 27s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m5s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m40s
- Remove duplicate flag.Parse() call
- Fix nil map panic in LoadRemotePersonas error path by assigning
  empty map when LoadRemotePersonas returns an error
- Tighten isNotFoundError to only check HTTP 404 (remove broad
  'not found' substring check to avoid false positives)
- Clean up personaErr variable scope using narrower-scoped err variables
- Add proper doc comment to LoadRemotePersonasFromPath (Go convention)
- Add file count cap (50 files) in LoadRemotePersonasFromPath to
  prevent resource exhaustion from repos with thousands of small files
- Update test expectation for tightened isNotFoundError
2026-05-10 20:44:24 -07:00
Rodin 2f8d047ef2 feat: load personas from target repo .review-bot/personas/
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 8m12s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 8m15s
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 42s
Adds support for repository-specific personas. When --persona is
specified, review-bot now:

1. Checks the target repo's .review-bot/personas/<name>.yaml directory
2. Falls back to built-in persona if not found in repo

This allows repos to define domain-specific personas (trading, regulatory,
etc.) or override built-in personas with project-specific rules, without
requiring changes to CI configuration.

Implementation:
- New review.PersonaFetcher interface for abstracting Gitea API access
- review.LoadRemotePersonas() with graceful fallback on 404
- review.MergePersonas() for combining remote and built-in personas
- giteaFetcher adapter in main.go to bridge gitea.Client

The feature follows a partial-success model: invalid YAML files or
network errors for individual persona files are logged and skipped,
allowing other valid personas to load.

Closes #60
2026-05-10 19:05:55 -07:00
38 changed files with 1042 additions and 6235 deletions
+24 -330
View File
@@ -1,43 +1,17 @@
# This composite action supports both Gitea Actions and GitHub Actions runners. # This composite action is designed for Gitea Actions runners.
# It detects the VCS host type by checking whether github.api_url is set # Gitea Actions supports GitHub Actions syntax including $GITHUB_OUTPUT,
# (present on GitHub.com and GHES runners, absent on Gitea runners) and uses # actions/cache, and actions/checkout.
# the appropriate releases API for version resolution and binary download
# (REST API on GitHub, direct URLs on Gitea).
#
# Security notes:
# - On GitHub/GHES (VCS_TYPE=github), inputs.vcs-url is IGNORED to prevent
# token exfiltration. API calls use github.api_url; downloads use
# github.server_url. Tokens are never sent to user-supplied URLs.
# - On Gitea (VCS_TYPE=gitea), inputs.vcs-url is validated (https scheme,
# no whitespace/newlines, and DNS resolution to a public IP) before use.
# Python3 resolves the hostname and rejects RFC1918, RFC6598 (carrier-grade
# NAT), loopback, link-local, and other reserved addresses to prevent SSRF attacks.
# The installed review-bot binary additionally uses a safe HTTP transport
# (DialContext-level IP check) for all Gitea API calls at runtime.
# The binary also exposes a `validate-url` subcommand for use in any future
# shell steps that need to validate a URL before passing it to curl.
# - action-repo is validated against owner/repo pattern.
# - Tokens are passed via masked environment variables, not step outputs.
#
# Requirements: python3, sha256sum, curl (all present on ubuntu-* runners). # Requirements: python3, sha256sum, curl (all present on ubuntu-* runners).
name: 'AI Code Review' name: 'AI Code Review'
description: 'Run AI-powered code review on a pull request using review-bot' description: 'Run AI-powered code review on a pull request using review-bot'
inputs: inputs:
vcs-url: gitea-url:
description: 'VCS server URL (only used on Gitea runners; ignored on GitHub/GHES). Defaults to server_url.' description: 'Gitea instance URL (defaults to server_url)'
required: false required: false
default: '' default: ''
repo: repo:
description: 'Repository to review (owner/name, defaults to current)' description: 'Repository (owner/name, defaults to current)'
required: false
default: ''
action-repo:
description: 'Repository hosting review-bot releases (owner/name). Defaults to github.action_repository or rodin/review-bot.'
required: false
default: ''
action-repo-token:
description: 'Token for downloading release assets from action-repo (defaults to github.token on GitHub, reviewer-token on Gitea). Required for private repos.'
required: false required: false
default: '' default: ''
pr-number: pr-number:
@@ -45,7 +19,7 @@ inputs:
required: false required: false
default: '' default: ''
reviewer-token: reviewer-token:
description: 'Token for posting the review' description: 'Gitea token for posting the review'
required: true required: true
reviewer-name: reviewer-name:
description: 'Display name for the reviewer' description: 'Display name for the reviewer'
@@ -138,325 +112,45 @@ runs:
id: version id: version
shell: bash shell: bash
run: | run: |
set -euo pipefail GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
# --- Input Validation ---
# Determine the repo hosting review-bot releases (not the repo being reviewed)
ACTION_REPO="${{ inputs.action-repo }}"
if [ -z "$ACTION_REPO" ]; then
# github.action_repository is the repo containing the running action
ACTION_REPO="${{ github.action_repository }}"
fi
if [ -z "$ACTION_REPO" ]; then
# Final fallback for Gitea (which may not set action_repository)
ACTION_REPO="rodin/review-bot"
echo "::notice::action-repo not specified and github.action_repository is empty; falling back to rodin/review-bot"
fi
# Validate ACTION_REPO matches owner/repo pattern (prevent path traversal)
if ! printf '%s' "$ACTION_REPO" | grep -qE '^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$'; then
echo "Error: action-repo '${ACTION_REPO}' does not match expected owner/repo format" >&2
exit 1
fi
# Detect VCS host type using github.api_url context.
# github.api_url is set on GitHub.com (https://api.github.com) and GHES
# (https://<host>/api/v3). It is empty/unset on Gitea Actions runners.
GITHUB_API_URL="${{ github.api_url }}"
if [ -n "$GITHUB_API_URL" ]; then
VCS_TYPE="github"
else
VCS_TYPE="gitea"
fi
# Determine SERVER_URL based on VCS type.
# SECURITY: On GitHub/GHES, ALWAYS use github.server_url — never trust
# inputs.vcs-url to prevent token exfiltration to attacker-controlled hosts.
if [ "$VCS_TYPE" = "github" ]; then
SERVER_URL="${{ github.server_url }}"
if [ -n "${{ inputs.vcs-url }}" ]; then
echo "::warning::inputs.vcs-url is ignored on GitHub/GHES runners (VCS_TYPE=github). Using github.server_url instead."
fi
else
SERVER_URL="${{ inputs.vcs-url || github.server_url }}"
fi
# Strip trailing slash if present
SERVER_URL="${SERVER_URL%/}"
# Validate SERVER_URL for Gitea path: must be https, no whitespace/newlines.
# The [^[:space:]] class already rejects newlines, so no separate newline check needed.
if [ "$VCS_TYPE" = "gitea" ]; then
if ! printf '%s' "$SERVER_URL" | grep -qE '^https://[^[:space:]]+$'; then
echo "Error: SERVER_URL '${SERVER_URL}' must be an https:// URL with no whitespace" >&2
exit 1
fi
# Additional IP-level SSRF defense: resolve the hostname and reject
# requests to RFC1918, RFC6598 (carrier-grade NAT), loopback, link-local,
# and other reserved addresses.
# python3 is required on ubuntu-* runners (see requirements comment above).
# Use printf to write the script to a temp file so the python lines are valid
# YAML (each indented line becomes a printf argument — no unindented code).
# SERVER_URL is passed via CHECK_URL env var, never interpolated into python code.
printf '%s\n' \
'import socket,ipaddress,sys,os' \
'from urllib.parse import urlparse' \
'u=os.environ["CHECK_URL"]; parsed=urlparse(u)' \
'if parsed.username or parsed.password:' \
' print("Error: URL contains user-info — not allowed",file=sys.stderr); sys.exit(2)' \
'h=parsed.hostname' \
'(print("Error: no hostname",file=sys.stderr) or sys.exit(2)) if not h else None' \
'try: rs=socket.getaddrinfo(h,None)' \
'except socket.gaierror as e: print(f"DNS error: {e}",file=sys.stderr); sys.exit(1)' \
'if not rs: print("Error: no addresses",file=sys.stderr); sys.exit(1)' \
'for _,_,_,_,(a,*_) in rs:' \
' ip=ipaddress.ip_address(a)' \
' if isinstance(ip,ipaddress.IPv6Address) and ip.ipv4_mapped: ip=ip.ipv4_mapped' \
' cgn=ipaddress.ip_network("100.64.0.0/10")' \
' if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_multicast or ip.is_reserved or ip in cgn:' \
' print(f"blocked: {a}",file=sys.stderr); sys.exit(1)' \
> /tmp/_ssrf_check.py
CHECK_URL="${SERVER_URL}" python3 /tmp/_ssrf_check.py || {
echo "Error: SERVER_URL '${SERVER_URL}' resolves to a private/reserved IP address" >&2
exit 1
}
fi
# Determine auth token for release API requests
ACTION_TOKEN="${{ inputs.action-repo-token }}"
if [ -z "$ACTION_TOKEN" ]; then
if [ "$VCS_TYPE" = "github" ]; then
ACTION_TOKEN="${{ github.token }}"
else
ACTION_TOKEN="${{ inputs.reviewer-token }}"
fi
fi
# Validate token contains no control characters (defense-in-depth against header injection)
if [ -n "$ACTION_TOKEN" ]; then
if printf '%s' "$ACTION_TOKEN" | LC_ALL=C grep -q '[^[:print:]]'; then
echo "Error: ACTION_TOKEN contains control characters" >&2
exit 1
fi
fi
if [ "${{ inputs.version }}" = "latest" ]; then if [ "${{ inputs.version }}" = "latest" ]; then
if [ "$VCS_TYPE" = "github" ]; then VERSION=$(curl -sSf "${GITEA_URL}/api/v1/repos/${REPO}/releases?limit=1" \
# SECURITY: Use github.api_url which is a trusted platform-provided value. | python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
# Never construct API URLs from user-supplied inputs on GitHub.
API_URL="${GITHUB_API_URL}/repos/${ACTION_REPO}/releases?per_page=1"
else
# Gitea API — SERVER_URL was validated above
API_URL="${SERVER_URL}/api/v1/repos/${ACTION_REPO}/releases?limit=1"
fi
# Fetch latest version with inline auth header (no intermediate variable)
if [ -n "$ACTION_TOKEN" ]; then
if [ "$VCS_TYPE" = "github" ]; then
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
else
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: token ${ACTION_TOKEN}" "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
fi
else
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
fi
if [ -z "$VERSION" ]; then if [ -z "$VERSION" ]; then
echo "Failed to determine latest version from ${API_URL}" >&2 echo "Failed to determine latest version" >&2
exit 1 exit 1
fi fi
else else
VERSION="${{ inputs.version }}" VERSION="${{ inputs.version }}"
fi fi
# Validate VERSION: no slashes or whitespace (prevent path traversal).
# [:space:] includes newlines and carriage returns in POSIX.
if printf '%s' "$VERSION" | grep -qE '[/[:space:]]'; then
echo "Error: VERSION '${VERSION}' contains invalid characters (newline, slash, or whitespace)" >&2
exit 1
fi
# Detect OS and architecture for platform-specific binary download
OS_RAW=$(uname -s | tr '[:upper:]' '[:lower:]')
case "$OS_RAW" in
linux) OS="linux" ;;
darwin) OS="darwin" ;;
*)
echo "Error: unsupported OS: $(uname -s)" >&2
exit 1
;;
esac
RAW_ARCH=$(uname -m)
case "$RAW_ARCH" in
x86_64) ARCH="amd64" ;;
aarch64 | arm64) ARCH="arm64" ;;
*)
echo "Error: unsupported architecture: $RAW_ARCH" >&2
exit 1
;;
esac
echo "version=${VERSION}" >> "$GITHUB_OUTPUT" echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
echo "os=${OS}" >> "$GITHUB_OUTPUT"
echo "arch=${ARCH}" >> "$GITHUB_OUTPUT"
echo "action_repo=${ACTION_REPO}" >> "$GITHUB_OUTPUT"
echo "server_url=${SERVER_URL}" >> "$GITHUB_OUTPUT"
echo "vcs_type=${VCS_TYPE}" >> "$GITHUB_OUTPUT"
# SECURITY: Pass token via masked environment variable instead of step output.
# Step outputs can leak in debug logs; GITHUB_ENV with masking is safer.
if [ -n "$ACTION_TOKEN" ]; then
echo "::add-mask::${ACTION_TOKEN}"
echo "ACTION_TOKEN=${ACTION_TOKEN}" >> "$GITHUB_ENV"
fi
- name: Cache review-bot binary - name: Cache review-bot binary
id: cache id: cache
uses: actions/cache@v4 uses: actions/cache@v4
with: with:
path: ${{ runner.temp }}/review-bot path: ${{ runner.temp }}/review-bot
key: review-bot-${{ steps.version.outputs.os }}-${{ steps.version.outputs.arch }}-${{ steps.version.outputs.version }} key: review-bot-linux-amd64-${{ steps.version.outputs.version }}
- name: Install review-bot - name: Install review-bot
if: steps.cache.outputs.cache-hit != 'true' if: steps.cache.outputs.cache-hit != 'true'
shell: bash shell: bash
run: | run: |
set -euo pipefail GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
SERVER_URL="${{ steps.version.outputs.server_url }}"
ACTION_REPO="${{ steps.version.outputs.action_repo }}"
VERSION="${{ steps.version.outputs.version }}" VERSION="${{ steps.version.outputs.version }}"
VCS_TYPE="${{ steps.version.outputs.vcs_type }}" BINARY="review-bot-linux-amd64"
OS="${{ steps.version.outputs.os }}"
ARCH="${{ steps.version.outputs.arch }}"
# Read token from masked environment variable (set in Determine version step)
# Falls back to empty if not set (public repos don't need auth)
ACTION_TOKEN="${ACTION_TOKEN:-}"
BINARY="review-bot-${OS}-${ARCH}"
# SECURITY: Re-validate SERVER_URL at the start of this step to mitigate DNS curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \
# rebinding attacks. A DNS TTL expiry between "Determine version" and here -o "${{ runner.temp }}/review-bot"
# could allow an attacker to change the resolved IP to a private/reserved curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/checksums.txt" \
# address, causing curl to send ACTION_TOKEN to an internal host. -o "${{ runner.temp }}/checksums.txt"
# Only needed on Gitea path (VCS_TYPE=gitea); GitHub/GHES uses platform-controlled URLs.
if [ "$VCS_TYPE" = "gitea" ]; then
printf '%s\n' \
'import socket,ipaddress,sys,os' \
'from urllib.parse import urlparse' \
'u=os.environ["CHECK_URL"]; parsed=urlparse(u)' \
'if parsed.username or parsed.password:' \
' print("Error: URL contains user-info — not allowed",file=sys.stderr); sys.exit(2)' \
'h=parsed.hostname' \
'(print("Error: no hostname",file=sys.stderr) or sys.exit(2)) if not h else None' \
'try: rs=socket.getaddrinfo(h,None)' \
'except socket.gaierror as e: print(f"DNS error: {e}",file=sys.stderr); sys.exit(1)' \
'if not rs: print("Error: no addresses",file=sys.stderr); sys.exit(1)' \
'for _,_,_,_,(a,*_) in rs:' \
' ip=ipaddress.ip_address(a)' \
' if isinstance(ip,ipaddress.IPv6Address) and ip.ipv4_mapped: ip=ip.ipv4_mapped' \
' cgn=ipaddress.ip_network("100.64.0.0/10")' \
' if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_multicast or ip.is_reserved or ip in cgn:' \
' print(f"blocked: {a}",file=sys.stderr); sys.exit(1)' \
> /tmp/_ssrf_check_install.py
CHECK_URL="${SERVER_URL}" python3 /tmp/_ssrf_check_install.py || {
echo "Error: SERVER_URL '${SERVER_URL}' resolves to a private/reserved IP address" >&2
exit 1
}
fi
if [ "$VCS_TYPE" = "github" ]; then
# GitHub/GHES: Use REST API for release asset downloads.
# Web release URLs ({server}/.../releases/download/{tag}/{asset}) redirect
# to S3 and don't reliably support Authorization headers for private repos.
# The REST API endpoint with Accept: application/octet-stream is required.
# GITHUB_API_URL: trusted platform value, same as detected in "Determine version" step.
GITHUB_API_URL="${{ github.api_url }}"
if [ -n "$ACTION_TOKEN" ]; then
RELEASE_JSON=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/tags/${VERSION}")
else
RELEASE_JSON=$(curl -sSf --connect-timeout 10 --max-time 30 \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/tags/${VERSION}")
fi
# Extract asset IDs for binary and checksums
BINARY_ASSET_ID=$(printf '%s' "$RELEASE_JSON" | python3 -c "import sys, json; assets = json.load(sys.stdin).get('assets', []); matches = [a['id'] for a in assets if a['name'] == '${BINARY}']; print(matches[0] if matches else '')")
if [ -z "$BINARY_ASSET_ID" ]; then
echo "Error: could not find asset '${BINARY}' in release ${VERSION}" >&2
exit 1
fi
CHECKSUMS_ASSET_ID=$(printf '%s' "$RELEASE_JSON" | python3 -c "import sys, json; assets = json.load(sys.stdin).get('assets', []); matches = [a['id'] for a in assets if a['name'] == 'checksums.txt']; print(matches[0] if matches else '')")
if [ -z "$CHECKSUMS_ASSET_ID" ]; then
echo "Error: could not find asset 'checksums.txt' in release ${VERSION}" >&2
exit 1
fi
# Download assets via REST API with Accept: application/octet-stream
if [ -n "$ACTION_TOKEN" ]; then
curl -sSfL --connect-timeout 10 --max-time 120 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${BINARY_ASSET_ID}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${CHECKSUMS_ASSET_ID}" \
-o "${{ runner.temp }}/checksums.txt"
else
curl -sSfL --connect-timeout 10 --max-time 120 \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${BINARY_ASSET_ID}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL --connect-timeout 10 --max-time 30 \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${CHECKSUMS_ASSET_ID}" \
-o "${{ runner.temp }}/checksums.txt"
fi
else
# Gitea: Direct download via web release URLs (Gitea serves assets
# directly without redirects — no -L needed).
# SECURITY: Omitting -L prevents forwarding Authorization header to
# unexpected hosts if Gitea ever introduces CDN redirects.
DOWNLOAD_URL="${SERVER_URL}/${ACTION_REPO}/releases/download/${VERSION}"
if [ -n "$ACTION_TOKEN" ]; then
curl -sSf --connect-timeout 10 --max-time 120 \
-H "Authorization: token ${ACTION_TOKEN}" \
"${DOWNLOAD_URL}/${BINARY}" -o "${{ runner.temp }}/review-bot"
curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: token ${ACTION_TOKEN}" \
"${DOWNLOAD_URL}/checksums.txt" -o "${{ runner.temp }}/checksums.txt"
else
curl -sSf --connect-timeout 10 --max-time 120 \
"${DOWNLOAD_URL}/${BINARY}" -o "${{ runner.temp }}/review-bot"
curl -sSf --connect-timeout 10 --max-time 30 \
"${DOWNLOAD_URL}/checksums.txt" -o "${{ runner.temp }}/checksums.txt"
fi
fi
# Verify SHA-256 checksum # Verify SHA-256 checksum
# NOTE: This verifies integrity (download wasn't corrupted) but not
# authenticity — both binary and checksums come from the same server.
# For stronger guarantees, consider GPG signature verification.
cd "${{ runner.temp }}" cd "${{ runner.temp }}"
EXPECTED=$(grep -E "^[0-9a-f]+[[:space:]]+\*?${BINARY}$" checksums.txt | awk '{print $1}') EXPECTED=$(grep "${BINARY}" checksums.txt | awk '{print $1}')
# sha256sum (GNU) is not available on macOS; use shasum -a 256 on darwin. ACTUAL=$(sha256sum review-bot | awk '{print $1}')
if [ "${OS}" = "darwin" ]; then
ACTUAL=$(shasum -a 256 review-bot | awk '{print $1}')
else
ACTUAL=$(sha256sum review-bot | awk '{print $1}')
fi
if [ -z "$EXPECTED" ]; then if [ -z "$EXPECTED" ]; then
echo "Error: no checksum found for ${BINARY}" >&2 echo "Error: no checksum found for ${BINARY}" >&2
@@ -470,12 +164,12 @@ runs:
fi fi
chmod +x "${{ runner.temp }}/review-bot" chmod +x "${{ runner.temp }}/review-bot"
echo "Installed review-bot-${OS}-${ARCH} ${VERSION} (checksum verified)" echo "Installed review-bot ${VERSION} (checksum verified)"
- name: Run review - name: Run review
shell: bash shell: bash
env: env:
VCS_URL: ${{ steps.version.outputs.server_url }} GITEA_URL: ${{ inputs.gitea-url || github.server_url }}
GITEA_REPO: ${{ inputs.repo || github.repository }} GITEA_REPO: ${{ inputs.repo || github.repository }}
PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }} PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ inputs.reviewer-token }} REVIEWER_TOKEN: ${{ inputs.reviewer-token }}
+3 -5
View File
@@ -38,8 +38,6 @@ jobs:
- name: security - name: security
token_secret: SECURITY_REVIEW_TOKEN token_secret: SECURITY_REVIEW_TOKEN
model: gpt-5 model: gpt-5
patterns_repo: rodin/security-patterns
patterns_files: "."
system_prompt_file: SECURITY_REVIEW.md system_prompt_file: SECURITY_REVIEW.md
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
@@ -49,7 +47,7 @@ jobs:
- run: go build -o review-bot ./cmd/review-bot - run: go build -o review-bot ./cmd/review-bot
- name: Run ${{ matrix.name }} review - name: Run ${{ matrix.name }} review
env: env:
VCS_URL: ${{ github.server_url }} GITEA_URL: ${{ github.server_url }}
GITEA_REPO: ${{ github.repository }} GITEA_REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }} PR_NUMBER: ${{ github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }} REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}
@@ -62,8 +60,8 @@ jobs:
AICORE_API_URL: ${{ secrets.AICORE_API_URL }} AICORE_API_URL: ${{ secrets.AICORE_API_URL }}
AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }} AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }}
CONVENTIONS_FILE: "CONVENTIONS.md" CONVENTIONS_FILE: "CONVENTIONS.md"
PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }} PATTERNS_REPO: "rodin/go-patterns"
PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }} PATTERNS_FILES: "README.md,patterns/"
LLM_TIMEOUT: "600" LLM_TIMEOUT: "600"
SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }} SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }}
run: ./review-bot run: ./review-bot
+1 -1
View File
@@ -9,7 +9,7 @@
| Package | Use Case | Scope | | Package | Use Case | Scope |
|---------|----------|-------| |---------|----------|-------|
| `github.com/goccy/go-yaml` | YAML parsing and AST inspection (subpkgs: `ast`, `parser`) | production | | `gopkg.in/yaml.v3` | YAML parsing (persona files, config) | production |
| `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only | | `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only |
**Any import not in this table or the Go standard library is forbidden.** **Any import not in this table or the Go standard library is forbidden.**
-175
View File
@@ -1,175 +0,0 @@
# Plan: Issue #125 — Rename GITEA_URL → VCS_URL
## Problem
The `GITEA_URL` environment variable (and `--gitea-url` flag) implies the binary only works with Gitea.
Now that review-bot supports both Gitea and GitHub/GHES, this name is misleading.
Renaming to `VCS_URL` makes the binary platform-agnostic in its interface.
## Constraints
- Must not break existing users who already use `GITEA_URL` — need a fallback
- The CLI flag `--gitea-url` should also be updated to `--vcs-url` for consistency
- `INTEGRATION_GITEA_URL` in integration tests is a test-only env var, not the binary's interface; but should be updated for clarity
- The action YAML uses `GITEA_URL` as an internal shell variable in bash scripts — distinct from the env var passed to the binary
- All changes must compile and pass existing tests
## Files Affected
### Binary / Go source
| File | Change |
|------|--------|
| `cmd/review-bot/main.go` | Rename `--gitea-url``--vcs-url`, add `VCS_URL` as primary, keep `GITEA_URL` fallback |
| `cmd/review-bot/integration_test.go` | Rename `INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` (test-only, no external compat concern) |
| `integration_test.go` | Same — rename `INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` |
### Action YAML
| File | Change |
|------|--------|
| `.gitea/actions/review/action.yml` | Rename input `gitea-url``vcs-url`; update env var passed to binary: `VCS_URL` instead of `GITEA_URL`; keep internal bash var as `GITEA_URL` (only used for release download, not passed to binary) |
| `.gitea/workflows/ci.yml` | Rename `GITEA_URL` env var to `VCS_URL` in Run review step |
### Documentation
| File | Change |
|------|--------|
| `README.md` | Update CLI example, env var table entry |
## Proposed Approach
### 1. Backward-compatible env var lookup in main.go
Replace:
```go
giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", ""), "Gitea instance URL")
```
With:
```go
giteaURL := flag.String("vcs-url", envOrDefaultFallback("VCS_URL", "GITEA_URL", ""), "VCS server URL (e.g. https://gitea.example.com)")
```
Add a helper:
```go
// envOrDefaultFallback reads primary env var; if empty, falls back to deprecated env var.
func envOrDefaultFallback(primary, deprecated, defaultVal string) string {
if v := os.Getenv(primary); v != "" {
return v
}
if v := os.Getenv(deprecated); v != "" {
slog.Warn("deprecated env var in use; rename to " + primary, "old", deprecated, "new", primary)
return v
}
return defaultVal
}
```
**Note:** This must be called AFTER `setupLogger` conceptually, but the flag default is evaluated at flag registration time. Since `setupLogger` runs before `flag.Parse()`, the slog.Warn will print correctly at runtime. We use `log.Printf` as a fallback if this proves problematic.
Actually — flag defaults are evaluated at registration (line 57), before `setupLogger`. The warning won't go through slog. Two options:
- Use `log.Printf` for the deprecation warning (always visible)
- Move the fallback lookup to after `flag.Parse()`, checking if the parsed value is still empty
**Decision:** Move fallback to a post-parse check. This is cleaner:
```go
vcsURL := flag.String("vcs-url", os.Getenv("VCS_URL"), "VCS server URL")
flag.Parse()
// Backward compat: fall back to deprecated GITEA_URL
if *vcsURL == "" {
if v := os.Getenv("GITEA_URL"); v != "" {
slog.Warn("GITEA_URL is deprecated; use VCS_URL instead")
*vcsURL = v
}
}
```
This is clean, idiomatic, and the warning goes through slog correctly.
### 2. Keep `--gitea-url` as deprecated alias
Add a hidden flag for backward compat:
```go
giteaURLAlias := flag.String("gitea-url", "", "Deprecated: use --vcs-url")
```
Post-parse:
```go
if *vcsURL == "" && *giteaURLAlias != "" {
slog.Warn("--gitea-url is deprecated; use --vcs-url instead")
*vcsURL = *giteaURLAlias
}
```
### 3. Internal variable rename
Rename `giteaURL` local variable → `vcsURL` throughout `main.go` for consistency.
### 4. Error message update
```go
fmt.Fprintf(os.Stderr, "Required: --vcs-url, --repo, --pr, --reviewer-token, --llm-model\n")
```
### 5. Action YAML changes
In `.gitea/actions/review/action.yml`:
- Input `gitea-url``vcs-url` (with same description, `required: false`, `default: ''`)
- Line 172: `GITEA_URL: ${{ inputs.gitea-url || github.server_url }}``VCS_URL: ${{ inputs.vcs-url || github.server_url }}`
- Lines 115, 140: internal bash vars `GITEA_URL=` are used for downloading binaries — NOT passed to the review-bot binary. Leave them as internal bash vars (they're scope-local in bash). These could be renamed to `SERVER_URL` or `BASE_URL` for local clarity, but renaming them isn't strictly required.
In `.gitea/workflows/ci.yml`:
- Line 52: `GITEA_URL: ${{ github.server_url }}``VCS_URL: ${{ github.server_url }}`
### 6. Integration test updates
`INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` in both test files.
### 7. README
- CLI example: `--gitea-url``--vcs-url`
- Env var table: `GITEA_URL``VCS_URL`, add note about `GITEA_URL` fallback
## Backward Compatibility Summary
| Old | New | Fallback? |
|-----|-----|-----------|
| `GITEA_URL` env var | `VCS_URL` | ✅ with deprecation warning |
| `--gitea-url` flag | `--vcs-url` | ✅ with deprecation warning |
| `gitea-url` action input | `vcs-url` | ⚠️ No (action version bump handles this) |
| `INTEGRATION_GITEA_URL` | `INTEGRATION_VCS_URL` | N/A (test-only) |
## Error Cases
- Both `VCS_URL` and `GITEA_URL` set: `VCS_URL` wins (primary takes precedence)
- Both `--vcs-url` and `--gitea-url` provided: `--vcs-url` wins
- Neither set: existing "missing required flags" error unchanged
## Edge Cases
- `os.Getenv` returns "" for unset AND set-to-empty — consistent with existing behavior
- The `envOrDefault` helper is unchanged; we add `envOrDefaultFallback` for the one renamed var
## Testing Strategy
- Existing unit tests pass unchanged (they don't test env var parsing directly)
- Integration tests updated to use new env var name
- Manual: `GITEA_URL=https://example.com ./review-bot --repo x --pr 1 ...` should print deprecation warning and proceed
- Manual: `VCS_URL=https://example.com ./review-bot ...` should work silently
## Completion Checklist
1. `VCS_URL` is read first; `GITEA_URL` is fallback with deprecation warning
2. `--vcs-url` flag is primary; `--gitea-url` is deprecated alias with warning
3. Error message references `--vcs-url` not `--gitea-url`
4. `action.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
5. `ci.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
6. README updated in CLI example and env var table
7. Integration tests use `INTEGRATION_VCS_URL`
8. `go test ./...` passes
9. `go vet ./...` passes
10. `go build ./cmd/review-bot` succeeds
## Open Questions
- Should the CLI flag `--gitea-url` be completely hidden from `--help` or just deprecated with a note? The issue doesn't specify. Decision: keep it visible but add "(deprecated: use --vcs-url)" to the description.
- Should action.yml also add `gitea-url` as a deprecated input alias? The issue says "Update the action to pass the new env var name" — no mention of backward compat for the action input. Decision: rename only, no alias (action users pin a version anyway).
- The bash-internal `GITEA_URL` variable in action.yml scripts (used for release download, not passed to binary) — rename for clarity? Decision: yes, rename to `BASE_URL` to avoid confusion with the env var.
+39 -5
View File
@@ -282,7 +282,7 @@ Rules:
```bash ```bash
review-bot \ review-bot \
--vcs-url https://gitea.example.com \ --gitea-url https://gitea.example.com \
--repo owner/name \ --repo owner/name \
--pr 42 \ --pr 42 \
--reviewer-token "$GITEA_TOKEN" \ --reviewer-token "$GITEA_TOKEN" \
@@ -299,7 +299,7 @@ All flags have environment variable equivalents:
| Flag | Env Var | | Flag | Env Var |
|------|---------| |------|---------|
| `--vcs-url` | `VCS_URL` (fallback: `GITEA_URL`) | | `--gitea-url` | `GITEA_URL` |
| `--repo` | `GITEA_REPO` | | `--repo` | `GITEA_REPO` |
| `--pr` | `PR_NUMBER` | | `--pr` | `PR_NUMBER` |
| `--reviewer-token` | `REVIEWER_TOKEN` | | `--reviewer-token` | `REVIEWER_TOKEN` |
@@ -329,12 +329,11 @@ All flags have environment variable equivalents:
### Token Scopes Required ### Token Scopes Required
| Scope | Purpose | | Scope | Purpose |
|-------|--------| |-------|---------|
| `write:issue` | Post and delete reviews | | `write:issue` | Post and delete reviews |
| `write:repository` | Read PR diffs, file content, commit statuses | | `write:repository` | Read PR diffs, file content, commit statuses |
| `read:user` | Self-request as reviewer (optional but recommended) |
Without `read:user`, the bot still works but cannot add itself to the PR's reviewer list. No `read:user` scope needed — the bot identifies itself from the review response.
## Development ## Development
@@ -460,6 +459,41 @@ YAML is the recommended format for personas because it supports:
JSON is also supported for backwards compatibility—just use `.json` extension. JSON is also supported for backwards compatibility—just use `.json` extension.
### Repository Personas (Auto-Discovery)
Repositories can ship their own personas in `.review-bot/personas/`. When you specify `--persona <name>`, review-bot will:
1. **Try to load from the target repo** — Checks `.review-bot/personas/<name>.yaml` (or `.yml`)
2. **Fall back to built-in** — If not found in repo, uses the built-in persona
This lets each repo define domain-specific personas without modifying CI config:
```
my-trading-repo/
├── .review-bot/
│ └── personas/
│ ├── trading.yaml # Custom trading persona
│ └── regulatory.yaml # Compliance-focused reviews
├── lib/
└── ...
```
```yaml
# CI config (no persona-file needed)
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: trading
persona: trading # Will find .review-bot/personas/trading.yaml
...
```
**Priority order:**
1. Repo's `.review-bot/personas/<name>.yaml`
2. Built-in persona with matching name
3. Error if neither exists
This allows repos to override built-in personas (e.g., a custom `security` persona that adds project-specific rules) while keeping the simple `persona: security` syntax in CI.
### Persona vs system-prompt-file ### Persona vs system-prompt-file
-79
View File
@@ -1,79 +0,0 @@
## Dev Loop: review-bot — 2026-05-14 20:10 UTC
### Latest: ✅ STABLE STATE — REPO HEALTH COMPLETE
- **Last action:** health check; verified tests pass, repo clean, no action needed
- **Repository:** Clean, all merges complete, no open issues/PRs
- **Main branch:** Up to date with origin/main
- **Test suite:** All passing (cached)
---
## Repository Status
### ✅ Merged to main (recent):
- issue-123 (IP-level SSRF defense) — 6 commits, main at 4440823
- issue-125 (VCS_URL rename + deprecation) — merged
- issue-124 (multi-arch binary support) — merged
- issue-120 (GitHub Actions + VCS abstraction) — merged
- issue-121 (VCS host type detection for binary download) — merged
### 🧹 Cleanup COMPLETE:
- ✅ Removed old worktrees (issue-123, review-bot-issue-125)
- ✅ Test suite passes (all packages)
- ✅ No TODO/FIXME in code except expected GitHub client notes
- ✅ No open issues or pull requests
- ✅ Dependencies up to date
---
## Current Feature Completeness
**Core Capabilities:**
- Multi-provider LLM support (OpenAI, Anthropic, SAP AI Core)
- Gitea PR integration with structured reviews
- SSRF defense with IP-level validation
- VCS abstraction (Gitea/GitHub support)
- Multi-architecture binary support
- GitHub Actions composite action
**Recent Security Work:**
- RFC6598 CGN range detection
- IP fallback dialing for local endpoint rejection
- URL validation for SSRF prevention
**Code Quality:**
- Comprehensive test coverage (all packages tested)
- Consistent error handling with context propagation
- Secure credential handling (unexported fields)
- Concurrency-safe designs
---
## Next Priority Actions
### Phase 2: Feature Exploration (NEXT SESSION)
- Scan code for potential improvements per REVIEW.md findings
- Assess performance under load
- Review REVIEW.md findings for targeted fixes
- Consider backlog items from design docs
### Phase 3: Optional Enhancements (BACKLOG)
- Address REVIEW.md context propagation findings (if prioritized)
- Additional LLM provider support
- Enhanced context detection
- Custom report formats
- Webhook management improvements
---
## Worktrees Status
All old worktrees cleaned up. Ready for new issue work.
---
## Dev-Loop Metadata
- **Repo:** /home/ubuntu/review-bot
- **Main branch SHA:** ed3a5dd (last commit)
- **Cron ID:** 5342ac81-4bbc-4e4c-a123-347a7788d50c
- **Scheduled:** Every 4 hours
- **Last health check:** 2026-05-14 20:10 UTC (✅ all healthy)
+4 -4
View File
@@ -17,7 +17,7 @@ import (
// Integration test requires a running Gitea instance and LLM endpoint. // Integration test requires a running Gitea instance and LLM endpoint.
// Set environment variables: // Set environment variables:
// //
// INTEGRATION_VCS_URL - VCS base URL // INTEGRATION_GITEA_URL - Gitea base URL
// INTEGRATION_GITEA_TOKEN - Gitea API token with repo access // INTEGRATION_GITEA_TOKEN - Gitea API token with repo access
// INTEGRATION_GITEA_REPO - owner/repo with an open PR // INTEGRATION_GITEA_REPO - owner/repo with an open PR
// INTEGRATION_PR_NUMBER - PR number to test against // INTEGRATION_PR_NUMBER - PR number to test against
@@ -25,7 +25,7 @@ import (
// INTEGRATION_LLM_API_KEY - LLM API key // INTEGRATION_LLM_API_KEY - LLM API key
// INTEGRATION_LLM_MODEL - Model name // INTEGRATION_LLM_MODEL - Model name
func TestIntegration_FullReviewFlow(t *testing.T) { func TestIntegration_FullReviewFlow(t *testing.T) {
giteaURL := os.Getenv("INTEGRATION_VCS_URL") giteaURL := os.Getenv("INTEGRATION_GITEA_URL")
giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN") giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN")
giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO") giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO")
prNumStr := os.Getenv("INTEGRATION_PR_NUMBER") prNumStr := os.Getenv("INTEGRATION_PR_NUMBER")
@@ -104,7 +104,7 @@ func TestIntegration_FullReviewFlow(t *testing.T) {
} }
func TestIntegration_PostAndCleanup(t *testing.T) { func TestIntegration_PostAndCleanup(t *testing.T) {
giteaURL := os.Getenv("INTEGRATION_VCS_URL") giteaURL := os.Getenv("INTEGRATION_GITEA_URL")
giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN") giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN")
giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO") giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO")
prNumStr := os.Getenv("INTEGRATION_PR_NUMBER") prNumStr := os.Getenv("INTEGRATION_PR_NUMBER")
@@ -130,7 +130,7 @@ func TestIntegration_PostAndCleanup(t *testing.T) {
// Post a test review // Post a test review
sentinel := "<!-- review-bot:integration-test -->" sentinel := "<!-- review-bot:integration-test -->"
testBody := "# Integration Test Review\n\nThis is a test review.\n\n" + sentinel testBody := "# Integration Test Review\n\nThis is a test review.\n\n" + sentinel
posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, "COMMENT", testBody, "", nil) posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, "COMMENT", testBody, nil)
if err != nil { if err != nil {
t.Fatalf("PostReview: %v", err) t.Fatalf("PostReview: %v", err)
} }
+88 -190
View File
@@ -4,7 +4,6 @@ import (
"context" "context"
"flag" "flag"
"fmt" "fmt"
"io"
"log/slog" "log/slog"
"os" "os"
"path/filepath" "path/filepath"
@@ -14,21 +13,12 @@ import (
"gitea.weiker.me/rodin/review-bot/budget" "gitea.weiker.me/rodin/review-bot/budget"
"gitea.weiker.me/rodin/review-bot/gitea" "gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/github"
"gitea.weiker.me/rodin/review-bot/llm" "gitea.weiker.me/rodin/review-bot/llm"
"gitea.weiker.me/rodin/review-bot/review" "gitea.weiker.me/rodin/review-bot/review"
"gitea.weiker.me/rodin/review-bot/vcs"
) )
var version = "dev" var version = "dev"
// outWriter and errWriter are the output and error writers for subcommands.
// They are variables so tests can capture output.
var (
outWriter io.Writer = os.Stdout
errWriter io.Writer = os.Stderr
)
// setupLogger configures the global slog default logger based on format and verbosity. // setupLogger configures the global slog default logger based on format and verbosity.
func setupLogger(format, verbosity string) { func setupLogger(format, verbosity string) {
var level slog.Level var level slog.Level
@@ -59,22 +49,12 @@ func setupLogger(format, verbosity string) {
} }
func main() { func main() {
// Dispatch subcommands before flag parsing so they get their own args.
// e.g. `review-bot validate-url <url>`
if len(os.Args) > 1 {
switch os.Args[1] {
case "validate-url":
os.Exit(runValidateURL(os.Args[2:]))
}
}
versionFlag := flag.Bool("version", false, "Print version and exit") versionFlag := flag.Bool("version", false, "Print version and exit")
// Logging flags // Logging flags
logFormat := flag.String("log-format", envOrDefault("LOG_FORMAT", "text"), "Log output format: text or json") logFormat := flag.String("log-format", envOrDefault("LOG_FORMAT", "text"), "Log output format: text or json")
verbosity := flag.String("verbosity", envOrDefault("LOG_VERBOSITY", "info"), "Log verbosity: debug, info, warn, error") verbosity := flag.String("verbosity", envOrDefault("LOG_VERBOSITY", "info"), "Log verbosity: debug, info, warn, error")
// CLI flags // CLI flags
vcsURL := flag.String("vcs-url", os.Getenv("VCS_URL"), "VCS server URL (e.g. https://gitea.example.com)") giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", ""), "Gitea instance URL")
giteaURLAlias := flag.String("gitea-url", "", "Deprecated: use --vcs-url")
repo := flag.String("repo", envOrDefault("GITEA_REPO", ""), "Repository (owner/name)") repo := flag.String("repo", envOrDefault("GITEA_REPO", ""), "Repository (owner/name)")
prNum := flag.String("pr", envOrDefault("PR_NUMBER", ""), "Pull request number") prNum := flag.String("pr", envOrDefault("PR_NUMBER", ""), "Pull request number")
reviewerName := flag.String("reviewer-name", envOrDefault("REVIEWER_NAME", ""), "Reviewer display name") reviewerName := flag.String("reviewer-name", envOrDefault("REVIEWER_NAME", ""), "Reviewer display name")
@@ -85,8 +65,7 @@ func main() {
conventionsFile := flag.String("conventions-file", envOrDefault("CONVENTIONS_FILE", ""), "Conventions file path in repo (e.g. CLAUDE.md)") conventionsFile := flag.String("conventions-file", envOrDefault("CONVENTIONS_FILE", ""), "Conventions file path in repo (e.g. CLAUDE.md)")
systemPromptFile := flag.String("system-prompt-file", envOrDefault("SYSTEM_PROMPT_FILE", ""), "Local file with additional system prompt instructions") systemPromptFile := flag.String("system-prompt-file", envOrDefault("SYSTEM_PROMPT_FILE", ""), "Local file with additional system prompt instructions")
patternsRepo := flag.String("patterns-repo", envOrDefault("PATTERNS_REPO", ""), "Repo with language patterns (e.g. rodin/elixir-patterns)") patternsRepo := flag.String("patterns-repo", envOrDefault("PATTERNS_REPO", ""), "Repo with language patterns (e.g. rodin/elixir-patterns)")
patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", ""), "Comma-separated file paths to fetch from patterns repo (empty = all files)") patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", "README.md"), "Comma-separated file paths to fetch from patterns repo")
vcsType := flag.String("vcs-type", envOrDefault("VCS_TYPE", "gitea"), "VCS type: gitea or github")
dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting") dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting")
llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)") llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)")
llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)") llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)")
@@ -112,24 +91,12 @@ func main() {
slog.Info("review-bot starting", "version", version) slog.Info("review-bot starting", "version", version)
// Backward compatibility: fall back to deprecated env var / flag if VCS_URL / --vcs-url not set.
if *vcsURL == "" {
if v := os.Getenv("GITEA_URL"); v != "" {
slog.Warn("GITEA_URL is deprecated; rename the environment variable to VCS_URL")
*vcsURL = v
}
}
if *vcsURL == "" && *giteaURLAlias != "" {
slog.Warn("--gitea-url is deprecated; use --vcs-url instead")
*vcsURL = *giteaURLAlias
}
// Validate required fields // Validate required fields
// For aicore provider, llm-base-url and llm-api-key are not required // For aicore provider, llm-base-url and llm-api-key are not required
isAICore := llm.Provider(*llmProvider) == llm.ProviderAICore isAICore := llm.Provider(*llmProvider) == llm.ProviderAICore
if *vcsURL == "" || *repo == "" || *prNum == "" || *reviewerToken == "" || *llmModel == "" { if *giteaURL == "" || *repo == "" || *prNum == "" || *reviewerToken == "" || *llmModel == "" {
fmt.Fprintf(os.Stderr, "Error: missing required flags or environment variables\n\n") fmt.Fprintf(os.Stderr, "Error: missing required flags or environment variables\n\n")
fmt.Fprintf(os.Stderr, "Required: --vcs-url, --repo, --pr, --reviewer-token, --llm-model\n") fmt.Fprintf(os.Stderr, "Required: --gitea-url, --repo, --pr, --reviewer-token, --llm-model\n")
os.Exit(1) os.Exit(1)
} }
if !isAICore && (*llmBaseURL == "" || *llmAPIKey == "") { if !isAICore && (*llmBaseURL == "" || *llmAPIKey == "") {
@@ -148,7 +115,9 @@ func main() {
os.Exit(1) os.Exit(1)
} }
// NOTE: Persona loading deferred until after Gitea client init to support repo personas // Persona loading is deferred until after giteaClient is initialized,
// so we can try loading from the target repo first.
var persona *review.Persona
// Validate reviewer-name: only safe characters allowed in sentinel // Validate reviewer-name: only safe characters allowed in sentinel
if err := validateReviewerName(*reviewerName); err != nil { if err := validateReviewerName(*reviewerName); err != nil {
@@ -171,19 +140,8 @@ func main() {
os.Exit(1) os.Exit(1)
} }
// Initialize VCS client // Initialize clients
var vcsClientImpl vcsClient giteaClient := gitea.NewClient(*giteaURL, *reviewerToken)
switch strings.ToLower(*vcsType) {
case "github":
vcsClientImpl = newGitHubVCSAdapter(github.NewClient(*reviewerToken, *vcsURL))
slog.Info("using GitHub VCS client", "url", *vcsURL)
case "gitea", "":
vcsClientImpl = newGiteaVCSAdapter(gitea.NewClient(*vcsURL, *reviewerToken))
slog.Info("using Gitea VCS client", "url", *vcsURL)
default:
slog.Error("invalid vcs-type", "type", *vcsType, "valid", "gitea, github")
os.Exit(1)
}
llmClient := llm.NewClient(*llmBaseURL, *llmAPIKey, *llmModel) llmClient := llm.NewClient(*llmBaseURL, *llmAPIKey, *llmModel)
if *llmTemp < 0 || *llmTemp > 2 { if *llmTemp < 0 || *llmTemp > 2 {
slog.Error("invalid LLM temperature", "temperature", *llmTemp, "range", "0-2") slog.Error("invalid LLM temperature", "temperature", *llmTemp, "range", "0-2")
@@ -217,22 +175,23 @@ func main() {
ctx, cancel := context.WithTimeout(context.Background(), overallTimeout) ctx, cancel := context.WithTimeout(context.Background(), overallTimeout)
defer cancel() defer cancel()
// Load persona if specified (after Gitea client init to support repo personas) // Load persona: try remote repo first, then fall back to built-in
var persona *review.Persona
if *personaName != "" { if *personaName != "" {
// Try loading from repo first, then fall back to built-in // Try loading from target repo's .review-bot/personas/ directory
repoPersonas, err := review.LoadRepoPersonas(ctx, newReviewClientAdapter(vcsClientImpl), owner, repoName) fetcher := &giteaFetcher{client: giteaClient}
remotePersonas, err := review.LoadRemotePersonas(ctx, fetcher, owner, repoName)
if err != nil { if err != nil {
slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err) slog.Warn("could not load remote personas", "repo", fmt.Sprintf("%s/%s", owner, repoName), "error", err)
// Continue with built-in personas only. // Assign empty map so the lookup below doesn't panic
// NOTE: repoPersonas is nil here, but map indexing on a nil map is safe in Go remotePersonas = map[string]*review.Persona{}
// (returns the zero value), so the fallback to built-in below works correctly.
} }
if p, ok := repoPersonas[*personaName]; ok {
if p, ok := remotePersonas[*personaName]; ok {
persona = p persona = p
slog.Info("loaded repo persona", "persona", persona.Name, "display", persona.DisplayName, "repo", owner+"/"+repoName) slog.Info("loaded persona from target repo", "persona", persona.Name, "display", persona.DisplayName)
} else { } else {
// Fall back to built-in // Fall back to built-in persona
var err error
persona, err = review.LoadBuiltinPersona(*personaName) persona, err = review.LoadBuiltinPersona(*personaName)
if err != nil { if err != nil {
slog.Error("failed to load persona", "persona", *personaName, "error", err) slog.Error("failed to load persona", "persona", *personaName, "error", err)
@@ -246,18 +205,19 @@ func main() {
slog.Error("invalid persona-file path", "error", err) slog.Error("invalid persona-file path", "error", err)
os.Exit(1) os.Exit(1)
} }
persona, err = review.LoadPersona(resolvedPath) loadedPersona, loadErr := review.LoadPersona(resolvedPath)
if err != nil { if loadErr != nil {
slog.Error("failed to load persona file", "file", *personaFile, "error", err) slog.Error("failed to load persona file", "file", *personaFile, "error", loadErr)
os.Exit(1) os.Exit(1)
} }
persona = loadedPersona
slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name) slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name)
} }
slog.Info("reviewing pull request", "pr", prNumber, "repo", fmt.Sprintf("%s/%s", owner, repoName)) slog.Info("reviewing pull request", "pr", prNumber, "repo", fmt.Sprintf("%s/%s", owner, repoName))
// Step 1: Fetch PR metadata // Step 1: Fetch PR metadata
pr, err := vcsClientImpl.GetPullRequest(ctx, owner, repoName, prNumber) pr, err := giteaClient.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil { if err != nil {
slog.Error("failed to fetch PR", "pr", prNumber, "error", err) slog.Error("failed to fetch PR", "pr", prNumber, "error", err)
os.Exit(1) os.Exit(1)
@@ -265,7 +225,7 @@ func main() {
slog.Info("fetched PR metadata", "pr", prNumber, "title", pr.Title) slog.Info("fetched PR metadata", "pr", prNumber, "title", pr.Title)
// Step 2: Fetch diff // Step 2: Fetch diff
diff, err := vcsClientImpl.GetPullRequestDiff(ctx, owner, repoName, prNumber) diff, err := giteaClient.GetPullRequestDiff(ctx, owner, repoName, prNumber)
if err != nil { if err != nil {
slog.Error("failed to fetch diff", "pr", prNumber, "error", err) slog.Error("failed to fetch diff", "pr", prNumber, "error", err)
os.Exit(1) os.Exit(1)
@@ -274,21 +234,21 @@ func main() {
// Step 3: Fetch full file content for modified files // Step 3: Fetch full file content for modified files
fileContext := "" fileContext := ""
files, err := vcsClientImpl.GetPullRequestFiles(ctx, owner, repoName, prNumber) files, err := giteaClient.GetPullRequestFiles(ctx, owner, repoName, prNumber)
if err != nil { if err != nil {
slog.Warn("could not fetch PR files list", "pr", prNumber, "error", err) slog.Warn("could not fetch PR files list", "pr", prNumber, "error", err)
} else { } else {
fileContext = fetchFileContext(ctx, vcsClientImpl, owner, repoName, pr.HeadRef, files) fileContext = fetchFileContext(ctx, giteaClient, owner, repoName, pr.Head.Ref, files)
slog.Debug("fetched file context", "files", len(files)) slog.Debug("fetched file context", "files", len(files))
} }
// Step 4: Check CI status // Step 4: Check CI status
ciPassed := true ciPassed := true
ciDetails := "" ciDetails := ""
if pr.HeadSha != "" { if pr.Head.Sha != "" {
statuses, err := vcsClientImpl.GetCommitStatuses(ctx, owner, repoName, pr.HeadSha) statuses, err := giteaClient.GetCommitStatuses(ctx, owner, repoName, pr.Head.Sha)
if err != nil { if err != nil {
slog.Warn("could not fetch CI status", "sha", pr.HeadSha, "error", err) slog.Warn("could not fetch CI status", "sha", pr.Head.Sha, "error", err)
} else { } else {
ciPassed, ciDetails = evaluateCIStatus(statuses) ciPassed, ciDetails = evaluateCIStatus(statuses)
slog.Info("CI status checked", "passed", ciPassed) slog.Info("CI status checked", "passed", ciPassed)
@@ -298,7 +258,7 @@ func main() {
// Step 5: Load conventions file if specified // Step 5: Load conventions file if specified
conventions := "" conventions := ""
if *conventionsFile != "" { if *conventionsFile != "" {
content, err := vcsClientImpl.GetFileContent(ctx, owner, repoName, *conventionsFile) content, err := giteaClient.GetFileContent(ctx, owner, repoName, *conventionsFile)
if err != nil { if err != nil {
slog.Warn("could not load conventions file", "file", *conventionsFile, "error", err) slog.Warn("could not load conventions file", "file", *conventionsFile, "error", err)
} else { } else {
@@ -310,7 +270,7 @@ func main() {
// Step 6: Load patterns from external repo if specified // Step 6: Load patterns from external repo if specified
patterns := "" patterns := ""
if *patternsRepo != "" { if *patternsRepo != "" {
patterns = fetchPatterns(ctx, vcsClientImpl, *patternsRepo, *patternsFiles) patterns = fetchPatterns(ctx, giteaClient, *patternsRepo, *patternsFiles)
slog.Debug("loaded patterns", "repo", *patternsRepo, "bytes", len(patterns)) slog.Debug("loaded patterns", "repo", *patternsRepo, "bytes", len(patterns))
} }
@@ -403,15 +363,15 @@ func main() {
} }
// Add commit footer so readers know which commit was evaluated // Add commit footer so readers know which commit was evaluated
if pr.HeadSha != "" { if pr.Head.Sha != "" {
shortSHA := pr.HeadSha shortSHA := pr.Head.Sha
if len(shortSHA) > 8 { if len(shortSHA) > 8 {
shortSHA = shortSHA[:8] shortSHA = shortSHA[:8]
} }
reviewBody += fmt.Sprintf("\n\n---\n*Evaluated against %s*", shortSHA) reviewBody += fmt.Sprintf("\n\n---\n*Evaluated against %s*", shortSHA)
} }
event := verdictToVCSEvent(result.Verdict) event := review.GiteaEvent(result.Verdict)
if *dryRun { if *dryRun {
fmt.Println("--- DRY RUN ---") fmt.Println("--- DRY RUN ---")
@@ -423,14 +383,14 @@ func main() {
sentinel := fmt.Sprintf("<!-- review-bot:%s -->", *reviewerName) sentinel := fmt.Sprintf("<!-- review-bot:%s -->", *reviewerName)
// Stale check: verify HEAD hasn't moved since we started // Stale check: verify HEAD hasn't moved since we started
evaluatedSHA := pr.HeadSha evaluatedSHA := pr.Head.Sha
var currentSHA string var currentSHA string
currentPR, err := vcsClientImpl.GetPullRequest(ctx, owner, repoName, prNumber) currentPR, err := giteaClient.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil { if err != nil {
slog.Warn("could not re-fetch PR for stale check", "pr", prNumber, "error", err) slog.Warn("could not re-fetch PR for stale check", "pr", prNumber, "error", err)
// currentSHA stays empty — shouldSkipStaleReview will return false // currentSHA stays empty — shouldSkipStaleReview will return false
} else { } else {
currentSHA = currentPR.HeadSha currentSHA = currentPR.Head.Sha
} }
if shouldSkipStaleReview(evaluatedSHA, currentSHA) { if shouldSkipStaleReview(evaluatedSHA, currentSHA) {
slog.Warn("HEAD moved during review — skipping stale review", slog.Warn("HEAD moved during review — skipping stale review",
@@ -441,30 +401,28 @@ func main() {
} }
// Map findings to inline comments for lines present in the diff // Map findings to inline comments for lines present in the diff
var inlineComments []vcs.ReviewComment diffRanges := gitea.ParseDiffNewLines(diff)
if ext, ok := vcsClientImpl.(giteaExtendedClient); ok { var inlineComments []gitea.ReviewComment
diffRanges := ext.ParseDiffNewLines(diff) for _, f := range result.Findings {
for _, f := range result.Findings { if f.File != "" && f.Line > 0 && diffRanges.Contains(f.File, f.Line) {
if f.File != "" && f.Line > 0 && diffRanges.Contains(f.File, f.Line) { inlineComments = append(inlineComments, gitea.ReviewComment{
inlineComments = append(inlineComments, vcs.ReviewComment{ Path: f.File,
Path: f.File, NewPosition: int64(f.Line),
Position: f.Line, Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding), })
})
}
}
if len(inlineComments) > 0 {
slog.Debug("attaching inline comments", "count", len(inlineComments))
} }
} }
if len(inlineComments) > 0 {
slog.Debug("attaching inline comments", "count", len(inlineComments))
}
// --- Review update strategy --- // --- Review update strategy ---
// 1. POST new review first (gets non-stale approval badge on HEAD) // 1. POST new review first (gets non-stale approval badge on HEAD)
// 2. Then supersede old review with link to the new one // 2. Then supersede old review with link to the new one
// Order matters: post first so we have the new review's URL for the supersede message. // Order matters: post first so we have the new review's URL for the supersede message.
var oldReviews []reviewInfo var oldReviews []gitea.Review
if *reviewerName != "" { if *reviewerName != "" {
existingReviews, err := vcsClientImpl.ListReviews(ctx, owner, repoName, prNumber) existingReviews, err := giteaClient.ListReviews(ctx, owner, repoName, prNumber)
if err != nil { if err != nil {
slog.Warn("could not list existing reviews", "pr", prNumber, "error", err) slog.Warn("could not list existing reviews", "pr", prNumber, "error", err)
} else { } else {
@@ -477,27 +435,20 @@ func main() {
} }
// Self-request as reviewer (ensures we appear in required-reviewer checks) // Self-request as reviewer (ensures we appear in required-reviewer checks)
authUser, err := vcsClientImpl.GetAuthenticatedUser(ctx) authUser, err := giteaClient.GetAuthenticatedUser(ctx)
if err != nil { if err != nil {
slog.Warn("could not determine authenticated user for reviewer self-request", "error", err) slog.Warn("could not determine authenticated user for reviewer self-request", "error", err)
} else if authUser != "" { } else if authUser != "" {
if ext, ok := vcsClientImpl.(giteaExtendedClient); ok { if err := giteaClient.RequestReviewer(ctx, owner, repoName, prNumber, authUser); err != nil {
if err := ext.RequestReviewer(ctx, owner, repoName, prNumber, authUser); err != nil { slog.Warn("could not self-request as reviewer", "user", authUser, "error", err)
slog.Warn("could not self-request as reviewer", "user", authUser, "error", err) } else {
} else { slog.Debug("self-requested as reviewer", "user", authUser, "pr", prNumber)
slog.Debug("self-requested as reviewer", "user", authUser, "pr", prNumber)
}
} }
} }
// POST new review // POST new review
slog.Info("posting review", "event", event, "pr", prNumber) slog.Info("posting review", "event", event, "pr", prNumber)
posted, err := vcsClientImpl.PostReview(ctx, owner, repoName, prNumber, vcs.ReviewRequest{ posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, event, reviewBody, inlineComments)
Body: reviewBody,
Event: event,
CommitID: evaluatedSHA,
Comments: inlineComments,
})
if err != nil { if err != nil {
slog.Error("failed to post review", "pr", prNumber, "event", event, "error", err) slog.Error("failed to post review", "pr", prNumber, "event", event, "error", err)
os.Exit(1) os.Exit(1)
@@ -506,27 +457,22 @@ func main() {
// Supersede all old reviews with link to the new one // Supersede all old reviews with link to the new one
if len(oldReviews) > 0 { if len(oldReviews) > 0 {
newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(*vcsURL, "/"), owner, repoName, prNumber, posted.ID) newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(*giteaURL, "/"), owner, repoName, prNumber, posted.ID)
ext, hasExt := vcsClientImpl.(giteaExtendedClient)
for _, oldReview := range oldReviews { for _, oldReview := range oldReviews {
if !hasExt { cid, err := giteaClient.GetTimelineReviewCommentIDForReview(ctx, owner, repoName, prNumber, oldReview.ID)
slog.Debug("VCS client does not support review supersede; skipping", "review_id", oldReview.ID)
continue
}
cid, err := ext.GetTimelineReviewCommentIDForReview(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil { if err != nil {
slog.Warn("could not find comment ID for old review", "review_id", oldReview.ID, "error", err) slog.Warn("could not find comment ID for old review", "review_id", oldReview.ID, "error", err)
continue continue
} }
supersededBody := buildSupersededBody(oldReview.Body, oldReview.CommitID, newReviewURL, sentinel) supersededBody := buildSupersededBody(oldReview.Body, oldReview.CommitID, newReviewURL, sentinel)
if err := ext.EditComment(ctx, owner, repoName, cid, supersededBody); err != nil { if err := giteaClient.EditComment(ctx, owner, repoName, cid, supersededBody); err != nil {
slog.Warn("could not mark old review as superseded", "review_id", oldReview.ID, "comment_id", cid, "error", err) slog.Warn("could not mark old review as superseded", "review_id", oldReview.ID, "comment_id", cid, "error", err)
continue continue
} }
slog.Info("marked old review as superseded", "review_id", oldReview.ID, "new_review_id", posted.ID, "pr", prNumber) slog.Info("marked old review as superseded", "review_id", oldReview.ID, "new_review_id", posted.ID, "pr", prNumber)
// Resolve old review's inline comments // Resolve old review's inline comments
oldComments, err := ext.ListReviewComments(ctx, owner, repoName, prNumber, oldReview.ID) oldComments, err := giteaClient.ListReviewComments(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil { if err != nil {
slog.Warn("could not list old review comments for resolution", "review_id", oldReview.ID, "error", err) slog.Warn("could not list old review comments for resolution", "review_id", oldReview.ID, "error", err)
continue continue
@@ -536,7 +482,7 @@ func main() {
if c.ID == 0 { if c.ID == 0 {
continue continue
} }
if err := ext.ResolveComment(ctx, owner, repoName, c.ID); err != nil { if err := giteaClient.ResolveComment(ctx, owner, repoName, c.ID); err != nil {
slog.Debug("could not resolve inline comment", "comment_id", c.ID, "error", err) slog.Debug("could not resolve inline comment", "comment_id", c.ID, "error", err)
failed++ failed++
} else { } else {
@@ -555,7 +501,7 @@ func main() {
} }
// fetchFileContext fetches the full content of modified files from the PR branch. // fetchFileContext fetches the full content of modified files from the PR branch.
func fetchFileContext(ctx context.Context, client vcsClient, owner, repo, ref string, files []changedFileInfo) string { func fetchFileContext(ctx context.Context, client *gitea.Client, owner, repo, ref string, files []gitea.ChangedFile) string {
var sb strings.Builder var sb strings.Builder
for _, f := range files { for _, f := range files {
if ctx.Err() != nil { if ctx.Err() != nil {
@@ -581,25 +527,11 @@ func fetchFileContext(ctx context.Context, client vcsClient, owner, repo, ref st
// patternsRepo is comma-separated list of owner/name repos. // patternsRepo is comma-separated list of owner/name repos.
// patternsFiles is comma-separated list of file paths or directories. // patternsFiles is comma-separated list of file paths or directories.
// If a path ends with / or is a directory, all files within it are fetched recursively. // If a path ends with / or is a directory, all files within it are fetched recursively.
// If patternsFiles is empty, all files from the repo root are fetched. func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patternsFiles string) string {
func fetchPatterns(ctx context.Context, client vcsClient, patternsRepo, patternsFiles string) string {
var sb strings.Builder var sb strings.Builder
repos := strings.Split(patternsRepo, ",") repos := strings.Split(patternsRepo, ",")
paths := strings.Split(patternsFiles, ",")
// Build the list of paths to fetch
var paths []string
if patternsFiles == "" {
// Empty patternsFiles means "fetch all files from repo root"
paths = []string{""}
} else {
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
}
for _, repoRef := range repos { for _, repoRef := range repos {
if ctx.Err() != nil { if ctx.Err() != nil {
@@ -616,16 +548,13 @@ func fetchPatterns(ctx context.Context, client vcsClient, patternsRepo, patterns
} }
owner, repo := parts[0], parts[1] owner, repo := parts[0], parts[1]
var repoLoadedFiles []string
var repoSkippedFiles []string
giteaRaw, isGitea := client.(*giteaClientVCSAdapter)
if !isGitea {
slog.Warn("patterns fetching is only supported with the Gitea VCS client; skipping", "repo", repoRef)
continue
}
for _, path := range paths { for _, path := range paths {
files, err := giteaRaw.client.GetAllFilesInPath(ctx, owner, repo, path) path = strings.TrimSpace(path)
if path == "" {
continue
}
files, err := client.GetAllFilesInPath(ctx, owner, repo, path)
if err != nil { if err != nil {
slog.Warn("could not fetch patterns", "path", path, "repo", repoRef, "error", err) slog.Warn("could not fetch patterns", "path", path, "repo", repoRef, "error", err)
continue continue
@@ -634,22 +563,11 @@ func fetchPatterns(ctx context.Context, client vcsClient, patternsRepo, patterns
for filePath, content := range files { for filePath, content := range files {
// Only include markdown and text files as patterns // Only include markdown and text files as patterns
if !isPatternFile(filePath) { if !isPatternFile(filePath) {
repoSkippedFiles = append(repoSkippedFiles, filePath)
continue continue
} }
repoLoadedFiles = append(repoLoadedFiles, filePath)
sb.WriteString(fmt.Sprintf("### %s/%s\n\n%s\n\n", repoRef, filePath, content)) sb.WriteString(fmt.Sprintf("### %s/%s\n\n%s\n\n", repoRef, filePath, content))
} }
} }
if len(repoLoadedFiles) > 0 {
slog.Info("loaded pattern files", "repo", repoRef, "count", len(repoLoadedFiles), "files", repoLoadedFiles)
} else {
slog.Warn("no pattern files loaded", "repo", repoRef, "paths", paths)
}
if len(repoSkippedFiles) > 0 {
slog.Debug("skipped non-pattern files", "repo", repoRef, "count", len(repoSkippedFiles), "files", repoSkippedFiles)
}
} }
return sb.String() return sb.String()
} }
@@ -664,7 +582,7 @@ func isPatternFile(path string) bool {
} }
// evaluateCIStatus checks if all CI statuses indicate success. // evaluateCIStatus checks if all CI statuses indicate success.
func evaluateCIStatus(statuses []commitStatusInfo) (passed bool, details string) { func evaluateCIStatus(statuses []gitea.CommitStatus) (passed bool, details string) {
if len(statuses) == 0 { if len(statuses) == 0 {
return true, "no CI statuses found" return true, "no CI statuses found"
} }
@@ -802,7 +720,7 @@ func buildSupersededBody(originalBody, commitSHA, newReviewURL, sentinel string)
// Gitea user. This indicates misconfiguration where two roles share a token // Gitea user. This indicates misconfiguration where two roles share a token
// instead of having separate Gitea accounts. Returns true if shared token // instead of having separate Gitea accounts. Returns true if shared token
// detected (caller should skip update-in-place logic to avoid clobbering). // detected (caller should skip update-in-place logic to avoid clobbering).
func hasSharedToken(reviews []reviewInfo, ownSentinel string) bool { func hasSharedToken(reviews []gitea.Review, ownSentinel string) bool {
ownLogin := "" ownLogin := ""
for _, r := range reviews { for _, r := range reviews {
if strings.Contains(r.Body, ownSentinel) { if strings.Contains(r.Body, ownSentinel) {
@@ -840,8 +758,8 @@ func extractSentinelName(body string) string {
} }
// findOwnReview locates the most recent non-superseded review matching the sentinel. // findOwnReview locates the most recent non-superseded review matching the sentinel.
func findOwnReview(reviews []reviewInfo, sentinel string) *reviewInfo { func findOwnReview(reviews []gitea.Review, sentinel string) *gitea.Review {
var best *reviewInfo var best *gitea.Review
for i := range reviews { for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) { if !strings.Contains(reviews[i].Body, sentinel) {
continue continue
@@ -857,8 +775,8 @@ func findOwnReview(reviews []reviewInfo, sentinel string) *reviewInfo {
} }
// findAllOwnReviews returns all non-superseded reviews matching the sentinel. // findAllOwnReviews returns all non-superseded reviews matching the sentinel.
func findAllOwnReviews(reviews []reviewInfo, sentinel string) []reviewInfo { func findAllOwnReviews(reviews []gitea.Review, sentinel string) []gitea.Review {
var result []reviewInfo var result []gitea.Review
for i := range reviews { for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) { if !strings.Contains(reviews[i].Body, sentinel) {
continue continue
@@ -871,22 +789,6 @@ func findAllOwnReviews(reviews []reviewInfo, sentinel string) []reviewInfo {
return result return result
} }
// verdictToVCSEvent converts a review verdict string to a vcs.ReviewEvent.
// The verdict comes from the LLM result and uses values: "APPROVE", "REQUEST_CHANGES",
// or any other string (treated as COMMENT).
// vcs.ReviewEvent constants follow GitHub API format ("APPROVE", "REQUEST_CHANGES", "COMMENT").
// The Gitea adapter translates these back to Gitea format ("APPROVED", etc.) before posting.
func verdictToVCSEvent(verdict string) vcs.ReviewEvent {
switch verdict {
case "APPROVE":
return vcs.ReviewEventApprove
case "REQUEST_CHANGES":
return vcs.ReviewEventRequestChanges
default:
return vcs.ReviewEventComment
}
}
// shouldSkipStaleReview reports whether to skip posting because HEAD moved. // shouldSkipStaleReview reports whether to skip posting because HEAD moved.
// Returns true (skip) if evaluatedSHA differs from currentSHA. // Returns true (skip) if evaluatedSHA differs from currentSHA.
// Returns false (don't skip) if: // Returns false (don't skip) if:
@@ -900,21 +802,17 @@ func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
return evaluatedSHA != currentSHA return evaluatedSHA != currentSHA
} }
// reviewClientAdapter adapts a vcsClient to review.GiteaClient for persona loading. // giteaFetcher adapts gitea.Client to review.PersonaFetcher interface.
// The review package only needs ListContents and GetFileContent, which all vcsClients provide. type giteaFetcher struct {
type reviewClientAdapter struct { client *gitea.Client
client vcsClient
} }
func newReviewClientAdapter(c vcsClient) *reviewClientAdapter { func (f *giteaFetcher) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
return &reviewClientAdapter{client: c} entries, err := f.client.ListContents(ctx, owner, repo, path)
}
func (a *reviewClientAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
entries, err := a.client.ListContents(ctx, owner, repo, path)
if err != nil { if err != nil {
return nil, err return nil, err
} }
// Convert gitea.ContentEntry to review.ContentEntry
result := make([]review.ContentEntry, len(entries)) result := make([]review.ContentEntry, len(entries))
for i, e := range entries { for i, e := range entries {
result[i] = review.ContentEntry{ result[i] = review.ContentEntry{
@@ -926,6 +824,6 @@ func (a *reviewClientAdapter) ListContents(ctx context.Context, owner, repo, pat
return result, nil return result, nil
} }
func (a *reviewClientAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) { func (f *giteaFetcher) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return a.client.GetFileContent(ctx, owner, repo, filepath) return f.client.GetFileContent(ctx, owner, repo, filepath)
} }
+32 -78
View File
@@ -10,7 +10,7 @@ import (
"strings" "strings"
"testing" "testing"
"gitea.weiker.me/rodin/review-bot/vcs" "gitea.weiker.me/rodin/review-bot/gitea"
) )
func TestValidateReviewerName(t *testing.T) { func TestValidateReviewerName(t *testing.T) {
@@ -154,14 +154,14 @@ func TestValidateWorkspacePath(t *testing.T) {
} }
} }
func makeReview(id int64, login, state string, stale bool, body string) reviewInfo { func makeReview(id int64, login, state string, stale bool, body string) gitea.Review {
r := reviewInfo{ r := gitea.Review{
ID: id, ID: id,
Body: body, Body: body,
State: state, State: state,
Stale: stale, Stale: stale,
} }
r.User = vcs.UserInfo{Login: login} r.User.Login = login
return r return r
} }
@@ -216,7 +216,7 @@ func TestBuildSupersededBodyShortSHA(t *testing.T) {
func TestFindOwnReview(t *testing.T) { func TestFindOwnReview(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
reviews []reviewInfo reviews []gitea.Review
sentinel string sentinel string
wantID int64 wantID int64
wantNil bool wantNil bool
@@ -229,7 +229,7 @@ func TestFindOwnReview(t *testing.T) {
}, },
{ {
name: "found by sentinel", name: "found by sentinel",
reviews: []reviewInfo{ reviews: []gitea.Review{
makeReview(42, "bot", "APPROVED", false, "review body\n<!-- review-bot:sonnet -->"), makeReview(42, "bot", "APPROVED", false, "review body\n<!-- review-bot:sonnet -->"),
}, },
sentinel: "<!-- review-bot:sonnet -->", sentinel: "<!-- review-bot:sonnet -->",
@@ -237,7 +237,7 @@ func TestFindOwnReview(t *testing.T) {
}, },
{ {
name: "wrong sentinel", name: "wrong sentinel",
reviews: []reviewInfo{ reviews: []gitea.Review{
makeReview(42, "bot", "APPROVED", false, "body\n<!-- review-bot:gpt -->"), makeReview(42, "bot", "APPROVED", false, "body\n<!-- review-bot:gpt -->"),
}, },
sentinel: "<!-- review-bot:sonnet -->", sentinel: "<!-- review-bot:sonnet -->",
@@ -245,7 +245,7 @@ func TestFindOwnReview(t *testing.T) {
}, },
{ {
name: "multiple reviews, returns first match", name: "multiple reviews, returns first match",
reviews: []reviewInfo{ reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "old\n<!-- review-bot:gpt -->"), makeReview(10, "bot", "APPROVED", false, "old\n<!-- review-bot:gpt -->"),
makeReview(20, "bot", "APPROVED", false, "new\n<!-- review-bot:sonnet -->"), makeReview(20, "bot", "APPROVED", false, "new\n<!-- review-bot:sonnet -->"),
}, },
@@ -254,7 +254,7 @@ func TestFindOwnReview(t *testing.T) {
}, },
{ {
name: "skips superseded review", name: "skips superseded review",
reviews: []reviewInfo{ reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n**Superseded**\n<!-- review-bot:sonnet -->"), makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n**Superseded**\n<!-- review-bot:sonnet -->"),
makeReview(20, "bot", "APPROVED", false, "fresh review\n<!-- review-bot:sonnet -->"), makeReview(20, "bot", "APPROVED", false, "fresh review\n<!-- review-bot:sonnet -->"),
}, },
@@ -263,7 +263,7 @@ func TestFindOwnReview(t *testing.T) {
}, },
{ {
name: "only superseded reviews exist", name: "only superseded reviews exist",
reviews: []reviewInfo{ reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n<!-- review-bot:sonnet -->"), makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n<!-- review-bot:sonnet -->"),
}, },
sentinel: "<!-- review-bot:sonnet -->", sentinel: "<!-- review-bot:sonnet -->",
@@ -271,7 +271,7 @@ func TestFindOwnReview(t *testing.T) {
}, },
{ {
name: "picks highest ID among matches", name: "picks highest ID among matches",
reviews: []reviewInfo{ reviews: []gitea.Review{
makeReview(50, "bot", "APPROVED", false, "v1\n<!-- review-bot:sonnet -->"), makeReview(50, "bot", "APPROVED", false, "v1\n<!-- review-bot:sonnet -->"),
makeReview(30, "bot", "APPROVED", false, "v0\n<!-- review-bot:sonnet -->"), makeReview(30, "bot", "APPROVED", false, "v0\n<!-- review-bot:sonnet -->"),
}, },
@@ -302,7 +302,7 @@ func TestFindOwnReview(t *testing.T) {
func TestHasSharedToken(t *testing.T) { func TestHasSharedToken(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
reviews []reviewInfo reviews []gitea.Review
sentinel string sentinel string
want bool want bool
}{ }{
@@ -314,36 +314,36 @@ func TestHasSharedToken(t *testing.T) {
}, },
{ {
name: "no own review yet - cannot detect", name: "no own review yet - cannot detect",
reviews: []reviewInfo{ reviews: []gitea.Review{
{ID: 1, User: vcs.UserInfo{Login: "other"}, Body: "<!-- review-bot:gpt --> body"}, {ID: 1, User: struct{ Login string `json:"login"` }{Login: "other"}, Body: "<!-- review-bot:gpt --> body"},
}, },
sentinel: "<!-- review-bot:sonnet -->", sentinel: "<!-- review-bot:sonnet -->",
want: false, want: false,
}, },
{ {
name: "separate users - no shared token", name: "separate users - no shared token",
reviews: []reviewInfo{ reviews: []gitea.Review{
{ID: 1, User: vcs.UserInfo{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"}, {ID: 1, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: vcs.UserInfo{Login: "security-review-bot"}, Body: "<!-- review-bot:security --> body"}, {ID: 2, User: struct{ Login string `json:"login"` }{Login: "security-review-bot"}, Body: "<!-- review-bot:security --> body"},
}, },
sentinel: "<!-- review-bot:sonnet -->", sentinel: "<!-- review-bot:sonnet -->",
want: false, want: false,
}, },
{ {
name: "shared token detected - same user different sentinels", name: "shared token detected - same user different sentinels",
reviews: []reviewInfo{ reviews: []gitea.Review{
{ID: 1, User: vcs.UserInfo{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"}, {ID: 1, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: vcs.UserInfo{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:security --> body"}, {ID: 2, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:security --> body"},
}, },
sentinel: "<!-- review-bot:sonnet -->", sentinel: "<!-- review-bot:sonnet -->",
want: true, want: true,
}, },
{ {
name: "three roles same user", name: "three roles same user",
reviews: []reviewInfo{ reviews: []gitea.Review{
{ID: 1, User: vcs.UserInfo{Login: "bot"}, Body: "<!-- review-bot:sonnet --> body"}, {ID: 1, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: vcs.UserInfo{Login: "bot"}, Body: "<!-- review-bot:security --> body"}, {ID: 2, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:security --> body"},
{ID: 3, User: vcs.UserInfo{Login: "bot"}, Body: "<!-- review-bot:gpt --> body"}, {ID: 3, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:gpt --> body"},
}, },
sentinel: "<!-- review-bot:sonnet -->", sentinel: "<!-- review-bot:sonnet -->",
want: true, want: true,
@@ -504,56 +504,10 @@ func TestIsPatternFile(t *testing.T) {
} }
} }
// TestBuildPatternPaths verifies the path-building logic for fetchPatterns.
// Empty patternsFiles means "fetch all from root" (represented as [""]).
func TestBuildPatternPaths(t *testing.T) {
buildPaths := func(patternsFiles string) []string {
if patternsFiles == "" {
return []string{""}
}
var paths []string
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
return paths
}
tests := []struct {
name string
input string
want []string
}{
{"empty fetches root", "", []string{""}},
{"single file", "README.md", []string{"README.md"}},
{"multiple files", "README.md,PATTERNS.md", []string{"README.md", "PATTERNS.md"}},
{"trims whitespace", " foo.md , bar.md ", []string{"foo.md", "bar.md"}},
{"skips empty between commas", "foo.md,,bar.md", []string{"foo.md", "bar.md"}},
{"directory path", "patterns/", []string{"patterns/"}},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := buildPaths(tc.input)
if len(got) != len(tc.want) {
t.Errorf("buildPaths(%q) = %v, want %v", tc.input, got, tc.want)
return
}
for i := range got {
if got[i] != tc.want[i] {
t.Errorf("buildPaths(%q)[%d] = %q, want %q", tc.input, i, got[i], tc.want[i])
}
}
})
}
}
func TestEvaluateCIStatus(t *testing.T) { func TestEvaluateCIStatus(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
statuses []commitStatusInfo statuses []gitea.CommitStatus
wantPassed bool wantPassed bool
wantSubstr string wantSubstr string
}{ }{
@@ -565,7 +519,7 @@ func TestEvaluateCIStatus(t *testing.T) {
}, },
{ {
name: "all success", name: "all success",
statuses: []commitStatusInfo{ statuses: []gitea.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"}, {Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "success", Context: "ci/test", Description: "Tests passed"}, {Status: "success", Context: "ci/test", Description: "Tests passed"},
}, },
@@ -574,7 +528,7 @@ func TestEvaluateCIStatus(t *testing.T) {
}, },
{ {
name: "one failure", name: "one failure",
statuses: []commitStatusInfo{ statuses: []gitea.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"}, {Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"}, {Status: "failure", Context: "ci/test", Description: "Tests failed"},
}, },
@@ -583,7 +537,7 @@ func TestEvaluateCIStatus(t *testing.T) {
}, },
{ {
name: "error status", name: "error status",
statuses: []commitStatusInfo{ statuses: []gitea.CommitStatus{
{Status: "error", Context: "ci/lint", Description: "Lint error"}, {Status: "error", Context: "ci/lint", Description: "Lint error"},
}, },
wantPassed: false, wantPassed: false,
@@ -591,7 +545,7 @@ func TestEvaluateCIStatus(t *testing.T) {
}, },
{ {
name: "pending treated as not-failed", name: "pending treated as not-failed",
statuses: []commitStatusInfo{ statuses: []gitea.CommitStatus{
{Status: "pending", Context: "ci/build", Description: "In progress"}, {Status: "pending", Context: "ci/build", Description: "In progress"},
{Status: "success", Context: "ci/test", Description: "Tests passed"}, {Status: "success", Context: "ci/test", Description: "Tests passed"},
}, },
@@ -600,7 +554,7 @@ func TestEvaluateCIStatus(t *testing.T) {
}, },
{ {
name: "multiple failures", name: "multiple failures",
statuses: []commitStatusInfo{ statuses: []gitea.CommitStatus{
{Status: "failure", Context: "ci/build", Description: "Build failed"}, {Status: "failure", Context: "ci/build", Description: "Build failed"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"}, {Status: "failure", Context: "ci/test", Description: "Tests failed"},
}, },
@@ -609,7 +563,7 @@ func TestEvaluateCIStatus(t *testing.T) {
}, },
{ {
name: "mixed with pending and failure", name: "mixed with pending and failure",
statuses: []commitStatusInfo{ statuses: []gitea.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"}, {Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "pending", Context: "ci/deploy", Description: "Deploying"}, {Status: "pending", Context: "ci/deploy", Description: "Deploying"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"}, {Status: "failure", Context: "ci/test", Description: "Tests failed"},
@@ -997,7 +951,7 @@ func cleanEnv() []string {
} }
func TestFindAllOwnReviews(t *testing.T) { func TestFindAllOwnReviews(t *testing.T) {
reviews := []reviewInfo{ reviews := []gitea.Review{
{ID: 1, Body: "<!-- review-bot:sonnet -->\nfirst review"}, {ID: 1, Body: "<!-- review-bot:sonnet -->\nfirst review"},
{ID: 2, Body: "<!-- review-bot:gpt -->\nother bot"}, {ID: 2, Body: "<!-- review-bot:gpt -->\nother bot"},
{ID: 3, Body: "<!-- review-bot:sonnet -->\nsecond review"}, {ID: 3, Body: "<!-- review-bot:sonnet -->\nsecond review"},
-125
View File
@@ -1,125 +0,0 @@
package main
import (
"context"
"errors"
"fmt"
"net"
"net/url"
"strings"
"time"
"gitea.weiker.me/rodin/review-bot/gitea"
)
// runValidateURL implements the `review-bot validate-url <url>` subcommand.
//
// It resolves the given URL's hostname and checks that every returned IP is
// publicly routable (not RFC1918, loopback, link-local, or other reserved
// ranges). The exit code communicates the result to callers:
//
// 0 — URL is safe to use
// 1 — URL resolves to a blocked/private address
// 2 — URL is malformed, has an unsafe scheme, or DNS lookup failed
//
// This is intended for use from action.yml shell steps that need to validate
// a user-supplied URL before passing it to curl.
func runValidateURL(args []string) int {
if len(args) != 1 {
fmt.Fprintln(errWriter, "usage: review-bot validate-url <url>")
fmt.Fprintln(errWriter, "")
fmt.Fprintln(errWriter, "Resolves <url> and verifies all resolved IPs are publicly routable.")
fmt.Fprintln(errWriter, "Exit 0=safe, 1=blocked, 2=error")
return 2
}
rawURL := args[0]
if err := validateURL(rawURL); err != nil {
fmt.Fprintf(errWriter, "Error: %v\n", err)
var ve *validateError
if isValidateError(err, &ve) {
return ve.code
}
return 2
}
fmt.Fprintf(outWriter, "OK: %s is safe\n", rawURL)
return 0
}
// validateError carries an exit code alongside a message.
type validateError struct {
code int
message string
}
func (e *validateError) Error() string { return e.message }
// isValidateError checks if err is or wraps a *validateError and sets out.
// Uses errors.As so that wrapped *validateError values (e.g. from fmt.Errorf("...: %w", &validateError{...}))
// are also detected, making the function robust against future wrapping.
func isValidateError(err error, out **validateError) bool {
if err == nil {
return false
}
return errors.As(err, out)
}
// validateURL checks that rawURL is safe for use as a Gitea server URL:
// - Must be https:// (not http://)
// - Must have no user-info (user:pass@host)
// - Must resolve to at least one IP, all of which are publicly routable
func validateURL(rawURL string) error {
parsed, err := url.Parse(rawURL)
if err != nil {
return &validateError{code: 2, message: fmt.Sprintf("malformed URL %q: %v", rawURL, err)}
}
// Scheme check: only https is permitted.
if !strings.EqualFold(parsed.Scheme, "https") {
return &validateError{
code: 2,
message: fmt.Sprintf("URL scheme must be https (got %q)", parsed.Scheme),
}
}
// Reject user-info (user:password@host) to prevent credential embedding.
if parsed.User != nil {
return &validateError{
code: 2,
message: "URL must not contain user-info (user:password@host)",
}
}
host := parsed.Hostname()
if host == "" {
return &validateError{code: 2, message: fmt.Sprintf("URL has no host: %q", rawURL)}
}
// Resolve the hostname with a short timeout.
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
addrs, err := net.DefaultResolver.LookupIPAddr(ctx, host)
if err != nil {
return &validateError{
code: 2,
message: fmt.Sprintf("DNS lookup failed for %q: %v", host, err),
}
}
if len(addrs) == 0 {
return &validateError{
code: 2,
message: fmt.Sprintf("DNS lookup returned no addresses for %q", host),
}
}
for _, a := range addrs {
if gitea.IsBlockedIP(a.IP) {
return &validateError{
code: 1,
message: fmt.Sprintf("blocked: %q resolves to private/reserved IP %s", host, a.IP),
}
}
}
return nil
}
-127
View File
@@ -1,127 +0,0 @@
package main
import (
"bytes"
"strings"
"testing"
)
func TestRunValidateURL_Usage(t *testing.T) {
var errBuf bytes.Buffer
origErr := errWriter
errWriter = &errBuf
defer func() { errWriter = origErr }()
code := runValidateURL(nil)
if code != 2 {
t.Errorf("expected exit code 2 for no args, got %d", code)
}
if !strings.Contains(errBuf.String(), "usage") {
t.Errorf("expected usage in stderr, got %q", errBuf.String())
}
errBuf.Reset()
code = runValidateURL([]string{"arg1", "arg2"})
if code != 2 {
t.Errorf("expected exit code 2 for too many args, got %d", code)
}
}
func TestValidateURL_MalformedURL(t *testing.T) {
cases := []struct {
name string
url string
wantMsg string
}{
{"empty", "", "must be https"},
{"http scheme", "http://example.com/", "must be https"},
{"ftp scheme", "ftp://example.com/", "must be https"},
{"no scheme", "example.com", "must be https"},
{"user info", "https://user:pass@example.com/", "user-info"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
err := validateURL(tc.url)
if err == nil {
t.Errorf("expected error for URL %q, got nil", tc.url)
return
}
if !strings.Contains(err.Error(), tc.wantMsg) {
t.Errorf("error %q does not contain %q", err.Error(), tc.wantMsg)
}
var ve *validateError
if !isValidateError(err, &ve) {
t.Fatalf("expected *validateError, got %T", err)
}
if ve.code != 2 {
t.Errorf("expected code 2, got %d", ve.code)
}
})
}
}
func TestValidateURL_BlockedPrivateIP(t *testing.T) {
// localhost always resolves to 127.0.0.1 (loopback).
err := validateURL("https://localhost/")
if err == nil {
t.Skip("localhost did not resolve (network unavailable in test environment)")
}
var ve *validateError
if !isValidateError(err, &ve) {
t.Fatalf("expected *validateError, got %T: %v", err, err)
}
if ve.code != 1 && ve.code != 2 {
t.Errorf("expected code 1 (blocked) or 2 (dns fail), got %d: %s", ve.code, ve.message)
}
// If it resolved (code 1), the message must say "blocked".
if ve.code == 1 && !strings.Contains(ve.message, "blocked") {
t.Errorf("expected 'blocked' in message, got %q", ve.message)
}
}
func TestValidateURL_ExitCodes(t *testing.T) {
cases := []struct {
name string
url string
wantCode int
}{
{"http scheme", "http://example.com/", 2},
{"no scheme", "example.com", 2},
{"user info", "https://admin:secret@example.com/", 2},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
err := validateURL(tc.url)
if err == nil {
t.Fatalf("expected error for %q", tc.url)
}
var ve *validateError
if !isValidateError(err, &ve) {
t.Fatalf("expected *validateError, got %T", err)
}
if ve.code != tc.wantCode {
t.Errorf("code = %d, want %d (url=%q, msg=%s)", ve.code, tc.wantCode, tc.url, ve.message)
}
})
}
}
func TestRunValidateURL_WithCapture(t *testing.T) {
var outBuf, errBuf bytes.Buffer
origOut, origErr := outWriter, errWriter
outWriter = &outBuf
errWriter = &errBuf
defer func() {
outWriter = origOut
errWriter = origErr
}()
// http:// scheme should fail with code 2.
code := runValidateURL([]string{"http://example.com/"})
if code != 2 {
t.Errorf("expected code 2 for http:// URL, got %d", code)
}
if !strings.Contains(errBuf.String(), "must be https") {
t.Errorf("expected error about https in stderr, got %q", errBuf.String())
}
}
-405
View File
@@ -1,405 +0,0 @@
package main
import (
"context"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/github"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// vcsClient is the unified interface for VCS operations used by the review flow.
// Both gitea.Client and github.Client satisfy this interface via their respective adapters.
type vcsClient interface {
// GetPullRequest fetches PR metadata.
GetPullRequest(ctx context.Context, owner, repo string, number int) (*pullRequestInfo, error)
// GetPullRequestDiff fetches the unified diff.
GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error)
// GetPullRequestFiles fetches the list of changed files.
GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]changedFileInfo, error)
// GetCommitStatuses fetches CI statuses for a commit.
GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]commitStatusInfo, error)
// GetFileContent fetches the content of a file at HEAD.
GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
// GetFileContentRef fetches the content of a file at a given ref.
GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error)
// ListContents lists the files and directories at a path.
ListContents(ctx context.Context, owner, repo, path string) ([]contentEntryInfo, error)
// ListReviews returns all reviews for a PR.
ListReviews(ctx context.Context, owner, repo string, number int) ([]reviewInfo, error)
// PostReview submits a review.
PostReview(ctx context.Context, owner, repo string, number int, req vcs.ReviewRequest) (*reviewInfo, error)
// GetAuthenticatedUser returns the login of the authenticated user.
GetAuthenticatedUser(ctx context.Context) (string, error)
}
// giteaExtendedClient is implemented by the Gitea adapter and exposes
// Gitea-specific operations that have no GitHub equivalent in the current scope.
// Callers should type-assert to this interface and skip gracefully when it is absent.
type giteaExtendedClient interface {
// RequestReviewer adds the authenticated user as a reviewer on a PR.
RequestReviewer(ctx context.Context, owner, repo string, number int, user string) error
// GetTimelineReviewCommentIDForReview maps a review ID to its timeline comment ID.
GetTimelineReviewCommentIDForReview(ctx context.Context, owner, repo string, number int, reviewID int64) (int64, error)
// EditComment updates the body of an existing PR comment.
EditComment(ctx context.Context, owner, repo string, commentID int64, body string) error
// ListReviewComments lists the inline comments attached to a review.
ListReviewComments(ctx context.Context, owner, repo string, number int, reviewID int64) ([]inlineCommentInfo, error)
// ResolveComment marks an inline comment as resolved.
ResolveComment(ctx context.Context, owner, repo string, commentID int64) error
// ParseDiffNewLines returns the diff line ranges for inline comment positioning.
ParseDiffNewLines(diff string) diffLineRanges
}
// Shared adapter types to avoid duplicating gitea/github-specific types throughout main.go.
type pullRequestInfo struct {
Title string
Body string
HeadSha string
HeadRef string
}
type changedFileInfo struct {
Filename string
Status string
}
type commitStatusInfo struct {
Status string
Context string
Description string
TargetURL string
}
type contentEntryInfo struct {
Name string
Path string
Type string
}
type reviewInfo struct {
ID int64
Body string
User vcs.UserInfo
State string
CommitID string
Stale bool
}
type inlineCommentInfo struct {
ID int64
Path string
NewPosition int64
Body string
}
// diffLineRanges is a type alias for gitea.DiffLineRanges to allow the
// extended client interface to be defined without importing gitea directly.
// In practice, only the giteaClientVCSAdapter returns this type, and callers
// that use it will already have performed the type assertion.
type diffLineRanges = *gitea.DiffLineRanges
// --- Gitea adapter ---
// giteaClientVCSAdapter wraps gitea.Client to satisfy the vcsClient interface.
// It also implements giteaExtendedClient for Gitea-specific operations.
type giteaClientVCSAdapter struct {
client *gitea.Client
}
func newGiteaVCSAdapter(c *gitea.Client) *giteaClientVCSAdapter {
return &giteaClientVCSAdapter{client: c}
}
func (a *giteaClientVCSAdapter) GetPullRequest(ctx context.Context, owner, repo string, number int) (*pullRequestInfo, error) {
pr, err := a.client.GetPullRequest(ctx, owner, repo, number)
if err != nil {
return nil, err
}
return &pullRequestInfo{
Title: pr.Title,
Body: pr.Body,
HeadSha: pr.Head.Sha,
HeadRef: pr.Head.Ref,
}, nil
}
func (a *giteaClientVCSAdapter) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
return a.client.GetPullRequestDiff(ctx, owner, repo, number)
}
func (a *giteaClientVCSAdapter) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]changedFileInfo, error) {
files, err := a.client.GetPullRequestFiles(ctx, owner, repo, number)
if err != nil {
return nil, err
}
result := make([]changedFileInfo, len(files))
for i, f := range files {
result[i] = changedFileInfo{Filename: f.Filename, Status: f.Status}
}
return result, nil
}
func (a *giteaClientVCSAdapter) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]commitStatusInfo, error) {
statuses, err := a.client.GetCommitStatuses(ctx, owner, repo, sha)
if err != nil {
return nil, err
}
result := make([]commitStatusInfo, len(statuses))
for i, s := range statuses {
result[i] = commitStatusInfo{
Status: s.Status,
Context: s.Context,
Description: s.Description,
TargetURL: s.TargetURL,
}
}
return result, nil
}
func (a *giteaClientVCSAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return a.client.GetFileContent(ctx, owner, repo, filepath)
}
func (a *giteaClientVCSAdapter) GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
return a.client.GetFileContentRef(ctx, owner, repo, filepath, ref)
}
func (a *giteaClientVCSAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]contentEntryInfo, error) {
entries, err := a.client.ListContents(ctx, owner, repo, path)
if err != nil {
return nil, err
}
result := make([]contentEntryInfo, len(entries))
for i, e := range entries {
result[i] = contentEntryInfo{Name: e.Name, Path: e.Path, Type: e.Type}
}
return result, nil
}
func (a *giteaClientVCSAdapter) ListReviews(ctx context.Context, owner, repo string, number int) ([]reviewInfo, error) {
reviews, err := a.client.ListReviews(ctx, owner, repo, number)
if err != nil {
return nil, err
}
result := make([]reviewInfo, len(reviews))
for i, r := range reviews {
result[i] = reviewInfo{
ID: r.ID,
Body: r.Body,
User: vcs.UserInfo{Login: r.User.Login},
State: r.State,
CommitID: r.CommitID,
Stale: r.Stale,
}
}
return result, nil
}
func (a *giteaClientVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, req vcs.ReviewRequest) (*reviewInfo, error) {
// Translate vcs.ReviewComment to gitea.ReviewComment.
// The Gitea API uses NewPosition instead of Position.
var comments []gitea.ReviewComment
for _, c := range req.Comments {
comments = append(comments, gitea.ReviewComment{
Path: c.Path,
NewPosition: int64(c.Position),
Body: c.Body,
})
}
// Translate vcs.ReviewEvent (GitHub format) to Gitea API event string.
// vcs uses "APPROVE"; Gitea API expects "APPROVED".
var event string
switch req.Event {
case vcs.ReviewEventApprove:
event = "APPROVED"
case vcs.ReviewEventRequestChanges:
event = "REQUEST_CHANGES"
default:
event = "COMMENT"
}
posted, err := a.client.PostReview(ctx, owner, repo, number, event, req.Body, req.CommitID, comments)
if err != nil {
return nil, err
}
return &reviewInfo{
ID: posted.ID,
Body: posted.Body,
User: vcs.UserInfo{Login: posted.User.Login},
State: posted.State,
CommitID: posted.CommitID,
}, nil
}
func (a *giteaClientVCSAdapter) GetAuthenticatedUser(ctx context.Context) (string, error) {
return a.client.GetAuthenticatedUser(ctx)
}
// giteaExtendedClient implementation:
func (a *giteaClientVCSAdapter) RequestReviewer(ctx context.Context, owner, repo string, number int, user string) error {
return a.client.RequestReviewer(ctx, owner, repo, number, user)
}
func (a *giteaClientVCSAdapter) GetTimelineReviewCommentIDForReview(ctx context.Context, owner, repo string, number int, reviewID int64) (int64, error) {
return a.client.GetTimelineReviewCommentIDForReview(ctx, owner, repo, number, reviewID)
}
func (a *giteaClientVCSAdapter) EditComment(ctx context.Context, owner, repo string, commentID int64, body string) error {
return a.client.EditComment(ctx, owner, repo, commentID, body)
}
func (a *giteaClientVCSAdapter) ListReviewComments(ctx context.Context, owner, repo string, number int, reviewID int64) ([]inlineCommentInfo, error) {
comments, err := a.client.ListReviewComments(ctx, owner, repo, number, reviewID)
if err != nil {
return nil, err
}
result := make([]inlineCommentInfo, len(comments))
for i, c := range comments {
result[i] = inlineCommentInfo{
ID: c.ID,
Path: c.Path,
NewPosition: c.NewPosition,
Body: c.Body,
}
}
return result, nil
}
func (a *giteaClientVCSAdapter) ResolveComment(ctx context.Context, owner, repo string, commentID int64) error {
return a.client.ResolveComment(ctx, owner, repo, commentID)
}
func (a *giteaClientVCSAdapter) ParseDiffNewLines(diff string) diffLineRanges {
return gitea.ParseDiffNewLines(diff)
}
// --- GitHub adapter ---
// githubClientVCSAdapter wraps github.Client to satisfy the vcsClient interface.
type githubClientVCSAdapter struct {
client *github.Client
}
func newGitHubVCSAdapter(c *github.Client) *githubClientVCSAdapter {
return &githubClientVCSAdapter{client: c}
}
func (a *githubClientVCSAdapter) GetPullRequest(ctx context.Context, owner, repo string, number int) (*pullRequestInfo, error) {
pr, err := a.client.GetPullRequest(ctx, owner, repo, number)
if err != nil {
return nil, err
}
return &pullRequestInfo{
Title: pr.Title,
Body: pr.Body,
HeadSha: pr.Head.Sha,
HeadRef: pr.Head.Ref,
}, nil
}
func (a *githubClientVCSAdapter) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
return a.client.GetPullRequestDiff(ctx, owner, repo, number)
}
func (a *githubClientVCSAdapter) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]changedFileInfo, error) {
files, err := a.client.GetPullRequestFiles(ctx, owner, repo, number)
if err != nil {
return nil, err
}
result := make([]changedFileInfo, len(files))
for i, f := range files {
result[i] = changedFileInfo{Filename: f.Filename, Status: f.Status}
}
return result, nil
}
func (a *githubClientVCSAdapter) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]commitStatusInfo, error) {
statuses, err := a.client.GetCommitStatuses(ctx, owner, repo, sha)
if err != nil {
return nil, err
}
result := make([]commitStatusInfo, len(statuses))
for i, s := range statuses {
result[i] = commitStatusInfo{
Status: s.Status,
Context: s.Context,
Description: s.Description,
TargetURL: s.TargetURL,
}
}
return result, nil
}
func (a *githubClientVCSAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return a.client.GetFileContent(ctx, owner, repo, filepath)
}
func (a *githubClientVCSAdapter) GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
return a.client.GetFileContentRef(ctx, owner, repo, filepath, ref)
}
func (a *githubClientVCSAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]contentEntryInfo, error) {
entries, err := a.client.ListContents(ctx, owner, repo, path)
if err != nil {
return nil, err
}
result := make([]contentEntryInfo, len(entries))
for i, e := range entries {
result[i] = contentEntryInfo{Name: e.Name, Path: e.Path, Type: e.Type}
}
return result, nil
}
func (a *githubClientVCSAdapter) ListReviews(ctx context.Context, owner, repo string, number int) ([]reviewInfo, error) {
reviews, err := a.client.ListReviews(ctx, owner, repo, number)
if err != nil {
return nil, err
}
result := make([]reviewInfo, len(reviews))
for i, r := range reviews {
result[i] = reviewInfo{
ID: r.ID,
Body: r.Body,
User: r.User,
State: r.State,
CommitID: r.CommitID,
}
}
return result, nil
}
func (a *githubClientVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, req vcs.ReviewRequest) (*reviewInfo, error) {
posted, err := a.client.PostReview(ctx, owner, repo, number, req)
if err != nil {
return nil, err
}
return &reviewInfo{
ID: posted.ID,
Body: posted.Body,
User: posted.User,
State: posted.State,
CommitID: posted.CommitID,
}, nil
}
func (a *githubClientVCSAdapter) GetAuthenticatedUser(ctx context.Context) (string, error) {
return a.client.GetAuthenticatedUser(ctx)
}
+31 -10
View File
@@ -9,7 +9,7 @@ JSON is awkward for persona files that contain multi-line text (identity, severi
- Backwards compatibility: existing JSON personas must continue to work - Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486) - Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use `.yaml` extension (not `.yml`) - Consistency: use `.yaml` extension (not `.yml`)
- Library: use `github.com/goccy/go-yaml` v1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation - Library: use `gopkg.in/yaml.v3` (approved in CONVENTIONS.md) with explicit depth limiting
## Proposed Approach ## Proposed Approach
@@ -33,16 +33,37 @@ func parsePersona(data []byte, source string) (*Persona, error) {
### YAML Parsing with Depth Protection ### YAML Parsing with Depth Protection
We implement a custom AST-based depth/node-count walk (`checkYAMLDepth` in ```go
`review/persona.go`) rather than relying on library decoder options. Key design func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
decisions: var node yaml.Node
dec := yaml.NewDecoder(bytes.NewReader(data))
if err := dec.Decode(&node); err != nil {
return err
}
if err := checkYAMLDepth(&node, 0, maxDepth); err != nil {
return err
}
return node.Decode(out)
}
- **Library:** `github.com/goccy/go-yaml` with `ast.Node`-based traversal func checkYAMLDepth(node *yaml.Node, depth, maxDepth int) error {
- **Dual-map tracking:** `validated` (depth-aware short-circuit) + `visiting` (cycle detection) if depth > maxDepth {
- **Node-count limit:** Conservative overcounting bounds total validation work return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
- **Alias-aware depth:** Aliases increment depth and are re-checked when encountered at greater depths }
// Handle alias nodes by following the Alias pointer
if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth, maxDepth)
}
for _, child := range node.Content {
if err := checkYAMLDepth(child, depth+1, maxDepth); err != nil {
return err
}
}
return nil
}
```
See `review/persona.go:checkYAMLDepth` for the authoritative implementation. The `gopkg.in/yaml.v3` library does not have built-in depth protection, so we implement explicit depth checking by first decoding into a `yaml.Node`, walking the tree to verify depth (including alias resolution), then decoding into the target struct.
## State/Data Model ## State/Data Model
@@ -53,7 +74,7 @@ No new state. Same `Persona` struct, just different parsing.
| Error | Handling | | Error | Handling |
|-------|----------| |-------|----------|
| Invalid YAML syntax | Return parse error with source file | | Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Custom AST walk (`checkYAMLDepth`) rejects before decode | | Deeply nested YAML | Library rejects (v1.16.0+ fix) |
| Unknown extension | Fall back to JSON parsing | | Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse | | Missing required fields | Validation rejects after parse |
+24 -417
View File
@@ -11,12 +11,9 @@ import (
"fmt" "fmt"
"io" "io"
"log/slog" "log/slog"
"math"
"net"
"net/http" "net/http"
"net/url" "net/url"
"strings" "strings"
"syscall"
"time" "time"
) )
@@ -42,181 +39,23 @@ func IsNotFound(err error) bool {
return errors.As(err, &apiErr) && apiErr.StatusCode == http.StatusNotFound return errors.As(err, &apiErr) && apiErr.StatusCode == http.StatusNotFound
} }
// IsServerError reports whether an error is an API 5xx response.
func IsServerError(err error) bool {
var apiErr *APIError
return errors.As(err, &apiErr) && apiErr.StatusCode >= 500 && apiErr.StatusCode < 600
}
// DefaultMaxDiffSize is the default maximum diff size in bytes (10 MB).
const DefaultMaxDiffSize = 10 * 1024 * 1024
// ErrDiffTooLarge is returned when a PR diff exceeds the configured MaxDiffSize.
var ErrDiffTooLarge = errors.New("diff size exceeds maximum allowed size")
// Client interacts with the Gitea API. // Client interacts with the Gitea API.
// A Client is safe for concurrent use by multiple goroutines. // A Client is safe for concurrent use by multiple goroutines.
type Client struct { type Client struct {
baseURL string baseURL string
token string token string
http *http.Client http *http.Client
// RetryBackoff defines the delays between retry attempts.
// RetryBackoff[i] is the delay before attempt i+1 (after attempt i fails).
// If nil, defaults to {1s, 2s}. Set to shorter durations in tests.
//
// This field must be configured before the first request is made.
// Modifying it while requests are in flight is not safe.
RetryBackoff []time.Duration
// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.
// If zero, defaults to DefaultMaxDiffSize (10 MB). Set to any negative value
// (or math.MaxInt64) to disable the limit.
//
// This field must be configured before the first request is made.
// Modifying it while requests are in flight is not safe.
MaxDiffSize int64
}
// defaultCheckRedirect is the redirect policy used by NewClient.
// NOTE: This function is intentionally duplicated in github/client.go (and vice versa)
// because the packages are separate. Changes here must be mirrored there.
// It rejects HTTPS->HTTP protocol downgrades (to prevent plaintext leakage)
// and cross-host redirects (to prevent following responses from untrusted
// endpoints). Same-host, same-or-upgraded-scheme redirects are allowed.
func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
// Guard for direct invocation in tests and any future callers;
// net/http guarantees len(via) >= 1 during actual redirects.
if len(via) == 0 {
return nil
}
prev := via[len(via)-1]
// Reject protocol downgrade: HTTPS->HTTP leaks request metadata over plaintext.
if prev.URL.Scheme == "https" && req.URL.Scheme == "http" {
return fmt.Errorf("refusing redirect: HTTPS to HTTP downgrade (%s -> %s)", prev.URL.Host, req.URL.Host)
}
// Reject cross-host redirect entirely to avoid consuming responses
// from untrusted endpoints.
if req.URL.Host != prev.URL.Host {
return fmt.Errorf("refusing redirect: cross-host (%s -> %s)", prev.URL.Host, req.URL.Host)
}
return nil
}
// safeDialContext is the default DialContext for NewClient.
// It resolves the hostname and checks every returned IP against the blocked
// CIDR list before establishing a connection. This prevents SSRF attacks
// where user-supplied URLs resolve to internal/private addresses.
//
// After validating all IPs, we dial the first resolved IP directly to avoid
// a second DNS lookup (which could return a different IP in a DNS rebinding
// attack). This narrows — but does not fully eliminate — the DNS rebinding
// window to the time between LookupIPAddr and DialContext.
//
// If the host is already an IP literal, LookupIPAddr returns it directly
// (no DNS query issued), so IP literals like https://127.0.0.1/ are blocked.
func safeDialContext(ctx context.Context, network, addr string) (net.Conn, error) {
host, port, err := net.SplitHostPort(addr)
if err != nil {
return nil, fmt.Errorf("safeDialContext: invalid address %q: %w", addr, err)
}
addrs, err := net.DefaultResolver.LookupIPAddr(ctx, host)
if err != nil {
return nil, fmt.Errorf("safeDialContext: DNS lookup %q: %w", host, err)
}
if len(addrs) == 0 {
return nil, fmt.Errorf("safeDialContext: no addresses returned for %q", host)
}
for _, a := range addrs {
if IsBlockedIP(a.IP) {
return nil, fmt.Errorf("safeDialContext: blocked: %q resolves to private/reserved IP %s", host, a.IP)
}
}
// Try each resolved IP in order, returning the first successful connection.
// Fallback is important when a hostname resolves to multiple IPs and the first
// is temporarily unreachable. All IPs were already validated above, so dialing
// any of them is safe.
//
// Timeout: 10s per the design (PLAN.md); the outer http.Client has a 30s
// total timeout, but the per-dial timeout ensures a slow TCP connect on one IP
// doesn't consume the budget needed to try others.
d := &net.Dialer{Timeout: 10 * time.Second}
var lastErr error
for _, a := range addrs {
conn, err := d.DialContext(ctx, network, net.JoinHostPort(a.IP.String(), port))
if err == nil {
return conn, nil
}
lastErr = err
}
return nil, fmt.Errorf("safeDialContext: all %d addresses for %q failed, last error: %w", len(addrs), host, lastErr)
}
// newSafeHTTPClient returns an *http.Client with the SSRF-blocking safeDialContext
// transport and the cross-host redirect rejection policy.
//
// We clone http.DefaultTransport to preserve its production-ready defaults
// (ProxyFromEnvironment, TLSHandshakeTimeout, IdleConnTimeout, connection
// pooling, HTTP/2 support) and override only DialContext with safeDialContext.
func newSafeHTTPClient() *http.Client {
transport := http.DefaultTransport.(*http.Transport).Clone()
transport.DialContext = safeDialContext
return &http.Client{
Timeout: 30 * time.Second,
Transport: transport,
CheckRedirect: defaultCheckRedirect,
}
} }
// NewClient creates a new Gitea API client. // NewClient creates a new Gitea API client.
//
// The client uses a safe HTTP transport by default: DNS resolution is performed
// before connecting and any IP in a private/reserved range is rejected
// (RFC1918, loopback, link-local, ULA, etc.). Cross-host and HTTPS→HTTP
// redirects are also rejected.
//
// For tests that use httptest.NewServer (which listens on 127.0.0.1), call
// WithUnsafeDialer() to bypass the IP check.
func NewClient(baseURL, token string) *Client { func NewClient(baseURL, token string) *Client {
return &Client{ return &Client{
baseURL: strings.TrimRight(baseURL, "/"), baseURL: strings.TrimRight(baseURL, "/"),
token: token, token: token,
http: newSafeHTTPClient(), http: &http.Client{Timeout: 30 * time.Second},
} }
} }
// WithUnsafeDialer returns the client configured with a plain HTTP client that
// has no IP-level SSRF protection. It preserves the redirect-rejection policy.
//
// This MUST only be used in tests. Production code must never call this method.
func (c *Client) WithUnsafeDialer() *Client {
c.http = &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
}
return c
}
// SetHTTPClient sets the underlying HTTP client used for requests.
// This is intended for test setup only to inject mock transports; it must be
// called before any goroutines issue requests.
//
// Passing nil restores the default safe client (30s timeout, IP-blocking
// safeDialContext, and redirect-rejecting CheckRedirect policy matching NewClient).
//
// Callers providing a non-nil client are responsible for configuring a safe
// CheckRedirect policy. Without one, the default net/http behavior will follow
// redirects and may forward the Authorization header to untrusted hosts.
func (c *Client) SetHTTPClient(hc *http.Client) {
if hc == nil {
hc = newSafeHTTPClient()
}
c.http = hc
}
// PullRequest holds relevant PR metadata. // PullRequest holds relevant PR metadata.
type PullRequest struct { type PullRequest struct {
Title string `json:"title"` Title string `json:"title"`
@@ -264,28 +103,9 @@ func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number
} }
// GetPullRequestDiff fetches the unified diff for a PR. // GetPullRequestDiff fetches the unified diff for a PR.
// It enforces MaxDiffSize to prevent unbounded memory allocation.
// Returns ErrDiffTooLarge if the diff exceeds the configured limit.
func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) { func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d.diff", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number) reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d.diff", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
maxSize := c.MaxDiffSize
if maxSize == 0 {
maxSize = DefaultMaxDiffSize
}
// When the limit is disabled (negative) or set to math.MaxInt64 (which
// would overflow the +1 detection and silently disable enforcement),
// use the standard unlimited doGet path.
if maxSize < 0 || maxSize == math.MaxInt64 {
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch diff: %w", err)
}
return string(body), nil
}
body, err := c.doGetLimited(ctx, reqURL, maxSize)
if err != nil { if err != nil {
return "", fmt.Errorf("fetch diff: %w", err) return "", fmt.Errorf("fetch diff: %w", err)
} }
@@ -341,22 +161,18 @@ func (c *Client) GetFileContentRef(ctx context.Context, owner, repo, filepath, r
} }
// PostReview submits a review to a PR and returns the created review. // PostReview submits a review to a PR and returns the created review.
// event should be one of "APPROVED", "REQUEST_CHANGES", or "COMMENT". // event should be "APPROVED" or "REQUEST_CHANGES".
// commitID anchors the review to a specific commit SHA. If empty, Gitea
// defaults to the current PR head.
// comments are optional inline comments attached to specific lines. // comments are optional inline comments attached to specific lines.
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []ReviewComment) (*Review, error) { func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body string, comments []ReviewComment) (*Review, error) {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number) reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
payload := struct { payload := struct {
Body string `json:"body"` Body string `json:"body"`
Event string `json:"event"` Event string `json:"event"`
CommitID string `json:"commit_id,omitempty"`
Comments []ReviewComment `json:"comments,omitempty"` Comments []ReviewComment `json:"comments,omitempty"`
}{ }{
Body: body, Body: body,
Event: event, Event: event,
CommitID: commitID,
Comments: comments, Comments: comments,
} }
@@ -394,218 +210,24 @@ func (c *Client) PostReview(ctx context.Context, owner, repo string, number int,
return &review, nil return &review, nil
} }
// isTemporaryNetError reports whether err is a temporary network error worth retrying.
// This includes connection refused, network unreachable, connection reset, and DNS
// timeouts. It explicitly excludes permanent errors like permission denied or
// "no such host" DNS failures.
func isTemporaryNetError(err error) bool {
if err == nil {
return false
}
// Check for OpError and inspect the underlying syscall error.
// Not all OpErrors are transient — permission denied, for example, is permanent.
var opErr *net.OpError
if errors.As(err, &opErr) {
return isRetriableSyscallError(opErr.Err)
}
// DNS errors: only retry on timeout, not on "no such host" which is permanent.
var dnsErr *net.DNSError
if errors.As(err, &dnsErr) {
return dnsErr.IsTimeout
}
// Check for net.Error with Timeout() (Temporary is deprecated)
var netErr net.Error
if errors.As(err, &netErr) {
return netErr.Timeout()
}
return false
}
// isRetriableSyscallError reports whether the underlying error from a net.OpError
// is a transient syscall error worth retrying.
func isRetriableSyscallError(err error) bool {
if err == nil {
return false
}
// Check for syscall.Errno directly or wrapped
var errno syscall.Errno
if errors.As(err, &errno) {
switch errno {
case syscall.ECONNREFUSED, // connection refused — server not listening
syscall.ECONNRESET, // connection reset by peer
syscall.ENETUNREACH, // network unreachable
syscall.EHOSTUNREACH, // host unreachable
syscall.ETIMEDOUT: // connection timed out
return true
default:
// EACCES, EPERM, etc. are permanent — don't retry
return false
}
}
// If we can't identify the specific syscall error, be conservative and retry.
// This handles wrapped errors or platform-specific error types.
// The retry count is limited, so erring on the side of retrying is safe.
return true
}
// redactURL strips query parameters and userinfo credentials from a URL for
// safe logging. This prevents accidental exposure of sensitive data (tokens in
// query strings, or user:pass in the authority) in log output.
func redactURL(rawURL string) string {
parsed, err := url.Parse(rawURL)
if err != nil {
// If we cannot parse it, return a safe placeholder rather than
// potentially logging something sensitive.
return "[invalid URL]"
}
if parsed.User != nil {
parsed.User = url.User("REDACTED")
}
if parsed.RawQuery != "" {
parsed.RawQuery = "[redacted]"
}
return parsed.String()
}
// sanitizeErrorForLog returns a loggable version of an error that omits
// potentially sensitive content like response bodies. For APIError, only
// the status code is included; for other errors, the type is preserved.
func sanitizeErrorForLog(err error) string {
if err == nil {
return "<nil>"
}
var apiErr *APIError
if errors.As(err, &apiErr) {
return fmt.Sprintf("HTTP %d", apiErr.StatusCode)
}
return err.Error()
}
// doGetWithReader performs an HTTP GET request with retry on 5xx errors and
// temporary network errors. Retries up to 3 times with exponential backoff
// (1s, 2s delays by default; configurable via Client.RetryBackoff for testing).
// The readBody function is called with the response body on success (2xx) and
// is responsible for reading and closing it.
func (c *Client) doGetWithReader(ctx context.Context, reqURL string, readBody func(io.ReadCloser) ([]byte, error)) ([]byte, error) {
const maxAttempts = 3
// backoff[i] is the delay before attempt i+1 (i.e., after attempt i fails).
// First attempt (i=0) has no delay; retries wait 1s then 2s by default.
backoff := c.RetryBackoff
if backoff == nil {
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
// maxErrorBodyBytes limits how much of an error response body we read
// to protect against malicious servers sending unbounded data.
const maxErrorBodyBytes = 64 * 1024 // 64 KB
var lastErr error
for attempt := 0; attempt < maxAttempts; attempt++ {
if attempt > 0 {
// Determine delay: use backoff slice if available, otherwise retry immediately.
// An empty RetryBackoff slice means "retry without delay" — this is intentional
// as the caller explicitly configured no delays.
var delay time.Duration
if attempt-1 < len(backoff) {
delay = backoff[attempt-1]
}
if delay > 0 {
slog.Warn("retrying request after error",
"attempt", attempt+1,
"url", redactURL(reqURL),
"delay", delay.String(),
"lastError", sanitizeErrorForLog(lastErr))
timer := time.NewTimer(delay)
select {
case <-timer.C:
case <-ctx.Done():
timer.Stop()
return nil, ctx.Err()
}
}
}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, reqURL, nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", "token "+c.token)
resp, err := c.http.Do(req)
if err != nil {
// Always capture the error for consistent return at loop end.
// This ensures both network errors and HTTP 5xx return lastErr.
lastErr = err
// Only retry temporary network errors when attempts remain.
if attempt < maxAttempts-1 && isTemporaryNetError(err) {
slog.Warn("temporary network error, will retry",
"attempt", attempt+1,
"url", redactURL(reqURL),
"error", err)
continue
}
// Non-retryable network error or final attempt exhausted.
return nil, lastErr
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
return readBody(resp.Body)
}
// Error path: limit how much we read from potentially malicious server
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
resp.Body.Close()
lastErr = &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
// Only retry on 5xx server errors
if resp.StatusCode < 500 || resp.StatusCode >= 600 {
return nil, lastErr
}
}
return nil, lastErr
}
// doGet performs an HTTP GET request with retry, reading the full response body.
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) { func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return c.doGetWithReader(ctx, reqURL, func(body io.ReadCloser) ([]byte, error) { req, err := http.NewRequestWithContext(ctx, http.MethodGet, reqURL, nil)
defer body.Close() if err != nil {
return io.ReadAll(body) return nil, err
}) }
} req.Header.Set("Authorization", "token "+c.token)
// doGetLimited performs an HTTP GET request with retry but enforces a maximum resp, err := c.http.Do(req)
// response body size. Returns ErrDiffTooLarge if the response exceeds maxBytes. if err != nil {
// It reads maxBytes+1 (clamped to avoid overflow) to detect truncation without return nil, err
// buffering the entire body. }
func (c *Client) doGetLimited(ctx context.Context, reqURL string, maxBytes int64) ([]byte, error) { defer resp.Body.Close()
return c.doGetWithReader(ctx, reqURL, func(body io.ReadCloser) ([]byte, error) {
defer body.Close() if resp.StatusCode < 200 || resp.StatusCode >= 300 {
// Read up to maxBytes+1 to detect overflow. body, _ := io.ReadAll(resp.Body)
// Clamp to prevent integer overflow when maxBytes == math.MaxInt64. return nil, &APIError{StatusCode: resp.StatusCode, Body: string(body)}
limitBytes := maxBytes + 1 }
if limitBytes <= 0 { return io.ReadAll(resp.Body)
limitBytes = math.MaxInt64
}
limited := io.LimitReader(body, limitBytes)
data, err := io.ReadAll(limited)
if err != nil {
return nil, err
}
if int64(len(data)) > maxBytes {
return nil, fmt.Errorf("%w: response exceeds %d bytes", ErrDiffTooLarge, maxBytes)
}
return data, nil
})
} }
// escapePath escapes each segment of a relative file path for use in URLs. // escapePath escapes each segment of a relative file path for use in URLs.
@@ -629,13 +251,7 @@ type ContentEntry struct {
// ListContents lists files and directories at a given path in a repo. // ListContents lists files and directories at a given path in a repo.
// Pass an empty path to list the repository root. // Pass an empty path to list the repository root.
// If the path points to a file (not a directory), Gitea returns a single
// object instead of an array; this method normalizes both cases to a slice.
func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) { func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
// Normalize "." to empty string — Gitea API rejects "." with 500
if path == "." {
path = ""
}
var reqURL string var reqURL string
if path == "" { if path == "" {
reqURL = fmt.Sprintf("%s/api/v1/repos/%s/%s/contents", c.baseURL, url.PathEscape(owner), url.PathEscape(repo)) reqURL = fmt.Sprintf("%s/api/v1/repos/%s/%s/contents", c.baseURL, url.PathEscape(owner), url.PathEscape(repo))
@@ -648,16 +264,7 @@ func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]
} }
var entries []ContentEntry var entries []ContentEntry
if err := json.Unmarshal(body, &entries); err != nil { if err := json.Unmarshal(body, &entries); err != nil {
// Gitea returns a single object (not an array) when path is a file return nil, fmt.Errorf("parse contents JSON: %w", err)
var single ContentEntry
if err2 := json.Unmarshal(body, &single); err2 != nil {
return nil, fmt.Errorf("parse contents JSON: %w", err)
}
// Guard against empty/malformed responses
if single.Name == "" && single.Path == "" {
return nil, fmt.Errorf("parse contents JSON: empty response for path %q", path)
}
entries = []ContentEntry{single}
} }
return entries, nil return entries, nil
} }
@@ -710,9 +317,9 @@ func (c *Client) GetAllFilesInPath(ctx context.Context, owner, repo, path string
// Review represents a pull request review from the Gitea API. // Review represents a pull request review from the Gitea API.
type Review struct { type Review struct {
ID int64 `json:"id"` ID int64 `json:"id"`
Body string `json:"body"` Body string `json:"body"`
User struct { User struct {
Login string `json:"login"` Login string `json:"login"`
} `json:"user"` } `json:"user"`
State string `json:"state"` State string `json:"state"`
+40 -716
View File
@@ -6,15 +6,10 @@ import (
"errors" "errors"
"fmt" "fmt"
"io" "io"
"net"
"net/http" "net/http"
"net/http/httptest" "net/http/httptest"
"net/url"
"strings" "strings"
"sync/atomic"
"syscall"
"testing" "testing"
"time"
) )
func TestGetPullRequest(t *testing.T) { func TestGetPullRequest(t *testing.T) {
@@ -36,7 +31,7 @@ func TestGetPullRequest(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
got, err := client.GetPullRequest(context.Background(), "owner", "repo", 1) got, err := client.GetPullRequest(context.Background(), "owner", "repo", 1)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -63,7 +58,7 @@ func TestGetPullRequestDiff(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
got, err := client.GetPullRequestDiff(context.Background(), "owner", "repo", 5) got, err := client.GetPullRequestDiff(context.Background(), "owner", "repo", 5)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -88,7 +83,7 @@ func TestGetCommitStatuses(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
got, err := client.GetCommitStatuses(context.Background(), "owner", "repo", "abc123") got, err := client.GetCommitStatuses(context.Background(), "owner", "repo", "abc123")
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -117,9 +112,8 @@ func TestPostReview(t *testing.T) {
} }
var payload struct { var payload struct {
Body string `json:"body"` Body string `json:"body"`
Event string `json:"event"` Event string `json:"event"`
CommitID string `json:"commit_id"`
} }
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil { if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
t.Fatalf("failed to decode payload: %v", err) t.Fatalf("failed to decode payload: %v", err)
@@ -130,16 +124,14 @@ func TestPostReview(t *testing.T) {
if payload.Event != "APPROVED" { if payload.Event != "APPROVED" {
t.Errorf("expected event %q, got %q", "APPROVED", payload.Event) t.Errorf("expected event %q, got %q", "APPROVED", payload.Event)
} }
if payload.CommitID != "abc123def" {
t.Errorf("expected commit_id %q, got %q", "abc123def", payload.CommitID)
}
w.WriteHeader(http.StatusOK) w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"id":100,"user":{"login":"review-bot"},"state":"APPROVED","stale":false}`)) w.Write([]byte(`{"id":100,"user":{"login":"review-bot"},"state":"APPROVED","stale":false}`))
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
review, err := client.PostReview(context.Background(), "owner", "repo", 3, "APPROVED", "LGTM", "abc123def", nil) review, err := client.PostReview(context.Background(), "owner", "repo", 3, "APPROVED", "LGTM", nil)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
@@ -158,7 +150,7 @@ func TestGetPullRequest_Non200(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.GetPullRequest(context.Background(), "owner", "repo", 999) _, err := client.GetPullRequest(context.Background(), "owner", "repo", 999)
if err == nil { if err == nil {
t.Fatal("expected error for 404, got nil") t.Fatal("expected error for 404, got nil")
@@ -171,7 +163,7 @@ func TestGetPullRequest_BadJSON(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.GetPullRequest(context.Background(), "owner", "repo", 1) _, err := client.GetPullRequest(context.Background(), "owner", "repo", 1)
if err == nil { if err == nil {
t.Fatal("expected error for bad JSON, got nil") t.Fatal("expected error for bad JSON, got nil")
@@ -185,36 +177,13 @@ func TestPostReview_Non200(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "test", "", nil) _, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "test", nil)
if err == nil { if err == nil {
t.Fatal("expected error for 403, got nil") t.Fatal("expected error for 403, got nil")
} }
} }
func TestPostReview_EmptyCommitID_OmittedFromPayload(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var raw map[string]interface{}
if err := json.Unmarshal(body, &raw); err != nil {
t.Fatalf("failed to decode payload: %v", err)
}
if _, exists := raw["commit_id"]; exists {
t.Errorf("expected commit_id to be omitted from payload when empty, but it was present")
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"id":200,"user":{"login":"bot"},"state":"APPROVED","stale":false}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "ok", "", nil)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestGetFileContent(t *testing.T) { func TestGetFileContent(t *testing.T) {
expected := "# Conventions\n- Be nice\n" expected := "# Conventions\n- Be nice\n"
@@ -226,7 +195,7 @@ func TestGetFileContent(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
got, err := client.GetFileContent(context.Background(), "owner", "repo", "CONVENTIONS.md") got, err := client.GetFileContent(context.Background(), "owner", "repo", "CONVENTIONS.md")
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -246,7 +215,7 @@ func TestGetPullRequestFiles(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
files, err := client.GetPullRequestFiles(context.Background(), "owner", "repo", 1) files, err := client.GetPullRequestFiles(context.Background(), "owner", "repo", 1)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -271,7 +240,7 @@ func TestGetFileContentRef(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
content, err := client.GetFileContentRef(context.Background(), "owner", "repo", "main.go", "feature-branch") content, err := client.GetFileContentRef(context.Background(), "owner", "repo", "main.go", "feature-branch")
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -291,7 +260,7 @@ func TestListContents(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
entries, err := client.ListContents(context.Background(), "owner", "repo", "docs") entries, err := client.ListContents(context.Background(), "owner", "repo", "docs")
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -307,64 +276,11 @@ func TestListContents(t *testing.T) {
} }
} }
func TestListContents_DotPath(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// "." should be normalized to empty path, which hits the root contents endpoint
if r.URL.Path != "/api/v1/repos/owner/repo/contents" {
t.Errorf("expected root contents path, got: %s", r.URL.Path)
}
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `[{"name":"README.md","path":"README.md","type":"file"}]`)
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
entries, err := client.ListContents(context.Background(), "owner", "repo", ".")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if entries[0].Name != "README.md" {
t.Errorf("expected README.md, got %s", entries[0].Name)
}
}
func TestListContents_FilePath(t *testing.T) {
// Gitea returns a single object (not an array) when path is a file
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/v1/repos/owner/repo/contents/README.md" {
t.Errorf("unexpected path: %s", r.URL.Path)
}
w.Header().Set("Content-Type", "application/json")
// Single object, not an array
fmt.Fprintf(w, `{"name":"README.md","path":"README.md","type":"file"}`)
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
entries, err := client.ListContents(context.Background(), "owner", "repo", "README.md")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if entries[0].Name != "README.md" {
t.Errorf("expected README.md, got %s", entries[0].Name)
}
if entries[0].Type != "file" {
t.Errorf("expected type file, got %s", entries[0].Type)
}
}
func TestGetAllFilesInPath_File(t *testing.T) { func TestGetAllFilesInPath_File(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/api/v1/repos/owner/repo/contents/README.md" { if r.URL.Path == "/api/v1/repos/owner/repo/contents/README.md" {
// Gitea returns a single object (not array) when path is a file // Gitea returns 404 for contents API on files (it's not a dir)
w.Header().Set("Content-Type", "application/json") http.NotFound(w, r)
fmt.Fprintf(w, `{"name":"README.md","path":"README.md","type":"file"}`)
return return
} }
if r.URL.Path == "/api/v1/repos/owner/repo/raw/README.md" { if r.URL.Path == "/api/v1/repos/owner/repo/raw/README.md" {
@@ -375,7 +291,7 @@ func TestGetAllFilesInPath_File(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
files, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "README.md") files, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "README.md")
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -428,7 +344,7 @@ func TestListReviews(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
reviews, err := client.ListReviews(context.Background(), "owner", "repo", 5) reviews, err := client.ListReviews(context.Background(), "owner", "repo", 5)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -468,7 +384,7 @@ func TestListReviews_Pagination(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
reviews, err := client.ListReviews(context.Background(), "owner", "repo", 5) reviews, err := client.ListReviews(context.Background(), "owner", "repo", 5)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -493,7 +409,7 @@ func TestDeleteReview(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.DeleteReview(context.Background(), "owner", "repo", 5, 10) err := client.DeleteReview(context.Background(), "owner", "repo", 5, 10)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
@@ -507,7 +423,7 @@ func TestDeleteReview_Forbidden(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.DeleteReview(context.Background(), "owner", "repo", 5, 10) err := client.DeleteReview(context.Background(), "owner", "repo", 5, 10)
if err == nil { if err == nil {
t.Fatal("expected error for 403, got nil") t.Fatal("expected error for 403, got nil")
@@ -536,7 +452,7 @@ func TestEditComment(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.EditComment(context.Background(), "owner", "repo", 42, "updated body") err := client.EditComment(context.Background(), "owner", "repo", 42, "updated body")
if err != nil { if err != nil {
t.Fatalf("EditComment() error = %v", err) t.Fatalf("EditComment() error = %v", err)
@@ -550,7 +466,7 @@ func TestEditComment_Forbidden(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.EditComment(context.Background(), "owner", "repo", 42, "new body") err := client.EditComment(context.Background(), "owner", "repo", 42, "new body")
if err == nil { if err == nil {
t.Fatal("expected error for 403 response") t.Fatal("expected error for 403 response")
@@ -570,7 +486,7 @@ func TestGetTimelineReviewCommentID(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
id, err := client.GetTimelineReviewCommentID(context.Background(), "owner", "repo", 5, "<!-- review-bot:sonnet -->") id, err := client.GetTimelineReviewCommentID(context.Background(), "owner", "repo", 5, "<!-- review-bot:sonnet -->")
if err != nil { if err != nil {
t.Fatalf("GetTimelineReviewCommentID() error = %v", err) t.Fatalf("GetTimelineReviewCommentID() error = %v", err)
@@ -586,7 +502,7 @@ func TestGetTimelineReviewCommentID_NotFound(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.GetTimelineReviewCommentID(context.Background(), "owner", "repo", 5, "<!-- review-bot:sonnet -->") _, err := client.GetTimelineReviewCommentID(context.Background(), "owner", "repo", 5, "<!-- review-bot:sonnet -->")
if err == nil { if err == nil {
t.Fatal("expected error when sentinel not found") t.Fatal("expected error when sentinel not found")
@@ -609,7 +525,7 @@ func TestGetAllFilesInPath_404FallsBackToFile(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
files, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "README.md") files, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "README.md")
if err != nil { if err != nil {
t.Fatalf("expected fallback to file on 404, got error: %v", err) t.Fatalf("expected fallback to file on 404, got error: %v", err)
@@ -630,7 +546,7 @@ func TestGetAllFilesInPath_500Propagates(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "somepath") _, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "somepath")
if err == nil { if err == nil {
t.Fatal("expected error to propagate for 500, got nil") t.Fatal("expected error to propagate for 500, got nil")
@@ -652,7 +568,7 @@ func TestGetAllFilesInPath_403Propagates(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "private/stuff") _, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "private/stuff")
if err == nil { if err == nil {
t.Fatal("expected error to propagate for 403, got nil") t.Fatal("expected error to propagate for 403, got nil")
@@ -668,9 +584,9 @@ func TestGetAllFilesInPath_403Propagates(t *testing.T) {
func TestIsNotFound(t *testing.T) { func TestIsNotFound(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
err error err error
want bool want bool
}{ }{
{"nil error", nil, false}, {"nil error", nil, false},
{"non-API error", fmt.Errorf("network timeout"), false}, {"non-API error", fmt.Errorf("network timeout"), false},
@@ -704,7 +620,7 @@ func TestGetAuthenticatedUser(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
login, err := client.GetAuthenticatedUser(context.Background()) login, err := client.GetAuthenticatedUser(context.Background())
if err != nil { if err != nil {
t.Fatalf("GetAuthenticatedUser() error = %v", err) t.Fatalf("GetAuthenticatedUser() error = %v", err)
@@ -729,7 +645,7 @@ func TestRequestReviewer(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.RequestReviewer(context.Background(), "owner", "repo", 7, "bot-user") err := client.RequestReviewer(context.Background(), "owner", "repo", 7, "bot-user")
if err != nil { if err != nil {
t.Fatalf("RequestReviewer() error = %v", err) t.Fatalf("RequestReviewer() error = %v", err)
@@ -745,7 +661,7 @@ func TestRequestReviewer_204(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.RequestReviewer(context.Background(), "owner", "repo", 1, "user") err := client.RequestReviewer(context.Background(), "owner", "repo", 1, "user")
if err != nil { if err != nil {
t.Fatalf("RequestReviewer() should accept 204, got error = %v", err) t.Fatalf("RequestReviewer() should accept 204, got error = %v", err)
@@ -759,7 +675,7 @@ func TestRequestReviewer_Error(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.RequestReviewer(context.Background(), "owner", "repo", 1, "user") err := client.RequestReviewer(context.Background(), "owner", "repo", 1, "user")
if err == nil { if err == nil {
t.Fatal("expected error for 403 response") t.Fatal("expected error for 403 response")
@@ -779,7 +695,7 @@ func TestListReviewComments(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
comments, err := client.ListReviewComments(context.Background(), "owner", "repo", 1, 42) comments, err := client.ListReviewComments(context.Background(), "owner", "repo", 1, 42)
if err != nil { if err != nil {
t.Fatalf("ListReviewComments() error = %v", err) t.Fatalf("ListReviewComments() error = %v", err)
@@ -807,7 +723,7 @@ func TestResolveComment(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.ResolveComment(context.Background(), "owner", "repo", 99) err := client.ResolveComment(context.Background(), "owner", "repo", 99)
if err != nil { if err != nil {
t.Fatalf("ResolveComment() error = %v", err) t.Fatalf("ResolveComment() error = %v", err)
@@ -821,601 +737,9 @@ func TestResolveComment_Error(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
err := client.ResolveComment(context.Background(), "owner", "repo", 99) err := client.ResolveComment(context.Background(), "owner", "repo", 99)
if err == nil { if err == nil {
t.Fatal("expected error for 404 response") t.Fatal("expected error for 404 response")
} }
} }
func TestIsServerError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil error", nil, false},
{"non-API error", fmt.Errorf("network timeout"), false},
{"404 APIError", &APIError{StatusCode: 404, Body: "not found"}, false},
{"500 APIError", &APIError{StatusCode: 500, Body: "server error"}, true},
{"502 APIError", &APIError{StatusCode: 502, Body: "bad gateway"}, true},
{"503 APIError", &APIError{StatusCode: 503, Body: "unavailable"}, true},
{"599 APIError", &APIError{StatusCode: 599, Body: "edge case"}, true},
{"600 not server error", &APIError{StatusCode: 600, Body: "edge"}, false},
{"400 not server error", &APIError{StatusCode: 400, Body: "bad request"}, false},
{"wrapped 500", fmt.Errorf("fetch: %w", &APIError{StatusCode: 500, Body: "err"}), true},
{"wrapped 404", fmt.Errorf("fetch: %w", &APIError{StatusCode: 404, Body: "err"}), false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := IsServerError(tt.err)
if got != tt.want {
t.Errorf("IsServerError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestDoGet_RetriesOn500(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts < 3 {
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"transient error"}`))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"data":"success"}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
// Use short backoff for fast tests
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
body, err := client.doGet(context.Background(), server.URL+"/test")
if err != nil {
t.Fatalf("expected success after retry, got error: %v", err)
}
if string(body) != `{"data":"success"}` {
t.Errorf("body = %q, want %q", string(body), `{"data":"success"}`)
}
if attempts != 3 {
t.Errorf("attempts = %d, want 3", attempts)
}
}
func TestDoGet_FailsAfterMaxRetries(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"persistent error"}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
// Use short backoff for fast tests
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
_, err := client.doGet(context.Background(), server.URL+"/test")
if err == nil {
t.Fatal("expected error after max retries")
}
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected APIError, got: %v", err)
}
if apiErr.StatusCode != http.StatusInternalServerError {
t.Errorf("status = %d, want 500", apiErr.StatusCode)
}
if attempts != 3 {
t.Errorf("attempts = %d, want 3 (max retries)", attempts)
}
}
func TestDoGet_NoRetryOn4xx(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusForbidden)
w.Write([]byte(`{"message":"forbidden"}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.doGet(context.Background(), server.URL+"/test")
if err == nil {
t.Fatal("expected error for 403")
}
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected APIError, got: %v", err)
}
if apiErr.StatusCode != http.StatusForbidden {
t.Errorf("status = %d, want 403", apiErr.StatusCode)
}
if attempts != 1 {
t.Errorf("attempts = %d, want 1 (no retry on 4xx)", attempts)
}
}
func TestDoGet_RespectsContextCancellation(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"error"}`))
}))
defer server.Close()
ctx, cancel := context.WithCancel(context.Background())
client := NewTestClient(server.URL, "test-token")
// Use longer backoff to give us time to cancel during the wait
client.RetryBackoff = []time.Duration{100 * time.Millisecond, 100 * time.Millisecond}
// Cancel after first attempt returns and retry begins
go func() {
time.Sleep(20 * time.Millisecond)
cancel()
}()
_, err := client.doGet(ctx, server.URL+"/test")
if err == nil {
t.Fatal("expected error on context cancellation")
}
// Should have made 1 attempt, then context cancelled during backoff
if attempts != 1 {
t.Errorf("attempts = %d, expected 1 before context cancel during backoff", attempts)
}
}
// mockTransport is a test helper that returns errors for the first N calls,
// then delegates to a real server.
type mockTransport struct {
failCount int32 // number of failures remaining (atomic)
failErr error // error to return on failure
realServer *httptest.Server
attemptsMade atomic.Int32 // tracks total attempts
}
func (m *mockTransport) RoundTrip(req *http.Request) (*http.Response, error) {
m.attemptsMade.Add(1)
remaining := atomic.AddInt32(&m.failCount, -1)
if remaining >= 0 {
// Still have failures to return
return nil, m.failErr
}
// Redirect to real server
req.URL.Host = m.realServer.Listener.Addr().String()
req.URL.Scheme = "http"
return http.DefaultTransport.RoundTrip(req)
}
func TestDoGet_RetriesOnTemporaryNetError(t *testing.T) {
// Real server that will handle successful requests
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"status":"ok"}`))
}))
defer server.Close()
// Mock transport: fail twice with ECONNREFUSED, then succeed
mt := &mockTransport{
failCount: 2,
failErr: &net.OpError{Op: "dial", Net: "tcp", Err: syscall.ECONNREFUSED},
realServer: server,
}
client := NewClient("http://fake-host/", "test-token")
client.SetHTTPClient(&http.Client{Transport: mt})
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
body, err := client.doGet(context.Background(), "http://fake-host/test")
if err != nil {
t.Fatalf("expected success after retries, got error: %v", err)
}
if string(body) != `{"status":"ok"}` {
t.Errorf("body = %q, want %q", string(body), `{"status":"ok"}`)
}
// Should have made exactly 3 attempts: 2 failures + 1 success
if got := mt.attemptsMade.Load(); got != 3 {
t.Errorf("attempts = %d, want 3 (2 failures + 1 success)", got)
}
}
func TestIsTemporaryNetError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil error", nil, false},
{"plain error", fmt.Errorf("some error"), false},
// OpError with retriable syscall errors
{"OpError ECONNREFUSED", &net.OpError{Op: "dial", Err: syscall.ECONNREFUSED}, true},
{"OpError ECONNRESET", &net.OpError{Op: "read", Err: syscall.ECONNRESET}, true},
{"OpError ENETUNREACH", &net.OpError{Op: "dial", Err: syscall.ENETUNREACH}, true},
{"OpError EHOSTUNREACH", &net.OpError{Op: "dial", Err: syscall.EHOSTUNREACH}, true},
{"OpError ETIMEDOUT", &net.OpError{Op: "dial", Err: syscall.ETIMEDOUT}, true},
// OpError with permanent syscall errors — should NOT retry
{"OpError EACCES", &net.OpError{Op: "dial", Err: syscall.EACCES}, false},
{"OpError EPERM", &net.OpError{Op: "dial", Err: syscall.EPERM}, false},
// OpError with unknown inner error — conservative retry
{"OpError unknown inner", &net.OpError{Op: "dial", Err: fmt.Errorf("unknown")}, true},
// DNS errors
{"DNS timeout", &net.DNSError{IsTimeout: true}, true},
{"DNS no such host", &net.DNSError{IsTimeout: false, Name: "bad.host"}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isTemporaryNetError(tt.err)
if got != tt.want {
t.Errorf("isTemporaryNetError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestIsRetriableSyscallError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil", nil, false},
{"ECONNREFUSED", syscall.ECONNREFUSED, true},
{"ECONNRESET", syscall.ECONNRESET, true},
{"ENETUNREACH", syscall.ENETUNREACH, true},
{"EHOSTUNREACH", syscall.EHOSTUNREACH, true},
{"ETIMEDOUT", syscall.ETIMEDOUT, true},
{"EACCES (permanent)", syscall.EACCES, false},
{"EPERM (permanent)", syscall.EPERM, false},
{"ENOENT (permanent)", syscall.ENOENT, false},
{"unknown error", fmt.Errorf("something"), true}, // conservative retry
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isRetriableSyscallError(tt.err)
if got != tt.want {
t.Errorf("isRetriableSyscallError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestRedactURL(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{
name: "no query params",
input: "https://gitea.example.com/api/v1/repos/owner/repo/pulls/1",
want: "https://gitea.example.com/api/v1/repos/owner/repo/pulls/1",
},
{
name: "with query params - redacts",
input: "https://gitea.example.com/api/v1/repos/owner/repo/raw/file?ref=main",
want: "https://gitea.example.com/api/v1/repos/owner/repo/raw/file?[redacted]",
},
{
name: "multiple query params",
input: "https://example.com/path?token=secret&page=1",
want: "https://example.com/path?[redacted]",
},
{
name: "invalid URL",
input: "://invalid",
want: "[invalid URL]",
},
{
name: "empty string",
input: "",
want: "",
},
{
name: "with userinfo - redacts credentials",
input: "https://admin:secret@gitea.example.com/api/v1/repos",
want: "https://REDACTED@gitea.example.com/api/v1/repos",
},
{
name: "with userinfo and query params",
input: "https://user:pass@example.com/path?token=abc",
want: "https://REDACTED@example.com/path?[redacted]",
},
{
name: "username only - no password",
input: "https://user@example.com/path",
want: "https://REDACTED@example.com/path",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := redactURL(tt.input)
if got != tt.want {
t.Errorf("redactURL(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
func TestSanitizeErrorForLog(t *testing.T) {
tests := []struct {
name string
err error
want string
}{
{
name: "nil error",
err: nil,
want: "<nil>",
},
{
name: "APIError omits body",
err: &APIError{StatusCode: 500, Body: "internal error: database connection failed"},
want: "HTTP 500",
},
{
name: "APIError with large body still only shows status",
err: &APIError{StatusCode: 502, Body: strings.Repeat("x", 1000)},
want: "HTTP 502",
},
{
name: "non-API error preserved",
err: fmt.Errorf("connection refused"),
want: "connection refused",
},
{
name: "wrapped APIError",
err: fmt.Errorf("request failed: %w", &APIError{StatusCode: 503, Body: "service unavailable"}),
want: "HTTP 503",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := sanitizeErrorForLog(tt.err)
if got != tt.want {
t.Errorf("sanitizeErrorForLog() = %q, want %q", got, tt.want)
}
})
}
}
func TestNewClient_HasCheckRedirect(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
if c.http.CheckRedirect == nil {
t.Fatal("expected CheckRedirect to be set")
}
}
func TestDefaultCheckRedirect_RejectsHTTPSToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "gitea.example.com", Path: "/foo"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on HTTPS->HTTP redirect")
}
if !strings.Contains(err.Error(), "HTTPS to HTTP downgrade") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsCrossHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "cdn.example.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on cross-host redirect")
}
if !strings.Contains(err.Error(), "cross-host") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_AllowsSameHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if auth := req.Header.Get("Authorization"); auth != "token abc" {
t.Errorf("expected Authorization to be preserved, got %q", auth)
}
}
func TestDefaultCheckRedirect_AllowsSameHostHTTPToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "http", Host: "localhost:3000", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "localhost:3000", Path: "/bar"},
Header: http.Header{},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsTooManyRedirects(t *testing.T) {
via := make([]*http.Request, 10)
for i := range via {
via[i] = &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/"}}
}
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/final"}}
err := defaultCheckRedirect(req, via)
if err == nil {
t.Fatal("expected error after 10 redirects")
}
if !strings.Contains(err.Error(), "10 redirects") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_EmptyViaAllowed(t *testing.T) {
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
err := defaultCheckRedirect(req, nil)
if err != nil {
t.Fatalf("unexpected error with empty via: %v", err)
}
}
func TestSetHTTPClient_NilRestoresDefault(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
c.SetHTTPClient(nil)
if c.http == nil {
t.Fatal("expected non-nil http client after SetHTTPClient(nil)")
}
if c.http.Timeout != 30*time.Second {
t.Errorf("expected 30s timeout, got %v", c.http.Timeout)
}
if c.http.CheckRedirect == nil {
t.Fatal("expected CheckRedirect policy after SetHTTPClient(nil)")
}
}
// TestSafeDialContextBlocksPrivateIPs verifies that NewClient (which uses
// safeDialContext by default) refuses to connect to private/reserved IPs.
func TestSafeDialContextBlocksPrivateIPs(t *testing.T) {
// These servers listen on 127.0.0.1, so the safe dialer will block them.
// We use NewClient (NOT NewTestClient) to exercise the real safe dialer.
privateURLs := []struct {
name string
url string
}{
{"loopback localhost", "http://localhost/"},
{"loopback 127.0.0.1", "http://127.0.0.1/"},
}
for _, tc := range privateURLs {
t.Run(tc.name, func(t *testing.T) {
c := NewClient(tc.url, "token")
_, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err == nil {
t.Errorf("expected error connecting to %s, got nil", tc.url)
}
// Error must mention SSRF/blocked, not a random network error.
if !strings.Contains(err.Error(), "blocked") &&
!strings.Contains(err.Error(), "private") &&
!strings.Contains(err.Error(), "loopback") &&
!strings.Contains(err.Error(), "reserved") {
t.Logf("error: %v", err)
// Allow other errors (connection refused, DNS) since the point
// is that we don't silently succeed — but prefer the explicit block message.
}
})
}
}
// TestWithUnsafeDialerAllowsLocalhost verifies that WithUnsafeDialer bypasses
// the IP check, allowing tests to connect to httptest.Server (127.0.0.1).
func TestWithUnsafeDialerAllowsLocalhost(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.Write([]byte(`{"title":"test","body":"","head":{"sha":"abc","ref":"main"}}`))
}))
defer server.Close()
// WithUnsafeDialer should allow connecting to 127.0.0.1.
c := NewClient(server.URL, "token").WithUnsafeDialer()
pr, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error with unsafe dialer: %v", err)
}
if pr.Title != "test" {
t.Errorf("expected title 'test', got %q", pr.Title)
}
}
// TestNewClient_HasSafeTransport verifies that NewClient installs the
// SSRF-blocking transport (i.e. Transport is not nil and DialContext is set).
func TestNewClient_HasSafeTransport(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
if c.http.Transport == nil {
t.Fatal("expected Transport to be set on NewClient (safe dialer)")
}
transport, ok := c.http.Transport.(*http.Transport)
if !ok {
t.Fatalf("expected *http.Transport, got %T", c.http.Transport)
}
if transport.DialContext == nil {
t.Fatal("expected DialContext to be set on transport (safe dialer)")
}
}
// TestSetHTTPClient_NilRestoresSafeTransport verifies that SetHTTPClient(nil)
// restores the safe transport (not just any client).
func TestSetHTTPClient_NilRestoresSafeTransport(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
c.SetHTTPClient(&http.Client{}) // replace with plain client
c.SetHTTPClient(nil) // restore
transport, ok := c.http.Transport.(*http.Transport)
if !ok {
t.Fatalf("expected *http.Transport after SetHTTPClient(nil), got %T", c.http.Transport)
}
if transport.DialContext == nil {
t.Fatal("expected DialContext to be restored after SetHTTPClient(nil)")
}
}
// TestNewSafeHTTPClient_PreservesDefaultTransportSettings verifies that
// newSafeHTTPClient clones http.DefaultTransport to retain proxy support,
// TLS handshake timeout, idle connection limits, and HTTP/2.
func TestNewSafeHTTPClient_PreservesDefaultTransportSettings(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
transport, ok := c.http.Transport.(*http.Transport)
if !ok {
t.Fatalf("expected *http.Transport, got %T", c.http.Transport)
}
defaults := http.DefaultTransport.(*http.Transport)
// TLSHandshakeTimeout must be inherited (non-zero), not the zero value
// that a bare &http.Transport{} would have.
if transport.TLSHandshakeTimeout == 0 {
t.Error("TLSHandshakeTimeout is 0; expected inherited value from DefaultTransport")
}
if transport.TLSHandshakeTimeout != defaults.TLSHandshakeTimeout {
t.Errorf("TLSHandshakeTimeout = %v, want %v", transport.TLSHandshakeTimeout, defaults.TLSHandshakeTimeout)
}
// IdleConnTimeout must be inherited.
if transport.IdleConnTimeout == 0 {
t.Error("IdleConnTimeout is 0; expected inherited value from DefaultTransport")
}
if transport.IdleConnTimeout != defaults.IdleConnTimeout {
t.Errorf("IdleConnTimeout = %v, want %v", transport.IdleConnTimeout, defaults.IdleConnTimeout)
}
// MaxIdleConns must be inherited.
if transport.MaxIdleConns == 0 {
t.Error("MaxIdleConns is 0; expected inherited value from DefaultTransport")
}
// ForceAttemptHTTP2 must be inherited.
if !transport.ForceAttemptHTTP2 {
t.Error("ForceAttemptHTTP2 is false; expected true from DefaultTransport")
}
// Proxy must be set (ProxyFromEnvironment).
if transport.Proxy == nil {
t.Error("Proxy is nil; expected ProxyFromEnvironment from DefaultTransport")
}
// DialContext must be our safe dialer, not the default.
if transport.DialContext == nil {
t.Error("DialContext is nil; expected safeDialContext")
}
}
-97
View File
@@ -1,97 +0,0 @@
package gitea
import (
"context"
"errors"
"math"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
func TestGetPullRequestDiff_SizeLimits(t *testing.T) {
tests := []struct {
name string
diff string
maxDiffSize int64
wantErr error
wantDiff string
}{
{
name: "exceeds max size",
diff: strings.Repeat("+ added line\n", 1000), // ~13 KB
maxDiffSize: 100,
wantErr: ErrDiffTooLarge,
},
{
name: "within max size",
diff: "diff --git a/f.go b/f.go\n--- a/f.go\n+++ b/f.go\n@@ -1 +1 @@\n-old\n+new\n",
maxDiffSize: 1024,
wantDiff: "diff --git a/f.go b/f.go\n--- a/f.go\n+++ b/f.go\n@@ -1 +1 @@\n-old\n+new\n",
},
{
name: "exactly at limit",
diff: strings.Repeat("x", 50),
maxDiffSize: 50,
wantDiff: strings.Repeat("x", 50),
},
{
name: "one byte over limit",
diff: strings.Repeat("x", 51),
maxDiffSize: 50,
wantErr: ErrDiffTooLarge,
},
{
name: "disabled limit",
diff: strings.Repeat("x", 10000),
maxDiffSize: -1,
wantDiff: strings.Repeat("x", 10000),
},
{
name: "math.MaxInt64 treated as disabled",
diff: strings.Repeat("x", 10000),
maxDiffSize: math.MaxInt64,
wantDiff: strings.Repeat("x", 10000),
},
{
name: "default limit",
diff: "diff content",
maxDiffSize: 0, // zero means use DefaultMaxDiffSize
wantDiff: "diff content",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte(tt.diff)) //nolint:errcheck // test handler
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client.MaxDiffSize = tt.maxDiffSize
client.RetryBackoff = []time.Duration{}
got, err := client.GetPullRequestDiff(context.Background(), "owner", "repo", 1)
if tt.wantErr != nil {
if err == nil {
t.Fatal("expected error, got nil")
}
if !errors.Is(err, tt.wantErr) {
t.Errorf("expected %v, got: %v", tt.wantErr, err)
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if got != tt.wantDiff {
t.Errorf("diff mismatch: got length %d, want length %d", len(got), len(tt.wantDiff))
}
})
}
}
-18
View File
@@ -1,18 +0,0 @@
// Package gitea — export_test.go exposes test helpers to test files in this
// package. It uses `package gitea` (not `package gitea_test`) so it can access
// unexported identifiers; Go only compiles it into the test binary, never into
// the production binary. This is the idiomatic pattern for white-box testing
// in Go (see net/http/export_test.go in the stdlib for the same approach).
package gitea
// NewTestClient creates a Gitea client configured for use in unit tests.
// It bypasses the IP-level SSRF protection so that tests can connect to
// httptest.Server instances (which listen on 127.0.0.1).
//
// Using the internal package gitea declaration (not gitea_test) means this
// symbol is available to all _test.go files in this package. It is ONLY
// compiled into the test binary; production binaries never include it.
// Production code must use NewClient, which enables the safe dialer.
func NewTestClient(baseURL, token string) *Client {
return NewClient(baseURL, token).WithUnsafeDialer()
}
-91
View File
@@ -1,91 +0,0 @@
// Package gitea provides a client for the Gitea API.
// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
package gitea
import (
"fmt"
"net"
)
// blockedCIDRStrings is the canonical list of CIDR strings that should never
// be contacted by review-bot. See IsBlockedIP for the full list of covered
// address families.
//
// These are hard-coded literals: any parse failure is a programming error.
// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
var blockedCIDRStrings = []string{
// IPv4 loopback
"127.0.0.0/8",
// IPv4 unspecified / "this network"
"0.0.0.0/8",
// RFC1918 private ranges
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
"169.254.0.0/16",
// IPv4 shared address space (RFC6598, carrier-grade NAT)
"100.64.0.0/10",
// IPv4 multicast
"224.0.0.0/4",
// IPv4 reserved / broadcast
"240.0.0.0/4",
// IPv6 loopback
"::1/128",
// IPv6 unspecified
"::/128",
// IPv6 link-local
"fe80::/10",
// IPv6 unique local (ULA) — RFC4193
"fc00::/7",
// IPv6 multicast
"ff00::/8",
}
// blockedCIDRs is the parsed form of blockedCIDRStrings.
// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
var (
blockedCIDRs []*net.IPNet
blockedCIDRParseErrors []string
)
func init() {
blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
for _, r := range blockedCIDRStrings {
_, cidr, err := net.ParseCIDR(r)
if err != nil {
// Record the error rather than panicking; TestBlockedCIDRsValid
// will catch this during tests, and the CI build will fail.
blockedCIDRParseErrors = append(blockedCIDRParseErrors,
fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
continue
}
blockedCIDRs = append(blockedCIDRs, cidr)
}
}
// IsBlockedIP reports whether ip is in a blocked address range.
// It is exported for use by the validate-url subcommand and tests outside
// this package.
//
// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
// IPv4 form before checking so that IPv4 CIDRs catch them.
//
// Based on:
// - RFC1918 private ranges
// - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
// - RFC4291 IPv6 link-local / loopback
func IsBlockedIP(ip net.IP) bool {
// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
if v4 := ip.To4(); v4 != nil {
ip = v4
}
for _, cidr := range blockedCIDRs {
if cidr.Contains(ip) {
return true
}
}
return false
}
-144
View File
@@ -1,144 +0,0 @@
package gitea
import (
"net"
"testing"
)
func TestIsBlockedIP(t *testing.T) {
blocked := []struct {
name string
ip string
}{
// IPv4 loopback
{"loopback 127.0.0.1", "127.0.0.1"},
{"loopback 127.0.0.2", "127.0.0.2"},
{"loopback 127.255.255.255", "127.255.255.255"},
// IPv4 unspecified
{"unspecified 0.0.0.0", "0.0.0.0"},
{"unspecified 0.1.2.3", "0.1.2.3"},
// RFC1918
{"RFC1918 10.0.0.1", "10.0.0.1"},
{"RFC1918 10.255.255.255", "10.255.255.255"},
{"RFC1918 172.16.0.1", "172.16.0.1"},
{"RFC1918 172.31.255.255", "172.31.255.255"},
{"RFC1918 192.168.0.1", "192.168.0.1"},
{"RFC1918 192.168.255.255", "192.168.255.255"},
// Link-local (APIPA / AWS metadata)
{"link-local 169.254.0.1", "169.254.0.1"},
{"link-local 169.254.169.254", "169.254.169.254"},
// Shared address space (carrier-grade NAT)
{"CGN 100.64.0.1", "100.64.0.1"},
{"CGN 100.127.255.255", "100.127.255.255"},
// Multicast
{"multicast 224.0.0.1", "224.0.0.1"},
{"multicast 239.255.255.255", "239.255.255.255"},
// Reserved
{"reserved 240.0.0.1", "240.0.0.1"},
{"broadcast 255.255.255.255", "255.255.255.255"},
// IPv6 loopback
{"IPv6 loopback ::1", "::1"},
// IPv6 unspecified
{"IPv6 unspecified ::", "::"},
// IPv6 link-local
{"IPv6 link-local fe80::1", "fe80::1"},
{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
// IPv6 ULA
{"IPv6 ULA fc00::1", "fc00::1"},
{"IPv6 ULA fd00::1", "fd00::1"},
// IPv6 multicast
{"IPv6 multicast ff02::1", "ff02::1"},
}
for _, tc := range blocked {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if !IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
}
})
}
allowed := []struct {
name string
ip string
}{
{"public 8.8.8.8", "8.8.8.8"},
{"public 1.1.1.1", "1.1.1.1"},
{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
// not assigned to any real host, but intentionally left unblocked here because
// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
// blocking it would require tracking every RFC5737 range without meaningful
// security benefit (no server should ever listen on a TEST-NET address).
{"public 151.101.1.1", "151.101.1.1"}, // Fastly
{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
}
for _, tc := range allowed {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
}
})
}
}
func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
// Construct it manually as a 16-byte IP.
mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
if !IsBlockedIP(mapped) {
t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
}
// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
if IsBlockedIP(mappedPublic) {
t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
}
}
func TestIsBlockedIPEdgeCases(t *testing.T) {
// The boundary between RFC1918 and public ranges.
// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
notPrivate := net.ParseIP("172.15.255.255")
if IsBlockedIP(notPrivate) {
t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
}
// 172.32.0.0 is NOT private (just above 172.31.255.255).
notPrivate2 := net.ParseIP("172.32.0.0")
if IsBlockedIP(notPrivate2) {
t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
}
// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
notCGN := net.ParseIP("100.63.255.255")
if IsBlockedIP(notCGN) {
t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
}
// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
notCGN2 := net.ParseIP("100.128.0.0")
if IsBlockedIP(notCGN2) {
t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
}
}
// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
// successfully. This catches programming errors in the CIDR list without
// requiring a startup panic. The init() function records parse failures in
// blockedCIDRParseErrors rather than panicking; this test makes those failures
// visible as test failures during CI.
func TestBlockedCIDRsValid(t *testing.T) {
if len(blockedCIDRParseErrors) > 0 {
for _, msg := range blockedCIDRParseErrors {
t.Errorf("CIDR parse error: %s", msg)
}
}
}
+4 -4
View File
@@ -31,13 +31,13 @@ func TestPostReview_WithComments(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
comments := []ReviewComment{ comments := []ReviewComment{
{Path: "main.go", NewPosition: 42, Body: "[MAJOR] Something bad"}, {Path: "main.go", NewPosition: 42, Body: "[MAJOR] Something bad"},
{Path: "util.go", NewPosition: 10, Body: "[MINOR] Style issue"}, {Path: "util.go", NewPosition: 10, Body: "[MINOR] Style issue"},
} }
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "REQUEST_CHANGES", "summary", "", comments) _, err := client.PostReview(context.Background(), "owner", "repo", 1, "REQUEST_CHANGES", "summary", comments)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
@@ -71,8 +71,8 @@ func TestPostReview_NilComments(t *testing.T) {
})) }))
defer server.Close() defer server.Close()
client := NewTestClient(server.URL, "test-token") client := NewClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "all good", "", nil) _, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "all good", nil)
if err != nil { if err != nil {
t.Fatalf("unexpected error: %v", err) t.Fatalf("unexpected error: %v", err)
} }
-582
View File
@@ -1,582 +0,0 @@
package github
import (
"context"
"encoding/base64"
"encoding/json"
"errors"
"fmt"
"net/http"
"net/http/httptest"
"testing"
"gitea.weiker.me/rodin/review-bot/vcs"
)
func TestGetPullRequest(t *testing.T) {
want := PullRequest{
Title: "My PR",
Body: "Description",
}
want.Head.Sha = "abc123"
want.Head.Ref = "feature/foo"
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/pulls/42" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(want)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
pr, err := c.GetPullRequest(context.Background(), "owner", "repo", 42)
if err != nil {
t.Fatalf("GetPullRequest: %v", err)
}
if pr.Title != want.Title {
t.Errorf("Title = %q, want %q", pr.Title, want.Title)
}
if pr.Head.Sha != want.Head.Sha {
t.Errorf("Head.Sha = %q, want %q", pr.Head.Sha, want.Head.Sha)
}
if pr.Head.Ref != want.Head.Ref {
t.Errorf("Head.Ref = %q, want %q", pr.Head.Ref, want.Head.Ref)
}
}
func TestGetPullRequest_NotFound(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNotFound)
w.Write([]byte("not found"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
_, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err == nil {
t.Fatal("expected error, got nil")
}
if !IsNotFound(err) {
t.Errorf("expected IsNotFound, got %v", err)
}
}
func TestGetPullRequestDiff(t *testing.T) {
const wantDiff = "diff --git a/foo.go b/foo.go\n+added line\n"
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("Accept") != "application/vnd.github.diff" {
t.Errorf("Accept = %q, want application/vnd.github.diff", r.Header.Get("Accept"))
}
w.Header().Set("Content-Type", "text/plain")
fmt.Fprint(w, wantDiff)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
diff, err := c.GetPullRequestDiff(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("GetPullRequestDiff: %v", err)
}
if diff != wantDiff {
t.Errorf("diff = %q, want %q", diff, wantDiff)
}
}
func TestGetPullRequestDiff_TooLarge(t *testing.T) {
// Return a diff larger than DefaultMaxDiffSize.
hugeDiff := make([]byte, DefaultMaxDiffSize+1)
for i := range hugeDiff {
hugeDiff[i] = 'x'
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
w.Write(hugeDiff)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
_, err := c.GetPullRequestDiff(context.Background(), "owner", "repo", 1)
if err == nil {
t.Fatal("expected ErrDiffTooLarge, got nil")
}
if err != ErrDiffTooLarge {
t.Errorf("err = %v, want ErrDiffTooLarge", err)
}
}
func TestGetPullRequestFiles(t *testing.T) {
files := []ChangedFile{
{Filename: "foo.go", Status: "modified"},
{Filename: "bar.go", Status: "added"},
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/pulls/7/files" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(files)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.GetPullRequestFiles(context.Background(), "owner", "repo", 7)
if err != nil {
t.Fatalf("GetPullRequestFiles: %v", err)
}
if len(got) != len(files) {
t.Fatalf("len = %d, want %d", len(got), len(files))
}
for i, f := range files {
if got[i].Filename != f.Filename || got[i].Status != f.Status {
t.Errorf("file[%d] = %+v, want %+v", i, got[i], f)
}
}
}
func TestGetCommitStatuses(t *testing.T) {
statuses := []CommitStatus{
{Status: "success", Context: "ci/tests", Description: "All tests passed", TargetURL: "https://ci.example.com"},
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/commits/deadbeef/statuses" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(statuses)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.GetCommitStatuses(context.Background(), "owner", "repo", "deadbeef")
if err != nil {
t.Fatalf("GetCommitStatuses: %v", err)
}
if len(got) != 1 {
t.Fatalf("len = %d, want 1", len(got))
}
if got[0].Status != "success" || got[0].Context != "ci/tests" {
t.Errorf("status = %+v, want %+v", got[0], statuses[0])
}
}
func TestGetFileContent(t *testing.T) {
const wantContent = "package main\n\nfunc main() {}\n"
encoded := base64.StdEncoding.EncodeToString([]byte(wantContent))
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/contents/main.go" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(contentResponse{
Type: "file",
Name: "main.go",
Path: "main.go",
Encoding: "base64",
Content: encoded,
})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.GetFileContent(context.Background(), "owner", "repo", "main.go")
if err != nil {
t.Fatalf("GetFileContent: %v", err)
}
if got != wantContent {
t.Errorf("content = %q, want %q", got, wantContent)
}
}
func TestGetFileContentRef(t *testing.T) {
const wantContent = "version: 2\n"
encoded := base64.StdEncoding.EncodeToString([]byte(wantContent))
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/contents/config.yml" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
if r.URL.Query().Get("ref") != "feature/x" {
t.Errorf("ref = %q, want feature/x", r.URL.Query().Get("ref"))
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(contentResponse{
Type: "file",
Name: "config.yml",
Path: "config.yml",
Encoding: "base64",
Content: encoded,
})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.GetFileContentRef(context.Background(), "owner", "repo", "config.yml", "feature/x")
if err != nil {
t.Fatalf("GetFileContentRef: %v", err)
}
if got != wantContent {
t.Errorf("content = %q, want %q", got, wantContent)
}
}
func TestGetFileContent_NewlinesInBase64(t *testing.T) {
// GitHub inserts newlines every 60 chars in the base64-encoded content.
// Verify we strip them before decoding.
const wantContent = "hello world this is a long string that gets split by github with newlines in the base64 encoding"
rawEncoded := base64.StdEncoding.EncodeToString([]byte(wantContent))
// Insert newlines to simulate GitHub's format.
var chunked string
for i := 0; i < len(rawEncoded); i += 60 {
end := i + 60
if end > len(rawEncoded) {
end = len(rawEncoded)
}
chunked += rawEncoded[i:end] + "\n"
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(contentResponse{
Type: "file",
Encoding: "base64",
Content: chunked,
})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.GetFileContent(context.Background(), "owner", "repo", "file.txt")
if err != nil {
t.Fatalf("GetFileContent: %v", err)
}
if got != wantContent {
t.Errorf("content = %q, want %q", got, wantContent)
}
}
func TestListContents_Directory(t *testing.T) {
entries := []ContentEntry{
{Name: "main.go", Path: "main.go", Type: "file"},
{Name: "lib", Path: "lib", Type: "dir"},
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/contents/" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(entries)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.ListContents(context.Background(), "owner", "repo", "")
if err != nil {
t.Fatalf("ListContents: %v", err)
}
if len(got) != len(entries) {
t.Fatalf("len = %d, want %d", len(got), len(entries))
}
for i, e := range entries {
if got[i].Name != e.Name || got[i].Type != e.Type {
t.Errorf("entry[%d] = %+v, want %+v", i, got[i], e)
}
}
}
func TestListContents_File(t *testing.T) {
// When path points to a single file, GitHub returns an object, not array.
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Return a single file object (not an array).
json.NewEncoder(w).Encode(contentResponse{
Type: "file",
Name: "README.md",
Path: "README.md",
Encoding: "base64",
Content: base64.StdEncoding.EncodeToString([]byte("# Hello")),
})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.ListContents(context.Background(), "owner", "repo", "README.md")
if err != nil {
t.Fatalf("ListContents (file): %v", err)
}
if len(got) != 1 {
t.Fatalf("len = %d, want 1", len(got))
}
if got[0].Name != "README.md" || got[0].Type != "file" {
t.Errorf("entry = %+v", got[0])
}
}
func TestGetAuthenticatedUser(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/user" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(userResponse{Login: "rodin"})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
login, err := c.GetAuthenticatedUser(context.Background())
if err != nil {
t.Fatalf("GetAuthenticatedUser: %v", err)
}
if login != "rodin" {
t.Errorf("login = %q, want %q", login, "rodin")
}
}
func TestPostReview(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
t.Errorf("method = %s, want POST", r.Method)
}
if r.URL.Path != "/repos/owner/repo/pulls/5/reviews" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
var payload postReviewRequest
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
t.Errorf("decode body: %v", err)
w.WriteHeader(http.StatusBadRequest)
return
}
if payload.Event != "REQUEST_CHANGES" {
t.Errorf("event = %q, want REQUEST_CHANGES", payload.Event)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(reviewResponse{
ID: 99,
Body: payload.Body,
State: "CHANGES_REQUESTED",
User: struct{ Login string `json:"login"` }{Login: "rodin"},
})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
review, err := c.PostReview(context.Background(), "owner", "repo", 5, vcs.ReviewRequest{
Body: "needs work",
Event: vcs.ReviewEventRequestChanges,
})
if err != nil {
t.Fatalf("PostReview: %v", err)
}
if review.ID != 99 {
t.Errorf("review.ID = %d, want 99", review.ID)
}
// Verify state translation: CHANGES_REQUESTED -> REQUEST_CHANGES
if review.State != "REQUEST_CHANGES" {
t.Errorf("review.State = %q, want REQUEST_CHANGES", review.State)
}
if review.User.Login != "rodin" {
t.Errorf("review.User.Login = %q, want rodin", review.User.Login)
}
}
func TestPostReview_ConflictingCommitIDs(t *testing.T) {
c := NewClient("tok", "https://api.github.com")
_, err := c.PostReview(context.Background(), "owner", "repo", 1, vcs.ReviewRequest{
Body: "test",
Event: vcs.ReviewEventComment,
Comments: []vcs.ReviewComment{
{Path: "a.go", Position: 1, Body: "comment 1", CommitID: "sha1"},
{Path: "b.go", Position: 2, Body: "comment 2", CommitID: "sha2"},
},
})
if err == nil {
t.Fatal("expected ErrConflictingCommitIDs, got nil")
}
if err != ErrConflictingCommitIDs {
t.Errorf("err = %v, want ErrConflictingCommitIDs", err)
}
}
func TestListReviews(t *testing.T) {
reviews := []reviewResponse{
{ID: 1, Body: "lgtm", State: "APPROVED", User: struct{ Login string `json:"login"` }{Login: "alice"}},
{ID: 2, Body: "needs work", State: "CHANGES_REQUESTED", User: struct{ Login string `json:"login"` }{Login: "bob"}},
}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/repos/owner/repo/pulls/3/reviews" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(reviews)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
got, err := c.ListReviews(context.Background(), "owner", "repo", 3)
if err != nil {
t.Fatalf("ListReviews: %v", err)
}
if len(got) != 2 {
t.Fatalf("len = %d, want 2", len(got))
}
if got[0].State != "APPROVED" {
t.Errorf("got[0].State = %q, want APPROVED", got[0].State)
}
if got[1].State != "REQUEST_CHANGES" {
t.Errorf("got[1].State = %q, want REQUEST_CHANGES (translated from CHANGES_REQUESTED)", got[1].State)
}
}
func TestDeleteReview_Pending(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodDelete {
t.Errorf("method = %s, want DELETE", r.Method)
}
if r.URL.Path != "/repos/owner/repo/pulls/1/reviews/42" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
w.WriteHeader(http.StatusNoContent)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
err := c.DeleteReview(context.Background(), "owner", "repo", 1, 42)
if err != nil {
t.Fatalf("DeleteReview: %v", err)
}
}
func TestDeleteReview_Submitted_Returns422(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// GitHub returns 422 when trying to delete a submitted review.
w.WriteHeader(http.StatusUnprocessableEntity)
w.Write([]byte(`{"message":"Cannot delete a submitted review"}`))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
err := c.DeleteReview(context.Background(), "owner", "repo", 1, 42)
if err == nil {
t.Fatal("expected error for submitted review deletion")
}
// Should be wrapped as ErrCannotDeleteSubmittedReview.
if !isErrCannotDeleteSubmittedReview(err) {
t.Errorf("err = %v, want ErrCannotDeleteSubmittedReview", err)
}
}
// isErrCannotDeleteSubmittedReview checks if err wraps ErrCannotDeleteSubmittedReview.
func isErrCannotDeleteSubmittedReview(err error) bool {
return err != nil && errors.Is(err, ErrCannotDeleteSubmittedReview)
}
func TestDismissReview(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPut {
t.Errorf("method = %s, want PUT", r.Method)
}
if r.URL.Path != "/repos/owner/repo/pulls/2/reviews/10/dismissals" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
var payload dismissReviewRequest
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
t.Errorf("decode body: %v", err)
w.WriteHeader(http.StatusBadRequest)
return
}
if payload.Event != "DISMISS" {
t.Errorf("event = %q, want DISMISS", payload.Event)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(reviewResponse{ID: 10, State: "DISMISSED"})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
err := c.DismissReview(context.Background(), "owner", "repo", 2, 10, "outdated review")
if err != nil {
t.Fatalf("DismissReview: %v", err)
}
}
func TestDoRequestWithBody_RejectsHTTP(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Fatal("request should not have been sent")
}))
defer srv.Close()
// Without AllowInsecureHTTPForTest, HTTP should be rejected.
c := NewClient("tok", srv.URL)
_, err := c.doRequestWithBody(context.Background(), http.MethodPost, srv.URL+"/test", []byte(`{}`))
if err == nil {
t.Fatal("expected error for HTTP request")
}
}
func TestPostReview_CommitIDFromRequest(t *testing.T) {
const wantCommitID = "abc123def456"
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var payload postReviewRequest
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
t.Errorf("decode body: %v", err)
w.WriteHeader(http.StatusBadRequest)
return
}
if payload.CommitID != wantCommitID {
t.Errorf("commit_id = %q, want %q", payload.CommitID, wantCommitID)
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(reviewResponse{
ID: 1,
Body: payload.Body,
State: "COMMENTED",
CommitID: payload.CommitID,
User: struct{ Login string `json:"login"` }{Login: "rodin"},
})
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
review, err := c.PostReview(context.Background(), "owner", "repo", 1, vcs.ReviewRequest{
Body: "looks good",
Event: vcs.ReviewEventApprove,
CommitID: wantCommitID,
})
if err != nil {
t.Fatalf("PostReview: %v", err)
}
if review.CommitID != wantCommitID {
t.Errorf("review.CommitID = %q, want %q", review.CommitID, wantCommitID)
}
}
-657
View File
@@ -1,657 +0,0 @@
// Package github provides a client for the GitHub API.
// It supports pull request operations, file content retrieval,
// and review submission for both github.com and GitHub Enterprise.
package github
import (
"bytes"
"context"
"encoding/base64"
"encoding/json"
"errors"
"fmt"
"io"
"log/slog"
"net/http"
"net/url"
"os"
"strconv"
"strings"
"time"
)
const (
defaultBaseURL = "https://api.github.com"
// maxRetryAttempts is the number of times doRequest will attempt a request.
maxRetryAttempts = 3
// maxRetryAfter caps the maximum delay from a Retry-After header to prevent
// a server from stalling the client indefinitely.
maxRetryAfter = 60 * time.Second
// maxErrorBodyBytes limits how much of an error response body we read
// to protect against malicious servers sending unbounded data.
maxErrorBodyBytes = 64 * 1024 // 64 KB
// maxResponseBodyBytes limits how much of a successful response body we read
// for defense-in-depth against servers returning excessively large payloads.
maxResponseBodyBytes = 10 * 1024 * 1024 // 10 MB
)
// APIError represents an HTTP error response from the GitHub API.
// It carries the status code so callers can distinguish between
// different failure modes (e.g. 404 vs 500).
//
// The Body field stores up to 64 KiB of the raw response for programmatic
// inspection. Error() truncates to 200 bytes for safe logging, but callers
// should avoid logging or propagating Body directly in production since it may
// contain sensitive details from the upstream server.
type APIError struct {
StatusCode int
Body string
}
func (e *APIError) Error() string {
body := e.Body
if len(body) > 200 {
body = body[:200] + "...(truncated)"
}
// Sanitize newlines to prevent log injection from upstream response bodies.
body = strings.ReplaceAll(body, "\n", " ")
body = strings.ReplaceAll(body, "\r", " ")
return fmt.Sprintf("HTTP %d: %s", e.StatusCode, body)
}
// IsNotFound reports whether an error is an API 404 response.
func IsNotFound(err error) bool {
if apiErr, ok := asAPIError(err); ok {
return apiErr.StatusCode == http.StatusNotFound
}
return false
}
// IsUnauthorized reports whether an error is an API 401 response.
func IsUnauthorized(err error) bool {
if apiErr, ok := asAPIError(err); ok {
return apiErr.StatusCode == http.StatusUnauthorized
}
return false
}
func asAPIError(err error) (*APIError, bool) {
if err == nil {
return nil, false
}
var target *APIError
if errors.As(err, &target) {
return target, true
}
return nil, false
}
// Client interacts with the GitHub API.
// A Client is safe for concurrent use by multiple goroutines.
// SetHTTPClient and SetRetryBackoff are intended for test setup only and must
// be called before any goroutines issue requests; they have no synchronization.
type Client struct {
baseURL string
token string
httpClient *http.Client
// allowInsecureHTTP permits requests to HTTP (non-TLS) endpoints.
// When false, doRequest rejects URLs with an http:// scheme.
allowInsecureHTTP bool
// retryBackoff defines the delays between retry attempts for 429 responses.
// retryBackoff[i] is the delay before attempt i+1 (after attempt i fails).
// If nil, defaults to {1s, 2s}.
retryBackoff []time.Duration
// now returns the current time. Defaults to time.Now.
// Override in tests to control HTTP-date Retry-After calculations.
now func() time.Time
}
// defaultCheckRedirect is the redirect policy used by NewClient.
// NOTE: This function is intentionally duplicated in gitea/client.go (and vice versa)
// because the packages are separate. Changes here must be mirrored there.
// It rejects HTTPS->HTTP protocol downgrades (to prevent plaintext leakage)
// and cross-host redirects (to prevent following responses from untrusted
// endpoints). Same-host, same-or-upgraded-scheme redirects are allowed.
func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
// Guard for direct invocation in tests and any future callers;
// net/http guarantees len(via) >= 1 during actual redirects.
if len(via) == 0 {
return nil
}
prev := via[len(via)-1]
// Reject protocol downgrade: HTTPS->HTTP leaks request metadata over plaintext.
if prev.URL.Scheme == "https" && req.URL.Scheme == "http" {
return fmt.Errorf("refusing redirect: HTTPS to HTTP downgrade (%s -> %s)", prev.URL.Host, req.URL.Host)
}
// Reject cross-host redirect entirely to avoid consuming responses
// from untrusted endpoints.
if req.URL.Host != prev.URL.Host {
return fmt.Errorf("refusing redirect: cross-host (%s -> %s)", prev.URL.Host, req.URL.Host)
}
return nil
}
// ClientOption configures optional behavior of a Client.
type ClientOption func(*clientConfig)
type clientConfig struct {
allowInsecureHTTP bool
insecureIsTestBypass bool
}
// AllowInsecureHTTP permits sending credentials over plaintext HTTP connections.
// In production, this option is gated by the REVIEW_BOT_ALLOW_INSECURE=1
// environment variable. Without the env var set, the option is ignored
// and a warning is logged.
//
// For tests, use AllowInsecureHTTPForTest (defined in a _test.go file in the same package) which bypasses the env gate.
func AllowInsecureHTTP() ClientOption {
return func(cfg *clientConfig) {
cfg.allowInsecureHTTP = true
}
}
// NewClient creates a new GitHub API client.
// If baseURL is empty, it defaults to https://api.github.com.
// For GitHub Enterprise, pass the API base URL (e.g. https://github.concur.com/api/v3).
func NewClient(token, baseURL string, opts ...ClientOption) *Client {
if baseURL == "" {
baseURL = defaultBaseURL
}
var cfg clientConfig
for _, opt := range opts {
opt(&cfg)
}
if cfg.allowInsecureHTTP && !cfg.insecureIsTestBypass {
if os.Getenv("REVIEW_BOT_ALLOW_INSECURE") != "1" {
slog.Warn("AllowInsecureHTTP ignored: set REVIEW_BOT_ALLOW_INSECURE=1 to enable")
cfg.allowInsecureHTTP = false
} else {
slog.Warn("AllowInsecureHTTP enabled — credentials may be sent over plaintext",
"env", "REVIEW_BOT_ALLOW_INSECURE=1")
}
}
return &Client{
baseURL: strings.TrimRight(baseURL, "/"),
token: token,
allowInsecureHTTP: cfg.allowInsecureHTTP,
httpClient: &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
},
now: time.Now,
}
}
// SetHTTPClient sets the underlying HTTP client used for requests.
// This is intended for test setup only to inject mock transports; it must be
// called before any goroutines issue requests.
//
// Passing nil restores the default client (30s timeout + redirect-rejecting
// CheckRedirect policy matching NewClient).
//
// Callers providing a non-nil client are responsible for configuring a safe
// CheckRedirect policy. Without one, the default net/http behavior will follow
// redirects and may forward the Authorization header to untrusted hosts.
func (c *Client) SetHTTPClient(hc *http.Client) {
if hc == nil {
hc = &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
}
}
c.httpClient = hc
}
// SetRetryBackoff sets the delays between retry attempts.
// This is intended for testing to speed up retry tests.
//
// Note: if an empty non-nil slice is provided, Retry-After delays parsed from
// server responses will be computed and capped but not applied (because
// attempt < len(backoff) is always false). This is acceptable for the
// test-only use case but callers should be aware of this edge case.
func (c *Client) SetRetryBackoff(backoff []time.Duration) {
c.retryBackoff = backoff
}
// parseRetryAfter parses a Retry-After header value, supporting both integer
// seconds (e.g. "120") and HTTP-date format (e.g. "Thu, 01 Dec 2025 16:00:00 GMT")
// as specified in RFC 7231 §7.1.3.
//
// For integer values, it returns the duration directly.
// For HTTP-date values, it computes the delay as the difference between the
// parsed time and now. If the date is in the past, it returns 0.
//
// Returns (0, false) if the value cannot be parsed as either format.
func (c *Client) parseRetryAfter(value string) (time.Duration, bool) {
value = strings.TrimSpace(value)
// Try integer seconds first (most common from GitHub).
// RFC 7231 allows delta-seconds of 0 to indicate immediate retry.
if seconds, err := strconv.Atoi(value); err == nil && seconds >= 0 {
return time.Duration(seconds) * time.Second, true
}
// Try HTTP-date format (RFC 7231 §7.1.3).
// http.ParseTime handles RFC 1123, RFC 850, and ASCTIME formats.
if retryAt, err := http.ParseTime(value); err == nil {
delay := retryAt.Sub(c.now())
if delay < 0 {
delay = 0
}
return delay, true
}
return 0, false
}
// redactURL redacts sensitive components from a URL for safe inclusion in error
// messages and log output. It removes userinfo (e.g., user:pass@) and replaces
// query parameters with a placeholder.
func redactURL(rawURL string) string {
u, err := url.Parse(rawURL)
if err != nil {
return "<unparseable URL>"
}
u.User = nil
if u.RawQuery != "" {
u.RawQuery = "<redacted>"
}
return u.String()
}
// doRequest performs an HTTP request with retry on 429 rate limit responses.
// It respects the Retry-After header when present, supporting both integer
// seconds and HTTP-date formats (capped at maxRetryAfter).
func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept string) ([]byte, error) {
// NOTE: This parses reqURL a second time (http.NewRequestWithContext parses it
// again internally). Acceptable cost: URL parsing is cheap and threading the
// parsed *url.URL through would complicate the interface for negligible gain.
if !c.allowInsecureHTTP {
parsed, err := url.Parse(reqURL)
if err != nil {
return nil, fmt.Errorf("parse request URL: %w", err)
}
if strings.EqualFold(parsed.Scheme, "http") {
return nil, fmt.Errorf("refusing HTTP request to %s: use HTTPS or set AllowInsecureHTTP option", redactURL(reqURL))
}
}
var backoff []time.Duration
if c.retryBackoff != nil {
backoff = append([]time.Duration(nil), c.retryBackoff...)
} else {
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
var lastErr error
for attempt := 0; attempt < maxRetryAttempts; attempt++ {
if attempt > 0 {
var delay time.Duration
if attempt-1 < len(backoff) {
delay = backoff[attempt-1]
}
if delay > 0 {
timer := time.NewTimer(delay)
select {
case <-timer.C:
timer.Stop() // no-op after fire; kept for symmetry with the ctx.Done case
case <-ctx.Done():
timer.Stop()
return nil, ctx.Err()
}
}
}
req, err := http.NewRequestWithContext(ctx, method, reqURL, nil)
if err != nil {
return nil, fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+c.token)
if accept != "" {
req.Header.Set("Accept", accept)
} else {
req.Header.Set("Accept", "application/vnd.github+json")
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("do request: %w", err)
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
body, err := io.ReadAll(io.LimitReader(resp.Body, maxResponseBodyBytes))
resp.Body.Close()
if err != nil {
return nil, fmt.Errorf("read response body: %w", err)
}
return body, nil
}
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
resp.Body.Close()
lastErr = &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
// Retry on 429 rate limit
if resp.StatusCode == http.StatusTooManyRequests && attempt < maxRetryAttempts-1 {
// Check for Retry-After header and override backoff if present.
// Supports both integer seconds (common) and HTTP-date format (RFC 7231).
if ra := resp.Header.Get("Retry-After"); ra != "" {
if delay, ok := c.parseRetryAfter(ra); ok {
if delay > maxRetryAfter {
delay = maxRetryAfter
}
if attempt < len(backoff) {
backoff[attempt] = delay
}
}
}
continue
}
// Don't retry other errors
return nil, lastErr
}
return nil, lastErr
}
// doGet is a convenience wrapper for GET requests with the default Accept header.
func (c *Client) doGet(ctx context.Context, url string) ([]byte, error) {
return c.doRequest(ctx, http.MethodGet, url, "")
}
// DefaultMaxDiffSize is the default maximum diff size in bytes (10 MB).
const DefaultMaxDiffSize = 10 * 1024 * 1024
// ErrDiffTooLarge is returned when a PR diff exceeds the configured MaxDiffSize.
var ErrDiffTooLarge = errors.New("diff size exceeds maximum allowed size")
// PullRequest holds relevant PR metadata.
type PullRequest struct {
Title string `json:"title"`
Body string `json:"body"`
Head struct {
Sha string `json:"sha"`
Ref string `json:"ref"`
} `json:"head"`
}
// CommitStatus represents a single CI status entry.
type CommitStatus struct {
Status string `json:"state"`
Context string `json:"context"`
Description string `json:"description"`
TargetURL string `json:"target_url"`
}
// ChangedFile represents a file modified in a PR.
type ChangedFile struct {
Filename string `json:"filename"`
Status string `json:"status"`
}
// ContentEntry represents a file or directory in a repository.
type ContentEntry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"`
}
// contentResponse is the GitHub API response for a single file's content.
type contentResponse struct {
Content string `json:"content"`
Encoding string `json:"encoding"`
Type string `json:"type"`
Name string `json:"name"`
Path string `json:"path"`
}
// GetPullRequest fetches PR metadata.
func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number int) (*PullRequest, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR: %w", err)
}
var pr PullRequest
if err := json.Unmarshal(body, &pr); err != nil {
return nil, fmt.Errorf("parse PR response: %w", err)
}
return &pr, nil
}
// GetPullRequestFiles fetches the list of files changed in a PR.
func (c *Client) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]ChangedFile, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/files",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR files: %w", err)
}
var files []ChangedFile
if err := json.Unmarshal(body, &files); err != nil {
return nil, fmt.Errorf("parse PR files response: %w", err)
}
return files, nil
}
// GetCommitStatuses fetches CI statuses for a commit SHA.
func (c *Client) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]CommitStatus, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/commits/%s/statuses",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), url.PathEscape(sha))
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch commit statuses: %w", err)
}
var statuses []CommitStatus
if err := json.Unmarshal(body, &statuses); err != nil {
return nil, fmt.Errorf("parse commit statuses response: %w", err)
}
return statuses, nil
}
// decodeFileContent decodes the content from a GitHub contents API response.
// GitHub returns content as base64-encoded string with newlines inserted;
// this function strips the newlines before decoding.
func decodeFileContent(resp contentResponse) (string, error) {
if resp.Encoding != "base64" {
return "", fmt.Errorf("unsupported content encoding: %q", resp.Encoding)
}
// GitHub inserts newlines every 60 characters; strip them before decoding.
stripped := strings.ReplaceAll(resp.Content, "\n", "")
decoded, err := base64.StdEncoding.DecodeString(stripped)
if err != nil {
return "", fmt.Errorf("decode base64 content: %w", err)
}
return string(decoded), nil
}
// GetFileContent fetches the content of a file at the default branch HEAD.
func (c *Client) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return c.GetFileContentRef(ctx, owner, repo, filepath, "")
}
// GetFileContentRef fetches the content of a file at the given ref (branch, tag, or SHA).
// If ref is empty, the default branch is used.
func (c *Client) GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/contents/%s",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), escapePath(filepath))
if ref != "" {
reqURL += "?ref=" + url.QueryEscape(ref)
}
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch file content %q: %w", filepath, err)
}
var resp contentResponse
if err := json.Unmarshal(body, &resp); err != nil {
return "", fmt.Errorf("parse file content response: %w", err)
}
content, err := decodeFileContent(resp)
if err != nil {
return "", fmt.Errorf("decode file content %q: %w", filepath, err)
}
return content, nil
}
// ListContents lists the files and directories at the given path.
// If path points to a file instead of a directory, it returns a single-entry slice.
func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/contents/%s",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), escapePath(path))
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list contents %q: %w", path, err)
}
// The GitHub API returns an array for directories and an object for files.
// Try array first; if it fails, try object and wrap in a slice.
var entries []ContentEntry
if err := json.Unmarshal(body, &entries); err == nil {
return entries, nil
}
// Try as a single ContentEntry (file path).
var single contentResponse
if err := json.Unmarshal(body, &single); err != nil {
return nil, fmt.Errorf("parse contents response for %q: %w", path, err)
}
return []ContentEntry{{
Name: single.Name,
Path: single.Path,
Type: single.Type,
}}, nil
}
// doRequestWithBody performs an HTTP request with a JSON request body.
// Unlike doRequest, it does NOT retry — write operations should not be
// retried automatically.
func (c *Client) doRequestWithBody(ctx context.Context, method, reqURL string, data []byte) ([]byte, error) {
if !c.allowInsecureHTTP {
parsed, err := url.Parse(reqURL)
if err != nil {
return nil, fmt.Errorf("parse request URL: %w", err)
}
if strings.EqualFold(parsed.Scheme, "http") {
return nil, fmt.Errorf("refusing HTTP request to %s: use HTTPS or set AllowInsecureHTTP option", redactURL(reqURL))
}
}
var bodyReader *bytes.Reader
if data != nil {
bodyReader = bytes.NewReader(data)
} else {
bodyReader = bytes.NewReader(nil)
}
req, err := http.NewRequestWithContext(ctx, method, reqURL, bodyReader)
if err != nil {
return nil, fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+c.token)
req.Header.Set("Accept", "application/vnd.github+json")
if data != nil {
req.Header.Set("Content-Type", "application/json")
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("do request: %w", err)
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
body, err := io.ReadAll(io.LimitReader(resp.Body, maxResponseBodyBytes))
resp.Body.Close()
if err != nil {
return nil, fmt.Errorf("read response body: %w", err)
}
return body, nil
}
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
resp.Body.Close()
return nil, &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
}
// escapePath escapes each segment of a file path for use in URL path components.
// Slashes are preserved as separators; individual segments are PathEscaped.
func escapePath(p string) string {
parts := strings.Split(p, "/")
for i, part := range parts {
parts[i] = url.PathEscape(part)
}
return strings.Join(parts, "/")
}
// GetPullRequestDiff fetches the unified diff for a PR.
// Returns ErrDiffTooLarge if the response exceeds DefaultMaxDiffSize bytes.
//
// It reads up to DefaultMaxDiffSize+1 bytes and checks for truncation:
// if exactly DefaultMaxDiffSize+1 bytes are available, the diff is too large.
func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
if !c.allowInsecureHTTP {
parsed, err := url.Parse(reqURL)
if err != nil {
return "", fmt.Errorf("parse request URL: %w", err)
}
if strings.EqualFold(parsed.Scheme, "http") {
return "", fmt.Errorf("refusing HTTP request to %s: use HTTPS or set AllowInsecureHTTP option", redactURL(reqURL))
}
}
req, err := http.NewRequestWithContext(ctx, http.MethodGet, reqURL, nil)
if err != nil {
return "", fmt.Errorf("create PR diff request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+c.token)
req.Header.Set("Accept", "application/vnd.github.diff")
resp, err := c.httpClient.Do(req)
if err != nil {
return "", fmt.Errorf("fetch PR diff: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
return "", &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
}
// Read up to DefaultMaxDiffSize+1 bytes. If we get exactly that many,
// the diff exceeds the limit.
limit := int64(DefaultMaxDiffSize) + 1
raw, err := io.ReadAll(io.LimitReader(resp.Body, limit))
if err != nil {
return "", fmt.Errorf("read PR diff: %w", err)
}
if int64(len(raw)) == limit {
return "", ErrDiffTooLarge
}
return string(raw), nil
}
-658
View File
@@ -1,658 +0,0 @@
package github
import (
"context"
"errors"
"net/http"
"net/http/httptest"
"net/url"
"strings"
"testing"
"time"
)
func TestNewClient_DefaultBaseURL(t *testing.T) {
c := NewClient("tok", "")
if c.baseURL != defaultBaseURL {
t.Errorf("baseURL = %q, want %q", c.baseURL, defaultBaseURL)
}
}
func TestNewClient_CustomBaseURL(t *testing.T) {
c := NewClient("tok", "https://github.concur.com/api/v3/")
if c.baseURL != "https://github.concur.com/api/v3" {
t.Errorf("baseURL = %q, want trailing slash stripped", c.baseURL)
}
}
func TestDoRequest_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if got := r.Header.Get("Authorization"); got != "Bearer test-token" {
t.Errorf("Authorization = %q, want Bearer test-token", got)
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"ok":true}`))
}))
defer srv.Close()
c := NewClient("test-token", srv.URL, AllowInsecureHTTPForTest())
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != `{"ok":true}` {
t.Errorf("body = %q, want %q", body, `{"ok":true}`)
}
}
func TestDoRequest_429_RetryAfter_IntegerSeconds(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.Header().Set("Retry-After", "0")
w.WriteHeader(http.StatusTooManyRequests)
w.Write([]byte("rate limited"))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte("success"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
c.SetRetryBackoff([]time.Duration{0, 0})
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != "success" {
t.Errorf("body = %q, want %q", body, "success")
}
if attempts != 2 {
t.Errorf("attempts = %d, want 2", attempts)
}
}
func TestDoRequest_429_RetryAfter_HTTPDate(t *testing.T) {
// Fix "now" to a known time for deterministic testing.
fixedNow := time.Date(2025, 12, 1, 15, 59, 59, 0, time.UTC)
retryAt := "Mon, 01 Dec 2025 16:00:00 GMT" // 1 second in the future
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.Header().Set("Retry-After", retryAt)
w.WriteHeader(http.StatusTooManyRequests)
w.Write([]byte("rate limited"))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte("success"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
c.now = func() time.Time { return fixedNow }
// Initial backoff is 0; the HTTP-date parser will compute 1s and override.
c.SetRetryBackoff([]time.Duration{0, 0})
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != "success" {
t.Errorf("body = %q, want %q", body, "success")
}
if attempts != 2 {
t.Errorf("attempts = %d, want 2", attempts)
}
}
func TestDoRequest_429_RetryAfter_HTTPDate_InPast(t *testing.T) {
// If the HTTP-date is in the past, delay should be 0 (retry immediately).
fixedNow := time.Date(2025, 12, 1, 17, 0, 0, 0, time.UTC)
retryAt := "Mon, 01 Dec 2025 16:00:00 GMT" // 1 hour in the past
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.Header().Set("Retry-After", retryAt)
w.WriteHeader(http.StatusTooManyRequests)
w.Write([]byte("rate limited"))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte("success"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
c.now = func() time.Time { return fixedNow }
c.SetRetryBackoff([]time.Duration{0, 0})
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != "success" {
t.Errorf("body = %q, want %q", body, "success")
}
}
func TestDoRequest_429_NoRetryAfter_UsesDefaultBackoff(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.WriteHeader(http.StatusTooManyRequests)
w.Write([]byte("rate limited"))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte("success"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
c.SetRetryBackoff([]time.Duration{0, 0})
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != "success" {
t.Errorf("body = %q, want %q", body, "success")
}
if attempts != 2 {
t.Errorf("attempts = %d, want 2", attempts)
}
}
func TestDoRequest_429_InvalidRetryAfter_UsesDefaultBackoff(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts == 1 {
w.Header().Set("Retry-After", "not-a-number-or-date")
w.WriteHeader(http.StatusTooManyRequests)
w.Write([]byte("rate limited"))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte("success"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
c.SetRetryBackoff([]time.Duration{0, 0})
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != "success" {
t.Errorf("body = %q, want %q", body, "success")
}
}
func TestDoRequest_404_NoRetry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusNotFound)
w.Write([]byte("not found"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
_, err := c.doGet(context.Background(), srv.URL+"/test")
if err == nil {
t.Fatal("expected error, got nil")
}
if !IsNotFound(err) {
t.Errorf("expected IsNotFound, got %v", err)
}
if attempts != 1 {
t.Errorf("attempts = %d, want 1 (no retry on 404)", attempts)
}
}
func TestDoRequest_401_NoRetry(t *testing.T) {
attempts := 0
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusUnauthorized)
w.Write([]byte("unauthorized"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
_, err := c.doGet(context.Background(), srv.URL+"/test")
if err == nil {
t.Fatal("expected error, got nil")
}
if !IsUnauthorized(err) {
t.Errorf("expected IsUnauthorized, got %v", err)
}
if attempts != 1 {
t.Errorf("attempts = %d, want 1 (no retry on 401)", attempts)
}
}
func TestDoRequest_ContextCanceled(t *testing.T) {
// This test exercises the timer-cancel path in the retry select:
// select { case <-timer.C; case <-ctx.Done() }
// The server returns 429 with a long Retry-After, and we cancel the
// context shortly after the first response so that cancellation races
// against the timer rather than preventing the initial HTTP round-trip.
requestReceived := make(chan struct{}, 1)
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
select {
case requestReceived <- struct{}{}:
default:
}
w.Header().Set("Retry-After", "10")
w.WriteHeader(http.StatusTooManyRequests)
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
c.SetRetryBackoff([]time.Duration{10 * time.Second, 10 * time.Second})
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// Cancel the context after the first request completes, while the
// client is blocked in the retry timer select.
go func() {
<-requestReceived
// Small delay to ensure we're inside the timer select.
time.Sleep(50 * time.Millisecond)
cancel()
}()
_, err := c.doGet(ctx, srv.URL+"/test")
if err == nil {
t.Fatal("expected error, got nil")
}
if !errors.Is(err, context.Canceled) {
t.Errorf("err = %v, want context.Canceled", err)
}
}
func TestParseRetryAfter_IntegerSeconds(t *testing.T) {
c := NewClient("tok", "")
delay, ok := c.parseRetryAfter("42")
if !ok {
t.Fatal("expected ok=true")
}
if delay != 42*time.Second {
t.Errorf("delay = %v, want 42s", delay)
}
}
func TestParseRetryAfter_ZeroSeconds(t *testing.T) {
c := NewClient("tok", "")
delay, ok := c.parseRetryAfter("0")
if !ok {
t.Fatal("expected ok=true for zero seconds (RFC 7231 allows immediate retry)")
}
if delay != 0 {
t.Errorf("delay = %v, want 0", delay)
}
}
func TestParseRetryAfter_NegativeSeconds(t *testing.T) {
c := NewClient("tok", "")
_, ok := c.parseRetryAfter("-5")
if ok {
t.Error("expected ok=false for negative seconds")
}
}
func TestParseRetryAfter_HTTPDate_Future(t *testing.T) {
fixedNow := time.Date(2025, 12, 1, 15, 59, 50, 0, time.UTC)
c := NewClient("tok", "")
c.now = func() time.Time { return fixedNow }
delay, ok := c.parseRetryAfter("Mon, 01 Dec 2025 16:00:00 GMT")
if !ok {
t.Fatal("expected ok=true")
}
// Should be 10 seconds in the future.
if delay != 10*time.Second {
t.Errorf("delay = %v, want 10s", delay)
}
}
func TestParseRetryAfter_HTTPDate_Past(t *testing.T) {
fixedNow := time.Date(2025, 12, 1, 17, 0, 0, 0, time.UTC)
c := NewClient("tok", "")
c.now = func() time.Time { return fixedNow }
delay, ok := c.parseRetryAfter("Mon, 01 Dec 2025 16:00:00 GMT")
if !ok {
t.Fatal("expected ok=true")
}
if delay != 0 {
t.Errorf("delay = %v, want 0 (past date)", delay)
}
}
func TestParseRetryAfter_RFC850_Format(t *testing.T) {
fixedNow := time.Date(2025, 12, 1, 15, 59, 50, 0, time.UTC)
c := NewClient("tok", "")
c.now = func() time.Time { return fixedNow }
// RFC 850 format
delay, ok := c.parseRetryAfter("Monday, 01-Dec-25 16:00:00 GMT")
if !ok {
t.Fatal("expected ok=true for RFC 850 format")
}
if delay != 10*time.Second {
t.Errorf("delay = %v, want 10s", delay)
}
}
func TestParseRetryAfter_Invalid(t *testing.T) {
c := NewClient("tok", "")
_, ok := c.parseRetryAfter("not-valid")
if ok {
t.Error("expected ok=false for invalid value")
}
}
func TestParseRetryAfter_EmptyString(t *testing.T) {
c := NewClient("tok", "")
_, ok := c.parseRetryAfter("")
if ok {
t.Error("expected ok=false for empty string")
}
}
func TestParseRetryAfter_MaxCap(t *testing.T) {
// Verify that parseRetryAfter returns the raw value (capping is done by caller).
c := NewClient("tok", "")
delay, ok := c.parseRetryAfter("3600")
if !ok {
t.Fatal("expected ok=true")
}
if delay != 3600*time.Second {
t.Errorf("delay = %v, want 3600s (caller is responsible for capping)", delay)
}
}
func TestAPIError_Error_Truncation(t *testing.T) {
longBody := make([]byte, 300)
for i := range longBody {
longBody[i] = 'x'
}
apiErr := &APIError{StatusCode: 500, Body: string(longBody)}
msg := apiErr.Error()
if len(msg) > 250 {
// "HTTP 500: " (10) + 200 + "...(truncated)" (14) = 224
t.Errorf("error message too long: %d chars", len(msg))
}
}
func TestAPIError_Error_NewlineSanitized(t *testing.T) {
apiErr := &APIError{StatusCode: 400, Body: "line1\nline2\rline3"}
msg := apiErr.Error()
for _, c := range msg {
if c == '\n' || c == '\r' {
t.Errorf("error message contains unsanitized newline: %q", msg)
break
}
}
}
func TestNewClient_HasCheckRedirect(t *testing.T) {
c := NewClient("secret-token", "https://api.github.com")
if c.httpClient.CheckRedirect == nil {
t.Fatal("expected CheckRedirect to be set")
}
}
func TestDefaultCheckRedirect_RejectsHTTPSToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "api.github.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "api.github.com", Path: "/foo"},
Header: http.Header{"Authorization": []string{"Bearer token"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on HTTPS->HTTP redirect")
}
if !strings.Contains(err.Error(), "HTTPS to HTTP downgrade") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsCrossHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "api.github.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "objects.githubusercontent.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"Bearer token"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on cross-host redirect")
}
if !strings.Contains(err.Error(), "cross-host") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_AllowsSameHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "api.github.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "api.github.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"Bearer token"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Auth should be preserved on same-host redirect
if auth := req.Header.Get("Authorization"); auth != "Bearer token" {
t.Errorf("expected Authorization to be preserved, got %q", auth)
}
}
func TestDefaultCheckRedirect_AllowsSameHostHTTPToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "http", Host: "localhost:8080", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "localhost:8080", Path: "/bar"},
Header: http.Header{},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsTooManyRedirects(t *testing.T) {
via := make([]*http.Request, 10)
for i := range via {
via[i] = &http.Request{URL: &url.URL{Scheme: "https", Host: "api.github.com", Path: "/"}}
}
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "api.github.com", Path: "/final"}}
err := defaultCheckRedirect(req, via)
if err == nil {
t.Fatal("expected error after 10 redirects")
}
if !strings.Contains(err.Error(), "10 redirects") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_EmptyViaAllowed(t *testing.T) {
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "api.github.com", Path: "/foo"}}
err := defaultCheckRedirect(req, nil)
if err != nil {
t.Fatalf("unexpected error with empty via: %v", err)
}
}
func TestSetHTTPClient_NilRestoresDefault(t *testing.T) {
c := NewClient("token", "https://api.github.com")
c.SetHTTPClient(nil)
if c.httpClient == nil {
t.Fatal("expected non-nil httpClient after SetHTTPClient(nil)")
}
if c.httpClient.Timeout != 30*time.Second {
t.Errorf("expected 30s timeout, got %v", c.httpClient.Timeout)
}
if c.httpClient.CheckRedirect == nil {
t.Fatal("expected CheckRedirect policy after SetHTTPClient(nil)")
}
}
func TestAllowInsecureHTTPForTest_PermitsHTTP(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte("ok"))
}))
defer srv.Close()
c := NewClient("tok", srv.URL, AllowInsecureHTTPForTest())
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != "ok" {
t.Errorf("body = %q, want %q", body, "ok")
}
}
func TestNoInsecureOption_RejectsHTTP(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Fatal("request should not have been sent")
}))
defer srv.Close()
c := NewClient("tok", srv.URL)
_, err := c.doGet(context.Background(), srv.URL+"/test")
if err == nil {
t.Fatal("expected error for HTTP request without AllowInsecureHTTP")
}
if !strings.Contains(err.Error(), "refusing HTTP request") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestNoInsecureOption_RejectsUppercaseHTTP(t *testing.T) {
// Verify case-insensitive scheme check (RFC 3986).
c := NewClient("tok", "HTTP://127.0.0.1:1")
_, err := c.doGet(context.Background(), "HTTP://127.0.0.1:1/test")
if err == nil {
t.Fatal("expected error for uppercase HTTP scheme")
}
if !strings.Contains(err.Error(), "refusing HTTP request") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestNoInsecureOption_RejectsMixedCaseHTTP(t *testing.T) {
// Verify mixed case like "Http://" is also rejected.
c := NewClient("tok", "Http://127.0.0.1:1")
_, err := c.doGet(context.Background(), "Http://127.0.0.1:1/test")
if err == nil {
t.Fatal("expected error for mixed-case HTTP scheme")
}
if !strings.Contains(err.Error(), "refusing HTTP request") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestAllowInsecureHTTP_WithoutEnvVar_Rejected(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Fatal("request should not have been sent")
}))
defer srv.Close()
t.Setenv("REVIEW_BOT_ALLOW_INSECURE", "")
c := NewClient("tok", srv.URL, AllowInsecureHTTP())
_, err := c.doGet(context.Background(), srv.URL+"/test")
if err == nil {
t.Fatal("expected error: AllowInsecureHTTP without env var should be rejected")
}
if !strings.Contains(err.Error(), "refusing HTTP request") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestAllowInsecureHTTP_WithEnvVar_Permitted(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte("insecure-ok"))
}))
defer srv.Close()
t.Setenv("REVIEW_BOT_ALLOW_INSECURE", "1")
c := NewClient("tok", srv.URL, AllowInsecureHTTP())
body, err := c.doGet(context.Background(), srv.URL+"/test")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if string(body) != "insecure-ok" {
t.Errorf("body = %q, want %q", body, "insecure-ok")
}
}
func TestAllowInsecureHTTP_EnvVarNotOne_Rejected(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Fatal("request should not have been sent")
}))
defer srv.Close()
// "true" is not "1" — strict check
t.Setenv("REVIEW_BOT_ALLOW_INSECURE", "true")
c := NewClient("tok", srv.URL, AllowInsecureHTTP())
_, err := c.doGet(context.Background(), srv.URL+"/test")
if err == nil {
t.Fatal("expected error: env var 'true' is not '1'")
}
if !strings.Contains(err.Error(), "refusing HTTP request") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestRedactURL_WithQuery(t *testing.T) {
got := redactURL("http://localhost:1234/path?secret=token&foo=bar")
want := "http://localhost:1234/path?<redacted>"
if got != want {
t.Errorf("redactURL = %q, want %q", got, want)
}
}
func TestRedactURL_NoQuery(t *testing.T) {
got := redactURL("http://localhost:1234/path")
want := "http://localhost:1234/path"
if got != want {
t.Errorf("redactURL = %q, want %q", got, want)
}
}
func TestRedactURL_Userinfo(t *testing.T) {
got := redactURL("http://user:pass@localhost:1234/path")
want := "http://localhost:1234/path"
if got != want {
t.Errorf("redactURL = %q, want %q", got, want)
}
}
func TestRedactURL_UserinfoWithQuery(t *testing.T) {
got := redactURL("http://user:pass@localhost:1234/path?secret=token")
want := "http://localhost:1234/path?<redacted>"
if got != want {
t.Errorf("redactURL = %q, want %q", got, want)
}
}
-13
View File
@@ -1,13 +0,0 @@
package github
// AllowInsecureHTTPForTest permits sending credentials over plaintext HTTP
// without requiring the REVIEW_BOT_ALLOW_INSECURE environment variable.
// This is intended exclusively for test code using httptest.Server.
//
// Defined in a _test.go file so it is only available to test binaries.
func AllowInsecureHTTPForTest() ClientOption {
return func(cfg *clientConfig) {
cfg.allowInsecureHTTP = true
cfg.insecureIsTestBypass = true
}
}
-29
View File
@@ -1,29 +0,0 @@
package github
import (
"context"
"encoding/json"
"fmt"
)
// userResponse is the GitHub API response for the authenticated user.
type userResponse struct {
Login string `json:"login"`
}
// GetAuthenticatedUser returns the login of the currently authenticated user.
func (c *Client) GetAuthenticatedUser(ctx context.Context) (string, error) {
reqURL := fmt.Sprintf("%s/user", c.baseURL)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("get authenticated user: %w", err)
}
var resp userResponse
if err := json.Unmarshal(body, &resp); err != nil {
return "", fmt.Errorf("parse user response: %w", err)
}
return resp.Login, nil
}
-212
View File
@@ -1,212 +0,0 @@
package github
import (
"context"
"encoding/json"
"errors"
"fmt"
"net/http"
"net/url"
"gitea.weiker.me/rodin/review-bot/vcs"
)
// ErrCannotDeleteSubmittedReview is returned when DeleteReview is called on
// a review that has already been submitted (APPROVED, REQUEST_CHANGES, COMMENT).
// GitHub only allows deletion of PENDING reviews. Callers that need to replace
// a submitted review should use DismissReview instead.
var ErrCannotDeleteSubmittedReview = errors.New("cannot delete submitted review: use DismissReview instead")
// ErrConflictingCommitIDs is returned when PostReview receives comments with
// differing non-empty CommitIDs. The GitHub API accepts only a single commit_id
// per review submission; callers must ensure all comments target the same commit.
var ErrConflictingCommitIDs = errors.New("comments contain conflicting commit IDs: all must target the same commit")
// postReviewRequest is the GitHub API request body for creating a review.
type postReviewRequest struct {
CommitID string `json:"commit_id,omitempty"`
Body string `json:"body"`
Event string `json:"event"`
Comments []reviewCommentEntry `json:"comments,omitempty"`
}
// reviewCommentEntry is a single inline comment in a review creation request.
type reviewCommentEntry struct {
Path string `json:"path"`
Position int `json:"position"`
Body string `json:"body"`
}
// reviewResponse is the GitHub API response for a review.
type reviewResponse struct {
ID int64 `json:"id"`
Body string `json:"body"`
State string `json:"state"`
CommitID string `json:"commit_id"`
User struct {
Login string `json:"login"`
} `json:"user"`
}
// dismissReviewRequest is the GitHub API request body for dismissing a review.
type dismissReviewRequest struct {
Message string `json:"message"`
Event string `json:"event"`
}
// translateGitHubReviewState translates a GitHub API review state to the
// canonical vcs.Review.State value.
func translateGitHubReviewState(state string) string {
switch state {
case "CHANGES_REQUESTED":
return "REQUEST_CHANGES"
case "COMMENTED":
return "COMMENT"
default:
// States like APPROVED, DISMISSED, and PENDING pass through unchanged
// as they already match the canonical vcs representation. PENDING appears
// on draft reviews that have not yet been submitted via the GitHub UI or API.
return state
}
}
// PostReview submits a review on a pull request.
//
// The vcs.ReviewEvent constants (ReviewEventApprove, ReviewEventRequestChanges,
// ReviewEventComment) have string values that match GitHub's wire-format event
// strings (APPROVE, REQUEST_CHANGES, COMMENT), so Event is cast directly to
// string without translation.
//
// ReviewComment.Position maps directly to the GitHub API position field.
// When req.Comments is empty, the payload omits the comments field entirely
// (via the omitempty tag on postReviewRequest.Comments).
//
// The GitHub API accepts a single commit_id per review submission. PostReview
// extracts it from the first comment with a non-empty CommitID. If any subsequent
// comment specifies a different CommitID, PostReview returns ErrConflictingCommitIDs.
// Comments with an empty CommitID are allowed and inherit the review-level value.
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, req vcs.ReviewRequest) (*vcs.Review, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
payload := postReviewRequest{
Body: req.Body,
Event: string(req.Event),
CommitID: req.CommitID,
}
// Build the comments in one pass. Inline comment CommitIDs must match the
// review-level CommitID; reject if any disagree.
for _, comment := range req.Comments {
if comment.CommitID != "" {
if payload.CommitID == "" {
payload.CommitID = comment.CommitID
} else if payload.CommitID != comment.CommitID {
return nil, ErrConflictingCommitIDs
}
}
payload.Comments = append(payload.Comments, reviewCommentEntry{
Path: comment.Path,
Position: comment.Position,
Body: comment.Body,
})
}
data, err := json.Marshal(payload)
if err != nil {
return nil, fmt.Errorf("marshal review request: %w", err)
}
body, err := c.doRequestWithBody(ctx, http.MethodPost, reqURL, data)
if err != nil {
return nil, fmt.Errorf("post review: %w", err)
}
var resp reviewResponse
if err := json.Unmarshal(body, &resp); err != nil {
return nil, fmt.Errorf("parse review response: %w", err)
}
return &vcs.Review{
ID: resp.ID,
Body: resp.Body,
User: vcs.UserInfo{Login: resp.User.Login},
State: translateGitHubReviewState(resp.State),
CommitID: resp.CommitID,
}, nil
}
// ListReviews retrieves all reviews for a pull request.
// GitHub review states are translated to canonical vcs values.
func (c *Client) ListReviews(ctx context.Context, owner, repo string, number int) ([]vcs.Review, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list reviews: %w", err)
}
var responses []reviewResponse
if err := json.Unmarshal(body, &responses); err != nil {
return nil, fmt.Errorf("parse reviews response: %w", err)
}
reviews := make([]vcs.Review, len(responses))
for i, r := range responses {
reviews[i] = vcs.Review{
ID: r.ID,
Body: r.Body,
User: vcs.UserInfo{Login: r.User.Login},
State: translateGitHubReviewState(r.State),
CommitID: r.CommitID,
}
}
return reviews, nil
}
// DeleteReview deletes a pull request review.
// Only PENDING reviews can be deleted; attempting to delete a submitted review
// (APPROVED, CHANGES_REQUESTED, or COMMENTED per GitHub API naming) returns
// ErrCannotDeleteSubmittedReview.
func (c *Client) DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews/%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, reviewID)
// nil body: the GitHub DELETE endpoint for reviews requires no request body.
_, err := c.doRequestWithBody(ctx, http.MethodDelete, reqURL, nil)
if err != nil {
var apiErr *APIError
if errors.As(err, &apiErr) && apiErr.StatusCode == 422 {
return fmt.Errorf("delete review: %w", ErrCannotDeleteSubmittedReview)
}
return fmt.Errorf("delete review: %w", err)
}
return nil
}
// DismissReview dismisses a submitted review on a pull request.
// This is the correct way to "remove" a submitted review (APPROVED, REQUEST_CHANGES).
// GitHub does not allow deleting submitted reviews — they must be dismissed.
func (c *Client) DismissReview(ctx context.Context, owner, repo string, number int, reviewID int64, message string) error {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews/%d/dismissals",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, reviewID)
payload := dismissReviewRequest{
Message: message,
// Event is required by the GitHub API for dismissal requests, even though
// "DISMISS" is the only valid value for this endpoint.
Event: "DISMISS",
}
data, err := json.Marshal(payload)
if err != nil {
return fmt.Errorf("marshal dismiss request: %w", err)
}
_, err = c.doRequestWithBody(ctx, http.MethodPut, reqURL, data)
if err != nil {
return fmt.Errorf("dismiss review: %w", err)
}
return nil
}
+1 -1
View File
@@ -2,4 +2,4 @@ module gitea.weiker.me/rodin/review-bot
go 1.26.2 go 1.26.2
require github.com/goccy/go-yaml v1.19.2 require gopkg.in/yaml.v3 v3.0.1
+4 -2
View File
@@ -1,2 +1,4 @@
github.com/goccy/go-yaml v1.19.2 h1:PmFC1S6h8ljIz6gMRBopkjP1TVT7xuwrButHID66PoM= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
github.com/goccy/go-yaml v1.19.2/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
+8 -9
View File
@@ -16,17 +16,16 @@ import (
// Integration test requires a running Gitea instance and LLM endpoint. // Integration test requires a running Gitea instance and LLM endpoint.
// Set environment variables: // Set environment variables:
// // INTEGRATION_GITEA_URL - Gitea base URL
// INTEGRATION_VCS_URL - VCS base URL // INTEGRATION_GITEA_TOKEN - Gitea API token with repo access
// INTEGRATION_GITEA_TOKEN - Gitea API token with repo access // INTEGRATION_GITEA_REPO - owner/repo with an open PR
// INTEGRATION_GITEA_REPO - owner/repo with an open PR // INTEGRATION_PR_NUMBER - PR number to test against
// INTEGRATION_PR_NUMBER - PR number to test against // INTEGRATION_LLM_BASE_URL - LLM API base URL
// INTEGRATION_LLM_BASE_URL - LLM API base URL // INTEGRATION_LLM_API_KEY - LLM API key
// INTEGRATION_LLM_API_KEY - LLM API key // INTEGRATION_LLM_MODEL - Model name
// INTEGRATION_LLM_MODEL - Model name
func TestIntegration_FullReviewFlow(t *testing.T) { func TestIntegration_FullReviewFlow(t *testing.T) {
giteaURL := os.Getenv("INTEGRATION_VCS_URL") giteaURL := os.Getenv("INTEGRATION_GITEA_URL")
giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN") giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN")
giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO") giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO")
prNumStr := os.Getenv("INTEGRATION_PR_NUMBER") prNumStr := os.Getenv("INTEGRATION_PR_NUMBER")
+27 -5
View File
@@ -2,6 +2,7 @@ package review
import ( import (
"fmt" "fmt"
"regexp"
"strings" "strings"
) )
@@ -22,10 +23,29 @@ func GiteaEvent(verdict string) string {
} }
} }
// markdownSpecialChars matches characters that have special meaning in Markdown.
// We escape these to prevent untrusted input from breaking formatting.
// Uses a quoted string since raw strings can't contain backticks.
var markdownSpecialChars = regexp.MustCompile("([\\\\*_`\\[\\]()#<>|~])")
// sanitizeMarkdownText escapes special Markdown characters in untrusted text.
// This prevents markdown injection attacks where a malicious display name could
// break formatting, inject links, or create unexpected rendering.
func sanitizeMarkdownText(s string) string {
// First, remove any control characters and null bytes
cleaned := strings.Map(func(r rune) rune {
if r < 32 && r != '\t' && r != '\n' {
return -1 // drop the character
}
return r
}, s)
// Escape special Markdown characters by prepending backslash
return markdownSpecialChars.ReplaceAllString(cleaned, `\$1`)
}
// FormatMarkdownWithDisplay formats a ReviewResult with separate display name and sentinel name. // FormatMarkdownWithDisplay formats a ReviewResult with separate display name and sentinel name.
// Note: displayName is not HTML-escaped as Gitea sanitizes rendered Markdown. // displayName is sanitized to prevent Markdown injection from untrusted remote persona metadata.
// Persona display names are controlled by repo owners (trusted input). // sentinelName is used for the cleanup sentinel comment (machine-readable, not rendered).
// displayName is used for the header title, sentinelName is used for the cleanup sentinel.
// If displayName is empty, sentinelName is used for both. // If displayName is empty, sentinelName is used for both.
func FormatMarkdownWithDisplay(result *ReviewResult, displayName, sentinelName string) string { func FormatMarkdownWithDisplay(result *ReviewResult, displayName, sentinelName string) string {
var sb strings.Builder var sb strings.Builder
@@ -37,7 +57,8 @@ func FormatMarkdownWithDisplay(result *ReviewResult, displayName, sentinelName s
} }
if headerName != "" { if headerName != "" {
title := CapitalizeFirst(headerName) // Sanitize the header name to prevent Markdown injection
title := CapitalizeFirst(sanitizeMarkdownText(headerName))
sb.WriteString(fmt.Sprintf("# %s Review\n\n", title)) sb.WriteString(fmt.Sprintf("# %s Review\n\n", title))
} }
@@ -61,7 +82,8 @@ func FormatMarkdownWithDisplay(result *ReviewResult, displayName, sentinelName s
sb.WriteString(fmt.Sprintf("**%s** — %s\n", result.Verdict, result.Recommendation)) sb.WriteString(fmt.Sprintf("**%s** — %s\n", result.Verdict, result.Recommendation))
if sentinelName != "" { if sentinelName != "" {
sb.WriteString(fmt.Sprintf("\n---\n*Review by %s*\n", headerName)) // Sanitize headerName for the footer as well
sb.WriteString(fmt.Sprintf("\n---\n*Review by %s*\n", sanitizeMarkdownText(headerName)))
// Hidden sentinel for identifying this bot's reviews during cleanup // Hidden sentinel for identifying this bot's reviews during cleanup
sb.WriteString(fmt.Sprintf("\n<!-- review-bot:%s -->\n", sentinelName)) sb.WriteString(fmt.Sprintf("\n<!-- review-bot:%s -->\n", sentinelName))
} }
+68
View File
@@ -214,3 +214,71 @@ func TestFormatMarkdownWithDisplay(t *testing.T) {
} }
}) })
} }
func TestSanitizeMarkdownText(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{
name: "plain text unchanged",
input: "Security Specialist",
want: "Security Specialist",
},
{
name: "escapes asterisks",
input: "**bold** attack",
want: `\*\*bold\*\* attack`,
},
{
name: "escapes brackets for links",
input: "[click me](http://evil.com)",
want: `\[click me\]\(http://evil.com\)`,
},
{
name: "escapes backticks",
input: "`code` injection",
want: "\\`code\\` injection",
},
{
name: "escapes angle brackets",
input: "<script>alert(1)</script>",
want: `\<script\>alert\(1\)\</script\>`,
},
{
name: "escapes hash for headers",
input: "# Fake Header",
want: `\# Fake Header`,
},
{
name: "escapes pipe for tables",
input: "col1 | col2",
want: `col1 \| col2`,
},
{
name: "removes control characters",
input: "hello\x00world\x1f",
want: "helloworld",
},
{
name: "preserves tabs and newlines",
input: "line1\n\tindented",
want: "line1\n\tindented",
},
{
name: "escapes tilde for strikethrough",
input: "~~strikethrough~~",
want: `\~\~strikethrough\~\~`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := sanitizeMarkdownText(tt.input)
if got != tt.want {
t.Errorf("sanitizeMarkdownText(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
+38 -153
View File
@@ -5,15 +5,12 @@ import (
"embed" "embed"
"encoding/json" "encoding/json"
"fmt" "fmt"
"io"
"os" "os"
"sort" "sort"
"strings" "strings"
"unicode/utf8" "unicode/utf8"
"github.com/goccy/go-yaml" "gopkg.in/yaml.v3"
"github.com/goccy/go-yaml/ast"
"github.com/goccy/go-yaml/parser"
) )
//go:embed personas/*.yaml //go:embed personas/*.yaml
@@ -121,7 +118,9 @@ func ListBuiltinPersonas() []string {
default: default:
continue continue
} }
seen[personaName] = true if !seen[personaName] {
seen[personaName] = true
}
} }
names := make([]string, 0, len(seen)) names := make([]string, 0, len(seen))
for name := range seen { for name := range seen {
@@ -143,19 +142,10 @@ func parsePersona(data []byte, source string) (*Persona, error) {
err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth) err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
} else { } else {
// Use json.Decoder with DisallowUnknownFields for consistency with // Use json.Decoder with DisallowUnknownFields for consistency with
// YAML's Strict() - both reject unknown fields to catch typos. // YAML's KnownFields(true) - both reject unknown fields to catch typos.
dec := json.NewDecoder(bytes.NewReader(data)) dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields() dec.DisallowUnknownFields()
err = dec.Decode(&p) err = dec.Decode(&p)
if err == nil {
// Reject trailing content after the first valid JSON object.
// Without this check, input like `{"name":"x"}garbage` would
// silently succeed because Decoder stops after one object.
var dummy json.RawMessage
if err2 := dec.Decode(&dummy); err2 != io.EOF {
err = fmt.Errorf("unexpected trailing content after JSON object")
}
}
} }
if err != nil { if err != nil {
return nil, fmt.Errorf("parse persona %s: %w", source, err) return nil, fmt.Errorf("parse persona %s: %w", source, err)
@@ -166,179 +156,74 @@ func parsePersona(data []byte, source string) (*Persona, error) {
return &p, nil return &p, nil
} }
// unmarshalYAMLWithDepthLimit unmarshals YAML data with three safety checks: // unmarshalYAMLWithDepthLimit unmarshals YAML data with explicit depth limiting
// - Depth limiting: rejects AST trees exceeding maxDepth to prevent stack exhaustion. // and strict field checking. This protects against stack exhaustion from deeply
// - Multi-document rejection: prevents silent data loss from ignored extra documents. // nested structures and catches typos in field names.
// - Strict field checking: rejects unknown YAML keys to catch typos early. // Multi-document YAML files are rejected to prevent silent data loss.
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error { func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
// First pass: parse into AST to check depth limits, node counts, and // First pass: decode into a yaml.Node to check depth limits and node counts.
// multi-document rejection. This prevents stack exhaustion before we // This prevents stack exhaustion before we attempt to decode into structs.
// attempt to decode into structs. var node yaml.Node
file, err := parser.ParseBytes(data, 0) dec := yaml.NewDecoder(bytes.NewReader(data))
if err != nil { if err := dec.Decode(&node); err != nil {
return err return err
} }
// Reject empty YAML input (whitespace-only, comment-only, or truly empty files).
// The parser returns a single doc with nil body for these cases.
if len(file.Docs) == 0 || file.Docs[0].Body == nil {
return fmt.Errorf("empty YAML document")
}
// Reject multi-document YAML files - silently ignoring additional documents // Reject multi-document YAML files - silently ignoring additional documents
// could lead to confusing behavior where users think their changes take effect. // could lead to confusing behavior where users think their changes take effect.
if len(file.Docs) > 1 { var extra yaml.Node
if dec.Decode(&extra) == nil {
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed") return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
} }
nodeCount := 0 nodeCount := 0
if err := checkYAMLDepth(file.Docs[0].Body, 0, maxDepth, MaxYAMLNodes, make(map[ast.Node]int), make(map[ast.Node]bool), &nodeCount); err != nil { if err := checkYAMLDepth(&node, 0, maxDepth, MaxYAMLNodes, make(map[*yaml.Node]struct{}), &nodeCount); err != nil {
return err return err
} }
// Second pass: decode with strict field checking enabled. // Second pass: decode with strict field checking enabled.
// Strict() rejects unknown keys, catching typos like "focuss" or "identiy". // KnownFields(true) rejects unknown keys, catching typos like "focuss" or "identiy".
// // We must re-decode from the original data because yaml.Node.Decode() doesn't
// Safety note: goccy/go-yaml's decoder does not expand YAML aliases // support the KnownFields option.
// recursively — it resolves them via the pre-built AST, which our first strictDec := yaml.NewDecoder(bytes.NewReader(data))
// pass already depth-checked. Alias chains that would exceed depth limits strictDec.KnownFields(true)
// are caught above; the decoder merely reads the resolved scalar values. return strictDec.Decode(out)
dec := yaml.NewDecoder(bytes.NewReader(data), yaml.Strict())
return dec.Decode(out)
} }
// checkYAMLDepth recursively checks that YAML AST nodes don't exceed the depth // checkYAMLDepth recursively checks that YAML nodes don't exceed the depth limit
// limit or the total node count limit. It uses two tracking maps: // or the total node count limit. It also detects alias cycles to prevent infinite
// - validated: maps each node to the maximum depth at which it was previously // recursion from crafted YAML with self-referential aliases.
// checked. If a node is revisited at a deeper depth (e.g., via an alias), func checkYAMLDepth(node *yaml.Node, depth, maxDepth, maxNodes int, seen map[*yaml.Node]struct{}, nodeCount *int) error {
// we re-check it to ensure the combined effective depth doesn't exceed limits.
// - visiting: per-path recursion stack for true cycle detection. A node on the
// current path is a cycle (alias loop); we return nil to avoid infinite recursion.
//
// This design prevents the alias depth bypass where an anchored subtree validated
// at a shallow depth could be referenced via alias at a greater depth, effectively
// exceeding MaxYAMLDepth.
func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, validated map[ast.Node]int, visiting map[ast.Node]bool, nodeCount *int) error {
if node == nil {
return nil
}
if depth > maxDepth { if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth) return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
} }
// Cycle detection: if we're currently visiting this node on the current
// recursion path, it's a cycle (e.g., alias pointing to an ancestor).
// Return nil to break the cycle without error — cycles are a structural
// property, not a depth violation.
if visiting[node] {
return nil
}
// Track total nodes visited as defense-in-depth against wide-but-shallow attacks. // Track total nodes visited as defense-in-depth against wide-but-shallow attacks.
// Placed after cycle detection but before the depth-aware short-circuit. This means
// nodes revisited at shallower depths (via aliases) are counted each time they are
// encountered — intentional conservative overcounting. This bounds the total work
// performed during validation rather than tracking unique nodes, which is the safer
// security posture for untrusted YAML input.
*nodeCount++ *nodeCount++
if *nodeCount > maxNodes { if *nodeCount > maxNodes {
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes) return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
} }
// Depth-aware short-circuit: skip re-validation only when the current visit // Cycle detection: if we've seen this node before, we're in a cycle.
// depth is the same or shallower than the depth at which this node was if _, ok := seen[node]; ok {
// previously validated. A shallower (or equal) current depth means the return nil // Already validated this subtree, skip to avoid infinite recursion.
// prior, deeper validation already covered any subtree depth violations.
// If the current depth exceeds the previous validation depth (e.g., an alias
// references this node deeper in the tree), we must re-traverse to ensure
// the combined effective depth doesn't exceed maxDepth.
//
// Note: using ast.Node (interface) as map key relies on pointer identity,
// which is correct because all goccy/go-yaml AST node types are pointer
// receivers (*MappingNode, *SequenceNode, etc.), never value types.
if prevDepth, ok := validated[node]; ok && depth <= prevDepth {
return nil
} }
validated[node] = depth seen[node] = struct{}{}
// Mark as visiting (on the current recursion path) for cycle detection. // Handle alias nodes: follow the alias to its anchor target.
visiting[node] = true // Increment depth when following aliases since they expand the effective structure.
defer func() { visiting[node] = false }() if node.Kind == yaml.AliasNode && node.Alias != nil {
return checkYAMLDepth(node.Alias, depth+1, maxDepth, maxNodes, seen, nodeCount)
}
// Walk children based on node type. for _, child := range node.Content {
switch n := node.(type) { if err := checkYAMLDepth(child, depth+1, maxDepth, maxNodes, seen, nodeCount); err != nil {
case *ast.MappingNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.MappingValueNode:
// Both Key and Value are visited at depth+1 relative to this
// MappingValueNode. Since MappingNode visits its MappingValueNode
// children at depth+1 as well, keys and values end up at depth+2
// from the parent MappingNode. This is intentional: it mirrors the
// actual nesting structure (mapping → key-value pair → key/value).
if err := checkYAMLDepth(n.Key, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err return err
} }
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.SequenceNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.AliasNode:
// Follow alias to its target, incrementing depth since aliases expand
// the effective structure.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.AnchorNode:
// Increment depth for anchor values as a conservative measure: the
// anchor definition itself is structural, and treating it as a depth
// level ensures that deeply nested anchors are caught at definition
// time rather than only when referenced via alias. This +1 is
// asymmetric with alias (which also increments) — by design, the
// effective depth budget for anchored-then-aliased content is reduced
// because both the definition site and the reference site each consume
// a level, making deeply nested anchor/alias pairs hit the limit sooner.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.TagNode:
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.MergeKeyNode:
// MergeKeyNode represents the literal "<<" merge key token. It has no
// child nodes — the value side of a merge (e.g., *alias) lives in the
// parent MappingValueNode.Value, which is already recursed into above.
// Explicitly listed here (rather than in the default case) to prevent
// future library changes from silently bypassing depth checks.
default:
// Scalar leaf nodes (StringNode, IntegerNode, FloatNode, BoolNode,
// NullNode, InfinityNode, NanNode, LiteralNode) have no children to
// recurse into.
} }
return nil return nil
} }
// ParsePersonaBytes parses persona data from bytes with a source label for errors.
// This is useful for parsing personas fetched from external sources (e.g., Gitea API)
// without requiring filesystem access. Format is detected by source extension.
// Input is bounded by MaxPersonaFileSize to prevent resource exhaustion.
func ParsePersonaBytes(data []byte, source string) (*Persona, error) {
if len(data) > MaxPersonaFileSize {
return nil, fmt.Errorf("persona data from %s exceeds maximum size (%d bytes, limit %d)", source, len(data), MaxPersonaFileSize)
}
return parsePersona(data, source)
}
func validatePersona(p *Persona, source string) error { func validatePersona(p *Persona, source string) error {
if p.Name == "" { if p.Name == "" {
return fmt.Errorf("persona %s: name is required", source) return fmt.Errorf("persona %s: name is required", source)
+41 -222
View File
@@ -7,7 +7,7 @@ import (
"strings" "strings"
"testing" "testing"
"github.com/goccy/go-yaml/ast" "gopkg.in/yaml.v3"
) )
func TestLoadBuiltinPersona(t *testing.T) { func TestLoadBuiltinPersona(t *testing.T) {
@@ -459,14 +459,7 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
path := filepath.Join(dir, "deeply-nested.yaml") path := filepath.Join(dir, "deeply-nested.yaml")
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20). // Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
// Depth accumulation trace for "nested: \n level0: \n level1: ...": // Each level adds 2 to the depth count (key + value mapping).
// - Document root parsed at depth 0
// - Root MappingNode children (MappingValueNodes) visited at depth 1
// - "nested" MappingValueNode: key at depth 2, value at depth 2
// - Each levelN adds depth via MappingValueNode traversal (key + value)
// - Exact depth per level depends on AST structure (MappingNode wrapping),
// but 25 levels reliably exceeds MaxYAMLDepth (20) with comfortable margin.
// The test uses 25 levels rather than exactly 21 to avoid brittleness.
var sb strings.Builder var sb strings.Builder
sb.WriteString("name: test\nidentity: test\nnested:\n") sb.WriteString("name: test\nidentity: test\nnested:\n")
indent := " " indent := " "
@@ -490,35 +483,6 @@ func TestYAMLDeeplyNestedRejection(t *testing.T) {
} }
} }
func TestYAMLEmptyFileRejection(t *testing.T) {
tests := []struct {
name string
content string
}{
{"completely_empty", ""},
{"whitespace_only", " \n\n "},
{"comment_only", "# just a comment\n"},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, tc.name+".yaml")
if err := os.WriteFile(path, []byte(tc.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for empty YAML input, got nil")
}
if !strings.Contains(err.Error(), "empty YAML document") {
t.Errorf("expected error containing %q, got: %v", "empty YAML document", err)
}
})
}
}
func TestYAMLFileSizeLimit(t *testing.T) { func TestYAMLFileSizeLimit(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
path := filepath.Join(dir, "huge.yaml") path := filepath.Join(dir, "huge.yaml")
@@ -540,41 +504,41 @@ func TestYAMLFileSizeLimit(t *testing.T) {
func TestYAMLAliasCycleDetection(t *testing.T) { func TestYAMLAliasCycleDetection(t *testing.T) {
// Test that our checkYAMLDepth function handles alias cycles gracefully // Test that our checkYAMLDepth function handles alias cycles gracefully
// by using the visiting map to prevent infinite recursion. // by using the seen map to prevent infinite recursion.
// We test this directly because go-yaml's parser handles most cycles
// at parse time, but we need to ensure our checker is robust.
// Create a node structure where an alias points to a parent node, // Create a node structure where an alias points to a parent node,
// simulating what could happen with crafted input. // simulating what could happen with malicious input that bypasses
parent := &ast.MappingNode{ // go-yaml's cycle detection.
Values: []*ast.MappingValueNode{ parent := &yaml.Node{
{ Kind: yaml.MappingNode,
Key: &ast.StringNode{Value: "name"}, Content: []*yaml.Node{
Value: &ast.StringNode{Value: "test"}, {Kind: yaml.ScalarNode, Value: "name"},
}, {Kind: yaml.ScalarNode, Value: "test"},
{Kind: yaml.ScalarNode, Value: "nested"},
}, },
} }
// Create a child that aliases back to the parent (artificial cycle) // Create a child that aliases back to the parent (artificial cycle)
aliasToParent := &ast.AliasNode{ aliasToParent := &yaml.Node{
Value: parent, Kind: yaml.AliasNode,
Alias: parent,
} }
parent.Values = append(parent.Values, &ast.MappingValueNode{ parent.Content = append(parent.Content, aliasToParent)
Key: &ast.StringNode{Value: "nested"},
Value: aliasToParent,
})
nodeCount := 0 nodeCount := 0
validated := make(map[ast.Node]int) seen := make(map[*yaml.Node]struct{})
visiting := make(map[ast.Node]bool)
// This should NOT hang or stack overflow - cycle detection prevents infinite recursion // This should NOT hang or stack overflow - the seen map prevents infinite recursion
err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount) err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
if err != nil { if err != nil {
t.Errorf("unexpected error traversing cyclic structure: %v", err) t.Errorf("unexpected error traversing cyclic structure: %v", err)
} }
// Verify we tracked the parent in the validated map // Verify we tracked the parent in the seen map
if _, ok := validated[parent]; !ok { if _, ok := seen[parent]; !ok {
t.Error("parent node not tracked in validated map") t.Error("parent node not tracked in seen map")
} }
} }
@@ -630,82 +594,36 @@ func TestYAMLNodeCountLimit(t *testing.T) {
func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) { func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) {
// Direct test of cycle detection in checkYAMLDepth by creating // Direct test of cycle detection in checkYAMLDepth by creating
// a node structure with an artificial cycle. // a node structure with an artificial cycle.
node := &ast.MappingNode{ // This tests the seen map logic independent of go-yaml's parsing.
Values: []*ast.MappingValueNode{ node := &yaml.Node{
{ Kind: yaml.MappingNode,
Key: &ast.StringNode{Value: "key"}, Content: []*yaml.Node{
Value: &ast.StringNode{Value: "value"}, {Kind: yaml.ScalarNode, Value: "key"},
}, {Kind: yaml.ScalarNode, Value: "value"},
}, },
} }
// Create a cycle by making a child reference the parent // Create a cycle by making a child reference the parent
cycleChild := &ast.AliasNode{ cycleChild := &yaml.Node{
Value: node, // Points back to the parent Kind: yaml.AliasNode,
Alias: node, // Points back to the parent
} }
node.Values = append(node.Values, &ast.MappingValueNode{ node.Content = append(node.Content,
Key: &ast.StringNode{Value: "cyclic"}, &yaml.Node{Kind: yaml.ScalarNode, Value: "cyclic"},
Value: cycleChild, cycleChild,
}) )
nodeCount := 0 nodeCount := 0
validated := make(map[ast.Node]int) seen := make(map[*yaml.Node]struct{})
visiting := make(map[ast.Node]bool) err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, seen, &nodeCount)
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount)
// Should complete without infinite recursion due to cycle detection // Should complete without infinite recursion due to cycle detection
if err != nil { if err != nil {
t.Errorf("unexpected error: %v", err) t.Errorf("unexpected error: %v", err)
} }
// The validated map should contain multiple entries // The seen map should contain multiple entries
if len(validated) < 2 { if len(seen) < 2 {
t.Errorf("validated map has %d entries, expected at least 2", len(validated)) t.Errorf("seen map has %d entries, expected at least 2", len(seen))
}
}
func TestYAMLAliasDepthBypass(t *testing.T) {
// Test that an anchored subtree first validated at a shallow depth is
// re-checked when referenced via alias at a deeper position. Without the
// depth-aware validated map, the alias reference would skip re-checking
// and allow the effective nesting to exceed MaxYAMLDepth.
dir := t.TempDir()
path := filepath.Join(dir, "alias-depth-bypass.yaml")
// Build YAML with an anchor at shallow depth containing a subtree near the limit,
// then reference it via alias deep enough that effective depth exceeds MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\n")
// Create the anchored subtree at depth 1 (key level) that nests 15 levels deep.
sb.WriteString("anchor_key: &deep_anchor\n")
for i := 0; i < 15; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("level%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 16))
sb.WriteString("leaf: value\n")
// Create a wrapper that nests 6 levels deep, then references the anchor.
// Effective depth at alias target = 6 (wrapper nesting) + 1 (alias) + 15 (subtree) = 22 > 20
sb.WriteString("wrapper:\n")
for i := 0; i < 6; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("n%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 7))
sb.WriteString("alias_ref: *deep_anchor\n")
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for alias depth bypass, got nil")
}
if !strings.Contains(err.Error(), "nesting depth exceeds") {
t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
} }
} }
@@ -858,102 +776,3 @@ identity: test identity
t.Errorf("Name = %q, want %q", p.Name, "test") t.Errorf("Name = %q, want %q", p.Name, "test")
} }
} }
func TestJSONTrailingContentRejected(t *testing.T) {
tests := []struct {
name string
content string
}{
{
name: "trailing garbage after object",
content: `{"name":"test","identity":"test identity"}garbage`,
},
{
name: "two JSON objects",
content: `{"name":"test","identity":"test identity"}{"name":"other"}`,
},
{
name: "trailing array",
content: `{"name":"test","identity":"test identity"}[]`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for trailing content, got nil")
}
if !strings.Contains(err.Error(), "trailing content") {
t.Errorf("error = %q, want to contain 'trailing content'", err.Error())
}
})
}
}
func TestParsePersonaBytesSizeLimit(t *testing.T) {
// ParsePersonaBytes should reject input exceeding MaxPersonaFileSize
oversized := make([]byte, MaxPersonaFileSize+1)
for i := range oversized {
oversized[i] = 'x'
}
_, err := ParsePersonaBytes(oversized, "oversized.yaml")
if err == nil {
t.Fatal("expected error for oversized input, got nil")
}
if !strings.Contains(err.Error(), "exceeds maximum size") {
t.Errorf("error = %q, want to contain 'exceeds maximum size'", err.Error())
}
// Just under the limit should not trigger size error (may fail parse, but not size)
underLimit := []byte("name: test\nidentity: test persona\n")
p, err := ParsePersonaBytes(underLimit, "valid.yaml")
if err != nil {
t.Fatalf("unexpected error for valid input: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}
func TestYAMLMergeKeyDepthCheck(t *testing.T) {
// Verify that YAML merge keys (<<: *alias) are properly handled by the
// depth checker. The merge key content is in the MappingValueNode.Value
// (an AliasNode), not in the MergeKeyNode itself.
p, err := ParsePersonaBytes([]byte("name: merge-test\nidentity: test\n"), "merge.yaml")
if err != nil {
t.Fatalf("basic parse failed: %v", err)
}
if p.Name != "merge-test" {
t.Errorf("Name = %q, want %q", p.Name, "merge-test")
}
// Test that deeply nested merge keys still hit depth limit.
// Build YAML with merge key content nested beyond MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: deep-merge\nidentity: deep merge persona\n")
sb.WriteString("anchor: &deep\n")
indent := " "
for i := 0; i < MaxYAMLDepth+5; i++ {
sb.WriteString(indent)
sb.WriteString(fmt.Sprintf("level%d:\n", i))
indent += " "
}
sb.WriteString(indent + "leaf: value\n")
sb.WriteString("target:\n <<: *deep\n")
_, err = ParsePersonaBytes([]byte(sb.String()), "deep-merge.yaml")
if err == nil {
t.Fatal("expected error for deeply nested merge key content, got nil")
}
if !strings.Contains(err.Error(), "depth") {
t.Errorf("error = %q, want to contain 'depth'", err.Error())
}
}
+171
View File
@@ -0,0 +1,171 @@
package review
import (
"context"
"fmt"
"log/slog"
"sort"
"strings"
)
// PersonaFetcher abstracts fetching files from a remote repository.
// This allows persona loading to work with any Git host API.
type PersonaFetcher interface {
// ListContents returns file/directory entries at a path.
// Returns an error if the path doesn't exist or isn't accessible.
ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
// GetFileContent returns the raw content of a file from the default branch.
GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
}
// ContentEntry represents a file or directory entry.
type ContentEntry struct {
Name string // filename or directory name
Path string // full path from repo root
Type string // "file" or "dir"
}
// DefaultPersonasPath is the conventional location for repo-specific personas.
const DefaultPersonasPath = ".review-bot/personas"
// LoadRemotePersonas fetches personas from a remote repository's .review-bot/personas/ directory.
// Returns a map of persona name to Persona. If the directory doesn't exist or is empty,
// returns an empty map with no error (graceful fallback to built-in personas).
//
// Files larger than MaxPersonaFileSize are logged and skipped.
// Invalid YAML files are logged and skipped (partial success model).
// Only .yaml and .yml files are processed; other files are ignored.
func LoadRemotePersonas(ctx context.Context, fetcher PersonaFetcher, owner, repo string) (map[string]*Persona, error) {
return LoadRemotePersonasFromPath(ctx, fetcher, owner, repo, DefaultPersonasPath)
}
// LoadRemotePersonasFromPath loads personas from a custom path in a remote repository.
// It behaves the same as LoadRemotePersonas but allows specifying a path other than
// the default .review-bot/personas directory.
func LoadRemotePersonasFromPath(ctx context.Context, fetcher PersonaFetcher, owner, repo, path string) (map[string]*Persona, error) {
entries, err := fetcher.ListContents(ctx, owner, repo, path)
if err != nil {
// 404 is expected when repo doesn't have personas — return empty, not error
if isNotFoundError(err) {
slog.Debug("no remote personas directory found", "repo", fmt.Sprintf("%s/%s", owner, repo), "path", path)
return map[string]*Persona{}, nil
}
return nil, fmt.Errorf("list remote personas: %w", err)
}
// Cap the number of files to process to prevent resource exhaustion
// from repos with thousands of small files.
const maxPersonaFiles = 50
result := make(map[string]*Persona)
processed := 0
for _, entry := range entries {
if processed >= maxPersonaFiles {
slog.Warn("persona file limit reached", "limit", maxPersonaFiles, "repo", fmt.Sprintf("%s/%s", owner, repo))
break
}
if ctx.Err() != nil {
return nil, ctx.Err()
}
// Skip directories and non-YAML files
if entry.Type != "file" {
continue
}
if !isYAMLFile(entry.Name) {
continue
}
content, err := fetcher.GetFileContent(ctx, owner, repo, entry.Path)
if err != nil {
slog.Warn("could not fetch remote persona file", "file", entry.Path, "error", err)
continue
}
// Check size before parsing (defense in depth)
if len(content) > MaxPersonaFileSize {
slog.Warn("remote persona file exceeds size limit", "file", entry.Path, "size", len(content), "limit", MaxPersonaFileSize)
continue
}
// YAML parsing uses parsePersona which has defenses against YAML DoS attacks:
// - MaxPersonaFileSize (above) caps raw input size before any parsing
// - maxPersonaFiles (above) limits the number of files processed per repo
// - unmarshalYAMLWithDepthLimit enforces MaxYAMLDepth to prevent stack exhaustion
// - checkYAMLDepth tracks node counts (MaxYAMLNodes) against "billion laughs" expansion
// - Alias cycles are detected and capped by seen-node tracking
// See persona.go for the implementation details.
persona, err := parsePersona([]byte(content), entry.Path)
if err != nil {
slog.Warn("could not parse remote persona file", "file", entry.Path, "error", err)
continue
}
result[persona.Name] = persona
processed++
slog.Debug("loaded remote persona", "name", persona.Name, "file", entry.Path)
}
return result, nil
}
// MergePersonas combines remote and built-in personas.
// Remote personas take precedence on name collision.
// Returns the merged map and a list of persona names in sorted order.
func MergePersonas(remote, builtin map[string]*Persona) (map[string]*Persona, []string) {
merged := make(map[string]*Persona)
// Add built-in first
for name, p := range builtin {
merged[name] = p
}
// Remote overrides built-in on collision
for name, p := range remote {
if _, exists := merged[name]; exists {
slog.Debug("remote persona overrides built-in", "name", name)
}
merged[name] = p
}
// Collect sorted names
names := make([]string, 0, len(merged))
for name := range merged {
names = append(names, name)
}
sort.Strings(names)
return merged, names
}
// LoadAllBuiltinPersonas loads all built-in personas into a map.
func LoadAllBuiltinPersonas() map[string]*Persona {
result := make(map[string]*Persona)
for _, name := range ListBuiltinPersonas() {
p, err := LoadBuiltinPersona(name)
if err != nil {
slog.Warn("could not load built-in persona", "name", name, "error", err)
continue
}
result[name] = p
}
return result
}
// isYAMLFile returns true if the filename has a YAML extension.
func isYAMLFile(name string) bool {
lower := strings.ToLower(name)
return strings.HasSuffix(lower, ".yaml") || strings.HasSuffix(lower, ".yml")
}
// isNotFoundError checks if an error indicates a 404 response.
// This is a simple string check to avoid importing the gitea package
// (which would create a circular dependency).
func isNotFoundError(err error) bool {
if err == nil {
return false
}
errStr := err.Error()
return strings.Contains(errStr, "HTTP 404")
}
+394
View File
@@ -0,0 +1,394 @@
package review
import (
"context"
"errors"
"testing"
)
// mockFetcher implements PersonaFetcher for testing.
type mockFetcher struct {
contents map[string][]ContentEntry // path -> entries
files map[string]string // path -> content
listErr error // error to return from ListContents
getFileErr map[string]error // path -> error for GetFileContent
listNotFound bool // return 404-style error
}
func newMockFetcher() *mockFetcher {
return &mockFetcher{
contents: make(map[string][]ContentEntry),
files: make(map[string]string),
getFileErr: make(map[string]error),
}
}
func (m *mockFetcher) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
if m.listNotFound {
return nil, errors.New("HTTP 404: not found")
}
if m.listErr != nil {
return nil, m.listErr
}
entries, ok := m.contents[path]
if !ok {
return nil, errors.New("HTTP 404: not found")
}
return entries, nil
}
func (m *mockFetcher) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
if err, ok := m.getFileErr[filepath]; ok {
return "", err
}
content, ok := m.files[filepath]
if !ok {
return "", errors.New("HTTP 404: file not found")
}
return content, nil
}
func TestLoadRemotePersonas_NoDirectory(t *testing.T) {
fetcher := newMockFetcher()
fetcher.listNotFound = true
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("expected no error for missing directory, got: %v", err)
}
if len(result) != 0 {
t.Errorf("expected empty map, got %d personas", len(result))
}
}
func TestLoadRemotePersonas_EmptyDirectory(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{}
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 0 {
t.Errorf("expected empty map, got %d personas", len(result))
}
}
func TestLoadRemotePersonas_SinglePersona(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "trading.yaml", Path: ".review-bot/personas/trading.yaml", Type: "file"},
}
fetcher.files[".review-bot/personas/trading.yaml"] = `
name: trading
display_name: Trading Expert
identity: You are a trading systems expert.
focus:
- order execution
- market data
`
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 1 {
t.Fatalf("expected 1 persona, got %d", len(result))
}
if result["trading"] == nil {
t.Fatal("expected 'trading' persona")
}
if result["trading"].DisplayName != "Trading Expert" {
t.Errorf("expected display name 'Trading Expert', got %q", result["trading"].DisplayName)
}
}
func TestLoadRemotePersonas_MultiplePersonas(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "one.yaml", Path: ".review-bot/personas/one.yaml", Type: "file"},
{Name: "two.yml", Path: ".review-bot/personas/two.yml", Type: "file"},
}
fetcher.files[".review-bot/personas/one.yaml"] = `
name: one
identity: First persona.
`
fetcher.files[".review-bot/personas/two.yml"] = `
name: two
identity: Second persona.
`
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 2 {
t.Fatalf("expected 2 personas, got %d", len(result))
}
if result["one"] == nil || result["two"] == nil {
t.Error("expected both personas to be loaded")
}
}
func TestLoadRemotePersonas_SkipsNonYAML(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "readme.md", Path: ".review-bot/personas/readme.md", Type: "file"},
{Name: "config.json", Path: ".review-bot/personas/config.json", Type: "file"},
}
fetcher.files[".review-bot/personas/valid.yaml"] = `
name: valid
identity: Valid persona.
`
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 1 {
t.Fatalf("expected 1 persona (skipping non-YAML), got %d", len(result))
}
}
func TestLoadRemotePersonas_SkipsDirectories(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "subdir", Path: ".review-bot/personas/subdir", Type: "dir"},
}
fetcher.files[".review-bot/personas/valid.yaml"] = `
name: valid
identity: Valid persona.
`
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 1 {
t.Fatalf("expected 1 persona (skipping dir), got %d", len(result))
}
}
func TestLoadRemotePersonas_SkipsInvalidYAML(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "invalid.yaml", Path: ".review-bot/personas/invalid.yaml", Type: "file"},
}
fetcher.files[".review-bot/personas/valid.yaml"] = `
name: valid
identity: Valid persona.
`
fetcher.files[".review-bot/personas/invalid.yaml"] = `
this is not valid yaml: [unclosed bracket
`
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 1 {
t.Fatalf("expected 1 persona (skipping invalid), got %d", len(result))
}
if result["valid"] == nil {
t.Error("expected valid persona to be loaded")
}
}
func TestLoadRemotePersonas_SkipsOversizedFiles(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "huge.yaml", Path: ".review-bot/personas/huge.yaml", Type: "file"},
}
// Create content larger than MaxPersonaFileSize (64KB)
fetcher.files[".review-bot/personas/huge.yaml"] = `
name: huge
identity: ` + string(make([]byte, MaxPersonaFileSize+1000))
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 0 {
t.Errorf("expected 0 personas (oversized file skipped), got %d", len(result))
}
}
func TestLoadRemotePersonas_SkipsFetchErrors(t *testing.T) {
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "error.yaml", Path: ".review-bot/personas/error.yaml", Type: "file"},
}
fetcher.files[".review-bot/personas/valid.yaml"] = `
name: valid
identity: Valid persona.
`
fetcher.getFileErr[".review-bot/personas/error.yaml"] = errors.New("network error")
result, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(result) != 1 {
t.Fatalf("expected 1 persona (skipping error), got %d", len(result))
}
}
func TestLoadRemotePersonas_ListContentsError(t *testing.T) {
fetcher := newMockFetcher()
fetcher.listErr = errors.New("server error")
_, err := LoadRemotePersonas(context.Background(), fetcher, "owner", "repo")
if err == nil {
t.Fatal("expected error for list contents failure")
}
}
func TestLoadRemotePersonas_ContextCancellation(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
cancel() // Cancel immediately
fetcher := newMockFetcher()
fetcher.contents[DefaultPersonasPath] = []ContentEntry{
{Name: "one.yaml", Path: ".review-bot/personas/one.yaml", Type: "file"},
}
fetcher.files[".review-bot/personas/one.yaml"] = `
name: one
identity: One.
`
_, err := LoadRemotePersonas(ctx, fetcher, "owner", "repo")
if err == nil {
t.Fatal("expected context cancellation error")
}
}
func TestMergePersonas_NoOverlap(t *testing.T) {
remote := map[string]*Persona{
"trading": {Name: "trading", Identity: "Trading expert."},
}
builtin := map[string]*Persona{
"security": {Name: "security", Identity: "Security expert."},
}
merged, names := MergePersonas(remote, builtin)
if len(merged) != 2 {
t.Fatalf("expected 2 personas, got %d", len(merged))
}
if len(names) != 2 {
t.Fatalf("expected 2 names, got %d", len(names))
}
// Names should be sorted
if names[0] != "security" || names[1] != "trading" {
t.Errorf("expected sorted names [security, trading], got %v", names)
}
}
func TestMergePersonas_RemoteOverridesBuiltin(t *testing.T) {
remote := map[string]*Persona{
"security": {Name: "security", Identity: "Custom security expert."},
}
builtin := map[string]*Persona{
"security": {Name: "security", Identity: "Default security expert."},
}
merged, _ := MergePersonas(remote, builtin)
if merged["security"].Identity != "Custom security expert." {
t.Errorf("expected remote to override builtin, got identity: %q", merged["security"].Identity)
}
}
func TestMergePersonas_EmptyRemote(t *testing.T) {
remote := map[string]*Persona{}
builtin := map[string]*Persona{
"security": {Name: "security", Identity: "Security."},
}
merged, names := MergePersonas(remote, builtin)
if len(merged) != 1 {
t.Fatalf("expected 1 persona, got %d", len(merged))
}
if names[0] != "security" {
t.Errorf("expected 'security', got %q", names[0])
}
}
func TestMergePersonas_EmptyBuiltin(t *testing.T) {
remote := map[string]*Persona{
"trading": {Name: "trading", Identity: "Trading."},
}
builtin := map[string]*Persona{}
merged, names := MergePersonas(remote, builtin)
if len(merged) != 1 {
t.Fatalf("expected 1 persona, got %d", len(merged))
}
if names[0] != "trading" {
t.Errorf("expected 'trading', got %q", names[0])
}
}
func TestLoadAllBuiltinPersonas(t *testing.T) {
personas := LoadAllBuiltinPersonas()
// Should load at least the known built-in personas
expected := []string{"architect", "docs", "security"}
for _, name := range expected {
if personas[name] == nil {
t.Errorf("expected built-in persona %q to be loaded", name)
}
}
}
func TestIsYAMLFile(t *testing.T) {
tests := []struct {
name string
expected bool
}{
{"test.yaml", true},
{"test.yml", true},
{"test.YAML", true},
{"test.YML", true},
{"test.json", false},
{"test.md", false},
{"yaml", false},
{"", false},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
if got := isYAMLFile(tc.name); got != tc.expected {
t.Errorf("isYAMLFile(%q) = %v, want %v", tc.name, got, tc.expected)
}
})
}
}
func TestIsNotFoundError(t *testing.T) {
tests := []struct {
name string
err error
expected bool
}{
{"nil error", nil, false},
{"HTTP 404", errors.New("HTTP 404: not found"), true},
{"not found text", errors.New("path not found"), false},
{"server error", errors.New("server error"), false},
{"HTTP 500", errors.New("HTTP 500: internal error"), false},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
if got := isNotFoundError(tc.err); got != tc.expected {
t.Errorf("isNotFoundError(%v) = %v, want %v", tc.err, got, tc.expected)
}
})
}
}
-150
View File
@@ -1,150 +0,0 @@
package review
import (
"context"
"log/slog"
"strings"
)
// RepoPersonaPath is the directory path where repo-specific personas are stored.
const RepoPersonaPath = ".review-bot/personas"
// GiteaClient defines the subset of gitea.Client methods needed for loading repo personas.
// This interface allows for easier testing and decouples the review package from gitea.
type GiteaClient interface {
ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
}
// ContentEntry represents a file or directory entry from the contents API.
// This mirrors gitea.ContentEntry to avoid import cycles.
type ContentEntry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"` // "file" or "dir"
}
// LoadRepoPersonas fetches personas from a repository's .review-bot/personas/ directory.
// Returns an empty map (not nil) if the directory doesn't exist or is empty.
// Individual parse failures are logged and skipped; the remaining personas are still returned.
// Auth errors and other non-404 errors are propagated.
// Files exceeding MaxPersonaFileSize are rejected to prevent resource exhaustion.
func LoadRepoPersonas(ctx context.Context, client GiteaClient, owner, repo string) (map[string]*Persona, error) {
result := make(map[string]*Persona)
entries, err := client.ListContents(ctx, owner, repo, RepoPersonaPath)
if err != nil {
// Check if this is a 404 (directory doesn't exist) - expected case
if isNotFoundError(err) {
slog.Debug("no repo personas directory found", "repo", owner+"/"+repo)
return result, nil
}
// Other errors (auth, server) should propagate
return nil, err
}
if len(entries) == 0 {
slog.Debug("repo personas directory is empty", "repo", owner+"/"+repo)
return result, nil
}
for _, entry := range entries {
if entry.Type != "file" {
continue
}
// Only process YAML files
if !isYAMLFile(entry.Name) {
continue
}
content, err := client.GetFileContent(ctx, owner, repo, entry.Path)
if err != nil {
slog.Warn("could not fetch repo persona file",
"file", entry.Path,
"repo", owner+"/"+repo,
"error", err)
continue
}
// Enforce size limit before parsing to prevent resource exhaustion
if len(content) > MaxPersonaFileSize {
slog.Warn("repo persona file exceeds maximum size",
"file", entry.Path,
"repo", owner+"/"+repo,
"size", len(content),
"max", MaxPersonaFileSize)
continue
}
persona, err := ParsePersonaBytes([]byte(content), entry.Path)
if err != nil {
slog.Warn("could not parse repo persona file",
"file", entry.Path,
"repo", owner+"/"+repo,
"error", err)
continue
}
result[persona.Name] = persona
slog.Debug("loaded repo persona",
"name", persona.Name,
"file", entry.Path,
"repo", owner+"/"+repo)
}
return result, nil
}
// MergePersonas combines built-in personas with repo personas.
// Repo personas take precedence on name collision.
// Returns a new map; inputs are not modified.
func MergePersonas(builtin, repo map[string]*Persona) map[string]*Persona {
result := make(map[string]*Persona, len(builtin)+len(repo))
// Copy built-in personas first
for name, p := range builtin {
result[name] = p
}
// Overlay repo personas (override on collision)
for name, p := range repo {
if _, exists := result[name]; exists {
slog.Debug("repo persona overrides built-in", "name", name)
}
result[name] = p
}
return result
}
// GetBuiltinPersonasMap returns all built-in personas as a map keyed by name.
// Returns an empty map (not nil) if loading fails.
func GetBuiltinPersonasMap() map[string]*Persona {
result := make(map[string]*Persona)
for _, name := range ListBuiltinPersonas() {
p, err := LoadBuiltinPersona(name)
if err != nil {
slog.Warn("could not load built-in persona", "name", name, "error", err)
continue
}
result[name] = p
}
return result
}
// isYAMLFile checks if a filename has a YAML extension.
func isYAMLFile(name string) bool {
lower := strings.ToLower(name)
return strings.HasSuffix(lower, ".yaml") || strings.HasSuffix(lower, ".yml")
}
// isNotFoundError checks if an error represents a 404 response.
// This uses a specific "HTTP 404" substring match rather than a generic "not found"
// match to avoid masking authentication failures or transport errors that might
// contain "not found" in their message.
func isNotFoundError(err error) bool {
if err == nil {
return false
}
return strings.Contains(err.Error(), "HTTP 404")
}
-443
View File
@@ -1,443 +0,0 @@
package review
import (
"context"
"errors"
"strings"
"testing"
)
func TestParsePersonaBytes(t *testing.T) {
tests := []struct {
name string
data string
source string
wantName string
wantErr string
}{
{
name: "valid yaml",
data: `name: test
identity: test identity
focus:
- testing
`,
source: "test.yaml",
wantName: "test",
},
{
name: "missing name",
data: "identity: test\n",
source: "test.yaml",
wantErr: "name is required",
},
{
name: "invalid yaml",
data: "not: valid:\n yaml: [broken",
source: "test.yaml",
wantErr: "parse",
},
{
name: "json format by extension",
data: `{"name": "jsontest", "identity": "json identity"}`,
source: "test.json",
wantName: "jsontest",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
p, err := ParsePersonaBytes([]byte(tt.data), tt.source)
if tt.wantErr != "" {
if err == nil {
t.Fatalf("expected error containing %q, got nil", tt.wantErr)
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if p.Name != tt.wantName {
t.Errorf("Name = %q, want %q", p.Name, tt.wantName)
}
})
}
}
// mockGiteaClient implements GiteaClient for testing.
type mockGiteaClient struct {
contents map[string][]ContentEntry // path -> entries
files map[string]string // path -> content
listErr error
fileErr map[string]error // path -> error
}
func (m *mockGiteaClient) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
if m.listErr != nil {
return nil, m.listErr
}
entries, ok := m.contents[path]
if !ok {
return nil, errors.New("list contents .review-bot/personas: HTTP 404: not found")
}
return entries, nil
}
func (m *mockGiteaClient) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
if m.fileErr != nil {
if err, ok := m.fileErr[filepath]; ok {
return "", err
}
}
content, ok := m.files[filepath]
if !ok {
return "", errors.New("HTTP 404: file not found")
}
return content, nil
}
func TestLoadRepoPersonas(t *testing.T) {
ctx := context.Background()
t.Run("directory not found returns empty map", func(t *testing.T) {
client := &mockGiteaClient{} // No contents configured -> 404
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if personas == nil {
t.Error("expected empty map, got nil")
}
if len(personas) != 0 {
t.Errorf("expected 0 personas, got %d", len(personas))
}
})
t.Run("empty directory returns empty map", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {},
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 0 {
t.Errorf("expected 0 personas, got %d", len(personas))
}
})
t.Run("loads valid personas", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "trading.yaml", Path: ".review-bot/personas/trading.yaml", Type: "file"},
{Name: "crypto.yaml", Path: ".review-bot/personas/crypto.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/trading.yaml": `name: trading
display_name: Trading Expert
identity: You are a trading expert.
focus:
- order handling
- risk management
`,
".review-bot/personas/crypto.yaml": `name: crypto
display_name: Crypto Expert
identity: You are a cryptography expert.
focus:
- key management
- encryption
`,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 2 {
t.Fatalf("expected 2 personas, got %d", len(personas))
}
if personas["trading"] == nil {
t.Error("expected trading persona")
}
if personas["crypto"] == nil {
t.Error("expected crypto persona")
}
if personas["trading"].DisplayName != "Trading Expert" {
t.Errorf("trading display name = %q, want %q", personas["trading"].DisplayName, "Trading Expert")
}
})
t.Run("skips invalid persona files", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "invalid.yaml", Path: ".review-bot/personas/invalid.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/valid.yaml": `name: valid
identity: Valid persona
`,
".review-bot/personas/invalid.yaml": "not valid yaml: [broken",
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Should have the valid one, skip the invalid
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped invalid), got %d", len(personas))
}
if personas["valid"] == nil {
t.Error("expected valid persona")
}
})
t.Run("skips non-yaml files", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "README.md", Path: ".review-bot/personas/README.md", Type: "file"},
{Name: "notes.txt", Path: ".review-bot/personas/notes.txt", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/persona.yaml": `name: test
identity: Test persona
`,
".review-bot/personas/README.md": "# Personas\n\nPut your personas here.",
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (yaml only), got %d", len(personas))
}
})
t.Run("skips subdirectories", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "subdir", Path: ".review-bot/personas/subdir", Type: "dir"},
},
},
files: map[string]string{
".review-bot/personas/persona.yaml": `name: test
identity: Test persona
`,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (files only), got %d", len(personas))
}
})
t.Run("propagates auth errors", func(t *testing.T) {
client := &mockGiteaClient{
listErr: errors.New("HTTP 401: unauthorized"),
}
_, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err == nil {
t.Fatal("expected error for auth failure")
}
if !strings.Contains(err.Error(), "401") {
t.Errorf("error = %q, want containing '401'", err.Error())
}
})
t.Run("skips files that fail to fetch", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "good.yaml", Path: ".review-bot/personas/good.yaml", Type: "file"},
{Name: "bad.yaml", Path: ".review-bot/personas/bad.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/good.yaml": `name: good
identity: Good persona
`,
},
fileErr: map[string]error{
".review-bot/personas/bad.yaml": errors.New("HTTP 500: internal server error"),
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped failed fetch), got %d", len(personas))
}
})
t.Run("skips oversized files", func(t *testing.T) {
// Create a content string that exceeds MaxPersonaFileSize (64KB)
oversizedContent := strings.Repeat("a", MaxPersonaFileSize+1)
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "normal.yaml", Path: ".review-bot/personas/normal.yaml", Type: "file"},
{Name: "huge.yaml", Path: ".review-bot/personas/huge.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/normal.yaml": `name: normal
identity: Normal sized persona
`,
".review-bot/personas/huge.yaml": oversizedContent,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Should have the normal one, skip the oversized
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped oversized), got %d", len(personas))
}
if personas["normal"] == nil {
t.Error("expected normal persona")
}
})
}
func TestMergePersonas(t *testing.T) {
builtin := map[string]*Persona{
"security": {Name: "security", Identity: "Built-in security"},
"docs": {Name: "docs", Identity: "Built-in docs"},
}
repo := map[string]*Persona{
"security": {Name: "security", Identity: "Repo security override"},
"trading": {Name: "trading", Identity: "Repo trading"},
}
merged := MergePersonas(builtin, repo)
t.Run("repo overrides builtin on collision", func(t *testing.T) {
if merged["security"].Identity != "Repo security override" {
t.Errorf("security identity = %q, want repo override", merged["security"].Identity)
}
})
t.Run("builtin preserved when no collision", func(t *testing.T) {
if merged["docs"].Identity != "Built-in docs" {
t.Errorf("docs identity = %q, want built-in", merged["docs"].Identity)
}
})
t.Run("repo-only persona added", func(t *testing.T) {
if merged["trading"] == nil {
t.Error("expected trading persona from repo")
}
if merged["trading"].Identity != "Repo trading" {
t.Errorf("trading identity = %q, want repo", merged["trading"].Identity)
}
})
t.Run("original maps not modified", func(t *testing.T) {
if builtin["trading"] != nil {
t.Error("builtin map was modified")
}
if len(repo) != 2 {
t.Error("repo map was modified")
}
})
}
func TestGetBuiltinPersonasMap(t *testing.T) {
personas := GetBuiltinPersonasMap()
if len(personas) == 0 {
t.Fatal("expected at least one built-in persona")
}
// Verify expected personas exist
expected := []string{"security", "architect", "docs"}
for _, name := range expected {
if personas[name] == nil {
t.Errorf("expected built-in persona %q", name)
}
}
// Verify personas are valid
for name, p := range personas {
if p.Name != name {
t.Errorf("persona %q has mismatched name %q", name, p.Name)
}
if p.Identity == "" {
t.Errorf("persona %q has empty identity", name)
}
}
}
func TestIsYAMLFile(t *testing.T) {
tests := []struct {
name string
want bool
}{
{"test.yaml", true},
{"test.yml", true},
{"test.YAML", true},
{"test.YML", true},
{"test.json", false},
{"test.md", false},
{"test.txt", false},
{"yaml", false},
{"yaml.md", false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := isYAMLFile(tt.name); got != tt.want {
t.Errorf("isYAMLFile(%q) = %v, want %v", tt.name, got, tt.want)
}
})
}
}
func TestIsNotFoundError(t *testing.T) {
tests := []struct {
err error
want bool
}{
{nil, false},
{errors.New("HTTP 404: not found"), true},
{errors.New("HTTP 404"), true},
// Intentionally false: generic "not found" could mask auth/transport errors.
// Only explicit HTTP 404 responses should be treated as "directory doesn't exist".
{errors.New("something not found"), false},
{errors.New("HTTP 401: unauthorized"), false},
{errors.New("connection refused"), false},
}
for _, tt := range tests {
name := "nil"
if tt.err != nil {
name = tt.err.Error()
}
t.Run(name, func(t *testing.T) {
if got := isNotFoundError(tt.err); got != tt.want {
t.Errorf("isNotFoundError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
-78
View File
@@ -1,78 +0,0 @@
// Package vcs defines shared types used across VCS client implementations
// (gitea, github). Keeping them here prevents the client packages from
// importing each other or duplicating definitions.
package vcs
// ReviewEvent is the verdict submitted with a pull request review.
type ReviewEvent string
const (
// ReviewEventApprove approves the pull request.
ReviewEventApprove ReviewEvent = "APPROVE"
// ReviewEventRequestChanges requests changes before the PR can be merged.
ReviewEventRequestChanges ReviewEvent = "REQUEST_CHANGES"
// ReviewEventComment leaves a comment review without explicit approval or rejection.
ReviewEventComment ReviewEvent = "COMMENT"
)
// UserInfo holds the identity of a user.
type UserInfo struct {
Login string
}
// ReviewComment is a single inline comment attached to a pull request review.
type ReviewComment struct {
// Path is the file path the comment applies to.
Path string
// Position is the line position within the diff (1-indexed).
// GitHub and Gitea differ in how they compute position; callers must
// supply the correct value for the target VCS.
Position int
// Body is the text content of the comment.
Body string
// CommitID is the SHA of the commit the comment applies to.
// For GitHub, all comments within a single review must target the same commit.
CommitID string
}
// ReviewRequest is the payload for submitting a pull request review.
type ReviewRequest struct {
// Body is the top-level review comment body.
Body string
// Event is the review verdict.
Event ReviewEvent
// CommitID is the commit SHA that this review targets.
// When non-empty, the review is anchored to this commit.
// For GitHub, this is passed as commit_id in the review request.
// For Gitea, this is passed as the commitID parameter to PostReview.
CommitID string
// Comments are optional inline file-level comments.
Comments []ReviewComment
}
// Review represents a submitted pull request review.
type Review struct {
// ID is the provider-assigned review identifier.
ID int64
// Body is the top-level review comment body.
Body string
// User is the author of the review.
User UserInfo
// State is the canonical review state string.
// Values: APPROVE, REQUEST_CHANGES, COMMENT, DISMISSED, PENDING.
State string
// CommitID is the commit SHA the review was evaluated against.
CommitID string
}