Compare commits

..

12 Commits

Author SHA1 Message Date
Rodin 44c80c36cf fix: use bedrock-2023-05-31 for AI Core Anthropic version
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 38s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m11s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m52s
SAP AI Core's Anthropic endpoint uses AWS Bedrock API format, not native
Anthropic API. Verified via integration testing:
- 2023-06-01: Invalid API version
- bedrock-2023-05-31: Works ✓
2026-05-09 23:48:21 -07:00
Rodin f71f26fcff fix: remove anthropic_version from body - AI Core rejects it
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 17s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m37s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m2s
2026-05-09 23:45:11 -07:00
Rodin 8da8fca19d fix: add omitempty to model field so it's not sent when empty
CI / test (pull_request) Successful in 15s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 16s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m25s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m39s
2026-05-09 23:39:22 -07:00
Rodin b12df1a636 test: update Anthropic test to check anthropic_version instead of model
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 16s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m33s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m50s
2026-05-09 23:36:42 -07:00
Rodin d13e062866 fix: omit model field from AI Core Anthropic request
CI / test (pull_request) Failing after 10s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Has been skipped
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Has been skipped
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Has been skipped
AI Core deployment URL already specifies the model. Sending model in the body
causes 'Extra inputs are not permitted' error.
2026-05-09 23:28:22 -07:00
Rodin b76270c21b fix: put anthropic_version in request body, not header
CI / test (pull_request) Successful in 12s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 17s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m50s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m43s
SAP AI Core expects anthropic_version in the JSON body, not as a header.
2026-05-09 23:23:42 -07:00
Rodin b92a968d93 fix: add anthropic-version header for AI Core Anthropic endpoint
CI / test (pull_request) Successful in 14s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 17s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m4s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m9s
2026-05-09 23:19:00 -07:00
Rodin d02c75486e ci: switch to native aicore provider, remove HAI proxy dependency
CI / test (pull_request) Successful in 14s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Failing after 18s
CI / review (gpt-5, security, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m10s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m44s
2026-05-09 23:14:46 -07:00
Rodin 34507dd9ff fix: propagate LLM timeout to AI Core client
CI / test (pull_request) Successful in 15s
CI / review (/anthropic/v1, anthropic--claude-4.6-sonnet, sonnet, anthropic, SONNET_REVIEW_TOKEN) (pull_request) Successful in 32s
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m58s
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m16s
Address review feedback:

MAJOR:
- AICoreClient now defaults to 5min timeout (matching Client)
- Add AICoreClient.WithTimeout() for explicit timeout control
- Client.WithAICore() inherits parent client's timeout
- Client.WithTimeout() propagates to aicore client if set

MINOR:
- Extract AICoreOpenAIAPIVersion constant for the hardcoded api-version
- Tighten IsAnthropicModel to only match 'anthropic--' prefix
  (SAP AI Core always uses this prefix for Anthropic models)

NIT:
- Use fmt.Sprintf for token generation in tests (robust for >9 calls)
- Add error checking in TestAICoreClient_TokenExpiry
- Add tests for WithTimeout propagation
2026-05-09 22:29:19 -07:00
Rodin a62b791b9e ci: use SAP AI Core model names and remove unavailable models
CI / test (pull_request) Successful in 15s
CI / review (/anthropic/v1, anthropic--claude-4.6-sonnet, sonnet, anthropic, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m10s
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m21s
- Use 'anthropic--claude-4.6-sonnet' instead of 'claude-sonnet-4-6'
  (hai-aicore proxy requires the 'anthropic--' prefix)
- Remove gpt-4.1, gpt-4.1-mini, gpt-5-mini reviewers since those models
  are not deployed in SAP AI Core
- Keep sonnet, gpt-5, and security reviews
2026-05-09 21:58:31 -07:00
Rodin c3ec44a87b chore: retrigger CI after LLM_BASE_URL fix
CI / test (pull_request) Successful in 14s
CI / review (/openai/v1, gpt-4.1-mini, gpt41-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Failing after 16s
CI / review (/anthropic/v1, claude-sonnet-4-6, sonnet, anthropic, SONNET_REVIEW_TOKEN) (pull_request) Failing after 17s
CI / review (/openai/v1, gpt-4.1, gpt41, openai, GPT_REVIEW_TOKEN) (pull_request) Failing after 16s
CI / review (/openai/v1, gpt-5-mini, gpt5-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Failing after 13s
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m17s
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 2m2s
2026-05-09 21:53:51 -07:00
claw cf453504cb feat: add native SAP AI Core support
CI / test (pull_request) Successful in 13s
CI / review (/anthropic/v1, claude-sonnet-4-6, sonnet, anthropic, SONNET_REVIEW_TOKEN) (pull_request) Failing after 12s
CI / review (/openai/v1, gpt-4.1, gpt41, openai, GPT_REVIEW_TOKEN) (pull_request) Failing after 12s
CI / review (/openai/v1, gpt-4.1-mini, gpt41-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Failing after 14s
CI / review (/openai/v1, gpt-5-mini, gpt5-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Failing after 11s
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m12s
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m28s
Adds a new 'aicore' LLM provider that authenticates directly with SAP AI Core
using OAuth client credentials. This eliminates the need for an external proxy
(hai-aicore or hai) when running review-bot in SAP environments.

Key changes:
- New llm/aicore.go with AICoreClient for OAuth token management and
  deployment discovery
- Thread-safe token caching with automatic refresh before expiry
- Automatic routing to /invoke (Anthropic) or /chat/completions (OpenAI)
  based on model name
- New CLI flags: --aicore-client-id, --aicore-client-secret, --aicore-auth-url,
  --aicore-api-url, --aicore-resource-group
- Updated action.yml with corresponding inputs
- Comprehensive test coverage for AI Core client

Example usage in CI:
  llm-provider: aicore
  llm-model: anthropic--claude-4.6-sonnet
  aicore-client-id: ${{ secrets.AICORE_CLIENT_ID }}
  aicore-client-secret: ${{ secrets.AICORE_CLIENT_SECRET }}
  aicore-auth-url: ${{ secrets.AICORE_AUTH_URL }}
  aicore-api-url: ${{ secrets.AICORE_API_URL }}

Closes #49
2026-05-08 17:49:26 -07:00
46 changed files with 313 additions and 8471 deletions
+20 -337
View File
@@ -1,43 +1,17 @@
# This composite action supports both Gitea Actions and GitHub Actions runners.
# It detects the VCS host type by checking whether github.api_url is set
# (present on GitHub.com and GHES runners, absent on Gitea runners) and uses
# the appropriate releases API for version resolution and binary download
# (REST API on GitHub, direct URLs on Gitea).
#
# Security notes:
# - On GitHub/GHES (VCS_TYPE=github), inputs.vcs-url is IGNORED to prevent
# token exfiltration. API calls use github.api_url; downloads use
# github.server_url. Tokens are never sent to user-supplied URLs.
# - On Gitea (VCS_TYPE=gitea), inputs.vcs-url is validated (https scheme,
# no whitespace/newlines, and DNS resolution to a public IP) before use.
# Python3 resolves the hostname and rejects RFC1918, RFC6598 (carrier-grade
# NAT), loopback, link-local, and other reserved addresses to prevent SSRF attacks.
# The installed review-bot binary additionally uses a safe HTTP transport
# (DialContext-level IP check) for all Gitea API calls at runtime.
# The binary also exposes a `validate-url` subcommand for use in any future
# shell steps that need to validate a URL before passing it to curl.
# - action-repo is validated against owner/repo pattern.
# - Tokens are passed via masked environment variables, not step outputs.
#
# This composite action is designed for Gitea Actions runners.
# Gitea Actions supports GitHub Actions syntax including $GITHUB_OUTPUT,
# actions/cache, and actions/checkout.
# Requirements: python3, sha256sum, curl (all present on ubuntu-* runners).
name: 'AI Code Review'
description: 'Run AI-powered code review on a pull request using review-bot'
inputs:
vcs-url:
description: 'VCS server URL (only used on Gitea runners; ignored on GitHub/GHES). Defaults to server_url.'
gitea-url:
description: 'Gitea instance URL (defaults to server_url)'
required: false
default: ''
repo:
description: 'Repository to review (owner/name, defaults to current)'
required: false
default: ''
action-repo:
description: 'Repository hosting review-bot releases (owner/name). Defaults to github.action_repository or rodin/review-bot.'
required: false
default: ''
action-repo-token:
description: 'Token for downloading release assets from action-repo (defaults to github.token on GitHub, reviewer-token on Gitea). Required for private repos.'
description: 'Repository (owner/name, defaults to current)'
required: false
default: ''
pr-number:
@@ -45,7 +19,7 @@ inputs:
required: false
default: ''
reviewer-token:
description: 'Token for posting the review'
description: 'Gitea token for posting the review'
required: true
reviewer-name:
description: 'Display name for the reviewer'
@@ -122,14 +96,6 @@ inputs:
description: 'Local file with additional system prompt instructions (e.g. security review focus)'
required: false
default: ''
persona:
description: 'Built-in persona name (security, architect, docs)'
required: false
default: ''
persona-file:
description: 'Path to custom persona JSON file'
required: false
default: ''
runs:
using: 'composite'
@@ -138,325 +104,45 @@ runs:
id: version
shell: bash
run: |
set -euo pipefail
# --- Input Validation ---
# Determine the repo hosting review-bot releases (not the repo being reviewed)
ACTION_REPO="${{ inputs.action-repo }}"
if [ -z "$ACTION_REPO" ]; then
# github.action_repository is the repo containing the running action
ACTION_REPO="${{ github.action_repository }}"
fi
if [ -z "$ACTION_REPO" ]; then
# Final fallback for Gitea (which may not set action_repository)
ACTION_REPO="rodin/review-bot"
echo "::notice::action-repo not specified and github.action_repository is empty; falling back to rodin/review-bot"
fi
# Validate ACTION_REPO matches owner/repo pattern (prevent path traversal)
if ! printf '%s' "$ACTION_REPO" | grep -qE '^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$'; then
echo "Error: action-repo '${ACTION_REPO}' does not match expected owner/repo format" >&2
exit 1
fi
# Detect VCS host type using github.api_url context.
# github.api_url is set on GitHub.com (https://api.github.com) and GHES
# (https://<host>/api/v3). It is empty/unset on Gitea Actions runners.
GITHUB_API_URL="${{ github.api_url }}"
if [ -n "$GITHUB_API_URL" ]; then
VCS_TYPE="github"
else
VCS_TYPE="gitea"
fi
# Determine SERVER_URL based on VCS type.
# SECURITY: On GitHub/GHES, ALWAYS use github.server_url — never trust
# inputs.vcs-url to prevent token exfiltration to attacker-controlled hosts.
if [ "$VCS_TYPE" = "github" ]; then
SERVER_URL="${{ github.server_url }}"
if [ -n "${{ inputs.vcs-url }}" ]; then
echo "::warning::inputs.vcs-url is ignored on GitHub/GHES runners (VCS_TYPE=github). Using github.server_url instead."
fi
else
SERVER_URL="${{ inputs.vcs-url || github.server_url }}"
fi
# Strip trailing slash if present
SERVER_URL="${SERVER_URL%/}"
# Validate SERVER_URL for Gitea path: must be https, no whitespace/newlines.
# The [^[:space:]] class already rejects newlines, so no separate newline check needed.
if [ "$VCS_TYPE" = "gitea" ]; then
if ! printf '%s' "$SERVER_URL" | grep -qE '^https://[^[:space:]]+$'; then
echo "Error: SERVER_URL '${SERVER_URL}' must be an https:// URL with no whitespace" >&2
exit 1
fi
# Additional IP-level SSRF defense: resolve the hostname and reject
# requests to RFC1918, RFC6598 (carrier-grade NAT), loopback, link-local,
# and other reserved addresses.
# python3 is required on ubuntu-* runners (see requirements comment above).
# Use printf to write the script to a temp file so the python lines are valid
# YAML (each indented line becomes a printf argument — no unindented code).
# SERVER_URL is passed via CHECK_URL env var, never interpolated into python code.
printf '%s\n' \
'import socket,ipaddress,sys,os' \
'from urllib.parse import urlparse' \
'u=os.environ["CHECK_URL"]; parsed=urlparse(u)' \
'if parsed.username or parsed.password:' \
' print("Error: URL contains user-info — not allowed",file=sys.stderr); sys.exit(2)' \
'h=parsed.hostname' \
'(print("Error: no hostname",file=sys.stderr) or sys.exit(2)) if not h else None' \
'try: rs=socket.getaddrinfo(h,None)' \
'except socket.gaierror as e: print(f"DNS error: {e}",file=sys.stderr); sys.exit(1)' \
'if not rs: print("Error: no addresses",file=sys.stderr); sys.exit(1)' \
'for _,_,_,_,(a,*_) in rs:' \
' ip=ipaddress.ip_address(a)' \
' if isinstance(ip,ipaddress.IPv6Address) and ip.ipv4_mapped: ip=ip.ipv4_mapped' \
' cgn=ipaddress.ip_network("100.64.0.0/10")' \
' if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_multicast or ip.is_reserved or ip in cgn:' \
' print(f"blocked: {a}",file=sys.stderr); sys.exit(1)' \
> /tmp/_ssrf_check.py
CHECK_URL="${SERVER_URL}" python3 /tmp/_ssrf_check.py || {
echo "Error: SERVER_URL '${SERVER_URL}' resolves to a private/reserved IP address" >&2
exit 1
}
fi
# Determine auth token for release API requests
ACTION_TOKEN="${{ inputs.action-repo-token }}"
if [ -z "$ACTION_TOKEN" ]; then
if [ "$VCS_TYPE" = "github" ]; then
ACTION_TOKEN="${{ github.token }}"
else
ACTION_TOKEN="${{ inputs.reviewer-token }}"
fi
fi
# Validate token contains no control characters (defense-in-depth against header injection)
if [ -n "$ACTION_TOKEN" ]; then
if printf '%s' "$ACTION_TOKEN" | LC_ALL=C grep -q '[^[:print:]]'; then
echo "Error: ACTION_TOKEN contains control characters" >&2
exit 1
fi
fi
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
if [ "${{ inputs.version }}" = "latest" ]; then
if [ "$VCS_TYPE" = "github" ]; then
# SECURITY: Use github.api_url which is a trusted platform-provided value.
# Never construct API URLs from user-supplied inputs on GitHub.
API_URL="${GITHUB_API_URL}/repos/${ACTION_REPO}/releases?per_page=1"
else
# Gitea API — SERVER_URL was validated above
API_URL="${SERVER_URL}/api/v1/repos/${ACTION_REPO}/releases?limit=1"
fi
# Fetch latest version with inline auth header (no intermediate variable)
if [ -n "$ACTION_TOKEN" ]; then
if [ "$VCS_TYPE" = "github" ]; then
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" "$API_URL" \
VERSION=$(curl -sSf "${GITEA_URL}/api/v1/repos/${REPO}/releases?limit=1" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
else
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: token ${ACTION_TOKEN}" "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
fi
else
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
fi
if [ -z "$VERSION" ]; then
echo "Failed to determine latest version from ${API_URL}" >&2
echo "Failed to determine latest version" >&2
exit 1
fi
else
VERSION="${{ inputs.version }}"
fi
# Validate VERSION: no slashes or whitespace (prevent path traversal).
# [:space:] includes newlines and carriage returns in POSIX.
if printf '%s' "$VERSION" | grep -qE '[/[:space:]]'; then
echo "Error: VERSION '${VERSION}' contains invalid characters (newline, slash, or whitespace)" >&2
exit 1
fi
# Detect OS and architecture for platform-specific binary download
OS_RAW=$(uname -s | tr '[:upper:]' '[:lower:]')
case "$OS_RAW" in
linux) OS="linux" ;;
darwin) OS="darwin" ;;
*)
echo "Error: unsupported OS: $(uname -s)" >&2
exit 1
;;
esac
RAW_ARCH=$(uname -m)
case "$RAW_ARCH" in
x86_64) ARCH="amd64" ;;
aarch64 | arm64) ARCH="arm64" ;;
*)
echo "Error: unsupported architecture: $RAW_ARCH" >&2
exit 1
;;
esac
echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
echo "os=${OS}" >> "$GITHUB_OUTPUT"
echo "arch=${ARCH}" >> "$GITHUB_OUTPUT"
echo "action_repo=${ACTION_REPO}" >> "$GITHUB_OUTPUT"
echo "server_url=${SERVER_URL}" >> "$GITHUB_OUTPUT"
echo "vcs_type=${VCS_TYPE}" >> "$GITHUB_OUTPUT"
# SECURITY: Pass token via masked environment variable instead of step output.
# Step outputs can leak in debug logs; GITHUB_ENV with masking is safer.
if [ -n "$ACTION_TOKEN" ]; then
echo "::add-mask::${ACTION_TOKEN}"
echo "ACTION_TOKEN=${ACTION_TOKEN}" >> "$GITHUB_ENV"
fi
- name: Cache review-bot binary
id: cache
uses: actions/cache@v4
with:
path: ${{ runner.temp }}/review-bot
key: review-bot-${{ steps.version.outputs.os }}-${{ steps.version.outputs.arch }}-${{ steps.version.outputs.version }}
key: review-bot-linux-amd64-${{ steps.version.outputs.version }}
- name: Install review-bot
if: steps.cache.outputs.cache-hit != 'true'
shell: bash
run: |
set -euo pipefail
SERVER_URL="${{ steps.version.outputs.server_url }}"
ACTION_REPO="${{ steps.version.outputs.action_repo }}"
GITEA_URL="${{ inputs.gitea-url || github.server_url }}"
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
VERSION="${{ steps.version.outputs.version }}"
VCS_TYPE="${{ steps.version.outputs.vcs_type }}"
OS="${{ steps.version.outputs.os }}"
ARCH="${{ steps.version.outputs.arch }}"
# Read token from masked environment variable (set in Determine version step)
# Falls back to empty if not set (public repos don't need auth)
ACTION_TOKEN="${ACTION_TOKEN:-}"
BINARY="review-bot-${OS}-${ARCH}"
BINARY="review-bot-linux-amd64"
# SECURITY: Re-validate SERVER_URL at the start of this step to mitigate DNS
# rebinding attacks. A DNS TTL expiry between "Determine version" and here
# could allow an attacker to change the resolved IP to a private/reserved
# address, causing curl to send ACTION_TOKEN to an internal host.
# Only needed on Gitea path (VCS_TYPE=gitea); GitHub/GHES uses platform-controlled URLs.
if [ "$VCS_TYPE" = "gitea" ]; then
printf '%s\n' \
'import socket,ipaddress,sys,os' \
'from urllib.parse import urlparse' \
'u=os.environ["CHECK_URL"]; parsed=urlparse(u)' \
'if parsed.username or parsed.password:' \
' print("Error: URL contains user-info — not allowed",file=sys.stderr); sys.exit(2)' \
'h=parsed.hostname' \
'(print("Error: no hostname",file=sys.stderr) or sys.exit(2)) if not h else None' \
'try: rs=socket.getaddrinfo(h,None)' \
'except socket.gaierror as e: print(f"DNS error: {e}",file=sys.stderr); sys.exit(1)' \
'if not rs: print("Error: no addresses",file=sys.stderr); sys.exit(1)' \
'for _,_,_,_,(a,*_) in rs:' \
' ip=ipaddress.ip_address(a)' \
' if isinstance(ip,ipaddress.IPv6Address) and ip.ipv4_mapped: ip=ip.ipv4_mapped' \
' cgn=ipaddress.ip_network("100.64.0.0/10")' \
' if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_multicast or ip.is_reserved or ip in cgn:' \
' print(f"blocked: {a}",file=sys.stderr); sys.exit(1)' \
> /tmp/_ssrf_check_install.py
CHECK_URL="${SERVER_URL}" python3 /tmp/_ssrf_check_install.py || {
echo "Error: SERVER_URL '${SERVER_URL}' resolves to a private/reserved IP address" >&2
exit 1
}
fi
if [ "$VCS_TYPE" = "github" ]; then
# GitHub/GHES: Use REST API for release asset downloads.
# Web release URLs ({server}/.../releases/download/{tag}/{asset}) redirect
# to S3 and don't reliably support Authorization headers for private repos.
# The REST API endpoint with Accept: application/octet-stream is required.
# GITHUB_API_URL: trusted platform value, same as detected in "Determine version" step.
GITHUB_API_URL="${{ github.api_url }}"
if [ -n "$ACTION_TOKEN" ]; then
RELEASE_JSON=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/tags/${VERSION}")
else
RELEASE_JSON=$(curl -sSf --connect-timeout 10 --max-time 30 \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/tags/${VERSION}")
fi
# Extract asset IDs for binary and checksums
BINARY_ASSET_ID=$(printf '%s' "$RELEASE_JSON" | python3 -c "import sys, json; assets = json.load(sys.stdin).get('assets', []); matches = [a['id'] for a in assets if a['name'] == '${BINARY}']; print(matches[0] if matches else '')")
if [ -z "$BINARY_ASSET_ID" ]; then
echo "Error: could not find asset '${BINARY}' in release ${VERSION}" >&2
exit 1
fi
CHECKSUMS_ASSET_ID=$(printf '%s' "$RELEASE_JSON" | python3 -c "import sys, json; assets = json.load(sys.stdin).get('assets', []); matches = [a['id'] for a in assets if a['name'] == 'checksums.txt']; print(matches[0] if matches else '')")
if [ -z "$CHECKSUMS_ASSET_ID" ]; then
echo "Error: could not find asset 'checksums.txt' in release ${VERSION}" >&2
exit 1
fi
# Download assets via REST API with Accept: application/octet-stream
if [ -n "$ACTION_TOKEN" ]; then
curl -sSfL --connect-timeout 10 --max-time 120 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${BINARY_ASSET_ID}" \
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${CHECKSUMS_ASSET_ID}" \
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/checksums.txt" \
-o "${{ runner.temp }}/checksums.txt"
else
curl -sSfL --connect-timeout 10 --max-time 120 \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${BINARY_ASSET_ID}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL --connect-timeout 10 --max-time 30 \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${CHECKSUMS_ASSET_ID}" \
-o "${{ runner.temp }}/checksums.txt"
fi
else
# Gitea: Direct download via web release URLs (Gitea serves assets
# directly without redirects — no -L needed).
# SECURITY: Omitting -L prevents forwarding Authorization header to
# unexpected hosts if Gitea ever introduces CDN redirects.
DOWNLOAD_URL="${SERVER_URL}/${ACTION_REPO}/releases/download/${VERSION}"
if [ -n "$ACTION_TOKEN" ]; then
curl -sSf --connect-timeout 10 --max-time 120 \
-H "Authorization: token ${ACTION_TOKEN}" \
"${DOWNLOAD_URL}/${BINARY}" -o "${{ runner.temp }}/review-bot"
curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: token ${ACTION_TOKEN}" \
"${DOWNLOAD_URL}/checksums.txt" -o "${{ runner.temp }}/checksums.txt"
else
curl -sSf --connect-timeout 10 --max-time 120 \
"${DOWNLOAD_URL}/${BINARY}" -o "${{ runner.temp }}/review-bot"
curl -sSf --connect-timeout 10 --max-time 30 \
"${DOWNLOAD_URL}/checksums.txt" -o "${{ runner.temp }}/checksums.txt"
fi
fi
# Verify SHA-256 checksum
# NOTE: This verifies integrity (download wasn't corrupted) but not
# authenticity — both binary and checksums come from the same server.
# For stronger guarantees, consider GPG signature verification.
cd "${{ runner.temp }}"
EXPECTED=$(grep -E "^[0-9a-f]+[[:space:]]+\*?${BINARY}$" checksums.txt | awk '{print $1}')
# sha256sum (GNU) is not available on macOS; use shasum -a 256 on darwin.
if [ "${OS}" = "darwin" ]; then
ACTUAL=$(shasum -a 256 review-bot | awk '{print $1}')
else
EXPECTED=$(grep "${BINARY}" checksums.txt | awk '{print $1}')
ACTUAL=$(sha256sum review-bot | awk '{print $1}')
fi
if [ -z "$EXPECTED" ]; then
echo "Error: no checksum found for ${BINARY}" >&2
@@ -470,13 +156,12 @@ runs:
fi
chmod +x "${{ runner.temp }}/review-bot"
echo "Installed review-bot-${OS}-${ARCH} ${VERSION} (checksum verified)"
echo "Installed review-bot ${VERSION} (checksum verified)"
- name: Run review
shell: bash
env:
VCS_URL: ${{ steps.version.outputs.server_url }}
VCS_TYPE: ${{ steps.version.outputs.vcs_type }}
GITEA_URL: ${{ inputs.gitea-url || github.server_url }}
GITEA_REPO: ${{ inputs.repo || github.repository }}
PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ inputs.reviewer-token }}
@@ -492,8 +177,6 @@ runs:
LLM_PROVIDER: ${{ inputs.llm-provider }}
UPDATE_EXISTING: ${{ inputs.update-existing }}
SYSTEM_PROMPT_FILE: ${{ inputs.system-prompt-file }}
PERSONA: ${{ inputs.persona }}
PERSONA_FILE: ${{ inputs.persona-file }}
AICORE_CLIENT_ID: ${{ inputs.aicore-client-id }}
AICORE_CLIENT_SECRET: ${{ inputs.aicore-client-secret }}
AICORE_AUTH_URL: ${{ inputs.aicore-auth-url }}
+4 -8
View File
@@ -19,9 +19,7 @@ jobs:
- run: go build -o review-bot ./cmd/review-bot
# Self-review using native SAP AI Core provider
# Models must match SAP AI Core deployments
# Available models: gpt-5, anthropic--claude-4.6-sonnet, anthropic--claude-4.6-opus
# Removed gpt-4.1, gpt-5-mini, gpt-4.1-mini - not deployed on AI Core
# Models must match SAP AI Core deployments (use 'anthropic--' prefix for Claude)
review:
runs-on: ubuntu-24.04
if: github.event_name == 'pull_request'
@@ -38,8 +36,6 @@ jobs:
- name: security
token_secret: SECURITY_REVIEW_TOKEN
model: gpt-5
patterns_repo: rodin/security-patterns
patterns_files: "."
system_prompt_file: SECURITY_REVIEW.md
steps:
- uses: actions/checkout@v4
@@ -49,7 +45,7 @@ jobs:
- run: go build -o review-bot ./cmd/review-bot
- name: Run ${{ matrix.name }} review
env:
VCS_URL: ${{ github.server_url }}
GITEA_URL: ${{ github.server_url }}
GITEA_REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}
@@ -62,8 +58,8 @@ jobs:
AICORE_API_URL: ${{ secrets.AICORE_API_URL }}
AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }}
CONVENTIONS_FILE: "CONVENTIONS.md"
PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }}
PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }}
PATTERNS_REPO: "rodin/go-patterns"
PATTERNS_FILES: "README.md,patterns/"
LLM_TIMEOUT: "600"
SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }}
run: ./review-bot
-38
View File
@@ -1,38 +0,0 @@
name: PR Ready Gate
on:
pull_request:
types: [synchronize]
jobs:
clear-labels:
runs-on: ubuntu-24.04
# Always run - curl commands are safe if labels don't exist
steps:
- name: Remove ready and self-reviewed labels, reassign to author
env:
GITEA_TOKEN: ${{ secrets.RODIN_TOKEN }}
run: |
PR_NUMBER=${{ github.event.pull_request.number }}
AUTHOR=${{ github.event.pull_request.user.login }}
READY_LABEL_ID=38
SELF_REVIEWED_LABEL_ID=37
# Remove ready label if present
curl -sS -X DELETE \
-H "Authorization: token $GITEA_TOKEN" \
"https://gitea.weiker.me/api/v1/repos/${{ github.repository }}/issues/${PR_NUMBER}/labels/${READY_LABEL_ID}" || true
# Remove self-reviewed label if present
curl -sS -X DELETE \
-H "Authorization: token $GITEA_TOKEN" \
"https://gitea.weiker.me/api/v1/repos/${{ github.repository }}/issues/${PR_NUMBER}/labels/${SELF_REVIEWED_LABEL_ID}" || true
# Reassign to author
curl -sS -X PATCH \
-H "Authorization: token $GITEA_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"assignees\": [\"${AUTHOR}\"]}" \
"https://gitea.weiker.me/api/v1/repos/${{ github.repository }}/pulls/${PR_NUMBER}"
echo "Cleared ready/self-reviewed labels and reassigned PR #${PR_NUMBER} to ${AUTHOR}"
+1 -19
View File
@@ -2,26 +2,8 @@
## Language & Dependencies
- Go standard library only — no external dependencies.
- Target the latest stable Go release.
- **STRICT ALLOWLIST:** Only packages listed below may be imported. No exceptions.
### Approved Third-Party Packages
| Package | Use Case | Scope |
|---------|----------|-------|
| `github.com/goccy/go-yaml` | YAML parsing and AST inspection (subpkgs: `ast`, `parser`) | production |
| `github.com/google/go-cmp` | Test comparisons (`cmp.Diff`) | test only |
**Any import not in this table or the Go standard library is forbidden.**
Transitive dependencies of approved packages are automatically allowed.
To request a new dependency:
1. Open a PR that ONLY updates this table
2. Requires explicit approval from Aaron
3. After merge, a separate PR may use the package
*Enforcement: `scripts/check-deps.sh` parses this table — update only here.*
## Error Handling
+1 -7
View File
@@ -1,4 +1,4 @@
.PHONY: build test test-integration lint clean coverage check-deps precommit
.PHONY: build test test-integration lint clean coverage
build:
go build -o review-bot ./cmd/review-bot/
@@ -12,15 +12,9 @@ test-integration:
lint:
go vet ./...
check-deps:
@./scripts/check-deps.sh
clean:
rm -f review-bot
coverage:
go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out
# Precommit runs all checks required before pushing
precommit: check-deps lint test
-175
View File
@@ -1,175 +0,0 @@
# Plan: Issue #125 — Rename GITEA_URL → VCS_URL
## Problem
The `GITEA_URL` environment variable (and `--gitea-url` flag) implies the binary only works with Gitea.
Now that review-bot supports both Gitea and GitHub/GHES, this name is misleading.
Renaming to `VCS_URL` makes the binary platform-agnostic in its interface.
## Constraints
- Must not break existing users who already use `GITEA_URL` — need a fallback
- The CLI flag `--gitea-url` should also be updated to `--vcs-url` for consistency
- `INTEGRATION_GITEA_URL` in integration tests is a test-only env var, not the binary's interface; but should be updated for clarity
- The action YAML uses `GITEA_URL` as an internal shell variable in bash scripts — distinct from the env var passed to the binary
- All changes must compile and pass existing tests
## Files Affected
### Binary / Go source
| File | Change |
|------|--------|
| `cmd/review-bot/main.go` | Rename `--gitea-url``--vcs-url`, add `VCS_URL` as primary, keep `GITEA_URL` fallback |
| `cmd/review-bot/integration_test.go` | Rename `INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` (test-only, no external compat concern) |
| `integration_test.go` | Same — rename `INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` |
### Action YAML
| File | Change |
|------|--------|
| `.gitea/actions/review/action.yml` | Rename input `gitea-url``vcs-url`; update env var passed to binary: `VCS_URL` instead of `GITEA_URL`; keep internal bash var as `GITEA_URL` (only used for release download, not passed to binary) |
| `.gitea/workflows/ci.yml` | Rename `GITEA_URL` env var to `VCS_URL` in Run review step |
### Documentation
| File | Change |
|------|--------|
| `README.md` | Update CLI example, env var table entry |
## Proposed Approach
### 1. Backward-compatible env var lookup in main.go
Replace:
```go
giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", ""), "Gitea instance URL")
```
With:
```go
giteaURL := flag.String("vcs-url", envOrDefaultFallback("VCS_URL", "GITEA_URL", ""), "VCS server URL (e.g. https://gitea.example.com)")
```
Add a helper:
```go
// envOrDefaultFallback reads primary env var; if empty, falls back to deprecated env var.
func envOrDefaultFallback(primary, deprecated, defaultVal string) string {
if v := os.Getenv(primary); v != "" {
return v
}
if v := os.Getenv(deprecated); v != "" {
slog.Warn("deprecated env var in use; rename to " + primary, "old", deprecated, "new", primary)
return v
}
return defaultVal
}
```
**Note:** This must be called AFTER `setupLogger` conceptually, but the flag default is evaluated at flag registration time. Since `setupLogger` runs before `flag.Parse()`, the slog.Warn will print correctly at runtime. We use `log.Printf` as a fallback if this proves problematic.
Actually — flag defaults are evaluated at registration (line 57), before `setupLogger`. The warning won't go through slog. Two options:
- Use `log.Printf` for the deprecation warning (always visible)
- Move the fallback lookup to after `flag.Parse()`, checking if the parsed value is still empty
**Decision:** Move fallback to a post-parse check. This is cleaner:
```go
vcsURL := flag.String("vcs-url", os.Getenv("VCS_URL"), "VCS server URL")
flag.Parse()
// Backward compat: fall back to deprecated GITEA_URL
if *vcsURL == "" {
if v := os.Getenv("GITEA_URL"); v != "" {
slog.Warn("GITEA_URL is deprecated; use VCS_URL instead")
*vcsURL = v
}
}
```
This is clean, idiomatic, and the warning goes through slog correctly.
### 2. Keep `--gitea-url` as deprecated alias
Add a hidden flag for backward compat:
```go
giteaURLAlias := flag.String("gitea-url", "", "Deprecated: use --vcs-url")
```
Post-parse:
```go
if *vcsURL == "" && *giteaURLAlias != "" {
slog.Warn("--gitea-url is deprecated; use --vcs-url instead")
*vcsURL = *giteaURLAlias
}
```
### 3. Internal variable rename
Rename `giteaURL` local variable → `vcsURL` throughout `main.go` for consistency.
### 4. Error message update
```go
fmt.Fprintf(os.Stderr, "Required: --vcs-url, --repo, --pr, --reviewer-token, --llm-model\n")
```
### 5. Action YAML changes
In `.gitea/actions/review/action.yml`:
- Input `gitea-url``vcs-url` (with same description, `required: false`, `default: ''`)
- Line 172: `GITEA_URL: ${{ inputs.gitea-url || github.server_url }}``VCS_URL: ${{ inputs.vcs-url || github.server_url }}`
- Lines 115, 140: internal bash vars `GITEA_URL=` are used for downloading binaries — NOT passed to the review-bot binary. Leave them as internal bash vars (they're scope-local in bash). These could be renamed to `SERVER_URL` or `BASE_URL` for local clarity, but renaming them isn't strictly required.
In `.gitea/workflows/ci.yml`:
- Line 52: `GITEA_URL: ${{ github.server_url }}``VCS_URL: ${{ github.server_url }}`
### 6. Integration test updates
`INTEGRATION_GITEA_URL``INTEGRATION_VCS_URL` in both test files.
### 7. README
- CLI example: `--gitea-url``--vcs-url`
- Env var table: `GITEA_URL``VCS_URL`, add note about `GITEA_URL` fallback
## Backward Compatibility Summary
| Old | New | Fallback? |
|-----|-----|-----------|
| `GITEA_URL` env var | `VCS_URL` | ✅ with deprecation warning |
| `--gitea-url` flag | `--vcs-url` | ✅ with deprecation warning |
| `gitea-url` action input | `vcs-url` | ⚠️ No (action version bump handles this) |
| `INTEGRATION_GITEA_URL` | `INTEGRATION_VCS_URL` | N/A (test-only) |
## Error Cases
- Both `VCS_URL` and `GITEA_URL` set: `VCS_URL` wins (primary takes precedence)
- Both `--vcs-url` and `--gitea-url` provided: `--vcs-url` wins
- Neither set: existing "missing required flags" error unchanged
## Edge Cases
- `os.Getenv` returns "" for unset AND set-to-empty — consistent with existing behavior
- The `envOrDefault` helper is unchanged; we add `envOrDefaultFallback` for the one renamed var
## Testing Strategy
- Existing unit tests pass unchanged (they don't test env var parsing directly)
- Integration tests updated to use new env var name
- Manual: `GITEA_URL=https://example.com ./review-bot --repo x --pr 1 ...` should print deprecation warning and proceed
- Manual: `VCS_URL=https://example.com ./review-bot ...` should work silently
## Completion Checklist
1. `VCS_URL` is read first; `GITEA_URL` is fallback with deprecation warning
2. `--vcs-url` flag is primary; `--gitea-url` is deprecated alias with warning
3. Error message references `--vcs-url` not `--gitea-url`
4. `action.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
5. `ci.yml` passes `VCS_URL` (not `GITEA_URL`) to the binary
6. README updated in CLI example and env var table
7. Integration tests use `INTEGRATION_VCS_URL`
8. `go test ./...` passes
9. `go vet ./...` passes
10. `go build ./cmd/review-bot` succeeds
## Open Questions
- Should the CLI flag `--gitea-url` be completely hidden from `--help` or just deprecated with a note? The issue doesn't specify. Decision: keep it visible but add "(deprecated: use --vcs-url)" to the description.
- Should action.yml also add `gitea-url` as a deprecated input alias? The issue says "Update the action to pass the new env var name" — no mention of backward compat for the action input. Decision: rename only, no alias (action users pin a version anyway).
- The bash-internal `GITEA_URL` variable in action.yml scripts (used for release download, not passed to binary) — rename for clarity? Decision: yes, rename to `BASE_URL` to avoid confusion with the env var.
+7 -124
View File
@@ -9,7 +9,7 @@ AI-powered code review bot for Gitea pull requests. Fetches diff + context, send
- **Smart budget**: Automatically trims context to fit model token limits
- **Idempotent reviews**: Posts new review, then cleans up stale ones (one review per bot)
- **Custom prompts**: Load additional instructions from a file (e.g. security-focused review)
- **Minimal dependencies**: Go stdlib + `gopkg.in/yaml.v3` only
- **Zero dependencies**: Go stdlib only
## Quick Start: Composite Action
@@ -207,8 +207,6 @@ AI Core handles OAuth token management and deployment discovery automatically. M
| `patterns-repo` | No | `""` | Comma-separated repos with language patterns (e.g. `rodin/go-patterns`) |
| `patterns-files` | No | `README.md` | Files/directories to fetch from pattern repos |
| `system-prompt-file` | No | `""` | Local file with additional system prompt instructions |
| `persona` | No | `""` | Built-in persona name (security, architect, docs) |
| `persona-file` | No | `""` | Path to persona file (YAML or JSON) with custom review focus |
| `temperature` | No | `0` | LLM temperature (0 = server default) |
| `timeout` | No | `300` | LLM request timeout in seconds |
| `dry-run` | No | `false` | Print review to stdout instead of posting |
@@ -282,10 +280,10 @@ Rules:
```bash
review-bot \
--vcs-url https://gitea.example.com \
--gitea-url https://gitea.example.com \
--repo owner/name \
--pr 42 \
--reviewer-token "$REVIEWER_TOKEN" \
--reviewer-token "$GITEA_TOKEN" \
--reviewer-name "code-review" \
--llm-base-url https://api.openai.com/v1 \
--llm-api-key "$OPENAI_API_KEY" \
@@ -299,9 +297,8 @@ All flags have environment variable equivalents:
| Flag | Env Var |
|------|---------|
| `--vcs-url` | `VCS_URL` (fallback: `GITEA_URL`) |
| `--vcs-type` | `VCS_TYPE` (auto-detected from URL if not set; `gitea` or `github`) |
| `--repo` | `GITEA_REPO` (also accepted: set `GITEA_REPO` for Gitea; VCS-agnostic `REPO` coming) |
| `--gitea-url` | `GITEA_URL` |
| `--repo` | `GITEA_REPO` |
| `--pr` | `PR_NUMBER` |
| `--reviewer-token` | `REVIEWER_TOKEN` |
| `--reviewer-name` | `REVIEWER_NAME` |
@@ -330,12 +327,11 @@ All flags have environment variable equivalents:
### Token Scopes Required
| Scope | Purpose |
|-------|--------|
|-------|---------|
| `write:issue` | Post and delete reviews |
| `write:repository` | Read PR diffs, file content, commit statuses |
| `read:user` | Self-request as reviewer (optional but recommended) |
Without `read:user`, the bot still works but cannot add itself to the PR's reviewer list.
No `read:user` scope needed — the bot identifies itself from the review response.
## Development
@@ -361,116 +357,3 @@ budget/ Token estimation + context trimming
## License
MIT
## Review Personas
Personas provide role-based review specialization. Instead of generic code review, each persona focuses on a specific domain (security, architecture, documentation) with tailored prompts and severity calibration.
### Built-in Personas
| Persona | Focus |
|---------|-------|
| `security` | Vulnerabilities, auth bypass, secrets exposure, injection attacks |
| `architect` | Design patterns, code organization, API contracts, testability |
| `docs` | Documentation quality, API clarity, error messages |
### Using Built-in Personas
```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: security
persona: security
llm-model: claude-opus-4-20250514 # Security benefits from strong reasoning
...
```
### Multiple Personas in Parallel
```yaml
jobs:
review:
strategy:
matrix:
include:
- name: security
persona: security
- name: architect
persona: architect
steps:
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: ${{ matrix.name }}
persona: ${{ matrix.persona }}
...
```
Each persona posts independently with its own sentinel, so reviews don't interfere.
### Custom Personas
Create a YAML file with your domain-specific review focus:
```yaml
# .review/personas/trading.yaml
name: trading
display_name: Trading Domain Expert
identity: |
You are a trading systems expert reviewing code for correctness.
Your expertise:
- Order lifecycle and state machines
- Fill handling and partial fills
- Position tracking and P&L calculations
- Event sourcing invariants
focus:
- Order state machine correctness
- Fill handling edge cases (partial, overfill)
- Position and P&L calculation accuracy
- Event replay determinism
- Decimal precision for money
ignore:
- Code style
- General performance
- Documentation formatting
severity:
major: "Bugs that cause incorrect positions, fills, or money calculations"
minor: "Edge cases that could cause issues under unusual conditions"
nit: "Clarity improvements for domain logic"
```
Use it in CI:
```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: trading
persona-file: .review/personas/trading.yaml
...
```
YAML is the recommended format for personas because it supports:
- Multi-line strings with `|` blocks (cleaner identity definitions)
- Comments for documentation
- More readable arrays and nested structures
JSON is also supported for backwards compatibility—just use `.json` extension.
### Persona vs system-prompt-file
| Feature | `persona` / `persona-file` | `system-prompt-file` |
|---------|---------------------------|----------------------|
| Replaces base prompt | Yes | No (appends) |
| Structured format | Yes (YAML/JSON) | No (freeform) |
| Focus/ignore lists | Yes | Manual |
| Severity calibration | Yes | Manual |
| Header display name | Yes | No |
| Built-in options | Yes | No |
Use personas for domain-specialized reviews. Use `system-prompt-file` for minor tweaks to the generic review.
+13 -82
View File
@@ -1,88 +1,19 @@
## Dev Loop: review-bot — 2026-05-14 20:10 UTC
## Self-Review: feat/aicore-provider — 2026-05-09
### Latest: ✅ STABLE STATE — REPO HEALTH COMPLETE
- **Last action:** health check; verified tests pass, repo clean, no action needed
- **Repository:** Clean, all merges complete, no open issues/PRs
- **Main branch:** Up to date with origin/main
- **Test suite:** All passing (cached)
### Verdict: PASS
---
No blocking issues found — ready for human review.
## Repository Status
#### Notes (informational, not blocking)
### ✅ Merged to main (recent):
- issue-123 (IP-level SSRF defense) — 6 commits, main at 4440823
- issue-125 (VCS_URL rename + deprecation) — merged
- issue-124 (multi-arch binary support) — merged
- issue-120 (GitHub Actions + VCS abstraction) — merged
- issue-121 (VCS host type detection for binary download) — merged
**[fit]** `staticcheck` reports:
- `llm/aicore.go:237` and `llm/client.go:231`: struct literal conversion style (S1016) — minor style nit, existing in both old and new code
- `gitea/diff.go:78`: HasPrefix return ignored (SA4017) — pre-existing, not introduced by this PR
- `cmd/review-bot/main_test.go:347`: nil Context (SA1012) — pre-existing, not introduced by this PR
### 🧹 Cleanup COMPLETE:
- ✅ Removed old worktrees (issue-123, review-bot-issue-125)
- ✅ Test suite passes (all packages)
- ✅ No TODO/FIXME in code except expected GitHub client notes
- ✅ No open issues or pull requests
- ✅ Dependencies up to date
**[fit]** Body length validation: `aicore.go` does not include the Content-Length vs body length validation that `doRequest` has in `client.go`. This is acceptable because:
1. AI Core uses OAuth tokens which are short-lived, so truncation is less likely
2. The retry logic still applies via "read response" error pattern
3. Adding complexity to aicore.go for an edge case that hasn't manifested is premature
---
## Current Feature Completeness
**Core Capabilities:**
- Multi-provider LLM support (OpenAI, Anthropic, SAP AI Core)
- Gitea PR integration with structured reviews
- SSRF defense with IP-level validation
- VCS abstraction (Gitea/GitHub support)
- Multi-architecture binary support
- GitHub Actions composite action
**Recent Security Work:**
- RFC6598 CGN range detection
- IP fallback dialing for local endpoint rejection
- URL validation for SSRF prevention
**Code Quality:**
- Comprehensive test coverage (all packages tested)
- Consistent error handling with context propagation
- Secure credential handling (unexported fields)
- Concurrency-safe designs
---
## Next Priority Actions
### Phase 2: Feature Exploration (NEXT SESSION)
- Scan code for potential improvements per REVIEW.md findings
- Assess performance under load
- Review REVIEW.md findings for targeted fixes
- Consider backlog items from design docs
### Phase 3: Optional Enhancements (BACKLOG)
- Address REVIEW.md context propagation findings (if prioritized)
- Additional LLM provider support
- Enhanced context detection
- Custom report formats
- Webhook management improvements
---
## Worktrees Status
All old worktrees cleaned up. Ready for new issue work.
---
## Dev-Loop Metadata
- **Repo:** /home/ubuntu/review-bot
- **Main branch SHA:** ed3a5dd (last commit)
- **Cron ID:** 5342ac81-4bbc-4e4c-a123-347a7788d50c
- **Scheduled:** Every 4 hours
- **Last health check:** 2026-05-14 20:10 UTC (✅ all healthy)
## Self-Review: review-bot-issue-130-work — 2026-05-14
### Verdict: NEEDS_WORK
- [x] [completeness] `VCS_TYPE` env var is detected in `main.go` but **not passed from `action.yml`'s `Run review` step** to the binary. The binary falls back to a URL heuristic (`github.com` / `github.concur.com` substrings), which will silently misclassify any GitHub Enterprise Server whose hostname does not contain `github`. The action already has `vcs_type` as a step output — it should be passed as `VCS_TYPE: ${{ steps.version.outputs.vcs_type }}` in the `Run review` env block.
- [x] [completeness] `README.md` still references `GITEA_REPO` and `$GITEA_TOKEN` in the CLI example (line ~288) and the env var table (line ~303). Now that the binary supports GitHub too, these should be renamed to VCS-agnostic names (`VCS_REPO`/`REPO`, `VCS_TOKEN`/`REVIEWER_TOKEN`) or at minimum the env var table should note GitHub support.
- [x] [fit] `vcsReviewComment.NewPosition` field comment says "Gitea: absolute line; GitHub: diff hunk position" but in `githubVCSAdapter.PostReview` it is actually mapped to `Line` (absolute line) + `Side: "RIGHT"`, not to `Position` (hunk position). The field name and comment are now slightly misleading — the field means "new line number" for Gitea and is repurposed as "absolute line" for GitHub. Consider renaming to `Line` or `NewLine` with a clearer comment explaining the per-backend semantics.
- [x] [instinct] `validateurl.go` imports `gitea` package for `gitea.IsBlockedIP` — this creates a dependency from the GitHub-generic `validateurl.go` on the Gitea-specific package. The `IsBlockedIP` function is a general networking utility. Consider moving it to a shared `internal/netutil` package (or keeping it in `gitea` and accepting the coupling), but the current import is an unexpected relationship.
**[completeness]** Tests pass (go test ./...), go vet clean, no uncommitted changes.
+4 -4
View File
@@ -17,7 +17,7 @@ import (
// Integration test requires a running Gitea instance and LLM endpoint.
// Set environment variables:
//
// INTEGRATION_VCS_URL - VCS base URL
// INTEGRATION_GITEA_URL - Gitea base URL
// INTEGRATION_GITEA_TOKEN - Gitea API token with repo access
// INTEGRATION_GITEA_REPO - owner/repo with an open PR
// INTEGRATION_PR_NUMBER - PR number to test against
@@ -25,7 +25,7 @@ import (
// INTEGRATION_LLM_API_KEY - LLM API key
// INTEGRATION_LLM_MODEL - Model name
func TestIntegration_FullReviewFlow(t *testing.T) {
giteaURL := os.Getenv("INTEGRATION_VCS_URL")
giteaURL := os.Getenv("INTEGRATION_GITEA_URL")
giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN")
giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO")
prNumStr := os.Getenv("INTEGRATION_PR_NUMBER")
@@ -104,7 +104,7 @@ func TestIntegration_FullReviewFlow(t *testing.T) {
}
func TestIntegration_PostAndCleanup(t *testing.T) {
giteaURL := os.Getenv("INTEGRATION_VCS_URL")
giteaURL := os.Getenv("INTEGRATION_GITEA_URL")
giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN")
giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO")
prNumStr := os.Getenv("INTEGRATION_PR_NUMBER")
@@ -130,7 +130,7 @@ func TestIntegration_PostAndCleanup(t *testing.T) {
// Post a test review
sentinel := "<!-- review-bot:integration-test -->"
testBody := "# Integration Test Review\n\nThis is a test review.\n\n" + sentinel
posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, "COMMENT", testBody, "", nil)
posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, "COMMENT", testBody, nil)
if err != nil {
t.Fatalf("PostReview: %v", err)
}
+66 -278
View File
@@ -4,7 +4,6 @@ import (
"context"
"flag"
"fmt"
"io"
"log/slog"
"os"
"path/filepath"
@@ -14,20 +13,12 @@ import (
"gitea.weiker.me/rodin/review-bot/budget"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/github"
"gitea.weiker.me/rodin/review-bot/llm"
"gitea.weiker.me/rodin/review-bot/review"
)
var version = "dev"
// outWriter and errWriter are the output and error writers for subcommands.
// They are variables so tests can capture output.
var (
outWriter io.Writer = os.Stdout
errWriter io.Writer = os.Stderr
)
// setupLogger configures the global slog default logger based on format and verbosity.
func setupLogger(format, verbosity string) {
var level slog.Level
@@ -58,22 +49,12 @@ func setupLogger(format, verbosity string) {
}
func main() {
// Dispatch subcommands before flag parsing so they get their own args.
// e.g. `review-bot validate-url <url>`
if len(os.Args) > 1 {
switch os.Args[1] {
case "validate-url":
os.Exit(runValidateURL(os.Args[2:]))
}
}
versionFlag := flag.Bool("version", false, "Print version and exit")
// Logging flags
logFormat := flag.String("log-format", envOrDefault("LOG_FORMAT", "text"), "Log output format: text or json")
verbosity := flag.String("verbosity", envOrDefault("LOG_VERBOSITY", "info"), "Log verbosity: debug, info, warn, error")
// CLI flags
vcsURL := flag.String("vcs-url", os.Getenv("VCS_URL"), "VCS server URL (e.g. https://gitea.example.com)")
giteaURLAlias := flag.String("gitea-url", "", "Deprecated: use --vcs-url")
giteaURL := flag.String("gitea-url", envOrDefault("GITEA_URL", ""), "Gitea instance URL")
repo := flag.String("repo", envOrDefault("GITEA_REPO", ""), "Repository (owner/name)")
prNum := flag.String("pr", envOrDefault("PR_NUMBER", ""), "Pull request number")
reviewerName := flag.String("reviewer-name", envOrDefault("REVIEWER_NAME", ""), "Reviewer display name")
@@ -84,13 +65,11 @@ func main() {
conventionsFile := flag.String("conventions-file", envOrDefault("CONVENTIONS_FILE", ""), "Conventions file path in repo (e.g. CLAUDE.md)")
systemPromptFile := flag.String("system-prompt-file", envOrDefault("SYSTEM_PROMPT_FILE", ""), "Local file with additional system prompt instructions")
patternsRepo := flag.String("patterns-repo", envOrDefault("PATTERNS_REPO", ""), "Repo with language patterns (e.g. rodin/elixir-patterns)")
patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", ""), "Comma-separated file paths to fetch from patterns repo (empty = all files)")
patternsFiles := flag.String("patterns-files", envOrDefault("PATTERNS_FILES", "README.md"), "Comma-separated file paths to fetch from patterns repo")
dryRun := flag.Bool("dry-run", false, "Print review to stdout instead of posting")
llmTemp := flag.Float64("llm-temperature", envOrDefaultFloat("LLM_TEMPERATURE", 0), "LLM temperature (0 = server default)")
llmTimeout := flag.Int("llm-timeout", envOrDefaultInt("LLM_TIMEOUT", 300), "LLM request timeout in seconds (default 300)")
llmProvider := flag.String("llm-provider", envOrDefault("LLM_PROVIDER", "openai"), "LLM API provider: openai, anthropic, or aicore")
personaName := flag.String("persona", envOrDefault("PERSONA", ""), "Built-in persona name (security, architect, docs)")
personaFile := flag.String("persona-file", envOrDefault("PERSONA_FILE", ""), "Path to persona JSON file")
// AI Core specific flags (only used when provider=aicore)
aicoreClientID := flag.String("aicore-client-id", envOrDefault("AICORE_CLIENT_ID", ""), "SAP AI Core client ID (for provider=aicore)")
aicoreClientSecret := flag.String("aicore-client-secret", envOrDefault("AICORE_CLIENT_SECRET", ""), "SAP AI Core client secret (for provider=aicore)")
@@ -110,24 +89,12 @@ func main() {
slog.Info("review-bot starting", "version", version)
// Backward compatibility: fall back to deprecated env var / flag if VCS_URL / --vcs-url not set.
if *vcsURL == "" {
if v := os.Getenv("GITEA_URL"); v != "" {
slog.Warn("GITEA_URL is deprecated; rename the environment variable to VCS_URL")
*vcsURL = v
}
}
if *vcsURL == "" && *giteaURLAlias != "" {
slog.Warn("--gitea-url is deprecated; use --vcs-url instead")
*vcsURL = *giteaURLAlias
}
// Validate required fields
// For aicore provider, llm-base-url and llm-api-key are not required
isAICore := llm.Provider(*llmProvider) == llm.ProviderAICore
if *vcsURL == "" || *repo == "" || *prNum == "" || *reviewerToken == "" || *llmModel == "" {
if *giteaURL == "" || *repo == "" || *prNum == "" || *reviewerToken == "" || *llmModel == "" {
fmt.Fprintf(os.Stderr, "Error: missing required flags or environment variables\n\n")
fmt.Fprintf(os.Stderr, "Required: --vcs-url, --repo, --pr, --reviewer-token, --llm-model\n")
fmt.Fprintf(os.Stderr, "Required: --gitea-url, --repo, --pr, --reviewer-token, --llm-model\n")
os.Exit(1)
}
if !isAICore && (*llmBaseURL == "" || *llmAPIKey == "") {
@@ -140,14 +107,6 @@ func main() {
os.Exit(1)
}
// Validate persona flags are mutually exclusive
if *personaName != "" && *personaFile != "" {
slog.Error("--persona and --persona-file are mutually exclusive")
os.Exit(1)
}
// NOTE: Persona loading deferred until after Gitea client init to support repo personas
// Validate reviewer-name: only safe characters allowed in sentinel
if err := validateReviewerName(*reviewerName); err != nil {
slog.Error("invalid reviewer name", "error", err)
@@ -170,39 +129,7 @@ func main() {
}
// Initialize clients
// Detect VCS type: explicit flag > env var > URL heuristic (default: gitea).
vcsType := envOrDefault("VCS_TYPE", "")
if vcsType == "" {
// Heuristic: if the URL looks like github.com or a GitHub Enterprise host,
// default to GitHub. The composite action sets VCS_TYPE explicitly, so this
// is a fallback for manual invocations.
if strings.Contains(*vcsURL, "github.com") || strings.Contains(*vcsURL, "github.concur.com") {
vcsType = "github"
} else {
vcsType = "gitea"
}
}
slog.Info("VCS type detected", "vcs_type", vcsType, "vcs_url", *vcsURL)
var vcs vcsClient
switch vcsType {
case "github":
// GitHub: baseURL is the API URL, derived from server URL.
// github.com → https://api.github.com
// GHES (e.g. https://ghe.example.com) → https://ghe.example.com/api/v3
apiURL := githubAPIURL(*vcsURL)
ghClient := github.NewClient(*reviewerToken, apiURL)
vcs = newGithubVCSAdapter(ghClient)
slog.Info("using GitHub VCS client", "api_url", apiURL)
case "gitea":
giteaClient := gitea.NewClient(*vcsURL, *reviewerToken)
vcs = newGiteaVCSAdapter(giteaClient)
slog.Info("using Gitea VCS client", "url", *vcsURL)
default:
slog.Error("unsupported VCS type", "vcs_type", vcsType, "valid", "gitea, github")
os.Exit(1)
}
giteaClient := gitea.NewClient(*giteaURL, *reviewerToken)
llmClient := llm.NewClient(*llmBaseURL, *llmAPIKey, *llmModel)
if *llmTemp < 0 || *llmTemp > 2 {
slog.Error("invalid LLM temperature", "temperature", *llmTemp, "range", "0-2")
@@ -236,47 +163,10 @@ func main() {
ctx, cancel := context.WithTimeout(context.Background(), overallTimeout)
defer cancel()
// Load persona if specified (after Gitea client init to support repo personas)
var persona *review.Persona
if *personaName != "" {
// Try loading from repo first, then fall back to built-in
repoPersonas, err := review.LoadRepoPersonas(ctx, vcs, owner, repoName)
if err != nil {
slog.Warn("could not load repo personas", "repo", owner+"/"+repoName, "error", err)
// Continue with built-in personas only.
// NOTE: repoPersonas is nil here, but map indexing on a nil map is safe in Go
// (returns the zero value), so the fallback to built-in below works correctly.
}
if p, ok := repoPersonas[*personaName]; ok {
persona = p
slog.Info("loaded repo persona", "persona", persona.Name, "display", persona.DisplayName, "repo", owner+"/"+repoName)
} else {
// Fall back to built-in
persona, err = review.LoadBuiltinPersona(*personaName)
if err != nil {
slog.Error("failed to load persona", "persona", *personaName, "error", err)
os.Exit(1)
}
slog.Info("loaded built-in persona", "persona", persona.Name, "display", persona.DisplayName)
}
} else if *personaFile != "" {
resolvedPath, err := validateWorkspacePath(*personaFile, "persona-file")
if err != nil {
slog.Error("invalid persona-file path", "error", err)
os.Exit(1)
}
persona, err = review.LoadPersona(resolvedPath)
if err != nil {
slog.Error("failed to load persona file", "file", *personaFile, "error", err)
os.Exit(1)
}
slog.Info("loaded persona from file", "file", *personaFile, "persona", persona.Name)
}
slog.Info("reviewing pull request", "pr", prNumber, "repo", fmt.Sprintf("%s/%s", owner, repoName))
// Step 1: Fetch PR metadata
pr, err := vcs.GetPullRequest(ctx, owner, repoName, prNumber)
pr, err := giteaClient.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil {
slog.Error("failed to fetch PR", "pr", prNumber, "error", err)
os.Exit(1)
@@ -284,7 +174,7 @@ func main() {
slog.Info("fetched PR metadata", "pr", prNumber, "title", pr.Title)
// Step 2: Fetch diff
diff, err := vcs.GetPullRequestDiff(ctx, owner, repoName, prNumber)
diff, err := giteaClient.GetPullRequestDiff(ctx, owner, repoName, prNumber)
if err != nil {
slog.Error("failed to fetch diff", "pr", prNumber, "error", err)
os.Exit(1)
@@ -293,11 +183,11 @@ func main() {
// Step 3: Fetch full file content for modified files
fileContext := ""
files, err := vcs.GetPullRequestFiles(ctx, owner, repoName, prNumber)
files, err := giteaClient.GetPullRequestFiles(ctx, owner, repoName, prNumber)
if err != nil {
slog.Warn("could not fetch PR files list", "pr", prNumber, "error", err)
} else {
fileContext = fetchFileContext(ctx, vcs, owner, repoName, pr.Head.Ref, files)
fileContext = fetchFileContext(ctx, giteaClient, owner, repoName, pr.Head.Ref, files)
slog.Debug("fetched file context", "files", len(files))
}
@@ -305,7 +195,7 @@ func main() {
ciPassed := true
ciDetails := ""
if pr.Head.Sha != "" {
statuses, err := vcs.GetCommitStatuses(ctx, owner, repoName, pr.Head.Sha)
statuses, err := giteaClient.GetCommitStatuses(ctx, owner, repoName, pr.Head.Sha)
if err != nil {
slog.Warn("could not fetch CI status", "sha", pr.Head.Sha, "error", err)
} else {
@@ -317,7 +207,7 @@ func main() {
// Step 5: Load conventions file if specified
conventions := ""
if *conventionsFile != "" {
content, err := vcs.GetFileContent(ctx, owner, repoName, *conventionsFile)
content, err := giteaClient.GetFileContent(ctx, owner, repoName, *conventionsFile)
if err != nil {
slog.Warn("could not load conventions file", "file", *conventionsFile, "error", err)
} else {
@@ -329,21 +219,41 @@ func main() {
// Step 6: Load patterns from external repo if specified
patterns := ""
if *patternsRepo != "" {
patterns = fetchPatterns(ctx, vcs, *patternsRepo, *patternsFiles)
patterns = fetchPatterns(ctx, giteaClient, *patternsRepo, *patternsFiles)
slog.Debug("loaded patterns", "repo", *patternsRepo, "bytes", len(patterns))
}
// Step 6b: Load additional system prompt if specified
additionalPrompt := ""
if *systemPromptFile != "" {
resolvedPath, err := validateWorkspacePath(*systemPromptFile, "system-prompt-file")
workspace := os.Getenv("GITHUB_WORKSPACE")
if workspace == "" {
workspace, _ = os.Getwd()
}
absWorkspace, err := filepath.Abs(workspace)
if err != nil {
slog.Error("invalid system-prompt-file path", "error", err)
slog.Error("failed to resolve workspace path", "error", err)
os.Exit(1)
}
promptPath := filepath.Join(absWorkspace, *systemPromptFile)
promptPath = filepath.Clean(promptPath)
if !strings.HasPrefix(promptPath, absWorkspace+string(filepath.Separator)) && promptPath != absWorkspace {
slog.Error("system-prompt-file resolves outside workspace", "path", promptPath, "workspace", absWorkspace)
os.Exit(1)
}
// Resolve symlinks and re-validate to prevent symlink traversal
resolvedPath, err := filepath.EvalSymlinks(promptPath)
if err != nil {
slog.Error("failed to resolve system prompt file", "path", promptPath, "error", err)
os.Exit(1)
}
if !strings.HasPrefix(resolvedPath, absWorkspace+string(filepath.Separator)) && resolvedPath != absWorkspace {
slog.Error("system-prompt-file symlink resolves outside workspace", "resolved", resolvedPath, "workspace", absWorkspace)
os.Exit(1)
}
data, err := os.ReadFile(resolvedPath)
if err != nil {
slog.Error("failed to read system prompt file", "path", *systemPromptFile, "error", err)
slog.Error("failed to read system prompt file", "path", promptPath, "error", err)
os.Exit(1)
}
additionalPrompt = string(data)
@@ -351,13 +261,7 @@ func main() {
}
// Step 7: Budget-aware prompt assembly
var systemBase string
if persona != nil {
systemBase = review.BuildPersonaSystemPrompt(persona)
slog.Debug("using persona system prompt", "persona", persona.Name)
} else {
systemBase = review.BuildSystemBase()
}
systemBase := review.BuildSystemBase()
if additionalPrompt != "" {
systemBase += "\n\n## Additional Review Instructions\n\n" + additionalPrompt
}
@@ -414,12 +318,7 @@ func main() {
slog.Info("review parsed", "verdict", result.Verdict, "findings", len(result.Findings))
// Step 10: Format and post review
var reviewBody string
if persona != nil && persona.DisplayName != "" {
reviewBody = review.FormatMarkdownWithDisplay(result, persona.DisplayName, *reviewerName)
} else {
reviewBody = review.FormatMarkdown(result, *reviewerName)
}
reviewBody := review.FormatMarkdown(result, *reviewerName)
// Add commit footer so readers know which commit was evaluated
if pr.Head.Sha != "" {
@@ -441,32 +340,14 @@ func main() {
sentinel := fmt.Sprintf("<!-- review-bot:%s -->", *reviewerName)
// Stale check: verify HEAD hasn't moved since we started
evaluatedSHA := pr.Head.Sha
var currentSHA string
currentPR, err := vcs.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil {
slog.Warn("could not re-fetch PR for stale check", "pr", prNumber, "error", err)
// currentSHA stays empty — shouldSkipStaleReview will return false
} else {
currentSHA = currentPR.Head.Sha
}
if shouldSkipStaleReview(evaluatedSHA, currentSHA) {
slog.Warn("HEAD moved during review — skipping stale review",
"evaluated", evaluatedSHA,
"current", currentSHA,
"pr", prNumber)
return
}
// Map findings to inline comments for lines present in the diff
diffRanges := gitea.ParseDiffNewLines(diff)
var inlineComments []vcsReviewComment
var inlineComments []gitea.ReviewComment
for _, f := range result.Findings {
if f.File != "" && f.Line > 0 && diffRanges.Contains(f.File, f.Line) {
inlineComments = append(inlineComments, vcsReviewComment{
inlineComments = append(inlineComments, gitea.ReviewComment{
Path: f.File,
NewLine: int64(f.Line),
NewPosition: int64(f.Line),
Body: fmt.Sprintf("**[%s]** %s", f.Severity, f.Finding),
})
}
@@ -479,9 +360,9 @@ func main() {
// 1. POST new review first (gets non-stale approval badge on HEAD)
// 2. Then supersede old review with link to the new one
// Order matters: post first so we have the new review's URL for the supersede message.
var oldReviews []vcsReview
var oldReviews []gitea.Review
if *reviewerName != "" {
existingReviews, err := vcs.ListReviews(ctx, owner, repoName, prNumber)
existingReviews, err := giteaClient.ListReviews(ctx, owner, repoName, prNumber)
if err != nil {
slog.Warn("could not list existing reviews", "pr", prNumber, "error", err)
} else {
@@ -494,11 +375,11 @@ func main() {
}
// Self-request as reviewer (ensures we appear in required-reviewer checks)
authUser, err := vcs.GetAuthenticatedUser(ctx)
authUser, err := giteaClient.GetAuthenticatedUser(ctx)
if err != nil {
slog.Warn("could not determine authenticated user for reviewer self-request", "error", err)
} else if authUser != "" {
if err := vcs.RequestReviewer(ctx, owner, repoName, prNumber, authUser); err != nil {
if err := giteaClient.RequestReviewer(ctx, owner, repoName, prNumber, authUser); err != nil {
slog.Warn("could not self-request as reviewer", "user", authUser, "error", err)
} else {
slog.Debug("self-requested as reviewer", "user", authUser, "pr", prNumber)
@@ -507,34 +388,31 @@ func main() {
// POST new review
slog.Info("posting review", "event", event, "pr", prNumber)
posted, err := vcs.PostReview(ctx, owner, repoName, prNumber, event, reviewBody, evaluatedSHA, inlineComments)
posted, err := giteaClient.PostReview(ctx, owner, repoName, prNumber, event, reviewBody, inlineComments)
if err != nil {
slog.Error("failed to post review", "pr", prNumber, "event", event, "error", err)
os.Exit(1)
}
slog.Info("review posted", "review_id", posted.ID, "user", posted.User.Login, "pr", prNumber)
// Supersede all old reviews with link to the new one.
// This is only supported on Gitea (requires timeline API); GitHub reviews cannot
// be edited after submission, so we skip the supersede step there.
extVCS, isGiteaExt := vcs.(giteaExtClient)
if len(oldReviews) > 0 && isGiteaExt {
newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(*vcsURL, "/"), owner, repoName, prNumber, posted.ID)
// Supersede all old reviews with link to the new one
if len(oldReviews) > 0 {
newReviewURL := fmt.Sprintf("%s/%s/%s/pulls/%d#pullrequestreview-%d", strings.TrimRight(*giteaURL, "/"), owner, repoName, prNumber, posted.ID)
for _, oldReview := range oldReviews {
cid, err := extVCS.GetTimelineReviewCommentIDForReview(ctx, owner, repoName, int64(prNumber), oldReview.ID)
cid, err := giteaClient.GetTimelineReviewCommentIDForReview(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil {
slog.Warn("could not find comment ID for old review", "review_id", oldReview.ID, "error", err)
continue
}
supersededBody := buildSupersededBody(oldReview.Body, oldReview.CommitID, newReviewURL, sentinel)
if err := extVCS.EditComment(ctx, owner, repoName, cid, supersededBody); err != nil {
if err := giteaClient.EditComment(ctx, owner, repoName, cid, supersededBody); err != nil {
slog.Warn("could not mark old review as superseded", "review_id", oldReview.ID, "comment_id", cid, "error", err)
continue
}
slog.Info("marked old review as superseded", "review_id", oldReview.ID, "new_review_id", posted.ID, "pr", prNumber)
// Resolve old review's inline comments
oldComments, err := extVCS.ListReviewComments(ctx, owner, repoName, int64(prNumber), oldReview.ID)
oldComments, err := giteaClient.ListReviewComments(ctx, owner, repoName, prNumber, oldReview.ID)
if err != nil {
slog.Warn("could not list old review comments for resolution", "review_id", oldReview.ID, "error", err)
continue
@@ -544,7 +422,7 @@ func main() {
if c.ID == 0 {
continue
}
if err := extVCS.ResolveComment(ctx, owner, repoName, c.ID); err != nil {
if err := giteaClient.ResolveComment(ctx, owner, repoName, c.ID); err != nil {
slog.Debug("could not resolve inline comment", "comment_id", c.ID, "error", err)
failed++
} else {
@@ -558,14 +436,12 @@ func main() {
slog.Warn("some inline comments could not be resolved", "review_id", oldReview.ID, "failed", failed, "pr", prNumber)
}
}
} else if len(oldReviews) > 0 {
slog.Info("skipping supersede of old reviews (not supported on this VCS)", "old_count", len(oldReviews), "pr", prNumber)
}
}
// fetchFileContext fetches the full content of modified files from the PR branch.
func fetchFileContext(ctx context.Context, client vcsClient, owner, repo, ref string, files []vcsChangedFile) string {
func fetchFileContext(ctx context.Context, client *gitea.Client, owner, repo, ref string, files []gitea.ChangedFile) string {
var sb strings.Builder
for _, f := range files {
if ctx.Err() != nil {
@@ -591,25 +467,11 @@ func fetchFileContext(ctx context.Context, client vcsClient, owner, repo, ref st
// patternsRepo is comma-separated list of owner/name repos.
// patternsFiles is comma-separated list of file paths or directories.
// If a path ends with / or is a directory, all files within it are fetched recursively.
// If patternsFiles is empty, all files from the repo root are fetched.
func fetchPatterns(ctx context.Context, client vcsClient, patternsRepo, patternsFiles string) string {
func fetchPatterns(ctx context.Context, client *gitea.Client, patternsRepo, patternsFiles string) string {
var sb strings.Builder
repos := strings.Split(patternsRepo, ",")
// Build the list of paths to fetch
var paths []string
if patternsFiles == "" {
// Empty patternsFiles means "fetch all files from repo root"
paths = []string{""}
} else {
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
}
paths := strings.Split(patternsFiles, ",")
for _, repoRef := range repos {
if ctx.Err() != nil {
@@ -626,10 +488,12 @@ func fetchPatterns(ctx context.Context, client vcsClient, patternsRepo, patterns
}
owner, repo := parts[0], parts[1]
var repoLoadedFiles []string
var repoSkippedFiles []string
for _, path := range paths {
path = strings.TrimSpace(path)
if path == "" {
continue
}
files, err := client.GetAllFilesInPath(ctx, owner, repo, path)
if err != nil {
slog.Warn("could not fetch patterns", "path", path, "repo", repoRef, "error", err)
@@ -639,22 +503,11 @@ func fetchPatterns(ctx context.Context, client vcsClient, patternsRepo, patterns
for filePath, content := range files {
// Only include markdown and text files as patterns
if !isPatternFile(filePath) {
repoSkippedFiles = append(repoSkippedFiles, filePath)
continue
}
repoLoadedFiles = append(repoLoadedFiles, filePath)
sb.WriteString(fmt.Sprintf("### %s/%s\n\n%s\n\n", repoRef, filePath, content))
}
}
if len(repoLoadedFiles) > 0 {
slog.Info("loaded pattern files", "repo", repoRef, "count", len(repoLoadedFiles), "files", repoLoadedFiles)
} else {
slog.Warn("no pattern files loaded", "repo", repoRef, "paths", paths)
}
if len(repoSkippedFiles) > 0 {
slog.Debug("skipped non-pattern files", "repo", repoRef, "count", len(repoSkippedFiles), "files", repoSkippedFiles)
}
}
return sb.String()
}
@@ -669,7 +522,7 @@ func isPatternFile(path string) bool {
}
// evaluateCIStatus checks if all CI statuses indicate success.
func evaluateCIStatus(statuses []vcsCommitStatus) (passed bool, details string) {
func evaluateCIStatus(statuses []gitea.CommitStatus) (passed bool, details string) {
if len(statuses) == 0 {
return true, "no CI statuses found"
}
@@ -692,19 +545,6 @@ func evaluateCIStatus(statuses []vcsCommitStatus) (passed bool, details string)
return true, "all checks passed"
}
// githubAPIURL converts a GitHub server URL to its API base URL.
// github.com → https://api.github.com
// GHES (e.g. https://ghe.example.com) → https://ghe.example.com/api/v3
func githubAPIURL(serverURL string) string {
const canonicalGitHub = "https://github.com"
const githubAPIBase = "https://api.github.com"
if serverURL == "" || strings.TrimRight(serverURL, "/") == canonicalGitHub {
return githubAPIBase
}
// GitHub Enterprise Server: /api/v3 suffix
return strings.TrimRight(serverURL, "/") + "/api/v3"
}
func envOrDefault(key, defaultVal string) string {
if v := os.Getenv(key); v != "" {
return v
@@ -754,43 +594,6 @@ func validateReviewerName(name string) error {
return nil
}
// validateWorkspacePath ensures a file path is within the workspace and resolves
// symlinks to prevent traversal attacks. Returns the resolved absolute path or
// an error if the path is outside the workspace.
func validateWorkspacePath(path, pathName string) (string, error) {
workspace := os.Getenv("GITHUB_WORKSPACE")
if workspace == "" {
workspace, _ = os.Getwd()
}
absWorkspace, err := filepath.Abs(workspace)
if err != nil {
return "", fmt.Errorf("failed to resolve workspace path: %w", err)
}
// Join and clean the path
fullPath := filepath.Join(absWorkspace, path)
fullPath = filepath.Clean(fullPath)
// Check path is within workspace using filepath.Rel (more robust than HasPrefix)
rel, err := filepath.Rel(absWorkspace, fullPath)
if err != nil || strings.HasPrefix(rel, "..") {
return "", fmt.Errorf("%s resolves outside workspace: path=%s workspace=%s", pathName, fullPath, absWorkspace)
}
// Resolve symlinks and re-validate to prevent symlink traversal
resolvedPath, err := filepath.EvalSymlinks(fullPath)
if err != nil {
return "", fmt.Errorf("failed to resolve %s: %w", pathName, err)
}
relResolved, err := filepath.Rel(absWorkspace, resolvedPath)
if err != nil || strings.HasPrefix(relResolved, "..") {
return "", fmt.Errorf("%s symlink resolves outside workspace: resolved=%s workspace=%s", pathName, resolvedPath, absWorkspace)
}
return resolvedPath, nil
}
// buildSupersededBody creates the body for a superseded review: struck-through banner
// with collapsed original content and the commit it was evaluated against.
func buildSupersededBody(originalBody, commitSHA, newReviewURL, sentinel string) string {
@@ -820,7 +623,7 @@ func buildSupersededBody(originalBody, commitSHA, newReviewURL, sentinel string)
// Gitea user. This indicates misconfiguration where two roles share a token
// instead of having separate Gitea accounts. Returns true if shared token
// detected (caller should skip update-in-place logic to avoid clobbering).
func hasSharedToken(reviews []vcsReview, ownSentinel string) bool {
func hasSharedToken(reviews []gitea.Review, ownSentinel string) bool {
ownLogin := ""
for _, r := range reviews {
if strings.Contains(r.Body, ownSentinel) {
@@ -858,8 +661,8 @@ func extractSentinelName(body string) string {
}
// findOwnReview locates the most recent non-superseded review matching the sentinel.
func findOwnReview(reviews []vcsReview, sentinel string) *vcsReview {
var best *vcsReview
func findOwnReview(reviews []gitea.Review, sentinel string) *gitea.Review {
var best *gitea.Review
for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) {
continue
@@ -875,8 +678,8 @@ func findOwnReview(reviews []vcsReview, sentinel string) *vcsReview {
}
// findAllOwnReviews returns all non-superseded reviews matching the sentinel.
func findAllOwnReviews(reviews []vcsReview, sentinel string) []vcsReview {
var result []vcsReview
func findAllOwnReviews(reviews []gitea.Review, sentinel string) []gitea.Review {
var result []gitea.Review
for i := range reviews {
if !strings.Contains(reviews[i].Body, sentinel) {
continue
@@ -888,18 +691,3 @@ func findAllOwnReviews(reviews []vcsReview, sentinel string) []vcsReview {
}
return result
}
// shouldSkipStaleReview reports whether to skip posting because HEAD moved.
// Returns true (skip) if evaluatedSHA differs from currentSHA.
// Returns false (don't skip) if:
// - SHAs match (no movement)
// - currentSHA is empty (re-fetch failed; prefer posting stale over failing)
func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
if currentSHA == "" {
// Re-fetch failed; better to post potentially stale than fail
return false
}
return evaluatedSHA != currentSHA
}
+33 -235
View File
@@ -6,10 +6,10 @@ import (
"log/slog"
"os"
"os/exec"
"path/filepath"
"strings"
"testing"
"gitea.weiker.me/rodin/review-bot/gitea"
)
func TestValidateReviewerName(t *testing.T) {
@@ -45,124 +45,18 @@ func TestValidateReviewerName(t *testing.T) {
}
}
func TestValidateWorkspacePath(t *testing.T) {
// Create a temp directory as our workspace
tmpDir := t.TempDir()
// Create a valid file inside the workspace
validFile := filepath.Join(tmpDir, "valid.json")
if err := os.WriteFile(validFile, []byte("{}"), 0644); err != nil {
t.Fatalf("failed to create test file: %v", err)
}
// Create a subdirectory with a file
subDir := filepath.Join(tmpDir, "subdir")
if err := os.MkdirAll(subDir, 0755); err != nil {
t.Fatalf("failed to create subdir: %v", err)
}
nestedFile := filepath.Join(subDir, "nested.json")
if err := os.WriteFile(nestedFile, []byte("{}"), 0644); err != nil {
t.Fatalf("failed to create nested file: %v", err)
}
// Create a symlink pointing outside the workspace
symlinkPath := filepath.Join(tmpDir, "evil-symlink.json")
if err := os.Symlink("/etc/passwd", symlinkPath); err != nil {
t.Fatalf("failed to create symlink: %v", err)
}
// Save and restore GITHUB_WORKSPACE
origWorkspace := os.Getenv("GITHUB_WORKSPACE")
defer os.Setenv("GITHUB_WORKSPACE", origWorkspace)
tests := []struct {
name string
workspace string
path string
wantErr bool
errMatch string
}{
{
name: "valid relative path",
workspace: tmpDir,
path: "valid.json",
wantErr: false,
},
{
name: "valid nested path",
workspace: tmpDir,
path: "subdir/nested.json",
wantErr: false,
},
{
name: "path traversal attempt",
workspace: tmpDir,
path: "../../../etc/passwd",
wantErr: true,
errMatch: "resolves outside workspace",
},
{
name: "absolute path normalized to workspace-relative",
workspace: tmpDir,
path: "/etc/passwd",
wantErr: true,
// Go 1.21+ filepath.Join normalizes absolute paths: Join("/tmp/x", "/etc/passwd")
// becomes "/tmp/x/etc/passwd", which is within workspace but doesn't exist.
errMatch: "failed to resolve",
},
{
name: "nonexistent file",
workspace: tmpDir,
path: "nonexistent.json",
wantErr: true,
errMatch: "failed to resolve",
},
{
name: "symlink escaping workspace",
workspace: tmpDir,
path: "evil-symlink.json",
wantErr: true,
errMatch: "symlink resolves outside workspace",
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
os.Setenv("GITHUB_WORKSPACE", tc.workspace)
resolved, err := validateWorkspacePath(tc.path, "test-file")
if tc.wantErr {
if err == nil {
t.Errorf("expected error for %q, got nil", tc.path)
} else if tc.errMatch != "" && !strings.Contains(err.Error(), tc.errMatch) {
t.Errorf("error %q should contain %q", err.Error(), tc.errMatch)
}
} else {
if err != nil {
t.Errorf("expected no error for %q, got %v", tc.path, err)
}
if resolved == "" {
t.Error("expected non-empty resolved path")
}
// Verify resolved path is within workspace
if !strings.HasPrefix(resolved, tc.workspace) {
t.Errorf("resolved path %q not within workspace %q", resolved, tc.workspace)
}
}
})
}
}
func makeReview(id int64, login, state string, _ bool, body string) vcsReview {
r := vcsReview{
func makeReview(id int64, login, state string, stale bool, body string) gitea.Review {
r := gitea.Review{
ID: id,
Body: body,
State: state,
Stale: stale,
}
r.User.Login = login
return r
}
func TestBuildSupersededBody(t *testing.T) {
original := "# Review\n\nLooks good.\n\n<!-- review-bot:sonnet -->"
sentinel := "<!-- review-bot:sonnet -->"
@@ -214,7 +108,7 @@ func TestBuildSupersededBodyShortSHA(t *testing.T) {
func TestFindOwnReview(t *testing.T) {
tests := []struct {
name string
reviews []vcsReview
reviews []gitea.Review
sentinel string
wantID int64
wantNil bool
@@ -227,7 +121,7 @@ func TestFindOwnReview(t *testing.T) {
},
{
name: "found by sentinel",
reviews: []vcsReview{
reviews: []gitea.Review{
makeReview(42, "bot", "APPROVED", false, "review body\n<!-- review-bot:sonnet -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
@@ -235,7 +129,7 @@ func TestFindOwnReview(t *testing.T) {
},
{
name: "wrong sentinel",
reviews: []vcsReview{
reviews: []gitea.Review{
makeReview(42, "bot", "APPROVED", false, "body\n<!-- review-bot:gpt -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
@@ -243,7 +137,7 @@ func TestFindOwnReview(t *testing.T) {
},
{
name: "multiple reviews, returns first match",
reviews: []vcsReview{
reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "old\n<!-- review-bot:gpt -->"),
makeReview(20, "bot", "APPROVED", false, "new\n<!-- review-bot:sonnet -->"),
},
@@ -252,7 +146,7 @@ func TestFindOwnReview(t *testing.T) {
},
{
name: "skips superseded review",
reviews: []vcsReview{
reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n**Superseded**\n<!-- review-bot:sonnet -->"),
makeReview(20, "bot", "APPROVED", false, "fresh review\n<!-- review-bot:sonnet -->"),
},
@@ -261,7 +155,7 @@ func TestFindOwnReview(t *testing.T) {
},
{
name: "only superseded reviews exist",
reviews: []vcsReview{
reviews: []gitea.Review{
makeReview(10, "bot", "APPROVED", false, "~~Original review~~\n\n<!-- review-bot:sonnet -->"),
},
sentinel: "<!-- review-bot:sonnet -->",
@@ -269,7 +163,7 @@ func TestFindOwnReview(t *testing.T) {
},
{
name: "picks highest ID among matches",
reviews: []vcsReview{
reviews: []gitea.Review{
makeReview(50, "bot", "APPROVED", false, "v1\n<!-- review-bot:sonnet -->"),
makeReview(30, "bot", "APPROVED", false, "v0\n<!-- review-bot:sonnet -->"),
},
@@ -300,7 +194,7 @@ func TestFindOwnReview(t *testing.T) {
func TestHasSharedToken(t *testing.T) {
tests := []struct {
name string
reviews []vcsReview
reviews []gitea.Review
sentinel string
want bool
}{
@@ -312,36 +206,36 @@ func TestHasSharedToken(t *testing.T) {
},
{
name: "no own review yet - cannot detect",
reviews: []vcsReview{
{ID: 1, User: struct{ Login string }{Login: "other"}, Body: "<!-- review-bot:gpt --> body"},
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "other"}, Body: "<!-- review-bot:gpt --> body"},
},
sentinel: "<!-- review-bot:sonnet -->",
want: false,
},
{
name: "separate users - no shared token",
reviews: []vcsReview{
{ID: 1, User: struct{ Login string }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string }{Login: "security-review-bot"}, Body: "<!-- review-bot:security --> body"},
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string `json:"login"` }{Login: "security-review-bot"}, Body: "<!-- review-bot:security --> body"},
},
sentinel: "<!-- review-bot:sonnet -->",
want: false,
},
{
name: "shared token detected - same user different sentinels",
reviews: []vcsReview{
{ID: 1, User: struct{ Login string }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:security --> body"},
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string `json:"login"` }{Login: "sonnet-review-bot"}, Body: "<!-- review-bot:security --> body"},
},
sentinel: "<!-- review-bot:sonnet -->",
want: true,
},
{
name: "three roles same user",
reviews: []vcsReview{
{ID: 1, User: struct{ Login string }{Login: "bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string }{Login: "bot"}, Body: "<!-- review-bot:security --> body"},
{ID: 3, User: struct{ Login string }{Login: "bot"}, Body: "<!-- review-bot:gpt --> body"},
reviews: []gitea.Review{
{ID: 1, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:sonnet --> body"},
{ID: 2, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:security --> body"},
{ID: 3, User: struct{ Login string `json:"login"` }{Login: "bot"}, Body: "<!-- review-bot:gpt --> body"},
},
sentinel: "<!-- review-bot:sonnet -->",
want: true,
@@ -502,56 +396,10 @@ func TestIsPatternFile(t *testing.T) {
}
}
// TestBuildPatternPaths verifies the path-building logic for fetchPatterns.
// Empty patternsFiles means "fetch all from root" (represented as [""]).
func TestBuildPatternPaths(t *testing.T) {
buildPaths := func(patternsFiles string) []string {
if patternsFiles == "" {
return []string{""}
}
var paths []string
for _, p := range strings.Split(patternsFiles, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths = append(paths, p)
}
}
return paths
}
tests := []struct {
name string
input string
want []string
}{
{"empty fetches root", "", []string{""}},
{"single file", "README.md", []string{"README.md"}},
{"multiple files", "README.md,PATTERNS.md", []string{"README.md", "PATTERNS.md"}},
{"trims whitespace", " foo.md , bar.md ", []string{"foo.md", "bar.md"}},
{"skips empty between commas", "foo.md,,bar.md", []string{"foo.md", "bar.md"}},
{"directory path", "patterns/", []string{"patterns/"}},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := buildPaths(tc.input)
if len(got) != len(tc.want) {
t.Errorf("buildPaths(%q) = %v, want %v", tc.input, got, tc.want)
return
}
for i := range got {
if got[i] != tc.want[i] {
t.Errorf("buildPaths(%q)[%d] = %q, want %q", tc.input, i, got[i], tc.want[i])
}
}
})
}
}
func TestEvaluateCIStatus(t *testing.T) {
tests := []struct {
name string
statuses []vcsCommitStatus
statuses []gitea.CommitStatus
wantPassed bool
wantSubstr string
}{
@@ -563,7 +411,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "all success",
statuses: []vcsCommitStatus{
statuses: []gitea.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "success", Context: "ci/test", Description: "Tests passed"},
},
@@ -572,7 +420,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "one failure",
statuses: []vcsCommitStatus{
statuses: []gitea.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"},
},
@@ -581,7 +429,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "error status",
statuses: []vcsCommitStatus{
statuses: []gitea.CommitStatus{
{Status: "error", Context: "ci/lint", Description: "Lint error"},
},
wantPassed: false,
@@ -589,7 +437,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "pending treated as not-failed",
statuses: []vcsCommitStatus{
statuses: []gitea.CommitStatus{
{Status: "pending", Context: "ci/build", Description: "In progress"},
{Status: "success", Context: "ci/test", Description: "Tests passed"},
},
@@ -598,7 +446,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "multiple failures",
statuses: []vcsCommitStatus{
statuses: []gitea.CommitStatus{
{Status: "failure", Context: "ci/build", Description: "Build failed"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"},
},
@@ -607,7 +455,7 @@ func TestEvaluateCIStatus(t *testing.T) {
},
{
name: "mixed with pending and failure",
statuses: []vcsCommitStatus{
statuses: []gitea.CommitStatus{
{Status: "success", Context: "ci/build", Description: "Build passed"},
{Status: "pending", Context: "ci/deploy", Description: "Deploying"},
{Status: "failure", Context: "ci/test", Description: "Tests failed"},
@@ -995,7 +843,7 @@ func cleanEnv() []string {
}
func TestFindAllOwnReviews(t *testing.T) {
reviews := []vcsReview{
reviews := []gitea.Review{
{ID: 1, Body: "<!-- review-bot:sonnet -->\nfirst review"},
{ID: 2, Body: "<!-- review-bot:gpt -->\nother bot"},
{ID: 3, Body: "<!-- review-bot:sonnet -->\nsecond review"},
@@ -1014,53 +862,3 @@ func TestFindAllOwnReviews(t *testing.T) {
}
}
}
func TestShouldSkipStaleReview(t *testing.T) {
tests := []struct {
name string
evaluatedSHA string
currentSHA string
wantSkip bool
}{
{
name: "matching SHAs",
evaluatedSHA: "abc123def456",
currentSHA: "abc123def456",
wantSkip: false,
},
{
name: "different SHAs",
evaluatedSHA: "abc123def456",
currentSHA: "xyz789abc123",
wantSkip: true,
},
{
name: "empty current SHA (re-fetch failed)",
evaluatedSHA: "abc123def456",
currentSHA: "",
wantSkip: false,
},
{
name: "both empty (edge case)",
evaluatedSHA: "",
currentSHA: "",
wantSkip: false,
},
{
name: "only current empty",
evaluatedSHA: "abc123",
currentSHA: "",
wantSkip: false,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := shouldSkipStaleReview(tc.evaluatedSHA, tc.currentSHA)
if got != tc.wantSkip {
t.Errorf("shouldSkipStaleReview(%q, %q) = %v, want %v",
tc.evaluatedSHA, tc.currentSHA, got, tc.wantSkip)
}
})
}
}
-125
View File
@@ -1,125 +0,0 @@
package main
import (
"context"
"errors"
"fmt"
"net"
"net/url"
"strings"
"time"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
// runValidateURL implements the `review-bot validate-url <url>` subcommand.
//
// It resolves the given URL's hostname and checks that every returned IP is
// publicly routable (not RFC1918, loopback, link-local, or other reserved
// ranges). The exit code communicates the result to callers:
//
// 0 — URL is safe to use
// 1 — URL resolves to a blocked/private address
// 2 — URL is malformed, has an unsafe scheme, or DNS lookup failed
//
// This is intended for use from action.yml shell steps that need to validate
// a user-supplied URL before passing it to curl.
func runValidateURL(args []string) int {
if len(args) != 1 {
fmt.Fprintln(errWriter, "usage: review-bot validate-url <url>")
fmt.Fprintln(errWriter, "")
fmt.Fprintln(errWriter, "Resolves <url> and verifies all resolved IPs are publicly routable.")
fmt.Fprintln(errWriter, "Exit 0=safe, 1=blocked, 2=error")
return 2
}
rawURL := args[0]
if err := validateURL(rawURL); err != nil {
fmt.Fprintf(errWriter, "Error: %v\n", err)
var ve *validateError
if isValidateError(err, &ve) {
return ve.code
}
return 2
}
fmt.Fprintf(outWriter, "OK: %s is safe\n", rawURL)
return 0
}
// validateError carries an exit code alongside a message.
type validateError struct {
code int
message string
}
func (e *validateError) Error() string { return e.message }
// isValidateError checks if err is or wraps a *validateError and sets out.
// Uses errors.As so that wrapped *validateError values (e.g. from fmt.Errorf("...: %w", &validateError{...}))
// are also detected, making the function robust against future wrapping.
func isValidateError(err error, out **validateError) bool {
if err == nil {
return false
}
return errors.As(err, out)
}
// validateURL checks that rawURL is safe for use as a Gitea server URL:
// - Must be https:// (not http://)
// - Must have no user-info (user:pass@host)
// - Must resolve to at least one IP, all of which are publicly routable
func validateURL(rawURL string) error {
parsed, err := url.Parse(rawURL)
if err != nil {
return &validateError{code: 2, message: fmt.Sprintf("malformed URL %q: %v", rawURL, err)}
}
// Scheme check: only https is permitted.
if !strings.EqualFold(parsed.Scheme, "https") {
return &validateError{
code: 2,
message: fmt.Sprintf("URL scheme must be https (got %q)", parsed.Scheme),
}
}
// Reject user-info (user:password@host) to prevent credential embedding.
if parsed.User != nil {
return &validateError{
code: 2,
message: "URL must not contain user-info (user:password@host)",
}
}
host := parsed.Hostname()
if host == "" {
return &validateError{code: 2, message: fmt.Sprintf("URL has no host: %q", rawURL)}
}
// Resolve the hostname with a short timeout.
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
addrs, err := net.DefaultResolver.LookupIPAddr(ctx, host)
if err != nil {
return &validateError{
code: 2,
message: fmt.Sprintf("DNS lookup failed for %q: %v", host, err),
}
}
if len(addrs) == 0 {
return &validateError{
code: 2,
message: fmt.Sprintf("DNS lookup returned no addresses for %q", host),
}
}
for _, a := range addrs {
if netutil.IsBlockedIP(a.IP) {
return &validateError{
code: 1,
message: fmt.Sprintf("blocked: %q resolves to private/reserved IP %s", host, a.IP),
}
}
}
return nil
}
-127
View File
@@ -1,127 +0,0 @@
package main
import (
"bytes"
"strings"
"testing"
)
func TestRunValidateURL_Usage(t *testing.T) {
var errBuf bytes.Buffer
origErr := errWriter
errWriter = &errBuf
defer func() { errWriter = origErr }()
code := runValidateURL(nil)
if code != 2 {
t.Errorf("expected exit code 2 for no args, got %d", code)
}
if !strings.Contains(errBuf.String(), "usage") {
t.Errorf("expected usage in stderr, got %q", errBuf.String())
}
errBuf.Reset()
code = runValidateURL([]string{"arg1", "arg2"})
if code != 2 {
t.Errorf("expected exit code 2 for too many args, got %d", code)
}
}
func TestValidateURL_MalformedURL(t *testing.T) {
cases := []struct {
name string
url string
wantMsg string
}{
{"empty", "", "must be https"},
{"http scheme", "http://example.com/", "must be https"},
{"ftp scheme", "ftp://example.com/", "must be https"},
{"no scheme", "example.com", "must be https"},
{"user info", "https://user:pass@example.com/", "user-info"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
err := validateURL(tc.url)
if err == nil {
t.Errorf("expected error for URL %q, got nil", tc.url)
return
}
if !strings.Contains(err.Error(), tc.wantMsg) {
t.Errorf("error %q does not contain %q", err.Error(), tc.wantMsg)
}
var ve *validateError
if !isValidateError(err, &ve) {
t.Fatalf("expected *validateError, got %T", err)
}
if ve.code != 2 {
t.Errorf("expected code 2, got %d", ve.code)
}
})
}
}
func TestValidateURL_BlockedPrivateIP(t *testing.T) {
// localhost always resolves to 127.0.0.1 (loopback).
err := validateURL("https://localhost/")
if err == nil {
t.Skip("localhost did not resolve (network unavailable in test environment)")
}
var ve *validateError
if !isValidateError(err, &ve) {
t.Fatalf("expected *validateError, got %T: %v", err, err)
}
if ve.code != 1 && ve.code != 2 {
t.Errorf("expected code 1 (blocked) or 2 (dns fail), got %d: %s", ve.code, ve.message)
}
// If it resolved (code 1), the message must say "blocked".
if ve.code == 1 && !strings.Contains(ve.message, "blocked") {
t.Errorf("expected 'blocked' in message, got %q", ve.message)
}
}
func TestValidateURL_ExitCodes(t *testing.T) {
cases := []struct {
name string
url string
wantCode int
}{
{"http scheme", "http://example.com/", 2},
{"no scheme", "example.com", 2},
{"user info", "https://admin:secret@example.com/", 2},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
err := validateURL(tc.url)
if err == nil {
t.Fatalf("expected error for %q", tc.url)
}
var ve *validateError
if !isValidateError(err, &ve) {
t.Fatalf("expected *validateError, got %T", err)
}
if ve.code != tc.wantCode {
t.Errorf("code = %d, want %d (url=%q, msg=%s)", ve.code, tc.wantCode, tc.url, ve.message)
}
})
}
}
func TestRunValidateURL_WithCapture(t *testing.T) {
var outBuf, errBuf bytes.Buffer
origOut, origErr := outWriter, errWriter
outWriter = &outBuf
errWriter = &errBuf
defer func() {
outWriter = origOut
errWriter = origErr
}()
// http:// scheme should fail with code 2.
code := runValidateURL([]string{"http://example.com/"})
if code != 2 {
t.Errorf("expected code 2 for http:// URL, got %d", code)
}
if !strings.Contains(errBuf.String(), "must be https") {
t.Errorf("expected error about https in stderr, got %q", errBuf.String())
}
}
-359
View File
@@ -1,359 +0,0 @@
package main
// vcs.go defines the vcsClient interface that both gitea.Client (via giteaVCSAdapter)
// and github.Client (via githubVCSAdapter) satisfy, enabling VCS-type routing in main.go.
//
// Interface design:
// - Methods cover all PR review operations used by main.go.
// - Gitea-specific operations (supersede, comment resolution) are in the separate
// giteaExtClient interface. GitHub implementations return ErrNotSupported for those.
// - Types are defined here as package-local VCS types; each adapter converts from
// its respective client package's types.
import (
"context"
"errors"
"gitea.weiker.me/rodin/review-bot/gitea"
"gitea.weiker.me/rodin/review-bot/github"
"gitea.weiker.me/rodin/review-bot/review"
)
// ErrNotSupported is returned by VCS methods that have no implementation for
// a particular VCS backend (e.g., Gitea-specific timeline APIs on GitHub).
var ErrNotSupported = errors.New("operation not supported on this VCS backend")
// vcsClient is the interface for all PR operations used by main.go.
// It is implemented by both giteaVCSAdapter and githubVCSAdapter.
// Interface defined here (in the consumer package) per Go idiom.
type vcsClient interface {
// PR metadata and content
GetPullRequest(ctx context.Context, owner, repo string, number int) (*vcsPullRequest, error)
GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error)
GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]vcsChangedFile, error)
GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]vcsCommitStatus, error)
GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error)
ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error)
GetAllFilesInPath(ctx context.Context, owner, repo, path string) (map[string]string, error)
// Review operations
PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error)
ListReviews(ctx context.Context, owner, repo string, number int) ([]vcsReview, error)
DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error
GetAuthenticatedUser(ctx context.Context) (string, error)
RequestReviewer(ctx context.Context, owner, repo string, number int, reviewer string) error
}
// giteaExtClient extends vcsClient with Gitea-specific operations that have no
// GitHub equivalent. Code that uses these methods should first do a type assertion.
type giteaExtClient interface {
vcsClient
GetTimelineReviewCommentIDForReview(ctx context.Context, owner, repo string, prNum, reviewID int64) (int64, error)
EditComment(ctx context.Context, owner, repo string, commentID int64, body string) error
ListReviewComments(ctx context.Context, owner, repo string, prNum, reviewID int64) ([]gitea.ReviewComment, error)
ResolveComment(ctx context.Context, owner, repo string, commentID int64) error
}
// --- shared VCS types ---
// vcsPullRequest is VCS-agnostic PR metadata.
type vcsPullRequest struct {
Title string
Body string
Head struct {
Sha string
Ref string
}
}
// vcsChangedFile is a file changed in a PR.
type vcsChangedFile struct {
Filename string
Status string
}
// vcsCommitStatus is a CI status entry.
type vcsCommitStatus struct {
Status string
Context string
Description string
TargetURL string
}
// vcsReviewComment is an inline review comment.
type vcsReviewComment struct {
Path string
NewLine int64 // absolute line number on the new (right) side of the diff, used by both Gitea and GitHub adapters
Body string
}
// vcsReview is a submitted PR review.
type vcsReview struct {
ID int64
Body string
CommitID string
User struct {
Login string
}
State string
}
// ============================================================
// giteaVCSAdapter
// ============================================================
// giteaVCSAdapter wraps gitea.Client to implement vcsClient + giteaExtClient.
type giteaVCSAdapter struct {
c *gitea.Client
}
func newGiteaVCSAdapter(c *gitea.Client) *giteaVCSAdapter { return &giteaVCSAdapter{c: c} }
func (a *giteaVCSAdapter) GetPullRequest(ctx context.Context, owner, repo string, number int) (*vcsPullRequest, error) {
pr, err := a.c.GetPullRequest(ctx, owner, repo, number)
if err != nil {
return nil, err
}
r := &vcsPullRequest{Title: pr.Title, Body: pr.Body}
r.Head.Sha = pr.Head.Sha
r.Head.Ref = pr.Head.Ref
return r, nil
}
func (a *giteaVCSAdapter) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
return a.c.GetPullRequestDiff(ctx, owner, repo, number)
}
func (a *giteaVCSAdapter) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]vcsChangedFile, error) {
files, err := a.c.GetPullRequestFiles(ctx, owner, repo, number)
if err != nil {
return nil, err
}
out := make([]vcsChangedFile, len(files))
for i, f := range files {
out[i] = vcsChangedFile{Filename: f.Filename, Status: f.Status}
}
return out, nil
}
func (a *giteaVCSAdapter) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]vcsCommitStatus, error) {
statuses, err := a.c.GetCommitStatuses(ctx, owner, repo, sha)
if err != nil {
return nil, err
}
out := make([]vcsCommitStatus, len(statuses))
for i, s := range statuses {
out[i] = vcsCommitStatus{Status: s.Status, Context: s.Context, Description: s.Description, TargetURL: s.TargetURL}
}
return out, nil
}
func (a *giteaVCSAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return a.c.GetFileContent(ctx, owner, repo, filepath)
}
func (a *giteaVCSAdapter) GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
return a.c.GetFileContentRef(ctx, owner, repo, filepath, ref)
}
func (a *giteaVCSAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
entries, err := a.c.ListContents(ctx, owner, repo, path)
if err != nil {
return nil, err
}
out := make([]review.ContentEntry, len(entries))
for i, e := range entries {
out[i] = review.ContentEntry{Name: e.Name, Path: e.Path, Type: e.Type}
}
return out, nil
}
func (a *giteaVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, path string) (map[string]string, error) {
return a.c.GetAllFilesInPath(ctx, owner, repo, path)
}
func (a *giteaVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
gc := make([]gitea.ReviewComment, len(comments))
for i, c := range comments {
gc[i] = gitea.ReviewComment{Path: c.Path, NewPosition: c.NewLine, Body: c.Body}
}
r, err := a.c.PostReview(ctx, owner, repo, number, event, body, commitID, gc)
if err != nil {
return nil, err
}
out := &vcsReview{ID: r.ID, Body: r.Body, CommitID: r.CommitID, State: r.State}
out.User.Login = r.User.Login
return out, nil
}
func (a *giteaVCSAdapter) ListReviews(ctx context.Context, owner, repo string, number int) ([]vcsReview, error) {
reviews, err := a.c.ListReviews(ctx, owner, repo, number)
if err != nil {
return nil, err
}
out := make([]vcsReview, len(reviews))
for i, r := range reviews {
out[i] = vcsReview{ID: r.ID, Body: r.Body, CommitID: r.CommitID, State: r.State}
out[i].User.Login = r.User.Login
}
return out, nil
}
func (a *giteaVCSAdapter) DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error {
return a.c.DeleteReview(ctx, owner, repo, number, reviewID)
}
func (a *giteaVCSAdapter) GetAuthenticatedUser(ctx context.Context) (string, error) {
return a.c.GetAuthenticatedUser(ctx)
}
func (a *giteaVCSAdapter) RequestReviewer(ctx context.Context, owner, repo string, number int, reviewer string) error {
return a.c.RequestReviewer(ctx, owner, repo, number, reviewer)
}
// Gitea-specific extension methods.
func (a *giteaVCSAdapter) GetTimelineReviewCommentIDForReview(ctx context.Context, owner, repo string, prNum, reviewID int64) (int64, error) {
return a.c.GetTimelineReviewCommentIDForReview(ctx, owner, repo, int(prNum), reviewID)
}
func (a *giteaVCSAdapter) EditComment(ctx context.Context, owner, repo string, commentID int64, body string) error {
return a.c.EditComment(ctx, owner, repo, commentID, body)
}
func (a *giteaVCSAdapter) ListReviewComments(ctx context.Context, owner, repo string, prNum, reviewID int64) ([]gitea.ReviewComment, error) {
return a.c.ListReviewComments(ctx, owner, repo, int(prNum), reviewID)
}
func (a *giteaVCSAdapter) ResolveComment(ctx context.Context, owner, repo string, commentID int64) error {
return a.c.ResolveComment(ctx, owner, repo, commentID)
}
// ============================================================
// githubVCSAdapter
// ============================================================
// githubVCSAdapter wraps github.Client to implement vcsClient.
// Gitea-specific extension methods (GetTimelineReviewCommentIDForReview, EditComment,
// ListReviewComments, ResolveComment) are not available on GitHub and will not be called
// because main.go gates them with a type assertion to giteaExtClient.
type githubVCSAdapter struct {
c *github.Client
}
func newGithubVCSAdapter(c *github.Client) *githubVCSAdapter { return &githubVCSAdapter{c: c} }
func (a *githubVCSAdapter) GetPullRequest(ctx context.Context, owner, repo string, number int) (*vcsPullRequest, error) {
pr, err := a.c.GetPullRequest(ctx, owner, repo, number)
if err != nil {
return nil, err
}
r := &vcsPullRequest{Title: pr.Title, Body: pr.Body}
r.Head.Sha = pr.Head.Sha
r.Head.Ref = pr.Head.Ref
return r, nil
}
func (a *githubVCSAdapter) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
return a.c.GetPullRequestDiff(ctx, owner, repo, number)
}
func (a *githubVCSAdapter) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]vcsChangedFile, error) {
files, err := a.c.GetPullRequestFiles(ctx, owner, repo, number)
if err != nil {
return nil, err
}
out := make([]vcsChangedFile, len(files))
for i, f := range files {
out[i] = vcsChangedFile{Filename: f.Filename, Status: f.Status}
}
return out, nil
}
func (a *githubVCSAdapter) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]vcsCommitStatus, error) {
statuses, err := a.c.GetCommitStatuses(ctx, owner, repo, sha)
if err != nil {
return nil, err
}
out := make([]vcsCommitStatus, len(statuses))
for i, s := range statuses {
// CommitStatus.Status is tagged as json:"state" — already the normalized "state" value
out[i] = vcsCommitStatus{Status: s.Status, Context: s.Context, Description: s.Description, TargetURL: s.TargetURL}
}
return out, nil
}
func (a *githubVCSAdapter) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return a.c.GetFileContent(ctx, owner, repo, filepath)
}
func (a *githubVCSAdapter) GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
return a.c.GetFileContentRef(ctx, owner, repo, filepath, ref)
}
func (a *githubVCSAdapter) ListContents(ctx context.Context, owner, repo, path string) ([]review.ContentEntry, error) {
entries, err := a.c.ListContents(ctx, owner, repo, path)
if err != nil {
return nil, err
}
out := make([]review.ContentEntry, len(entries))
for i, e := range entries {
out[i] = review.ContentEntry{Name: e.Name, Path: e.Path, Type: e.Type}
}
return out, nil
}
func (a *githubVCSAdapter) GetAllFilesInPath(ctx context.Context, owner, repo, path string) (map[string]string, error) {
return a.c.GetAllFilesInPath(ctx, owner, repo, path)
}
func (a *githubVCSAdapter) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []vcsReviewComment) (*vcsReview, error) {
gc := make([]github.ReviewComment, len(comments))
for i, c := range comments {
// GitHub inline comments use Line+Side (absolute line on the RIGHT side).
// NewLine from diff parsing gives absolute new-file line numbers.
// Comments that cannot be mapped will be omitted (GitHub rejects invalid positions).
gc[i] = github.ReviewComment{
Path: c.Path,
Line: c.NewLine,
Side: "RIGHT",
Body: c.Body,
}
}
r, err := a.c.PostReview(ctx, owner, repo, number, event, body, commitID, gc)
if err != nil {
return nil, err
}
out := &vcsReview{ID: r.ID, Body: r.Body, State: r.State}
out.User.Login = r.User.Login
return out, nil
}
func (a *githubVCSAdapter) ListReviews(ctx context.Context, owner, repo string, number int) ([]vcsReview, error) {
reviews, err := a.c.ListReviews(ctx, owner, repo, number)
if err != nil {
return nil, err
}
out := make([]vcsReview, len(reviews))
for i, r := range reviews {
out[i] = vcsReview{ID: r.ID, Body: r.Body, State: r.State}
out[i].User.Login = r.User.Login
}
return out, nil
}
func (a *githubVCSAdapter) DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error {
// GitHub only allows deleting PENDING (draft) reviews. review-bot posts submitted
// reviews, so this will return an error for any review we actually posted.
// Callers should treat 422 errors here gracefully.
return a.c.DeleteReview(ctx, owner, repo, number, reviewID)
}
func (a *githubVCSAdapter) GetAuthenticatedUser(ctx context.Context) (string, error) {
return a.c.GetAuthenticatedUser(ctx)
}
func (a *githubVCSAdapter) RequestReviewer(ctx context.Context, owner, repo string, number int, reviewer string) error {
return a.c.RequestReviewer(ctx, owner, repo, number, reviewer)
}
-334
View File
@@ -1,334 +0,0 @@
# Design: Role-based Review Personas (Issue #51)
> **Note:** This design was revised during implementation to use JSON instead of YAML
> to maintain the repository's zero-external-dependencies convention. All persona
> files use JSON format. See "Design Revision" section at the end for details.
## Problem
Current review-bot performs generic code review. Every reviewer (regardless of `reviewer-name`) uses the same base prompt and evaluates the same concerns. This leads to:
1. **Redundancy** — Two reviewers (e.g., GPT + Claude twins) often flag identical issues
2. **Gaps** — Generic reviewers miss specialized concerns (security, domain logic, architecture)
3. **Noise** — NITs about style mixed with critical security findings
4. **No ownership** — Findings lack clear domain attribution
## Constraints
- Must work with existing CLI flags and CI workflow patterns
- Must not break backwards compatibility (existing configs still work)
- Must integrate cleanly with the budget system (personas add to context)
- Multiple personas running in parallel must not interfere with each other
- Each persona must have clear scope boundaries (no duplication)
## Proposed Approach
### 1. Persona Definition
A persona is a named review role with:
- **Identity** — Who am I? What's my expertise?
- **Focus** — What do I look for?
- **Scope boundaries** — What do I explicitly NOT comment on?
- **Severity calibration** — What counts as MAJOR/MINOR/NIT for MY domain?
Personas are defined in JSON files that can live:
1. In the pattern repos (shared across projects)
2. In the target repo (project-specific personas)
3. Inline via a new `--persona-file` flag (JSON format)
### 2. Persona File Format
```json
# .review/personas/security.yaml
name: security
display_name: Security Specialist
model_preference: opus # optional hint for expensive analysis
identity: |
You are a security specialist reviewing code for vulnerabilities.
Your expertise: OWASP Top 10, injection attacks, auth/authz, secrets management,
event sourcing security (replay attacks, event injection).
focus:
- Injection attacks (SQL, command, path traversal, template)
- Authentication and authorization gaps
- Secrets exposure (hardcoded credentials, tokens in logs)
- Input validation (unsanitized input, unsafe deserialization)
- Race conditions with security implications
- Event sourcing attack vectors
ignore:
- Code style and naming conventions
- Performance (unless security-related)
- Documentation
- General code quality
- Test coverage
severity:
critical: "Remote code execution, auth bypass, data exfiltration"
major: "Privilege escalation, information disclosure, DoS"
minor: "Missing rate limiting, verbose errors"
nit: "Theoretical risk with low exploitability"
output_format: |
For each finding:
- Severity: [CRITICAL|MAJOR|MINOR|NIT]
- Attack vector: How could this be exploited?
- Evidence: Code snippet showing the vulnerability
- Recommendation: Specific fix
```
### 3. New CLI Flags
```
--persona-file PATH Path to persona JSON file (local or in repo)
--persona NAME Built-in persona name (security, architect, domain)
```
Either flag sets the persona. If neither is provided, behavior is unchanged (generic review).
### 4. Prompt Assembly
Current flow:
```
SystemBase → Patterns → Conventions → [LLM]
```
New flow with persona:
```
PersonaPrompt (from YAML) → Patterns (filtered?) → Conventions → [LLM]
```
The persona's identity/focus/ignore/severity sections become the system prompt, replacing the generic "You are an expert code reviewer" base.
### 5. Built-in Personas
Ship with these built-in personas (loadable via `--persona NAME`):
| Name | Focus |
|------|-------|
| `security` | Vulnerabilities, auth, secrets |
| `architect` | Patterns, consistency, design |
| `domain` | Business logic (requires repo-specific config) |
| `docs` | Documentation, API clarity |
Built-in personas live in `review/personas/` as embedded Go assets or YAML shipped with the binary.
### 6. CI Workflow Integration
Single persona:
```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: security
persona: security
...
```
Multiple personas (parallel jobs):
```yaml
jobs:
review:
strategy:
matrix:
include:
- name: security
persona: security
- name: architect
persona: architect
steps:
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: ${{ matrix.name }}
persona: ${{ matrix.persona }}
```
Custom persona from repo:
```yaml
- uses: rodin/review-bot/.gitea/actions/review@v1
with:
reviewer-name: trading
persona-file: .review/personas/trading.yaml
```
### 7. Persona + Patterns Interaction
Some personas benefit from filtered patterns:
- Security → only security-related patterns
- Architect → all patterns (structural focus)
- Domain → domain docs, not language patterns
For v1, keep it simple: all patterns are included regardless of persona. Future enhancement could add `patterns_filter` to persona YAML.
### 8. Output Format Changes
Persona name appears in the review header:
```markdown
# Security Review
## Summary
No critical vulnerabilities found in this change.
## Findings
| # | Severity | File | Line | Finding |
...
## Recommendation
**APPROVE** — No security-relevant issues detected.
---
*Review by security*
<!-- review-bot:security -->
```
## State/Data Model
### Persona struct
```go
// review/persona.go
type Persona struct {
Name string `yaml:"name"`
DisplayName string `yaml:"display_name"`
ModelPref string `yaml:"model_preference,omitempty"`
Identity string `yaml:"identity"`
Focus []string `yaml:"focus"`
Ignore []string `yaml:"ignore"`
Severity Severity `yaml:"severity"`
OutputFormat string `yaml:"output_format,omitempty"`
}
type Severity struct {
Critical string `yaml:"critical"`
Major string `yaml:"major"`
Minor string `yaml:"minor"`
Nit string `yaml:"nit"`
}
```
### Loading precedence
1. `--persona-file PATH` → load from local file system
2. `--persona NAME` → load from embedded built-ins
3. Neither → use generic system prompt (current behavior)
## Error Cases
| Error | Handling |
|-------|----------|
| Persona file not found | Fatal exit with clear message |
| Invalid YAML in persona file | Fatal exit with parse error |
| Both `--persona` and `--persona-file` specified | Fatal exit: mutually exclusive |
| Unknown built-in persona name | Fatal exit with list of valid names |
| Empty identity in persona | Warning, fall back to generic prompt |
## Edge Cases
- **Empty focus list**: Valid — persona relies on identity alone
- **Empty ignore list**: Valid — no explicit scope exclusions
- **No severity section**: Use default MAJOR/MINOR/NIT definitions
- **Model preference set but budget insufficient**: Ignore preference, log warning
- **Persona file in pattern repo**: Fetch like other pattern files
## Testing Strategy
### Unit tests
- `persona_test.go`: Parse valid/invalid YAML, validate required fields
- `prompt_test.go`: Verify persona prompt assembly
- Integration with budget: persona prompts count toward token limit
### Integration tests
- End-to-end with `--persona security` (built-in)
- End-to-end with `--persona-file custom.yaml`
- Backwards compatibility: no flags = generic behavior
### Manual verification
- Run security persona on a PR with obvious vulnerability
- Verify security persona ignores style issues
- Verify non-security persona doesn't flag security issues
## Implementation Phases
### Phase 1: Persona types and loading
- [ ] `review/persona.go`: Persona struct + YAML parsing
- [ ] `review/persona_test.go`: Unit tests
- [ ] Embed built-in personas in binary
- [ ] Compiles clean, tests pass
### Phase 2: Prompt generation
- [ ] `review/prompt.go`: `BuildPersonaPrompt(p Persona) string`
- [ ] Modify `BuildSystemBase()` to accept optional persona
- [ ] Integrate persona prompt with budget system
- [ ] Tests for prompt assembly
### Phase 3: CLI integration
- [ ] Add `--persona` and `--persona-file` flags
- [ ] Flag validation (mutually exclusive, valid names)
- [ ] Load persona based on flags
- [ ] Pass persona to prompt builder
### Phase 4: Action integration
- [ ] Add `persona` and `persona-file` inputs to action.yml
- [ ] Update README with persona examples
- [ ] End-to-end CI test
### Phase 5: Built-in personas
- [ ] `security.yaml` built-in
- [ ] `architect.yaml` built-in
- [ ] `docs.yaml` built-in
- [ ] Document each persona's focus
## Open Questions
1. **Persona file location in repo**: Should we support `--persona-file .review/security.yaml` where the file is fetched from the PR's repo (like conventions)? This adds complexity but enables project-specific personas without action changes.
2. **Model preference enforcement**: If persona specifies `model_preference: opus` but the action uses a different model, should we warn? Override? Ignore? Current thinking: log warning, use the specified model (user controls model via action input).
3. **Severity override output**: If persona defines custom severity levels (CRITICAL), should the JSON output include them, or map back to standard MAJOR/MINOR/NIT? Current thinking: keep standard output format, use severity calibration only for prompt guidance.
## Completion Checklist
1. Persona struct matches YAML schema exactly?
2. Built-in personas embedded in binary (not external files)?
3. `--persona` and `--persona-file` are mutually exclusive?
4. Unknown persona name produces clear error with valid options?
5. Empty persona file fields have sensible defaults?
6. Persona prompt integrates with budget system (token counting)?
7. Backwards compatibility: no flags = current behavior?
8. Review header shows persona display name?
9. Sentinel still uses reviewer-name (not persona name)?
10. Unit tests cover parse errors, missing fields, valid YAML?
## Design Review Findings (Self-Review)
### Finding 1: Severity Mapping
The persona YAML allows `critical` severity, but the LLM output parser (`review/parser.go`) only accepts MAJOR/MINOR/NIT.
**Resolution:** Keep standard output format. Persona severity section is ONLY for calibrating the LLM's judgment (prompt guidance). Output must still use MAJOR/MINOR/NIT. Document this clearly in persona format docs.
### Finding 2: Embedding Built-in Personas
Go doesn't natively embed YAML. Must use `//go:embed` directive (Go 1.16+).
**Resolution:** Create `review/personas/` directory with YAML files and use:
```go
//go:embed personas/*.yaml
var embeddedPersonas embed.FS
```
### Finding 3: display_name vs reviewer-name
Design says header shows "persona display name" but sentinel uses "reviewer-name". This is correct - they serve different purposes:
- `display_name` → human-readable header ("Security Specialist Review")
- `reviewer-name` → machine sentinel for cleanup (`<!-- review-bot:security -->`)
When persona is used, `display_name` takes precedence for the header title, but `reviewer-name` (CLI flag) is still used for the sentinel.
## Design Revision: YAML with gopkg.in/yaml.v3
**Decision:** Add `gopkg.in/yaml.v3` as a dependency.
YAML is preferred over JSON for persona files because:
- Multi-line strings are cleaner (no escaping quotes in identity/focus text)
- Comments are supported for documentation
- More human-readable for complex persona definitions
The implementation supports both YAML (`.yaml`, `.yml`) and JSON (`.json`) for backwards compatibility, with YAML as the default for built-in personas.
-87
View File
@@ -1,87 +0,0 @@
# Design: YAML Support for Persona Files (#57)
## Problem
JSON is awkward for persona files that contain multi-line text (identity, severity descriptions). YAML supports cleaner multi-line strings and comments, improving readability and maintainability.
## Constraints
- Backwards compatibility: existing JSON personas must continue to work
- Security: protect against DoS via deeply nested YAML (AIKIDO-2024-10486)
- Consistency: use `.yaml` extension (not `.yml`)
- Library: use `github.com/goccy/go-yaml` v1.16.0+ (approved in CONVENTIONS.md); we implement custom AST-based depth/node-count checks for precise alias-aware validation
## Proposed Approach
1. **Update `parsePersona`** to detect format from file extension
2. **Add YAML parsing** with explicit depth limit (defense in depth)
3. **Keep JSON as fallback** for files without `.yaml`/`.yml` extension
4. **Convert built-in personas** to YAML format
5. **Update embed directive** to include both formats
### File Extension Detection
```go
func parsePersona(data []byte, source string) (*Persona, error) {
isYAML := strings.HasSuffix(source, ".yaml") || strings.HasSuffix(source, ".yml")
if isYAML {
return parseYAML(data, source)
}
return parseJSON(data, source)
}
```
### YAML Parsing with Depth Protection
We implement a custom AST-based depth/node-count walk (`checkYAMLDepth` in
`review/persona.go`) rather than relying on library decoder options. Key design
decisions:
- **Library:** `github.com/goccy/go-yaml` with `ast.Node`-based traversal
- **Dual-map tracking:** `validated` (depth-aware short-circuit) + `visiting` (cycle detection)
- **Node-count limit:** Conservative overcounting bounds total validation work
- **Alias-aware depth:** Aliases increment depth and are re-checked when encountered at greater depths
See `review/persona.go:checkYAMLDepth` for the authoritative implementation.
## State/Data Model
No new state. Same `Persona` struct, just different parsing.
## Error Cases
| Error | Handling |
|-------|----------|
| Invalid YAML syntax | Return parse error with source file |
| Deeply nested YAML | Custom AST walk (`checkYAMLDepth`) rejects before decode |
| Unknown extension | Fall back to JSON parsing |
| Missing required fields | Validation rejects after parse |
## Edge Cases
- File with `.json` extension but YAML content → JSON parse fails, user sees error
- File with no extension → defaults to JSON
- Embedded persona reference like `builtin:security` → detect by embed path (`personas/X.yaml`)
## Testing Strategy
1. Unit tests for YAML parsing (valid, invalid, deeply nested)
2. Unit tests for extension detection
3. Integration test for built-in personas (now YAML)
4. Backwards compat test: verify JSON still works for external files
## Completion Checklist
1. [ ] `go-yaml` dependency added at v1.16.0+
2. [ ] Extension detection uses case-insensitive comparison
3. [ ] YAML parse errors include source file name
4. [ ] JSON parsing still works for `.json` files
5. [ ] Built-in personas converted to YAML with readable multi-line strings
6. [ ] Embed directive updated to include `*.yaml`
7. [ ] Test for deeply nested YAML rejection
8. [ ] All existing tests pass
## Open Questions
- Should we support both `.yaml` AND `.yml`? Issue says `.yaml` only for consistency, but some users expect `.yml`. **Decision:** Support both for reading, recommend `.yaml` in docs.
- Should we add a "format" field to detect mismatched extension/content? **Decision:** No, keep it simple. Extension determines format.
+10 -403
View File
@@ -11,12 +11,9 @@ import (
"fmt"
"io"
"log/slog"
"math"
"net"
"net/http"
"net/url"
"strings"
"syscall"
"time"
)
@@ -42,181 +39,23 @@ func IsNotFound(err error) bool {
return errors.As(err, &apiErr) && apiErr.StatusCode == http.StatusNotFound
}
// IsServerError reports whether an error is an API 5xx response.
func IsServerError(err error) bool {
var apiErr *APIError
return errors.As(err, &apiErr) && apiErr.StatusCode >= 500 && apiErr.StatusCode < 600
}
// DefaultMaxDiffSize is the default maximum diff size in bytes (10 MB).
const DefaultMaxDiffSize = 10 * 1024 * 1024
// ErrDiffTooLarge is returned when a PR diff exceeds the configured MaxDiffSize.
var ErrDiffTooLarge = errors.New("diff size exceeds maximum allowed size")
// Client interacts with the Gitea API.
// A Client is safe for concurrent use by multiple goroutines.
type Client struct {
baseURL string
token string
http *http.Client
// RetryBackoff defines the delays between retry attempts.
// RetryBackoff[i] is the delay before attempt i+1 (after attempt i fails).
// If nil, defaults to {1s, 2s}. Set to shorter durations in tests.
//
// This field must be configured before the first request is made.
// Modifying it while requests are in flight is not safe.
RetryBackoff []time.Duration
// MaxDiffSize is the maximum number of bytes allowed when fetching a PR diff.
// If zero, defaults to DefaultMaxDiffSize (10 MB). Set to any negative value
// (or math.MaxInt64) to disable the limit.
//
// This field must be configured before the first request is made.
// Modifying it while requests are in flight is not safe.
MaxDiffSize int64
}
// defaultCheckRedirect is the redirect policy used by NewClient.
// NOTE: This function is intentionally duplicated in github/client.go (and vice versa)
// because the packages are separate. Changes here must be mirrored there.
// It rejects HTTPS->HTTP protocol downgrades (to prevent plaintext leakage)
// and cross-host redirects (to prevent following responses from untrusted
// endpoints). Same-host, same-or-upgraded-scheme redirects are allowed.
func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
// Guard for direct invocation in tests and any future callers;
// net/http guarantees len(via) >= 1 during actual redirects.
if len(via) == 0 {
return nil
}
prev := via[len(via)-1]
// Reject protocol downgrade: HTTPS->HTTP leaks request metadata over plaintext.
if prev.URL.Scheme == "https" && req.URL.Scheme == "http" {
return fmt.Errorf("refusing redirect: HTTPS to HTTP downgrade (%s -> %s)", prev.URL.Host, req.URL.Host)
}
// Reject cross-host redirect entirely to avoid consuming responses
// from untrusted endpoints.
if req.URL.Host != prev.URL.Host {
return fmt.Errorf("refusing redirect: cross-host (%s -> %s)", prev.URL.Host, req.URL.Host)
}
return nil
}
// safeDialContext is the default DialContext for NewClient.
// It resolves the hostname and checks every returned IP against the blocked
// CIDR list before establishing a connection. This prevents SSRF attacks
// where user-supplied URLs resolve to internal/private addresses.
//
// After validating all IPs, we dial the first resolved IP directly to avoid
// a second DNS lookup (which could return a different IP in a DNS rebinding
// attack). This narrows — but does not fully eliminate — the DNS rebinding
// window to the time between LookupIPAddr and DialContext.
//
// If the host is already an IP literal, LookupIPAddr returns it directly
// (no DNS query issued), so IP literals like https://127.0.0.1/ are blocked.
func safeDialContext(ctx context.Context, network, addr string) (net.Conn, error) {
host, port, err := net.SplitHostPort(addr)
if err != nil {
return nil, fmt.Errorf("safeDialContext: invalid address %q: %w", addr, err)
}
addrs, err := net.DefaultResolver.LookupIPAddr(ctx, host)
if err != nil {
return nil, fmt.Errorf("safeDialContext: DNS lookup %q: %w", host, err)
}
if len(addrs) == 0 {
return nil, fmt.Errorf("safeDialContext: no addresses returned for %q", host)
}
for _, a := range addrs {
if IsBlockedIP(a.IP) {
return nil, fmt.Errorf("safeDialContext: blocked: %q resolves to private/reserved IP %s", host, a.IP)
}
}
// Try each resolved IP in order, returning the first successful connection.
// Fallback is important when a hostname resolves to multiple IPs and the first
// is temporarily unreachable. All IPs were already validated above, so dialing
// any of them is safe.
//
// Timeout: 10s per the design (PLAN.md); the outer http.Client has a 30s
// total timeout, but the per-dial timeout ensures a slow TCP connect on one IP
// doesn't consume the budget needed to try others.
d := &net.Dialer{Timeout: 10 * time.Second}
var lastErr error
for _, a := range addrs {
conn, err := d.DialContext(ctx, network, net.JoinHostPort(a.IP.String(), port))
if err == nil {
return conn, nil
}
lastErr = err
}
return nil, fmt.Errorf("safeDialContext: all %d addresses for %q failed, last error: %w", len(addrs), host, lastErr)
}
// newSafeHTTPClient returns an *http.Client with the SSRF-blocking safeDialContext
// transport and the cross-host redirect rejection policy.
//
// We clone http.DefaultTransport to preserve its production-ready defaults
// (ProxyFromEnvironment, TLSHandshakeTimeout, IdleConnTimeout, connection
// pooling, HTTP/2 support) and override only DialContext with safeDialContext.
func newSafeHTTPClient() *http.Client {
transport := http.DefaultTransport.(*http.Transport).Clone()
transport.DialContext = safeDialContext
return &http.Client{
Timeout: 30 * time.Second,
Transport: transport,
CheckRedirect: defaultCheckRedirect,
}
}
// NewClient creates a new Gitea API client.
//
// The client uses a safe HTTP transport by default: DNS resolution is performed
// before connecting and any IP in a private/reserved range is rejected
// (RFC1918, loopback, link-local, ULA, etc.). Cross-host and HTTPS→HTTP
// redirects are also rejected.
//
// For tests that use httptest.NewServer (which listens on 127.0.0.1), call
// WithUnsafeDialer() to bypass the IP check.
func NewClient(baseURL, token string) *Client {
return &Client{
baseURL: strings.TrimRight(baseURL, "/"),
token: token,
http: newSafeHTTPClient(),
http: &http.Client{Timeout: 30 * time.Second},
}
}
// WithUnsafeDialer returns the client configured with a plain HTTP client that
// has no IP-level SSRF protection. It preserves the redirect-rejection policy.
//
// This MUST only be used in tests. Production code must never call this method.
func (c *Client) WithUnsafeDialer() *Client {
c.http = &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
}
return c
}
// SetHTTPClient sets the underlying HTTP client used for requests.
// This is intended for test setup only to inject mock transports; it must be
// called before any goroutines issue requests.
//
// Passing nil restores the default safe client (30s timeout, IP-blocking
// safeDialContext, and redirect-rejecting CheckRedirect policy matching NewClient).
//
// Callers providing a non-nil client are responsible for configuring a safe
// CheckRedirect policy. Without one, the default net/http behavior will follow
// redirects and may forward the Authorization header to untrusted hosts.
func (c *Client) SetHTTPClient(hc *http.Client) {
if hc == nil {
hc = newSafeHTTPClient()
}
c.http = hc
}
// PullRequest holds relevant PR metadata.
type PullRequest struct {
Title string `json:"title"`
@@ -264,32 +103,13 @@ func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number
}
// GetPullRequestDiff fetches the unified diff for a PR.
// It enforces MaxDiffSize to prevent unbounded memory allocation.
// Returns ErrDiffTooLarge if the diff exceeds the configured limit.
func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d.diff", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
maxSize := c.MaxDiffSize
if maxSize == 0 {
maxSize = DefaultMaxDiffSize
}
// When the limit is disabled (negative) or set to math.MaxInt64 (which
// would overflow the +1 detection and silently disable enforcement),
// use the standard unlimited doGet path.
if maxSize < 0 || maxSize == math.MaxInt64 {
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch diff: %w", err)
}
return string(body), nil
}
body, err := c.doGetLimited(ctx, reqURL, maxSize)
if err != nil {
return "", fmt.Errorf("fetch diff: %w", err)
}
return string(body), nil
}
// GetPullRequestFiles fetches the list of files changed in a PR.
@@ -341,22 +161,18 @@ func (c *Client) GetFileContentRef(ctx context.Context, owner, repo, filepath, r
}
// PostReview submits a review to a PR and returns the created review.
// event should be one of "APPROVED", "REQUEST_CHANGES", or "COMMENT".
// commitID anchors the review to a specific commit SHA. If empty, Gitea
// defaults to the current PR head.
// event should be "APPROVED" or "REQUEST_CHANGES".
// comments are optional inline comments attached to specific lines.
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []ReviewComment) (*Review, error) {
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body string, comments []ReviewComment) (*Review, error) {
reqURL := fmt.Sprintf("%s/api/v1/repos/%s/%s/pulls/%d/reviews", c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
payload := struct {
Body string `json:"body"`
Event string `json:"event"`
CommitID string `json:"commit_id,omitempty"`
Comments []ReviewComment `json:"comments,omitempty"`
}{
Body: body,
Event: event,
CommitID: commitID,
Comments: comments,
}
@@ -394,145 +210,7 @@ func (c *Client) PostReview(ctx context.Context, owner, repo string, number int,
return &review, nil
}
// isTemporaryNetError reports whether err is a temporary network error worth retrying.
// This includes connection refused, network unreachable, connection reset, and DNS
// timeouts. It explicitly excludes permanent errors like permission denied or
// "no such host" DNS failures.
func isTemporaryNetError(err error) bool {
if err == nil {
return false
}
// Check for OpError and inspect the underlying syscall error.
// Not all OpErrors are transient — permission denied, for example, is permanent.
var opErr *net.OpError
if errors.As(err, &opErr) {
return isRetriableSyscallError(opErr.Err)
}
// DNS errors: only retry on timeout, not on "no such host" which is permanent.
var dnsErr *net.DNSError
if errors.As(err, &dnsErr) {
return dnsErr.IsTimeout
}
// Check for net.Error with Timeout() (Temporary is deprecated)
var netErr net.Error
if errors.As(err, &netErr) {
return netErr.Timeout()
}
return false
}
// isRetriableSyscallError reports whether the underlying error from a net.OpError
// is a transient syscall error worth retrying.
func isRetriableSyscallError(err error) bool {
if err == nil {
return false
}
// Check for syscall.Errno directly or wrapped
var errno syscall.Errno
if errors.As(err, &errno) {
switch errno {
case syscall.ECONNREFUSED, // connection refused — server not listening
syscall.ECONNRESET, // connection reset by peer
syscall.ENETUNREACH, // network unreachable
syscall.EHOSTUNREACH, // host unreachable
syscall.ETIMEDOUT: // connection timed out
return true
default:
// EACCES, EPERM, etc. are permanent — don't retry
return false
}
}
// If we can't identify the specific syscall error, be conservative and retry.
// This handles wrapped errors or platform-specific error types.
// The retry count is limited, so erring on the side of retrying is safe.
return true
}
// redactURL strips query parameters and userinfo credentials from a URL for
// safe logging. This prevents accidental exposure of sensitive data (tokens in
// query strings, or user:pass in the authority) in log output.
func redactURL(rawURL string) string {
parsed, err := url.Parse(rawURL)
if err != nil {
// If we cannot parse it, return a safe placeholder rather than
// potentially logging something sensitive.
return "[invalid URL]"
}
if parsed.User != nil {
parsed.User = url.User("REDACTED")
}
if parsed.RawQuery != "" {
parsed.RawQuery = "[redacted]"
}
return parsed.String()
}
// sanitizeErrorForLog returns a loggable version of an error that omits
// potentially sensitive content like response bodies. For APIError, only
// the status code is included; for other errors, the type is preserved.
func sanitizeErrorForLog(err error) string {
if err == nil {
return "<nil>"
}
var apiErr *APIError
if errors.As(err, &apiErr) {
return fmt.Sprintf("HTTP %d", apiErr.StatusCode)
}
return err.Error()
}
// doGetWithReader performs an HTTP GET request with retry on 5xx errors and
// temporary network errors. Retries up to 3 times with exponential backoff
// (1s, 2s delays by default; configurable via Client.RetryBackoff for testing).
// The readBody function is called with the response body on success (2xx) and
// is responsible for reading and closing it.
func (c *Client) doGetWithReader(ctx context.Context, reqURL string, readBody func(io.ReadCloser) ([]byte, error)) ([]byte, error) {
const maxAttempts = 3
// backoff[i] is the delay before attempt i+1 (i.e., after attempt i fails).
// First attempt (i=0) has no delay; retries wait 1s then 2s by default.
backoff := c.RetryBackoff
if backoff == nil {
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
// maxErrorBodyBytes limits how much of an error response body we read
// to protect against malicious servers sending unbounded data.
const maxErrorBodyBytes = 64 * 1024 // 64 KB
var lastErr error
for attempt := 0; attempt < maxAttempts; attempt++ {
if attempt > 0 {
// Determine delay: use backoff slice if available, otherwise retry immediately.
// An empty RetryBackoff slice means "retry without delay" — this is intentional
// as the caller explicitly configured no delays.
var delay time.Duration
if attempt-1 < len(backoff) {
delay = backoff[attempt-1]
}
if delay > 0 {
slog.Warn("retrying request after error",
"attempt", attempt+1,
"url", redactURL(reqURL),
"delay", delay.String(),
"lastError", sanitizeErrorForLog(lastErr))
timer := time.NewTimer(delay)
select {
case <-timer.C:
case <-ctx.Done():
timer.Stop()
return nil, ctx.Err()
}
}
}
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, reqURL, nil)
if err != nil {
return nil, err
@@ -540,72 +218,16 @@ func (c *Client) doGetWithReader(ctx context.Context, reqURL string, readBody fu
req.Header.Set("Authorization", "token "+c.token)
resp, err := c.http.Do(req)
if err != nil {
// Always capture the error for consistent return at loop end.
// This ensures both network errors and HTTP 5xx return lastErr.
lastErr = err
// Only retry temporary network errors when attempts remain.
if attempt < maxAttempts-1 && isTemporaryNetError(err) {
slog.Warn("temporary network error, will retry",
"attempt", attempt+1,
"url", redactURL(reqURL),
"error", err)
continue
}
// Non-retryable network error or final attempt exhausted.
return nil, lastErr
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
return readBody(resp.Body)
}
// Error path: limit how much we read from potentially malicious server
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
resp.Body.Close()
lastErr = &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
// Only retry on 5xx server errors
if resp.StatusCode < 500 || resp.StatusCode >= 600 {
return nil, lastErr
}
}
return nil, lastErr
}
// doGet performs an HTTP GET request with retry, reading the full response body.
func (c *Client) doGet(ctx context.Context, reqURL string) ([]byte, error) {
return c.doGetWithReader(ctx, reqURL, func(body io.ReadCloser) ([]byte, error) {
defer body.Close()
return io.ReadAll(body)
})
}
// doGetLimited performs an HTTP GET request with retry but enforces a maximum
// response body size. Returns ErrDiffTooLarge if the response exceeds maxBytes.
// It reads maxBytes+1 (clamped to avoid overflow) to detect truncation without
// buffering the entire body.
func (c *Client) doGetLimited(ctx context.Context, reqURL string, maxBytes int64) ([]byte, error) {
return c.doGetWithReader(ctx, reqURL, func(body io.ReadCloser) ([]byte, error) {
defer body.Close()
// Read up to maxBytes+1 to detect overflow.
// Clamp to prevent integer overflow when maxBytes == math.MaxInt64.
limitBytes := maxBytes + 1
if limitBytes <= 0 {
limitBytes = math.MaxInt64
}
limited := io.LimitReader(body, limitBytes)
data, err := io.ReadAll(limited)
if err != nil {
return nil, err
}
if int64(len(data)) > maxBytes {
return nil, fmt.Errorf("%w: response exceeds %d bytes", ErrDiffTooLarge, maxBytes)
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
body, _ := io.ReadAll(resp.Body)
return nil, &APIError{StatusCode: resp.StatusCode, Body: string(body)}
}
return data, nil
})
return io.ReadAll(resp.Body)
}
// escapePath escapes each segment of a relative file path for use in URLs.
@@ -629,13 +251,7 @@ type ContentEntry struct {
// ListContents lists files and directories at a given path in a repo.
// Pass an empty path to list the repository root.
// If the path points to a file (not a directory), Gitea returns a single
// object instead of an array; this method normalizes both cases to a slice.
func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
// Normalize "." to empty string — Gitea API rejects "." with 500
if path == "." {
path = ""
}
var reqURL string
if path == "" {
reqURL = fmt.Sprintf("%s/api/v1/repos/%s/%s/contents", c.baseURL, url.PathEscape(owner), url.PathEscape(repo))
@@ -648,17 +264,8 @@ func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]
}
var entries []ContentEntry
if err := json.Unmarshal(body, &entries); err != nil {
// Gitea returns a single object (not an array) when path is a file
var single ContentEntry
if err2 := json.Unmarshal(body, &single); err2 != nil {
return nil, fmt.Errorf("parse contents JSON: %w", err)
}
// Guard against empty/malformed responses
if single.Name == "" && single.Path == "" {
return nil, fmt.Errorf("parse contents JSON: empty response for path %q", path)
}
entries = []ContentEntry{single}
}
return entries, nil
}
+35 -711
View File
@@ -6,15 +6,10 @@ import (
"errors"
"fmt"
"io"
"net"
"net/http"
"net/http/httptest"
"net/url"
"strings"
"sync/atomic"
"syscall"
"testing"
"time"
)
func TestGetPullRequest(t *testing.T) {
@@ -36,7 +31,7 @@ func TestGetPullRequest(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
got, err := client.GetPullRequest(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -63,7 +58,7 @@ func TestGetPullRequestDiff(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
got, err := client.GetPullRequestDiff(context.Background(), "owner", "repo", 5)
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -88,7 +83,7 @@ func TestGetCommitStatuses(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
got, err := client.GetCommitStatuses(context.Background(), "owner", "repo", "abc123")
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -119,7 +114,6 @@ func TestPostReview(t *testing.T) {
var payload struct {
Body string `json:"body"`
Event string `json:"event"`
CommitID string `json:"commit_id"`
}
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
t.Fatalf("failed to decode payload: %v", err)
@@ -130,16 +124,14 @@ func TestPostReview(t *testing.T) {
if payload.Event != "APPROVED" {
t.Errorf("expected event %q, got %q", "APPROVED", payload.Event)
}
if payload.CommitID != "abc123def" {
t.Errorf("expected commit_id %q, got %q", "abc123def", payload.CommitID)
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"id":100,"user":{"login":"review-bot"},"state":"APPROVED","stale":false}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
review, err := client.PostReview(context.Background(), "owner", "repo", 3, "APPROVED", "LGTM", "abc123def", nil)
client := NewClient(server.URL, "test-token")
review, err := client.PostReview(context.Background(), "owner", "repo", 3, "APPROVED", "LGTM", nil)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
@@ -158,7 +150,7 @@ func TestGetPullRequest_Non200(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
_, err := client.GetPullRequest(context.Background(), "owner", "repo", 999)
if err == nil {
t.Fatal("expected error for 404, got nil")
@@ -171,7 +163,7 @@ func TestGetPullRequest_BadJSON(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
_, err := client.GetPullRequest(context.Background(), "owner", "repo", 1)
if err == nil {
t.Fatal("expected error for bad JSON, got nil")
@@ -185,36 +177,13 @@ func TestPostReview_Non200(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "test", "", nil)
client := NewClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "test", nil)
if err == nil {
t.Fatal("expected error for 403, got nil")
}
}
func TestPostReview_EmptyCommitID_OmittedFromPayload(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var raw map[string]interface{}
if err := json.Unmarshal(body, &raw); err != nil {
t.Fatalf("failed to decode payload: %v", err)
}
if _, exists := raw["commit_id"]; exists {
t.Errorf("expected commit_id to be omitted from payload when empty, but it was present")
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"id":200,"user":{"login":"bot"},"state":"APPROVED","stale":false}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "ok", "", nil)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestGetFileContent(t *testing.T) {
expected := "# Conventions\n- Be nice\n"
@@ -226,7 +195,7 @@ func TestGetFileContent(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
got, err := client.GetFileContent(context.Background(), "owner", "repo", "CONVENTIONS.md")
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -246,7 +215,7 @@ func TestGetPullRequestFiles(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
files, err := client.GetPullRequestFiles(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -271,7 +240,7 @@ func TestGetFileContentRef(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
content, err := client.GetFileContentRef(context.Background(), "owner", "repo", "main.go", "feature-branch")
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -291,7 +260,7 @@ func TestListContents(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
entries, err := client.ListContents(context.Background(), "owner", "repo", "docs")
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -307,64 +276,11 @@ func TestListContents(t *testing.T) {
}
}
func TestListContents_DotPath(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// "." should be normalized to empty path, which hits the root contents endpoint
if r.URL.Path != "/api/v1/repos/owner/repo/contents" {
t.Errorf("expected root contents path, got: %s", r.URL.Path)
}
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `[{"name":"README.md","path":"README.md","type":"file"}]`)
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
entries, err := client.ListContents(context.Background(), "owner", "repo", ".")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if entries[0].Name != "README.md" {
t.Errorf("expected README.md, got %s", entries[0].Name)
}
}
func TestListContents_FilePath(t *testing.T) {
// Gitea returns a single object (not an array) when path is a file
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/v1/repos/owner/repo/contents/README.md" {
t.Errorf("unexpected path: %s", r.URL.Path)
}
w.Header().Set("Content-Type", "application/json")
// Single object, not an array
fmt.Fprintf(w, `{"name":"README.md","path":"README.md","type":"file"}`)
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
entries, err := client.ListContents(context.Background(), "owner", "repo", "README.md")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
if entries[0].Name != "README.md" {
t.Errorf("expected README.md, got %s", entries[0].Name)
}
if entries[0].Type != "file" {
t.Errorf("expected type file, got %s", entries[0].Type)
}
}
func TestGetAllFilesInPath_File(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/api/v1/repos/owner/repo/contents/README.md" {
// Gitea returns a single object (not array) when path is a file
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{"name":"README.md","path":"README.md","type":"file"}`)
// Gitea returns 404 for contents API on files (it's not a dir)
http.NotFound(w, r)
return
}
if r.URL.Path == "/api/v1/repos/owner/repo/raw/README.md" {
@@ -375,7 +291,7 @@ func TestGetAllFilesInPath_File(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
files, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "README.md")
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -428,7 +344,7 @@ func TestListReviews(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
reviews, err := client.ListReviews(context.Background(), "owner", "repo", 5)
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -468,7 +384,7 @@ func TestListReviews_Pagination(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
reviews, err := client.ListReviews(context.Background(), "owner", "repo", 5)
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -493,7 +409,7 @@ func TestDeleteReview(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.DeleteReview(context.Background(), "owner", "repo", 5, 10)
if err != nil {
t.Fatalf("unexpected error: %v", err)
@@ -507,7 +423,7 @@ func TestDeleteReview_Forbidden(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.DeleteReview(context.Background(), "owner", "repo", 5, 10)
if err == nil {
t.Fatal("expected error for 403, got nil")
@@ -536,7 +452,7 @@ func TestEditComment(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.EditComment(context.Background(), "owner", "repo", 42, "updated body")
if err != nil {
t.Fatalf("EditComment() error = %v", err)
@@ -550,7 +466,7 @@ func TestEditComment_Forbidden(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.EditComment(context.Background(), "owner", "repo", 42, "new body")
if err == nil {
t.Fatal("expected error for 403 response")
@@ -570,7 +486,7 @@ func TestGetTimelineReviewCommentID(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
id, err := client.GetTimelineReviewCommentID(context.Background(), "owner", "repo", 5, "<!-- review-bot:sonnet -->")
if err != nil {
t.Fatalf("GetTimelineReviewCommentID() error = %v", err)
@@ -586,7 +502,7 @@ func TestGetTimelineReviewCommentID_NotFound(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
_, err := client.GetTimelineReviewCommentID(context.Background(), "owner", "repo", 5, "<!-- review-bot:sonnet -->")
if err == nil {
t.Fatal("expected error when sentinel not found")
@@ -609,7 +525,7 @@ func TestGetAllFilesInPath_404FallsBackToFile(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
files, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "README.md")
if err != nil {
t.Fatalf("expected fallback to file on 404, got error: %v", err)
@@ -630,7 +546,7 @@ func TestGetAllFilesInPath_500Propagates(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
_, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "somepath")
if err == nil {
t.Fatal("expected error to propagate for 500, got nil")
@@ -652,7 +568,7 @@ func TestGetAllFilesInPath_403Propagates(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
_, err := client.GetAllFilesInPath(context.Background(), "owner", "repo", "private/stuff")
if err == nil {
t.Fatal("expected error to propagate for 403, got nil")
@@ -704,7 +620,7 @@ func TestGetAuthenticatedUser(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
login, err := client.GetAuthenticatedUser(context.Background())
if err != nil {
t.Fatalf("GetAuthenticatedUser() error = %v", err)
@@ -729,7 +645,7 @@ func TestRequestReviewer(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.RequestReviewer(context.Background(), "owner", "repo", 7, "bot-user")
if err != nil {
t.Fatalf("RequestReviewer() error = %v", err)
@@ -745,7 +661,7 @@ func TestRequestReviewer_204(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.RequestReviewer(context.Background(), "owner", "repo", 1, "user")
if err != nil {
t.Fatalf("RequestReviewer() should accept 204, got error = %v", err)
@@ -759,7 +675,7 @@ func TestRequestReviewer_Error(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.RequestReviewer(context.Background(), "owner", "repo", 1, "user")
if err == nil {
t.Fatal("expected error for 403 response")
@@ -779,7 +695,7 @@ func TestListReviewComments(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
comments, err := client.ListReviewComments(context.Background(), "owner", "repo", 1, 42)
if err != nil {
t.Fatalf("ListReviewComments() error = %v", err)
@@ -807,7 +723,7 @@ func TestResolveComment(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.ResolveComment(context.Background(), "owner", "repo", 99)
if err != nil {
t.Fatalf("ResolveComment() error = %v", err)
@@ -821,601 +737,9 @@ func TestResolveComment_Error(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
err := client.ResolveComment(context.Background(), "owner", "repo", 99)
if err == nil {
t.Fatal("expected error for 404 response")
}
}
func TestIsServerError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil error", nil, false},
{"non-API error", fmt.Errorf("network timeout"), false},
{"404 APIError", &APIError{StatusCode: 404, Body: "not found"}, false},
{"500 APIError", &APIError{StatusCode: 500, Body: "server error"}, true},
{"502 APIError", &APIError{StatusCode: 502, Body: "bad gateway"}, true},
{"503 APIError", &APIError{StatusCode: 503, Body: "unavailable"}, true},
{"599 APIError", &APIError{StatusCode: 599, Body: "edge case"}, true},
{"600 not server error", &APIError{StatusCode: 600, Body: "edge"}, false},
{"400 not server error", &APIError{StatusCode: 400, Body: "bad request"}, false},
{"wrapped 500", fmt.Errorf("fetch: %w", &APIError{StatusCode: 500, Body: "err"}), true},
{"wrapped 404", fmt.Errorf("fetch: %w", &APIError{StatusCode: 404, Body: "err"}), false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := IsServerError(tt.err)
if got != tt.want {
t.Errorf("IsServerError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestDoGet_RetriesOn500(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
if attempts < 3 {
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"transient error"}`))
return
}
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"data":"success"}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
// Use short backoff for fast tests
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
body, err := client.doGet(context.Background(), server.URL+"/test")
if err != nil {
t.Fatalf("expected success after retry, got error: %v", err)
}
if string(body) != `{"data":"success"}` {
t.Errorf("body = %q, want %q", string(body), `{"data":"success"}`)
}
if attempts != 3 {
t.Errorf("attempts = %d, want 3", attempts)
}
}
func TestDoGet_FailsAfterMaxRetries(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"persistent error"}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
// Use short backoff for fast tests
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
_, err := client.doGet(context.Background(), server.URL+"/test")
if err == nil {
t.Fatal("expected error after max retries")
}
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected APIError, got: %v", err)
}
if apiErr.StatusCode != http.StatusInternalServerError {
t.Errorf("status = %d, want 500", apiErr.StatusCode)
}
if attempts != 3 {
t.Errorf("attempts = %d, want 3 (max retries)", attempts)
}
}
func TestDoGet_NoRetryOn4xx(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusForbidden)
w.Write([]byte(`{"message":"forbidden"}`))
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.doGet(context.Background(), server.URL+"/test")
if err == nil {
t.Fatal("expected error for 403")
}
var apiErr *APIError
if !errors.As(err, &apiErr) {
t.Fatalf("expected APIError, got: %v", err)
}
if apiErr.StatusCode != http.StatusForbidden {
t.Errorf("status = %d, want 403", apiErr.StatusCode)
}
if attempts != 1 {
t.Errorf("attempts = %d, want 1 (no retry on 4xx)", attempts)
}
}
func TestDoGet_RespectsContextCancellation(t *testing.T) {
attempts := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
attempts++
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte(`{"message":"error"}`))
}))
defer server.Close()
ctx, cancel := context.WithCancel(context.Background())
client := NewTestClient(server.URL, "test-token")
// Use longer backoff to give us time to cancel during the wait
client.RetryBackoff = []time.Duration{100 * time.Millisecond, 100 * time.Millisecond}
// Cancel after first attempt returns and retry begins
go func() {
time.Sleep(20 * time.Millisecond)
cancel()
}()
_, err := client.doGet(ctx, server.URL+"/test")
if err == nil {
t.Fatal("expected error on context cancellation")
}
// Should have made 1 attempt, then context cancelled during backoff
if attempts != 1 {
t.Errorf("attempts = %d, expected 1 before context cancel during backoff", attempts)
}
}
// mockTransport is a test helper that returns errors for the first N calls,
// then delegates to a real server.
type mockTransport struct {
failCount int32 // number of failures remaining (atomic)
failErr error // error to return on failure
realServer *httptest.Server
attemptsMade atomic.Int32 // tracks total attempts
}
func (m *mockTransport) RoundTrip(req *http.Request) (*http.Response, error) {
m.attemptsMade.Add(1)
remaining := atomic.AddInt32(&m.failCount, -1)
if remaining >= 0 {
// Still have failures to return
return nil, m.failErr
}
// Redirect to real server
req.URL.Host = m.realServer.Listener.Addr().String()
req.URL.Scheme = "http"
return http.DefaultTransport.RoundTrip(req)
}
func TestDoGet_RetriesOnTemporaryNetError(t *testing.T) {
// Real server that will handle successful requests
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"status":"ok"}`))
}))
defer server.Close()
// Mock transport: fail twice with ECONNREFUSED, then succeed
mt := &mockTransport{
failCount: 2,
failErr: &net.OpError{Op: "dial", Net: "tcp", Err: syscall.ECONNREFUSED},
realServer: server,
}
client := NewClient("http://fake-host/", "test-token")
client.SetHTTPClient(&http.Client{Transport: mt})
client.RetryBackoff = []time.Duration{1 * time.Millisecond, 1 * time.Millisecond}
body, err := client.doGet(context.Background(), "http://fake-host/test")
if err != nil {
t.Fatalf("expected success after retries, got error: %v", err)
}
if string(body) != `{"status":"ok"}` {
t.Errorf("body = %q, want %q", string(body), `{"status":"ok"}`)
}
// Should have made exactly 3 attempts: 2 failures + 1 success
if got := mt.attemptsMade.Load(); got != 3 {
t.Errorf("attempts = %d, want 3 (2 failures + 1 success)", got)
}
}
func TestIsTemporaryNetError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil error", nil, false},
{"plain error", fmt.Errorf("some error"), false},
// OpError with retriable syscall errors
{"OpError ECONNREFUSED", &net.OpError{Op: "dial", Err: syscall.ECONNREFUSED}, true},
{"OpError ECONNRESET", &net.OpError{Op: "read", Err: syscall.ECONNRESET}, true},
{"OpError ENETUNREACH", &net.OpError{Op: "dial", Err: syscall.ENETUNREACH}, true},
{"OpError EHOSTUNREACH", &net.OpError{Op: "dial", Err: syscall.EHOSTUNREACH}, true},
{"OpError ETIMEDOUT", &net.OpError{Op: "dial", Err: syscall.ETIMEDOUT}, true},
// OpError with permanent syscall errors — should NOT retry
{"OpError EACCES", &net.OpError{Op: "dial", Err: syscall.EACCES}, false},
{"OpError EPERM", &net.OpError{Op: "dial", Err: syscall.EPERM}, false},
// OpError with unknown inner error — conservative retry
{"OpError unknown inner", &net.OpError{Op: "dial", Err: fmt.Errorf("unknown")}, true},
// DNS errors
{"DNS timeout", &net.DNSError{IsTimeout: true}, true},
{"DNS no such host", &net.DNSError{IsTimeout: false, Name: "bad.host"}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isTemporaryNetError(tt.err)
if got != tt.want {
t.Errorf("isTemporaryNetError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestIsRetriableSyscallError(t *testing.T) {
tests := []struct {
name string
err error
want bool
}{
{"nil", nil, false},
{"ECONNREFUSED", syscall.ECONNREFUSED, true},
{"ECONNRESET", syscall.ECONNRESET, true},
{"ENETUNREACH", syscall.ENETUNREACH, true},
{"EHOSTUNREACH", syscall.EHOSTUNREACH, true},
{"ETIMEDOUT", syscall.ETIMEDOUT, true},
{"EACCES (permanent)", syscall.EACCES, false},
{"EPERM (permanent)", syscall.EPERM, false},
{"ENOENT (permanent)", syscall.ENOENT, false},
{"unknown error", fmt.Errorf("something"), true}, // conservative retry
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isRetriableSyscallError(tt.err)
if got != tt.want {
t.Errorf("isRetriableSyscallError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
func TestRedactURL(t *testing.T) {
tests := []struct {
name string
input string
want string
}{
{
name: "no query params",
input: "https://gitea.example.com/api/v1/repos/owner/repo/pulls/1",
want: "https://gitea.example.com/api/v1/repos/owner/repo/pulls/1",
},
{
name: "with query params - redacts",
input: "https://gitea.example.com/api/v1/repos/owner/repo/raw/file?ref=main",
want: "https://gitea.example.com/api/v1/repos/owner/repo/raw/file?[redacted]",
},
{
name: "multiple query params",
input: "https://example.com/path?token=secret&page=1",
want: "https://example.com/path?[redacted]",
},
{
name: "invalid URL",
input: "://invalid",
want: "[invalid URL]",
},
{
name: "empty string",
input: "",
want: "",
},
{
name: "with userinfo - redacts credentials",
input: "https://admin:secret@gitea.example.com/api/v1/repos",
want: "https://REDACTED@gitea.example.com/api/v1/repos",
},
{
name: "with userinfo and query params",
input: "https://user:pass@example.com/path?token=abc",
want: "https://REDACTED@example.com/path?[redacted]",
},
{
name: "username only - no password",
input: "https://user@example.com/path",
want: "https://REDACTED@example.com/path",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := redactURL(tt.input)
if got != tt.want {
t.Errorf("redactURL(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
func TestSanitizeErrorForLog(t *testing.T) {
tests := []struct {
name string
err error
want string
}{
{
name: "nil error",
err: nil,
want: "<nil>",
},
{
name: "APIError omits body",
err: &APIError{StatusCode: 500, Body: "internal error: database connection failed"},
want: "HTTP 500",
},
{
name: "APIError with large body still only shows status",
err: &APIError{StatusCode: 502, Body: strings.Repeat("x", 1000)},
want: "HTTP 502",
},
{
name: "non-API error preserved",
err: fmt.Errorf("connection refused"),
want: "connection refused",
},
{
name: "wrapped APIError",
err: fmt.Errorf("request failed: %w", &APIError{StatusCode: 503, Body: "service unavailable"}),
want: "HTTP 503",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := sanitizeErrorForLog(tt.err)
if got != tt.want {
t.Errorf("sanitizeErrorForLog() = %q, want %q", got, tt.want)
}
})
}
}
func TestNewClient_HasCheckRedirect(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
if c.http.CheckRedirect == nil {
t.Fatal("expected CheckRedirect to be set")
}
}
func TestDefaultCheckRedirect_RejectsHTTPSToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "gitea.example.com", Path: "/foo"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on HTTPS->HTTP redirect")
}
if !strings.Contains(err.Error(), "HTTPS to HTTP downgrade") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsCrossHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "cdn.example.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err == nil {
t.Fatal("expected error on cross-host redirect")
}
if !strings.Contains(err.Error(), "cross-host") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_AllowsSameHost(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/bar"},
Header: http.Header{"Authorization": []string{"token abc"}},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if auth := req.Header.Get("Authorization"); auth != "token abc" {
t.Errorf("expected Authorization to be preserved, got %q", auth)
}
}
func TestDefaultCheckRedirect_AllowsSameHostHTTPToHTTP(t *testing.T) {
prev := &http.Request{URL: &url.URL{Scheme: "http", Host: "localhost:3000", Path: "/foo"}}
req := &http.Request{
URL: &url.URL{Scheme: "http", Host: "localhost:3000", Path: "/bar"},
Header: http.Header{},
}
err := defaultCheckRedirect(req, []*http.Request{prev})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
}
func TestDefaultCheckRedirect_RejectsTooManyRedirects(t *testing.T) {
via := make([]*http.Request, 10)
for i := range via {
via[i] = &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/"}}
}
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/final"}}
err := defaultCheckRedirect(req, via)
if err == nil {
t.Fatal("expected error after 10 redirects")
}
if !strings.Contains(err.Error(), "10 redirects") {
t.Errorf("unexpected error message: %v", err)
}
}
func TestDefaultCheckRedirect_EmptyViaAllowed(t *testing.T) {
req := &http.Request{URL: &url.URL{Scheme: "https", Host: "gitea.example.com", Path: "/foo"}}
err := defaultCheckRedirect(req, nil)
if err != nil {
t.Fatalf("unexpected error with empty via: %v", err)
}
}
func TestSetHTTPClient_NilRestoresDefault(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
c.SetHTTPClient(nil)
if c.http == nil {
t.Fatal("expected non-nil http client after SetHTTPClient(nil)")
}
if c.http.Timeout != 30*time.Second {
t.Errorf("expected 30s timeout, got %v", c.http.Timeout)
}
if c.http.CheckRedirect == nil {
t.Fatal("expected CheckRedirect policy after SetHTTPClient(nil)")
}
}
// TestSafeDialContextBlocksPrivateIPs verifies that NewClient (which uses
// safeDialContext by default) refuses to connect to private/reserved IPs.
func TestSafeDialContextBlocksPrivateIPs(t *testing.T) {
// These servers listen on 127.0.0.1, so the safe dialer will block them.
// We use NewClient (NOT NewTestClient) to exercise the real safe dialer.
privateURLs := []struct {
name string
url string
}{
{"loopback localhost", "http://localhost/"},
{"loopback 127.0.0.1", "http://127.0.0.1/"},
}
for _, tc := range privateURLs {
t.Run(tc.name, func(t *testing.T) {
c := NewClient(tc.url, "token")
_, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err == nil {
t.Errorf("expected error connecting to %s, got nil", tc.url)
}
// Error must mention SSRF/blocked, not a random network error.
if !strings.Contains(err.Error(), "blocked") &&
!strings.Contains(err.Error(), "private") &&
!strings.Contains(err.Error(), "loopback") &&
!strings.Contains(err.Error(), "reserved") {
t.Logf("error: %v", err)
// Allow other errors (connection refused, DNS) since the point
// is that we don't silently succeed — but prefer the explicit block message.
}
})
}
}
// TestWithUnsafeDialerAllowsLocalhost verifies that WithUnsafeDialer bypasses
// the IP check, allowing tests to connect to httptest.Server (127.0.0.1).
func TestWithUnsafeDialerAllowsLocalhost(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.Write([]byte(`{"title":"test","body":"","head":{"sha":"abc","ref":"main"}}`))
}))
defer server.Close()
// WithUnsafeDialer should allow connecting to 127.0.0.1.
c := NewClient(server.URL, "token").WithUnsafeDialer()
pr, err := c.GetPullRequest(context.Background(), "owner", "repo", 1)
if err != nil {
t.Fatalf("unexpected error with unsafe dialer: %v", err)
}
if pr.Title != "test" {
t.Errorf("expected title 'test', got %q", pr.Title)
}
}
// TestNewClient_HasSafeTransport verifies that NewClient installs the
// SSRF-blocking transport (i.e. Transport is not nil and DialContext is set).
func TestNewClient_HasSafeTransport(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
if c.http.Transport == nil {
t.Fatal("expected Transport to be set on NewClient (safe dialer)")
}
transport, ok := c.http.Transport.(*http.Transport)
if !ok {
t.Fatalf("expected *http.Transport, got %T", c.http.Transport)
}
if transport.DialContext == nil {
t.Fatal("expected DialContext to be set on transport (safe dialer)")
}
}
// TestSetHTTPClient_NilRestoresSafeTransport verifies that SetHTTPClient(nil)
// restores the safe transport (not just any client).
func TestSetHTTPClient_NilRestoresSafeTransport(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
c.SetHTTPClient(&http.Client{}) // replace with plain client
c.SetHTTPClient(nil) // restore
transport, ok := c.http.Transport.(*http.Transport)
if !ok {
t.Fatalf("expected *http.Transport after SetHTTPClient(nil), got %T", c.http.Transport)
}
if transport.DialContext == nil {
t.Fatal("expected DialContext to be restored after SetHTTPClient(nil)")
}
}
// TestNewSafeHTTPClient_PreservesDefaultTransportSettings verifies that
// newSafeHTTPClient clones http.DefaultTransport to retain proxy support,
// TLS handshake timeout, idle connection limits, and HTTP/2.
func TestNewSafeHTTPClient_PreservesDefaultTransportSettings(t *testing.T) {
c := NewClient("https://gitea.example.com", "token")
transport, ok := c.http.Transport.(*http.Transport)
if !ok {
t.Fatalf("expected *http.Transport, got %T", c.http.Transport)
}
defaults := http.DefaultTransport.(*http.Transport)
// TLSHandshakeTimeout must be inherited (non-zero), not the zero value
// that a bare &http.Transport{} would have.
if transport.TLSHandshakeTimeout == 0 {
t.Error("TLSHandshakeTimeout is 0; expected inherited value from DefaultTransport")
}
if transport.TLSHandshakeTimeout != defaults.TLSHandshakeTimeout {
t.Errorf("TLSHandshakeTimeout = %v, want %v", transport.TLSHandshakeTimeout, defaults.TLSHandshakeTimeout)
}
// IdleConnTimeout must be inherited.
if transport.IdleConnTimeout == 0 {
t.Error("IdleConnTimeout is 0; expected inherited value from DefaultTransport")
}
if transport.IdleConnTimeout != defaults.IdleConnTimeout {
t.Errorf("IdleConnTimeout = %v, want %v", transport.IdleConnTimeout, defaults.IdleConnTimeout)
}
// MaxIdleConns must be inherited.
if transport.MaxIdleConns == 0 {
t.Error("MaxIdleConns is 0; expected inherited value from DefaultTransport")
}
// ForceAttemptHTTP2 must be inherited.
if !transport.ForceAttemptHTTP2 {
t.Error("ForceAttemptHTTP2 is false; expected true from DefaultTransport")
}
// Proxy must be set (ProxyFromEnvironment).
if transport.Proxy == nil {
t.Error("Proxy is nil; expected ProxyFromEnvironment from DefaultTransport")
}
// DialContext must be our safe dialer, not the default.
if transport.DialContext == nil {
t.Error("DialContext is nil; expected safeDialContext")
}
}
-97
View File
@@ -1,97 +0,0 @@
package gitea
import (
"context"
"errors"
"math"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
func TestGetPullRequestDiff_SizeLimits(t *testing.T) {
tests := []struct {
name string
diff string
maxDiffSize int64
wantErr error
wantDiff string
}{
{
name: "exceeds max size",
diff: strings.Repeat("+ added line\n", 1000), // ~13 KB
maxDiffSize: 100,
wantErr: ErrDiffTooLarge,
},
{
name: "within max size",
diff: "diff --git a/f.go b/f.go\n--- a/f.go\n+++ b/f.go\n@@ -1 +1 @@\n-old\n+new\n",
maxDiffSize: 1024,
wantDiff: "diff --git a/f.go b/f.go\n--- a/f.go\n+++ b/f.go\n@@ -1 +1 @@\n-old\n+new\n",
},
{
name: "exactly at limit",
diff: strings.Repeat("x", 50),
maxDiffSize: 50,
wantDiff: strings.Repeat("x", 50),
},
{
name: "one byte over limit",
diff: strings.Repeat("x", 51),
maxDiffSize: 50,
wantErr: ErrDiffTooLarge,
},
{
name: "disabled limit",
diff: strings.Repeat("x", 10000),
maxDiffSize: -1,
wantDiff: strings.Repeat("x", 10000),
},
{
name: "math.MaxInt64 treated as disabled",
diff: strings.Repeat("x", 10000),
maxDiffSize: math.MaxInt64,
wantDiff: strings.Repeat("x", 10000),
},
{
name: "default limit",
diff: "diff content",
maxDiffSize: 0, // zero means use DefaultMaxDiffSize
wantDiff: "diff content",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte(tt.diff)) //nolint:errcheck // test handler
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client.MaxDiffSize = tt.maxDiffSize
client.RetryBackoff = []time.Duration{}
got, err := client.GetPullRequestDiff(context.Background(), "owner", "repo", 1)
if tt.wantErr != nil {
if err == nil {
t.Fatal("expected error, got nil")
}
if !errors.Is(err, tt.wantErr) {
t.Errorf("expected %v, got: %v", tt.wantErr, err)
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if got != tt.wantDiff {
t.Errorf("diff mismatch: got length %d, want length %d", len(got), len(tt.wantDiff))
}
})
}
}
-18
View File
@@ -1,18 +0,0 @@
// Package gitea — export_test.go exposes test helpers to test files in this
// package. It uses `package gitea` (not `package gitea_test`) so it can access
// unexported identifiers; Go only compiles it into the test binary, never into
// the production binary. This is the idiomatic pattern for white-box testing
// in Go (see net/http/export_test.go in the stdlib for the same approach).
package gitea
// NewTestClient creates a Gitea client configured for use in unit tests.
// It bypasses the IP-level SSRF protection so that tests can connect to
// httptest.Server instances (which listen on 127.0.0.1).
//
// Using the internal package gitea declaration (not gitea_test) means this
// symbol is available to all _test.go files in this package. It is ONLY
// compiled into the test binary; production binaries never include it.
// Production code must use NewClient, which enables the safe dialer.
func NewTestClient(baseURL, token string) *Client {
return NewClient(baseURL, token).WithUnsafeDialer()
}
-22
View File
@@ -1,22 +0,0 @@
// Package gitea provides a client for the Gitea API.
// ipcheck.go re-exports the IsBlockedIP function from internal/netutil for use
// by this package's safe dialer (client.go) and for backward compatibility with
// any callers that previously imported it from here.
//
// The implementation has moved to internal/netutil so it can be shared with the
// validate-url subcommand (cmd/review-bot/validateurl.go) without creating a
// dependency from VCS-generic code on the Gitea-specific package.
package gitea
import (
"net"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
// IsBlockedIP reports whether ip is in a blocked address range.
// It delegates to internal/netutil.IsBlockedIP; see that function for the full
// list of blocked ranges and IPv6-mapped IPv4 normalization behavior.
func IsBlockedIP(ip net.IP) bool {
return netutil.IsBlockedIP(ip)
}
-37
View File
@@ -1,37 +0,0 @@
package gitea
import (
"net"
"testing"
"gitea.weiker.me/rodin/review-bot/internal/netutil"
)
// TestIsBlockedIPForwarding verifies that gitea.IsBlockedIP correctly forwards
// to internal/netutil.IsBlockedIP. Full coverage of the blocking logic lives in
// internal/netutil/ipcheck_test.go.
func TestIsBlockedIPForwarding(t *testing.T) {
cases := []struct {
ip string
blocked bool
}{
{"127.0.0.1", true}, // loopback — must be blocked
{"192.168.1.1", true}, // RFC1918 — must be blocked
{"8.8.8.8", false}, // public — must not be blocked
{"2001:4860:4860::8888", false}, // public IPv6 — must not be blocked
}
for _, tc := range cases {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
got := IsBlockedIP(ip)
want := netutil.IsBlockedIP(ip)
if got != want {
t.Errorf("gitea.IsBlockedIP(%q) = %v, netutil.IsBlockedIP = %v: forwarding mismatch", tc.ip, got, want)
}
if got != tc.blocked {
t.Errorf("gitea.IsBlockedIP(%q) = %v, want %v", tc.ip, got, tc.blocked)
}
}
}
+4 -4
View File
@@ -31,13 +31,13 @@ func TestPostReview_WithComments(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
client := NewClient(server.URL, "test-token")
comments := []ReviewComment{
{Path: "main.go", NewPosition: 42, Body: "[MAJOR] Something bad"},
{Path: "util.go", NewPosition: 10, Body: "[MINOR] Style issue"},
}
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "REQUEST_CHANGES", "summary", "", comments)
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "REQUEST_CHANGES", "summary", comments)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
@@ -71,8 +71,8 @@ func TestPostReview_NilComments(t *testing.T) {
}))
defer server.Close()
client := NewTestClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "all good", "", nil)
client := NewClient(server.URL, "test-token")
_, err := client.PostReview(context.Background(), "owner", "repo", 1, "APPROVED", "all good", nil)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
-831
View File
@@ -1,831 +0,0 @@
// Package github provides a client for the GitHub API.
// It supports pull request operations, file content retrieval,
// and review submission for both github.com and GitHub Enterprise.
package github
import (
"bytes"
"context"
"encoding/base64"
"encoding/json"
"errors"
"fmt"
"io"
"log/slog"
"net/http"
"net/url"
"os"
"strconv"
"strings"
"time"
)
const (
defaultBaseURL = "https://api.github.com"
// maxRetryAttempts is the number of times doRequest will attempt a request.
maxRetryAttempts = 3
// maxRetryAfter caps the maximum delay from a Retry-After header to prevent
// a server from stalling the client indefinitely.
maxRetryAfter = 60 * time.Second
// maxErrorBodyBytes limits how much of an error response body we read
// to protect against malicious servers sending unbounded data.
maxErrorBodyBytes = 64 * 1024 // 64 KB
// maxResponseBodyBytes limits how much of a successful response body we read
// for defense-in-depth against servers returning excessively large payloads.
maxResponseBodyBytes = 10 * 1024 * 1024 // 10 MB
)
// APIError represents an HTTP error response from the GitHub API.
// It carries the status code so callers can distinguish between
// different failure modes (e.g. 404 vs 500).
//
// The Body field stores up to 64 KiB of the raw response for programmatic
// inspection. Error() truncates to 200 bytes for safe logging, but callers
// should avoid logging or propagating Body directly in production since it may
// contain sensitive details from the upstream server.
type APIError struct {
StatusCode int
Body string
}
func (e *APIError) Error() string {
body := e.Body
if len(body) > 200 {
body = body[:200] + "...(truncated)"
}
// Sanitize newlines to prevent log injection from upstream response bodies.
body = strings.ReplaceAll(body, "\n", " ")
body = strings.ReplaceAll(body, "\r", " ")
return fmt.Sprintf("HTTP %d: %s", e.StatusCode, body)
}
// IsNotFound reports whether an error is an API 404 response.
func IsNotFound(err error) bool {
if apiErr, ok := asAPIError(err); ok {
return apiErr.StatusCode == http.StatusNotFound
}
return false
}
// IsUnauthorized reports whether an error is an API 401 response.
func IsUnauthorized(err error) bool {
if apiErr, ok := asAPIError(err); ok {
return apiErr.StatusCode == http.StatusUnauthorized
}
return false
}
func asAPIError(err error) (*APIError, bool) {
if err == nil {
return nil, false
}
var target *APIError
if errors.As(err, &target) {
return target, true
}
return nil, false
}
// Client interacts with the GitHub API.
// A Client is safe for concurrent use by multiple goroutines.
// SetHTTPClient and SetRetryBackoff are intended for test setup only and must
// be called before any goroutines issue requests; they have no synchronization.
type Client struct {
baseURL string
token string
httpClient *http.Client
// allowInsecureHTTP permits requests to HTTP (non-TLS) endpoints.
// When false, doRequest rejects URLs with an http:// scheme.
allowInsecureHTTP bool
// retryBackoff defines the delays between retry attempts for 429 responses.
// retryBackoff[i] is the delay before attempt i+1 (after attempt i fails).
// If nil, defaults to {1s, 2s}.
retryBackoff []time.Duration
// now returns the current time. Defaults to time.Now.
// Override in tests to control HTTP-date Retry-After calculations.
now func() time.Time
}
// defaultCheckRedirect is the redirect policy used by NewClient.
// NOTE: This function is intentionally duplicated in gitea/client.go (and vice versa)
// because the packages are separate. Changes here must be mirrored there.
// It rejects HTTPS->HTTP protocol downgrades (to prevent plaintext leakage)
// and cross-host redirects (to prevent following responses from untrusted
// endpoints). Same-host, same-or-upgraded-scheme redirects are allowed.
func defaultCheckRedirect(req *http.Request, via []*http.Request) error {
if len(via) >= 10 {
return fmt.Errorf("stopped after 10 redirects")
}
// Guard for direct invocation in tests and any future callers;
// net/http guarantees len(via) >= 1 during actual redirects.
if len(via) == 0 {
return nil
}
prev := via[len(via)-1]
// Reject protocol downgrade: HTTPS->HTTP leaks request metadata over plaintext.
if prev.URL.Scheme == "https" && req.URL.Scheme == "http" {
return fmt.Errorf("refusing redirect: HTTPS to HTTP downgrade (%s -> %s)", prev.URL.Host, req.URL.Host)
}
// Reject cross-host redirect entirely to avoid consuming responses
// from untrusted endpoints.
if req.URL.Host != prev.URL.Host {
return fmt.Errorf("refusing redirect: cross-host (%s -> %s)", prev.URL.Host, req.URL.Host)
}
return nil
}
// ClientOption configures optional behavior of a Client.
type ClientOption func(*clientConfig)
type clientConfig struct {
allowInsecureHTTP bool
insecureIsTestBypass bool
}
// AllowInsecureHTTP permits sending credentials over plaintext HTTP connections.
// In production, this option is gated by the REVIEW_BOT_ALLOW_INSECURE=1
// environment variable. Without the env var set, the option is ignored
// and a warning is logged.
//
// For tests, use AllowInsecureHTTPForTest (defined in a _test.go file in the same package) which bypasses the env gate.
func AllowInsecureHTTP() ClientOption {
return func(cfg *clientConfig) {
cfg.allowInsecureHTTP = true
}
}
// NewClient creates a new GitHub API client.
// If baseURL is empty, it defaults to https://api.github.com.
// For GitHub Enterprise, pass the API base URL (e.g. https://github.concur.com/api/v3).
func NewClient(token, baseURL string, opts ...ClientOption) *Client {
if baseURL == "" {
baseURL = defaultBaseURL
}
var cfg clientConfig
for _, opt := range opts {
opt(&cfg)
}
if cfg.allowInsecureHTTP && !cfg.insecureIsTestBypass {
if os.Getenv("REVIEW_BOT_ALLOW_INSECURE") != "1" {
slog.Warn("AllowInsecureHTTP ignored: set REVIEW_BOT_ALLOW_INSECURE=1 to enable")
cfg.allowInsecureHTTP = false
} else {
slog.Warn("AllowInsecureHTTP enabled — credentials may be sent over plaintext",
"env", "REVIEW_BOT_ALLOW_INSECURE=1")
}
}
return &Client{
baseURL: strings.TrimRight(baseURL, "/"),
token: token,
allowInsecureHTTP: cfg.allowInsecureHTTP,
httpClient: &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
},
now: time.Now,
}
}
// SetHTTPClient sets the underlying HTTP client used for requests.
// This is intended for test setup only to inject mock transports; it must be
// called before any goroutines issue requests.
//
// Passing nil restores the default client (30s timeout + redirect-rejecting
// CheckRedirect policy matching NewClient).
//
// Callers providing a non-nil client are responsible for configuring a safe
// CheckRedirect policy. Without one, the default net/http behavior will follow
// redirects and may forward the Authorization header to untrusted hosts.
func (c *Client) SetHTTPClient(hc *http.Client) {
if hc == nil {
hc = &http.Client{
Timeout: 30 * time.Second,
CheckRedirect: defaultCheckRedirect,
}
}
c.httpClient = hc
}
// SetRetryBackoff sets the delays between retry attempts.
// This is intended for testing to speed up retry tests.
//
// Note: if an empty non-nil slice is provided, Retry-After delays parsed from
// server responses will be computed and capped but not applied (because
// attempt < len(backoff) is always false). This is acceptable for the
// test-only use case but callers should be aware of this edge case.
func (c *Client) SetRetryBackoff(backoff []time.Duration) {
c.retryBackoff = backoff
}
// parseRetryAfter parses a Retry-After header value, supporting both integer
// seconds (e.g. "120") and HTTP-date format (e.g. "Thu, 01 Dec 2025 16:00:00 GMT")
// as specified in RFC 7231 §7.1.3.
//
// For integer values, it returns the duration directly.
// For HTTP-date values, it computes the delay as the difference between the
// parsed time and now. If the date is in the past, it returns 0.
//
// Returns (0, false) if the value cannot be parsed as either format.
func (c *Client) parseRetryAfter(value string) (time.Duration, bool) {
value = strings.TrimSpace(value)
// Try integer seconds first (most common from GitHub).
// RFC 7231 allows delta-seconds of 0 to indicate immediate retry.
if seconds, err := strconv.Atoi(value); err == nil && seconds >= 0 {
return time.Duration(seconds) * time.Second, true
}
// Try HTTP-date format (RFC 7231 §7.1.3).
// http.ParseTime handles RFC 1123, RFC 850, and ASCTIME formats.
if retryAt, err := http.ParseTime(value); err == nil {
delay := retryAt.Sub(c.now())
if delay < 0 {
delay = 0
}
return delay, true
}
return 0, false
}
// redactURL redacts sensitive components from a URL for safe inclusion in error
// messages and log output. It removes userinfo (e.g., user:pass@) and replaces
// query parameters with a placeholder.
func redactURL(rawURL string) string {
u, err := url.Parse(rawURL)
if err != nil {
return "<unparseable URL>"
}
u.User = nil
if u.RawQuery != "" {
u.RawQuery = "<redacted>"
}
return u.String()
}
// doRequest performs an HTTP request with retry on 429 rate limit responses.
// It respects the Retry-After header when present, supporting both integer
// seconds and HTTP-date formats (capped at maxRetryAfter).
func (c *Client) doRequest(ctx context.Context, method, reqURL string, accept string) ([]byte, error) {
// NOTE: This parses reqURL a second time (http.NewRequestWithContext parses it
// again internally). Acceptable cost: URL parsing is cheap and threading the
// parsed *url.URL through would complicate the interface for negligible gain.
if !c.allowInsecureHTTP {
parsed, err := url.Parse(reqURL)
if err != nil {
return nil, fmt.Errorf("parse request URL: %w", err)
}
if strings.EqualFold(parsed.Scheme, "http") {
return nil, fmt.Errorf("refusing HTTP request to %s: use HTTPS or set AllowInsecureHTTP option", redactURL(reqURL))
}
}
var backoff []time.Duration
if c.retryBackoff != nil {
backoff = append([]time.Duration(nil), c.retryBackoff...)
} else {
backoff = []time.Duration{1 * time.Second, 2 * time.Second}
}
var lastErr error
for attempt := 0; attempt < maxRetryAttempts; attempt++ {
if attempt > 0 {
var delay time.Duration
if attempt-1 < len(backoff) {
delay = backoff[attempt-1]
}
if delay > 0 {
timer := time.NewTimer(delay)
select {
case <-timer.C:
timer.Stop() // no-op after fire; kept for symmetry with the ctx.Done case
case <-ctx.Done():
timer.Stop()
return nil, ctx.Err()
}
}
}
req, err := http.NewRequestWithContext(ctx, method, reqURL, nil)
if err != nil {
return nil, fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+c.token)
if accept != "" {
req.Header.Set("Accept", accept)
} else {
req.Header.Set("Accept", "application/vnd.github+json")
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("do request: %w", err)
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
body, err := io.ReadAll(io.LimitReader(resp.Body, maxResponseBodyBytes))
resp.Body.Close()
if err != nil {
return nil, fmt.Errorf("read response body: %w", err)
}
return body, nil
}
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
resp.Body.Close()
lastErr = &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
// Retry on 429 rate limit
if resp.StatusCode == http.StatusTooManyRequests && attempt < maxRetryAttempts-1 {
// Check for Retry-After header and override backoff if present.
// Supports both integer seconds (common) and HTTP-date format (RFC 7231).
if ra := resp.Header.Get("Retry-After"); ra != "" {
if delay, ok := c.parseRetryAfter(ra); ok {
if delay > maxRetryAfter {
delay = maxRetryAfter
}
if attempt < len(backoff) {
backoff[attempt] = delay
}
}
}
continue
}
// Don't retry other errors
return nil, lastErr
}
return nil, lastErr
}
// doGet is a convenience wrapper for GET requests with the default Accept header.
func (c *Client) doGet(ctx context.Context, url string) ([]byte, error) {
return c.doRequest(ctx, http.MethodGet, url, "")
}
// doRequestWithBody performs an HTTP request with an optional body, applying the
// same HTTPS enforcement as doRequest. It is used by write methods (POST, PUT,
// DELETE) that bypass the retry loop in doRequest because write operations are
// not idempotent.
//
// body may be nil for requests that carry no payload (e.g. DELETE).
// When body is non-nil, Content-Type is set to application/json.
func (c *Client) doRequestWithBody(ctx context.Context, method, reqURL string, body []byte) ([]byte, error) {
if !c.allowInsecureHTTP {
parsed, err := url.Parse(reqURL)
if err != nil {
return nil, fmt.Errorf("parse request URL: %w", err)
}
if strings.EqualFold(parsed.Scheme, "http") {
return nil, fmt.Errorf("refusing HTTP request to %s: use HTTPS or set AllowInsecureHTTP option", redactURL(reqURL))
}
}
var reqBody io.Reader
if body != nil {
reqBody = bytes.NewReader(body)
}
req, err := http.NewRequestWithContext(ctx, method, reqURL, reqBody)
if err != nil {
return nil, fmt.Errorf("create request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+c.token)
req.Header.Set("Accept", "application/vnd.github+json")
if body != nil {
req.Header.Set("Content-Type", "application/json")
}
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("do request: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
respBody, err := io.ReadAll(io.LimitReader(resp.Body, maxResponseBodyBytes))
if err != nil {
return nil, fmt.Errorf("read response body: %w", err)
}
return respBody, nil
}
errBody, _ := io.ReadAll(io.LimitReader(resp.Body, maxErrorBodyBytes))
return nil, &APIError{StatusCode: resp.StatusCode, Body: string(errBody)}
}
// --- API types ---
// PullRequest holds relevant PR metadata.
type PullRequest struct {
Title string `json:"title"`
Body string `json:"body"`
Head struct {
Sha string `json:"sha"`
Ref string `json:"ref"`
} `json:"head"`
Draft bool `json:"draft"`
}
// CommitStatus represents a single CI status entry.
// GitHub returns "state" not "status"; this type uses Status for consistency
// with the gitea package (both are normalized before use).
type CommitStatus struct {
Status string `json:"state"` // GitHub field is "state"
Context string `json:"context"`
Description string `json:"description"`
TargetURL string `json:"target_url"`
}
// ChangedFile represents a file modified in a PR.
type ChangedFile struct {
Filename string `json:"filename"`
Status string `json:"status"`
}
// ReviewComment represents an inline comment to attach to a review.
// GitHub uses "position" (diff hunk position), whereas Gitea uses "new_position" (line number).
// When posting inline comments on GitHub, position is required; line numbers
// from the diff cannot be used directly.
type ReviewComment struct {
ID int64 `json:"id,omitempty"`
Path string `json:"path"`
Position int64 `json:"position,omitempty"` // GitHub diff hunk position
Line int64 `json:"line,omitempty"` // GitHub absolute line number (alternative to position)
Side string `json:"side,omitempty"` // "RIGHT" or "LEFT"
Body string `json:"body"`
}
// Review represents a pull request review from the GitHub API.
type Review struct {
ID int64 `json:"id"`
Body string `json:"body"`
User struct {
Login string `json:"login"`
} `json:"user"`
State string `json:"state"`
}
// contentResponse is the GitHub contents API response for a single file.
type contentResponse struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"` // "file" or "dir" or "symlink" or "submodule"
Content string `json:"content"` // Base64-encoded file content (with embedded newlines)
Encoding string `json:"encoding"` // "base64" or ""
}
// ContentEntry represents a file or directory entry from the contents API.
type ContentEntry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"` // "file" or "dir"
}
// --- PR methods ---
// GetPullRequest fetches PR metadata.
func (c *Client) GetPullRequest(ctx context.Context, owner, repo string, number int) (*PullRequest, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR: %w", err)
}
var pr PullRequest
if err := json.Unmarshal(body, &pr); err != nil {
return nil, fmt.Errorf("parse PR JSON: %w", err)
}
return &pr, nil
}
// GetPullRequestDiff fetches the unified diff for a PR.
func (c *Client) GetPullRequestDiff(ctx context.Context, owner, repo string, number int) (string, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
body, err := c.doRequest(ctx, http.MethodGet, reqURL, "application/vnd.github.diff")
if err != nil {
return "", fmt.Errorf("fetch diff: %w", err)
}
return string(body), nil
}
// GetPullRequestFiles fetches the list of files changed in a PR.
// GitHub paginates this endpoint (100 per page max).
func (c *Client) GetPullRequestFiles(ctx context.Context, owner, repo string, number int) ([]ChangedFile, error) {
const perPage = 100
var all []ChangedFile
for page := 1; ; page++ {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/files?per_page=%d&page=%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, perPage, page)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch PR files (page %d): %w", page, err)
}
var batch []ChangedFile
if err := json.Unmarshal(body, &batch); err != nil {
return nil, fmt.Errorf("parse PR files JSON (page %d): %w", page, err)
}
all = append(all, batch...)
if len(batch) < perPage {
break
}
}
return all, nil
}
// GetCommitStatuses fetches CI statuses for a commit SHA.
// GitHub has two status systems: legacy "commit statuses" and newer "check runs".
// This method returns commit statuses only; check runs are a separate API.
// Note: GitHub returns "state" in the JSON; CommitStatus.Status is tagged accordingly.
func (c *Client) GetCommitStatuses(ctx context.Context, owner, repo, sha string) ([]CommitStatus, error) {
const perPage = 100
var all []CommitStatus
for page := 1; ; page++ {
reqURL := fmt.Sprintf("%s/repos/%s/%s/commits/%s/statuses?per_page=%d&page=%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), url.PathEscape(sha), perPage, page)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("fetch commit statuses (page %d): %w", page, err)
}
var batch []CommitStatus
if err := json.Unmarshal(body, &batch); err != nil {
return nil, fmt.Errorf("parse statuses JSON (page %d): %w", page, err)
}
all = append(all, batch...)
if len(batch) < perPage {
break
}
}
return all, nil
}
// --- File content methods ---
// GetFileContent fetches a file from the default branch of a repo.
// GitHub returns base64-encoded content; this method decodes it.
func (c *Client) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
return c.getFileContentAtRef(ctx, owner, repo, filepath, "")
}
// GetFileContentRef fetches a file from a specific ref (branch/tag/sha).
func (c *Client) GetFileContentRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
return c.getFileContentAtRef(ctx, owner, repo, filepath, ref)
}
// getFileContentAtRef fetches a file at the given ref (empty = default branch).
// GitHub's contents API returns base64-encoded file content.
func (c *Client) getFileContentAtRef(ctx context.Context, owner, repo, filepath, ref string) (string, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/contents/%s",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), escapePath(filepath))
if ref != "" {
reqURL += "?ref=" + url.QueryEscape(ref)
}
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("fetch file %s: %w", filepath, err)
}
var resp contentResponse
if err := json.Unmarshal(body, &resp); err != nil {
return "", fmt.Errorf("parse file content JSON for %s: %w", filepath, err)
}
if resp.Type != "file" {
return "", fmt.Errorf("path %s is a %s, not a file", filepath, resp.Type)
}
if resp.Encoding == "base64" {
// GitHub embeds newlines in the base64 content for readability.
// Strip them before decoding.
cleaned := strings.ReplaceAll(resp.Content, "\n", "")
decoded, err := base64.StdEncoding.DecodeString(cleaned)
if err != nil {
return "", fmt.Errorf("decode base64 content for %s: %w", filepath, err)
}
return string(decoded), nil
}
// Non-base64 encoding (shouldn't happen normally, but handle gracefully).
return resp.Content, nil
}
// ListContents lists files and directories at a given path.
// Pass an empty path to list the repository root.
// GitHub returns a single object (not array) when path is a file — this
// method normalizes both cases to a slice, matching Gitea's behavior.
func (c *Client) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
var reqURL string
if path == "" || path == "." {
reqURL = fmt.Sprintf("%s/repos/%s/%s/contents",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo))
} else {
reqURL = fmt.Sprintf("%s/repos/%s/%s/contents/%s",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), escapePath(path))
}
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list contents %s: %w", path, err)
}
var entries []ContentEntry
if err := json.Unmarshal(body, &entries); err != nil {
// GitHub returns a single object when path is a file (not an array).
var single contentResponse
if err2 := json.Unmarshal(body, &single); err2 != nil {
return nil, fmt.Errorf("parse contents JSON: %w", err)
}
if single.Name == "" && single.Path == "" {
return nil, fmt.Errorf("parse contents JSON: empty response for path %q", path)
}
entries = []ContentEntry{{
Name: single.Name,
Path: single.Path,
Type: single.Type,
}}
}
return entries, nil
}
// GetAllFilesInPath recursively fetches all file contents under a path.
// If the path is a file, returns just that file's content.
// If the path is a directory, recursively fetches all files within it.
func (c *Client) GetAllFilesInPath(ctx context.Context, owner, repo, path string) (map[string]string, error) {
results := make(map[string]string)
entries, err := c.ListContents(ctx, owner, repo, path)
if err != nil {
if !IsNotFound(err) {
return nil, fmt.Errorf("list contents %q: %w", path, err)
}
// 404 means path may be a file — try fetching directly.
content, fileErr := c.GetFileContent(ctx, owner, repo, path)
if fileErr != nil {
return nil, fmt.Errorf("path %q is neither a file nor directory: %w", path, fileErr)
}
results[path] = content
return results, nil
}
for _, entry := range entries {
switch entry.Type {
case "file":
content, err := c.GetFileContent(ctx, owner, repo, entry.Path)
if err != nil {
slog.Warn("could not fetch file from patterns repo", "file", entry.Path, "error", err)
continue
}
results[entry.Path] = content
case "dir":
subResults, err := c.GetAllFilesInPath(ctx, owner, repo, entry.Path)
if err != nil {
slog.Warn("could not recurse into directory", "dir", entry.Path, "error", err)
continue
}
for k, v := range subResults {
results[k] = v
}
}
}
return results, nil
}
// --- Review methods ---
// PostReview submits a review to a PR.
// event should be one of "APPROVE", "REQUEST_CHANGES", or "COMMENT".
// commitID anchors the review to a specific commit SHA. If empty, defaults to current HEAD.
// comments are optional inline comments; GitHub uses diff hunk position (not line numbers).
// Note: unlike Gitea, GitHub does not support deleting submitted reviews.
// Use COMMENT event to supersede old reviews.
func (c *Client) PostReview(ctx context.Context, owner, repo string, number int, event, body, commitID string, comments []ReviewComment) (*Review, error) {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
payload := struct {
Body string `json:"body"`
Event string `json:"event"`
CommitID string `json:"commit_id,omitempty"`
Comments []ReviewComment `json:"comments,omitempty"`
}{
Body: body,
Event: event,
CommitID: commitID,
Comments: comments,
}
data, err := json.Marshal(payload)
if err != nil {
return nil, fmt.Errorf("marshal review payload: %w", err)
}
respBody, err := c.doRequestWithBody(ctx, http.MethodPost, reqURL, data)
if err != nil {
return nil, fmt.Errorf("post review: %w", err)
}
var review Review
if err := json.Unmarshal(respBody, &review); err != nil {
return nil, fmt.Errorf("parse review response: %w", err)
}
return &review, nil
}
// ListReviews returns all reviews on a pull request.
// GitHub paginates via Link header; this method uses per_page=100.
func (c *Client) ListReviews(ctx context.Context, owner, repo string, number int) ([]Review, error) {
const perPage = 100
var all []Review
for page := 1; ; page++ {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews?per_page=%d&page=%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, perPage, page)
body, err := c.doGet(ctx, reqURL)
if err != nil {
return nil, fmt.Errorf("list reviews (page %d): %w", page, err)
}
var batch []Review
if err := json.Unmarshal(body, &batch); err != nil {
return nil, fmt.Errorf("parse reviews (page %d): %w", page, err)
}
all = append(all, batch...)
if len(batch) < perPage {
break
}
}
return all, nil
}
// DeleteReview attempts to delete a pull request review.
// GitHub only allows deleting PENDING (draft) reviews. Submitted reviews cannot
// be deleted via the API; this method returns a descriptive error in that case.
// review-bot callers should handle this error gracefully (e.g., by not attempting
// supersede and instead posting a new review alongside the old one).
func (c *Client) DeleteReview(ctx context.Context, owner, repo string, number int, reviewID int64) error {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/reviews/%d",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number, reviewID)
// nil body: the GitHub DELETE endpoint for reviews requires no request body.
_, err := c.doRequestWithBody(ctx, http.MethodDelete, reqURL, nil)
if err != nil {
return fmt.Errorf("delete review: %w", err)
}
return nil
}
// GetAuthenticatedUser returns the login of the authenticated user.
func (c *Client) GetAuthenticatedUser(ctx context.Context) (string, error) {
reqURL := c.baseURL + "/user"
body, err := c.doGet(ctx, reqURL)
if err != nil {
return "", fmt.Errorf("get authenticated user: %w", err)
}
var result struct {
Login string `json:"login"`
}
if err := json.Unmarshal(body, &result); err != nil {
return "", fmt.Errorf("parse user response: %w", err)
}
return result.Login, nil
}
// RequestReviewer adds a user as a requested reviewer on a pull request.
// This is idempotent — requesting an already-requested reviewer is a no-op.
func (c *Client) RequestReviewer(ctx context.Context, owner, repo string, number int, reviewer string) error {
reqURL := fmt.Sprintf("%s/repos/%s/%s/pulls/%d/requested_reviewers",
c.baseURL, url.PathEscape(owner), url.PathEscape(repo), number)
payload := struct {
Reviewers []string `json:"reviewers"`
}{Reviewers: []string{reviewer}}
data, err := json.Marshal(payload)
if err != nil {
return fmt.Errorf("marshal reviewer request: %w", err)
}
_, err = c.doRequestWithBody(ctx, http.MethodPost, reqURL, data)
if err != nil {
return fmt.Errorf("request reviewer: %w", err)
}
return nil
}
// --- helpers ---
// escapePath escapes each segment of a relative file path for use in URLs.
// Slashes are preserved as path separators; other special characters are escaped.
func escapePath(p string) string {
parts := strings.Split(p, "/")
for i, part := range parts {
parts[i] = url.PathEscape(part)
}
return strings.Join(parts, "/")
}
File diff suppressed because it is too large Load Diff
-13
View File
@@ -1,13 +0,0 @@
package github
// AllowInsecureHTTPForTest permits sending credentials over plaintext HTTP
// without requiring the REVIEW_BOT_ALLOW_INSECURE environment variable.
// This is intended exclusively for test code using httptest.Server.
//
// Defined in a _test.go file so it is only available to test binaries.
func AllowInsecureHTTPForTest() ClientOption {
return func(cfg *clientConfig) {
cfg.allowInsecureHTTP = true
cfg.insecureIsTestBypass = true
}
}
-2
View File
@@ -1,5 +1,3 @@
module gitea.weiker.me/rodin/review-bot
go 1.26.2
require github.com/goccy/go-yaml v1.19.2
-2
View File
@@ -1,2 +0,0 @@
github.com/goccy/go-yaml v1.19.2 h1:PmFC1S6h8ljIz6gMRBopkjP1TVT7xuwrButHID66PoM=
github.com/goccy/go-yaml v1.19.2/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA=
+2 -3
View File
@@ -16,8 +16,7 @@ import (
// Integration test requires a running Gitea instance and LLM endpoint.
// Set environment variables:
//
// INTEGRATION_VCS_URL - VCS base URL
// INTEGRATION_GITEA_URL - Gitea base URL
// INTEGRATION_GITEA_TOKEN - Gitea API token with repo access
// INTEGRATION_GITEA_REPO - owner/repo with an open PR
// INTEGRATION_PR_NUMBER - PR number to test against
@@ -26,7 +25,7 @@ import (
// INTEGRATION_LLM_MODEL - Model name
func TestIntegration_FullReviewFlow(t *testing.T) {
giteaURL := os.Getenv("INTEGRATION_VCS_URL")
giteaURL := os.Getenv("INTEGRATION_GITEA_URL")
giteaToken := os.Getenv("INTEGRATION_GITEA_TOKEN")
giteaRepo := os.Getenv("INTEGRATION_GITEA_REPO")
prNumStr := os.Getenv("INTEGRATION_PR_NUMBER")
-97
View File
@@ -1,97 +0,0 @@
// Package netutil provides shared network utilities for review-bot.
// ipcheck.go implements IP-level SSRF protection by checking resolved addresses
// against known blocked CIDR ranges (RFC1918, loopback, link-local, etc.).
package netutil
import (
"fmt"
"net"
)
// blockedCIDRStrings is the canonical list of CIDR strings that should never
// be contacted by review-bot. See IsBlockedIP for the full list of covered
// address families.
//
// These are hard-coded literals: any parse failure is a programming error.
// Validity is verified by TestBlockedCIDRsValid in ipcheck_test.go.
var blockedCIDRStrings = []string{
// IPv4 loopback
"127.0.0.0/8",
// IPv4 unspecified / "this network"
"0.0.0.0/8",
// RFC1918 private ranges
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
// IPv4 link-local (APIPA, also used by AWS instance metadata 169.254.169.254)
"169.254.0.0/16",
// IPv4 shared address space (RFC6598, carrier-grade NAT)
"100.64.0.0/10",
// IPv4 multicast
"224.0.0.0/4",
// IPv4 reserved / broadcast
"240.0.0.0/4",
// IPv6 loopback
"::1/128",
// IPv6 unspecified
"::/128",
// IPv6 link-local
"fe80::/10",
// IPv6 unique local (ULA) — RFC4193
"fc00::/7",
// IPv6 multicast
"ff00::/8",
}
// blockedCIDRs is the parsed form of blockedCIDRStrings.
// Any entry that fails to parse is recorded in blockedCIDRParseErrors instead
// of panicking; tests verify this slice is always empty via TestBlockedCIDRsValid.
var (
blockedCIDRs []*net.IPNet
blockedCIDRParseErrors []string
)
func init() {
blockedCIDRs = make([]*net.IPNet, 0, len(blockedCIDRStrings))
for _, r := range blockedCIDRStrings {
_, cidr, err := net.ParseCIDR(r)
if err != nil {
// Record the error rather than panicking; TestBlockedCIDRsValid
// will catch this during tests, and the CI build will fail.
blockedCIDRParseErrors = append(blockedCIDRParseErrors,
fmt.Sprintf("ipcheck: invalid built-in CIDR %q: %v", r, err))
continue
}
blockedCIDRs = append(blockedCIDRs, cidr)
}
}
// BlockedCIDRParseErrors returns any errors encountered parsing the built-in
// CIDR list. In correct code this will always be empty; tests assert it is.
func BlockedCIDRParseErrors() []string {
return blockedCIDRParseErrors
}
// IsBlockedIP reports whether ip is in a blocked address range.
// It is exported for use by the gitea package's safe dialer, the validate-url
// subcommand, and tests outside this package.
//
// IPv6-mapped IPv4 addresses (e.g. ::ffff:192.168.1.1) are normalized to their
// IPv4 form before checking so that IPv4 CIDRs catch them.
//
// Based on:
// - RFC1918 private ranges
// - RFC5735 / RFC4193 special-use IPv4/IPv6 ranges
// - RFC4291 IPv6 link-local / loopback
func IsBlockedIP(ip net.IP) bool {
// Normalize IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) to plain IPv4.
if v4 := ip.To4(); v4 != nil {
ip = v4
}
for _, cidr := range blockedCIDRs {
if cidr.Contains(ip) {
return true
}
}
return false
}
-142
View File
@@ -1,142 +0,0 @@
package netutil
import (
"net"
"testing"
)
func TestIsBlockedIP(t *testing.T) {
blocked := []struct {
name string
ip string
}{
// IPv4 loopback
{"loopback 127.0.0.1", "127.0.0.1"},
{"loopback 127.0.0.2", "127.0.0.2"},
{"loopback 127.255.255.255", "127.255.255.255"},
// IPv4 unspecified
{"unspecified 0.0.0.0", "0.0.0.0"},
{"unspecified 0.1.2.3", "0.1.2.3"},
// RFC1918
{"RFC1918 10.0.0.1", "10.0.0.1"},
{"RFC1918 10.255.255.255", "10.255.255.255"},
{"RFC1918 172.16.0.1", "172.16.0.1"},
{"RFC1918 172.31.255.255", "172.31.255.255"},
{"RFC1918 192.168.0.1", "192.168.0.1"},
{"RFC1918 192.168.255.255", "192.168.255.255"},
// Link-local (APIPA / AWS metadata)
{"link-local 169.254.0.1", "169.254.0.1"},
{"link-local 169.254.169.254", "169.254.169.254"},
// Shared address space (carrier-grade NAT)
{"CGN 100.64.0.1", "100.64.0.1"},
{"CGN 100.127.255.255", "100.127.255.255"},
// Multicast
{"multicast 224.0.0.1", "224.0.0.1"},
{"multicast 239.255.255.255", "239.255.255.255"},
// Reserved
{"reserved 240.0.0.1", "240.0.0.1"},
{"broadcast 255.255.255.255", "255.255.255.255"},
// IPv6 loopback
{"IPv6 loopback ::1", "::1"},
// IPv6 unspecified
{"IPv6 unspecified ::", "::"},
// IPv6 link-local
{"IPv6 link-local fe80::1", "fe80::1"},
{"IPv6 link-local fe80::dead:beef", "fe80::dead:beef"},
// IPv6 ULA
{"IPv6 ULA fc00::1", "fc00::1"},
{"IPv6 ULA fd00::1", "fd00::1"},
// IPv6 multicast
{"IPv6 multicast ff02::1", "ff02::1"},
}
for _, tc := range blocked {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if !IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = false, want true", tc.ip)
}
})
}
allowed := []struct {
name string
ip string
}{
{"public 8.8.8.8", "8.8.8.8"},
{"public 1.1.1.1", "1.1.1.1"},
{"public 198.51.100.1", "198.51.100.1"}, // RFC5737 TEST-NET-2 — a documentation-only range;
// not assigned to any real host, but intentionally left unblocked here because
// it has no special routing treatment (unlike RFC1918/loopback/link-local) and
// blocking it would require tracking every RFC5737 range without meaningful
// security benefit (no server should ever listen on a TEST-NET address).
{"public 151.101.1.1", "151.101.1.1"}, // Fastly
{"public IPv6 2001:4860:4860::8888", "2001:4860:4860::8888"}, // Google DNS
{"public IPv6 2606:4700:4700::1111", "2606:4700:4700::1111"}, // Cloudflare DNS
}
for _, tc := range allowed {
t.Run(tc.name, func(t *testing.T) {
ip := net.ParseIP(tc.ip)
if ip == nil {
t.Fatalf("failed to parse IP %q", tc.ip)
}
if IsBlockedIP(ip) {
t.Errorf("IsBlockedIP(%q) = true, want false", tc.ip)
}
})
}
}
func TestIsBlockedIPv6MappedIPv4(t *testing.T) {
// ::ffff:192.168.1.1 is an IPv6-mapped IPv4 address — should be blocked as RFC1918.
// Construct it manually as a 16-byte IP.
mapped := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 192, 168, 1, 1}
if !IsBlockedIP(mapped) {
t.Errorf("IsBlockedIP(::ffff:192.168.1.1) = false, want true (IPv6-mapped IPv4 must be normalized)")
}
// ::ffff:8.8.8.8 — IPv6-mapped public IP — should be allowed.
mappedPublic := net.IP{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff, 8, 8, 8, 8}
if IsBlockedIP(mappedPublic) {
t.Errorf("IsBlockedIP(::ffff:8.8.8.8) = true, want false")
}
}
func TestIsBlockedIPEdgeCases(t *testing.T) {
// The boundary between RFC1918 and public ranges.
// 172.15.255.255 is NOT private (just below 172.16.0.0/12).
notPrivate := net.ParseIP("172.15.255.255")
if IsBlockedIP(notPrivate) {
t.Errorf("IsBlockedIP(172.15.255.255) = true, want false (outside 172.16.0.0/12)")
}
// 172.32.0.0 is NOT private (just above 172.31.255.255).
notPrivate2 := net.ParseIP("172.32.0.0")
if IsBlockedIP(notPrivate2) {
t.Errorf("IsBlockedIP(172.32.0.0) = true, want false (outside 172.16.0.0/12)")
}
// CGN: 100.63.255.255 is NOT in 100.64.0.0/10.
notCGN := net.ParseIP("100.63.255.255")
if IsBlockedIP(notCGN) {
t.Errorf("IsBlockedIP(100.63.255.255) = true, want false (outside 100.64.0.0/10)")
}
// CGN: 100.128.0.0 is NOT in 100.64.0.0/10.
notCGN2 := net.ParseIP("100.128.0.0")
if IsBlockedIP(notCGN2) {
t.Errorf("IsBlockedIP(100.128.0.0) = true, want false (outside 100.64.0.0/10)")
}
}
// TestBlockedCIDRsValid verifies that all entries in blockedCIDRStrings parse
// successfully. This catches programming errors in the CIDR list without
// requiring a startup panic. The init() function records parse failures in
// blockedCIDRParseErrors rather than panicking; this test makes those failures
// visible as test failures during CI.
func TestBlockedCIDRsValid(t *testing.T) {
for _, msg := range BlockedCIDRParseErrors() {
t.Errorf("CIDR parse error: %s", msg)
}
}
+40 -50
View File
@@ -17,10 +17,6 @@ import (
// Update this when SAP AI Core releases a new stable version.
const AICoreOpenAIAPIVersion = "2024-12-01-preview"
// maxErrorBodyLen limits the length of response bodies included in error messages
// to prevent leaking potentially sensitive upstream details in logs.
const maxErrorBodyLen = 200
// AICoreConfig holds SAP AI Core authentication and connection settings.
type AICoreConfig struct {
ClientID string
@@ -32,10 +28,6 @@ type AICoreConfig struct {
// AICoreClient wraps AI Core authentication and deployment discovery.
// Thread-safe for concurrent use after construction.
//
// Design: The deployment cache is populated once and never invalidated. This is
// acceptable for short-lived CI runner processes, but longer-lived deployments
// may want to add a TTL or re-fetch on errors.
type AICoreClient struct {
config AICoreConfig
http *http.Client
@@ -43,7 +35,12 @@ type AICoreClient struct {
mu sync.RWMutex
token string
tokenExpiry time.Time
deployments map[string]string // model name -> deployment URL
deployments map[string]deployment // model name -> deployment info
}
type deployment struct {
ID string
URL string
}
// NewAICoreClient creates a new AI Core client with the given configuration.
@@ -52,7 +49,7 @@ func NewAICoreClient(cfg AICoreConfig) *AICoreClient {
return &AICoreClient{
config: cfg,
http: &http.Client{Timeout: 5 * time.Minute},
deployments: make(map[string]string),
deployments: make(map[string]deployment),
}
}
@@ -63,15 +60,6 @@ func (c *AICoreClient) WithTimeout(d time.Duration) *AICoreClient {
return c
}
// truncateBody truncates a response body for inclusion in error messages.
// This prevents leaking potentially sensitive upstream response details in logs.
func truncateBody(body []byte) string {
if len(body) <= maxErrorBodyLen {
return string(body)
}
return string(body[:maxErrorBodyLen]) + "..."
}
// getToken returns a valid OAuth token, refreshing if necessary.
func (c *AICoreClient) getToken(ctx context.Context) (string, error) {
c.mu.RLock()
@@ -125,7 +113,7 @@ func (c *AICoreClient) fetchToken(ctx context.Context) (string, time.Time, error
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return "", time.Time{}, fmt.Errorf("token request failed (status %d): %s", resp.StatusCode, truncateBody(body))
return "", time.Time{}, fmt.Errorf("token request failed (status %d): %s", resp.StatusCode, string(body))
}
var tokenResp struct {
@@ -145,48 +133,36 @@ func (c *AICoreClient) fetchToken(ctx context.Context) (string, time.Time, error
}
// getDeploymentURL returns the deployment URL for a model, fetching deployments if needed.
// getDeploymentURL returns the deployment URL for a model, fetching deployments if needed.
// Also returns a valid token for use by the caller, avoiding redundant getToken calls.
//
// Note: The token is fetched before acquiring the write lock to avoid holding the lock
// during network I/O. In rare cases where multiple goroutines race and one waits a long
// time for the write lock, the token could theoretically expire. The 5-minute refresh
// buffer in getToken makes this extremely unlikely in practice.
func (c *AICoreClient) getDeploymentURL(ctx context.Context, model string) (deployURL, token string, err error) {
func (c *AICoreClient) getDeploymentURL(ctx context.Context, model string) (string, error) {
c.mu.RLock()
if u, ok := c.deployments[model]; ok {
if d, ok := c.deployments[model]; ok {
c.mu.RUnlock()
// Still need a token for the caller
token, err = c.getToken(ctx)
if err != nil {
return "", "", fmt.Errorf("get token: %w", err)
}
return u, token, nil
return d.URL, nil
}
c.mu.RUnlock()
// Fetch token first (before acquiring write lock to avoid holding lock during I/O)
token, err = c.getToken(ctx)
// Fetch token first (before acquiring write lock to avoid deadlock)
token, err := c.getToken(ctx)
if err != nil {
return "", "", fmt.Errorf("get token for deployments: %w", err)
return "", fmt.Errorf("get token for deployments: %w", err)
}
c.mu.Lock()
defer c.mu.Unlock()
// Double-check after acquiring write lock
if u, ok := c.deployments[model]; ok {
return u, token, nil
if d, ok := c.deployments[model]; ok {
return d.URL, nil
}
if err := c.fetchDeployments(ctx, token); err != nil {
return "", "", err
return "", err
}
if u, ok := c.deployments[model]; ok {
return u, token, nil
if d, ok := c.deployments[model]; ok {
return d.URL, nil
}
return "", "", fmt.Errorf("no deployment found for model %q", model)
return "", fmt.Errorf("no deployment found for model %q", model)
}
func (c *AICoreClient) fetchDeployments(ctx context.Context, token string) error {
@@ -210,11 +186,12 @@ func (c *AICoreClient) fetchDeployments(ctx context.Context, token string) error
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return fmt.Errorf("deployments request failed (status %d): %s", resp.StatusCode, truncateBody(body))
return fmt.Errorf("deployments request failed (status %d): %s", resp.StatusCode, string(body))
}
var deployResp struct {
Resources []struct {
ID string `json:"id"`
DeploymentURL string `json:"deploymentUrl"`
Status string `json:"status"`
Details struct {
@@ -240,7 +217,10 @@ func (c *AICoreClient) fetchDeployments(ctx context.Context, token string) error
if modelName == "" {
continue
}
c.deployments[modelName] = r.DeploymentURL
c.deployments[modelName] = deployment{
ID: r.ID,
URL: r.DeploymentURL,
}
}
return nil
@@ -248,7 +228,12 @@ func (c *AICoreClient) fetchDeployments(ctx context.Context, token string) error
// CompleteAnthropic sends a request to an Anthropic model via AI Core.
func (c *AICoreClient) CompleteAnthropic(ctx context.Context, model string, messages []Message, maxTokens int, temperature float64) (string, error) {
deployURL, token, err := c.getDeploymentURL(ctx, model)
deployURL, err := c.getDeploymentURL(ctx, model)
if err != nil {
return "", err
}
token, err := c.getToken(ctx)
if err != nil {
return "", err
}
@@ -305,7 +290,7 @@ func (c *AICoreClient) CompleteAnthropic(ctx context.Context, model string, mess
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return "", fmt.Errorf("AI Core API error (status %d): %s", resp.StatusCode, truncateBody(body))
return "", fmt.Errorf("AI Core API error (status %d): %s", resp.StatusCode, string(body))
}
var anthropicResp anthropicResponse
@@ -332,7 +317,12 @@ func (c *AICoreClient) CompleteAnthropic(ctx context.Context, model string, mess
// CompleteOpenAI sends a request to an OpenAI model via AI Core.
func (c *AICoreClient) CompleteOpenAI(ctx context.Context, model string, messages []Message, temperature float64) (string, error) {
deployURL, token, err := c.getDeploymentURL(ctx, model)
deployURL, err := c.getDeploymentURL(ctx, model)
if err != nil {
return "", err
}
token, err := c.getToken(ctx)
if err != nil {
return "", err
}
@@ -370,7 +360,7 @@ func (c *AICoreClient) CompleteOpenAI(ctx context.Context, model string, message
}
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return "", fmt.Errorf("AI Core API error (status %d): %s", resp.StatusCode, truncateBody(body))
return "", fmt.Errorf("AI Core API error (status %d): %s", resp.StatusCode, string(body))
}
var openaiResp ChatResponse
+4 -4
View File
@@ -142,7 +142,7 @@ func TestAICoreClient_DeploymentFetch(t *testing.T) {
})
// Should find running deployment
url, _, err := client.getDeploymentURL(context.Background(), "anthropic--claude-4.6-sonnet")
url, err := client.getDeploymentURL(context.Background(), "anthropic--claude-4.6-sonnet")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
@@ -151,7 +151,7 @@ func TestAICoreClient_DeploymentFetch(t *testing.T) {
}
// Should find running gpt-5, not stopped one
url, _, err = client.getDeploymentURL(context.Background(), "gpt-5")
url, err = client.getDeploymentURL(context.Background(), "gpt-5")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
@@ -160,14 +160,14 @@ func TestAICoreClient_DeploymentFetch(t *testing.T) {
}
// Should error on unknown model
_, _, err = client.getDeploymentURL(context.Background(), "unknown-model")
_, err = client.getDeploymentURL(context.Background(), "unknown-model")
if err == nil {
t.Error("expected error for unknown model")
}
}
func TestAICoreClient_CompleteAnthropic(t *testing.T) {
// baseURL is set after server creation; captured by closure in handlers
// Use a pointer to capture the server URL for use in the handler
var baseURL string
mux := http.NewServeMux()
mux.HandleFunc("/oauth/token", func(w http.ResponseWriter, r *http.Request) {
+17 -32
View File
@@ -7,37 +7,10 @@ import (
// FormatMarkdown formats a ReviewResult into the markdown body for a Gitea review.
func FormatMarkdown(result *ReviewResult, reviewerName string) string {
return FormatMarkdownWithDisplay(result, reviewerName, reviewerName)
}
// GiteaEvent converts the verdict to the Gitea API event string.
func GiteaEvent(verdict string) string {
switch verdict {
case "APPROVE":
return "APPROVED"
case "REQUEST_CHANGES":
return "REQUEST_CHANGES"
default:
return "COMMENT"
}
}
// FormatMarkdownWithDisplay formats a ReviewResult with separate display name and sentinel name.
// Note: displayName is not HTML-escaped as Gitea sanitizes rendered Markdown.
// Persona display names are controlled by repo owners (trusted input).
// displayName is used for the header title, sentinelName is used for the cleanup sentinel.
// If displayName is empty, sentinelName is used for both.
func FormatMarkdownWithDisplay(result *ReviewResult, displayName, sentinelName string) string {
var sb strings.Builder
// Use display name for header, or fall back to sentinel name
headerName := displayName
if headerName == "" {
headerName = sentinelName
}
if headerName != "" {
title := CapitalizeFirst(headerName)
if reviewerName != "" {
title := strings.ToUpper(reviewerName[:1]) + reviewerName[1:]
sb.WriteString(fmt.Sprintf("# %s Review\n\n", title))
}
@@ -60,11 +33,23 @@ func FormatMarkdownWithDisplay(result *ReviewResult, displayName, sentinelName s
sb.WriteString("## Recommendation\n\n")
sb.WriteString(fmt.Sprintf("**%s** — %s\n", result.Verdict, result.Recommendation))
if sentinelName != "" {
sb.WriteString(fmt.Sprintf("\n---\n*Review by %s*\n", headerName))
if reviewerName != "" {
sb.WriteString(fmt.Sprintf("\n---\n*Review by %s*\n", reviewerName))
// Hidden sentinel for identifying this bot's reviews during cleanup
sb.WriteString(fmt.Sprintf("\n<!-- review-bot:%s -->\n", sentinelName))
sb.WriteString(fmt.Sprintf("\n<!-- review-bot:%s -->\n", reviewerName))
}
return sb.String()
}
// GiteaEvent converts the verdict to the Gitea API event string.
func GiteaEvent(verdict string) string {
switch verdict {
case "APPROVE":
return "APPROVED"
case "REQUEST_CHANGES":
return "REQUEST_CHANGES"
default:
return "COMMENT"
}
}
-55
View File
@@ -159,58 +159,3 @@ func TestFormatMarkdown_RoleTitle(t *testing.T) {
t.Error("should not contain role title header when reviewer name is empty")
}
}
func TestFormatMarkdownWithDisplay(t *testing.T) {
result := &ReviewResult{
Verdict: "APPROVE",
Summary: "Test summary",
Findings: nil,
Recommendation: "Test recommendation",
}
t.Run("with display name", func(t *testing.T) {
body := FormatMarkdownWithDisplay(result, "Security Specialist", "security")
// Header should use display name
if !strings.Contains(body, "# Security Specialist Review") {
t.Error("header should use display name")
}
// Sentinel should use sentinel name
if !strings.Contains(body, "<!-- review-bot:security -->") {
t.Error("sentinel should use sentinel name")
}
// Footer "Review by" should use display name
if !strings.Contains(body, "*Review by Security Specialist*") {
t.Error("footer should use display name")
}
})
t.Run("without display name", func(t *testing.T) {
body := FormatMarkdownWithDisplay(result, "", "reviewer")
// Should fall back to sentinel name for header
if !strings.Contains(body, "# Reviewer Review") {
t.Error("header should fall back to sentinel name")
}
if !strings.Contains(body, "<!-- review-bot:reviewer -->") {
t.Error("sentinel should use sentinel name")
}
})
t.Run("empty both names", func(t *testing.T) {
body := FormatMarkdownWithDisplay(result, "", "")
// Should not have header
if strings.Contains(body, "# ") && strings.Contains(body, " Review") {
t.Error("should not have header when both names empty")
}
// Should not have sentinel
if strings.Contains(body, "<!-- review-bot:") {
t.Error("should not have sentinel when sentinel name empty")
}
})
}
-367
View File
@@ -1,367 +0,0 @@
package review
import (
"bytes"
"embed"
"encoding/json"
"fmt"
"io"
"os"
"sort"
"strings"
"unicode/utf8"
"github.com/goccy/go-yaml"
"github.com/goccy/go-yaml/ast"
"github.com/goccy/go-yaml/parser"
)
//go:embed personas/*.yaml
var embeddedPersonas embed.FS
// MaxPersonaFileSize is the maximum size for persona files (64 KB).
// This prevents denial-of-service via excessively large files.
const MaxPersonaFileSize = 64 * 1024
// MaxYAMLDepth is the maximum nesting depth allowed in YAML persona files.
// This prevents stack exhaustion from deeply nested structures.
const MaxYAMLDepth = 20
// MaxYAMLNodes is the maximum number of YAML nodes allowed in persona files.
// This prevents DoS via wide-but-shallow structures that bypass depth limits.
const MaxYAMLNodes = 1000
// Persona defines a specialized review role with focused expertise.
type Persona struct {
Name string `json:"name" yaml:"name"`
DisplayName string `json:"display_name" yaml:"display_name"`
ModelPref string `json:"model_preference,omitempty" yaml:"model_preference,omitempty"`
Identity string `json:"identity" yaml:"identity"`
Focus []string `json:"focus" yaml:"focus"`
Ignore []string `json:"ignore" yaml:"ignore"`
Severity Severity `json:"severity" yaml:"severity"`
OutputFormat string `json:"output_format,omitempty" yaml:"output_format,omitempty"`
}
// Severity defines what constitutes each severity level for this persona.
// These are prompt guidance for the LLM, not output format changes.
type Severity struct {
Major string `json:"major" yaml:"major"`
Minor string `json:"minor" yaml:"minor"`
Nit string `json:"nit" yaml:"nit"`
}
// LoadPersona loads a persona from a JSON or YAML file path.
// Format is detected by file extension: .yaml/.yml for YAML, .json or other for JSON.
// Files larger than MaxPersonaFileSize are rejected.
//
// Symlinks are supported: os.Stat follows symlinks, so a symlink pointing to
// a regular file will pass the IsRegular() check. Symlinks to non-regular files
// (directories, FIFOs, devices) are still rejected.
func LoadPersona(path string) (*Persona, error) {
// os.Stat follows symlinks, so symlinks to regular files are supported.
// The IsRegular() check operates on the target, not the symlink itself.
info, err := os.Stat(path)
if err != nil {
return nil, fmt.Errorf("read persona file %s: %w", path, err)
}
if !info.Mode().IsRegular() {
return nil, fmt.Errorf("persona file %s is not a regular file", path)
}
if info.Size() > MaxPersonaFileSize {
return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
}
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("read persona file %s: %w", path, err)
}
// Re-check size after read to defend against TOCTOU races where file
// grows between stat and read (e.g., appending process, replaced file).
if len(data) > MaxPersonaFileSize {
return nil, fmt.Errorf("persona file %s exceeds maximum size (%d bytes)", path, MaxPersonaFileSize)
}
return parsePersona(data, path)
}
// LoadBuiltinPersona loads a built-in persona by name.
// Returns an error if the persona doesn't exist.
// Built-in personas are stored in YAML format only (see embed directive).
func LoadBuiltinPersona(name string) (*Persona, error) {
yamlFile := name + ".yaml"
data, err := embeddedPersonas.ReadFile("personas/" + yamlFile)
if err != nil {
available := ListBuiltinPersonas()
return nil, fmt.Errorf("unknown built-in persona %q (available: %s)", name, strings.Join(available, ", "))
}
return parsePersona(data, "builtin:"+yamlFile)
}
// ListBuiltinPersonas returns the names of all built-in personas in sorted order.
// Returns an empty slice if the embedded directory cannot be read.
func ListBuiltinPersonas() []string {
entries, err := embeddedPersonas.ReadDir("personas")
if err != nil {
return []string{}
}
seen := make(map[string]bool)
for _, e := range entries {
if e.IsDir() {
continue
}
name := e.Name()
// Strip extension to get persona name
var personaName string
switch {
case strings.HasSuffix(name, ".yaml"):
personaName = strings.TrimSuffix(name, ".yaml")
case strings.HasSuffix(name, ".yml"):
personaName = strings.TrimSuffix(name, ".yml")
case strings.HasSuffix(name, ".json"):
personaName = strings.TrimSuffix(name, ".json")
default:
continue
}
seen[personaName] = true
}
names := make([]string, 0, len(seen))
for name := range seen {
names = append(names, name)
}
sort.Strings(names)
return names
}
// parsePersona parses persona data from JSON or YAML format.
// Format is detected by the source file extension.
func parsePersona(data []byte, source string) (*Persona, error) {
lowerSource := strings.ToLower(source)
isYAML := strings.HasSuffix(lowerSource, ".yaml") || strings.HasSuffix(lowerSource, ".yml")
var p Persona
var err error
if isYAML {
err = unmarshalYAMLWithDepthLimit(data, &p, MaxYAMLDepth)
} else {
// Use json.Decoder with DisallowUnknownFields for consistency with
// YAML's Strict() - both reject unknown fields to catch typos.
dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields()
err = dec.Decode(&p)
if err == nil {
// Reject trailing content after the first valid JSON object.
// Without this check, input like `{"name":"x"}garbage` would
// silently succeed because Decoder stops after one object.
var dummy json.RawMessage
if err2 := dec.Decode(&dummy); err2 != io.EOF {
err = fmt.Errorf("unexpected trailing content after JSON object")
}
}
}
if err != nil {
return nil, fmt.Errorf("parse persona %s: %w", source, err)
}
if err := validatePersona(&p, source); err != nil {
return nil, err
}
return &p, nil
}
// unmarshalYAMLWithDepthLimit unmarshals YAML data with three safety checks:
// - Depth limiting: rejects AST trees exceeding maxDepth to prevent stack exhaustion.
// - Multi-document rejection: prevents silent data loss from ignored extra documents.
// - Strict field checking: rejects unknown YAML keys to catch typos early.
func unmarshalYAMLWithDepthLimit(data []byte, out any, maxDepth int) error {
// First pass: parse into AST to check depth limits, node counts, and
// multi-document rejection. This prevents stack exhaustion before we
// attempt to decode into structs.
file, err := parser.ParseBytes(data, 0)
if err != nil {
return err
}
// Reject empty YAML input (whitespace-only, comment-only, or truly empty files).
// The parser returns a single doc with nil body for these cases.
if len(file.Docs) == 0 || file.Docs[0].Body == nil {
return fmt.Errorf("empty YAML document")
}
// Reject multi-document YAML files - silently ignoring additional documents
// could lead to confusing behavior where users think their changes take effect.
if len(file.Docs) > 1 {
return fmt.Errorf("multi-document YAML is not supported; only single-document files are allowed")
}
nodeCount := 0
if err := checkYAMLDepth(file.Docs[0].Body, 0, maxDepth, MaxYAMLNodes, make(map[ast.Node]int), make(map[ast.Node]bool), &nodeCount); err != nil {
return err
}
// Second pass: decode with strict field checking enabled.
// Strict() rejects unknown keys, catching typos like "focuss" or "identiy".
//
// Safety note: goccy/go-yaml's decoder does not expand YAML aliases
// recursively — it resolves them via the pre-built AST, which our first
// pass already depth-checked. Alias chains that would exceed depth limits
// are caught above; the decoder merely reads the resolved scalar values.
dec := yaml.NewDecoder(bytes.NewReader(data), yaml.Strict())
return dec.Decode(out)
}
// checkYAMLDepth recursively checks that YAML AST nodes don't exceed the depth
// limit or the total node count limit. It uses two tracking maps:
// - validated: maps each node to the maximum depth at which it was previously
// checked. If a node is revisited at a deeper depth (e.g., via an alias),
// we re-check it to ensure the combined effective depth doesn't exceed limits.
// - visiting: per-path recursion stack for true cycle detection. A node on the
// current path is a cycle (alias loop); we return nil to avoid infinite recursion.
//
// This design prevents the alias depth bypass where an anchored subtree validated
// at a shallow depth could be referenced via alias at a greater depth, effectively
// exceeding MaxYAMLDepth.
func checkYAMLDepth(node ast.Node, depth, maxDepth, maxNodes int, validated map[ast.Node]int, visiting map[ast.Node]bool, nodeCount *int) error {
if node == nil {
return nil
}
if depth > maxDepth {
return fmt.Errorf("YAML nesting depth exceeds maximum (%d)", maxDepth)
}
// Cycle detection: if we're currently visiting this node on the current
// recursion path, it's a cycle (e.g., alias pointing to an ancestor).
// Return nil to break the cycle without error — cycles are a structural
// property, not a depth violation.
if visiting[node] {
return nil
}
// Track total nodes visited as defense-in-depth against wide-but-shallow attacks.
// Placed after cycle detection but before the depth-aware short-circuit. This means
// nodes revisited at shallower depths (via aliases) are counted each time they are
// encountered — intentional conservative overcounting. This bounds the total work
// performed during validation rather than tracking unique nodes, which is the safer
// security posture for untrusted YAML input.
*nodeCount++
if *nodeCount > maxNodes {
return fmt.Errorf("YAML node count exceeds maximum (%d)", maxNodes)
}
// Depth-aware short-circuit: skip re-validation only when the current visit
// depth is the same or shallower than the depth at which this node was
// previously validated. A shallower (or equal) current depth means the
// prior, deeper validation already covered any subtree depth violations.
// If the current depth exceeds the previous validation depth (e.g., an alias
// references this node deeper in the tree), we must re-traverse to ensure
// the combined effective depth doesn't exceed maxDepth.
//
// Note: using ast.Node (interface) as map key relies on pointer identity,
// which is correct because all goccy/go-yaml AST node types are pointer
// receivers (*MappingNode, *SequenceNode, etc.), never value types.
if prevDepth, ok := validated[node]; ok && depth <= prevDepth {
return nil
}
validated[node] = depth
// Mark as visiting (on the current recursion path) for cycle detection.
visiting[node] = true
defer func() { visiting[node] = false }()
// Walk children based on node type.
switch n := node.(type) {
case *ast.MappingNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.MappingValueNode:
// Both Key and Value are visited at depth+1 relative to this
// MappingValueNode. Since MappingNode visits its MappingValueNode
// children at depth+1 as well, keys and values end up at depth+2
// from the parent MappingNode. This is intentional: it mirrors the
// actual nesting structure (mapping → key-value pair → key/value).
if err := checkYAMLDepth(n.Key, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.SequenceNode:
for _, value := range n.Values {
if err := checkYAMLDepth(value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
}
case *ast.AliasNode:
// Follow alias to its target, incrementing depth since aliases expand
// the effective structure.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.AnchorNode:
// Increment depth for anchor values as a conservative measure: the
// anchor definition itself is structural, and treating it as a depth
// level ensures that deeply nested anchors are caught at definition
// time rather than only when referenced via alias. This +1 is
// asymmetric with alias (which also increments) — by design, the
// effective depth budget for anchored-then-aliased content is reduced
// because both the definition site and the reference site each consume
// a level, making deeply nested anchor/alias pairs hit the limit sooner.
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.TagNode:
if err := checkYAMLDepth(n.Value, depth+1, maxDepth, maxNodes, validated, visiting, nodeCount); err != nil {
return err
}
case *ast.MergeKeyNode:
// MergeKeyNode represents the literal "<<" merge key token. It has no
// child nodes — the value side of a merge (e.g., *alias) lives in the
// parent MappingValueNode.Value, which is already recursed into above.
// Explicitly listed here (rather than in the default case) to prevent
// future library changes from silently bypassing depth checks.
default:
// Scalar leaf nodes (StringNode, IntegerNode, FloatNode, BoolNode,
// NullNode, InfinityNode, NanNode, LiteralNode) have no children to
// recurse into.
}
return nil
}
// ParsePersonaBytes parses persona data from bytes with a source label for errors.
// This is useful for parsing personas fetched from external sources (e.g., Gitea API)
// without requiring filesystem access. Format is detected by source extension.
// Input is bounded by MaxPersonaFileSize to prevent resource exhaustion.
func ParsePersonaBytes(data []byte, source string) (*Persona, error) {
if len(data) > MaxPersonaFileSize {
return nil, fmt.Errorf("persona data from %s exceeds maximum size (%d bytes, limit %d)", source, len(data), MaxPersonaFileSize)
}
return parsePersona(data, source)
}
func validatePersona(p *Persona, source string) error {
if p.Name == "" {
return fmt.Errorf("persona %s: name is required", source)
}
if p.Identity == "" {
return fmt.Errorf("persona %s: identity is required", source)
}
// DisplayName defaults to Name if not set
if p.DisplayName == "" {
p.DisplayName = p.Name
}
return nil
}
// CapitalizeFirst capitalizes the first rune of a string in a Unicode-safe way.
// Returns the original string if it's empty.
func CapitalizeFirst(s string) string {
if s == "" {
return s
}
r, size := utf8.DecodeRuneInString(s)
if r == utf8.RuneError {
return s
}
return strings.ToUpper(string(r)) + s[size:]
}
-104
View File
@@ -1,104 +0,0 @@
package review
import (
"fmt"
"strings"
)
// BuildPersonaSystemPrompt constructs a system prompt from a persona definition.
// This replaces BuildSystemBase when a persona is provided.
func BuildPersonaSystemPrompt(p *Persona) string {
var sb strings.Builder
// Identity section
sb.WriteString(p.Identity)
sb.WriteString("\n\n")
// Focus section
if len(p.Focus) > 0 {
sb.WriteString("## Focus Areas\n\n")
sb.WriteString("Concentrate your review on:\n")
for _, f := range p.Focus {
sb.WriteString(fmt.Sprintf("- %s\n", f))
}
sb.WriteString("\n")
}
// Ignore section
if len(p.Ignore) > 0 {
sb.WriteString("## Explicitly Out of Scope\n\n")
sb.WriteString("Do NOT comment on:\n")
for _, i := range p.Ignore {
sb.WriteString(fmt.Sprintf("- %s\n", i))
}
sb.WriteString("\n")
}
// Severity calibration
if p.Severity.Major != "" || p.Severity.Minor != "" || p.Severity.Nit != "" {
sb.WriteString("## Severity Calibration\n\n")
sb.WriteString("Use these severity definitions for YOUR domain:\n")
if p.Severity.Major != "" {
sb.WriteString(fmt.Sprintf("- **MAJOR**: %s\n", p.Severity.Major))
}
if p.Severity.Minor != "" {
sb.WriteString(fmt.Sprintf("- **MINOR**: %s\n", p.Severity.Minor))
}
if p.Severity.Nit != "" {
sb.WriteString(fmt.Sprintf("- **NIT**: %s\n", p.Severity.Nit))
}
sb.WriteString("\n")
}
// Output format instructions (shared schema from prompt.go)
sb.WriteString("## Review Instructions\n\n")
sb.WriteString("CONTEXT:\n")
sb.WriteString("- You will receive the full content of modified files for reference, followed by the diff showing what changed.\n")
sb.WriteString("- The diff shows ONLY what was added/removed. The full file content provides complete context.\n")
sb.WriteString("- Focus your review on the CHANGES (the diff), using the full files for context.\n\n")
sb.WriteString("Your task:\n")
sb.WriteString("1. Review the diff for issues within YOUR focus areas only.\n")
sb.WriteString("2. Consider the CI status — if CI has failed, that is an automatic REQUEST_CHANGES regardless of code quality.\n")
sb.WriteString("3. Output your review as structured JSON (and ONLY JSON, no markdown fences or other text).\n\n")
sb.WriteString("Output format:\n")
sb.WriteString(outputSchemaJSON)
sb.WriteString("\n\n")
sb.WriteString(verdictRules)
sb.WriteString("\n- Only report findings within your focus areas. Ignore everything else.\n")
sb.WriteString("- Line numbers should reference the new file line numbers from the diff headers.\n")
sb.WriteString("- If the diff has no changes relevant to your focus areas, APPROVE with no findings.\n")
// Custom output format if provided
if p.OutputFormat != "" {
sb.WriteString("\n\n## Additional Output Guidelines\n\n")
sb.WriteString(p.OutputFormat)
}
return sb.String()
}
// BuildSystemPromptWithPersona constructs the full system prompt, using either
// a persona or the default generic prompt. This is a convenience wrapper that
// combines BuildPersonaSystemPrompt (or BuildSystemBase) with patterns and conventions.
// It is exported for use by callers who want one-shot prompt assembly.
func BuildSystemPromptWithPersona(persona *Persona, conventions, patterns string) string {
var base string
if persona != nil {
base = BuildPersonaSystemPrompt(persona)
} else {
base = BuildSystemBase()
}
var sb strings.Builder
sb.WriteString(base)
if patterns != "" {
sb.WriteString(fmt.Sprintf("\n\n## Language Patterns & Idioms\n\nUse the following patterns as review criteria. Code that violates these established patterns is a finding:\n\n%s\n", patterns))
}
if conventions != "" {
sb.WriteString(fmt.Sprintf("\n\n## Repository Conventions\n\nThe repository has the following coding conventions that must be respected:\n\n%s\n", conventions))
}
return sb.String()
}
-157
View File
@@ -1,157 +0,0 @@
package review
import (
"strings"
"testing"
)
func TestBuildPersonaSystemPrompt(t *testing.T) {
p := &Persona{
Name: "security",
DisplayName: "Security Specialist",
Identity: "You are a security specialist.",
Focus: []string{"injection attacks", "auth bypass"},
Ignore: []string{"code style", "performance"},
Severity: Severity{
Major: "exploitable vulnerabilities",
Minor: "defense in depth",
Nit: "theoretical risks",
},
}
prompt := BuildPersonaSystemPrompt(p)
// Check identity is included
if !strings.Contains(prompt, "You are a security specialist.") {
t.Error("prompt should contain identity")
}
// Check focus areas
if !strings.Contains(prompt, "Focus Areas") {
t.Error("prompt should contain Focus Areas section")
}
if !strings.Contains(prompt, "injection attacks") {
t.Error("prompt should contain focus item")
}
// Check ignore section
if !strings.Contains(prompt, "Out of Scope") {
t.Error("prompt should contain Out of Scope section")
}
if !strings.Contains(prompt, "code style") {
t.Error("prompt should contain ignore item")
}
// Check severity calibration
if !strings.Contains(prompt, "Severity Calibration") {
t.Error("prompt should contain Severity Calibration section")
}
if !strings.Contains(prompt, "exploitable vulnerabilities") {
t.Error("prompt should contain major severity definition")
}
// Check JSON output format is included
if !strings.Contains(prompt, `"verdict"`) {
t.Error("prompt should contain JSON output format")
}
if !strings.Contains(prompt, "APPROVE") {
t.Error("prompt should mention APPROVE verdict")
}
}
func TestBuildPersonaSystemPromptMinimal(t *testing.T) {
// Minimal persona with only required fields
p := &Persona{
Name: "minimal",
Identity: "You are a minimal reviewer.",
}
prompt := BuildPersonaSystemPrompt(p)
// Should still work without optional fields
if !strings.Contains(prompt, "You are a minimal reviewer.") {
t.Error("prompt should contain identity")
}
// Should not have empty sections
if strings.Contains(prompt, "Focus Areas") && !strings.Contains(prompt, "Concentrate your review on:") {
t.Error("should not have Focus Areas header without content")
}
}
func TestBuildSystemPromptWithPersona(t *testing.T) {
t.Run("with persona", func(t *testing.T) {
p := &Persona{
Name: "test",
Identity: "Test persona identity.",
Focus: []string{"testing"},
}
prompt := BuildSystemPromptWithPersona(p, "test conventions", "test patterns")
if !strings.Contains(prompt, "Test persona identity.") {
t.Error("should contain persona identity")
}
if !strings.Contains(prompt, "test conventions") {
t.Error("should contain conventions")
}
if !strings.Contains(prompt, "test patterns") {
t.Error("should contain patterns")
}
})
t.Run("without persona", func(t *testing.T) {
prompt := BuildSystemPromptWithPersona(nil, "test conventions", "test patterns")
// Should use default system base
if !strings.Contains(prompt, "expert code reviewer") {
t.Error("should contain default system base when no persona")
}
if !strings.Contains(prompt, "test conventions") {
t.Error("should contain conventions")
}
})
t.Run("empty conventions and patterns", func(t *testing.T) {
p := &Persona{
Name: "test",
Identity: "Test identity.",
}
prompt := BuildSystemPromptWithPersona(p, "", "")
if strings.Contains(prompt, "Language Patterns") {
t.Error("should not contain patterns section when empty")
}
if strings.Contains(prompt, "Repository Conventions") {
t.Error("should not contain conventions section when empty")
}
})
}
func TestPersonaPromptContainsOutputRules(t *testing.T) {
p := &Persona{
Name: "test",
Identity: "Test.",
}
prompt := BuildPersonaSystemPrompt(p)
// Must contain the critical output rules
requiredStrings := []string{
"APPROVE",
"REQUEST_CHANGES",
"MAJOR",
"MINOR",
"NIT",
"verdict",
"findings",
"CI",
}
for _, s := range requiredStrings {
if !strings.Contains(prompt, s) {
t.Errorf("prompt should contain %q", s)
}
}
}
-959
View File
@@ -1,959 +0,0 @@
package review
import (
"fmt"
"os"
"path/filepath"
"strings"
"testing"
"github.com/goccy/go-yaml/ast"
)
func TestLoadBuiltinPersona(t *testing.T) {
tests := []struct {
name string
personaName string
wantErr bool
wantDisplay string
}{
{
name: "security persona",
personaName: "security",
wantErr: false,
wantDisplay: "Security Specialist",
},
{
name: "architect persona",
personaName: "architect",
wantErr: false,
wantDisplay: "Software Architect",
},
{
name: "docs persona",
personaName: "docs",
wantErr: false,
wantDisplay: "Documentation Reviewer",
},
{
name: "unknown persona",
personaName: "nonexistent",
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
p, err := LoadBuiltinPersona(tt.personaName)
if tt.wantErr {
if err == nil {
t.Error("expected error, got nil")
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if p.Name != tt.personaName {
t.Errorf("Name = %q, want %q", p.Name, tt.personaName)
}
if p.DisplayName != tt.wantDisplay {
t.Errorf("DisplayName = %q, want %q", p.DisplayName, tt.wantDisplay)
}
if p.Identity == "" {
t.Error("Identity should not be empty")
}
if len(p.Focus) == 0 {
t.Error("Focus should not be empty")
}
})
}
}
func TestListBuiltinPersonas(t *testing.T) {
names := ListBuiltinPersonas()
if len(names) == 0 {
t.Fatal("expected at least one built-in persona")
}
// Check for expected personas
expected := map[string]bool{"security": false, "architect": false, "docs": false}
for _, name := range names {
if _, ok := expected[name]; ok {
expected[name] = true
}
}
for name, found := range expected {
if !found {
t.Errorf("expected built-in persona %q not found", name)
}
}
}
func TestLoadPersonaFromYAMLFile(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.yaml")
content := `# Test persona
name: test
display_name: Test Persona
identity: |
You are a test persona.
Multi-line identity works.
focus:
- testing
- validation
ignore:
- nothing
severity:
major: Big problems
minor: Small problems
nit: Tiny problems
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
if p.DisplayName != "Test Persona" {
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test Persona")
}
if len(p.Focus) != 2 {
t.Errorf("Focus len = %d, want 2", len(p.Focus))
}
if !strings.Contains(p.Identity, "Multi-line") {
t.Error("Identity should contain multi-line content")
}
}
func TestLoadPersonaFromYMLFile(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.yml")
content := `name: test
display_name: Test YML
identity: Test identity
focus:
- testing
ignore: []
severity:
major: Big
minor: Small
nit: Tiny
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
if p.DisplayName != "Test YML" {
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test YML")
}
}
func TestLoadPersonaFromJSONFile(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
content := `{
"name": "test",
"display_name": "Test Persona",
"identity": "You are a test persona.\nMulti-line identity works.",
"focus": ["testing", "validation"],
"ignore": ["nothing"],
"severity": {
"major": "Big problems",
"minor": "Small problems",
"nit": "Tiny problems"
}
}`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
if p.DisplayName != "Test Persona" {
t.Errorf("DisplayName = %q, want %q", p.DisplayName, "Test Persona")
}
if len(p.Focus) != 2 {
t.Errorf("Focus len = %d, want 2", len(p.Focus))
}
if !strings.Contains(p.Identity, "Multi-line") {
t.Error("Identity should contain multi-line content")
}
}
func TestLoadPersonaValidation(t *testing.T) {
tests := []struct {
name string
content string
ext string
wantErr string
}{
{
name: "missing name yaml",
content: "identity: test\n",
ext: ".yaml",
wantErr: "name is required",
},
{
name: "missing identity yaml",
content: "name: test\n",
ext: ".yaml",
wantErr: "identity is required",
},
{
name: "missing name json",
content: `{"identity": "test"}`,
ext: ".json",
wantErr: "name is required",
},
{
name: "missing identity json",
content: `{"name": "test"}`,
ext: ".json",
wantErr: "identity is required",
},
{
name: "display_name defaults to name",
content: "name: test\nidentity: test identity\n",
ext: ".yaml",
// No error expected - should succeed
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test"+tt.ext)
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if tt.wantErr != "" {
if err == nil {
t.Errorf("expected error containing %q, got nil", tt.wantErr)
return
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Check display_name defaulting
if p.DisplayName == "" {
t.Error("DisplayName should default to Name")
}
if p.DisplayName != p.Name {
t.Errorf("DisplayName should default to Name, got %q", p.DisplayName)
}
})
}
}
func TestLoadPersonaFileNotFound(t *testing.T) {
_, err := LoadPersona("/nonexistent/path/persona.yaml")
if err == nil {
t.Error("expected error for nonexistent file")
}
}
func TestLoadPersonaInvalidYAML(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "invalid.yaml")
if err := os.WriteFile(path, []byte("not valid yaml:\n - [broken"), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for invalid YAML")
}
}
func TestLoadPersonaInvalidJSON(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "invalid.json")
if err := os.WriteFile(path, []byte("not valid json {"), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for invalid JSON")
}
}
func TestLoadPersonaCaseInsensitiveExtension(t *testing.T) {
tests := []struct {
name string
ext string
}{
{"lowercase yaml", ".yaml"},
{"uppercase YAML", ".YAML"},
{"mixed case Yaml", ".Yaml"},
{"lowercase yml", ".yml"},
{"uppercase YML", ".YML"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test"+tt.ext)
content := "name: test\nidentity: test identity\n"
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed for extension %s: %v", tt.ext, err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
})
}
}
func TestCapitalizeFirst(t *testing.T) {
tests := []struct {
input string
want string
}{
{"hello", "Hello"},
{"Hello", "Hello"},
{"HELLO", "HELLO"},
{"a", "A"},
{"", ""},
{"日本語", "日本語"}, // Non-ASCII: Japanese doesn't have case
{"über", "Über"}, // German umlaut
{"élève", "Élève"}, // French accent
}
for _, tt := range tests {
t.Run(tt.input, func(t *testing.T) {
got := CapitalizeFirst(tt.input)
if got != tt.want {
t.Errorf("CapitalizeFirst(%q) = %q, want %q", tt.input, got, tt.want)
}
})
}
}
func TestListBuiltinPersonasReturnsEmptySlice(t *testing.T) {
// ListBuiltinPersonas should return an empty slice (not nil) on error.
// We can't easily test the error case, but we can verify the success case
// returns a proper slice.
names := ListBuiltinPersonas()
if names == nil {
t.Error("ListBuiltinPersonas should return empty slice, not nil")
}
}
func TestYAMLMultilineStrings(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "multiline.yaml")
// Test literal block scalar (|) which preserves newlines
content := `name: multiline
display_name: Multiline Test
identity: |
First line.
Second line.
Third line.
focus:
- item one
ignore: []
severity:
major: Major issue
minor: Minor issue
nit: Nit
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
// Literal block scalar preserves newlines
if !strings.Contains(p.Identity, "\n") {
t.Error("Identity should contain newlines from literal block scalar")
}
if !strings.Contains(p.Identity, "Second line") {
t.Error("Identity should contain 'Second line'")
}
}
func TestYAMLComments(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "comments.yaml")
content := `# This is a comment
name: commented # inline comment
display_name: Commented Persona
# Another comment
identity: Test identity
focus:
- item # comment after item
ignore: []
severity:
major: Major
minor: Minor
nit: Nit
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
p, err := LoadPersona(path)
if err != nil {
t.Fatalf("LoadPersona failed: %v", err)
}
// Comments should be ignored
if p.Name != "commented" {
t.Errorf("Name = %q, want %q", p.Name, "commented")
}
if p.Focus[0] != "item" {
t.Errorf("Focus[0] = %q, want %q", p.Focus[0], "item")
}
}
func TestYAMLDeeplyNestedRejection(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "deeply-nested.yaml")
// Build a deeply nested YAML structure that exceeds MaxYAMLDepth (20).
// Depth accumulation trace for "nested: \n level0: \n level1: ...":
// - Document root parsed at depth 0
// - Root MappingNode children (MappingValueNodes) visited at depth 1
// - "nested" MappingValueNode: key at depth 2, value at depth 2
// - Each levelN adds depth via MappingValueNode traversal (key + value)
// - Exact depth per level depends on AST structure (MappingNode wrapping),
// but 25 levels reliably exceeds MaxYAMLDepth (20) with comfortable margin.
// The test uses 25 levels rather than exactly 21 to avoid brittleness.
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\nnested:\n")
indent := " "
for i := 0; i < 25; i++ {
sb.WriteString(strings.Repeat(indent, i+1))
sb.WriteString(fmt.Sprintf("level%d:\n", i))
}
sb.WriteString(strings.Repeat(indent, 26))
sb.WriteString("value: too-deep\n")
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for deeply nested YAML, got nil")
}
if !strings.Contains(err.Error(), "nesting depth exceeds") {
t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
}
}
func TestYAMLEmptyFileRejection(t *testing.T) {
tests := []struct {
name string
content string
}{
{"completely_empty", ""},
{"whitespace_only", " \n\n "},
{"comment_only", "# just a comment\n"},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, tc.name+".yaml")
if err := os.WriteFile(path, []byte(tc.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for empty YAML input, got nil")
}
if !strings.Contains(err.Error(), "empty YAML document") {
t.Errorf("expected error containing %q, got: %v", "empty YAML document", err)
}
})
}
}
func TestYAMLFileSizeLimit(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "huge.yaml")
// Create a file larger than MaxPersonaFileSize (64 KB)
content := "name: test\nidentity: " + strings.Repeat("x", MaxPersonaFileSize+1) + "\n"
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for oversized file, got nil")
}
if !strings.Contains(err.Error(), "exceeds maximum size") {
t.Errorf("error = %q, want containing 'exceeds maximum size'", err.Error())
}
}
func TestYAMLAliasCycleDetection(t *testing.T) {
// Test that our checkYAMLDepth function handles alias cycles gracefully
// by using the visiting map to prevent infinite recursion.
// Create a node structure where an alias points to a parent node,
// simulating what could happen with crafted input.
parent := &ast.MappingNode{
Values: []*ast.MappingValueNode{
{
Key: &ast.StringNode{Value: "name"},
Value: &ast.StringNode{Value: "test"},
},
},
}
// Create a child that aliases back to the parent (artificial cycle)
aliasToParent := &ast.AliasNode{
Value: parent,
}
parent.Values = append(parent.Values, &ast.MappingValueNode{
Key: &ast.StringNode{Value: "nested"},
Value: aliasToParent,
})
nodeCount := 0
validated := make(map[ast.Node]int)
visiting := make(map[ast.Node]bool)
// This should NOT hang or stack overflow - cycle detection prevents infinite recursion
err := checkYAMLDepth(parent, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount)
if err != nil {
t.Errorf("unexpected error traversing cyclic structure: %v", err)
}
// Verify we tracked the parent in the validated map
if _, ok := validated[parent]; !ok {
t.Error("parent node not tracked in validated map")
}
}
func TestYAMLMultiDocumentRejection(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "multi.yaml")
// Multi-document YAML (documents separated by ---)
content := `name: first
identity: first document
---
name: second
identity: second document
`
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for multi-document YAML, got nil")
}
if !strings.Contains(err.Error(), "multi-document") {
t.Errorf("error = %q, want containing 'multi-document'", err.Error())
}
}
func TestYAMLNodeCountLimit(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "wide.yaml")
// Build a YAML structure that's shallow but wide - many keys at the same level
// to test the node count limit (should exceed MaxYAMLNodes = 1000)
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\n")
for i := 0; i < 600; i++ {
sb.WriteString(fmt.Sprintf("key%d: value%d\n", i, i))
}
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Error("expected error for wide YAML exceeding node count, got nil")
}
if !strings.Contains(err.Error(), "node count exceeds") {
t.Errorf("error = %q, want containing 'node count exceeds'", err.Error())
}
}
func TestCheckYAMLDepthCycleDetectionDirect(t *testing.T) {
// Direct test of cycle detection in checkYAMLDepth by creating
// a node structure with an artificial cycle.
node := &ast.MappingNode{
Values: []*ast.MappingValueNode{
{
Key: &ast.StringNode{Value: "key"},
Value: &ast.StringNode{Value: "value"},
},
},
}
// Create a cycle by making a child reference the parent
cycleChild := &ast.AliasNode{
Value: node, // Points back to the parent
}
node.Values = append(node.Values, &ast.MappingValueNode{
Key: &ast.StringNode{Value: "cyclic"},
Value: cycleChild,
})
nodeCount := 0
validated := make(map[ast.Node]int)
visiting := make(map[ast.Node]bool)
err := checkYAMLDepth(node, 0, MaxYAMLDepth, MaxYAMLNodes, validated, visiting, &nodeCount)
// Should complete without infinite recursion due to cycle detection
if err != nil {
t.Errorf("unexpected error: %v", err)
}
// The validated map should contain multiple entries
if len(validated) < 2 {
t.Errorf("validated map has %d entries, expected at least 2", len(validated))
}
}
func TestYAMLAliasDepthBypass(t *testing.T) {
// Test that an anchored subtree first validated at a shallow depth is
// re-checked when referenced via alias at a deeper position. Without the
// depth-aware validated map, the alias reference would skip re-checking
// and allow the effective nesting to exceed MaxYAMLDepth.
dir := t.TempDir()
path := filepath.Join(dir, "alias-depth-bypass.yaml")
// Build YAML with an anchor at shallow depth containing a subtree near the limit,
// then reference it via alias deep enough that effective depth exceeds MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: test\nidentity: test\n")
// Create the anchored subtree at depth 1 (key level) that nests 15 levels deep.
sb.WriteString("anchor_key: &deep_anchor\n")
for i := 0; i < 15; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("level%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 16))
sb.WriteString("leaf: value\n")
// Create a wrapper that nests 6 levels deep, then references the anchor.
// Effective depth at alias target = 6 (wrapper nesting) + 1 (alias) + 15 (subtree) = 22 > 20
sb.WriteString("wrapper:\n")
for i := 0; i < 6; i++ {
sb.WriteString(strings.Repeat(" ", i+1))
sb.WriteString(fmt.Sprintf("n%d:\n", i))
}
sb.WriteString(strings.Repeat(" ", 7))
sb.WriteString("alias_ref: *deep_anchor\n")
if err := os.WriteFile(path, []byte(sb.String()), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for alias depth bypass, got nil")
}
if !strings.Contains(err.Error(), "nesting depth exceeds") {
t.Errorf("error = %q, want containing 'nesting depth exceeds'", err.Error())
}
}
func TestListBuiltinPersonasSortedOrder(t *testing.T) {
names := ListBuiltinPersonas()
if len(names) < 2 {
t.Skip("need at least 2 personas to test ordering")
}
// Verify the list is sorted
for i := 1; i < len(names); i++ {
if names[i-1] > names[i] {
t.Errorf("ListBuiltinPersonas not sorted: %q > %q", names[i-1], names[i])
}
}
}
func TestYAMLUnknownFieldsRejected(t *testing.T) {
tests := []struct {
name string
content string
wantErr string
}{
{
name: "unknown top-level field",
content: `name: test
identity: test identity
unknown_field: should fail
`,
wantErr: "unknown_field",
},
{
name: "typo in field name",
content: `name: test
identiy: typo should fail
`,
wantErr: "identiy",
},
{
name: "unknown field in severity",
content: `name: test
identity: test
severity:
major: Major
minro: typo
`,
wantErr: "minro",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "unknown.yaml")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Errorf("expected error for unknown field %q, got nil", tt.wantErr)
return
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
}
})
}
}
func TestJSONUnknownFieldsRejected(t *testing.T) {
tests := []struct {
name string
content string
wantErr string
}{
{
name: "unknown top-level field",
content: `{
"name": "test",
"identity": "test identity",
"unknown_field": "should fail"
}`,
wantErr: "unknown_field",
},
{
name: "typo in field name",
content: `{
"name": "test",
"identiy": "typo should fail"
}`,
wantErr: "identiy",
},
{
name: "unknown field in severity",
content: `{
"name": "test",
"identity": "test",
"severity": {
"major": "ok",
"miner": "typo"
}
}`,
wantErr: "miner",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for unknown field, got nil")
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want to contain %q", err.Error(), tt.wantErr)
}
})
}
}
func TestLoadPersonaSymlink(t *testing.T) {
// Create a regular persona file
dir := t.TempDir()
realFile := filepath.Join(dir, "real.yaml")
content := `name: test
identity: test identity
`
if err := os.WriteFile(realFile, []byte(content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
// Create a symlink to it
symlink := filepath.Join(dir, "link.yaml")
if err := os.Symlink(realFile, symlink); err != nil {
t.Fatalf("failed to create symlink: %v", err)
}
// LoadPersona should work via symlink
p, err := LoadPersona(symlink)
if err != nil {
t.Fatalf("LoadPersona via symlink failed: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}
func TestJSONTrailingContentRejected(t *testing.T) {
tests := []struct {
name string
content string
}{
{
name: "trailing garbage after object",
content: `{"name":"test","identity":"test identity"}garbage`,
},
{
name: "two JSON objects",
content: `{"name":"test","identity":"test identity"}{"name":"other"}`,
},
{
name: "trailing array",
content: `{"name":"test","identity":"test identity"}[]`,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.json")
if err := os.WriteFile(path, []byte(tt.content), 0644); err != nil {
t.Fatalf("failed to write test file: %v", err)
}
_, err := LoadPersona(path)
if err == nil {
t.Fatal("expected error for trailing content, got nil")
}
if !strings.Contains(err.Error(), "trailing content") {
t.Errorf("error = %q, want to contain 'trailing content'", err.Error())
}
})
}
}
func TestParsePersonaBytesSizeLimit(t *testing.T) {
// ParsePersonaBytes should reject input exceeding MaxPersonaFileSize
oversized := make([]byte, MaxPersonaFileSize+1)
for i := range oversized {
oversized[i] = 'x'
}
_, err := ParsePersonaBytes(oversized, "oversized.yaml")
if err == nil {
t.Fatal("expected error for oversized input, got nil")
}
if !strings.Contains(err.Error(), "exceeds maximum size") {
t.Errorf("error = %q, want to contain 'exceeds maximum size'", err.Error())
}
// Just under the limit should not trigger size error (may fail parse, but not size)
underLimit := []byte("name: test\nidentity: test persona\n")
p, err := ParsePersonaBytes(underLimit, "valid.yaml")
if err != nil {
t.Fatalf("unexpected error for valid input: %v", err)
}
if p.Name != "test" {
t.Errorf("Name = %q, want %q", p.Name, "test")
}
}
func TestYAMLMergeKeyDepthCheck(t *testing.T) {
// Verify that YAML merge keys (<<: *alias) are properly handled by the
// depth checker. The merge key content is in the MappingValueNode.Value
// (an AliasNode), not in the MergeKeyNode itself.
p, err := ParsePersonaBytes([]byte("name: merge-test\nidentity: test\n"), "merge.yaml")
if err != nil {
t.Fatalf("basic parse failed: %v", err)
}
if p.Name != "merge-test" {
t.Errorf("Name = %q, want %q", p.Name, "merge-test")
}
// Test that deeply nested merge keys still hit depth limit.
// Build YAML with merge key content nested beyond MaxYAMLDepth.
var sb strings.Builder
sb.WriteString("name: deep-merge\nidentity: deep merge persona\n")
sb.WriteString("anchor: &deep\n")
indent := " "
for i := 0; i < MaxYAMLDepth+5; i++ {
sb.WriteString(indent)
sb.WriteString(fmt.Sprintf("level%d:\n", i))
indent += " "
}
sb.WriteString(indent + "leaf: value\n")
sb.WriteString("target:\n <<: *deep\n")
_, err = ParsePersonaBytes([]byte(sb.String()), "deep-merge.yaml")
if err == nil {
t.Fatal("expected error for deeply nested merge key content, got nil")
}
if !strings.Contains(err.Error(), "depth") {
t.Errorf("error = %q, want to contain 'depth'", err.Error())
}
}
-37
View File
@@ -1,37 +0,0 @@
# Software Architect Persona
# Focuses on design quality, patterns, and code organization
name: architect
display_name: Software Architect
identity: |
You are a software architect reviewing code for design quality.
Your expertise:
- Design patterns and anti-patterns
- Code organization and module boundaries
- API design and contracts
- Testability and dependency injection
- Consistency with existing architecture
- Technical debt identification
focus:
- Design pattern violations or misuse
- Module boundary violations (inappropriate coupling)
- API design issues (unclear contracts, leaky abstractions)
- Testability problems (hidden dependencies, god objects)
- Inconsistency with existing codebase patterns
- Unnecessary complexity or over-engineering
- Missing abstractions or premature abstraction
ignore:
- Security vulnerabilities (security persona handles these)
- Performance micro-optimizations
- Code style and formatting
- Documentation typos
- Test implementation details
severity:
major: "Architectural violations that will cause maintenance problems or make the codebase harder to evolve"
minor: "Design issues that reduce clarity or testability but don't block progress"
nit: "Minor pattern deviations or style preferences"
-36
View File
@@ -1,36 +0,0 @@
# Documentation Reviewer Persona
# Focuses on clarity, documentation quality, and self-documenting code
name: docs
display_name: Documentation Reviewer
identity: |
You are a documentation specialist reviewing code for clarity and documentation quality.
Your expertise:
- API documentation and examples
- Code comments and their accuracy
- Error message clarity
- README and guide quality
- Naming clarity and self-documenting code
focus:
- Missing or outdated documentation
- Unclear or misleading comments
- Poor error messages (cryptic, unhelpful, missing context)
- Confusing naming (functions, variables, types)
- Missing examples for complex APIs
- Inconsistent terminology
- Documentation that contradicts the code
ignore:
- Security vulnerabilities
- Performance issues
- Design patterns
- Test coverage
- Code style (unless it affects readability)
severity:
major: "Documentation that actively misleads or missing docs for critical functionality"
minor: "Unclear documentation or poor error messages that will confuse users"
nit: "Minor clarity improvements or typo fixes"
-37
View File
@@ -1,37 +0,0 @@
# Security Specialist Persona
# Focuses on vulnerabilities, auth issues, and security best practices
name: security
display_name: Security Specialist
identity: |
You are a security specialist reviewing code for vulnerabilities.
Your expertise:
- OWASP Top 10 vulnerabilities
- Injection attacks (SQL, command, path traversal, template)
- Authentication and authorization patterns
- Secrets management and exposure risks
- Race conditions with security implications
- Event sourcing attack vectors (replay attacks, event injection)
focus:
- Injection attacks (SQL, command, path traversal, template injection)
- Authentication and authorization gaps or bypasses
- Secrets exposure (hardcoded credentials, tokens in logs, config leaks)
- Input validation failures (unsanitized input, unsafe deserialization)
- Race conditions that could be exploited
- Cryptographic weaknesses (weak algorithms, improper key handling)
- Information disclosure through error messages or logs
ignore:
- Code style and naming conventions
- Performance optimizations (unless security-related)
- Documentation quality
- General code quality or readability
- Test coverage
severity:
major: "Exploitable vulnerabilities: auth bypass, injection, data exfiltration, privilege escalation, RCE"
minor: "Defense-in-depth issues: missing rate limiting, verbose errors, weak input validation"
nit: "Theoretical risks with low exploitability or impact"
+18 -26
View File
@@ -7,28 +7,6 @@ import (
"strings"
)
// outputSchemaJSON is the shared JSON output format specification used by both
// the generic reviewer and persona-based reviewers.
const outputSchemaJSON = `{
"verdict": "APPROVE" or "REQUEST_CHANGES",
"summary": "Brief overall assessment (1-3 sentences)",
"findings": [
{
"severity": "MAJOR" or "MINOR" or "NIT",
"file": "path/to/file",
"line": <line number from the diff>,
"finding": "Description of the issue"
}
],
"recommendation": "Full recommendation text explaining your verdict"
}`
// verdictRules is the shared verdict determination rules.
const verdictRules = `Rules:
- If there are any MAJOR findings → verdict must be REQUEST_CHANGES
- If there are no MAJOR findings → verdict should be APPROVE
- If CI has failed → verdict must be REQUEST_CHANGES with a finding noting the CI failure`
// BuildSystemBase returns the core system prompt instructions without
// patterns or conventions. Used by the budget package to separate
// trimmable from non-trimmable content.
@@ -45,10 +23,24 @@ func BuildSystemBase() string {
sb.WriteString("2. Consider the CI status — if CI has failed, that is an automatic REQUEST_CHANGES regardless of code quality.\n")
sb.WriteString("3. Output your review as structured JSON (and ONLY JSON, no markdown fences or other text).\n\n")
sb.WriteString("Output format:\n")
sb.WriteString(outputSchemaJSON)
sb.WriteString("\n\n")
sb.WriteString(verdictRules)
sb.WriteString("\n- Be thorough but fair. Don't nitpick style unless it impacts readability significantly.\n")
sb.WriteString("{\n")
sb.WriteString(" \"verdict\": \"APPROVE\" or \"REQUEST_CHANGES\",\n")
sb.WriteString(" \"summary\": \"Brief overall assessment (1-3 sentences)\",\n")
sb.WriteString(" \"findings\": [\n")
sb.WriteString(" {\n")
sb.WriteString(" \"severity\": \"MAJOR\" or \"MINOR\" or \"NIT\",\n")
sb.WriteString(" \"file\": \"path/to/file\",\n")
sb.WriteString(" \"line\": <line number from the diff>,\n")
sb.WriteString(" \"finding\": \"Description of the issue\"\n")
sb.WriteString(" }\n")
sb.WriteString(" ],\n")
sb.WriteString(" \"recommendation\": \"Full recommendation text explaining your verdict\"\n")
sb.WriteString("}\n\n")
sb.WriteString("Rules:\n")
sb.WriteString("- If there are any MAJOR findings → verdict must be REQUEST_CHANGES\n")
sb.WriteString("- If there are no MAJOR findings → verdict should be APPROVE\n")
sb.WriteString("- If CI has failed → verdict must be REQUEST_CHANGES with a finding noting the CI failure\n")
sb.WriteString("- Be thorough but fair. Don't nitpick style unless it impacts readability significantly.\n")
sb.WriteString("- Line numbers should reference the new file line numbers from the diff headers.\n")
sb.WriteString("- If the diff is empty or trivial (only formatting/whitespace), APPROVE with no findings.\n")
-150
View File
@@ -1,150 +0,0 @@
package review
import (
"context"
"log/slog"
"strings"
)
// RepoPersonaPath is the directory path where repo-specific personas are stored.
const RepoPersonaPath = ".review-bot/personas"
// GiteaClient defines the subset of gitea.Client methods needed for loading repo personas.
// This interface allows for easier testing and decouples the review package from gitea.
type GiteaClient interface {
ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error)
GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error)
}
// ContentEntry represents a file or directory entry from the contents API.
// This mirrors gitea.ContentEntry to avoid import cycles.
type ContentEntry struct {
Name string `json:"name"`
Path string `json:"path"`
Type string `json:"type"` // "file" or "dir"
}
// LoadRepoPersonas fetches personas from a repository's .review-bot/personas/ directory.
// Returns an empty map (not nil) if the directory doesn't exist or is empty.
// Individual parse failures are logged and skipped; the remaining personas are still returned.
// Auth errors and other non-404 errors are propagated.
// Files exceeding MaxPersonaFileSize are rejected to prevent resource exhaustion.
func LoadRepoPersonas(ctx context.Context, client GiteaClient, owner, repo string) (map[string]*Persona, error) {
result := make(map[string]*Persona)
entries, err := client.ListContents(ctx, owner, repo, RepoPersonaPath)
if err != nil {
// Check if this is a 404 (directory doesn't exist) - expected case
if isNotFoundError(err) {
slog.Debug("no repo personas directory found", "repo", owner+"/"+repo)
return result, nil
}
// Other errors (auth, server) should propagate
return nil, err
}
if len(entries) == 0 {
slog.Debug("repo personas directory is empty", "repo", owner+"/"+repo)
return result, nil
}
for _, entry := range entries {
if entry.Type != "file" {
continue
}
// Only process YAML files
if !isYAMLFile(entry.Name) {
continue
}
content, err := client.GetFileContent(ctx, owner, repo, entry.Path)
if err != nil {
slog.Warn("could not fetch repo persona file",
"file", entry.Path,
"repo", owner+"/"+repo,
"error", err)
continue
}
// Enforce size limit before parsing to prevent resource exhaustion
if len(content) > MaxPersonaFileSize {
slog.Warn("repo persona file exceeds maximum size",
"file", entry.Path,
"repo", owner+"/"+repo,
"size", len(content),
"max", MaxPersonaFileSize)
continue
}
persona, err := ParsePersonaBytes([]byte(content), entry.Path)
if err != nil {
slog.Warn("could not parse repo persona file",
"file", entry.Path,
"repo", owner+"/"+repo,
"error", err)
continue
}
result[persona.Name] = persona
slog.Debug("loaded repo persona",
"name", persona.Name,
"file", entry.Path,
"repo", owner+"/"+repo)
}
return result, nil
}
// MergePersonas combines built-in personas with repo personas.
// Repo personas take precedence on name collision.
// Returns a new map; inputs are not modified.
func MergePersonas(builtin, repo map[string]*Persona) map[string]*Persona {
result := make(map[string]*Persona, len(builtin)+len(repo))
// Copy built-in personas first
for name, p := range builtin {
result[name] = p
}
// Overlay repo personas (override on collision)
for name, p := range repo {
if _, exists := result[name]; exists {
slog.Debug("repo persona overrides built-in", "name", name)
}
result[name] = p
}
return result
}
// GetBuiltinPersonasMap returns all built-in personas as a map keyed by name.
// Returns an empty map (not nil) if loading fails.
func GetBuiltinPersonasMap() map[string]*Persona {
result := make(map[string]*Persona)
for _, name := range ListBuiltinPersonas() {
p, err := LoadBuiltinPersona(name)
if err != nil {
slog.Warn("could not load built-in persona", "name", name, "error", err)
continue
}
result[name] = p
}
return result
}
// isYAMLFile checks if a filename has a YAML extension.
func isYAMLFile(name string) bool {
lower := strings.ToLower(name)
return strings.HasSuffix(lower, ".yaml") || strings.HasSuffix(lower, ".yml")
}
// isNotFoundError checks if an error represents a 404 response.
// This uses a specific "HTTP 404" substring match rather than a generic "not found"
// match to avoid masking authentication failures or transport errors that might
// contain "not found" in their message.
func isNotFoundError(err error) bool {
if err == nil {
return false
}
return strings.Contains(err.Error(), "HTTP 404")
}
-443
View File
@@ -1,443 +0,0 @@
package review
import (
"context"
"errors"
"strings"
"testing"
)
func TestParsePersonaBytes(t *testing.T) {
tests := []struct {
name string
data string
source string
wantName string
wantErr string
}{
{
name: "valid yaml",
data: `name: test
identity: test identity
focus:
- testing
`,
source: "test.yaml",
wantName: "test",
},
{
name: "missing name",
data: "identity: test\n",
source: "test.yaml",
wantErr: "name is required",
},
{
name: "invalid yaml",
data: "not: valid:\n yaml: [broken",
source: "test.yaml",
wantErr: "parse",
},
{
name: "json format by extension",
data: `{"name": "jsontest", "identity": "json identity"}`,
source: "test.json",
wantName: "jsontest",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
p, err := ParsePersonaBytes([]byte(tt.data), tt.source)
if tt.wantErr != "" {
if err == nil {
t.Fatalf("expected error containing %q, got nil", tt.wantErr)
}
if !strings.Contains(err.Error(), tt.wantErr) {
t.Errorf("error = %q, want containing %q", err.Error(), tt.wantErr)
}
return
}
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if p.Name != tt.wantName {
t.Errorf("Name = %q, want %q", p.Name, tt.wantName)
}
})
}
}
// mockGiteaClient implements GiteaClient for testing.
type mockGiteaClient struct {
contents map[string][]ContentEntry // path -> entries
files map[string]string // path -> content
listErr error
fileErr map[string]error // path -> error
}
func (m *mockGiteaClient) ListContents(ctx context.Context, owner, repo, path string) ([]ContentEntry, error) {
if m.listErr != nil {
return nil, m.listErr
}
entries, ok := m.contents[path]
if !ok {
return nil, errors.New("list contents .review-bot/personas: HTTP 404: not found")
}
return entries, nil
}
func (m *mockGiteaClient) GetFileContent(ctx context.Context, owner, repo, filepath string) (string, error) {
if m.fileErr != nil {
if err, ok := m.fileErr[filepath]; ok {
return "", err
}
}
content, ok := m.files[filepath]
if !ok {
return "", errors.New("HTTP 404: file not found")
}
return content, nil
}
func TestLoadRepoPersonas(t *testing.T) {
ctx := context.Background()
t.Run("directory not found returns empty map", func(t *testing.T) {
client := &mockGiteaClient{} // No contents configured -> 404
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if personas == nil {
t.Error("expected empty map, got nil")
}
if len(personas) != 0 {
t.Errorf("expected 0 personas, got %d", len(personas))
}
})
t.Run("empty directory returns empty map", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {},
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 0 {
t.Errorf("expected 0 personas, got %d", len(personas))
}
})
t.Run("loads valid personas", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "trading.yaml", Path: ".review-bot/personas/trading.yaml", Type: "file"},
{Name: "crypto.yaml", Path: ".review-bot/personas/crypto.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/trading.yaml": `name: trading
display_name: Trading Expert
identity: You are a trading expert.
focus:
- order handling
- risk management
`,
".review-bot/personas/crypto.yaml": `name: crypto
display_name: Crypto Expert
identity: You are a cryptography expert.
focus:
- key management
- encryption
`,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 2 {
t.Fatalf("expected 2 personas, got %d", len(personas))
}
if personas["trading"] == nil {
t.Error("expected trading persona")
}
if personas["crypto"] == nil {
t.Error("expected crypto persona")
}
if personas["trading"].DisplayName != "Trading Expert" {
t.Errorf("trading display name = %q, want %q", personas["trading"].DisplayName, "Trading Expert")
}
})
t.Run("skips invalid persona files", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "valid.yaml", Path: ".review-bot/personas/valid.yaml", Type: "file"},
{Name: "invalid.yaml", Path: ".review-bot/personas/invalid.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/valid.yaml": `name: valid
identity: Valid persona
`,
".review-bot/personas/invalid.yaml": "not valid yaml: [broken",
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Should have the valid one, skip the invalid
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped invalid), got %d", len(personas))
}
if personas["valid"] == nil {
t.Error("expected valid persona")
}
})
t.Run("skips non-yaml files", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "README.md", Path: ".review-bot/personas/README.md", Type: "file"},
{Name: "notes.txt", Path: ".review-bot/personas/notes.txt", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/persona.yaml": `name: test
identity: Test persona
`,
".review-bot/personas/README.md": "# Personas\n\nPut your personas here.",
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (yaml only), got %d", len(personas))
}
})
t.Run("skips subdirectories", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "persona.yaml", Path: ".review-bot/personas/persona.yaml", Type: "file"},
{Name: "subdir", Path: ".review-bot/personas/subdir", Type: "dir"},
},
},
files: map[string]string{
".review-bot/personas/persona.yaml": `name: test
identity: Test persona
`,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (files only), got %d", len(personas))
}
})
t.Run("propagates auth errors", func(t *testing.T) {
client := &mockGiteaClient{
listErr: errors.New("HTTP 401: unauthorized"),
}
_, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err == nil {
t.Fatal("expected error for auth failure")
}
if !strings.Contains(err.Error(), "401") {
t.Errorf("error = %q, want containing '401'", err.Error())
}
})
t.Run("skips files that fail to fetch", func(t *testing.T) {
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "good.yaml", Path: ".review-bot/personas/good.yaml", Type: "file"},
{Name: "bad.yaml", Path: ".review-bot/personas/bad.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/good.yaml": `name: good
identity: Good persona
`,
},
fileErr: map[string]error{
".review-bot/personas/bad.yaml": errors.New("HTTP 500: internal server error"),
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped failed fetch), got %d", len(personas))
}
})
t.Run("skips oversized files", func(t *testing.T) {
// Create a content string that exceeds MaxPersonaFileSize (64KB)
oversizedContent := strings.Repeat("a", MaxPersonaFileSize+1)
client := &mockGiteaClient{
contents: map[string][]ContentEntry{
RepoPersonaPath: {
{Name: "normal.yaml", Path: ".review-bot/personas/normal.yaml", Type: "file"},
{Name: "huge.yaml", Path: ".review-bot/personas/huge.yaml", Type: "file"},
},
},
files: map[string]string{
".review-bot/personas/normal.yaml": `name: normal
identity: Normal sized persona
`,
".review-bot/personas/huge.yaml": oversizedContent,
},
}
personas, err := LoadRepoPersonas(ctx, client, "owner", "repo")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Should have the normal one, skip the oversized
if len(personas) != 1 {
t.Fatalf("expected 1 persona (skipped oversized), got %d", len(personas))
}
if personas["normal"] == nil {
t.Error("expected normal persona")
}
})
}
func TestMergePersonas(t *testing.T) {
builtin := map[string]*Persona{
"security": {Name: "security", Identity: "Built-in security"},
"docs": {Name: "docs", Identity: "Built-in docs"},
}
repo := map[string]*Persona{
"security": {Name: "security", Identity: "Repo security override"},
"trading": {Name: "trading", Identity: "Repo trading"},
}
merged := MergePersonas(builtin, repo)
t.Run("repo overrides builtin on collision", func(t *testing.T) {
if merged["security"].Identity != "Repo security override" {
t.Errorf("security identity = %q, want repo override", merged["security"].Identity)
}
})
t.Run("builtin preserved when no collision", func(t *testing.T) {
if merged["docs"].Identity != "Built-in docs" {
t.Errorf("docs identity = %q, want built-in", merged["docs"].Identity)
}
})
t.Run("repo-only persona added", func(t *testing.T) {
if merged["trading"] == nil {
t.Error("expected trading persona from repo")
}
if merged["trading"].Identity != "Repo trading" {
t.Errorf("trading identity = %q, want repo", merged["trading"].Identity)
}
})
t.Run("original maps not modified", func(t *testing.T) {
if builtin["trading"] != nil {
t.Error("builtin map was modified")
}
if len(repo) != 2 {
t.Error("repo map was modified")
}
})
}
func TestGetBuiltinPersonasMap(t *testing.T) {
personas := GetBuiltinPersonasMap()
if len(personas) == 0 {
t.Fatal("expected at least one built-in persona")
}
// Verify expected personas exist
expected := []string{"security", "architect", "docs"}
for _, name := range expected {
if personas[name] == nil {
t.Errorf("expected built-in persona %q", name)
}
}
// Verify personas are valid
for name, p := range personas {
if p.Name != name {
t.Errorf("persona %q has mismatched name %q", name, p.Name)
}
if p.Identity == "" {
t.Errorf("persona %q has empty identity", name)
}
}
}
func TestIsYAMLFile(t *testing.T) {
tests := []struct {
name string
want bool
}{
{"test.yaml", true},
{"test.yml", true},
{"test.YAML", true},
{"test.YML", true},
{"test.json", false},
{"test.md", false},
{"test.txt", false},
{"yaml", false},
{"yaml.md", false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := isYAMLFile(tt.name); got != tt.want {
t.Errorf("isYAMLFile(%q) = %v, want %v", tt.name, got, tt.want)
}
})
}
}
func TestIsNotFoundError(t *testing.T) {
tests := []struct {
err error
want bool
}{
{nil, false},
{errors.New("HTTP 404: not found"), true},
{errors.New("HTTP 404"), true},
// Intentionally false: generic "not found" could mask auth/transport errors.
// Only explicit HTTP 404 responses should be treated as "directory doesn't exist".
{errors.New("something not found"), false},
{errors.New("HTTP 401: unauthorized"), false},
{errors.New("connection refused"), false},
}
for _, tt := range tests {
name := "nil"
if tt.err != nil {
name = tt.err.Error()
}
t.Run(name, func(t *testing.T) {
if got := isNotFoundError(tt.err); got != tt.want {
t.Errorf("isNotFoundError(%v) = %v, want %v", tt.err, got, tt.want)
}
})
}
}
-127
View File
@@ -1,127 +0,0 @@
#!/usr/bin/env bash
# check-deps.sh - Enforces the strict dependency allowlist from CONVENTIONS.md
# Exit 1 if any unapproved import is found.
#
# Requires: Bash 4+ (for associative arrays), Go toolchain
#
# The allowlist is parsed from CONVENTIONS.md to maintain a single source of truth.
# Enforces Scope column: "test only" packages cannot appear in non-test code.
set -euo pipefail
# Check bash version
if ((BASH_VERSINFO[0] < 4)); then
echo "❌ Bash 4+ required (found ${BASH_VERSION})"
echo " On macOS: brew install bash"
exit 1
fi
CONVENTIONS_FILE="${1:-CONVENTIONS.md}"
if [ ! -f "$CONVENTIONS_FILE" ]; then
echo "❌ CONVENTIONS.md not found"
exit 1
fi
# Parse approved packages from CONVENTIONS.md table using awk (POSIX-compatible)
# Format: | `package` | use case | scope |
declare -A ALLOWED_PROD=()
declare -A ALLOWED_TEST=()
while IFS= read -r line; do
# Use awk to extract package and scope from table row
pkg=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]*`|`[[:space:]]*$/, "", $2); print $2}')
scope=$(echo "$line" | awk -F'|' '{gsub(/^[[:space:]]+|[[:space:]]+$/, "", $4); print tolower($4)}')
if [ -n "$pkg" ] && [ "$pkg" != "Package" ] && [[ "$pkg" =~ ^[a-zA-Z] ]]; then
if [[ "$scope" == *"test"* ]]; then
ALLOWED_TEST["$pkg"]=1
else
ALLOWED_PROD["$pkg"]=1
fi
fi
done < <(grep '| `' "$CONVENTIONS_FILE" 2>/dev/null || true)
ALL_ALLOWED=("${!ALLOWED_PROD[@]}" "${!ALLOWED_TEST[@]}")
if [ ${#ALL_ALLOWED[@]} -eq 0 ]; then
echo "⚠️ No approved packages found in $CONVENTIONS_FILE"
echo " (This is fine if you want stdlib-only)"
fi
# Helper: check if import matches any package in an associative array (literal prefix, no glob)
matches_allowlist() {
local import="$1"
shift
local -n allowlist=$1
for allowed in "${!allowlist[@]}"; do
# Exact match
if [ "$import" = "$allowed" ]; then
return 0
fi
# Literal prefix match for subpackages: must match "pkg/" exactly
if [ "${import#"$allowed/"}" != "$import" ]; then
return 0
fi
done
return 1
}
# Get direct module dependencies from go.mod
DIRECT_IMPORTS=$(go list -m -f '{{if and (not .Indirect) (not .Main)}}{{.Path}}{{end}}' all 2>&1) || {
echo "❌ Failed to list dependencies: $DIRECT_IMPORTS"
exit 1
}
DIRECT_IMPORTS=$(echo "$DIRECT_IMPORTS" | grep -v '^$' || true)
if [ -z "$DIRECT_IMPORTS" ]; then
echo "✅ No external dependencies"
exit 0
fi
# Check ALL direct dependencies are in some allowlist
VIOLATIONS=""
while IFS= read -r import; do
[ -z "$import" ] && continue
if ! matches_allowlist "$import" ALLOWED_PROD && ! matches_allowlist "$import" ALLOWED_TEST; then
VIOLATIONS="${VIOLATIONS} - ${import} (not in allowlist)"$'\n'
fi
done <<< "$DIRECT_IMPORTS"
if [ -n "$VIOLATIONS" ]; then
echo "❌ UNAPPROVED DEPENDENCIES DETECTED"
echo ""
echo "The following imports are not in the allowlist:"
printf "%s" "$VIOLATIONS"
echo ""
echo "To add a dependency, update CONVENTIONS.md (requires Aaron's approval)"
exit 1
fi
# Enforce Scope: test-only packages must not appear in non-test code
# Get imports used by non-test code only (go list -deps without -test excludes test deps)
PROD_IMPORTS=$(go list -deps -f '{{if not .Standard}}{{.ImportPath}}{{end}}' ./... 2>/dev/null || true)
TEST_ONLY_IN_PROD=""
for test_pkg in "${!ALLOWED_TEST[@]}"; do
# Use word-boundary matching: exact match or followed by /
if echo "$PROD_IMPORTS" | grep -qE "^${test_pkg}(/|\$|$)"; then
TEST_ONLY_IN_PROD="${TEST_ONLY_IN_PROD} - ${test_pkg} (marked 'test only' but used in production code)"$'\n'
fi
done
if [ -n "$TEST_ONLY_IN_PROD" ]; then
echo "❌ TEST-ONLY DEPENDENCIES IN PRODUCTION CODE"
echo ""
printf "%s" "$TEST_ONLY_IN_PROD"
echo ""
echo "These packages are marked 'test only' in CONVENTIONS.md"
echo "and must only be imported from *_test.go files."
exit 1
fi
echo "✅ All dependencies are approved"
echo " Direct module deps: $(echo "$DIRECT_IMPORTS" | wc -l | tr -d ' ')"
echo " Production allowlist: ${#ALLOWED_PROD[@]}, Test-only allowlist: ${#ALLOWED_TEST[@]}"