Compare commits

...

6 Commits

Author SHA1 Message Date
aweiker 27d7fd3a93 address review feedback: portability, docs, and security hardening
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 35s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m6s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m30s
- Replace grep -qP with POSIX-compatible LC_ALL=C grep -q '[^[:print:]]'
- Inline auth headers directly in curl calls (eliminate AUTH_HEADER variable)
- Remove redundant sys.exit(1) from Python for-else (shell empty-check suffices)
- Update top-of-file comment to match actual detection mechanism
- Remove -L from Gitea download curl calls to prevent auth header forwarding
  on potential redirects (defense-in-depth)
2026-05-14 05:04:14 +00:00
claw 220f6e7369 fix(action): address review findings - validation hardening and cleanup
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 16s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 38s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 59s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m14s
Addresses findings from reviews #3655 (sonnet), #3657 (security), #3658 (gpt):

- Add set -euo pipefail to both script steps for fail-fast behavior
- Remove redundant newline check ([:space:] already covers it)
- Simplify VERSION regex: remove non-portable \n\r in POSIX ERE
- Add ACTION_TOKEN control character validation (defense-in-depth)
- Anchor checksum grep to exact filename match (prevent substring collision)
- Add ::notice:: when falling back to default ACTION_REPO
- Translate Chinese comments to English for consistency
- Add comment linking GITHUB_API_URL usage back to VCS detection
2026-05-13 21:54:08 -07:00
claw e709956d0b fix(action): use REST API for GitHub asset downloads, enforce trusted GITEA_URL
PR Ready Gate / clear-labels (pull_request) Successful in 1s
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 34s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m21s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m42s
Addresses gpt-review-bot findings on PR #121:

MAJOR #1: The 'Run review' step set GITEA_URL from inputs.gitea-url which
could exfiltrate the reviewer token to an attacker-controlled host on
GitHub/GHES. Now uses steps.version.outputs.server_url which enforces
VCS-type-aware trust (github.server_url on GitHub, validated input on Gitea).

MAJOR #2: Private release asset downloads on GitHub/GHES used web URLs
({server}/.../releases/download/{tag}/{asset}) which redirect to S3 and
don't support Authorization headers for private repos. Now uses the GitHub
REST API: fetches release metadata by tag, extracts asset IDs, and downloads
via /repos/{owner}/{repo}/releases/assets/{id} with Accept: octet-stream.
Gitea path retains direct URL downloads (which work correctly there).
2026-05-13 21:38:49 -07:00
claw 93d89ba662 fix(action): address security review - prevent token exfiltration and add input validation
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 22s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 33s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m14s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m58s
Security fixes:
- On GitHub/GHES (VCS_TYPE=github), inputs.gitea-url is now completely
  ignored. API calls use github.api_url; downloads use github.server_url.
  Tokens are never sent to user-supplied URLs.
- Replace action_token step output with masked GITHUB_ENV variable to
  prevent token leakage in debug logs.
- Validate action-repo against owner/repo pattern to prevent path traversal.
- Validate SERVER_URL in Gitea path: require https:// scheme, reject
  whitespace and newlines.
- Strengthen VERSION validation: block slashes and whitespace in addition
  to newlines.
- Add integrity check in Install step: verify SERVER_URL matches
  github.server_url on GitHub runners.

Addresses findings from security-review-bot on PR #121.
Deferred: full IP-level SSRF defense (see #123).
2026-05-13 21:29:15 -07:00
claw 646497de68 fix(action): address review feedback on VCS host detection and API URL
PR Ready Gate / clear-labels (pull_request) Successful in 2s
CI / test (pull_request) Successful in 17s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 31s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m20s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 1m23s
- Use github.api_url context for VCS type detection instead of hostname
  grep (fixes brittle detection that could misclassify GHES instances)
- Use github.api_url directly for GitHub API calls (correct for both
  github.com and GHES, fixes incorrect /api/v3/ assumption)
- Add action-repo-token input with smart defaults (github.token on
  GitHub, reviewer-token on Gitea) for private repo support
- Add curl --connect-timeout and --max-time to all HTTP requests
- Add checksum verification caveat noting same-server limitation
- Add newline validation on VERSION before writing to GITHUB_OUTPUT
- Remove incorrect comment about github.com using /api/v3/
2026-05-13 21:12:33 -07:00
claw d4d34aa029 fix(action): detect VCS host type for version resolution and binary download
CI / test (pull_request) Successful in 18s
CI / review (anthropic--claude-4.6-sonnet, sonnet, SONNET_REVIEW_TOKEN) (pull_request) Successful in 28s
CI / review (gpt-5, gpt, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m54s
CI / review (gpt-5, security, ., rodin/security-patterns, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 2m11s
The composite action hardcoded Gitea-specific API paths and repo references
that broke when running on GitHub Enterprise Server.

Changes:
- Add 'action-repo' input to specify the repo hosting review-bot releases,
  separate from the 'repo' input (which is the target repo being reviewed)
- Auto-detect action repo from github.action_repository context variable,
  falling back to 'rodin/review-bot' for backward compatibility
- Detect VCS host type (Gitea vs GitHub) from server URL using hostname
  heuristic (URLs containing 'github' use /api/v3/, others use /api/v1/)
- Pass computed server_url and action_repo between steps via GITHUB_OUTPUT
  to avoid redundant computation
- Update descriptions to be host-agnostic

Fixes #120
2026-05-13 21:02:05 -07:00
+243 -19
View File
@@ -1,17 +1,37 @@
# This composite action is designed for Gitea Actions runners. # This composite action supports both Gitea Actions and GitHub Actions runners.
# Gitea Actions supports GitHub Actions syntax including $GITHUB_OUTPUT, # It detects the VCS host type by checking whether github.api_url is set
# actions/cache, and actions/checkout. # (present on GitHub.com and GHES runners, absent on Gitea runners) and uses
# the appropriate releases API for version resolution and binary download
# (REST API on GitHub, direct URLs on Gitea).
#
# Security notes:
# - On GitHub/GHES (VCS_TYPE=github), inputs.gitea-url is IGNORED to prevent
# token exfiltration. API calls use github.api_url; downloads use
# github.server_url. Tokens are never sent to user-supplied URLs.
# - On Gitea (VCS_TYPE=gitea), inputs.gitea-url is validated (https scheme,
# no whitespace/newlines) before use.
# - action-repo is validated against owner/repo pattern.
# - Tokens are passed via masked environment variables, not step outputs.
#
# Requirements: python3, sha256sum, curl (all present on ubuntu-* runners). # Requirements: python3, sha256sum, curl (all present on ubuntu-* runners).
name: 'AI Code Review' name: 'AI Code Review'
description: 'Run AI-powered code review on a pull request using review-bot' description: 'Run AI-powered code review on a pull request using review-bot'
inputs: inputs:
gitea-url: gitea-url:
description: 'Gitea instance URL (defaults to server_url)' description: 'Gitea instance URL (only used on Gitea runners; ignored on GitHub/GHES). Defaults to server_url.'
required: false required: false
default: '' default: ''
repo: repo:
description: 'Repository (owner/name, defaults to current)' description: 'Repository to review (owner/name, defaults to current)'
required: false
default: ''
action-repo:
description: 'Repository hosting review-bot releases (owner/name). Defaults to github.action_repository or rodin/review-bot.'
required: false
default: ''
action-repo-token:
description: 'Token for downloading release assets from action-repo (defaults to github.token on GitHub, reviewer-token on Gitea). Required for private repos.'
required: false required: false
default: '' default: ''
pr-number: pr-number:
@@ -19,7 +39,7 @@ inputs:
required: false required: false
default: '' default: ''
reviewer-token: reviewer-token:
description: 'Gitea token for posting the review' description: 'Token for posting the review'
required: true required: true
reviewer-name: reviewer-name:
description: 'Display name for the reviewer' description: 'Display name for the reviewer'
@@ -112,19 +132,131 @@ runs:
id: version id: version
shell: bash shell: bash
run: | run: |
GITEA_URL="${{ inputs.gitea-url || github.server_url }}" set -euo pipefail
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
# --- Input Validation ---
# Determine the repo hosting review-bot releases (not the repo being reviewed)
ACTION_REPO="${{ inputs.action-repo }}"
if [ -z "$ACTION_REPO" ]; then
# github.action_repository is the repo containing the running action
ACTION_REPO="${{ github.action_repository }}"
fi
if [ -z "$ACTION_REPO" ]; then
# Final fallback for Gitea (which may not set action_repository)
ACTION_REPO="rodin/review-bot"
echo "::notice::action-repo not specified and github.action_repository is empty; falling back to rodin/review-bot"
fi
# Validate ACTION_REPO matches owner/repo pattern (prevent path traversal)
if ! printf '%s' "$ACTION_REPO" | grep -qE '^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$'; then
echo "Error: action-repo '${ACTION_REPO}' does not match expected owner/repo format" >&2
exit 1
fi
# Detect VCS host type using github.api_url context.
# github.api_url is set on GitHub.com (https://api.github.com) and GHES
# (https://<host>/api/v3). It is empty/unset on Gitea Actions runners.
GITHUB_API_URL="${{ github.api_url }}"
if [ -n "$GITHUB_API_URL" ]; then
VCS_TYPE="github"
else
VCS_TYPE="gitea"
fi
# Determine SERVER_URL based on VCS type.
# SECURITY: On GitHub/GHES, ALWAYS use github.server_url — never trust
# inputs.gitea-url to prevent token exfiltration to attacker-controlled hosts.
if [ "$VCS_TYPE" = "github" ]; then
SERVER_URL="${{ github.server_url }}"
if [ -n "${{ inputs.gitea-url }}" ]; then
echo "::warning::inputs.gitea-url is ignored on GitHub/GHES runners (VCS_TYPE=github). Using github.server_url instead."
fi
else
SERVER_URL="${{ inputs.gitea-url || github.server_url }}"
fi
# Strip trailing slash if present
SERVER_URL="${SERVER_URL%/}"
# Validate SERVER_URL for Gitea path: must be https, no whitespace/newlines.
# The [^[:space:]] class already rejects newlines, so no separate newline check needed.
if [ "$VCS_TYPE" = "gitea" ]; then
if ! printf '%s' "$SERVER_URL" | grep -qE '^https://[^[:space:]]+$'; then
echo "Error: SERVER_URL '${SERVER_URL}' must be an https:// URL with no whitespace" >&2
exit 1
fi
fi
# Determine auth token for release API requests
ACTION_TOKEN="${{ inputs.action-repo-token }}"
if [ -z "$ACTION_TOKEN" ]; then
if [ "$VCS_TYPE" = "github" ]; then
ACTION_TOKEN="${{ github.token }}"
else
ACTION_TOKEN="${{ inputs.reviewer-token }}"
fi
fi
# Validate token contains no control characters (defense-in-depth against header injection)
if [ -n "$ACTION_TOKEN" ]; then
if printf '%s' "$ACTION_TOKEN" | LC_ALL=C grep -q '[^[:print:]]'; then
echo "Error: ACTION_TOKEN contains control characters" >&2
exit 1
fi
fi
if [ "${{ inputs.version }}" = "latest" ]; then if [ "${{ inputs.version }}" = "latest" ]; then
VERSION=$(curl -sSf "${GITEA_URL}/api/v1/repos/${REPO}/releases?limit=1" \ if [ "$VCS_TYPE" = "github" ]; then
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')") # SECURITY: Use github.api_url which is a trusted platform-provided value.
# Never construct API URLs from user-supplied inputs on GitHub.
API_URL="${GITHUB_API_URL}/repos/${ACTION_REPO}/releases?per_page=1"
else
# Gitea API — SERVER_URL was validated above
API_URL="${SERVER_URL}/api/v1/repos/${ACTION_REPO}/releases?limit=1"
fi
# Fetch latest version with inline auth header (no intermediate variable)
if [ -n "$ACTION_TOKEN" ]; then
if [ "$VCS_TYPE" = "github" ]; then
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
else
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: token ${ACTION_TOKEN}" "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
fi
else
VERSION=$(curl -sSf --connect-timeout 10 --max-time 30 "$API_URL" \
| python3 -c "import sys, json; releases = json.load(sys.stdin); print(releases[0]['tag_name'] if releases else '')")
fi
if [ -z "$VERSION" ]; then if [ -z "$VERSION" ]; then
echo "Failed to determine latest version" >&2 echo "Failed to determine latest version from ${API_URL}" >&2
exit 1 exit 1
fi fi
else else
VERSION="${{ inputs.version }}" VERSION="${{ inputs.version }}"
fi fi
# Validate VERSION: no slashes or whitespace (prevent path traversal).
# [:space:] includes newlines and carriage returns in POSIX.
if printf '%s' "$VERSION" | grep -qE '[/[:space:]]'; then
echo "Error: VERSION '${VERSION}' contains invalid characters (newline, slash, or whitespace)" >&2
exit 1
fi
echo "version=${VERSION}" >> "$GITHUB_OUTPUT" echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
echo "action_repo=${ACTION_REPO}" >> "$GITHUB_OUTPUT"
echo "server_url=${SERVER_URL}" >> "$GITHUB_OUTPUT"
echo "vcs_type=${VCS_TYPE}" >> "$GITHUB_OUTPUT"
# SECURITY: Pass token via masked environment variable instead of step output.
# Step outputs can leak in debug logs; GITHUB_ENV with masking is safer.
if [ -n "$ACTION_TOKEN" ]; then
echo "::add-mask::${ACTION_TOKEN}"
echo "ACTION_TOKEN=${ACTION_TOKEN}" >> "$GITHUB_ENV"
fi
- name: Cache review-bot binary - name: Cache review-bot binary
id: cache id: cache
@@ -137,19 +269,111 @@ runs:
if: steps.cache.outputs.cache-hit != 'true' if: steps.cache.outputs.cache-hit != 'true'
shell: bash shell: bash
run: | run: |
GITEA_URL="${{ inputs.gitea-url || github.server_url }}" set -euo pipefail
REPO="${{ inputs.repo || 'rodin/review-bot' }}"
SERVER_URL="${{ steps.version.outputs.server_url }}"
ACTION_REPO="${{ steps.version.outputs.action_repo }}"
VERSION="${{ steps.version.outputs.version }}" VERSION="${{ steps.version.outputs.version }}"
VCS_TYPE="${{ steps.version.outputs.vcs_type }}"
# Read token from masked environment variable (set in Determine version step)
# Falls back to empty if not set (public repos don't need auth)
ACTION_TOKEN="${ACTION_TOKEN:-}"
BINARY="review-bot-linux-amd64" BINARY="review-bot-linux-amd64"
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/${BINARY}" \ if [ "$VCS_TYPE" = "github" ]; then
-o "${{ runner.temp }}/review-bot" # GitHub/GHES: Use REST API for release asset downloads.
curl -sSfL "${GITEA_URL}/${REPO}/releases/download/${VERSION}/checksums.txt" \ # Web release URLs ({server}/.../releases/download/{tag}/{asset}) redirect
-o "${{ runner.temp }}/checksums.txt" # to S3 and don't reliably support Authorization headers for private repos.
# The REST API endpoint with Accept: application/octet-stream is required.
# GITHUB_API_URL: trusted platform value, same as detected in "Determine version" step.
GITHUB_API_URL="${{ github.api_url }}"
if [ -n "$ACTION_TOKEN" ]; then
RELEASE_JSON=$(curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/tags/${VERSION}")
else
RELEASE_JSON=$(curl -sSf --connect-timeout 10 --max-time 30 \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/tags/${VERSION}")
fi
# Extract asset IDs for binary and checksums
BINARY_ASSET_ID=$(printf '%s' "$RELEASE_JSON" | python3 -c "
import sys, json
assets = json.load(sys.stdin).get('assets', [])
for a in assets:
if a['name'] == '${BINARY}':
print(a['id'])
break
")
if [ -z "$BINARY_ASSET_ID" ]; then
echo "Error: could not find asset '${BINARY}' in release ${VERSION}" >&2
exit 1
fi
CHECKSUMS_ASSET_ID=$(printf '%s' "$RELEASE_JSON" | python3 -c "
import sys, json
assets = json.load(sys.stdin).get('assets', [])
for a in assets:
if a['name'] == 'checksums.txt':
print(a['id'])
break
")
if [ -z "$CHECKSUMS_ASSET_ID" ]; then
echo "Error: could not find asset 'checksums.txt' in release ${VERSION}" >&2
exit 1
fi
# Download assets via REST API with Accept: application/octet-stream
if [ -n "$ACTION_TOKEN" ]; then
curl -sSfL --connect-timeout 10 --max-time 120 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${BINARY_ASSET_ID}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL --connect-timeout 10 --max-time 30 \
-H "Authorization: Bearer ${ACTION_TOKEN}" \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${CHECKSUMS_ASSET_ID}" \
-o "${{ runner.temp }}/checksums.txt"
else
curl -sSfL --connect-timeout 10 --max-time 120 \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${BINARY_ASSET_ID}" \
-o "${{ runner.temp }}/review-bot"
curl -sSfL --connect-timeout 10 --max-time 30 \
-H "Accept: application/octet-stream" \
"${GITHUB_API_URL}/repos/${ACTION_REPO}/releases/assets/${CHECKSUMS_ASSET_ID}" \
-o "${{ runner.temp }}/checksums.txt"
fi
else
# Gitea: Direct download via web release URLs (Gitea serves assets
# directly without redirects — no -L needed).
# SECURITY: Omitting -L prevents forwarding Authorization header to
# unexpected hosts if Gitea ever introduces CDN redirects.
DOWNLOAD_URL="${SERVER_URL}/${ACTION_REPO}/releases/download/${VERSION}"
if [ -n "$ACTION_TOKEN" ]; then
curl -sSf --connect-timeout 10 --max-time 120 \
-H "Authorization: token ${ACTION_TOKEN}" \
"${DOWNLOAD_URL}/${BINARY}" -o "${{ runner.temp }}/review-bot"
curl -sSf --connect-timeout 10 --max-time 30 \
-H "Authorization: token ${ACTION_TOKEN}" \
"${DOWNLOAD_URL}/checksums.txt" -o "${{ runner.temp }}/checksums.txt"
else
curl -sSf --connect-timeout 10 --max-time 120 \
"${DOWNLOAD_URL}/${BINARY}" -o "${{ runner.temp }}/review-bot"
curl -sSf --connect-timeout 10 --max-time 30 \
"${DOWNLOAD_URL}/checksums.txt" -o "${{ runner.temp }}/checksums.txt"
fi
fi
# Verify SHA-256 checksum # Verify SHA-256 checksum
# NOTE: This verifies integrity (download wasn't corrupted) but not
# authenticity — both binary and checksums come from the same server.
# For stronger guarantees, consider GPG signature verification.
cd "${{ runner.temp }}" cd "${{ runner.temp }}"
EXPECTED=$(grep "${BINARY}" checksums.txt | awk '{print $1}') EXPECTED=$(grep -E "^[0-9a-f]+[[:space:]]+\*?${BINARY}$" checksums.txt | awk '{print $1}')
ACTUAL=$(sha256sum review-bot | awk '{print $1}') ACTUAL=$(sha256sum review-bot | awk '{print $1}')
if [ -z "$EXPECTED" ]; then if [ -z "$EXPECTED" ]; then
@@ -169,7 +393,7 @@ runs:
- name: Run review - name: Run review
shell: bash shell: bash
env: env:
GITEA_URL: ${{ inputs.gitea-url || github.server_url }} GITEA_URL: ${{ steps.version.outputs.server_url }}
GITEA_REPO: ${{ inputs.repo || github.repository }} GITEA_REPO: ${{ inputs.repo || github.repository }}
PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }} PR_NUMBER: ${{ inputs.pr-number || github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ inputs.reviewer-token }} REVIEWER_TOKEN: ${{ inputs.reviewer-token }}