CI: gate heavy reviews on self-review (Doc consistency); comment-trigger; disable TTL heavy reviews #159

Open
rodin wants to merge 8 commits from ci-selfreview-gate into main
Showing only changes of commit 3d0c84fa6e - Show all commits
+91 -103
View File
@@ -1,71 +1,26 @@
name: CI
true:
on:
push:
branches:
- main
branches: [main]
pull_request:
types:
- opened
- synchronize
types: [opened, synchronize]
env:
SELF_REVIEW_TTL_MIN: '45'
jobs:
test:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.26'
- run: go test ./...
- run: go vet ./...
- run: go build -o review-bot ./cmd/review-bot
review:
runs-on: ubuntu-24.04
if: github.event_name == 'pull_request' && needs.review-gate.outputs.allow_review
== 'true'
needs:
- test
- review-gate
strategy:
matrix:
include:
- name: sonnet
token_secret: SONNET_REVIEW_TOKEN
model: anthropic--claude-4.6-sonnet
- name: gpt
token_secret: GPT_REVIEW_TOKEN
model: gpt-5
- name: security
token_secret: SECURITY_REVIEW_TOKEN
model: gpt-5
patterns_repo: rodin/security-patterns
patterns_files: .
system_prompt_file: SECURITY_REVIEW.md
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.26'
- run: go build -o review-bot ./cmd/review-bot
- name: Run ${{ matrix.name }} review
env:
VCS_URL: ${{ github.server_url }}
GITEA_REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}
REVIEWER_NAME: ${{ matrix.name }}
LLM_PROVIDER: aicore
LLM_MODEL: ${{ matrix.model }}
AICORE_CLIENT_ID: ${{ secrets.AICORE_CLIENT_ID }}
AICORE_CLIENT_SECRET: ${{ secrets.AICORE_CLIENT_SECRET }}
AICORE_AUTH_URL: ${{ secrets.AICORE_AUTH_URL }}
AICORE_API_URL: ${{ secrets.AICORE_API_URL }}
AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }}
CONVENTIONS_FILE: CONVENTIONS.md
PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }}
PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }}
LLM_TIMEOUT: '600'
SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }}
run: ./review-bot
- uses: actions/checkout@v4
Review

[MAJOR] The test job is restricted to only pull_request events via if: github.event_name == 'pull_request', so tests no longer run on push to main even though the workflow still triggers on push. This likely degrades CI coverage and branch protection; prefer skipping only issue_comment runs or include push explicitly (e.g., if: github.event_name != 'issue_comment' or if: github.event_name == 'pull_request' || github.event_name == 'push').

**[MAJOR]** The test job is restricted to only pull_request events via `if: github.event_name == 'pull_request'`, so tests no longer run on push to main even though the workflow still triggers on push. This likely degrades CI coverage and branch protection; prefer skipping only `issue_comment` runs or include push explicitly (e.g., `if: github.event_name != 'issue_comment'` or `if: github.event_name == 'pull_request' || github.event_name == 'push'`).
- uses: actions/setup-go@v5
Review

[MINOR] The test job is restricted to run only on pull_request events (if: github.event_name == 'pull_request'), so tests no longer run on push to main despite the workflow being triggered for pushes. If this was not intended, adjust the condition (e.g., exclude only issue_comment) to keep tests on push events.

**[MINOR]** The test job is restricted to run only on pull_request events (if: github.event_name == 'pull_request'), so tests no longer run on push to main despite the workflow being triggered for pushes. If this was not intended, adjust the condition (e.g., exclude only issue_comment) to keep tests on push events.
with:
Review

[NIT] The test job installs jq even though it is unused in that job (jq is only needed in review-gate and review). Removing this step will speed up runs slightly.

**[NIT]** The test job installs jq even though it is unused in that job (jq is only needed in review-gate and review). Removing this step will speed up runs slightly.
go-version: '1.26'
Review

[MINOR] The test job installs jq but does not use it in any subsequent step, adding unnecessary time to test runs. Remove if not needed.

**[MINOR]** The test job installs jq but does not use it in any subsequent step, adding unnecessary time to test runs. Remove if not needed.
- run: go test ./...
- run: go vet ./...
- run: go build -o review-bot ./cmd/review-bot
review-gate:
runs-on: ubuntu-24.04
if: github.event_name == 'pull_request'
@@ -73,50 +28,83 @@ jobs:
allow_review: ${{ steps.gate.outputs.allow_review }}
Review

[MINOR] issue_comment events will trigger the review-gate job for any PR comment, allowing an attacker to spam comments to repeatedly invoke the job and consume runner resources (DoS). Although heavy reviewers won't run without a valid self-review, the gate still performs package installs and API calls on each comment.

**[MINOR]** issue_comment events will trigger the review-gate job for any PR comment, allowing an attacker to spam comments to repeatedly invoke the job and consume runner resources (DoS). Although heavy reviewers won't run without a valid self-review, the gate still performs package installs and API calls on each comment.
reason: ${{ steps.gate.outputs.reason }}
steps:
Review

[MINOR] The test job now installs jq but never uses it. jq is only needed in the review-gate and review jobs. This adds ~5s to every test run for no reason.

**[MINOR]** The `test` job now installs `jq` but never uses it. `jq` is only needed in the `review-gate` and `review` jobs. This adds ~5s to every test run for no reason.
- name: Check self-review gate
id: gate
env:
GITEA_TOKEN: ${{ secrets.RODIN_TOKEN }}
run: 'REPO=${{ github.repository }}
- name: Check self-review gate
security-review-bot marked this conversation as resolved
Review

[MINOR] The review-gate job is triggered on issue_comment events for PRs, which is intended. For defense-in-depth, consider explicitly setting permissions: in the workflow to least-privilege (e.g., read-only) to reduce blast radius for the default token.

**[MINOR]** The review-gate job is triggered on `issue_comment` events for PRs, which is intended. For defense-in-depth, consider explicitly setting `permissions:` in the workflow to least-privilege (e.g., read-only) to reduce blast radius for the default token.
Review

[MINOR] Condition uses (github.event.issue.pull_request) directly. For clarity and future-proofing, prefer an explicit null check (e.g., github.event.issue.pull_request != null) to avoid truthiness ambiguity.

**[MINOR]** Condition uses (github.event.issue.pull_request) directly. For clarity and future-proofing, prefer an explicit null check (e.g., github.event.issue.pull_request != null) to avoid truthiness ambiguity.
id: gate
env:
GITEA_TOKEN: ${{ secrets.RODIN_TOKEN }}
run: |
set -e
Review

[NIT] The REPO=${{ github.repository }} and API=... lines use unquoted expression substitution without ${} shell quoting. If the repository name ever contains spaces (unlikely but possible), this would break. Prefer REPO="${{ github.repository }}" style consistently.

**[NIT]** The `REPO=${{ github.repository }}` and `API=...` lines use unquoted expression substitution without `${}` shell quoting. If the repository name ever contains spaces (unlikely but possible), this would break. Prefer `REPO="${{ github.repository }}"` style consistently.
REPO=${{ github.repository }}
PR=${{ github.event.pull_request.number }}
SHA=${{ github.event.pull_request.head.sha }}
TTL_MIN=${SELF_REVIEW_TTL_MIN}
API="${{ github.server_url }}/api/v1"
PR=${{ github.event.pull_request.number }}
PR_JSON=$(curl -sS -H "Authorization: token $GITEA_TOKEN" "$API/repos/$REPO/pulls/$PR")
Review

[MINOR] The AGE_MIN variable is computed and ALLOW=true/REASON=ttl is set when TTL is exceeded, but allow_review=true with reason=ttl does NOT trigger the review job (the review job gates on reason == 'self-review'). This is correct per the stated intent, but the allow_review output is never actually consumed by any job — only reason is checked. The allow_review output is dead output and may cause future confusion. Consider removing it or documenting its lack of use.

**[MINOR]** The AGE_MIN variable is computed and ALLOW=true/REASON=ttl is set when TTL is exceeded, but `allow_review=true` with `reason=ttl` does NOT trigger the review job (the review job gates on `reason == 'self-review'`). This is correct per the stated intent, but the `allow_review` output is never actually consumed by any job — only `reason` is checked. The `allow_review` output is dead output and may cause future confusion. Consider removing it or documenting its lack of use.
Review

[MINOR] The review-gate script uses set -e but not -u or -o pipefail. Adding set -euo pipefail would harden error handling and ensure failures in pipelines or unset variables don’t lead to unintended behavior (fail closed).

**[MINOR]** The review-gate script uses `set -e` but not `-u` or `-o pipefail`. Adding `set -euo pipefail` would harden error handling and ensure failures in pipelines or unset variables don’t lead to unintended behavior (fail closed).
Review

[MINOR] Shell robustness: consider using 'set -euo pipefail' and 'curl -fsS' to fail fast on unset vars and HTTP errors. Currently 'set -e' won't catch all error modes, and curl without -f treats HTTP 4xx/5xx as success.

**[MINOR]** Shell robustness: consider using 'set -euo pipefail' and 'curl -fsS' to fail fast on unset vars and HTTP errors. Currently 'set -e' won't catch all error modes, and curl without -f treats HTTP 4xx/5xx as success.
UPDATED_AT=$(echo "$PR_JSON" | jq -r .updated_at)
NOW=$(date -u +%s)
PR_TS=$(date -u -d "$UPDATED_AT" +%s)
AGE_MIN=$(( (NOW - PR_TS) / 60 ))
Review

[MINOR] The review-gate condition (github.event_name == 'issue_comment' && github.event.issue.pull_request) assumes object truthiness. For robustness, explicitly check existence (e.g., github.event.issue.pull_request != null) to avoid edge-case evaluation issues.

**[MINOR]** The review-gate condition `(github.event_name == 'issue_comment' && github.event.issue.pull_request)` assumes object truthiness. For robustness, explicitly check existence (e.g., `github.event.issue.pull_request != null`) to avoid edge-case evaluation issues.
SHA=${{ github.event.pull_request.head.sha }}
COMMENTS=$(curl -sS -H "Authorization: token $GITEA_TOKEN" "$API/repos/$REPO/issues/$PR/comments?limit=200")
HAS_SR=$(echo "$COMMENTS" | jq -r --arg sha "$SHA" '[.[] | select(.user.login=="rodin") | select(.body|contains("Self-review against "+$sha)) | select(.body|test("(?im)^###\\s+Doc consistency\\b"))] | length')
Review

[MINOR] Shell variables from GitHub expressions are interpolated directly without quoting: REPO=${{ github.repository }}. If the repository name contained spaces or special characters this would break. Prefer REPO='${{ github.repository }}' or export with quotes for robustness.

**[MINOR]** Shell variables from GitHub expressions are interpolated directly without quoting: `REPO=${{ github.repository }}`. If the repository name contained spaces or special characters this would break. Prefer `REPO='${{ github.repository }}'` or export with quotes for robustness.
TTL_MIN=${SELF_REVIEW_TTL_MIN}
if [ "$HAS_SR" -gt 0 ]; then
security-review-bot marked this conversation as resolved
Review

[MINOR] Curl invocations lack explicit failure and timeout settings. Consider adding --fail (or --fail-with-body), --max-time, and/or --retry to prevent hanging jobs and reduce DOS potential against CI if the API stalls.

**[MINOR]** Curl invocations lack explicit failure and timeout settings. Consider adding `--fail` (or `--fail-with-body`), `--max-time`, and/or `--retry` to prevent hanging jobs and reduce DOS potential against CI if the API stalls.
ALLOW=true
REASON=self-review
elif [ "$AGE_MIN" -ge "$TTL_MIN" ]; then
Review

[MINOR] The jq regex (?im)^###\s+Doc consistency\b is passed through multiple layers of shell quoting and JSON encoding before reaching jq. The \b word-boundary assertion and the multiline mode should work in jq's ONIG engine, but this is non-obvious and fragile. A simpler test("(?m)^### Doc consistency") without the \b would be more readable and equally correct for this use case.

**[MINOR]** The jq regex `(?im)^###\s+Doc consistency\b` is passed through multiple layers of shell quoting and JSON encoding before reaching jq. The `\b` word-boundary assertion and the multiline mode should work in jq's ONIG engine, but this is non-obvious and fragile. A simpler `test("(?m)^### Doc consistency")` without the `\b` would be more readable and equally correct for this use case.
ALLOW=true
REASON=ttl
else
ALLOW=false
REASON=missing
security-review-bot marked this conversation as resolved
Review

[MINOR] Same as above: the comments API request should include curl timeouts and --fail to avoid indefinite hangs and fail closed on HTTP errors.

**[MINOR]** Same as above: the comments API request should include curl timeouts and `--fail` to avoid indefinite hangs and fail closed on HTTP errors.
Review

[NIT] Comments API call limits results to 200. Very large PRs with more than 200 comments could miss earlier valid self-reviews. Consider pagination or filtering by most recent comments if this is a concern.

**[NIT]** Comments API call limits results to 200. Very large PRs with more than 200 comments could miss earlier valid self-reviews. Consider pagination or filtering by most recent comments if this is a concern.
fi
Review

[MAJOR] Self-review gate only considers comments from user.login == "rodin". This contradicts the description that "Self‑review comments" should trigger reviews and will ignore valid self-reviews authored by PR authors, preventing heavy reviews from starting (TTL does not trigger reviewers). Remove or broaden this filter (e.g., allow the PR author and/or a configurable allowlist) to match intent.

**[MAJOR]** Self-review gate only considers comments from user.login == "rodin". This contradicts the description that "Self‑review comments" should trigger reviews and will ignore valid self-reviews authored by PR authors, preventing heavy reviews from starting (TTL does not trigger reviewers). Remove or broaden this filter (e.g., allow the PR author and/or a configurable allowlist) to match intent.
Review

[MINOR] The review job if: condition uses string comparison needs.review-gate.outputs.reason == 'self-review' combined with && against event_name checks. In Gitea Actions, if review-gate is skipped (e.g. for push events), needs.review-gate.outputs.reason will be empty and the comparison will be false — meaning review is silently skipped for push-to-main without any indication. This is probably intentional but worth a comment.

**[MINOR]** The review job `if:` condition uses string comparison `needs.review-gate.outputs.reason == 'self-review'` combined with `&&` against event_name checks. In Gitea Actions, if `review-gate` is skipped (e.g. for push events), `needs.review-gate.outputs.reason` will be empty and the comparison will be false — meaning review is silently skipped for push-to-main without any indication. This is probably intentional but worth a comment.
API="${{ github.server_url }}/api/v1"
echo "allow_review=$ALLOW" >> $GITHUB_OUTPUT
echo "reason=$REASON" >> $GITHUB_OUTPUT
PR_JSON=$(curl -sS -H "Authorization: token $GITEA_TOKEN" "$API/repos/$REPO/pulls/$PR")
UPDATED_AT=$(echo "$PR_JSON" | jq -r .updated_at)
NOW=$(date -u +%s)
PR_TS=$(date -u -d "$UPDATED_AT" +%s)
AGE_MIN=$(( (NOW - PR_TS) / 60 ))
COMMENTS=$(curl -sS -H "Authorization: token $GITEA_TOKEN" "$API/repos/$REPO/issues/$PR/comments?limit=200")
HAS_SR=$(echo "$COMMENTS" | jq -r --arg sha "$SHA" "[.[] | select(.user.login=="rodin")
| select(.body|contains("Self-review against "+\$sha)) | select(.body|test("(?im)^###
\\s+Doc consistency\\b"))] | length")
if [ "$HAS_SR" -gt 0 ]; then ALLOW=true; REASON=self-review; elif [ "$AGE_MIN"
-ge "$TTL_MIN" ]; then ALLOW=true; REASON=ttl; else ALLOW=false; REASON=missing;
fi
echo "allow_review=$ALLOW" >> $GITHUB_OUTPUT
echo "reason=$REASON" >> $GITHUB_OUTPUT'
env:
SELF_REVIEW_TTL_MIN: '45'
'on':
pull_request:
types:
- opened
- synchronize
push:
branches:
- main
review:
Review

[MINOR] date -u -d "$UPDATED_AT" +%s uses GNU date syntax. This is fine on ubuntu-24.04, but the -d flag for parsing a date string is a GNU extension. Worth a comment so future maintainers know this is intentionally Linux-only.

**[MINOR]** `date -u -d "$UPDATED_AT" +%s` uses GNU date syntax. This is fine on ubuntu-24.04, but the `-d` flag for parsing a date string is a GNU extension. Worth a comment so future maintainers know this is intentionally Linux-only.
runs-on: ubuntu-24.04
if: github.event_name == 'pull_request' && needs.review-gate.outputs.allow_review == 'true'
needs: [test, review-gate]
strategy:
matrix:
include:
- name: sonnet
token_secret: SONNET_REVIEW_TOKEN
Review

[MINOR] The TTL is calculated using PR_JSON.updated_at, which changes on every push/comment. A PR with a new comment 5 minutes ago will never pass the TTL gate regardless of how old the PR is. If the intent is 'PR was opened and nobody reviewed it for 45 minutes', created_at would be more appropriate. If the intent is 'last activity was 45 minutes ago', the current behavior is correct but differs from the PR description which says 'TTL keeps the gate green'.

**[MINOR]** The TTL is calculated using `PR_JSON.updated_at`, which changes on every push/comment. A PR with a new comment 5 minutes ago will never pass the TTL gate regardless of how old the PR is. If the intent is 'PR was opened and nobody reviewed it for 45 minutes', `created_at` would be more appropriate. If the intent is 'last activity was 45 minutes ago', the current behavior is correct but differs from the PR description which says 'TTL keeps the gate green'.
model: anthropic--claude-4.6-sonnet
- name: gpt
Review

[MINOR] The allow_review output is set but never consumed — review.if only checks reason, not allow_review. The output could be removed, or the review.if condition could be simplified to just needs.review-gate.outputs.reason == 'self-review' (which it already is). The unused output is harmless but misleading.

**[MINOR]** The `allow_review` output is set but never consumed — `review.if` only checks `reason`, not `allow_review`. The output could be removed, or the `review.if` condition could be simplified to just `needs.review-gate.outputs.reason == 'self-review'` (which it already is). The unused output is harmless but misleading.
token_secret: GPT_REVIEW_TOKEN
model: gpt-5
- name: security
token_secret: SECURITY_REVIEW_TOKEN
model: gpt-5
patterns_repo: rodin/security-patterns
patterns_files: '.'
Review

[NIT] The review job dropped its dependency on test (changed from needs: test to needs: [review-gate]). This means the heavy review can now run even if the Go build/tests are failing. The previous design required passing tests before reviews ran. This may be intentional but is worth confirming.

**[NIT]** The `review` job dropped its dependency on `test` (changed from `needs: test` to `needs: [review-gate]`). This means the heavy review can now run even if the Go build/tests are failing. The previous design required passing tests before reviews ran. This may be intentional but is worth confirming.
system_prompt_file: SECURITY_REVIEW.md
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.26'
- run: go build -o review-bot ./cmd/review-bot
- name: Run ${{ matrix.name }} review
env:
VCS_URL: ${{ github.server_url }}
GITEA_REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}
REVIEWER_NAME: ${{ matrix.name }}
LLM_PROVIDER: aicore
LLM_MODEL: ${{ matrix.model }}
AICORE_CLIENT_ID: ${{ secrets.AICORE_CLIENT_ID }}
AICORE_CLIENT_SECRET: ${{ secrets.AICORE_CLIENT_SECRET }}
AICORE_AUTH_URL: ${{ secrets.AICORE_AUTH_URL }}
AICORE_API_URL: ${{ secrets.AICORE_API_URL }}
AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }}
CONVENTIONS_FILE: 'CONVENTIONS.md'
PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }}
PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }}
LLM_TIMEOUT: '600'
SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }}
run: ./review-bot