CI: gate heavy reviews on self-review (Doc consistency); comment-trigger; disable TTL heavy reviews #159

Closed
rodin wants to merge 8 commits from ci-selfreview-gate into main
Showing only changes of commit 0c6f46d279 - Show all commits
+60 -15
View File
@@ -1,11 +1,12 @@
name: CI name: CI
true:
on:
push: push:
branches: [main] branches:
- main
pull_request: pull_request:
types: [opened, synchronize] types:
- opened
- synchronize
jobs: jobs:
test: test:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
2
@@ -17,15 +18,13 @@ jobs:
- run: go test ./... - run: go test ./...
Review

[NIT] The test job installs jq even though it is unused in that job (jq is only needed in review-gate and review). Removing this step will speed up runs slightly.

**[NIT]** The test job installs jq even though it is unused in that job (jq is only needed in review-gate and review). Removing this step will speed up runs slightly.
- run: go vet ./... - run: go vet ./...
Review

[MINOR] The test job installs jq but does not use it in any subsequent step, adding unnecessary time to test runs. Remove if not needed.

**[MINOR]** The test job installs jq but does not use it in any subsequent step, adding unnecessary time to test runs. Remove if not needed.
- run: go build -o review-bot ./cmd/review-bot - run: go build -o review-bot ./cmd/review-bot
# Self-review using native SAP AI Core provider
# Models must match SAP AI Core deployments
# Available models: gpt-5, anthropic--claude-4.6-sonnet, anthropic--claude-4.6-opus
# Removed gpt-4.1, gpt-5-mini, gpt-4.1-mini - not deployed on AI Core
review: review:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
if: github.event_name == 'pull_request' if: github.event_name == 'pull_request' && needs.review-gate.outputs.allow_review
needs: test == 'true'
needs:
- test
- review-gate
strategy: strategy:
Review

[MINOR] issue_comment events will trigger the review-gate job for any PR comment, allowing an attacker to spam comments to repeatedly invoke the job and consume runner resources (DoS). Although heavy reviewers won't run without a valid self-review, the gate still performs package installs and API calls on each comment.

**[MINOR]** issue_comment events will trigger the review-gate job for any PR comment, allowing an attacker to spam comments to repeatedly invoke the job and consume runner resources (DoS). Although heavy reviewers won't run without a valid self-review, the gate still performs package installs and API calls on each comment.
matrix: matrix:
include: include:
Review

[MINOR] The test job now installs jq but never uses it. jq is only needed in the review-gate and review jobs. This adds ~5s to every test run for no reason.

**[MINOR]** The `test` job now installs `jq` but never uses it. `jq` is only needed in the `review-gate` and `review` jobs. This adds ~5s to every test run for no reason.
3
@@ -39,7 +38,7 @@ jobs:
token_secret: SECURITY_REVIEW_TOKEN token_secret: SECURITY_REVIEW_TOKEN
model: gpt-5 model: gpt-5
patterns_repo: rodin/security-patterns patterns_repo: rodin/security-patterns
patterns_files: "." patterns_files: .
system_prompt_file: SECURITY_REVIEW.md system_prompt_file: SECURITY_REVIEW.md
steps: steps:
Review

[MINOR] The AGE_MIN variable is computed and ALLOW=true/REASON=ttl is set when TTL is exceeded, but allow_review=true with reason=ttl does NOT trigger the review job (the review job gates on reason == 'self-review'). This is correct per the stated intent, but the allow_review output is never actually consumed by any job — only reason is checked. The allow_review output is dead output and may cause future confusion. Consider removing it or documenting its lack of use.

**[MINOR]** The AGE_MIN variable is computed and ALLOW=true/REASON=ttl is set when TTL is exceeded, but `allow_review=true` with `reason=ttl` does NOT trigger the review job (the review job gates on `reason == 'self-review'`). This is correct per the stated intent, but the `allow_review` output is never actually consumed by any job — only `reason` is checked. The `allow_review` output is dead output and may cause future confusion. Consider removing it or documenting its lack of use.
Review

[MINOR] The review-gate script uses set -e but not -u or -o pipefail. Adding set -euo pipefail would harden error handling and ensure failures in pipelines or unset variables don’t lead to unintended behavior (fail closed).

**[MINOR]** The review-gate script uses `set -e` but not `-u` or `-o pipefail`. Adding `set -euo pipefail` would harden error handling and ensure failures in pipelines or unset variables don’t lead to unintended behavior (fail closed).
Review

[MINOR] Shell robustness: consider using 'set -euo pipefail' and 'curl -fsS' to fail fast on unset vars and HTTP errors. Currently 'set -e' won't catch all error modes, and curl without -f treats HTTP 4xx/5xx as success.

**[MINOR]** Shell robustness: consider using 'set -euo pipefail' and 'curl -fsS' to fail fast on unset vars and HTTP errors. Currently 'set -e' won't catch all error modes, and curl without -f treats HTTP 4xx/5xx as success.
- uses: actions/checkout@v4 - uses: actions/checkout@v4
4
@@ -61,9 +60,55 @@ jobs:
AICORE_AUTH_URL: ${{ secrets.AICORE_AUTH_URL }} AICORE_AUTH_URL: ${{ secrets.AICORE_AUTH_URL }}
security-review-bot marked this conversation as resolved
Review

[MINOR] Same as above: the comments API request should include curl timeouts and --fail to avoid indefinite hangs and fail closed on HTTP errors.

**[MINOR]** Same as above: the comments API request should include curl timeouts and `--fail` to avoid indefinite hangs and fail closed on HTTP errors.
Review

[NIT] Comments API call limits results to 200. Very large PRs with more than 200 comments could miss earlier valid self-reviews. Consider pagination or filtering by most recent comments if this is a concern.

**[NIT]** Comments API call limits results to 200. Very large PRs with more than 200 comments could miss earlier valid self-reviews. Consider pagination or filtering by most recent comments if this is a concern.
AICORE_API_URL: ${{ secrets.AICORE_API_URL }} AICORE_API_URL: ${{ secrets.AICORE_API_URL }}
Review

[MAJOR] Self-review gate only considers comments from user.login == "rodin". This contradicts the description that "Self‑review comments" should trigger reviews and will ignore valid self-reviews authored by PR authors, preventing heavy reviews from starting (TTL does not trigger reviewers). Remove or broaden this filter (e.g., allow the PR author and/or a configurable allowlist) to match intent.

**[MAJOR]** Self-review gate only considers comments from user.login == "rodin". This contradicts the description that "Self‑review comments" should trigger reviews and will ignore valid self-reviews authored by PR authors, preventing heavy reviews from starting (TTL does not trigger reviewers). Remove or broaden this filter (e.g., allow the PR author and/or a configurable allowlist) to match intent.
AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }} AICORE_RESOURCE_GROUP: ${{ secrets.AICORE_RESOURCE_GROUP }}
Review

[MINOR] The review job if: condition uses string comparison needs.review-gate.outputs.reason == 'self-review' combined with && against event_name checks. In Gitea Actions, if review-gate is skipped (e.g. for push events), needs.review-gate.outputs.reason will be empty and the comparison will be false — meaning review is silently skipped for push-to-main without any indication. This is probably intentional but worth a comment.

**[MINOR]** The review job `if:` condition uses string comparison `needs.review-gate.outputs.reason == 'self-review'` combined with `&&` against event_name checks. In Gitea Actions, if `review-gate` is skipped (e.g. for push events), `needs.review-gate.outputs.reason` will be empty and the comparison will be false — meaning review is silently skipped for push-to-main without any indication. This is probably intentional but worth a comment.
CONVENTIONS_FILE: "CONVENTIONS.md" CONVENTIONS_FILE: CONVENTIONS.md
PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }} PATTERNS_REPO: ${{ matrix.patterns_repo || 'rodin/go-patterns' }}
PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }} PATTERNS_FILES: ${{ matrix.patterns_files || 'README.md,patterns/' }}
LLM_TIMEOUT: "600" LLM_TIMEOUT: '600'
Review

[MINOR] date -u -d "$UPDATED_AT" +%s uses GNU date syntax. This is fine on ubuntu-24.04, but the -d flag for parsing a date string is a GNU extension. Worth a comment so future maintainers know this is intentionally Linux-only.

**[MINOR]** `date -u -d "$UPDATED_AT" +%s` uses GNU date syntax. This is fine on ubuntu-24.04, but the `-d` flag for parsing a date string is a GNU extension. Worth a comment so future maintainers know this is intentionally Linux-only.
SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }} SYSTEM_PROMPT_FILE: ${{ matrix.system_prompt_file }}
run: ./review-bot run: ./review-bot
review-gate:
runs-on: ubuntu-24.04
if: github.event_name == 'pull_request'
outputs:
allow_review: ${{ steps.gate.outputs.allow_review }}
reason: ${{ steps.gate.outputs.reason }}
Review

[MINOR] The TTL is calculated using PR_JSON.updated_at, which changes on every push/comment. A PR with a new comment 5 minutes ago will never pass the TTL gate regardless of how old the PR is. If the intent is 'PR was opened and nobody reviewed it for 45 minutes', created_at would be more appropriate. If the intent is 'last activity was 45 minutes ago', the current behavior is correct but differs from the PR description which says 'TTL keeps the gate green'.

**[MINOR]** The TTL is calculated using `PR_JSON.updated_at`, which changes on every push/comment. A PR with a new comment 5 minutes ago will never pass the TTL gate regardless of how old the PR is. If the intent is 'PR was opened and nobody reviewed it for 45 minutes', `created_at` would be more appropriate. If the intent is 'last activity was 45 minutes ago', the current behavior is correct but differs from the PR description which says 'TTL keeps the gate green'.
steps:
- name: Check self-review gate
Review

[MINOR] The allow_review output is set but never consumed — review.if only checks reason, not allow_review. The output could be removed, or the review.if condition could be simplified to just needs.review-gate.outputs.reason == 'self-review' (which it already is). The unused output is harmless but misleading.

**[MINOR]** The `allow_review` output is set but never consumed — `review.if` only checks `reason`, not `allow_review`. The output could be removed, or the `review.if` condition could be simplified to just `needs.review-gate.outputs.reason == 'self-review'` (which it already is). The unused output is harmless but misleading.
id: gate
env:
GITEA_TOKEN: ${{ secrets.RODIN_TOKEN }}
run: 'REPO=${{ github.repository }}
PR=${{ github.event.pull_request.number }}
Review

[NIT] The review job dropped its dependency on test (changed from needs: test to needs: [review-gate]). This means the heavy review can now run even if the Go build/tests are failing. The previous design required passing tests before reviews ran. This may be intentional but is worth confirming.

**[NIT]** The `review` job dropped its dependency on `test` (changed from `needs: test` to `needs: [review-gate]`). This means the heavy review can now run even if the Go build/tests are failing. The previous design required passing tests before reviews ran. This may be intentional but is worth confirming.
SHA=${{ github.event.pull_request.head.sha }}
TTL_MIN=${SELF_REVIEW_TTL_MIN}
API="${{ github.server_url }}/api/v1"
PR_JSON=$(curl -sS -H "Authorization: token $GITEA_TOKEN" "$API/repos/$REPO/pulls/$PR")
UPDATED_AT=$(echo "$PR_JSON" | jq -r .updated_at)
NOW=$(date -u +%s)
PR_TS=$(date -u -d "$UPDATED_AT" +%s)
AGE_MIN=$(( (NOW - PR_TS) / 60 ))
COMMENTS=$(curl -sS -H "Authorization: token $GITEA_TOKEN" "$API/repos/$REPO/issues/$PR/comments?limit=200")
HAS_SR=$(echo "$COMMENTS" | jq -r --arg sha "$SHA" "[.[] | select(.user.login=="rodin")
| select(.body|contains("Self-review against "+\$sha)) | select(.body|test("(?im)^###
\\s+Doc consistency\\b"))] | length")
if [ "$HAS_SR" -gt 0 ]; then ALLOW=true; REASON=self-review; elif [ "$AGE_MIN"
-ge "$TTL_MIN" ]; then ALLOW=true; REASON=ttl; else ALLOW=false; REASON=missing;
fi
echo "allow_review=$ALLOW" >> $GITHUB_OUTPUT
echo "reason=$REASON" >> $GITHUB_OUTPUT'
env:
SELF_REVIEW_TTL_MIN: '45'