fix: skip posting review when HEAD moves during evaluation #53

Merged
aweiker merged 3 commits from fix/stale-commit-check into main 2026-05-10 15:26:12 +00:00
3 changed files with 83 additions and 16 deletions
+2 -16
View File
@@ -19,6 +19,7 @@ jobs:
- run: go build -o review-bot ./cmd/review-bot - run: go build -o review-bot ./cmd/review-bot
# Self-review: builds from source since we're pre-release # Self-review: builds from source since we're pre-release
# Models configured to match SAP AI Core deployments
review: review:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
if: github.event_name == 'pull_request' if: github.event_name == 'pull_request'
@@ -30,27 +31,12 @@ jobs:
token_secret: SONNET_REVIEW_TOKEN token_secret: SONNET_REVIEW_TOKEN
provider: anthropic provider: anthropic
llm_path: /anthropic/v1 llm_path: /anthropic/v1
model: claude-sonnet-4-6 model: anthropic--claude-4.6-sonnet
- name: gpt - name: gpt
token_secret: GPT_REVIEW_TOKEN token_secret: GPT_REVIEW_TOKEN
provider: openai provider: openai
llm_path: /openai/v1 llm_path: /openai/v1
model: gpt-5 model: gpt-5
- name: gpt41
token_secret: GPT_REVIEW_TOKEN
provider: openai
llm_path: /openai/v1
model: gpt-4.1
- name: gpt5-mini
token_secret: GPT_REVIEW_TOKEN
provider: openai
llm_path: /openai/v1
model: gpt-5-mini
- name: gpt41-mini
token_secret: GPT_REVIEW_TOKEN
provider: openai
llm_path: /openai/v1
model: gpt-4.1-mini
- name: security - name: security
token_secret: SECURITY_REVIEW_TOKEN token_secret: SECURITY_REVIEW_TOKEN
provider: openai provider: openai
+31
View File
@@ -315,6 +315,24 @@ func main() {
Review

[MAJOR] CI is failing for the review workflows; per process policy, merges must be blocked until CI passes.

**[MAJOR]** CI is failing for the review workflows; per process policy, merges must be blocked until CI passes.
Review

[NIT] To reduce wasted LLM budget when HEAD moves early, consider performing an additional early re-fetch and SHA check before invoking the LLM, in addition to the final check before posting.

**[NIT]** To reduce wasted LLM budget when HEAD moves early, consider performing an additional early re-fetch and SHA check before invoking the LLM, in addition to the final check before posting.
Review

[MINOR] Consider performing a pre-flight stale HEAD check earlier (e.g., right after fetching initial PR metadata and before invoking the LLM) to avoid unnecessary compute when the PR has already moved. You can still keep this final pre-post check for correctness.

**[MINOR]** Consider performing a pre-flight stale HEAD check earlier (e.g., right after fetching initial PR metadata and before invoking the LLM) to avoid unnecessary compute when the PR has already moved. You can still keep this final pre-post check for correctness.
Review

[MINOR] The stale-HEAD check runs after the expensive LLM call and review formatting. If HEAD already moved by the time the LLM request starts, the run will still consume time/tokens and then exit. Consider adding a preflight HEAD check before assembling prompts or sending the LLM request to reduce wasted compute; keep the current post-LLM check to handle races during evaluation.

**[MINOR]** The stale-HEAD check runs after the expensive LLM call and review formatting. If HEAD already moved by the time the LLM request starts, the run will still consume time/tokens and then exit. Consider adding a preflight HEAD check before assembling prompts or sending the LLM request to reduce wasted compute; keep the current post-LLM check to handle races during evaluation.
Review

[MINOR] There is a small race window between re-fetching the current HEAD and posting the review (TOCTOU). This is acceptable in practice but worth noting; if the API supports it, posting conditionally against a specific commit could eliminate the window.

**[MINOR]** There is a small race window between re-fetching the current HEAD and posting the review (TOCTOU). This is acceptable in practice but worth noting; if the API supports it, posting conditionally against a specific commit could eliminate the window.
Review

[MINOR] There is a race window between the stale HEAD check and the subsequent PostReview call. HEAD could change again after the check but before posting, potentially causing a stale review to be posted. Consider re-fetching the PR and re-checking the SHA immediately before PostReview to minimize this window.

**[MINOR]** There is a race window between the stale HEAD check and the subsequent PostReview call. HEAD could change again after the check but before posting, potentially causing a stale review to be posted. Consider re-fetching the PR and re-checking the SHA immediately before PostReview to minimize this window.
Review

[MINOR] Consider performing an early HEAD pre-check (re-fetch PR and compare SHA) before invoking the LLM to avoid spending tokens when the PR head has already moved, then keep the current final check before posting.

**[MINOR]** Consider performing an early HEAD pre-check (re-fetch PR and compare SHA) before invoking the LLM to avoid spending tokens when the PR head has already moved, then keep the current final check before posting.
sentinel := fmt.Sprintf("<!-- review-bot:%s -->", *reviewerName) sentinel := fmt.Sprintf("<!-- review-bot:%s -->", *reviewerName)
Review

[NIT] The variable evaluatedSHA is assigned from pr.Head.Sha here, but pr.Head.Sha was already captured earlier in the function (around line 277) for the CI status check and again for the commit footer. This is fine functionally, but a brief comment or colocation with the earlier SHA usage might make the relationship clearer to future readers. Very minor readability nit.

**[NIT]** The variable `evaluatedSHA` is assigned from `pr.Head.Sha` here, but `pr.Head.Sha` was already captured earlier in the function (around line 277) for the CI status check and again for the commit footer. This is fine functionally, but a brief comment or colocation with the earlier SHA usage might make the relationship clearer to future readers. Very minor readability nit.
// Stale check: verify HEAD hasn't moved since we started
Outdated
Review

[MINOR] The stale-check compares currentPR.Head.Sha against evaluatedSHA even if evaluatedSHA might be empty. Consider guarding the comparison (e.g., only skip when both SHAs are non-empty and differ) to avoid skipping reviews due to transient missing SHA.

**[MINOR]** The stale-check compares currentPR.Head.Sha against evaluatedSHA even if evaluatedSHA might be empty. Consider guarding the comparison (e.g., only skip when both SHAs are non-empty and differ) to avoid skipping reviews due to transient missing SHA.
evaluatedSHA := pr.Head.Sha
Review

[MINOR] The stale check re-fetches the PR using the same err variable that was used for the first fetch. This shadows the outer err implicitly since currentPR, err := is a short variable declaration that reuses err. This is correct Go, but it means that after this block, err holds the result of the second GetPullRequest call (or nil on success). Any subsequent code that checks err without reassigning could be confused. In this case the variable is not read afterwards in an ambiguous way, but it's worth noting for maintainability.

**[MINOR]** The stale check re-fetches the PR using the same `err` variable that was used for the first fetch. This shadows the outer `err` implicitly since `currentPR, err :=` is a short variable declaration that reuses `err`. This is correct Go, but it means that after this block, `err` holds the result of the second GetPullRequest call (or nil on success). Any subsequent code that checks `err` without reassigning could be confused. In this case the variable is not read afterwards in an ambiguous way, but it's worth noting for maintainability.
var currentSHA string
currentPR, err := giteaClient.GetPullRequest(ctx, owner, repoName, prNumber)
if err != nil {
slog.Warn("could not re-fetch PR for stale check", "pr", prNumber, "error", err)
// currentSHA stays empty — shouldSkipStaleReview will return false
} else {
Review

[NIT] The log level for skipping a stale review uses Warn. Since this is expected behavior in normal operation, consider using Info to reduce warning noise in logs.

**[NIT]** The log level for skipping a stale review uses Warn. Since this is expected behavior in normal operation, consider using Info to reduce warning noise in logs.
currentSHA = currentPR.Head.Sha
}
if shouldSkipStaleReview(evaluatedSHA, currentSHA) {
slog.Warn("HEAD moved during review — skipping stale review",
"evaluated", evaluatedSHA,
"current", currentSHA,
"pr", prNumber)
return
}
// Map findings to inline comments for lines present in the diff // Map findings to inline comments for lines present in the diff
diffRanges := gitea.ParseDiffNewLines(diff) diffRanges := gitea.ParseDiffNewLines(diff)
var inlineComments []gitea.ReviewComment var inlineComments []gitea.ReviewComment
@@ -666,3 +684,16 @@ func findAllOwnReviews(reviews []gitea.Review, sentinel string) []gitea.Review {
} }
Review

[NIT] shouldSkipStaleReview will skip posting when evaluatedSHA is empty and currentSHA is non-empty. While unlikely for real PRs, consider documenting this edge case or adding a guard if evaluatedSHA can ever be empty.

**[NIT]** shouldSkipStaleReview will skip posting when evaluatedSHA is empty and currentSHA is non-empty. While unlikely for real PRs, consider documenting this edge case or adding a guard if evaluatedSHA can ever be empty.
Review

[MINOR] The shouldSkipStaleReview comment does not mention the behavior when evaluatedSHA is empty and currentSHA is non-empty; currently this results in skipping. Clarify this in the doc comment to reflect intended behavior.

**[MINOR]** The shouldSkipStaleReview comment does not mention the behavior when evaluatedSHA is empty and currentSHA is non-empty; currently this results in skipping. Clarify this in the doc comment to reflect intended behavior.
Review

[MINOR] shouldSkipStaleReview treats an empty evaluated SHA differently from an empty current SHA. If evaluatedSHA is empty and currentSHA is non-empty, it will skip; consider documenting or testing this scenario explicitly to clarify intent.

**[MINOR]** shouldSkipStaleReview treats an empty evaluated SHA differently from an empty current SHA. If evaluatedSHA is empty and currentSHA is non-empty, it will skip; consider documenting or testing this scenario explicitly to clarify intent.
return result return result
} }
// shouldSkipStaleReview reports whether to skip posting because HEAD moved.
// Returns true (skip) if evaluatedSHA differs from currentSHA.
// Returns false (don't skip) if:
// - SHAs match (no movement)
// - currentSHA is empty (re-fetch failed; prefer posting stale over failing)
func shouldSkipStaleReview(evaluatedSHA, currentSHA string) bool {
if currentSHA == "" {
// Re-fetch failed; better to post potentially stale than fail
return false
}
return evaluatedSHA != currentSHA
}
+50
View File
@@ -862,3 +862,53 @@ func TestFindAllOwnReviews(t *testing.T) {
} }
Review

[MINOR] Add a test case for shouldSkipStaleReview when evaluatedSHA is empty and currentSHA is non-empty to document and lock in the intended skip behavior for that edge case.

**[MINOR]** Add a test case for shouldSkipStaleReview when evaluatedSHA is empty and currentSHA is non-empty to document and lock in the intended skip behavior for that edge case.
} }
} }
func TestShouldSkipStaleReview(t *testing.T) {
tests := []struct {
name string
evaluatedSHA string
currentSHA string
wantSkip bool
}{
{
name: "matching SHAs",
evaluatedSHA: "abc123def456",
currentSHA: "abc123def456",
Review

[MINOR] The test case "both empty (edge case)" where both evaluatedSHA and currentSHA are empty results in shouldSkipStaleReview("", "") returning false (don't skip). This means if somehow both SHAs are empty the review would be posted. This edge case is arguably fine (empty SHA means re-fetch failed and we prefer posting), but a brief comment in the test explaining why wantSkip: false is correct here would improve readability.

**[MINOR]** The test case `"both empty (edge case)"` where both evaluatedSHA and currentSHA are empty results in `shouldSkipStaleReview("", "")` returning `false` (don't skip). This means if somehow both SHAs are empty the review would be posted. This edge case is arguably fine (empty SHA means re-fetch failed and we prefer posting), but a brief comment in the test explaining why `wantSkip: false` is correct here would improve readability.
wantSkip: false,
},
{
name: "different SHAs",
evaluatedSHA: "abc123def456",
currentSHA: "xyz789abc123",
wantSkip: true,
},
{
name: "empty current SHA (re-fetch failed)",
evaluatedSHA: "abc123def456",
currentSHA: "",
wantSkip: false,
},
{
name: "both empty (edge case)",
evaluatedSHA: "",
currentSHA: "",
wantSkip: false,
},
{
name: "only current empty",
evaluatedSHA: "abc123",
currentSHA: "",
wantSkip: false,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
got := shouldSkipStaleReview(tc.evaluatedSHA, tc.currentSHA)
if got != tc.wantSkip {
t.Errorf("shouldSkipStaleReview(%q, %q) = %v, want %v",
tc.evaluatedSHA, tc.currentSHA, got, tc.wantSkip)
}
})
}
}