fix: retry on transient LLM response body truncation
CI / test (pull_request) Successful in 15s
CI / review (/openai/v1, gpt-4.1, gpt41, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 25s
CI / review (/openai/v1, gpt-4.1-mini, gpt41-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 29s
CI / review (/anthropic/v1, claude-sonnet-4-6, sonnet, anthropic, SONNET_REVIEW_TOKEN) (pull_request) Successful in 49s
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 50s
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m15s
CI / review (/openai/v1, gpt-5-mini, gpt5-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 52s
CI / test (pull_request) Successful in 15s
CI / review (/openai/v1, gpt-4.1, gpt41, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 25s
CI / review (/openai/v1, gpt-4.1-mini, gpt41-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 29s
CI / review (/anthropic/v1, claude-sonnet-4-6, sonnet, anthropic, SONNET_REVIEW_TOKEN) (pull_request) Successful in 49s
CI / review (/openai/v1, gpt-5, security, openai, SECURITY_REVIEW.md, SECURITY_REVIEW_TOKEN) (pull_request) Successful in 50s
CI / review (/openai/v1, gpt-5, gpt, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 1m15s
CI / review (/openai/v1, gpt-5-mini, gpt5-mini, openai, GPT_REVIEW_TOKEN) (pull_request) Successful in 52s
Addresses intermittent 'unexpected end of JSON input' failures where the LLM response body is truncated in transit between the proxy and client. Root cause: network-level truncation where io.ReadAll returns partial data (observed in 3/50 CI runs through HAI proxy). The response body reading was already using io.ReadAll correctly, but transient network issues between the proxy and client can still cause partial reads. Changes: - Add Content-Length validation in doRequest: detect when fewer bytes arrive than the server declared, triggering a retry - Add retry logic in Complete: retries once on retryable errors (body read failures, content-length mismatches) with a 500ms backoff - Add parse-level retry in main: if ParseResponse fails, re-requests from the LLM once before giving up (defensive, since retries always succeed per issue evidence) - Improve ParseResponse error diagnostics: log raw vs cleaned lengths and a preview of the cleaned content to aid future debugging Does NOT retry on API errors (4xx/5xx) or structural issues — only transient body read problems. Closes #47
This commit is contained in:
+28
-12
@@ -254,25 +254,41 @@ func main() {
|
||||
slog.Warn("context trimmed to fit budget", "trimmed", budgetResult.Trimmed)
|
||||
}
|
||||
|
||||
// Step 8: Call LLM
|
||||
// Step 8: Call LLM (with retry on parse failure)
|
||||
slog.Info("sending request to LLM", "model", *llmModel)
|
||||
messages := []llm.Message{
|
||||
{Role: "system", Content: budgetResult.SystemPrompt},
|
||||
{Role: "user", Content: budgetResult.UserPrompt},
|
||||
}
|
||||
|
||||
response, err := llmClient.Complete(ctx, messages)
|
||||
if err != nil {
|
||||
slog.Error("LLM request failed", "model", *llmModel, "error", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
slog.Info("LLM response received", "bytes", len(response))
|
||||
var response string
|
||||
var result *review.ReviewResult
|
||||
for attempt := 1; attempt <= 2; attempt++ {
|
||||
if attempt > 1 {
|
||||
slog.Warn("retrying LLM request after parse failure", "attempt", attempt)
|
||||
time.Sleep(time.Second)
|
||||
}
|
||||
|
||||
// Step 9: Parse response
|
||||
result, err := review.ParseResponse(response)
|
||||
if err != nil {
|
||||
slog.Error("failed to parse LLM response", "error", err)
|
||||
os.Exit(1)
|
||||
response, err = llmClient.Complete(ctx, messages)
|
||||
if err != nil {
|
||||
slog.Error("LLM request failed", "model", *llmModel, "error", err, "attempt", attempt)
|
||||
if attempt == 2 {
|
||||
os.Exit(1)
|
||||
}
|
||||
continue
|
||||
}
|
||||
slog.Info("LLM response received", "bytes", len(response), "attempt", attempt)
|
||||
|
||||
// Step 9: Parse response
|
||||
result, err = review.ParseResponse(response)
|
||||
if err != nil {
|
||||
slog.Error("failed to parse LLM response", "error", err, "attempt", attempt)
|
||||
if attempt == 2 {
|
||||
os.Exit(1)
|
||||
}
|
||||
continue
|
||||
}
|
||||
break
|
||||
}
|
||||
slog.Info("review parsed", "verdict", result.Verdict, "findings", len(result.Findings))
|
||||
|
||||
|
||||
Reference in New Issue
Block a user