Implements native AI Core support with:
- OAuth2 token refresh
- Deployment discovery via /v2/lm/deployments
- Anthropic Messages API via /invoke endpoint
- Uses bedrock-2023-05-31 API version (AI Core uses Bedrock format)
- Model field omitted from body (deployment URL specifies model)
- Retry logic with exponential backoff
Tested via integration tests against live AI Core endpoint.
Addresses intermittent 'unexpected end of JSON input' failures where the
LLM response body is truncated in transit between the proxy and client.
Root cause: network-level truncation where io.ReadAll returns partial data
(observed in 3/50 CI runs through HAI proxy). The response body reading
was already using io.ReadAll correctly, but transient network issues
between the proxy and client can still cause partial reads.
Changes:
- Add Content-Length validation in doRequest: detect when fewer bytes
arrive than the server declared, triggering a retry
- Add retry logic in Complete: retries once on retryable errors (body
read failures, content-length mismatches) with a 500ms backoff
- Add parse-level retry in main: if ParseResponse fails, re-requests
from the LLM once before giving up (defensive, since retries always
succeed per issue evidence)
- Improve ParseResponse error diagnostics: log raw vs cleaned lengths
and a preview of the cleaned content to aid future debugging
Does NOT retry on API errors (4xx/5xx) or structural issues — only
transient body read problems.
Closes#47
- Overall context timeout now derived from LLM timeout + 1 minute
(no longer hardcoded 3min that could conflict with longer LLM timeouts)
- Clarify concurrency docs: With* methods are setup-only, not concurrent
- Add ctx.Err() checks in fetchFileContext and fetchPatterns loops
(break early on cancellation instead of making unnecessary requests)
- Fix doc comments: WithTimeout and WithTemperature each get their own
- Add TestWithTimeout (verifies short timeout causes request failure)
- Log warning on directory recursion failure in GetAllFilesInPath
- Note: unexported fields is a breaking change, will document in release notes
New flag: --llm-timeout / LLM_TIMEOUT (seconds, default 300)
New builder: llmClient.WithTimeout(duration)
Composite action: new timeout input
Keeps 5 minutes as the sensible default but allows tuning for
larger repos or slower models.
REVIEW.md findings 1-4, 14:
- All Gitea client methods now accept context.Context as first param
- All LLM client methods now accept context.Context as first param
- Use http.NewRequestWithContext for cancellation/timeout support
- Main uses 3-minute timeout context for all operations
- Unexport Client struct fields (baseURL, token, apiKey, etc.)
- Use bytes.NewReader instead of strings.NewReader(string(...))
- Composite action: cache to runner.temp instead of /usr/local/bin
(avoids permission issues on runners)
- Document that temperature=0 means server default (omitted from request)
- Note: strconv import already exists (false positive from GPT-5)
- install.sh: verify SHA-256 checksum before installing binary
- install.sh: fallback to ~/.local/bin if /usr/local/bin not writable
- install.sh: use sed instead of grep for POSIX-safe JSON parsing
- release.yml: remove jq dependency, parse release ID with sed
- llm: make temperature configurable via --llm-temperature / LLM_TEMPERATURE
- llm: add WithTemperature builder method on Client
- llm: omit temperature from request when zero (uses server default)
- CLI binary with flag/env var configuration
- Gitea API client (PR metadata, diff, CI status, post review)
- OpenAI-compatible LLM client
- Structured review prompt with conventions support
- JSON response parser with validation
- Markdown review formatter for Gitea
- CI failure auto-detection (REQUEST_CHANGES)
- Dry-run mode for testing