[NIT] The AICoreClient struct doc comment references 'issue #54 review discussion' which is an internal reference that may not be accessible to all readers of the code. This is fine as a development note, but could be confusing in a public codebase.
[MINOR] Same double getToken pattern in CompleteOpenAI — getDeploymentURL already fetches a token internally, then another explicit getToken call follows. Same concern as above.
[MINOR] The stale check re-fetches the PR using the same err variable that was used for the first fetch. This shadows the outer err implicitly since currentPR, err := is a short variable declaration that reuses err. This is correct Go, but it means that after this block, err holds the result of the second GetPullRequest call (or nil on success). Any subsequent code that checks err without reassigning could be confused. In this case the variable is not read afterwards in an ambiguous way, but it's worth noting for maintainability.
[MINOR] The test case "both empty (edge case)" where both evaluatedSHA and currentSHA are empty results in shouldSkipStaleReview("", "") returning false (don't skip). This means if somehow both SHAs are empty the review would be posted. This edge case is arguably fine (empty SHA means re-fetch failed and we prefer posting), but a brief comment in the test explaining why wantSkip: false is correct here would improve readability.
[MINOR] If two concurrent goroutines both miss the read-lock cache check for getDeploymentURL, they will both call getToken before acquiring the write lock. Since getToken itself acquires c.mu internally, and then getDeploymentURL acquires c.mu.Lock(), there's a potential for the two goroutines to acquire c.mu.Lock() sequentially, both calling fetchDeployments. The double-check after acquiring the write lock prevents duplicate work, but the second goroutine still called getToken unnecessarily. This is a minor efficiency concern, not a correctness issue, since getToken is also protected by the same mutex and has its own double-check.
[MINOR] The client_secret is sent as a URL-encoded form parameter (data.Set("client_secret", c.config.ClientSecret)). While this is valid per OAuth2 spec (Resource Owner Password flow uses form body), many OAuth2 servers also accept/prefer HTTP Basic auth (Authorization: Basic base64(clientID:secret)). The current approach is spec-compliant but worth noting as some IdPs may reject it. Not a bug, but the comment could document why form body is used over HTTP Basic auth for this particular IdP.
[MINOR] When tokenResp.ExpiresIn is 0 (either because the field is missing or the server returns 0), expiry will equal time.Now(), causing every subsequent call to immediately re-fetch the token. A reasonable fallback (e.g., if tokenResp.ExpiresIn <= 0 { tokenResp.ExpiresIn = 3600 }) would be more robust.
[NIT] The AnthropicVersion field is hardcoded as a string literal "bedrock-2023-05-31" inside CompleteAnthropic. Consider extracting this as a named constant (e.g. AICoreAnthropicVersion) similar to AICoreOpenAIAPIVersion, for consistency and to make it easy to update.
[NIT] The variable url (used in if url, ok := c.deployments[model]; ok) shadows the built-in net/url package import, but net/url is imported and only used in fetchToken. While it works, renaming the local variable (e.g., deployURL) would avoid the shadow and make the code clearer.
[NIT] The test uses json.NewEncoder(w).Encode(...) without checking its error return. While errors from writing to an httptest.ResponseRecorder are extremely rare, this is technically ignoring an error. The established project convention is to check all error returns.
[MINOR] CompleteAnthropic calls getDeploymentURL and then getToken as separate operations. Between these two calls, the token could theoretically expire (extremely unlikely in practice, but the token fetched for deployment discovery in getDeploymentURL is not reused here). This is a minor efficiency issue rather than a correctness bug since getToken handles caching and the window is tiny, but it would be slightly cleaner to fetch the token once and pass it through. Low priority given this only matters on token boundary.
[NIT] The deployment struct has an ID field that is stored but never read anywhere in the codebase. If it's not used for anything, it adds unused memory per deployment entry. Either use it (e.g., for logging) or remove it.
[NIT] json.NewEncoder(w).Encode(...) errors are silently ignored throughout the test handlers. While this is common in test code and a failure here would manifest as a test failure, adding t.Helper() and checking the error would make test failures easier to diagnose.
[MINOR] The deployments cache is populated once and never invalidated. If a deployment becomes unavailable (status changes from RUNNING to STOPPED) after initial discovery, the client will continue trying to use the old URL until the process restarts. Consider adding a TTL or a re-fetch on 4xx/5xx responses from the invoke endpoint. This is acceptable for a short-lived CI runner process but could be an issue for longer-running deployments.
[MINOR] fetchDeployments is called while holding the write lock (c.mu.Lock() in getDeploymentURL). This means HTTP I/O (the deployments API call) happens under the mutex, which blocks all concurrent readers for the duration of the network round-trip. For a single-use CLI tool this is fine, but for a library it would be better to do the fetch outside the lock and then re-acquire to write. This is an accepted tradeoff for the current use case but worth noting.
[NIT] The field name http shadows the net/http package name within the struct's methods. While Go resolves this correctly (the package is accessed via the import path, the field via the receiver), this is a common source of confusion. The existing llm/client.go uses the same pattern, so this is consistent with the codebase, but naming it httpClient would be clearer.
[MINOR] When multiple models exist with the same name and both are RUNNING, the last one encountered wins (the loop overwrites c.deployments[modelName]). There is no tie-breaking and the outcome is non-deterministic depending on API response order. The test for this scenario (deploy-456 is STOPPED, deploy-789 is RUNNING for gpt-5) works correctly, but if two RUNNING deployments for the same model name exist the behaviour is undefined. A comment explaining the intent would help.