feat: add native SAP AI Core support #50

Closed
rodin wants to merge 12 commits from feat/aicore-provider into main
3 changed files with 96 additions and 21 deletions
Showing only changes of commit 34507dd9ff - Show all commits
+21 -9
View File
8
@@ -13,6 +13,10 @@ import (
"time"
)
// AICoreOpenAIAPIVersion is the API version used for OpenAI models through AI Core.
// Update this when SAP AI Core releases a new stable version.
const AICoreOpenAIAPIVersion = "2024-12-01-preview"
// AICoreConfig holds SAP AI Core authentication and connection settings.
type AICoreConfig struct {
ClientID string
2
@@ -40,14 +44,22 @@ type deployment struct {
}
Review

[NIT] AICoreClient uses an internal http.Client with a fixed 30s Timeout irrespective of the overall LLM timeout. While requests also honor context deadlines, the shorter client timeout may cause premature cancellations and inconsistent behavior. Consider aligning with the caller-provided context/timeout or making this configurable.

**[NIT]** AICoreClient uses an internal http.Client with a fixed 30s Timeout irrespective of the overall LLM timeout. While requests also honor context deadlines, the shorter client timeout may cause premature cancellations and inconsistent behavior. Consider aligning with the caller-provided context/timeout or making this configurable.
Review

[MAJOR] AICoreClient constructs an internal http.Client with a hard-coded 30s timeout and does not share or inherit the timeout set on llm.Client via WithTimeout. This creates inconsistent behavior: the top-level LLM timeout may be minutes, but AI Core calls will fail after 30s. Propagate the parent client's timeout or expose a way to configure AICoreClient's http timeout and set it from llm.Client.

**[MAJOR]** AICoreClient constructs an internal http.Client with a hard-coded 30s timeout and does not share or inherit the timeout set on llm.Client via WithTimeout. This creates inconsistent behavior: the top-level LLM timeout may be minutes, but AI Core calls will fail after 30s. Propagate the parent client's timeout or expose a way to configure AICoreClient's http timeout and set it from llm.Client.
// NewAICoreClient creates a new AI Core client with the given configuration.
Review

[MAJOR] AICoreClient initializes its own http.Client with a hard-coded 30s timeout, ignoring the user-configured LLM timeout. This causes AI Core requests to time out at 30s even when --llm-timeout is larger, diverging from documented behavior.

**[MAJOR]** AICoreClient initializes its own http.Client with a hard-coded 30s timeout, ignoring the user-configured LLM timeout. This causes AI Core requests to time out at 30s even when --llm-timeout is larger, diverging from documented behavior.
// The client uses a default 5-minute timeout; use WithTimeout to customize.
func NewAICoreClient(cfg AICoreConfig) *AICoreClient {
return &AICoreClient{
config: cfg,
http: &http.Client{Timeout: 30 * time.Second},
http: &http.Client{Timeout: 5 * time.Minute},
deployments: make(map[string]deployment),
}
Review

[NIT] The deployment struct has an exported ID and URL field but the type is unexported. The ID field is stored in the map but never read back — only URL is used. The ID field could be removed unless planned for future use (e.g., logging, metrics).

**[NIT]** The `deployment` struct has an exported `ID` and `URL` field but the type is unexported. The `ID` field is stored in the map but never read back — only `URL` is used. The `ID` field could be removed unless planned for future use (e.g., logging, metrics).
}
// WithTimeout sets the HTTP request timeout for AI Core calls.
// This should be called during construction, before concurrent use.
Review

[NIT] The field name http shadows the imported net/http package name within the struct. While Go allows this (the package is still accessible by its full import path), it's a subtle naming collision. The existing llm/client.go already uses the same pattern so this is consistent with the codebase, but worth noting as a readability concern.

**[NIT]** The field name `http` shadows the imported `net/http` package name within the struct. While Go allows this (the package is still accessible by its full import path), it's a subtle naming collision. The existing `llm/client.go` already uses the same pattern so this is consistent with the codebase, but worth noting as a readability concern.
func (c *AICoreClient) WithTimeout(d time.Duration) *AICoreClient {
c.http.Timeout = d
return c
}
// getToken returns a valid OAuth token, refreshing if necessary.
func (c *AICoreClient) getToken(ctx context.Context) (string, error) {
c.mu.RLock()
22
@@ -154,7 +166,6 @@ func (c *AICoreClient) getDeploymentURL(ctx context.Context, model string) (stri
}
func (c *AICoreClient) fetchDeployments(ctx context.Context, token string) error {
Review

[MINOR] Deployments are fetched once and cached forever with no TTL or invalidation mechanism. If a deployment goes from RUNNING to STOPPED (or a new deployment is added) between reviews, the cached URL will remain stale until the process restarts. Given this tool runs as a short-lived CI process this is acceptable, but worth a comment explaining the intentional lack of cache expiry.

**[MINOR]** Deployments are fetched once and cached forever with no TTL or invalidation mechanism. If a deployment goes from RUNNING to STOPPED (or a new deployment is added) between reviews, the cached URL will remain stale until the process restarts. Given this tool runs as a short-lived CI process this is acceptable, but worth a comment explaining the intentional lack of cache expiry.
deployURL := strings.TrimRight(c.config.APIURL, "/") + "/v2/lm/deployments"
req, err := http.NewRequestWithContext(ctx, http.MethodGet, deployURL, nil)
if err != nil {
Review

[MINOR] Header value for AI-Resource-Group is taken from unvalidated input and inserted into an HTTP header. While Go's net/http prevents CRLF injection in headers, consider validating this value (e.g., allow only [A-Za-z0-9-_]) to reduce risk and fail fast on invalid configuration.

**[MINOR]** Header value for AI-Resource-Group is taken from unvalidated input and inserted into an HTTP header. While Go's net/http prevents CRLF injection in headers, consider validating this value (e.g., allow only [A-Za-z0-9-_]) to reduce risk and fail fast on invalid configuration.
2
@@ -180,10 +191,10 @@ func (c *AICoreClient) fetchDeployments(ctx context.Context, token string) error
var deployResp struct {
Resources []struct {
ID string `json:"id"`
DeploymentURL string `json:"deploymentUrl"`
Status string `json:"status"`
Details struct {
ID string `json:"id"`
DeploymentURL string `json:"deploymentUrl"`
Outdated
Review

[MINOR] fetchDeployments has an unnecessary blank line after the function signature (func (c *AICoreClient) fetchDeployments(...) error {\n\n). Cosmetic but inconsistent with the rest of the file style.

**[MINOR]** fetchDeployments has an unnecessary blank line after the function signature (`func (c *AICoreClient) fetchDeployments(...) error {\n\n`). Cosmetic but inconsistent with the rest of the file style.
Status string `json:"status"`
Details struct {
Review

[MINOR] Deployments are cached indefinitely after first fetch. If a deployment is restarted, updated, or a new one is added for a model, the client will keep using the stale cached URL until the process restarts. Consider adding a TTL or a mechanism to refresh on 4xx/5xx errors from the deployment URL, similar to how tokenExpiry handles token refresh.

**[MINOR]** Deployments are cached indefinitely after first fetch. If a deployment is restarted, updated, or a new one is added for a model, the client will keep using the stale cached URL until the process restarts. Consider adding a TTL or a mechanism to refresh on 4xx/5xx errors from the deployment URL, similar to how `tokenExpiry` handles token refresh.
Resources struct {
BackendDetails struct {
Review

[MINOR] Error returns include the full response body on deployments request failure (fmt.Errorf("deployments request failed ...: %s", body)). This may leak internal details to logs. Prefer limiting or redacting response bodies in error messages.

**[MINOR]** Error returns include the full response body on deployments request failure (fmt.Errorf("deployments request failed ...: %s", body)). This may leak internal details to logs. Prefer limiting or redacting response bodies in error messages.
Model struct {
18
@@ -326,8 +337,8 @@ func (c *AICoreClient) CompleteOpenAI(ctx context.Context, model string, message
return "", fmt.Errorf("marshal request: %w", err)
}
Review

[MINOR] Same SSRF concern for OpenAI-style models: the chat/completions URL is derived directly from deploymentUrl without scheme/host validation. Add URL parsing and enforce https and allowed host/domain.

**[MINOR]** Same SSRF concern for OpenAI-style models: the chat/completions URL is derived directly from deploymentUrl without scheme/host validation. Add URL parsing and enforce https and allowed host/domain.
// AI Core uses /chat/completions?api-version=2024-12-01-preview for OpenAI models
chatURL := strings.TrimRight(deployURL, "/") + "/chat/completions?api-version=2024-12-01-preview"
// AI Core uses /chat/completions?api-version=<version> for OpenAI models
chatURL := strings.TrimRight(deployURL, "/") + "/chat/completions?api-version=" + AICoreOpenAIAPIVersion
req, err := http.NewRequestWithContext(ctx, http.MethodPost, chatURL, bytes.NewReader(data))
if err != nil {
Review

[MINOR] Response handling in CompleteOpenAI does not validate body length against Content-Length. Align with the OpenAI/Anthropic client path by adding a Content-Length vs received length check to detect truncation.

**[MINOR]** Response handling in CompleteOpenAI does not validate body length against Content-Length. Align with the OpenAI/Anthropic client path by adding a Content-Length vs received length check to detect truncation.
return "", fmt.Errorf("create request: %w", err)
9
@@ -363,6 +374,7 @@ func (c *AICoreClient) CompleteOpenAI(ctx context.Context, model string, message
}
// IsAnthropicModel returns true if the model name indicates an Anthropic model.
// SAP AI Core uses "anthropic--" prefix for Anthropic models (e.g., "anthropic--claude-3-5-sonnet").
func IsAnthropicModel(model string) bool {
return strings.HasPrefix(model, "anthropic--") || strings.Contains(model, "claude")
return strings.HasPrefix(model, "anthropic--")
}
+64 -6
View File
3
@@ -3,6 +3,7 @@ package llm
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
1
@@ -275,8 +276,8 @@ func TestAICoreClient_CompleteOpenAI(t *testing.T) {
})
})
mux.HandleFunc("/deployments/openai/chat/completions", func(w http.ResponseWriter, r *http.Request) {
if r.URL.Query().Get("api-version") != "2024-12-01-preview" {
t.Errorf("expected api-version query param")
if r.URL.Query().Get("api-version") != AICoreOpenAIAPIVersion {
t.Errorf("expected api-version %s, got %s", AICoreOpenAIAPIVersion, r.URL.Query().Get("api-version"))
}
var req ChatRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
@@ -327,12 +328,17 @@ func TestIsAnthropicModel(t *testing.T) {
model string
expected bool
}{
// SAP AI Core uses "anthropic--" prefix for Anthropic models
{"anthropic--claude-4.6-sonnet", true},
{"anthropic--claude-4.6-opus", true},
{"claude-sonnet-4", true},
{"anthropic--claude-3-5-sonnet", true},
// Non-prefixed model names are not detected as Anthropic
// (SAP AI Core always uses the prefix for Anthropic models)
{"claude-sonnet-4", false},
{"gpt-5", false},
{"gpt-4.1", false},
{"llama-3", false},
{"my-claude-model", false}, // Avoid false positives on "claude" substring
}
for _, tt := range tests {
@@ -350,7 +356,7 @@ func TestAICoreClient_TokenExpiry(t *testing.T) {
call := atomic.AddInt32(&tokenCalls, 1)
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]interface{}{
"access_token": "token-" + string(rune('0'+call)),
"access_token": fmt.Sprintf("token-%d", call),
"expires_in": 1, // 1 second expiry
})
return
@@ -367,7 +373,10 @@ func TestAICoreClient_TokenExpiry(t *testing.T) {
})
// First call
token1, _ := client.getToken(context.Background())
token1, err := client.getToken(context.Background())
if err != nil {
t.Fatalf("first getToken: %v", err)
}
// Force token expiry by manipulating expiry time
client.mu.Lock()
@@ -375,7 +384,10 @@ func TestAICoreClient_TokenExpiry(t *testing.T) {
client.mu.Unlock()
// Should fetch new token
token2, _ := client.getToken(context.Background())
token2, err := client.getToken(context.Background())
if err != nil {
Outdated
Review

[NIT] TestAICoreClient_TokenExpiry generates token strings with string(rune('0'+call)) which works for single-digit call counts (1, 2) but would produce incorrect results for call >= 10 (non-digit runes). Since the test only makes 2 calls this is safe, but fmt.Sprintf("token-%d", call) would be clearer and more robust.

**[NIT]** TestAICoreClient_TokenExpiry generates token strings with `string(rune('0'+call))` which works for single-digit call counts (1, 2) but would produce incorrect results for call >= 10 (non-digit runes). Since the test only makes 2 calls this is safe, but `fmt.Sprintf("token-%d", call)` would be clearer and more robust.
t.Fatalf("second getToken: %v", err)
}
Outdated
Review

[NIT] TestAICoreClient_TokenExpiry ignores the error return from getToken on lines 390 and 396 (token1, _ := and token2, _ :=). Tests should check errors or use t.Fatal to avoid masking failures.

**[NIT]** TestAICoreClient_TokenExpiry ignores the error return from `getToken` on lines 390 and 396 (`token1, _ :=` and `token2, _ :=`). Tests should check errors or use t.Fatal to avoid masking failures.
if token1 == token2 {
t.Error("expected different tokens after expiry")
@@ -385,6 +397,27 @@ func TestAICoreClient_TokenExpiry(t *testing.T) {
}
}
func TestAICoreClient_WithTimeout(t *testing.T) {
client := NewAICoreClient(AICoreConfig{
ClientID: "test-id",
ClientSecret: "test-secret",
AuthURL: "https://auth.example.com",
APIURL: "https://api.example.com",
ResourceGroup: "default",
})
// Default timeout is 5 minutes
if client.http.Timeout != 5*time.Minute {
t.Errorf("expected default timeout 5m, got %v", client.http.Timeout)
}
Outdated
Review

[NIT] Token string generation uses rune arithmetic ("token-" + string(rune('0'+call))). This is fragile beyond single-digit calls. Prefer fmt.Sprintf("token-%d", call) for clarity and robustness.

**[NIT]** Token string generation uses rune arithmetic ("token-" + string(rune('0'+call))). This is fragile beyond single-digit calls. Prefer fmt.Sprintf("token-%d", call) for clarity and robustness.
// WithTimeout should update the timeout
client.WithTimeout(10 * time.Minute)
if client.http.Timeout != 10*time.Minute {
t.Errorf("expected timeout 10m, got %v", client.http.Timeout)
}
}
func TestClient_WithAICore(t *testing.T) {
client := NewClient("http://example.com", "key", "model")
if client.provider != ProviderOpenAI {
@@ -407,6 +440,31 @@ func TestClient_WithAICore(t *testing.T) {
}
}
func TestClient_WithTimeout_PropagatestoAICore(t *testing.T) {
client := NewClient("http://example.com", "key", "model").
WithAICore(AICoreConfig{
ClientID: "id",
ClientSecret: "secret",
AuthURL: "https://auth.example.com",
APIURL: "https://api.example.com",
ResourceGroup: "default",
})
// Default should be 5 minutes (inherited from parent client)
if client.aicore.http.Timeout != 5*time.Minute {
t.Errorf("expected aicore default timeout 5m, got %v", client.aicore.http.Timeout)
}
// WithTimeout should propagate to AI Core client
client.WithTimeout(15 * time.Minute)
if client.http.Timeout != 15*time.Minute {
t.Errorf("expected parent timeout 15m, got %v", client.http.Timeout)
}
if client.aicore.http.Timeout != 15*time.Minute {
t.Errorf("expected aicore timeout 15m, got %v", client.aicore.http.Timeout)
}
}
func TestClient_CompleteAICore(t *testing.T) {
var baseURL string
mux := http.NewServeMux()
+11 -6
View File
@@ -52,8 +52,12 @@ func NewClient(baseURL, apiKey, model string) *Client {
}
// WithTimeout sets the HTTP request timeout for LLM calls (default 5 minutes).
// When using AI Core, this also sets the timeout on the AI Core client.
func (c *Client) WithTimeout(d time.Duration) *Client {
c.http.Timeout = d
if c.aicore != nil {
c.aicore.WithTimeout(d)
}
return c
}
@@ -71,9 +75,10 @@ func (c *Client) WithProvider(p Provider) *Client {
// WithAICore configures the client to use SAP AI Core for authentication.
// This sets the provider to aicore automatically.
// The AI Core client inherits the current HTTP timeout from this client.
func (c *Client) WithAICore(cfg AICoreConfig) *Client {
c.provider = ProviderAICore
c.aicore = NewAICoreClient(cfg)
c.aicore = NewAICoreClient(cfg).WithTimeout(c.http.Timeout)
return c
}
1
@@ -201,11 +206,11 @@ func (c *Client) completeOpenAI(ctx context.Context, messages []Message) (string
// --- Anthropic Messages API implementation ---
type anthropicRequest struct {
Model string `json:"model"`
MaxTokens int `json:"max_tokens"`
System string `json:"system,omitempty"`
Messages []anthropicMsg `json:"messages"`
Temperature float64 `json:"temperature,omitempty"`
Model string `json:"model"`
MaxTokens int `json:"max_tokens"`
System string `json:"system,omitempty"`
Messages []anthropicMsg `json:"messages"`
Temperature float64 `json:"temperature,omitempty"`
}
type anthropicMsg struct {