ci: fix reviewer models — sonnet uses Anthropic, gpt uses GPT-5 #44
Reference in New Issue
Block a user
Delete Branch "fix/sonnet-reviewer"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The matrix was wrong: "sonnet" was running GPT-5 and "gpt" was running GPT-4.1.
Now:
Each matrix entry specifies its own provider and base_url. This exercises the Anthropic code path in CI for the first time.
Sonnet Review
Summary
This CI configuration fix correctly reassigns models to their proper providers: sonnet now uses Claude Sonnet 4.6 via the Anthropic endpoint, and gpt/security use GPT-5 via the OpenAI endpoint. The change also moves base_url from a shared secret to per-matrix-entry values, which is the right approach for supporting multiple providers. CI passes.
Recommendation
APPROVE — The change is straightforward and correct. The previous configuration had a clear bug (sonnet running GPT-5, gpt running GPT-4.1), and this fix resolves it by properly associating each matrix entry with its provider, base URL, and model. Moving LLM_BASE_URL from a shared secret to per-matrix configuration is the right architectural decision since different providers require different endpoints. No issues found.
Review by sonnet
Evaluated against
50cefd78Original reviewSuperseded — see current review for up-to-date findings.
Previous findings (commit
50cefd78)Gpt Review
Summary
The workflow changes correctly realign the reviewer matrix so 'sonnet' uses Anthropic and 'gpt'/'security' use OpenAI, and add per-matrix provider/base_url wiring. The environment variable updates look consistent with the matrix fields, and CI has passed.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
APPROVE — Overall the changes are sound: the matrix entries now specify their own provider and base_url, and the env values correctly pull from the matrix (including LLM_PROVIDER). This should exercise both Anthropic and OpenAI code paths as intended, and CI confirms the workflow executes. Consider moving the base URLs out of the workflow into repository/environment secrets or variables to avoid hardcoding internal IPs and to simplify updates across environments, and remove or update any no-longer-used secrets or docs related to the old LLM_BASE_URL source.
Review by gpt
Evaluated against
50cefd78@@ -30,2 +30,3 @@token_secret: SONNET_REVIEW_TOKENmodel: gpt-5provider: anthropicbase_url: http://100.86.77.84:6655/anthropic/v1[MINOR] Hardcoded base_url includes an internal IP address. Consider moving base URLs to repository/environment secrets or variables to avoid exposing infrastructure details and ease environment changes.
@@ -50,3 +56,3 @@REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}REVIEWER_NAME: ${{ matrix.name }}LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}LLM_BASE_URL: ${{ matrix.base_url }}[NIT] LLM_BASE_URL now comes from matrix.base_url; if a secrets-based LLM_BASE_URL was previously used, ensure any obsolete secret or documentation references are cleaned up to prevent confusion.
Security Review
Summary
The workflow fix correctly assigns models/providers but introduces significant security risks. The LLM endpoints are hardcoded to plain HTTP and moved out of secrets, risking exposure of API keys and internal infrastructure details.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
REQUEST_CHANGES — Switch all LLM endpoints to HTTPS and ensure TLS validation is enforced by the review-bot. Avoid hardcoding service endpoints in the workflow; instead, restore LLM_BASE_URL to a repository or environment secret so PRs cannot redirect calls to arbitrary hosts. If multiple providers are needed, store provider-specific base URLs and API keys in distinct secrets (e.g., LLM_BASE_URL_OPENAI, LLM_BASE_URL_ANTHROPIC) and reference them securely. Also consider replacing the hardcoded IP with a DNS name managed via secrets or repository/environment settings to prevent internal information disclosure and improve portability.
Review by security
Evaluated against
50cefd78@@ -31,1 +31,3 @@model: gpt-5provider: anthropicbase_url: http://100.86.77.84:6655/anthropic/v1model: claude-sonnet-4-6[MAJOR] LLM base_url uses plain HTTP (http://100.86.77.84:6655/anthropic/v1) which will transmit the LLM_API_KEY and request contents in cleartext, risking secret and data exposure. All three base_url entries (Anthropic and OpenAI) are HTTP.
[NIT] Hardcoded internal IP (100.86.77.84) in the repository may disclose internal network details if the repo is public and reduces flexibility across environments.
@@ -49,9 +55,10 @@ jobs:PR_NUMBER: ${{ github.event.pull_request.number }}[MINOR] LLM_BASE_URL now derives from the workflow matrix instead of a secret, increasing the risk that a PR modifying this workflow could redirect requests (and exfiltrate secrets) to an attacker-controlled host. Keeping the endpoint in a secret or protected environment reduces this risk.
50cefd7840tob4c994d0faGpt41-mini Review
Summary
The PR correctly fixes the reviewer model matrix in the CI workflow, properly assigning each matrix entry its explicit provider, base_url, and model. The change aligns with the stated intent, enabling the use of Anthropic for sonnet and GPT-5 for gpt and security reviewers, and the CI has passed.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
APPROVE — Approve the PR as it fixes configuration inconsistencies in the CI workflow matrix correctly, enabling proper testing of Anthropic and OpenAI paths. The tests passed without error. The minor environment variable fixes enhance clarity and correctness of the CI environment setup.
Review by gpt41-mini
Evaluated against
b4c994d0@@ -35,0 +46,4 @@provider: openaibase_url: http://100.86.77.84:6655/openai/v1model: gpt-5-mini- name: gpt41-mini[MINOR] In the 'Run ${{ matrix.name }} review' step environment, the LLM_BASE_URL was corrected from referring to a secret to referencing the matrix.base_url, ensuring each job uses the appropriate base URL.
[MINOR] Added LLM_PROVIDER environment variable to the review step to reflect the provider set in the matrix, ensuring consistency and correct provider usage at runtime.
Gpt41 Review
Summary
This PR updates the CI review matrix to explicitly specify the correct model-provider mappings and aligns matrix properties with expected runtime environment variables. The changes are configuration-only with no code or logic changes.
Recommendation
APPROVE — The changes correct the reviewer matrix, making each job explicit in its provider, base_url, and model. This increases clarity and makes Anthropic Sonnet reviews appear in the matrix for the first time. The PR respects all conventions, does not introduce any technical risks, and the tests pass. No further action needed.
Review by gpt41
Evaluated against
b4c994d0Sonnet Review
Summary
This CI configuration fix correctly routes each matrix entry to the right provider and model, replacing the previous broken setup where 'sonnet' was using GPT-5. The changes are straightforward and accurate.
Findings
.gitea/workflows/ci.ymlRecommendation
APPROVE — The fix is correct and CI passes. The main change — giving each matrix entry its own provider, base_url, and model, and moving LLM_BASE_URL from a secret to a per-matrix value — is clean and well-structured. The only minor question is whether SONNET_REVIEW_TOKEN is intentionally reused for the gpt41/gpt5-mini/gpt41-mini entries (likely fine if the HAI proxy accepts any valid token regardless of provider), but this is a policy/ops question rather than a code correctness issue. Approve.
Review by sonnet
Evaluated against
b4c994d0@@ -34,1 +41,4 @@provider: openaibase_url: http://100.86.77.84:6655/openai/v1model: gpt-4.1- name: gpt5-mini[MINOR] The gpt41, gpt5-mini, and gpt41-mini matrix entries all use token_secret: SONNET_REVIEW_TOKEN rather than their own dedicated secrets or GPT_REVIEW_TOKEN. This appears intentional (sharing a token for auxiliary reviewers) but is worth confirming — if the Anthropic token is used to call an OpenAI endpoint it may fail depending on how the HAI proxy validates tokens.
Gpt Review
Summary
The workflow changes correctly map reviewers to their intended models and providers, add per-matrix base URLs and providers, and adjust environment variables accordingly. CI has passed, and the YAML appears valid and consistent.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
APPROVE — The changes align reviewers to the correct models/providers and ensure the Anthropic code path is exercised in CI. Environment variables are updated to source provider and base_url from the matrix, which improves flexibility. CI passing indicates functional correctness. As minor improvements, consider using distinct reviewer tokens for the added OpenAI-based reviewers to avoid potential identity confusion, and move hardcoded base URLs to secrets or repository variables (and use HTTPS where possible). Otherwise, the PR looks good to merge.
Review by gpt
Evaluated against
b4c994d0@@ -30,2 +30,3 @@token_secret: SONNET_REVIEW_TOKENmodel: gpt-5provider: anthropicbase_url: http://100.86.77.84:6655/anthropic/v1[MINOR] Hardcoded base_url values use a plaintext HTTP internal IP (e.g., http://100.86.77.84:6655/...). Consider moving these to secrets or repository variables and preferring HTTPS to avoid exposing internal infra details and to improve transport security.
@@ -34,0 +37,4 @@base_url: http://100.86.77.84:6655/openai/v1model: gpt-5- name: gpt41token_secret: SONNET_REVIEW_TOKEN[MINOR] Matrix entries gpt41, gpt5-mini, and gpt41-mini use token_secret SONNET_REVIEW_TOKEN despite being OpenAI-based reviewers. Consider using a distinct reviewer token (e.g., GPT_REVIEW_TOKEN or dedicated tokens) to avoid confusing identity/permission scopes.
Security Review
Summary
While CI passed and the intent to exercise different providers is clear, the changes introduce a significant secret exfiltration risk by allowing the PR-controlled workflow to set the LLM endpoint. Additionally, the endpoints use plain HTTP and expose internal IPs, which are security hardening concerns.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
REQUEST_CHANGES — Address the major secret exfiltration risk by ensuring untrusted PRs cannot control destinations that receive secrets. Concrete options: (1) Revert LLM_BASE_URL to come from a protected repository/organization secret (e.g., secrets.LLM_BASE_URL) or an Actions environment variable set outside the PR; do not allow it to be overridden by matrix values from the workflow file. (2) Restrict the review job so it does not run with secrets on pull_request from forks; consider running the review on a trusted context (e.g., after merge, or via a protected manual dispatch) or require environment approval before secrets are injected. (3) Implement host allowlisting in review-bot (validate LLM_BASE_URL against an expected set) so even if the workflow is altered, outbound requests are only made to approved endpoints.
Additionally, switch the base_url endpoints to HTTPS to prevent plaintext credential transport, and avoid committing internal IPs to the repository by moving these endpoints to secrets or runner-level configuration. These changes will mitigate both the immediate exfiltration vector and harden transport/security posture.
Review by security
Evaluated against
b4c994d0@@ -29,3 +29,3 @@- name: sonnettoken_secret: SONNET_REVIEW_TOKENmodel: gpt-5provider: anthropic[MINOR] The configured LLM endpoints use HTTP (plaintext) rather than HTTPS (e.g., base_url: http://100.86.77.84:6655/...), risking interception of API keys and PR content in transit, even on internal networks.
[MINOR] The workflow hardcodes internal IP addresses and ports in a repo file, which can disclose internal network topology if the repository is public. Prefer using secrets or environment variables to avoid exposing infrastructure details.
@@ -51,3 +72,3 @@REVIEWER_NAME: ${{ matrix.name }}LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}LLM_BASE_URL: ${{ matrix.base_url }}LLM_API_KEY: ${{ secrets.LLM_API_KEY }}[MAJOR] LLM_BASE_URL is now sourced from the PR-controlled matrix (matrix.base_url) instead of a repository secret, enabling an attacker to modify the workflow in a PR to redirect outbound requests (including secrets like LLM_API_KEY and REVIEWER_TOKEN) to an arbitrary endpoint. This is a classic supply-chain/secret exfiltration vector for CI workflows that run on pull_request events with secrets available.
Gpt5-mini Review
Summary
The workflow changes correctly add per-matrix provider and base_url settings and wire LLM_PROVIDER through the run step, but several matrix entries use the SONNET_REVIEW_TOKEN where it looks like the GPT token (or a dedicated token) should be used — this is likely a copy-paste error that will cause runs to authenticate with the wrong reviewer token.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
REQUEST_CHANGES — Fix the token_secret values for the gpt-related matrix entries so each job uses the intended reviewer secret (e.g. change the token_secret for gpt41, gpt5-mini, and gpt41-mini from SONNET_REVIEW_TOKEN to GPT_REVIEW_TOKEN or to the appropriate distinct secrets). Also verify that the review-bot supports Anthropic via LLM_PROVIDER=anthropic and that the LLM_API_KEY secret is appropriate for the Anthropic/openai proxy in use; if different keys are required per provider, add a per-matrix secret or otherwise handle that mapping. After making those adjustments, re-run CI to ensure the matrix jobs authenticate and execute as expected.
Review by gpt5-mini
Evaluated against
b4c994d0@@ -29,3 +29,3 @@- name: sonnettoken_secret: SONNET_REVIEW_TOKENmodel: gpt-5provider: anthropic[MINOR] You've added provider: anthropic and base_url for the 'sonnet' entry which is good. Make sure the review-bot supports Anthropic-style auth/headers when LLM_PROVIDER=anthropic and that a single LLM_API_KEY secret is valid for both Anthropic and OpenAI endpoints (or provide separate secrets if needed).
@@ -34,0 +37,4 @@base_url: http://100.86.77.84:6655/openai/v1model: gpt-5- name: gpt41token_secret: SONNET_REVIEW_TOKEN[MAJOR] The matrix entry 'gpt41' sets token_secret: SONNET_REVIEW_TOKEN. This is likely incorrect — gpt-related entries should probably use GPT_REVIEW_TOKEN (or their own appropriate secret). Using the wrong token will cause that matrix job to authenticate as the wrong reviewer and may fail or operate under unexpected permissions.
@@ -34,1 +42,4 @@base_url: http://100.86.77.84:6655/openai/v1model: gpt-4.1- name: gpt5-minitoken_secret: SONNET_REVIEW_TOKEN[MAJOR] The matrix entry 'gpt5-mini' sets token_secret: SONNET_REVIEW_TOKEN. This duplicates the same token as 'sonnet' and likely should use GPT_REVIEW_TOKEN (or a dedicated token). Confirm intended reviewer tokens per matrix entry.
@@ -35,0 +47,4 @@base_url: http://100.86.77.84:6655/openai/v1model: gpt-5-mini- name: gpt41-minitoken_secret: SONNET_REVIEW_TOKEN[MAJOR] The matrix entry 'gpt41-mini' sets token_secret: SONNET_REVIEW_TOKEN. As above, this seems inconsistent with the 'gpt' naming and should likely reference GPT_REVIEW_TOKEN or another correct secret.
@@ -53,2 +73,4 @@LLM_BASE_URL: ${{ matrix.base_url }}LLM_API_KEY: ${{ secrets.LLM_API_KEY }}LLM_MODEL: ${{ matrix.model }}LLM_PROVIDER: ${{ matrix.provider }}[NIT] SYSTEM_PROMPT_FILE is set from matrix.system_prompt_file; for entries that don't define it the variable will be empty — that's probably fine, but consider documenting that behavior in the PR body or adding an explicit null/empty entry for clarity.
Sonnet Review
Summary
This is a CI configuration fix that correctly assigns providers, endpoints, and models to each matrix entry. The changes are straightforward, CI passes, and the configuration looks correct.
Recommendation
APPROVE — The change is correct and well-scoped. It fixes the mislabeled matrix entries (sonnet was using GPT-5, gpt was using GPT-4.1), adds the missing
providerandllm_pathfields to all matrix entries, expands the matrix with additional model variants, and threadsLLM_PROVIDERthrough the environment. CI passes. Approve.Review by sonnet
Evaluated against
f48288bfGpt41-mini Review
Summary
The changes correct the reviewer matrix configuration for the CI workflows by properly assigning providers, paths, and models, ensuring the correct LLM service endpoints and versions are tested. The matrix now explicitly specifies all required fields for each job, improving clarity and correctness without introducing regressions.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
APPROVE — The PR cleanly fixes the configuration by clearly specifying provider, model, and endpoint path for each matrix test job. It adheres to idiomatic CI configuration and is consistent with the rest of the workflow file. No changes are required.
Review by gpt41-mini
Evaluated against
f48288bf@@ -30,2 +30,3 @@token_secret: SONNET_REVIEW_TOKENmodel: gpt-5provider: anthropicllm_path: /anthropic/v1[MINOR] Original 'sonnet' matrix entry was missing provider and llm_path fields; they are now added to correctly exercise the Anthropic codepath.
@@ -32,1 +31,4 @@provider: anthropicllm_path: /anthropic/v1model: claude-sonnet-4-6- name: gpt[MINOR] The model for 'sonnet' was corrected from 'gpt-5' to 'claude-sonnet-4-6' consistent with the Anthropic provider.
@@ -33,1 +35,4 @@token_secret: GPT_REVIEW_TOKENprovider: openaillm_path: /openai/v1model: gpt-5[MINOR] The previously missing provider and llm_path fields for 'gpt41' and subsequent jobs are added, ensuring full specification of matrix entries.
@@ -35,0 +45,4 @@token_secret: GPT_REVIEW_TOKENprovider: openaillm_path: /openai/v1model: gpt-5-mini[MINOR] LLM_BASE_URL is now constructed from secret plus matrix.llm_path, improving modularity and correctness in endpoint invocation.
@@ -35,0 +49,4 @@- name: gpt41-minitoken_secret: GPT_REVIEW_TOKENprovider: openaillm_path: /openai/v1[MINOR] LLM_PROVIDER environment variable added to the run step to allow correct provider logic downstream.
Gpt41 Review
Summary
The changes are correct, align with the description, and respect all conventions. Each matrix entry is now explicit about its provider and endpoint, which improves clarity and correctness of model routing.
Recommendation
APPROVE — No issues found. The review workflow now unambiguously selects the correct LLM provider and endpoint for each matrix entry, enabling accurate testing of both OpenAI and Anthropic integration in CI. CI has passed and the diff does not introduce any errors or anti-patterns. Merge as-is.
Review by gpt41
Evaluated against
f48288bfSecurity Review
Summary
The workflow changes correctly assign providers/models and append per-provider base paths without exposing secrets or introducing injection risks. CI passed and the modifications are limited to CI configuration with no evident security impact.
Recommendation
APPROVE — Approve as submitted. The updated matrix explicitly sets provider and base path per entry and passes secrets only via environment variables to the built binary. No sensitive values are printed or interpolated into command lines, and no untrusted input is executed. As a future hardening measure (outside the scope of this change), consider pinning Actions to commit SHAs and ensuring the review-bot never logs environment variables containing secrets.
Review by security
Evaluated against
f48288bfGpt Review
Summary
The workflow matrix now correctly maps models to their intended providers and endpoints, and the environment variables are set up to exercise both OpenAI and Anthropic code paths. CI passed, and the changes look consistent and well-scoped.
Recommendation
APPROVE — The updates appropriately add per-matrix provider and base path configuration and correct the model-provider mapping (sonnet → Anthropic, gpt/security → OpenAI). Concatenating LLM_BASE_URL with llm_path is a sensible approach, and adding LLM_PROVIDER enables provider-specific handling in the bot. Since CI passed, this appears to work as intended. As a minor consideration for future improvements, ensure the shared LLM_API_KEY secret covers both providers (if applicable) or introduce provider-specific keys if needed, and consider documenting the additional gpt41 and mini variants in the PR description for clarity on the expanded test matrix.
Review by gpt
Evaluated against
f48288bfGpt5-mini Review
Summary
The changes correctly adjust the CI review matrix to run Sonnet on Anthropic and GPT-5 on the OpenAI path, add per-entry provider and llm_path fields, and wire those into the job environment. CI is green and the YAML appears syntactically valid.
Findings
.gitea/workflows/ci.yml.gitea/workflows/ci.yml.gitea/workflows/ci.ymlRecommendation
APPROVE — Approve: the diff fixes the reviewer-model mapping and introduces provider and path metadata so the Anthropic code path is exercised. CI has passed. Apply small improvements: (1) make LLM base URL handling explicit per-provider or normalize the secret to avoid string-concatenation pitfalls; (2) ensure review-bot tolerates an empty SYSTEM_PROMPT_FILE (or only set it when defined); and (3) consider clearer matrix job naming (e.g. 'gpt5') to reduce ambiguity. None of these are blocking given the green CI.
Review by gpt5-mini
Evaluated against
f48288bf@@ -32,1 +31,4 @@provider: anthropicllm_path: /anthropic/v1model: claude-sonnet-4-6- name: gpt[NIT] The matrix entry named 'gpt' now runs GPT-5 while there are other entries like 'gpt41' and 'gpt5-mini'. Consider renaming 'gpt' to 'gpt5' (or similar) for clarity and to avoid confusion about which model each job runs.
@@ -50,3 +71,3 @@REVIEWER_TOKEN: ${{ secrets[matrix.token_secret] }}REVIEWER_NAME: ${{ matrix.name }}LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}${{ matrix.llm_path }}[MINOR] LLM_BASE_URL is constructed by concatenating the secret and matrix.llm_path: ${{ secrets.LLM_BASE_URL }}${{ matrix.llm_path }}. This works but is fragile: it assumes the secret does not already include the path or a trailing slash and that a single secret can be used for multiple providers. Consider storing provider-specific base URL secrets (e.g. LLM_BASE_URL_ANTHROPIC, LLM_BASE_URL_OPENAI) or normalizing/trimming slashes in the secret to avoid accidental double slashes or broken endpoints.