Files
model-research/findings/2026-05-14-github-client-https-bypass-multimodel-catch.md
Rodin 8e64f8f012 finding(79): multi-model security review catches HTTPS bypass in GitHub client (PR #131)
Security-review-bot persona caught HTTPS enforcement inconsistency in write-path
methods (PostReview, DeleteReview, RequestReviewer) that generalist reviewers missed.
Issue fixed within 30 minutes, all reviewers re-approved. Validates specialized
security persona value in multi-model pipeline.
2026-05-14 21:56:58 +00:00

4.6 KiB
Raw Permalink Blame History

Finding #79: Multi-Model Security Review Catches HTTPS Bypass in GitHub Client (PR #131)

Date: 2026-05-14 Task: Security review of GitHub API client implementation in review-bot PR #131 (feat: implement GitHub API methods and VCS routing). Task type: Security code review — HTTPS enforcement consistency in Go HTTP client. Config: Config B (odd PR# 131 → Opus investigates, GPT-5 judges)

Summary

The multi-model security reviewer (security-review-bot) identified a critical HTTPS enforcement bypass in the GitHub API client implementation. Three methods (PostReview, DeleteReview, RequestReviewer) constructed and executed HTTP requests directly via c.httpClient.Do without using the doRequest helper that enforces HTTPS-only connections. This meant that if the base URL was misconfigured with http:// (e.g., GitHub Enterprise Server on HTTP), the Authorization header containing API tokens would be transmitted in plaintext.

The finding was flagged as REQUEST_CHANGES at review ID 3745 (2026-05-14T20:46:02Z). The author fixed the issue within 30 minutes (fix applied ~21:00 UTC), and all reviewers re-approved at 21:1221:13 UTC.

Reviewer Behavior

Reviewer Initial State (20:4520:46) Final State (21:1221:13)
sonnet-review-bot APPROVED APPROVED
gpt-review-bot APPROVED APPROVED
security-review-bot REQUEST_CHANGES APPROVED

Key observation: The standard reviewers (sonnet, gpt) both missed the HTTPS enforcement inconsistency. Only the dedicated security-review-bot persona caught it. This is a direct validation of the value of specialized security reviewer personas in a multi-model pipeline.

The Finding (Verbatim)

From review ID 3745 (security-review-bot):

PostReview, DeleteReview, and RequestReviewer construct and execute HTTP requests directly (c.httpClient.Do) without the HTTPS-only guard used in doRequest. If baseURL is accidentally configured with an http:// scheme (e.g., GHES on HTTP), these methods will send the Authorization header over plaintext, exposing tokens. Enforce scheme checks consistently or reuse a common request helper for non-GET methods.

Severity: [MAJOR] — Token exposure via HTTPS enforcement inconsistency.

Recommended fix: Refactor write-path methods to use a generalized version of doRequest that accepts a request body and validates the URL scheme before executing.

Evidence of Impact

The fix was applied and verified. Security review passed after fix. This demonstrates:

  1. The security persona provides unique coverage that generalist reviewers do not
  2. Request/response method paths are higher risk for HTTPS bypass than read paths (different code paths)
  3. Even well-structured code with security intent (existing doRequest HTTPS guard) can have inconsistent enforcement on new methods

Comparison: Config B (Odd PR) Performance

This PR was Config B: Opus investigates, GPT-5 judges. However, the security finding came from the dedicated security-review-bot persona, not the configuration-selected models. The Config A/B test primarily tests model selection for investigator/judge roles, while the security reviewer runs independently of that framework.

Insight: Security findings are best captured by a domain-specialized persona, not by generalist model improvement. The Config A/B experiment measures general code quality signal; specialized security review is a complementary, separate signal.

  • Finding #77: CGN proxy SSRF catch (2026-05-14-cgn-proxy-ssrf-multimodel-catch.md) — similar pattern: security persona catches issue generalists missed
  • Finding #78: Dev loop effectiveness analysis — broader pipeline quality metrics

Lessons

  1. Specialized security reviewers outperform generalist "security-aware" reviewers — the explicit persona framing directs attention effectively
  2. HTTPS enforcement must be audited at every call site, not just the initial implementation
  3. Write-path methods deserve extra security scrutiny — read-only helpers often get security treatment that write helpers miss
  4. 30-minute fix cycle — multi-model pipeline enabled same-session detection, author fix, and re-verification without human involvement

Metadata

  • Repo: rodin/review-bot
  • PR: #131
  • Security finding review: ID 3745 (initial REQUEST_CHANGES) → ID 3758 (final APPROVED)
  • Fix duration: ~1530 minutes
  • Config: B (odd PR#)
  • Reviewers: sonnet-review-bot, gpt-review-bot, security-review-bot
  • Outcome: Security issue caught and fixed before merge