Files
model-research/findings/2026-05-14-github-client-https-bypass-multimodel-catch.md
T
Rodin 8e64f8f012 finding(79): multi-model security review catches HTTPS bypass in GitHub client (PR #131)
Security-review-bot persona caught HTTPS enforcement inconsistency in write-path
methods (PostReview, DeleteReview, RequestReviewer) that generalist reviewers missed.
Issue fixed within 30 minutes, all reviewers re-approved. Validates specialized
security persona value in multi-model pipeline.
2026-05-14 21:56:58 +00:00

68 lines
4.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Finding #79: Multi-Model Security Review Catches HTTPS Bypass in GitHub Client (PR #131)
**Date:** 2026-05-14
**Task:** Security review of GitHub API client implementation in review-bot PR #131 (feat: implement GitHub API methods and VCS routing).
**Task type:** Security code review — HTTPS enforcement consistency in Go HTTP client.
**Config:** Config B (odd PR# 131 → Opus investigates, GPT-5 judges)
## Summary
The multi-model security reviewer (`security-review-bot`) identified a critical HTTPS enforcement bypass in the GitHub API client implementation. Three methods (`PostReview`, `DeleteReview`, `RequestReviewer`) constructed and executed HTTP requests directly via `c.httpClient.Do` without using the `doRequest` helper that enforces HTTPS-only connections. This meant that if the base URL was misconfigured with `http://` (e.g., GitHub Enterprise Server on HTTP), the `Authorization` header containing API tokens would be transmitted in plaintext.
The finding was flagged as **REQUEST_CHANGES** at review ID 3745 (2026-05-14T20:46:02Z). The author fixed the issue within 30 minutes (fix applied ~21:00 UTC), and all reviewers re-approved at 21:1221:13 UTC.
## Reviewer Behavior
| Reviewer | Initial State (20:4520:46) | Final State (21:1221:13) |
|----------|-----------------------------|---------------------------|
| sonnet-review-bot | APPROVED | APPROVED |
| gpt-review-bot | APPROVED | APPROVED |
| security-review-bot | **REQUEST_CHANGES** | APPROVED |
**Key observation:** The standard reviewers (sonnet, gpt) both missed the HTTPS enforcement inconsistency. Only the dedicated `security-review-bot` persona caught it. This is a direct validation of the value of specialized security reviewer personas in a multi-model pipeline.
## The Finding (Verbatim)
From review ID 3745 (`security-review-bot`):
> **PostReview, DeleteReview, and RequestReviewer construct and execute HTTP requests directly (`c.httpClient.Do`) without the HTTPS-only guard used in `doRequest`. If `baseURL` is accidentally configured with an `http://` scheme (e.g., GHES on HTTP), these methods will send the Authorization header over plaintext, exposing tokens. Enforce scheme checks consistently or reuse a common request helper for non-GET methods.**
Severity: **[MAJOR]** — Token exposure via HTTPS enforcement inconsistency.
Recommended fix: Refactor write-path methods to use a generalized version of `doRequest` that accepts a request body and validates the URL scheme before executing.
## Evidence of Impact
The fix was applied and verified. Security review passed after fix. This demonstrates:
1. The security persona provides unique coverage that generalist reviewers do not
2. Request/response method paths are higher risk for HTTPS bypass than read paths (different code paths)
3. Even well-structured code with security intent (existing `doRequest` HTTPS guard) can have inconsistent enforcement on new methods
## Comparison: Config B (Odd PR) Performance
This PR was Config B: Opus investigates, GPT-5 judges. However, the security finding came from the dedicated `security-review-bot` persona, not the configuration-selected models. The Config A/B test primarily tests *model selection for investigator/judge roles*, while the security reviewer runs independently of that framework.
**Insight:** Security findings are best captured by a domain-specialized persona, not by generalist model improvement. The Config A/B experiment measures general code quality signal; specialized security review is a complementary, separate signal.
## Related Findings
- **Finding #77:** CGN proxy SSRF catch (`2026-05-14-cgn-proxy-ssrf-multimodel-catch.md`) — similar pattern: security persona catches issue generalists missed
- **Finding #78:** Dev loop effectiveness analysis — broader pipeline quality metrics
## Lessons
1. **Specialized security reviewers outperform generalist "security-aware" reviewers** — the explicit persona framing directs attention effectively
2. **HTTPS enforcement must be audited at every call site**, not just the initial implementation
3. **Write-path methods deserve extra security scrutiny** — read-only helpers often get security treatment that write helpers miss
4. **30-minute fix cycle** — multi-model pipeline enabled same-session detection, author fix, and re-verification without human involvement
## Metadata
- **Repo:** rodin/review-bot
- **PR:** #131
- **Security finding review:** ID 3745 (initial REQUEST_CHANGES) → ID 3758 (final APPROVED)
- **Fix duration:** ~1530 minutes
- **Config:** B (odd PR#)
- **Reviewers:** sonnet-review-bot, gpt-review-bot, security-review-bot
- **Outcome:** Security issue caught and fixed before merge