finding(79): multi-model security review catches HTTPS bypass in GitHub client (PR #131)

Security-review-bot persona caught HTTPS enforcement inconsistency in write-path methods (PostReview, DeleteReview, RequestReviewer) that generalist reviewers missed. Issue fixed within 30 minutes, all reviewers re-approved. Validates specialized security persona value in multi-model pipeline.
2026-05-14 21:56:58 +00:00
parent 643a804bdf
commit 8e64f8f012
1 changed files with 67 additions and 0 deletions
@@ -0,0 +1,67 @@
+# Finding #79: Multi-Model Security Review Catches HTTPS Bypass in GitHub Client (PR #131)
+
+**Date:** 2026-05-14
+**Task:** Security review of GitHub API client implementation in review-bot PR #131 (feat: implement GitHub API methods and VCS routing).
+**Task type:** Security code review — HTTPS enforcement consistency in Go HTTP client.
+**Config:** Config B (odd PR# 131 → Opus investigates, GPT-5 judges)
+
+## Summary
+
+The multi-model security reviewer (`security-review-bot`) identified a critical HTTPS enforcement bypass in the GitHub API client implementation. Three methods (`PostReview`, `DeleteReview`, `RequestReviewer`) constructed and executed HTTP requests directly via `c.httpClient.Do` without using the `doRequest` helper that enforces HTTPS-only connections. This meant that if the base URL was misconfigured with `http://` (e.g., GitHub Enterprise Server on HTTP), the `Authorization` header containing API tokens would be transmitted in plaintext.
+
+The finding was flagged as **REQUEST_CHANGES** at review ID 3745 (2026-05-14T20:46:02Z). The author fixed the issue within 30 minutes (fix applied ~21:00 UTC), and all reviewers re-approved at 21:12–21:13 UTC.
+
+## Reviewer Behavior
+
+| Reviewer | Initial State (20:45–20:46) | Final State (21:12–21:13) |
+|----------|-----------------------------|---------------------------|
+| sonnet-review-bot | APPROVED | APPROVED |
+| gpt-review-bot | APPROVED | APPROVED |
+| security-review-bot | **REQUEST_CHANGES** | APPROVED |
+
+**Key observation:** The standard reviewers (sonnet, gpt) both missed the HTTPS enforcement inconsistency. Only the dedicated `security-review-bot` persona caught it. This is a direct validation of the value of specialized security reviewer personas in a multi-model pipeline.
+
+## The Finding (Verbatim)
+
+From review ID 3745 (`security-review-bot`):
+
+> **PostReview, DeleteReview, and RequestReviewer construct and execute HTTP requests directly (`c.httpClient.Do`) without the HTTPS-only guard used in `doRequest`. If `baseURL` is accidentally configured with an `http://` scheme (e.g., GHES on HTTP), these methods will send the Authorization header over plaintext, exposing tokens. Enforce scheme checks consistently or reuse a common request helper for non-GET methods.**
+
+Severity: **[MAJOR]** — Token exposure via HTTPS enforcement inconsistency.
+
+Recommended fix: Refactor write-path methods to use a generalized version of `doRequest` that accepts a request body and validates the URL scheme before executing.
+
+## Evidence of Impact
+
+The fix was applied and verified. Security review passed after fix. This demonstrates:
+1. The security persona provides unique coverage that generalist reviewers do not
+2. Request/response method paths are higher risk for HTTPS bypass than read paths (different code paths)
+3. Even well-structured code with security intent (existing `doRequest` HTTPS guard) can have inconsistent enforcement on new methods
+
+## Comparison: Config B (Odd PR) Performance
+
+This PR was Config B: Opus investigates, GPT-5 judges. However, the security finding came from the dedicated `security-review-bot` persona, not the configuration-selected models. The Config A/B test primarily tests *model selection for investigator/judge roles*, while the security reviewer runs independently of that framework.
+
+**Insight:** Security findings are best captured by a domain-specialized persona, not by generalist model improvement. The Config A/B experiment measures general code quality signal; specialized security review is a complementary, separate signal.
+
+## Related Findings
+
+- **Finding #77:** CGN proxy SSRF catch (`2026-05-14-cgn-proxy-ssrf-multimodel-catch.md`) — similar pattern: security persona catches issue generalists missed
+- **Finding #78:** Dev loop effectiveness analysis — broader pipeline quality metrics
+
+## Lessons
+
+1. **Specialized security reviewers outperform generalist "security-aware" reviewers** — the explicit persona framing directs attention effectively
+2. **HTTPS enforcement must be audited at every call site**, not just the initial implementation
+3. **Write-path methods deserve extra security scrutiny** — read-only helpers often get security treatment that write helpers miss
+4. **30-minute fix cycle** — multi-model pipeline enabled same-session detection, author fix, and re-verification without human involvement
+
+## Metadata
+
+- **Repo:** rodin/review-bot
+- **PR:** #131
+- **Security finding review:** ID 3745 (initial REQUEST_CHANGES) → ID 3758 (final APPROVED)
+- **Fix duration:** ~15–30 minutes
+- **Config:** B (odd PR#)
+- **Reviewers:** sonnet-review-bot, gpt-review-bot, security-review-bot
+- **Outcome:** Security issue caught and fixed before merge