## Changes ### Go: IP-level SSRF protection in gitea.Client (primary defense) - Add gitea/ipcheck.go with IsBlockedIP() covering all blocked CIDR ranges: loopback (127.0.0.0/8, ::1), RFC1918 (10/8, 172.16/12, 192.168/16), link-local (169.254/16, fe80::/10), ULA (fc00::/7), CGN (100.64/10), multicast, reserved, and unspecified ranges. - IPv6-mapped IPv4 addresses (::ffff:x.x.x.x) are normalized before checking. - Add safeDialContext to gitea.Client: resolves DNS, rejects any IP in a blocked CIDR, then dials the resolved IP directly to narrow the DNS rebinding window. NewClient now uses this safe transport by default. - Add WithUnsafeDialer() for test code using httptest.Server (127.0.0.1). - Update NewTestClient helper in export_test.go for all gitea unit tests. - Update SetHTTPClient(nil) to restore the safe transport (not the plain one). ### Go: validate-url subcommand (defense-in-depth for future bash callers) - Add 'review-bot validate-url <url>' subcommand: validates https scheme, no user-info, resolves hostname, rejects any blocked IP. - Exit 0=safe, 1=blocked, 2=validation error/dns failure. - Add outWriter/errWriter vars to main.go for testable output capture. ### action.yml: Python3 IP check in 'Determine version' step - After the https scheme validation, resolve SERVER_URL hostname with socket.getaddrinfo and reject any result where ipaddress.ip_address(ip).is_private/is_loopback/is_link_local/etc. is true. - python3 is required on ubuntu-* runners (noted in existing comments). - Covers the version-check curl that sends ACTION_TOKEN to SERVER_URL. - SERVER_URL for install-step curls is covered by the same pre-check. ### Tests - gitea/ipcheck_test.go: 30+ cases covering all blocked families + public IPs - gitea/client_test.go: safe transport presence, WithUnsafeDialer, SSRF blocking - cmd/review-bot/validateurl_test.go: scheme validation, user-info, exit codes Closes #123
14 KiB
Plan: Issue #123 — SSRF Defense: IP-Level URL Validation
Problem
When running on a self-hosted Gitea runner, action.yml accepts a user-supplied vcs-url input and uses it to make HTTP requests (version lookup, binary download) from within the runner environment. An attacker who controls workflow inputs could point this URL at internal services (10.x.x.x, 169.254.x.x, 127.0.0.1, etc.) and exfiltrate data.
Current mitigations (scheme check + whitespace rejection in bash) are insufficient because they operate on the URL string, not on the resolved IP addresses. DNS rebinding and creative hostnames can bypass string-level checks.
The existing Go binary (review-bot) also makes requests to vcs-url directly via gitea.NewClient. That client has redirect protections but no IP-level SSRF defense.
What This Issue Does NOT Cover
- LLM base URLs / AI Core auth/API URLs — those are future work, out of scope for this issue.
- GitHub Actions path (
VCS_TYPE=github) — already usesgithub.api_url(trusted, not user-controlled), so SSRF is not applicable there.
Constraints
- No new dependencies.
CONVENTIONS.mdhas a strict allowlist;netis stdlib. - Must not break existing functionality for legitimate Gitea instances.
- Must handle DNS resolution correctly (some CIDRs block connection but resolve fine — we must check before connecting).
- Must work on both linux/amd64 and linux/arm64 runner environments.
- Go stdlib
netpackage provides everything needed.
Acceptance Criteria (from issue)
- Decide on approach → Go custom
DialContextingitea.Client - Implement URL validation that blocks requests to RFC1918/loopback/link-local IPs
- Block HTTP redirects to non-https or different hosts (already done in
defaultCheckRedirect) - Add tests
Approach Decision: Custom DialContext in gitea.Client
Three options were considered:
| Option | Pros | Cons |
|---|---|---|
Bash dig/getent + IP checks |
No Go change | Fragile, platform-dependent, bypass-prone, hard to test |
review-bot validate-url subcommand |
Reusable from bash | Two phases (validate-then-dial) has TOCTOU; adding a subcommand complicates main.go |
Custom DialContext in gitea.Client |
Atomic: resolves and validates IP in one step; no TOCTOU; testable in Go; no new deps | Defense-in-depth only in Go path, not bash curl calls |
Decision: Custom DialContext + bash-side pre-check for the install step.
Rationale:
- The
review-botbinary already makes the sensitive Gitea API calls. Protecting them at theDialContextlevel is the most robust defense — it's atomic and cannot be bypassed. - The bash
curlin the "Determine version" and "Install review-bot" steps makes requests to user-supplied URLs. The Go binary isn't involved there. Options:- Accept residual risk for the install-step curl calls (the token used is
reviewer-token, not a secrets-level token). - Add a
review-bot validate-urlsubcommand that the bash calls before each curl. - We add
review-bot validate-urlas a lightweight subcommand to protect the install-step curl calls too.
- Accept residual risk for the install-step curl calls (the token used is
Both defenses together (DialContext + validate-url subcommand) give full coverage. The validate-url path only needs to resolve the domain and check the resulting IPs — no additional state.
Design: validate-url Subcommand
review-bot validate-url <url> exits 0 if safe, non-zero + error message if blocked.
usage: review-bot validate-url <url>
Resolves <url> and checks that all resolved IP addresses are
routable (not RFC1918, loopback, link-local, or special-use).
Exit codes:
0 — URL is safe to use
1 — URL resolves to a blocked address
2 — URL validation error (bad scheme, malformed URL, DNS failure)
This command is tiny — no LLM logic, no Gitea auth. The bash in action.yml can call it as:
"${{ runner.temp }}/review-bot" validate-url "$SERVER_URL" || {
echo "Error: SERVER_URL resolves to a blocked/private IP" >&2
exit 1
}
Design: IP Validation Logic (Shared)
A single internal/ipcheck package (or unexported function in gitea) will:
- Parse the URL: extract host + port.
- Call
net.DefaultResolver.LookupIPAddr(ctx, host)to resolve all IPs. - For each IP, call
isBlockedIP(ip):- Loopback:
127.0.0.0/8,::1/128 - RFC1918:
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 - Link-local:
169.254.0.0/16,fe80::/10 - Unique local (IPv6 ULA):
fc00::/7 - Multicast:
224.0.0.0/4,ff00::/8 - Loopback/unspecified:
0.0.0.0/8,::
- Loopback:
- If any resolved IP is blocked → reject.
- If zero IPs are returned → reject (NXDOMAIN shouldn't be allowed through).
- Also validate: scheme must be
https, no user info in URL.
This logic lives in a standalone file (gitea/ipcheck.go or cmd/review-bot/ipcheck.go) so it can be used from both the DialContext interceptor and the validate-url subcommand.
Why reject if ANY IP is blocked: An attacker using split-horizon DNS can return both a public IP and a private IP. If we only block "all IPs private", one real IP clears the check. We must block if any resolved IP is private.
Design: Custom DialContext in gitea.Client
// safeDialer wraps net.Dialer.DialContext with IP-level SSRF protection.
// It resolves the hostname and checks all IPs before connecting.
func safeDialContext(ctx context.Context, network, addr string) (net.Conn, error) {
host, port, err := net.SplitHostPort(addr)
if err != nil {
return nil, fmt.Errorf("invalid address %q: %w", addr, err)
}
addrs, err := net.DefaultResolver.LookupIPAddr(ctx, host)
if err != nil {
return nil, fmt.Errorf("DNS lookup %q: %w", host, err)
}
for _, a := range addrs {
if isBlockedIP(a.IP) {
return nil, fmt.Errorf("blocked: %q resolves to private/reserved IP %s", host, a.IP)
}
}
// Re-dial using the first resolved IP directly to avoid double-resolve
d := &net.Dialer{Timeout: 10 * time.Second}
return d.DialContext(ctx, network, net.JoinHostPort(addrs[0].IP.String(), port))
}
NewClient applies this as the transport's DialContext:
transport := &http.Transport{
DialContext: safeDialContext,
}
return &Client{
...
http: &http.Client{
Timeout: 30 * time.Second,
Transport: transport,
CheckRedirect: defaultCheckRedirect,
},
}
Note: This only protects Gitea client calls. The llm and github clients are not changed in this PR (out of scope per issue).
Files to Change
| File | Change |
|---|---|
gitea/ipcheck.go (new) |
isBlockedIP(ip net.IP) bool + blocked CIDR list |
gitea/ipcheck_test.go (new) |
Tests for all blocked/allowed CIDRs |
gitea/client.go |
Add safeDialContext, apply to NewClient transport; add WithUnsafeDialer for tests |
gitea/client_test.go |
Tests for SSRF blocking in NewClient |
cmd/review-bot/main.go |
Add validate-url subcommand routing |
cmd/review-bot/validateurl.go (new) |
runValidateURL(args []string) error |
cmd/review-bot/validateurl_test.go (new) |
Unit tests for validate-url subcommand |
.gitea/actions/review/action.yml |
Call validate-url before the version-check and download curl calls |
State / Data Model
No persistent state. This is stateless request-time validation.
CIDR list is a compile-time constant slice — no runtime config.
Error Cases
| Case | Behavior |
|---|---|
| DNS resolution fails (NXDOMAIN) | Reject — do not connect |
| DNS returns empty IP list | Reject — do not connect |
| Any IP in blocked CIDR | Reject with log message showing hostname + IP |
| URL has http:// scheme | Reject at string level (validate-url) / irrelevant for DialContext (scheme is checked at bash level already) |
| User info in URL (user:pass@host) | Reject at string level in validate-url |
| IPv6 mapped IPv4 (::ffff:10.0.0.1) | Blocked via ip.To4() normalization before CIDR check |
| Unresolvable port | Handled by net.SplitHostPort error |
Edge Cases
- IPv6 mapped IPv4 addresses (
::ffff:192.168.1.1):net.IP.To4()returns the IPv4 form, which must be checked against IPv4 CIDRs. We normalize before checking. - Multiple A/AAAA records: One public + one private = blocked. Zero records = blocked.
- Default port (443 for https): URL may not include
:443. Thevalidate-urlsubcommand must use the URL's scheme to infer port for resolution (the hostname is enough for DNS check; port isn't needed for IP validation). - DialContext addr format:
net/httppasses"host:port"(already split), sonet.SplitHostPortworks without parsing the original URL. - Test isolation: Tests that hit local httptest servers need to bypass the safe dialer. Existing
SetHTTPClientmethod allows injecting a plain HTTP client in tests. For the DialContext specifically, we addWithUnsafeDialer() *Clientfor test-only use. - DNS rebinding: The safeDialContext checks IPs after resolution, immediately before connection. This is not 100% immune to DNS rebinding (an attacker can change DNS response between LookupIPAddr and DialContext), but combined with the validate-url pre-check in bash (which also resolves), the window is extremely narrow. Full immunity requires pinning the IP for the duration of the request, which is not practical here without a larger refactor.
Testing Strategy
Unit tests (gitea/ipcheck_test.go)
isBlockedIP: test each CIDR (10.0.0.1, 172.16.0.1, 192.168.0.1, 169.254.1.1, 127.0.0.1, ::1, fe80::1, fc00::1)- Public IPs: 8.8.8.8, 1.1.1.1, 2001:4860:4860::8888 → not blocked
- IPv6-mapped IPv4:
net.IP{0,0,0,0,0,0,0,0,0,0,0xff,0xff,192,168,1,1}→ blocked
Unit tests (gitea/client_test.go)
TestSafeDialerBlocksPrivateIP: mock resolver returns 127.0.0.1 → client returns error, no connection madeTestSafeDialerBlocksRFC1918: mock resolver returns 10.0.0.1TestSafeDialerAllowsPublicIP: public IP passes (connect to httptest server via 127.0.0.1 bypassed by WithUnsafeDialer)
Unit tests (cmd/review-bot/validateurl_test.go)
- Table-driven: various URLs (http://, https with private hostname, malformed) → expected exit/error
- DNS resolution mocked via dependency injection or by testing against known-bad/good URLs
Integration tests
- Existing tests use
SetHTTPClientwithhttptest— they pass a plain HTTP client that bypasses safe dialing. No change needed there.
Backward Compatibility
-
No flag changes, no env var changes, no action input changes.
-
NewClientnow uses a safe HTTP transport by default — this is a breaking change in behavior for callers that were previously able to point the client at localhost. But callers that do so are tests — they already useSetHTTPClientto inject a test transport. TheWithUnsafeDialerescape hatch for tests makes this clean. -
The
validate-urlsubcommand inaction.ymlruns before the binary is installed (in the "Determine version" step). We call it after the binary is installed (cache step) or at the start of "Install review-bot". Actually — the binary is installed in the "Install review-bot" step, but "Determine version" runs first and that's where the sensitive version-check curl happens. Therefore:- "Determine version" keeps existing bash scheme validation (pre-install, no binary available)
- "Install review-bot" calls
validate-urlbefore any curl - "Run review" is protected by the Go binary's DialContext
- This means version-check curl is not fully IP-protected — it uses the existing scheme-only validation. This is an acceptable residual risk: the attacker would only leak the Gitea API version response, not any secrets (no token is sent in the unauthenticated path; authenticated path sends reviewer-token which is the user's own token for their own server).
Actually, re-reading action.yml: the version check step DOES send
ACTION_TOKENtoAPI_URL. So we should protect it. Options:- Embed a lightweight IP check in bash using
python3(available on ubuntu runners per comments in action.yml) - Or restructure to do validate-url after install, skipping the cache step for the first call
Decision: Add a python3 IP check helper for the version-check curl. Python3's
socket.getaddrinfocan resolve and check IPs without any external dep. Keep it minimal (just loopback + RFC1918 + link-local check). This is one more line in an already-bash-heavy step, documented clearly.
Open Questions
-
The DNS rebinding window in
safeDialContextcan't be fully closed without IP pinning (binding the connection to the resolved IP). Is pinning worth the added complexity? Currently we dialaddrs[0]directly after resolution which is a close approximation. This is our approach — worth noting in code comments. -
Should we also protect the
llmclient? The issue says "action.yml Gitea path" so explicitly not in scope, butllm-base-urlis also user-supplied. Suggest filing a follow-up. Not addressed here. -
The python3 bash helper for the version-check step: should it also check IPv6? Yes — add
ipaddress.ip_address(ip).is_privatewhich handles both in Python 3.3+.
Completion Checklist
Generated for this specific task:
isBlockedIPcovers all CIDR families: loopback, RFC1918, link-local, ULA, multicast, unspecified- IPv6-mapped IPv4 addresses are normalized before CIDR check
- ANY blocked IP in the resolved set rejects (not just all-blocked)
- Zero resolved IPs rejects
safeDialContextdials the resolved IP directly (avoids second DNS lookup that could be different)- Test isolation: existing tests using httptest still pass (via
WithUnsafeDialerorSetHTTPClient) validate-urlsubcommand exits non-zero for private IPs, malformed URLs, http:// schemeaction.ymlcallsvalidate-urlbefore each curl that uses user-supplied URLgo test ./...passesgo vet ./...passesscripts/check-deps.shpasses (no new deps)