Files
security-patterns/input-validation.md
Rodin 647928a0a1 Initial commit: 9 security patterns for code review
Fundamentals: secure-defaults, input-validation, credential-handling, audit-logging
Identity: authentication, authorization
Attack Prevention: injection-prevention, dos-prevention, prompt-injection
2026-05-10 22:45:03 -07:00

2.6 KiB

Input Validation

Rule

Validate all input. Allowlist > blocklist.

Source: OWASP Input Validation Cheat Sheet

Correct Pattern

import re
from typing import Optional

# Allowlist: only permit known-good patterns
VALID_USERNAME = re.compile(r'^[a-zA-Z0-9_]{3,20}$')
VALID_EMAIL = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')

def validate_username(username: str) -> Optional[str]:
    """Return sanitized username or None if invalid."""
    if not username:
        return None
    username = username.strip()
    if VALID_USERNAME.match(username):
        return username
    return None

def validate_positive_int(value: str, max_value: int = 10000) -> Optional[int]:
    """Parse and validate positive integer with upper bound."""
    try:
        n = int(value)
        if 0 < n <= max_value:
            return n
    except (ValueError, TypeError):
        pass
    return None

Incorrect Pattern

# Wrong: blocklist approach (attackers find bypasses)
def sanitize(s):
    bad = ["<script>", "DROP TABLE", "../"]
    for b in bad:
        s = s.replace(b, "")
    return s

# Wrong: trusting input without validation
def get_user(user_id):
    return db.query(f"SELECT * FROM users WHERE id = {user_id}")

# Wrong: regex that allows too much
VALID_PATH = re.compile(r'.*')  # Matches anything!

# Wrong: validation after use
def process(data):
    result = expensive_operation(data)  # Already used!
    if not is_valid(data):
        raise ValueError("Invalid")

Validation at Boundaries

Validate at every trust boundary:

# API endpoint — first line of defense
@app.route("/users/<user_id>")
def get_user(user_id: str):
    validated_id = validate_positive_int(user_id)
    if validated_id is None:
        return {"error": "invalid_user_id"}, 400
    
    return user_service.get(validated_id)

# Service layer — defense in depth
class UserService:
    def get(self, user_id: int) -> User:
        assert isinstance(user_id, int) and user_id > 0
        return self.repo.find(user_id)

Type Coercion Attacks

# Wrong: loose equality / type confusion
if user_input == 0:  # "0" == 0 in some languages
    grant_admin()

# Correct: strict type checking
if isinstance(user_input, int) and user_input == 0:
    ...

Edge Cases

  • Unicode normalization attacks (homoglyphs)
  • Null byte injection (file.txt\x00.jpg)
  • Integer overflow on length checks
  • Locale-dependent parsing (1,000 vs 1.000)
  • JSON vs form encoding differences