Add supply-chain, deserialization, cryptography, error-handling patterns

Now covers all OWASP Top 10:2025 categories: - A03: supply-chain.md (SolarWinds, Bybit, npm worm examples) - A04: cryptography.md (algorithm recommendations, key management) - A08: deserialization.md (pickle, yaml, language-specific risks) - A10: error-handling.md (fail closed, error messages)
2026-05-10 22:48:39 -07:00
parent 647928a0a1
commit 8a94a08511
5 changed files with 641 additions and 16 deletions
@@ -0,0 +1,151 @@
+# Insecure Deserialization
+
+## Rule
+
+Never deserialize untrusted data without validation. Prefer data-only formats.
+
+**Source:** [OWASP Top 10 2025 - A08 Software or Data Integrity Failures](https://owasp.org/Top10/2025/A08_2025-Software_or_Data_Integrity_Failures/)
+
+## Why It's Dangerous
+
+Deserialization can:
+- Execute arbitrary code
+- Instantiate arbitrary objects
+- Bypass authentication
+- Cause denial of service
+
+## Correct Pattern
+
+```python
+import json
+from dataclasses import dataclass
+from typing import Any
+
+# Prefer data-only formats (JSON, not pickle)
+def safe_deserialize(data: str) -> dict:
+    """Deserialize JSON (data-only, no code execution)."""
+    return json.loads(data)
+
+# Validate structure after deserialization
+@dataclass
+class UserInput:
+    name: str
+    email: str
+    age: int
+
+def parse_user_input(raw: str) -> UserInput:
+    data = json.loads(raw)
+    
+    # Validate required fields
+    if not isinstance(data.get("name"), str):
+        raise ValueError("Invalid name")
+    if not isinstance(data.get("email"), str):
+        raise ValueError("Invalid email")
+    if not isinstance(data.get("age"), int):
+        raise ValueError("Invalid age")
+    
+    return UserInput(
+        name=data["name"],
+        email=data["email"],
+        age=data["age"]
+    )
+
+# If you must use object serialization, allowlist classes
+ALLOWED_CLASSES = {"User", "Order", "Product"}
+
+def safe_unpickle(data: bytes, allowed: set[str]) -> Any:
+    """Restricted unpickler that only allows specific classes."""
+    import pickle
+    import io
+    
+    class RestrictedUnpickler(pickle.Unpickler):
+        def find_class(self, module, name):
+            if name not in allowed:
+                raise pickle.UnpicklingError(f"Class {name} not allowed")
+            return super().find_class(module, name)
+    
+    return RestrictedUnpickler(io.BytesIO(data)).load()
+```
+
+## Incorrect Pattern
+
+```python
+import pickle
+import yaml
+
+# Wrong: pickle from untrusted source
+def load_session(cookie_value: bytes):
+    return pickle.loads(cookie_value)  # RCE!
+
+# Wrong: yaml.load (can execute code)
+def load_config(yaml_string: str):
+    return yaml.load(yaml_string)  # Should be yaml.safe_load
+
+# Wrong: eval/exec on user data
+def parse_expression(expr: str):
+    return eval(expr)  # Arbitrary code execution
+
+# Wrong: deserializing without validation
+def process_request(data: bytes):
+    obj = pickle.loads(data)
+    obj.execute()  # No type checking!
+```
+
+## Language-Specific Risks
+
+| Language | Dangerous | Safe Alternative |
+|----------|-----------|------------------|
+| Python | `pickle.loads()` | JSON, restricted unpickler |
+| Java | `ObjectInputStream` | JSON, allowlisted classes |
+| PHP | `unserialize()` | `json_decode()` |
+| Ruby | `Marshal.load()` | JSON, YAML.safe_load |
+| JavaScript | `eval(JSON)` | `JSON.parse()` |
+| .NET | `BinaryFormatter` | `JsonSerializer` |
+
+## YAML Specific
+
+```python
+import yaml
+
+# Wrong: yaml.load allows arbitrary Python objects
+data = yaml.load(untrusted_yaml)  # Can execute code!
+# Attack: "!!python/object/apply:os.system ['rm -rf /']"
+
+# Correct: yaml.safe_load only allows basic types
+data = yaml.safe_load(untrusted_yaml)
+```
+
+## Signature Verification
+
+If you must accept serialized objects:
+
+```python
+import hmac
+import hashlib
+
+SECRET_KEY = get_secret("serialization_key")
+
+def sign_data(data: bytes) -> bytes:
+    """Sign serialized data."""
+    signature = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
+    return signature + data
+
+def verify_and_load(signed_data: bytes) -> Any:
+    """Verify signature before deserializing."""
+    signature = signed_data[:32]
+    data = signed_data[32:]
+    
+    expected = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
+    if not hmac.compare_digest(signature, expected):
+        raise SecurityError("Invalid signature")
+    
+    return restricted_deserialize(data)
+```
+
+## Edge Cases
+
+- Base64-encoded serialized data in cookies
+- Serialized objects in database fields
+- Message queues with serialized payloads
+- Session data in Redis/Memcached
+- Java RMI (Remote Method Invocation)