8a94a08511
Now covers all OWASP Top 10:2025 categories: - A03: supply-chain.md (SolarWinds, Bybit, npm worm examples) - A04: cryptography.md (algorithm recommendations, key management) - A08: deserialization.md (pickle, yaml, language-specific risks) - A10: error-handling.md (fail closed, error messages)
4.0 KiB
4.0 KiB
Insecure Deserialization
Rule
Never deserialize untrusted data without validation. Prefer data-only formats.
Source: OWASP Top 10 2025 - A08 Software or Data Integrity Failures
Why It's Dangerous
Deserialization can:
- Execute arbitrary code
- Instantiate arbitrary objects
- Bypass authentication
- Cause denial of service
Correct Pattern
import json
from dataclasses import dataclass
from typing import Any
# Prefer data-only formats (JSON, not pickle)
def safe_deserialize(data: str) -> dict:
"""Deserialize JSON (data-only, no code execution)."""
return json.loads(data)
# Validate structure after deserialization
@dataclass
class UserInput:
name: str
email: str
age: int
def parse_user_input(raw: str) -> UserInput:
data = json.loads(raw)
# Validate required fields
if not isinstance(data.get("name"), str):
raise ValueError("Invalid name")
if not isinstance(data.get("email"), str):
raise ValueError("Invalid email")
if not isinstance(data.get("age"), int):
raise ValueError("Invalid age")
return UserInput(
name=data["name"],
email=data["email"],
age=data["age"]
)
# If you must use object serialization, allowlist classes
ALLOWED_CLASSES = {"User", "Order", "Product"}
def safe_unpickle(data: bytes, allowed: set[str]) -> Any:
"""Restricted unpickler that only allows specific classes."""
import pickle
import io
class RestrictedUnpickler(pickle.Unpickler):
def find_class(self, module, name):
if name not in allowed:
raise pickle.UnpicklingError(f"Class {name} not allowed")
return super().find_class(module, name)
return RestrictedUnpickler(io.BytesIO(data)).load()
Incorrect Pattern
import pickle
import yaml
# Wrong: pickle from untrusted source
def load_session(cookie_value: bytes):
return pickle.loads(cookie_value) # RCE!
# Wrong: yaml.load (can execute code)
def load_config(yaml_string: str):
return yaml.load(yaml_string) # Should be yaml.safe_load
# Wrong: eval/exec on user data
def parse_expression(expr: str):
return eval(expr) # Arbitrary code execution
# Wrong: deserializing without validation
def process_request(data: bytes):
obj = pickle.loads(data)
obj.execute() # No type checking!
Language-Specific Risks
| Language | Dangerous | Safe Alternative |
|---|---|---|
| Python | pickle.loads() |
JSON, restricted unpickler |
| Java | ObjectInputStream |
JSON, allowlisted classes |
| PHP | unserialize() |
json_decode() |
| Ruby | Marshal.load() |
JSON, YAML.safe_load |
| JavaScript | eval(JSON) |
JSON.parse() |
| .NET | BinaryFormatter |
JsonSerializer |
YAML Specific
import yaml
# Wrong: yaml.load allows arbitrary Python objects
data = yaml.load(untrusted_yaml) # Can execute code!
# Attack: "!!python/object/apply:os.system ['rm -rf /']"
# Correct: yaml.safe_load only allows basic types
data = yaml.safe_load(untrusted_yaml)
Signature Verification
If you must accept serialized objects:
import hmac
import hashlib
SECRET_KEY = get_secret("serialization_key")
def sign_data(data: bytes) -> bytes:
"""Sign serialized data."""
signature = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
return signature + data
def verify_and_load(signed_data: bytes) -> Any:
"""Verify signature before deserializing."""
signature = signed_data[:32]
data = signed_data[32:]
expected = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
if not hmac.compare_digest(signature, expected):
raise SecurityError("Invalid signature")
return restricted_deserialize(data)
Edge Cases
- Base64-encoded serialized data in cookies
- Serialized objects in database fields
- Message queues with serialized payloads
- Session data in Redis/Memcached
- Java RMI (Remote Method Invocation)