# Audit Logging

## Rule

Log security-relevant events. Never log secrets.

**Source:** [OWASP Logging Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Logging_Cheat_Sheet.html)

## What to Log

| Event | Log Level | Required Fields |
|-------|-----------|-----------------|
| Authentication success/failure | INFO/WARN | user_id, ip, timestamp, method |
| Authorization failure | WARN | user_id, resource, action, ip |
| Input validation failure | WARN | endpoint, validation_error, ip |
| Privilege escalation | WARN | user_id, old_role, new_role, by_whom |
| Data access (sensitive) | INFO | user_id, resource_type, resource_id |
| Configuration change | INFO | user_id, setting, old_value, new_value |
| Security control disabled | ALERT | user_id, control, reason |

## Correct Pattern

```python
import logging
import hashlib
from datetime import datetime

# Structured logging
security_logger = logging.getLogger("security")

def log_auth_attempt(user_id: str, success: bool, ip: str, method: str):
    security_logger.info(
        "authentication_attempt",
        extra={
            "event_type": "auth",
            "user_id": user_id,
            "success": success,
            "ip_address": ip,
            "auth_method": method,
            "timestamp": datetime.utcnow().isoformat(),
        }
    )

def log_access(user_id: str, resource: str, action: str, allowed: bool):
    level = logging.INFO if allowed else logging.WARNING
    security_logger.log(
        level,
        "access_attempt",
        extra={
            "event_type": "access",
            "user_id": user_id,
            "resource": resource,
            "action": action,
            "allowed": allowed,
            "timestamp": datetime.utcnow().isoformat(),
        }
    )

# Mask sensitive data in logs
def mask_sensitive(data: dict) -> dict:
    """Mask sensitive fields for logging."""
    sensitive_keys = {"password", "token", "secret", "api_key", "ssn", "credit_card"}
    masked = {}
    for key, value in data.items():
        if any(s in key.lower() for s in sensitive_keys):
            masked[key] = "[REDACTED]"
        elif isinstance(value, dict):
            masked[key] = mask_sensitive(value)
        else:
            masked[key] = value
    return masked
```

## Incorrect Pattern

```python
# Wrong: logging secrets
logging.info(f"User login with password: {password}")
logging.debug(f"API call with key: {api_key}")

# Wrong: no context
logging.warning("Invalid input")  # Which input? Where? Who?

# Wrong: user-controlled data in log format string
logging.info(user_input)  # Log injection possible

# Wrong: logging PII without purpose
logging.info(f"User {name} with SSN {ssn} logged in")
```

## Log Injection Prevention

```python
# Wrong: allows log injection
def log_user_action(action: str):
    logging.info(f"User action: {action}")
    # Input: "action\n2024-01-01 INFO: Admin granted"

# Correct: escape or use structured logging
def log_user_action(action: str):
    # Option 1: escape newlines
    safe_action = action.replace("\n", "\\n").replace("\r", "\\r")
    logging.info(f"User action: {safe_action}")
    
    # Option 2: structured logging (preferred)
    logging.info("user_action", extra={"action": action})
```

## Retention and Protection

```python
# Log retention policy
RETENTION_DAYS = {
    "security": 365,      # Keep security logs 1 year
    "access": 90,         # Access logs 90 days
    "debug": 7,           # Debug logs 7 days
}

# Tamper detection
def log_with_hash(event: dict):
    """Append hash for integrity verification."""
    event["_hash"] = hashlib.sha256(
        json.dumps(event, sort_keys=True).encode()
    ).hexdigest()
    security_logger.info(event)
```

## Edge Cases

- Logs themselves become attack surface (log4shell)
- PII in logs may violate GDPR/CCPA
- High-volume logging can be used for DOS
- Stack traces may leak sensitive info
- Correlation IDs needed for distributed tracing