17c535bc61
Complete the security patterns collection: - session-management.md: fixation, hijacking, secure cookies, concurrent sessions - cors.md: origin validation, reflected origin attacks, preflight caching - xxe.md: external entities, DTD attacks, language-specific fixes Now 19 patterns covering comprehensive web application security.
182 lines
5.1 KiB
Markdown
182 lines
5.1 KiB
Markdown
# XML External Entities (XXE)
|
|
|
|
## Rule
|
|
|
|
Disable external entity processing. Disable DTDs. Use safe parser defaults.
|
|
|
|
**Source:** [OWASP XXE Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html)
|
|
|
|
## What XXE Can Do
|
|
|
|
- **File disclosure**: Read `/etc/passwd`, config files, source code
|
|
- **SSRF**: Make requests to internal services
|
|
- **DoS**: Billion laughs attack (exponential entity expansion)
|
|
- **Port scanning**: Error-based probing of internal ports
|
|
- **RCE**: In some configurations (PHP expect://)
|
|
|
|
## Attack Payloads
|
|
|
|
```xml
|
|
<!-- File disclosure -->
|
|
<?xml version="1.0"?>
|
|
<!DOCTYPE foo [
|
|
<!ENTITY xxe SYSTEM "file:///etc/passwd">
|
|
]>
|
|
<data>&xxe;</data>
|
|
|
|
<!-- SSRF to cloud metadata -->
|
|
<?xml version="1.0"?>
|
|
<!DOCTYPE foo [
|
|
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
|
|
]>
|
|
<data>&xxe;</data>
|
|
|
|
<!-- Billion laughs DoS -->
|
|
<?xml version="1.0"?>
|
|
<!DOCTYPE lolz [
|
|
<!ENTITY lol "lol">
|
|
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
|
|
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
|
|
<!-- ... continues exponentially -->
|
|
]>
|
|
<lolz>&lol9;</lolz>
|
|
```
|
|
|
|
## Correct Pattern
|
|
|
|
```python
|
|
# Python - defusedxml (recommended)
|
|
import defusedxml.ElementTree as ET
|
|
|
|
def parse_xml_safe(xml_string: str):
|
|
"""Parse XML with XXE protection."""
|
|
return ET.fromstring(xml_string)
|
|
|
|
# Python - standard library with safe settings
|
|
from xml.etree.ElementTree import XMLParser, parse
|
|
import xml.etree.ElementTree as ET
|
|
|
|
def parse_xml_manual(xml_string: str):
|
|
"""Manual safe configuration."""
|
|
parser = ET.XMLParser()
|
|
# Python's ElementTree doesn't resolve external entities by default
|
|
# But always verify your specific library!
|
|
return ET.fromstring(xml_string, parser=parser)
|
|
|
|
# lxml with safe settings
|
|
from lxml import etree
|
|
|
|
def parse_xml_lxml(xml_string: str):
|
|
"""lxml with XXE disabled."""
|
|
parser = etree.XMLParser(
|
|
resolve_entities=False,
|
|
no_network=True,
|
|
dtd_validation=False,
|
|
load_dtd=False,
|
|
)
|
|
return etree.fromstring(xml_string.encode(), parser=parser)
|
|
```
|
|
|
|
## Incorrect Pattern
|
|
|
|
```python
|
|
from lxml import etree
|
|
|
|
# Wrong: default lxml settings allow XXE
|
|
def bad_parse(xml_string: str):
|
|
return etree.fromstring(xml_string)
|
|
|
|
# Wrong: explicitly enabling dangerous features
|
|
def bad_parse_2(xml_string: str):
|
|
parser = etree.XMLParser(resolve_entities=True)
|
|
return etree.fromstring(xml_string, parser=parser)
|
|
|
|
# Wrong: using xml.dom.minidom without protection
|
|
from xml.dom.minidom import parseString
|
|
def bad_parse_3(xml_string: str):
|
|
return parseString(xml_string) # May be vulnerable
|
|
|
|
# Wrong: SAX parser without disabling features
|
|
import xml.sax
|
|
def bad_parse_4(xml_string: str):
|
|
handler = MyHandler()
|
|
xml.sax.parseString(xml_string, handler)
|
|
```
|
|
|
|
## Language-Specific Fixes
|
|
|
|
### Java
|
|
|
|
```java
|
|
// DocumentBuilderFactory
|
|
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
|
|
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
|
|
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
|
|
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
|
|
dbf.setXIncludeAware(false);
|
|
dbf.setExpandEntityReferences(false);
|
|
|
|
// SAXParserFactory
|
|
SAXParserFactory spf = SAXParserFactory.newInstance();
|
|
spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
|
|
spf.setFeature("http://xml.org/sax/features/external-general-entities", false);
|
|
spf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
|
|
```
|
|
|
|
### .NET
|
|
|
|
```csharp
|
|
// XmlReader (safe by default in .NET 4.5.2+)
|
|
XmlReaderSettings settings = new XmlReaderSettings();
|
|
settings.DtdProcessing = DtdProcessing.Prohibit;
|
|
settings.XmlResolver = null;
|
|
XmlReader reader = XmlReader.Create(stream, settings);
|
|
|
|
// XmlDocument
|
|
XmlDocument doc = new XmlDocument();
|
|
doc.XmlResolver = null; // Disable external resources
|
|
doc.LoadXml(xmlString);
|
|
```
|
|
|
|
### PHP
|
|
|
|
```php
|
|
// Disable entity loading globally
|
|
libxml_disable_entity_loader(true);
|
|
|
|
// Use LIBXML options
|
|
$doc = new DOMDocument();
|
|
$doc->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD | LIBXML_DTDATTR);
|
|
// Actually, better to just not use those flags:
|
|
$doc->loadXML($xml, LIBXML_NONET);
|
|
```
|
|
|
|
## When You Need DTDs
|
|
|
|
```python
|
|
# If you absolutely need DTD validation (rare):
|
|
# 1. Allowlist specific DTDs
|
|
# 2. Fetch DTDs from local filesystem only
|
|
# 3. Never allow user-controlled DTD URLs
|
|
|
|
ALLOWED_DTDS = {
|
|
"-//W3C//DTD XHTML 1.0 Strict//EN": "/path/to/local/xhtml1-strict.dtd"
|
|
}
|
|
|
|
class SafeResolver(etree.Resolver):
|
|
def resolve(self, system_url, public_id, context):
|
|
if public_id in ALLOWED_DTDS:
|
|
return self.resolve_filename(ALLOWED_DTDS[public_id], context)
|
|
raise ValueError(f"DTD not allowed: {public_id}")
|
|
```
|
|
|
|
## Edge Cases
|
|
|
|
- SVG files are XML — validate uploads!
|
|
- SOAP/XML-RPC endpoints are XXE targets
|
|
- Office documents (DOCX, XLSX) contain XML
|
|
- Configuration files (Maven pom.xml, Spring beans.xml)
|
|
- RSS/Atom feeds
|
|
- SAML assertions
|
|
- Blind XXE (out-of-band data exfiltration via DNS/HTTP)
|