17c535bc61
Complete the security patterns collection: - session-management.md: fixation, hijacking, secure cookies, concurrent sessions - cors.md: origin validation, reflected origin attacks, preflight caching - xxe.md: external entities, DTD attacks, language-specific fixes Now 19 patterns covering comprehensive web application security.
5.1 KiB
5.1 KiB
XML External Entities (XXE)
Rule
Disable external entity processing. Disable DTDs. Use safe parser defaults.
Source: OWASP XXE Prevention Cheat Sheet
What XXE Can Do
- File disclosure: Read
/etc/passwd, config files, source code - SSRF: Make requests to internal services
- DoS: Billion laughs attack (exponential entity expansion)
- Port scanning: Error-based probing of internal ports
- RCE: In some configurations (PHP expect://)
Attack Payloads
<!-- File disclosure -->
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<data>&xxe;</data>
<!-- SSRF to cloud metadata -->
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<data>&xxe;</data>
<!-- Billion laughs DoS -->
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!-- ... continues exponentially -->
]>
<lolz>&lol9;</lolz>
Correct Pattern
# Python - defusedxml (recommended)
import defusedxml.ElementTree as ET
def parse_xml_safe(xml_string: str):
"""Parse XML with XXE protection."""
return ET.fromstring(xml_string)
# Python - standard library with safe settings
from xml.etree.ElementTree import XMLParser, parse
import xml.etree.ElementTree as ET
def parse_xml_manual(xml_string: str):
"""Manual safe configuration."""
parser = ET.XMLParser()
# Python's ElementTree doesn't resolve external entities by default
# But always verify your specific library!
return ET.fromstring(xml_string, parser=parser)
# lxml with safe settings
from lxml import etree
def parse_xml_lxml(xml_string: str):
"""lxml with XXE disabled."""
parser = etree.XMLParser(
resolve_entities=False,
no_network=True,
dtd_validation=False,
load_dtd=False,
)
return etree.fromstring(xml_string.encode(), parser=parser)
Incorrect Pattern
from lxml import etree
# Wrong: default lxml settings allow XXE
def bad_parse(xml_string: str):
return etree.fromstring(xml_string)
# Wrong: explicitly enabling dangerous features
def bad_parse_2(xml_string: str):
parser = etree.XMLParser(resolve_entities=True)
return etree.fromstring(xml_string, parser=parser)
# Wrong: using xml.dom.minidom without protection
from xml.dom.minidom import parseString
def bad_parse_3(xml_string: str):
return parseString(xml_string) # May be vulnerable
# Wrong: SAX parser without disabling features
import xml.sax
def bad_parse_4(xml_string: str):
handler = MyHandler()
xml.sax.parseString(xml_string, handler)
Language-Specific Fixes
Java
// DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
// SAXParserFactory
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
spf.setFeature("http://xml.org/sax/features/external-general-entities", false);
spf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
.NET
// XmlReader (safe by default in .NET 4.5.2+)
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;
XmlReader reader = XmlReader.Create(stream, settings);
// XmlDocument
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null; // Disable external resources
doc.LoadXml(xmlString);
PHP
// Disable entity loading globally
libxml_disable_entity_loader(true);
// Use LIBXML options
$doc = new DOMDocument();
$doc->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD | LIBXML_DTDATTR);
// Actually, better to just not use those flags:
$doc->loadXML($xml, LIBXML_NONET);
When You Need DTDs
# If you absolutely need DTD validation (rare):
# 1. Allowlist specific DTDs
# 2. Fetch DTDs from local filesystem only
# 3. Never allow user-controlled DTD URLs
ALLOWED_DTDS = {
"-//W3C//DTD XHTML 1.0 Strict//EN": "/path/to/local/xhtml1-strict.dtd"
}
class SafeResolver(etree.Resolver):
def resolve(self, system_url, public_id, context):
if public_id in ALLOWED_DTDS:
return self.resolve_filename(ALLOWED_DTDS[public_id], context)
raise ValueError(f"DTD not allowed: {public_id}")
Edge Cases
- SVG files are XML — validate uploads!
- SOAP/XML-RPC endpoints are XXE targets
- Office documents (DOCX, XLSX) contain XML
- Configuration files (Maven pom.xml, Spring beans.xml)
- RSS/Atom feeds
- SAML assertions
- Blind XXE (out-of-band data exfiltration via DNS/HTTP)