These vulnerabilities are only an issue if you allow untrusted users to supply XML documents with DTDs.

If your system must allow users to submit XML documents with DTDs, then you probably want to pre-parse them before supplying them to BaseX, i.e., using a Java parser or Python with lxml or similar, where the entity-related vulnerabilities can be prevented or isolated. That is, your site can provide an upload target that preprocesses XML documents in order to sanitize them before submitting to BaseX.

One limitation I’ve run into with BaseX’s built-in parser is that it does not implement use of Apache’s grammar cache feature, which makes it very inefficient for documents with large DTDs, like DITA documents.

My solution is to simply not use DTD-aware parsing, which works for DITA because we know what all the default attribute values are for a given tag name and are not dependent on any other DTD-specific feature (i.e., DITA doesn’t use external general entities for any defined purpose, like references to images or something).

Cheers,

_____________________________________________

Eliot Kimber

Sr. Staff Content Engineer

O: 512 554 9368

servicenow

servicenow.com

LinkedIn | X | YouTube | Instagram

From: Nico Verwer (Rakensi) <nverwer@rakensi.com>
Date: Thursday, March 13, 2025 at 5:26 PM
To: basex-talk@mailman.uni-konstanz.de <basex-talk@mailman.uni-konstanz.de>
Subject: [basex-talk] Protecting against XML vulnerabilities

[External Email]

I am trying to protect my BaseX application from XML vulnerabilities, like the ones described in [https://gist.github.com/mgeeky/4f726d3b374f0a34267d4f19c9004870] and [https://learn.microsoft.com/en-us/archive/msdn-magazine/2009/november/xml-denial-of-service-attacks-and-defenses].

My application runs as `basexhttp` inside a docker container, and I set the options in web.xml:

<context-param>

<param-name>org.basex.dtd</param-name>

<param-value>false</param-value>

</context-param>

<context-param>

<param-name>org.basex.xinclude</param-name>

<param-value>false</param-value>

</context-param>

I have not found other options, for example to let the parser limit expansion of internal entities.
Is there a way to set parser properties like `jdk.xml.entityExpansionLimit` in BaseX?