Hi Daniel,
Yes, I assume we’ll need to call it a bug… Although what BaseX is currently doing is known to us to be out of spec behavior. The function fn:parse-xml-fragments is based on our internal XML parser, which is much faster than the standard XML parser (in particular for small input), and it tolerates input that’s not perfectly well-formed. In addition, it accepts HTML entities without a linked DTD:
parse-xml-fragment(`ä`)
We should at least document the behavior or (better) introduce a custom BaseX function for it.
Hope this helps (for now), Christian
On Tue, Nov 21, 2023 at 3:17 PM Zimmel, Daniel D.Zimmel@esvmedien.de wrote:
Hi,
is this a bug?
Query: parse-xml-fragment('Tom & Jerry')
Result: Tom ? Jerry
Same result with: parse-xml-fragment('Tom &DUMMY; Jerry')
BaseX 10.7
Saxon complains correctly that the resulting document node is not well-formed. BaseX should also return an error, shouldn't it?
Best, Daniel