Hi BaseX Team, this may be a bug: I have an xml file:
<?xml version="1.0" encoding="UTF-8"?> <Test>foo <Inner/></Test>
note the whitespace between the text foo and the node <Inner/>. Importing this into basex and running a query yields
<Test>foo<Inner/></Test>
without whitespace. Is there some kind of whitespace normalization going on during import? Can I set options that influence this behavior or is this a bug? Best regards Stefan
Stefan Sechelmann DFG-Forschungszentrum Matheon Mathematik für Schlüsseltechnologien Technische Universität Berlin Sekretariat MA 8-3 Tel. 030/314 29 486 Straße des 17. Juni 136 Fax 030/314 79 282 10623 Berlin sechel@math.tu-berlin.de http://www.math.tu-berlin.de/~sechel
Hi Stefan,
Am 26.03.2014 17:51, schrieb Stefan Sechelmann:
Is there some kind of whitespace normalization going on during import? Can I set options that influence this behavior or is this a bug?
this is the `CHOP` option [1] at work:
Signature CHOP [boolean] Default true Summary Chops all leading and trailing whitespaces from text nodes while building a database, and discards empty text nodes. By default, this option is set to true, as it often reduces the database size by up to 50%. It can also be turned off on command line via -w.
Hope that helps, Leo
basex-talk@mailman.uni-konstanz.de