Hi Cerstin,
If I want to get whitespaces back, do I have to re-create the collection?
Yes; sorry for that. The database does not contain any information on chopped whitespaces, which is why you'll indeed have to reimport the documents.
Would this result in any change concerning the node-ids? We already have some data depending on node-ids. Is there some other way to get the original whitespaces back?
The node ids will change if the documents include pure whitespace texts. The following example represents such a document; it contains three text nodes ("X", and two text nodes with a single newline character):
<hello> <world>X</world> </hello>
How would I display the selected text snippet to the user, when I store the node-id and the text (as mixed content)? ft:mark will not work, I think.
I'm not quite sure what you refer to here; could you attach a small example? Christian
PS@Michael and Gerrit: thanks for your opinion. One of the reasons for the chopping whitespaces by default is that whitespace texts in structured documents consume a lot of space in a database, although they will never need to be processed. However, I see that this solution may cause more confusion than be helpful, which is why we'll think about switching the default behavior.