(1) 'contains text "Kopf Sand Stecken" all words using stemming using language "de"'
There is no difference to:
(2) 'contains text ("Nase" ftand "Sand" ftand "stecken") using stemming using language "de"'
both queries deliver 4 nodes.
If I would like to find the query terms within a certain distance, adding
'distance at most 10 words'
for (1) I get 2 nodes (a subset of the 4 from the first run), but for (2) I still get all 4 nodes. The information concerning distance doesn't seem to be considered. For my application this is no problem, since I have to go for the "ftand"-variant to get proper marking, but in general this looks strange.
This may well be correct, as (unfortunately) the internal data model of XQuery Full Text is pretty complex and leads to frequent misunderstandings. To give more information, I'll have to look at the actual data; do you think you can provide me with a little document that exemplifies your observation?
The second question is about "ftand" and "ftor". If I try these queries: ...
Once more, it might be helpful to have the actual data at hand..
The third question is about the full-text index itself. When applying fuzzy search or using wildcards, the full-text index is not applied -- resulting in a time out on my website, I need 341859.09 ms in the GUI for applying
Currently, the choice has to be made between efficient fuzzy or wildcard matching (the latter being based on a Trie index structure). Some information on that can be found in our Wiki [1] (btw, feel free to edit the Wiki if you feel it's incomplete!). We are working on a new index structure that will unify both index structures, improve performance, and support incremental updates. We may even eliminate the explicit choice of some other full-text options, such that those options can be dynamically chosen without the need to reindex the database.
More feature requests regarding the full-text index are welcome. Christian