Hi,
I have three questions concerning working with the full-text index:
The first question is about "distance" information. Given this query:
(1) 'contains text "Kopf Sand Stecken" all words using stemming using language "de"'
There is no difference to:
(2) 'contains text ("Nase" ftand "Sand" ftand "stecken") using stemming using language "de"'
both queries deliver 4 nodes.
If I would like to find the query terms within a certain distance, adding
'distance at most 10 words'
for (1) I get 2 nodes (a subset of the 4 from the first run), but for (2) I still get all 4 nodes. The information concerning distance doesn't seem to be considered. For my application this is no problem, since I have to go for the "ftand"-variant to get proper marking, but in general this looks strange.
The second question is about "ftand" and "ftor". If I try these queries:
(3) 'contains text ("Nase" ftand "Sand" ftand "stecken") using stemming using language "de" distance at most 10 words' (4) 'contains text ("Kopf" ftand "Sand" ftand "stecken") using stemming using language "de" distance at most 10 words
I get 2 hits for (3) and 11 for (4). So I assumed I would get 13 hits (the ones from (3) and the ones from (4) when changing the query to:
(5) 'contains text (("Nase" ftor "Kopf") ftand "Sand" ftand "stecken") using stemming using language "de" distance at most 10 words'
However, I get 6 hits -- none of them containing "Nase" (there is no difference, if the query starts with '"Nase" ftor "Kopf"' or with '"Kopf" ftor "Nase"').
Did I mess something up?
The third question is about the full-text index itself. When applying fuzzy search or using wildcards, the full-text index is not applied -- resulting in a time out on my website, I need 341859.09 ms in the GUI for applying
'ft:mark (//*[text() contains text ("Korb" ftand "geben") using fuzzy][self::*:p or self::*:l])'
to my 3 GB collection. The information at the "Full-Text" tab says:
- Structure: Trie - Stemming: ON - Case Sensitivity: ON - Diacritics: ON - Language: German - Size: 1 GB - Entries: 1743744
I created the full-text index with the option "Support Wildcards", too, but this information is not shown in the Database properties. When creating the index, "SET WILDCARDS true" is shown. I used stemming, casesensitivity, diacritics, and wildcards -- is this an unrecommended combination?
Thank you very much in advance
Cerstin