Hi Chris,
> DIACRITICS: true
It seems as if you set the diacritics option to true (which is
equivalent to "diacritics sensitive", as it is supposed to say
"consider diacritics: yes, please!"). Could you try to rebuild the
index with the diacritics option disabled?
Christian
On Tue, Aug 19, 2014 at 2:19 PM, Christopher Yocum <cyocum@gmail.com> wrote:
> Hi Christian,
>
> I hope you had a good weekend!
>
> Otherwise, no, this doesn't help as it doesn't choose to use the full text
> index on my content :(. This is what I am getting at the moment:
>
> Compiling:
> - pre-evaluating fn:collection("edil")
> - simplifying descendant-or-self step(s)
> - converting descendant::*:entry to child steps
> - simplifying descendant-or-self step(s)
> - removing context expression (.)
> - rewriting where clause(s)
> - simplifying flwor expression
>
> Query:
> declare variable $term as xs:string external := 'athgabāi.*'; declare
> variable $col as xs:string external := 'edil';
> <results>{subsequence(ft:mark(for $x in collection($col)//entry where
> $x//text() contains text {$term} using diacritics insensitive using
> wildcards return $x), 1, 5000)}</results>
>
> Optimized Query:
> element results { (fn:subsequence(ft:mark((db:open-pre("edil",0),
> db:open-pre("edil",155748), ...)/*:sample/*:entry[descendant::text()
> contains text "athgabāi.*" using wildcards using language 'English']), 1,
> 5000)) }
>
> I tried this as well with the same results:
>
> Compiling:
> - pre-evaluating fn:collection("edil")
> - simplifying descendant-or-self step(s)
> - converting descendant::*:entry to child steps
> - removing context expression (.)
> - rewriting where clause(s)
> - simplifying flwor expression
>
> Query:
> declare variable $term as xs:string external := 'athgabāi.*'; declare
> variable $col as xs:string external := 'edil';
> <results>{subsequence(ft:mark(for $x in collection($col)//entry where
> $x/descendant::*[text() contains text 'athgabāi.*' using diacritics
> insensitive using wildcards] return $x), 1, 5000)}</results>
> Optimized Query:
>
> element results { (fn:subsequence(ft:mark((db:open-pre("edil",0),
> db:open-pre("edil",155748), ...)/*:sample/*:entry[descendant::*[text()
> contains text "athgabāi.*" using wildcards using language 'English']]), 1,
> 5000)) }
>
> There are the options set on the database:
>
> Database Properties
> Name: edil
> Size: 194 MB
> Nodes: 7951662
> Documents: 19
> Binaries: 0
> Timestamp: 2014-08-15-17-00-29
>
> Resource Properties
> Input Path: /home/cyocum/temp/edil_src/xml_src
> Input Size: 87 MB
> Timestamp: 2014-08-15-16-46-31
> Encoding: UTF-8
> CHOP: true
>
> Indexes
> Up-to-date: true
> TEXTINDEX: true
> ATTRINDEX: true
> FTINDEX: true
> LANGUAGE:
> STEMMING: false
> CASESENS: false
> DIACRITICS: true
> STOPWORDS:
> UPDINDEX: false
> MAXCATS: 100
> MAXLEN: 96
>
> I hope this helps.
>
> All the best,
> Chris