On Thu, Jan 24, 2019 at 09:41:18PM +0100, Xavier-Laurent SALVADOR scripsit:
Hi List,
I'm seing a little problem I can't understand with a small 27M thesaurus database. I created all indexes. When using the '!=' operator to compare two lists, I get a quick and wrong result:
*Case 1:* let $a := /thesaurus/entry/synonym/term let $b := /thesaurus/entry/term return $a[. = $b] ---> First result: "raccourcir"
The = and != operators compare the sequences.
= returns true if _any_ member of the left-hand sequence can be found in the right-hand sequence.
('1','2','3') = ('1','asparagus','guillotine')
returns true().
Similarly, != returns true if _any_ member of the left-hand sequence can NOT be found in the right-hand sequence.
('1','2','3') != ('1','asparagus','guillotine')
returns true().
!= is an exceedingly tricksy operator;
(1,2,3) != (1,2,3)
is true.
*Case 4:* let $a := /thesaurus/entry/synonym/term let $b := /thesaurus/entry/term return $a[not(.=$b)]
It dies out of memory.
That's odd. That pattern generally works pretty well for me on much larger datasets than 27 MB.
If you're trying to check that all your synonyms are terms, I'd try
let $a as xs:string+ := /thesaurus/entry/synonym/term/string() let $b as xs:string+ := /thesaurus/entry/term/string()
return $a[not(. = $b)]
so you do the "I want the string value of this element" only once, or if for some reason you need the elements later, maybe create maps and then compare the map keys?
-- Graydon