Awesome, thanks!
Le jeu. 24 janv. 2019 à 21:52, Graydon graydonish@gmail.com a écrit :
On Thu, Jan 24, 2019 at 09:41:18PM +0100, Xavier-Laurent SALVADOR scripsit:
Hi List,
I'm seing a little problem I can't understand with a small 27M thesaurus database. I created all indexes. When using the '!=' operator to compare two lists, I get a quick and
wrong
result:
*Case 1:* let $a := /thesaurus/entry/synonym/term let $b := /thesaurus/entry/term return $a[. = $b] ---> First result: "raccourcir"
The = and != operators compare the sequences.
= returns true if _any_ member of the left-hand sequence can be found in the right-hand sequence.
('1','2','3') = ('1','asparagus','guillotine')
returns true().
Similarly, != returns true if _any_ member of the left-hand sequence can NOT be found in the right-hand sequence.
('1','2','3') != ('1','asparagus','guillotine')
returns true().
!= is an exceedingly tricksy operator;
(1,2,3) != (1,2,3)
is true.
*Case 4:* let $a := /thesaurus/entry/synonym/term let $b := /thesaurus/entry/term return $a[not(.=$b)]
It dies out of memory.
That's odd. That pattern generally works pretty well for me on much larger datasets than 27 MB.
If you're trying to check that all your synonyms are terms, I'd try
let $a as xs:string+ := /thesaurus/entry/synonym/term/string() let $b as xs:string+ := /thesaurus/entry/term/string()
return $a[not(. = $b)]
so you do the "I want the string value of this element" only once, or if for some reason you need the elements later, maybe create maps and then compare the map keys?
-- Graydon
--
Xavier-Laurent Salvador Professeur Agrégé, Maître de Conférence HDR ECC TTN 2018 22868H - équipe "Humanités Numériques" Coordinateur du réseau international HiBHidEM http://ttn.univ-paris13.fr Université Paris 13 Sorbonne Paris Cité 99 avenue Jean-Baptiste Clément 93430 Villetaneuse tél. : (+33) 06 51.65.84.38 email : xavier-laurent.salvador@univ-paris13.fr site web: http://www.biblehistoriale.fr site web: http://www.humanitesnumeriques.fr
------------------------------------------------------------ Ce message peut contenir des informations réservées exclusivement à son destinataire. Toute diffusion sans autorisation est interdite. Si vous n'en êtes pas le destinataire, merci de prendre contact avec l'expéditeur et de détruire ce message.
*This email may contain material for the sole use of the intended recipient. Any forwarding without express permission is prohibited. If you are not the intended recipient, please contact the sender and delete all copies*.