Hi Ron,
Did you a) create a full-text index for your data and b) ensure that your query is rewritten for index access?
Best, Christian
On Fri, Aug 3, 2018 at 2:39 PM Ron Katriel rkatriel@mdsol.com wrote:
Christian,
Adding diacritics sensitive slows execution by a factor of 3. My script (fragment below), which joins two large databases, namely CT.gov and DrugBank, takes 2 hours without the diacritics sensitive constraint but 6 hours with it. Given the combinatorics involved, I am wondering if there is a better way to do this in BaseX.
Thanks, Ron
for $drug in db:open('DrugBank')/drugbank/drug let $drug_name := $drug/name/text() let $drug_synonyms := functx:value-union(normalize-space(lower-case($drug/name)), local:drug-synonyms($drug_name)) for $synonym_name in $drug_synonyms ... for $study in db:open('CTGov')/clinical_study[intervention/intervention_name contains text { $synonym_name } using case insensitive using diacritics sensitive] ...
Ron Katriel, Ph.D. | Principal Data Scientist | Medidata Solutions 350 Hudson Street, 7th Floor, New York, NY 10014 rkatriel@mdsol.com | direct: +1 201 337 3622 | mobile: +1 201 675 5598 | main: +1 212 918 1800
On August 1, 2018 at 12:41:26 PM, Ron Katriel (rkatriel@mdsol.com) wrote:
Thanks, Christian. Strange, prior to contacting you and on a hunch, I tried adding the missing “using” keyword but still got the syntax error. Anyway, everything is good now!
Best, Ron
On August 1, 2018 at 3:57:51 AM, Christian Grün (christian.gruen@gmail.com) wrote:
I have fixed the example in the doc. Best, Christian
On Wed, Aug 1, 2018 at 5:08 AM Ron Katriel rkatriel@mdsol.com wrote:
Hi,
The following from your website (docs.basex.org/wiki/Full-Text) appears to be syntactically incorrect
"'Äpfel' will not be found..." contains text "Apfel" diacritics sensitive
In the BaseX GUI the keyword diacritics is underlined in red and the following error is reported
Unexpected end of query: 'diacritic sens...'.
This happens in version 8.6.4 and also the latest (9.0.2).
Thanks, Ron
Ron Katriel, Ph.D. | Principal Data Scientist | Medidata Solutions
350 Hudson Street, 7th Floor, New York, NY 10014
rkatriel@mdsol.com | direct: +1 201 337 3622 | mobile: +1 201 675 5598 | main: +1 212 918 1800