Hello everyone
I’ve recently started to use and stumbled upon a problem with the indices in basex that I do not fully understand and was hoping for some clarification.
So basically I’ve got a bunch of xml files in my database with medical ICD-10 codes that all look like this:
<classification name=„some-name“ year=„2017“ >
<code>
<value>03R80JZ</value>
<final>true</final>
</code>
<code>
<value>2W3JX1Z</value>
<final>false</final>
</code>
…
...
</classification>
The db can contain multiple of these files, one for each name and year, and each classification can contain up to 100k of those codes (all values are unique). What I’d like to do now is, given a list of codes, check if those codes actually exist in the classification for a given name and source. So I came up with this straight-forward piece of code:
let $proc := ('F0723EZ','0JWS37Z','0W054JZ','0NQ00ZZ','03R80JZ','0PW537Z','0RW940Z','0PPD0KZ','2W3JX1Z','0D9B8ZZ','0SWC30Z','0PRJ4KZ','025R0ZZ','0T1347C','0S950ZX','008J4ZZ','0D110K4','0XUV4JZ','0GTN4ZZ','4A133J1’)
for $p in $proc
return /classification[@name=‚some-name‘ and @year=2017]/code[value = $p]
This does work fine, however it takes roughly one second to evaluate the query, and in the Query Info Window I can see that the text index of the db is not being used. However I do see that the attribute index is being queried. Now I noticed that, if I drop the attribute index, the query is evaluated basically instantaneous and can see that the query is rewritten to use the text index as well.
So my question is:
Is this normal, i.e. is it not possible to use multiple indices in one query at once?
Or is this a problem where the „sub-query“ …./code[value = $p] is only evaluated on the result of the first query, thus the text index is not being considered? If so, is there a way (without rewriting the whole query) to make basex use both indices when compiling the query?
Thanks a lot already and all the best
Clemens Müthing