Hi Sebastian,
I couldn’t get this reproduced out of the box. A technical guess: Global full-text options may have been overwritten by database-specific properties in the second switch branch at compile, which yielded wrong/restricted results in the first branch at runtime.
Could you check once again if this is fixed with the new snapshot [1]?
Some more comments on your query: If you formulate duplicate paths only once, you might get even better performance:
OLD: for $a in A where $a/B/C/D/E contains text { $text } return $a/B/C/D/E
NEW: for $e in A/B/C/D/E where $e contains text { $text } return $e
In a future version of BaseX, such patterns will automatically be rewritten. Currently, basic patterns are already simplified [2]:
for $e in A/B where $e/C/D return $e/C/D/E → A/B[C/D]/C/D/E → A/B/C/D/E
The enforceindex is still in a somewhat experimental stage (hence, thanks for your feedback), and its behavior is sometimes surprising if there are various competing candidates for index rewrites in your expression. If you want to have more control on how your queries are executed, you can directly call ft:search:
for $db in ('US00','US01','US02') return ft:search($db, $text)[parent::mark-identification]
If all 'mark-identification' elements occur on the same level in your document, you can omit the remaining parent steps (this will further speed up query evaluation). A look at the optimized query in the InfoView panel will give you some more hints.
Cheers, Christian
[1] http://files.basex.org/releases/latest/ [2] https://github.com/BaseXdb/basex/issues/1864
On Wed, May 13, 2020 at 11:23 PM Sebastian Guerrero chapeti@gmail.com wrote:
Hi everyone! it's me again.
Here is my doubt:
If I execute this query:
(# db:enforceindex #) { for $db in ('US00','US01','US02') for $tmUS in db:open($db)/trademark-applications-daily/application-information/file-segments/action-keys/case-file where $tmUS/case-file-header/mark-identification/text() contains text { 'apple' } return $tmUS/case-file-header/mark-identification/text() }
I get 4k results in 139ms from three databases of 90GB and 13M of records. It works like a charm. [01]
But, if I include that query into a for and then into a switch ( I tried with if-then-else too ), the same query returns only 11 results in 107ms [02]:
declare namespace gb="http://www.ipo.gov.uk/schemas/tm"; let $text := "apple" let $registries := ('US')
for $registry in $registries return switch ($registry)
case "US" return (# db:enforceindex #) { for $db in ('US00','US01','US02') for $tmUS in db:open($db)/trademark-applications-daily/application-information/file-segments/action-keys/case-file where $tmUS/case-file-header/mark-identification/text() contains text { $text } return $tmUS/case-file-header/mark-identification/text() } case "GB" return (# db:enforceindex #) { for $tmGB in db:open('GB')/gb:MarkLicenceeExportList/gb:TradeMark where $tmGB/gb:WordMarkSpecification/gb:MarkVerbalElementText/text() contains text { $text } return $tmGB/gb:WordMarkSpecification/gb:MarkVerbalElementText/text() } default return "Unknown registry code"
I noticed that removing the case option "GB" ( even if it's not evaluated ), it works fine and returns the 4k records [03]:
declare namespace gb="http://www.ipo.gov.uk/schemas/tm"; let $text := "apple" let $registries := ('US')
for $registry in $registries return switch ($registry)
case "US" return (# db:enforceindex #) { for $db in ('US00','US01','US02') for $tmUS in db:open($db)/trademark-applications-daily/application-information/file-segments/action-keys/case-file where $tmUS/case-file-header/mark-identification/text() contains text { $text } return $tmUS/case-file-header/mark-identification/text() } default return "Unknown registry code"
What I'm missing here? is this the right behaviour?
Best regards, Sebastian
[01] https://imgur.com/o4RUUyO [02] https://imgur.com/533c0rI [03] https://imgur.com/mCb3qEe