Hi Christian, 

Thank you very much for your detailed answer, your comments are very useful for me.

- Could you check once again if this is fixed with the new snapshot?: I confirm it. With your new snapshot, the problem is fixed. [1]. 

Thank you very much for your comments about duplicate paths, you're right: it's more performant if we write it in the other way. I've changed it. [2]

About "ft:search" ( and full index in general ) I've noticed a "strange" behaviour when you perform a search using the full text.

But, I'll write about it in a separated thread to keep everything consistent.

Best regards,
Sebastian

[1] https://imgur.com/XnRdxyD
[2] https://imgur.com/U1wo4y3





On Thu, May 14, 2020 at 11:25 AM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Sebastian,

I couldn’t get this reproduced out of the box. A technical guess:
Global full-text options may have been overwritten by
database-specific properties in the second switch branch at compile,
which yielded wrong/restricted results in the first branch at runtime.

Could you check once again if this is fixed with the new snapshot [1]?

Some more comments on your query: If you formulate duplicate paths
only once, you might get even better performance:

OLD:
  for $a in A
  where $a/B/C/D/E contains text { $text }
  return $a/B/C/D/E

NEW:
  for $e in A/B/C/D/E
  where $e contains text { $text }
  return $e

In a future version of BaseX, such patterns will automatically be
rewritten. Currently, basic patterns are already simplified [2]:

  for $e in A/B where $e/C/D return $e/C/D/E
  → A/B[C/D]/C/D/E
  → A/B/C/D/E

The enforceindex is still in a somewhat experimental stage (hence,
thanks for your feedback), and its behavior is sometimes surprising if
there are various competing candidates for index rewrites in your
expression. If you want to have more control on how your queries are
executed, you can directly call ft:search:

    for $db in ('US00','US01','US02')
    return ft:search($db, $text)[parent::mark-identification]

If all 'mark-identification' elements occur on the same level in your
document, you can omit the remaining parent steps (this will further
speed up query evaluation). A look at the optimized query in the
InfoView panel will give you some more hints.

Cheers,
Christian

[1] http://files.basex.org/releases/latest/
[2] https://github.com/BaseXdb/basex/issues/1864



On Wed, May 13, 2020 at 11:23 PM Sebastian Guerrero <chapeti@gmail.com> wrote:
>
> Hi everyone! it's me again.
>
> Here is my doubt:
>
> If I execute this query:
>
>              (# db:enforceindex #) {
>                   for $db in ('US00','US01','US02')
>                   for $tmUS in db:open($db)/trademark-applications-daily/application-information/file-segments/action-keys/case-file
>                   where $tmUS/case-file-header/mark-identification/text() contains text { 'apple' }
>                   return $tmUS/case-file-header/mark-identification/text()
>                 }
>
> I get 4k results in 139ms from three databases of 90GB and 13M of records. It works like a charm. [01]
>
> But, if I include that query into a for and then into a switch ( I tried with if-then-else too ), the same query returns only 11 results in 107ms [02]:
>
> declare namespace gb="http://www.ipo.gov.uk/schemas/tm";
> let $text := "apple"
> let $registries := ('US')
>
> for $registry in $registries
> return
>   switch ($registry)
>
>            case "US"
>            return
>            (# db:enforceindex #) {
>                   for $db in ('US00','US01','US02')
>                   for $tmUS in db:open($db)/trademark-applications-daily/application-information/file-segments/action-keys/case-file
>                   where $tmUS/case-file-header/mark-identification/text() contains text { $text }
>                   return $tmUS/case-file-header/mark-identification/text()
>                 }
>
>             case "GB"
>            return
>                (# db:enforceindex #) {
>                    for $tmGB in db:open('GB')/gb:MarkLicenceeExportList/gb:TradeMark
>                  where $tmGB/gb:WordMarkSpecification/gb:MarkVerbalElementText/text() contains text { $text }
>                  return $tmGB/gb:WordMarkSpecification/gb:MarkVerbalElementText/text()
>                 }
>
>             default return "Unknown registry code"
>
>
>
> I noticed that removing the case option "GB" ( even if it's not evaluated ), it works fine and returns the 4k records [03]:
>
> declare namespace gb="http://www.ipo.gov.uk/schemas/tm";
> let $text := "apple"
> let $registries := ('US')
>
> for $registry in $registries
> return
>   switch ($registry)
>
>            case "US"
>            return
>            (# db:enforceindex #) {
>                   for $db in ('US00','US01','US02')
>                   for $tmUS in db:open($db)/trademark-applications-daily/application-information/file-segments/action-keys/case-file
>                   where $tmUS/case-file-header/mark-identification/text() contains text { $text }
>                   return $tmUS/case-file-header/mark-identification/text()
>                 }
>
>             default return "Unknown registry code"
>
>
>
> What I'm missing here? is this the right behaviour?
>
> Best regards,
> Sebastian
>
> [01] https://imgur.com/o4RUUyO
> [02] https://imgur.com/533c0rI
> [03] https://imgur.com/mCb3qEe