Hi,

First of all, thank you for the excellent software you produce and maintain! Keep up the good work.

I've been using BaseX for some academic experiments on XQuery processing, and I got this situation that you guys can probably explain. 

Here is some context:
- I am using version 8.2.3. 
- Database 'expdb' was created with default options of that version, using 'auction.xml' document generated from Xmark benchmark [1].
- BaseX is running with default options too. 
- What the query does is irrelevant. 

When I execute this query:

for $pe in doc('expdb/auction.xml')/site/people/person
for $cat in doc('expdb/auction.xml')/site/categories/category[position() >= 1 and position() < 101]
where count($pe/profile/interest) > 3 and $pe/profile/interest/@category = $cat/@id
return
    <match>
        <person>{$pe/name}</person>
        <category>{$cat/name}</category>
    </match>

the resulting optimized query (in 'Query Info' window, on GUI) is this:

for $pe_0 in db:open-pre("expdb",0)/*:site/*:people/*:person[3.0 < count(*:profile/*:interest)] 
for $cat_1 in db:open-pre("expdb",0)/*:site/*:categories/*:category[position() = 1 to 100][(@*:id = $pe_0/*:profile/*:interest/@*:category)] 
return element match { (element person { ($pe_0/*:name) }, element category { ($cat_1/*:name) }) } 

If I change the original query to (note that I am only switching the position of 'for' clauses):

for $cat in doc('expdb/auction.xml')/site/categories/category[position() >= 1 and position() < 101]
for $pe in doc('expdb/auction.xml')/site/people/person
where count($pe/profile/interest) > 3 and $pe/profile/interest/@category = $cat/@id
return
    <match>
        <person>{$pe/name}</person>
        <category>{$cat/name}</category>
    </match>

the optimized query changes to:

for $cat_0 in db:open-pre("expdb",0)/*:site/*:categories/*:category[position() = 1 to 100] 
for $pe_1 in db:attribute("expdb", $cat_0/@*:id)/self::*:category/parent::*:interest/parent::*:profile/parent::*:person[3.0 < count(*:profile/*:interest)] 
return element match { (element person { ($pe_1/*:name) }, element category { ($cat_0/*:name) }) }

You can see that BaseX was able to use attribute index to optimize last query, reducing its execution time (by a lot!).

My question is: what is the explanation for this behavior?

Thank you in advance!

Gabriel Tessarolli

---
[1] http://www.xml-benchmark.org/