Just as an example:
On our database with attribtue index the following query
db:open("lookup")//entry[@zip = "53040" and @city ="BETTOLLE"]
returns the following output in the info view:
Compiling:
- pre-evaluate db:open("lookup") to document-node()
- atomic evaluation of (@*:zip = "53040")
- atomic evaluation of (@*:city = "BETTOLLE")
- rewrite ((@*:zip = "53040") and (@*:city = "BETTOLLE"))
to predicate(s)
- rewrite boolean((@*:zip = "53040")) to '=' operator
- rewrite boolean((@*:city = "BETTOLLE")) to '=' operator
- rewrite descendant-or-self step(s)
- apply attribute index for "BETTOLLE"
Optimized Query:
db:attribute("lookup",
"BETTOLLE")/self::*:city/parent::*:entry[(@*:zip = "53040")]
It's clearly visible the application of the nost selective
attribute @city. Time 2.6 msec.
the following query db:open("lookup")//entry[@zip = "53040" and
matches(@city,"BETTOLLE")] returns the following output in the
info view:
Compiling:
- pre-evaluate db:open("lookup") to document-node()
- atomic evaluation of (@*:zip = "53040")
- rewrite ((@*:zip = "53040") and matches(@*:city,
"BETTOLLE")) to predicate(s)
- rewrite boolean((@*:zip = "53040")) to '=' operator
- rewrite boolean(matches(@*:city, "BETTOLLE")) to
fn:matches(string,pattern[,mod])
- rewrite descendant-or-self step(s)
- apply attribute index for "53040"
Optimized Query:
db:attribute("lookup",
"53040")/self::*:zip/parent::*:entry[matches(@*:city,
"BETTOLLE")]
Here the @zip attribute is used for indexing 3.6 msec but anyway
time is not significative it's comparable..
the following query db:open("lookup")//entry[matches(@zip,"53040")
and matches(@city,"BETTOLLE")] returns the following output in the
info view:
Compiling:
- pre-evaluate db:open("lookup") to document-node()
- rewrite (matches(@*:zip, "53040") and matches(@*:city,
"BETTOLLE")) to predicate(s)
- rewrite boolean(matches(@*:zip, "53040")) to
fn:matches(string,pattern[,mod])
- rewrite boolean(matches(@*:city, "BETTOLLE")) to
fn:matches(string,pattern[,mod])
- rewrite descendant-or-self step(s)
- convert to child steps:
descendant::*:entry[matches(@*:zip, "53040")][matches(@*:city,
"BETTOLLE")]
Optimized Query:
db:open-pre("lookup",0)/*:lookup/*:entry[matches(@*:zip,
"53040")][matches(@*:city, "BETTOLLE")]
Here no indexing is exploited and return time is 76msec.
Anyway how hard I try (with basex 8.6.6) I'm not able to fool the
optimizer and even these attempts ar eable to perfectly exploit
the index.
declare function local:q($db) {
db:open($db)//entry[@zip = "53040" and @city
="BETTOLLE"]
};
declare variable $database as xs:string external;
local:q($database)
or directly local:q("lookup")
So before coming to a conclusion take a look at what happens
and is logged in your info view.
Hope this is useful,
Marco.
On 20/10/2017 14:35, Marco Lettere wrote:
Hi France,
check out the info window of the GUI in a test-run it will tell.
Anyway I think that if the database name is in a declared
variable it should defintely be able to grasp the index.
If $dbname is passed into another function as a function
parameter then it could be that the info for accessing a proper
index is lost. Need to check with inlining though....
On way around this that I've found is to insulate the usually
small part of the predicate that applies to indexing and compose
an xquery:eval for executing it with the $dbname sculpted into
it. Usually the overhead is largely covered by the gain in DB
access time.
Something like:
xquery:eval("declare variable $input as xs:string external;
db:open('" || $lang || "')/*[id=$input]", map{ "input" :
"searchedid"})
Another case that I could experien is that if you use fn:matches
instead of = in a predicate, the optimizer also gets puzzled and
is not able to exploit the index.
Wonder if for these cases there could be an improvement too.
Ciao,
M.
On 20/10/2017 14:21, France Baril wrote:
Hi,
We are working hard
right now on performance issues. I just read this about an
upcoming release:
Enforce
Rewritings
In
various cases, existing index structures will not be
utilized by the query optimizer. This is usually the case
if the name of the database is not a static string (e.g.,
because it is bound to a variable or passed on as argument
of a function call). Furthermore, several candidates for
index rewritings may exist, and the query optimizer may
decide for a rewriting that turns out to be suboptimal.
I'm
not sure I read it properly so I would like to confirm
my understanding:
In
our code, all our db accesses are in a format like this:
db:open($lang)/*[@id='...']. Does the fact that the db
name is $lang, a variable, mean that we have been
working without the optimizer all this time and
therefore that this new feature is there for people like
us?
--