Thanks for your quick answer. Query execution improves its timing a lot.
One more question, in a similar query but with attributes, it looks like the index doesn´t get applied (more similar to our real situation):
XML is similar to:
<businessEvent> <inputs> <input name="data"> <input element="229388252" name="id" type="String"/> <input element="2751 XTE" name="matriculation" type="String"/> <input element="XTETVTTJKC" name="clientId" type="String"/> <input name="driver"> <input element="driver name" name="name" type="String"/> <input element="driver surname" name="surname" type="String"/> <input element="21973887X" name="nif" type="String"/> <input element="line 1" name="line1" type="String"/> <input element="Line 2" name="line2" type="String"/> <input element="City name" name="city" type="String"/> <input element="28197" name="cp" type="String"/> <input element="Region7" name="region" type="String"/> <input element="Country" name="country" type="String"/> </input> <input name="accident"> <input name="date"> <input element="2012" name="year" type="String"/> <input element="13" name="month" type="String"/> <input element="22" name="day" type="String"/> <input element="2" name="hour" type="String"/> <input element="53" name="minute" type="String"/> </input> <input name="address"> <input element="Avenue 5" name="line1" type="String"/> <input element="Line2" name="line2" type="String"/> <input element="Madrid" name="city" type="String"/> <input element="28224" name="cp" type="String"/> <input element="Region3" name="region" type="String"/> <input element="Spain" name="country" type="String"/> </input> <input element="Lorem ipsum dolor …." name="declaration" type="String"/> </input> </input> </inputs> </businessEvent>
Execution:
Query: <result>{ for $col1 in (db:open('bbdd_1')//businessEvent ), $col2 in (db:open('bbdd2')//businessEvent) for $income in distinct-values($col2//inputs/input[@name='data']/input[@name='accident']/input[@name='address']/input[@name='region']/data(@element)) where $col1//inputs/input[@name='data']/input[@name='id']/data(@element)=$col2//inputs/input[@name='data']/input[@name='id']/data(@element) group by $income return <line><label>{ $income }</label><data>{ count($col2) }</data></line>}</result> Compiling: - rewriting where clause to predicate(s) - pre-evaluating db:open("bbdd_1") - simplifying descendant-or-self step(s) - converting descendant::*:businessEvent to child steps - pre-evaluating db:open("bbdd_2") - simplifying descendant-or-self step(s) - simplifying descendant-or-self step(s) - swapping operands: descendant::*:inputs/*:input[@*:name = "data"]/*:input[@*:name = "id"]/fn:data(@*:element) = $col1/descendant::inputs/input[@name = "data"]/input[@name = "id"]/fn:data(@element) - simplifying descendant-or-self step(s) - converting descendant::*:businessEvent[descendant::*:inputs/*:input[@*:name = "data"]/*:input[@*:name = "id"]/fn:data(@*:element) = $col1/descendant::inputs/input[@name = "data"]/input[@name = "id"]/fn:data(@element)] to child steps - simplifying descendant-or-self step(s) Result: element result { for $col1 in (document-node { "1361200546116" }, ...)/*:businessEvent for $col2 in (document-node { "1361200600857" }, ...)/*:businessEvent[descendant::*:inputs/*:input[@*:name = "data"]/*:input[@*:name = "id"]/fn:data(@*:element) = $col1/descendant::inputs/input[@name = "data"]/input[@name = "id"]/fn:data(@element)] for $income in fn:distinct-values($col2/descendant::inputs/input[@name = "data"]/input[@name = "accident"]/input[@name = "address"]/input[@name = "region"]/fn:data(@element)) group by $income := $income return element line { element label { $income }, element data { fn:count($col2) } } }
Timing: - Parsing: 1.24 ms - Compiling: 7.44 ms - Evaluating: 36090.05 ms - Printing: 0.64 ms - Total Time: 36099.37 ms
Is it possible that the index doesn’t get applied?, how could I improve this query?
Date: Mon, 18 Feb 2013 13:35:56 +0100 From: lw@basex.org To: narayola@hotmail.com CC: basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] Performance problem
Hi Nuria,
Am 18.02.2013 12:10, schrieb Nuria Garcia:
Compiling:
- rewriting where clause to predicate(s)
- pre-evaluating db:open("Authors")
- simplifying descendant-or-self step(s)
- pre-evaluating db:open("Books")
- simplifying descendant-or-self step(s)
- simplifying descendant-or-self step(s)
- swapping operands: descendant::*:book/*:idauthor = $Author/descendant::author/idauthor
- simplifying descendant-or-self step(s)
- simplifying descendant-or-self step(s)
It seems that the predicate into which the `where` clause is rewritten is not converted into an index access. Nothing in the query should cause this, so the problem seems to be (one of) the databases.
Are you sure that the text index of the "Books" database is up-to-date? You can check that in the GUI under "Database -> Open & Manage..." or with the command `INFO INDEX TEXT` [1]. If it is outdated, you have to update it via "Database -> Properties... -> Optimize..." in the GUI or the `OPTIMIZE` command [2]. You can also recreate the database with incremental updates activated via the `OPDINDEX` option [3], but this only affects the text and attribute indices.
Hope this helps, cheers, Leo
[1] http://docs.basex.org/wiki/Commands#INFO_INDEX [2] http://docs.basex.org/wiki/Commands#OPTIMIZE [3] http://docs.basex.org/wiki/Options#UPDINDEX