Hello,
I have upgraded BaseX from 7.2 to 7.3 but I have a severe performance drop on my query.
My query is in 2 parts :
- The first part extracts values from an xpath and store them in a map (the map contains 40000 entries) - The second part extracts values from an xpath and test if they are contains in the map (xpath return around 30000 entries).
On 7.2 a typical query run on 2s and on 7.3 I have no result after 10 minutes
Is there modification on BaseX 7.3 which can explain this performance drop ?
Thanks for your help,
Regards,
Nicolas
I have analysis the cause of the performance drop.
My first statement is wrong, the performance drop is not from 7.2 to 7.3 but from 7.2 to 7.2.1.
BaseX v7.2 executes the query in sequence. First the flwor which extracts data and stores them in a map. Second the flwor which test if values are contains in the map.
BaseX v7.2.1 optimises the query and the two sequentials flwor are merge in a main flwor with an embed flwor. The maps seems to be constructed again and again for each value of the second flwor.
Hope it will help,
Regards,
Nicolas
On Mon, Jun 25, 2012 at 10:21 AM, Nicolas Labrot nithril@gmail.com wrote:
Hello,
I have upgraded BaseX from 7.2 to 7.3 but I have a severe performance drop on my query.
My query is in 2 parts :
- The first part extracts values from an xpath and store them in a map
(the map contains 40000 entries)
- The second part extracts values from an xpath and test if they are
contains in the map (xpath return around 30000 entries).
On 7.2 a typical query run on 2s and on 7.3 I have no result after 10 minutes
Is there modification on BaseX 7.3 which can explain this performance drop ?
Thanks for your help,
Regards,
Nicolas
Hi Nicolas,
thanks for your analysis. Due to the complexity of XQuery, and the wide variety of possible execution plans, it frequently happens that some queries get slower than others, and vice versa. 2 seconds vs. 10 minutes is striking, though, so feel free to send us a little query that demonstrates the behavior.
Christian __________________________
My first statement is wrong, the performance drop is not from 7.2 to 7.3 but from 7.2 to 7.2.1.
BaseX v7.2 executes the query in sequence. First the flwor which extracts data and stores them in a map. Second the flwor which test if values are contains in the map.
BaseX v7.2.1 optimises the query and the two sequentials flwor are merge in a main flwor with an embed flwor. The maps seems to be constructed again and again for each value of the second flwor.
Hope it will help,
Regards,
Nicolas
On Mon, Jun 25, 2012 at 10:21 AM, Nicolas Labrot nithril@gmail.com wrote:
Hello,
I have upgraded BaseX from 7.2 to 7.3 but I have a severe performance drop on my query.
My query is in 2 parts :
The first part extracts values from an xpath and store them in a map (the map contains 40000 entries) The second part extracts values from an xpath and test if they are contains in the map (xpath return around 30000 entries).
On 7.2 a typical query run on 2s and on 7.3 I have no result after 10 minutes
Is there modification on BaseX 7.3 which can explain this performance drop ?
Thanks for your help,
Regards,
Nicolas
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Here is the query, hope it will help :
*declare namespace map*="http://www.w3.org/2005/xpath-functions/map";
<result> {
*let $res* := *map:new*(*for $dmCode* *in */*dmodule*/*identAndStatusSection */*dmAddress*/*dmIdent*/*dmCode*
*return **map:entry*(*string*(*concat*(*$dmCode*/*@modelIdentCode* , *$dmCode*/*@systemDiffCode* , *$dmCode*/*@systemCode* , *$dmCode*/*@subSystemCode* , *$dmCode*/*@subSubSystemCode* , *$dmCode*/*@assyCode* , *$dmCode*/*@disassyCode* , *$dmCode*/*@disassyCodeVariant* , *$dmCode*/*@infoCode* , *$dmCode*/*@infoCodeVariant* , *$dmCode*/*@itemLocationCode*)) , *true*()))
*for* *$dml* *in */*dml*/*dmlContent*/*dmlEntry*/*dmRef*/*dmRefIdent*/* dmCode* *let* *$ident* := *string*(*concat*(*$dml*/*@modelIdentCode* , *$dml*/*@systemDiffCode* , *$dml*/*@systemCode* , *$dml*/*@subSystemCode* , *$dml*/*@subSubSystemCode* , *$dml*/*@assyCode* , *$dml*/*@disassyCode* , *$dml*/*@disassyCodeVariant* , *$dml*/*@infoCode* , *$dml*/*@infoCodeVariant* , *$dml*/*@itemLocationCode*)) *return* *if* (*not*(*map:contains*(*$res* , *$ident*))) *then* <ko>{*$ident*}</ko> *else*()
} </result>
On Mon, Jun 25, 2012 at 11:39 AM, Christian Grün christian.gruen@gmail.comwrote:
Hi Nicolas,
thanks for your analysis. Due to the complexity of XQuery, and the wide variety of possible execution plans, it frequently happens that some queries get slower than others, and vice versa. 2 seconds vs. 10 minutes is striking, though, so feel free to send us a little query that demonstrates the behavior.
Christian __________________________
My first statement is wrong, the performance drop is not from 7.2 to 7.3
but
from 7.2 to 7.2.1.
BaseX v7.2 executes the query in sequence. First the flwor which extracts data and stores them in a map. Second the flwor which test if values are contains in the map.
BaseX v7.2.1 optimises the query and the two sequentials flwor are merge
in
a main flwor with an embed flwor. The maps seems to be constructed again
and
again for each value of the second flwor.
Hope it will help,
Regards,
Nicolas
On Mon, Jun 25, 2012 at 10:21 AM, Nicolas Labrot nithril@gmail.com
wrote:
Hello,
I have upgraded BaseX from 7.2 to 7.3 but I have a severe performance
drop
on my query.
My query is in 2 parts :
The first part extracts values from an xpath and store them in a map
(the
map contains 40000 entries) The second part extracts values from an xpath and test if they are contains in the map (xpath return around 30000 entries).
On 7.2 a typical query run on 2s and on 7.3 I have no result after 10 minutes
Is there modification on BaseX 7.3 which can explain this performance
drop
?
Thanks for your help,
Regards,
Nicolas
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Here is the query, hope it will help :
Thanks; do you also have a document for testing (which you may send directly to me)? Christian
declare namespace map="http://www.w3.org/2005/xpath-functions/map";
<result> {
let $res := map:new(for $dmCode in /dmodule/identAndStatusSection/dmAddress/dmIdent/dmCode
return map:entry(string(concat($dmCode/@modelIdentCode , $dmCode/@systemDiffCode , $dmCode/@systemCode , $dmCode/@subSystemCode , $dmCode/@subSubSystemCode , $dmCode/@assyCode , $dmCode/@disassyCode , $dmCode/@disassyCodeVariant , $dmCode/@infoCode , $dmCode/@infoCodeVariant , $dmCode/@itemLocationCode)) , true()))
for $dml in /dml/dmlContent/dmlEntry/dmRef/dmRefIdent/dmCode let $ident := string(concat($dml/@modelIdentCode , $dml/@systemDiffCode , $dml/@systemCode , $dml/@subSystemCode , $dml/@subSubSystemCode , $dml/@assyCode , $dml/@disassyCode , $dml/@disassyCodeVariant , $dml/@infoCode , $dml/@infoCodeVariant , $dml/@itemLocationCode)) return if (not(map:contains($res , $ident))) then <ko>{$ident}</ko> else()
}
</result>
On Mon, Jun 25, 2012 at 11:39 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Nicolas,
thanks for your analysis. Due to the complexity of XQuery, and the wide variety of possible execution plans, it frequently happens that some queries get slower than others, and vice versa. 2 seconds vs. 10 minutes is striking, though, so feel free to send us a little query that demonstrates the behavior.
Christian __________________________
My first statement is wrong, the performance drop is not from 7.2 to 7.3 but from 7.2 to 7.2.1.
BaseX v7.2 executes the query in sequence. First the flwor which extracts data and stores them in a map. Second the flwor which test if values are contains in the map.
BaseX v7.2.1 optimises the query and the two sequentials flwor are merge in a main flwor with an embed flwor. The maps seems to be constructed again and again for each value of the second flwor.
Hope it will help,
Regards,
Nicolas
On Mon, Jun 25, 2012 at 10:21 AM, Nicolas Labrot nithril@gmail.com wrote:
Hello,
I have upgraded BaseX from 7.2 to 7.3 but I have a severe performance drop on my query.
My query is in 2 parts :
The first part extracts values from an xpath and store them in a map (the map contains 40000 entries) The second part extracts values from an xpath and test if they are contains in the map (xpath return around 30000 entries).
On 7.2 a typical query run on 2s and on 7.3 I have no result after 10 minutes
Is there modification on BaseX 7.3 which can explain this performance drop ?
Thanks for your help,
Regards,
Nicolas
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Nicolas,
finally some feedback: as you already figured out (thanks for the hint), BaseX 7.2. and 7.2.1 apply different rewritings before evaluating your query. I have found the "optimization" that changes the behavior [1]. It was introduced to pre-evaluate a number of other queries starting with a root node [2]. As this rewriting is important to speed up a bunch of other queries that have been too slow in the past, it will probably stay as is. Instead , we might implement some other rewritings that could again speed up queries like yours.
The core problem is that it's difficult to decide a) when/if lazy evaluation of an expression will be faster than a pre-evaluation, and b) which subexpressions will always yield the same result and can thus be cached. As an example, the following query…
let $doc := //millionsOfNodes return $doc[1]
…will be evaluated much faster if it's rewritten to…
(//millionsOfNodes)[1]
…because querying can be stopped after the first node has been returned.
In a nutshell (I hope I didn’t clutter you with too many details): your existing query will further on be rewritten differently than before, BUT… You can move the expression of the first "let" clause to a global variable. This way, you can enforce that the expression will always be pre-evaluated, and not moved into the loop: _______________________
declare namespace map = "http://www.w3.org/2005/xpath-functions/map"; declare variable $res := map:new( ... );
for $dml in /dml/dmlContent/dmlEntry/dmRef/dmRefIdent/dmCode let $ident := string(concat($dml/@modelIdentCode , $dml/@systemDiffCode , ...
Hope this helps; feel free to ask for more, Christian
[1] https://github.com/BaseXdb/basex/commit/68a4490ff8e6d7f6a75d1b182a9091b76bd1... [2] https://github.com/BaseXdb/basex/issues/474 ___________________________
Here is the query, hope it will help :
declare namespace map="http://www.w3.org/2005/xpath-functions/map";
<result> {
let $res := map:new(for $dmCode in /dmodule/identAndStatusSection/dmAddress/dmIdent/dmCode
return map:entry(string(concat($dmCode/@modelIdentCode , $dmCode/@systemDiffCode , $dmCode/@systemCode , $dmCode/@subSystemCode , $dmCode/@subSubSystemCode , $dmCode/@assyCode , $dmCode/@disassyCode , $dmCode/@disassyCodeVariant , $dmCode/@infoCode , $dmCode/@infoCodeVariant , $dmCode/@itemLocationCode)) , true()))
for $dml in /dml/dmlContent/dmlEntry/dmRef/dmRefIdent/dmCode let $ident := string(concat($dml/@modelIdentCode , $dml/@systemDiffCode , $dml/@systemCode , $dml/@subSystemCode , $dml/@subSubSystemCode , $dml/@assyCode , $dml/@disassyCode , $dml/@disassyCodeVariant , $dml/@infoCode , $dml/@infoCodeVariant , $dml/@itemLocationCode)) return if (not(map:contains($res , $ident))) then <ko>{$ident}</ko> else()
}
</result>
On Mon, Jun 25, 2012 at 11:39 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Nicolas,
thanks for your analysis. Due to the complexity of XQuery, and the wide variety of possible execution plans, it frequently happens that some queries get slower than others, and vice versa. 2 seconds vs. 10 minutes is striking, though, so feel free to send us a little query that demonstrates the behavior.
Christian __________________________
My first statement is wrong, the performance drop is not from 7.2 to 7.3 but from 7.2 to 7.2.1.
BaseX v7.2 executes the query in sequence. First the flwor which extracts data and stores them in a map. Second the flwor which test if values are contains in the map.
BaseX v7.2.1 optimises the query and the two sequentials flwor are merge in a main flwor with an embed flwor. The maps seems to be constructed again and again for each value of the second flwor.
Hope it will help,
Regards,
Nicolas
On Mon, Jun 25, 2012 at 10:21 AM, Nicolas Labrot nithril@gmail.com wrote:
Hello,
I have upgraded BaseX from 7.2 to 7.3 but I have a severe performance drop on my query.
My query is in 2 parts :
The first part extracts values from an xpath and store them in a map (the map contains 40000 entries) The second part extracts values from an xpath and test if they are contains in the map (xpath return around 30000 entries).
On 7.2 a typical query run on 2s and on 7.3 I have no result after 10 minutes
Is there modification on BaseX 7.3 which can explain this performance drop ?
Thanks for your help,
Regards,
Nicolas
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Hi Nicolas,
the latest snapshot of BaseX [1] contains some new optimizations by which your the let clause in your original query should not be moved anymore. I assume that these optimizations will lead to no more regressions in future versions of BaseX.
Feedback is welcome, Christian
[1] http://files.basex.org/releases/latest/ ___________________________
On Mon, Jun 25, 2012 at 11:47 AM, Nicolas Labrot nithril@gmail.com wrote:
Here is the query, hope it will help :
declare namespace map="http://www.w3.org/2005/xpath-functions/map";
<result> {
let $res := map:new(for $dmCode in /dmodule/identAndStatusSection/dmAddress/dmIdent/dmCode
return map:entry(string(concat($dmCode/@modelIdentCode , $dmCode/@systemDiffCode , $dmCode/@systemCode , $dmCode/@subSystemCode , $dmCode/@subSubSystemCode , $dmCode/@assyCode , $dmCode/@disassyCode , $dmCode/@disassyCodeVariant , $dmCode/@infoCode , $dmCode/@infoCodeVariant , $dmCode/@itemLocationCode)) , true()))
for $dml in /dml/dmlContent/dmlEntry/dmRef/dmRefIdent/dmCode let $ident := string(concat($dml/@modelIdentCode , $dml/@systemDiffCode , $dml/@systemCode , $dml/@subSystemCode , $dml/@subSubSystemCode , $dml/@assyCode , $dml/@disassyCode , $dml/@disassyCodeVariant , $dml/@infoCode , $dml/@infoCodeVariant , $dml/@itemLocationCode)) return if (not(map:contains($res , $ident))) then <ko>{$ident}</ko> else()
}
</result>
On Mon, Jun 25, 2012 at 11:39 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Nicolas,
thanks for your analysis. Due to the complexity of XQuery, and the wide variety of possible execution plans, it frequently happens that some queries get slower than others, and vice versa. 2 seconds vs. 10 minutes is striking, though, so feel free to send us a little query that demonstrates the behavior.
Christian __________________________
My first statement is wrong, the performance drop is not from 7.2 to 7.3 but from 7.2 to 7.2.1.
BaseX v7.2 executes the query in sequence. First the flwor which extracts data and stores them in a map. Second the flwor which test if values are contains in the map.
BaseX v7.2.1 optimises the query and the two sequentials flwor are merge in a main flwor with an embed flwor. The maps seems to be constructed again and again for each value of the second flwor.
Hope it will help,
Regards,
Nicolas
On Mon, Jun 25, 2012 at 10:21 AM, Nicolas Labrot nithril@gmail.com wrote:
Hello,
I have upgraded BaseX from 7.2 to 7.3 but I have a severe performance drop on my query.
My query is in 2 parts :
The first part extracts values from an xpath and store them in a map (the map contains 40000 entries) The second part extracts values from an xpath and test if they are contains in the map (xpath return around 30000 entries).
On 7.2 a typical query run on 2s and on 7.3 I have no result after 10 minutes
Is there modification on BaseX 7.3 which can explain this performance drop ?
Thanks for your help,
Regards,
Nicolas
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de