Re: [basex-talk] BaseX optimizer performance on REx-generated parser

29 Mar 2016

      Hi Gunther,
Do I under understand correctly from:
https://twitter.com/__Gunther__/status/709744679361912832
Given the way the EBNF for XQuery comments is defined - If (REx)
p:transition is not optimized via tail-recursion then long comments will
use excessive stack?
/Andy
On 29 March 2016 at 18:05, Christian Grün christian.gruen@gmail.com wrote:
...
So we may probably need to find out if the query can be optimized at all…
On Tue, Mar 29, 2016 at 7:01 PM, Andy Bunce bunce.andy@gmail.com wrote:
...
Looking again at Florent's message
http://markmail.org/message/gxi26da4crk2v5ge...
After testing it seems is the length of the XQuery comments that trigger
the
...
stack overflow in BaseX as well.
/Andy
On 29 March 2016 at 16:54, Andy Bunce bunce.andy@gmail.com wrote:
...
...
Did you hear sth. about the performance of MarkLogic
My understanding is that on ML the presence typing blocks some
tail-recursion optimizations leading to stack overflow for large texts.
Stack overflow is also an issue for my use of REx with BaseX,  I would
like to be able to parse sources like the  FunctX library without
having to
...
...
set a large Java stack size. [1] (My report that the issue was fixed was
incorrect). I would like to investigate if BaseX is using tail-recursion
optimizations here and/or where/why this overflow happens - but it all
looks
...
...
very complicated :-).
I suspect the users of XQuery REx are capable of hand tweaking the REx
o/p
...
...
either to add or remove type declarations once the issues are well
documented.
/Andy
[1]
http://www.mail-archive.com/basex-talk%40mailman.uni-konstanz.de/msg07212.ht...
...
...
On 29 March 2016 at 16:24, Christian Grün christian.gruen@gmail.com
wrote:
...
Hi Gunther,
Thanks for your mail. After some research, I don’t see a quick way to
statically infer the type in the function you mentioned, mostly
because it’s called recursively (while statically inferring the type
of the function, the return type of map2 function is requested, and it
will be the type of the function signature, because the inferred type
is not available yet…).
Personally, I would love to see the return types readded, not only
because it speeds up BaseX, but also because I think it’s a general
advantage of XQuery to have typed functions, and . Do you think there
is any chance to revert the change? Did you hear sth. about the
performance of MarkLogic – is the version without argument types
comparable to BaseX in terms of execution time, or much faster?
All the best,
Christian
...
in a recent update of REx parser generator (v5.37), some type
specifications
were removed from generated XQuery and XSLT functions. This affects
tail-recursive functions, and it was done because it turned out that
MarkLogic fails to optimize tail calls, presumably due to an explicit
type
check caused by the type specification (see
http://markmail.org/message/gxi26da4crk2v5ge ).
Now it was reported that this change causes performance problems with
BaseX
(see https://twitter.com/apb1704/status/714219874441146368 ).
I reproduced the problem as follows:

download the XQuery 3.1 grammar from

http://bottlecaps.de/rex/CR-xquery-31-20151217.ebnf

generate a parser from it on http://bottlecaps.de/rex/ using

command
line options:
   -xquery -tree -main

run the generated parser, using this command line
basex -binput={42} CR-xquery-31-20151217.xquery

This works OK, but it takes about 80 seconds. Some analysis showed
that
...
...
...
...
the
time can be influenced by putting back the type specification to
function
p:map2, i.e. declaring 'as xs:integer' as the result type:
   declare function p:map2($c as xs:integer, $lo as xs:integer,

$hi
...
...
...
...
as
xs:integer) as xs:integer
This variant completes in less than 3 seconds. But even when
declaring
...
...
...
...
a
return type of
      as xs:integer?

which might be the type that can be inferred statically, it completes
fast.
Is this possibly a problem with the optimizer?
Which variant of generated code would be preferable for BaseX - typed
or
untyped?
Thanks,
Gunther

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] BaseX optimizer performance on REx-generated parser