Hi Andy,
that's correct. p:transition ought to be tail-call optimized to be prepared for tokens of arbitrary size.
Best regards Gunther
On 29.03.2016 19:17, Andy Bunce wrote:
Hi Gunther, Do I under understand correctly from: https://twitter.com/__Gunther__/status/709744679361912832
Given the way the EBNF for XQuery comments is defined - If (REx) p:transition is not optimized via tail-recursion then long comments will use excessive stack? /Andy
On 29 March 2016 at 18:05, Christian Grün <christian.gruen@gmail.com mailto:christian.gruen@gmail.com> wrote:
So we may probably need to find out if the query can be optimized at all… On Tue, Mar 29, 2016 at 7:01 PM, Andy Bunce <bunce.andy@gmail.com <mailto:bunce.andy@gmail.com>> wrote: > Looking again at Florent's message > http://markmail.org/message/gxi26da4crk2v5ge... > After testing it seems is the length of the XQuery comments that trigger the > stack overflow in BaseX as well. > /Andy > > > On 29 March 2016 at 16:54, Andy Bunce <bunce.andy@gmail.com <mailto:bunce.andy@gmail.com>> wrote: >> >> >Did you hear sth. about the performance of MarkLogic >> My understanding is that on ML the presence typing blocks some >> tail-recursion optimizations leading to stack overflow for large texts. >> >> Stack overflow is also an issue for my use of REx with BaseX, I would >> like to be able to parse sources like the FunctX library without having to >> set a large Java stack size. [1] (My report that the issue was fixed was >> incorrect). I would like to investigate if BaseX is using tail-recursion >> optimizations here and/or where/why this overflow happens - but it all looks >> very complicated :-). >> >> I suspect the users of XQuery REx are capable of hand tweaking the REx o/p >> either to add or remove type declarations once the issues are well >> documented. >> >> /Andy >> >> >> [1] >> http://www.mail-archive.com/basex-talk%40mailman.uni-konstanz.de/msg07212.html >> >> On 29 March 2016 at 16:24, Christian Grün <christian.gruen@gmail.com <mailto:christian.gruen@gmail.com>> >> wrote: >>> >>> Hi Gunther, >>> >>> Thanks for your mail. After some research, I don’t see a quick way to >>> statically infer the type in the function you mentioned, mostly >>> because it’s called recursively (while statically inferring the type >>> of the function, the return type of map2 function is requested, and it >>> will be the type of the function signature, because the inferred type >>> is not available yet…). >>> >>> Personally, I would love to see the return types readded, not only >>> because it speeds up BaseX, but also because I think it’s a general >>> advantage of XQuery to have typed functions, and . Do you think there >>> is any chance to revert the change? Did you hear sth. about the >>> performance of MarkLogic – is the version without argument types >>> comparable to BaseX in terms of execution time, or much faster? >>> >>> All the best, >>> Christian >>> >>> >>> >>> > in a recent update of REx parser generator (v5.37), some type >>> > specifications >>> > were removed from generated XQuery and XSLT functions. This affects >>> > tail-recursive functions, and it was done because it turned out that >>> > MarkLogic fails to optimize tail calls, presumably due to an explicit >>> > type >>> > check caused by the type specification (see >>> > http://markmail.org/message/gxi26da4crk2v5ge ). >>> > >>> > Now it was reported that this change causes performance problems with >>> > BaseX >>> > (see https://twitter.com/apb1704/status/714219874441146368 ). >>> > >>> > I reproduced the problem as follows: >>> > >>> > - download the XQuery 3.1 grammar from >>> > http://bottlecaps.de/rex/CR-xquery-31-20151217.ebnf >>> > >>> > - generate a parser from it on http://bottlecaps.de/rex/ using >>> > command >>> > line options: >>> > >>> > -xquery -tree -main >>> > >>> > - run the generated parser, using this command line >>> > >>> > basex -binput={42} CR-xquery-31-20151217.xquery >>> > >>> > This works OK, but it takes about 80 seconds. Some analysis showed that >>> > the >>> > time can be influenced by putting back the type specification to >>> > function >>> > p:map2, i.e. declaring 'as xs:integer' as the result type: >>> > >>> > declare function p:map2($c as xs:integer, $lo as xs:integer, $hi >>> > as >>> > xs:integer) as xs:integer >>> > >>> > This variant completes in less than 3 seconds. But even when declaring >>> > a >>> > return type of >>> > >>> > as xs:integer? >>> > >>> > which might be the type that can be inferred statically, it completes >>> > fast. >>> > >>> > Is this possibly a problem with the optimizer? >>> > >>> > Which variant of generated code would be preferable for BaseX - typed >>> > or >>> > untyped? >>> > >>> > Thanks, >>> > Gunther >> >> >