[basex-talk] BaseX optimizer performance on REx-generated parser

28 Mar 2016


      Hi,
in a recent update of REx parser generator (v5.37), some type 
specifications were removed from generated XQuery and XSLT functions. 
This affects tail-recursive functions, and it was done because it turned 
out that MarkLogic fails to optimize tail calls, presumably due to an 
explicit type check caused by the type specification (see 
http://markmail.org/message/gxi26da4crk2v5ge ).
Now it was reported that this change causes performance problems with 
BaseX (see https://twitter.com/apb1704/status/714219874441146368 ).
I reproduced the problem as follows:
- download the XQuery 3.1 grammar from 
http://bottlecaps.de/rex/CR-xquery-31-20151217.ebnf
- generate a parser from it on http://bottlecaps.de/rex/ using 
command line options:
-xquery -tree -main
- run the generated parser, using this command line
basex -binput={42} CR-xquery-31-20151217.xquery
This works OK, but it takes about 80 seconds. Some analysis showed that 
the time can be influenced by putting back the type specification to 
function p:map2, i.e. declaring 'as xs:integer' as the result type:
declare function p:map2($c as xs:integer, $lo as xs:integer, $hi 
as xs:integer) as xs:integer
This variant completes in less than 3 seconds. But even when declaring a 
return type of
as xs:integer?
which might be the type that can be inferred statically, it completes fast.
Is this possibly a problem with the optimizer?
Which variant of generated code would be preferable for BaseX - typed or 
untyped?
Thanks,
Gunther

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

[basex-talk] BaseX optimizer performance on REx-generated parser