Hello dear basex list,
Is there any known solution for function memoization in basex ? Or maybe a cache implementation ?
M. Desfrênes
Hi Mickael,
BaseX provides no dedicated memoization features. I remember that some users have successfully utilized Memcached in the past to cache BaseX query results.
Salutations, Christian
On Mon, Mar 23, 2020 at 11:05 AM Mickael Desfrenes mickael.desfrenes@unicaen.fr wrote:
Hello dear basex list,
Is there any known solution for function memoization in basex ? Or maybe a cache implementation ?
M. Desfrênes
On Mon, 2020-03-30 at 15:39 +0200, Christian Grün wrote:
I remember that some users have successfully utilized Memcached in the past to cache BaseX query results.
I did this for a while on fromoldbooks.org and it worked fine (you have to know when to invalidate the cache of course!). But for www.fromolbooks.org/Search/ i have a cacheing framework i wrote years ago (which unfortunately grew into a monster, as these things do) that also lets me switch between different XQuery engines. It's been ages, i think more than 10 years, since i've had to switch, but it's sometimes useful for debugging problems.
The first version was in production in under a day.
Most of the queries i use in BaseX for the Web site run faster than the time to start the cahe framework these days, though, so i should redo it. You have to get the basic HTML result sent to the Google bot in under a second if you want to be on the first page of results.
Liam
Not sure, whether this went through to the mailing list or only to Mickael, so please, forgive me, if posting twice.
On Mon, Mar 23, 2020 at 11:05 AM Mickael Desfrenes < mickael.desfrenes@unicaen.fr> wrote:
Is there any known solution for function memoization in basex ? Or maybe a cache implementation ?
Not sure how much this helps, but there was a Balisage talk by James Fuller, titled "A catalog of Functional Programming idioms in XQuery 3.1" https://www.balisage.net/Proceedings/vol17/author-pkg/Fuller01/BalisageVol17-Fuller01.html, which produces (amongst others) a memoize function. It is written in the Marklogic dialect of XQuery. but porting it was a no-brainer, since, AFAIR, only the syntax for maps had to be changed. The Github repo is here https://github.com/xquery/xquery_functional_catalog/.
That's great, just what I was looking for. And the rest of the presentation is interesting too.
Thank you !
De: "Andreas Mixich" mixich.andreas@gmail.com Cc: "basex-talk" basex-talk@mailman.uni-konstanz.de Envoyé: Samedi 4 Avril 2020 02:55:45 Objet: Re: [basex-talk] Memoize
Not sure, whether this went through to the mailing list or only to Mickael, so please, forgive me, if posting twice. On Mon, Mar 23, 2020 at 11:05 AM Mickael Desfrenes < [ mailto:mickael.desfrenes@unicaen.fr | mickael.desfrenes@unicaen.fr ] > wrote:
Is there any known solution for function memoization in basex ? Or maybe a cache implementation ?
Not sure how much this helps, but there was a Balisage talk by James Fuller, titled [ https://www.balisage.net/Proceedings/vol17/author-pkg/Fuller01/BalisageVol17... | "A catalog of Functional Programming idioms in XQuery 3.1" ] , which produces (amongst others) a memoize function. It is written in the Marklogic dialect of XQuery. but porting it was a no-brainer, since, AFAIR, only the syntax for maps had to be changed. The [ https://github.com/xquery/xquery_functional_catalog/ | Github repo is here ] .
Glad I could help!
I just found this in the Saxon documentation (I am adding it for the sake of completeness, even though you may be interested in BaseX):
saxon:memo-function https://www.saxonica.com/html/documentation/extensions/attributes/memo-function.html
On Sun, Apr 5, 2020 at 10:34 AM Mickael Desfrenes < mickael.desfrenes@unicaen.fr> wrote:
That's great, just what I was looking for. And the rest of the presentation is interesting too.
Thank you !
*De: *"Andreas Mixich" mixich.andreas@gmail.com *Cc: *"basex-talk" basex-talk@mailman.uni-konstanz.de *Envoyé: *Samedi 4 Avril 2020 02:55:45 *Objet: *Re: [basex-talk] Memoize
Not sure, whether this went through to the mailing list or only to Mickael, so please, forgive me, if posting twice. On Mon, Mar 23, 2020 at 11:05 AM Mickael Desfrenes < mickael.desfrenes@unicaen.fr> wrote:
Is there any known solution for function memoization in basex ? Or maybe a cache implementation ?
Not sure how much this helps, but there was a Balisage talk by James Fuller, titled "A catalog of Functional Programming idioms in XQuery 3.1" https://www.balisage.net/Proceedings/vol17/author-pkg/Fuller01/BalisageVol17-Fuller01.html, which produces (amongst others) a memoize function. It is written in the Marklogic dialect of XQuery. but porting it was a no-brainer, since, AFAIR, only the syntax for maps had to be changed. The Github repo is here https://github.com/xquery/xquery_functional_catalog/.
-- Minden jót, all the best, Alles Gute, Andreas Mixich
Hi Mickael, hi Andreas,
It is written in the Marklogic dialect of XQuery. but porting it was a no-brainer, since, AFAIR, only the syntax for maps had to be changed. The Github repo is here.
I see that the code uses MarkLogic’s map module functions, which are based on an implementation of a classical side-effecting and mutable hash map. As real XQuery maps are immutable (i.e., once defined, their contents will never change, see [1]), you may need to use Java bindings and instantiate a Java HashMap [2]. I have rewritten one of the proposed memoize functions, it’s attached.
I forgot to mention that BaseX itself uses runtime optimizations to memoize data as well. Just two examples:
1. If a path expression is requested multiple times, its result will automatically be cached. In the query below, it’s //name that will only be evaluated once:
//city[. = //name]
2. If large sequences are compared, the items to be compared will incrementally be stored in a hash map. In the query below, it’s the items of $lines2 that will be put to a hash map:
let $lines1 := file:read-text-lines('file1.txt') let $lines2 := file:read-text-lines('file2.txt') return $lines1[. = $lines2]
In both cases, evaluation time can be reduced from a quadratical to a linear runtime (and the result may be available within milliseconds instead of seconds or minutes).
@Mickael:
1. Do you try to reduce the runtime of specific queries, or 2. do you want to get faster results if you run the same query multiple times?
Do you possibly have specific use cases or queries which you’d like to speed up via memoization?
Best, Christian
[1] http://docs.basex.org/wiki/XQuery_3.1#Maps [2] http://docs.basex.org/wiki/Java_Bindings
Hello,
Thank you for the rewrote of the memoize function, that's very interesting. (and the part about the java hashmap will actually find a use for another problem I have).
My goal was to get faster results when a query is run multiple times. Yes, that's probably premature optimization, but since I do require these things in other application stacks I thought I'd ask.
Mickaël
----- Mail original ----- De: "Christian Grün" christian.gruen@gmail.com À: "Andreas Mixich" mixich.andreas@gmail.com, "Mickael Desfrenes" mickael.desfrenes@unicaen.fr Cc: "basex-talk" basex-talk@mailman.uni-konstanz.de Envoyé: Mercredi 8 Avril 2020 11:05:46 Objet: Re: [basex-talk] Memoize
Hi Mickael, hi Andreas,
It is written in the Marklogic dialect of XQuery. but porting it was a no-brainer, since, AFAIR, only the syntax for maps had to be changed. The Github repo is here.
I see that the code uses MarkLogic’s map module functions, which are based on an implementation of a classical side-effecting and mutable hash map. As real XQuery maps are immutable (i.e., once defined, their contents will never change, see [1]), you may need to use Java bindings and instantiate a Java HashMap [2]. I have rewritten one of the proposed memoize functions, it’s attached.
I forgot to mention that BaseX itself uses runtime optimizations to memoize data as well. Just two examples:
1. If a path expression is requested multiple times, its result will automatically be cached. In the query below, it’s //name that will only be evaluated once:
//city[. = //name]
2. If large sequences are compared, the items to be compared will incrementally be stored in a hash map. In the query below, it’s the items of $lines2 that will be put to a hash map:
let $lines1 := file:read-text-lines('file1.txt') let $lines2 := file:read-text-lines('file2.txt') return $lines1[. = $lines2]
In both cases, evaluation time can be reduced from a quadratical to a linear runtime (and the result may be available within milliseconds instead of seconds or minutes).
@Mickael:
1. Do you try to reduce the runtime of specific queries, or 2. do you want to get faster results if you run the same query multiple times?
Do you possibly have specific use cases or queries which you’d like to speed up via memoization?
Best, Christian
[1] http://docs.basex.org/wiki/XQuery_3.1#Maps [2] http://docs.basex.org/wiki/Java_Bindings
On Thu, 2020-04-09 at 10:08 +0200, Mickael Desfrenes wrote:
My goal was to get faster results when a query is run multiple times. Yes, that's probably premature optimization, but since I do require these things in other application stacks I thought I'd ask.
I have a Perl-based framework that kept a cache of results). But, the time taken to open a cache file and read it is often longer than it takes BaseX to run the query. The main value is that there are a couple of queries that are much slower.
I've also used memcached via php in a front end, and that's faster.
One of that hardest things about cache management, though, is invalidating pages when the data changes. For https://www.fromoldbooks.org/Search/ i just blow away the whole cache and then pre-fetch the 100 or so most common queries, one per second.
But the front page on fromoldbooks.org is not cached and is just about as fast as a search. The sweet spot for Web back ends is still that you need a Web page to finish loading in under two seconds to stop Google from hating you :)
Liam
Am 08.04.2020 um 11:05 schrieb Christian Grün:
I see that the code uses MarkLogic’s map module functions, which are based on an implementation of a classical side-effecting and mutable hash map. As real XQuery maps are immutable (i.e., once defined, their contents will never change, see [1]), you may need to use Java bindings and instantiate a Java HashMap [2]. I have rewritten one of the proposed memoize functions, it’s attached.
Sorry for the late reply, I don't find much time to pursue my hobby these days, just wanted to give a head's up, since I am always very happy, when you post some XQuery code to the list :-) Want to say: It hasn't gone unseen and thanks so much!
basex-talk@mailman.uni-konstanz.de