Hi there,
BaseX 10 will come with a new Caching Module [1], which will allow you to store XQuery values (atomic items, nodes, sequences, maps, arrays, anything except function items) in a main-memory cache. Caches are persistent: Its contents will be written to disk at shutdown time and retrieved from disk in a new or restarted BaseX instance when accessed for the first time.
The cache size is only limited by the available RAM. We started off with an XML format for representing caches in files, but we switched to a binary format to speed up the processing of large caches. It’s also possible to put large XML documents in the cache, but the classic database representation will give you better results in most cases.
A first snapshot is available [2].
In addition, we’ll soon introduce new database functions, which will enable you to store XQuery values (…including maps) in a database.
Have fun, Christian
[1] https://docs.basex.org/wiki/Caching_Module [2] https://files.basex.org/releases/latest-10/
A very interesting feature for me.
I have to admit after I posted my last explanation I found out I already heavily cached data for some search requests in my TEI/XML snippet API. But I hit a low of almost constant 2s for retrieving data using that cache.
I probably will do a writeup of that part and try the new caching module. Maybe that is faster.
Is there a cache for XQuery code? I work with small snippets and can most of the time choose if something is a literal or passed as an input variable.
Best regards
Is there a cache for XQuery code? I work with small snippets and can most of the time choose if something is a literal or passed as an input variable.
If you want to resort to the string representation of a query, you could store it in the cache as well, and evaluate it later on, e.g. as follows:
'name-of-query' => cache:get() => xquery:eval()
Query strings could also be organized in an XQuery map:
(: run cached query :) let $map := cache:get('queries') let $query := $map?name-of-query return xquery:eval($query)
(: register query :) '123 + 456' => cache:get('queries') => map:put('name-of-query', $query) => cache:put()
On Thu, 2022-05-05 at 12:10 +0200, Christian Grün wrote:
contents will be written to disk at shutdown time
What happens on a crash (e.g. power failure)?
E/g/ for the listtle teszt/experiment site i have at www.fromoldbooks.org (and www.fromoldbooks.org/Search/) there's a framework i wrote that calls out to BaseX and keeps a cached result in a separate file, one per query. The cache is deleted entirely on any update, which is not optimal but was easy :) and a background process repopulates it based on popular queries. The cache can get up to a few gigabytes sometimes.
These days for most of the queries, BaseX is faster than the framework, so the site would speed up without it :) but there are a few that can be slow, and the cache reduces server load when someone's bot goes crazy.
Philosophy question: Is a cache so different from a view in SQL, a constructed dynamic table?
Hi Liam,
What happens on a crash (e.g. power failure)?
If BaseX is shut down gracefully, the data will be stored; otherwise, it might get lost indeed. If the cached data is important, it’s advisable to call cache:write after each update.
In the documentation, I mentioned that the cache will automatically be written to disk at shutdown time. Based on some more feedback I got, I imagine there can be cases in which you simply want to create a temporary cache without making it persistent. I think I’ll change this, and I will only serialize the cache if a cache file already exists on disk (as a result of a previous explicit cache:write call).
E/g/ for the listtle teszt/experiment site i have at www.fromoldbooks.org (and www.fromoldbooks.org/Search/) there's a framework i wrote that calls out to BaseX and keeps a cached result in a separate file, one per query. […]
By reading your reply and the one from Omar, I wonder if the »Cache« is really the best term to describe what the module offers. It’s basically a Main-Memory Key-Value Store that can be made persistent, similar to e.g. Redis. Suggestions for a better name are welcome.
All the best, Christian
On Fri, May 06, 2022 at 04:25:44PM +0200, Christian Grün scripsit:
By reading your reply and the one from Omar, I wonder if the »Cache« is really the best term to describe what the module offers. It’s basically a Main-Memory Key-Value Store that can be made persistent, similar to e.g. Redis. Suggestions for a better name are welcome.
The temptation to call it the dynamic static context is great, but perhaps "session-persistent context"? SPC for short?
"There are only two hard things in Computer Science: cache invalidation and naming things." [1]
This topic combines the two ;) As for alternative names I offer: the exotic: entrepot, the industrial: kvstore, the bohemian: stash. Personally I think cache is fine.
I have sometimes thought of using BaseX and Redis together managed by docker-compose. I like the data structures [3] and the concept of 'variables' that self destruct after some time [4]
/Andy
[1] https://www.martinfowler.com/bliki/TwoHardThings.html [2] https://en.wikipedia.org/wiki/Entrep%C3%B4t [3] https://redis.io/docs/about/ [4] https://redis.io/commands/expire/
On Fri, 6 May 2022 at 15:26, Christian Grün christian.gruen@gmail.com wrote:
Hi Liam,
What happens on a crash (e.g. power failure)?
If BaseX is shut down gracefully, the data will be stored; otherwise, it might get lost indeed. If the cached data is important, it’s advisable to call cache:write after each update.
In the documentation, I mentioned that the cache will automatically be written to disk at shutdown time. Based on some more feedback I got, I imagine there can be cases in which you simply want to create a temporary cache without making it persistent. I think I’ll change this, and I will only serialize the cache if a cache file already exists on disk (as a result of a previous explicit cache:write call).
E/g/ for the listtle teszt/experiment site i have at www.fromoldbooks.org (and www.fromoldbooks.org/Search/) there's a framework i wrote that calls out to BaseX and keeps a cached result in a separate file, one per query. […]
By reading your reply and the one from Omar, I wonder if the »Cache« is really the best term to describe what the module offers. It’s basically a Main-Memory Key-Value Store that can be made persistent, similar to e.g. Redis. Suggestions for a better name are welcome.
All the best, Christian
;·) About time for BohemiaX.
Similar to Redis, we could either work with expiry dates or limit the cache to a maximum number of entries (and drop the ones with the oldest access time).
On Sat, May 7, 2022 at 5:30 PM Andy Bunce bunce.andy@gmail.com wrote:
"There are only two hard things in Computer Science: cache invalidation and naming things." [1]
This topic combines the two ;) As for alternative names I offer: the exotic: entrepot, the industrial: kvstore, the bohemian: stash. Personally I think cache is fine.
I have sometimes thought of using BaseX and Redis together managed by docker-compose. I like the data structures [3] and the concept of 'variables' that self destruct after some time [4]
/Andy
[1] https://www.martinfowler.com/bliki/TwoHardThings.html [2] https://en.wikipedia.org/wiki/Entrep%C3%B4t [3] https://redis.io/docs/about/ [4] https://redis.io/commands/expire/
On Fri, 6 May 2022 at 15:26, Christian Grün christian.gruen@gmail.com wrote:
Hi Liam,
What happens on a crash (e.g. power failure)?
If BaseX is shut down gracefully, the data will be stored; otherwise, it might get lost indeed. If the cached data is important, it’s advisable to call cache:write after each update.
In the documentation, I mentioned that the cache will automatically be written to disk at shutdown time. Based on some more feedback I got, I imagine there can be cases in which you simply want to create a temporary cache without making it persistent. I think I’ll change this, and I will only serialize the cache if a cache file already exists on disk (as a result of a previous explicit cache:write call).
E/g/ for the listtle teszt/experiment site i have at www.fromoldbooks.org (and www.fromoldbooks.org/Search/) there's a framework i wrote that calls out to BaseX and keeps a cached result in a separate file, one per query. […]
By reading your reply and the one from Omar, I wonder if the »Cache« is really the best term to describe what the module offers. It’s basically a Main-Memory Key-Value Store that can be made persistent, similar to e.g. Redis. Suggestions for a better name are welcome.
All the best, Christian
Maybe keep it simple and focused on the minimum required for XQuery use :)
From a quick test Jedis seem to work fine from custom/lib [1]
/Andy [1] https://gist.github.com/apb2006/9563707df4d8f7dd536d9cd3ea70046f
On Tue, 10 May 2022 at 12:26, Christian Grün christian.gruen@gmail.com wrote:
;·) About time for BohemiaX.
Similar to Redis, we could either work with expiry dates or limit the cache to a maximum number of entries (and drop the ones with the oldest access time).
On Sat, May 7, 2022 at 5:30 PM Andy Bunce bunce.andy@gmail.com wrote:
"There are only two hard things in Computer Science: cache invalidation
and naming things." [1]
This topic combines the two ;) As for alternative names I offer: the exotic: entrepot, the industrial:
kvstore, the bohemian: stash. Personally I think cache is fine.
I have sometimes thought of using BaseX and Redis together managed by
docker-compose.
I like the data structures [3] and the concept of 'variables' that self
destruct after some time [4]
/Andy
[1] https://www.martinfowler.com/bliki/TwoHardThings.html [2] https://en.wikipedia.org/wiki/Entrep%C3%B4t [3] https://redis.io/docs/about/ [4] https://redis.io/commands/expire/
On Fri, 6 May 2022 at 15:26, Christian Grün christian.gruen@gmail.com
wrote:
Hi Liam,
What happens on a crash (e.g. power failure)?
If BaseX is shut down gracefully, the data will be stored; otherwise, it might get lost indeed. If the cached data is important, it’s advisable to call cache:write after each update.
In the documentation, I mentioned that the cache will automatically be written to disk at shutdown time. Based on some more feedback I got, I imagine there can be cases in which you simply want to create a temporary cache without making it persistent. I think I’ll change this, and I will only serialize the cache if a cache file already exists on disk (as a result of a previous explicit cache:write call).
E/g/ for the listtle teszt/experiment site i have at www.fromoldbooks.org (and www.fromoldbooks.org/Search/) there's a framework i wrote that calls out to BaseX and keeps a cached result in a separate file, one per query. […]
By reading your reply and the one from Omar, I wonder if the »Cache« is really the best term to describe what the module offers. It’s basically a Main-Memory Key-Value Store that can be made persistent, similar to e.g. Redis. Suggestions for a better name are welcome.
All the best, Christian
The naming of the upcoming module caused confusion more than once in our beta tests; so we decided to go for a light-version of the industrial proposal and name it »Store Module«! BaseX 9.7.3 will be released this month. It will contain a preview version.
Everyone’s feedback is welcome. Christian
On Sat, May 7, 2022 at 5:30 PM Andy Bunce bunce.andy@gmail.com wrote:
"There are only two hard things in Computer Science: cache invalidation and naming things." [1]
This topic combines the two ;) As for alternative names I offer: the exotic: entrepot, the industrial: kvstore, the bohemian: stash. Personally I think cache is fine.
I have sometimes thought of using BaseX and Redis together managed by docker-compose. I like the data structures [3] and the concept of 'variables' that self destruct after some time [4]
/Andy
[1] https://www.martinfowler.com/bliki/TwoHardThings.html [2] https://en.wikipedia.org/wiki/Entrep%C3%B4t [3] https://redis.io/docs/about/ [4] https://redis.io/commands/expire/
On Fri, 6 May 2022 at 15:26, Christian Grün christian.gruen@gmail.com wrote:
Hi Liam,
What happens on a crash (e.g. power failure)?
If BaseX is shut down gracefully, the data will be stored; otherwise, it might get lost indeed. If the cached data is important, it’s advisable to call cache:write after each update.
In the documentation, I mentioned that the cache will automatically be written to disk at shutdown time. Based on some more feedback I got, I imagine there can be cases in which you simply want to create a temporary cache without making it persistent. I think I’ll change this, and I will only serialize the cache if a cache file already exists on disk (as a result of a previous explicit cache:write call).
E/g/ for the listtle teszt/experiment site i have at www.fromoldbooks.org (and www.fromoldbooks.org/Search/) there's a framework i wrote that calls out to BaseX and keeps a cached result in a separate file, one per query. […]
By reading your reply and the one from Omar, I wonder if the »Cache« is really the best term to describe what the module offers. It’s basically a Main-Memory Key-Value Store that can be made persistent, similar to e.g. Redis. Suggestions for a better name are welcome.
All the best, Christian
basex-talk@mailman.uni-konstanz.de