Hi,
I am running some queries against file based and main-memory dbs for comparison.
Creating the main-memory db via xquery (declare option or pragma) did not work: declare option db:mainmem 'true'; or (# db:mainmem true #) Via command line it works: set mainmem true
Only the db won't be listed and seems to be dropped after I open another db on the command line. I had expected the main-memory db to be treated like other dbs (e.g. being there as long as I don't drop it or stop the server)
Is this behaviour on purpose?
Regards,
Max
Hi Max,
finally some feedback...
Creating the main-memory db via xquery (declare option or pragma) did not work: declare option db:mainmem 'true'; or (# db:mainmem true #) Via command line it works: set mainmem true
Yes, this is on purpose, and for two reasons:
* All databases that are opened within a query will eventually be closed again. * If a main-memory database is created, it has no disk representation, and as soon as the query is evaluated, it gets lost.
A long time ago, BaseX started as a main-memory database, which is actually the main reason for the MAINMEM flag. Step by step, we reduced internal main memory dependencies, as we would have to implemented many things twice. If I remember right, the most important mainmem scenarios are as follows:
* In the standalone context, a main-memory database can be created (using CREATE DB), which can then be accessed by subsequent commands. * If a BaseX server instance is started, and if a database is created in its context (using CREATE DB), other BaseX client instances can access (and update) this database (using OPEN, db:open, etc.) as long as no other database is opened/created by the server. * By the way: main-memory database instances are also created by the invocation of doc(...) or collection(...), if the argument is not a database (no matter which value is set for MAINMEM). In other words: the same internal representation is used for main-memory databases and documents/collections generated via XQuery.
It may be worth adding such inormation in our Wiki articles on Databases [1]; is anyone willing to do that? ;)
Now back to the motivation of your question: We could think about extending the MAINMEM option and also open existing databases in main-memory (via OPEN or db:open) if the option is activated. This would imply that the full-text index would be ignored (currently, no main-memory representation of this index structure exists). But more importantly, updates would have to be propagated back to disk if the database is closed (or after every update). Currently, no code exists to convert main-memory instances to disk (but the data structures are at least similar), so it would be easier to simply ignore updates - but that's probably not what one would expect?
Christian
Hi
2014-05-07 11:27 GMT+02:00 Christian Grün christian.gruen@gmail.com:
Creating the main-memory db via xquery (declare option or pragma) did not work: declare option db:mainmem 'true'; or (# db:mainmem true #) Via command line it works: set mainmem true
Yes, this is on purpose, and for two reasons:
- All databases that are opened within a query will eventually be closed again.
- If a main-memory database is created, it has no disk representation,
and as soon as the query is evaluated, it gets lost.
A long time ago, BaseX started as a main-memory database, which is actually the main reason for the MAINMEM flag. Step by step, we reduced internal main memory dependencies, as we would have to implemented many things twice. If I remember right, the most important mainmem scenarios are as follows:
- In the standalone context, a main-memory database can be created
(using CREATE DB), which can then be accessed by subsequent commands.
- If a BaseX server instance is started, and if a database is created
in its context (using CREATE DB), other BaseX client instances can access (and update) this database (using OPEN, db:open, etc.) as long as no other database is opened/created by the server.
- By the way: main-memory database instances are also created by the
invocation of doc(...) or collection(...), if the argument is not a database (no matter which value is set for MAINMEM). In other words: the same internal representation is used for main-memory databases and documents/collections generated via XQuery.
I guess that is the confusing part - when using create db I'd expect the database to last until I drop it (or the server ist stoppend in this case)
It may be worth adding such inormation in our Wiki articles on Databases [1]; is anyone willing to do that? ;)
I would give it a try (account request sent)
Now back to the motivation of your question: We could think about extending the MAINMEM option and also open existing databases in main-memory (via OPEN or db:open) if the option is activated. This would imply that the full-text index would be ignored (currently, no main-memory representation of this index structure exists).
I am not sure what wou mean with ignoring full-text, queries with "containing text ..." seem to work (at least in my tests)
But more importantly, updates would have to be propagated back to disk if the database is closed (or after every update). Currently, no code exists to convert main-memory instances to disk (but the data structures are at least similar),
I can think of supporting the development of such functionality if you consider worth the effort
Considering the current penalty with calling optimize() after every change and otherwise (at least for me) no big differences in usage this would make an interesting use case
something like an xml version of http://prevayler.org/
Christian
I guess that is the confusing part - when using create db I'd expect the database to last until I drop it (or the server ist stoppend in this case)
If we extend MAINMEM to write main-memory databases to disk, this would indeed mean that all databases created via db:create will get persistent.
I would give it a try (account request sent)
Confirmed!
I am not sure what wou mean with ignoring full-text, queries with "containing text ..." seem to work (at least in my tests)
Full-text expressions will continue to work, but they won't be sped up by the full-text index anymore. In many cases, this may not be a real bottleneck, however, because main-memory access if faster anyway.
I can think of supporting the development of such functionality if you consider worth the effort
Thanks for the offer.. Anyone else interested in supporting this feature?
Considering the current penalty with calling optimize() after every change and otherwise (at least for me) no big differences in usage this would make an interesting use case
When it comes to optimize(), you could play around with the UPDINDEX option.
Christian
Hi
2014-05-07 12:12 GMT+02:00 Christian Grün christian.gruen@gmail.com:
Considering the current penalty with calling optimize() after every change and otherwise (at least for me) no big differences in usage this would make an interesting use case
When it comes to optimize(), you could play around with the UPDINDEX option.
I did so. Two issues:
1) Path indexes are (according to the docs) dropped even with UPDINDEX
2) Full Text is excluded, so if you split your data over several databases (and have one dedicated to the full text data), you have to join this data again, which is quite noticeable already for around 50-100 nodes that you want to "join"
Christian
basex-talk@mailman.uni-konstanz.de