Thank you Christian !
A last question : Is 7.8 code working with both database formats ?
Best Fabrice
-----Message d'origine----- De : Christian Grün [mailto:christian.gruen@gmail.com] Envoyé : lundi 13 janvier 2014 22:30 À : Fabrice Etanchaud Cc : basex-talk@mailman.uni-konstanz.de Objet : Re: [basex-talk] adding too many documents is slowing down collection operations
Hi Fabrice,
4 minutes sounds extremely slow. The database format has changed a little with BaseX 7.8, and very large databases will be opened faster than before. To benefit from this optimization, a new database has to created. But I'm not sure if this is also true in your case, as the speedup affects database that are usually opened within a second.
One thing you might know anyway: adding documents is always faster than replacing them. There is a bug tracker issue to speed up the document index in BaseX [1]. I can't tell yet when this will happen, though.
Best, Christian
[1] https://github.com/BaseXdb/basex/issues/804 __________________________________
On Mon, Jan 13, 2014 at 11:56 AM, Fabrice Etanchaud fetanchaud@questel.com wrote:
Dear all,
I'm working on a collection containing tens of millions of documents, updated weekly.
My first guess to store these documents was to simply db:add/replace/remove them.
This solution is slowing things down as the count of documents increases. For example, opening one collection takes up to 4 minutes.
I believe that the document list is an in memory structure.
Is there a way to speed things up, or do I have to change sides with the following guess, in order to reduce the 'physical' document list size :
1.group documents in 'logical' documents on insertion (fewer documents containing new or updated documents under a root xml element)
- remove the old version of these documents in the previous 'logical'
documents with xquery update.
Has anybody already find that problem, and a workaround ? BaseX is just fantastic !
Best regards, Fabrice Etanchaud Questel/Orbit