Hi Fabrice,

indeed, the problem with large updates is the size of the pending update list.

1) If you want to stick with XQuery you could spread the complete add/replace
over several queries and consequently reduce the size of the pending update
list. I'm aware that this may not be a completely self-maintaining solution if
the size of your update varies considerably.

At some point in the future (I hope within the next few months) we plan to
integrate another XQUF optimization that caches the expensive parts of the
pending update list on disk. Most likely this will solve your problem, but until
then ...

2) ... switching to a sequence of BaseX commands is an option. Each BaseX
command is executed separately (at least update-wise), which keeps the
the PUL at bay. On the other hand, dividing the bulk update into its atomic
parts comes with a price tag ... Performance should be ok if you limit yourself
to db:add. Documents are added to the end of the BaseX table. In contrast,
the table access pattern for db:replace is more random.

In Conclusion - if you are able to spread out your update over several XQueries,
that's the way to go. The more adds/replaces you combine in a query, the faster
it will be. If dividing the update into smaller parts is a problem, a sequence of
commands may be your only choice. You could also consider a mixed solution:
commands for all 'adds' (as they're fast anyway) and xqueries for all replaces ...

Hope this helps!

Cheers,
Lukas



On Sun, Nov 11, 2012 at 8:27 PM, Fabrice ETANCHAUD <fabrice.etanchaud@orange.fr> wrote:

Dear all at basex,

 

I am wondering what could be the most efficient way to add/replace a lot of documents (several thousands) at the same time.

 

I tried db:add/db:replace but this gives heap memory overflow because of the pending list...

 

Will I have better results with a .bxs with add/replace commands (and autoflush deactivated) ?

 

Did any other user find his way in this use case ?

 

Best regards,

 

Fabrice


_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk