Hello Andreas,

thank you very much for these informations! Indeed, the use-cases are similar.

I try to understand how exactly you stored the messages. The Wiki says: "the initial database just contained a root node <tweets/>". So my understanding is that the messages are inserted as child elements into this root element - and the end result is one document with one root element and millions of child elements representing the invidual messages, yes? Therefore you do not have to come up with URIs, as there is only one single document. A monster document, but I conclude from your approach that this is no problem, and not worse (or even better) than having a million individual, small documents. Is it correct - would you recommend to store the messages in one single document?

If the loading process cannot concur with queries - would there be any way how one could periodically "shift" packages of messages into a "read only" database? Or perhaps better the other way around, let the server periodically interrupt its loading activity, close the database, rename it, open and initialize a new base and then continue to load? Or is there presently simply no solution available?

Kind regards,
Hans-Juergen


Von: Andreas Weiler <andreas.weiler@uni-konstanz.de>
An: Hans-Juergen Rennau <hrennau@yahoo.de>
CC: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de>
Gesendet: 15:51 Dienstag, 3.Juli 2012
Betreff: Re: [basex-talk] BaseX as a log msg store?

Hello Hans-Juergen,

here are some details about my use case, which is similar to yours.
I'm using BaseX to insert the live public Twitter Stream into databases (see Wiki Entry [1]).

One Twitter message is around 4 kb of size and i'm able to insert about 2000 of them per second
using single XQuery Update inserts. So that would probably be working out for you, too.
If you use bulk inserts, like caching the items in a item list and running one XQuery Update for all of them, the amount of inserts would also increase.

thus made available for querying

this could be a bigger problem, cause as long as you are writing items into the database (which will never stop in your use case), the readers are blocked.
And if one of your readers will be running, the writers are blocked.

Hope this helps,
Andreas