Hello all,
We have an issue with performance in basex.
We are using basex to manage some user generated content, which has
frequent updates. Each user has his own "home" within one large xml
document, and they are able to update content within their own home.
Our problem is that as our number of users increased, we started to notice
some major slow-downs on our server. A small update of the user's content
(one or two xml nodes) sometimes results in our [flush-...] process writing
about 30MB of data. I hope this makes sense.
I haven't actually looked at the basex source code, so I don't know what
happens during these updates, but it seems to me that the amount of data
written to the disk per update is relative to our database (document) size,
although the amount of data being updated remains constant. Perhaps there
is an index update happening? Sorry if I'm talking nonsense because I
don't really understand how basex works.
So basically, I have several questions:
1) Is it a bad idea to have one single large document, which can have many
frequent updates?
2) Would it be better to have a document for each user? This would result
in about 1000 documents being created per day. Does basex index documents
within a database, for quick retreival, and is there a limit on the number
of documents that you can have in a single database?
3) if we switch to one-document-per-user, would updating these documents
somehow still result in large amounts of data being written per update,
even if the user's documents are very small? Or would this be a possible
solution to our problem?
Thank you and I hope this made sense,
- Adrian