Hi,
I am trying to achieve the following;
Have a number of baseX servers each working on a replica of an indexed document. One baseX server will handle concurrent client read queries The rest of the servers will handle concurrent client write queries on their copy of the index. In the end I would like to merge all the indexes. Is there a way to achieve that?
Best Regards, Marios
Hi Marios,
The rest of the servers will handle concurrent client write queries on their copy of the index. In the end I would like to merge all the indexes.
Hm, I'm not quite sure what you want to achieve: Would you really like to join the internal index structures, or do you talk about the database, which serve as an index in your use case?
If write operations are performed on your replica servers, this probably means that you want to transfer new documents to all other servers, including your master server?
Best, Christian
Hi Christian,
Thank you for your prompt response. Yes what I meant is transferring new documents (the delta to be complete) to all other servers. So the problem I think comes down to how to merge two databases into one without having to reprocess the whole content.
Best, Marios
On Thu, Mar 12, 2015 at 2:55 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Marios,
The rest of the servers will handle concurrent client write queries on
their
copy of the index. In the end I would like to merge all the indexes.
Hm, I'm not quite sure what you want to achieve: Would you really like to join the internal index structures, or do you talk about the database, which serve as an index in your use case?
If write operations are performed on your replica servers, this probably means that you want to transfer new documents to all other servers, including your master server?
Best, Christian
Hi Marios,
Thanks for your response.
So the problem I think comes down to how to merge two databases into one without having to reprocess the whole content.
One more challenge I think will be to find out which documents have changed on a particular server.
Maybe it's easier to have one dedicated server that receives new documents, which can then be distributed to the remaining server. One more approach is to store new documents in a separate database, which can then be merged with the main database. As database are very light-weight constructs in BaseX, you can e.g. address more than one database from a single XQuery expression [1].
Hope this helps, Christian
Hi Christian,
Thank you for the ideas.
One more approach is to store new documents in a separate database, which can then be merged with the main database.
The is the key then. Is that possible?
Thanks, Marios
On Thu, Mar 12, 2015 at 3:30 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Marios,
Thanks for your response.
So the problem I think comes down to how to merge two databases into one without having to reprocess the whole content.
One more challenge I think will be to find out which documents have changed on a particular server.
Maybe it's easier to have one dedicated server that receives new documents, which can then be distributed to the remaining server. One more approach is to store new documents in a separate database, which can then be merged with the main database. As database are very light-weight constructs in BaseX, you can e.g. address more than one database from a single XQuery expression [1].
Hope this helps, Christian
One more approach is to store new documents in a separate database, which can then be merged with the main database.
The is the key then. Is that possible?
Absolutely ;) It can e.g. be realized with the functions in the database module [1] (but this requires some experience with XQuery).
Best, Christian
[1] http://docs.basex.org/wiki/Database_Module
Hi Marios,
Thanks for your response.
So the problem I think comes down to how to merge two databases into one without having to reprocess the whole content.
One more challenge I think will be to find out which documents have changed on a particular server.
Maybe it's easier to have one dedicated server that receives new documents, which can then be distributed to the remaining server. One more approach is to store new documents in a separate database, which can then be merged with the main database. As database are very light-weight constructs in BaseX, you can e.g. address more than one database from a single XQuery expression [1].
Hope this helps, Christian
Wont using a function like db:add be the same as reprocessing that piece of xml? ex; gathering statistics, reprocessing the db indexes etc. But we have already processed that piece of xml, during the update on the other copy of the database.
Maybe this example will show more what I mean. Suppose we have a huge xml document and creating the baseX database takes a lot of time. We would like to split that document, pass it to multiple baseX instances and finally merge the results, hoping to increase performance. But if that merge is taking place by adding the change in one db to the other with db:add, wont it be like reprocessing that change?
Best, Marios
On Thu, Mar 12, 2015 at 3:48 PM, Christian Grün christian.gruen@gmail.com wrote:
One more approach is to store new documents in a separate database,
which
can then be merged with the main database.
The is the key then. Is that possible?
Absolutely ;) It can e.g. be realized with the functions in the database module [1] (but this requires some experience with XQuery).
Best, Christian
[1] http://docs.basex.org/wiki/Database_Module
Hi Marios,
Thanks for your response.
So the problem I think comes down to how to merge two databases into
one
without having to reprocess the whole content.
One more challenge I think will be to find out which documents have changed on a particular server.
Maybe it's easier to have one dedicated server that receives new documents, which can then be distributed to the remaining server. One more approach is to store new documents in a separate database, which can then be merged with the main database. As database are very light-weight constructs in BaseX, you can e.g. address more than one database from a single XQuery expression [1].
Hope this helps, Christian
Hi Marios,
Wont using a function like db:add be the same as reprocessing that piece of xml? ex; gathering statistics, reprocessing the db indexes etc.
Yes, it will. But if you do bulk updates, and if the UPDINDEX option is not turned on, the index structures will be invalidated anyway, and you can simply call db:optimize after all new documents have been inserted.
Suppose we have a huge xml document and creating the baseX database takes a lot of time. We would like to split that document, pass it to multiple baseX instances and finally merge the results, hoping to increase performance.
It's probably difficult to recommend one way to do it, because in practice, there are several ways out. One thing you could try is to store your data in several databases and request all of them with a single XQuery expression (in some cases, there may even be no need to use multiple BaseX instances).
Best, Christian
On Thu, Mar 12, 2015 at 3:48 PM, Christian Grün christian.gruen@gmail.com wrote:
One more approach is to store new documents in a separate database, which can then be merged with the main database.
The is the key then. Is that possible?
Absolutely ;) It can e.g. be realized with the functions in the database module [1] (but this requires some experience with XQuery).
Best, Christian
[1] http://docs.basex.org/wiki/Database_Module
Hi Marios,
Thanks for your response.
So the problem I think comes down to how to merge two databases into one without having to reprocess the whole content.
One more challenge I think will be to find out which documents have changed on a particular server.
Maybe it's easier to have one dedicated server that receives new documents, which can then be distributed to the remaining server. One more approach is to store new documents in a separate database, which can then be merged with the main database. As database are very light-weight constructs in BaseX, you can e.g. address more than one database from a single XQuery expression [1].
Hope this helps, Christian
basex-talk@mailman.uni-konstanz.de