Hi Eric Your use case is very clear. It seems you need some nice tool, which would be able to filter information from dynamically growing number of information stored in sort of jmeter "logs".
I am not sure, if BaseX is the best tool for this - it has nice GUI, which allows to query large number of xml data - this is really nice - but I never did anything with so changing number of data. Be prepared to following limitations (some can correct me, if I mention limitation, which is already overcome):
- data added over command line tool are not synchronised with what you see in GUI. Reopening GUI resolvers this, but do not expect GUI to show latest changed done by some other process in background. I am not sure, if closing and then opening the collection would help, it is quite likely. - Modification of a collection by adding documents does not automatically updates indices - you have to call explicitly command optimize.
On the other hand, if you would analyse the data over BaseX client (not GUI), then this shall work quite well.
Some hints:
If there would be large amount of data and need to optimize performance, I would think of using multiple collections, having older and not changing data separated form latest ones. This way the old data would be kept optimised without reindexing, only the latest database would have to be optimised (it is possible, rebuilding the latest database would be faster) . I am not sure, how would perform xqueries when they query multiple databases, I did not test it (it is likely, it would not have any problem). It is possible, your xqueries are simple and can perform fast enough even without updated indices - this could simplify the situation as you would not have to care about command optimize.
Jan
*Ing. Jan Vlčinský* CAD programy Slunečnicová 338/3, 734 01 Karviná Ráj, Czech Republic tel: +420-597 602 024; mob: +420-608 979 040 skype: janvlcinsky; GoogleTalk: jan.vlcinsky@gmail.com http://cz.linkedin.com/in/vlcinsky
On 3 July 2011 01:14, Eric Jones ecjones2040@gmail.com wrote:
Hi All,
A load testing tool (jmeter) has the capability to write out its test results as XML. This is great as I search and perform metrics calculations on this data using XQuery. However, the number of XML files can grow to some volume and it is hard to tell when a test is actually over which would mean its xml result file is finished being written to. The solution to this is to use Inotify to monitor the directory where these files are stored and add them to the database if they do not exist or delete the file and re-add it to BaseX if it already has been added and is just in need of an update (do to a running test). So that is the solution I will implement -- if anyone has anything cleaner I am all ears.
I thank the BaseX mailing list for helping me with this.
Regards, Eric
On Sat, Jul 2, 2011 at 3:29 AM, Huib Verwey Huib.Verwey@mpi.nl wrote:
Hi Eric, could you give an example of a use case? For monitoring changes to files and acting upon them Google tells me I
could
use inotify or incron. From your setup I suspect you'll use BaseX as a read-only store, maybe
just
for searching? So you won't use the XQuery Update Facility for modifying
XML
in the database? Hartelijke groet, Huib Verweij. -- Drs. Huib Verweij Senior software developer - The Language Archive Max Planck Institute for Psycholinguistics P.O. Box 310 6500 AH Nijmegen The Netherlands t +31-24-3521911 e huib.verwey@mpi.nl w http://www.mpi.nl/
Op 1 jul 2011, om 06:53 heeft Eric Jones het volgende geschreven:
Hello all,
I am new to BaseX. I want to know how one handles frequently updated xml files. Do you have to monitor the file with a third party tool and upon change drop the file from BaseX and re-add it?
Regards, Eric _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk