Hi,
we use now for a couple of years XML files for organizing the schedule
of lectures in our institute. The amount data for one semester is about
one megabyte. For presenting the schedule on the web we use currently
python in combination with the C-based lxml XPath implementation. Only
basic selecting nodes happens on the layer of lxml, but the most of
intelligence is implemented in filtering python lists and lists of
dictionaries. We read the xml data from hard disk and parse them into
lxml objects.
>From time to time we thought about storing the data in a XML database as
this seems intuitive. One main point is, that we need some sort of
transaction management, because different applications may manipulate
the files simultaneously.
Now I gave BaseX a try and implemented one basic output of our online
schedule in XQuery. Soon I noticed, that all of the intelligence should
be managed inside XQuery as it should return the full-prepared html for
presenting online.
Unfortunately we now have the impression, that we did not gain speed -
to the contrary the query itself needs more execution time than the
whole corresponding ready interface (including the user interface etc.)
I'm now interested in your general opinion about this: Is it surprising,
that the XQuery implementation than the lxml/python one (For me it is,
as I thought the indices etc. created when importing the data should
decrease computational affords in searching the tree)? Is there some
catch in my approach? May the reason be bad designing the query?
I would appreciate to hear from you,
Ronny