Hi,
I'm new to BaseX and researching it at the moment, and therefore I have a couple of questions starting to build up.
1. Modules seems to be compiled on every request in the GUI. Adding something like "funqtx" to the imported modules increases the query overhead by ~50ms. Is there a way to precompile modules, other than writing a native java plugin (or is this actually just a GUI designed feature, making you able to edit module files in between requests)?
2. What are collections good for as opposed to storing documents individually. I'm looking for a way to generate sequential ID's, and currently uses a method described on the list, where the count is hold in a containing node like <somecollection nextid="1234"><item id="1"> . <item id="1233"></somecollection> I'm not sure but I think I've seen other XML databases use document collections for keeping automated track of document id's - is there something similar for BaseX?
3. How do you make sure that you are not overwriting documents with stale data. Do you use locking attributes (optimistic locking)?
4. I like the idea of being able to reshape the data with greater ease in the future, but at the same time it's also a little scary to have data stored in a form that is not structured in terms of what values a field may take and so on. I know that XML schemas or DTD's might help in this, but what kind of impact can be expected when introducing such validation mechanisms (in particular on inserts and updates). I'm sure some of it depends on the complexity of the documents, but are the Schema for example cached so that it does not need to be read and parsed on each request?
I'm generally impressed with the speed and feature set of BaseX, and I'm happy to see that you have also included Session variables to help in building web applications to run directly of the server. Unfortunately I will also need binary storage (load/save files over http) for the particular application that we have under development/design and I'm not sure whether this is possible in the current release (I've read something about XQuery only being able to handle UTF8).
It's a little hard to search the mailing list (have you considered a forum J) for questions regarding specific areas of interest, so I hope that my questions are not already answered 300 times in the past - and please forgive me if that's the case.
It would be of great help if there was more information on how to get started with designing an application for BaseX. I mean, when you still think RDBM's it's not obvious how to navigate and find solutions within the BaseX paradigms. It might help to see a real world BaseX application, addressing a classic application preferably including handling of errors, transactions and locking.
Best regards
Kenneth
Dear Kenneth,
- Modules seems to be compiled on every request in the GUI.
currently, this is not only valid for the GUI, but for all execution modes of BaseX. Contrary to our fears, this turned out to be no show stopper for users, because the time for parsing and compiling queries has been optimized a lot. Of course this will not be true for all use cases, so you may need to choose a better modularization whenever compilation time threatens your plans.
Apart from that, however, future versions of BaseX will indeed provide a caching of already compiled modules, but it’s too early to give you an ETA.
- What are collections good for as opposed to storing documents
individually.
Collections are a must if you plan to store thousands or millions of documents in databases. Next, the query optimizer yields better results for queries that are focused on single databases.
I’m not sure but I think I’ve seen other XML databases use document collections for keeping automated track of document id’s – is there something similar for BaseX?
BaseX provides no such auto id feature. If you have more information on how this is realized in other xmldbs, feel free to give us a note. One potential alternative for incremental, dense document ids could be the usage of UUIDs or ms/ns timestamps.
How do you make sure that you are not overwriting documents with
stale data. Do you use locking attributes (optimistic locking)?
I’m not sure what you mean by that? BaseX uses readers-writer locks [1], maybe this helps?
- I like the idea of being able to reshape the data with greater ease
in the future, but at the same time it’s also a little scary to have data stored in a form that is not structured in terms of what values a field may take and so on.
This may be a matter of getting used to it, but I know what you mean. In most larger applications I’m aware of, validation is part of the business logic and e.g. ensured by the validation of input forms via XForms and other technologies. If you want to do schema validation before adding resources, the Validation Module is helpful [2].
Unfortunately I will also need binary storage (load/save files over http) for the particular application that we have under development/design and I’m not sure whether this is possible in the current release (I’ve read something about XQuery only being able to handle UTF8).
Yep, that’s possible; see e.g. [3] in our documentation, or browse for the keywords "binary" and "raw".
It’s a little hard to search the mailing list (have you considered a forum J) for questions regarding specific areas of interest, so I hope that my questions are not already answered 300 times in the past – and please forgive me if that’s the case…
Maybe the search function of mail-archive.com helps a little [4]. We thought about introducing a forum, but we wanted to avoid that there is no single target for requests. You may want to resort to StackOverflow, it also provides a "basex" tag [5].. or, of course, search engines.
It would be of great help if there was more information on how to get started with designing an application for BaseX. I mean, when you still think RDBM’s it’s not obvious how to navigate and find solutions within the BaseX paradigms. It might help to see a real world BaseX application, addressing a classic application preferably including handling of errors, transactions and locking.
Absolutely true. Any volunteers please stand up ;)
Christian
[1] http://docs.basex.org/wiki/Transaction_Management [2] http://docs.basex.org/wiki/Validation_Module [3] http://docs.basex.org/wiki/Binary_Data [4] http://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/ [5] http://stackoverflow.com/questions/tagged/basex
basex-talk@mailman.uni-konstanz.de