Hi Christian,
That would be terrific and I think what you suggest is already sufficient. Maybe have another look at the Qizx manual or API to see what it offered. I found it a quite well-designed feature.
Why do you hesitate about adding API access to such data? Is that a technical/complexity or more a design concern? If it's the latter I would argue that such metadata would be there precisely to handle cases (such as the SCM sync) where one, system (not an XQuery app) needs to manipulate information on nodes as well as nodes themselves that can then be used in an XQuery app. Such an SCM sync tool, written, say in Java would then probably have to use the Java API to update the information on these nodes.
I was thinking of other use cases:
- ACL info - Sync info (eg. last-modfied date, content-type etc. so one instance can cache nodes from another without touching the content itself) - Properties that are computationally expensive to derive, calculate them in advance and then quickly search for and retrieve them (this case was also mentioned in the Qizx manual)
Of course when one has both the option of expressing the data as XML and as metadata properties you might ask yourself when to store where? But having this option is a good thing I believe.
Cheers, --Marc
On Fri, Aug 29, 2014 at 12:32 PM, Christian GrĂ¼n christian.gruen@gmail.com wrote:
@Marc:
For BaseX 8.0, we are planning to speed up our document index, and we could possibly enrich it with some more (possibly user-specific) metadata. I have added a reference to this mailing-list thread in the correspondent GitHub issue [1].
However, I am not sure if we should extend in our existing APIs. Maybe it would be more consistent to provide an additional XQuery Module for that, or extend the Database Module. Additional metadata could be returned via db:list-details(), and we could an updating function, sth. like db:store-details(). What do you think? Any more suggestions are welcome.
@Vincent:
I've started to implement along these lines by creating a second database to hold metadata about documents in the actual database. If there is a better option I'll switch to it.
I would be interested which metadata properties you currently storing in this auxiliary database?
Thanks, Christian
[1] https://github.com/BaseXdb/basex/issues/804
I would find this feature useful for several similar scenarios. I want to use BaseX for querying XML documents and keep BaseX synchronized with external archives/repositories where the XML files are maintained.
Vincent
From: basex-talk-bounces@mailman.uni-konstanz.de basex-talk-bounces@mailman.uni-konstanz.de on behalf of Marc van Grootel marc.van.grootel@gmail.com Sent: Thursday, August 28, 2014 5:38 PM To: BaseX Subject: [basex-talk] db documents metadata
Hi,
I was looking through the feature list in the issue tracker to see what's in the pipeline. I suddenly remembered a feature from an xml database I used a couple of years ago called Qizx. This had a very neat feature where every database document and collection could have a special map with metadata properties. These do not affect the XML content in any way but they can be accessed via special API calls or Qizx specific extension module.
A better explanation of this feature can be read in the Qizx manual (for example here http://kiwi.emse.fr/DN/qizx-manual.pdf on page 18 and 57).
I have used such metadata properties on nodes to implement syncing XML documents in a SCM (Subversion). I stored revision id's and other SCM control data in those properties. Authors would work in Subversion and certain directories where kept synced to a Qizx database so we could easily create PDF publications of the latest XML with zero impact on the XML itself.
Maybe BaseX already uses something like that under the hood, I don't know. If so extending it or opening it for use would be useful I think, and generally cool :-)
-- --Marc