Referential Queries - BaseX-Talk - mailman.uni-konstanz.de

14 May 2013


      Hello Again,
If this is the wrong forum for these type of questions let me know. By the way Liam I picked up your book last night, I like the flavor as it differs from my other reads such as those from Kay. Although I have been using XML for years and understand the core concepts it should be a great refresher. If you have time to read this and respond I appreciate it... If not I understand. :)
The Organizational Overall Problem:
 There aren't many people in my industry that use XQuery and xml in the way it was intended (IMHO). In fact most developers in my organization are rather uneducated in it and as you know there is some un-rational backlash as many correlate XML to the DOM and XPath/XSLT 1.0 and as a competitor to JSON which is ludicrous. The DOM has its issues of scale-ability which our products are currently running into. This isn't really xml's or the DOM's problem but simply poor implementation. As you know though, all that matters is perception.
I am having to work with a large number of un-schemad, basically hack job, developed xml documents and workflows. A lot of our product utilizes XSLT for reporting and transformation however only a few in the team understand the concepts and due to MS/Managerial BS we are stuck with a .NET XSLT 1.0 processor. Also almost all of our XSLT scripts utilize a Pull pattern which I find  overly verbose and inefficient, but that is a personal opinion.
I have been researching alternatives to the DOM and .NET's standard processor (I have used Saxon too) because I personally find XML useful and the query semantics of XQuery 3.0 awesome. 
Proposed Solutions:
We have two primary use cases:
1) as a local db to replace the context DOM for our 'documents' which in our case relates to Utilitiy GIS Designs of circuit, subdivisions, fiber, etc... I am thinking BaseX coupled with RestXQ could replace our DOM for local installs and allow ourselves to decouple from the Geodatabase and provide a browser based UI. (Currently our xml documents are stored in a blob in the GIS database) The GIS we are using is ESRI however we are interested in also supporting Open Street Maps. (Noticed the Geo Module)
2) as a service for hosting uploading and allowing users/delivery and support to view, query and modify complex sets of interrelated XML configuration files. Some of our applications have hundreds. Again all these documents follow a similar semantics however their is no defined schema for any of them. Currently the client has to read a 50 page manual and edit the files one at time. Often there are nodes in several files which must match exactly or the entire application fails.... Its a nightmare and there is no concept of generalization in my organization when it comes to development. Every tool we do have to 'configure' is hand crafted and unique..... Its abysmal! Because of this we have tools that cover probably only 20% of the configuration.
I think we can accomplish both of the above tasks using a single codebase and restXQ
I have written an XQuery expression which using our 'common' xml semantics can ascertain entities/properties/relationships and distill this in the form of metadata which then using RestXQ is distilled into a metadata driven api for manipulated data centric xml documents. Similar to how you can transform a well formed xml document into an xsd however our metadata format compliments XSD but servers broader purposes for example to annotate triggers/mappings etc.  
This metadata is then consumed for each RestXQ operation to allow for a generalized API and all markup is applied on the Client to allow for more efficient caching mechanisms since users can share resources between different representations more efficiently. We currently use Knockout.js on the client for this.  
Here is an example of a generalized API endpoint for retrieving a unique entity by type. 
(: Removed error handling and some other stuff for brevity. Its not fully functional/vindicated but is more a representation of the concept :)declare %restxq:path("api/{$entityListName}/{$entityId}")        %restxq:GET        function page:GetEntity($entityListName as xs:string, $entityId as xs:string) {                  let $entityMetadata := $page:database/metadata/entity[@type = $entityListName]         let $entity :=         if($entityMetadata/property[@key = 'true']) then (: Use the properties marked in the metadata as the key :)           $page:database//*[name() = $entityListName and data((*|@*)[name() = $entityMetadata/property[@key = 'true']/@name]) = $entityId]                      <entity>            {             (: Construct the representation based on the metadata :)              for $propMeta in $entityMetadata/property              return attribute {$propMeta/@name} { data($entity/(*|@*)[name() = $propMeta/@name]) }             }            {              (: Include a link to all related entities for further discovery by the client.. :)                             for $relMeta in $entityMetadata/relationship                   return               if($relMeta/@multiplicity = '*') then                    (: creates a href like: api/worklocation/23/cost which would return the sequence of cost items associated with the work location 23 :)                   element link { (attribute href { concat( 'api/, $entityListName, '/', $entityId, '/', $relMeta/@with ), attribute rel { $relMeta/@with })}               else (                   (: creates a href to a single related entity :)                   let $relatedEntityMeta := $page:database/metadata/entity[@type = $relMeta/@with']                   element link { (attribute href { concat( 'api/', $relMeta/@with, data( $entity/(@*|*)[name() = $relMeta/@type]/((@*|*)[$relatedEntityMeta/property[@key = 'true']  )  ) },                                     attribute rel { $relMeta/@with } ) }                )            }          </entity>              };
With a metadata mapping of the document we can convert the 'tree' structure into a virtual 'relational table' structure allowing for granular node modification/addition/removal in a generalized way. Once the metadata is distilled using the same api with metadata describing the metadata we can then allow the client to further manipulate the view/aliasing/permissions/triggers/mappings of the entities archetypes metadata allowing for quick UI/Workflow generation. At least that is the idea... This obviously requires a client side library which understand this metadata orchestration. 
The XML/BaseX Question:We need to be able to query effectively across relationships. Are there any facilities in XML/XQuery/3rd party that do this? I was hoping Id and IDref could accomplish this but as you stated that  is not the case....
Basically I won't know before hand whether a child node is inline or a reference but would like to be able to query both as those they were the same..
If I had two nodes (I will call them parents) both with the same child. One has the actual inline child node and the other has just a reference to it. Lets say the child's attribute name is 'Tom'. I would like to be able to query like this and return both parents nodes:
//parents[child/@name = 'Tom']
Anything? If not, is this a weird use case? I wouldn't think so. I could imagine however that this use case may be hard to 'generally' support in xml given referential loops etc especially in a non schemed/validated document...  - James