Liam,

Thanks again. We luckily are not talking terabytes. The larger documents I have seen are in the 1 - 5 gigs with most under 15mb. I would think each 'document' would be a resource in the database. Its the 1+ gig DOMs that are our problem.. Also having to load the entire Blob/DOM at once is horrible for load performance. The RestXQ/Hypermedia approach would allow lazy loading via hypermedia driven discover-ability but still support full XPath/XQuery level querying as previously relegated to the DOM alone. 

The nice thing about our metadata approach is we actually have several levels of metadata which is then 'combined' into user/group specific metadata to allow for  user/group level configuration without undermining core system functions and caching mechanisms.

The RestXQ endpoints simply consume the 'System wide' metadata for building entity representations and the client consumes their own 'personal' metadata for client side representation.... This personal metadata is a mashup of the system metadata, any group metadata they are apart of and personal metadata alterations. For example Bob could make the 'NeedsRepair' field invisible on Transformer Inspection records...

 This 'custom' metadata is consumed by the client and the html markup is generated based on it. Again we use clietside js via knockout.js for this. The key benefit here is the data representation does not change and can be consumed by all clients regardless of their metadata alterations. This single representation can then be subsequently cached efficiently through the network with caching headers which are configurable via the system wide entity metadata. For example maybe Transformers have a freshness of 1 minute while Inspection Records are fresh for 2 days...

Although there is only 'one' representation of each resource private/sensitive data is always present and encrypted with a secret key per resource where the secret key is only available to those with privileges.

So for example:

api/employees/2

may return

{ name: 'Bob'
  salary: 'encryptedstringhere',
  isActive: 'true' }

This resource can be 'shared' by everyone. If someone has permissions to see 'salary' they can request:

api/employees/2/keys

This would return a sequence of secret keys for the properties encrypted in the resource for all properties the user has permissions to see. 

Anyway I think I got carried away... I am just excited! 

I really just wanted to say thanks again, there is a ton of documentation on XML and its hard to wade through it all efficiently.. Your input was invaluable. The approach you outlined is in line with the way I was thinking I would implement query-able relationships should xml not have those facilities inhouse. Also thanks for the SML reference. It looks promising.

Ill leave you alone and check out that query-talk group..

Once again! Thanks and have a great rest of your week!

Hopefully in the coming weeks Ill know if this will all work as we envision or fails miserably. :)

- James

> Subject: Re: [basex-talk] Referential Queries
> From: liam@w3.org
> To: james.jw@hotmail.com
> CC: basex-talk@mailman.uni-konstanz.de
> Date: Tue, 14 May 2013 15:13:59 -0400
>
> [I think this thread is getting further away from BaseX, and might
> belong on query-talk instead, but on the other hand the use of XQuery as
> a back-end for Web Apps is definitely on the increase]
>
> On Tue, 2013-05-14 at 11:14 -0600, James Wright wrote:
> > Hello Again,
> > If this is the wrong forum for these type of questions let me know. By
> > the way Liam I picked up your book last night, I like the flavor as it
> > differs from my other reads such as those from Kay. Although I have
> > been using XML for years and understand the core concepts it should be
> > a great refresher.
>
> Thanks, I wrote the boring chapters :-)
>
> > The Organizational Overall Problem:
> > There aren't many people in my industry that use XQuery and xml in
> > the way it was intended (IMHO). In fact most developers in my
> > organization are rather uneducated in it and as you know there is some
> > un-rational backlash as many correlate XML to the DOM and XPath/XSLT
> > 1.0 and as a competitor to JSON which is ludicrous.
>
> You're right, it's crazy and unfortunate.
>
> XML was originally designed as an interoperable way to put SGML
> technical documentation on the Web in Netscape plugins!
>
> > The DOM has its issues of scale-ability which our products are
> > currently running into. This isn't really xml's or the DOM's problem
> > but simply poor implementation. As you know though, all that matters
> > is perception.
>
> If it helps, XQuery, Xpath 2 and later, XSLT 2 and later, are not
> DOM-based, but have an abstract data model, and are designed with
> performance very much in mind.
> > [...]
>
>
> > we are stuck with a .NET XSLT 1.0 processor.
>
> There's at least two .net-based XSLT 2 processors, and another in
> development. But I think that's maybe off-topic for this list ;)
>
> > We have two primary use cases:
> > 1) as a local db to replace the context DOM for our 'documents' which
> > in our case relates to Utilitiy GIS Designs of circuit, subdivisions,
> > fiber, etc... I am thinking BaseX coupled with RestXQ could replace
> > our DOM for local installs and allow ourselves to decouple from the
> > Geodatabase and provide a browser based UI.
>
> Yes, that will likely make sense.
>
> > 2) as a service for hosting uploading and allowing users/delivery and
> > support to view, query and modify complex sets of interrelated XML
> > configuration files. Some of our applications have hundreds. Again all
> > these documents follow a similar semantics however their is no defined
> > schema for any of them.
>
> You might want to look at W3C SML as a way of orchestrating validation
> for configuration management.
>
> > I think we can accomplish both of the above tasks using a single
> > codebase and restXQ
>
> it's likely although obviously you'll want separate database instances.
> Note also that there are size/performance issues with BaseX today if you
> have a lot of data - "a lot" is subjective but if it's multiple
> terabytes you'll probably need multiple database instances. The good
> news is that it's relatively easy to move to different XQuery engines if
> needed, and also that BaseX keeps improving so you might well not need
> to move :-) I do know of people with petabyte XQuery databases.
>
> > I have written an XQuery expression which using our 'common' xml
> > semantics can ascertain entities/properties/relationships and distill
> > this in the form of metadata which then using RestXQ is distilled into
> > a metadata driven api for manipulated data centric xml documents.
>
> This pattern is rather like creating a persistent "view" in SQL.
>
> > [...]
>
> > The XML/BaseX Question:We need to be able to query effectively across
> > relationships. Are there any facilities in XML/XQuery/3rd party that
> > do this? I was hoping Id and IDref could accomplish this but as you
> > stated that is not the case....
>
> ID and IDREF support is almost certainly irrelevant here.
>
> given $doc1 with <student sn="3016"><name>Simon</name></student> and
> $doc2 with <course><enrolled>3016</enrolled>....
> you can easily do
> for $student in $doc1//student
> return $doc2//course[enrolled = $student/@sn]
> to get a list of courses with students from $doc1.
>
> > Basically I won't know before hand whether a child node is inline or a
> > reference but would like to be able to query both as those they were
> > the same..
> > If I had two nodes (I will call them parents) both with the same
> > child. One has the actual inline child node and the other has just a
> > reference to it. Lets say the child's attribute name is 'Tom'. I would
> > like to be able to query like this and return both parents nodes:
> > //parents[child/@name = 'Tom']
>
> Write a function to do it.
> declare function local:get-children($input as element(*)) as element(*)*
> {
> for $child in $input/node()
> return
> if (local:isreferences($child))
> then local:getreference($child)
> else $child
> }
>
> and maybe
> declare function local:get-reference($input as element(*))
> as element(*)*
> {
> return /documents[@id eq $input/@doc]//*[@id eq $input/ref]
> }
>
> > Anything? If not, is this a weird use case? I wouldn't think so.
>
> I haven't encountered it, but RDF people often want something similar.
>
> After this you can write e.g.
> /documents[@id = "36"]//parents[local:get-children()/@name = 'Tom']
>
> > I could imagine however that this use case may be hard to 'generally'
> > support in xml given referential loops etc especially in a non
> > schemed/validated document.
>
> The nature of XML is that you don't necessarily know what's a reference
> when you create a document.
>
> Liam
>
> --
> Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
> Pictures from old books: http://fromoldbooks.org/
> Ankh: irc.sorcery.net irc.gnome.org freenode/#xml
>