Fine, Christian, I shall see what I can achieve. My interest is keen, as the validation of expressions embedded in configurations against XSDs has a potentially enormous value in the domain where I work. In fact, I had earlier begun to write a little XQuery parser myself for just this purpose (and you can imagine that I have not yet got very far on that way), but the alternative of using the BaseX query plans is a shining alternative - provided I can rely on the stability of the format (minor, documented changes from time to time are acceptable, too.) My first impressions of the query plan format is that it is indeed well-suited for such analysis - conveying the query structure in as concise and readable a way as one could wish for. When questions arise, I shall bug you offline, as such questions are probably too specific to be of general interest.

And thank you for the explanations - yes, schema awareness is indeed something rather different, though overlapping.

Finally: I think you are right in saying that the value of an idea is not related to the effort of its implementation. Perhaps there is something like a threshold: once it is passed, once the conceived value has slipped beyond that threshold, it acquires a peculiar independence. Many years ago, looking at distant church towers in Brugge, I had suddenly a sensation that there was something infinite about them: the effort of piling up and fitting together those stones was huge, but still finite - and now there they stand, and again and again and again human eyes look at them and drink the beauty which turns in our minds into joy. Inexhaustible.

Good night.
Hans-Jürgen


Christian Grün <christian.gruen@gmail.com> schrieb am 14:39 Samstag, 20.Februar 2016:


Hi Hans-Jürgen,

I’ll start from the end of your mail:

> I would be prepared to embark upon an XQuery implementation of a data path
> extractor, provided that you do not come to the conclusion that it would be
> of very little.

Awesome. The result will surely be interesting for others in the
community as well.

> The approach is not equivalent, but related to the concept of a
> "schema-aware XQuery processor", or am I wrong?

What a schema-aware processor mostly does is adding type information
to the processed nodes, and this info needs to be handled at runtime.
However, schema information can indeed be helpful at parse or compile
time as well, at it allows for more optimizations. In BaseX, we use
our database statistics and the name and path indexes for similar
optimizations.

> My feeling is that all
> XQuery implementors have turned away from that possibility due to a
> disproportion of effort and benefit.

I can’t speak for other implementations, but it would surely have cost
us too much time to make BaseX schema-aware. Saxon does an excellent
job at evaluating schema information. It might be worth checking out
its query plans to get a feeling of what’s possible if schema info is
available.

> So - is the first idea, at second thought, worthless because leading towards
> sheerly unlimited amounts of effort?

Absolutely not ;) I would say that the value/merit of an idea has
generally nothing to do with the effort related to making it happen.

> let $a := /x
> return $a/y
> =>
> root()/child::x/child::y
>
> But the task would be open-ended, perhaps even exceeding the complexity of
> an XQuery processor - the task of resolving XQuery expressions to a set of
> inferences, rather than to the expression value.

Some optimizations like this are already taking place in BaseX. If you
run the query above for a document that does not contain x or y
elements, the resulting query plan will be an empty sequence. However,
what we currently don’t do in BaseX is to pass on path information to
variables. For example, look at the following input and queries:

* Input: <x><y/></x>
* Query 1: xquery:parse('/x/x', map { 'compile': true(), 'plan': true() })
* Query 2: xquery:parse('let $x := /x return $x/x', map { 'compile':
true(), 'plan': true() })

Query 1 will currently be rewritten to an empty sequence, but Query 2
won’t. The good thing is that a compiled query plan in BaseX will
already have dropped out those paths that can be statically detected
as being useless.

> But the task would be open-ended, perhaps even exceeding the complexity of
> an XQuery processor - the task of resolving XQuery expressions to a set of
> inferences, rather than to the expression value.

From an algorithmic point of view, you can do everything with XQuery
what you can do with Java. Creating the data paths with XQuery should
even be more elegant, because as you can directly work down the XML
query plan. But I agree it can be a challenge, because XQuery is
probably not one of the easiest languages (however, you usually don’t
regret the time you have spent to get to know it better ;).


Christian