On Sun, Apr 17, 2022 at 09:00:10PM +0200, Markus Elfring scripsit:
There are neither join conditions nor record sets in XQuery.
I suggest to compare this view to the situation before the key word “JOIN” was added to the SQL standard. https://en.wikipedia.org/wiki/Join_(SQL)
You can do join _operations_, but you aren't doing them on tables (unless you did extra work to represent the tables hierarchically) and there's absolutely no need for the keywords because the existing more general mechanisms work fine.
How do you think about the following XQuery script sketch?
let $interesting_stuff1 as item()* := my_fn:get_data("some expression""),), $interesting_stuff2 as item()* := my_fn:determine_further_data(), $interesting_stuff3 as item()* := my_fn:evaluate_another_expression() for $this1 in $interesting_stuff1, $this2 in $interesting_stuff2, $this3 in $interesting_stuff3 where $this1/id = $this2/id and $this2/id = $this3/id return do_something($this1/id, $this2/description, $this3/comment)
You're asking the optimizer to turn something O(N^2) into something efficient and you don't have to. All of this keys on an id element's string value and you already know that.
Use your functions to create maps where the keys come from that id element's string value.
(: usual caveat; this is typing, it hasn't been run :)
(: this is a sequence of unique id values :) let $intresting_stuff1 as xs:string+ := my_fun:get_data("some expression") (: this maps id values to description elements :) let $intresting_stuff2 as map(xs:string,element(description)) := my_fun:determine_further_data() (: maps id values to comment elements :) let $intresting_stuff3 as map(xs:string,element(comment)) := my_fun:evaluage_another_expression()
(: bind to the sequence of id values :) for $id in $interesting_stuff1 return (: run the function per-id :) my_fun:do_something($id,$interesting_stuff2($id),$interesting_stuff3($id))
You could decide to skip the for clause and use
return $interesting_stuff1 ! my_fun:do_something(.,$interesting_stuff2(.),$interesting_stuff3(.))
instead.
You could do something similar (and conceptually simpler, but maybe not better in practical terms) as:
(: all of the expressions need to return elements with id element descendants where the element has a meaningful string value :) for $data in (db:open('one')/expression_one,db:open('two')/expression_two,db:open('three')/expression_three) (: get the first descendant id element and take its string value :) let $id as xs:string? := $data/descendant::id[1]/string() (: if there's no $id value, stop processing this member of the binding sequence :) where $id (: re-order the entire tuple stream into one tuple per distinct $id value, with however many $data variables have that id value associated with it :) group by $id (: after the group by clause, a reference to $data is a reference to a sequence of $data bindings, everything where this specific id value was found as the first descendent id element's string property of the elements in the binding sequence of the for clause :) return my_fun:do_something($data)
This approach assumes my_fun:do_something() knows what it's looking for, how to filter that out of the elements it gets passed, and how to order it in what it returns. Because it has the actual nodes, it can tell where they came from if it needs to. This approach can work better with messy data where 'two' might have comments and 'three' might have descriptions and you'll need to either use both or add logic about which one is preferred. (Or you functions to create the maps can be a bit smarter. Depends on the data.)
XQuery and SQL are not similar languages; they're both query languages, but SQL is built on set theory while XQuery is built on graph theory (XPath) and the idea of a tuple stream processor (FLOWR expressions). The underlying math for XQuery is younger. For example, the data structure under maps (finger trees) was first published _as math_ in 2006. You can't use either to understand the other one.
Will any further comparisons evolve for the provided functionality?
Don't think so. I find the trick with XQuery is to not fight with it about being some other language.
Internalizing the sequence concept takes work; internalizing the "this one, and all of them, at the same time" tuple stream processor concept takes more work. Once you've done that, you've got an extremely powerful and general tool.
(Rather like using git, I might put this as not trying to outsmart the capable people who designed XQuery. I'm going to lose if I do that.)