Am 11.04.2019 um 00:09 schrieb Chuck Bearden:
BaseX is a great tool for analyzing & characterizing large amounts of XML data. I have used it both at work and on personal projects. I hope the following observation is useful.
When I define a function that recurs over a sequence of elements in order to build a map of element name counts, I find that when I specify the type of the element sequence as 'element()*', the function runs so slowly that I give up after 5 minutes or so. But when I specify the type as 'item()*', it finishes in 40 seconds or less. Here's an example:
-----begin code snippet----- declare namespace local="w00fw00f"; declare function local:count($elems as element()*, $elem_counts as map(*)) as map(*) { let $elem := head($elems), $elem_name := $elem/name(), $elems_new := tail($elems), $elem_name_count := if (map:contains($elem_counts, $elem_name)) then map:get($elem_counts, $elem_name) + 1 else 1, $elem_counts_new := map:put($elem_counts, $elem_name, $elem_name_count) return if (count($elems_new) = 0) then $elem_counts_new else local:count($elems_new, $elem_counts_new) };
let $coll := collection('pure_20190402'), $elems := $coll/result/items/*, $elem_names_map := local:count($elems, map {})
It seems that task to build the map can also be solved with grouping:
let $elem_names_map := map:merge( for $item in $coll/result/items/* group by $name := name($item) return map { $name : count($item) } )
Not sure whether that improves performance.