One more solution that should be evaluated faster (the data to be
looked up is directly stored in a map):
declare variable $hib_parses:= db:open('hib_parses');
declare variable $hib_lemmas := db:open('hib_lemmas');
let $lemmas := map:merge(
for $row in $hib_lemmas//row
where $row/field[@name = 'lemma_lang_id'] = '3'
return map:entry($row/field[@name = 'lemma_id'], $row)
, map { 'duplicates': 'combine'})
for $parse in $hib_parses//row
for $lemma in $lemmas($parse/field[@name = 'lemma_id'])
return (# db:copynode false #) {
element wf {
<f>{ $parse/* }</f>,
<l>{ $lemma/* }</l>
}
}
On 7/11/20, Giuseppe G. A. Celano <
celano@informatik.uni-leipzig.de> wrote:
Hi,
I am trying to perform a join operation between two large XML files (~490 MB
and ~40 MB), which are the result of the automatic conversion of old sql
dumps into XML files. I created two databases for the files. The query I
wrote to join them is correct because it works when I limit the join to just
a few items, but it never ends if I apply it to all items:
here is the xquery:
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_files.xq
<https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/join_files.xq>
here is the first file:
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_parses.xml
<https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_parses.xml>
here is the second file:
https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_lemmas.xml
<https://git.informatik.uni-leipzig.de/celano/perseus_morpheus/-/blob/master/hib_lemmas.xml>
I have also tried to use the database module functions, but without success.
Am I missing anything here? Thanks.
Ciao,
Giuseppe