I have two databases in BaseX, source_db and target_db, and would like to merge them by matching on the id attribute of each element and upserting the element with a `replace` or an `insert` depending on whether the element was found in the `target_db`. `source_db` has about 100,000 elements, and `target_db` has about 1,000,000 elements. The databases look like this:
<!-- source_db contents -->
<root>
<element id="1">
<element id="2">
</root>
<!-- target_db contents -->
<root>
<element id="1">
</root>
And my query to merge the two looks like this:
for $e in (db:open("source_db")/root/element)
return (
if (exists(db:open("target_db")/root/element[@id = data($e/@id)]))
then replace node db:open("target_db")/root/element[@id = data($e/@id)] with $e
else insert node $e into db:open("target_db")/root
)
When running the query, however, I keep getting memory constraint errors. Using a POST request to BaseX's REST interface I get "Out of Main Memory" and using the BaseX java client (
https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/org/basex/examples/api/BaseXClient.java) I get "java.io.IOException: GC overhead limit exceeded".
Ideally I would like to just process one element from source_db at a time to avoid memory issues, but it seems like my query isn't doing this. I've tried using the `db:copynode false` pragma but it did not make a difference.
Is there any way to accomplish this? Thanks in advance!