Hi all,
I'm working on a query that reads from two databases -- OriginalDB and
UpdateDB. The structure of each database is the same:
<data>
<item id="1">...</item>
<item id="2">...</item>
...
</data>
OriginalDB has 2,305,570 items and UpdateDB has 307,019.
The query gets all distinct item ids from the two databases, then gets the
item with a matching id (if there is one) from each database, and returns
the id and both nodes in a JSON object. Here's the query:
let $originalDB := db:get('OriginalDB')/data/item
let $updateDB := db:get('UpdateDB')/data/item
for $id in distinct-values(($originalDB/@id, $updateDB/@id))
let $original := $originalDB[@id = $id]
let $update := $updateDB[@id = $id]
return json:serialize(map {
'id': xs:long($id),
'original': $original,
'update': $update
}, map { 'indent': 'no' })
In its current state this query is very slow -- it's returned only 35 JSON
objects in ~30 minutes. How can I go about optimizing it to best take
advantage of each database's indices?
Thanks in advance,
Matt