Re: [basex-talk] fork-join and recursive function

15 Jul 2016


      Hi Christian,
Thanks for your help.
Running your version of the query does not exhaust memory as mine had, however I don’t see the CPU usage using more than one slightly more than available processor.  Excepting a initial 1 second spike, it’s running at around 101% on a two CPU machine - so not parallelized.   If you change the node we are extracting (which will extract 17,000 nodes) in the extract_hierarchy-forked.xq  you should be able to see this:
let $node := <organization id="urn:uuid:a0c7c9cb-cdc4-4d24-b644-04dfcd45f9ea"/>
Any ideas?
Cheers,
-carl
...
On Jul 15, 2016, at 4:43 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Carl,
I finally had a look at your query. The parallelized variant of your
query was not 100% equivalent to the first one. The following version
should do the job:
declare function extractor:get_child_orgs-forked($orgs,$org) {
   for $org_id in $org/@id
   for $c_orgs in $orgs[parent/@id = $org_id]
   return xquery:fork-join(
     for $c_org in $c_orgs
     return function() {
       $c_org, extractor:get_child_orgs-forked($orgs, $c_org)
     }
   )
 };
...
If I first load the organizations.xml into the database it takes 25 seconds
to run (both before and after I run optimize).  If I run the extraction
directly against the organizations.xml file on disk it only takes 7 seconds.
Is that to be expected?
Yes it is. The reason is that access to a database will always be a
bit slower than memory access. You can explicitly convert database to
main-memory fragments by using the update keyword:
db:open('organization') update {}
…but that’s only recommendable for smaller fragments and the ones that
are frequently accessed.
Cheers
Christian

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] fork-join and recursive function