I have loaded all the Pubmed baseline XML records into a series of 20 BaseX databases, 55 or 56 of the baseline files per database, each of which is about 12GB in size and has between 520 and 530 million nodes in 55 or 56 documents. Text, Token, and Attribute indices are enabled, but with 6GB RAM allocated to the Java VM it would not create a full text index. Each of the 55 or 56 documents has 30000 article records in it under a root PubmedArticleSet element.
I typically use basexgui for interactive work and basex for scripted loads & queries, and I allocate 6G to the Java VM in each case:
BASEX_JVM="-Xmx6g $BASEX_JVM"
I'm exploring ways to search the data in a moderately performant way, starting with the realtively simple lookup by PubMed ID:
/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID[text()=$pmid]]
I generated a seriers of five index XML files that pair PubMed IDs with database names, like so:
<index> <entry> <dbname>pmed_baseline_a</dbname> <pmid>579614</pmid> </entry> … </index>
Each file contains entries for four of the 20 BaseX databases. I loaded the five files into a single database.
My hope was that I could quickly lookup the name of the database that contained a record by that record's PMID, and that I could then open that collection and quickly obtain that record, but it isn't working the way I had hoped.
If I query the index database by PMID, I get the answer in 156ms:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text() return $dbname (: returns 'pmed_baseline_s' :)
If I open and query that collection by XPath for that PMID, also get the answer quickly, in about 420ms:
let $coll := collection('pmed_baseline_s') let $pmid := '22345065' let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted (: returns desired XML record :)
But if I combine the code to run in a single execution, it takes about 40s:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text()
let $coll := collection($dbname) let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted
I feel like I must be doing some simple thing wrong, but the only difference I see in my code between the two separate steps and the single execution version is that I'm passing the db name in a variable instead of as a string literal to the collection() function, and I'm running the whole thing in a single execution.
Note that before each execution of an XQuery, I exited basexgui and restarted it to avoid any caching effect in memory at least.
The VM I'm running on is modest (spinning drives in RAID 1, four modest AMD CPU cores, dynamic memory growth up to 32GB). But these factors would not explain the difference in speed between the two steps in separate executions and both steps in a single execution.
Can anyone point out what I'm doing wrong? And is there a better way to go about this?
Many thanks & all the best, Chuck
Hi Charles, If I remember well, it is because of the dynamic call to collection() : when you call collection('my static db name'), the parser can rewrite it to use an index, but not when you call collection($my_dynamic_db_name).
I wonder if you could get better results using db:text() function. Could you give something like that a try ?
let $pmid := '22345065' let $dbname := db:text('idx_pmed_baseline', $pmid)/parent::pmid/../dbname return db:text($dbname, $pmid)/parent::PMID/ancestor::PubmedArticle
Best regards, Fabrice
________________________________ De : BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de de la part de Charles Bearden cfbmdacc@gmail.com Envoyé : lundi 28 mars 2022 23:59 À : BaseX basex-talk@mailman.uni-konstanz.de Objet : [basex-talk] (no subject)
I have loaded all the Pubmed baseline XML records into a series of 20 BaseX databases, 55 or 56 of the baseline files per database, each of which is about 12GB in size and has between 520 and 530 million nodes in 55 or 56 documents. Text, Token, and Attribute indices are enabled, but with 6GB RAM allocated to the Java VM it would not create a full text index. Each of the 55 or 56 documents has 30000 article records in it under a root PubmedArticleSet element.
I typically use basexgui for interactive work and basex for scripted loads & queries, and I allocate 6G to the Java VM in each case:
BASEX_JVM="-Xmx6g $BASEX_JVM"
I'm exploring ways to search the data in a moderately performant way, starting with the realtively simple lookup by PubMed ID:
/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID[text()=$pmid]]
I generated a seriers of five index XML files that pair PubMed IDs with database names, like so:
<index> <entry> <dbname>pmed_baseline_a</dbname> <pmid>579614</pmid> </entry> … </index>
Each file contains entries for four of the 20 BaseX databases. I loaded the five files into a single database.
My hope was that I could quickly lookup the name of the database that contained a record by that record's PMID, and that I could then open that collection and quickly obtain that record, but it isn't working the way I had hoped.
If I query the index database by PMID, I get the answer in 156ms:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text() return $dbname (: returns 'pmed_baseline_s' :)
If I open and query that collection by XPath for that PMID, also get the answer quickly, in about 420ms:
let $coll := collection('pmed_baseline_s') let $pmid := '22345065' let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted (: returns desired XML record :)
But if I combine the code to run in a single execution, it takes about 40s:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text()
let $coll := collection($dbname) let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted
I feel like I must be doing some simple thing wrong, but the only difference I see in my code between the two separate steps and the single execution version is that I'm passing the db name in a variable instead of as a string literal to the collection() function, and I'm running the whole thing in a single execution.
Note that before each execution of an XQuery, I exited basexgui and restarted it to avoid any caching effect in memory at least.
The VM I'm running on is modest (spinning drives in RAID 1, four modest AMD CPU cores, dynamic memory growth up to 32GB). But these factors would not explain the difference in speed between the two steps in separate executions and both steps in a single execution.
Can anyone point out what I'm doing wrong? And is there a better way to go about this?
Many thanks & all the best, Chuck
Thank you Fabrice! That is most helpful.
It's a good day when you learn something new, and I did not know about *db:text()* and *db:attribute()*. I'll need to think more on how I can use these functions. Much appreciated!
All the best, Chuck Bearden
On Tue, Mar 29, 2022 at 2:52 AM ETANCHAUD Fabrice fabrice.etanchaud@maif.fr wrote:
Hi Charles, If I remember well, it is because of the dynamic call to collection() : when you call collection('my static db name'), the parser can rewrite it to use an index, but not when you call collection($my_dynamic_db_name).
I wonder if you could get better results using db:text() function. Could you give something like that a try ?
let $pmid := '22345065' let $dbname := db:text('idx_pmed_baseline', $pmid)/parent::pmid/../dbname return db:text($dbname, $pmid)/parent::PMID/ancestor::PubmedArticle
Best regards, Fabrice
*De :* BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de de la part de Charles Bearden cfbmdacc@gmail.com *Envoyé :* lundi 28 mars 2022 23:59 *À :* BaseX basex-talk@mailman.uni-konstanz.de *Objet :* [basex-talk] (no subject)
I have loaded all the Pubmed baseline XML records into a series of 20 BaseX databases, 55 or 56 of the baseline files per database, each of which is about 12GB in size and has between 520 and 530 million nodes in 55 or 56 documents. Text, Token, and Attribute indices are enabled, but with 6GB RAM allocated to the Java VM it would not create a full text index. Each of the 55 or 56 documents has 30000 article records in it under a root PubmedArticleSet element.
I typically use basexgui for interactive work and basex for scripted loads & queries, and I allocate 6G to the Java VM in each case:
BASEX_JVM="-Xmx6g $BASEX_JVM"
I'm exploring ways to search the data in a moderately performant way, starting with the realtively simple lookup by PubMed ID:
/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID[text()=$pmid]]
I generated a seriers of five index XML files that pair PubMed IDs with database names, like so:
<index> <entry> <dbname>pmed_baseline_a</dbname> <pmid>579614</pmid> </entry> … </index>
Each file contains entries for four of the 20 BaseX databases. I loaded the five files into a single database.
My hope was that I could quickly lookup the name of the database that contained a record by that record's PMID, and that I could then open that collection and quickly obtain that record, but it isn't working the way I had hoped.
If I query the index database by PMID, I get the answer in 156ms:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text() return $dbname (: returns 'pmed_baseline_s' :)
If I open and query that collection by XPath for that PMID, also get the answer quickly, in about 420ms:
let $coll := collection('pmed_baseline_s') let $pmid := '22345065' let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted (: returns desired XML record :)
But if I combine the code to run in a single execution, it takes about 40s:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text()
let $coll := collection($dbname) let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted
I feel like I must be doing some simple thing wrong, but the only difference I see in my code between the two separate steps and the single execution version is that I'm passing the db name in a variable instead of as a string literal to the collection() function, and I'm running the whole thing in a single execution.
Note that before each execution of an XQuery, I exited basexgui and restarted it to avoid any caching effect in memory at least.
The VM I'm running on is modest (spinning drives in RAID 1, four modest AMD CPU cores, dynamic memory growth up to 32GB). But these factors would not explain the difference in speed between the two steps in separate executions and both steps in a single execution.
Can anyone point out what I'm doing wrong? And is there a better way to go about this?
Many thanks & all the best, Chuck
Hi Charles, Give it a try before ! My XQuery is patchy...
Best regards, Fabrice
________________________________ De : Charles Bearden cfbmdacc@gmail.com Envoyé : mardi 29 mars 2022 17:10 À : ETANCHAUD Fabrice fabrice.etanchaud@maif.fr Cc : BaseX basex-talk@mailman.uni-konstanz.de Objet : Re: [basex-talk] (no subject)
Thank you Fabrice! That is most helpful.
It's a good day when you learn something new, and I did not know about db:text() and db:attribute(). I'll need to think more on how I can use these functions. Much appreciated!
All the best, Chuck Bearden
On Tue, Mar 29, 2022 at 2:52 AM ETANCHAUD Fabrice <fabrice.etanchaud@maif.frmailto:fabrice.etanchaud@maif.fr> wrote: Hi Charles, If I remember well, it is because of the dynamic call to collection() : when you call collection('my static db name'), the parser can rewrite it to use an index, but not when you call collection($my_dynamic_db_name).
I wonder if you could get better results using db:text() function. Could you give something like that a try ?
let $pmid := '22345065' let $dbname := db:text('idx_pmed_baseline', $pmid)/parent::pmid/../dbname return db:text($dbname, $pmid)/parent::PMID/ancestor::PubmedArticle
Best regards, Fabrice
________________________________ De : BaseX-Talk <basex-talk-bounces@mailman.uni-konstanz.demailto:basex-talk-bounces@mailman.uni-konstanz.de> de la part de Charles Bearden <cfbmdacc@gmail.commailto:cfbmdacc@gmail.com> Envoyé : lundi 28 mars 2022 23:59 À : BaseX <basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de> Objet : [basex-talk] (no subject)
I have loaded all the Pubmed baseline XML records into a series of 20 BaseX databases, 55 or 56 of the baseline files per database, each of which is about 12GB in size and has between 520 and 530 million nodes in 55 or 56 documents. Text, Token, and Attribute indices are enabled, but with 6GB RAM allocated to the Java VM it would not create a full text index. Each of the 55 or 56 documents has 30000 article records in it under a root PubmedArticleSet element.
I typically use basexgui for interactive work and basex for scripted loads & queries, and I allocate 6G to the Java VM in each case:
BASEX_JVM="-Xmx6g $BASEX_JVM"
I'm exploring ways to search the data in a moderately performant way, starting with the realtively simple lookup by PubMed ID:
/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID[text()=$pmid]]
I generated a seriers of five index XML files that pair PubMed IDs with database names, like so:
<index> <entry> <dbname>pmed_baseline_a</dbname> <pmid>579614</pmid> </entry> … </index>
Each file contains entries for four of the 20 BaseX databases. I loaded the five files into a single database.
My hope was that I could quickly lookup the name of the database that contained a record by that record's PMID, and that I could then open that collection and quickly obtain that record, but it isn't working the way I had hoped.
If I query the index database by PMID, I get the answer in 156ms:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text() return $dbname (: returns 'pmed_baseline_s' :)
If I open and query that collection by XPath for that PMID, also get the answer quickly, in about 420ms:
let $coll := collection('pmed_baseline_s') let $pmid := '22345065' let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted (: returns desired XML record :)
But if I combine the code to run in a single execution, it takes about 40s:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text()
let $coll := collection($dbname) let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted
I feel like I must be doing some simple thing wrong, but the only difference I see in my code between the two separate steps and the single execution version is that I'm passing the db name in a variable instead of as a string literal to the collection() function, and I'm running the whole thing in a single execution.
Note that before each execution of an XQuery, I exited basexgui and restarted it to avoid any caching effect in memory at least.
The VM I'm running on is modest (spinning drives in RAID 1, four modest AMD CPU cores, dynamic memory growth up to 32GB). But these factors would not explain the difference in speed between the two steps in separate executions and both steps in a single execution.
Can anyone point out what I'm doing wrong? And is there a better way to go about this?
Many thanks & all the best, Chuck
I should have mentioned: your code returned the result in 10 milliseconds.
On Tue, Mar 29, 2022 at 10:20 AM ETANCHAUD Fabrice < fabrice.etanchaud@maif.fr> wrote:
Hi Charles, Give it a try before ! My XQuery is patchy...
Best regards, Fabrice
*De :* Charles Bearden cfbmdacc@gmail.com *Envoyé :* mardi 29 mars 2022 17:10 *À :* ETANCHAUD Fabrice fabrice.etanchaud@maif.fr *Cc :* BaseX basex-talk@mailman.uni-konstanz.de *Objet :* Re: [basex-talk] (no subject)
Thank you Fabrice! That is most helpful.
It's a good day when you learn something new, and I did not know about *db:text()* and *db:attribute()*. I'll need to think more on how I can use these functions. Much appreciated!
All the best, Chuck Bearden
On Tue, Mar 29, 2022 at 2:52 AM ETANCHAUD Fabrice < fabrice.etanchaud@maif.fr> wrote:
Hi Charles, If I remember well, it is because of the dynamic call to collection() : when you call collection('my static db name'), the parser can rewrite it to use an index, but not when you call collection($my_dynamic_db_name).
I wonder if you could get better results using db:text() function. Could you give something like that a try ?
let $pmid := '22345065' let $dbname := db:text('idx_pmed_baseline', $pmid)/parent::pmid/../dbname return db:text($dbname, $pmid)/parent::PMID/ancestor::PubmedArticle
Best regards, Fabrice
*De :* BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de de la part de Charles Bearden cfbmdacc@gmail.com *Envoyé :* lundi 28 mars 2022 23:59 *À :* BaseX basex-talk@mailman.uni-konstanz.de *Objet :* [basex-talk] (no subject)
I have loaded all the Pubmed baseline XML records into a series of 20 BaseX databases, 55 or 56 of the baseline files per database, each of which is about 12GB in size and has between 520 and 530 million nodes in 55 or 56 documents. Text, Token, and Attribute indices are enabled, but with 6GB RAM allocated to the Java VM it would not create a full text index. Each of the 55 or 56 documents has 30000 article records in it under a root PubmedArticleSet element.
I typically use basexgui for interactive work and basex for scripted loads & queries, and I allocate 6G to the Java VM in each case:
BASEX_JVM="-Xmx6g $BASEX_JVM"
I'm exploring ways to search the data in a moderately performant way, starting with the realtively simple lookup by PubMed ID:
/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID[text()=$pmid]]
I generated a seriers of five index XML files that pair PubMed IDs with database names, like so:
<index> <entry> <dbname>pmed_baseline_a</dbname> <pmid>579614</pmid> </entry> … </index>
Each file contains entries for four of the 20 BaseX databases. I loaded the five files into a single database.
My hope was that I could quickly lookup the name of the database that contained a record by that record's PMID, and that I could then open that collection and quickly obtain that record, but it isn't working the way I had hoped.
If I query the index database by PMID, I get the answer in 156ms:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text() return $dbname (: returns 'pmed_baseline_s' :)
If I open and query that collection by XPath for that PMID, also get the answer quickly, in about 420ms:
let $coll := collection('pmed_baseline_s') let $pmid := '22345065' let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted (: returns desired XML record :)
But if I combine the code to run in a single execution, it takes about 40s:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text()
let $coll := collection($dbname) let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted
I feel like I must be doing some simple thing wrong, but the only difference I see in my code between the two separate steps and the single execution version is that I'm passing the db name in a variable instead of as a string literal to the collection() function, and I'm running the whole thing in a single execution.
Note that before each execution of an XQuery, I exited basexgui and restarted it to avoid any caching effect in memory at least.
The VM I'm running on is modest (spinning drives in RAID 1, four modest AMD CPU cores, dynamic memory growth up to 32GB). But these factors would not explain the difference in speed between the two steps in separate executions and both steps in a single execution.
Can anyone point out what I'm doing wrong? And is there a better way to go about this?
Many thanks & all the best, Chuck
BaseX is really a wonderful tool to process unstructured data ! ________________________________ De : Charles Bearden cfbmdacc@gmail.com Envoyé : mardi 29 mars 2022 17:31 À : ETANCHAUD Fabrice fabrice.etanchaud@maif.fr Cc : BaseX basex-talk@mailman.uni-konstanz.de Objet : Re: [basex-talk] (no subject)
I should have mentioned: your code returned the result in 10 milliseconds.
On Tue, Mar 29, 2022 at 10:20 AM ETANCHAUD Fabrice <fabrice.etanchaud@maif.frmailto:fabrice.etanchaud@maif.fr> wrote: Hi Charles, Give it a try before ! My XQuery is patchy...
Best regards, Fabrice
________________________________ De : Charles Bearden <cfbmdacc@gmail.commailto:cfbmdacc@gmail.com> Envoyé : mardi 29 mars 2022 17:10 À : ETANCHAUD Fabrice <fabrice.etanchaud@maif.frmailto:fabrice.etanchaud@maif.fr> Cc : BaseX <basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de> Objet : Re: [basex-talk] (no subject)
Thank you Fabrice! That is most helpful.
It's a good day when you learn something new, and I did not know about db:text() and db:attribute(). I'll need to think more on how I can use these functions. Much appreciated!
All the best, Chuck Bearden
On Tue, Mar 29, 2022 at 2:52 AM ETANCHAUD Fabrice <fabrice.etanchaud@maif.frmailto:fabrice.etanchaud@maif.fr> wrote: Hi Charles, If I remember well, it is because of the dynamic call to collection() : when you call collection('my static db name'), the parser can rewrite it to use an index, but not when you call collection($my_dynamic_db_name).
I wonder if you could get better results using db:text() function. Could you give something like that a try ?
let $pmid := '22345065' let $dbname := db:text('idx_pmed_baseline', $pmid)/parent::pmid/../dbname return db:text($dbname, $pmid)/parent::PMID/ancestor::PubmedArticle
Best regards, Fabrice
________________________________ De : BaseX-Talk <basex-talk-bounces@mailman.uni-konstanz.demailto:basex-talk-bounces@mailman.uni-konstanz.de> de la part de Charles Bearden <cfbmdacc@gmail.commailto:cfbmdacc@gmail.com> Envoyé : lundi 28 mars 2022 23:59 À : BaseX <basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de> Objet : [basex-talk] (no subject)
I have loaded all the Pubmed baseline XML records into a series of 20 BaseX databases, 55 or 56 of the baseline files per database, each of which is about 12GB in size and has between 520 and 530 million nodes in 55 or 56 documents. Text, Token, and Attribute indices are enabled, but with 6GB RAM allocated to the Java VM it would not create a full text index. Each of the 55 or 56 documents has 30000 article records in it under a root PubmedArticleSet element.
I typically use basexgui for interactive work and basex for scripted loads & queries, and I allocate 6G to the Java VM in each case:
BASEX_JVM="-Xmx6g $BASEX_JVM"
I'm exploring ways to search the data in a moderately performant way, starting with the realtively simple lookup by PubMed ID:
/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID[text()=$pmid]]
I generated a seriers of five index XML files that pair PubMed IDs with database names, like so:
<index> <entry> <dbname>pmed_baseline_a</dbname> <pmid>579614</pmid> </entry> … </index>
Each file contains entries for four of the 20 BaseX databases. I loaded the five files into a single database.
My hope was that I could quickly lookup the name of the database that contained a record by that record's PMID, and that I could then open that collection and quickly obtain that record, but it isn't working the way I had hoped.
If I query the index database by PMID, I get the answer in 156ms:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text() return $dbname (: returns 'pmed_baseline_s' :)
If I open and query that collection by XPath for that PMID, also get the answer quickly, in about 420ms:
let $coll := collection('pmed_baseline_s') let $pmid := '22345065' let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted (: returns desired XML record :)
But if I combine the code to run in a single execution, it takes about 40s:
let $pmid := '22345065' let $icoll := collection('idx_pmed_baseline') let $pmid_lookup := $icoll/index/entry/pmid let $entry := $pmid_lookup[text()=$pmid] let $dbname := $entry/parent::entry/dbname/text()
let $coll := collection($dbname) let $wanted := $coll/PubmedArticleSet/PubmedArticle[MedlineCitation/PMID/text()=$pmid] return $wanted
I feel like I must be doing some simple thing wrong, but the only difference I see in my code between the two separate steps and the single execution version is that I'm passing the db name in a variable instead of as a string literal to the collection() function, and I'm running the whole thing in a single execution.
Note that before each execution of an XQuery, I exited basexgui and restarted it to avoid any caching effect in memory at least.
The VM I'm running on is modest (spinning drives in RAID 1, four modest AMD CPU cores, dynamic memory growth up to 32GB). But these factors would not explain the difference in speed between the two steps in separate executions and both steps in a single execution.
Can anyone point out what I'm doing wrong? And is there a better way to go about this?
Many thanks & all the best, Chuck
basex-talk@mailman.uni-konstanz.de