Hi all,
I was wondering whether command scripts are optimized before they are
executed. I would think so, because when I had made an error in the,
say, sixth command in my script, it was caught before the first
command was executed.
But more importantly for my application: it seems that combinations of
"CLOSE; OPEN <same DB as just closed>" are skipped. I created an
XQuery to copy 680000 records from my XML DB to MySQL. These records
are MODS, found inside record elements at XPath /file/record. The
database is about 2.5 GB and eats even more of my main memory when I
try to copy all records in one run.
I found out that doing subsequences of 50000 to 10000 records speeds
up the process significantly and the server uses a lot less memory.
For instance, when I start processing records from 400001, BaseX
apparently doesn't keep the first 400000 in memory.
But just doing 50000 records in a row, shifting the begin parameter of
the subsequence and going on, eventually slows the process from tens
to more than 100 records per second to less than one per second,
probably because memory consumption goes up to all available memory (I
assigned 3 GB to the JVM at startup of the server). Therefore I tried
to free some memory by closing the database, re-opening it and then
resume processing. But I don't see the memory usage reduction that I
hoped for.
>From my XQuery mods2sql5.xq:
<namespace declarations>
declare variable $begin as xs:integer external := 1;
declare variable $length as xs:integer external := 50000;
<function declarations:
prepare statement
return sql:execute-prepared(...)>
(: main process :)
for $record at $rodb in subsequence(/file/record[mods:mods][@id !=
""], $begin, $length)
<call functions to process records>
return <functions' output as sequence>
>From my command script mods2sql.bxs:
SET QUERYINFO true
OPEN monly2
SET BINDINGS $begin=1,$length=50000
RUN /Users/ben/Documents/xquery/metadata_only/mods2sql5.xq
SET BINDINGS $begin=50001
RUN /Users/ben/Documents/xquery/metadata_only/mods2sql5.xq
CLOSE
OPEN monly2
SET BINDINGS $begin=100001
RUN /Users/ben/Documents/xquery/metadata_only/mods2sql5.xq
SET BINDINGS $begin=150001
RUN /Users/ben/Documents/xquery/metadata_only/mods2sql5.xq
SET BINDINGS $begin=200001
RUN /Users/ben/Documents/xquery/metadata_only/mods2sql5.xq
CLOSE
OPEN monly2
...
Are my observations about optimizations of command scripts correct? Is
there any way to speed up my process?
Thanks for any advice!
Regards,
Ben