Alex,
Here is the script (.bxs file) contents in its partitioned form
(broken out into 6 seperate scripts rather than one script):
SET STRIPNS true
SET ADDCACHE true
SET TEXTINDEX false
SET ATTRINDEX false
OPEN Release-Canonicals-Comparative
XQUERY db:output('
 -- ' || current-time() || ' --

')
XQUERY db:output("
#12")
SET BINDINGS
$db=Release-Canonicals-Comparative,$containerSetStart=110001,$containerSetCount=10000
RUN
..\webapp\release-identification\xquery\generate-comparison-db.xq
XQUERY db:output("
#13")
SET BINDINGS
$db=Release-Canonicals-Comparative,$containerSetStart=120001,$containerSetCount=10000
RUN
..\webapp\release-identification\xquery\generate-comparison-db.xq
XQUERY db:output('
 -- ' || current-time() || ' --

')
If I run it in this partitioned form, it is quite fast, roughly 5
minutes per "RUN" command. If I concatenate them all together, it
progressively slows down, consumes all the available memory,
starts to swap memory to disc, cpu climbs to 100% and eventually
fails with a memory error.
Christopher
On 6/12/2013 2:45 AM, Alexander Holupirek wrote:
Christopher,
it may be sufficient if you can pass the script (.bxs) file that you use to process the data.
Would that be possible?
Alex
On 12.06.2013, at 02:46, "Christopher.Ball" <christopher.ball@metaheuristica.com> wrote:
Christian,
So I have finally upgraded to BaseX 7.7 and found I am still having the out of memory issue.
Given the size and nature of the data I am working with I am at a loss of how to provide you with a simple example that replicates the problem.
On the flip side, one behavior I am noticing is that breaking the work in to discrete chunks in separate batch scripts gives dramatically faster performance and avoids the memory error. This strongly suggests that something is preventing garbage collection between unrelated tasks in a batch script.
Is there any way I can force garbage collection in a batch script? I tried closing and reopening databases but that had no effect (actually shocked that it did not).
Let me know,
Christopher
On 5/20/2013 6:24 AM, Christian Grün wrote:
Hi Christopher, hi Ben,
yes, this sounds like unwanted behavior, and I believe it should be
fixable as the commands scripts I’ve been working with didn’t cause
memory leaks. I’ll be glad to track down the possible issues. Could
(one/both) of you pass me on a script that causes the problems?
Christian
PS: I would be grateful if you could additionally check if the problem
persists in the latest stable snapshot.
___________________________
On Mon, May 20, 2013 at 10:33 AM, Ben Companjen
<bencompanjen@gmail.com>
wrote:
I recognise your problem, and reported it, but never got back to it
with more details. I used BaseX client/server 7.5 beta. My first
database contained 2.7 million documents, but I created a new one from
an exported subset of 700k documents. That helped lower the memory use
directly after loading the DB.
Any chance you use the SQL module in your processing?
My guess was that it had been a design choice to keep previously
opened documents from a database in use in memory. But running out of
memory probably wasn't ;)
Ben
On 20 May 2013 04:32, Christopher.R.Ball
<christopher.r.ball@gmail.com>
wrote:
I have a BaseX script (.bxs) I am running that does queries in batches (sets
of 5k documents), but as it progresses it bogs down in speed, does not
release memory between sets even if I force it to close and reopen the db
between queries, and eventually runs out of memory.
But, if I break the same BaseX script into separate files still doing the
same exact batches it is extremely fast and memory efficient.
Very suggestive of a memory leak . . .
I am running on BaseX 7.6.1 Beta.
Any thoughts?
Is there a way to force the script to do garbage collection?
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Dr. Alexander Holupirek
|-- Room E 221, 0049 7531 88 2188 (phone) 3577 (fax)
|-- Database & Information Systems Group, U Konstanz
`-- https://scikon.uni-konstanz.de/personen/alexander.holupirek/