Bonjour Simon,
I would send a query for each document,
externalizing the loop in java.
A question : could you process be written in xquery ? That way you might not face memory overflow.
Best regards,
Fabrice Etanchaud
CERFrance Poitou-Charentes
De : basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de]
De la part de Simon Chatelain
Envoyé : vendredi 22 septembre 2017 09:34
À : BaseX
Objet : [basex-talk] OutOfMemoryError at Query#more()
Hello,
I am facing an issue while retrieving some big amount of XML documents from a BaseX collection.
Each document (as an XML file) is around 10 KB, and in the problematic case I must retrieve around 70000 of them.
I am using
Session#query(String query) then
Query#more() and Query#next() to iterate through the result of my query.
try (final Query query = l_Session.query(“query”)) {
while (query.more()) {
String xml = query.next();
}
}
If there is more than a certain amount of XML document in the result of my query I get a OutOfMemoryError (full stack trace in attached file) when executing query.more().
I did the test with BaseX 8.6.6 and 8.6.7, Java 8, VM arguments –Xmx1024m
Increasing the Xmx value is not a solution as I don’t know what the maximum amount of data I will have to retrieve in the future. So what I need is a reliable way of executing such
queries and iterate through the result without exploding the heap size.
I also try to use
QueryProcessor and
QueryProcessor#iter() instead of Session#query(String query). But is it safe to use it knowing that my application is multithreaded and that each thread has its own session to query or add elements from/to
multiple collections?
Moreover, for now all access to BaseX are done through a session, so my application can run with an embedded BaseX or with a BaseX server. If I start using QueryProcessor, then
it will be embedded BaseX only, right?
I also attached a simple example showing the problem.
Any advice would be much appreciated
Thanks
Simon