Hello,
we're getting an apparent deadlock (followed by a GC overhead limit
exceeded) on one machine, when starting some processing on a
collection of over 800 000 records. Going after it with YourKit yields
the following:
application-akka.actor.default-dispatcher-110 <--- Frozen for at least 3m 7s
org.basex.server.Query.cache(InputStream)
org.basex.server.ClientQuery.cache()
org.basex.server.Query.more()
eu.delving.basex.client.Implicits$RichClientQuery.hasNext()
scala.collection.Iterator$$anon$19.hasNext()
scala.collection.Iterator$$anon$29.hasNext()
scala.collection.Iterator$class.foreach(Iterator, Function1)
scala.collection.Iterator$$anon$29.foreach(Function1)
core.processing.CollectionProcessor$$anonfun$process$2.apply(ClientSession)
core.processing.CollectionProcessor$$anonfun$process$2.apply(Object)
core.storage.BaseXStorage$$anonfun$withSession$1.apply(ClientSession)
core.storage.BaseXStorage$$anonfun$withSession$1.apply(Object)
eu.delving.basex.client.BaseX$$anonfun$withSession$1.apply(ClientSession)
eu.delving.basex.client.BaseX$$anonfun$withSession$1.apply(Object)
eu.delving.basex.client.BaseX.withSession(Function1)
eu.delving.basex.client.BaseX.withSession(String, Function1)
core.storage.BaseXStorage$.withSession(Collection, Function1)
core.processing.CollectionProcessor.process(Function0, Function1,
Function1, Function3)
core.processing.DataSetCollectionProcessor$.process(DataSet)
actors.Processor$$anonfun$receive$1.apply(Object)<2 recursive calls>
akka.actor.Actor$class.apply(Actor, Object)
actors.Processor.apply(Object)
akka.actor.ActorCell.invoke(Envelope)
akka.dispatch.Mailbox.processMailbox(int, long)
akka.dispatch.Mailbox.run()
akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec()
akka.jsr166y.ForkJoinTask.doExec()
akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinTask)
akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool$WorkQueue)
akka.jsr166y.ForkJoinWorkerThread.run()
In the server logs I can observe:
09:10:45.129 [192.168.1.214:47530]:
dimcon____geheugen-van-nederland QUERY(3) for $i in /record[@version =
0] order by $i/system/index return $i OK 0.06 ms
09:10:45.129 [192.168.1.214:47530]:
dimcon____geheugen-van-nederland QUERY(3) OK 0.03 ms
09:13:23.155 [192.168.1.214:47530]:
dimcon____geheugen-van-nederland ITER(3) Error: Connection reset
09:13:23.155 [192.168.1.214:47530]:
dimcon____geheugen-van-nederland LOGOUT admin OK
I looked up the code, and it looks as though the whole (?) query is
cached in memory upon retrieval. Given the DB is over 1.2 GB in size,
our client server has a hard time (it only has 1,5 GB of Xmx).
Is there any preferred way of dealing with this?
What I am going to do for the moment, I think, is to override or
intercept the creation of the ClientQuery and make an implementation
that has a different caching strategy. Another approach may probably
be to limit the query output and implement some custom iteration
behavior - but if it that can be handled directly at the query level,
I think it would make things easier.
Manuel