Hello,
we're getting an apparent deadlock (followed by a GC overhead limit exceeded) on one machine, when starting some processing on a collection of over 800 000 records. Going after it with YourKit yields the following:
application-akka.actor.default-dispatcher-110 <--- Frozen for at least 3m 7s org.basex.server.Query.cache(InputStream) org.basex.server.ClientQuery.cache() org.basex.server.Query.more() eu.delving.basex.client.Implicits$RichClientQuery.hasNext() scala.collection.Iterator$$anon$19.hasNext() scala.collection.Iterator$$anon$29.hasNext() scala.collection.Iterator$class.foreach(Iterator, Function1) scala.collection.Iterator$$anon$29.foreach(Function1) core.processing.CollectionProcessor$$anonfun$process$2.apply(ClientSession) core.processing.CollectionProcessor$$anonfun$process$2.apply(Object) core.storage.BaseXStorage$$anonfun$withSession$1.apply(ClientSession) core.storage.BaseXStorage$$anonfun$withSession$1.apply(Object) eu.delving.basex.client.BaseX$$anonfun$withSession$1.apply(ClientSession) eu.delving.basex.client.BaseX$$anonfun$withSession$1.apply(Object) eu.delving.basex.client.BaseX.withSession(Function1) eu.delving.basex.client.BaseX.withSession(String, Function1) core.storage.BaseXStorage$.withSession(Collection, Function1) core.processing.CollectionProcessor.process(Function0, Function1, Function1, Function3) core.processing.DataSetCollectionProcessor$.process(DataSet) actors.Processor$$anonfun$receive$1.apply(Object)<2 recursive calls> akka.actor.Actor$class.apply(Actor, Object) actors.Processor.apply(Object) akka.actor.ActorCell.invoke(Envelope) akka.dispatch.Mailbox.processMailbox(int, long) akka.dispatch.Mailbox.run() akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec() akka.jsr166y.ForkJoinTask.doExec() akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinTask) akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool$WorkQueue) akka.jsr166y.ForkJoinWorkerThread.run()
In the server logs I can observe:
09:10:45.129 [192.168.1.214:47530]: dimcon____geheugen-van-nederland QUERY(3) for $i in /record[@version = 0] order by $i/system/index return $i OK 0.06 ms 09:10:45.129 [192.168.1.214:47530]: dimcon____geheugen-van-nederland QUERY(3) OK 0.03 ms 09:13:23.155 [192.168.1.214:47530]: dimcon____geheugen-van-nederland ITER(3) Error: Connection reset 09:13:23.155 [192.168.1.214:47530]: dimcon____geheugen-van-nederland LOGOUT admin OK
I looked up the code, and it looks as though the whole (?) query is cached in memory upon retrieval. Given the DB is over 1.2 GB in size, our client server has a hard time (it only has 1,5 GB of Xmx).
Is there any preferred way of dealing with this?
What I am going to do for the moment, I think, is to override or intercept the creation of the ClientQuery and make an implementation that has a different caching strategy. Another approach may probably be to limit the query output and implement some custom iteration behavior - but if it that can be handled directly at the query level, I think it would make things easier.
Manuel
Hi Manuel,
dimcon____geheugen-van-nederland QUERY(3) for $i in /record[@version = 0] order by $i/system/index return $i OK 0.06 ms
as you have already seen, all results are first cached by the client if they are requested via the iterative query protocol. In earlier versions of BaseX, results were returned in a purely iterative manner -- which was more convenient and flexible from a user's point of view, but led to numerous deadlocks if reading and writing queries were mixed.
If you only need parts of the requested results, I would recommend to limit the number of results via XQuery, e.g. as follows:
( for $i in /record[@version = > 0] order by $i/system/index return $i) [position() = 1 to 1000]
Next, it is important to note that the "order by" clause can get very expensive, as all results have to be cached anyway before they can be returned. Our top-k functions will probably give you better results if it's possible in your use case to limit the number of results [1].
A popular alternative to client-side caching (well, you mentioned that already) is to overwrite the code of the query client, and directly process the returned results. Note, however, that you need to loop through all results, even if you only need parts of the results.
Hope this helps, Christian
[1] http://docs.basex.org/wiki/Higher-Order_Functions_Module#hof:top-k-by
Hi Christian,
as you have already seen, all results are first cached by the client if they are requested via the iterative query protocol. In earlier versions of BaseX, results were returned in a purely iterative manner -- which was more convenient and flexible from a user's point of view, but led to numerous deadlocks if reading and writing queries were mixed.
If you only need parts of the requested results, I would recommend to limit the number of results via XQuery, e.g. as follows:
( for $i in /record[@version = > 0] order by $i/system/index return $i) [position() = 1 to 1000]
I had considered this, but haven't used that approach - yet - mainly because I wanted to try the streaming approach first. So far our system only used MongoDB and we are used to working with cursors as query results, so I'm trying to keep that somehow aligned if possible.
Next, it is important to note that the "order by" clause can get very expensive, as all results have to be cached anyway before they can be returned. Our top-k functions will probably give you better results if it's possible in your use case to limit the number of results [1].
Ok, thanks. If this becomes a problem, I'll consider using this. Is the query time of 0.06ms otherwise the actual time the query takes to run? If yes then I'm not too worried about query performance :) In general, the bottleneck in our system is not so much the querying but rather the processing of the records - I started rewriting this one concurrently using Akka, but am now stuck with a classloader deadlock (no pun intended). It will likely take quite some effort for the processing to be faster than the query iteration.
A popular alternative to client-side caching (well, you mentioned that already) is to overwrite the code of the query client, and directly process the returned results. Note, however, that you need to loop through all results, even if you only need parts of the results.
I implemented this and it looks like it works nicely (to be confirmed soon - I started a run on a 600k records collection).
Thanks for your time!
Manuel
Hope this helps, Christian
[1] http://docs.basex.org/wiki/Higher-Order_Functions_Module#hof:top-k-by
Hello again,
I implemented this and it looks like it works nicely (to be confirmed soon - I started a run on a 600k records collection).
This runs nicely, in that the machine doesn't run out of memory anymore. There is one thing I noticed however, and that I had noticed earlier on as well when a big collection was being processed: any attempt to talk with the server seems not to be working, i.e. even when I try to connect via the command-line basexadmin and run a command such as "list" or "open db foo", I do not get a reply. I can see the commands in the log though:
17:28:06.532 [127.0.0.1:33112] LOGIN admin OK 17:28:08.158 [127.0.0.1:33112] LIST 17:28:21.288 [127.0.0.1:33114] LOGIN admin OK 17:28:25.602 [127.0.0.1:33114] LIST 17:28:52.676 [127.0.0.1:33116] LOGIN admin OK
Could it be that the long session is blocking the output stream coming from the server?
Thanks,
Manuel
On Mon, May 21, 2012 at 4:40 PM, Manuel Bernhardt bernhardt.manuel@gmail.com wrote:
Hi Christian,
as you have already seen, all results are first cached by the client if they are requested via the iterative query protocol. In earlier versions of BaseX, results were returned in a purely iterative manner -- which was more convenient and flexible from a user's point of view, but led to numerous deadlocks if reading and writing queries were mixed.
If you only need parts of the requested results, I would recommend to limit the number of results via XQuery, e.g. as follows:
( for $i in /record[@version = > 0] order by $i/system/index return $i) [position() = 1 to 1000]
I had considered this, but haven't used that approach - yet - mainly because I wanted to try the streaming approach first. So far our system only used MongoDB and we are used to working with cursors as query results, so I'm trying to keep that somehow aligned if possible.
Next, it is important to note that the "order by" clause can get very expensive, as all results have to be cached anyway before they can be returned. Our top-k functions will probably give you better results if it's possible in your use case to limit the number of results [1].
Ok, thanks. If this becomes a problem, I'll consider using this. Is the query time of 0.06ms otherwise the actual time the query takes to run? If yes then I'm not too worried about query performance :) In general, the bottleneck in our system is not so much the querying but rather the processing of the records - I started rewriting this one concurrently using Akka, but am now stuck with a classloader deadlock (no pun intended). It will likely take quite some effort for the processing to be faster than the query iteration.
A popular alternative to client-side caching (well, you mentioned that already) is to overwrite the code of the query client, and directly process the returned results. Note, however, that you need to loop through all results, even if you only need parts of the results.
I implemented this and it looks like it works nicely (to be confirmed soon - I started a run on a 600k records collection).
Thanks for your time!
Manuel
Hope this helps, Christian
[1] http://docs.basex.org/wiki/Higher-Order_Functions_Module#hof:top-k-by
[....] There is one thing I noticed however, and that I had noticed earlier on as well when a big collection was being processed: any attempt to talk with the server seems not to be working, i.e. even when I try to connect via the command-line basexadmin and run a command such as "list" or "open db foo", I do not get a reply. [....]
I'm not quite sure what's the problem. Some questions I get in mind:
-- does the problem occur with a single client? -- does "no reply" mean that your client request is being blocked, or that the returned result is empty? -- can you access your database via the standalone interfaces?
Just in case.. Feel free to send a small Java example that demonstrates the issue. Christian
Hi Christian,
I just witnessed this again now. There was one processing resulting in a "streaming query" (though I think there would not have been a big difference if it would have been a cached one) over 90000 records, and we uploaded a few small collections after starting that one. Additionally I issued a "list" statement from another client.
What happened next is that: - the process with the long query went on - the uploads were blocked (in queue) - my call on the console was blocked (in queue) - once the long query was done, all other operations proceeded
So it looks as though there is some kind of read lock on the server level...? Am I perhaps doing something wrong when starting the long query - e.g. should it be started within some kind of transaction or special context?
Thanks,
Manuel
On Tue, May 22, 2012 at 12:18 AM, Christian Grün christian.gruen@gmail.com wrote:
[....] There is one thing I noticed however, and that I had noticed earlier on as well when a big collection was being processed: any attempt to talk with the server seems not to be working, i.e. even when I try to connect via the command-line basexadmin and run a command such as "list" or "open db foo", I do not get a reply. [....]
I'm not quite sure what's the problem. Some questions I get in mind:
-- does the problem occur with a single client? -- does "no reply" mean that your client request is being blocked, or that the returned result is empty? -- can you access your database via the standalone interfaces?
Just in case.. Feel free to send a small Java example that demonstrates the issue. Christian
Hi Christian,
incoming read operations will be blocked as soon as a write operation is queued, or executed, by any client. Our Wiki entry on transaction management may give you some helpful information [1]; if you feel the information is incomplete, you are invited to edit our Wiki!
We are currently working on pushing down the locking concept to databases (instead of processes); as soon as this works, you will be able to write to one database and access another one in parallel (provided that the XQuery optimizer will find out which databases will be touched by a given query).
Hope this helps, Christian
[1] http://docs.basex.org/wiki/Transaction_Management ___________________________________
On Tue, May 22, 2012 at 2:57 PM, Manuel Bernhardt bernhardt.manuel@gmail.com wrote:
Hi Christian,
I just witnessed this again now. There was one processing resulting in a "streaming query" (though I think there would not have been a big difference if it would have been a cached one) over 90000 records, and we uploaded a few small collections after starting that one. Additionally I issued a "list" statement from another client.
What happened next is that:
- the process with the long query went on
- the uploads were blocked (in queue)
- my call on the console was blocked (in queue)
- once the long query was done, all other operations proceeded
So it looks as though there is some kind of read lock on the server level...? Am I perhaps doing something wrong when starting the long query - e.g. should it be started within some kind of transaction or special context?
Thanks,
Manuel
On Tue, May 22, 2012 at 12:18 AM, Christian Grün christian.gruen@gmail.com wrote:
[....] There is one thing I noticed however, and that I had noticed earlier on as well when a big collection was being processed: any attempt to talk with the server seems not to be working, i.e. even when I try to connect via the command-line basexadmin and run a command such as "list" or "open db foo", I do not get a reply. [....]
I'm not quite sure what's the problem. Some questions I get in mind:
-- does the problem occur with a single client? -- does "no reply" mean that your client request is being blocked, or that the returned result is empty? -- can you access your database via the standalone interfaces?
Just in case.. Feel free to send a small Java example that demonstrates the issue. Christian
basex-talk@mailman.uni-konstanz.de