Oups, forgot to put the mailing list in copy.
Hello Christian,
After some testing on my side, I didn't see the ConcurrentModificationException any more. That's the good news.
However, when running a slightly modified version of the small test case you wrote from my sample application, I faced another problem. The test can run for a few seconds to almost an hour but eventually, the following exception is thrown.
java.lang.IllegalMonitorStateException at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:374) at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131) at org.basex.core.locks.DBLocking.release(DBLocking.java:215) at org.basex.core.Context.unregister(Context.java:287) at org.basex.core.Command.execute(Command.java:105) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at basextest.BaseXTestDBLocking$1.run(BaseXTestDBLocking.java:43)
I attach my new test case (BaseXTestAdd.java), but the main modification is that in each created thread I also add documents to the collections instead of only opening it.
I was also able to see that in the call to getOrCreateLock() in DBLocking#release(final Proc pr) (line 212) the lock is created while it should already be in the locks Map, but I really cannot understand how this is possible. It would mean that the lock was removed by another thread but for that the usage value must be wrong in the lockUsage map, and I cannot find any sequence of operation that could lead to such a situation.
Trying to pin-point more precisely the problem, I wrote another test ( BaseXTestDBLocking.java) that calls directly the acquire and release methods of the DBLocking class. The problem seems to happen more quickly with this test.
Any thoughts ?
Regards
Simon
On Tue, Apr 14, 2015 at 3:31 PM, Simon Chatelain schatela@gmail.com wrote:
That was fast :-) . Thank you very much. I will test it right away.
Regards
Simon
On Tue, Apr 14, 2015 at 3:28 PM, Christian Grün <christian.gruen@gmail.com
wrote:
Simon,
I am glad to report the concurrency bug in the user access code has been fixed [1,2]. BaseX 8.1.1 may already be released this week. Of course, please report back to me if you should encounter additional bugs.
Thanks for testing, Christian
[1] https://github.com/BaseXdb/basex/commit/b5d1716e662a300d484cbbb19d1bf23cbef5... [2] http://files.basex.org/releases/latest
On Tue, Apr 14, 2015 at 2:44 PM, Christian Grün christian.gruen@gmail.com wrote:
I have minimized your example a bit (attached), and I agree it seems to be the user iterator that causes surprises.
Looking at it.. Christian
On Tue, Apr 14, 2015 at 1:48 PM, Christian Grün christian.gruen@gmail.com wrote:
Hi Simon,
Thanks for the updated code. It's still quite a lot code.. Is all of it required to reproduce the bug? E.g., does the bug only occur with the specified options etc? Could you possibly reduce it even more? Idealle, we can later rewrite it to a JUnit test once the problem is fixed (and we try to keep our tests as concise as possible).
But I'll have some thoughts about your helpful assumptions in parallel.
Thanks in advance, Christian
On Tue, Apr 14, 2015 at 1:10 PM, Simon Chatelain schatela@gmail.com
wrote:
Hello Christian,
I checked again my code, but I am quite sure that each LocalSession is created and used by only one thread. I attach the test application code.
Each LocalSession is used to execute several commands, but that
shouldn't be
a problem, right ?
I also had a look in BaseX code, especially in the
org.basex.core.users.User
class.
What happens in my test application is that I create several Writer
that
will add XML document each in a specific collection, but using the
same
user. Each time I create a new Writer it will first create the collection
then
grant read and write permission to this collection to the user
"USER", and
then start adding XML document in this collection.
My suspicion is that the problem occurs when a new Writer is created,
and it
is granting permission to the newly created collection while another
Writer
(in another thread) is adding a new XML document in its own
collection.
If, during the execution of the Grant command, the method
User#perm(final
Perm prm, final String pattern) is called while the execution of a concurrent Write command is at the same time executing the
User#find(final
String db) method, then we have a ConcurrentModificationException as
perm
method will put a new entry in locals field while the find method is iterating on the locals entrySet of the same User, the same User
instance
being shared by all subcontext created when opening a new
LocalSession.
For test purposes, I synchronized the access to locals in the perm(),
find()
and remove() methods of the class User, in that case I do not have the ConcurrentModificationException any more.
What do you think, is this possible ?
Regards
Simon
On Mon, Apr 13, 2015 at 11:04 AM, Christian Grün <
christian.gruen@gmail.com>
wrote:
> But the java.util.ConcurrentModificationException > at org.basex.core.users.User.find > is still happening.
I am pretty sure that any of the concurrent command calls are not moved to their own thread yet.
But feel free to pass me on your code (ideally, a version that's as small as possible).
Best, Christian
> > What I did is basically, in the test application, replace all
session
> creations > > try (final Session session = new LocalSession(rootContext, ...)) { > ... > } > catch {...} > > by > > Thread t = new Thread(new Runnable() { public void run() { > try (final Session session = new LocalSession(rootContext, ...)) { > ... > } > catch {...} > > } > }); > > Not very elegant but should be enough to make sure every session is > executed > in its own Java thread. > > > Am I still missing something ? > > Thank you very much for your help, it is very appreciated. > > Regards > > Simon > > > On Tue, Apr 7, 2015 at 12:16 PM, Christian Grün > christian.gruen@gmail.com > wrote: >> >> Hi Simon, >> >> Thanks for the detailed description of your setup and your test
code.
>> >> In BaseX, every session needs to be executed in a separate Java >> thread. The exceptions will e.g. disappear if you pass on your
admin
>> session to the store() and updateMetaStore() functions (or if you
run
>> these operations in a separate thread). >> >> Hope this helps, >> Christian >> >> >> On Thu, Apr 2, 2015 at 1:58 PM, Simon Chatelain <
schatela@gmail.com>
>> wrote: >> > Hello all, >> > >> > I have the following problem. >> > >> > Using: >> > >> > BaseX 8.1.1 beta cf713e2 (20150328.133759) >> > Java 1.8.0_31 >> > Windows 7 >> > >> > The context: I have several sources of data from which I
receive XML
>> > documents. My application will create a collection for each of
these
>> > sources >> > and store each received document into the collection
corresponding to
>> > the >> > source, and then update some kind of metadata about the
collection.
>> > The >> > metadata of all collections are stored into a common
collection. Each
>> > source >> > is handled by a different thread. >> > I am using an embedded BaseX, and access to it through
LocalSession.
>> > >> > Now the problem is that after a certain amount of time (not
constant,
>> > could >> > be seconds or minutes) there is some exceptions. >> > Either: java.lang.IllegalMonitorStateException >> > at org.basex.core.locks.DBLocking.acquire >> > and org.basex.core.locks.DBLocking.release >> > >> > Or: java.util.ConcurrentModificationException >> > at org.basex.core.users.User.find >> > >> > I put the full stack trace in attachment. >> > >> > I build a test application that mimic what happens in the real >> > application >> > and it seems to exhibit the same problem. The source of this
test
>> > application is attached >> > >> > My question: Am I doing something wrong, or is there a bug in
BaseX?
>> > >> > Thanks a lot >> > >> > Simon > >
Hello,
One more detail: if in the unsetLockIfUnused() method in the DBLocking class, I put the locks.remove(object) call in a synchronized(locks){} block, the problem does not appear any more, but as I don't understand exactly what the problem is, I am not sure if it really solves it or if it just change the timing a little bit making the problem less likely to happen.
Regards
Simon
Hello Christian,
After some testing on my side, I didn't see the ConcurrentModificationException any more. That's the good news.
However, when running a slightly modified version of the small test case you wrote from my sample application, I faced another problem. The test can run for a few seconds to almost an hour but eventually, the following exception is thrown.
java.lang.IllegalMonitorStateException at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:374) at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131) at org.basex.core.locks.DBLocking.release(DBLocking.java:215) at org.basex.core.Context.unregister(Context.java:287) at org.basex.core.Command.execute(Command.java:105) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at basextest.BaseXTestDBLocking$1.run(BaseXTestDBLocking.java:43)
I attach my new test case (BaseXTestAdd.java), but the main modification is that in each created thread I also add documents to the collections instead of only opening it.
I was also able to see that in the call to getOrCreateLock() in DBLocking#release(final Proc pr) (line 212) the lock is created while it should already be in the locks Map, but I really cannot understand how this is possible. It would mean that the lock was removed by another thread but for that the usage value must be wrong in the lockUsage map, and I cannot find any sequence of operation that could lead to such a situation.
Trying to pin-point more precisely the problem, I wrote another test ( BaseXTestDBLocking.java) that calls directly the acquire and release methods of the DBLocking class. The problem seems to happen more quickly with this test.
Any thoughts ?
Regards
Simon
Thanks for the example. I'm busy right now, but I will give you feedback soon.
On Mon, Apr 20, 2015 at 1:44 PM, Simon Chatelain schatela@gmail.com wrote:
Hello,
One more detail: if in the unsetLockIfUnused() method in the DBLocking class, I put the locks.remove(object) call in a synchronized(locks){} block, the problem does not appear any more, but as I don't understand exactly what the problem is, I am not sure if it really solves it or if it just change the timing a little bit making the problem less likely to happen.
Regards
Simon
Hello Christian,
After some testing on my side, I didn't see the ConcurrentModificationException any more. That's the good news.
However, when running a slightly modified version of the small test case you wrote from my sample application, I faced another problem. The test can run for a few seconds to almost an hour but eventually, the following exception is thrown.
java.lang.IllegalMonitorStateException at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:374) at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131) at org.basex.core.locks.DBLocking.release(DBLocking.java:215) at org.basex.core.Context.unregister(Context.java:287) at org.basex.core.Command.execute(Command.java:105) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at basextest.BaseXTestDBLocking$1.run(BaseXTestDBLocking.java:43)
I attach my new test case (BaseXTestAdd.java), but the main modification is that in each created thread I also add documents to the collections instead of only opening it.
I was also able to see that in the call to getOrCreateLock() in DBLocking#release(final Proc pr) (line 212) the lock is created while it should already be in the locks Map, but I really cannot understand how this is possible. It would mean that the lock was removed by another thread but for that the usage value must be wrong in the lockUsage map, and I cannot find any sequence of operation that could lead to such a situation.
Trying to pin-point more precisely the problem, I wrote another test (BaseXTestDBLocking.java) that calls directly the acquire and release methods of the DBLocking class. The problem seems to happen more quickly with this test.
Any thoughts ?
Regards
Simon
Hi Simon,
I finally had time to look at your examples, and...
One more detail: [...]
...seemed to fix it! The original version of this class was written by Jens (in the cc), but I also believe that the basic problem was that the locks instance was not synchronized. In my fix, I used a ConcurrentHashMap instance and changed other minor things in the code (see [1], 8.2 branch).
A new snapshot is available, too..
Thanks for the helpful feedback! Christian
[1] https://github.com/BaseXdb/basex/commit/d3503d36325cb0fea58a13ee9f54feb5ce88... [2] http://files.basex.org/releases/latest
if in the unsetLockIfUnused() method in the DBLocking
class, I put the locks.remove(object) call in a synchronized(locks){} block, the problem does not appear any more, but as I don't understand exactly what the problem is, I am not sure if it really solves it or if it just change the timing a little bit making the problem less likely to happen.
Regards
Simon
Hello Christian,
After some testing on my side, I didn't see the ConcurrentModificationException any more. That's the good news.
However, when running a slightly modified version of the small test case you wrote from my sample application, I faced another problem. The test can run for a few seconds to almost an hour but eventually, the following exception is thrown.
java.lang.IllegalMonitorStateException at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:374) at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131) at org.basex.core.locks.DBLocking.release(DBLocking.java:215) at org.basex.core.Context.unregister(Context.java:287) at org.basex.core.Command.execute(Command.java:105) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at basextest.BaseXTestDBLocking$1.run(BaseXTestDBLocking.java:43)
I attach my new test case (BaseXTestAdd.java), but the main modification is that in each created thread I also add documents to the collections instead of only opening it.
I was also able to see that in the call to getOrCreateLock() in DBLocking#release(final Proc pr) (line 212) the lock is created while it should already be in the locks Map, but I really cannot understand how this is possible. It would mean that the lock was removed by another thread but for that the usage value must be wrong in the lockUsage map, and I cannot find any sequence of operation that could lead to such a situation.
Trying to pin-point more precisely the problem, I wrote another test (BaseXTestDBLocking.java) that calls directly the acquire and release methods of the DBLocking class. The problem seems to happen more quickly with this test.
Any thoughts ?
Regards
Simon
Hello,
Excellent. Glad to be of use.
I'll try the new snapshot right away.
Cheers
Simon
On Wed, Apr 22, 2015 at 10:06 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi Simon,
I finally had time to look at your examples, and...
One more detail: [...]
...seemed to fix it! The original version of this class was written by Jens (in the cc), but I also believe that the basic problem was that the locks instance was not synchronized. In my fix, I used a ConcurrentHashMap instance and changed other minor things in the code (see [1], 8.2 branch).
A new snapshot is available, too..
Thanks for the helpful feedback! Christian
[1] https://github.com/BaseXdb/basex/commit/d3503d36325cb0fea58a13ee9f54feb5ce88... [2] http://files.basex.org/releases/latest
if in the unsetLockIfUnused() method in the DBLocking
class, I put the locks.remove(object) call in a synchronized(locks){}
block,
the problem does not appear any more, but as I don't understand exactly
what
the problem is, I am not sure if it really solves it or if it just change the timing a little bit making the problem less likely to happen.
Regards
Simon
Hello Christian,
After some testing on my side, I didn't see the ConcurrentModificationException any more. That's the good news.
However, when running a slightly modified version of the small test case you wrote from my sample application, I faced another problem. The test can run for a few seconds to almost an hour but eventually, the following exception is thrown.
java.lang.IllegalMonitorStateException at
java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:374)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260)
at
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131)
at org.basex.core.locks.DBLocking.release(DBLocking.java:215) at org.basex.core.Context.unregister(Context.java:287) at org.basex.core.Command.execute(Command.java:105) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at basextest.BaseXTestDBLocking$1.run(BaseXTestDBLocking.java:43)
I attach my new test case (BaseXTestAdd.java), but the main modification is that in each created thread I also add documents to the collections instead of only opening it.
I was also able to see that in the call to getOrCreateLock() in DBLocking#release(final Proc pr) (line 212) the lock is created while it should already be in the locks Map, but I really cannot understand how
this
is possible. It would mean that the lock was removed by another thread
but
for that the usage value must be wrong in the lockUsage map, and I
cannot
find any sequence of operation that could lead to such a situation.
Trying to pin-point more precisely the problem, I wrote another test (BaseXTestDBLocking.java) that calls directly the acquire and release methods of the DBLocking class. The problem seems to happen more quickly with this test.
Any thoughts ?
Regards
Simon
Hi all,
I'd say Simon's proposal to fix the issue pretty much nailed the issue.
One more detail: if in the unsetLockIfUnused() method in the DBLocking class, I put the locks.remove(object) call in a synchronized(locks){} block, the problem does not appear any more, but as I don't understand exactly what the problem is, I am not sure if it really solves it or if it just change the timing a little bit making the problem less likely to happen.
The access to the lock hashmap was synchronized by _two different objects_, which is a pretty bad thing to do. I wonder why nobody observed (or reported this) until now!
I put together a patch that fixes the synchronization [1] and also reverts Christian's change which I fear is not sufficient, as the `synchronized` block also guards access to the lock counters (but should not provide a major improvement in parallelization, that's probably also why this didn't get an issue for so long). As all access to the `locks` object is now fully synchronized, we don't need the overhead (I guess there will be some) of the concurrent implementation any more.
I already opened a pull request [2].
@Simon: thank you for tracing this to the very deep inner core of BaseX!
Regards, Jens
[1]: https://github.com/JensErat/basex/commit/6ec2d78dc6ce6d5c7837aa80af9338bdb9a... [2]: https://github.com/BaseXdb/basex/pull/1133
On 22.04.2015 10:06, Christian Grün wrote:
Hi Simon,
I finally had time to look at your examples, and...
One more detail: [...]
...seemed to fix it! The original version of this class was written by Jens (in the cc), but I also believe that the basic problem was that the locks instance was not synchronized. In my fix, I used a ConcurrentHashMap instance and changed other minor things in the code (see [1], 8.2 branch).
A new snapshot is available, too..
Thanks for the helpful feedback! Christian
[1] https://github.com/BaseXdb/basex/commit/d3503d36325cb0fea58a13ee9f54feb5ce88... [2] http://files.basex.org/releases/latest
if in the unsetLockIfUnused() method in the DBLocking
class, I put the locks.remove(object) call in a synchronized(locks){} block, the problem does not appear any more, but as I don't understand exactly what the problem is, I am not sure if it really solves it or if it just change the timing a little bit making the problem less likely to happen.
Regards
Simon
Hello Christian,
After some testing on my side, I didn't see the ConcurrentModificationException any more. That's the good news.
However, when running a slightly modified version of the small test case you wrote from my sample application, I faced another problem. The test can run for a few seconds to almost an hour but eventually, the following exception is thrown.
java.lang.IllegalMonitorStateException at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:374) at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131) at org.basex.core.locks.DBLocking.release(DBLocking.java:215) at org.basex.core.Context.unregister(Context.java:287) at org.basex.core.Command.execute(Command.java:105) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at basextest.BaseXTestDBLocking$1.run(BaseXTestDBLocking.java:43)
I attach my new test case (BaseXTestAdd.java), but the main modification is that in each created thread I also add documents to the collections instead of only opening it.
I was also able to see that in the call to getOrCreateLock() in DBLocking#release(final Proc pr) (line 212) the lock is created while it should already be in the locks Map, but I really cannot understand how this is possible. It would mean that the lock was removed by another thread but for that the usage value must be wrong in the lockUsage map, and I cannot find any sequence of operation that could lead to such a situation.
Trying to pin-point more precisely the problem, I wrote another test (BaseXTestDBLocking.java) that calls directly the acquire and release methods of the DBLocking class. The problem seems to happen more quickly with this test.
Any thoughts ?
Regards
Simon
basex-talk@mailman.uni-konstanz.de