Hi,
in one of our projects we are using BaseX. So far we are very happy with our choice, but recently we got an error that we could not fix. The only solution that worked was to create a new, empty database.
When trying to add a document to the database, we get an exception and a message „lock file does not exist“:
$ open cap10_cap_store_prod Database 'cap10_cap_store_prod' was opened in 219.99 ms.
$ add /home/thomas/tmp/test.xml Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.0.1 Java: Debian, 11.0.9 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException: cap10_cap_store_prod: lock file does not exist. at org.basex.util.Util.notExpected(Util.java:61) at org.basex.data.DiskData.finishUpdate(DiskData.java:246) at org.basex.core.cmd.ACreate.update(ACreate.java:97) at org.basex.core.cmd.Add.run(Add.java:56) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at org.basex.core.CLI.execute(CLI.java:92) at org.basex.core.CLI.execute(CLI.java:76) at org.basex.BaseX.console(BaseX.java:196) at org.basex.BaseX.<init>(BaseX.java:171) at org.basex.BaseX.main(BaseX.java:42)
Which lock file is the error referring to? Can we create it manually to make it work again?
The error only happens in the database that we have been using so far (cap10_cap_store_prod). After creating and using a new one (cap10_cap_store_prod2) the problem does not occur. Here is the output of running list on the database:
$ list
Name Resources Size Input Path
cap10_cap_store_prod 233387 1583797639 cap10_cap_store_prod2 241 1912414 testdb 1 4611
3 database(s).
Did we hit a size limit of BaseX? I checked the server and it has enough free disk space and free inodes available.
Some more background info:
The system runs on a cluster of servers. BaseX runs on one of them, bound to port 1984. The other nodes in the cluster connect to the BaseX-Server via the network. In my understanding, this type of concurrency is supported by BaseX.
The software on the nodes is implemented in Elixir. We wrote our own Elixir client library that is a thin wrapper around https://github.com/zadean/basexerl, an existing open source Erlang client library for BaseX.
Our Elixir client library keeps open a pool of connections to the BaseX server. So far and until we encountered the above error everything worked perfectly. Also, after creating a new database everything works fine again.
This error also occurs after shutting down all the nodes that connect to our BaseX instance. We checked using the lsof command and there are no connections open to the database other then the client we use to reproduce the error.
Any help would be greatly appreciated!
Kind regards, Thomas
Kind regards, Thomas
Hi Thomas,
Thanks for your mail.
As you have already reported back to us, it seems to be an inconsistency in your database that seems to trigger the exception. It’s true that the INSPECT cannot catch all possible inconsistencies in a database. You can try to…
1. export all files from your database and import them in the new instance, 2. optimize your database 3. fully optimize your database (with the ALL option). Just to be sure, it may be wise to create a backup before the optimization.
If you have some vague idea of when the bug happened first, you could scan the database logs from that period of time and the days before.
I didn’t encounter a similar error in the recent past, so maybe it was caused by a previous version of BaseX. It’s always a good idea to switch to the latest version. Do you remember which version you started with?
Best, Christian
On Wed, Dec 30, 2020 at 2:26 PM Thomas Spitaler tommy.spitaler@gmail.com wrote:
Hi,
in one of our projects we are using BaseX. So far we are very happy with our choice, but recently we got an error that we could not fix. The only solution that worked was to create a new, empty database.
When trying to add a document to the database, we get an exception and a message „lock file does not exist“:
$ open cap10_cap_store_prod Database 'cap10_cap_store_prod' was opened in 219.99 ms.
$ add /home/thomas/tmp/test.xml Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.0.1 Java: Debian, 11.0.9 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException: cap10_cap_store_prod: lock file does not exist. at org.basex.util.Util.notExpected(Util.java:61) at org.basex.data.DiskData.finishUpdate(DiskData.java:246) at org.basex.core.cmd.ACreate.update(ACreate.java:97) at org.basex.core.cmd.Add.run(Add.java:56) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at org.basex.core.CLI.execute(CLI.java:92) at org.basex.core.CLI.execute(CLI.java:76) at org.basex.BaseX.console(BaseX.java:196) at org.basex.BaseX.<init>(BaseX.java:171) at org.basex.BaseX.main(BaseX.java:42)
Which lock file is the error referring to? Can we create it manually to make it work again?
The error only happens in the database that we have been using so far (cap10_cap_store_prod). After creating and using a new one (cap10_cap_store_prod2) the problem does not occur. Here is the output of running list on the database:
$ list
Name Resources Size Input Path
cap10_cap_store_prod 233387 1583797639 cap10_cap_store_prod2 241 1912414 testdb 1 4611
3 database(s).
Did we hit a size limit of BaseX? I checked the server and it has enough free disk space and free inodes available.
Some more background info:
The system runs on a cluster of servers. BaseX runs on one of them, bound to port 1984. The other nodes in the cluster connect to the BaseX-Server via the network. In my understanding, this type of concurrency is supported by BaseX.
The software on the nodes is implemented in Elixir. We wrote our own Elixir client library that is a thin wrapper around https://github.com/zadean/basexerl, an existing open source Erlang client library for BaseX.
Our Elixir client library keeps open a pool of connections to the BaseX server. So far and until we encountered the above error everything worked perfectly. Also, after creating a new database everything works fine again.
This error also occurs after shutting down all the nodes that connect to our BaseX instance. We checked using the lsof command and there are no connections open to the database other then the client we use to reproduce the error.
Any help would be greatly appreciated!
Kind regards, Thomas
Kind regards, Thomas
Hi Christian,
thanks a lot for your advice.
Following brought some further insights.
I backed up the database and then tried to optimize it, without and with the all option.
However, this led to the following error message (prompt shown as $):
``` $ xquery db:optimize("cap10_cap_store_prod") [db:lock] Database 'cap10_cap_store_prod' is currently opened by another process. ```
However, this database shouldn’t be used by any process any longer. We created a new one with a different name.
`SHOW SESSIONS` did not reveal any sessions.
Reading the BaseX (https://docs.basex.org/wiki/Transaction_Management#File-System_Locks) we found out that there are two kinds of file-system locks:
1. Update Operations 2. Database Locks
I did not found any upd.basex file in the data directory, so I assume it is a database lock.
Is there any way to remove this lock, like a force option of the optimize command or a dedicated command?
Is it possible to force close/kill the dangling session/process? As I said I’m pretty sure there isn’t any process accessing this database, but still the database is in the locked state. Probably the process and/or db was shut down in an ungraceful way. How can such a situation be handled? We have to assume that this could possibly happen again and need a strategy to deal with it.
I could also pin down the moment when it happened in the log file. There is a strange character sequence in the log file, and after that all the attempted operations fail:
``` 103537 15:46:02.721 127.0.0.1:33919 admin OK Query executed in 3.36 ms. 3.42 ms 103538 15:46:02.724 127.0.0.1:40847 admin OK Query executed in 9.32 ms. 9.4 ms 103539 15:46:02.956 127.0.0.1:40847 admin REQUEST XQUERY db:open('cap10_cap_store_prod', 'dfade903-4365-44ce-a0dc-abbe474a7daa.cap') 0.08 ms 103540 15:46:02.962 127.0.0.1:40847 admin OK Query executed in 6.07 ms. 6.15 msadmin REQUEST ADD TO 003843e7-47ef-4a58-be8b-b7d6e665bbb1.cap [...] 0.11 ms 103542 16:13:26.583 127.0.0.1:40847 admin ERROR Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.0.1 Java: Debian, 11.0.9 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException: cap10_cap_store_prod: lock file does not exist. at org.basex.util.Util.notExpected(Util.java:61) at org.basex.data.DiskData.finishUpdate(DiskData.java:246) at org.basex.core.cmd.ACreate.update(ACreate.java:97) at org.basex.core.cmd.Add.run(Add.java:56) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.core.Command.execute(Command.java:116) at org.basex.server.ClientListener.execute(ClientListener.java:343) at org.basex.server.ClientListener.add(ClientListener.java:314) at org.basex.server.ClientListener.run(ClientListener.java:96) 15.75 ms
```
The error `admin ERROR Improper use? Potential bug?`keeps repeating from then on. Do you see anything unusual, or can any insights be gained from these logs?
Any help in solving this issue is greatly appreciated.
Kind regards, Thomas
On 4 Jan 2021, at 9:20, Christian Grün wrote:
Hi Thomas,
Thanks for your mail.
As you have already reported back to us, it seems to be an inconsistency in your database that seems to trigger the exception. It’s true that the INSPECT cannot catch all possible inconsistencies in a database. You can try to…
- export all files from your database and import them in the new
instance, 2. optimize your database 3. fully optimize your database (with the ALL option). Just to be sure, it may be wise to create a backup before the optimization.
If you have some vague idea of when the bug happened first, you could scan the database logs from that period of time and the days before.
I didn’t encounter a similar error in the recent past, so maybe it was caused by a previous version of BaseX. It’s always a good idea to switch to the latest version. Do you remember which version you started with?
Best, Christian
On Wed, Dec 30, 2020 at 2:26 PM Thomas Spitaler tommy.spitaler@gmail.com wrote:
Hi,
in one of our projects we are using BaseX. So far we are very happy with our choice, but recently we got an error that we could not fix. The only solution that worked was to create a new, empty database.
When trying to add a document to the database, we get an exception and a message „lock file does not exist“:
$ open cap10_cap_store_prod Database 'cap10_cap_store_prod' was opened in 219.99 ms.
$ add /home/thomas/tmp/test.xml Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.0.1 Java: Debian, 11.0.9 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException: cap10_cap_store_prod: lock file does not exist. at org.basex.util.Util.notExpected(Util.java:61) at org.basex.data.DiskData.finishUpdate(DiskData.java:246) at org.basex.core.cmd.ACreate.update(ACreate.java:97) at org.basex.core.cmd.Add.run(Add.java:56) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at org.basex.core.CLI.execute(CLI.java:92) at org.basex.core.CLI.execute(CLI.java:76) at org.basex.BaseX.console(BaseX.java:196) at org.basex.BaseX.<init>(BaseX.java:171) at org.basex.BaseX.main(BaseX.java:42)
Which lock file is the error referring to? Can we create it manually to make it work again?
The error only happens in the database that we have been using so far (cap10_cap_store_prod). After creating and using a new one (cap10_cap_store_prod2) the problem does not occur. Here is the output of running list on the database:
$ list
Name Resources Size Input Path
cap10_cap_store_prod 233387 1583797639 cap10_cap_store_prod2 241 1912414 testdb 1 4611
3 database(s).
Did we hit a size limit of BaseX? I checked the server and it has enough free disk space and free inodes available.
Some more background info:
The system runs on a cluster of servers. BaseX runs on one of them, bound to port 1984. The other nodes in the cluster connect to the BaseX-Server via the network. In my understanding, this type of concurrency is supported by BaseX.
The software on the nodes is implemented in Elixir. We wrote our own Elixir client library that is a thin wrapper around https://github.com/zadean/basexerl, an existing open source Erlang client library for BaseX.
Our Elixir client library keeps open a pool of connections to the BaseX server. So far and until we encountered the above error everything worked perfectly. Also, after creating a new database everything works fine again.
This error also occurs after shutting down all the nodes that connect to our BaseX instance. We checked using the lsof command and there are no connections open to the database other then the client we use to reproduce the error.
Any help would be greatly appreciated!
Kind regards, Thomas
Kind regards, Thomas
Hi Thomas,
$ xquery db:optimize("cap10_cap_store_prod") [db:lock] Database 'cap10_cap_store_prod' is currently opened by another process.
I did not found any upd.basex file in the data directory, so I assume it is a database lock.
Exactly. Usually, this error occurs only if you have multiple independent BaseX instances accessing the data at the same time (independent means that the instances are no clients who communicate with a single database server). If the error does not disappear after a restart of your system, and if you have ensured that no other BaseX/Java process is running simultaneously, you could…
1. download a fresh and most recent version of BaseX, 2. copy the challenging database into that instance, 3. start BaseX in debugging mode in the console (basex -d), 4. call OPTIMIZE and, if it fails, 5. share the command-line output with us.
We have to assume that this could possibly happen again and need a strategy to deal with it.
I hope it doesn’t (I didn’t encounter similar problems in our productive environments so far). If a server is shut down ungracefully, we cannot promise that your data will remain to be intact, though. Do you have some more information on what happened that particular day?
It’s generally advisable to back up your data in regular intervals. You can e.g. register Job Services for that [1].
Hope this helps Christian
Hi Christian,
we could solve the issue: a machine was cloned for test purposes without me knowing about it. We’re using NFS to mount the storage that hosts the BaseX data directly, so there were two BaseX server processes accessing the same files.
Thanks for your help!
Kind regards, Thomas
On 8 Jan 2021, at 9:24, Christian Grün wrote:
Hi Thomas,
$ xquery db:optimize("cap10_cap_store_prod") [db:lock] Database 'cap10_cap_store_prod' is currently opened by another process.
I did not found any upd.basex file in the data directory, so I assume it is a database lock.
Exactly. Usually, this error occurs only if you have multiple independent BaseX instances accessing the data at the same time (independent means that the instances are no clients who communicate with a single database server). If the error does not disappear after a restart of your system, and if you have ensured that no other BaseX/Java process is running simultaneously, you could…
- download a fresh and most recent version of BaseX,
- copy the challenging database into that instance,
- start BaseX in debugging mode in the console (basex -d),
- call OPTIMIZE and, if it fails,
- share the command-line output with us.
We have to assume that this could possibly happen again and need a strategy to deal with it.
I hope it doesn’t (I didn’t encounter similar problems in our productive environments so far). If a server is shut down ungracefully, we cannot promise that your data will remain to be intact, though. Do you have some more information on what happened that particular day?
It’s generally advisable to back up your data in regular intervals. You can e.g. register Job Services for that [1].
Hope this helps Christian
basex-talk@mailman.uni-konstanz.de