Hi Christian,

thanks a lot for your advice.

Following brought some further insights.

I backed up the database and then tried to optimize it, without and with the all option.

However, this led to the following error message (prompt shown as $):

$ xquery db:optimize("cap10_cap_store_prod")
[db:lock] Database 'cap10_cap_store_prod' is currently opened by another process.

However, this database shouldn’t be used by any process any longer. We created a new one with a different name.

SHOW SESSIONS did not reveal any sessions.

Reading the BaseX (https://docs.basex.org/wiki/Transaction_Management#File-System_Locks) we found out that there are two kinds of file-system locks:

  1. Update Operations
  2. Database Locks

I did not found any upd.basex file in the data directory, so I assume it is a database lock.

Is there any way to remove this lock, like a force option of the optimize command or a dedicated command?

Is it possible to force close/kill the dangling session/process? As I said I’m pretty sure there isn’t any process accessing this database, but still the database is in the locked state. Probably the process and/or db was shut down in an ungraceful way. How can such a situation be handled? We have to assume that this could possibly happen again and need a strategy to deal with it.

I could also pin down the moment when it happened in the log file. There is a strange character sequence in the log file, and after that all the attempted operations fail:

103537 15:46:02.721    127.0.0.1:33919 admin   OK      Query executed in 3.36 ms.      3.42 ms
 103538 15:46:02.724    127.0.0.1:40847 admin   OK      Query executed in 9.32 ms.      9.4 ms
 103539 15:46:02.956    127.0.0.1:40847 admin   REQUEST XQUERY db:open('cap10_cap_store_prod', 'dfade903-4365-44ce-a0dc-abbe474a7daa.cap')      0.08 ms
 103540 15:46:02.962    127.0.0.1:40847 admin   OK      Query executed in 6.07 ms.      6.15 ms
 103541 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@16:13:26.567    127.0.0.1:40847 admin   REQUEST ADD TO 003843e7-47ef-4a58-be8b-b7d6e665bbb1.cap [...]   0.11 ms
 103542 16:13:26.583    127.0.0.1:40847 admin   ERROR   Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.0.1 Java: Debian, 11.0.9 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException: cap10_cap_store_prod: lock file does not exist. at org.basex.util.Util.notExpected(Util.java:61) at org.basex.data.DiskData.finishUpdate(DiskData.java:246) at org.basex.core.cmd.ACreate.update(ACreate.java:97) at org.basex.core.cmd.Add.run(Add.java:56) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.core.Command.execute(Command.java:116) at org.basex.server.ClientListener.execute(ClientListener.java:343) at org.basex.server.ClientListener.add(ClientListener.java:314) at org.basex.server.ClientListener.run(ClientListener.java:96)   15.75 ms

The error admin ERROR Improper use? Potential bug?keeps repeating from then on. Do you see anything unusual, or can any insights be gained from these logs?

Any help in solving this issue is greatly appreciated.

Kind regards,
Thomas

On 4 Jan 2021, at 9:20, Christian Grün wrote:

Hi Thomas,

Thanks for your mail.

As you have already reported back to us, it seems to be an
inconsistency in your database that seems to trigger the exception.
It’s true that the INSPECT cannot catch all possible inconsistencies
in a database. You can try to…

1. export all files from your database and import them in the new instance,
2. optimize your database
3. fully optimize your database (with the ALL option). Just to be
sure, it may be wise to create a backup before the optimization.

If you have some vague idea of when the bug happened first, you could
scan the database logs from that period of time and the days before.

I didn’t encounter a similar error in the recent past, so maybe it was
caused by a previous version of BaseX. It’s always a good idea to
switch to the latest version. Do you remember which version you
started with?

Best,
Christian


On Wed, Dec 30, 2020 at 2:26 PM Thomas Spitaler
<tommy.spitaler@gmail.com> wrote:

Hi,

in one of our projects we are using BaseX. So far we are very happy with
our choice, but recently we got an error that we could not fix. The only
solution that worked was to create a new, empty database.

When trying to add a document to the database, we get an exception and a
message „lock file does not exist“:

$ open cap10_cap_store_prod
Database 'cap10_cap_store_prod' was opened in 219.99 ms.

$ add /home/thomas/tmp/test.xml
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 9.0.1
Java: Debian, 11.0.9
OS: Linux, amd64
Stack Trace:
java.lang.RuntimeException: cap10_cap_store_prod: lock file does not
exist.
at org.basex.util.Util.notExpected(Util.java:61)
at org.basex.data.DiskData.finishUpdate(DiskData.java:246)
at org.basex.core.cmd.ACreate.update(ACreate.java:97)
at org.basex.core.cmd.Add.run(Add.java:56)
at org.basex.core.Command.run(Command.java:257)
at org.basex.core.Command.execute(Command.java:93)
at org.basex.api.client.LocalSession.execute(LocalSession.java:132)
at org.basex.api.client.Session.execute(Session.java:36)
at org.basex.core.CLI.execute(CLI.java:92)
at org.basex.core.CLI.execute(CLI.java:76)
at org.basex.BaseX.console(BaseX.java:196)
at org.basex.BaseX.<init>(BaseX.java:171)
at org.basex.BaseX.main(BaseX.java:42)


Which lock file is the error referring to? Can we create it manually to
make it work again?

The error only happens in the database that we have been using so far
(cap10_cap_store_prod). After creating and using a new one
(cap10_cap_store_prod2) the problem does not occur. Here is the output
of running list on the database:

$ list

Name Resources Size Input Path

cap10_cap_store_prod 233387 1583797639
cap10_cap_store_prod2 241 1912414
testdb 1 4611

3 database(s).

Did we hit a size limit of BaseX? I checked the server and it has enough
free disk space and free inodes available.

Some more background info:

The system runs on a cluster of servers. BaseX runs on one of them,
bound to port 1984. The other nodes in the cluster connect to the
BaseX-Server via the network. In my understanding, this type of
concurrency is supported by BaseX.

The software on the nodes is implemented in Elixir. We wrote our own
Elixir client library that is a thin wrapper around
https://github.com/zadean/basexerl, an existing open source Erlang
client library for BaseX.

Our Elixir client library keeps open a pool of connections to the BaseX
server. So far and until we encountered the above error everything
worked perfectly. Also, after creating a new database everything works
fine again.

This error also occurs after shutting down all the nodes that connect to
our BaseX instance. We checked using the lsof command and there are no
connections open to the database other then the client we use to
reproduce the error.

Any help would be greatly appreciated!

Kind regards,
Thomas

Kind regards,
Thomas