Hi Peter,

I recall seeing something similar when copying database folders from a MS Windows BaseX install to a Linux install .
I found in some cases I needed to run db:optimize[1]  once before any other use of the database otherwise a Java error would occur.

Now, I would always use db backup and restore to transfer databases between systems.

/Andy

[1] https://old.docs.basex.org/wiki/Database_Module#db:optimize

On Fri, 7 Mar 2025 at 08:10, Christian Grün <christian.gruen@gmail.com> wrote:
Hi Peter,

Can you reproduce the problem with the latest version (11.7) and the zip distribution of BaseX (e.g., without Docker)?

Best,
Christian



Peter Villadsen <Peter.Villadsen@microsoft.com> schrieb am Fr., 7. März 2025, 05:00:

I did some more work to capture the relevant information for the two crashes.

 

As you recall, I build a container image on top of the official (but old) basex one. It copies the database into the right place in the container.

 

I added the -c switch to the basexhttp command when the container starts. Even when I do this, the container does not have data context set when the first operation involving the database happens - If I do a query that involves the open database:

 

<query>

   <text>count(/ada)</text>

</query>

 

I get:

 

Improper use? Potential bug? Your feedback is welcome:

Contact: basex-talk@mailman.uni-konstanz.de

Version: BaseX 9.6 RC1

Java: IcedTea,

1.8.0_212

OS: Linux, amd64

Stack Trace: 

java.lang.NullPointerException

    at org.basex.data.Data.defaultNs(Data.java: 270)

    at org.basex.query.expr.path.NameTest.noMatches(NameTest.java: 60)

    at org.basex.query.expr.path.Step.optimize(Step.java: 162)

    at org.basex.query.expr.path.Step.optimize(Step.java: 134)

    at org.basex.query.expr.Preds.compile(Preds.java: 59)

    at org.basex.query.expr.path.Path.lambda$compile$0(Path.java: 139)

    at org.basex.query.CompileContext.get(CompileContext.java: 165)

    at org.basex.query.expr.path.Path.compile(Path.java: 134)

    at org.basex.query.expr.Arr.compile(Arr.java: 47)

    at org.basex.query.scope.MainModule.comp(MainModule.java: 81)

    at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 119)

    at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 106)

    at org.basex.query.QueryContext.compile(QueryContext.java: 306)

    at org.basex.query.QueryProcessor.compile(QueryProcessor.java: 79)

    at org.basex.core.cmd.AQuery.query(AQuery.java: 91)

    at org.basex.core.cmd.XQuery.run(XQuery.java: 22)

    at org.basex.core.Command.run(Command.java: 257)

    at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 105)

    at org.basex.http.rest.RESTQuery.query(RESTQuery.java: 69)

    at org.basex.http.rest.RESTQuery.run0(RESTQuery.java: 37)

    at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 70)

    at org.basex.core.Command.run(Command.java: 257)

    at org.basex.core.Command.execute(Command.java: 93)

    at org.basex.core.Command.execute(Command.java: 116)

    at org.basex.http.rest.RESTServlet.run(RESTServlet.java: 32)

    at org.basex.http.BaseXServlet.service(BaseXServlet.java: 65)

 

If I remove the -c flag, and just let the container start (still with the database copied into place in the container), I get this trace when I try to do anything related with the database:

 

Improper use? Potential bug? Your feedback is welcome:

Contact: basex-talk@mailman.uni-konstanz.de

Version: BaseX 9.6 RC1

Java: IcedTea, 1.8.0_212

OS: Linux, amd64

Stack Trace:

java.lang.NullPointerException

        at org.basex.data.DiskData.write(DiskData.java:146)

        at org.basex.data.DiskData.close(DiskData.java:160)

        at org.basex.core.Datas.unpin(Datas.java:52)

        at org.basex.core.cmd.Close.close(Close.java:45)

        at org.basex.query.QueryResources.close(QueryResources.java:92)

        at org.basex.query.QueryContext.close(QueryContext.java:515)

        at org.basex.query.QueryProcessor.close(QueryProcessor.java:251)

        at org.basex.core.cmd.AQuery.query(AQuery.java:132)

        at org.basex.core.cmd.XQuery.run(XQuery.java:22)

        at org.basex.core.Command.run(Command.java:257)

        at org.basex.core.Command.execute(Command.java:93)

        at org.basex.api.client.LocalSession.execute(LocalSession.java:132)

        at org.basex.api.client.Session.execute(Session.java:36)

        at org.basex.core.CLI.execute(CLI.java:92)

        at org.basex.core.CLI.execute(CLI.java:76)

        at org.basex.BaseX.console(BaseX.java:177)

        at org.basex.BaseX.<init>(BaseX.java:152)

        at org.basex.BaseX.main(BaseX.java:43)

 

I hope this is useful. Right now I am blocked.

 

Best Regards

 

Peter Villadsen.

 

 

From: Peter Villadsen
Sent: Tuesday, March 4, 2025 12:43 PM
To: Christian Grün <christian.gruen@gmail.com>
Cc: basex-talk@mailman.uni-konstanz.de
Subject: RE: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...

 

Christian,

 

Yes, I have. Thank you for following up - I should have come back earlier.

 

I have been experimenting with this for a while now, and the container image (the official one and the quodatum, newer 10.3 one) both crash when I try to use them, both through HTTP and TCP. I am still looking into it. If I do not manage to find out what the issue is, I will upload the stack traces.

 

In both cases, I built my own container to include the database, so I can avoid the volumes and have the container be completely self-contained. The database is 19GB, so the container gets pretty big.

 

Here is how I start the container:

 

docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023

 

In my humble opinion it is unfortunate that the official container image has not been updated for at least 3 years. It would be nice to have the newest bits there, supported by BaseX.

 

Here is the dockerfile I use:

 

# escape=`

# Use the BaseX 10.3 image as the base image

FROM basex/basexhttp

 

# Copy the Windows database directory into the container so it is available

# when the container starts, without providing a --volume parameter.

# This is fine since the database is essentially read-only.

 

WORKDIR /srv/basex/data

COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"

 

# The older versions of BaseX just use admin/admin.

# RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD

 

# Modify the CMD command so that the Rainier05042023 database is opened

CMD /usr/local/bin/basexhttp -c "open Rainier05042023"

 

LABEL description="Legacy BaseX with Rainier05042023 database"

 

# Here is a build command that builds the container with the name Rainier05042023:

#

# cd to the directory containing this Dockerfile and run the command:

# docker build -t rainier05042023 .

#

# When the docker container has been built it can be run with the name

# provided in the build command i.e. rainier05042023. It can be saved

# to a file with the command:

#

# docker save -o Rainier05042023.tar rainier05042023

#

# and loaded with the command:

#

# docker load -i rainier05042023.tar

#

# The container can be run with the command:

# docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023

# The database can be accessed at http://localhost:8080/dba/  

 

Best Regards

 

Peter VIlladsen

 

From: Christian Grün <christian.gruen@gmail.com>
Sent: Tuesday, March 4, 2025 6:05 AM
To: Peter Villadsen <Peter.Villadsen@microsoft.com>
Cc: basex-talk@mailman.uni-konstanz.de
Subject: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...

 

Hi Peter,

 

To be sure, could you confirm that you have received my mails?

 

Best regards,

Christian

 

 

On Sat, Feb 22, 2025 at 12:41PM Christian Grün <christian.gruen@gmail.com> wrote:

Hi Peter,

 

> This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.

 

One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:

 

  basexhttp -c"open name-of-db"

 

Best,

Christian

 

 

On Sat, Feb 15, 2025 at 8:53PM Peter Villadsen via BaseX-Talk <basex-talk@mailman.uni-konstanz.de> wrote:

All,

 

I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!

 

Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0

 

<query xmlns=http://basex.org/rest>

  <text>/Class[@Package='ApplicationPlatform']/@Name</text>

</query>

 

This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.

 

However, it seems that the time spent executing the query against the database is negligible. Please consider this query:

 

<query xmlns=http://basex.org/rest>

  <text>1 + 2</text>

</query>

 

In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.

 

If I post the 1 + 2 query to the endpoint without specifying the database on the URL:

 

http://localhost:8984/rest

 

it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.

 

This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.

 

The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.

 

Please let me know if there is a way to accomplish acceptable performance with the HTTP server.

 

 

Best Regards

 

Peter Villadsen

Principal Technical Program Manager

Microsoft Business Applications Group