All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest> <text>/Class[@Package='ApplicationPlatform']/@Name</text> </query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest> <text>1 + 2</text> </query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to "remember" the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen Principal Technical Program Manager Microsoft Business Applications Group
Hi Peter,
This leads me to believe that a lot of the time (>7 seconds) may be spent
opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:
basexhttp -c"open name-of-db"
Best, Christian
On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest>
<text>/Class[@Package='ApplicationPlatform']/@Name</text>
</query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0 http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest>
<text>1 + 2</text>
</query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen
Principal Technical Program Manager
Microsoft Business Applications Group
Hi Peter,
To be sure, could you confirm that you have received my mails?
Best regards, Christian
On Sat, Feb 22, 2025 at 12:41 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Peter,
This leads me to believe that a lot of the time (>7 seconds) may be
spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:
basexhttp -c"open name-of-db"
Best, Christian
On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest>
<text>/Class[@Package='ApplicationPlatform']/@Name</text>
</query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0 http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest>
<text>1 + 2</text>
</query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen
Principal Technical Program Manager
Microsoft Business Applications Group
Christian,
Yes, I have. Thank you for following up - I should have come back earlier.
I have been experimenting with this for a while now, and the container image (the official one and the quodatum, newer 10.3 one) both crash when I try to use them, both through HTTP and TCP. I am still looking into it. If I do not manage to find out what the issue is, I will upload the stack traces.
In both cases, I built my own container to include the database, so I can avoid the volumes and have the container be completely self-contained. The database is 19GB, so the container gets pretty big.
Here is how I start the container:
docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023
In my humble opinion it is unfortunate that the official container image has not been updated for at least 3 years. It would be nice to have the newest bits there, supported by BaseX.
Here is the dockerfile I use:
# escape=` # Use the BaseX 10.3 image as the base image FROM basex/basexhttp
# Copy the Windows database directory into the container so it is available # when the container starts, without providing a --volume parameter. # This is fine since the database is essentially read-only.
WORKDIR /srv/basex/data COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"
# The older versions of BaseX just use admin/admin. # RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD
# Modify the CMD command so that the Rainier05042023 database is opened CMD /usr/local/bin/basexhttp -c "open Rainier05042023"
LABEL description="Legacy BaseX with Rainier05042023 database"
# Here is a build command that builds the container with the name Rainier05042023: # # cd to the directory containing this Dockerfile and run the command: # docker build -t rainier05042023 . # # When the docker container has been built it can be run with the name # provided in the build command i.e. rainier05042023. It can be saved # to a file with the command: # # docker save -o Rainier05042023.tar rainier05042023 # # and loaded with the command: # # docker load -i rainier05042023.tar # # The container can be run with the command: # docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023 # The database can be accessed at http://localhost:8080/dba/
Best Regards
Peter VIlladsen
From: Christian Grün christian.gruen@gmail.com Sent: Tuesday, March 4, 2025 6:05 AM To: Peter Villadsen Peter.Villadsen@microsoft.com Cc: basex-talk@mailman.uni-konstanz.de Subject: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Hi Peter,
To be sure, could you confirm that you have received my mails?
Best regards, Christian
On Sat, Feb 22, 2025 at 12:41 PM Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> wrote: Hi Peter,
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:
basexhttp -c"open name-of-db"
Best, Christian
On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk <basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de> wrote: All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest> <text>/Class[@Package='ApplicationPlatform']/@Name</text> </query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest> <text>1 + 2</text> </query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen Principal Technical Program Manager Microsoft Business Applications Group
I did some more work to capture the relevant information for the two crashes.
As you recall, I build a container image on top of the official (but old) basex one. It copies the database into the right place in the container.
I added the -c switch to the basexhttp command when the container starts. Even when I do this, the container does not have data context set when the first operation involving the database happens - If I do a query that involves the open database:
<query> <text>count(/ada)</text> </query>
I get:
Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.6 RC1 Java: IcedTea, 1.8.0_212 OS: Linux, amd64 Stack Trace: java.lang.NullPointerException at org.basex.data.Data.defaultNs(Data.java: 270) at org.basex.query.expr.path.NameTest.noMatches(NameTest.java: 60) at org.basex.query.expr.path.Step.optimize(Step.java: 162) at org.basex.query.expr.path.Step.optimize(Step.java: 134) at org.basex.query.expr.Preds.compile(Preds.java: 59) at org.basex.query.expr.path.Path.lambda$compile$0(Path.java: 139) at org.basex.query.CompileContext.get(CompileContext.java: 165) at org.basex.query.expr.path.Path.compile(Path.java: 134) at org.basex.query.expr.Arr.compile(Arr.java: 47) at org.basex.query.scope.MainModule.comp(MainModule.java: 81) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 119) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 106) at org.basex.query.QueryContext.compile(QueryContext.java: 306) at org.basex.query.QueryProcessor.compile(QueryProcessor.java: 79) at org.basex.core.cmd.AQuery.query(AQuery.java: 91) at org.basex.core.cmd.XQuery.run(XQuery.java: 22) at org.basex.core.Command.run(Command.java: 257) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 105) at org.basex.http.rest.RESTQuery.query(RESTQuery.java: 69) at org.basex.http.rest.RESTQuery.run0(RESTQuery.java: 37) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 70) at org.basex.core.Command.run(Command.java: 257) at org.basex.core.Command.execute(Command.java: 93) at org.basex.core.Command.execute(Command.java: 116) at org.basex.http.rest.RESTServlet.run(RESTServlet.java: 32) at org.basex.http.BaseXServlet.service(BaseXServlet.java: 65)
If I remove the -c flag, and just let the container start (still with the database copied into place in the container), I get this trace when I try to do anything related with the database:
Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 9.6 RC1 Java: IcedTea, 1.8.0_212 OS: Linux, amd64 Stack Trace: java.lang.NullPointerException at org.basex.data.DiskData.write(DiskData.java:146) at org.basex.data.DiskData.close(DiskData.java:160) at org.basex.core.Datas.unpin(Datas.java:52) at org.basex.core.cmd.Close.close(Close.java:45) at org.basex.query.QueryResources.close(QueryResources.java:92) at org.basex.query.QueryContext.close(QueryContext.java:515) at org.basex.query.QueryProcessor.close(QueryProcessor.java:251) at org.basex.core.cmd.AQuery.query(AQuery.java:132) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at org.basex.core.CLI.execute(CLI.java:92) at org.basex.core.CLI.execute(CLI.java:76) at org.basex.BaseX.console(BaseX.java:177) at org.basex.BaseX.<init>(BaseX.java:152) at org.basex.BaseX.main(BaseX.java:43)
I hope this is useful. Right now I am blocked.
Best Regards
Peter Villadsen.
From: Peter Villadsen Sent: Tuesday, March 4, 2025 12:43 PM To: Christian Grün christian.gruen@gmail.com Cc: basex-talk@mailman.uni-konstanz.de Subject: RE: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Christian,
Yes, I have. Thank you for following up - I should have come back earlier.
I have been experimenting with this for a while now, and the container image (the official one and the quodatum, newer 10.3 one) both crash when I try to use them, both through HTTP and TCP. I am still looking into it. If I do not manage to find out what the issue is, I will upload the stack traces.
In both cases, I built my own container to include the database, so I can avoid the volumes and have the container be completely self-contained. The database is 19GB, so the container gets pretty big.
Here is how I start the container:
docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023
In my humble opinion it is unfortunate that the official container image has not been updated for at least 3 years. It would be nice to have the newest bits there, supported by BaseX.
Here is the dockerfile I use:
# escape=` # Use the BaseX 10.3 image as the base image FROM basex/basexhttp
# Copy the Windows database directory into the container so it is available # when the container starts, without providing a --volume parameter. # This is fine since the database is essentially read-only.
WORKDIR /srv/basex/data COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"
# The older versions of BaseX just use admin/admin. # RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD
# Modify the CMD command so that the Rainier05042023 database is opened CMD /usr/local/bin/basexhttp -c "open Rainier05042023"
LABEL description="Legacy BaseX with Rainier05042023 database"
# Here is a build command that builds the container with the name Rainier05042023: # # cd to the directory containing this Dockerfile and run the command: # docker build -t rainier05042023 . # # When the docker container has been built it can be run with the name # provided in the build command i.e. rainier05042023. It can be saved # to a file with the command: # # docker save -o Rainier05042023.tar rainier05042023 # # and loaded with the command: # # docker load -i rainier05042023.tar # # The container can be run with the command: # docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023 # The database can be accessed at http://localhost:8080/dba/
Best Regards
Peter VIlladsen
From: Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> Sent: Tuesday, March 4, 2025 6:05 AM To: Peter Villadsen <Peter.Villadsen@microsoft.commailto:Peter.Villadsen@microsoft.com> Cc: basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de Subject: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Hi Peter,
To be sure, could you confirm that you have received my mails?
Best regards, Christian
On Sat, Feb 22, 2025 at 12:41 PM Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> wrote: Hi Peter,
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:
basexhttp -c"open name-of-db"
Best, Christian
On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk <basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de> wrote: All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest> <text>/Class[@Package='ApplicationPlatform']/@Name</text> </query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest> <text>1 + 2</text> </query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen Principal Technical Program Manager Microsoft Business Applications Group
Hi Peter,
Can you reproduce the problem with the latest version (11.7) and the zip distribution of BaseX (e.g., without Docker)?
Best, Christian
Peter Villadsen Peter.Villadsen@microsoft.com schrieb am Fr., 7. März 2025, 05:00:
I did some more work to capture the relevant information for the two crashes.
As you recall, I build a container image on top of the official (but old) basex one. It copies the database into the right place in the container.
I added the -c switch to the basexhttp command when the container starts. Even when I do this, the container does not have data context set when the first operation involving the database happens - If I do a query that involves the open database:
<query>
<text>count(/ada)</text>
</query>
I get:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 9.6 RC1
Java: IcedTea,
1.8.0_212
OS: Linux, amd64
Stack Trace:
java.lang.NullPointerException
at org.basex.data.Data.defaultNs(Data.java: 270) at org.basex.query.expr.path.NameTest.noMatches(NameTest.java: 60) at org.basex.query.expr.path.Step.optimize(Step.java: 162) at org.basex.query.expr.path.Step.optimize(Step.java: 134) at org.basex.query.expr.Preds.compile(Preds.java: 59) at org.basex.query.expr.path.Path.lambda$compile$0(Path.java: 139) at org.basex.query.CompileContext.get(CompileContext.java: 165) at org.basex.query.expr.path.Path.compile(Path.java: 134) at org.basex.query.expr.Arr.compile(Arr.java: 47) at org.basex.query.scope.MainModule.comp(MainModule.java: 81) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 119) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 106) at org.basex.query.QueryContext.compile(QueryContext.java: 306) at org.basex.query.QueryProcessor.compile(QueryProcessor.java: 79) at org.basex.core.cmd.AQuery.query(AQuery.java: 91) at org.basex.core.cmd.XQuery.run(XQuery.java: 22) at org.basex.core.Command.run(Command.java: 257) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 105) at org.basex.http.rest.RESTQuery.query(RESTQuery.java: 69) at org.basex.http.rest.RESTQuery.run0(RESTQuery.java: 37) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 70) at org.basex.core.Command.run(Command.java: 257) at org.basex.core.Command.execute(Command.java: 93) at org.basex.core.Command.execute(Command.java: 116) at org.basex.http.rest.RESTServlet.run(RESTServlet.java: 32) at org.basex.http.BaseXServlet.service(BaseXServlet.java: 65)
If I remove the -c flag, and just let the container start (still with the database copied into place in the container), I get this trace when I try to do anything related with the database:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 9.6 RC1
Java: IcedTea, 1.8.0_212
OS: Linux, amd64
Stack Trace:
java.lang.NullPointerException
at org.basex.data.DiskData.write(DiskData.java:146) at org.basex.data.DiskData.close(DiskData.java:160) at org.basex.core.Datas.unpin(Datas.java:52) at org.basex.core.cmd.Close.close(Close.java:45) at org.basex.query.QueryResources.close(QueryResources.java:92) at org.basex.query.QueryContext.close(QueryContext.java:515) at org.basex.query.QueryProcessor.close(QueryProcessor.java:251) at org.basex.core.cmd.AQuery.query(AQuery.java:132) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at org.basex.core.CLI.execute(CLI.java:92) at org.basex.core.CLI.execute(CLI.java:76) at org.basex.BaseX.console(BaseX.java:177) at org.basex.BaseX.<init>(BaseX.java:152) at org.basex.BaseX.main(BaseX.java:43)
I hope this is useful. Right now I am blocked.
Best Regards
Peter Villadsen.
*From:* Peter Villadsen *Sent:* Tuesday, March 4, 2025 12:43 PM *To:* Christian Grün christian.gruen@gmail.com *Cc:* basex-talk@mailman.uni-konstanz.de *Subject:* RE: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Christian,
Yes, I have. Thank you for following up - I should have come back earlier.
I have been experimenting with this for a while now, and the container image (the official one and the quodatum, newer 10.3 one) both crash when I try to use them, both through HTTP and TCP. I am still looking into it. If I do not manage to find out what the issue is, I will upload the stack traces.
In both cases, I built my own container to include the database, so I can avoid the volumes and have the container be completely self-contained. The database is 19GB, so the container gets pretty big.
Here is how I start the container:
docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023
In my humble opinion it is unfortunate that the official container image has not been updated for at least 3 years. It would be nice to have the newest bits there, supported by BaseX.
Here is the dockerfile I use:
# escape=`
# Use the BaseX 10.3 image as the base image
FROM basex/basexhttp
# Copy the Windows database directory into the container so it is available
# when the container starts, without providing a --volume parameter.
# This is fine since the database is essentially read-only.
WORKDIR /srv/basex/data
COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"
# The older versions of BaseX just use admin/admin.
# RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD
# Modify the CMD command so that the Rainier05042023 database is opened
CMD /usr/local/bin/basexhttp -c "open Rainier05042023"
LABEL description="Legacy BaseX with Rainier05042023 database"
# Here is a build command that builds the container with the name Rainier05042023:
#
# cd to the directory containing this Dockerfile and run the command:
# docker build -t rainier05042023 .
#
# When the docker container has been built it can be run with the name
# provided in the build command i.e. rainier05042023. It can be saved
# to a file with the command:
#
# docker save -o Rainier05042023.tar rainier05042023
#
# and loaded with the command:
#
# docker load -i rainier05042023.tar
#
# The container can be run with the command:
# docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023
# The database can be accessed at http://localhost:8080/dba/
Best Regards
Peter VIlladsen
*From:* Christian Grün christian.gruen@gmail.com *Sent:* Tuesday, March 4, 2025 6:05 AM *To:* Peter Villadsen Peter.Villadsen@microsoft.com *Cc:* basex-talk@mailman.uni-konstanz.de *Subject:* [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Hi Peter,
To be sure, could you confirm that you have received my mails?
Best regards,
Christian
On Sat, Feb 22, 2025 at 12:41 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Peter,
This leads me to believe that a lot of the time (>7 seconds) may be
spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:
basexhttp -c"open name-of-db"
Best,
Christian
On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest>
<text>/Class[@Package='ApplicationPlatform']/@Name</text>
</query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0 http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest>
<text>1 + 2</text>
</query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen
Principal Technical Program Manager
Microsoft Business Applications Group
Hi Peter,
I recall seeing something similar when copying database folders from a MS Windows BaseX install to a Linux install . I found in some cases I needed to run db:optimize[1] once before any other use of the database otherwise a Java error would occur.
Now, I would always use db backup and restore to transfer databases between systems.
/Andy
[1] https://old.docs.basex.org/wiki/Database_Module#db:optimize
On Fri, 7 Mar 2025 at 08:10, Christian Grün christian.gruen@gmail.com wrote:
Hi Peter,
Can you reproduce the problem with the latest version (11.7) and the zip distribution of BaseX (e.g., without Docker)?
Best, Christian
Peter Villadsen Peter.Villadsen@microsoft.com schrieb am Fr., 7. März 2025, 05:00:
I did some more work to capture the relevant information for the two crashes.
As you recall, I build a container image on top of the official (but old) basex one. It copies the database into the right place in the container.
I added the -c switch to the basexhttp command when the container starts. Even when I do this, the container does not have data context set when the first operation involving the database happens - If I do a query that involves the open database:
<query>
<text>count(/ada)</text>
</query>
I get:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 9.6 RC1
Java: IcedTea,
1.8.0_212
OS: Linux, amd64
Stack Trace:
java.lang.NullPointerException
at org.basex.data.Data.defaultNs(Data.java: 270) at org.basex.query.expr.path.NameTest.noMatches(NameTest.java: 60) at org.basex.query.expr.path.Step.optimize(Step.java: 162) at org.basex.query.expr.path.Step.optimize(Step.java: 134) at org.basex.query.expr.Preds.compile(Preds.java: 59) at org.basex.query.expr.path.Path.lambda$compile$0(Path.java: 139) at org.basex.query.CompileContext.get(CompileContext.java: 165) at org.basex.query.expr.path.Path.compile(Path.java: 134) at org.basex.query.expr.Arr.compile(Arr.java: 47) at org.basex.query.scope.MainModule.comp(MainModule.java: 81) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 119) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 106) at org.basex.query.QueryContext.compile(QueryContext.java: 306) at org.basex.query.QueryProcessor.compile(QueryProcessor.java: 79) at org.basex.core.cmd.AQuery.query(AQuery.java: 91) at org.basex.core.cmd.XQuery.run(XQuery.java: 22) at org.basex.core.Command.run(Command.java: 257) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 105) at org.basex.http.rest.RESTQuery.query(RESTQuery.java: 69) at org.basex.http.rest.RESTQuery.run0(RESTQuery.java: 37) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 70) at org.basex.core.Command.run(Command.java: 257) at org.basex.core.Command.execute(Command.java: 93) at org.basex.core.Command.execute(Command.java: 116) at org.basex.http.rest.RESTServlet.run(RESTServlet.java: 32) at org.basex.http.BaseXServlet.service(BaseXServlet.java: 65)
If I remove the -c flag, and just let the container start (still with the database copied into place in the container), I get this trace when I try to do anything related with the database:
Improper use? Potential bug? Your feedback is welcome:
Contact: basex-talk@mailman.uni-konstanz.de
Version: BaseX 9.6 RC1
Java: IcedTea, 1.8.0_212
OS: Linux, amd64
Stack Trace:
java.lang.NullPointerException
at org.basex.data.DiskData.write(DiskData.java:146) at org.basex.data.DiskData.close(DiskData.java:160) at org.basex.core.Datas.unpin(Datas.java:52) at org.basex.core.cmd.Close.close(Close.java:45) at org.basex.query.QueryResources.close(QueryResources.java:92) at org.basex.query.QueryContext.close(QueryContext.java:515) at org.basex.query.QueryProcessor.close(QueryProcessor.java:251) at org.basex.core.cmd.AQuery.query(AQuery.java:132) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at
org.basex.api.client.LocalSession.execute(LocalSession.java:132)
at org.basex.api.client.Session.execute(Session.java:36) at org.basex.core.CLI.execute(CLI.java:92) at org.basex.core.CLI.execute(CLI.java:76) at org.basex.BaseX.console(BaseX.java:177) at org.basex.BaseX.<init>(BaseX.java:152) at org.basex.BaseX.main(BaseX.java:43)
I hope this is useful. Right now I am blocked.
Best Regards
Peter Villadsen.
*From:* Peter Villadsen *Sent:* Tuesday, March 4, 2025 12:43 PM *To:* Christian Grün christian.gruen@gmail.com *Cc:* basex-talk@mailman.uni-konstanz.de *Subject:* RE: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Christian,
Yes, I have. Thank you for following up - I should have come back earlier.
I have been experimenting with this for a while now, and the container image (the official one and the quodatum, newer 10.3 one) both crash when I try to use them, both through HTTP and TCP. I am still looking into it. If I do not manage to find out what the issue is, I will upload the stack traces.
In both cases, I built my own container to include the database, so I can avoid the volumes and have the container be completely self-contained. The database is 19GB, so the container gets pretty big.
Here is how I start the container:
docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023
In my humble opinion it is unfortunate that the official container image has not been updated for at least 3 years. It would be nice to have the newest bits there, supported by BaseX.
Here is the dockerfile I use:
# escape=`
# Use the BaseX 10.3 image as the base image
FROM basex/basexhttp
# Copy the Windows database directory into the container so it is available
# when the container starts, without providing a --volume parameter.
# This is fine since the database is essentially read-only.
WORKDIR /srv/basex/data
COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"
# The older versions of BaseX just use admin/admin.
# RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD
# Modify the CMD command so that the Rainier05042023 database is opened
CMD /usr/local/bin/basexhttp -c "open Rainier05042023"
LABEL description="Legacy BaseX with Rainier05042023 database"
# Here is a build command that builds the container with the name Rainier05042023:
#
# cd to the directory containing this Dockerfile and run the command:
# docker build -t rainier05042023 .
#
# When the docker container has been built it can be run with the name
# provided in the build command i.e. rainier05042023. It can be saved
# to a file with the command:
#
# docker save -o Rainier05042023.tar rainier05042023
#
# and loaded with the command:
#
# docker load -i rainier05042023.tar
#
# The container can be run with the command:
# docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023
# The database can be accessed at http://localhost:8080/dba/
Best Regards
Peter VIlladsen
*From:* Christian Grün christian.gruen@gmail.com *Sent:* Tuesday, March 4, 2025 6:05 AM *To:* Peter Villadsen Peter.Villadsen@microsoft.com *Cc:* basex-talk@mailman.uni-konstanz.de *Subject:* [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Hi Peter,
To be sure, could you confirm that you have received my mails?
Best regards,
Christian
On Sat, Feb 22, 2025 at 12:41 PM Christian Grün < christian.gruen@gmail.com> wrote:
Hi Peter,
This leads me to believe that a lot of the time (>7 seconds) may be
spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:
basexhttp -c"open name-of-db"
Best,
Christian
On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk < basex-talk@mailman.uni-konstanz.de> wrote:
All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest>
<text>/Class[@Package='ApplicationPlatform']/@Name</text>
</query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0 http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest>
<text>1 + 2</text>
</query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen
Principal Technical Program Manager
Microsoft Business Applications Group
All,
Thanks all for your help. I managed to get thing working, see below.
I have never seen the errors I mentioned below outside of the docker container. Desktop versions have always worked correctly.
The failing version was copying the binary database that was created from the desktop into the container; This did not produce reliable results, for reasons unknown. However, I did not experiment with the db:optimize() feature, because I decided to change my approach to something that matched better the generation of the data: When this process runs it generates a directory structure of data that is then added to a database, so I decided to let the dockerfile do that work. I ended up with this dockerfile:
# escape=` # Use the legacy basex image as the base image FROM basex/basexhttp
# Copy the XML root of the documents to the tmp directory, # so they can be imported as part of a later step. WORKDIR /tmp/xmls COPY --chown=basex:basex ExportedRainier05042023 "./"
ENV BASEX_JVM=-Xmx30g
# Now add the content of this directory to a database called db. RUN basex -c "CREATE DB db" RUN basex -c "OPEN db; ADD TO db /tmp/xmls" RUN basex -c "CLOSE"
# Get rid of the directory with the XMLs RUN rm -rf /tmp/xmls
# Modify the CMD command so that the newly created database is opened CMD /usr/local/bin/basexhttp -c "OPEN db"
LABEL description="Legacy BaseX with open database"
This has solved the problem for me for now. It took around an hour to build the container, which is quite big, around 40GB.
Initially I did a sequence of many
RUN basex -c "CREATE DB db" RUN basex -c "OPEN db; ADD TO db /tmp/xmls/MyDirectory" RUN basex -c "CLOSE"
For each of the directories in the xmls directory. I thought this would be an advantage, but it was not, since the system did not release memory after the database was closed, and ready for the next import. It all disappeared when I changed it to deal with everything at once, with a big memory allowance.
I have learned a lot about docker containers.
Thanks again.
Best Regards
Peter Villadsen
From: Andy Bunce bunce.andy@gmail.com Sent: Friday, March 7, 2025 2:47 AM To: Peter Villadsen Peter.Villadsen@microsoft.com Cc: BaseX basex-talk@mailman.uni-konstanz.de; Christian Grün christian.gruen@gmail.com Subject: Re: [basex-talk] Re: [EXTERNAL] Re: HTTP server performance seems very slow...
Hi Peter,
I recall seeing something similar when copying database folders from a MS Windows BaseX install to a Linux install . I found in some cases I needed to run db:optimize[1] once before any other use of the database otherwise a Java error would occur.
Now, I would always use db backup and restore to transfer databases between systems.
/Andy
[1] https://old.docs.basex.org/wiki/Database_Module#db:optimize
On Fri, 7 Mar 2025 at 08:10, Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> wrote: Hi Peter,
Can you reproduce the problem with the latest version (11.7) and the zip distribution of BaseX (e.g., without Docker)?
Best, Christian
Peter Villadsen <Peter.Villadsen@microsoft.commailto:Peter.Villadsen@microsoft.com> schrieb am Fr., 7. März 2025, 05:00: I did some more work to capture the relevant information for the two crashes.
As you recall, I build a container image on top of the official (but old) basex one. It copies the database into the right place in the container.
I added the -c switch to the basexhttp command when the container starts. Even when I do this, the container does not have data context set when the first operation involving the database happens - If I do a query that involves the open database:
<query> <text>count(/ada)</text> </query>
I get:
Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de Version: BaseX 9.6 RC1 Java: IcedTea, 1.8.0_212 OS: Linux, amd64 Stack Trace: java.lang.NullPointerException at org.basex.data.Data.defaultNs(Data.java: 270) at org.basex.query.expr.path.NameTest.noMatches(NameTest.java: 60) at org.basex.query.expr.path.Step.optimize(Step.java: 162) at org.basex.query.expr.path.Step.optimize(Step.java: 134) at org.basex.query.expr.Preds.compile(Preds.java: 59) at org.basex.query.expr.path.Path.lambda$compile$0(Path.java: 139) at org.basex.query.CompileContext.get(CompileContext.java: 165) at org.basex.query.expr.path.Path.compile(Path.java: 134) at org.basex.query.expr.Arr.compile(Arr.java: 47) at org.basex.query.scope.MainModule.comp(MainModule.java: 81) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 119) at org.basex.query.QueryCompiler.compile(QueryCompiler.java: 106) at org.basex.query.QueryContext.compile(QueryContext.java: 306) at org.basex.query.QueryProcessor.compile(QueryProcessor.java: 79) at org.basex.core.cmd.AQuery.query(AQuery.java: 91) at org.basex.core.cmd.XQuery.run(XQuery.java: 22) at org.basex.core.Command.run(Command.java: 257) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 105) at org.basex.http.rest.RESTQuery.query(RESTQuery.java: 69) at org.basex.http.rest.RESTQuery.run0(RESTQuery.java: 37) at org.basex.http.rest.RESTCmd.run(RESTCmd.java: 70) at org.basex.core.Command.run(Command.java: 257) at org.basex.core.Command.execute(Command.java: 93) at org.basex.core.Command.execute(Command.java: 116) at org.basex.http.rest.RESTServlet.run(RESTServlet.java: 32) at org.basex.http.BaseXServlet.service(BaseXServlet.java: 65)
If I remove the -c flag, and just let the container start (still with the database copied into place in the container), I get this trace when I try to do anything related with the database:
Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de Version: BaseX 9.6 RC1 Java: IcedTea, 1.8.0_212 OS: Linux, amd64 Stack Trace: java.lang.NullPointerException at org.basex.data.DiskData.write(DiskData.java:146) at org.basex.data.DiskData.close(DiskData.java:160) at org.basex.core.Datas.unpin(Datas.java:52) at org.basex.core.cmd.Close.close(Close.java:45) at org.basex.query.QueryResources.close(QueryResources.java:92) at org.basex.query.QueryContext.close(QueryContext.java:515) at org.basex.query.QueryProcessor.close(QueryProcessor.java:251) at org.basex.core.cmd.AQuery.query(AQuery.java:132) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) at org.basex.api.client.LocalSession.execute(LocalSession.java:132) at org.basex.api.client.Session.execute(Session.java:36) at org.basex.core.CLI.execute(CLI.java:92) at org.basex.core.CLI.execute(CLI.java:76) at org.basex.BaseX.console(BaseX.java:177) at org.basex.BaseX.<init>(BaseX.java:152) at org.basex.BaseX.main(BaseX.java:43)
I hope this is useful. Right now I am blocked.
Best Regards
Peter Villadsen.
From: Peter Villadsen Sent: Tuesday, March 4, 2025 12:43 PM To: Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> Cc: basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de Subject: RE: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Christian,
Yes, I have. Thank you for following up - I should have come back earlier.
I have been experimenting with this for a while now, and the container image (the official one and the quodatum, newer 10.3 one) both crash when I try to use them, both through HTTP and TCP. I am still looking into it. If I do not manage to find out what the issue is, I will upload the stack traces.
In both cases, I built my own container to include the database, so I can avoid the volumes and have the container be completely self-contained. The database is 19GB, so the container gets pretty big.
Here is how I start the container:
docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023
In my humble opinion it is unfortunate that the official container image has not been updated for at least 3 years. It would be nice to have the newest bits there, supported by BaseX.
Here is the dockerfile I use:
# escape=` # Use the BaseX 10.3 image as the base image FROM basex/basexhttp
# Copy the Windows database directory into the container so it is available # when the container starts, without providing a --volume parameter. # This is fine since the database is essentially read-only.
WORKDIR /srv/basex/data COPY --chown=basex:basex Rainier05042023 "Rainier05042023/"
# The older versions of BaseX just use admin/admin. # RUN echo "admin" | /srv/basex/bin/basex -cPASSWORD
# Modify the CMD command so that the Rainier05042023 database is opened CMD /usr/local/bin/basexhttp -c "open Rainier05042023"
LABEL description="Legacy BaseX with Rainier05042023 database"
# Here is a build command that builds the container with the name Rainier05042023: # # cd to the directory containing this Dockerfile and run the command: # docker build -t rainier05042023 . # # When the docker container has been built it can be run with the name # provided in the build command i.e. rainier05042023. It can be saved # to a file with the command: # # docker save -o Rainier05042023.tar rainier05042023 # # and loaded with the command: # # docker load -i rainier05042023.tar # # The container can be run with the command: # docker run -d -e BASEX_JVM=-Xmx19G -p 8080:8080 -p 1984:1984 -p 8984:8984 rainier05042023 # The database can be accessed at http://localhost:8080/dba/
Best Regards
Peter VIlladsen
From: Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> Sent: Tuesday, March 4, 2025 6:05 AM To: Peter Villadsen <Peter.Villadsen@microsoft.commailto:Peter.Villadsen@microsoft.com> Cc: basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de Subject: [EXTERNAL] Re: [basex-talk] HTTP server performance seems very slow...
Hi Peter,
To be sure, could you confirm that you have received my mails?
Best regards, Christian
On Sat, Feb 22, 2025 at 12:41 PM Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> wrote: Hi Peter,
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
One option is to open the database with the initial basexhttp call. It will be kept open until the server is shut down:
basexhttp -c"open name-of-db"
Best, Christian
On Sat, Feb 15, 2025 at 8:53 PM Peter Villadsen via BaseX-Talk <basex-talk@mailman.uni-konstanz.demailto:basex-talk@mailman.uni-konstanz.de> wrote: All,
I have been using BaseX for a while, connecting to the TCP endpoint. I know the performance I typically get, and it is impressive! However, now I wanted to use the HTTP endpoint, and it seems the performance is at least 2 orders of magnitude worse!
Here is the query that I am POSTing to http://localhost:8984/rest/RainFnd_6.0.10.0
<query xmlns=http://basex.org/rest> <text>/Class[@Package='ApplicationPlatform']/@Name</text> </query>
This simple query will generate around 1500 results from the 13GB database (RainFnd_6.0.10.0http://localhost:8984/rest/RainFnd_6.0.10.0). It takes just over 7 seconds to do this. If I do this in the BaseX GUI that is self contained, it takes around 20ms.
However, it seems that the time spent executing the query against the database is negligible. Please consider this query:
<query xmlns=http://basex.org/rest> <text>1 + 2</text> </query>
In which there there is obviously no database access. It takes almost the same amount of time as the query that accesses the database. 7 seconds to calculate 1 + 2 is too long.
If I post the 1 + 2 query to the endpoint without specifying the database on the URL:
it takes around 7 milliseconds, close to what I expected, certainly within expectations for the time spent sending the query over the wire and serializing etc.
This leads me to believe that a lot of the time (>7 seconds) may be spent opening the database each time a POST is done? Is there a way to tweak the HTTP server to “remember” the connection with the current database for a little while? This may be against the REST principles, of course. The database is guaranteed to be read-only in my case.
The problem is that this makes the HTTP server inappropriate for interactive applications. I can still use the TCP server, where I get the results I need, but using the HTTP would be simpler, and have less overhead in terms of code needed to communicate with the server.
Please let me know if there is a way to accomplish acceptable performance with the HTTP server.
Best Regards
Peter Villadsen Principal Technical Program Manager Microsoft Business Applications Group
basex-talk@mailman.uni-konstanz.de