Hi Marco,
The PARALLEL feature in BaseX mostly exists to ensure that a single operation does not block other users from accessing our server. There are some cases in which parallel operations speed up overall processing (such as running HTTP requests, or doing simple computations, as you did in your last example), but in most real-life cases, you won’t benefit too much by splitting up your operations to multiple threads.
BaseX is built upon Java; and even in pure Java, it can be difficult to write code that’s considerably faster when being run in parallel. Indeed there were lots of users who were disappointed that the (at that time) new streaming features of Java 8 didn’t lead to a considerable performance boost (see e.g. [1]). Even if your machine has multiple cores, the CPU is a critical bottleneck, as all the code and data (from caches, main-memory, external sources) need to be pushed through. The simpler your codes is the easier it will be for the compilers to rewrite it to simple and compact machine code; if you have a Java and an XQuery processor in between, simple code that could be written with 10 bytes with assembler code will require 100, 1000 or even more bytes.
Apart from that, the Java runtime is getting smarter and smarter. Even if you don’t run multiple transactions in parallel, the JRE will utilize more than one of your cores: In the beginning, it will be the just-in-time compiler; later on, it will be the garbage collector. As a consequence, multiple (non-trivial) algorithms will always compete with the built-in optimizations of Java.
However, I’ll give you some questions in return that might help us to optimize your code:
1. Is most of the time is spent for parsing, compiling, evaluating or printing (see the detailed info view for this information). If it’s printing, then the size of your result might be the bottleneck. If it’s parsing or compiling, your might want to think about rewriting your code (and e.g. removing code from your imported modules that will never be executed). 2. Do you process any external data in your code? If yes, are any external resources affected? Does it come from local files, is it generated and dropped within your code? 3. What would you guess: If a lot of main memory is required for running your algorithms (>10mb? >100mb?), it might help to shrink down your data structures. 4. How much time is spent by one of the single complex operations that you want to optimize? Various milliseconds, seconds or even minutes?
Hope this helps, Christian
[1] https://stackoverflow.com/questions/20375176/should-i-always-use-a-parallel-...
On Tue, Sep 17, 2019 at 5:51 PM Marco Lettere m.lettere@gmail.com wrote:
Dear all,
on one of our applications based on RestXQ services running on 9.0.2 we experience a strange behaviour related to concurrency.
It is a stateless service running on a server with http-local set to true. So no database access. We removed all accesses to sessions and the code doesn't even write to files or other external resources. It's really only computation.
When I run multiple requests, I can clearly see that they are accepted simultaneously (up to 8 which matches PARALLEL). They even complete out of order according to the session id reported in the log. [1]
The strange thing is: performance seems to be as if calls were executed serially. If T is the time to complete one call then the pattern is T + T + T + T + T.
I have to add that the algorithm is rather complex. Lots of functions (possibly inlined, private and the main one even recursive) which filter and transform a large XML request into a possibly large XML output.
If I simplify the called RestXQ to something like [2] then I get a log like [3] which shows a pattern like the one I'm expecting T+d+d+d+d where d is just a slight overhead (orders of magnitude smaller).
My question is what do I have to check in order to understand where the hard synchronization of the calls occurs? Might this be related to non-deterministic functions? What else?
I've tried to isolate parts of the code but I'm unable to get predictable results.
Thank you very much for all the hints you will be able to provide!
Regards,
Marco.
[1]
17:10:19.204 127.0.0.1:58038 admin REQUEST [POST] http://localhost:8081/rate-selector/select-matching-rates 17:10:19.204 127.0.0.1:58040 admin REQUEST [POST] http://localhost:8081/rate-selector/select-matching-rates 17:10:19.205 127.0.0.1:58042 admin REQUEST [POST] http://localhost:8081/rate-selector/select-matching-rates 17:10:19.205 127.0.0.1:58044 admin REQUEST [POST] http://localhost:8081/rate-selector/select-matching-rates 17:10:19.208 127.0.0.1:58046 admin REQUEST [POST] http://localhost:8081/rate-selector/select-matching-rates 17:10:21.325 127.0.0.1:58040 admin 200 2121.01 ms 17:10:23.174 127.0.0.1:58046 admin 200 3965.74 ms 17:10:25.035 127.0.0.1:58038 admin 200 5830.63 ms 17:10:26.889 127.0.0.1:58044 admin 200 7683.76 ms 17:10:28.739 127.0.0.1:58042 admin 200 9534.46 ms
[2]
(:~ : Test concurrency :) declare %rest:path("/test") %rest:POST("{$body}") function rs:test($body as node()) { <a>{$body}{fold-left(1 to 10000000, 0, function($acc, $next){ $acc + math:sqrt($next)})}</a> };
[3]
17:48:02.782 127.0.0.1:48262 admin REQUEST [POST] http://localhost:8081/rate-selector/test 17:48:02.783 127.0.0.1:48252 admin REQUEST [POST] http://localhost:8081/rate-selector/test 17:48:02.783 127.0.0.1:48258 admin REQUEST [POST] http://localhost:8081/rate-selector/test 17:48:02.784 127.0.0.1:48260 admin REQUEST [POST] http://localhost:8081/rate-selector/test 17:48:02.784 127.0.0.1:48254 admin REQUEST [POST] http://localhost:8081/rate-selector/test 17:48:04.428 127.0.0.1:48262 admin 200 1646.03 ms 17:48:04.509 127.0.0.1:48260 admin 200 1725.45 ms 17:48:04.710 127.0.0.1:48254 admin 200 1926.27 ms 17:48:04.772 127.0.0.1:48258 admin 200 1988.41 ms 17:48:04.904 127.0.0.1:48252 admin 200 2120.33 ms