Is there a way to specify how many parallel processes Basex 11 is allowed to spawn (I am not using the database)? In combination with GNU parallel, it becomes difficult to have control on parallelism.
Best, Giuseppe
Hi Giuseppe - Hope you're well. Did you try adding the `parallel` option to xquery:fork-join? See https://docs.basex.org/main/XQuery_Functions#xquery:fork-join - if not specified, then BaseX will be greedy :)
Best,
Bridger
On Fri, Dec 6, 2024, 7:36 AM celano@informatik.uni-leipzig.de wrote:
Is there a way to specify how many parallel processes Basex 11 is allowed to spawn (I am not using the database)? In combination with GNU parallel, it becomes difficult to have control on parallelism.
Best, Giuseppe
Hi Bridger,
I am not using xquery:fork-join. If I understand correctly, some operations on nodes are parallelized automatically by BaseX. If the same script is then run in parallel on different files, I get all my cores busy.
Best, Giuseppe
Quoting Bridger Dyson-Smith bdysonsmith@gmail.com:
Hi Giuseppe - Hope you're well. Did you try adding the `parallel` option to xquery:fork-join? See https://docs.basex.org/main/XQuery_Functions#xquery:fork-join - if not specified, then BaseX will be greedy :)
Best,
Bridger
On Fri, Dec 6, 2024, 7:36 AM celano@informatik.uni-leipzig.de wrote:
Is there a way to specify how many parallel processes Basex 11 is allowed to spawn (I am not using the database)? In combination with GNU parallel, it becomes difficult to have control on parallelism.
Best, Giuseppe
Giuseppe -
Interesting! I'm not sure about the parallelized node operations, but I also confess that I'm not sure if you're wanting to keep all of your cores busy, or if you're wanting to restrain BaseX from using too many cores. If you're leveraging GNU parallel to control a BaseX instance, I assume you're using `-j|--jobs` to set a limit on the number of BaseXes.. BaseXen? BaseXeses... :) launched.
Would you be able to share an example of how you're approaching your problem? Someone may be able to give suggestions based on it. Best, Bridger
On Fri, Dec 6, 2024 at 8:01 AM Giuseppe G. A. Celano < celano@informatik.uni-leipzig.de> wrote:
Hi Bridger,
I am not using xquery:fork-join. If I understand correctly, some operations on nodes are parallelized automatically by BaseX. If the same script is then run in parallel on different files, I get all my cores busy.
Best, Giuseppe
Quoting Bridger Dyson-Smith bdysonsmith@gmail.com:
Hi Giuseppe - Hope you're well. Did you try adding the `parallel` option to xquery:fork-join? See https://docs.basex.org/main/XQuery_Functions#xquery:fork-join - if not specified, then BaseX will be greedy :)
Best,
Bridger
On Fri, Dec 6, 2024, 7:36 AM celano@informatik.uni-leipzig.de wrote:
Is there a way to specify how many parallel processes Basex 11 is allowed to spawn (I am not using the database)? In combination with GNU parallel, it becomes difficult to have control on parallelism.
Best, Giuseppe
-- Universität Leipzig Institute of Computer Science Augustusplatz 10 04109 Leipzig Deutschland celano@informatik.uni-leipzig.de
Hi Bridger,
GNU parallel has --jobs but each BaseX process is then free to spawn as many processes as it likes/can. The code is simple:
find . -name *tok01.xml | parallel --progress --jobs 10 ../basex114/bin/basex -bfile={} myscript.xq
Best, Giuseppe
Quoting Bridger Dyson-Smith bdysonsmith@gmail.com:
Giuseppe -
Interesting! I'm not sure about the parallelized node operations, but I also confess that I'm not sure if you're wanting to keep all of your cores busy, or if you're wanting to restrain BaseX from using too many cores. If you're leveraging GNU parallel to control a BaseX instance, I assume you're using `-j|--jobs` to set a limit on the number of BaseXes.. BaseXen? BaseXeses... :) launched.
Would you be able to share an example of how you're approaching your problem? Someone may be able to give suggestions based on it. Best, Bridger
On Fri, Dec 6, 2024 at 8:01 AM Giuseppe G. A. Celano < celano@informatik.uni-leipzig.de> wrote:
Hi Bridger,
I am not using xquery:fork-join. If I understand correctly, some operations on nodes are parallelized automatically by BaseX. If the same script is then run in parallel on different files, I get all my cores busy.
Best, Giuseppe
Quoting Bridger Dyson-Smith bdysonsmith@gmail.com:
Hi Giuseppe - Hope you're well. Did you try adding the `parallel` option to xquery:fork-join? See https://docs.basex.org/main/XQuery_Functions#xquery:fork-join - if not specified, then BaseX will be greedy :)
Best,
Bridger
On Fri, Dec 6, 2024, 7:36 AM celano@informatik.uni-leipzig.de wrote:
Is there a way to specify how many parallel processes Basex 11 is allowed to spawn (I am not using the database)? In combination with GNU parallel, it becomes difficult to have control on parallelism.
Best, Giuseppe
-- Universität Leipzig Institute of Computer Science Augustusplatz 10 04109 Leipzig Deutschland celano@informatik.uni-leipzig.de
Hi Giuseppe,
If you use the client/server architecture, or any other approach that ensures that a single JVM instance is used, you can decrease the number of parallel transactions [1] and, additionally, enable fair locking to also consider non-blocking transactions [2].
Best, Christian
[1] https://docs.basex.org/main/Options#parallel [2] https://docs.basex.org/main/Options#fairlock
Giuseppe G. A. Celano celano@informatik.uni-leipzig.de schrieb am Fr., 6. Dez. 2024, 16:26:
Hi Bridger,
GNU parallel has --jobs but each BaseX process is then free to spawn as many processes as it likes/can. The code is simple:
find . -name *tok01.xml | parallel --progress --jobs 10 ../basex114/bin/basex -bfile={} myscript.xq
Best, Giuseppe
Quoting Bridger Dyson-Smith bdysonsmith@gmail.com:
Giuseppe -
Interesting! I'm not sure about the parallelized node operations, but I also confess that I'm not sure if you're wanting to keep all of your
cores
busy, or if you're wanting to restrain BaseX from using too many cores.
If
you're leveraging GNU parallel to control a BaseX instance, I assume
you're
using `-j|--jobs` to set a limit on the number of BaseXes.. BaseXen? BaseXeses... :) launched.
Would you be able to share an example of how you're approaching your problem? Someone may be able to give suggestions based on it. Best, Bridger
On Fri, Dec 6, 2024 at 8:01 AM Giuseppe G. A. Celano < celano@informatik.uni-leipzig.de> wrote:
Hi Bridger,
I am not using xquery:fork-join. If I understand correctly, some operations on nodes are parallelized automatically by BaseX. If the same script is then run in parallel on different files, I get all my cores busy.
Best, Giuseppe
Quoting Bridger Dyson-Smith bdysonsmith@gmail.com:
Hi Giuseppe - Hope you're well. Did you try adding the `parallel` option to xquery:fork-join? See https://docs.basex.org/main/XQuery_Functions#xquery:fork-join - if
not
specified, then BaseX will be greedy :)
Best,
Bridger
On Fri, Dec 6, 2024, 7:36 AM celano@informatik.uni-leipzig.de
wrote:
Is there a way to specify how many parallel processes Basex 11 is allowed to spawn (I am not using the database)? In combination with GNU parallel, it becomes difficult to have control on parallelism.
Best, Giuseppe
-- Universität Leipzig Institute of Computer Science Augustusplatz 10 04109 Leipzig Deutschland celano@informatik.uni-leipzig.de
-- Universität Leipzig Institute of Computer Science Augustusplatz 10 04109 Leipzig Deutschland celano@informatik.uni-leipzig.de
basex-talk@mailman.uni-konstanz.de