Re: [basex-talk] Large memory basex instances

25 Apr 2015


      Hi Christian, Christophe and all.
Since 2013 we are developing a middleware to the parallel XQuery
processing in huge XML data. Today, we are evaluating it with BaseX in a
cluster. For example, in standalone mode we have queries that do not
execute in a desktop platform (4Gb RAM and -Xmx 2Gb). These queries were
executed with approximately 20 hours in only one cluster processing node
(16Gb RAM and -Xmx 10Gb) - final result has ~2 GB.
In our preliminary experiments, the query processing time was reduced in
almost 80% with our middleware (scenario with 8 nodes, -Xmx 2Gb). We used
XMark benchmark database with 1.0 GB, but further we will try with real
databases with 5GB or more. In all cases, we focus in ad-hoc high-cost
queries (with joins, aggregate functions etc.) and we did not mind with
the the JVM behavior.
Shortly, I think that you need adopt a partitioning strategy (we recommend
virtually instead of physically) and distribute the processing overhead.
Sure, if you have a distributed environment available and may to treat the
JVM and DBMSX how a black-box.
Kind regards,
-- 
Luiz Matos
Federal Fluminense University, Brazil

> Hi Christophe,
>
> Just a short reply (maybe someone else can give you some more profound
> feedback). I can't tell really much about J2EE servers in productive
> use. However, I would be interested to hear what is the main reason in
> your setup for the large memory consumption. Do you think there is
> some chance to speed up or optimize the queries you are evaluating?
>
> Best,
> Christian

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Large memory basex instances