Gary,

A remark over this topic, since you are worried about performances for 20 Gb of data with BaseX and might be interested by a user experience.
Once you will be successful in inserting, simple XPATH queries over such a collection should take time below milliseconds.


2013/12/6 jean-marc Mercier <jeanmarc.mercier@gmail.com>
Hi Gary,

Maybe you should wait for BaseX team members advises over this topic.

Telling from my own experience, I can remember also having met some problem trying to insert 20 Gb of data into BaseX through the GUI.

I had to achieve this into smaller steps, using batch commands (see http://docs.basex.org/wiki/Commands), in order to free memory. i.e. I first inserted the xml into 4 or 5 passes. Then created some indexes, etc...In that way it was possible to run the insertion, even with 4Gb machines.

Hope this helps

Jean-Marc






2013/12/6 Huband, Gary W *HS <GWH2SJ@hscmail.mcc.virginia.edu>
I still get an out of main memory when I try to create attribute indexes.

I originally added the xml data with no indexes.

Gary

________________________________
From: jean-marc Mercier [jeanmarc.mercier@gmail.com]
Sent: Friday, December 06, 2013 9:37 AM
To: Huband, Gary W *HS
Cc: basex-talk@mailman.uni-konstanz.de
Subject: Re: [basex-talk] BaseX Evaluation for Big Data

I meant I am NOT a baseX member ! sorry for the mistyping !


2013/12/6 jean-marc Mercier <jeanmarc.mercier@gmail.com<mailto:jeanmarc.mercier@gmail.com>>
Hi Gary

I am a BaseX team member, but have you sized up your JVM to run "Big Data" ?

See the files BaseX/basex*.bat. I am currently using set VM=-Xmx6g

Hope this helps

Jean-Marc


2013/12/6 Huband, Gary W *HS <GWH2SJ@hscmail.mcc.virginia.edu<mailto:GWH2SJ@hscmail.mcc.virginia.edu>>

I'm evaluating BaseX to use in a project that involves 10s of TB of xml data.  Our goal is to be able to return a small amount of data from a query within a few seconds.  Is BaseX capable of handling this much data?

My initial evaluation of 1GB of data met our requirements.  But, when I evaluated BaseX using about 20GB of data I get an out of memory message when I run a query.

I'm running Windows 7 Pro on a Core i7, 8GB memory, and using the BaseX GUI.

Thanks for your time

Gary

_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de<mailto:BaseX-Talk@mailman.uni-konstanz.de>
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk