Hello again!

Creating a database without indices from 28 GB of XML files using the BaseX GUI worked on my computer and took only 3 and a half hours. To be safe I provided it with 8 GB RAM. Right now BaseX consumes 6 GB with the GUI, so I suppose it would never have worked on my V-Server (only 4 GB RAM) anyway. 

Thanks for your kind help!

 ~ Mathias

2011/6/28 Christian Grün <christian.gruen@gmail.com>
Hi Mathias,

> As you suggested I tried using the "new" command. I wasn't successfull so
> far, because I encountered a number of other problems during the process.
> Since overall the db creation process lasts several hours with these amounts
> of data the time till some of the errors/problems surfaced where equally
> long. (Invalid filenames or contents of some files)

True; you need to ensure that all XML documents are well-formed. You
might as well use xmllint or similar tools to remove those files in
advance.

> Nevertheless today I got it running without OutOfMemoryExceptions or other
> printed errors. Unfortunately though, when I executed the "create db OAI
> [folder]" command in the BaseXClient (over ssh on my server) it obviously
> never finished.

That's quite an unusual behavior; I guess that too many URLs are
resolved again and again, which might take lots of lots of time. I'd
advise to set the intparse flag to true (set intparse on; create db
..., or Database -> New -> Parsing -> Use Internal Parser), or
deactivate DTD parsing. If you need to do DTD handling, e.g. to
resolve entities, you could as well specify a Catalog Resolver
(http://docs.basex.org/wiki/Catalog_Resolver). Once more, I recommend
to use the latest snapshot, this will simplify tracing down the cause
of the problem.

Feel free to ask for more,
Christian