Hello again!
Hi Mathias,
True; you need to ensure that all XML documents are well-formed. You
> As you suggested I tried using the "new" command. I wasn't successfull so
> far, because I encountered a number of other problems during the process.
> Since overall the db creation process lasts several hours with these amounts
> of data the time till some of the errors/problems surfaced where equally
> long. (Invalid filenames or contents of some files)
might as well use xmllint or similar tools to remove those files in
advance.
That's quite an unusual behavior; I guess that too many URLs are
> Nevertheless today I got it running without OutOfMemoryExceptions or other
> printed errors. Unfortunately though, when I executed the "create db OAI
> [folder]" command in the BaseXClient (over ssh on my server) it obviously
> never finished.
resolved again and again, which might take lots of lots of time. I'd
advise to set the intparse flag to true (set intparse on; create db
..., or Database -> New -> Parsing -> Use Internal Parser), or
deactivate DTD parsing. If you need to do DTD handling, e.g. to
resolve entities, you could as well specify a Catalog Resolver
(http://docs.basex.org/wiki/Catalog_Resolver). Once more, I recommend
to use the latest snapshot, this will simplify tracing down the cause
of the problem.
Feel free to ask for more,
Christian