Hi Julia,
the Xml files I'm using are also in different directories, one for each day.
I create one database on each of these directories, each containing between
several hundred and up to 6000 documents. Usually, queries need to take in
account only one day, so the search space is considerably smaller than for
one large database. If access to documents of more than one day is required,
you can create the range of database identifiers (supposing there is an
algorithm to do so) and then open each database in a for loop, thus getting
a sequence of top nodes.
Also, for more complicated queries, I can choose to (re)create some the
databases in-memory in order to improve performance. Since creating a
database of this size takes just a couple of seconds, this is an approach
which would possibly not be feasible if all the documents were in one huge
database.
Kind regards,
Goetz