Hi Julia,
the Xml files I'm using are also in different directories, one for each day. I create one database on each of these directories, each containing between several hundred and up to 6000 documents. Usually, queries need to take in account only one day, so the search space is considerably smaller than for one large database. If access to documents of more than one day is required, you can create the range of database identifiers (supposing there is an algorithm to do so) and then open each database in a for loop, thus getting a sequence of top nodes. Also, for more complicated queries, I can choose to (re)create some the databases in-memory in order to improve performance. Since creating a database of this size takes just a couple of seconds, this is an approach which would possibly not be feasible if all the documents were in one huge database.
Kind regards, Goetz