Hi Simon,
I haven't gotten to optimizing queries yet. Doing some simple aggregate queries like count(//), it looks like the query time scales linearly with the number of documents in the database.
This is true, and it depends a lot on the types of queries you are performing:
• If your database is fully optimized (which can be attained by running the 'optimize' command, or 'db:optimize', on the database), many count() calls can be evaluated in constant time, because the database statistics will be utilized for evaluation.
• If you activate the 'updindex' flag for a database, value indexes can be kept up-to-date, and queries such as //*[text() = 'abc'] will be evaluated by the index [1]. The InfoView panel [2] can be checked to see how queries will be compiled.
Please note that databases in BaseX are pretty light-weight. For example, you can access more than one database in a single XQuery expression. If you have free time slots for maintenance, you could work with one or more static databases, which is completely indexed and optimized, and one incremental database for new documents. During maintenance, you can add new documents to the static database pool and benefit from all query optimizations.
Hope this helps, Christian
[1] http://docs.basex.org/wiki/Indexes [2] http://docs.basex.org/wiki/Graphical_User_Interface#Visualizations