Re: [basex-talk] Optimal size for large databases

9 Jun 2016


      Hi Lizzi,
...
Thanks for the information!
And thanks back for the details.
...
When using OPTIMIZE it is not clear what caused the out of memory error.
With individual CREATE INDEX statements I ran into the out of memory error
on the FULLTEXT index.
I see; in both cases, it must be the fulltext index. As you have
discovered in the documentation, we are writing partial index
structures to disk once main memory gets exhausted. This works very
well for the text and attribude index (you can usually index 10 GB of
data with 100 MB of main memory), but it gets more and more clear that
the corresponding merge algorithms need to be improved and better
adapted to the full-text index.
Did you try selective indexing (i.e., limit the indexed fulltext nodes
to the ones that will eventually be queried)?
...
I have not tried incremental indexing with UPDINDEX or AUTOINDEX. My
understanding from the documentation is that UPDINDEX does not update the
full text index, and incremental should be turned off to improve the speed
of bulk imports.
Completely true; in your case, it doesn’t really help.
...
Today I benchmarked
ADD vs REPLACE and have not seen much difference in speed.
Once a REPLACE is called, additional meta data structures will be
created that need to be maintained, so it could be that you will need
to start from scratch with a new database.
...
Today I found the section on Index Performance in the documentation
(http://docs.basex.org/wiki/Indexes#Performance). This section mentions “If
main memory runs out while creating a value index, the current index
structures will be partially written to disk and eventually merged.” Does
this mean that if running OPTIMIZE ALL ends with an out of memory error,
running OPTIMIZE as many times as needed will eventually update all of the
indexes?
Once you get out of memory, indexing will be interrupted and needs to
be done again.
Hope this helps,
Christian

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] Optimal size for large databases