Hello,
I have a largish (5.4G) file with a full-text index that I am using to reconcile names in a local dataset. I've been experimenting with splitting the file into many smaller index files to improve performance. I group the entries by initial character and create a new index file for each distinct initial character. Each smaller file then gets its own full-text index.
I've been following the approach outlined in the documentation for custom index structures https://docs.basex.org/wiki/Indexes#Custom_Index_Structures. Using prof:track, I've noticed the following performance for different uses of ft:search.
(Here, $db refers to the 5.4G file, and $index refers to a smaller 159MB subindex. Times are averaged across 10 runs of 1000 iterations for each expression.)
1. Direct lookup against large index Time: 23ms Expression: ft:search($db, $text)/../..
2. Direct lookup against subindex Time: 3.3ms Expression: ft:search($index, $text)/../..
3. Lookup against subindex file with reference to large index Time: 2.9ms Expression: let $s := ft:search($index, $text)/../.. return db:open-id($db, $s/id)/../..
My question is: why would the third expression be slightly faster (or at least not slower) than the second one, if it involves additional computation?
Thanks in advance, Tim