In latest snapshot, could you tell us how to use the index on the document names ?
The index should be created automatically after having run your first path-based query; subsequent queries should give you better results.
Given 10 000 000 documents named $i.xml containing <xml>{$i}</xml> We found that text index is 470x faster than documents' one :
Compiling:
- pre-evaluating (7000001 to 7001000)
Query: for $i in 7000001 to 7001000 return db:open('docs', xs:string($i) || '.xml') Optimized Query: for $i_0 in (7000001 to 7001000) return db:open("docs", fn:concat($i_0 cast as xs:string, ".xml")) Result:
- Hit(s): 1000 Items
- Updated: 0 Items
- Printed: 19500 Bytes
- Read Locking: local [docs]
- Write Locking: none
Timing:
- Parsing: 0.91 ms
- Compiling: 0.24 ms
- Evaluating: 68514.39 ms
- Printing: 1.61 ms
- Total Time: 68517.16 ms
Compiling:
- pre-evaluating (7000001 to 7001000)
Query: for $i in 7000001 to 7001000 return db:text('docs', xs:string($i))/root() Optimized Query: for $i_0 in (7000001 to 7001000) return db:text("docs", $i_0 cast as xs:string)/fn:root() Result:
- Hit(s): 1000 Items
- Updated: 0 Items
- Printed: 19500 Bytes
- Read Locking: local [docs]
- Write Locking: none
Timing:
- Parsing: 2.62 ms
- Compiling: 0.23 ms
- Evaluating: 143.72 ms
- Printing: 1.59 ms
- Total Time: 148.16 ms
-----Message d'origine----- De : Christian Grün [mailto:christian.gruen@gmail.com] Envoyé : mardi 23 septembre 2014 16:34 À : Fabrice Etanchaud Cc : Marco Lettere; basex-talk@mailman.uni-konstanz.de Objet : Re: [basex-talk] Adding documents slows over time
Hi Fabrice,
If you update your collection per document, you can use the replace command instead of xquery update and get free of pending update list limitations.
I would be interested what limitations you have observed so far?
Christian, from what I read in the last exchanges, the document index is now a persistent data structure ?
Exactly. After it has been requested for the first time, it will additionally stored on disk and updated incrementally. I would be interested to have your feedback on the latest snapshot.
Christian