[basex-talk] TR: Adding documents slows over time

23 Sep 2014


      -----Message d'origine-----
De : Fabrice Etanchaud 
Envoyé : mardi 23 septembre 2014 18:00
À : 'Christian Grün'
Objet : RE: [basex-talk] Adding documents slows over time
Dear Christian,
In our old tests, we found that in a collection with several millions documents, opening that collection, or replacing a document was very very long.
In latest snapshot, could you tell us how to use the index on the document names ?
Given 10 000 000 documents named $i.xml containing <xml>{$i}</xml> We found that text index is 470x faster than documents' one :
Compiling:
- pre-evaluating (7000001 to 7001000)
Query:
for $i in 7000001 to 7001000 return db:open('docs', xs:string($i) || '.xml') Optimized Query:
for $i_0 in (7000001 to 7001000) return db:open("docs", fn:concat($i_0 cast as xs:string, ".xml"))
Result:
- Hit(s): 1000 Items
- Updated: 0 Items
- Printed: 19500 Bytes
- Read Locking: local [docs]
- Write Locking: none
Timing:
- Parsing: 0.91 ms
- Compiling: 0.24 ms
- Evaluating: 68514.39 ms
- Printing: 1.61 ms
- Total Time: 68517.16 ms
Compiling:
- pre-evaluating (7000001 to 7001000)
Query:
for $i in 7000001 to 7001000 return db:text('docs', xs:string($i))/root() Optimized Query:
for $i_0 in (7000001 to 7001000) return db:text("docs", $i_0 cast as xs:string)/fn:root()
Result:
- Hit(s): 1000 Items
- Updated: 0 Items
- Printed: 19500 Bytes
- Read Locking: local [docs]
- Write Locking: none
Timing:
- Parsing: 2.62 ms
- Compiling: 0.23 ms
- Evaluating: 143.72 ms
- Printing: 1.59 ms
- Total Time: 148.16 ms
-----Message d'origine-----
De : Christian Grün [mailto:christian.gruen@gmail.com]
Envoyé : mardi 23 septembre 2014 16:34
À : Fabrice Etanchaud
Cc : Marco Lettere; basex-talk@mailman.uni-konstanz.de
Objet : Re: [basex-talk] Adding documents slows over time
Hi Fabrice,
...
If you update your collection per document, you can use the replace 
command instead of xquery update and get free of pending update list limitations.
I would be interested what limitations you have observed so far?
...
Christian, from what I read in the last exchanges, the document index 
is now a persistent data structure ?
Exactly. After it has been requested for the first time, it will additionally stored on disk and updated incrementally. I would be interested to have your feedback on the latest snapshot.
Christian

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

[basex-talk] TR: Adding documents slows over time