index usage while processing large xml file(s) - BaseX-Talk - mailman.uni-konstanz.de

13 Mar 2013


      Hi,
    I'm currently processing a 3GB xml file. Below is the Query that is
being executed and the corresponding output
count(db:open("ASEPXML",
"A_SEPXML")/descendant::*:PmtInf[15]/*:CdtTrfTxInf/*:CdtrAgt/*:FinInstnId/*:BIC/text())
Output: 100000
The total time taken to evaluate the query is 210 secs. Below is
the query plan.
Compiling:
- pre-evaluating db:open("ASEPXML", "A_SEPXML")
Optimized Query:
fn:count(document-node { "A_SEPXML"
}/descendant::*:PmtInf[15]/*:CdtTrfTxInf/*:CdtrAgt/*:FinInstnId/*:BIC/text())
Result:
- Hit(s): 1 Item
- Updated: 0 Items
- Printed: 6 Bytes
- Locking: local [ASEPXML]
Timing:
- Parsing: 0.0 ms
- Compiling: 0.28 ms
- Evaluating: 210818.96 ms
- Printing: 0.08 ms
- Total Time: 210819.34 ms
Query plan:
<QueryPlan>
  <FNAggr name="count(item)">
    <CachedPath>
      <DBNode name="ASEPXML" pre="0"/>
      <IterPosStep axis="descendant" test="*:PmtInf">
        <Pos min="15" max="15"/>
      </IterPosStep>
      <IterStep axis="child" test="*:CdtTrfTxInf"/>
      <IterStep axis="child" test="*:CdtrAgt"/>
      <IterStep axis="child" test="*:FinInstnId"/>
      <IterStep axis="child" test="*:BIC"/>
      <IterStep axis="child" test="text()"/>
    </CachedPath>
  </FNAggr>
</QueryPlan>
      The database ASEPXML has text index created. The structure of the xml
is as follows:
<root>
       <PmtInf>
              <CdtTrfTxInf>
                   <CdtrAgt>
                             <FinInstnId>
                                     <BIC>XXXXXX1A</BIC>
                             </FinInstnId>
                    </CdtrAgt>
               </CdtTrfTxInf>
               <CdtTrfTxInf>
                   <CdtrAgt>
                             <FinInstnId>
                                     <BIC>XXXXXX2A</BIC>
                             </FinInstnId>
                    </CdtrAgt>
               </CdtTrfTxInf>
              <CdtTrfTxInf>
                   <CdtrAgt>
                             <FinInstnId>
                                     <BIC>XXXXXX3A</BIC>
                             </FinInstnId>
                    </CdtrAgt>
               </CdtTrfTxInf>
               <CdtTrfTxInf> ... repeating node (100000)
       </PmtInf>
       <PmtInf>
</PmtInf>
        ... repeating node (40)
</root>
Are there possibilities of reducing the response times and how do I detect
if an index is being used or not ?
Regards
Nesh