Hi, I'm currently processing a 3GB xml file. Below is the Query that is being executed and the corresponding output
count(db:open("ASEPXML", "A_SEPXML")/descendant::*:PmtInf[15]/*:CdtTrfTxInf/*:CdtrAgt/*:FinInstnId/*:BIC/text()) Output: 100000
The total time taken to evaluate the query is 210 secs. Below is the query plan.
Compiling: - pre-evaluating db:open("ASEPXML", "A_SEPXML") Optimized Query: fn:count(document-node { "A_SEPXML" }/descendant::*:PmtInf[15]/*:CdtTrfTxInf/*:CdtrAgt/*:FinInstnId/*:BIC/text()) Result: - Hit(s): 1 Item - Updated: 0 Items - Printed: 6 Bytes - Locking: local [ASEPXML] Timing: - Parsing: 0.0 ms - Compiling: 0.28 ms - Evaluating: 210818.96 ms - Printing: 0.08 ms - Total Time: 210819.34 ms Query plan: <QueryPlan> <FNAggr name="count(item)"> <CachedPath> <DBNode name="ASEPXML" pre="0"/> <IterPosStep axis="descendant" test="*:PmtInf"> <Pos min="15" max="15"/> </IterPosStep> <IterStep axis="child" test="*:CdtTrfTxInf"/> <IterStep axis="child" test="*:CdtrAgt"/> <IterStep axis="child" test="*:FinInstnId"/> <IterStep axis="child" test="*:BIC"/> <IterStep axis="child" test="text()"/> </CachedPath> </FNAggr> </QueryPlan> The database ASEPXML has text index created. The structure of the xml is as follows:
<root> <PmtInf> <CdtTrfTxInf> <CdtrAgt> <FinInstnId> <BIC>XXXXXX1A</BIC> </FinInstnId> </CdtrAgt> </CdtTrfTxInf> <CdtTrfTxInf> <CdtrAgt> <FinInstnId> <BIC>XXXXXX2A</BIC> </FinInstnId> </CdtrAgt> </CdtTrfTxInf> <CdtTrfTxInf> <CdtrAgt> <FinInstnId> <BIC>XXXXXX3A</BIC> </FinInstnId> </CdtrAgt> </CdtTrfTxInf> <CdtTrfTxInf> ... repeating node (100000) </PmtInf> <PmtInf>
</PmtInf> ... repeating node (40) </root>
Are there possibilities of reducing the response times and how do I detect if an index is being used or not ?
Regards Nesh
On 03/13/2013 05:04 PM, EV Narasimham wrote:
Hi, I'm currently processing a 3GB xml file. Below is the Query that is being executed and the corresponding output
count(db:open("ASEPXML", "A_SEPXML")/descendant::*:PmtInf[15]/*:CdtTrfTxInf/*:CdtrAgt/*:FinInstnId/*:BIC/text()) Output: 100000
And without specifying the wildcard prefix? The query plan suggests that it's not using the text-index. Furthermore I'd probably use:
count(db:open("ASEPXML", "A_SEPXML")/root/PmtInf[15]/CdtTrfTxInf/CdtrAgt/FinInstnId/BIC/text())
without the descendant-axis step. Without the prefixes however, the descendant-axis step should be rewritten to the explicit child steps (I guess).
kind regards Johannes
Hi Nesh,
as Johannes indicated, it could help to explicitly specify all child steps to the PmtInf node.
Christian ___________________________
On Wed, Mar 13, 2013 at 7:38 PM, Johannes.Lichtenberger Johannes.Lichtenberger@uni-konstanz.de wrote:
On 03/13/2013 05:04 PM, EV Narasimham wrote:
Hi, I'm currently processing a 3GB xml file. Below is the Query that is being executed and the corresponding output
count(db:open("ASEPXML",
"A_SEPXML")/descendant::*:PmtInf[15]/*:CdtTrfTxInf/*:CdtrAgt/*:FinInstnId/*:BIC/text()) Output: 100000
And without specifying the wildcard prefix? The query plan suggests that it's not using the text-index. Furthermore I'd probably use:
count(db:open("ASEPXML", "A_SEPXML")/root/PmtInf[15]/CdtTrfTxInf/CdtrAgt/FinInstnId/BIC/text())
without the descendant-axis step. Without the prefixes however, the descendant-axis step should be rewritten to the explicit child steps (I guess).
kind regards Johannes
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de