Result:
- Hit(s): 250000 Items
- Updated: 0 Items
- Printed: 2048 KB
- Read Locking: local [CDI]
- Write Locking: none
Timing:
- Parsing: 2.0 ms
- Compiling: 107.74 ms
- Evaluating: 8085.55 ms
- Printing: 106.4 ms
- Total Time: 8301.69 ms
With kind regards, Menashè
Hi Menashè,QUERY[0] xquery version "3.0"; declare namespace queryName ='GetIDS'; declare namespace gco = "http://www.isotc211.org/2005/gco"; declare [...]It would be great if you could help us and simplify the query, such that we can have a look at the core issue.Id there an undocumented way to log the full xquery in BaseX server logs?The maximum size of log entries can be adjusted via the option LOGMSGMAXLEN [1]. Cheers, Christian [1] http://docs.basex.org/wiki/Options#LOGMSGMAXLENI've seen the -V option, but I don't use the standalone version, but: java -cp /usr/share/java/basex.jar org.basex.BaseXServer -d doesn't give me extra query info. With kind regards, Menashè On 02/03/2015 01:13 PM, Menashè Eliezer wrote:Hi Christian, Thank you! The performance arrives to 0.5 sec! The biggest improvement is related to the query rephrasing you've suggested. Then the latest snapshot also helps a lot! You may want to know that in the log of the latest snapshot I see applying attribute index for "7827" which is not clear to the user, instead of BaseX80-20150130.124009 which has also used indexing: applying attribute index for ("ALKY", "AYMD") I'm attaching the first and the second launch of the query using BaseXGUI. Relaunching the same query reduces the time from over 1 second to 0.5 second. Some data: BaseX80-20150130.124009 Total Time: 30676.02 ms After using "for $x in collection("ALL-CDIS")/gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification": Total Time: 5456.74 ms applying attribute index for ("ALKY", "AYMD") in log. Second launch: 1333.71 ms Latest snapshot (BaseX80-20150202.121033): 1st: Total Time: 1873.02 ms 2nd: Total Time: 548.62 ms With kind regards, Menashè On 02/02/2015 02:02 PM, Menashè Eliezer wrote:Hi Christian, Thank you very much! Unfortunately I'll be at the office only tomorrow. Menashè On Sat, 31 Jan 2015 16:42:32 +0100, Christian Grün <christian.gruen@gmail.com> wrote:Hi Menashè, With the latest snapshot [1], your original query should now be rewritten for index access as well. Looking forward to your tests, Christian PS: In terms of performance, it may still be worthwhile to move redundant paths to the for clause; but just try and see. [1] http://files.basex.org/releases/latest/ On Fri, Jan 30, 2015 at 9:49 PM, Christian Grün <christian.gruen@gmail.com> wrote:Hi Menashè,Should I expect to see the usage of an index for each of the wherephrases?Usually, only one predicate will be rewritten for index access, and the remaining conditions will be answered sequentially.Have a nice weekend!Enjoy, ChristianMenashè On Fri, 30 Jan 2015 18:11:59 +0100, Christian Grün <christian.gruen@gmail.com> wrote:Hi Menashè, Thanks for the XML samples you sent me in private. I noticed that the index rewritings will only be triggered if you formulate your query as follows: OLD: for $x in collection("ALL-CDIS") where $x/gmd:MD_Metadata/gmd:identificationInfo/... return ... NEW: for $x in collection("ALL-CDIS")/gmd:MD_Metadata where $x/gmd:identificationInfo/... return ... It's difficult to explain in short sentences why Variant 1 cannot be optimized that straightforward (basically, it's quite a different pattern to look for), but I'll check out if we can extend our matcher to also support these kind of queries. So, if possible, I would recommend you for now (and at least for testing) to move the root element test after the collection() function. I noticed that the first three child steps are the same in all of your conditions: gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification If that will be always be the case, it surely makes sense to move all of them to the "for" clause. Looking forward to your updated performance tests, Christian _______________________________ On Fri, Jan 30, 2015 at 5:55 PM, Christian Grün <christian.gruen@gmail.com> wrote:Could you possibly provide me with a small snapshot of your data sources (one, two documents might be sufficient)? On Fri, Jan 30, 2015 at 5:52 PM, Menashè Eliezer <meliezer@ogs.trieste.it> wrote:Almost the same speed with version 8.0. No indexing (no "applying" in the query info). As I've attached before, indexes are active for this DB. With kind regards, Menashè On 01/30/2015 05:31 PM, Christian Grün wrote:It's indeed interesting that your query does not use any of the existing index structures (if they did, you would find strings like "applying text index" or "applying attribute index" in the query info). Maybe/hopefully things look different with Version 8.0. On Fri, Jan 30, 2015 at 5:26 PM, Menashè Eliezer <meliezer@ogs.trieste.it> wrote:On 01/30/2015 05:18 PM, Christian Grün wrote:/gmd:MD_Metadata/gmd:identificationInfo/sdn:SDN_DataIdentification/gmd:descriptiveKeywords[1]/gmd:MD_Keywords/gmd:keyword[2]/sdn:SDN_ParameterDiscoveryCode/@codeListValueHow can I remove *?Simply remove the predicate; a[*]/b is the same as a/b.Maybe I wasn't clear. The actual number appears in the xml file,e.g.,gmd:descriptiveKeywords[1] Anyway, I've removed all [*] and I get the same correct result,howeverthe processing time is doubled...* In some cases, if you know that an element name is distinct,youcanget rid of all the explicit child steps and directly addressthenodevia the descendant axis.Thanks, but it's not relevant in my case.Is it because the element names are not distinct? Or is itbecauseyour input form allows users to choose arbitrary paths forarbitrarydocuments?The element names are not distinct.Sure, I'l also try BaseX 8.0 and compare. Should I recreate thedbimporting the xml files for testing the improved indexing?We have actually improved support for collections, but thedatabaseformat itself has not changed, so it shouldn't make a differenceinyour case. Christian[1] http://files.basex.org/releases/latest On Fri, Jan 30, 2015 at 3:55 PM, Menashè Eliezer <meliezer@ogs.trieste.it> wrote:Hello, I wonder if the attached query can be optimised. I'm attachingallrelevant information. Basex 7.9, Debian, powerful server. This is just an example. The queries will be built based on a compilation of a search form. Any help would be appreciated. 40 seconds are not acceptable. -- With kind regards, Menashè-- With kind regards, MenashèWith kind regards, Menashè-- Menashè