- that's not my query: I was using 'all words', and my query still runs
without index.
Correct; first of all, the following sub-expressions are equivalent:
– "romeo juliet" all words – "romeo" ftand "juliet"
..but in fact "text()" is not equivalent to the context item ".". The following expression shows another alternative..
a) //SPEECH[ .//text() contains text "romeo" ftand "juliet"]
..but to really get equivalent results, you should go along with:
b) //SPEECH[ .//text() contains text "romeo"][ .//text() contains text "juliet"]
- I think that 'text() 'is not equivalent to '.'.
For example I could not use text() here //SPEECH[ . contains text "romeo juliet" all words] By the way, this query just doesn't work (while it should return 42 hits).
This might be due to the phenomena of node atomization, which is handled differently by all implementations. The following query..
<xml>A<x>B</x></xml> contains text 'A'
..returns "false" in BaseX, whereas other implementations might return "true", if "A" and "B" are handles as two independent tokens. If you apply query b) above, you might get the results you are expecting.
Best, Christian
___________________________
Christian Gruen Universitaet Konstanz Department of Computer & Information Science D-78457 Konstanz, Germany Tel: +49 (0)7531/88-4449, Fax: +49 (0)7531/88-3577 http://www.inf.uni-konstanz.de/~gruen
On Thu, Feb 11, 2010 at 1:14 AM, Thomas Goossens thomgooss@gmail.com wrote:
---------- Forwarded message ---------- From: Christian Grün christian.gruen@gmail.com Date: Thu, Feb 11, 2010 at 12:42 AM Subject: Re: [basex-talk] Full-text speed To: Thomas Goossens thomgooss@gmail.com Cc: basex-talk@mailman.uni-konstanz.de
Hi Thomas,
your query will be evaluated much faster if you rewrite it to..
//LINE[ text() contains text "romeo juliet"]
This query should take ~3-5 ms on the 7.5mb Shakespeare instance.
You can have a look into our XQuery documentation (http://basex.org/xquery, Section »Query Evaluation«) to get more insight on query compilation and how to utilize the index structures.
Hope this helps, Christian ___________________________
Christian Gruen Universitaet Konstanz Department of Computer & Information Science D-78457 Konstanz, Germany Tel: +49 (0)7531/88-4449, Fax: +49 (0)7531/88-3577 http://www.inf.uni-konstanz.de/~gruen
On Thu, Feb 11, 2010 at 12:24 AM, Thomas Goossens thomgooss@gmail.com wrote:
Hello,
I am trying XQuery Full-text on BaseX and I am a bit surprised by the full-text query speed: I have loaded the Shakespeare plays into a BaseX database, and created a full-text index. So far so good.
Then I a tried a query like: //LINE[ . contains text "romeo juliet" all words] (4 hits)
It takes about 1200 ms. I expected less than 100 ms. For example I tried Qizx and it takes less than 20 ms. Even eXist (old version, with a different syntax) was taking around 200 ms.
I tried dropping the full-text index: that makes no difference! So clearly the FT index is not used. What should I do ?
Thanks
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk