Hi Dimitar,
I have done the comparison between the function "contains" (without fulltext index) and "contains text" by using a 800 MB XML database. You can download the test data here: http://dl.dropbox.com/u/22427941/RootSearch.rar .
1) Using a simple query:
+ declare default element namespace "http://iso.org/OTX"; for $pro in collection()//specification[text() contains text "Specification"] return $pro
+ declare default element namespace "http://iso.org/OTX"; for $pro in collection()//specification[contains(text() ,"Specification")] return $pro
The query with "contains text" ran a little bit faster than the query without full text index.
2) Using an complex query:
declare default element namespace "http://iso.org/OTX";
for $pro in collection()/otx/procedures/procedure return for $hd in $pro/realisation/flow//handler where exists($hd/@*[contains(data(.),"Variable1")]) or exists($hd/realisation/catch/exception//@*[contains(data(.),"Variable1")]) or $hd/specification contains text "Specification" (: or exists ($hd/specification[contains(data(.),"Specification")] ):) return concat(data($pro/../../@package),":",data($pro/../../@name),":",data($pro/@name),":","handler",":",$hd/@id)
The variant with "contains text" ran much slower than the variant with "contains".
The indexes are used: path, text index, attribute index, full-text index (without any options)
Thanks for helping me.
Cheers, An
On Fri, Nov 25, 2011 at 3:31 PM, Dimitar Popov < Dimitar.Popov@uni-konstanz.de> wrote:
Hi An,
Am Freitag, 25. November 2011, 11:30:00 schrieb Truong An Nguyen:
Hi,
I've two questions about the fulltext search.
- I've created fulltext i for my BaseX doesn't understand ftcontains.
When I try a CONTAINS query in a 1 Gigabyte Database, using text index is slower than without text index. How can I use fulltext index rightly?
You can do several things:
- Make sure that the full-text index is used, i.e. check the "query info"
in the GUI; it should contain "FTIndexAccess" similar to:
<FTIndexAccess data="factbook"> <FTWords> <Item value="norway" type="xs:string"/> </FTWords> </FTIndexAccess>
2a. If the full-text index is used, please send more details about your query and data (e.g. what full-text options are used); it would be interesting to see why the index query is slower.
2b. If the full-text index is NOT used, please check that the full-text options you use in your query correspond to the options with which the full- text index is created. For more information check our wiki page [1].
- Does BaseX supports regular expression in the fulltext extension? If
yes, could you please give me an example.
No, full blown regular expressions are not supported by XQuery Full-Text. However, wild-cards are supported. For the correspond syntax and examples, you can check the XQuery Full-Text specification [2].
Thanks
Cheers, An
Greetings, Dimitar
[1] http://docs.basex.org/wiki/Full-Text [2] http://www.w3.org/TR/xpath-full-text-10/#ftwildcardoption