Hello all,
We are using Basex 7.0.2 and using wildcard for full-text search we ran into some problems when it comes to tokenization related issues. Our database contains these entries:
bb (aa)bb bb(cc) (aa)bb(cc)
We ran a test as following with the given results shown in each case:
1- .//value[text() contains text {'.*(bb)'} using wildcards]
returned (aa)bb and (aa)bb(cc)
2- .//value[text() contains text {'.(bb).*'} using wildcards]
returned bb(cc) and (aa)bb(cc)
3- .//value[text() contains text {'(bb)'} using wildcards]
returned (aa)bb and (aa)bb(cc) and bb(cc) and bb
so far so good, but the following case is the weird case:
4- .//value[text() contains text {'.*(bb).*'} using wildcards]
returning only (aa)bb(cc)
Can anyone explain why is the behavior of the last case different? Whereas it should be the most general case , it turns out to be the most exclusive one ? Are we missing something or is it a bug?