Many thanks for checking, Christian! I'll study the spec and get back to you should I come to a different conclusion.

Kind regards,
Hans-Jürgen

Am Montag, 25. April 2022, 12:00:41 MESZ hat Christian Grün <christian.gruen@gmail.com> Folgendes geschrieben:


Hi Hans-Jürgen,

PS: I think there is a bug concerning "different sentence":

basex "'base.x' contains text 'base x' same sentence"
false

basex "'base.x' contains text 'base x' different sentence"
false

After some intents, I decided to stick with the current solution, as I believe it’s formally correct (albeit counter-intuitive). The specification merely indicates that:

A scope selection selects matches which satisfy the operand full-text selection and for which the matched tokens and phrases are contained in the same scope or in different scopes.” [1]

It does not elaborate on what should happen when a phrase spans multiple scopes (sentences, paragraphs), and I didn’t manage to define concise rules that provide consistent results without considering various edge cases.

If your use case allows you to ignore the difference between token and phrase matches, it’s advisable to use the following syntax:

let $input := 'base.x'
let $tokens := ft:tokenize('base x')
return $input contains text { $tokens } all different sentence

Hope this helps,
Christian