Hi All,
the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" contains text "anschliessend"
returns *false*
but the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" = "anschliessend"
returns *true*.
and the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" eq "anschliessend"
returns *true*.
I wonder why 'contains text' does not return true as well? All full-text related functions doesn't support ß = ss for german collation. Alex
Hi Alex,
yes, that’s true.
The collation feature was introduced in a later version of XQuery. XQuery Full Text was implemented quite some time before that. The tokenization process reduces each non-ascii character to a single alternative character. As a result, the following returns true…
'anschließend' contains text 'anschliesend'
…whereas one would rather expect 'anschliessend' to be accepted.
You may have spotted Günter’s similar observation regarding the German long s (ſ) in [1]. It may be reasonable to bring all the normalization steps together (even if some of them are language-specific).
Best, Christian
[1] https://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/msg11927.htm...
On Wed, Jul 31, 2019 at 1:42 PM Alexander Witzigmann alexander.witzigmann@tanner.de wrote:
Hi All,
the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" contains text "anschliessend"
returns false
but the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" = "anschliessend"
returns true.
and the following query
declare default collation 'http://basex.org/collation?lang=de;strength=secondary'; "anschließend" eq "anschliessend"
returns true.
I wonder why 'contains text' does not return true as well? All full-text related functions doesn't support ß = ss for german collation. Alex
basex-talk@mailman.uni-konstanz.de