Hi Graydon,
So I would expect that, with a full text search that ignores diacritics, I'd get four hits.
By adding some collation hints to one of the standard string functions, the comparison will succeed:
fn:compare('≮','<','?lang=en;strength=primary')
In the example, I used the BaseX notation for collations (it is similar to the notation in Saxon or Exist; in future, more and more people will probably switch to the newly introduced UCA collation).
I don't think it's clear that "text" in "full text" means "groups of letters".
I agree. Once again, the XQFT spec does not dictate what a "token" in a full-text is. Currently, we only have two tokenizers: one for Western languages and another one for Japanese (which gets along without whitespaces). When we initially implemented the XQFT features some years ago, our major use case was the search in a library catalog (comprising meta data on appr. 2 million titles).
Best, Christian