Hi Alex,
> If i understood correctly i used the unicode codes from
> http://unicode.org/charts/PDF/U0370.pdf
> to produce the following mapping:
> [...]
thanks; I have added your mappings and uploaded a new stable snapshot
[1,2]. The following query should now return true:
"ά" contains text "α"
Next, I've added the Greek stemmer to our internal implementations. It
can be invoked by setting "stemming" and "language"; e.g.:
"..." contains text "..." using stemming using language "el"
Due to my non-existing Greek language skills, I'm sorry I had no
chance to perform any tests.. your feedback is welcome!
> I am concerned though because this is not always the desired behavior.
> Sometimes (ie in an academic context) I
could see the need for
> accent-sensitive searches.
In this particular case, you can switch off the removal of diacritics via..
"ά" contains text "α" using diacritics sensitive
> OTOH this would be reinventing the collation wheel (an oversimplified
> version of it)
That's true. I'll write some more on that as a reply to Michael’s mail.
Christian
[1] http://docs.basex.org/wiki/Releases
[2] http://files.basex.org/releases/latest/