--
Tim A. Thompson (he, him)
Librarian for Applied Metadata Research
Yale University Library
On Wed, Apr 27, 2022 at 3:43 PM Christian Grün
christian.gruen@gmail.com
wrote:
> > "men" is stemmed to both "man" and "cocksman"--shaking my head on that
> one!
>
> Reads like a subversive Easter egg :·)
>
> Thanks for the efforts, Tim. The results (memorandum → memo, others)
> indicate to me that a dictionary-based solution has been chosen by
> Bitext.
>
> I have decided to enhance our Porter algorithm with a simple
> dictionary that contains the most common irregular plural forms,
> including the ones you analyzed and various others. A new snapshot is
> available [1,2].
>
> Best,
> Christian
>
> [1]
https://github.com/BaseXdb/basex/issues/2097
> [2]
https://files.basex.org/releases/latest/
>