Excellent--thanks for this enhancement!

Tim


--
Tim A. Thompson (he, him)
Librarian for Applied Metadata Research
Yale University Library



On Wed, Apr 27, 2022 at 3:43 PM Christian Grün <christian.gruen@gmail.com> wrote:
> "men" is stemmed to both "man" and "cocksman"--shaking my head on that one!

 Reads like a subversive Easter egg :·)

Thanks for the efforts, Tim. The results (memorandum → memo, others)
indicate to me that a dictionary-based solution has been chosen by
Bitext.

I have decided to enhance our Porter algorithm with a simple
dictionary that contains the most common irregular plural forms,
including the ones you analyzed and various others. A new snapshot is
available [1,2].

Best,
Christian

[1] https://github.com/BaseXdb/basex/issues/2097
[2] https://files.basex.org/releases/latest/