Hi,
When trying to use the Portuguese stemmer (from the GUI), I get the following error:
--8<---------------cut here---------------start------------->8--- Version: BaseX 7.0.2 Java: Apple Inc., 1.6.0_29 OS: Mac OS X, x86_64 Stack Trace: java.lang.NullPointerException org.basex.util.Token.token(Token.java:154) org.basex.util.ft.LuceneStemmer.stem(LuceneStemmer.java:133) org.basex.util.ft.Stemmer.nextToken(Stemmer.java:96) org.basex.util.ft.FTLexer.nextToken(FTLexer.java:119) org.basex.index.ft.FTBuilder.index(FTBuilder.java:122) org.basex.index.ft.FTTrieBuilder.build(FTTrieBuilder.java:48) org.basex.index.ft.FTTrieBuilder.build(FTTrieBuilder.java:1) org.basex.core.cmd.ACreate.index(ACreate.java:154) org.basex.core.cmd.ACreate.index(ACreate.java:131) org.basex.core.cmd.CreateIndex.run(CreateIndex.java:63) org.basex.core.Command.run(Command.java:328) org.basex.core.Command.run(Command.java:116) org.basex.gui.dialog.DialogProgress$1.run(DialogProgress.java:135) --8<---------------cut here---------------end--------------->8---
I had a look in lucene-stemmers-3.4.0.jar; could the problem be that the Portuguese stemmer is actually a Snowball, not a Lucene stemmer?
Thanks and best regards
Hi Michael,
thanks for the report. Could you please, provide us with a small more concrete example, so that we are able to reproduce the error.
Regards, Dimitar
Am Freitag, den 09.12.2011, 00:38 +0100 schrieb Michael Piotrowski:
Hi,
When trying to use the Portuguese stemmer (from the GUI), I get the following error:
--8<---------------cut here---------------start------------->8--- Version: BaseX 7.0.2 Java: Apple Inc., 1.6.0_29 OS: Mac OS X, x86_64 Stack Trace: java.lang.NullPointerException org.basex.util.Token.token(Token.java:154) org.basex.util.ft.LuceneStemmer.stem(LuceneStemmer.jav
a:133) org.basex.util.ft.Stemmer.nextToken(Stemmer.java:96) org.basex.util.ft.FTLexer.nextToken(FTLexer.java:119) org.basex.index.ft.FTBuilder.index(FTBuilder.java:122) org.basex.index.ft.FTTrieBuilder.build(FTTrieBuilder.java:48) org.basex.index.ft.FTTrieBuilder.build(FTTrieBuilder.java:1) org.basex.core.cmd.ACreate.index(ACreate.java:154) org.basex.core.cmd.ACreate.index(ACreate.java:131) org.basex.core.cmd.CreateIndex.run(CreateIndex.java:63) org.basex.core.Command.run(Command.java:328) org.basex.core.Command.run(Command.java:116) org.basex.gui.dialog.DialogProgress$1.run(DialogProgress.java:135) --8<---------------cut here---------------end--------------->8---
I had a look in lucene-stemmers-3.4.0.jar; could the problem be that the Portuguese stemmer is actually a Snowball, not a Lucene stemmer?
Thanks and best regards
Hi,
On 2011-12-09, Dimitar Popov dimitar.popov@uni-konstanz.de wrote:
thanks for the report. Could you please, provide us with a small more concrete example, so that we are able to reproduce the error.
Thanks for your quick response. The error should be easy to reproduce. Here's what I did in the GUI:
- Open the Database Properties - In the Full-Text tab: Activate Full-Text Index, set Language to "Portuguese (Lucene)", and activate Stemming - Press OK - VoilĂ , you get the error
My database is CARDS from the University of Lisbon [1], a TEI-encoded collection of historical letters.
Thanks and best regards
Footnotes: [1] http://alfclul.clul.ul.pt/cards-fly/download.php?file=CardsXML.zip
Hi Michael,
many thanks for data; it seems that at least one of the files (CARDS0184.xml) causes the exception when the Portuguese stemmer is enabled. We'll investigate further - it may turn out that we have a more general problem with full-text stemmers. I'll give you more feedback soon.
Regards, Dimitar
Am Freitag, den 09.12.2011, 11:51 +0100 schrieb Michael Piotrowski:
Hi,
On 2011-12-09, Dimitar Popov dimitar.popov@uni-konstanz.de wrote:
thanks for the report. Could you please, provide us with a small more concrete example, so that we are able to reproduce the error.
Thanks for your quick response. The error should be easy to reproduce. Here's what I did in the GUI:
- Open the Database Properties
- In the Full-Text tab: Activate Full-Text Index, set Language to "Portuguese (Lucene)", and activate Stemming
- Press OK
- VoilĂ , you get the error
My database is CARDS from the University of Lisbon [1], a TEI-encoded collection of historical letters.
Thanks and best regards
Footnotes: [1] http://alfclul.clul.ul.pt/cards-fly/download.php?file=CardsXML.zip
Hi Dimitar,
On 2011-12-09, Dimitar Popov dimitar.popov@uni-konstanz.de wrote:
many thanks for data; it seems that at least one of the files (CARDS0184.xml) causes the exception when the Portuguese stemmer is enabled. We'll investigate further - it may turn out that we have a more general problem with full-text stemmers. I'll give you more feedback soon.
Great, thanks!
Best regards
Hi Michael,
the problem should be fixed in the next development snapshot (not available yet, but you can check out the sources from github).
Regards, Dimitar
Am Freitag, den 09.12.2011, 13:45 +0100 schrieb Michael Piotrowski:
Hi Dimitar,
On 2011-12-09, Dimitar Popov dimitar.popov@uni-konstanz.de wrote:
many thanks for data; it seems that at least one of the files (CARDS0184.xml) causes the exception when the Portuguese stemmer is enabled. We'll investigate further - it may turn out that we have a more general problem with full-text stemmers. I'll give you more feedback soon.
Great, thanks!
Best regards
Hi Dimitar,
On 2011-12-10, Dimitar Popov dimitar.popov@uni-konstanz.de wrote:
the problem should be fixed in the next development snapshot (not available yet, but you can check out the sources from github).
Great, thanks for the super-quick response! I'll do so as soon as possible.
Thanks and greetings
On 10.12.2011, at 01:20, Michael Piotrowski wrote:
On 2011-12-10, Dimitar Popov dimitar.popov@uni-konstanz.de wrote:
the problem should be fixed in the next development snapshot (not available yet, but you can check out the sources from github).
Great, thanks for the super-quick response! I'll do so as soon as possible.
I've just deployed a snapshot, you may download it from:
http://files.basex.org/releases/latest/
Cheers, Alex
On 2011-12-10, Alexander Holupirek alexander.holupirek@uni-konstanz.de wrote:
the problem should be fixed in the next development snapshot (not available yet, but you can check out the sources from github).
Great, thanks for the super-quick response! I'll do so as soon as possible.
I've just deployed a snapshot, you may download it from:
Thanks a lot for the quick fix and for providing the snapshot. It works great now.
Best regards
basex-talk@mailman.uni-konstanz.de