Thanks for that. New version works nicely with full text indexing - and very fast too! 

Noticed that the text index seems to work differently between RESTXQ (not utilized?) and REST - judging from the response time. 

Thanks again for the efforts

Lars

2015-05-19 13:29 GMT+02:00 Christian Grün <christian.gruen@gmail.com>:
> I'll check out how this can be fixed.

So I checked out how to fix it, and I fixed it [1]. Feel free to try
the latest snapshot [2]!
Christian

[1] https://github.com/BaseXdb/basex/issues/1144
[2] http://files.basex.org/releases/latest


> On Mon, May 18, 2015 at 6:46 PM, Lars Johnsen <yoonsen@gmail.com> wrote:
>> A last update, which may illuminate a little. After reindexing the database
>> using Norwegian (snowball), stemming, and keeping diacritis, RESTXQ
>> processes neither the special characters (treats them as closest ascii), nor
>> inflected forms.
>>
>> The words "mannen" (=the man, definite) and "spaserer" (=walks, present
>> tense), result in no output, while using the naked stems "mann" and "spaser"
>> the full result is displayed. In contrast to REST which behaves as expected.
>>
>>
>> Cheers
>> Lars
>>
>> 2015-05-18 15:28 GMT+02:00 Lars Johnsen <yoonsen@gmail.com>:
>>>
>>> As an update, after rebuilding database with
>>>
>>> text index,
>>> full text index (no language, no stemming, keep diacritics)
>>>
>>> restarting server:
>>> BaseX 8.1.1 [Server]
>>> Server was started (port: 29084)
>>> [main] INFO org.eclipse.jetty.server.AbstractConnector - Started
>>> SelectChannelConnector@0.0.0.0:8984
>>> HTTP Server was started (port: 8984)
>>>
>>> RESTXQ: Norwegian characters are converted using full text index, changing
>>> to text index takes forever.
>>> REST: Full-text works as expected, and text index works as expected (same
>>> as runing in GUI for both).
>>>
>>> It looks as if the index structure is treated differently.
>>>
>>>
>>> 2015-05-18 15:07 GMT+02:00 Lars Johnsen <yoonsen@gmail.com>:
>>>>
>>>> The full text query is blisteringly fast for both, the text index query
>>>> is fast only for REST queries and seems not to be used with queries in
>>>> RESTXQ. I am rebuilding the whole database now to see how it goes, and will
>>>> restart everything for a new assessment.
>>>>
>>>>
>>>>
>>>> 2015-05-18 15:00 GMT+02:00 Christian Grün <christian.gruen@gmail.com>:
>>>>>
>>>>> > However, when using text index instead of full text the results are
>>>>> > the same
>>>>> > for both, except that RESTXQ takes almost forever
>>>>>
>>>>> What about the original query: Has it been slow as well, or do you
>>>>> think this is a new problem?
>>>>>
>>>>>
>>>>> > 2015-05-18 14:28 GMT+02:00 Christian Grün <christian.gruen@gmail.com>:
>>>>> >>
>>>>> >> It could be that your URL is decoded in a wrong way.. What happens if
>>>>> >> you run the following function with REST and RESTXQ and "føre" as
>>>>> >> word?
>>>>> >>
>>>>> >>   declare
>>>>> >>     %rest:path("/test/encoding/{$word}")
>>>>> >>   function page:test-encoding($word) {
>>>>> >>     string-to-codepoints($word)
>>>>> >>   };
>>>>> >>
>>>>> >> Thanks,
>>>>> >> Christian
>>>>> >>
>>>>> >>
>>>>> >> string-to-codepoints()
>>>>> >> > REST output (2 first lines):
>>>>> >> >    føre
>>>>> >> >    fø - re 219
>>>>> >> >
>>>>> >> > RESTXQ
>>>>> >> >    føre
>>>>> >> >    fo - re 123
>>>>> >> >
>>>>> >> > The first word quoted is "føre" in both cases and is what the
>>>>> >> > scripts
>>>>> >> > see,
>>>>> >> > so the full text is given the same in both cases. Could it be that
>>>>> >> > within
>>>>> >> > RESTXQ the full text index is treated differently?
>>>>> >> >
>>>>> >> > I will work closer on a  self contained example, but thought this
>>>>> >> > might
>>>>> >> > point to something.
>>>>> >> >
>>>>> >> > Cheers
>>>>> >> > Lars
>>>>> >> >
>>>>> >> >
>>>>> >> > 2015-05-18 13:44 GMT+02:00 Lars Johnsen <yoonsen@gmail.com>:
>>>>> >> >>
>>>>> >> >> Hi Christian - and thanks for fast response. Latest version 8.11
>>>>> >> >> is in
>>>>> >> >> use
>>>>> >> >> (same behaviour as previous). Let me see if I can make a self
>>>>> >> >> contained
>>>>> >> >> example.
>>>>> >> >>
>>>>> >> >> best,
>>>>> >> >> Lars
>>>>> >> >>
>>>>> >> >> 2015-05-18 13:40 GMT+02:00 Christian Grün
>>>>> >> >> <christian.gruen@gmail.com>:
>>>>> >> >>>
>>>>> >> >>> Hi Lars,
>>>>> >> >>>
>>>>> >> >>> hm, that's difficult to tell. All I can say is that this sounds
>>>>> >> >>> unusual, so I'm coming up with my standard questions: Do you
>>>>> >> >>> think you
>>>>> >> >>> could build us a little example that allows us to reproduce the
>>>>> >> >>> problem? Have you tried the latest version of BaseX?
>>>>> >> >>>
>>>>> >> >>> Best,
>>>>> >> >>> Christian
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>> On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen <yoonsen@gmail.com>
>>>>> >> >>> wrote:
>>>>> >> >>> >
>>>>> >> >>> > I am running a web script in two identical versions (identical
>>>>> >> >>> > as in
>>>>> >> >>> > "cut
>>>>> >> >>> > and paste"), one via RESTXQ and one vi REST. The response is
>>>>> >> >>> > different,
>>>>> >> >>> > and
>>>>> >> >>> > I wondered what may be the trouble.
>>>>> >> >>> >
>>>>> >> >>> > For example the output (the URLs only works locally) for
>>>>> >> >>> >     http://ljohnsen:8984/hyphens/mellom
>>>>> >> >>> > is the same as
>>>>> >> >>> >      http://ljohnsen:8984/rest?run=hyphen-show.xq&word=mellom
>>>>> >> >>> >
>>>>> >> >>> > which is a set of hyphenation data:
>>>>> >> >>> >     mellom
>>>>> >> >>> >     mel - lom 17005
>>>>> >> >>> >     Mel - lom 144
>>>>> >> >>> >     mel - lom. 50
>>>>> >> >>> >
>>>>> >> >>> > but if "mellom" is exchanged with "nasjonalbiblioteket" only
>>>>> >> >>> > the
>>>>> >> >>> > REST
>>>>> >> >>> > version shows any result, which then is the same as I get
>>>>> >> >>> > experimenting
>>>>> >> >>> > in
>>>>> >> >>> > the GUI.
>>>>> >> >>> >
>>>>> >> >>> > The actual script is added below, and which runs in both
>>>>> >> >>> > versions
>>>>> >> >>> > (identical apart form the rest and restxq interfaces), it uses
>>>>> >> >>> > full
>>>>> >> >>> > text
>>>>> >> >>> > search, but results differ when run under the REST-regime.
>>>>> >> >>> >
>>>>> >> >>> > All the best
>>>>> >> >>> > Lars G Johnsen
>>>>> >> >>> > National Library of Norway
>>>>> >> >>> >
>>>>> >> >>> > module namespace page = 'http://basex.org/modules/web-page';
>>>>> >> >>> >
>>>>> >> >>> > declare
>>>>> >> >>> >   %rest:path("/hyphens/{$word}")
>>>>> >> >>> >   %output:method("html")
>>>>> >> >>> >
>>>>> >> >>> > function page:show-hyphens($word) {
>>>>> >> >>> >    let $db := db:open('hyphen-data')
>>>>> >> >>> >      let $hyphens :=  for $hyp in $db/hyphens/hyphens[full
>>>>> >> >>> > contains
>>>>> >> >>> > text
>>>>> >> >>> > {$word}]
>>>>> >> >>> >       group by $first := $hyp/first, $second := $hyp/second
>>>>> >> >>> >       let $count := count($hyp)
>>>>> >> >>> >       order by xs:int($count) descending
>>>>> >> >>> >       return element p {
>>>>> >> >>> >         attribute freq {$count},
>>>>> >> >>> >         $first, " - ", $second, $count
>>>>> >> >>> >       }
>>>>> >> >>> >
>>>>> >> >>> >      let $total := sum($hyphens//@freq)
>>>>> >> >>> >      let $div := element div {
>>>>> >> >>> >        element p {$word},
>>>>> >> >>> >        for $hyp in $hyphens
>>>>> >> >>> >        return element div {
>>>>> >> >>> >           attribute class {"hyph"},
>>>>> >> >>> >           attribute style {"font-size:", 1
>>>>> >> >>> > +round(xs:int($hyp//@freq/data())
>>>>> >> >>> > div $total,1) || "em"},
>>>>> >> >>> >           $hyp
>>>>> >> >>> >
>>>>> >> >>> >          }
>>>>> >> >>> >      }
>>>>> >> >>> >      return
>>>>> >> >>> >      <html encoding="UTF-8">
>>>>> >> >>> >     <head>
>>>>> >> >>> >         <meta http-equiv="Content-Type" content="text/html"
>>>>> >> >>> > charset="UTF-8"
>>>>> >> >>> > />
>>>>> >> >>> >         <title>Orddelinger</title>
>>>>> >> >>> >     </head>
>>>>> >> >>> >     <body>{$div}
>>>>> >> >>> >     </body>
>>>>> >> >>> >     </html>
>>>>> >> >>> >
>>>>> >> >>> > };
>>>>> >> >>
>>>>> >> >>
>>>>> >> >
>>>>> >
>>>>> >
>>>>
>>>>
>>>
>>