A last update, which may illuminate a little. After reindexing the database using Norwegian (snowball), stemming, and keeping diacritis, RESTXQ processes neither the special characters (treats them as closest ascii), nor inflected forms.

The words "mannen" (=the man, definite) and "spaserer" (=walks, present tense), result in no output, while using the naked stems "mann" and "spaser" the full result is displayed. In contrast to REST which behaves as expected.


Cheers
Lars

2015-05-18 15:28 GMT+02:00 Lars Johnsen <yoonsen@gmail.com>:
As an update, after rebuilding database with

text index, 
full text index (no language, no stemming, keep diacritics)

restarting server:
BaseX 8.1.1 [Server]
Server was started (port: 29084)
[main] INFO org.eclipse.jetty.server.AbstractConnector - Started SelectChannelConnector@0.0.0.0:8984
HTTP Server was started (port: 8984)

RESTXQ: Norwegian characters are converted using full text index, changing to text index takes forever.
REST: Full-text works as expected, and text index works as expected (same as runing in GUI for both).

It looks as if the index structure is treated differently.


2015-05-18 15:07 GMT+02:00 Lars Johnsen <yoonsen@gmail.com>:
The full text query is blisteringly fast for both, the text index query is fast only for REST queries and seems not to be used with queries in RESTXQ. I am rebuilding the whole database now to see how it goes, and will restart everything for a new assessment.



2015-05-18 15:00 GMT+02:00 Christian Grün <christian.gruen@gmail.com>:
> However, when using text index instead of full text the results are the same
> for both, except that RESTXQ takes almost forever

What about the original query: Has it been slow as well, or do you
think this is a new problem?


> 2015-05-18 14:28 GMT+02:00 Christian Grün <christian.gruen@gmail.com>:
>>
>> It could be that your URL is decoded in a wrong way.. What happens if
>> you run the following function with REST and RESTXQ and "føre" as
>> word?
>>
>>   declare
>>     %rest:path("/test/encoding/{$word}")
>>   function page:test-encoding($word) {
>>     string-to-codepoints($word)
>>   };
>>
>> Thanks,
>> Christian
>>
>>
>> string-to-codepoints()
>> > REST output (2 first lines):
>> >    føre
>> >    fø - re 219
>> >
>> > RESTXQ
>> >    føre
>> >    fo - re 123
>> >
>> > The first word quoted is "føre" in both cases and is what the scripts
>> > see,
>> > so the full text is given the same in both cases. Could it be that
>> > within
>> > RESTXQ the full text index is treated differently?
>> >
>> > I will work closer on a  self contained example, but thought this might
>> > point to something.
>> >
>> > Cheers
>> > Lars
>> >
>> >
>> > 2015-05-18 13:44 GMT+02:00 Lars Johnsen <yoonsen@gmail.com>:
>> >>
>> >> Hi Christian - and thanks for fast response. Latest version 8.11 is in
>> >> use
>> >> (same behaviour as previous). Let me see if I can make a self contained
>> >> example.
>> >>
>> >> best,
>> >> Lars
>> >>
>> >> 2015-05-18 13:40 GMT+02:00 Christian Grün <christian.gruen@gmail.com>:
>> >>>
>> >>> Hi Lars,
>> >>>
>> >>> hm, that's difficult to tell. All I can say is that this sounds
>> >>> unusual, so I'm coming up with my standard questions: Do you think you
>> >>> could build us a little example that allows us to reproduce the
>> >>> problem? Have you tried the latest version of BaseX?
>> >>>
>> >>> Best,
>> >>> Christian
>> >>>
>> >>>
>> >>> On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen <yoonsen@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> > I am running a web script in two identical versions (identical as in
>> >>> > "cut
>> >>> > and paste"), one via RESTXQ and one vi REST. The response is
>> >>> > different,
>> >>> > and
>> >>> > I wondered what may be the trouble.
>> >>> >
>> >>> > For example the output (the URLs only works locally) for
>> >>> >     http://ljohnsen:8984/hyphens/mellom
>> >>> > is the same as
>> >>> >      http://ljohnsen:8984/rest?run=hyphen-show.xq&word=mellom
>> >>> >
>> >>> > which is a set of hyphenation data:
>> >>> >     mellom
>> >>> >     mel - lom 17005
>> >>> >     Mel - lom 144
>> >>> >     mel - lom. 50
>> >>> >
>> >>> > but if "mellom" is exchanged with "nasjonalbiblioteket" only  the
>> >>> > REST
>> >>> > version shows any result, which then is the same as I get
>> >>> > experimenting
>> >>> > in
>> >>> > the GUI.
>> >>> >
>> >>> > The actual script is added below, and which runs in both versions
>> >>> > (identical apart form the rest and restxq interfaces), it uses full
>> >>> > text
>> >>> > search, but results differ when run under the REST-regime.
>> >>> >
>> >>> > All the best
>> >>> > Lars G Johnsen
>> >>> > National Library of Norway
>> >>> >
>> >>> > module namespace page = 'http://basex.org/modules/web-page';
>> >>> >
>> >>> > declare
>> >>> >   %rest:path("/hyphens/{$word}")
>> >>> >   %output:method("html")
>> >>> >
>> >>> > function page:show-hyphens($word) {
>> >>> >    let $db := db:open('hyphen-data')
>> >>> >      let $hyphens :=  for $hyp in $db/hyphens/hyphens[full contains
>> >>> > text
>> >>> > {$word}]
>> >>> >       group by $first := $hyp/first, $second := $hyp/second
>> >>> >       let $count := count($hyp)
>> >>> >       order by xs:int($count) descending
>> >>> >       return element p {
>> >>> >         attribute freq {$count},
>> >>> >         $first, " - ", $second, $count
>> >>> >       }
>> >>> >
>> >>> >      let $total := sum($hyphens//@freq)
>> >>> >      let $div := element div {
>> >>> >        element p {$word},
>> >>> >        for $hyp in $hyphens
>> >>> >        return element div {
>> >>> >           attribute class {"hyph"},
>> >>> >           attribute style {"font-size:", 1
>> >>> > +round(xs:int($hyp//@freq/data())
>> >>> > div $total,1) || "em"},
>> >>> >           $hyp
>> >>> >
>> >>> >          }
>> >>> >      }
>> >>> >      return
>> >>> >      <html encoding="UTF-8">
>> >>> >     <head>
>> >>> >         <meta http-equiv="Content-Type" content="text/html"
>> >>> > charset="UTF-8"
>> >>> > />
>> >>> >         <title>Orddelinger</title>
>> >>> >     </head>
>> >>> >     <body>{$div}
>> >>> >     </body>
>> >>> >     </html>
>> >>> >
>> >>> > };
>> >>
>> >>
>> >
>
>