As an update, after rebuilding database with
text index, full text index (no language, no stemming, keep diacritics)
restarting server: BaseX 8.1.1 [Server] Server was started (port: 29084) [main] INFO org.eclipse.jetty.server.AbstractConnector - Started SelectChannelConnector@0.0.0.0:8984 HTTP Server was started (port: 8984)
RESTXQ: Norwegian characters are converted using full text index, changing to text index takes forever. REST: Full-text works as expected, and text index works as expected (same as runing in GUI for both).
It looks as if the index structure is treated differently.
2015-05-18 15:07 GMT+02:00 Lars Johnsen yoonsen@gmail.com:
The full text query is blisteringly fast for both, the text index query is fast only for REST queries and seems not to be used with queries in RESTXQ. I am rebuilding the whole database now to see how it goes, and will restart everything for a new assessment.
2015-05-18 15:00 GMT+02:00 Christian Grün christian.gruen@gmail.com:
However, when using text index instead of full text the results are the
same
for both, except that RESTXQ takes almost forever
What about the original query: Has it been slow as well, or do you think this is a new problem?
2015-05-18 14:28 GMT+02:00 Christian Grün christian.gruen@gmail.com:
It could be that your URL is decoded in a wrong way.. What happens if you run the following function with REST and RESTXQ and "føre" as word?
declare %rest:path("/test/encoding/{$word}") function page:test-encoding($word) { string-to-codepoints($word) };
Thanks, Christian
string-to-codepoints()
REST output (2 first lines): føre fø - re 219
RESTXQ føre fo - re 123
The first word quoted is "føre" in both cases and is what the scripts see, so the full text is given the same in both cases. Could it be that within RESTXQ the full text index is treated differently?
I will work closer on a self contained example, but thought this
might
point to something.
Cheers Lars
2015-05-18 13:44 GMT+02:00 Lars Johnsen yoonsen@gmail.com:
Hi Christian - and thanks for fast response. Latest version 8.11 is
in
use (same behaviour as previous). Let me see if I can make a self
contained
example.
best, Lars
2015-05-18 13:40 GMT+02:00 Christian Grün <
christian.gruen@gmail.com>:
> > Hi Lars, > > hm, that's difficult to tell. All I can say is that this sounds > unusual, so I'm coming up with my standard questions: Do you think
you
> could build us a little example that allows us to reproduce the > problem? Have you tried the latest version of BaseX? > > Best, > Christian > > > On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoonsen@gmail.com > wrote: > > > > I am running a web script in two identical versions (identical
as in
> > "cut > > and paste"), one via RESTXQ and one vi REST. The response is > > different, > > and > > I wondered what may be the trouble. > > > > For example the output (the URLs only works locally) for > > http://ljohnsen:8984/hyphens/mellom > > is the same as > > http://ljohnsen:8984/rest?run=hyphen-show.xq&word=mellom > > > > which is a set of hyphenation data: > > mellom > > mel - lom 17005 > > Mel - lom 144 > > mel - lom. 50 > > > > but if "mellom" is exchanged with "nasjonalbiblioteket" only the > > REST > > version shows any result, which then is the same as I get > > experimenting > > in > > the GUI. > > > > The actual script is added below, and which runs in both versions > > (identical apart form the rest and restxq interfaces), it uses
full
> > text > > search, but results differ when run under the REST-regime. > > > > All the best > > Lars G Johnsen > > National Library of Norway > > > > module namespace page = 'http://basex.org/modules/web-page'; > > > > declare > > %rest:path("/hyphens/{$word}") > > %output:method("html") > > > > function page:show-hyphens($word) { > > let $db := db:open('hyphen-data') > > let $hyphens := for $hyp in $db/hyphens/hyphens[full
contains
> > text > > {$word}] > > group by $first := $hyp/first, $second := $hyp/second > > let $count := count($hyp) > > order by xs:int($count) descending > > return element p { > > attribute freq {$count}, > > $first, " - ", $second, $count > > } > > > > let $total := sum($hyphens//@freq) > > let $div := element div { > > element p {$word}, > > for $hyp in $hyphens > > return element div { > > attribute class {"hyph"}, > > attribute style {"font-size:", 1 > > +round(xs:int($hyp//@freq/data()) > > div $total,1) || "em"}, > > $hyp > > > > } > > } > > return > > <html encoding="UTF-8"> > > <head> > > <meta http-equiv="Content-Type" content="text/html" > > charset="UTF-8" > > /> > > <title>Orddelinger</title> > > </head> > > <body>{$div} > > </body> > > </html> > > > > };