Hi Lukas,
Yes. I have a query per-selecting te letters from Arles.
I do this by entering the numbers of the letters which are written in Arles. Letters in this range are from Arles.
let $range := 577 to 771 for $doc in collection('brievenvangogh') let $uri := base-uri($doc), $num := substring($uri, string-length($uri) - 6, 3) where $num castable as xs:integer and xs:integer($num) = $range return <document uri='{$uri}'>{ for $n score $s in $doc//*[text() contains text 'gauguin'] return <hit score='{$s}'>{ $n }</hit> }</document>
I wonder if you see a query in this line of code which could do the required action:
for $n score $s in $doc//*[text() contains text 'gauguin']
Something like: for $n score $s in $doc//*[text() contains text 'gauguin' AND contains text 'pissarro']
Regards,
Wiard
2011/5/20 Christian Grün christian.gruen@gmail.com
Wiard,
Given the letters from Arles, how many letters have the term 'Gauguin'? Or: Given the letters from Arles, how many letters have the term 'Gauguin AND Pissarro'?
It depends on the input data. Did you manage to write a query that pre-selects the letters from Arles?
Can you see from the query, printed in blue, whether it gives the tf/idf score? While making the database I checked the td/idf score in the full text
search
option.
You'll have to check the query info to see if the tf/idf scoring is used. If the compilation steps include sth like "Applying full-text index", you can be sure that tf/idf is used as scoring. Otherwise, the default scoring (..which often yields better results for XML documents..) is utilized.
Hope this helps, Christian