..and a last one: I'm glad to tell you that we've just added the ft:count() function to our full-text module:
http://docs.basex.org/wiki/Full-Text_Functions#ft:count
Now you can either count the total number of occurrences of search terms in your documents:
let $terms := ('gauguin', 'pissarro') return ft:count( //*[text() contains text { $terms } ] )
...or return them hit by hit as e.g. follows:
for $doc in doc('test') let $terms := ('gauguin', 'pissarro') for $hit score $s in $doc//*[text() contains text { $terms }] let $c := ft:count($hit[text() contains text { $terms }]) return <hit score="{ $s }" count="{ $c }">{ $hit }</hit>
Please download the latest snapshot of BaseX to get the new feature working:
http://files.basex.org/releases/latest/
Christian ___________________________
On Mon, May 23, 2011 at 12:47 AM, Christian Grün christian.gruen@gmail.com wrote:
Wiard,
Leo just gave me a hint that our ft:mark() function can used as well to count the number of occurrences of terms in a full-text query. I hope that the following example gives you some clue:
for $doc in doc('test') let $terms := ('gauguin', 'pissarro') for $hit score $s in $doc//*[text() contains text { $terms }] let $c := count(ft:mark($hit[text() contains text { $terms }])/mark) return <hit score="{ $s }" count="{ $c }">{ $hit }</hit>
Note that, in this case, the contains text expression must be specified twice. Christian ___________________________
On Mon, May 23, 2011 at 12:32 AM, Christian Grün christian.gruen@gmail.com wrote:
Dear Wiard,
for $n score $s in $doc//*[text() contains text {'gauguin','pissarro'}all ] return <hit score='{ $s }'>{ $n, count($n) }</hit> }
</document> In my count($n) I get only the number 1.
true, that doesn't make sense. I'm sorry it's not possible (at least for now) to count the number of occurrences of search terms in a single hit. It could make sense, however, to extend our own set of full-text functions..
http://docs.basex.org/wiki/Full-Text_Functions
..with a new function that counts the number of full-text matches; something like:
ft:count ( $node[ . contains text { "a", "b" } ] )
Suggestions from everyone are welcome. Christian _____________________
On Sun, May 22, 2011 at 5:40 PM, Wiard Vasen wiard.vasen@gmail.com wrote:
Hi Christian, This weekend I tried to make a query which counts the number of times a certain combination of terms occurs in my repository of xml-files. I didn't succeed in finding a way to count the number of occurrences. Could you help me with this? I have attached the result-file from the query to this mail. As you have said in your last mail, the found letters have to be put in a sequence. After that the items in the sequence have to be counted. I was thinking of a method like: let $sequence := ( a method to get the items from the result)
let $count := count($sequence)
return <results> <count>{$count}</count> <items> {for $item in $sequence return <item>{$item}</item> } </items> </results>
let $range := 1 to 800 for $doc in collection('brievenvangogh') let $uri := base-uri($doc), $num := substring($uri, string-length($uri) - 6, 3) where $num castable as xs:integer and xs:integer($num) = $range return <document uri='{$uri}'>{ for $n score $s in $doc//*[text() contains text {'gauguin','pissarro'}all ] return <hit score='{ $s }'>{ $n, count($n) }</hit> }
</document> In my count($n) I get only the number 1.