Hi Steve,
We have XQuery scripts that create custom indexes as BaseX databases called facet-subject, facet-geogname, etc. that look like:
<terms>
<term text="{subject/geogname/etc. here}">
<ark>80444/xv12345</ark>
<ark>80444/xv67890</ark>
</term>
</terms>
Then when users submit a search, we feed the ARKs of the results to another XQuery that gets the facet terms from those indexes, ordered by count descending. Below $a is the ARKs separated by bars, $n is the facet database names separated by bars, and $m is the maximum number of terms to return per facet.
(: Get facet terms for ARKs from the production indexes :)
declare variable $a as xs:string external;
declare variable $n as xs:string external;
declare variable $m as xs:integer external;
<facets>
{
let $arks := tokenize($a, '\|')
let $names := tokenize($n, '\|')
for $name in $names
let $facet_db := 'facet-' || $name || '-prod'
let $sorted_terms := <terms>{
for $term in db:open($facet_db)/terms/term[ark/text()=$arks]
group by $text := $term/@text
let $count := count($term/ark[text()=$arks])
order by $count descending
return <term text="{$text}" count="{$count}"/>
}</terms>
return <facet type="{$name}">{
for $term at $index in subsequence($sorted_terms/term, 1, $m)
return $term
}</facet>
}
</facets>
Let me know if you'd like more examples, like the XQuery scripts that create the facets from our repository databases in bulk and EADs individually.
-Tamara