Short question: Is it possible to write an XQuery FLWOR statement that can return a set of unique values present across multiple databases?

Long question: Our new website in development displays EAD finding aids stored across 45 databases in BaseX. I've built "facet" databases that index terms in the EADs from controlled vocabularies like subjects, places, personal names, etc. The indexes follow this structure, where each EAD node contains a unique identifier:

<terms type="subject">
  <term text="Literature" db="1">
    <ead>12345</ead>
    <ead>67890</ead>
  </term>
  <term text="Poetry" db="1">
    <ead>abcde</ead>
  </term>
  {etc.}
</terms>

In the search interface, users can select multiple facets to apply to one search. For example, they could browse database 12 for EADs with the subject "Literature" and the place "Oregon," etc.

I currently use the REST server to run an XQuery file that loops through each selected facet and prints all EAD IDs for each submitted term and database. Then after results are returned, I use PHP to count occurences of each EAD and print them only if the total count matches the count of facets used.

declare variable $d as xs:string external;
declare variable $f as xs:string external;
let $db_ids := tokenize($d, '\|')
return <facets>{
for $facet in tokenize($f, '\|')
  let $split := tokenize($facet, ':')
  let $facet_type := $split[1]
  let $facet_term := $split[2]
  let $facet_db := 'facet-' || $facet_type
  return <facet type="{$facet_type}" term="{$facet_term}">{
    for $ead in db:open($facet_db)/terms/term[@text=$facet_term and @db=$db_ids]/ead
      return $ead
  }</facet>
}</facets>

So in the hypothetical example above, I'd pass "12" as d (or multiple selected databases separated by bars) and "subject:Literature|geogname:Oregon" as f, and I'd get back a document like:

<facets>
  <facet type="subject" term="Literature">
    <ead>12345</ead>
    <ead>67890</ead>
  </facet>
  <facet type="geogname" term="Oregon">
    <ead>12345</ead>
  </facet>
</facets>

The count of "12345" will equal the count of the user's selected facets, so that result will be printed, but 67890 will not.

Is there a more efficient way to do this? I'd prefer the XQuery to return only the EADs that meet all criteria, so only 12345 would be returned because it's in facet-subject under Literature and in facet-geogname under "Oregon," and then I don't have to do any post-processing.

-Tamara

--

Tamara Marnell
IT Manager
Orbis Cascade Alliance (orbiscascade.org)
Pronouns: she/her/hers