Thank you, Martin! I'm new to XQuery and didn't know about higher-order functions like fold-left. With a little tweaking, this is perfect for my application:

declare variable $d as xs:string external;
declare variable $f as xs:string external;
declare function local:get_eads($facet as xs:string, $db_ids as item()+) as item()* {
  let $split := tokenize($facet, ':')
  return db:open('facet-' || $split[1])/terms/term[@text=$split[2] and @db=$db_ids]/ead
};
let $db_ids := tokenize($d, '\|')
let $facets := tokenize($f, '\|')
let $eads :=
  fold-left(
    $facets,
    local:get_eads(head($facets), $db_ids),
    function($all_eads, $facet) {
      let $facet_eads := local:get_eads($facet, $db_ids)
      let $eads_in_both := distinct-values($all_eads[.=$facet_eads])
      return $eads_in_both
    }
  )
return <eads>{
  for $ead in $eads
    return <ead>{$ead}</ead>
}</eads>

On Thu, Aug 12, 2021 at 11:27 PM Martin Honnen <martin.honnen@gmx.de> wrote:


Am 13.08.2021 um 00:12 schrieb Tamara Marnell:
Short question: Is it possible to write an XQuery FLWOR statement that can return a set of unique values present across multiple databases?

Long question: Our new website in development displays EAD finding aids stored across 45 databases in BaseX. I've built "facet" databases that index terms in the EADs from controlled vocabularies like subjects, places, personal names, etc. The indexes follow this structure, where each EAD node contains a unique identifier:

<terms type="subject">
  <term text="Literature" db="1">
    <ead>12345</ead>
    <ead>67890</ead>
  </term>
  <term text="Poetry" db="1">
    <ead>abcde</ead>
  </term>
  {etc.}
</terms>

In the search interface, users can select multiple facets to apply to one search. For example, they could browse database 12 for EADs with the subject "Literature" and the place "Oregon," etc.

I currently use the REST server to run an XQuery file that loops through each selected facet and prints all EAD IDs for each submitted term and database. Then after results are returned, I use PHP to count occurences of each EAD and print them only if the total count matches the count of facets used.

declare variable $d as xs:string external;
declare variable $f as xs:string external;
let $db_ids := tokenize($d, '\|')
return <facets>{
for $facet in tokenize($f, '\|')
  let $split := tokenize($facet, ':')
  let $facet_type := $split[1]
  let $facet_term := $split[2]
  let $facet_db := 'facet-' || $facet_type
  return <facet type="{$facet_type}" term="{$facet_term}">{
    for $ead in db:open($facet_db)/terms/term[@text=$facet_term and @db=$db_ids]/ead
      return $ead
  }</facet>
}</facets>

So in the hypothetical example above, I'd pass "12" as d (or multiple selected databases separated by bars) and "subject:Literature|geogname:Oregon" as f, and I'd get back a document like:

<facets>
  <facet type="subject" term="Literature">
    <ead>12345</ead>
    <ead>67890</ead>
  </facet>
  <facet type="geogname" term="Oregon">
    <ead>12345</ead>
  </facet>
</facets>

The count of "12345" will equal the count of the user's selected facets, so that result will be printed, but 67890 will not.

Is there a more efficient way to do this? I'd prefer the XQuery to return only the EADs that meet all criteria, so only 12345 would be returned because it's in facet-subject under Literature and in facet-geogname under "Oregon," and then I don't have to do any post-processing.


I think you can use fold-left to reduce the found eas while selecting them:



let $db_ids := tokenize($d, '\|')
return
    <facets>{
        let $facet-maps :=
          fold-left(
            for $facet in tokenize($f, '\|')
              let $split := tokenize($facet, ':')
              let $facet_type := $split[1]
              let $facet_term := $split[2]
              let $facet_db := 'facet-' || $facet_type
            return 
              map:merge(
                  for $ead in
db:open($facet_db)/terms/term[@text=$facet_term and @db=$db_ids]/ead
                  return map:entry(string($ead), map { 'node' : $ead, 'type' : $facet_type, 'term' : $facet_term })
                  ,
                  map { 'duplicates' : 'combine' }
              )
           ,
           map{},
           function($ams, $m) {
               for $m1 in $ams
               return map:remove($m1, map:keys($m1)[not(. = map:keys($m))]),
               $m
           }
          )
        return
            for $m in $facet-maps[exists(map:keys(.))]
            let $ead1 := $m?*[1]
            return
                <facet type="{$ead1?type}" term="{$ead1?term}">
                {
                    $m?*?node
                }
                </facet>
    }</facets>



--

Tamara Marnell
IT Manager
Orbis Cascade Alliance (orbiscascade.org)
Pronouns: she/her/hers