Jonathan,
I had exactly the same question not too long ago and what Christian helped me understand is that for persistent data you need to use XML to represent your map.
You can do that in a generic way (i.e., serialize the map to XML in a generic way) or you can construct specific markup for the data you’re indexing and then use normal XQuery features to query on it.
What I found was that I tend to use maps to construct the data for the index, for example, taking advantage of duplicates: combine and then serialize that into my own index markup.
For retrieval, there’s no particular advantage at the XQuery level to having maps as opposed to XML that optimizes the index structure.
One thing I did have to fully understand was how best to translate map values that were node lists into the equivalent XML structures. My solution, which I think is the intended best BaseX practice, was to
store the node-id values for the nodes. I implemented a little utility function to dereference my node-id-capturing elements to their nodes. The process seems to be very fast.
Here’s my resolveNodeRef function:
(:~
: Resolves BaseX node IDs back into nodes
: @param elems Elements that either contain <noderef> elements or are a noderef elements.
: @param database The name of the source database the node ID is from
: @return Sequence of nodes in the order listed in nodeRefs
:)
declare function linkrk:resolveNodeRefs($elems
as element()*)
as node()* {
(: let $debug := (prof:dump('linkrk:resolveNodeRefs(): Input elems: '), prof:dump($elems)) :)
let
$nodeRefs as
element(noderef)*
:= ($elems/self::noderef|$elems//noderef)
let
$databaseNames as
xs:string*
:= distinct-values($nodeRefs/@database
! string(.))
let
$nodes as
node()*
:=
for
$database in
$databaseNames
let
$nodeRefsForDb as
element(noderef)*
:= $nodeRefs[@database
eq $database]
let
$node-ids as
xs:integer*
:= $nodeRefsForDb/@node-id
! xs:integer(.)
return
try {
(: prof:dump('linkrk:resolveNodeRefs(): Calling db:open-id() with node IDs: ' || $node-ids => string-join(', ')), :)
db:open-id($database,
$node-ids)
} catch
* {
prof:dump('linkrk:resolveNodeRefs(): Failed to resolve node references: '
|| $err:description)
}
return
$nodes
};
Cheers,
E.
_____________________________________________
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
LinkedIn | Twitter | YouTube | Facebook
From:
BaseX-Talk <basex-talk-bounces@mailman.uni-konstanz.de> on behalf of Jonathan Robie <jonathan.robie@gmail.com>
Date: Thursday, March 24, 2022 at 1:10 PM
To: Christian Grün <christian.gruen@gmail.com>
Cc: BaseX <basex-talk@mailman.uni-konstanz.de>
Subject: Re: [basex-talk] Storing maps in BaseX
[External Email]
Hi Christian,
I think the database you describe looks like this:
<words>
<word n="1">one</word>
<word n="2">two</word>
<word n="3">three</word>
<word n="4">four</word>
<word n="5">five</word>
<word n="6">six</word>
<word n="7">seven</word>
<word n="8">eight</word>
<word n="9">nine</word>
<word n="10">ten</word>
</words>
So I always use XML for this? There's no persistent representation of the map data structure?
Jonathan
On Thu, Mar 24, 2022 at 2:02 PM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Jonathan,
You can create databases that contain key/value pairs:
let $words := <words>{
for $i in 1 to 10
return <word n='{ $i }'>{ format-integer($i, 'w') }</word>
}</words>
return db:create('words', $words, 'words.xml')
If you look up values in that database …
for $n in (1 to 10) ! string()
return db:open('words')//word[@n = $n] ! data()
… the text index will be utilized, and your query will be rewritten as follows:
(1 to 10) ! data(db:attribute("words", string(.))/
self::attribute(n)/parent::word)
If you don’t want to rely on the rewritings of the query optimizer,
you can directly use db:attribute.
Best,
Christian