New subject: general question on basex database/collection design

31 Aug 2010


      Hi all,
Basex continues to impress. I had been under the (false) impression that a
single database should contain all the data/files relevant for one
particular application.
I have discovered that this kind of query, which crosses/joins data across
databases, works:
for $c in basex:db("chaingang")//person[@key="ccc86809"]
let $links := basex:db("fasdb")//linkGrp[link=$c/@target]
return ($c, $links, basex:db("fasdb")//person[@key=($links//link/text())])
which gives me (as desired):
<person isVdl="yes" target="ai14506" type="ccc" key="ccc86809">
  <sources>
    <ccc id="ccc86809" n="86809">
      <isVDL>yes</isVDL>
      <aiRef>14506</aiRef>
      <drupalNodeId>86809</drupalNodeId>
    </ccc>
  </sources>
</person>
<linkGrp id="L016612" size="3" c31a="1" dlm="1" ai="1">
  <link type="dlm">dlm18216038</link>
  <link type="ai">ai14506</link>
  <link type="c31a">c31a31070390</link>
</linkGrp>
<person key= ... the 3 person records specified in the linkGrp.
Wow! I had no idea this kind of joining across basex databases was doable.
Performance is great, about 64ms. It means I can think about refactoring
my rather large single db with 30 documents into far smaller and
manageable chunks which become updatable (i.e. the overhead of optimizing
becomes tolerable).
Does anyone have any comment about this, and the pros and cons? Perhaps I
should be thinking of a database for each record type/major document
instead of "one big database". I can see a downside in that my queries get
locked into an implementation specific syntax, but I am so pleased with
what Basex fulltext querying is giving me, and the general performance and
clean design, that with it being open source, I'm happy to wear this risk.
Thoughts anyone?
Hoping this helps someone else discover what this marvellous piece of
software is capable of.
Thanks again to the developers -- bravo!!
Cheers,
Sandra