Hi Mansi,
> 1. Most of my xqueries are of below nature
>
> '/Archives/descendant::apiCalls[contains(@name,"com.sun")]/@name', where
> apiCalls could be 3-4 level under 'Archives'. Xqueries are accessed via REST
The existing index structures won’t allow you to look for arbitrary
sub strings; see [1] for more information.
You are right, the full-text index may be a possibly way out. Prefix
searches can be realized via the "using wildcards" option [2]:
//*[text() contains text "abc.*" using wildcards
Please note that the query string will always be "tokenized": if you
are looking for "com.sun", you will also get results like "COM SUN!".
> 2. I have 1000s of documents, spanning over 100 XML DB, with total space
> around 400 GB currently. Each query is taking roughly 30 mins, to run.
>
> My concern is, at each DB update, I am using attribute indexing, but info
> command on basex prompt tells me otherwise. Am I misreading something ? Is
> there a way to fix this once DB is created ? Its takes me 48 hours, to
> create DBs from scratch... :)
If UPDINDEX and AUTOOPTIMIZE is false, you will need to call
"OPTIMIZE" after your updates.
If you create a new database, you can set UPDINDEX and AUTOOPTIMIZE to
true. However, AUTOOPTIMIZE will get incredibly slow if you are
working with gigabytes of XML data.
> Reading thru UPDINDEX and AUTOOPTIMIZE ALL commands, tells me to open each
> DB and run these commands. Is that my option ? Do we have a xquery script
> somewhere which I can use to do this ?
If your databases are called "db1" ... "db100", the following XQuery
script will optimize all those databases:
for $i in 1 to 100
return db:optimize('db' || $i)
You can also create a command script [3] with XQuery:
<commands>{
for $i in 1 to 100
return (
<open>{ 'db' || $i }</open>,
<optimize/>
)
}</commands>
You can store the result as a .bxs file and run it afterwards.
Before you create all index structures, you should probably run your
queries on some smaller database instances and check out the "Query
Info" panel in the GUI. It will tell you if an index is used or not.
Best,
Christian
[1] http://docs.basex.org/wiki/Indexes#Value_Indexes
[2] http://docs.basex.org/wiki/Full-Text#Match_Options
[3] http://docs.basex.org/wiki/Commands#Command_Scripts