Hi Mansi,
- Most of my xqueries are of below nature
'/Archives/descendant::apiCalls[contains(@name,"com.sun")]/@name', where apiCalls could be 3-4 level under 'Archives'. Xqueries are accessed via REST
The existing index structures won’t allow you to look for arbitrary sub strings; see [1] for more information.
You are right, the full-text index may be a possibly way out. Prefix searches can be realized via the "using wildcards" option [2]:
//*[text() contains text "abc.*" using wildcards
Please note that the query string will always be "tokenized": if you are looking for "com.sun", you will also get results like "COM SUN!".
- I have 1000s of documents, spanning over 100 XML DB, with total space
around 400 GB currently. Each query is taking roughly 30 mins, to run.
My concern is, at each DB update, I am using attribute indexing, but info command on basex prompt tells me otherwise. Am I misreading something ? Is there a way to fix this once DB is created ? Its takes me 48 hours, to create DBs from scratch... :)
If UPDINDEX and AUTOOPTIMIZE is false, you will need to call "OPTIMIZE" after your updates.
If you create a new database, you can set UPDINDEX and AUTOOPTIMIZE to true. However, AUTOOPTIMIZE will get incredibly slow if you are working with gigabytes of XML data.
Reading thru UPDINDEX and AUTOOPTIMIZE ALL commands, tells me to open each DB and run these commands. Is that my option ? Do we have a xquery script somewhere which I can use to do this ?
If your databases are called "db1" ... "db100", the following XQuery script will optimize all those databases:
for $i in 1 to 100 return db:optimize('db' || $i)
You can also create a command script [3] with XQuery:
<commands>{ for $i in 1 to 100 return ( <open>{ 'db' || $i }</open>, <optimize/> ) }</commands>
You can store the result as a .bxs file and run it afterwards.
Before you create all index structures, you should probably run your queries on some smaller database instances and check out the "Query Info" panel in the GUI. It will tell you if an index is used or not.
Best, Christian
[1] http://docs.basex.org/wiki/Indexes#Value_Indexes [2] http://docs.basex.org/wiki/Full-Text#Match_Options [3] http://docs.basex.org/wiki/Commands#Command_Scripts