Me again, sorry :)

Getting hands on the optimization just became easier -
simply download the latest [snapshot]. No need to
hussle with GitHub.

[snapshot] http://files.basex.org/releases/latest/



On Wed, Oct 10, 2012 at 10:56 AM, Lukas Kircher <lukaskircher1@gmail.com> wrote:
Hi again - small addition:

1) to speed up the update process you just have to check
out the latest version of BaseX in our GitHub repository

2) to speed up the look-up of nodes you could experiment
with incremental index updates by setting the [UPDINDEX]
flag to true (please visit our [OPTIONS] documentation for
assistance)

If you have further questions just drop a note ...

Looking forward to your report!

Cheers,
Lukas


[UPDINDEX] http://docs.basex.org/wiki/Options#UPDINDEX
[OPTIONS] http://docs.basex.org/wiki/Options


On Wed, Oct 10, 2012 at 10:40 AM, Lukas Kircher <lukaskircher1@gmail.com> wrote:
Hi Tim,

as a lucky coincidence we integrated a major optimization
regarding updates yesterday evening that should speed up
your scenario as well.

It would be nice if you could take a look at it and report your
experience. Just check it out at our [GitHub] repository.

Cheers,
Lukas





On Tue, Oct 9, 2012 at 3:05 PM, Tim Belschner <tim.belschner@ils.uni-stuttgart.de> wrote:

[BaseX 7.3 with the internal editor]

 

Hello,

 

as a part of a query, I need to update more than ten thousand database entries with an additional attribute. For this, I thought about getting the node IDs with db:node-id(), storing them in an element sequence as attributes and use them afterwards as an identifier for the “update insert”. Unfortunately, it takes around 22 seconds to update all database entries. After some testing I got two bottlenecks:

1.       the “insert node …” command and

2.       the “db:open-id()” command

Both of them cause a processing time of around 10 seconds each.

 

Is there a way to increase the update performance significantly?

Maybe by setting some options of BaseX or using a different approach?

 

Best regards

Tim

 

 

Here is an example code (no practical sense just to demonstrate the current code for the updates):

----------------------------------CODE---------------------------------------------

(: create a list with the node IDs:)

declare function local:GetDBIDs ($seq as element()+) as element()+

{

  for $element in $seq

  let $DBID := db:node-id($element)  

  return ( element {'Doublet'} { attribute{'DBNodeID'}{$DBID} } )

};

 

(: ##### MAIN PART ##### :)

declare variable $dbname := 'db_enum_test';

let $elements := DBSection

let $doubletIDs := element{'Doublets'}{local:GetDBIDs($elements/DatabaseEntry)}

 

for $element at $pos in $doubletIDs/child::*

return (

  insert node (attribute {'DB_ID'}{xs:string(xs:integer('100000') + $pos)} ) into db:open-id( $dbname, xs:integer(data($element/@DBNodeID)) )

)

----------------------------------CODE---------------------------------------------

 

and a query to create an example database:

----------------------------------DB---------------------------------------------

let $entries := for $i in (1 to 10000)

                return (

                  element {'DatabaseEntry'}{attribute {'id'}{xs:string($i)},

                           attribute {'word'}{concat('Scheduler_',xs:string($i*$i))},

                           attribute{'type'}{'INT32'}}

                )

 

let $entries2 := element  {'DBSection'}{$entries}

return ( db:add("db_enum_test", document { $entries2 }, "DB") )

----------------------------------DB---------------------------------------------

 


_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk