There may be various solutions ;) A minimized code example that
demonstrates your procedure could be helpful here.
On Tue, Feb 16, 2021 at 3:21 PM Tim Thompson <timathom(a)gmail.com> wrote:
>
> Yes, the insert operations also involve lookups against a couple of full-text indexes (using ft:search()). The entries in the full-text indexes include corresponding unique identifiers (URIs). The query:
>
> iterates through a set of library catalog records
> filters on certain elements
> runs full-text lookups on the text (which are name or subject strings)
> inserts a new element containing the URI from a match in the index.
> Logs the result of the each lookup (using file:append-text-lines())
>
>
> I'm guessing that the index access is what slows things down the most. What would be the approach for combining updates?
>
> Thank you,
> Tim
>
>
> --
> Tim A. Thompson
> Discovery Metadata Librarian
> Yale University Library
>
>
>
> On Tue, Feb 16, 2021 at 4:53 AM Christian Grün <christian.gruen(a)gmail.com> wrote:
>>
>> Hi Tim,
>>
>> > Is it possible to put updating expressions in a library module function (with the name of the database hard coded) and then call from the function within jobs:eval() in a main module?
>>
>> Yes, it should be. Here’s a little example (create database 'x' first):
>>
>> lib.xqm:
>> module namespace lib = 'lib';
>> declare %updating function lib:delete() {
>> prof:sleep(1000),
>> delete node db:open('x')//x
>> };
>>
>> query.xq:
>> let $id := jobs:eval("
>> import module namespace lib='lib' at 'lib.xqm';
>> lib:delete()
>> ")
>> return (prof:sleep(500), jobs:list-details($id))
>>
>> The query result looks something like this:
>>
>> <job id="job29" type="QueryJob" state="running" user="admin"
>> duration="PT0.498S" reads="(none)" writes="x"
>> time="2021-02-16T10:40:57.605+01:00">import module namespace lib='lib'
>> at 'lib.xqm'; lib:delete()</job>
>>
>> If the name of the database is found in the "writes" attribute, you’ll
>> know that only this database will be write-locked.
>>
>> However, please note that, due to random file access patterns,
>> parallel writes are often slower than consecutive ones (even with
>> SSDs), so my guess is that you won’t save a lot.
>>
>> 48 hours sounds a lot indeed. It’s usually much, much faster to run a
>> single XQuery expression that performs 1000 updates than running 1000
>> independent queries. Could you give us more information on the insert
>> operations you need to perform? Is there any chance to combine
>> updates?
>>
>> Best,
>> Christian