There may be various solutions ;) A minimized code example that demonstrates your procedure could be helpful here.
On Tue, Feb 16, 2021 at 3:21 PM Tim Thompson timathom@gmail.com wrote:
Yes, the insert operations also involve lookups against a couple of full-text indexes (using ft:search()). The entries in the full-text indexes include corresponding unique identifiers (URIs). The query:
iterates through a set of library catalog records filters on certain elements runs full-text lookups on the text (which are name or subject strings) inserts a new element containing the URI from a match in the index. Logs the result of the each lookup (using file:append-text-lines())
I'm guessing that the index access is what slows things down the most. What would be the approach for combining updates?
Thank you, Tim
-- Tim A. Thompson Discovery Metadata Librarian Yale University Library
On Tue, Feb 16, 2021 at 4:53 AM Christian Grün christian.gruen@gmail.com wrote:
Hi Tim,
Is it possible to put updating expressions in a library module function (with the name of the database hard coded) and then call from the function within jobs:eval() in a main module?
Yes, it should be. Here’s a little example (create database 'x' first):
lib.xqm: module namespace lib = 'lib'; declare %updating function lib:delete() { prof:sleep(1000), delete node db:open('x')//x };
query.xq: let $id := jobs:eval(" import module namespace lib='lib' at 'lib.xqm'; lib:delete() ") return (prof:sleep(500), jobs:list-details($id))
The query result looks something like this:
<job id="job29" type="QueryJob" state="running" user="admin" duration="PT0.498S" reads="(none)" writes="x" time="2021-02-16T10:40:57.605+01:00">import module namespace lib='lib' at 'lib.xqm'; lib:delete()</job>
If the name of the database is found in the "writes" attribute, you’ll know that only this database will be write-locked.
However, please note that, due to random file access patterns, parallel writes are often slower than consecutive ones (even with SSDs), so my guess is that you won’t save a lot.
48 hours sounds a lot indeed. It’s usually much, much faster to run a single XQuery expression that performs 1000 updates than running 1000 independent queries. Could you give us more information on the insert operations you need to perform? Is there any chance to combine updates?
Best, Christian
basex-talk@mailman.uni-konstanz.de