BaseX-Talk January 2022

basex-talk@mailman.uni-konstanz.de

15 participants
20 discussions

BaseX 9.6: The Summer Edition
by Christian Grün 28 Nov '24

28 Nov '24

Dear all, We provide you with a new and fresh version of BaseX, our open source XML framework, database system and XQuery 3.1 processor: https://basex.org/ Apart from our main focus (query rewritings and optimizations), we have added the following enhancements: XQUERY: MODULES, FEATURES - Archive Module, archive:write: stream large archives to file - SQL Module: support for more SQL types - Full-Text Module, ft:thesaurus: perform Thesaurus queries - Fulltext, fuzzy search: specify Levenshtein limit - UNROLLLIMIT option: control limit for unrolling loops XQUERY: JAVA BINDINGS - Java objects of unknown type are wrapped into function items - results of constructor calls are returned as function items - the standard package "java.lang." has become optional - array arguments can be specified with the middle dot notation - conversion can be controlled with the WRAPJAVA option - better support for XQuery arrays and maps WEB APPLICATIONS - RESTXQ: Server-Timing HTTP headers are attached to the response For a more comprehensive list of added and updated features, look into our documentation (docs.basex.org) and check out the GitHub issues (github.com/BaseXdb/basex/issues). Have fun, Your BaseX Team

11 43

recursively used variables
by Rob Stapper 12 Aug '24

12 Aug '24

Hi, The code[1] below and send as attachment generates a error message: “Static variable depends on itself: $Q{http://www.w3.org/2005/xquery-local-functions}test”. I use these variables to refer to my private functions in my modules so I can easyly refer to them in a inheritance situation. It’s not a big problem for me but I was wondering if the error-triggering is justified or that it should work. [1]=========================================== declare variable $local:test := local:test#1 ; declare %private function local:test( $i) { if ( $i > 0) then $local:test( $i - 1) } ; $local:test( 10) =========================================== Kind regards, Rob Stapper Sent from Mail for Windows 10 -- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

4 7

User group meeting in Prague?
by Imsieke, Gerrit, le-tex 10 Apr '22

10 Apr '22

Fellow BaseX Users! You might have heard that XML Prague 2022 will take place in June. (Unless the then-prevalent Greek letter makes it impossible even in that time of year, of course.) I asked Christian whether the BaseX team will organize a user group meeting after it had not happened for years now. Christian didn’t seem to be very fond of organizing such a meeting. I asked him whether he would be available to present new features, the roadmap, and for a Q&A session if the users themselves organized such a meeting. He agreed, and therefore I hereby ask the list members whether anyone will join me in organizing this. The plan looks as follows: We will apply for one or two 90-minute slots via the CFP process (https://www.xmlprague.cz/cfp/). We don’t need to have a fixed schedule yet by Dec. 20 (end of CFP date as currently announced – it will be extended anyway). Christian was so kind as to create a new repo, user-group, on Github. We will use one or more of its Wiki pages [1] in order to plan the event. The page will eventually evolve into an agenda if you agree. Looking forward to meeting many of you in Prague in June. And another organizing volunteer (or other volunteers), please come forward. Maybe we can also deal with it in the Wiki [2]. Gerrit [1] https://github.com/BaseXdb/user-group/wiki/2022-06-XML-Prague [2] https://github.com/BaseXdb/user-group/wiki/Members

3 3

Techniques for Unit Testing Updating Operations
by Eliot Kimber 02 Feb '22

02 Feb '22

I’m setting up unit tests for my code that creates various custom indexes. I have content on the file system that serves as my known input. With that data I then need to create the content database and run the processes that create the various indexes over the content database. Thus I need to create the databases, populate them, then verify the populated result. As you can’t create a database and query it in the same XQuery, I don’t see a way to use %unit:before-module to initialize my databases before running unit tests in the same module. The solution seems to be to use a BaseX script to do the database initialization, which seems easy enough: # Run unit tests with fresh database # Make sure the databases exist so we can then drop them check pce-test-01 check _lrc_pce-test-01_link_records # Drop them drop db pce-test-01 drop db _lrc_pce-test-01_link_records # Now create them fresh check pce-test-01 check _lrc_pce-test-01_link_records # # Now run the tests that use these databases in the # required order # test ./test-db-from-git.xqy test ./test-link-record-keeping.xqy However, in running this from the BaseX GUI, it appears that the test commands are trying to find files relative to the location of the basexgui command rather than relative to the script being run: Resource "/Users/eliot.kimber/apps/basex/bin/test-db-from-git.xqy" not found. I don’t see anything in the commands documentation that suggests a way to parameterize the values passed to commands. Am I missing a way to have this kind of setup script be portable from within the GUI or is there a better/different way to initialize the databases for unit tests? Thanks, E. _____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.com<https://www.servicenow.com> LinkedIn<https://www.linkedin.com/company/servicenow> | Twitter<https://twitter.com/servicenow> | YouTube<https://www.youtube.com/user/servicenowinc> | Facebook<https://www.facebook.com/servicenow>

3 12

Bug (?) with dynamic namespace constructor
by Hans-Juergen Rennau 31 Jan '22

31 Jan '22

Dear BaseX people, is this a bug: basex "<foo>{namespace {''}{'bar'}}</foo>"=>[XQDY0102] Duplicate namespace declaration: ''.? This works as expected:basex "<foo>{namespace {'xxx'}{'bar'}}</foo>"=><foo xmlns:xxx="bar"/> With kind regards,Hans-Jürgen

2 4

Techniques for Parallelizing Updating Operations?
by Eliot Kimber 30 Jan '22

30 Jan '22

I’ve worked out how to optimize my process that indexes DITA topics based on what top-level maps they are ultimately used from (turned out I needed to first index the maps in ref count order from least to most, which meant I could then just look up the top-level maps used by any direct-reference maps that reference a given topic—with that in place each topic only requires a single index lookup). However, on my laptop these lookups still take about 0.1 second/topic so for 1000s of topics it’s a long time (relatively speaking). But the topic index process is 100% parallelizable, so I would be able to have at least 2 or 3 ingestion threads going on my 4-CPU server machine. Note that my ingestion process is two-phased: Phase 1: Construct an XQuery map with the index details for the input topics (the topics already exist in the database, only the index is new). Phrase 2: Persist the map to the database as XML elements. I do the map construction in order to both take advantage of map:merge() and because it’s the only way I can do indexing of the DITA maps and topics in one transaction: build the doc-to-root-map for the DITA maps and then use that data to build the doc-to-root-map entries for all the topics, then persist the lot to the database for future use. This is in the context of a one-time mass load of content from a new git work tree. Subsequent changes to the content database will be on individual files and the index can be easily updated incrementally. So I’m just trying to optimize the startup time so that it doesn’t take two hours to load and index our typical content set. I can also try to optimize the low-level operations, although they’re pretty simple so I don’t see much opportunity for significant improvement, but I also haven’t had time to try different options and measure them. I must also say how useful the built-in unit testing framework is—that’s really made this work easier. Cheers, Eliot _____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.com<https://www.servicenow.com> LinkedIn<https://www.linkedin.com/company/servicenow> | Twitter<https://twitter.com/servicenow> | YouTube<https://www.youtube.com/user/servicenowinc> | Facebook<https://www.facebook.com/servicenow>

1 0

Managing Interactions with Long-Running Processes
by Eliot Kimber 26 Jan '22

26 Jan '22

I’m making good progress on our BaseX-based validation dashboard application. The basic process here is we use Oxygen’s scripting support to do DITA map validation and then ingest the result into a database (along with the content documents that were validated) and provide reports on the validation result. The practical challenge here seems to be running the Oxygen process successfully from BaseX—because our content is so huge it can take 10s of minutes for the Oxygen process to run. I set the command time-out to be much longer than the process should run but running it from the HTTP app’s query panel it eventually failed with an error that wasn’t a time out (my code had earlier reported legit errors so I know errors will be properly reported). As soon as the Oxygen process ends I want to ingest the resulting XML file, which is why I started with doing it from within BaseX. But I’m wondering if this is a bad idea and I should really be doing it with i.e., a shell script run via cron or some such? I was trying to keep everything in BaseX as much as possible just to keep it simple. Any general tips or cautions for this time of integration of BaseX with the outside world? Thanks, E. _____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.com<https://www.servicenow.com> LinkedIn<https://www.linkedin.com/company/servicenow> | Twitter<https://twitter.com/servicenow> | YouTube<https://www.youtube.com/user/servicenowinc> | Facebook<https://www.facebook.com/servicenow>

2 2

Re: [basex-talk] Unable to Make My Custom EXPath Module Work
by Christian Grün 24 Jan '22

24 Jan '22

I guess you’re right. We haven’t revised EXPath packaging for a long time now. Actually, I’m not sure how many people use it at all ;) Anyone out there? On Mon, Jan 24, 2022 at 4:08 PM Eliot Kimber <eliot.kimber(a)servicenow.com> wrote: > > I did confirm that if the package @name URI matches the URI of a module, then the module is resolved, i.e.: > > <package xmlns="http://expath.org/ns/pkg" > name="http://servicenow.com/xquery/module/now-dita-utils" > abbrev="now-xquery" > version="0.1" spec="1.0"> > <title>XQuery modules for ServiceNow Product Content processing and support</title> <xquery> > <namespace>http://servicenow.com/xquery/module/database-from-git</namespace> > <file>database-from-git.xqm</file> > </xquery> > > … > > </package> > > That implies that each module needs to be in a separate XAR file in order to also be in a separate namespace. > > I don’t think that is consistent with the EXPath packaging spec. > > In the description of the XQuery entry it says: > > An XQuery library module is referenced by its namespace URI. Thus the xquery element associates a namespace URI to an XQuery file. An importing module just need to use an import statement of the form import module namespace xx = "<namespace-uri>";. > An XQuery main module is associated a public URI. Usually an XQuery package will provide functions through library modules, but in some cases one can want to provide main modules as well. > This implies to me that the value of the <namespace> element in the <xquery> is what should be used to resolve the package reference, not the package’s @name value, which simply serves to identify the package. > > Is my analysis correct or have I misunderstood the package mechanism? > > Cheers, > E.

3 6

Strategy for Persisting Maps that Contain Nodes: db:node-id()
by Eliot Kimber 24 Jan '22

24 Jan '22

I have large maps that include nodes as entry values. I want to persist these maps in the DB for subsequent retrieval. I believe the best/only strategy is: 1. Construct a new map where each node in a value is replaced by its node-id() value. 2. Serialize the map to JSON and store as a blob To reconstitute it, do the reverse (parse JSON into map, replace node-ids() with nodes). Is my analysis correct? Have I missed any detail or better approach? Thanks, Eliot _____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.com<https://www.servicenow.com> LinkedIn<https://www.linkedin.com/company/servicenow> | Twitter<https://twitter.com/servicenow> | YouTube<https://www.youtube.com/user/servicenowinc> | Facebook<https://www.facebook.com/servicenow>

2 6

Unable to Make My Custom EXPath Module Work
by Eliot Kimber 24 Jan '22

24 Jan '22

I’m packaging some XQuery modules into an EXPath XAR archive and then installing them using the repo install command. The command succeeds and my repo is listed by repo list but I’m unable to get the modules to import and I can’t see where I’ve gone wrong. I must have made some non-obvious (to me) error but so far I have not found it. I’ve checked that my expath-pkg.xml against the working examples I have (functx and Schematron BaseX) and I don’t see any difference. I’ve also carefully checked all my namespace URIs to make sure they match. Here is my expath-pkg.xml: <package xmlns="http://expath.org/ns/pkg" name="http://servicenow.com/xquery" abbrev="now-xquery" version="0.1" spec="1.0"> <title>XQuery modules for ServiceNow Product Content processing and support</title> <xquery> <namespace>http://servicenow.com/xquery/module/database-from-git</namespace> <file>database-from-git.xqm</file> </xquery> <xquery> <namespace>http://servicenow.com/xquery/module/now-dita-utils</namespace> <file>now-dita-utils.xqm</file> </xquery> <xquery> <namespace>http://servicenow.com/xquery/module/now-relpath-utils</namespace> <file>now-relpath-utils.xqm</file> </xquery> </package> And here is the structure of the resulting module directory after installation: main % ls ~/apps/basex/repo/http-servicenow.com-xquery-0.1 content expath-pkg.xml main % ls ~/apps/basex/repo/http-servicenow.com-xquery-0.1/content database-from-git.xqm now-dita-utils.xqm now-relpath-utils.xqm main % And the prolog for now-relpath-utils.xqm: module namespace relpath="http://servicenow.com/xquery/module/now-relpath-utils"; Trying to import in a trivial XQuery script, e.g.: import module namespace relpath="http://servicenow.com/xquery/module/now-relpath-utils"; count(/*) Produces this error: Error: Stopped at /Users/eliot.kimber/git/dita-build-tools/src/main/xquery/file, 1/88: [XQST0059] Module not found: http://servicenow.com/xquery/module/now-relpath-utils. So clearly something isn’t hooked up correctly but I don’t see any obvious breakage or violation of a required convention. Any idea where I’ve gone wrong or tips on debugging the resolution failure? This is 9.6.4 on macOS. Thanks, Eliot _____________________________________________ Eliot Kimber Sr Staff Content Engineer O: 512 554 9368 M: 512 554 9368 servicenow.com<https://www.servicenow.com> LinkedIn<https://www.linkedin.com/company/servicenow> | Twitter<https://twitter.com/servicenow> | YouTube<https://www.youtube.com/user/servicenowinc> | Facebook<https://www.facebook.com/servicenow>

2 1

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

BaseX-Talk January 2022