Hello Christian, Gerrit, Liam, Graydon,

 

Is it possible to use a different XML Catalog Resolver with BaseX? I’m referring specifically to the new XML resolver that Norm Tovey-Wash presented today at Declarative Amsterdam. The presentation recording is at https://www.youtube.com/watch?v=LBuqQG8io8k&ab_channel=DeclarativeAmsterdam and resolver is available at https://xmlresolver.org/ and https://github.com/xmlresolver/xmlresolver/.

 

I haven’t yet had a chance to try Norm’s new XML resolver or the BaseX 10 snapshot.

 

However, I have also run into the limitation Gerrit mentioned about xslt:transform() not using an XML Catalog, and have used workarounds to preprocess the XML before calling xslt:transform().

 

Regarding useful options, the two things that I usually want to configure (apart from the contents of catalog.xml) are the location of the catalog.xml file(s) and logging verbosity. Being able to configure the catalog in a map parameter or startup parameter seem like useful additions to the existing methods (pragma, option, .basex, etc.).

 

Kind regards,

Vincent

 

_____________________________________________

Vincent M. Lizzi

Head of Information Standards | Taylor & Francis Group

vincent.lizzi@taylorandfrancis.com

 

 

 

Information Classification: General

From: BaseX-Talk <basex-talk-bounces@mailman.uni-konstanz.de> On Behalf Of Christian Grün
Sent: Friday, November 5, 2021 8:28 AM
To: Imsieke, Gerrit, le-tex <gerrit.imsieke@le-tex.de>
Cc: BaseX <basex-talk@mailman.uni-konstanz.de>
Subject: Re: [basex-talk] specifying the processor for xslt:transform()

 

With BaseX 10, which will be based on JDK 11, we’ll switch to the
built-in JDK Catalog Resolver [1], which tends to get good reviews,
and which allows for a much cleaner and more consistent integration.
Debugging should be easier as well, as errors will always be reported
back if the catalog resolution fails.

We think about replacing the CATFILE option…

1. Option:
CATFILE: path/to/catalog.xml

2. or XQuery:
fetch:xml('file.xml', map { 'catfile': 'path/to/catalog.xml })

…with a new CATALOG option that takes multiple keys and values:

1. Option:
CATALOG: files=path/to/catalog.xml,resolve=strict,prefer=public,defer=false

2. or XQuery:
fetch:xml('file.xml', map { 'catalog': map {
'files': 'path/to/catalog.xml',
'resolve': 'strict',
'prefer': 'public',
'defer': false()
}})

An alternative would be to completely drop the catalog options and
assign all catalog options via system properties at startup:

java -Djavax.xml.catalog.files=path/to/catalog.xml .... BaseX

I’d love to get your feedback on these ideas, and your experiences
with an early BaseX 10 snapshot [2]!
Christian

[1] https://docs.oracle.com/en/java/javase/11/core/xml-catalog-api1.html#GUID-96D2C9AC-641A-4BDB-BB08-9FA04358A6F4
[2] https://files.basex.org/releases/latest-10/


On Fri, Nov 5, 2021 at 9:03 AM Imsieke, Gerrit, le-tex
<gerrit.imsieke@le-tex.de> wrote:
>
>
>
> On 05.11.2021 03:03, Liam R. E. Quin wrote:
> > On Thu, 2021-11-04 at 18:43 -0400, Graydon Saunders wrote:
>
> >> Related to this, setting the catalog for use by xslt:transform() is
> >> defeating me.
> >
> > The only ways i have found to debug these are
> > (1) with strace -f, to make sure the file is being read
> > (2) with a CatalogManager.properties file [[
> > verbosity=65535
> > # relative-catalogs=false
> > prefer = public
> > catalogs=mycataloguefile.xml
> > ]]
> >
> > Likely you need entries in the catalog file starting with file:///
> >
> > If you are uploading queries to a BaseX server, remember it's the
> > server that needs to have had XLASSPATH set when starting, and that
> > relativeURIs like "catalog.xml" might be sought for in the server's
> > directory.
> >
> > Liam
>
> Liam and Christian have thankfully added support for resolving
> include/import URIs and doc(…) URIs approx 2 years ago [1]. A thing that
> I recently found was lacking is resolution of system identifiers that
> occur in documents. That is, if there is a reference to a DTD in a
> document that is read during the transformation, the catalog resolution
> does not apply to the public or system identifiers.
>
> Is this the issue that you are encountering, Graydon?
>
> Your first argument to xslt:transform is db:open('acme_content')[1].
> Does this document have a DOCTYPE declaration? I’d have guessed that the
> DOCTYPE declaration was stripped away when the documents were loaded
> into the DB, that is, parsing with the DTD only happened during import.
> But maybe this is different if you use the internal parser.
>
> Gerrit
>
> [1] https://github.com/BaseXdb/basex/issues/1719