Hi all,
I have a query that makes use of the UCA collation with the "numeric" keyword [1] (e.g., for ensuring that "d2" sorts before "d12"), and I'm seeking a way to confirm that the icu4j library is available on the BaseX classpath before I run queries that depend on it. [2]
Even though I manually install the icu4j library into Basex's lib/custom directory as specified in [3], I have sometimes forgotten to do so when, for example, Homebrew upgraded BaseX, and as a result my query relying on this collation began producing unexpected results. I've tried to make use of the XQuery 4 fn:collation-available function [4], but so far I've been unsuccessful.
Specifically, using BaseX 11.9 or 12 with icu4j-77.1.jar on macOS 15.5 (on Apple Silicon), the following expression returns the expected result:
``` ("d2", "d12") => sort("http://www.w3.org/2013/collation/UCA?numeric=yes") ```
The result, as expected, is ("d12", "d2"), when the library is present and the opposite when it is absent.
But the following version, which introduces a conditional and the fn:collation-available function, always takes the "else" path:
``` if (collation-available("http://www.w3.org/2013/collation/UCA?numeric=yes", "compare")) then ("d2", "d12") => sort("http://www.w3.org/2013/collation/UCA?numeric=yes") else "UCA collation not present" ```
I expected this expression to return ("d12", "d2") but instead received "UCA collation not present".
Am I misusing/misunderstanding how to use fn:collation-available?
Thanks! Joe
[1] https://www.w3.org/TR/xpath-functions-31/#uca-collations [2] https://docs.basex.org/main/Full-Text#collations [3] https://docs.basex.org/main/Startup#full_distributions [4] https://docs.basex.org/12/Standard_Functions#fn:collation-available
Hi Joe,
Choosing collation URIs can be confusing. The spec says [1]:
“If the fallback parameter is omitted or takes the value yes [in the URI], and if the collation URI is well-formed according to the rules in this section, then the implementation must accept the collation URI, and should use the available collation that most closely reflects the user’s intentions.”
This means that BaseX chooses a standard collation implementation if ICU is not present, and if the URI does not contain 'fallback=no'.
However, this does not fully explain the surprising behavior that you observed. It was caused by a wrong interpretation of the 'compare' argument. I have improved this in the latest snapshot [2]. To be honest, this is still work in progress, as the spec offers few information on the details. The safest choice is to add the fallback behavior to the URI, and omit the second parameter:
let $uri := `http://www.w3.org/2013/collation/UCA?numeric=yes&fallback=no%60 return if (collation-available($uri)) then ( ("d2", "d12") => sort($uri) ) else ( "UCA collation not present" )
Hope this helps, Christian
[1] https://qt4cg.org/specifications/xpath-functions-40/Overview.html#uca-collat... [2] https://files.basex.org/releases/latest/
Hi Christian,
Thank you very much! I had overlooked the information about the fallback parameter. Using the latest snapshot, I can confirm that your suggested code works.
Best, Joe
p.s. Apologies for the errors in my note. In my rush to hit send I incorrectly wrote in a couple of instances that the expected sort order with "numeric=yes" and the library present is ("d12", "d2"), whereas the correct sequence is actually ("d2", "d12").
p.p.s. I just noticed that the spec says to use a semicolon as the delimiter between parameters in the UCA collation URI. [1] It seems BaseX is forgiving and allows the use of ampersands.
[1] https://qt4cg.org/specifications/xpath-functions-40/Overview.html#uca-collat... - unchanged in this respect from 3.1, I see. https://www.w3.org/TR/xpath-functions-31/#uca-collations
On Mon, Jun 16, 2025 at 2:47 AM Christian Grün cg@basex.org wrote:
Hi Joe,
Choosing collation URIs can be confusing. The spec says [1]:
“If the fallback parameter is omitted or takes the value yes [in the URI], and if the collation URI is well-formed according to the rules in this section, then the implementation must accept the collation URI, and should use the available collation that most closely reflects the user’s intentions.”
This means that BaseX chooses a standard collation implementation if ICU is not present, and if the URI does not contain 'fallback=no'.
However, this does not fully explain the surprising behavior that you observed. It was caused by a wrong interpretation of the 'compare' argument. I have improved this in the latest snapshot [2]. To be honest, this is still work in progress, as the spec offers few information on the details. The safest choice is to add the fallback behavior to the URI, and omit the second parameter:
let $uri := `http://www.w3.org/2013/collation/UCA?numeric=yes&fallback=no%60 http://www.w3.org/2013/collation/UCA?numeric=yes&fallback=no return if (collation-available($uri)) then ( ("d2", "d12") => sort($uri) ) else ( "UCA collation not present" )
Hope this helps, Christian
[1] https://qt4cg.org/specifications/xpath-functions-40/Overview.html#uca-collat... [2] https://files.basex.org/releases/latest/
basex-talk@mailman.uni-konstanz.de