I am struggling to use the "serialize" XPath 3.1 function as specified by the W3C with BaseX 9.1.2:
1) Is there any way to provide the "use-character-maps" parameter supplied the XPath 3.1 way with a second argument to "serialize" as a map: serialize("a,b", map { 'method' : 'text', 'use-character-maps' : map { "'" : "'" }})
I get an error saying [SEPM0017] Invalid 'use-character-maps' value 'map { "'": "'" }'; must be a string.
2) When I try the XPath 3.0 way of providing the serialization parameters as an XML element I also get an error:
When I run
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method "text";
serialize( ( <text>'</text>, "'", map { "Ä":"'C4", "Ö":"'D6", "Ü":"'DC", "ß":"'DF", "ä":"'E4", "ö":"'F6", "ü":"'FC" } ), output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization" <output:method value="adaptive"/> output:use-character-maps <output:character-map character="'" map-string="&apos;"/> </output:use-character-maps> </output:serialization-parameters> )
in BaseX it tells me [SEPM0017] Character map 'character-map=,' is not defined.
Saxon 9.9 and Altova 2018 run that example without giving an error (although not with the same serialization result, but that is a different issue).
Hi Martin,
- Is there any way to provide the "use-character-maps" parameter
supplied the XPath 3.1 way with a second argument to "serialize" as a map: serialize("a,b", map { 'method' : 'text', 'use-character-maps' : map { "'" : "&apos;" }})
In BaseX, values of the fn:serialize map argument are restricted to strings (this still differs from the requirements of the official specification). As character map handling is a special case, I have extended our code, and your query should now be evaluated as expected.
With the current official release of BaseX, it is actually possible to supply character maps as comma-separated list:
serialize('ab', map { 'use-character-maps': 'a=A,b=B' })
We haven’t documented this feature, as it’s not compliant with the specification, and we haven’t considered all corner cases (supplying delimiter characters as keys or values, etc.).
- When I try the XPath 3.0 way of providing the serialization
parameters as an XML element I also get an error:
Until now, character maps were only correctly parsed when specified in parameter documents in the query prolog. I have fixed this as well.
Saxon 9.9 and Altova 2018 run that example without giving an error (although not with the same serialization result, but that is a different issue).
This might be due to the somewhat fuzzy rule in the spec, which states that the "use-character-maps parameter is directly applicable to the Adaptive output method only as elsewhere specified." [1]
Feel free to try the latest snapshot [2].
Best Christian
[1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_USE-CHARACTER-M... [2] http://files.basex.org/releases/latest/
Hi Christian,
On 02.04.2019 14:36, Christian Grün wrote:
- Is there any way to provide the "use-character-maps" parameter
supplied the XPath 3.1 way with a second argument to "serialize" as a map: serialize("a,b", map { 'method' : 'text', 'use-character-maps' : map { "'" : "&apos;" }})
In BaseX, values of the fn:serialize map argument are restricted to strings (this still differs from the requirements of the official specification).
Are there plans to remove the restrictions on string arguments? It is hard writing standard XQuery 3.1 when one implementation (BaseX) wants map { 'indent' : 'yes' } and another like Saxon map { 'indent' : true() }
As character map handling is a special case, I have extended our code, and your query should now be evaluated as expected.
Thanks, works fine in the snapshot.
- When I try the XPath 3.0 way of providing the serialization
parameters as an XML element I also get an error:
Until now, character maps were only correctly parsed when specified in parameter documents in the query prolog. I have fixed this as well.
Great.
Saxon 9.9 and Altova 2018 run that example without giving an error (although not with the same serialization result, but that is a different issue).
This might be due to the somewhat fuzzy rule in the spec, which states that the "use-character-maps parameter is directly applicable to the Adaptive output method only as elsewhere specified." [1]
What's your view of adaptive serialization of map(xs:string,xs:string) and a character map, should the character map be applied to the map's key and value string values?
It seems BaseX doesn't do that.
The spec gives the sentence you cited but also "Character maps are applied (a) when nodes are serialized using the XML output method, and (b) to any value represented as a string enclosed in quotation marks".
I am not sure whether that is meant to be applied to string values inside maps.
Are there plans to remove the restrictions on string arguments? It is hard writing standard XQuery 3.1 when one implementation (BaseX) wants map { 'indent' : 'yes' } and another like Saxon map { 'indent' : true() }
Yes, I guess so. The choices that have been made for the data types of the map arguments are pretty specific (for example, 'version' must be of type xs:string, where 'html-version' must be a decimal). As a result, the correct parsing of the serialization parameters would result in many lines of code that in addition would be incompatible with our existing solution for parsing map arguments of built-in XQuery functions. I’m still hesitant to give it a try.
But I could possibly make parsing more liberal and e.g. convert booleans to strings instead of rejecting them. Something like map { 'indent' : true() } or map { 'html-version' : '5' } could then be successfully parsed.
What's your view of adaptive serialization of map(xs:string,xs:string) and a character map, should the character map be applied to the map's key and value string values?
My general view on the adaptive serialization method is that I would have loved it to become what the name implies (adaptive: best fit for the type that’s to be serialized). Instead, it has more or less become a debugging method, and many details have eventually been left to the implementation. As a consequence, we don’t have real use cases for this method, but if it’s used for debugging, it might be better if character maps were ignored, because they could easily create output that’s difficult to reuse.
On 02.04.2019 22:37, Christian Grün wrote:
What's your view of adaptive serialization of map(xs:string,xs:string) and a character map, should the character map be applied to the map's key and value string values?
My general view on the adaptive serialization method is that I would have loved it to become what the name implies (adaptive: best fit for the type that’s to be serialized). Instead, it has more or less become a debugging method, and many details have eventually been left to the implementation. As a consequence, we don’t have real use cases for this method, but if it’s used for debugging, it might be better if character maps were ignored, because they could easily create output that’s difficult to reuse.
For XSLT 3 (which shares serialization methods with XQuery) Michael Kay has committed https://github.com/w3c/xslt30-test/commit/21685399a560de67391707cbf5cbff536b... to the XSLT test suite where the defined test results suggest that character maps are to be taken into account for strings with the adaptive output method. And Saxon 9.9's AdaptiveEmitter https://saxonica.plan.io/issues/4187#note-7 has been "adapted" to use a character map "when outputting strings as atomic values". Guess that means that Saxon 9.9 in the next maintenance release will use that code also for XQuery serialization.
Thanks for the pointer, I’ll consider that.
Out of interest: What are your use cases for the adaptive serialization method?
Martin Honnen martin.honnen@gmx.de schrieb am Di., 2. Apr. 2019, 23:03:
On 02.04.2019 22:37, Christian Grün wrote:
What's your view of adaptive serialization of map(xs:string,xs:string) and a character map, should the character map be applied to the map's key and value string values?
My general view on the adaptive serialization method is that I would have loved it to become what the name implies (adaptive: best fit for the type that’s to be serialized). Instead, it has more or less become a debugging method, and many details have eventually been left to the implementation. As a consequence, we don’t have real use cases for this method, but if it’s used for debugging, it might be better if character maps were ignored, because they could easily create output that’s difficult to reuse.
For XSLT 3 (which shares serialization methods with XQuery) Michael Kay has committed
https://github.com/w3c/xslt30-test/commit/21685399a560de67391707cbf5cbff536b... to the XSLT test suite where the defined test results suggest that character maps are to be taken into account for strings with the adaptive output method. And Saxon 9.9's AdaptiveEmitter https://saxonica.plan.io/issues/4187#note-7 has been "adapted" to use a character map "when outputting strings as atomic values". Guess that means that Saxon 9.9 in the next maintenance release will use that code also for XQuery serialization.
On 05.04.2019 17:50, Christian Grün wrote:
Out of interest: What are your use cases for the adaptive serialization method?
The question about the character map support arose in the context of trying to create an XPath 3.1 character map itself in XSLT and serialize it in a way that it could be inserted into an XSLT attribute with XPath code.
Hi Martin,
I have relaxed the type checking of map arguments. Inter alia, booleans will be treated similarly to the string values 'yes' and 'no', and functions calls like the following one will now be evaluated as expected:
serialize(1, map { 'indent': true() })
A new snapshot is online. Adaptive serialization is still on my mental queue (I’ll have to decide if we should apply the same rules here for the 'adaptive' and the 'basex' serialization method).
Have fun, Christian
On Tue, Apr 2, 2019 at 11:03 PM Martin Honnen martin.honnen@gmx.de wrote:
On 02.04.2019 22:37, Christian Grün wrote:
What's your view of adaptive serialization of map(xs:string,xs:string) and a character map, should the character map be applied to the map's key and value string values?
My general view on the adaptive serialization method is that I would have loved it to become what the name implies (adaptive: best fit for the type that’s to be serialized). Instead, it has more or less become a debugging method, and many details have eventually been left to the implementation. As a consequence, we don’t have real use cases for this method, but if it’s used for debugging, it might be better if character maps were ignored, because they could easily create output that’s difficult to reuse.
For XSLT 3 (which shares serialization methods with XQuery) Michael Kay has committed https://github.com/w3c/xslt30-test/commit/21685399a560de67391707cbf5cbff536b... to the XSLT test suite where the defined test results suggest that character maps are to be taken into account for strings with the adaptive output method. And Saxon 9.9's AdaptiveEmitter https://saxonica.plan.io/issues/4187#note-7 has been "adapted" to use a character map "when outputting strings as atomic values". Guess that means that Saxon 9.9 in the next maintenance release will use that code also for XQuery serialization.
basex-talk@mailman.uni-konstanz.de