Dear BaseX team, there is something I don't understand about the parsing/serializing of JSON. When I serialize a document without options, backslashes are escaped. So for example ... <url>https://help.openconnectors.ext.hana.ondemand.com/home/google-analytics</url> becomes"url":"https://help.openconnectors.ext.hana.ondemand.com/home/google-analytics" Only using serialization options - map{"escape": "no"} - do I get the output as expected, corresponding to the input. Why are slashes by default escaped when serializing JSON? Kind regards,Hans-Jürgen PS: When parsing, the slashes are not escaped, so it is "asymmetrical" - I need to parse without options, yet to serialize with options.
Hi Hans-Jürgen,
Maybe this thread and the linked sources can shed some light on the annoying slash escaping:
https://www.mhonarc.org/archive/html/xsl-list/2019-10/msg00078.html
Gerrit
On 20.03.2021 11:11, Hans-Juergen Rennau wrote:
Dear BaseX team,
there is something I don't understand about the parsing/serializing of JSON.
When I serialize a document without options, backslashes are escaped. So for example ...
<url>https://help.openconnectors.ext.hana.ondemand.com/home/google-analytics</url> becomes "url":"https://help.openconnectors.ext.hana.ondemand.com/home/google-analytics"
Only using serialization options - map{"escape": "no"} - do I get the output as expected, corresponding to the input.
Why are slashes by default escaped when serializing JSON?
Kind regards, Hans-Jürgen
PS: When parsing, the slashes are not escaped, so it is "asymmetrical" - I need to parse without options, yet to serialize with options.
Thank you very much, Gerrit, interesting! But I am unwilling to get tangled up (in blue, like Bob Dylan, or not), I want clarity. It could not be simpler: I want to preserve information content, character for character: (a) when parsing JSON syntax into an XML node tree, (b) when serializing an XML node tree representation of JSON into JSON syntax. At least I want clean roundtripping, where output information is equal to input information. (Which is a lesser requirement than a+b.).
Currently, it is not clear to me how to do it, and I want to learn. Besides, I think it is a bug if roundtripping involves a change of information. If I oversimplify things, I would like to learn why. Have a cozy Sunday -Hans PS: To demonstrate the issue: let $j1 := '{"key":"a/b"}' let $x := json:parse($j1) let $j2 := json:serialize($x) return ($j1, $x, $j2) yields: {"key":"a/b"} <json type="object"> <key>a/b</key> </json> { "key":"a/b" } PPS: Please do not even dream of thinking about considering or not excluding to change the parsers behaviour which leaves the slash as it is - otherwise you cannot use JSON information in XML representation without wiggling around - e.g. follow links, imagine. It's the serializer who gets it wrong.
Am Samstag, 20. März 2021, 23:52:06 MEZ hat Imsieke, Gerrit, le-tex gerrit.imsieke@le-tex.de Folgendes geschrieben:
Hi Hans-Jürgen,
Maybe this thread and the linked sources can shed some light on the annoying slash escaping:
https://www.mhonarc.org/archive/html/xsl-list/2019-10/msg00078.html
Gerrit
On 20.03.2021 11:11, Hans-Juergen Rennau wrote:
Dear BaseX team,
there is something I don't understand about the parsing/serializing of JSON.
When I serialize a document without options, backslashes are escaped. So for example ...
<url>https://help.openconnectors.ext.hana.ondemand.com/home/google-analytics</url> becomes "url":"https://help.openconnectors.ext.hana.ondemand.com/home/google-analytics"
Only using serialization options - map{"escape": "no"} - do I get the output as expected, corresponding to the input.
Why are slashes by default escaped when serializing JSON?
Kind regards, Hans-Jürgen
PS: When parsing, the slashes are not escaped, so it is "asymmetrical" - I need to parse without options, yet to serialize with options.
On Sun, 2021-03-21 at 09:42 +0000, Hans-Juergen Rennau wrote:
PPS: Please do not even dream of thinking about considering or not excluding to change the parsers behaviour which leaves the slash as it is - otherwise you cannot use JSON information in XML representation without wiggling around - e.g. follow links, imagine. It's the serializer who gets it wrong.
The JSON spec requires / to be escaped.
string = quotation-mark *char quotation-mark
char = unescaped / escape ( %x22 / ; " quotation mark U+0022 %x5C / ; \ reverse solidus U+005C %x2F / ; / solidus U+002F %x62 / ; b backspace U+0008 %x66 / ; f form feed U+000C %x6E / ; n line feed U+000A %x72 / ; r carriage return U+000D %x74 / ; t tab U+0009 %x75 4HEXDIG ) ; uXXXX U+XXXX
escape = %x5C ; \
quotation-mark = %x22 ; "
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
Liam
No, Liam, that is a misunderstanding - it *may* be escaped by a preceding , like a handful of other characters, but there is no obligation. However, you are right in pointing out (implicitly) that my assertion was wrong that the information content is changed by the escaping: it is not, they are just alternative representations. Thank you! My experience with JSON is mainly JSON Schema and OpenAPI documents. There I have NEVER seen slashes escaped. They use URLs and JSON Pointer a lot, which would look very awkward with / everywhere.
I still think that the choice of representation should be the same when moving from JSON to XML nodes, or back. There is a strong argument to stick with the unescaped variants in the XML representation: as the information content retrieved from those elements treats / as two characters, not as an escaped /. In other words: json:doc('foo.json')//bar would potentially yield a different sequence of characters than encoded by the source JSON. This seems to mean that any use of JSON data fetched from the XML representation would have to be preceded by a transformation performed by the application (replacing / with /, etc.).
Kind regards,Hans-Jürgen
Am Sonntag, 21. März 2021, 18:08:14 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 09:42 +0000, Hans-Juergen Rennau wrote:
PPS: Please do not even dream of thinking about considering or not excluding to change the parsers behaviour which leaves the slash as it is - otherwise you cannot use JSON information in XML representation without wiggling around - e.g. follow links, imagine. It's the serializer who gets it wrong.
The JSON spec requires / to be escaped.
string = quotation-mark *char quotation-mark
char = unescaped / escape ( %x22 / ; " quotation mark U+0022 %x5C / ; \ reverse solidus U+005C %x2F / ; / solidus U+002F %x62 / ; b backspace U+0008 %x66 / ; f form feed U+000C %x6E / ; n line feed U+000A %x72 / ; r carriage return U+000D %x74 / ; t tab U+0009 %x75 4HEXDIG ) ; uXXXX U+XXXX
escape = %x5C ; \
quotation-mark = %x22 ; "
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
Liam
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote:
No, Liam, that is a misunderstanding - it *may* be escaped by a preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes. In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)
NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<
So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:
Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off. If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off optional serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content. SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.
Kind regards,Hans-Jürgen PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. " becomes ", / becomes /, \n becomes 
.
Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote:
No, Liam, that is a misunderstanding - it *may* be escaped by a preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
What’s wrong with adding 'use-character-maps': map{'/': '/'} to the JSON serialization each time?
On 22.03.2021 08:05, Hans-Juergen Rennau wrote:
Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes.
In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)
NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<
So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:
Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off.
If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off /optional/ serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content.
SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.
Kind regards, Hans-Jürgen
PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. " becomes ", / becomes /, \n becomes 
.
Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote:
No, Liam, that is a misunderstanding - it *may* be escaped by a preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
You are right, Gerrit - the desired output can be achieved using fn:serialize and the use-characgter-maps option: serialize($doc, map{'method': 'json', 'use-character-maps': map{'/': '/'}}) I had overlooked that, and it's good to know. I was speaking about function json:serialize, which does not support the option "use-character-maps". I still think option "escape=no" must not leave double quotes or backslashes without escaping, as it produces invalid JSON and a serialization function producing invalid output does not make sense to me. But as the use of fn:serialize is a work around, the matter is not urgent. Thanks, Gerrit! Kind regards,Hans
Am Montag, 22. März 2021, 08:49:59 MEZ hat Imsieke, Gerrit, le-tex gerrit.imsieke@le-tex.de Folgendes geschrieben:
What’s wrong with adding 'use-character-maps': map{'/': '/'} to the JSON serialization each time?
On 22.03.2021 08:05, Hans-Juergen Rennau wrote:
Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes.
In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)
>>> NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<
So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:
Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off.
If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off /optional/ serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content.
SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.
Kind regards, Hans-Jürgen
PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. " becomes ", / becomes /, \n becomes 
.
Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote: > No, Liam, that is a misunderstanding - it *may* be escaped by a > preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
Dear Hans-Jürgen,
I was speaking about function json:serialize, which does not support the option "use-character-maps". I still think option "escape=no" must not leave double quotes or backslashes without escaping, as it produces invalid JSON and a serialization function producing invalid output does not make sense to me.
The JSON options have been designed to improve bidirectional conversions: The option can be supplied to preserve the original escape strings (which can e.g. be helpful if the input data is invalid). The same options should be used for parsing and serializing data.
But I completely agree, we shouldn’t allow users to create invalid JSON. It may be better to raise an exception if the serialization of data with the supplied options would lead to corrupt output?
Best, Christian
But as the use of fn:serialize is a work around, the matter is not urgent. Thanks, Gerrit!
Kind regards, Hans
Am Montag, 22. März 2021, 08:49:59 MEZ hat Imsieke, Gerrit, le-tex gerrit.imsieke@le-tex.de Folgendes geschrieben:
What’s wrong with adding 'use-character-maps': map{'/': '/'} to the JSON serialization each time?
On 22.03.2021 08:05, Hans-Juergen Rennau wrote:
Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes.
In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)
NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<
So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:
Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off.
If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off /optional/ serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content.
SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.
Kind regards, Hans-Jürgen
PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. " becomes ", / becomes /, \n becomes 
.
Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote:
No, Liam, that is a misunderstanding - it *may* be escaped by a preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
Oh, thank you, Christian - only now I understand the option "escape" of json:serialize (do I?)! I thought it was about how to *represent* the input characters (apply optional escaping or not), but in fact it is (also) about how to *interpret* them - interpret them in the XDM way (escape=yes) or interpret them in the JSON way (escape=no). An experiment made it clear: with "escape=yes", input text "a\nb" represents 4 characters, serialized into "a\nb"; whereas with escape=no, "a\nb" represents 3 characters, serialized into "a\nb". Understood, this enables to construct JSON text with JSON escaping, and retain this representation when serializing.
So serialization via json:serialize does currently not foresee a possibility to control whether or not to use optional escaping in the output - / is escaped, relentlessly. If I'm not mistaken, neither does fn:serialize offer an option for that . But as Gerrit pointed out, escaping can be controlled via fn:serialize with serialization parameter "use-character-maps". Alright - "escape=no" means that *all* responsibility for escaping is taken over by the caller, which explains the present behavior and its "refusal" to escape quote and backslash! I agree, in case of invalid JSON an exception would be better than the silent construction of invalid JSON.
Thanks a lot for explaining!Hans-Jürgen
Am Montag, 22. März 2021, 13:14:12 MEZ hat Christian Grün christian.gruen@gmail.com Folgendes geschrieben:
Dear Hans-Jürgen,
I was speaking about function json:serialize, which does not support the option "use-character-maps". I still think option "escape=no" must not leave double quotes or backslashes without escaping, as it produces invalid JSON and a serialization function producing invalid output does not make sense to me.
The JSON options have been designed to improve bidirectional conversions: The option can be supplied to preserve the original escape strings (which can e.g. be helpful if the input data is invalid). The same options should be used for parsing and serializing data.
But I completely agree, we shouldn’t allow users to create invalid JSON. It may be better to raise an exception if the serialization of data with the supplied options would lead to corrupt output?
Best, Christian
But as the use of fn:serialize is a work around, the matter is not urgent. Thanks, Gerrit!
Kind regards, Hans
Am Montag, 22. März 2021, 08:49:59 MEZ hat Imsieke, Gerrit, le-tex gerrit.imsieke@le-tex.de Folgendes geschrieben:
What’s wrong with adding 'use-character-maps': map{'/': '/'} to the JSON serialization each time?
On 22.03.2021 08:05, Hans-Juergen Rennau wrote:
Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes.
In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)
>>> NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<
So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:
Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off.
If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off /optional/ serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content.
SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.
Kind regards, Hans-Jürgen
PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. " becomes ", / becomes /, \n becomes 
.
Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote: > No, Liam, that is a misunderstanding - it *may* be escaped by a > preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
…and thanks for your reply.
Out of interest: What’s your main reason for trying to avoid escaped slashes in the JSON result? Is it “only” about better readability, or about a technical problem, e.g. a JSON processor that cannot handle backslashed slashes?
I’m asking because we already had a client who stumbled upon the latter case. Back then, we also proposed the use of a character map (very similar to Gerrit’s suggestion):
serialize(<json>/</json>, map { 'method': 'json', 'use-character-maps': map { '/': '/' } })
On Mon, Mar 22, 2021 at 2:19 PM Hans-Juergen Rennau hrennau@yahoo.de wrote:
Oh, thank you, Christian - only now I understand the option "escape" of json:serialize (do I?)! I thought it was about how to *represent* the input characters (apply optional escaping or not), but in fact it is (also) about how to *interpret* them - interpret them in the XDM way (escape=yes) or interpret them in the JSON way (escape=no). An experiment made it clear: with "escape=yes", input text "a\nb" represents 4 characters, serialized into "a\nb"; whereas with escape=no, "a\nb" represents 3 characters, serialized into "a\nb". Understood, this enables to construct JSON text with JSON escaping, and retain this representation when serializing.
So serialization via json:serialize does currently not foresee a possibility to control whether or not to use optional escaping in the output - / is escaped, relentlessly. If I'm not mistaken, neither does fn:serialize offer an option for that . But as Gerrit pointed out, escaping can be controlled via fn:serialize with serialization parameter "use-character-maps".
Alright - "escape=no" means that *all* responsibility for escaping is taken over by the caller, which explains the present behavior and its "refusal" to escape quote and backslash! I agree, in case of invalid JSON an exception would be better than the silent construction of invalid JSON.
Thanks a lot for explaining! Hans-Jürgen
On 3/22/21, Hans-Juergen Rennau hrennau@yahoo.de wrote:
Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes. In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)
NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<
So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:
Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off. If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off optional serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content. SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.
Kind regards,Hans-Jürgen PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. " becomes ", / becomes /, \n becomes 
.
Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin
liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote:
No, Liam, that is a misunderstanding - it *may* be escaped by a preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
-- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org
In my case, it's readability. I've started work on a BaseX-based tool for analyzing and reporting OpenAPI documents. In that domain, according to my experience, slashes are never escaped. Tool output cluttered up with backslashes would be inacceptable - just consider how URLs and JSON Pointers look when every slash is preceded by a backslash: http://:...., #/definitions/fooRequest ...
Kind regards,Hans-Jürgen
Am Montag, 22. März 2021, 14:34:53 MEZ hat Christian Grün christian.gruen@gmail.com Folgendes geschrieben:
…and thanks for your reply.
Out of interest: What’s your main reason for trying to avoid escaped slashes in the JSON result? Is it “only” about better readability, or about a technical problem, e.g. a JSON processor that cannot handle backslashed slashes?
I’m asking because we already had a client who stumbled upon the latter case. Back then, we also proposed the use of a character map (very similar to Gerrit’s suggestion):
serialize(<json>/</json>, map { 'method': 'json', 'use-character-maps': map { '/': '/' } })
On Mon, Mar 22, 2021 at 2:19 PM Hans-Juergen Rennau hrennau@yahoo.de wrote:
Oh, thank you, Christian - only now I understand the option "escape" of json:serialize (do I?)! I thought it was about how to *represent* the input characters (apply optional escaping or not), but in fact it is (also) about how to *interpret* them - interpret them in the XDM way (escape=yes) or interpret them in the JSON way (escape=no). An experiment made it clear: with "escape=yes", input text "a\nb" represents 4 characters, serialized into "a\nb"; whereas with escape=no, "a\nb" represents 3 characters, serialized into "a\nb". Understood, this enables to construct JSON text with JSON escaping, and retain this representation when serializing.
So serialization via json:serialize does currently not foresee a possibility to control whether or not to use optional escaping in the output - / is escaped, relentlessly. If I'm not mistaken, neither does fn:serialize offer an option for that . But as Gerrit pointed out, escaping can be controlled via fn:serialize with serialization parameter "use-character-maps".
Alright - "escape=no" means that *all* responsibility for escaping is taken over by the caller, which explains the present behavior and its "refusal" to escape quote and backslash! I agree, in case of invalid JSON an exception would be better than the silent construction of invalid JSON.
Thanks a lot for explaining! Hans-Jürgen
On 3/22/21, Hans-Juergen Rennau hrennau@yahoo.de wrote:
Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes. In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)
NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<
So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:
Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off. If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off optional serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content. SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.
Kind regards,Hans-Jürgen PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. " becomes ", / becomes /, \n becomes 
.
Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin liam@fromoldbooks.org Folgendes geschrieben:
On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote:
No, Liam, that is a misunderstanding - it *may* be escaped by a preceding ,
oops, thanks! Altough i shuld note that it's sort of changed over time from recommended to "may" - but see e.g. for the rationale (Doug Crockford explains it in the first few second after the 6 minute point so that's where i've linked to):
https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s
It's so that </script> doesn't occur.
-- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org
basex-talk@mailman.uni-konstanz.de