Good Morning, Liam, thank you very much for this precise link into the video, showing the "why" behind escaping slashes.

In spite of the incorrect "must escape", your previous post was key to me for understanding my error. It is absolutely crucial to understand that within the JSON model, "/" and "\/" denote the same Unicode point. Therefore my talking about expected roundtripping behavior was simply wrong: you cannot expect, or achieve, roundtripping when there are alternative representations, as the choice of representation cannot be remembered. (Comparable with the choice of single or double quotes to delimit attribute values in XML.)

>>> NEW VERSION OF THE ORIGINAL "COMPLAINT" <<<

So now it seems to me that everything boils down to one REAL ISSUE, and one might look at it either as a bug or as a missing feature:

Currently, it is not possible to serialize JSON in unescaped form which is guaranteed to be *valid* JSON. The problem is that serialization option "escape" turns off ALL backslash escaping - not only the optional (like /) but also the mandatory ones: double quote and backslash. This means that the serialization option escape="no" may produce invalid JSON, and I regard this as a bug: only the optional escaping should be switched off.

If one disagrees and thinks it makes sense to offer a serialization which is potentially invalid JSON (difficult to imagine, though), then I regard the current situation as having a gap: we then need a further option switching off optional serialization, but retaining mandatory escaping, in order to preserve valid JSON and safeguard unchanged information content.

SUMMING UP: can we have a possibility to serialize JSON with guarantee of correctness and without optional backslash escapes? Proposal: change the behavior of serialization option "escape=no", so that mandatory escaping is retained; alternative: add a further serialization option restricting backslash escaping to the mandatory cases.

Kind regards,
Hans-Jürgen

PS: Default parser behaviour is perfect, as it is: every JSON character is represented by the same Unicode character, e.g. \" becomes ", \/ becomes /, \n becomes &#xA;.

Am Sonntag, 21. März 2021, 19:25:25 MEZ hat Liam R. E. Quin <liam@fromoldbooks.org> Folgendes geschrieben:


On Sun, 2021-03-21 at 18:06 +0000, Hans-Juergen Rennau wrote:
>  No, Liam, that is a misunderstanding - it *may* be escaped by a
> preceding \,

oops, thanks! Altough i shuld note that it's sort of changed over time
from recommended to "may" - but see e.g. for the rationale (Doug
Crockford explains it in the first few second after the 6  minute point
so that's where i've linked to):

https://www.youtube.com/watch?v=-C-JoyNuQJs&t=386s

It's so that </script> doesn't occur.




--
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org