Hi all!
I have a few XML documents, whose content I would also like to provide in JSON. The XML files contain many strings encoded within comments, such as <!--incipit—>, and I would like to transfer such string contents unchanged into JSON. However, I am having a hard time to deal with “-” ( using data(<!---—>) ), which is rendered as “&45;” , and “>”, which is translated into “>”: I tried a lot, but nothing seems to work: is there a way to keep “-” and “>” unchanged while serializing to JSON?
Another issue I am experiencing is that, after JSON serialization, the order of the key-values in a map does not follow that specified in my code: I know that the order is not computationally meaningful, but for reading purposes it would make a huge difference in my case, because one key contains a long array of objects and therefore I would like the keys with shorter contents to be serialized before.
Ciao, Giuseppe
Hi Giuseppe,
However, I am having a hard time to deal with “-” ( using
data(<!---—>) ), which is rendered as “&45;” , and “>”, which is translated into “>”: I tried a lot, but nothing seems to work: is there a way to keep “-” and “>” unchanged while serializing to JSON?
Could you provide us with a little self-contained example?
Another issue I am experiencing is that, after JSON serialization, the
order of the key-values in a map does not follow that specified in my code
That's correct. If you want the order to be preserved, you’ll need to choose an XML representation for your JSON data.
Ciao, Christian
Hi Christian,
I have the comment node <!—-—>. If I extract its content with data() in an element, I get the following: <g>&#45;</g>, and it seems there is no way to force &#45; to become - within the element: I found a workaround to replace &#45; with a dash '-', and this seems the best solution at the moment. My problem is that I have to refer to the positions of these characters (as character offsets), and therefore any change when I move a character from a comment node to an element node could break my reference system (indeed, &#45; is not equivalent to - in BaseX: is the rendering of - as &#45; correct?).
Ciao, Giuseppe
On 26. Apr 2023, at 18:06, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
However, I am having a hard time to deal with “-” ( using data(<!---—>) ), which is rendered as “&45;” , and “>”, which is translated into “>”: I tried a lot, but nothing seems to work: is there a way to keep “-” and “>” unchanged while serializing to JSON?
Could you provide us with a little self-contained example?
Another issue I am experiencing is that, after JSON serialization, the order of the key-values in a map does not follow that specified in my code
That's correct. If you want the order to be preserved, you’ll need to choose an XML representation for your JSON data.
Ciao, Christian
Hi Giuseppe,
I’m sorry, I fail to understand how to simulate your use case. Could you please provide us with a minimized code snippet for testing?
I have the comment node <!—-—>.
I assume it’s -- instead of —?
If I extract its content with data() in an element, I get the following: <g>&#45;</g>, and it seems there is no way to force &#45; to become - within the element.
I tried this:
let $comment := <!-- - --> return element g { $comment }
It gives me <g><!-- - --></g>. I assume it differs from your approach?
Grazie in anticipo, Christian
Hi Christian,
This is the code:
<h>{data(<!----->)}</h>
Ciao, Giuseppe
On 27. Apr 2023, at 13:38, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
I’m sorry, I fail to understand how to simulate your use case. Could you please provide us with a minimized code snippet for testing?
I have the comment node <!—-—>.
I assume it’s -- instead of —?
If I extract its content with data() in an element, I get the following: <g>&#45;</g>, and it seems there is no way to force &#45; to become - within the element.
I tried this:
let $comment := <!-- - --> return element g { $comment }
It gives me <g><!-- - --></g>. I assume it differs from your approach?
Grazie in anticipo, Christian
Hi Giuseppe,
<h>{ data(<!----->) }</h>
The literal string value of the comment is "-" (the codepoints are 38 35 52 53 59). You can use parse-xml-fragment() to decode character and entity references:
parse-xml-fragment(<!----->)
Hope this helps, Christian
On 27. Apr 2023, at 13:38, Christian Grün christian.gruen@gmail.com wrote:
Hi Giuseppe,
I’m sorry, I fail to understand how to simulate your use case. Could you please provide us with a minimized code snippet for testing?
I have the comment node <!—-—>.
I assume it’s -- instead of —?
If I extract its content with data() in an element, I get the following: <g>&#45;</g>, and it seems there is no way to force &#45; to become - within the element.
I tried this:
let $comment := <!-- - --> return element g { $comment }
It gives me <g><!-- - --></g>. I assume it differs from your approach?
Grazie in anticipo, Christian
basex-talk@mailman.uni-konstanz.de