As always, thank you Christian for helping clarify.
Tschüß! Bridger
On Thu, Oct 21, 2021 at 11:04 AM Christian Grün christian.gruen@gmail.com wrote:
Hi Bridger,
Thanks for your valuable observation; you are completely right. I have opened a new issue for tracking down this bug [1].
Best, Christian
[1] https://github.com/BaseXdb/basex/issues/2044
On Thu, Oct 21, 2021 at 4:49 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Hi Maud - with apologies, I thought I had added a namespace value to my example (I clearly did not).
Hi Christian - I'm a bit confused and I wonder if this is just the typical user/operator error on my part, or if this is a bug? The BaseX documentation for the JSON module[1] states that "... namespaces, comments, and processing instructions will be discarded in the transformation process.", but the output of the following still has namespaces:
declare variable $test := <m:me xmlns:m="http://canofbees.org/ns/%22%3E<m:mine test="false"/></m:me>; json:serialize($test, map { "format": "jsonml", "indent": false() })
returns ["me", {"xmlns:m":"http://canofbees.org/ns/"},["mine", {"test":"false"}]]
I checked the JsonML documentation[2] and it seemed to imply that only XHTML-namespaced nodes would have their namespaces dropped, but that doesn't seem to happen either; e.g.
declare variable $xhtml :=
<dl xmlns="http://www.w3.org/1999/xhtml"> <dt> <span>element</span> <dd>test</dd> </dt> </dl>; json:serialize($xhtml, map { "format": "jsonml", "indent": false() })
returns ["dl", {"xmlns":"http://www.w3.org /1999/xhtml"},["dt",["span","element"],["dd","test"]]]
Thanks for any thoughts you can share on this. Best, Bridger
[1] https://docs.basex.org/wiki/JSON_Module#JsonML [2] http://www.jsonml.org/xml/
On Thu, Oct 21, 2021 at 4:31 AM Christian Grün christian.gruen@gmail.com wrote:
Hi Maud,
In the following lines of code, your input is first serialized as XML, and then converted back to XML with whitespaces chopped and namespaces stripped. The result will then be serialized to JSON:
let $xml := fetch:xml( serialize($docs), map { 'stripns': true(), 'chop': true() }) return json:serialize($xml, map { 'format': 'jsonml' })
Here’s one more solution to remove whitespace-only text nodes before serializing the data:
let $xml := $docs update { delete nodes .//text()[normalize-space() = ''] } return json:serialize...
It’s harder that way to get rid of namespaces. You’d probably need to rebuild the XML node, as e.g. demonstrated in [1].
Hope this helps – and I hope you’re doing fine, Christian
[1] https://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/msg13678.htm...
On Wed, Oct 20, 2021 at 5:57 PM Maud Ingarao maud.ingarao@ens-lyon.fr wrote:
Dear all
We need to convert some xml data to json to create a nice js visualisation in our webapp
The json:serialize works fine, except the fact that it creates empty strings data due to \n, \s and \t
We can't find the option(s) to ignore them... And we would like to ignore the namespace attributes as well.
Below is an example of the input and output we get with
let $docs := <ead>{db:open('rey',
'inventaire-ead.xml')//*:c[.//*:scopecontent[@localtype='Lettre']]}</ead> return json:serialize($docs, map{ 'format': 'jsonml' })
Thanks a lot !
Maud Input
<c audience="external" id="Rey17470124" level="item"> <did> <unitdate certainty="high" label="composition" normal="1747-01-24"/> <unitid>Bre 3-2</unitid> <repository> <corpname> <part>Amsterdam, UB, Bibliotheek der Vereeniging tot Bevordering van de Belangen des Boekhandels</part> </corpname> </repository> <didnote/> <physdescstructured coverage="whole" physdescstructuredtype="materialtype"> <quantity>1</quantity> <unittype>feuillet</unittype> </physdescstructured> </did> <relations> <relation href="Polier de Bottens, Georges Nicolas" linktitle="expediteur" relationtype="functionrelation"/> <relation href="Bousquet, Marc Michel" linktitle="destinataire" relationtype="functionrelation"/> <relation href="Lausanne" linktitle="expedition" relationtype="functionrelation"/> <relation href="Genève" linktitle="reception" relationtype="functionrelation"/> </relations> <relatedmaterial> <relatedmaterial localtype="imprime"> <bibref/> </relatedmaterial> <relatedmaterial localtype="edition"> <bibref/> </relatedmaterial> <relatedmaterial localtype="bibliographie"> <bibref/> </relatedmaterial> </relatedmaterial> <scopecontent localtype="Lettre"> <scopecontent localtype="adresse"> <p>oui</p> </scopecontent> <scopecontent localtype="incipit"> <blockquote> <p>Si vous ecrivés aujourdhui ou vendredi</p> </blockquote> </scopecontent> </scopecontent> <odd> <odd localtype="autographe"> <p>oui</p> </odd> <odd localtype="signature"> <p>oui</p> </odd> <odd localtype="contient"> <p>cette lettre a certainement transitée par Bousquet</p> </odd> </odd> </c> Output
["c", {"xmlns":"http://ead3.archivists.org/schema/" http:%5C/%5C/ead3.archivists.org%5C/schema%5C/, "xmlns:xsi":"http://www.w3.org/2001/XMLSchema-instance" http:%5C/%5C/www.w3.org%5C/2001%5C/XMLSchema-instance, "audience":"external", "id":"Rey17470124", "level":"item"}, "\n ", ["did", "\n ", ["unitdate", {"certainty":"high", "label":"composition", "normal":"1747-01-24"}], "\n ", ["unitid", "Bre 3-2"], "\n ", ["repository", "\n ", ["corpname", "\n ", ["part", "Amsterdam, UB, Bibliotheek der Vereeniging tot Bevordering van de Belangen des Boekhandels"], "\n "], "\n "], "\n ", ["didnote"], "\n ", ["physdescstructured", {"coverage":"whole", "physdescstructuredtype":"materialtype"}, "\n ", ["quantity", "1"], "\n ", ["unittype", "feuillet"], "\n "], "\n "], "\n ", ["relations", "\n ", ["relation", {"href":"Polier de Bottens, Georges Nicolas", "linktitle":"expediteur", "relationtype":"functionrelation"}], "\n ", ["relation", {"href":"Bousquet, Marc Michel", "linktitle":"destinataire", "relationtype":"functionrelation"}], "\n ", ["relation", {"href":"Lausanne", "linktitle":"expedition", "relationtype":"functionrelation"}], "\n ", ["relation", {"href":"Genève", "linktitle":"reception", "relationtype":"functionrelation"}], "\n "], "\n ", ["relatedmaterial", "\n ", ["relatedmaterial", {"localtype":"imprime"}, "\n ", ["bibref"], "\n "], "\n ", ["relatedmaterial", {"localtype":"edition"}, "\n ", ["bibref"], "\n "], "\n ", ["relatedmaterial", {"localtype":"bibliographie"}, "\n ", ["bibref"], "\n "], "\n "], "\n ", ["scopecontent", {"localtype":"Lettre"}, "\n ", ["scopecontent", {"localtype":"adresse"}, "\n ", ["p", "oui"], "\n "], "\n ", ["scopecontent", {"localtype":"incipit"}, "\n ", ["blockquote", "\n ", ["p", "Si vous ecrivés aujourdhui ou vendredi"], "\n "], "\n "], "\n "], "\n ", ["odd", "\n ", ["odd", {"localtype":"autographe"}, "\n ", ["p", "oui"], "\n "], "\n ", ["odd", {"localtype":"signature"}, "\n ", ["p", "oui"], "\n "], "\n ", ["odd", {"localtype":"contient"}, "\n ", ["p", "cette lettre a certainement transitée par Bousquet"], "\n "], "\n "], "\n "],
--
Maud Ingarao IHRIM - UMR 5317 Institut d’histoire des représentations et des idées dans les modernités École Normale Supérieure de Lyon 15 Parvis René Descartes - BP7000 - 69342 Lyon CEDEX 07 +33 4 37 37 65 79 - maud.ingarao@ens-lyon.fr
*Présente les lundi - mardi - mercredi* *At the office on mondays - tuesdays - wednesdays*
http://ihrim.ens-lyon.fr/ http://ahn.ens-lyon.fr/ https://cahier.hypotheses.org/