Hi Lars,
When converting files from xml to html, there appeared a serialization error saying something to the effect that x84 was an illegal html character. The files were written using file:write with parameter $params defined as:
Do you have some idea how the x84 byte was stored into the database?
Is there a way to fix the text before submitting it to file:write()?
There are probably several ways to do this, but one standard XQuery solution I just got in mind looks as follows:
let $invalid := 132 let $valid := string-to-codepoints("?") for $text in db:open('db')//text() let $cps := string-to-codepoints($string) ! (if (. eq $invalid) then $valid else .) let $new := codepoints-to-string($cps) return $text
All strings are converted to their codepoints, and the invalid codes are replaced with an alternative (here: ?). The text are then returned as result.
The following query will replace all texts in the database..
let $invalid := 132 let $valid := string-to-codepoints("?") for $text in db:open('db')//text() let $cps := string-to-codepoints($string) ! (if (. eq $invalid) then $valid else .) let $new := codepoints-to-string($cps) return replace value of node $text with $new
..and the last one replaces the texts in the main memory representation of the document:
copy $db := db:open('db') modify ( let $invalid := 132 let $valid := string-to-codepoints("?") for $text in $db//text() let $cps := string-to-codepoints($string) ! (if (. eq $invalid) then $valid else .) let $new := codepoints-to-string($cps) return replace value of node $text with $new ) return $db
Hope this helps, Christian