Definitely looks like a bug. I’m currently on the road, but I’ll get to the bottom of this once I’m back.



Bridger Dyson-Smith <bdysonsmith@gmail.com> schrieb am Fr., 27. Mai 2022, 19:27:
Marco -
I'm sorry but I can only corroborate your findings, and that trying to force UTF-8 by adding the encoding parameter to the functions doesn't seem to help; e.g.

) ./bin/basex
BaseX 9.7.1 [Standalone]
Try 'help' to get more information.
> xquery file:current-dir()
/usr/home/bridger/bin/basex/
Query executed in 886.62 ms.
> xquery file:write-text("a1.txt", "°" || out:nl(), "UTF-8")

Query executed in 4.32 ms.
> xquery file:read-text("a1.txt")
°

Query executed in 1.99 ms.
> xquery file:write-text("a2.txt", file:read-text("a1.txt", "UTF-8"), "UTF-8")

Query executed in 1.83 ms.
> xquery file:read-text("a2.txt")
[file:io-error] Decoding error: xb0
> xquery file:read-text("a2.txt", "UTF-8")
[file:io-error] Decoding error: xb0
> xquery file:read-text("a2.txt", "ISO-8859-1")
°

Query executed in 2.01 ms.

On Fri, May 27, 2022 at 1:00 PM Marco Lettere <m.lettere@gmail.com> wrote:
Dear all,

after wrapping our heads around this for hours today, we don't know how
to get rid of this inconsistency. Thus I ask for help ...

SSCE:

BaseX 9.6.4 [Standalone]
Try 'help' to get more information.
 > xquery file:write-text("a1.txt", "°" || out:nl()) (: Same with
codepoints-to-string(176) instead of "°" :)

Query executed in 183.94 ms.
 > xquery file:read-text("a1.txt")
°

Query executed in 1.49 ms.
 > xquery file:write-text("a2.txt", file:read-text("a1.txt"))
Query executed in 3.4 ms.

 > xquery file:read-text("a2.txt")
[file:io-error] Decoding error: xb0

Testing the files with linux command-line tool "file", this is the output:

 > file a1.txt
a1.txt: Unicode text, UTF-8 text

 > file a2.txt
a2.txt: ISO-8859 text

Reading the file after "copying" it seems to change the encoding. How is
this supposed to be handled?

Regards,

Marco.