Hello --

I have some mainframe files which start off in no-known-encoding.  Using Basex 10.6, I'm trying to use the bin module to make some character substitutions so the content of these files can be UTF-8.

let $charMap as map(*) := map {
  33: 93, (: exclamation point ! to close bracket ] :)
  162: 91, (: cent-sign ¢ to open bracket [ :)
  124: 33, (: pipe character | to exclamation point ! :)
  160: 32,  (: non-breaking space to plain space :)
  26: 32   (: U+001A SUBSTITUTION CHARACTER at the end of the file; do not want :)
}
let $fromList as xs:integer+ := map:keys($charMap)

let $fileList as xs:string+ := file:children($localPath)

for $x in $fileList
  return (file:read-binary($x) ! bin:to-octets(.) ! (if (. = $fromList) then $charMap(.) else .) ! bin:from-octets(.) ! bin:decode-string(.,'UTF-8')) => string-join('')

Four of the five sample files work; one of them returns "Decoding error: xff"

If I restrict the process to the problematic file and use
return (file:read-binary($x) ! bin:to-octets(.) ! (if (. = $fromList) then $charMap(.) else .)) => distinct-values() => sort()

I don't find a 255 value.  And I'm pretty sure all the codepoints I do have are simple, less than 255, single octet UTF-8 characters.

Any suggestions for what I ought to be looking at?

Thanks!
Graydon

-- 
Graydon Saunders  | graydonish@fastmail.com
Þæs oferéode, ðisses swá mæg.
-- Deor  ("That passed, so may this.")