Converting to UTF-8 in SQL module - BaseX-Talk - mailman.uni-konstanz.de

17 Aug 2015


      Hello,
I am using the BaseX SQL module to query the Oracle database of a library
catalog. Its character data is encoded as ISO 2709 (MARC-8)[1] and is
stored in Oracle as US7ASCII.
The data contains diacritics with combining characters like this:
http://www.fileformat.info/info/unicode/char/0301/index.htm
When I run the query in BaseX, all characters with combining accents are
output as something like Ì&#x81;
Has anyone had experience handling this kind of encoding issue in BaseX, or
does anyone have any solutions/approaches to recommend as a way to convert
this data to UTF-8?
Thanks in advance,
Tim
[1] https://en.wikipedia.org/wiki/ISO_2709
[2] http://www.fileformat.info/info/unicode/char/00e8/index.htm
--
Tim A. Thompson
Metadata Librarian (Spanish/Portuguese Specialty)
Princeton University Library tat2@princeton.edu