Hi, I am trying to run a parser via proc:execute/system. While if I run it from the command line, it works, it does not, if I run it via BaseX. More precisely, I get the following error: Traceback (most recent call last): File "main.py", line 325, in <module> test_data = loader.load(params.test) File "/Users/mycomputer/Desktop/Basex9.1beta 2/utils.py", line 57, in load for line in f: File "/Users/mycomputer/anaconda/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5390: ordinal not in range(128) Is there a way to avoid that? Thanks. Best, Giuseppe
On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5390: ordinal not in range(128)
A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128 suggests US ASCII was expected - check the encoding declaration on the XML, or maybe it's a locale difference? -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Web slave for vintage clipart http://www.fromoldbooks.org/
Hi Liam Thanks. My locale is actually "en_US.UTF-8", so I do not know why the error is raised
On Feb 12, 2019, at 6:30 AM, Liam R. E. Quin <liam@fromoldbooks.org> wrote:
On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5390: ordinal not in range(128)
A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128 suggests US ASCII was expected - check the encoding declaration on the XML, or maybe it's a locale difference?
-- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Web slave for vintage clipart http://www.fromoldbooks.org/
I notice that if I run "locale" from my MAC Terminal I get the correct one (utf-8), but if I run proc:system("locale") I get: LANG= LC_COLLATE="C" LC_CTYPE="C" LC_MESSAGES="C" LC_MONETARY="C" LC_NUMERIC="C" LC_TIME="C" LC_ALL= Is there a way to force BaseX to start with utf-8? Thanks. Ciao, Giuseppe
On Feb 12, 2019, at 9:12 AM, Giuseppe G. A. Celano <celano@informatik.uni-leipzig.de> wrote:
Hi Liam
Thanks. My locale is actually "en_US.UTF-8", so I do not know why the error is raised
On Feb 12, 2019, at 6:30 AM, Liam R. E. Quin <liam@fromoldbooks.org> wrote:
On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5390: ordinal not in range(128)
A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128 suggests US ASCII was expected - check the encoding declaration on the XML, or maybe it's a locale difference?
-- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Web slave for vintage clipart http://www.fromoldbooks.org/
Hi Giuseppe, Have you tried to set your locale variable on your system? If it’s Ubuntu, you could have a look here: https://ubuntuforums.org/showthread.php?t=2212353 Hope this helps, Christian On Tue, Feb 12, 2019 at 4:13 PM Giuseppe G. A. Celano <celano@informatik.uni-leipzig.de> wrote:
I notice that if I run "locale" from my MAC Terminal I get the correct one (utf-8), but if I run proc:system("locale") I get:
LANG= LC_COLLATE="C" LC_CTYPE="C" LC_MESSAGES="C" LC_MONETARY="C" LC_NUMERIC="C" LC_TIME="C" LC_ALL=
Is there a way to force BaseX to start with utf-8? Thanks.
Ciao, Giuseppe
On Feb 12, 2019, at 9:12 AM, Giuseppe G. A. Celano <celano@informatik.uni-leipzig.de> wrote:
Hi Liam
Thanks. My locale is actually "en_US.UTF-8", so I do not know why the error is raised
On Feb 12, 2019, at 6:30 AM, Liam R. E. Quin <liam@fromoldbooks.org> wrote:
On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5390: ordinal not in range(128)
A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128 suggests US ASCII was expected - check the encoding declaration on the XML, or maybe it's a locale difference?
-- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Web slave for vintage clipart http://www.fromoldbooks.org/
Hi Christian, The problem with Mac is that it is not that easy to chance root settings. However, I have found a workaround, which may be useful to others: instead of invoking something like proc:system("python", "main.py"), I create a bash file like: #!/bin/bash export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8 export LANGUAGE=en_US.UTF-8 absolute-path/python absolute-path/main.py and then I run in Basex proc:system("bash", "path-to-the-bash-file"). This works! Ciao, Giuseppe Dr. Giuseppe G. A. Celano DFG-project leader <http://gepris.dfg.de/gepris/projekt/408121292> Universität Leipzig Institute of Computer Science, NLP Augustusplatz 10 Tel: +4934132223 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de <mailto:celano@informatik.uni-leipzig.de> Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano <http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano> Web site 2: https://sites.google.com/site/giuseppegacelano/ <https://sites.google.com/site/giuseppegacelano/>
On Feb 13, 2019, at 5:16 PM, Christian Grün <christian.gruen@gmail.com> wrote:
Hi Giuseppe,
Have you tried to set your locale variable on your system? If it’s Ubuntu, you could have a look here:
https://ubuntuforums.org/showthread.php?t=2212353
Hope this helps, Christian
On Tue, Feb 12, 2019 at 4:13 PM Giuseppe G. A. Celano <celano@informatik.uni-leipzig.de> wrote:
I notice that if I run "locale" from my MAC Terminal I get the correct one (utf-8), but if I run proc:system("locale") I get:
LANG= LC_COLLATE="C" LC_CTYPE="C" LC_MESSAGES="C" LC_MONETARY="C" LC_NUMERIC="C" LC_TIME="C" LC_ALL=
Is there a way to force BaseX to start with utf-8? Thanks.
Ciao, Giuseppe
On Feb 12, 2019, at 9:12 AM, Giuseppe G. A. Celano <celano@informatik.uni-leipzig.de> wrote:
Hi Liam
Thanks. My locale is actually "en_US.UTF-8", so I do not know why the error is raised
On Feb 12, 2019, at 6:30 AM, Liam R. E. Quin <liam@fromoldbooks.org> wrote:
On Tue, 2019-02-12 at 01:42 +0100, Giuseppe G. A. Celano wrote:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 5390: ordinal not in range(128)
A guess - sounds like an encoding error - 0xE2 is â in Unicode, and 128 suggests US ASCII was expected - check the encoding declaration on the XML, or maybe it's a locale difference?
-- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Web slave for vintage clipart http://www.fromoldbooks.org/
participants (3)
-
Christian Grün -
Giuseppe G. A. Celano -
Liam R. E. Quin