[file:invalid-path] Invalid file path: 'Temp/pdfs/ e5-轻奢版-80-global-en-us-dh9.fo http://xn--e5--80-global-en-us-dh9-3195b4226a2w0g.fo
Any chance this limit can be overcome?
Hi France, what filesystem are you on? I know its not very helpful to say: “works on my machine”, but on macOS I could simply use your example as a path:
[cid:image001.png@01D87534.4C60EF80]
Can you try creating a file with that name from your Terminal or outside of BaseX? You might then try file:list() to get a glimpse whether or not the filesystem changed something.
Hope this helps. In my opinion and experience for portability its safest to stick to ASCII.
Best
Michael
Von: BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de im Auftrag von France Baril france.baril@architextus.com Datum: Dienstag, 31. Mai 2022 um 17:58 An: BaseX basex-talk@mailman.uni-konstanz.de Cc: Eric Schieler eschieler@nutshellcommunication.com Betreff: [basex-talk] Cannot file:write when path has chinese characters... [file:invalid-path] Invalid file path: 'Temp/pdfs/e5-轻奢版-80-global-en-us-dh9.fohttp://xn--e5--80-global-en-us-dh9-3195b4226a2w0g.fo
Any chance this limit can be overcome?
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.commailto:france.baril@architextus.com
Am 31.05.22 um 17:58 schrieb France Baril:
[file:invalid-path] Invalid file path: 'Temp/pdfs/e5-轻奢版-80-global-en-us-dh9.fo http://xn--e5--80-global-en-us-dh9-3195b4226a2w0g.fo
Any chance this limit can be overcome?
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com mailto:france.baril@architextus.com
Hi France,
just a guess:
The leading string part
'Temp/pdfs/
already looks strange to me; but it may be correct nonetheless.
However, due to the Hanzi stuff inside you might try to enclose the complete path string (including the leading single quote) with double quotes (" "), or simply add a trailing single quote ( ' ).
Then you may check, if the character encoding is correct (probably formatted UTF-8), and there's no mix up with other Chinese/CJK formats.
Further, your file path appears leading to a file, and not a path.
Regards
Hi,
Some more details:
- I'm on Ubuntu and my client is on Windows, we both get the same behavior: correct output in english, error with chinese characters. - I assume the characters are utf-8 since my basex setup is all utf-8. I query a db for the product name and then use that to build the file name to use in the file:write. - When I file:list a folder that has files with Chinese characters, the characters get replaced by ���.
On Tue, May 31, 2022 at 9:29 PM Michael Topp info@mito-space.com wrote:
Well, drop that line, it's just too late in the evening:
Am 31.05.22 um 22:19 schrieb mr_t:
Further, your file path appears leading to a file, and not a path.
Well,
/> When I file:list a folder that has files with Chinese characters, the characters get replaced by ���. /
especially this one is looking to me like you're missing a proper(ly configured) Chinese font on either host or client system – otherwise there would be garbled ASCII character sequences. – In other words, currently any Chinese glyphs have no characters for graphical representation.
Very likely Ubuntu has excellent wikis or howtos about CJK font installation. The Noto-fonts are quite popular. I am on Arch (and have no Windows), so here's another font list: https://wiki.archlinux.org/title/Localization/Chinese#Fonts
If this is not sufficient, try to configure the BaseX back-end to find your font. Or try to declare your fonts in a style sheet like so: https://www.w3.org/Style/styling-XML.en.html
Regards
Am 02.06.22 um 11:33 schrieb France Baril:
Hi,
Some more details:
- I'm on Ubuntu and my client is on Windows, we both get the same behavior: correct output in english, error with chinese characters.
- I assume the characters are utf-8 since my basex setup is all utf-8. I query a db for the product name and then use that to build the file name to use in the file:write.
- When I file:list a folder that has files with Chinese characters, the characters get replaced by ���.
Edit:
/otherwise there would be garbled ASCII character sequences/
(in case of encoding problems).
Am 06.06.22 um 10:40 schrieb mr_t:
Well,
/> When I file:list a folder that has files with Chinese characters, the characters get replaced by ���. /
especially this one is looking to me like you're missing a proper(ly configured) Chinese font on either host or client system – otherwise there would be garbled ASCII character sequences. – In other words, currently any Chinese glyphs have no characters for graphical representation.
Very likely Ubuntu has excellent wikis or howtos about CJK font installation. The Noto-fonts are quite popular. I am on Arch (and have no Windows), so here's another font list: https://wiki.archlinux.org/title/Localization/Chinese#Fonts
If this is not sufficient, try to configure the BaseX back-end to find your font. Or try to declare your fonts in a style sheet like so: https://www.w3.org/Style/styling-XML.en.html
Regards
Am 02.06.22 um 11:33 schrieb France Baril:
Hi,
Some more details:
- I'm on Ubuntu and my client is on Windows, we both get the same behavior: correct output in english, error with chinese characters.
- I assume the characters are utf-8 since my basex setup is all utf-8. I query a db for the product name and then use that to build the file name to use in the file:write.
- When I file:list a folder that has files with Chinese characters, the characters get replaced by ���.
I finally found the issue. It was with the docker instance.
docker > touch 轻奢版.txt My system's volume: ls 轻奢版.txt Me: This works!
docker > ls ''$'\350\275\273\345\245\242\347\211\210''.txt' Me: Ah!
I rebuilt the docker image with different locale settings and voilà!
Thanks!
On Mon, Jun 6, 2022 at 10:12 AM mr_t info@mito-space.com wrote:
Edit:
*otherwise there would be garbled ASCII character sequences*
(in case of encoding problems).
Am 06.06.22 um 10:40 schrieb mr_t:
Well,
*> When I file:list a folder that has files with Chinese characters, the characters get replaced by ���. *
especially this one is looking to me like you're missing a proper(ly configured) Chinese font on either host or client system – otherwise there would be garbled ASCII character sequences. – In other words, currently any Chinese glyphs have no characters for graphical representation.
Very likely Ubuntu has excellent wikis or howtos about CJK font installation. The Noto-fonts are quite popular. I am on Arch (and have no Windows), so here's another font list: https://wiki.archlinux.org/title/Localization/Chinese#Fonts
If this is not sufficient, try to configure the BaseX back-end to find your font. Or try to declare your fonts in a style sheet like so: https://www.w3.org/Style/styling-XML.en.html
Regards
Am 02.06.22 um 11:33 schrieb France Baril:
Hi,
Some more details:
- I'm on Ubuntu and my client is on Windows, we both get the same
behavior: correct output in english, error with chinese characters.
- I assume the characters are utf-8 since my basex setup is all utf-8.
I query a db for the product name and then use that to build the file name to use in the file:write.
- When I file:list a folder that has files with Chinese characters,
the characters get replaced by ���.
Yeah, these strange little things ...
Congrats, glad it works now! ^^
Am 10.06.22 um 14:14 schrieb France Baril:
I finally found the issue. It was with the docker instance.
docker > touch 轻奢版.txt My system's volume: ls 轻奢版.txt Me: This works!
docker > ls ''$'\350\275\273\345\245\242\347\211\210''.txt' Me: Ah!
I rebuilt the docker image with different locale settings and voilà!
Thanks!
On Mon, Jun 6, 2022 at 10:12 AM mr_t <info@mito-space.com mailto:info@mito-space.com> wrote:
Edit: > /otherwise there would be garbled ASCII character sequences/ (in case of encoding problems). Am 06.06.22 um 10:40 schrieb mr_t:
Well, /> When I file:list a folder that has files with Chinese characters, the characters get replaced by ���. / especially this one is looking to me like you're missing a proper(ly configured) Chinese font on either host or client system – otherwise there would be garbled ASCII character sequences. – In other words, currently any Chinese glyphs have no characters for graphical representation. Very likely Ubuntu has excellent wikis or howtos about CJK font installation. The Noto-fonts are quite popular. I am on Arch (and have no Windows), so here's another font list: https://wiki.archlinux.org/title/Localization/Chinese#Fonts If this is not sufficient, try to configure the BaseX back-end to find your font. Or try to declare your fonts in a style sheet like so: https://www.w3.org/Style/styling-XML.en.html Regards Am 02.06.22 um 11:33 schrieb France Baril:
Hi, Some more details: * I'm on Ubuntu and my client is on Windows, we both get the same behavior: correct output in english, error with chinese characters. * I assume the characters are utf-8 since my basex setup is all utf-8. I query a db for the product name and then use that to build the file name to use in the file:write. * When I file:list a folder that has files with Chinese characters, the characters get replaced by ���.
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com mailto:france.baril@architextus.com
basex-talk@mailman.uni-konstanz.de