Dear baseX creater:
If it is xml file ,it 's ok. If it is xml String, BaseX only receives utf-8 encoding of XML. The reasion is : //---------------------------------------------------// public static IO get(final String s) { if(s == null) return new IOFile(""); if(s.startsWith("<")) return newIOContent(Token.token(s)); if(s.startsWith("http://")) return new IOUrl(s); return new IOFile(s); //---------------------------------------------------------// public static byte[] token(final String s) { final int l = s.length(); if(l == 0) return EMPTY; final byte[] bytes = new byte[l]; for(int i = 0; i < l; i++) { final char c = s.charAt(i); if(c > 0x7F) return utf8(s); bytes[i] = (byte) c; } return bytes; //-----------------------------------------------// Doesn't parse with the head of XML file. But this programe create xml struts with Sax which pares the xml with the head info. For this reason,it'll make mistakes once the XML String is not UTF-8.
I fix it : if(s.startsWith("<")) return new IOContent(Token.encoding(s)); //----------------------------------------------// public static byte[] encoding(final String s) { final int l = s.length(); if(l == 0) return EMPTY; int i=0; int j =0; StringBuilder ss=new StringBuilder(); char hope[]=new char[]{'c','=','"','"'}; while (true) { final char c = s.charAt(i); if(hope[j]==c) j++; if(j==4) break; if(j>2&&c!=hope[j-1]) { ss.append(c); } i++; }
final byte[] bytes = new byte[l]; for(int k = 0; k < l; k++) { final char c = s.charAt(k); if(c > 0x7F) return others(s,ss.toString()); bytes[k] = (byte) c; } return bytes; } //-----------------------------------------------//
I wonder if you could fix this bug in the next version. Also BaseX doesn't support the String of xml with namespace build DB. And I have no idea. PS: BaseX57,Jdk1.5
best wishes Ruby
basex-talk@mailman.uni-konstanz.de