I am trying to index an csv file through basexclient. I issued this command
SET INTPARSE false; CREATE DB Properties Properties.csv
got these error message in basex server. What shall I do to index csv through basexclient command?
[kangadm@kangapp12 javaProcess]$ basexserver -d BaseX 7.9 [Server] Server was started (port: 1984) Creating Database... org.xml.sax.SAXParseException; systemId: file:///home/kangadm/basex/migros/production/javaProcess/Brands.sql.csv; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368) at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
I solved this one,
SET PARSER csv SET CSVPARSER encoding=utf-8, header=true, separator=comma SET CREATEFILTER *.csv
But now with 16565274 rows csv file, 690M. During indexing, I am getting this. I have set memory to 30G
Creating Database... java.lang.NegativeArraySizeException at java.util.Arrays.copyOf(Arrays.java:2271) at org.basex.util.TokenBuilder.addByte(TokenBuilder.java:242) at org.basex.util.TokenBuilder.add(TokenBuilder.java:151) at org.basex.io.parse.XmlTokenBuilder.add(XmlTokenBuilder.java:96) at org.basex.io.parse.XmlTokenBuilder.addText(XmlTokenBuilder.java:51) at org.basex.io.parse.csv.CsvStringConverter.entry(CsvStringConverter.java:76) at org.basex.io.parse.csv.CsvParser.record(CsvParser.java:105) at org.basex.io.parse.csv.CsvParser.parse(CsvParser.java:80) at org.basex.io.parse.csv.CsvParser.parse(CsvParser.java:54) at org.basex.io.parse.csv.CsvConverter.convert(CsvConverter.java:37) at org.basex.build.CsvParser.toXML(CsvParser.java:40)
On Wed, Jan 7, 2015 at 9:30 PM, Erol Akarsu eakarsu@gmail.com wrote:
I am trying to index an csv file through basexclient. I issued this command
SET INTPARSE false; CREATE DB Properties Properties.csv
got these error message in basex server. What shall I do to index csv through basexclient command?
[kangadm@kangapp12 javaProcess]$ basexserver -d BaseX 7.9 [Server] Server was started (port: 1984) Creating Database... org.xml.sax.SAXParseException; systemId: file:///home/kangadm/basex/migros/production/javaProcess/Brands.sql.csv; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368) at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
Hi Erol,
I am glad to tell this bottleneck should be removed as well. CSV parsing requires constant memory now. A new snapshot is online.
Looking forward to your feedback, Christian
[1] http://files.basex.org/releases/latest
On Thu, Jan 8, 2015 at 4:27 AM, Erol Akarsu eakarsu@gmail.com wrote:
I solved this one,
SET PARSER csv SET CSVPARSER encoding=utf-8, header=true, separator=comma SET CREATEFILTER *.csv
But now with 16565274 rows csv file, 690M. During indexing, I am getting this. I have set memory to 30G
Creating Database... java.lang.NegativeArraySizeException at java.util.Arrays.copyOf(Arrays.java:2271) at org.basex.util.TokenBuilder.addByte(TokenBuilder.java:242) at org.basex.util.TokenBuilder.add(TokenBuilder.java:151) at org.basex.io.parse.XmlTokenBuilder.add(XmlTokenBuilder.java:96) at org.basex.io.parse.XmlTokenBuilder.addText(XmlTokenBuilder.java:51) at org.basex.io.parse.csv.CsvStringConverter.entry(CsvStringConverter.java:76) at org.basex.io.parse.csv.CsvParser.record(CsvParser.java:105) at org.basex.io.parse.csv.CsvParser.parse(CsvParser.java:80) at org.basex.io.parse.csv.CsvParser.parse(CsvParser.java:54) at org.basex.io.parse.csv.CsvConverter.convert(CsvConverter.java:37) at org.basex.build.CsvParser.toXML(CsvParser.java:40)
On Wed, Jan 7, 2015 at 9:30 PM, Erol Akarsu eakarsu@gmail.com wrote:
I am trying to index an csv file through basexclient. I issued this command
SET INTPARSE false; CREATE DB Properties Properties.csv
got these error message in basex server. What shall I do to index csv through basexclient command?
[kangadm@kangapp12 javaProcess]$ basexserver -d BaseX 7.9 [Server] Server was started (port: 1984) Creating Database... org.xml.sax.SAXParseException; systemId: file:///home/kangadm/basex/migros/production/javaProcess/Brands.sql.csv; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368) at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
basex-talk@mailman.uni-konstanz.de