Hi Christian,
Please find below the code I use. As you can see, I make a call to the BaseXClient found at the Github location, which I leave unchanged: https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/or.... I have spent some time breaking down the "send" function in the BaseXClient, and have tracked the delay down to the "bos.write" which is attached to the raw socket created in the constructor. I also tested the function using a byte array, rather than a single byte, and the problem persists, suggesting a socket buffering problem on the server side itself. I increased the buffer size in the BaseXClient socket, and that made no difference either. The XML datafiles that I'm sending vary, so it's not associated with a single structure within one of them. I also tried the internal parser, but that made no difference, and, if it's the lower level server buffers, anyway, I wouldn't have thought that would make any difference either.
-----------------------------------
// Document builder initialisation DocumentBuilderFactory fDocumentBuilderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder fDocumentBuilder = fDocumentBuilderFactory.newDocumentBuilder(); DOMImplementation fDOMImplementation = fDocumentBuilder.getDOMImplementation();
// Generate XML from select object fields. Document fDocument = fDOMImplementation.createDocument(null, this.getClass().getName(), null); Element fElement = fDocument.getDocumentElement(); fMyObject.toXML(fDocument, fElement);
// Do transformation DOMSource fDomSource = new DOMSource(fDocument); StringWriter fStringWriter = new StringWriter(); StreamResult fStringStreamResult = new StreamResult(fStringWriter); fTransformer = fTransformerFactory.newTransformer(); fTransformer.setOutputProperty(OutputKeys.ENCODING,"ISO-8859-1"); fTransformer.transform(fDomSource, fStringStreamResult);
String fXMLSource = fStringWriter.toString();
// BaseXClient Start final BaseXClient fBaseXClient = new BaseXClient(fHost, fPort, fUserName, fPassword); try { fBaseXClient.execute("open " + fIdentity); doSignal(fBaseXClient.info()); InputStream fInputStream = new ByteArrayInputStream(fXMLSource.getBytes()); fBaseXClient.replace(fPathName, fXMLSource.getBytes()); doSignal(fBaseXClient.info()); } finally { fBaseXClient.close(); } // BaseXClient Finish.
-----------------------------------
Jonathan.
-----Original Message----- From: Christian Grün [mailto:christian.gruen@gmail.com] Sent: 13 March 2015 22:30 To: Jonathan Clarke Cc: BaseX Subject: Re: [basex-talk] Large Document Upload Performance
Hi Jonathan,
I wouldn't be able to provide you with the data itself, but I'm not using a query, I'm simply using the BaseXClient that's provided on your site, it's just a connection open to the server, and then a call to the replace function.
Could you please post the lines of code you have been using so far to replace documents?
Beside that, you could check out the e-mail from Simon Chatelain [1]: In many cases, you can e.g. speed up the import of documents by using our internal parser (INTPARSE = true).
Best, Christian
[1] https://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/msg05911.htm...
What's the typical time you would expect to see for a file of that size? Some research online has suggested that the delay is caused by the document indexing that gets underway at the point of update. In the meantime, I'll try and construct a file of similar size that's non-descript that we can use. Are there any other performance enhancing settings that you've advised others for a similar reports? Like the flushing, and I able to postpone or turn off the document indexing until I'm ready to call the function explicitly?
Jonathan.
-----Original Message----- From: Christian Grün [mailto:christian.gruen@gmail.com] Sent: 13 March 2015 19:12 To: Jonathan Clarke Cc: BaseX Subject: Re: [basex-talk] Large Document Upload Performance
Hi Jonathan,
I hope you can help me. I am using BaseX to underpin a complex distributed system, which also requires storage of xml document in soft real-time. At the moment, I’m getting storage times for a 4Mb XML file of about 500ms. Can you advise how I might be able to bring that down, please, by at least 75%?
We'll probably need more information on your queries etc.
I also tried to use AddCache, and that just crashed the latest production release of the server.
If you find out how we can reproduce this, your feedback is welcome.
Best, Christian