BaseX-Talk May 2012

basex-talk@mailman.uni-konstanz.de

48 participants
64 discussions

Best way to insert large amounts of records
by Manuel Bernhardt 21 May '12

21 May '12

Hi, we're using BaseX to store multiple collections of documents (we call them records). These record are produced programmatically, by parsing an incoming stream on a server application and turning it into a document of the kind <record id="123" version="1"> ... </record> So far I took the following approach: - each collection of records is its own database in BaseX, for easier management - on insertion - set the session's autoflush to false - iterate over record - add them via add(id, document) - each 10000 records, flush - finally, flush once more - create the attributes index So for example now we have: name Resources Size Input Path ------------------------------------------------------------------------------------ col1 14141 19815190 col2 14750 16697081 col3 84450 253593687 col4 1012477 2107593252 col5 126058 186315175 col6 13767 14640701 col7 815991 730536864 col8 31189 39598405 col9 24733 91277637 col10 171906 202392553 ... and there'll be quite a bit more coming in. This kind of bulk insertion can also happen concurrently (I've set-up an actor pool at five for the moment). My questions are: - is this the most performant approach, or would it make sense to e.g. build one stream on the fly and somehow turn it into an inputstream to be sent via add? - is there a performance cost in adding with an ID? We don't really need them since we retrieve records via a query - and those resources aren't really files on the file-system - is there a performance penalty in doing this kind of parsing concurrently? - are there any JVM parameters that would help speed this up? I haven't quite found how to pass in JVM parameters when starting basexserver via the command line. Looks like BaseX gave itself an Xmx of 1866006528 (but that machine has 8GB so it could in theory get more. Thanks! Manuel

2 2

RESTXQ: If you ever need to set an http status code, or overwrite the annotations
by Philippe Rathé 20 May '12

20 May '12

Hello, this is an example of setting a 404 by overwriting what the REST annotations would produce. I had to gathered pieces of specs together in order to have it working. Please tell me if I'm missing the documentation that cover this. Feedback are welcome. declare %rest:path("/categories/{$category}") %rest:GET %rest:produces("application/atomcat+xml") %output:omit-xml-declaration("no") %output:media-type("application/atomcat+xml") function page:category($category as xs:string) { if (db:exists('categories', $category || '.xml')) then doc('categories/' || $category || '.xml') else let $reason := "Category scheme not found" return <rest:response> <http:response status="404" reason="{$reason}">  </http:response> <output:serialization-parameters> <output:omit-xml-declaration value="yes"/> <output:media-type value="text/plain"/> </output:serialization-parameters> </rest:response> }; Would produce the following on a not found document: HTTP/1.1 404 Not Found Content-Type: text/plain; charset=utf-8 Content-Length: 25 Server: Jetty(6.1.26) Category scheme not found -- Philippe

1 0

Scala client library for BaseX
by Manuel Bernhardt 20 May '12

20 May '12

Hi, I'd like to announce the first release of a scala client library for BaseX, which simplifies the idiomatic usage of BaseX within Scala applications. It's likely going to evolve quite a bit over the next weeks since we're in the process of learning how to best use BaseX. The source is available here: https://github.com/delving/basex-scala-client And I'll try to find some time to write some documentation this week-end. Comments, feedback etc. are of course very welcome! Cheers, Manuel

3 2

List all documents in a database
by Anupam Bakshi 20 May '12

20 May '12

Hello, I'm new to BaseX and XQuery, I wonder if someone can help with a simple question. I have added several documents to a BaseX database. Now I want to list and delete them. I didn't see commands to do that in the BaseX client 7.2.1. Any pointer or help will help me in understanding. Thanks, Anupam

3 4

RESTXQ: separate server into multiple files
by Philippe Rathé 18 May '12

18 May '12

Hello, In the Hello World example (restxq.xqm), would it be possible to split the file in two with half of the requests mappings be in one file in the other half in the other file. The reason I would like to do this is for separation of concern. You could see it as a controller file and in my use case I have two controllers. I have tried to add another file name abc.xqm aside restxq.xqm but I got the following error. The doc says it loads every xqm file in the root directory so I thought nothing would be necessary to configure. HTTP/1.1 500 Could not initialize class org.basex.http.restxq.RestXqResponse Could not initialize class org.basex.http.restxq.RestXqResponse</pre></p><h3>Caused by:</h3><pre>java.lang.NoClassDefFoundError: Could not initialize class org.basex.http.restxq.RestXqResponse at org.basex.http.restxq.RestXqModule.process(RestXqModule.java:97) at org.basex.http.restxq.RestXqServlet.run(RestXqServlet.java:28) at org.basex.http.BaseXServlet.service(BaseXServlet.java:34) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Thanks, -- Philippe

2 2

DB replication
by Gilles CARRY 17 May '12

17 May '12

Hello, Is there any simple way to perform a one-way replication with BaseX? The needed replication is: serverA -> serverB Where serverB is read-only. I would imagine a basic system such as server forwarding updates to server. Or more simply, is there any logging of the update queries so we can just feed them to serverB? Thank-you, Best regards, Gilles.

4 3

Using FOTS format with BaseX
by Florent Georges 16 May '12

16 May '12

Hi, I am playing around with basex-tests [1]. I cloned the repository, then went to the directory and tried "mvn test". I got a compilation error: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:testCompile (default-testCompile) on project basex-tests: Compilation failure [ERROR] .../basex-tests/src/test/java/org/basex/test/qt3ts/app/AppXMark.java:[293,26] constant string too long If I try only "mvn compile", everything goes fine. From there, I could then execute the tests by using (from the same directory): M2_BASEX=~/.m2/repository/org/basex/basex BASEX=$M2_BASEX/7.2.2-SNAPSHOT/basex-7.2.2-20120514.173900-14.jar java -cp target/classes/:$BASEX org.basex.tests.w3c.QT3TS \ -p ~/tmp/fots-try The directory ~/tmp/fots-try/ is not the official FOTS from the WG, but a small sample I wrote for playing with FOTS. It contains the following files: fots-try/ catalog.xml sample.xml set.xml The file catalog.xml is the FOTS catalog, the main entry point, sample.xml is a sample XML document, and set.xml is the only one test set in the test suite. I have two remarks. 1/ The second test case fails because BaseX tries to resolve sample.xml in the current directory, but it should be resolved relatively to the catalog file. The FOTS doc says (for the attribute group "fileAttr"): "The URI will always be relative to the base URI of the XML document in which the attribute appears." 2/ BaseX creates a file "qt3ts.log" in the test suite directory. I couldn't find any infos in the FOTS doc about where to put the result, but Saxon does that in results/saxon/, which is probably better in order to avoid interferences between several processors. The catalog contains: <catalog xmlns="http://www.w3.org/2010/09/qt-fots-catalog" test-suite="try" version="0.1.0"> <environment name="empty"/>  <environment name="sample"> <namespace prefix="s" uri="http://example.org/sample"/> <source role="." file="sample.xml" uri="http://example.org/sample.xml"> <description>Some sample input tree.</description> <created by="Florent Georges" on="2012-05-14"/> </source> </environment>  <test-set name="set" file="set.xml"/> </catalog> The test set descriptor, set.xml, contains two test cases: <test-set xmlns="http://www.w3.org/2010/09/qt-fots-catalog" name="set"> <description>Some test set...</description>  <test-case name="sample-001"> <description>Some test case...</description> <created by="Florent Georges" on="2012-05-14"/> <environment ref="empty"/> <test>"hello"</test> <result> <all-of> <assert-eq>'hello'</assert-eq> <assert-type>xs:string</assert-type> </all-of> </result> </test-case>  <test-case name="sample-002"> <description>Other test case...</description> <created by="Florent Georges" on="2012-05-14"/> <environment ref="sample"/> <test>sample</test> <result> <assert-type>element(sample)</assert-type> </result> </test-case> </test-set> Thank you for making this available to the community! Regards, -- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/ [1] https://github.com/BaseXdb/basex-tests

2 4

Full Text Search Ignore Option
by Michael Piotrowski 16 May '12

16 May '12

Hi, I just noticed that BaseX doesn't support the Full Text ignore option ("without content"). Are there any plans to add support for it? Best regards -- Dr.-Ing. Michael Piotrowski, M.A. <mxp(a)cl.uzh.ch> Institute of Computational Linguistics, University of Zurich Phone +41 44 63-54313 | OpenPGP public key ID 0x1614A044 * OUT NOW: Systems and Frameworks for Computational Morphology * <http://www.springeronline.com/978-3-642-23137-7>

2 2

Re: [basex-talk] BaseX-Talk Digest, Vol 29, Issue 26
by Sandeep Yohans 16 May '12

16 May '12

Hello Everyone, I am from Anand's team. Thank you Christian for looking into the our problem. We really appreciate all your efforts towards helping us and are thankful to you. -- Thanks and Regards Sandeep Yohans Java Developer Systems - AIR Infotech Nagpur - India www.airinfotech.in Mobile: +91-997-086-5520

1 0

Backup all databases at once?
by Manuel Bernhardt 16 May '12

16 May '12

Hi, is there perhaps a way to backup all databases at once? I am thinking of adapting http://sourceforge.net/projects/automysqlbackup/ to use with BaseX, as we already did so with MongoDB. Thanks, Manuel

2 2

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

BaseX-Talk May 2012