Hi Christian,
Let's return with this discussion to the mailing list as I think this might be interesting for everybody.
Summary: I had stability problems. In my tests, once BaseX reached certain size It would start throwing exceptions with stacktraces reaching deep into the low-level classes that shuffle the bits and bytes on disk.
I now started a test run of our system with PARALLEL=1 and UPDINDEX=false. The system has been running without any exceptions for 17 hours. Starting with an empty data there have been almost 200K queries and commands. This is a record.
Now, what exactly are the consequences of UPDINDEX being false.
Can I do queries? What kind of queries I can't do? What kind of queries will be slower? Will the database rebuild the index automatically? When? What is the command to trigger an index rebuild manually. Is the index rebuild kind of a "stop-the-world" operation, so that BaseX will become unresponsive or will it be done in parallel. What are the benefits of the indices, can I live without them?
-- Antoni Myłka Software Engineer
basis06 AG, Birkenweg 61, CH-3013 Bern - Fon +41 31 311 32 22 http://www.basis06.ch - source of smart business
----- Ursprüngliche Mail ----- Von: "Christian Grün" christian.gruen@gmail.com An: "Antoni Mylka" amy@basis06.ch Gesendet: Dienstag, 20. November 2012 16:46:30 Betreff: Re: [basex-talk] Database corruption
Hi Antoni,
I'll also try to run an experiment without UPDINDEX.
yes, this would be interesting. I did some more testing with the given queries, but couldn’t encounter something noteworthy so far..
Thanks, Christian
Hi Antoni,
thanks for your feedback.
I now started a test run of our system with PARALLEL=1 and UPDINDEX=false. The system has been running without any exceptions for 17 hours. Starting with an empty data there have been almost 200K queries and commands. This is a record.
that's good to hear. My assumption is that PARALLEL!=1 and UPDINDEX=false would point to a concurrency issue, while PARALLEL=1 and UPDINDEX=true would point to a bug in the updatable data structures.
Now, what exactly are the consequences of UPDINDEX being false. Can I do queries?
Absolutely; all queries are still possible.
What kind of queries will be slower?
All those queries that depend on the value indexes (e.g., text and attribrute value indexes). See [1] for some examples.
Will the database rebuild the index automatically?
Nope; this will exactly happen if UPDINDEX is set to true. In many cases, however, it will be much faster to perform updates with UPDINDEX=false and rebuild indexes manually afterwards, using OPTIMIZE or db:optimize().
What is the command to trigger an index rebuild manually.
OPTIMIZE will recreate your outdated index structures. You may as well use CREATE INDEX to explicitly create/rebuild a specified index.
Is the index rebuild kind of a "stop-the-world" operation, so that BaseX will become unresponsive or will it be done in parallel. What are the benefits of the indices, can I live without them?
As you guessed, indexing is flagged as a write operations and will thus queue other operations. In many use cases, this works fine, because indexing is a rather fast operation in BaseX
But, quite naturally, the notion of what “fast” is depends very much on your context, and the choice for implicit or manual indexing depends on your envisaged read/write patterns.
Hope this helps, Christian
PS: I would be interested to hear if you have already thought about benchmarking 200K update operations with UPDINDEX=false, AUTOFLUSH=false and a final OPTIMIZE call.
Hi Christian,
Just to let you know. Running another experiment and seeing other issues. This is with BaseX75-20121114.004957.war configured with PARALLEL=1 and UPDINDEX=false. Indeed, the stack trace doesn't mention the Updates or AtomicUpdateList classes, but still not optimal. The good result from last week may have been intermittent. In my log there are about 72K queries at the moment and catalina.out contains 34 exceptions. They correspond to 34 HTTP 500 responses in the access log.
Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 7.5 beta Java: Sun Microsystems Inc., 1.6.0_27 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException: Data Access out of bounds: - pre value: 1716983 - #used blocks: 6757 - #total locks: 6771 - access: 6756 (6757 > 6756] org.basex.util.Util.notexpected(Util.java:53) org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:489) org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:189) org.basex.data.Data.kind(Data.java:281) org.basex.query.value.node.DBNode.<init>(DBNode.java:49) org.basex.query.value.seq.DBNodeSeq.itemAt(DBNodeSeq.java:81) org.basex.query.value.seq.DBNodeSeq.itemAt(DBNodeSeq.java:1) org.basex.query.value.seq.Seq$1.next(Seq.java:94) org.basex.query.path.MixedPath.iter(MixedPath.java:83) org.basex.query.QueryContext.iter(QueryContext.java:289) org.basex.query.flwor.For$1.init(For.java:121) org.basex.query.flwor.For$1.next(For.java:85) org.basex.query.flwor.FLWR$1.next(FLWR.java:63) org.basex.query.expr.Constr.add(Constr.java:66) org.basex.query.expr.CElem.item(CElem.java:84) org.basex.query.expr.CElem.item(CElem.java:1) org.basex.query.expr.CFrag.item(CFrag.java:1) org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46) org.basex.query.QueryContext.iter(QueryContext.java:289) org.basex.query.QueryContext.iter(QueryContext.java:243) org.basex.query.QueryProcessor.iter(QueryProcessor.java:76) org.basex.server.QueryListener.execute(QueryListener.java:127) org.basex.server.LocalQuery.execute(LocalQuery.java:53) org.basex.http.rest.RESTQuery.query(RESTQuery.java:93) org.basex.http.rest.RESTQuery.run(RESTQuery.java:47) org.basex.http.rest.RESTPost.run(RESTPost.java:107) org.basex.http.rest.RESTServlet.run(RESTServlet.java:14) org.basex.http.BaseXServlet.service(BaseXServlet.java:39) javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
-- Antoni Myłka Software Engineer
basis06 AG, Birkenweg 61, CH-3013 Bern - Fon +41 31 311 32 22 http://www.basis06.ch - source of smart business
Antoni,
sorry for the ongoing bug research. We recently had another look at our basex-tests repository, which includes various stress tests [1], and our internal test cases, but it seems to be pretty difficult for us to get the reported problems reproduced. If we should manage to get closer to the core problem, we’ll be glad to dive more into this.
Christian
[1] https://github.com/BaseXdb/basex-tests ___________________________
Hi Christian,
Just to let you know. Running another experiment and seeing other issues. This is with BaseX75-20121114.004957.war configured with PARALLEL=1 and UPDINDEX=false. Indeed, the stack trace doesn't mention the Updates or AtomicUpdateList classes, but still not optimal. The good result from last week may have been intermittent. In my log there are about 72K queries at the moment and catalina.out contains 34 exceptions. They correspond to 34 HTTP 500 responses in the access log.
Improper use? Potential bug? Your feedback is welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 7.5 beta Java: Sun Microsystems Inc., 1.6.0_27 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException: Data Access out of bounds:
- pre value: 1716983
- #used blocks: 6757
- #total locks: 6771
- access: 6756 (6757 > 6756] org.basex.util.Util.notexpected(Util.java:53) org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:489) org.basex.io.random.TableDiskAccess.read1(TableDiskAccess.java:189) org.basex.data.Data.kind(Data.java:281) org.basex.query.value.node.DBNode.<init>(DBNode.java:49) org.basex.query.value.seq.DBNodeSeq.itemAt(DBNodeSeq.java:81) org.basex.query.value.seq.DBNodeSeq.itemAt(DBNodeSeq.java:1) org.basex.query.value.seq.Seq$1.next(Seq.java:94) org.basex.query.path.MixedPath.iter(MixedPath.java:83) org.basex.query.QueryContext.iter(QueryContext.java:289) org.basex.query.flwor.For$1.init(For.java:121) org.basex.query.flwor.For$1.next(For.java:85) org.basex.query.flwor.FLWR$1.next(FLWR.java:63) org.basex.query.expr.Constr.add(Constr.java:66) org.basex.query.expr.CElem.item(CElem.java:84) org.basex.query.expr.CElem.item(CElem.java:1) org.basex.query.expr.CFrag.item(CFrag.java:1) org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46) org.basex.query.QueryContext.iter(QueryContext.java:289) org.basex.query.QueryContext.iter(QueryContext.java:243) org.basex.query.QueryProcessor.iter(QueryProcessor.java:76) org.basex.server.QueryListener.execute(QueryListener.java:127) org.basex.server.LocalQuery.execute(LocalQuery.java:53) org.basex.http.rest.RESTQuery.query(RESTQuery.java:93) org.basex.http.rest.RESTQuery.run(RESTQuery.java:47) org.basex.http.rest.RESTPost.run(RESTPost.java:107) org.basex.http.rest.RESTServlet.run(RESTServlet.java:14) org.basex.http.BaseXServlet.service(BaseXServlet.java:39) javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
-- Antoni Myłka Software Engineer
basis06 AG, Birkenweg 61, CH-3013 Bern - Fon +41 31 311 32 22 http://www.basis06.ch - source of smart business _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de