Hello,
I asked this question in StackOverflow concerning some performance problems I experienced when inserting nodes into a BaseX database: https://stackoverflow.com/questions/51595210/basex-inserting-nodes-performan...
I already made some progress, especially when it comes to querying all data I need for the updates. I work a lot with the indexes now.
But I still have problems with inserting - and also deleting - nodes. It doesn't matter if I insert/delete nodes via a Java program or in the editor of the BaseX GUI: Both is quite slow. Inserting just one node in the GUI with an XQuery like this one takes up to 3 seconds:
insert node <related_record><title>Test title</title><author>Joe Lastname</author></related_record> into db:open-id('Database_Name', 7947561)
Deleting a node with the following command takes up to 7 seconds: delete node db:open-id('Database_Name', 88085737)
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
Maybe my database is just too big to be performant? Or some settings are wrong? I'm very new to BaseX (and XML databases in general) so maybe there are just some errors I don't see. I also give you some information on my database that I copied from the info screen of the BaseX GUI:
Database Properties NAME: Database_Name SIZE: 2568 MB NODES: 135607105 DOCUMENTS: 1 BINARIES: 0 TIMESTAMP: 2018-08-07T07:05:56.000Z UPTODATE: true
Resource Properties INPUTPATH: /path/to/file.xml INPUTSIZE: 1774 MB INPUTDATE: 2018-07-24T14:32:58.000Z
Indexes TEXTINDEX: true ATTRINDEX: true TOKENINDEX: false FTINDEX: false TEXTINCLUDE: ATTRINCLUDE: TOKENINCLUDE: FTINCLUDE: LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS: UPDINDEX: true AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96 SPLITSIZE: 0
Best regards, Michael
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Michael,
Welcome to the list.
One thing you could try immediately is to call OPTIMIZE – possibly followed by the ALL flag, or db:optimize(..., true() – and see if performance improves. Obviously, this doesn't make sense after each single update operation, but it could be called before a bigger number of updates is to be performed.
The problem is that in my case, I have to do about 150000 inserts and
deletes, so it would take too much time.
If you define all the insert expression (or a bigger number than just 1 or 10) in a single XQuery expression (via a FLWOR expression), you will benefit from various bulk optimizations. Did you try that already?
Best, Christian
BIRKNER Michael Michael.BIRKNER@akwien.at schrieb am Mi., 8. Aug. 2018, 08:36:
Hello,
I asked this question in StackOverflow concerning some performance problems I experienced when inserting nodes into a BaseX database:
https://stackoverflow.com/questions/51595210/basex-inserting-nodes-performan...
I already made some progress, especially when it comes to querying all data I need for the updates. I work a lot with the indexes now.
But I still have problems with inserting - and also deleting - nodes. It doesn't matter if I insert/delete nodes via a Java program or in the editor of the BaseX GUI: Both is quite slow. Inserting just one node in the GUI with an XQuery like this one takes up to 3 seconds:
insert node <related_record><title>Test title</title><author>Joe Lastname</author></related_record> into db:open-id('Database_Name', 7947561)
Deleting a node with the following command takes up to 7 seconds: delete node db:open-id('Database_Name', 88085737)
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
Maybe my database is just too big to be performant? Or some settings are wrong? I'm very new to BaseX (and XML databases in general) so maybe there are just some errors I don't see. I also give you some information on my database that I copied from the info screen of the BaseX GUI:
Database Properties NAME: Database_Name SIZE: 2568 MB NODES: 135607105 DOCUMENTS: 1 BINARIES: 0 TIMESTAMP: 2018-08-07T07:05:56.000Z UPTODATE: true
Resource Properties INPUTPATH: /path/to/file.xml INPUTSIZE: 1774 MB INPUTDATE: 2018-07-24T14:32:58.000Z
Indexes TEXTINDEX: true ATTRINDEX: true TOKENINDEX: false FTINDEX: false TEXTINCLUDE: ATTRINCLUDE: TOKENINCLUDE: FTINCLUDE: LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS: UPDINDEX: true AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96 SPLITSIZE: 0
Best regards, Michael
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter *501 65 1*, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Hi,
I made a lot of tests now and the inserts are much faster. I found out that the database was not optimized. When I optimize the DB before calling the inserts or deletes, they execute very fast.
But I have another problem now when I optimize the DB in Java code. I try to insert the new nodes in chunks of 8000 nodes per command. As I already said: I have to insert a lot of nodes (more than 100000) and I think that would be too much for one XQuery command (???). So what I basically do now is this:
1. Gathering all the information I need for the insert commands 2. Creating a loop in Java that executes the insert commands every 8000th iteration 3. After each iteration, I try to optimize the database with this Java command: new XQuery("db:optimize('DB_Name', false())").execute(context);
The problem is that - when the second iteration of the loop tries to insert the next 8000 nodes - I get this exception "TM: no context value bound". I add the complete stack trace below. I already tried to create a new context, but that didn't work. What could be the problem?
Thank you very much for your help in advance! Michael
Here the complete stack trace:
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound. at org.basex.core.Command.execute(Command.java:94) at org.basex.core.Command.execute(Command.java:116) at actions.basex.Test.run(Test.java:256) --> This line contains the insert command: new XQuery([INSERT_QUERY]).execute(context); at actions.basex.Test.<init>(Test.java:50) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at main.Main.main(Main.java:44) Caused by: org.basex.query.QueryException: TM: no context value bound. at org.basex.query.QueryError.get(QueryError.java:1392) at org.basex.query.expr.path.Step.checkNode(Step.java:227) at org.basex.query.expr.path.IterStep$1.next(IterStep.java:37) at org.basex.query.expr.path.CachedPath.iter(CachedPath.java:72) at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:51) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:40) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.next(List.java:119) at org.basex.query.QueryContext.next(QueryContext.java:397) at org.basex.query.expr.constr.Constr.add(Constr.java:71) at org.basex.query.up.expr.Insert.item(Insert.java:55) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.size(List.java:141) at org.basex.query.scope.MainModule.cache(MainModule.java:99) at org.basex.query.QueryContext.iter(QueryContext.java:333) at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90) at org.basex.core.cmd.AQuery.query(AQuery.java:100) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) ... 8 morehttps://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün christian.gruen@gmail.com Gesendet: Mittwoch, 8. August 2018 19:16 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Michael,
Welcome to the list.
One thing you could try immediately is to call OPTIMIZE – possibly followed by the ALL flag, or db:optimize(..., true() – and see if performance improves. Obviously, this doesn't make sense after each single update operation, but it could be called before a bigger number of updates is to be performed.
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
If you define all the insert expression (or a bigger number than just 1 or 10) in a single XQuery expression (via a FLWOR expression), you will benefit from various bulk optimizations. Did you try that already?
Best, Christian
BIRKNER Michael <Michael.BIRKNER@akwien.atmailto:Michael.BIRKNER@akwien.at> schrieb am Mi., 8. Aug. 2018, 08:36:
Hello,
I asked this question in StackOverflow concerning some performance problems I experienced when inserting nodes into a BaseX database: https://stackoverflow.com/questions/51595210/basex-inserting-nodes-performan...
I already made some progress, especially when it comes to querying all data I need for the updates. I work a lot with the indexes now.
But I still have problems with inserting - and also deleting - nodes. It doesn't matter if I insert/delete nodes via a Java program or in the editor of the BaseX GUI: Both is quite slow. Inserting just one node in the GUI with an XQuery like this one takes up to 3 seconds:
insert node <related_record><title>Test title</title><author>Joe Lastname</author></related_record> into db:open-id('Database_Name', 7947561)
Deleting a node with the following command takes up to 7 seconds: delete node db:open-id('Database_Name', 88085737)
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
Maybe my database is just too big to be performant? Or some settings are wrong? I'm very new to BaseX (and XML databases in general) so maybe there are just some errors I don't see. I also give you some information on my database that I copied from the info screen of the BaseX GUI:
Database Properties NAME: Database_Name SIZE: 2568 MB NODES: 135607105 DOCUMENTS: 1 BINARIES: 0 TIMESTAMP: 2018-08-07T07:05:56.000Z UPTODATE: true
Resource Properties INPUTPATH: /path/to/file.xml INPUTSIZE: 1774 MB INPUTDATE: 2018-07-24T14:32:58.000Z
Indexes TEXTINDEX: true ATTRINDEX: true TOKENINDEX: false FTINDEX: false TEXTINCLUDE: ATTRINCLUDE: TOKENINCLUDE: FTINCLUDE: LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS: UPDINDEX: true AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96 SPLITSIZE: 0
Best regards, Michael
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Hi Michael,
that's good to hear.
In order to understand the error message, I’ll have to look at your XQuery code. Could you please attach it to your next reply?
Thanks, Christian
BIRKNER Michael Michael.BIRKNER@akwien.at schrieb am Do., 30. Aug. 2018, 07:59:
Hi,
I made a lot of tests now and the inserts are much faster. I found out that the database was not optimized. When I optimize the DB before calling the inserts or deletes, they execute very fast.
But I have another problem now when I optimize the DB in Java code. I try to insert the new nodes in chunks of 8000 nodes per command. As I already said: I have to insert a lot of nodes (more than 100000) and I think that would be too much for one XQuery command (???). So what I basically do now is this:
- Gathering all the information I need for the insert commands
- Creating a loop in Java that executes the insert commands every 8000th
iteration 3. After each iteration, I try to optimize the database with this Java command: new XQuery("db:optimize('DB_Name', false())").execute(context);
The problem is that - when the second iteration of the loop tries to insert the next 8000 nodes - I get this exception "TM: no context value bound". I add the complete stack trace below. I already tried to create a new context, but that didn't work. What could be the problem?
Thank you very much for your help in advance! Michael
Here the complete stack trace:
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound. at org.basex.core.Command.execute(Command.java:94) at org.basex.core.Command.execute(Command.java:116) at actions.basex.Test.run(Test.java:256) --> This line contains the insert command: new XQuery([INSERT_QUERY]).execute(context); at actions.basex.Test.<init>(Test.java:50) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at main.Main.main(Main.java:44) Caused by: org.basex.query.QueryException: TM: no context value bound. at org.basex.query.QueryError.get(QueryError.java:1392) at org.basex.query.expr.path.Step.checkNode(Step.java:227) at org.basex.query.expr.path.IterStep$1.next(IterStep.java:37) at org.basex.query.expr.path.CachedPath.iter(CachedPath.java:72) at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:51) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:40) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.next(List.java:119) at org.basex.query.QueryContext.next(QueryContext.java:397) at org.basex.query.expr.constr.Constr.add(Constr.java:71) at org.basex.query.up.expr.Insert.item(Insert.java:55) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.size(List.java:141) at org.basex.query.scope.MainModule.cache(MainModule.java:99) at org.basex.query.QueryContext.iter(QueryContext.java:333) at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90) at org.basex.core.cmd.AQuery.query(AQuery.java:100) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) ... 8 more https://www.wie-soll-arbeit.at
*Von:* Christian Grün christian.gruen@gmail.com *Gesendet:* Mittwoch, 8. August 2018 19:16 *An:* BIRKNER Michael *Cc:* BaseX *Betreff:* Re: [basex-talk] BaseX insert/delete node performance
Michael,
Welcome to the list.
One thing you could try immediately is to call OPTIMIZE – possibly followed by the ALL flag, or db:optimize(..., true() – and see if performance improves. Obviously, this doesn't make sense after each single update operation, but it could be called before a bigger number of updates is to be performed.
The problem is that in my case, I have to do about 150000 inserts and
deletes, so it would take too much time.
If you define all the insert expression (or a bigger number than just 1 or 10) in a single XQuery expression (via a FLWOR expression), you will benefit from various bulk optimizations. Did you try that already?
Best, Christian
BIRKNER Michael Michael.BIRKNER@akwien.at schrieb am Mi., 8. Aug. 2018, 08:36:
Hello,
I asked this question in StackOverflow concerning some performance problems I experienced when inserting nodes into a BaseX database:
https://stackoverflow.com/questions/51595210/basex-inserting-nodes-performan...
I already made some progress, especially when it comes to querying all data I need for the updates. I work a lot with the indexes now.
But I still have problems with inserting - and also deleting - nodes. It doesn't matter if I insert/delete nodes via a Java program or in the editor of the BaseX GUI: Both is quite slow. Inserting just one node in the GUI with an XQuery like this one takes up to 3 seconds:
insert node <related_record><title>Test title</title><author>Joe Lastname</author></related_record> into db:open-id('Database_Name', 7947561)
Deleting a node with the following command takes up to 7 seconds: delete node db:open-id('Database_Name', 88085737)
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
Maybe my database is just too big to be performant? Or some settings are wrong? I'm very new to BaseX (and XML databases in general) so maybe there are just some errors I don't see. I also give you some information on my database that I copied from the info screen of the BaseX GUI:
Database Properties NAME: Database_Name SIZE: 2568 MB NODES: 135607105 DOCUMENTS: 1 BINARIES: 0 TIMESTAMP: 2018-08-07T07:05:56.000Z UPTODATE: true
Resource Properties INPUTPATH: /path/to/file.xml INPUTSIZE: 1774 MB INPUTDATE: 2018-07-24T14:32:58.000Z
Indexes TEXTINDEX: true ATTRINDEX: true TOKENINDEX: false FTINDEX: false TEXTINCLUDE: ATTRINCLUDE: TOKENINCLUDE: FTINCLUDE: LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS: UPDINDEX: true AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96 SPLITSIZE: 0
Best regards, Michael
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter *501 65 1*, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter *501 65 1*, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Hi Christian,
sure, here it is. At first, I insert the nodes with this command:
let $parent := db:open('DB_Name')/path/to/parent/node return insert node <childNode><subNode>Example 1</subNode><subNode>Example 2</subNode></childNode> into $parent
I execute 8000 of these commands at once. They are comma separated and, as I already said, with an optimized database, they execute very fast. Then I flush the changes with this command: db:flush('DB_Name')
And the last command before executing the next 8000 inserts is the optimize command: db:optimize('DB_Name', false())
Best regards, Michael https://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün christian.gruen@gmail.com Gesendet: Donnerstag, 30. August 2018 08:07 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Hi Michael,
that's good to hear.
In order to understand the error message, I’ll have to look at your XQuery code. Could you please attach it to your next reply?
Thanks, Christian
BIRKNER Michael <Michael.BIRKNER@akwien.atmailto:Michael.BIRKNER@akwien.at> schrieb am Do., 30. Aug. 2018, 07:59:
Hi,
I made a lot of tests now and the inserts are much faster. I found out that the database was not optimized. When I optimize the DB before calling the inserts or deletes, they execute very fast.
But I have another problem now when I optimize the DB in Java code. I try to insert the new nodes in chunks of 8000 nodes per command. As I already said: I have to insert a lot of nodes (more than 100000) and I think that would be too much for one XQuery command (???). So what I basically do now is this:
1. Gathering all the information I need for the insert commands 2. Creating a loop in Java that executes the insert commands every 8000th iteration 3. After each iteration, I try to optimize the database with this Java command: new XQuery("db:optimize('DB_Name', false())").execute(context);
The problem is that - when the second iteration of the loop tries to insert the next 8000 nodes - I get this exception "TM: no context value bound". I add the complete stack trace below. I already tried to create a new context, but that didn't work. What could be the problem?
Thank you very much for your help in advance! Michael
Here the complete stack trace:
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound. at org.basex.core.Command.execute(Command.java:94) at org.basex.core.Command.execute(Command.java:116) at actions.basex.Test.run(Test.java:256) --> This line contains the insert command: new XQuery([INSERT_QUERY]).execute(context); at actions.basex.Test.<init>(Test.java:50) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at main.Main.main(Main.java:44) Caused by: org.basex.query.QueryException: TM: no context value bound. at org.basex.query.QueryError.get(QueryError.java:1392) at org.basex.query.expr.path.Step.checkNode(Step.java:227) at org.basex.query.expr.path.IterStep$1.next(IterStep.java:37) at org.basex.query.expr.path.CachedPath.iter(CachedPath.java:72) at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:51) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:40) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.next(List.java:119) at org.basex.query.QueryContext.next(QueryContext.java:397) at org.basex.query.expr.constr.Constr.add(Constr.java:71) at org.basex.query.up.expr.Insert.item(Insert.java:55) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.size(List.java:141) at org.basex.query.scope.MainModule.cache(MainModule.java:99) at org.basex.query.QueryContext.iter(QueryContext.java:333) at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90) at org.basex.core.cmd.AQuery.query(AQuery.java:100) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) ... 8 morehttps://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> Gesendet: Mittwoch, 8. August 2018 19:16 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Michael,
Welcome to the list.
One thing you could try immediately is to call OPTIMIZE – possibly followed by the ALL flag, or db:optimize(..., true() – and see if performance improves. Obviously, this doesn't make sense after each single update operation, but it could be called before a bigger number of updates is to be performed.
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
If you define all the insert expression (or a bigger number than just 1 or 10) in a single XQuery expression (via a FLWOR expression), you will benefit from various bulk optimizations. Did you try that already?
Best, Christian
BIRKNER Michael <Michael.BIRKNER@akwien.atmailto:Michael.BIRKNER@akwien.at> schrieb am Mi., 8. Aug. 2018, 08:36:
Hello,
I asked this question in StackOverflow concerning some performance problems I experienced when inserting nodes into a BaseX database: https://stackoverflow.com/questions/51595210/basex-inserting-nodes-performan...
I already made some progress, especially when it comes to querying all data I need for the updates. I work a lot with the indexes now.
But I still have problems with inserting - and also deleting - nodes. It doesn't matter if I insert/delete nodes via a Java program or in the editor of the BaseX GUI: Both is quite slow. Inserting just one node in the GUI with an XQuery like this one takes up to 3 seconds:
insert node <related_record><title>Test title</title><author>Joe Lastname</author></related_record> into db:open-id('Database_Name', 7947561)
Deleting a node with the following command takes up to 7 seconds: delete node db:open-id('Database_Name', 88085737)
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
Maybe my database is just too big to be performant? Or some settings are wrong? I'm very new to BaseX (and XML databases in general) so maybe there are just some errors I don't see. I also give you some information on my database that I copied from the info screen of the BaseX GUI:
Database Properties NAME: Database_Name SIZE: 2568 MB NODES: 135607105 DOCUMENTS: 1 BINARIES: 0 TIMESTAMP: 2018-08-07T07:05:56.000Z UPTODATE: true
Resource Properties INPUTPATH: /path/to/file.xml INPUTSIZE: 1774 MB INPUTDATE: 2018-07-24T14:32:58.000Z
Indexes TEXTINDEX: true ATTRINDEX: true TOKENINDEX: false FTINDEX: false TEXTINCLUDE: ATTRINCLUDE: TOKENINCLUDE: FTINCLUDE: LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS: UPDINDEX: true AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96 SPLITSIZE: 0
Best regards, Michael
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Could you possible attach the resulting query as well (the one that caused the error message)?
BIRKNER Michael Michael.BIRKNER@akwien.at schrieb am Do., 30. Aug. 2018, 08:39:
Hi Christian,
sure, here it is. At first, I insert the nodes with this command:
let $parent := db:open('DB_Name')/path/to/parent/node return insert node <childNode><subNode>Example 1</subNode><subNode>Example 2</subNode></childNode> into $parent
I execute 8000 of these commands at once. They are comma separated and, as I already said, with an optimized database, they execute very fast. Then I flush the changes with this command: db:flush('DB_Name')
And the last command before executing the next 8000 inserts is the optimize command: db:optimize('DB_Name', false())
Best regards, Michael https://www.wie-soll-arbeit.at
*Von:* Christian Grün christian.gruen@gmail.com *Gesendet:* Donnerstag, 30. August 2018 08:07 *An:* BIRKNER Michael *Cc:* BaseX *Betreff:* Re: [basex-talk] BaseX insert/delete node performance
Hi Michael,
that's good to hear.
In order to understand the error message, I’ll have to look at your XQuery code. Could you please attach it to your next reply?
Thanks, Christian
BIRKNER Michael Michael.BIRKNER@akwien.at schrieb am Do., 30. Aug. 2018, 07:59:
Hi,
I made a lot of tests now and the inserts are much faster. I found out that the database was not optimized. When I optimize the DB before calling the inserts or deletes, they execute very fast.
But I have another problem now when I optimize the DB in Java code. I try to insert the new nodes in chunks of 8000 nodes per command. As I already said: I have to insert a lot of nodes (more than 100000) and I think that would be too much for one XQuery command (???). So what I basically do now is this:
- Gathering all the information I need for the insert commands
- Creating a loop in Java that executes the insert commands every 8000th
iteration 3. After each iteration, I try to optimize the database with this Java command: new XQuery("db:optimize('DB_Name', false())").execute(context);
The problem is that - when the second iteration of the loop tries to insert the next 8000 nodes - I get this exception "TM: no context value bound". I add the complete stack trace below. I already tried to create a new context, but that didn't work. What could be the problem?
Thank you very much for your help in advance! Michael
Here the complete stack trace:
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound. at org.basex.core.Command.execute(Command.java:94) at org.basex.core.Command.execute(Command.java:116) at actions.basex.Test.run(Test.java:256) --> This line contains the insert command: new XQuery([INSERT_QUERY]).execute(context); at actions.basex.Test.<init>(Test.java:50) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at main.Main.main(Main.java:44) Caused by: org.basex.query.QueryException: TM: no context value bound. at org.basex.query.QueryError.get(QueryError.java:1392) at org.basex.query.expr.path.Step.checkNode(Step.java:227) at org.basex.query.expr.path.IterStep$1.next(IterStep.java:37) at org.basex.query.expr.path.CachedPath.iter(CachedPath.java:72) at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:51) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:40) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.next(List.java:119) at org.basex.query.QueryContext.next(QueryContext.java:397) at org.basex.query.expr.constr.Constr.add(Constr.java:71) at org.basex.query.up.expr.Insert.item(Insert.java:55) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.size(List.java:141) at org.basex.query.scope.MainModule.cache(MainModule.java:99) at org.basex.query.QueryContext.iter(QueryContext.java:333) at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90) at org.basex.core.cmd.AQuery.query(AQuery.java:100) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) ... 8 more https://www.wie-soll-arbeit.at
*Von:* Christian Grün christian.gruen@gmail.com *Gesendet:* Mittwoch, 8. August 2018 19:16 *An:* BIRKNER Michael *Cc:* BaseX *Betreff:* Re: [basex-talk] BaseX insert/delete node performance
Michael,
Welcome to the list.
One thing you could try immediately is to call OPTIMIZE – possibly followed by the ALL flag, or db:optimize(..., true() – and see if performance improves. Obviously, this doesn't make sense after each single update operation, but it could be called before a bigger number of updates is to be performed.
The problem is that in my case, I have to do about 150000 inserts and
deletes, so it would take too much time.
If you define all the insert expression (or a bigger number than just 1 or 10) in a single XQuery expression (via a FLWOR expression), you will benefit from various bulk optimizations. Did you try that already?
Best, Christian
BIRKNER Michael Michael.BIRKNER@akwien.at schrieb am Mi., 8. Aug. 2018, 08:36:
Hello,
I asked this question in StackOverflow concerning some performance problems I experienced when inserting nodes into a BaseX database:
https://stackoverflow.com/questions/51595210/basex-inserting-nodes-performan...
I already made some progress, especially when it comes to querying all data I need for the updates. I work a lot with the indexes now.
But I still have problems with inserting - and also deleting - nodes. It doesn't matter if I insert/delete nodes via a Java program or in the editor of the BaseX GUI: Both is quite slow. Inserting just one node in the GUI with an XQuery like this one takes up to 3 seconds:
insert node <related_record><title>Test title</title><author>Joe Lastname</author></related_record> into db:open-id('Database_Name', 7947561)
Deleting a node with the following command takes up to 7 seconds: delete node db:open-id('Database_Name', 88085737)
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
Maybe my database is just too big to be performant? Or some settings are wrong? I'm very new to BaseX (and XML databases in general) so maybe there are just some errors I don't see. I also give you some information on my database that I copied from the info screen of the BaseX GUI:
Database Properties NAME: Database_Name SIZE: 2568 MB NODES: 135607105 DOCUMENTS: 1 BINARIES: 0 TIMESTAMP: 2018-08-07T07:05:56.000Z UPTODATE: true
Resource Properties INPUTPATH: /path/to/file.xml INPUTSIZE: 1774 MB INPUTDATE: 2018-07-24T14:32:58.000Z
Indexes TEXTINDEX: true ATTRINDEX: true TOKENINDEX: false FTINDEX: false TEXTINCLUDE: ATTRINCLUDE: TOKENINCLUDE: FTINCLUDE: LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS: UPDINDEX: true AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96 SPLITSIZE: 0
Best regards, Michael
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter *501 65 1*, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter *501 65 1*, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter *501 65 1*, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Hi,
the error message is thrown after the first iteration of the before mentioned commands. As soon as the insert command for the next 8000 inserts is executed the second time, the error occurs. So when this XQuery is executed the second time:
let $parent := db:open('DB_Name')/path/to/parent/node return insert node <childNode><subNode>Example 1</subNode><subNode>Example 2</subNode></childNode> into $parent
... the error is thrown. For your information: I put the above mentioned command to a String in Java and execute it like this: new XQuery(INSERT_COMMAND_STRING).execute(context);
Best regards, Michael
Mag. Michael Birkner AK Wien - Bibliothek 1040, Prinz Eugen Straße 20-22 T: +43 1 501 65 12455 F: +43 1 501 65 142455
michael.birkner@akwien.atmailto:Michael.BIRKNER@akwien.at wien.arbeiterkammer.athttp://wien.arbeiterkammer.at/
Besuchen Sie uns auch auf: facebookhttp://www.facebook.com/arbeiterkammer/ | twitterhttps://twitter.com/Arbeiterkammer | youtubehttps://www.youtube.com/user/AKoesterreich --------------------------------------------------
https://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün christian.gruen@gmail.com Gesendet: Donnerstag, 30. August 2018 08:42 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Could you possible attach the resulting query as well (the one that caused the error message)?
BIRKNER Michael <Michael.BIRKNER@akwien.atmailto:Michael.BIRKNER@akwien.at> schrieb am Do., 30. Aug. 2018, 08:39:
Hi Christian,
sure, here it is. At first, I insert the nodes with this command:
let $parent := db:open('DB_Name')/path/to/parent/node return insert node <childNode><subNode>Example 1</subNode><subNode>Example 2</subNode></childNode> into $parent
I execute 8000 of these commands at once. They are comma separated and, as I already said, with an optimized database, they execute very fast. Then I flush the changes with this command: db:flush('DB_Name')
And the last command before executing the next 8000 inserts is the optimize command: db:optimize('DB_Name', false())
Best regards, Michael https://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> Gesendet: Donnerstag, 30. August 2018 08:07 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Hi Michael,
that's good to hear.
In order to understand the error message, I’ll have to look at your XQuery code. Could you please attach it to your next reply?
Thanks, Christian
BIRKNER Michael <Michael.BIRKNER@akwien.atmailto:Michael.BIRKNER@akwien.at> schrieb am Do., 30. Aug. 2018, 07:59:
Hi,
I made a lot of tests now and the inserts are much faster. I found out that the database was not optimized. When I optimize the DB before calling the inserts or deletes, they execute very fast.
But I have another problem now when I optimize the DB in Java code. I try to insert the new nodes in chunks of 8000 nodes per command. As I already said: I have to insert a lot of nodes (more than 100000) and I think that would be too much for one XQuery command (???). So what I basically do now is this:
1. Gathering all the information I need for the insert commands 2. Creating a loop in Java that executes the insert commands every 8000th iteration 3. After each iteration, I try to optimize the database with this Java command: new XQuery("db:optimize('DB_Name', false())").execute(context);
The problem is that - when the second iteration of the loop tries to insert the next 8000 nodes - I get this exception "TM: no context value bound". I add the complete stack trace below. I already tried to create a new context, but that didn't work. What could be the problem?
Thank you very much for your help in advance! Michael
Here the complete stack trace:
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound. at org.basex.core.Command.execute(Command.java:94) at org.basex.core.Command.execute(Command.java:116) at actions.basex.Test.run(Test.java:256) --> This line contains the insert command: new XQuery([INSERT_QUERY]).execute(context); at actions.basex.Test.<init>(Test.java:50) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at main.Main.main(Main.java:44) Caused by: org.basex.query.QueryException: TM: no context value bound. at org.basex.query.QueryError.get(QueryError.java:1392) at org.basex.query.expr.path.Step.checkNode(Step.java:227) at org.basex.query.expr.path.IterStep$1.next(IterStep.java:37) at org.basex.query.expr.path.CachedPath.iter(CachedPath.java:72) at org.basex.query.expr.path.CachedPath.nodeIter(CachedPath.java:51) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:69) at org.basex.query.expr.path.AxisPath.iter(AxisPath.java:40) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.constr.Constr.add(Constr.java:70) at org.basex.query.expr.constr.CElem.item(CElem.java:93) at org.basex.query.expr.constr.CElem.item(CElem.java:1) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.next(List.java:119) at org.basex.query.QueryContext.next(QueryContext.java:397) at org.basex.query.expr.constr.Constr.add(Constr.java:71) at org.basex.query.up.expr.Insert.item(Insert.java:55) at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:48) at org.basex.query.expr.List$1.iter(List.java:151) at org.basex.query.expr.List$1.size(List.java:141) at org.basex.query.scope.MainModule.cache(MainModule.java:99) at org.basex.query.QueryContext.iter(QueryContext.java:333) at org.basex.query.QueryProcessor.iter(QueryProcessor.java:90) at org.basex.core.cmd.AQuery.query(AQuery.java:100) at org.basex.core.cmd.XQuery.run(XQuery.java:22) at org.basex.core.Command.run(Command.java:257) at org.basex.core.Command.execute(Command.java:93) ... 8 morehttps://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün <christian.gruen@gmail.commailto:christian.gruen@gmail.com> Gesendet: Mittwoch, 8. August 2018 19:16 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Michael,
Welcome to the list.
One thing you could try immediately is to call OPTIMIZE – possibly followed by the ALL flag, or db:optimize(..., true() – and see if performance improves. Obviously, this doesn't make sense after each single update operation, but it could be called before a bigger number of updates is to be performed.
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
If you define all the insert expression (or a bigger number than just 1 or 10) in a single XQuery expression (via a FLWOR expression), you will benefit from various bulk optimizations. Did you try that already?
Best, Christian
BIRKNER Michael <Michael.BIRKNER@akwien.atmailto:Michael.BIRKNER@akwien.at> schrieb am Mi., 8. Aug. 2018, 08:36:
Hello,
I asked this question in StackOverflow concerning some performance problems I experienced when inserting nodes into a BaseX database: https://stackoverflow.com/questions/51595210/basex-inserting-nodes-performan...
I already made some progress, especially when it comes to querying all data I need for the updates. I work a lot with the indexes now.
But I still have problems with inserting - and also deleting - nodes. It doesn't matter if I insert/delete nodes via a Java program or in the editor of the BaseX GUI: Both is quite slow. Inserting just one node in the GUI with an XQuery like this one takes up to 3 seconds:
insert node <related_record><title>Test title</title><author>Joe Lastname</author></related_record> into db:open-id('Database_Name', 7947561)
Deleting a node with the following command takes up to 7 seconds: delete node db:open-id('Database_Name', 88085737)
The problem is that in my case, I have to do about 150000 inserts and deletes, so it would take too much time.
Maybe my database is just too big to be performant? Or some settings are wrong? I'm very new to BaseX (and XML databases in general) so maybe there are just some errors I don't see. I also give you some information on my database that I copied from the info screen of the BaseX GUI:
Database Properties NAME: Database_Name SIZE: 2568 MB NODES: 135607105 DOCUMENTS: 1 BINARIES: 0 TIMESTAMP: 2018-08-07T07:05:56.000Z UPTODATE: true
Resource Properties INPUTPATH: /path/to/file.xml INPUTSIZE: 1774 MB INPUTDATE: 2018-07-24T14:32:58.000Z
Indexes TEXTINDEX: true ATTRINDEX: true TOKENINDEX: false FTINDEX: false TEXTINCLUDE: ATTRINCLUDE: TOKENINCLUDE: FTINCLUDE: LANGUAGE: English STEMMING: false CASESENS: false DIACRITICS: false STOPWORDS: UPDINDEX: true AUTOOPTIMIZE: false MAXCATS: 100 MAXLEN: 96 SPLITSIZE: 0
Best regards, Michael
Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Hi Michael,
So when this XQuery is executed the second time:
let $parent := db:open('DB_Name')/path/to/parent/node return insert node <childNode><subNode>Example 1</subNode><subNode>Example 2</subNode></childNode> into $parent
... the error is thrown.
In order to understand what might go wrong, I would once again be thankful if you could provide us with reproducible steps, or a self-contained example.
I tried to simulate the repeated insertion in Java, but the attached Java code does not raise any errors. In the error message you attached,
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound.
the query parser indicates that you want to address a TM step in your query – which does not exist if no database is opened, or if no item is bound to the current context. This is why I asked for the query that triggered the error (it’s difficult to judge what goes wrong without having more insight). If your query contains confidential information, you could open it in the BaseX GUI and jump to the referenced error position (line 1, column 5847730).
Cheers Christian
Hi Christian,
thank you for your reply again. I will try to create a complete example and come back to you with it in some days (I am just about to travel at the moment so I have to find some time for it). Maybe, while creating the example, I'll find the error myself. In that case I also will report that.
Best regards,
Michaelhttps://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün christian.gruen@gmail.com Gesendet: Donnerstag, 30. August 2018 09:15 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Hi Michael,
So when this XQuery is executed the second time:
let $parent := db:open('DB_Name')/path/to/parent/node return insert node <childNode><subNode>Example 1</subNode><subNode>Example 2</subNode></childNode> into $parent
... the error is thrown.
In order to understand what might go wrong, I would once again be thankful if you could provide us with reproducible steps, or a self-contained example.
I tried to simulate the repeated insertion in Java, but the attached Java code does not raise any errors. In the error message you attached,
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound.
the query parser indicates that you want to address a TM step in your query – which does not exist if no database is opened, or if no item is bound to the current context. This is why I asked for the query that triggered the error (it’s difficult to judge what goes wrong without having more insight). If your query contains confidential information, you could open it in the BaseX GUI and jump to the referenced error position (line 1, column 5847730).
Cheers Christian Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
Hi again,
I finally found the problem: the error "TM: no context value bound" occured because in the XML text of my insert command there was a string like "...{TM}..." (with curly braces). This got interpreted as command. Christian already addressed this problem in a StackOverflow post (https://stackoverflow.com/a/48887497/792962) After removing the braces, everything went fine.
Best regards,
Michael
________________________________ Von: BaseX-Talk basex-talk-bounces@mailman.uni-konstanz.de im Auftrag von BIRKNER Michael Michael.BIRKNER@akwien.at Gesendet: Donnerstag, 30. August 2018 09:40 An: Christian Grün Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Hi Christian,
thank you for your reply again. I will try to create a complete example and come back to you with it in some days (I am just about to travel at the moment so I have to find some time for it). Maybe, while creating the example, I'll find the error myself. In that case I also will report that.
Best regards,
Michaelhttps://www.wie-soll-arbeit.at
________________________________ Von: Christian Grün christian.gruen@gmail.com Gesendet: Donnerstag, 30. August 2018 09:15 An: BIRKNER Michael Cc: BaseX Betreff: Re: [basex-talk] BaseX insert/delete node performance
Hi Michael,
So when this XQuery is executed the second time:
let $parent := db:open('DB_Name')/path/to/parent/node return insert node <childNode><subNode>Example 1</subNode><subNode>Example 2</subNode></childNode> into $parent
... the error is thrown.
In order to understand what might go wrong, I would once again be thankful if you could provide us with reproducible steps, or a self-contained example.
I tried to simulate the repeated insertion in Java, but the attached Java code does not raise any errors. In the error message you attached,
org.basex.core.BaseXException: Stopped at ., 1/5847730: [XPDY0002] TM: no context value bound.
the query parser indicates that you want to address a TM step in your query – which does not exist if no database is opened, or if no item is bound to the current context. This is why I asked for the query that triggered the error (it’s difficult to judge what goes wrong without having more insight). If your query contains confidential information, you could open it in the BaseX GUI and jump to the referenced error position (line 1, column 5847730).
Cheers Christian Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html Beachten Sie, dass Sie uns ab sofort unter einer geänderten Rufnummer erreichen. Bitte speichern Sie gleich Ihren Kontakt zur AK Wien ein unter 501 65 1, gefolgt von der gewohnten Durchwahl. Dieses Mail ist ausschließlich für die Verwendung durch die/den darin genannten AdressatInnen bestimmt und kann vertrauliche bzw rechtlich geschützte Informationen enthalten, deren Verwendung ohne Genehmigung durch den/ die AbsenderIn rechtswidrig sein kann. Falls Sie dieses Mail irrtümlich erhalten haben, informieren Sie uns bitte und löschen Sie die Nachricht. UID: ATU 16209706 I https://wien.arbeiterkammer.at/Datenschutz_(DSGVO).html
basex-talk@mailman.uni-konstanz.de