Greetings,
First of all, thank you very much for BaseX. It has made many of my assignments this semester doable, more enjoyable, and better. Using it, I could demonstrate database features at scale where only toy minimal examples were required.
The source of most of my data has been NIST's National Vulnerability Database https://nvd.nist.gov/vuln/data-feeds and the Mitre curated Common Weakness Enumeration Lists https://cwe.mitre.org/data/downloads.html. Using BaseX, I found and reported to NIST some errors in the CVE database and the errors have since been fixed. I'm sure there are many more fixes and enhancements possible with the CVE database.
I almost exclusively use BaseX/GUI (version 9.0.2, then 9.1, and now 9.1.1) on Fedora Linux. I do have some issues using BaseX/GUI and I am hoping that some improvements can be made.
Eventually, BaseX/GUI uses all of it's allocated memory. Even after increasing the max memory to 3.5GB, eventually it is all used and BaseX/GUI essentially freezes. Any operation that freezes, executes quickly and completely after quitting/killing BaseX/GUI and then restarting. It seems that some memory just never gets freed as I develop different XQuery routines in the Editor, run them, save them to files, click in various places on the Map Visualization, run XQuery on the Input Bar, etc. Sometimes closing the database frees the memory, mostly it doesn't. Once I think memory was freed when I saved a file in the Editor. The Java error messages all seem to relate to running out of memory. Hitting the "GC" button never seems to help. I don't have a specific sequence of actions that eventually consumes all the memory.
An example database that will demonstrate this memory consumption is composed of NVD-CVE-1.0-2018 https://nvd.nist.gov/feeds/json/cve/1.0/nvdcve-1.0-2018.json.zip and CWE Comprehensive View https://cwe.mitre.org/data/xml/views/2000.xml.zip. The database takes about 109 MB and has about 5.25 million nodes. Just viewing the Visualization Map, clicking around, and running/editing queries like this will eventually use all the memory.
let $cwe := distinct-values
( for $c in //cve//problemtype__data//value return tokenize($c, "-")[last()] ) for $c in $cwe where empty(//Weakness_Catalog[contains(@Name, "CWE-2000")]//Weakness[@ID = $c]) and empty(//Weakness_Catalog[contains(@Name, "CWE-2000")]//Related_Weakness[@CWE_ID = $c]) order by number($c) return $c
`java -version` reports:
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
I will gladly provide any additional info that may help to diagnose these symptoms, etc.
Thanks and best regards. RG
Hi Rick,
Thanks for your observations.
I restricted my main memory to 2 GB and I played around with your sample data (with Windows). My memory consumption never exceeded 200 MB (and after closing everything, it goes back to appr. 30 MB). Maybe there is a single operation that I missed?
If the limits for query results have been increased in the GUI Preferences (in the "Result" tab), memory consumption might rise as well. If you have not changed the defaults, you could help us by…
• opening the "Used Memory" dialog (I think you have done this already, right?) and • clicking the "GC" after each single action you perform.
If the "Used Memory" value rises a lot after a specific action even after garbage collection, and if it doesn’t decrease after closing the database, your editor tabs, visualizations, etc., then you might have been able to isolate the operation that leads to the observed memory leak. If you believe that the visualizations might affect memory consumption, you can close them if a database is opened, and restart BaseX (visualizations won’t be computed if they are not displayed). Feel free to provide us with a list of the actions and the values for the observed memory consumption.
Best, Christian
Hi Rick,
While investing some more time in profiling, we encountered one memory leak by a) creating a database and b) adding additional documents via the Database Manage dialog in a second step. In Java, a strange decision was taken that top-level swing containers (such as our progress bar) won’t be garbage collected, even after they have been disposed [1].
I guess this is not a very serious leak in BaseX (it has never been reported in the past), but I have added a quick fix to tackle the most obvious case, and I’ll be interested in hearing if this will already reduce memory usage in your use experiments. A new snapshot is available [2].
Best, Christian
[1] https://github.com/BaseXdb/basex/issues/1650 [2] http://files.basex.org/releases/latest/
On Mon, Dec 17, 2018 at 12:49 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Rick,
Thanks for your observations.
I restricted my main memory to 2 GB and I played around with your sample data (with Windows). My memory consumption never exceeded 200 MB (and after closing everything, it goes back to appr. 30 MB). Maybe there is a single operation that I missed?
If the limits for query results have been increased in the GUI Preferences (in the "Result" tab), memory consumption might rise as well. If you have not changed the defaults, you could help us by…
• opening the "Used Memory" dialog (I think you have done this already, right?) and • clicking the "GC" after each single action you perform.
If the "Used Memory" value rises a lot after a specific action even after garbage collection, and if it doesn’t decrease after closing the database, your editor tabs, visualizations, etc., then you might have been able to isolate the operation that leads to the observed memory leak. If you believe that the visualizations might affect memory consumption, you can close them if a database is opened, and restart BaseX (visualizations won’t be computed if they are not displayed). Feel free to provide us with a list of the actions and the values for the observed memory consumption.
Best, Christian
Hi Christian,
I tried the snapshot release and it seems to be much better about releasing memory.
I still needed to increase "-Xmx" so that nvdcve-1.0-2018.json.zip could be loaded (all default settings, I think), but after I did, everything seemed to work fine.
I'll continue to exercise it.
Thanks!
Best regards, RG
On Mon, Dec 17, 2018 at 6:39 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Rick,
While investing some more time in profiling, we encountered one memory leak by a) creating a database and b) adding additional documents via the Database Manage dialog in a second step. In Java, a strange decision was taken that top-level swing containers (such as our progress bar) won’t be garbage collected, even after they have been disposed [1].
I guess this is not a very serious leak in BaseX (it has never been reported in the past), but I have added a quick fix to tackle the most obvious case, and I’ll be interested in hearing if this will already reduce memory usage in your use experiments. A new snapshot is available [2].
Best, Christian
[1] https://github.com/BaseXdb/basex/issues/1650 [2] http://files.basex.org/releases/latest/
On Mon, Dec 17, 2018 at 12:49 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Rick,
Thanks for your observations.
I restricted my main memory to 2 GB and I played around with your sample data (with Windows). My memory consumption never exceeded 200 MB (and after closing everything, it goes back to appr. 30 MB). Maybe there is a single operation that I missed?
If the limits for query results have been increased in the GUI Preferences (in the "Result" tab), memory consumption might rise as well. If you have not changed the defaults, you could help us by…
• opening the "Used Memory" dialog (I think you have done this already, right?) and • clicking the "GC" after each single action you perform.
If the "Used Memory" value rises a lot after a specific action even after garbage collection, and if it doesn’t decrease after closing the database, your editor tabs, visualizations, etc., then you might have been able to isolate the operation that leads to the observed memory leak. If you believe that the visualizations might affect memory consumption, you can close them if a database is opened, and restart BaseX (visualizations won’t be computed if they are not displayed). Feel free to provide us with a list of the actions and the values for the observed memory consumption.
Best, Christian
basex-talk@mailman.uni-konstanz.de