Hello,

First let me give you the context: I have a never ending stream of XML element coming in that I want to store and then make available through a REST interface.
Thus BaseX seems to be a well suited candidate. To be on the safe side I must be able to sustain an insertion rate of about 200 elements per second.

The XML elements I have to store are of the type:

<notification ts=”2015-03-13T10.44.25.123” nid=”type-of-data”>
    <name-1>value1</ name-1>
    <name-2>value2</ name-2>
    <name-3>value3</ name-3>
    <name-4>value4</ name-4>
    ….
</notification>


So quite simple and small.
I will mainly retrieve data by selecting notifications of a specific @nid between two @ts values, thus I need an attribute index.

I am using for now an embedded BaseX DB, to test the insertion of elements.

Here is how I configure my DB:

Context m_Context = new Context();
new Set(MainOptions.AUTOFLUSH, false).execute(m_Context);
new Set(MainOptions.ADDCACHE, false).execute(m_Context);
new Set(MainOptions.INTPARSE, true).execute(m_Context);
new Set(MainOptions.STRIPNS, true).execute(m_Context);
new Set(MainOptions.UPDINDEX, true).execute(m_Context);
new Set(MainOptions.TEXTINDEX, false).execute(m_Context);
new Set(MainOptions.ATTRINDEX, true).execute(m_Context);
new CreateDB(_SourceId).execute(m_Context);


And this is how I insert the elements:

try {
    String l_XmlRepresentation = _Notification.getXmlRepresentation();
    if (l_XmlRepresentation.isEmpty()) {
        return;
    }
    ByteArrayInputStream l_InputStream = new ByteArrayInputStream(l_XmlRepresentation.getBytes(m_Charset));
    Add add = new Add(_Notification.getSourceId());
    add.setInput(l_InputStream);
    add.execute(m_Context);
    if (_CurrentNotification % 10000 == 0) { // flush every 10000 notifications
        new Flush().execute(m_Context);
    }
}
catch (BaseXException ex) {
    s_Logger.log(Level.SEVERE, null, ex);
}



The performances I get are as follows

Size 10'000, Speed: 1'292
Size 20'000, Speed: 625
Size 30'000, Speed: 361
Size 40'000, Speed: 248
Size 50'000, Speed: 184
Size 60'000, Speed: 148
Size 70'000, Speed: 123
Size 80'000, Speed: 104
Size 90'000, Speed: 91
Size 100'000, Speed: 77
Size 110'000, Speed: 69
Size 120'000, Speed: 61
Size 130'000, Speed: 56
Size 140'000, Speed: 46

Where “Size” is the number of elements in the collection and “Speed” is average speed of insertion [in element per second] of the last 10000 elements.

My question is: do those performances seem normal or am I doing something wrong, knowing that with UPDINDEX = false, I have a steady insertion rate of 10000 elements per second.

Thanks a lot

Simon