Hi Zach,
you may be disappointed, but I would also have proposed the use of the
SAXSerializer, and nothing else. To get even better performance, I
would suggest to do some profiling (e.g. using -Xhprof:cpu=samples on
command line) and see which component is the current bottleneck. I
invite you to send the results to the list.
Thanks,
Christian
On Sat, Jun 7, 2014 at 5:41 AM, Zachary DeLuca <zadeluca@gmail.com> wrote:
> Okay I found another approach which is about 30% faster than what was
> written at the end of my last email:
>
> QueryProcessor proc = new QueryProcessor(query, context);
> Iter iter = proc.iter();
> proc.close();
>
> if (iter != null) {
> ArrayList<Element1> elements = new ArrayList<>();
> SAXSerializer ser = null;
> SAXSource source = new SAXSource(ser, null);
> for(Item item; (item = iter.next()) != null;) {
> ser = new SAXSerializer(item);
> source.setXMLReader(ser);
> elements.add(((Element1) um.unmarshal(source)));
> }
> // do something with the elements
> }
>
>
> Can it be done even better than this?
>
> Thanks,
> Zach
>
>
> On Fri, Jun 6, 2014 at 1:38 PM, Zachary DeLuca <zadeluca@gmail.com> wrote:
>>
>> Hello, I have a question about unmarshaling a query result into JAXB
>> object(s). My database collection contains documents that all contain the
>> same type of root element, let's call it Element1.
>>
>> Originally I was doing something like this (I know it's silly but it
>> worked just fine for small data sets):
>>
>>
>> String CLOSING_TAG = "</Element1>";
>> Context context = ...
>> Unmarshaller um = ...
>>
>> String predicate1 = ...
>> String predicate2 = ...
>> String query = "//Element1[" + predicate1 + " and " + predicate2 + "]";
>>
>> String result = new XQuery(query).execute(context);
>> if (result != null && !result.isEmpty()) {
>> ArrayList<Element1> elements = new ArrayList<>();
>> int index = -1;
>> int beginIndex = 0;
>> while ((index = result.indexOf(CLOSING_TAG, beginIndex)) != -1) {
>> int endIndex = index + CLOSING_TAG.length();
>> String element1 = result.substring(beginIndex, endIndex);
>> beginIndex = endIndex + 1;
>> elements.add(((Element1) um.unmarshal(new
>> ByteArrayInputStream(element1.getBytes()))));
>> }
>> // do something with the elements
>> }
>>
>>
>> But when I got into larger data sets (my DB collection is currently
>> approx. 1GB total size and has just over 20k documents) this started to fail
>> apparently due to trying to convert the entire result to a string at one
>> time (out of memory error, despite setting -Xmx16384m). So I dig through the
>> examples to find a better way and changed my code to this:
>>
>>
>> ByteArrayOutputStream baos = new ByteArrayOutputStream();
>> Iter iter = null;
>> Serializer ser = null;
>> QueryProcessor proc = new QueryProcessor(query, context);
>> iter = proc.iter();
>> ser = proc.getSerializer(baos);
>> proc.close();
>>
>> if (iter != null && ser != null) {
>> ArrayList<Element1> elements = new ArrayList<>();
>> for(Item item; (item = iter.next()) != null;) {
>> baos.reset();
>> ser.serialize(item);
>> elements.add(((Element1) um.unmarshal(new
>> ByteArrayInputStream(baos.toByteArray()))));
>> }
>> ser.close();
>> // do something with the elements
>> }
>>
>>
>> This appears to work fine but I am just wondering if there is a
>> better/faster way to do it? At first glance, serializing to a
>> ByteArrayOutputSream only to then turn around and use a ByteArrayInputStream
>> to unmarshal with JAXB seems wasteful.
>>
>>
>> Thanks for taking the time to read,
>> Zach
>
>