Hi,
I have a problem with ft:mark()that seems to be a bug. Given the following query:
for $sentence in //sentence where $sentence[text() contains text "biological"] return ft:mark($sentence[text() contains text "biological"], 'b')
And the following document structure:
<article> <articleInfo> <date>27-06-14</date> <author>Author</author> </articleInfo> <section id="1"> <title id="1.1">Title Section here</title> <paragraph id="1.1.1"> <sentence id="1.1.1.1.1">Induction of NF-KB during monocyte differentiation by HIV type 1 infection.</sentence> <sentence id="1.1.1.1.2">Electrophoretic mobility shift assays and Southwestern blotting experiments were used to detect the binding of cellular transactivation factor NF-KB to the double repeat-KB enhancer sequence located in the long terminal repeat.</sentence> </paragraph> ...
In Java, if I use the methods query(String query), process(String query) or serialize(String query) from https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/or..., I get the desired results (i.e. the term “biological” is highlighted):
<sentence id="445">A novel fluorescent silica tracer for <b>biological</b> silicification studies</sentence> <sentence id="319125">Cell <b>biological</b> basis of biomineralization</sentence> ...
But when I try to access the Java object, using the method iterate(String query), the highlight is missing:
QueryProcessor proc = new QueryProcessor(query, context); Iter iter = proc.iter(); ArrayList<BXElem> results = new ArrayList<BXElem>(); for(Item item; (item = iter.next()) != null;) results.add((BXElem) item.toJava()); proc.close();
// print results for(BXElem elem: results) System.out.println(elem.getTextContent());
Output:
A novel fluorescent silica tracer for biological silicification studies Cell biological basis of biomineralization ...
Is this a known bug or I am doing something wrong?
Thank you in advance.
Javier
Hi Javier,
System.out.println(elem.getTextContent());
This function will only give you the text content (or, in XQuery terminology, the "string value") of an element. This means that all newly created marker elements will be stripped. You will have to serialize your element by e.g. calling DOMImplementationLS.createLSSerializer.
Avoiding the DOM conversion gives you much better performance. This is one way to do it:
QueryProcessor proc = new QueryProcessor(query, context); // loop through all results for(Item item : proc.value()) { // check if result is a node if(item instanceof ANode) { // print children of node (elements, texts, etc.) for(final ANode child : ((ANode) item).children()) { System.out.print(child); } System.out.println(); } } proc.close();
Hope this helps, Christian
[1] http://docs.oracle.com/javase/6/docs/api/org/w3c/dom/Node.html#getTextConten...)
basex-talk@mailman.uni-konstanz.de