Hi Dave,
Am 25.08.2010 17:43, schrieb Dave Glick:
Is it possible to return the absolute text offset in characters (either from the start of the document or the start of the result node) for each match? Along with that is it possible to return every match, even if it results in returning the same node more than once (if it has more than one occurrence for example)?
well, I don't think there's an *official* way to get the full-text positions out of BaseX until now. It's only used in the query process and in the GUI, for highlighting the matches.
But if I got your previous mails right, you know your way around the BaseX codebase pretty well, so here's the Cheater's Guide:
The GUI gets the positions by setting the hidden static property org.basex.core.Prop.gui to true. That lets the FTPosData object be propagated to the resulting Nodes. It's accessible via Nodes.ftpos. The interface isn't as nice as it could be, but you get everything you asked for.
Please note that serializing the result will yield control characters used for highlighting in the GUI. This can be avoided by discarding the full-text position as in
final Nodes res = qp.queryNodes(); final Nodes copy = new Nodes(res.nodes, res.data);
and serializing the copy after that.
As these are internals of BaseX, the solution described above may stop working at any time in the future. When we find the time we will implement a simpler interface for this, but we're not really short of ToDos...
I hope this helps you in any way...
Even if XQuery Full Text doesn't work for this particular need, it's still very cool and I really like the implementation and look forward to finding other uses.
That's nice to hear!
Cheers Leo