I had a similar requirement a while back and ended up solving it in two ways:

The first attempt I wrote a routine that would take a node and scan for all text nodes along the preceding axis. I would then add the length of each text node to a counter (I wanted offset), but counting new lines shouldn't be much different. I used the API, but I don't think it would be too tough to write an XQuery expression or function that does the same thing. Alternatively you could use the BaseX functionality to call Java code from XQuery to write it up in code.

This worked okay in isolated cases, but I needed the info very frequently (I was writing a text editor backed by XML, so I was constantly getting offsets for the caret) and scanning the preceding axis each time took too long. I ended up going through every node in sequence for a new document and caching the information. The idea was the same though - every time I hit a text node, I incremented a counter and used the value for following nodes (until I hit another text node).

Not sure if that helps - hopefully it does...

Dave


-----Original message-----
From: "Christian Grün" <christian.gruen@gmail.com>
To:
Anders Hessellund <anders.hessellund@gmail.com>
Cc:
"basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de>
Sent:
Sat, Jan 14, 2012 11:38:06 GMT+00:00
Subject:
Re: [basex-talk] line number and column

Dear Anders,

maybe there's another way to achieve what you want; what do you
exactly need line and column information for?

Christian
___________________________

On Sat, Jan 14, 2012 at 12:21 PM, Anders Hessellund
<anders.hessellund@gmail.com> wrote:
> Thanks for your replies,
>
> the line number and column info is pretty important for us. Is there perhaps
> some part of the parser code that I could hack/fork?
>
> -- Anders
>
>
> On Thu, Jan 12, 2012 at 11:59 AM, Christian Grün <christian.gruen@gmail.com>
> wrote:
>>
>> Dear Anders,
>>
>> I'm sorry: information on the original line and columns gets lost as
>> soon as a database, or a main memory instance of a document, is
>> created. If you look for ways to address the same XML nodes multiple
>> times, you may want to have a look at the fn:path(); it returns an
>> XPath expression to the adressed nodes [1]. A simple example:
>>
>>  for $item in doc('http://files.basex.org/xml/xmark.xml')//item
>>  return path($item)
>>
>> If you have create a database from your files, you can also use the
>> BaseX-specific db:node-pre() and db:open-pre() functions to directly
>> address nodes [2].
>>
>> Hope this helps,
>> Christian
>>
>> [1] http://www.w3.org/TR/xpath-functions-30/#func-path
>> [2] http://docs.basex.org/wiki/Database_Module
>> ___________________________
>>
>> On Wed, Jan 11, 2012 at 11:28 PM, Anders Hessellund
>> <anders.hessellund@gmail.com> wrote:
>> > Hi,
>> >
>> > we are experimenting with BaseX as a database for a large set of local
>> > XML
>> > files (3000-4000). When we query for individual elements and attributes
>> > in
>> > this portfolio of XML documents, is there any way to get location
>> > information about concrete elements and attributes. Specifically, we
>> > need to
>> > know filename, line number and column of element (start tags) and
>> > attributes. Is this possible? And if so, how?
>> >
>> > Thanks,
>> > Anders
>
>
>
> _______________________________________________
> BaseX-Talk mailing list
> BaseX-Talk@mailman.uni-konstanz.de
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk