regex for xquery

List overview All Threads
Download

newer

older

confused

IllegalMonitorStateException

Rob Stapper

8 Oct 2014 8 Oct '14

7:37 a.m.

Hi,

Basically I want to break down the path-expression of a node in my XML into its elementary steps.

QName: "Q{http://www.w3.org/2005/xpath-functions%7Droot()/Q%7B%7DA%5B1%5D/Q%7B%7DB%5B1... D[1]" must become QName-sequence: ("Q{http://www.w3.org/2005/xpath-functions%7Droot()", "Q{}A[1] ", "Q{}B[1] ", "Q{}C[1] ", "Q{}D[1]")

So I need to break the path-expression down on the "/"-characters which aren't part of the namespace-uries.

I'm using the "fn:tokenize"-function for this. The function uses a "regular expression" for it's separator.

Unfortunately my regex-experience is zero (well, two days by now). I came up with the next brilliant ;-) solution: fn:tokenize( path( $node), "/(?=Q)").

Again, unfortunately, this doesn't work because the xquery-flavor of regex doesn't support looking around.

Sure, I can use "/Q" as pattern and subsequently stick a "Q" in front of every substring (except the first one) but I'm looking for a solid solution.

I there anyone out there with enough "regex-for-xquery"- experience to help me with this?

Big thanx for that in advance.

@Christian: On the other hand I can imagine that within Basex the path-expression is built from these elementary steps.

If so, is it in any way possibly to retrieve this elementary steps sequence instead of this path-expression?

Thanx,

Rob Stapper

--- Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus-bescherming actief is. http://www.avast.com

Attachments:

attachment.html (text/html — 4.8 KB)
demo.xq (application/octet-stream — 216 bytes)

Show replies by date

Marco Lettere

8 Oct 8 Oct

8:10 a.m.

Hi Rob, don't know whether your prefix is constant.

If it is then

let $prefix := "Q{http://www.w3.org/2005/xpath-functions%7Droot()" let $inpstr := "Q{http://www.w3.org/2005/xpath-functions%7Droot()/Q%7B%7DA%5B1%5D/Q%7B%7DB%5B1...]" return ($prefix, tokenize(substring-after($inpstr, $prefix),"/"))

should do the job. M.

On 08/10/2014 13:37, Rob Stapper wrote:

...

Hi,

Basically I want to break down the path-expression of a node in my XML into its elementary steps.

QName: “Q{http://www.w3.org/2005/xpath-functions%7Droot()/Q%7B%7DA%5B1%5D/Q%7B%7DB%5B1... must become QName-sequence: (“Q{http://www.w3.org/2005/xpath-functions%7Droot()%E2%80%9D, ”Q{}A[1] ”, ”Q{}B[1] ”, ”Q{}C[1] ”, ”Q{}D[1]”)

So I need to break the path-expression down on the “/”-characters which aren’t part of the namespace-uries.

I’m using the “fn:tokenize”-function for this. The function uses a “regular expression” for it’s separator.

Unfortunately my regex-experience is zero (well, two days by now). I came up with the next brilliant ;-) solution: fn:tokenize( path( $node), “/(?=Q)”).

Again, unfortunately, this doesn’t work because the xquery-flavor of regex doesn’t support looking around.

Sure, I can use “/Q” as pattern and subsequently stick a “Q” in front of every substring (except the first one) but I’m looking for a solid solution.

I there anyone out there with enough “regex-for-xquery”- experience to help me with this?

Big thanx for that in advance.

@Christian: On the other hand I can imagine that within Basex the path-expression is built from these elementary steps.

If so, is it in any way possibly to retrieve this elementary steps sequence instead of this path-expression?

Thanx,

Rob Stapper

http://www.avast.com/

Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus http://www.avast.com/ actief is.

Christian Grün

8:17 a.m.

Hi Rob,

here's yet another solution:

let $path := "Q{ http://www.w3.org/2005/xpath-functions%7Droot()/Q%7B%7DA%5B1%5D/Q%7B%7DB%5B1...] " return analyze-string($path, '(Q{.*?}.*?)(/|$)')/fn:match/fn:group[1]/string()

...

@Christian: On the other hand I can imagine that within Basex the

path-expression is built from these elementary steps.

...

If so, is it in any way possibly to retrieve this elementary steps

sequence instead of this path-expression?

Currently, there is no other way to retrieve paths, so you'll have to manually split the path as indicated above..

Christian

On Wed, Oct 8, 2014 at 2:10 PM, Marco Lettere marco.lettere@dedalus.eu wrote:

...

Hi Rob, don't know whether your prefix is constant.

If it is then

let $prefix := "Q{http://www.w3.org/2005/xpath-functions%7Droot()" let $inpstr := "Q{http://www.w3.org/2005/xpath-functions }root()/Q{}A[1]/Q{}B[1]/Q{}C[1]/Q{}D[1]" return ($prefix, tokenize(substring-after($inpstr, $prefix),"/"))

should do the job. M.

On 08/10/2014 13:37, Rob Stapper wrote:

Hi,

Basically I want to break down the path-expression of a node in my XML into its elementary steps.

QName: “Q{http://www.w3.org/2005/xpath-functions%7Droot()/Q%7B%7DA%5B1%5D/Q%7B%7DB%5B1... must become QName-sequence: (“Q{http://www.w3.org/2005/xpath-functions%7Droot()%E2%80%9D, ”Q{}A[1] ”, ”Q{}B[1] ”, ”Q{}C[1] ”, ”Q{}D[1]”)

So I need to break the path-expression down on the “/”-characters which aren’t part of the namespace-uries.

I’m using the “fn:tokenize”-function for this. The function uses a “regular expression” for it’s separator.

Unfortunately my regex-experience is zero (well, two days by now). I came up with the next brilliant ;-) solution: fn:tokenize( path( $node), “/(?=Q)”).

Again, unfortunately, this doesn’t work because the xquery-flavor of regex doesn’t support looking around.

Sure, I can use “/Q” as pattern and subsequently stick a “Q” in front of every substring (except the first one) but I’m looking for a solid solution.

I there anyone out there with enough “regex-for-xquery”- experience to help me with this?

Big thanx for that in advance.

@Christian: On the other hand I can imagine that within Basex the path-expression is built from these elementary steps.

If so, is it in any way possibly to retrieve this elementary steps sequence instead of this path-expression?

Thanx,

Rob Stapper
<http://www.avast.com/>
Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus http://www.avast.com/ actief is.

Rob Stapper

9:08 a.m.

Hi guys,

Thank for the suggestions.

@Marco. Sorry, I’ll use Christians suggestion ;-) But thanx you any way.

@Christian. Too bad this is the only way. But hé, it works.

Rob

Van: basex-talk-bounces@mailman.uni-konstanz.de [mailto:basex-talk-bounces@mailman.uni-konstanz.de] Namens Christian Grün Verzonden: woensdag 8 oktober 2014 14:18 Aan: Marco Lettere CC: BaseX Onderwerp: Re: [basex-talk] regex for xquery

Hi Rob,

here's yet another solution:

let $path := "Q{http://www.w3.org/2005/xpath-functions%7Droot()/Q%7B%7DA%5B1%5D/Q%7B%7DB%5B1...] http://www.w3.org/2005/xpath-functions%7droot()/Q%7b%7dA%5b1%5d/Q%7b%7dB%5b1%5d/Q%7b%7dC%5b1%5d/Q%7b%7dD%5b1%5d "

return analyze-string($path, '(Q{.*?}.*?)(/|$)')/fn:match/fn:group[1]/string()

...

@Christian: On the other hand I can imagine that within Basex the path-expression is built from these elementary steps.

...

If so, is it in any way possibly to retrieve this elementary steps sequence instead of this path-expression?

Currently, there is no other way to retrieve paths, so you'll have to manually split the path as indicated above..

Christian

On Wed, Oct 8, 2014 at 2:10 PM, Marco Lettere marco.lettere@dedalus.eu wrote:

Hi Rob, don't know whether your prefix is constant.

If it is then

should do the job. M.

On 08/10/2014 13:37, Rob Stapper wrote:

Hi,

Basically I want to break down the path-expression of a node in my XML into its elementary steps.

QName: “Q{http://www.w3.org/2005/xpath-functions%7Droot()/Q%7B%7DA%5B1%5D/Q%7B%7DB%5B1... must become QName-sequence: (“Q{http://www.w3.org/2005/xpath-functions%7Droot()%E2%80%9D, ”Q{}A[1] ”, ”Q{}B[1] ”, ”Q{}C[1] ”, ”Q{}D[1]”)

So I need to break the path-expression down on the “/”-characters which aren’t part of the namespace-uries.

I’m using the “fn:tokenize”-function for this. The function uses a “regular expression” for it’s separator.

Unfortunately my regex-experience is zero (well, two days by now). I came up with the next brilliant ;-) solution: fn:tokenize( path( $node), “/(?=Q)”).

Again, unfortunately, this doesn’t work because the xquery-flavor of regex doesn’t support looking around.

Sure, I can use “/Q” as pattern and subsequently stick a “Q” in front of every substring (except the first one) but I’m looking for a solid solution.

I there anyone out there with enough “regex-for-xquery”- experience to help me with this?

Big thanx for that in advance.

@Christian: On the other hand I can imagine that within Basex the path-expression is built from these elementary steps.

If so, is it in any way possibly to retrieve this elementary steps sequence instead of this path-expression?

Thanx,

Rob Stapper

_____

http://www.avast.com/

Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus http://www.avast.com/ actief is.

--- Dit e-mailbericht bevat geen virussen en malware omdat avast! Antivirus-bescherming actief is. http://www.avast.com

3936

Age (days ago)

3936

Last active (days ago)

basex-talk@mailman.uni-konstanz.de

3 comments

3 participants

tags (0)

participants (3)

Christian Grün
Marco Lettere
Rob Stapper