Hello,
I have a question that is more XQuery based than BaseX specifically but I thought I would pose it here to see if any one knows.
The basic problem statement is: given a list of tags that have an id attribute and a list of target attributes which should correspond to those ids, are there any target attrbutes which reference ids which do not exist?
Basically, I am trying to link up a tag called biblFull or biblStruct which have an id attribute with a tag named title which has a target attribute which should match up with one of the two tags (biblFull or biblStruct) above. I have formulated the query below which I believe *should* give me all the title tags which do not exist either in the biblFull or biblStruct id lists. I am not getting the results that I am expecting.
let $biblFull := distinct-values(collection('edil_target/Prologue Merged 2013.xml')//biblFull/@id) let $biblStruct := distinct-values(collection('edil_target/Prologue Merged 2013.xml')//biblStruct/@id) for $title in collection('edil_target/eDIL-A.xml')//entry//title where $title/@target != $biblStruct or $title/@target != $biblFull return $title
I am basically getting all the title tags back which is not what I am expecting at all. Can anyone shed any light on this?
Thank you very much in advance!
All the best, Chris
Hi Chris,
try
for $title in collection('edil_target/eDIL-A.xml')//entry//title where not($title/@target = $biblStruct) and not($title/@target = $biblFull) return $title
Best regards, Markus
Am 25.03.2019 um 23:07 schrieb Chris Yocum:
Hello,
I have a question that is more XQuery based than BaseX specifically but I thought I would pose it here to see if any one knows.
The basic problem statement is: given a list of tags that have an id attribute and a list of target attributes which should correspond to those ids, are there any target attrbutes which reference ids which do not exist?
Basically, I am trying to link up a tag called biblFull or biblStruct which have an id attribute with a tag named title which has a target attribute which should match up with one of the two tags (biblFull or biblStruct) above. I have formulated the query below which I believe *should* give me all the title tags which do not exist either in the biblFull or biblStruct id lists. I am not getting the results that I am expecting.
let $biblFull := distinct-values(collection('edil_target/Prologue Merged 2013.xml')//biblFull/@id) let $biblStruct := distinct-values(collection('edil_target/Prologue Merged 2013.xml')//biblStruct/@id) for $title in collection('edil_target/eDIL-A.xml')//entry//title where $title/@target != $biblStruct or $title/@target != $biblFull return $title
I am basically getting all the title tags back which is not what I am expecting at all. Can anyone shed any light on this?
Thank you very much in advance!
All the best, Chris
Hi Markus,
try
for $title in collection('edil_target/eDIL-A.xml')//entry//title where not($title/@target = $biblStruct) and not($title/@target = $biblFull) return $title
Thank you for the quick reply at such a late hour! However, I am getting the same results sadly. These results I can open the files and find the targets that are being return in either of the two lists.
All the best, Chris
Are you sure that the @target attributes are supposed to be identical to the IDs? Don’t you prepend a pound sign to @target attributes when they point to IDs within the same document? So you probably need to say
where not(substring($title/@target,2) = $biblStruct) and not(substring($title/@target,2) = $biblFull)
And maybe you need to restrict the titles that you search to those with a @target attribute, like so:
for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]
Otherwise also non-@target-bearing titles will match the where clause, which may be unintended.
Gerrit
On 25.03.2019 23:32, Chris Yocum wrote:
Hi Markus,
try
for $title in collection('edil_target/eDIL-A.xml')//entry//title where not($title/@target = $biblStruct) and not($title/@target = $biblFull) return $title
Thank you for the quick reply at such a late hour! However, I am getting the same results sadly. These results I can open the files and find the targets that are being return in either of the two lists.
All the best, Chris
Hi Gerrit,
Are you sure that the @target attributes are supposed to be identical to the IDs?
Yes, they should be. If they are not, I need to find them so I can fix them to be identical.
Don’t you prepend a pound sign to @target attributes when they point to IDs within the same document?
They are not in the same document. The @target attributes live spread out in the other documents while the IDs all live in the same document.
So you probably need to say
where not(substring($title/@target,2) = $biblStruct) and not(substring($title/@target,2) = $biblFull)
I will give this a shot tomorrow when I am not as tired.
And maybe you need to restrict the titles that you search to those with a @target attribute, like so:
for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]
This is the other half of the problem which I did not state here. I am to find all titles that do not have target attributes then give them a target attribute based on some rules. I have done so in a few files (and I explicitly testing one of them in the query in my previous email) and I will roll out the fix in all other files once I have everything else tested and working.
I will give your suggestion a try tomorrow. Thanks!
All the best, Chris
If you are allowed to share some snippets of the actual documents, it will be easier to see how the query needs to be phrased.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
Gerrit
On 26.03.2019 00:18, Chris Yocum wrote:
Hi Gerrit,
Are you sure that the @target attributes are supposed to be identical to the IDs?
Yes, they should be. If they are not, I need to find them so I can fix them to be identical.
Don’t you prepend a pound sign to @target attributes when they point to IDs within the same document?
They are not in the same document. The @target attributes live spread out in the other documents while the IDs all live in the same document.
So you probably need to say
where not(substring($title/@target,2) = $biblStruct) and not(substring($title/@target,2) = $biblFull)
I will give this a shot tomorrow when I am not as tired.
And maybe you need to restrict the titles that you search to those with a @target attribute, like so:
for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]
This is the other half of the problem which I did not state here. I am to find all titles that do not have target attributes then give them a target attribute based on some rules. I have done so in a few files (and I explicitly testing one of them in the query in my previous email) and I will roll out the fix in all other files once I have everything else tested and working.
I will give your suggestion a try tomorrow. Thanks!
All the best, Chris
Hi,
If you are allowed to share some snippets of the actual documents, it will be easier to see how the query needs to be phrased.
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the same results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
It is TEI-like. It is not exactly TEI and I very much doubt it would validate.
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
Hi Everyone,
I just tried using functx:is-value-in-sequence: where not(functx:is-value-in-sequence($title/@id, $biblFull)). I am still getting erroneous results. At this point, I very near offering money for someone to help me fix this. It looks like a bug though to me. I should not be getting the results that I am.
All the best, Chris
On Tue, Mar 26, 2019 at 12:13 PM Christopher Yocum cyocum@gmail.com wrote:
Hi,
If you are allowed to share some snippets of the actual documents, it will be easier to see how the query needs to be phrased.
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the same results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
It is TEI-like. It is not exactly TEI and I very much doubt it would validate.
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
Hi,
Of course the instant you say that, you fix it. Thanks for everyone's support.
All the best, Chris
On Tue, Mar 26, 2019 at 12:35 PM Christopher Yocum cyocum@gmail.com wrote:
Hi Everyone,
I just tried using functx:is-value-in-sequence: where not(functx:is-value-in-sequence($title/@id, $biblFull)). I am still getting erroneous results. At this point, I very near offering money for someone to help me fix this. It looks like a bug though to me. I should not be getting the results that I am.
All the best, Chris
On Tue, Mar 26, 2019 at 12:13 PM Christopher Yocum cyocum@gmail.com wrote:
Hi,
If you are allowed to share some snippets of the actual documents, it will be easier to see how the query needs to be phrased.
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the same results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
It is TEI-like. It is not exactly TEI and I very much doubt it would validate.
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
Chris,
please don't leave us hanging :). What was your solution?
Best, Bridger
On Tue, Mar 26, 2019 at 8:40 AM Christopher Yocum cyocum@gmail.com wrote:
Hi,
Of course the instant you say that, you fix it. Thanks for everyone's support.
All the best, Chris
On Tue, Mar 26, 2019 at 12:35 PM Christopher Yocum cyocum@gmail.com wrote:
Hi Everyone,
I just tried using functx:is-value-in-sequence: where not(functx:is-value-in-sequence($title/@id, $biblFull)). I am still getting erroneous results. At this point, I very near offering money for someone to help me fix this. It looks like a bug though to me. I should not be getting the results that I am.
All the best, Chris
On Tue, Mar 26, 2019 at 12:13 PM Christopher Yocum cyocum@gmail.com
wrote:
Hi,
If you are allowed to share some snippets of the actual documents, it will be easier to see how the query needs to be phrased.
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the same results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
It is TEI-like. It is not exactly TEI and I very much doubt it would
validate.
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
Hi Bridger,
The solution was indeed using is-value-in-sequence:
import module namespace functx = 'http://www.functx.com'; let $biblFull := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblFull/@id) let $biblStruct := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblStruct/@id) let $bibl := distinct-values(collection('edil/Prologue Merged 2013.xml')//bibl/@id) for $title in collection('edil/eDIL-A.xml')//entry//title where not(functx:is-value-in-sequence($title/@target, $biblFull)) and not(functx:is-value-in-sequence($title/@target, $biblStruct)) and not(functx:is-value-in-sequence($title/@target, $bibl)) return $title
It is also pretty speedy as well.
All the best, Chris
On Tue, Mar 26, 2019 at 12:47 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Chris,
please don't leave us hanging :). What was your solution?
Best, Bridger
On Tue, Mar 26, 2019 at 8:40 AM Christopher Yocum cyocum@gmail.com wrote:
Hi,
Of course the instant you say that, you fix it. Thanks for everyone's support.
All the best, Chris
On Tue, Mar 26, 2019 at 12:35 PM Christopher Yocum cyocum@gmail.com wrote:
Hi Everyone,
I just tried using functx:is-value-in-sequence: where not(functx:is-value-in-sequence($title/@id, $biblFull)). I am still getting erroneous results. At this point, I very near offering money for someone to help me fix this. It looks like a bug though to me. I should not be getting the results that I am.
All the best, Chris
On Tue, Mar 26, 2019 at 12:13 PM Christopher Yocum cyocum@gmail.com wrote:
Hi,
If you are allowed to share some snippets of the actual documents, it will be easier to see how the query needs to be phrased.
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the same results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
It is TEI-like. It is not exactly TEI and I very much doubt it would validate.
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
Hi Chris,
I’m glad to hear you found a solution.
I was surprised to hear that functx:is-value-in-sequence did the job, though. The body of this function is nothing else than a general comparison: $value = $seq [1] (and I would even have guessed that your query will be eventually rewritten to a representation in which the function body is completely inlined).
The query should be even faster if the import of the FunctX library can be avoided. If the data is confidential, feel free to send us the output of the Info View panel of the GUI.
Cheers, Christian
[1] http://www.xqueryfunctions.com/xq/functx_is-value-in-sequence.html
On Tue, Mar 26, 2019 at 1:48 PM Christopher Yocum cyocum@gmail.com wrote:
Hi Bridger,
The solution was indeed using is-value-in-sequence:
import module namespace functx = 'http://www.functx.com'; let $biblFull := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblFull/@id) let $biblStruct := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblStruct/@id) let $bibl := distinct-values(collection('edil/Prologue Merged 2013.xml')//bibl/@id) for $title in collection('edil/eDIL-A.xml')//entry//title where not(functx:is-value-in-sequence($title/@target, $biblFull)) and not(functx:is-value-in-sequence($title/@target, $biblStruct)) and not(functx:is-value-in-sequence($title/@target, $bibl)) return $title
It is also pretty speedy as well.
All the best, Chris
On Tue, Mar 26, 2019 at 12:47 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Chris,
please don't leave us hanging :). What was your solution?
Best, Bridger
On Tue, Mar 26, 2019 at 8:40 AM Christopher Yocum cyocum@gmail.com wrote:
Hi,
Of course the instant you say that, you fix it. Thanks for everyone's support.
All the best, Chris
On Tue, Mar 26, 2019 at 12:35 PM Christopher Yocum cyocum@gmail.com wrote:
Hi Everyone,
I just tried using functx:is-value-in-sequence: where not(functx:is-value-in-sequence($title/@id, $biblFull)). I am still getting erroneous results. At this point, I very near offering money for someone to help me fix this. It looks like a bug though to me. I should not be getting the results that I am.
All the best, Chris
On Tue, Mar 26, 2019 at 12:13 PM Christopher Yocum cyocum@gmail.com wrote:
Hi,
If you are allowed to share some snippets of the actual documents, it will be easier to see how the query needs to be phrased.
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the same results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
It is TEI-like. It is not exactly TEI and I very much doubt it would validate.
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
Chris - thanks for sharing your solution (IMO it will help future searchers/readers of the list, or others with similar problems searching the list archive)!
Note that the `!=` is an existential quantifier in XPath; i.e. it returns `true` if *any* value on the left is not equal to *any* value of the right*. So you might have had an expression problem earlier.
Cheers, Bridger * -- I totally and completely cribbed that note from Mr. Liam Quin, who is often giving away advice like that on the xml freenode IRC channel. Credit to him :).
On Tue, Mar 26, 2019 at 8:48 AM Christopher Yocum cyocum@gmail.com wrote:
Hi Bridger,
The solution was indeed using is-value-in-sequence:
import module namespace functx = 'http://www.functx.com'; let $biblFull := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblFull/@id) let $biblStruct := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblStruct/@id) let $bibl := distinct-values(collection('edil/Prologue Merged 2013.xml')//bibl/@id) for $title in collection('edil/eDIL-A.xml')//entry//title where not(functx:is-value-in-sequence($title/@target, $biblFull)) and not(functx:is-value-in-sequence($title/@target, $biblStruct)) and not(functx:is-value-in-sequence($title/@target, $bibl)) return $title
It is also pretty speedy as well.
All the best, Chris
On Tue, Mar 26, 2019 at 12:47 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Chris,
please don't leave us hanging :). What was your solution?
Best, Bridger
On Tue, Mar 26, 2019 at 8:40 AM Christopher Yocum cyocum@gmail.com
wrote:
Hi,
Of course the instant you say that, you fix it. Thanks for everyone's
support.
All the best, Chris
On Tue, Mar 26, 2019 at 12:35 PM Christopher Yocum cyocum@gmail.com
wrote:
Hi Everyone,
I just tried using functx:is-value-in-sequence: where not(functx:is-value-in-sequence($title/@id, $biblFull)). I am still getting erroneous results. At this point, I very near offering money for someone to help me fix this. It looks like a bug though to me. I should not be getting the results that I am.
All the best, Chris
On Tue, Mar 26, 2019 at 12:13 PM Christopher Yocum cyocum@gmail.com
wrote:
Hi,
If you are allowed to share some snippets of the actual
documents, it
will be easier to see how the query needs to be phrased.
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the
same
results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
Have you verified that $biblFull and $biblStruct actually contain strings? If not, do you need to declare a default namespace? The vocabulary looks like TEI, so
declare default element namespace "http://www.tei-c.org/ns/1.0";
It is TEI-like. It is not exactly TEI and I very much doubt it
would validate.
may be necessary. And if it is TEI, the ID attributes are probably called @xml:id rather than @id.
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
Just my guess ;) Thanks, Bridger. Here are some general comparisons with their correct results:
() = () : false () != () : false (1,2) = (2,3) : true (1,2) != (2,3) : true (1) > (10,20,30,40,0) : true (5) < (1,2,3,4) : false
On Tue, Mar 26, 2019 at 1:58 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Chris - thanks for sharing your solution (IMO it will help future searchers/readers of the list, or others with similar problems searching the list archive)!
Note that the `!=` is an existential quantifier in XPath; i.e. it returns `true` if *any* value on the left is not equal to *any* value of the right*. So you might have had an expression problem earlier.
Cheers, Bridger
- -- I totally and completely cribbed that note from Mr. Liam Quin, who is often giving away advice like that on the xml freenode IRC channel. Credit to him :).
On Tue, Mar 26, 2019 at 8:48 AM Christopher Yocum cyocum@gmail.com wrote:
Hi Bridger,
The solution was indeed using is-value-in-sequence:
import module namespace functx = 'http://www.functx.com'; let $biblFull := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblFull/@id) let $biblStruct := distinct-values(collection('edil/Prologue Merged 2013.xml')//biblStruct/@id) let $bibl := distinct-values(collection('edil/Prologue Merged 2013.xml')//bibl/@id) for $title in collection('edil/eDIL-A.xml')//entry//title where not(functx:is-value-in-sequence($title/@target, $biblFull)) and not(functx:is-value-in-sequence($title/@target, $biblStruct)) and not(functx:is-value-in-sequence($title/@target, $bibl)) return $title
It is also pretty speedy as well.
All the best, Chris
On Tue, Mar 26, 2019 at 12:47 PM Bridger Dyson-Smith bdysonsmith@gmail.com wrote:
Chris,
please don't leave us hanging :). What was your solution?
Best, Bridger
On Tue, Mar 26, 2019 at 8:40 AM Christopher Yocum cyocum@gmail.com wrote:
Hi,
Of course the instant you say that, you fix it. Thanks for everyone's support.
All the best, Chris
On Tue, Mar 26, 2019 at 12:35 PM Christopher Yocum cyocum@gmail.com wrote:
Hi Everyone,
I just tried using functx:is-value-in-sequence: where not(functx:is-value-in-sequence($title/@id, $biblFull)). I am still getting erroneous results. At this point, I very near offering money for someone to help me fix this. It looks like a bug though to me. I should not be getting the results that I am.
All the best, Chris
On Tue, Mar 26, 2019 at 12:13 PM Christopher Yocum cyocum@gmail.com wrote:
Hi,
> If you are allowed to share some snippets of the actual documents, it > will be easier to see how the query needs to be phrased. >
Sadly, probably not. I can probably do something off list but even then I would hesitate.
I tried your proposal from your other email and I am getting the same results as with all other attempts. I feel like this should be very straightforward and I am just missing something small.
> Have you verified that $biblFull and $biblStruct actually contain > strings? If not, do you need to declare a default namespace? The > vocabulary looks like TEI, so > > declare default element namespace "http://www.tei-c.org/ns/1.0"; >
It is TEI-like. It is not exactly TEI and I very much doubt it would validate.
> may be necessary. And if it is TEI, the ID attributes are probably > called @xml:id rather than @id. >
No, the XML documents are explicit about the @id as I checked.
Thank you for your help.
All the best, Chris
basex-talk@mailman.uni-konstanz.de