Consider the following XML document:
<article> <front> <article-meta> <aff id="aff1">Tropical and Infectious Disease Hospital, Kathmandu, Nepal</aff> <aff id="aff2">Nagasaki University, Nagasaki, Japan</aff> <aff id="aff3">Department of Radiology, Kyorin University Faculty of Medicine, Tokyo, Japan</aff> <aff id="aff5">Pentax Company Limited, Tokyo, Japan</aff> <aff id="aff6">National Research Laboratory of Molecular Complex Control, Yonsei University, Seoul, Korea</aff> <!--* ... *--> <article-id pub-id-type="pmc">2570825</article-id> <article-id pub-id-type="pmid">18325280</article-id> <article-id pub-id-type="publisher-id">07-0473</article-id> <article-id pub-id-type="doi">10.3201/eid1403.070473</article-id> </article-meta> </front> <!--* ... *--> </article
For convenience in trying to understand this problem, a copy of this document has been placed at [1].
When I issue the following search against this document, I get unexpected results.
let $doc := doc('http://blackmesatech.com/2014/LIS590DML/data/testdata.xml')
let $path1 := $doc/child::article/child::front /child::article-meta /child::aff[contains(.,"Japan")] /parent::article-meta/child::article-id, $path2 := $doc/descendant::aff[contains(.,"Japan")] /parent::article-meta/child::article-id, $path3 := $doc/article/front/article-meta/aff[contains(.,'Japan')] /../article-id, $path4 := $doc//aff[contains(.,'Japan')]/../article-id, $path5 := $doc//aff/../article-id
return (count($path1), count($path2), count($path3), count($path4), count($path5))
What I expect is that path1, path2, path3, path4, and path5 should all return the same results, namely the set of four article-id elements in the document. So the sequence of counts returned should be 4 4 4 4 4.
What I am finding is that path1 and path3 are returning 12 results, with each article-id present three times in the result (once, apparently, for every aff element containing the string 'Japan'). Paths 2, 4, and 5 are all returning 4 results each, as I had expected them to. So the sequence of counts actually returned is 12 4 12 4 4.
In BaseX 7.6, for what it's worth, this query returns the sequence 12 12 12 12 20, which seems suggestive.
Interestingly, if I initialize the variable $doc with a direct element constructor, along the lines of
let $doc := document { <article>...</article> }
then all counts come out as expected in 7.6, but in 7.9 the result continues to be 12 4 12 4 4.
Is this an error in the handling of the / operator, or am I missing some subtle point?
Many thanks.
[1] http://blackmesatech.com/2014/LIS590DML/data/testdata.xml