Consider the following XML document:
<article>
<front>
<article-meta>
<aff id="aff1">Tropical and Infectious Disease Hospital, Kathmandu, Nepal</aff>
<aff id="aff2">Nagasaki University, Nagasaki, Japan</aff>
<aff id="aff3">Department of Radiology, Kyorin University Faculty of Medicine, Tokyo, Japan</aff>
<aff id="aff5">Pentax Company Limited, Tokyo, Japan</aff>
<aff id="aff6">National Research Laboratory of Molecular Complex Control, Yonsei University, Seoul, Korea</aff>
<!--* ... *-->
<article-id pub-id-type="pmc">2570825</article-id>
<article-id pub-id-type="pmid">18325280</article-id>
<article-id pub-id-type="publisher-id">07-0473</article-id>
<article-id pub-id-type="doi">10.3201/eid1403.070473</article-id>
</article-meta>
</front>
<!--* ... *-->
</article
For convenience in trying to understand this problem, a copy of this
document has been placed at [1].
When I issue the following search against this document, I get
unexpected results.
let $doc := doc('http://blackmesatech.com/2014/LIS590DML/data/testdata.xml')
let $path1 := $doc/child::article/child::front
/child::article-meta
/child::aff[contains(.,"Japan")]
/parent::article-meta/child::article-id,
$path2 := $doc/descendant::aff[contains(.,"Japan")]
/parent::article-meta/child::article-id,
$path3 := $doc/article/front/article-meta/aff[contains(.,'Japan')]
/../article-id,
$path4 := $doc//aff[contains(.,'Japan')]/../article-id,
$path5 := $doc//aff/../article-id
return (count($path1), count($path2), count($path3), count($path4), count($path5))
What I expect is that path1, path2, path3, path4, and path5 should
all return the same results, namely the set of four article-id elements
in the document. So the sequence of counts returned should be
4 4 4 4 4.
What I am finding is that path1 and path3 are returning 12 results,
with each article-id present three times in the result (once, apparently,
for every aff element containing the string 'Japan'). Paths 2, 4, and 5
are all returning 4 results each, as I had expected them to. So
the sequence of counts actually returned is 12 4 12 4 4.
In BaseX 7.6, for what it's worth, this query returns the sequence
12 12 12 12 20, which seems suggestive.
Interestingly, if I initialize the variable $doc with a direct element
constructor, along the lines of
let $doc := document { <article>...</article> }
then all counts come out as expected in 7.6, but in 7.9 the result
continues to be 12 4 12 4 4.
Is this an error in the handling of the / operator, or am I missing some
subtle point?
Many thanks.
[1] http://blackmesatech.com/2014/LIS590DML/data/testdata.xml
--
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com
* http://cmsmcq.com/mib
* http://balisage.net
****************************************************************