Hi  Christian,

Thank you for your help! To summarize (also for the benefit of other users), while it is true that in XQuery data($d/aspect-values/@sign) = "yes" and $d/aspect-values/@sign = "yes" are equivalent (because of atomization), the use of data() enables the user to
prevent the use of a certain index in BaseX (so this is a BaseX-specific feature). Paying attention to how BaseX uses indexes (which can be seen in the GUI Info panel) seems to be particularly important when join operations between documents are done: as far as I understand, which index and how these indexes are used automatically by BaseX cannot be predicted in advance, so what one can do is to actually try to use the data() function in order to test which index use turns out to be the best (especially when the query evaluates slowly).

Is this correct?

Thank you again!
Giuseppe 


Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: celano@informatik.uni-leipzig.de
E-mail: giuseppegacelano@gmail.com
Web site 1: http://www.dh.uni-leipzig.de/wo/team/
Web site 2: https://sites.google.com/site/giuseppegacelano/

On May 23, 2018, at 2:44 PM, Christian Grün <christian.gruen@gmail.com> wrote:

Hi Giuseppe,

I think your observation was related to another issue that has already
been fixed recently. Did you try the latest snapshot [1]?

Btw, in your specific query I noticed that the data() may indeed be
helpful to suppress the index rewriting for the last condition. As
it’s the only one that has a static comparison string, it will be the
one that will be chosen for index access, but for your data, it will
actually be better if one of the other two conditions will be
evaluated by the index.

Thanks for the sample documents,
Christian

PS: 9.0.2 will be available until end of May.

[1] http://files.basex.org/releases/latest/



On Tue, May 22, 2018 at 5:22 PM, Giuseppe Celano
<celano@informatik.uni-leipzig.de> wrote:
I think I have identified a problem with atomization of attribute content
(no database involved). I have a simple query:

for $s in doc("doc1")//s//t
for $d in doc("doc2")//case
where  $d/verb_lemma = $s/@l and $d//verb_form/@value = $s/@f and
$d/aspect-values/@sign = "yes"
return
$s

In order to get a result, I (necessarily) need to use the data() function in
data($d/aspect-values/@sign) = "yes", otherwise the query never returns a
result. Is this a bug?
I would expect that the value of @sign is automatically atomized and
compared to "yes", but this does not seem the case.
Thanks.

Ciao,
Giuseppe

Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: celano@informatik.uni-leipzig.de
E-mail: giuseppegacelano@gmail.com
Web site 1: http://www.dh.uni-leipzig.de/wo/team/
Web site 2: https://sites.google.com/site/giuseppegacelano/