Hi,
I'm working on a query where the full-text index is not applied, even though it could be (see attached test.xql).
I'm sorry for the somewhat complicated look of the example with data-specific XPaths and so on. I really have tried to simplify it as much as I could, but when I further simplify it, the fulltext index actually will be applied or the query is optimized to () due to cached evaluations.
But if you take the expression of the variable $number and inline it into below where the variable is referenced, you may see what I mean. The result is an equivalent query (see attached test-ft.xql) to the original one, but now the fulltext index *is* applied.
I'm using the BaseX 9.1 snapshot from 2018-11-05.
Best, Sebastian
Hi Sebastian,
It’s difficult to guess what the query optimizer does by just looking at the query. Could you pass us on a database document that allows us to run the query?
Any attempt to simplify your query is appreciated. Here are some more things that you could try: • Does it make a difference if you work with namespaces, or could all prefixes in the location steps be replaced with *? • Does the issue only occur if your code is defined in the local:search function? • What happens if you replace the functx function call with its body ($value = $seq)?
Best, Christian
On Mon, Nov 12, 2018 at 9:54 AM Sebastian Zimmer < sebastian.zimmer@uni-koeln.de> wrote:
Hi,
I'm working on a query where the full-text index is not applied, even though it could be (see attached test.xql).
I'm sorry for the somewhat complicated look of the example with data-specific XPaths and so on. I really have tried to simplify it as much as I could, but when I further simplify it, the fulltext index actually will be applied or the query is optimized to () due to cached evaluations.
But if you take the expression of the variable $number and inline it into below where the variable is referenced, you may see what I mean. The result is an equivalent query (see attached test-ft.xql) to the original one, but now the fulltext index *is* applied.
I'm using the BaseX 9.1 snapshot from 2018-11-05.
Best, Sebastian -- Sebastian Zimmer sebastian.zimmer@uni-koeln.de [image: CCeH Logo] http://cceh.uni-koeln.de
Cologne Center for eHumanities http://cceh.uni-koeln.de DH Center at the University of Cologne [image: Twitter Logo] https://twitter.com/CCeHum@CCeHum https://twitter.com/CCeHum
Hi Christian,
thanks for your help attempt. I understand that the query is not really simple yet and that it's hard to see what's going on.
Attached is a database document on which you can run the query.
To your questions:
* It does make no difference if the namespace prefixes are replaced with *. Thus you can simplify the query in that manner and the issue persists. * Yes, the issue does only occur if the code is within the local:search function. If the code is in the global scope of the file, the ft-index is applied. Thus, I cannot simplify the query that way. * If I replace the functx function with its body, the ft-index is applied. Thus, I cannot simplify the query that way.
I'm happy to help further if I can.
Best, Sebastian
Am 12.11.2018 um 17:24 schrieb Christian Grün:
Hi Sebastian,
It’s difficult to guess what the query optimizer does by just looking at the query. Could you pass us on a database document that allows us to run the query?
Any attempt to simplify your query is appreciated. Here are some more things that you could try: • Does it make a difference if you work with namespaces, or could all prefixes in the location steps be replaced with *? • Does the issue only occur if your code is defined in the local:search function? • What happens if you replace the functx function call with its body ($value = $seq)?
Best, Christian
On Mon, Nov 12, 2018 at 9:54 AM Sebastian Zimmer <sebastian.zimmer@uni-koeln.de mailto:sebastian.zimmer@uni-koeln.de> wrote:
Hi, I'm working on a query where the full-text index is not applied, even though it could be (see attached test.xql). I'm sorry for the somewhat complicated look of the example with data-specific XPaths and so on. I really have tried to simplify it as much as I could, but when I further simplify it, the fulltext index actually will be applied or the query is optimized to () due to cached evaluations. But if you take the expression of the variable $number and inline it into below where the variable is referenced, you may see what I mean. The result is an equivalent query (see attached test-ft.xql) to the original one, but now the fulltext index *is* applied. I'm using the BaseX 9.1 snapshot from 2018-11-05. Best, Sebastian -- Sebastian Zimmer sebastian.zimmer@uni-koeln.de <mailto:sebastian.zimmer@uni-koeln.de> CCeH Logo <http://cceh.uni-koeln.de> Cologne Center for eHumanities <http://cceh.uni-koeln.de> DH Center at the University of Cologne Twitter Logo <https://twitter.com/CCeHum>@CCeHum <https://twitter.com/CCeHum>
Hi Sebastian,
I agree, it is kind of impossible to simplify your query. The reason is function inlining:
Original query: declare function local:greet($name) { 'hi ' || $name }; local:greet('john')
Optimized query (intermediate): let $name_1 := 'john' return 'hi ' || $name_1
Optimized query (final): 'hi john'
If a function body is too large, it won’t be inlined (otherwise, recursive functions can easily blow up main memory). The inline limit in BaseX is a static value, it can be controlled via the INLINELIMIT option [1].
Coming back to your query, test.xql won’t be optimized for index access because… 1. The function body is too large 2. The passed on JSON string cannot be statically converted to a map 3. As a consequence, it cannot statically be determined if search will be fuzzy or not. 4. As a consequence, the predicate with the full-text expression contains multiple options, and none of them will be rewritten for index access.
You can e.g.b tackle this by defining a larger inline limit for this function in your function declaration (see the attached link). As stated before, you can also call ft:search directly and pass on the fuzzy option as parameter.
A general hint: With BaseX 9.1, your map lookup can be simplified to…
$map?key2?key3 ?: ("1", "2")
…or…
util:or($map?key2?key3, ("1", "2"))
If you want to use standard XQuery 3.1 language, you can write:
let $key := $map?key2?key3 return if(exists($key)) then $key else ("1", "2")
This will ensure that your map will only be looked up once.
Best, Christian
[1] http://docs.basex.org/wiki/Options#INLINELIMIT
On Mon, Nov 12, 2018 at 6:11 PM Sebastian Zimmer < sebastian.zimmer@uni-koeln.de> wrote:
Hi Christian,
thanks for your help attempt. I understand that the query is not really simple yet and that it's hard to see what's going on.
Attached is a database document on which you can run the query.
To your questions:
- It does make no difference if the namespace prefixes are replaced with
*. Thus you can simplify the query in that manner and the issue persists.
- Yes, the issue does only occur if the code is within the local:search
function. If the code is in the global scope of the file, the ft-index is applied. Thus, I cannot simplify the query that way.
- If I replace the functx function with its body, the ft-index is applied.
Thus, I cannot simplify the query that way.
I'm happy to help further if I can.
Best, Sebastian
Am 12.11.2018 um 17:24 schrieb Christian Grün:
Hi Sebastian,
It’s difficult to guess what the query optimizer does by just looking at the query. Could you pass us on a database document that allows us to run the query?
Any attempt to simplify your query is appreciated. Here are some more things that you could try: • Does it make a difference if you work with namespaces, or could all prefixes in the location steps be replaced with *? • Does the issue only occur if your code is defined in the local:search function? • What happens if you replace the functx function call with its body ($value = $seq)?
Best, Christian
On Mon, Nov 12, 2018 at 9:54 AM Sebastian Zimmer < sebastian.zimmer@uni-koeln.de> wrote:
Hi,
I'm working on a query where the full-text index is not applied, even though it could be (see attached test.xql).
I'm sorry for the somewhat complicated look of the example with data-specific XPaths and so on. I really have tried to simplify it as much as I could, but when I further simplify it, the fulltext index actually will be applied or the query is optimized to () due to cached evaluations.
But if you take the expression of the variable $number and inline it into below where the variable is referenced, you may see what I mean. The result is an equivalent query (see attached test-ft.xql) to the original one, but now the fulltext index *is* applied.
I'm using the BaseX 9.1 snapshot from 2018-11-05.
Best, Sebastian -- Sebastian Zimmer sebastian.zimmer@uni-koeln.de [image: CCeH Logo] http://cceh.uni-koeln.de
Cologne Center for eHumanities http://cceh.uni-koeln.de DH Center at the University of Cologne [image: Twitter Logo] https://twitter.com/CCeHum@CCeHum https://twitter.com/CCeHum
-- Sebastian Zimmer sebastian.zimmer@uni-koeln.de [image: CCeH Logo] http://cceh.uni-koeln.de
Cologne Center for eHumanities http://cceh.uni-koeln.de DH Center at the University of Cologne [image: Twitter Logo] https://twitter.com/CCeHum@CCeHum https://twitter.com/CCeHum
Hi Christian,
wow, thank you for explaining! All this is very helpful to me, especially the INLINELIMIT option that I was not aware of. I think I will manage now to rewrite the original query so that the index is applied again.
Best, Sebastian
Am 16.11.2018 um 13:06 schrieb Christian Grün:
Hi Sebastian,
I agree, it is kind of impossible to simplify your query. The reason is function inlining:
Original query: declare function local:greet($name) { 'hi ' || $name }; local:greet('john')
Optimized query (intermediate): let $name_1 := 'john' return 'hi ' || $name_1
Optimized query (final): 'hi john'
If a function body is too large, it won’t be inlined (otherwise, recursive functions can easily blow up main memory). The inline limit in BaseX is a static value, it can be controlled via the INLINELIMIT option [1].
Coming back to your query, test.xql won’t be optimized for index access because…
- The function body is too large
- The passed on JSON string cannot be statically converted to a map
- As a consequence, it cannot statically be determined if search will
be fuzzy or not. 4. As a consequence, the predicate with the full-text expression contains multiple options, and none of them will be rewritten for index access.
You can e.g.b tackle this by defining a larger inline limit for this function in your function declaration (see the attached link). As stated before, you can also call ft:search directly and pass on the fuzzy option as parameter.
A general hint: With BaseX 9.1, your map lookup can be simplified to…
$map?key2?key3 ?: ("1", "2")
…or…
util:or($map?key2?key3, ("1", "2"))
If you want to use standard XQuery 3.1 language, you can write:
let $key := $map?key2?key3 return if(exists($key)) then $key else ("1", "2")
This will ensure that your map will only be looked up once.
Best, Christian
[1] http://docs.basex.org/wiki/Options#INLINELIMIT
On Mon, Nov 12, 2018 at 6:11 PM Sebastian Zimmer <sebastian.zimmer@uni-koeln.de mailto:sebastian.zimmer@uni-koeln.de> wrote:
Hi Christian, thanks for your help attempt. I understand that the query is not really simple yet and that it's hard to see what's going on. Attached is a database document on which you can run the query. To your questions: * It does make no difference if the namespace prefixes are replaced with *. Thus you can simplify the query in that manner and the issue persists. * Yes, the issue does only occur if the code is within the local:search function. If the code is in the global scope of the file, the ft-index is applied. Thus, I cannot simplify the query that way. * If I replace the functx function with its body, the ft-index is applied. Thus, I cannot simplify the query that way. I'm happy to help further if I can. Best, Sebastian Am 12.11.2018 um 17:24 schrieb Christian Grün:
Hi Sebastian, It’s difficult to guess what the query optimizer does by just looking at the query. Could you pass us on a database document that allows us to run the query? Any attempt to simplify your query is appreciated. Here are some more things that you could try: • Does it make a difference if you work with namespaces, or could all prefixes in the location steps be replaced with *? • Does the issue only occur if your code is defined in the local:search function? • What happens if you replace the functx function call with its body ($value = $seq)? Best, Christian On Mon, Nov 12, 2018 at 9:54 AM Sebastian Zimmer <sebastian.zimmer@uni-koeln.de <mailto:sebastian.zimmer@uni-koeln.de>> wrote: Hi, I'm working on a query where the full-text index is not applied, even though it could be (see attached test.xql). I'm sorry for the somewhat complicated look of the example with data-specific XPaths and so on. I really have tried to simplify it as much as I could, but when I further simplify it, the fulltext index actually will be applied or the query is optimized to () due to cached evaluations. But if you take the expression of the variable $number and inline it into below where the variable is referenced, you may see what I mean. The result is an equivalent query (see attached test-ft.xql) to the original one, but now the fulltext index *is* applied. I'm using the BaseX 9.1 snapshot from 2018-11-05. Best, Sebastian -- Sebastian Zimmer sebastian.zimmer@uni-koeln.de <mailto:sebastian.zimmer@uni-koeln.de> CCeH Logo <http://cceh.uni-koeln.de> Cologne Center for eHumanities <http://cceh.uni-koeln.de> DH Center at the University of Cologne Twitter Logo <https://twitter.com/CCeHum>@CCeHum <https://twitter.com/CCeHum>
-- Sebastian Zimmer sebastian.zimmer@uni-koeln.de <mailto:sebastian.zimmer@uni-koeln.de> CCeH Logo <http://cceh.uni-koeln.de> Cologne Center for eHumanities <http://cceh.uni-koeln.de> DH Center at the University of Cologne Twitter Logo <https://twitter.com/CCeHum>@CCeHum <https://twitter.com/CCeHum>
basex-talk@mailman.uni-konstanz.de