Hi Christian,

Thanks for the quick response, and sure thing. It does look like the text index is applied in both cases. While I was writing up an example of the slow query, I realized that the way I'll actually be querying it is by wrapping the condition in not(...). After doing so, it now only takes 6 seconds to run -- still slower but better. The slow query looks like this:

for $x in db:open('MyDatabase')/data/element
where not($x/child1/child2/valid = 'false')
return $x

And the fast query looks like this:

for $x in db:open('MyDatabase')/data/element
where $x/child3/child4/id = '123'
return $x

Let me know if this is helpful -- if not, I could share more info in a direct email about the actual structure of the database and the Info output.

Thanks,
Matt

On Mon, Oct 30, 2023 at 4:40 PM Christian Grün <christian.gruen@gmail.com> wrote:
Hi Matt,

In general, all nodes are treated identically, no matter what the hierarchy is or regular the target path is.

Could you share some more information with us? How do the queries look like (the slow and the fast one)? Is the text index applied in both cases?

Thanks in advance,
Christian



Matthew Dziuban <mrdziuban@gmail.com> schrieb am Mo., 30. Okt. 2023, 21:33:
Hi all,

I'm working with a database structured like so:

<data>
  <element>...</element>
  <element>...</element>
  <element>...</element>
</data>

There are a total of about 1.5 million <element> nodes in the database. Each <element> has many child nodes, one of which is uncommon -- it only appears in 727 <element>s.

I'm writing a query that has a condition on this uncommon field, but the query takes about 20 seconds to run, whereas another with a condition on a child node that appears in every <element> only takes about 20 milliseconds to run.

Based on the Info in the GUI, it does appear that the text index is being used -- I see 'apply text index for "..."'. Is it expected that the query time would be this much longer? Is the text index somehow built differently for nodes that don't appear often in the database?

Thanks in advance,
Matt


--
Matthew R. Dziuban
mattdziuban.com
703-973-6717
mrdziuban@gmail.com