Thanks for the confirmation that I didn’t miss a feature.
Cheers,
E.
_____________________________________________
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
LinkedIn | Twitter | YouTube | Facebook
From:
Christian Grün <christian.gruen@gmail.com>
Date: Sunday, August 7, 2022 at 8:09 AM
To: Eliot Kimber <eliot.kimber@servicenow.com>
Cc: Bridger Dyson-Smith <bdysonsmith@gmail.com>, basex-talk@mailman.uni-konstanz.de <basex-talk@mailman.uni-konstanz.de>
Subject: Re: [basex-talk] Get All Tokens for Attribute Name?
[External Email]
Hi Eliot,
index:facets is a good pointer: It collects statistical information on
all distinct path and values of the database. The number of distinct
stored values is limited, but it can be increased with the MAXCATS
option [1].
The current BaseX value indexes don’t store any location/path
metadata, only the values themselves, so you’ll indeed need to collect
the values with XPath, as you already did.
Best,
Christian
[1]
https://urldefense.com/v3/__https://docs.basex.org/wiki/Options*MAXCATS__;Iw!!N4vogdjhuJM!Cb4F26luzsIYSiRVbkRRaoDl-geHy-dQWG5Az8S-PryGVWOIKoDfU8xee54QdVaRnw-CGeFFdcdDYrDZc993y0tPBzOUNJY$
On Sun, Aug 7, 2022 at 2:59 PM Eliot Kimber <eliot.kimber@servicenow.com> wrote:
>
> I had not explored the index:facets() function—interesting. However, since my goal is to get values straight from the indexes, I think using the facets doesn’t help, since the alternative is to just do an XPath over the attributes in my index XML.
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _____________________________________________
>
> Eliot Kimber
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com
>
> LinkedIn | Twitter | YouTube | Facebook
>
>
>
> From: Bridger Dyson-Smith <bdysonsmith@gmail.com>
> Date: Saturday, August 6, 2022 at 12:39 PM
> To: Eliot Kimber <eliot.kimber@servicenow.com>
> Cc: basex-talk@mailman.uni-konstanz.de <basex-talk@mailman.uni-konstanz.de>
> Subject: Re: [basex-talk] Get All Tokens for Attribute Name?
>
> [External Email]
>
>
>
> Hi Eliot -
>
> If I'm following correctly, could you use `index:facets()`? E.g.
>
> index:facets($db)//attribute[@name='bundles']//entry/text() => distinct-values()
>
> Can the bundles attribute have multiple tokens?
>
> Best,
>
> Bridger
>
>
>
>
>
> On Sat, Aug 6, 2022 at 12:01 PM Eliot Kimber <eliot.kimber@servicenow.com> wrote:
>
> Using the attribute index I can get all attributes of a specific name that have a specific value or token (db:attribute()) and I can get all the values for all attributes that start with a specific prefix (index:attributes()) but I don’t see a way to get,
from the index, all the values for all the attributes of a specific name.
>
>
>
> Have I missed something?
>
>
>
> My use case is I have an index over my docs where each index entry is of the form:
>
>
>
> <doc-to-bundle-index-entry key="10503999" filename="c_DataCertification.dita" dbpath="product/data-certification/concept/c_DataCertification.dita" bundles="bundle-platcap-platform-capabilities"/>
>
>
>
> And I would like the distinct-values of all the @bundles attributes. There are about 40K entry elements.
>
>
>
> Of course I can do:
>
> doc-to-bundle-index/doc-to-bundle-index-entry/@bundles ! tokenize(., '\s+')
>
> => distinct-values()
>
>
>
> And that seems plenty fast, but it seemed like there should be a way to do it with the indexes alone. But it may be that the query on the @bundle attribute actually uses the index anyway…
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _____________________________________________
>
> Eliot Kimber
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com
>
> LinkedIn | Twitter | YouTube | Facebook