I am loading page view data exported from Adobe as CSV as XML with a structure like:

<csv timestamp="2024-01-24">

<entry name="URL">docs.servicenow.com/bundle/vancouver-servicenow-platform/page/administer/general/concept/capabilities-bundle-landingpage.html</entry>

</record>

… 50K more records …

</csv>

For this database I have a token index and the MAXLEN value is set to 255 to ensure that the @topicpath values will be tokenized (none should be longer than 255).

I’m measuring a consistent 0.3 seconds for this query:

let $docRecords as element()* := db:token($analyticsmgmt:analyticsDatabase, $docPath, 'topicpath')/..

Where $docPath is a value that will match tokens in the @topicpath attribute.

Based on other token-based optimizations I’ve done, I would expect faster results, at least 10-times faster.

My questions:

What can I look for that might be making this lookup slower than expected or, conversely, how can I prove that this is the fastest the query will return?
Is there a better way to manage this kind of tabular data with many 1000s of records within BaseX? For example, does it matter if the records are part of a single document or would it be better to have each record be a separate document?

Thanks,

Eliot

_____________________________________________

Eliot Kimber

Sr Staff Content Engineer

Digital Content & Design

O: 512 554 9368

M: 512 554 9368

servicenow.com

LinkedIn | Twitter | YouTube | Facebook