I am loading page view data exported from Adobe as CSV as XML with a structure like:
<csv timestamp="2024-01-24"> <record language="en-US" family="vancouver" bundlename="bundle-platcap-platform-capabilities" topicpath="administer/general/concept/capabilities-bundle-landingpage.dita" notviewed="viewed" viewcount="120587"> <entry name="URL">docs.servicenow.com/bundle/vancouver-servicenow-platform/page/administer/general/concept/capabilities-bundle-landingpage.html</entry> <entry name="Page Views">120587</entry> <entry name="Unique Visitors">50863</entry> <entry name="Bounce Rate">0.255523003</entry> <entry name="Content Velocity">3.289644475</entry> <entry name="Docs Engagement Rate">0.289549233</entry> </record> … 50K more records … </csv>
For this database I have a token index and the MAXLEN value is set to 255 to ensure that the @topicpath values will be tokenized (none should be longer than 255).
I’m measuring a consistent 0.3 seconds for this query:
let $docRecords as element()* := db:token($analyticsmgmt:analyticsDatabase, $docPath, 'topicpath')/..
Where $docPath is a value that will match tokens in the @topicpath attribute.
Based on other token-based optimizations I’ve done, I would expect faster results, at least 10-times faster.
My questions:
1. What can I look for that might be making this lookup slower than expected or, conversely, how can I prove that this is the fastest the query will return? 2. Is there a better way to manage this kind of tabular data with many 1000s of records within BaseX? For example, does it matter if the records are part of a single document or would it be better to have each record be a separate document?
Thanks,
Eliot _____________________________________________ Eliot Kimber Sr Staff Content Engineer Digital Content & Design O: 512 554 9368 M: 512 554 9368 servicenow.comhttps://www.servicenow.com LinkedInhttps://www.linkedin.com/company/servicenow | Twitterhttps://twitter.com/servicenow | YouTubehttps://www.youtube.com/user/servicenowinc | Facebookhttps://www.facebook.com/servicenow