Hi everyone,
We are experimenting a small issue with the count() function, and I wanted to know if you have any idea to reduce the response time of the server :
We have a database containing almost 5.000 descriptions of expert reports from the 18th century. As a report can take place over several years, we want to list all the years mentioned in the reports and then count for each year how many reports we have. Our goal with this query is to produce a filter by year for our web application.
But, if the first part of the query is quite fast, it is not the case for the second part… here is a simplified sample of our code : xquery version "3.1"; declare default element namespace "xpr" ;
(: to create the xprDB ;) db:create('xpr', 'https://raw.githubusercontent.com/anrExperts/data/master/db/xpr.xml') :)
let $years := fn:distinct-values(db:open('xpr')/xpr/expertises/expertise/description/sessions/date[@when castable as xs:date]/fn:year-from-date(@when)) for $year in $years return $year || ' : ' (:this first part of the query is quite fast 0.09sec on my old computer:) || fn:count(db:open('xpr')/xpr/expertises/expertise[description/sessions/date[fn:matches(@when, xs:string($year))]]) (:it takes 5sec to execute the second part of the query:)
Do you see anything we are doing wrong or we can improve to reduce the server response time ?
We thank you in advance! Best, Josselin
Dear Josselin,
The group by clause can be used to speed up repeated value lookups:
declare default element namespace 'xpr' ;
for $group in db:open('xpr')/xpr/expertises/expertise for $year in $group/description/sessions/ date[@when castable as xs:date]/year-from-date(@when) group by $year return $year || ' : ' || count($group/self::node())
The self::node() step is used in the last line to remove duplicate nodes in the resulting $group value.
Salutations, Christian
On Tue, Mar 29, 2022 at 5:13 PM Josselin Morvan morvan.josselin@gmail.com wrote:
Hi everyone,
We are experimenting a small issue with the count() function, and I wanted to know if you have any idea to reduce the response time of the server :
We have a database containing almost 5.000 descriptions of expert reports from the 18th century. As a report can take place over several years, we want to list all the years mentioned in the reports and then count for each year how many reports we have. Our goal with this query is to produce a filter by year for our web application.
But, if the first part of the query is quite fast, it is not the case for the second part… here is a simplified sample of our code :
xquery version "3.1"; declare default element namespace "xpr" ;
(: to create the xprDB ;) db:create('xpr', 'https://raw.githubusercontent.com/anrExperts/data/master/db/xpr.xml') :)
let $years := fn:distinct-values(db:open('xpr')/xpr/expertises/expertise/description/sessions/date[@when castable as xs:date]/fn:year-from-date(@when)) for $year in $years return $year || ' : ' (:this first part of the query is quite fast 0.09sec on my old computer:) || fn:count(db:open('xpr')/xpr/expertises/expertise[description/sessions/date[fn:matches(@when, xs:string($year))]]) (:it takes 5sec to execute the second part of the query:)
Do you see anything we are doing wrong or we can improve to reduce the server response time ?
We thank you in advance! Best, Josselin
Hi Christian,
Thank you very much!
This group by approach is very efficient, and really (really) faster… and I completely missed it!
Thanks a lot again.
Best, Josselin
Le 29 mars 2022 à 18:33, Christian Grün christian.gruen@gmail.com a écrit :
Dear Josselin,
The group by clause can be used to speed up repeated value lookups:
declare default element namespace 'xpr' ;
for $group in db:open('xpr')/xpr/expertises/expertise for $year in $group/description/sessions/ date[@when castable as xs:date]/year-from-date(@when) group by $year return $year || ' : ' || count($group/self::node())
The self::node() step is used in the last line to remove duplicate nodes in the resulting $group value.
Salutations, Christian
On Tue, Mar 29, 2022 at 5:13 PM Josselin Morvan morvan.josselin@gmail.com wrote:
Hi everyone,
We are experimenting a small issue with the count() function, and I wanted to know if you have any idea to reduce the response time of the server :
We have a database containing almost 5.000 descriptions of expert reports from the 18th century. As a report can take place over several years, we want to list all the years mentioned in the reports and then count for each year how many reports we have. Our goal with this query is to produce a filter by year for our web application.
But, if the first part of the query is quite fast, it is not the case for the second part… here is a simplified sample of our code :
xquery version "3.1"; declare default element namespace "xpr" ;
(: to create the xprDB ;) db:create('xpr', 'https://raw.githubusercontent.com/anrExperts/data/master/db/xpr.xml') :)
let $years := fn:distinct-values(db:open('xpr')/xpr/expertises/expertise/description/sessions/date[@when castable as xs:date]/fn:year-from-date(@when)) for $year in $years return $year || ' : ' (:this first part of the query is quite fast 0.09sec on my old computer:) || fn:count(db:open('xpr')/xpr/expertises/expertise[description/sessions/date[fn:matches(@when, xs:string($year))]]) (:it takes 5sec to execute the second part of the query:)
Do you see anything we are doing wrong or we can improve to reduce the server response time ?
We thank you in advance! Best, Josselin
basex-talk@mailman.uni-konstanz.de