Re: [basex-talk] How best to cache an intermediate result in the context of a larger query?

10 Aug 2023


      On Thu, 2023-08-10 at 16:00 +0000, Eliot Kimber wrote:
...
This REST endpoint is called from server-side code that also checks
for a cached preview and just returns it (avoiding the overhead of
the REST call), otherwise it calls the endpoint.
I do something similar for fromoldbooks.org (using memcached for the
front page, as the site sometimes gets.. a little busy :) )
A couple of things to watch for...
* write the new cache file to a temp file and then rename it; that way,
another process can't start reading an incomplete cache file
* i check the load average (by opening /proc/loadavg on a Linux server,
it's a text file maintained by the kernel) and if it's too high, sleep
for a while to slow down crawlers, then return failure.
* updating the cache i handle in the front end code, and i return the
result before updating the cache, to shave off a few ms from “time to
first paint”. This affects your position in Google search results, if
that matters to you.
* if your pages are public, crawler bots will pre-populate the cache.
Possibly with nonsensical parameters, so it can make sense to reject
those early on. E.g. an incoming search at fromoldbooks.org with 30
keywords isn't from a human as the UI doesn't support more than 3. So I
don't need to store 2^30 cached pages when the bot tries every
combination
* you can use the Google search console (i think that's the right
place) to tell the google bot about parameters that don't affect the
result, so it shouldn't try with every possible value.
liam
-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [basex-talk] How best to cache an intermediate result in the context of a larger query?