Memoize AJAX requests without data inconsistency

A common problem working with ajax is firing duplicate requests. A common solution is to first check if a request is ongoing. I solved this problem differently by memoizing the get request for the lifetime of the request.

Memoization is storing the results of a function call and returning the cached result when the same inputs occur. This is often used to optimize expensive function calls, but I see value in using it here. In the code below we use Restangular, but the concept works just as well for any promise based request library.

function memoizedService(route) {
    var service = Restangular.service(route);
    var hasher = JSON.stringify;
    var memoizedGetList = _.memoize(service.getList, hasher);
 
    /**
    * Get list, but if there already is a request ongoing with the same
    * params then return the existing request.
    * @param {Object} params
    * @return {promise}
    */

    service.getList = function(params) {
        return memoizedGetList(params).finally(function() {
            delete memoized.cache[hasher(params)];
        });
    }

     return service;
}
var questionEndpoint = Restangular.memoizedService('question');

Make sure you install underscore with a bit of

bower install underscore --save

Then, to use the feature we would do:

questionEndpoint.getList({page: 1});

If we request page 1 when there is already a request for page 1 then the second `getList` will return the same promise that the first request returned and you will only see 1 request for page 1 in the network panel. Note importantly we also remove the cached request when the request is complete. This prevents data consistency problems: a get request can return different data over time (e.g., more records), and we want to make sure the user receives the most up to date data.

Advertisement

Tell browser when files are updated (per-file cache busting)

This post will explain a simple method to tell the browser to re-download a file when the file has changed.

The problem

A common approach to cache busting when files are updated is to automatically add a “build label” to the url – maybe using requireJs‘s `urlArgs`

require.config({
    urlArgs: "build={{build_id}}",  //adds ?build=... to each request
    ...
});

When my_file_1.js is requested, the build will be added to the url:

/static/my_file.js?build=19

This is fine if all the code is in a single javascript file but a problem arises when the code is split into smaller files. We have 30 files used in our web site:

/static/my_file_1.js?build=19
/static/my_file_2.js?build=19
...
/static/my_file_30.js?build=19

Later we update “my_file_1.js”, and therefore the build label is also updated, so this happens:

/static/my_file_1.js?build=20
/static/my_file_2.js?build=20
...
/static/my_file_30.js?build=20

Now 1 file updated but the browser re-downloads all 30 files because the build label changed. 97% of the files have now been needlessly re-downloaded. This will slow down page loads compared to if we only re-download changed files. This problem gets larger if continuous integration is used with many pushes to production each day. Moreover, we also force re-download of any third party libs in use. Performance will surely suffer.

A solution

Instead of using build label for cache busting we will extend requireJs to use the commit hash of the last times each individual file was changed (note: this bit is quite hacky. If anyone has a smarter way to do it then I will update):

require.config({...});
requirejs.s.contexts._.realNameToUrl = requirejs.s.contexts._.nameToUrl;
 requirejs.s.contexts._.nameToUrl = function() {
    var url = requirejs.s.contexts._.realNameToUrl.apply(this, arguments);
    if (hashes[url]) {
        return url + '?hash=' +hashes[url]; 
    }
    return url
 }

Before we an use this code we need to create a file containing file paths and the commit hashes, a file that looks like this:

var hashes = {
    "/static/my_file_1.js": "28c72d56",
    "/static/my_file_2.js": "8e4e7740",
    ...
    "/static/my_file_30.js": "28c72d56" 
};

I made a django app that does this (here) . If you’re not using Django then fork the repo and make it work in your environment.

The outcome

See the hashes are added to the static file requests.

Usage in production

We can manually call the script to update the file containing commit hashes and add the resulting file to version control. Thats fine. However, I’m quite forgetful  and forgetting to run the script will result in the browser using old files. As I use ansible for provisioning my servers I just add the following to my ansible script:

- name: getting cache bust hashes
 django_manage: >
     app_path="my/repo/"
     command=collect_static_hashes
     settings=settings.live
     virtualenv="my/virutalenv"

And it just works.

Caution

  • If you update your static files using git commit –amend then the commit hash will not change.
  • The hack to requirejs will not work if you have multiple contexts in play.