Tell browser when files are updated (per-file cache busting)

This post will explain a simple method to tell the browser to re-download a file when the file has changed.

The problem

A common approach to cache busting when files are updated is to automatically add a “build label” to the url – maybe using requireJs‘s `urlArgs`

require.config({
    urlArgs: "build={{build_id}}",  //adds ?build=... to each request
    ...
});

When my_file_1.js is requested, the build will be added to the url:

/static/my_file.js?build=19

This is fine if all the code is in a single javascript file but a problem arises when the code is split into smaller files. We have 30 files used in our web site:

/static/my_file_1.js?build=19
/static/my_file_2.js?build=19
...
/static/my_file_30.js?build=19

Later we update “my_file_1.js”, and therefore the build label is also updated, so this happens:

/static/my_file_1.js?build=20
/static/my_file_2.js?build=20
...
/static/my_file_30.js?build=20

Now 1 file updated but the browser re-downloads all 30 files because the build label changed. 97% of the files have now been needlessly re-downloaded. This will slow down page loads compared to if we only re-download changed files. This problem gets larger if continuous integration is used with many pushes to production each day. Moreover, we also force re-download of any third party libs in use. Performance will surely suffer.

A solution

Instead of using build label for cache busting we will extend requireJs to use the commit hash of the last times each individual file was changed (note: this bit is quite hacky. If anyone has a smarter way to do it then I will update):

require.config({...});
requirejs.s.contexts._.realNameToUrl = requirejs.s.contexts._.nameToUrl;
 requirejs.s.contexts._.nameToUrl = function() {
    var url = requirejs.s.contexts._.realNameToUrl.apply(this, arguments);
    if (hashes[url]) {
        return url + '?hash=' +hashes[url]; 
    }
    return url
 }

Before we an use this code we need to create a file containing file paths and the commit hashes, a file that looks like this:

var hashes = {
    "/static/my_file_1.js": "28c72d56",
    "/static/my_file_2.js": "8e4e7740",
    ...
    "/static/my_file_30.js": "28c72d56" 
};

I made a django app that does this (here) . If you’re not using Django then fork the repo and make it work in your environment.

The outcome

See the hashes are added to the static file requests.

Usage in production

We can manually call the script to update the file containing commit hashes and add the resulting file to version control. Thats fine. However, I’m quite forgetful  and forgetting to run the script will result in the browser using old files. As I use ansible for provisioning my servers I just add the following to my ansible script:

- name: getting cache bust hashes
 django_manage: >
     app_path="my/repo/"
     command=collect_static_hashes
     settings=settings.live
     virtualenv="my/virutalenv"

And it just works.

Caution

  • If you update your static files using git commit –amend then the commit hash will not change.
  • The hack to requirejs will not work if you have multiple contexts in play.