Tell browser when files are updated (per-file cache busting)

This post will explain a simple method to tell the browser to re-download a file when the file has changed.

The problem

A common approach to cache busting when files are updated is to automatically add a “build label” to the url – maybe using requireJs‘s `urlArgs`

    urlArgs: "build={{build_id}}",  //adds ?build=... to each request

When my_file_1.js is requested, the build will be added to the url:


This is fine if all the code is in a single javascript file but a problem arises when the code is split into smaller files. We have 30 files used in our web site:


Later we update “my_file_1.js”, and therefore the build label is also updated, so this happens:


Now 1 file updated but the browser re-downloads all 30 files because the build label changed. 97% of the files have now been needlessly re-downloaded. This will slow down page loads compared to if we only re-download changed files. This problem gets larger if continuous integration is used with many pushes to production each day. Moreover, we also force re-download of any third party libs in use. Performance will surely suffer.

A solution

Instead of using build label for cache busting we will extend requireJs to use the commit hash of the last times each individual file was changed (note: this bit is quite hacky. If anyone has a smarter way to do it then I will update):

requirejs.s.contexts._.realNameToUrl = requirejs.s.contexts._.nameToUrl;
 requirejs.s.contexts._.nameToUrl = function() {
    var url = requirejs.s.contexts._.realNameToUrl.apply(this, arguments);
    if (hashes[url]) {
        return url + '?hash=' +hashes[url]; 
    return url

Before we an use this code we need to create a file containing file paths and the commit hashes, a file that looks like this:

var hashes = {
    "/static/my_file_1.js": "28c72d56",
    "/static/my_file_2.js": "8e4e7740",
    "/static/my_file_30.js": "28c72d56" 

I made a django app that does this (here) . If you’re not using Django then fork the repo and make it work in your environment.

The outcome

See the hashes are added to the static file requests.

Usage in production

We can manually call the script to update the file containing commit hashes and add the resulting file to version control. Thats fine. However, I’m quite forgetful  and forgetting to run the script will result in the browser using old files. As I use ansible for provisioning my servers I just add the following to my ansible script:

- name: getting cache bust hashes
 django_manage: >

And it just works.


  • If you update your static files using git commit –amend then the commit hash will not change.
  • The hack to requirejs will not work if you have multiple contexts in play.

JSON schema validation with Django Rest Framework

Django Rest Framework integrates well with models to provide out of the box validation, and ModelSerializers allow futher fine-grained custom validation. However, if you’re not using a model as the resource of your endpoint then the code required for custom validation of complex data structure can get hairy.

If there is a heavily nested data structure then a serializer can be used that has a nested serializer, which also has a nested serializer, and so on – or a JSON schema and custom JSON parser can be employed.

Using a JSON schema generation tool also provides a quick win: generate the canonical “pattern” of valid JSON structure using data known to be of the correct structure. This post will go through doing this JSON schema validation using Django Rest Framework.

Note my folder structure is like so:



If you find yourself in the following situations then this approach should come in useful:

  • Storing data from external service when you don’t have control over schema and don’t want to replicate it in a database.
  • Data not related to a specific resource.
  • Endpoint for saving data serialized from taffy.
  • Need to store data in flat file instead of database due to a technical constraint.

JSON Schema

We will validate the JSON posted to out endpoint against a JSON schema we define. The JSON schema standard is not yet finalized, but in a mature enough for our usecase. This example uses the jsonschema python package, and the following data:

Note for sake of briefity the example data structure below is simple, but just pretend its complex. If the data structure was as simple as bellow a serializer should be used.

json = {
    "name": "Product",
    "properties": {
        "name": {
            "type": "string",
            "required": True
        "price": {
            "type": "number",
            "minimum": 0,
            "required": True
        "tags": {
            "type": "array",
            "items": {"type": "string"}
        "stock": {
            "type": "object",
            "properties": {
                "warehouse": {"type": "number"},
                "retail": {"type": "number"}

# The JSON Schema above can be used to test the validity of the JSON code below:
example_data = {
    "name": "Foo",
    "price": 123,
    "tags": ["Bar", "Eek"],
    "stock": {
        "warehouse": 300,
        "retail": 20

For an easy look at validation in practice take a look here


Now to the impliment the DRF endpoint that uses JSON schema validation:

from rest_framework.exceptions import ParseError
from rest_framework.response import Response
from rest_framework import views

from . import negotiators, parsers, utils

class ProductView(views.APIView):

    parser_classes = (parsers.JSONSchemaParser,)
    content_negotiation_class = negotiators.IgnoreClientContentNegotiation

    def post(self, request, *args, **kwargs):
            # implicitly calls parser_classes
        except ParseError as error:
            return Response(
                'Invalid JSON - {0}'.format(error.message),
        return Response()
import jsonschema
from rest_framework.exceptions import ParseError
from rest_framework.parsers import JSONParser

from . import schemas

class JSONSchemaParser(JSONParser):

    def parse(self, stream, media_type=None, parser_context=None):
        data = super(JSONSchemaParser, self).parse(stream, media_type,
            jsonschema.validate(data, schemas.json)
        except ValueError as error:
            raise ParseError(detail=error.message)
            return data
from django.conf.urls.defaults import url
from django.conf.urls import patterns

from . import views

urlpatterns = patterns(
    url(r'^/api/product/$', views.ProductView.as_view(), name='product_view'),

Content negotation

The `parse` method on each parser in `parser_classes` will get called only if the request’s “Content-Type” header has a value that matches the ‘media_type’ attribute on the parser, which means the JSON schema validation will not go ahead if no “Content-Type” header is set. If the schema validation must go ahead, I see a few options:

    • Assign `parser_classes = (PlainSchemaParser, JSONSchemaParser, XMLSchemaParser, YAMLSchemaParser, etc)` on ProductView and define the YAML, XML, etc schemas.
    • Force the view to use JSONSchemaParser parser regardless of if the client requests JSON, XML, etc.

To keep it simple this example will choose the second option by using custom content negotiation (which is pulled directly from DRF docs:


from rest_framework.negotiation import BaseContentNegotiation

class IgnoreClientContentNegotiation(BaseContentNegotiation):
    def select_parser(self, request, parsers):
        Select the first parser in the `.parser_classes` list.
        return parsers[0]

    def select_renderer(self, request, renderers, format_suffix):
        Select the first renderer in the `.renderer_classes` list.
        return (renderers[0], renderers[0].media_type)

Posting to the endpoint

lets define a simple webservice:

function createProduct(data){
    $.post('/api/product' data).
            var error = JSON.parse(xhr.responseText);
            console.log('error - ' + error.detail);

    "name": "Foo",
    "price": 123,
    "tags": ["Bar", "Eek"],
    "stock": {
        "warehouse": 300,
        "retail": 20
// OK

    "price": 123,
    "tags": ["Bar", "Eek"],
    "stock": {
        "warehouse": 300,
        "retail": 20
// error - name property is required

    "price": 123,
    "tags": ["Bar", "Eek"],
    "stock": {
        "warehouse": 300,
        "retail": 20
// error - price Number is less then the required minimum value


Word of caution: I find serializers much more maintainable than JSON schemas, so if you are 50/50 of whether to use JSON schema validation or a serializer then I suggest going for the serializer.

Authenticate using Django Rest Framework and Angular

In a previous post we went through how to authenticated using a DRF endpoint.

This post will expand on how to achieve persistent login with a one page app using Angular. Note as there are 2 frameworks in play there will be some need to workaround integrations problems. It doesn’t get messy, which is a testiment to the quality of the respective frameworks!

For a demo check out this a silly microblog tool I made where you can create an account and log into it: github

Note my folder strucure is like so:

                ...[all third party js files]...

Create the Angular module

In app.js we create a module that will call the DRF endpoints in order to authenticate:

angular.module('authApp', ['ngResource']).
    config(['$httpProvider', function($httpProvider){
        // django and angular both support csrf tokens. This tells
        // angular which cookie to add to what header.
        $httpProvider.defaults.xsrfCookieName = 'csrftoken';
        $httpProvider.defaults.xsrfHeaderName = 'X-CSRFToken';
    factory('api', function($resource){
        function add_auth_header(data, headersGetter){
            // as per HTTP authentication spec [1], credentials must be
            // encoded in base64. Lets use window.btoa [2]
            var headers = headersGetter();
            headers['Authorization'] = ('Basic ' + btoa(data.username +
                                        ':' + data.password));
        // defining the endpoints. Note we escape url trailing dashes: Angular
        // strips unescaped trailing slashes. Problem as Django redirects urls
        // not ending in slashes to url that ends in slash for SEO reasons, unless
        // we tell Django not to [3]. This is a problem as the POST data cannot
        // be sent with the redirect. So we want Angular to not strip the slashes!
        return {
            auth: $resource('/api/auth\\/', {}, {
                login: {method: 'POST', transformRequest: add_auth_header},
                logout: {method: 'DELETE'}
            users: $resource('/api/users\\/', {}, {
                create: {method: 'POST'}
    controller('authController', function($scope, api) {
        // Angular does not detect auto-fill or auto-complete. If the browser
        // autofills "username", Angular will be unaware of this and think
        // the $scope.username is blank. To workaround this we use the
        // autofill-event polyfill [4][5]
        $('#id_auth_form input').checkAndTriggerAutoFillEvent();

        $scope.getCredentials = function(){
            return {username: $scope.username, password: $scope.password};

        $scope.login = function(){
                        // on good username and password
                        $scope.user = data.username;
                        // on incorrect username and password

        $scope.logout = function(){
                $scope.user = undefined;
        $scope.register = function($event){
            // prevent login form from firing
            // create user and immediatly login on success

// [1]
// [2]
// [3]
// [4]
// [5]

Create the Angular view

Django and Angular both use curly braces to denote variables. Below we want Django to take care of defining the user (‘{{user.username}}’) based on the value passed in the Context to the template, while Angular should take care of everything else. We use verbatim template tag to prevent Django from stealing Angular’s thunder. In one_page_app.html we do:

<html ng-cloak ng-app>
	<link rel="stylesheet" href="/static/js/angular/angular-csp.css"/>
	<link rel="stylesheet" href="/static/js/bootstrap/dist/css/bootstrap.min.css"/>
<div class="navbar navbar-fixed-bottom" ng-controller="authController"
{% verbatim %}
<div ng-show="user" class="navbar-form navbar-right">
<input type="submit" class="btn btn-default" ng-click="logout()"
value="logout {{user}}"/>
<div ng-hide="user">
<form id="id_auth_form" class="navbar-form navbar-right"
<div class="form-group">
<input ng-model="username" required name="username"
type="text" placeholder="username" class="form-control">
<div class="form-group">
<input ng-model="password" required name="password" type="password"
placeholder="password" class="form-control">
<div class="btn-group">
<input type="submit" class="btn btn-default" value="login">
<input type="submit" class="btn btn-default" value="register"
<script src="/static/js/jquery/jquery.js"></script>
<script src="/static/js/bootstrap/dist/js/bootstrap.min.js"></script>
<script src="/static/js/angular/angular.min.js"></script>
<script src="/static/js/angular-resource/angular-resource.min.js"></script>
<script src="/static/js/autofill-event/src/autofill-event.js"></script>
<script src="/static/js/app.js"></script>
{% endverbatim %}


Bower is fantastic tool to manage front-end requirements. Lets define bower.json like so:

  "dependencies": {
    "angular": "1.3.0-build.2419+sha.fe0e434",
    "angular-resource": "1.3.0-build.2419+sha.fe0e434",
    "autofill-event": "1.0.0",
    "jquery": "1.8.2",
    "bootstrap": "3.0.0",

DRF Endpoint

We also need to update the apps.accounts.api.AuthView created in previous post: we need to call django.contrib.auth.login in order to attach the session id the request object, and cause the creation a Set-Cookie header on the respons, which causes the browser to create the defined cookie and so we will achieve persistent login upon reloading the page:

from django.contrib.auth import login, logout

class AuthView(APIView):
    authentication_classes = (QuietBasicAuthentication,)

    def post(self, request, *args, **kwargs):
        login(request, request.user)
        return Response(UserSerializer(request.user).data)

    def delete(self, request, *args, **kwargs):
        return Response({})

django.contrib.auth.logout stops the session, and sends instructions to browser to make the cookie expire.

Serving the view

We also of course need to serve the one_page_app.html, so in we do

from django.views.generic.base import TemplateView

class OnePageAppView(TemplateView):
    template_name = 'one_page_app.html'

And in core.urls we route a url to the Django view:

from django.conf.urls import patterns, include, url

from . import views

urlpatterns = patterns('',
    url(r'^$', views.OnePageAppView.as_view(), name='home'),

With this all plugged in we have a one page app that:

– Shows login form if user isn’t authenticated with their Django session cookie.
– Shows logout if user is authenticated by either session cookie or username and password.
– Allows logging in in a RESTful way.
– Allows creating new user and automatically logs after creation.
– Alerts login and registration errors.
– Took only about 150 lines of code to achieve.

Check credentials using Django Rest Framework

This post will cover how to authenticate a user’s username and password using a Django Rest Framework endpoint. This functionality allows checking credentials without the need for refreshing the browser.

This is a follow up to my previous post, which covered how to create a User model DRF endpoint (allowing creating, listing, deleting users).

Create the endpoint

note my folder strucure is like so:


In we need to create the AuthView which we will call to check credentials:

from . import authentication, serializers  # see previous post[1] for user serializer.

class AuthView(APIView):
    authentication_classes = (authentication.QuietBasicAuthentication,)
    serializer_class = serializers.UserSerializer

    def post(self, request, *args, **kwargs):
        return Response(self.serializer_class(request.user).data)


Then in we define our authenticator: We inherit DRF’s BasticAuthentication to check the HTTP_AUTHORIZATION header for correct username and password.

from rest_framework.authentication import BasicAuthentication

class QuietBasicAuthentication(BasicAuthentication):
    # disclaimer: once the user is logged in, this should NOT be used as a
    # substitute for SessionAuthentication, which uses the django session cookie,
    # rather it can check credentials before a session cookie has been granted.
    def authenticate_header(self, request):
        return 'xBasic realm="%s"' % self.www_authenticate_realm

Notice we’re also overriding BasicAuthentication’s authenticate_header method to prevent undesirable behaviour: by default with Basic authentication if user provides wrong credentials the browser prompts the user for their credentials again using a native dialogue box. Rather ugly and bad user experience. To avoid this we ensure the schema returns a custom value other than ‘Basic‘.

We of course need to define the url in

from django.conf.urls import patterns, url

from api import views as api_views

urlpatterns = patterns(

Using the endpoint

Now we can do some fun stuff – attempt to authenticate using the endpoint. We need to set the header of the ajax request. We can do this by hooking in with $.ajax’s beforeSend:

function checkCredentials(username, password){
    function setHeader(xhr) {
        // as per HTTP authentication spec [2], credentials must be
        // encoded in base64. Lets use window.btoa [3]
        xhr.setRequestHeader ("Authorization", "Basic " +
                               btoa(username + ':' password));

    $.ajax({type: "POST",  url: "/api/auth/",  beforeSend: setHeader}).
            console.log('bad credentials.')
            console.log('welcome ' +


and to finally use it:

// pass in bad credentials...
checkCredentials('AzureDiamond', 'password');
// ...prints 'bad credentials.'

// pass in good credentials...
checkCredentials('AzureDiamond', 'hunter2');
// ...prints 'welcome'

Security considerations

Only serve this endpoint over HTTPS, in the same way you should serve a conventional login page using HTTPS. We dont wan’t a man in the middle attack getting our auth header!

In a future post we cover how to use this functionality to have persistent login.

Django Rest Framework User Endpoint

DRF is an awesome tool for building web APIs the RESTful way: allowing you to interact with your database like so:

POST request to /api/articles with hdata payload to create new Article instance. PATCH request to /api/articles/1/ to partially update Article with pk of PUT request to /api/articles/1/ to completely replace Article with pk of 1. DELETE request to /api/articles/1/ to delete Article with pk of 1. GET request to /api/articles/1/ to retrieve Article with pk of 1. HEAD request to /api/articles/1/ to see if Article with pk of exists.

Gone are the days when we POST data like

POST request to /api/create_article with data payload POST request to /api/update_article with data payload POST request to /api/delete_article with data payload

The main advantage I see in RESTful is it gives us sane restrictions we must develop under and in doing so so helps organize our web apps – thereby avoid accidentally being “clever” and implementing a hard to maintain codebase. Since using DRF I found adding new features to my apps quicker, DRYer and with more readable code.

If you are a django user and interested in DRF, do pip install djangorestframework. For existing DRF users note make sure you have DRF >= 2.3.11, which now supports write_only_fields.

Defining the endpoint

Below we go through how to expose the User model to the web using a DRF endpoint to allow creating, updating, listing, deleting User objects. Note my folder structure is:


We need to create a view that will serve list and detail view of users:

from django.contrib.auth.models import User
from rest_framework viewsets
from rest_framework.permissions import AllowAny

from .permissions import IsStaffOrTargetUser

class UserView(viewsets.ModelViewSet):
    serializer_class = UserSerializer
    model = User

    def get_permissions(self):
        # allow non-authenticated user to create via POST
        return (AllowAny() if self.request.method == 'POST'
                else IsStaffOrTargetUser()),

We need to be careful with permissions – we dont want users to be able to view other user objects if they are not staff members.

from rest_framework import permissions

class IsStaffOrTargetUser(permissions.BasePermission):
    def has_permission(self, request, view):
        # allow user to list all users if logged in user is staff
        return view.action == 'retrieve' or request.user.is_staff

    def has_object_permission(self, request, view, obj):
        # allow logged in user to view own details, allows staff to view all records
        return request.user.is_staff or obj == request.user

Next we define the serializer that will serialize Querysets and objects to JSON. We need to be careful on create of User object to handle passwords correctly, and on read not to serialize and return the password to the client.

from django.contrib.auth.models import User
from rest_framework import serializers

class UserSerializer(serializers.ModelSerializer):
    class Meta:
        model = User
        fields = ('password', 'first_name', 'last_name', 'email',)
        write_only_fields = ('password',)
        read_only_fields = ('is_staff', 'is_superuser', 'is_active', 'date_joined',)

    def restore_object(self, attrs, instance=None):
        # call set_password on user object. Without this
        # the password will be stored in plain text.
        user = super(UserSerializer, self).restore_object(attrs, instance)
        return user

Now register the endpoint in the app’s

from django.conf.urls import patterns, url, include
from rest_framework import routers

from . import api

router = routers.DefaultRouter()
router.register(r'accounts', api.views.UserView, 'list')

urlpatterns = patterns(
    url(r'^api/', include(router.urls)),


Using the endpoint

Below are examples of calling the endpoint using jQuery, showing the request and the data returned.

Create new user

var data = {username: '', password: '****', ...};
$.post('/api/accounts/', data).done(function(data){
{first_name: "New"
 last_name: "User",
 email: "",

Update user details

var data = {email: ''};
$.ajax({url: '/api/accounts/4', type: 'patch', data: data}).done(function(data){
{first_name: "New"
 last_name: "User",
 email: "",

List all users when logged in as staff

[{first_name: "Richard"
  last_name: "Tier"
  email: "",
  id: 1},
{first_name: "John"
 last_name: "Doe",
 email: "",
 id: 2},
{first_name: "Jane"
 last_name: "Doe",
 email: "",
 id: 3}];

Retrieve own record when logged in as

{first_name: "John"
 last_name: "Doe",
 email: "",
 id: 2}

Retrieve’s record when NOT logged in as staff member and NOT user

{detail: "You do not have permission to perform this action."}

In a follow up post we cover checking username and password using DRF.

Selenium in a hurry

I love tools that save me time. Then I start using them. Then using them turns into a chore. A tool I love is a tool I haven’t used enough yet.

I started using Selenium for end to end testing my web apps, and Selenium went from awesome to  chore quite quickly. My gripe was verbosity:

browser.find_element_by_id('user-' + str(user_id)).click()
(browser.find_element_by_class_name('user-info-' + str(user_id)

Writing it took longer than I wanted, and it will take longer than it should to read, and future-me will be negatively affected when maintaining it. Over-verbosity is not my friend – I think a developer is not here to write code: I think I’m here to create features and I prefer doing it with less code where appropriate. Maybe jQuery has fooled me into thinking clicking on a DOM element can be as simple as:


Sure, I appreciate the Selenium project attempts to maintain similar syntax across it’s many implementations. Pick up a Java, C, or Python implementation and if you can write in one implementation then you can read it in another. jQuery doesn’t have that issue as its only implemented in javascript. However, I’m not in the business of reading Selenium scripts across multiple languages.

I work with Python in a high pressure agile environment where I’m happy to make my Selenium scripts more Pythonic in order improve my workflow. I dont stand alone in this opinion: the official ElasticSearch library has been implemented across many different languages – and each one focuses on aligning with the language it was implemented on rather than attempting to maintain a similar syntax across several languages. I prefer this approach.

My wrapper makes selenium what I consider more developer-friendly, to give me sugar that allows:

browser.find('#user-', user_id).click()
browser.find('#user-info-', user_id).find('.edit').click()

Compared to the vanilla Selenium its 40% less code and I think 40% more awesome. With that defense laid down, below we go through how it was implemented.

from selenium.webdriver.remote.webelement import WebElement
from selenium.webdriver.firefox.webdriver import WebDriver as Firefox
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile

class CustomFirefoxDriver(Firefox):
    web driver that returns CustomWebElement
    when get_element_by_*, etc are called.


    def __init__(self)
        super(CustomFirefoxDriver, self).__init__()
        self.find = ElementFinder(self)

    def create_web_element(self, element_id):
        return CustomWebElement(self, element_id)

class CustomWebElement(WebElement):
    """Attach ElementFinder instance to web elements."""

    def __init__(self, *args, **kwargs):
        self.find = ElementFinder(self)
        super(CustomWebElement, self).__init__(*args, **kwargs)

class ElementFinder(object):
    non-verbosely find elements. routes to find_element_by_id,
    find_element_by_class_name, etc.


    def __init__(self, element):
        self.element = element

    def __call__(self, many=False, *args, **kwargs):
        if many is True then find_elements_by_*, else find_element_by_*

        note args will be concatenated to create the selector


        selector = ''.join(map(str, args))
        prepend = 'find_element{0}_by_'.format('s' if many else '')
        # rough approximate check if selector is complex css selector, if not then
        # select element by id, class, or tag, and fall back to css_selector
        punctuation = """!"#$%&'()*+,./:;<=>?@[\]^`{|}~"""
        if sum(map(selector.count, punctuation)):
            append = 'css_selector'
        elif selector[0] == '#':
            append = 'id'
            selector = selector[1:]
        elif selector[0] == '.':
            append = 'class_name'
            selector = selector[1:]
            append = 'tag_name'
            selector = selector

        # now we know the method name to call, lets do it
        return getattr(self.element, prepend + append)(selector)