Why CustomUser subclasses are not such a good idea

Background

The system I work on has People who may or may not be Users, and very infrequently Users who may not be a Person. In fact, an extension to the system has meant that there will be more of these: a User who needs to be able to generate reports (say, a Franchisor who needs to only be able to access aggregate data from franchises, that might belong to multiple companies) who is never rostered on for shifts, which is what the Person class is all about.

Anyway, the long and the short of this was that I thought it might be a good idea to look at sub-classing User for ManagementUser.

I guess I should have listened to those smarter than me who shouted that sub-classing User is not cool. Although they never gave any concrete reasons, but now I have one.

You cannot easily convert a superclass object to a specialised sub-class. Once a user is a User, it’s hard to make them into a ManagementUser.

It can be done: the following code will take a User (or any parent class) object, a User (or whatever) subclass, and any other keyword arguments that should be passed into the constructor. It saves the newly upgraded object, and returns it.

def create_subclass(SubClass, old_instance, **kwargs):
    new_instance = SubClass()
    for field in old_instance._meta.local_fields:
        setattr(new_instance, field.name, getattr(old_instance, field.name))
    new_instance.save()
    return new_instance()

However, it really should check that there isn’t an existing instance, and maybe some other checks.

What advantages does sub-classing have?

The biggest advantage, or so I thought, was to have it so you can automatically downcast your models on user login, and then get access to the extended user details. For instance, if your authentication backend automatically converts User to Person, then you can get access to the Person’s attributes (like the company they work for, their shifts, etc) without an extra level of attribute access:

# request.user is always an auth.User instance:
request.user.person.company
# request.user might be a person, etc.
request.user.company

But it turns out that even this is bad. Now, in guard decorators on view functions, you cannot just test the value of an attribute, as not all users will have that attribute. Instead, you need to test to see if the attribute exists, and then test the attribute itself.

So, what do you do instead?

The preferred method in django for extending User is to use a UserProfile class. This is just a model that has a OneToOneField linked back to User. I would look at doing a very small amount of duck-punching just to make getting a hold of the profile class:

import logging
from django.contrib.auth.models import User
from django.db import models

class Person(models.Model):
    user = models.OneToOneField(User, related_name="_person")
    date_of_birth = models.DateField(null=True, blank=True)

def get_person(user):
    try:
        return user._person
    except Person.DoesNotExist:
        pass

def set_person(user, person):
    user._person = person

if hasattr(User, 'person'):
    logging.error('Model User already has an attribute "person".')
else:
    User.person = property(get_person, set_person)

By having the person’s related name attribute as _person, we can wrap read access to it in an exception handler, and then use a view decorator like:

@user_passes_test(lambda u:u.person)
def person_only_view(request, **kwargs):
    pass

We know this view will only be available to logged in users who have a related Person object.

I will point out that I am duck-punching/monkey-patching here. However, I feel that this particular method of doing it is relatively safe. I check before adding the property, and in reality I probably would raise an exception rather than just log an error.

BBEdit - Strip Outer HTML tags

So, I monitor the BBEdit Google group, now that I’m a paid-up BBEdit user. One question piqued my interest today, and here is my solution:

tell application "BBEdit"
	tell front window
		set cursorPos to characterOffset of selection
		balance tags
		set startPos to characterOffset of selection
		set endPos to startPos + (length of selection)
		select (characters (startPos - 6) thru (endPos + 6))
		set selectedText to selection as text
		if characters 1 thru 6 of selectedText as text is equal to "<span>" then
			set replaceText to characters startPos thru (endPos - 1) as text
			set selection to replaceText
			select insertion point before character (cursorPos - 6)
		else
			select insertion point before character (cursorPos)
		end if
	end tell
end tell

In summary, it uses the BBEdit builtin command to select the contents of the current tag, and then extends that selection to grab the span tags that surround it. If indeed it was as span block, then it removes those tags.

This is just a simple one-off, but it might be useful as a basis for generating a script that has more features: like arbitrary tag types (rather than just span), or some other thing I haven’t thought of.

Note that it will only strip the outer tags. BBEdit has a Remove Markup feature, but that does not seem to be accessible using AppleScript.

Dreamweaver Password Decoding

For future reference:

def decode_dreamweaver_password(encoded):
    output = ""
    for i in range(0, len(encoded), 2):
        val = int(data[i:i+1],16) - i/2
        output += chr(val)
    return output

Knockout Collection

I am loving KnockoutJS. It makes it super easy to bind data values to UI elements in a declarative manner. You no longer have to worry about callbacks updating your data model and/or your view widgets.

The addition to KnockoutJS that I have been working on is a ‘collection’, that can be used to contain a set of objects, which can be fetched from a server, and each of which has it’s own resource URI that will be used to update or delete it.

For instance, we may have a collection URI:

GET "http://example.com/people/"

When we access this using a GET request, we might see something like:

[
  {
    "first_name": "Adam",
    "last_name": "Smith",
    "links": [
      {"rel":"self", "uri": "http://example.com/people/552/"}
    ]
  },
  {
    "first_name": "John",
    "last_name": "Citizen",
    "links": [
      {"rel":"self", "uri": "http://example.com/people/32/"}
    ]
  }  
]

Each linked resource contains the full (or as much as the logged-in user is able to see) representation. Example:

GET "http://example.com/people/552/"
{
  "first_name": "Adam",
  "last_name": "Smith",
  "date_of_birth": "1910-02-11",
  "email": "adam.smith@example.com",
  "links": [
    {"rel":"self", "uri": "http://example.com/people/552/"}
  ]
}

Now, this is just the beginning. Obviously, we want to turn all of these fields into observables. I also wanted to know when any data had changed (so the “Save” button can be disabled when the object is not dirty). Clearly, being able to write the data back to the server, as well as create new objects, and delete them. Further, I needed to be able to do conditional reads and writes (only allow the object to be saved if no-one else has touched it since we last fetched it).

The place where the ko.mapping plugin broke down for me was that updating the resource from the full representation didn’t add the new fields that came back from the server. It may be that indeed this is possible (I think it is), but at the time, I could not see how to do this. It may be that I will rewrite this to use the ko.mapping stuff, but I’m not so sure right now.

Anyway, after a couple of revisions, I have a working framework.

To use it, you can just do:

// Add a dependentObservable called 'name'.
var processPerson = function(item) {
  item.name = ko.dependentObservable(function(){
    return item.first_name() + ' ' + item.last_name();
  });
};

var people = ko.collection({
  url: "http://example.com/people/",
  processItem: processPerson
});

There is one main caveat at this stage:

  • It is expected that each object will have a ‘name’ property. If your server does not return one, you’ll need to setup a dependentObservable as shown in processPerson above.

First, the ko.collection object:

ko.collection = function(options) {
  // Let jQuery know we always want JSON
  $.ajaxSetup({
    contentType: 'application/json',
    dataType: 'json',
    cache: false // This is browser cache! Needs to be set for Firefox.
  });
  
  options = options || {};
  var url = options.url;                  // Allow passing in a url.
  var processItem = options.processItem;  // Allow passing in a function to process each item after it is fetched.
  var etag;
  
  
  // Initial setup. We need to set these early so we can access them, even
  // if we have no data for them.
  var self = {
    items: ko.observableArray([]),
    selectedItem: ko.observable(null),
    selectedIndexes: ko.observableArray([]),
    filters: ko.observable({})
  };
  
  /*
  Message handling.
  
  We have a messages observableArray, but we use this dependent observable
  to access it. This allows us to have messages that expire.
  
  self.messages() => provide access to the array of messages.
  self.messages({
    type: "error|notice|warning|whatever",    => This will usually be used to apply a class
    message: "Text of message",               => This text will be displayed
    timeout: 1500                             => If this is non-zero, message expires (and 
                                                 is automatically removed after this many 
                                                 milliseconds)
  });
  
  Every message object gets given a callback function (.remove()), that,
  when executed, well immediately remove that message, and get rid of the
  timer that normally removes that message after timeout.
  
  The messages object is also given a flush() function, that will remove
  all of the messages within it.
  
  Not sure if I should move this to a seperate plugin?
  */
  var messages = ko.observableArray([]);
  self.messages = ko.dependentObservable({
    read: function() {
      return messages();
    },
    write: function(message) {
      var timeout;
      message.remove = function() {
        messages.remove(message);
        clearTimeout(timeout);
      };
      messages.remove(function(item) {
        return item.message === message.message;
      });
      messages.push(message);
      if (message.timeout) {
        timeout = setTimeout(function(){
          messages.remove(message);
        }, message.timeout);
      }
    }
  });
  self.messages.flush = function() {
    $.each(messages, function(message){
      message.remove();
    });
  };
    
  /*
  filteredItems : a subset of self.items() that has been passed through
                  all of the self.filters(), and selects only those that
                  match. A filter must be an object of the form:
                  {
                    value: ko.observable(""),
                    attr: "name",
                    test: function(test_value, obj_value) {}
                  }
                  
                  The filtering code handles getting the correct values to
                  pass to the test function, the attr is the name of the 
                  attribute on each member of self.items() that will be
                  tested.
                  Having 'value' passed in means we can have a default
                  value when app starts.
  */
  self.filteredItems = ko.dependentObservable(function() {
    var filteredItems = self.items();
    $.each(self.filters(), function(name, filt){
      filteredItems = ko.utils.arrayFilter(filteredItems, function(item){
        if (!filt.attr || !item[filt.attr]) {
          return true;
        }
        return filt.test(filt.value(), item[filt.attr]());
      });
    });
    return filteredItems;
  });
  
  /*
    This is really only used by a select[multiple] object, and is used in
    conjunction with selectedIndexes.
    
    TODO: make this a writeable dependentObservable.
  */
  self.selectedItems = ko.dependentObservable(function() {
    return self.items().filter(function(el){
      return $.inArray(self.items().indexOf(el), self.selectedIndexes()) >= 0;
    });
  });
  
  /*
    Filter self.items() finding only those that have at least one attribute
    that is marked as dirty.
  */
  self.dirtyItems = ko.dependentObservable(function() {
    return self.items().filter(function(el){
      return el.isDirty();
    });
  });
  
  /*
    Filter self.items(), finding only those that have at least one attribute
    marked as conflicted.
  */
  self.conflictedItems = ko.dependentObservable(function() {
    return self.items().filter(function(el){
      return el.hasConflicts();
    });
  });
  
  self.setSource = function(newUrl) {
    url = newUrl;
  };
  
  /*
    Fetch all items from the url we have for the index.
    
    It is allowable that the index does not return the full body of each
    item, but instead only contains perhaps a name, and links for that
    item. Then, we can use self.selectedItem().fetch() to get the full
    data for the item.
  */
  self.fetchItems = function() {
    if (!url) {
      return;
    }
    var headers = {};
    if (etag) {
      headers['If-None-Match'] = etag;
    }
    $.ajax({
      url: url,
      type: "get",
      headers: headers,
      statusCode: {
        200: function(data, textStatus, jqXHR) {
          // Successful. If we already had objects, then
          // we need to update that list.
          $.each(self.items(), function(i, item){
            // Is there an item in the new data items list that matches
            // the item we are now looking at?
            var matchingItem = data.filter(function(el){
              links = el.links.filter(function(link){
                return link.rel==="self";
              });
              return links[0] && links[0].uri === item._url();
            })[0];
            if (matchingItem) {
              // Update the item that matched.
              item.updateData(matchingItem);
              if (processItem) {
                processItem(item);
              }
              // Remove from data.
              data.splice(data.indexOf(matchingItem), 1);
              // Not sure if this should be here.
              // item.isDirty(false);
            } else {
              // Not found in incoming data: remove from our local store.
              // Will this break $.each(self.items(), ...) ?
              self.items.remove(item);
            }
          });
          
          // Any items that we have left in data (which will be all if we
          // haven't loaded this up before) now need to be added to items().
          // On a clean fetch, this will be the first code that is run.
          $.each(data, function(i, el){
            var item = ko.collectionItem(el, self);
            if (processItem) {
              processItem(item, el);
            }
            self.items.push(item);
          });
          
          // Finally, update the etag.
          etag = jqXHR.getResponseHeader('Etag');
        }
      }
    });
  };
  
  /*
    A shortcut method that allows us to bind an action to fetch the
    data from the server for the currently selected item.
  */
  self.fetchSelectedItemDetail = function(evt) {
    if (self.selectedItem && self.selectedItem()) {
      self.selectedItem().fetch();
    }
  };
  
  /*
    Create an item. I haven't implemented this yet, because I haven't 
    figured out a way to see what fields are needed to be created when
    there are no currently loaded items. I'm thinking about using a
    Wizard in my application, so this might be overridden by the app.
  */
  self.createItem = function(evt) {
    console.log("ADDING ITEM (NOT FINISHED YET)");
    // The trick here is knowing what fields need to be created.
    // self.items.push(ko.collectionItem({}));
  };
  
  /*
    Permanently remove the selectedItem, and delete it on the server.
  */
  self.removeSelectedItem = function(evt) {
    if (self.selectedItem && self.selectedItem()) {
      var sure = confirm("This will permanently remove " + self.selectedItem().name());
      if (sure){
        self.selectedItem().destroy();        
      }
    }
  };
  
  /*
    Iterate through self.items(), finding those that match all of the data
    we pass in.
    
    For instance, you can do things like: 
    
      viewModel.findMatchingItems({date_of_birth: "1995-01-01"})
    
    This is used internally to find matches for objects when updating. Not
    sure why it is exposed as a public member function though.
  */
  self.findMatchingItems = function(options) {
    return self.items().filter(function(el){
      var match = true;
      $.each(options, function(opt, val) {
        if (el[opt]() !== val) {
          // Returning false causes $.each to stop, too.
          return match = false;
        }
      });
      return match;
    });
  };
    
  if (url) {
    self.fetchItems();
  }
  
  return ko.observable(self);
};

Second, the ko.collectionItem object. This may be eventually hidden in the collection object, as it isn’t really intended to be used seperately.

ko.collectionItem = function(initialData, parentCollection) {
  var self = {
    isFetched: ko.observable(false)
  };
  var links = [];
  var etag = null;
  var url = null;
  var attributes = ko.observableArray([]);
  var collection = parentCollection;
  var dirtyFlag = ko.observable(false);
  
  /* Private methods */
  
  /*
    Given the incoming 'data' for this object, look through the fields for
    things that differ between the server representation and the client
    representation. Store both values for any differences in an attribute
    of the observable called conflicts().
    
    For each conflict, create a member function on the observable that
    allows you to resolve the conflict. When the last conflict is resolved,
    our etag is updated to the value the server gave us.
    
    This method returns true if all conflicts could be resolved (ie, the
    data in all fields was the same, just the etag had changed).
  */
  var parseConflicts = function(data, newEtag) {
    $.each(data, function(attr, value){
      if (attr !== "links") {
        if ($.compare(value, self[attr]() === undefined ? "" : self[attr]())) {
          // Server and client values match.
          // We need to do some funky stuff with undefined values, and treat
          // them as "". I don't really like this, but it works for now.
          self[attr].conflicts([]);
          self[attr].resolveConflict = function(){};
        } else {
          self[attr].conflicts([value, self[attr]() === undefined ? "" : self[attr]()]);
          self[attr].resolveConflict = function(chosenValue) {
            // Mark the entire object as dirty, so we can allow it to be
            // saved, even if we set it to the original value we had (which
            // differed from the server's value).
            self.isDirty(true);
            self[attr](chosenValue);
            self[attr].conflicts([]);
            if (!self.hasConflicts()) {
              // If this was the last conflict, we can use the new etag from
              // the server.
              etag = newEtag;
            }
          };
        }        
      }
    });
    var conflicts = self.hasConflicts();
    if (!conflicts) {
      etag = newEtag;
    }
    return !conflicts;
  };
  
  /*
  Given an object containing errors, we want to apply each of these
  errors onto the relevant field. We want to remove any errors that are
  already on any field.
  
  If we have any errors leftover, we need to notify globally, using the
  parentCollection's messages object.
  */
  var markErrors = function(errors) {
    $.each(attributes(), function(i,attr){
      if (!self[attr].errors) {
        self[attr].errors = ko.observableArray([]);
      }
      if (errors[attr]) {
        self[attr].errors(errors[attr]);
        delete errors[attr];
      } else {
        self[attr].errors([]);
      }
    });
    
    $.each(errors, function(field){
      parentCollection.messages({type:"error", message: field + ": " + errors[field].join("<br>"), timeout: 3000});
    });
  };
  
  /*
    Get the attributes ready for sending to the server.
    
    We can't just iterate through properties, as some will not apply. We
    use the convention that we will only send back properties that the
    server sent to us.
  */
  var prepareAttributes = function() {
    var data = {};
    $.each(attributes(), function(i,attr){
      data[attr] = self[attr]();
    });
    return data;
  };
  /* Public methods */
  
  /*
    Update the data fields associated with this object from the provided
    data.
    
    This may create new attributes, which need to be noted so we can send
    those values back to the server.
    
    We can mark all updated attributes as not dirty, not conflicted, and
    not having errors.
  */
  self.updateData = function(data) {
    if (data.links) {
      // We want to store the links, but not attach them to the object.
      links = data.links;
      delete data.links;
      $.each(links, function(i, obj){
        if (obj.rel === "self") {
          url = obj.uri;
        }
      });
    }
    
    $.each(data, function(attr, value){
      if (attributes().indexOf(attr) < 0) {
        self[attr] = ko.observable(value);
        self[attr].errors = ko.observableArray([]);
        self[attr].conflicts = ko.observableArray([]);
        ko.dirtyFlag(self[attr], false);
        // Need to add this last to cause the dirtyFields dependentObservable
        // to work correctly when editing the last field.
        attributes.push(attr);
      } else {
        self[attr](value);
        self[attr].errors([]);
        self[attr].conflicts([]);
        self[attr].isDirty.reset();
      }
    });
    // Put the links back in case a post-processor needs them.
    data.links = links;
  };
  
  self.serialize = function(evt) {
    return JSON.stringify(prepareAttributes());
  };
  
  /*
    Discard any local changes, and pull the data from the server.
  */
  self.revert = function(evt) {
    etag = null;
    self.fetch();
    parentCollection.messages({type:'warning', message:'The object "' + self.name() + '" was reverted to the version stored on the server.', timeout: 5000});
  };
  
  /*
    Attempt to save the data to the server.
    
    Only permitted to do this if we have successfully fetched the data
    at some point.
    
    Notes: We use POST instead of PUT, in case we do not have access to
           all of the fields of the object. PUT implies the complete resource
           is being updated.
           Errors may come back in {'field-errors': []}, or {'detail':[]}.
           Currently, this makes assumptions about server type, which are
           bad. I need to refactor the error handling code. (400,409)
           Precondition Failed (412) needs to be handled differently, as
           we need to fetch the data from the server if none was provided
           as to the current state of the resource.
           
  */
  self.save = function(evt) {
    if (self.isFetched()) {
      $.ajax({
        url: url,
        type: 'post', // We can't PUT in case we don't know about all fields.
        headers: {'If-Match': etag},
        data: self.serialize(),
        statusCode: {
          200: function(data, textStatus, jqXHR) {
            // Object saved.
            // Incase some fields were reformatted by the server, redo our data.
            self.updateData(data);
            etag = jqXHR.getResponseHeader('Etag');
            parentCollection.messages({type:'success', message:'The object "' + self.name() + '" was saved.', timeout: 2500});
            self.isDirty(false);
          },
          201: function(data, textStatus, jqXHR) {
            // Object saved for the first time (created)
            // Incase some fields were reformatted by the server, redo our data.
            self.updateData(data);
            etag = jqXHR.getResponseHeader('Etag');
            url = jqXHR.getResponseHeader('Location');
            parentCollection.messages({type:'success', message:'The object "' + self.name() + '" was created.', timeout: 2500});
            self.isDirty(false);
          },
          400: function(jqXHR, textStatus, errorThrown) {
            var data = JSON.parse(jqXHR.responseText);
            if (data['field-errors']) {
              markErrors(data['field-errors']);
            }
            parentCollection.messages({type:'error', message:'The object "' + self.name() + '" could not be saved. Please check the highlighted field(s).', timeout: 10000});
          },
          409: function(jqXHR, textStatus, errorThrown) {
            // Errors saving the data. Likely to be validation errors.
            // We should have a detail object with info to display.
            var data = JSON.parse(jqXHR.responseText);
            if (data.detail) {
              markErrors(data.detail);
            }
            parentCollection.messages({type:'error', message:'The object "' + self.name() + '" could not be saved. Please check the highlighted field(s).', timeout: 10000});
          },
          412: function(jqXHR, textStatus, errorThrown) {
            // Data was changed on server since we last fetched it.
            // There may be conflicts to deal with.
            // See if the server gave us a current version back...
            var data, serverEtag;
            if (jqXHR.responseText) {
              data = JSON.parse(jqXHR.responseText);
            } else {
              $.ajax({
                url: url,
                async: false,
                success: function(newData, textStatus, jqXHR) {
                  data = newData;
                  serverEtag = jqXHR.getResponseHeader('Etag');
                }
              });
            }
            if (parseConflicts(data, serverEtag)) {
              // We were able to resolve all of the conflicts, now we can
              // try to re-save; but only if it was the first time we saved,
              // to prevent inifinite recursion.
              if (evt) {
                self.save();
              }
            } else {
              parentCollection.messages({type:'error', message:'The object "' + self.name() + '" has been modified on the server. Please check the changed field(s) and select the appropriate value(s).', timeout: 10000});
            }
          }
        }
      });
    }
  };
  
  /*
    Permanently delete the object from the server.
  */
  self.destroy = function(evt) {
    if (self.isFetched() && etag) {
      console.log("DELETING ITEM");
      $.ajax({
        url: url,
        type: 'delete',
        headers: {'If-Match': etag},
        success: function(data, textStatus, jqXHR) {
          if (collection) {
            collection.items.remove(self);
          }
          parentCollection.messages({type:'success', message:'The object "' + self.name() + '" was deleted.', timeout: 2500});
        },
        error: function(jqXHR, textStatus, errorThrown) {
          // Display error message about not being able to delete?
          parentCollection.messages({type:'error', message:'The object "' + self.name() + '" could not be deleted.', timeout: 10000});
        }
      });
    }
  };
  
  /*
    (Re)Fetch the resource from the server.
    
    Handle conflicts if the arise (when the object has already been fetchd)
  */
  self.fetch = function(evt) {
    var headers = {};
    if (etag) {
      headers['If-None-Match'] = etag;
    }
    $.ajax({
      type: 'get',
      url: url,
      headers: headers,
      statusCode: {
        200: function(data, textStatus, jqXHR) {
          // If we have an etag already, this means the object has been
          // updated on the server, and we need to look for conflicts.
          if (etag) {
            var serverEtag = jqXHR.getResponseHeader('Etag');
            // If we were unable to handle all conflicts, we need to exit.
            if (!parseConflicts(data, serverEtag)) {
              parentCollection.messages({type:'error', message:'The object "' + self.name() + '" has been modified on the server. Please check the changed field(s) and select the appropriate value(s).', timeout: 10000});
              return;
            };
          }
          
          // Otherwise, we can update the data and the etag.
          self.updateData(data);
          etag = jqXHR.getResponseHeader('Etag');
          self.isFetched(true);
        },
        304: function() {
        }
      },
      error: function(jqXHR, textStatus, errorThrown) {
        parentCollection.messages({type:"error", message:"There was an error fetching the data from the server"});
      }
    });
  };
  
  
  
  /* Dependent Observables */
  self.dirtyFields = ko.dependentObservable(function(){
    return ko.utils.arrayFilter(attributes(), function(attr){
      return self[attr] && self[attr].isDirty && self[attr].isDirty();
    });
  });
  
  self.conflictedFields = ko.dependentObservable(function() {
    return ko.utils.arrayFilter(attributes(), function(attr){
      return self[attr] && self[attr].conflicts && self[attr].conflicts().length > 0;
    });
  });
  
  var filterAttributes = function(property) {
    return function() {
      return ko.utils.arrayFilter(attributes(), function(attr){
        return self[attr] && self[attr][property] && self[attr][property]().length > 0;
      }).length > 0;
    };      
  };
  
  self.hasErrors = ko.dependentObservable(filterAttributes('errors'));
  self.hasConflicts = ko.dependentObservable(filterAttributes('conflicts'));
  
  /*
    An object is dirty when:
      - any of its fields/attributes are dirty. (we aks them), OR
      - we have explicitly marked it as dirty.
      
    We need to do the latter for when we have merged a conflict, by choosing
    our value, which differed from the server. The local model would
    normally think it wasn't dirty, but it differs from the server, and
    does need to be saved.
  */
  self.isDirty = ko.dependentObservable({
    read: function() {
      return self.dirtyFields().length > 0 || dirtyFlag();
    },
    write: function(value) {
      dirtyFlag(value);
      if (!value) {
        $.each(attributes(), function(attr){
          if (self[attr] && self[attr].isDirty) {
            console.log(attr);
            self[attr].isDirty.reset();          
          }
        });
      }
    }
  });
  
  /*
    Can this object be saved back to the server?
    Only when it is dirty, and has been fetched.
  */
  self.canSave = ko.dependentObservable(function() {
    return self.isDirty() && self.isFetched();
  });
  
  self._etag = function(){return etag;};
  self._attributes = function(){ return attributes();};
  self._url = function() {return url;};
  
  if (initialData) {
    self.updateData(initialData);
  }
  
  return self;
};

Filtering querysets in django.contrib.admin forms

I make extensive use of the django admin interface. It is the primary tool for our support team to look at user data for our product, and I have stretched it in many ways to suit my needs.

One problem I often come back to is a need to filter querysets in forms and formsets. Specifically, the objects that should be presented to the admin user in a relationship to the currently viewed object should be filtered. In most cases, this is something as simple as making sure the Person and the Units they work at are within the same company.

There is a simple bit of boilerplate that can do this. You need to create a custom form, and attach this to the ModelAdmin for the parent object:

from django.contrib import admin
from django import forms
from models import Person, Unit

class PersonAdminForm(forms.ModelForm):
    class Meta:
        model = Person
    
    def __init__(self, *args, **kwargs):
        super(PersonAdminForm, self).__init__(*args, **kwargs)
        # This is the bit that matters:
        self.fields['units'].queryset = self.instance.company.units

class PersonAdmin(admin.ModelAdmin):
    form = PersonAdminForm

In actuality, it is a little more complicated than this: you need to test if the selected object has a company, and really, if the user has changed the company (or selected it on a new person), you should use that instead. So the code looks a bit more like:

company = None
if self.data.get('company', None):
    try:
        company = Company.objects.get(pk=self.data['company'])
    except Company.DoesNotExist:
        pass
else:
    try:
        company = self.instance.company
    except Company.DoesNotExist:
        pass
if company:
    self.fields['units'].queryset = company.units.all()

Now, having to write all of that every time you have to filter the choices available wears rather thin. And wait until you need to do it to a formset instead: you need to also do stuff to the empty_form, so that when you dynamically add an inline form, it has the correct choices.

Enter FilteringForm, and her niece FilteringFormSet:

from django import forms
from django.core.exceptions import ObjectDoesNotExist

class FilterMixin(object):
    filters = {}
    instance_filters = {}
    def apply_filters(self, forms=None):
        # If we didn't get a forms argument, we apply to ourself.
        if forms is None:
            forms = [self]
        # We need to apply instance filters first, as they allow us to
        # select an attribute on our instance to be the queryset, and
        # then apply a filter onto that with filters.
        for field, attr in self.instance_filters.iteritems():
            # It may be using a related attribute. person.company.units
            tokens = attr.split('.')
            
            source = None
            # See if there is any incoming data first.
            if self.data.get(tokens[0], ''):
                try:
                    source = self.instance._meta.get_field_by_name(tokens[0])[0].rel.to.objects.get(pk=self.data[tokens[0]])
                except ObjectDoesNotExist:
                    pass
            # Else, look for a match on the object we already have stored
            if not source:
                try:
                    source = getattr(self.instance, tokens[0])
                except ObjectDoesNotExist:
                    pass
            
            # Now, look for child attributes.
            if source:
                for segment in tokens[1:]:
                    source = getattr(source, segment)
                if forms:
                    for form in forms:
                        form.fields[field].queryset = source
        
        # We can now apply any simple filters to the queryset.
        for field, q_filter in self.filters.iteritems():
            for form in forms:
                form.fields[field].queryset = form.fields[field].queryset.filter(q_filter)
    

class FilteringForm(forms.ModelForm, FilterMixin):
    def __init__(self, *args, **kwargs):
        super(FilteringForm, self).__init__(*args, **kwargs)
        self.apply_filters()

class FilteringFormSet(forms.models.BaseInlineFormSet, FilterMixin):
    filters = {}
    instance_filters = {}
    
    def __init__(self, *args, **kwargs):
        super(FilteringFormSet, self).__init__(*args, **kwargs)
        self.apply_filters(self.forms)
    
    def _get_empty_form(self, **kwargs):
        form = super(FilteringFormSet, self)._get_empty_form(**kwargs)
        self.apply_filters([form])
        return form
    empty_form = property(_get_empty_form)

Now, to use all of this, you still need to subclass, but you can declare the filters:

class PersonAdminForm(FilteringForm):
    class Meta:
        model = Person
    
    instance_filters = {
        'units': 'company.units'
    }

You can also have non-instance filters, and they will be applied after the instance_filters:

from django.db import models

class PersonAdminForm(FilteringForm):
    class Meta:
        model = Person
    
    instance_filters = {
        'units': 'company.units'
    }
    filters = {
        'units': models.Q(is_active=True)
    }

I think it might be nice to be able to add an extra set of filtering for the empty form in a formset, so you could make it that only choices that hadn’t already been selected, for instance, were the only ones available. But that isn’t an issue for me right now.

Displaying only objects without subclasses

Sometimes, the django.contrib.auth User model just doesn’t cut it.

I have bounced around between ways of handling this sorry fact. My production system uses a nasty system of Person-User relationships (where, due to old legacy code, I need to keep the primary keys in sync), to monkey-patching User, using UserProfiles, and subclassing User.

First, a little on the nasty hack I have in place (and how that will affect my choices later on).

My project in work is a rostering system, where not everyone who is a Person in the system needs to be a User. For instance, most managers (who are Users) do not need their staff to be able to log in. However, they themselves must be a Person as well as a User, if they are to be able to log in, but also be rostered on.

Thus, there are many people in the system who are not Users. They don’t have a username, and may not even have an email address. Not that having an email address is that useful in the django User model, as there is no unique constraint upon that.

So, I am currently kind-of using Person as a UserProfile object, but there are Person instances that do not have an associated User, and some of these are required to have an email address, and have first and last names. So, there is lots of duplication across these two tables. Which need to be kept in sync.

The solution I am looking at now moves in the other direction.

A Person is a subclass of User. It has the extra data that we need for our business logic (mobile phone number, company they work for), but I have also monkey-patched User to not require a username. We are moving towards using email addresses for login names anyway, so that isn’t a problem. That has its own concerns (not everyone has a unique email address, but there are workarounds for that).

But not every User will have a Person attached. The admin team’s logins will not (and this will be used to allow them to masquerade as another user for testing and bug-hunting purposes). So, we can’t just ignore the User class altogether and do everything with the Person class.

This is all well and good. I have an authentication backend that will return a Person object instead of a User object (if one matches the credentials). Things are looking good.

Except then I look in the admin interface. And there we have all of the Person objects’ related User objects, in the User table. It would be nice if we only had the ‘pure’ Users in here, and all Person objects were just in their category.

So, I needed a way to filter this list.

Luckily, django’s admin has this capability. In my person/admin.py file, I had the following code:

from django.contrib import admin
from django.contrib import auth

class UserAdmin(auth.admin.UserAdmin):
    def queryset(self, request):
        return super(UserAdmin, self).queryset(request).filter(person=None)

admin.site.unregister(auth.models.User)
admin.site.register(auth.models.User, UserAdmin)

And, indeed, this works.

But then I found another User subclass. Now we needed a type of user that is distinct from Person (they are never rostered, are not associated with a given company, but do log into the system).

I wanted the changes to the admin to be isolated within the different apps, so I needed to be able to get the currently installed UserAdmin class, and subclass that to filter the queryset. So the code becomes (in both admin.py files):

from django.contrib import admin
from django.contrib import auth

BaseUserAdmin = type(admin.site._registry[auth.models.User])

class UserAdmin(BaseUserAdmin):
    def queryset(self, request):
        return super(UserAdmin, self).queryset(request).filter(foo=None)

admin.site.unregister(auth.models.User)
admin.site.register(auth.models.User, UserAdmin)

The only difference in the two files is the foo. This becomes whatever this sub-class’s name is. Thus, it is person in the person/admin.py file, and orguser in the orguser/admin.py file.

The next step is to change the backend so that it will automatically downcast the logged in user to their child class. Other people have detailed this in the past: mostly the performance issue vanishes here because we are only looking at a single database query for a single object.

Adobe PDF Workflow under Snow Leopard

My partner is a mad keen Macromedia Freehand user. This is one of the reasons she has been able (and prepared) to stick with her trusty old G4 iMac until now. It is also the reason our brand new iMac won’t be running Lion anytime soon.

So, when we got the new iMac, I had to setup Freehand so it worked. The next thing was to bring across all of her thousands of fonts. Tip: if fonts look jaggy, force the font cache to rebuild and restart.

Finally, we got to the stage where she was trying to create some PDFs. And since Adobe is not always the best OS citizen, we found the old way she used to create them no longer worked under Snow Leopard. Using the system PDF generator resulted in far inferior PDF quality: jaggy fonts, lines within curves.

Now, I get the feeling that the system isn’t at fault here, as it is very capable of creating PDF files of correct quality. Indeed, I was able to get high quality PDFs generated from other programs (and as we’ll see in a minute, even from files generated from Freehand). So, it seems that Freehand ‘knows’ this is a ‘preview’ version, and cuts the quality of data it sends.

Eventually, after much work, I found that creating a PostScript file worked okay, but the page size was incorrect. At this stage I had installed the printer driver for our old Epson Stylus PHOTO EX, which resulted in the print dialog box no longer showing all of the Freehand MX settings.

The final solution was to create an IPP printer, to localhost, that is called Adobe PDF. This is set to use the generic PostScript driver. All of a sudden, we are able to access the Freehand MX advanced settings in the print dialog, and create PostScript files that are the right size, and of suitable quality. She then either uses Preview or Acrobat Distiller to turn these into PDFs.

Installing django (or any python framework, really)

TL;DR

$ pip install virtualenv
$ virtualenv /path/to/django_project/
$ . /path/to/django_project/bin/activate
$ pip install django

I hang around a fair bit in #django now on IRC. It’s open most of the time I am at work: if I am waiting for something to deploy, I’ll keep an eye out for someone that needs a hand, or whatever. Yesterday, I attempted to help someone out with an issue with django and apache: I ended up having to go home before it got sorted out.

One of the things that came up was how to actually install django. The person was following instructions on how to do so under Ubuntu, but they weren’t exactly ‘best practice’.

One of the things I wish I had been around when I first started developing using python is virtualenv. This tool allows you to isolate a python environment, and install stuff into it that will not affect other virtual environments, or the system python installation.

Unfortunately, it does not come standard with python. If it were part of the standard library, it may reduce the likelihood of someone not using it. The upside of it not being in the standard library is that it gets updated more frequently.

Installing virtualenv

First, see if virtualenv is installed:

$ virtualenv --version

If not, you’ll need to install it. You can install it using pip or easy_install, if you have either of those installed. If you are a super-user on your machine (ie, it is your computer), then you may want to use sudo. You can have it installed just in your user account, which you might need to do on a shared computer.

You’ll probably also want to install pip at the system level. I do this first, and use it to install virtualenv, fabric and other packages that I need to use outside of a virtualenv (mercurial springs to mind). Do note that a virtualenv contains an install of pip by default, so this is up to you: once you have virtualenv installed, you can use pip in every virtualenv to install packages.

Setting up a virtual environment

I recommend using virtualenv for both development and deployment.

I think I use virtualenv slightly differently to most other people. My project structure tends to look like:

/home/user/development/<project-name>/
    bin/
    fabfile.py
    include/
    lib/python2.6/site-packages/...
    project/
        # Project-specific stuff goes here
    src/
        # pip install -e stuff goes here
    tmp/

Thus, my $VIRTUAL_ENV is actually also my $PROJECT_ROOT. This means that everything is self contained. It has the negative side-effect of meaning if I clone my project, I need to install everything again. This is not such a bad thing, as I use Fabric to automate the setup and deployment processes. It takes a bit of time, but using a local pypi mirror makes if fairly painless.

Obviously, I ignore bin/, lib/ and the other virtualenv created directories in my source control.

However, since we are starting from scratch, we won’t have a fabfile.py to begin with, and we’ll just do stuff manually.

$ cd /location/to/develop
$ virtualenv my_django_project

That’s it. You now have a virtual environment.

Installing django/other python packages

You’ll want to activate your new virtualenv to install the stuff you will need:

$ cd my_django_project
$ . bin/activate
(my_django_project)$

Notice the prompt changes to show you are in a virtual environment.

Install the packages you need (from now on, I’ll assume your virtualenv is active):

    $ pip install django

There has been some discussion about having packages like psycopg2 installed at the system level: I tend to install everything into the virtualenv.

So that’s it. You now have django installed in a virtual environment. I plan to write some more later about my deployment process, as well as how I structure my django projects.

iPad Thumbscanner

Well, my iPad 2 arrived yesterday. Loving it.

But that’s not what this post is about.

At work, one of our products is a thumbscanner system. I was discussing with our CEO the other day how the best way to have this in a unit was, and ideally an iPad or other tablet would be best.

The issue with the iPad is that you cannot just connect any hardware to it.

But then I remembered how some people had connected all types of USB hardware, including in one case a USB-ADB adaptor and an old, old keyboard. All you need is one of the iPad Camera Connection Kit thingos.

Guess I’ll have to get one of those, and see what I can manage to get it to do…

…The trigger for thinking about this: Oscium’s iMSO-104 turns your iPad into a mixed signal oscilloscope

Baked Blog

So, it looks like I’ve taken the route that quite a few others are taking lately. Although not really based on the post by Brent Simmons: A plea for baked weblogs, it probably got me thinking. Marco Arment also talked about this concept in a couple of episodes of Build and Analyze, one of which I haven’t even listened to yet.

More though, it was wanting to really be in control of what goes on the site. I moved the data from Blogger to Blogsome ages ago, and then onto a self-hosted WordPress installation. I never managed to keep the installation updated, and was forced to use PHP if I wanted to change the look.

So, this re-publish, using Jekyll, was a way to simplify the design, and generally clean everything up. I managed to pull all of the posts down using the Jekyll-WordPress tool, although since I had a very old WordPress installation, I needed to tweak the script somewhat. Mainly that was to get the tags and categories. The database structure they had to associate them was royally fucked. Eventually I got there, though.

I designed the layout myself, with a real focus on simplicity. It’s probably not quite finished, but tweaking will happen. I have a few custom (based on others work) Jekyll plugins: notably I have a generators that create the yearly/monthly/daily archives, and the tag pages. I override the highlight template tag to cache the pygments output, and add a filename if it is included.

I probably would have not had the permalinks the way I do if I had started from scratch, but didn’t really want to break incoming links. I also need to find any more missing images, and incorrectly indented code blocks.

What I haven’t done yet is get my workflow complete. I would like to be able to edit a post in my iPad (when it arrives), or from any DropBox linked computer, and when I mark it as publish, it automatically moves it to the _posts/ folder in the Jekyll directory, runs jekyll, and then deploys the data to the site. This script can take care of the file naming, from the date and the title in the yaml header, or the file modification date/time if none provided in the header. It would be nice to come up with some workflow for creating new files, but I think I’d have to write my own app for that to work on iPad.

I do have a deployment script, which just uses rsync to copy changed files:

#! /bin/bash
rsync -rcvuzm --delete /path/to/site/_site/ user@host:

I think I may create a new TextMate jekyll bundle, as the one that exists (a) cannot be installed properly from GetBundles, and (b) is missing some features. Like creating a new post, publish and deploy.

Finally, I am a bit peeved at how long it takes to build the jekyll site. With around 1500 posts, it takes about 4 minutes to generate the files. This is caching the pygments files, but the markdown files need to be re-read each time, as the several pages will change if a new post is added. That is, the index, any paginated root pages, plus all archives for the current year, and any tag pages with tags the same as the new post has. Jekyll currently recreates all pages.