I Hate Generic Foreign Keys, but this works anyway

I’m really not a fan of the concept of Generic Foreign Keys. They do have their place, and the app I’ve just started is a reasonable example.

It’s django-activity-streams, and I’m using it essentially as an audit stream. It stores the user who performed the change, the object that was changed, when it was changed, and a serialised version of the fields that have changed, in the format of:

    "field": "date_of_birth",
    "old": "1955-01-10",
    "new": "1955-10-01"

Now, the complication comes when trying to generate reports based on this stuff, and that is all down to the use of GFKs.

Essentially, what I want to be able to do is:

Action.objects.between(start, finish).verb('updated').filter(

But this will not work, as there is no real target field: it’s a GFK field. But we can query on the two fields that make it up: target_content_type and target_object_id.

So, you might think we can do something like:

ctype = ContentType.objects.get_for_model(queryset.model)

Alas, this will not work either, as target_object_id is a “character varying”, and a queryset kind-of looks like a set of integers (or whatever the primary key for that table is).

So, we need a list of characters, instead of integers.

pks = map(str, queryset.values_list('id', flat=True))

Indeed, that works, but (a) it requires two queries (one to get the PKs, and the other to get the actions), and (b) the second query will get very long if there are lots of objects in the queryset.

So, we want a query that we can use as a subquery. Enter postgres:

pks = queryset.extra(select={'_id': 'SELECT CAST("id" AS text)'values('_id')


    "actstream_action"."verb" = created
    AND "actstream_action"."timestamp" <= 2013-09-11 00:00:00
    AND "actstream_action"."timestamp" >= 2001-01-01 00:00:00
    AND "actstream_action"."target_object_id" IN (
          (SELECT CAST("id" AS text)) AS "_id"
            "people" U0 
            U0."comp_id" = 1 
    AND "actstream_action"."target_content_type_id" = 17
    "actstream_action"."timestamp" DESC;

You can see the subquery SELECT (SELECT CAST(...)) after the IN, which in the previous version was a list of string versions of the ids.

arp -a | vendor

I have lots of things on my local network. Most of them behave nicely with the DHCP server, and provide their machine name as part of their DHCP request (Client ID), which means I can see them in the list in Airport Utility.

However, some of them don’t which means I have some blank rows.

It would be nice to be able to figure out which devices these are, especially for those that don’t provide any services (particulary a web interface).

Enter MAC Vendor Lookup.

You can register, and get an API key that will return values in the format you desire.

Then, it’s possible to do:

$ curl --silent http://www.macvendorlookup.com/api/:API_KEY/:MAC_ADDRESS | cut -f 1 -d \|

(I use the pipe delimited version).

This is all well and good, but who wants to have to type them in? Not this guy.

Let’s look at how we can get them from arp -a.

$ arp -a | cut -f 4 -d ' '

Okay, that’s promising, it gives me a list of MAC addreses. Almost. It skips out leading zeros, which the API rejects. And it includes ones that are missing.

Cue about an hour mucking about with the (limited) sed regex docs:

$ arp -a | 
    cut -f 4 -d ' ' | 
    sed -E 's/:([[:xdigit:]]):/:0\1:/g' | 
    sed -E 's/^.:/0&/' | 
    sed -E 's/:(.)$/:0\1/'

Ah, that’s better. Now we have the proper MAC addresses.

Now, we can pipe this information through the API call.

This is where we need to start to get a bit tricky. We need to create a function that will allow us to call the API with a new value each time. You’ll want to stick this in your .bashrc.

function mac_vendor() {
  $API_KEY="<your api key>"
  if [[ $1 ]]; then
    curl --silent "$API_KEY/:API_KEY/$1" | cut -f 1 -d \|
    while read DATA; do
      curl --silent "$API_KEY/:API_KEY/$DATA" | cut -f 1 -d \|

The if statement means we can use it by passing an argument on the command line:

$ mac_vendor 00:00:00:00:00:00
Xerox Corporation

Or by passing through data from stdin:

$ arp -a | 
    cut -f 4 -d ' ' | 
    sed -E 's/:([[:xdigit:]]):/:0\1:/g' | 
    sed -E 's/^.:/0&/' | 
    sed -E 's/:(.)$/:0\1/' |

Okay, that’s nice, but we now can’t see which IP address is associated with which vendor.

Let’s move that ugly chained sed call into it’s own function, called normalise_mac_address, which we will also wrap in a while read DATA; do ... done clause, so we can pipe data through it:

function normalise_mac_address() {
  while read DATA; do
    echo $DATA |
      sed -E 's/:([[:xdigit:]]):/:0\1:/g' |
      sed -E 's/^.:/0&/' |
      sed -E 's/:(.)$/:0\1/'

Nearly there!

We now need to be able to grab out the IP address and the MAC address from arp, and pass only the MAC address through our conversion functions. By default the bash for … in … construct will iterate through words, so we need to tell it to deal with a line at a time:

function get_all_local_vendors() {
  for LINE in `arp -a | cut -f 2,4 -d ' '`; do
    # We have LINE="(<ip.address.here>) <mac:address:here>"
    MAC=`echo $LINE | cut -f 2 -d ' ' | normalise_mac_address`
    IP=`echo $LINE | cut -f 1 -d ' '`
    # We only want ones that were still active
    if [ $MAC != '(incomplete)' ]; then
      VENDOR=`echo $MAC | mac_vendor`
      echo $VENDOR $IP

I’m hardly a bash expert, so there may be a better way of doing things rather than the repeated VARIABLE=`foo thing` construct I keep using.

So, the outcome I get when I run this looks something like:

$ get_all_local_vendors 
Apple, Inc. (
Sparklan Communications, Inc. (
Devicescape Software, Inc. (
Mitrastar Technology (
Apple, Inc. (
Silicondust Engineering Ltd (
Apple Computer (
none (

Getting rid of that last one is left as an exercise to the reader: the MAC address is FF:FF:FF:FF:FF:FF.

Django Single Table Inheritance on the cheap.

There was a recent question on Stack Overflow about Django Single Table Inheritance (STI). It got me thinking about how to use my FSM-proxy stuff to just be about STI.

Note: this only works when all sub-classes have the same fields: the example we are going to use here is different to a state machine, in that an object may not change state after it has been created.

class Sheep(models.Model):
  type = models.CharField(max_length=4)
  tag_number = models.CharField(max_length=64)

class Ram(Sheep):
  class Meta:
    proxy = True
class Ewe(Sheep):
  class Meta:
    proxy = True

In this case, we can fetch all sheep as Sheep.objects.all(). However, this gives us the objects as Sheep instances, when we want those with type='ram' to return Ram instances, and those with type='ewe' to return Ewe instances.

We can do this, by the magic of type().__subclasses__().

class Sheep(models.Model):
  # fields as above
  def __init__(self, *args, **kwargs):
    super(Sheep, self).__init__(*args, **kwargs)
    # If we don't have a subclass at all, then we need the type attribute to match
    # our current class. 
    if not self.__class__.__subclasses__():
      self.type = self.__class__.__name__.lower()
      subclass = [x for x in self.__class__.__subclasses__() if x.__name__.lower() == self.type]
      if subclass:
        self.__class__ = subclass[0]
        self.type = self.__class__.__name__.lower()

This will automatically downcast Sheep objects to the correct subclass, based upon the type field.

It also sets the type field on objects that are instantiated without one (based on the current instance class). This enables us to do things like:

# Fetch all Sheep, downcast to correct subclass.
>>> Sheep.objects.all()
[<Ram: Ram object>, <Ram: Ram object>, <Ewe: Ewe object>]

# Automatically set the type on a class.
>>> Ram()
<Ram: Ram object>
>>> Ram().type
>>> Sheep()
<Sheep: Sheep object>
>>> Sheep().type

# Automatically set the class on a valid subclass/type
>>> Sheep(type='ram')
<Ram: Ram object>
# Force the type field on an invalid type argument. [see below]
>>> Ram(type='ewe')
<Ram: Ram object>
>>> Sheep(type='foo')
<Sheep: Sheep object>
>>> Sheep(type='foo').type

The assumption I have made here is that when instantiating a class, and the type value is not a valid value (our class, or one of our subclasses), then it changes the type field to the current class.

The other assumption is that the parent class is also valid. In this case, it wouldn’t be, as sheep must be either ewes or rams (or wethers, but that’s another story).

We also need to be able to fetch Ewe and Ram objects using their manager. This is just as simple as filtering on the type.

class ProxyManager(models.Manager):
  def get_query_set(self): # Note: get_queryset in Django1.6+
    return super(ProxyManager, self).get_query_set().filter(type=self.model.__name__.lower())

class Ram(Sheep):
  objects = ProxyManager()
  class Meta:
    proxy = True

class Ewe(Sheep):
  objects = ProxyManager()
  class Meta:
    proxy = True

Now, we can do:

>>> Ram.objects.all()
[<Ram: Ram object>, <Ram: Ram object>]

Clearly, the models have been simplified: I have not shown any model methods that would be the different behaviours that the subclasses have.

Who is having a birthday?

The latest thing I have been working on is notifications for our project. One of the required notification types is upcoming (and today’s) birthdays (and expiring work visas, but that’s a much easier problem).

This actually turns out to be quite a hard problem. There are some simple solutions, but they all do not meet our requirements:

  1. Store only the month and day of a person’s birthday. This is unsatisfactory as we use their age to calculate their wage, if applicable.
  2. Create a pseudo-column that contains their upcoming birthday. This gets hard when you take leap-day birthdays into account.

We need to fetch all people who have a birthday coming up in the next X days. This is a requirement because if we just matched people who had a birthday in X days, (a) leap days are easy to miss, and (b) changes to either the query period or a person’s birthday could mean some events were missed.

Instead, we will query for birthdays in a range, and see if we have already sent a notification for this instance of their birthday. If not, we will send a notification.

One solution is to look at all of the dates in the given range, in the format -%m-%d, and query using contains against this list:

dates = [start+datetime.timedelta(i) for i in range((finish-start).days)]
filters = [Q(dob__contains=x.strftime('-%m-%d')) for x in dates]
Person.objects.filter(reduce(operator.or_, filters))

But, this too fails when a birthday on a leap day exists, and this year is not a leap year.

(We use the -%m-%d format instead of %m-%d so we don’t get false matches from the year part of the date).

Then I came across a post by Zoltán Böszörményi, that contains the following useful function:

  SELECT to_char($1, 'MM-DD');
$BODY$ language 'sql' IMMUTABLE STRICT;

There are a couple of things to notice: it does MM-DD, not the other way around. This allows us to sort lexically. Also, declaring it as IMMUTABLE means we will be able to create an index using it. And since we are querying against it, having an index may be useful:

CREATE INDEX person_birthday_idx ON people (indexable_month_day(dob));

Now, we can also query against this. I like to use django queryset methods (see building a higher-level query API), so my stuff looks like:

class BirthdayQuerySetMixin(object):
    def birthday_between(self, start, finish):
        assert start <= finish, "Start must be less than or equal to finish"
        start = start - datetime.timedelta(1)
        finish = finish + datetime.timedelta(1)
        return self.extra(where=[
            indexable_month_day(dob) < '%(finish)s' 
            indexable_month_day(dob) > '%(start)s'
            """ % {
                'start': start.strftime('%m-%d'),
                'finish': finish.strftime('%m-%d'),
                'andor': 'and' if start.year == finish.year else 'or'

    def birthday_on(self, date):
        return self.birthday_between(date, date)

This has a caveat: it returns two matches for leap-day birthdays during non-leap-years. This is intentional, as other logic will prevent duplicate notifications, and we don’t know which offsetting method people will prefer.

The logic behind it is that it offsets the start and the finish by one day each, and then filters using less_than and greater_than. This is what allows us to find leap-day birthdays. The other tricky part is using AND when the years of the start and finish are the same, and OR if the finish is in the next year. This allows it to match over year boundaries.

Oh, there should also be a check to ensure that we have less than a full year between start and finish: if it’s a year or more, we can just return everyone!

Otherwise, it’s all good, and we can use it to filter a queryset. I’ve put those methods in my PersonManager and PersonQuerySet (via PassThroughManager), so I can do things like:

>>> today = datetime.date.today()
>>> Person.objects.birthday_between(today, today + datetime.timedelta(7))

… which provides me with a list of people who have a birthday within the next seven days.

Django sessions and security

We have an interesting set of requirements regarding session timeouts.

Our application is currently split into two parts: the newer stuff runs in a browser, but the older parts consume a JSON api, and are part of a native app. We recently stopped using HTTP Basic authentication, and instead use session-based authentication in both places. This was handy, as it allows us to:

  1. Not store the user’s password, even in memory on the local machine.
  2. Automatically have the user logged in when the native client links to an HTML page (by passing the session id through).

This is all well and good, but we have discovered a slight possible issue.

  1. User logs in to native client.
  2. User clicks on a button that loads a page in the browser (logging them in automatically).
  3. User closes browser window, but does not quit browser.
  4. Native client does not cleanly exit, and logout code is not called.

This means that the browser session is still logged in, even though the user would have no idea of this. This is a very bad thing™, as the next person to use the computer could have access to all of the previous user’s data.

So, we need the following to happen:

  • Logging out of the client logs out all of the linked (same session id) browser instances.
  • Closing a given browser window does not log out the session (the client may still be open, or there may be other linked browser windows).
  • When no requests are receieved within a given time period, the session expires.

So, we need a short session expiry time, but this should refresh every time a request occurs. The browser pages fetch notifications every 30 seconds, but the native client will also need to ping the server with some frequency for this to work.

This is somewhat different to the way django-session-security works. However, this does add a feature that may also be useful: if no user input is receieved on a given page within a timeout period, the session should expire. However, this may be hard to manage, as no activity may occur on one page, but another page may be getting lots of activity. For now, we might leave this out as a requirement.

It turns out Django can do everything that is required, out of the box. All you need to do is configure it correctly:

# settings.py


The key is to understanding that the session expire time is only refreshed if the session is saved. Most requests will not save this (my fetch of unread notifications doesn’t for instance), so after the expiry time, the session would expire, even if requests had been made in the meantime.

Django Fieldsets

HTML forms contain a construct called a fieldset. These are generally used to segment a form: splitting a form into groups of fields that are logically grouped. Each fieldset may also have a legend.

Django’s forms have no concept of a fieldset natively, but with a bit of patching, we can make every django form capable of rendering itself using fieldsets, yet still be backwards compatible with non-fieldset-aware templates.

Ideally, we would like to be able to render a form in a way similar to:

  {% for fieldset in form.fieldsets %}
    <legend>{{ fieldset.title }}</legend>
      {% for field in fieldset %}
          {{ field.label_tag }}
          {{ field }}
          {{ field.help_text }}
          {{ field.errors }}
      {% endfor %}
  {% endfor %}
  <!-- submit button -->

And, it would make sense to be able to declare a form’s fieldsets in a manner such as:

class MyForm(forms.Form):
  field1 = forms.BooleanField(required=False)
  field2 = forms.CharField()
  class Meta:
    fieldsets = (
      ('Fieldset title', {
        'fields': ('field1', 'field2')

This is similar to how fieldsets are declared in the django admin.

We can’t just simply create a subclass of forms.Form, and do everything there, as the metaclass stuff doesn’t work correctly. Instead, we need to duck-punch.

First, we want to redefine the metaclass __init__ method, so it will accept the fieldsets attribute.

from django import forms
from django.forms.models import ModelFormOptions

_old_init = ModelFormOptions.__init__

def _new_init(self, options=None):
  _old_init(self, options)
  self.fieldsets = getattr(options, 'fieldsets', None)

ModelFormOptions.__init__ = _new_init

Next, we will need a Fieldset class:

class Fieldset(object):
  def __init__(self, form, title, fields, classes):
    self.form = form
    self.title = title
    self.fields = fields
    self.classes = classes
  def __iter__(self):
    # Similar to how a form can iterate through it's fields...
    for field in self.fields:
      yield field

And finally, we need to give every form a fieldsets method, which will yield each fieldset, as a Fieldset defined above:

def fieldsets(self):
  meta = getattr(self, '_meta', None)
  if not meta:
    meta = getattr(self, 'Meta', None)
  if not meta or not meta.fieldsets:
  for name, data in meta.fieldsets:
    yield Fieldset(
      fields=(self[f] for f in data.get('fields',(,))),
      classes=data.get('classes', '')

forms.BaseForm.fieldsets = fieldsets

I am using this code (or something very similar to it), in projects. It works for me, but your mileage may vary…

Django Proxy Model State Machine

Finite State Machines (fsm) are a great way to model something that has, well, a finite number of known states. You can easily specify the different states, and the transitions between them.

Some time ago, I came across a great way of doing this in python: Dynamic State Machines. This maps well onto an idea I have been toying with lately, replacing a series of linked models representing different phases in a process with one model type. Initially, I had thought to just use a type flag, but actually changing the class seems like a better idea.

One aspect of django’s models that makes it easy to do this is the concept of a Proxy Model. These are models that share the database table, but have different class definitions. However, usually a model instance will be of the type that was used to fetch it:

class ModelOne(models.Model):
  field = models.CharField()
class ModelOneProxy(ModelOne):
  class Meta:
    proxy = True

ModelOneProxy.objects.get(pk=1) # Returns a ModelOneProxy object.
ModelOne.objects.all() # Returns all ModelOne objects.

However, by using a type field, we can, at the time it is fetched from the database, turn it into the correct type.

class StateMachineModel(models.Model):
  status = models.CharField(max_length=64)
  def __init__(self, *args, **kwargs):
    super(StateMachineModel, self).__init__(*args, **kwargs)
    self.__class__ = class_mapping[self.status]

However, having to store a registry of status : <ProxyModelClass> objects is not much fun.

Enter __subclasses__.

  def _get_states(self):
    Get a mapping of {status: SubClass, ...}
    The status key will be the name of the SubClass, with the
    name of the superclass stripped out.
    It is intended that you prefix your subclasses with a meaningful
    name, that will be used as the status value.
    return dict([
        sub.__name__.lower().replace(self.__class__.__name__, ''),
      ) for sub in self.__class__.__subclasses__()
  # in __init__, above, replace the last line with:
    self.__class__ = self._get_states[self.status]

Now, we need to change the underlying class when the type gets changed

  def __setattr__(self, attr, value):
    if attr == 'status':
      states = self._get_states
      if value in states:
        self.__class__ = states[value]
    return super(StateMachineModel, self).__setattr__(attr, value)

As the docstring on _get_states indicates, it looks at the subclass name, and compares it to the superclass name to work out the values that will be stored as the status (and used to dynamically change the class).

This has a fairly large implication: you cannot fetch database objects of any of the subclass types directly: you would need to:


Of course, you could use queryset methods to do this: that’s what I have been doing.

This is still a bit of a work in progress: it’s not well tested, but is an interesting idea.

The full version of this model class, which is slightly different to above:

from django.db import models

class StateMachineModel(models.Model):
    status = models.CharField(max_length=64)
    class Meta:
        abstract = True
    def __init__(self, *args, **kwargs):
        self._states = dict([
            (sub.__name__.replace(self.__class__.__name__, '').lower(), sub)
            for sub in self.__class__.__subclasses__()
        super(StateMachineModel, self).__init__(*args, **kwargs)
        self._meta.get_field_by_name('status')[0]._choices = [(x, x) for x in self._states.keys()]
    def _set_state(self):
        if getattr(self, 'status', None) in self._states:
            self.__class__ = self._states[self.status]
    def __setattr__(self, attr, value):
        if attr == 'status':
        return super(StateMachineModel, self).__setattr__(attr, value)

Neat and tidy read-only fields

I have a recurring pattern I’m seeing, where I have a field in a model that needs to be read-only. It usually is a Company to which an object belongs, but it also occurs in the case where an object belongs to some collection, and isn’t permitted to be moved to a different collection.

Whilst there are some workarounds that apply the field’s value to the instance after creating, it’s nicer to be able to apply the read-only nature declaratively, and not have to remember to do something in the form itself.

Unfortunately, in django, normal field subclasses don’t have access to the initial argument that was used to construct it. But forms.FileField objects do. So we can abuse that a little.

We also need a widget, that will always return False for questions about if the value has been changed, and re-render with the initial value at all times.

from django import forms

class ReadOnlyWidget(forms.HiddenInput):
    def render(self, name, value, attrs):
      value = getattr(self, 'initial', value)
      return super(ReadOnlyWidget, self).render(name, value, attrs)
    def _has_changed(self, initial, data):
      return False

class ReadOnlyField(forms.FileField):
  widget = forms.HiddenInput
  def __init__(self, *args, **kwargs):
    forms.Field.__init__(self, *args, **kwargs)
  def clean(self, value, initial):
    self.widget.initial = initial
    return initial

So, that’s all well and good. But a common use for me was for this field to be a related field: a Company as described above, or a User.

Enter ReadOnlyModelField, and ReadOnlyUserField.

Now, ReadOnlyModelField is a bit tricky: it’s not actually a class, but a factory function, so we will look at ReadOnlyUserField first:

class ReadOnlyUserField(ReadOnlyField):
  def clean(self, value, initial):
    initial = super(ReadOnlyUserField, self).clean(value, initial)
    return User.objects.get(pk=initial)

Note, it will have a database query.

Now, we are ready to look at a more general case:

def ReadOnlyModelField(ModelClass, *args, **kwargs):
  class ReadOnlyModelField(ReadOnlyField):
    def clean(self, value, initial):
      initial = super(ReadOnlyModelField, self).clean(value, initial)
      return ModelClass.objects.get(pk=initial)
  return ReadOnlyModelField(*args, **kwargs)

This is a bit tricky. We create a function that looks like a class, but actually creates a new class when it is called. This is so we can use it in a form definition:

class MyForm(forms.ModelForm):
  company = ReadOnlyModelField(Company)
  class Meta:
    model = MyModel

Django AJAX Forms

I think the more Django code I write, the more I like one particular feature.


Simple as that. Forms are the reason I keep coming back to django, and discard other web frameworks in other languages, even though I really want to try them.

One pattern I have been using a fair bit, which was touched on in another post, is using AJAX to handle form submission, and displaying the response.

Before we continue, a quick recap on what Django’s forms offer us.

  • A declarative approach to defining the fields a form has, including validation functions.
  • Will render themselves to HTML input elements, as appropriate.
  • Handle validation of incoming form-encoded (or otherwise provided) data.
  • Fields can validate themselves, and can include validation error messages as part of the HTML output.
  • (Model forms) handle instantiation of and updating of model instances.

A normal form-submission cycle contains a POST or GET request to the server, which responds with a fresh HTML page, which the browser renders. The normal pattern for successful POST requests is to redirect to a GET afterwards, to prevent duplicate submission of forms.

By doing an ajax request instead of a full-page request means we can:

  • reduce the amount of data that is sent back from the server
  • improve apparent performance by only re-rendering the relevant data
  • reduce the amount of time spent rendering parts of the page that have not changed, such as menu, etc.

The way I have been doing it, in broad terms, is to have a template just for the form. If the request is an ajax request, then this will be rendered and returned. If it’s not an ajax request, then the full page will be returned.

Some example code, for one way to do this:

def view(request, pk):
  instance = MyModel.objects.get(pk=pk)
  if request.is_ajax():
    template = 'form.html'
    template = 'page.html'
  if request.method == 'POST':
    form = MyForm(request.POST, instance=instance)
    if form.is_valid():
      if not request.is_ajax():
        return redirect('redirect-name-here')
    form = MyForm(instance=instance)
  return render(request, template, {'form': form})

Our template files. page.html:

{% extends 'base.html' %}

{% block main %}
  {% include 'form.html' %}
{% endblock %}

{% block script %}
{# Assumes jQuery is loaded... #}
{# This should be in a seperate script file #}
$(document).on('submit', 'form.dynamic-form', function(form) {
  var $form = $(form);
    type: form.method,
    url: form.action,
    data: $form.serialize(),
    success: function(data) {
{% endblock %}

And form.html:

<form action="/path/to/url/" method="POST" class="dynamic-form">
  {% csrf_token %}
  {{ form }}
  <button type="input">Submit</button>

Obviously, this is a fairly cut-down example, but it gets the message across.

One thing I dislike in general about django is that failed form submissions are returned with a status code of 200: personally I think a 409 is more appropriate in most cases, but returning a 200 actually means this code is simpler.

Capture and test sys.stdout/sys.stderr in unittest.TestCase

Testing in Django is usually done using the unittest framework, which comes with Python. You can also test using doctest, with a little bit of work.

One advantage of doctest is that it’s super-easy to test for an exception: you just expect the traceback (which can be trimmed using \n ... \n).

In a unittest.TestCase, you can do a similar thing, but it’s a little more work.

Basically, you want to temporarily replace sys.stdout (or sys.stderr) with a StringIO instance, and set it back after the block you care about has finished.

Python has had a nice feature for some time called Context Managers. These enable you to ensure that cleanup code will be run, regardless of what happens in the block.

The syntax for running code within a context manager is:

with context_manager(thing) as other:
  # Code we want to run
  # Can use 'other' in here.

One place that you can see this syntax, in the context of testing using unittest is to check a specific exception is raised when a function that uses keyword arguments, or a statement that is not a callable is executed:

class FooTest(TestCase):
  def test_one_way(self):
    self.assertRaises(ExceptionType, callable, arg1, arg2)

  def test_another_way(self):
    with self.assertRaises(ExceptionType):
      callable(arg1, arg2)
      # Could also be:
      #     callable(arg1, arg2=arg2)
      # Or even:
      #     foo = bar + baz
      # Which are not possible in the test_one_way call.

So, we could come up with a similar way of calling our code that we want to capture the sys.stdout from:

class BarTest(TestCase):
  def test_and_capture(self):
    with capture(callable, *args, **kwargs) as output:
      self.assertEquals("Expected output", output)

And the context manager:

import sys
from cStringIO import StringIO
from contextlib import contextmanager

def capture(command, *args, **kwargs):
  out, sys.stdout = sys.stdout, StringIO()
  command(*args, **kwargs)
  yield sys.stdout.read()
  sys.stdout = out

It’s simple enough to do the same with sys.stderr.