Django Proxy Model Relations

I’ve got lots of code I’d do a different way if I were to start over, but often, we have to live with what we have.

One situation I would seriously reconsider is the structure I use for storing data related to how I interact with external systems. I have an Application object, and I create instances of this for each external system I interact with. Each new Application gets a UUID, and is created as part of a migration. Code in the system uses this UUID to determine if something is for that system.

But that’s not the worst of it.

I also have an AppConfig object, and other related objects that store a relation to an Application. This was fine initially, but as my code got more complex, I hit upon the idea of using Django’s Proxy models, and using the related Application to determine the subclass. So, I have AppConfig subclasses for a range of systems. This is nice: we can even ensure that we only get the right instances (using a lookup to the application to get the discriminator, which I’d probably do a different way next time).

However, we also have other bits of information that we need to store, that has a relation to this AppConfig object.

And here is where we run into problems. Eventually, I had the need to subclass these other objects, and deal with them. That gives a similar benefit to above for fetching filtered lists of objects, however when we try to follow the relations between these, something annoying happens.

Instead of getting the subclass of AppConfig, that we probably want to use because the business logic hangs off that, we instead get the actual AppConfig instances. So, in order to get the subclass, we have to fetch the object again, or swizzle the __class__. And, going back the other way would have the same problem.

Python is a dynamic language, so we should be able to do better.

In theory, all we have to do is replace the attributes on the two classes with ones that will do what we want them to do. In practice, we need to muck around a little bit more to make sure it all works out right.

It would be nice to be able to decorate the declaration of the overridden field, but that’s not valid python syntax:

>>> class Foo(object):
...   @override
...   bar = object()

So, we’ll have to do one of two things: alter the class after it has been defined, or leverage the metaclass magic Django already does.

class Foo(models.Model):
    bar = models.ForeignKey('bar.Bar')


class FooProxy(models.Model):
    bar = ProxyForeignKey('bar.BarProxy')  # Note the proxy class

    class Meta:
      proxy = True

However, we can’t just use the contribute_to_class(cls, name) method as-is, as the Proxy model attributes get dealt with before the parent model. So, we need to register a signal, and get the framework to run our code after the class has been prepared:

class ProxyField(object):
    def __init__(self, field):
        self.field = field

    def contribute_to_class(self, model, name):
        @receiver(models.signals.class_prepared, sender=model, weak=False)
        def late_bind(sender, *args, **kwargs):
          override_model_field(model, name, self.field)


class ProxyForeignKey(ProxyField):
    def __init__(self, *args, **kwargs):
        super(ProxyForeignKey, self).__init__(ForeignKey(*args, **kwargs))

Then, it’s a matter of working out what needs to happen to override_model_field.

It turns out: not much. Until we start thinking about edge cases, anyway:

def override_model_field(model, name, field):
    original_field = model._meta.get_field(name)

    model.add_to_class(name, field)
    if field.rel:
      field.rel.to.add_to_class(
        field.related_name,
        ForeignRelatedObjectsDescriptor(field.related)
      )

There is a little more to it than that:

  • We need to use the passed-in related_name if one was provided in the new field definition, else we want to use what the original field’s related name was. However, if not explicitly set, then neither field will actually have a related_name attribute.
  • We cannot allow an override of a foreign key to a non-proxy model: that would hijack the original model’s related queryset.
  • Similarly, we can only allow a single proxy to override-relate to another proxy: any subsequent override-relations would likewise hijack the related object queryset.
  • For non-related fields, we can only allow an override if the field is compatible. What that means I’m not completely sure just yet, but for now, we will only allow the same field class (or a subclass). Things that would require a db change would be verboten.

So, we need to guard our override_model_field somewhat:

def override_model_field(model, name, field):
    original_field = model._meta.get_field(name)

    if not isinstance(field, original_field.__class__):
        raise TypeError('...')

    # Must do these checks before the `add_to_class`, otherwise it breaks tests.
    if field.rel:
        if not field.rel.to._meta.proxy:
            raise TypeError('...')

        related_name = getattr(field, 'related_name', original_field.related.get_accessor_name())
        related_model = getattr(field.rel.to, related_name).related.model

        # Do we already have an overridden relation to this model?
        if related_model._meta.proxy:
            raise TypeError('...')

    model.add_to_class(name, field)

    if field.rel:
        field.rel.to.add_to_class(
          related_name,
          ForeignRelatedObjectsDescriptor(field.related)
        )

There is an installable app that includes tests: django-proxy-overrides.