Why CustomUser subclasses are not such a good idea

Background

The system I work on has People who may or may not be Users, and very infrequently Users who may not be a Person. In fact, an extension to the system has meant that there will be more of these: a User who needs to be able to generate reports (say, a Franchisor who needs to only be able to access aggregate data from franchises, that might belong to multiple companies) who is never rostered on for shifts, which is what the Person class is all about.

Anyway, the long and the short of this was that I thought it might be a good idea to look at sub-classing User for ManagementUser.

I guess I should have listened to those smarter than me who shouted that sub-classing User is not cool. Although they never gave any concrete reasons, but now I have one.

You cannot easily convert a superclass object to a specialised sub-class. Once a user is a User, it’s hard to make them into a ManagementUser.

It can be done: the following code will take a User (or any parent class) object, a User (or whatever) subclass, and any other keyword arguments that should be passed into the constructor. It saves the newly upgraded object, and returns it.

1 def create_subclass(SubClass, old_instance, **kwargs):
2     new_instance = SubClass()
3     for field in old_instance._meta.local_fields:
4         setattr(new_instance, field.name, getattr(old_instance, field.name))
5     new_instance.save()
6     return new_instance()

However, it really should check that there isn’t an existing instance, and maybe some other checks.

What advantages does sub-classing have?

The biggest advantage, or so I thought, was to have it so you can automatically downcast your models on user login, and then get access to the extended user details. For instance, if your authentication backend automatically converts User to Person, then you can get access to the Person’s attributes (like the company they work for, their shifts, etc) without an extra level of attribute access:

1 # request.user is always an auth.User instance:
2 request.user.person.company
3 # request.user might be a person, etc.
4 request.user.company

But it turns out that even this is bad. Now, in guard decorators on view functions, you cannot just test the value of an attribute, as not all users will have that attribute. Instead, you need to test to see if the attribute exists, and then test the attribute itself.

So, what do you do instead?

The preferred method in django for extending User is to use a UserProfile class. This is just a model that has a OneToOneField linked back to User. I would look at doing a very small amount of duck-punching just to make getting a hold of the profile class:

 1 import logging
 2 from django.contrib.auth.models import User
 3 from django.db import models
 4 
 5 class Person(models.Model):
 6     user = models.OneToOneField(User, related_name="_person")
 7     date_of_birth = models.DateField(null=True, blank=True)
 8 
 9 def get_person(user):
10     try:
11         return user._person
12     except Person.DoesNotExist:
13         pass
14 
15 def set_person(user, person):
16     user._person = person
17 
18 if hasattr(User, 'person'):
19     logging.error('Model User already has an attribute "person".')
20 else:
21     User.person = property(get_person, set_person)

By having the person’s related name attribute as _person, we can wrap read access to it in an exception handler, and then use a view decorator like:

1 @user_passes_test(lambda u:u.person)
2 def person_only_view(request, **kwargs):
3     pass

We know this view will only be available to logged in users who have a related Person object.

I will point out that I am duck-punching/monkey-patching here. However, I feel that this particular method of doing it is relatively safe. I check before adding the property, and in reality I probably would raise an exception rather than just log an error.

Displaying only objects without subclasses

Sometimes, the django.contrib.auth User model just doesn’t cut it.

I have bounced around between ways of handling this sorry fact. My production system uses a nasty system of PersonUser relationships (where, due to old legacy code, I need to keep the primary keys in sync), to monkey-patching User, using UserProfiles, and subclassing User.

First, a little on the nasty hack I have in place (and how that will affect my choices later on).

My project in work is a rostering system, where not everyone who is a Person in the system needs to be a User. For instance, most managers (who are Users) do not need their staff to be able to log in. However, they themselves must be a Person as well as a User, if they are to be able to log in, but also be rostered on.

Thus, there are many people in the system who are not Users. They don’t have a username, and may not even have an email address. Not that having an email address is that useful in the django User model, as there is no unique constraint upon that.

So, I am currently kind-of using Person as a UserProfile object, but there are Person instances that do not have an associated User, and some of these are required to have an email address, and have first and last names. So, there is lots of duplication across these two tables. Which need to be kept in sync.

The solution I am looking at now moves in the other direction.

A Person is a subclass of User. It has the extra data that we need for our business logic (mobile phone number, company they work for), but I have also monkey-patched User to not require a username. We are moving towards using email addresses for login names anyway, so that isn’t a problem. That has its own concerns (not everyone has a unique email address, but there are workarounds for that).

But not every User will have a Person attached. The admin team’s logins will not (and this will be used to allow them to masquerade as another user for testing and bug-hunting purposes). So, we can’t just ignore the User class altogether and do everything with the Person class.

This is all well and good. I have an authentication backend that will return a Person object instead of a User object (if one matches the credentials). Things are looking good.

Except then I look in the admin interface. And there we have all of the Person objects' related User objects, in the User table. It would be nice if we only had the ‘pure’ Users in here, and all Person objects were just in their category.

So, I needed a way to filter this list.

Luckily, django’s admin has this capability. In my person/admin.py file, I had the following code:

1 from django.contrib import admin
2 from django.contrib import auth
3 
4 class UserAdmin(auth.admin.UserAdmin):
5     def queryset(self, request):
6         return super(UserAdmin, self).queryset(request).filter(person=None)
7 
8 admin.site.unregister(auth.models.User)
9 admin.site.register(auth.models.User, UserAdmin)

And, indeed, this works.

But then I found another User subclass. Now we needed a type of user that is distinct from Person (they are never rostered, are not associated with a given company, but do log into the system).

I wanted the changes to the admin to be isolated within the different apps, so I needed to be able to get the currently installed UserAdmin class, and subclass that to filter the queryset. So the code becomes (in both admin.py files):

 1 from django.contrib import admin
 2 from django.contrib import auth
 3 
 4 BaseUserAdmin = type(admin.site._registry[auth.models.User])
 5 
 6 class UserAdmin(BaseUserAdmin):
 7     def queryset(self, request):
 8         return super(UserAdmin, self).queryset(request).filter(foo=None)
 9 
10 admin.site.unregister(auth.models.User)
11 admin.site.register(auth.models.User, UserAdmin)

The only difference in the two files is the foo. This becomes whatever this sub-class’s name is. Thus, it is person in the person/admin.py file, and orguser in the orguser/admin.py file.

The next step is to change the backend so that it will automatically downcast the logged in user to their child class. Other people have detailed this in the past: mostly the performance issue vanishes here because we are only looking at a single database query for a single object.