Consider the following situation:
We have a set of items that we want to show in a random order. The order, however, should remain fixed for a period. The display of items will need to be paginated, and we have over 200,000 items.
If the ordering is not truly random, then we could have an expression index, and allow ordering based on that. However, that doesn’t really help out with the pagination, and issues around LIMIT/OFFSET ordering of very large sets.
Instead, I came up with a solution this afternoon that uses Postgres Materialised Views.
Let’s start with a Django model:
class Post(models.Model): title = models.TextField() content = models.TextField() posted_at = models.DateTimeField()
We can build up a materialised view that associates each post with a position:
CREATE MATERIALISED VIEW post_ordering AS SELECT post.id AS post_id, row_number() OVER () AS position FROM ( SELECT id FROM blog_post ORDER BY random() ) post; CREATE INDEX post_ordering_id ON post_ordering(post_id); CREATE INDEX post_ordering_position ON post_ordering(position);
Because a materialised view stores a copy of the data, we need to index it if we want to get performance benefits.
This materialised view is interesting from the pagination perspective because there are no gaps in the position values, even if there are missing post_id values. This means we can use filtering to paginate, instead of having to use the regular slicing notation.
Do note that the ordering needs to be done inside a subquery, otherwise it will not work correctly.
We do need to stick a model in front of this to access it from Django:
class PostPosition(models.Model): post = models.OneToOneField( primary_key=True, related_name='ordering', on_delete=models.DO_NOTHING, ) position = models.IntegerField() class Meta: managed = False db_table = 'post_ordering'
Now, because we have a relationship defined, we may filter using this:
page_2 = Post.objects.filter( ordering__position__gte=100, ordering__position__lt=200 ).order_by('ordering__position')
From here, we just need to create a custom paginator to use this instead of the normal slicing.
class PositionPaginator(django.core.paginators.Paginator): def page(self, number): number = self.validate_number(number) bottom = (number - 1) * self.per_page top = bottom + self.per_page if top + self.orphans >= self.count: top = self.count return self._get_page(self.object_list.filter( ordering__position__gte=bottom, ordering__position__lt=top ).order_by('ordering__position'), self)
The last piece of the puzzle is getting the refresh of the ordering. In postgres, you just need:
REFRESH MATERIALISED VIEW post_ordering;