Keyset Pagination in Django

Pagination is great. Nothing worse than having an HTML page that renders 25000 rows in a table.

Django Pagination is also great. It makes it super easy to declare that a view (that inherits from MultipleObjectMixin) should paginate its results:

class List(ListView):
    queryset = Foo.objects.order_by('bar', '-baz')
    paginate_by = 10
    template_name = 'foo.html'

Django pagination uses the LIMIT/OFFSET method. This is fine for smaller offsets, but once you start getting beyond a few pages, it can perform really badly. This is because the database needs to fetch all of the previous rows, even though it discards them.

Using Keyset Pagination allows for better performing “next page” fetches, at the cost of not being able to randomly fetch a page. That is, if you know the last element from page N-1, then you may fetch page N, but otherwise you really can’t.

Keyset Pagination, sometimes called the Seek Method, has been documented by Markus Winand and Joe Nelson. If you are not familiar with the concept, I strongly suggest you read the articles above.

Django’s pagination is somewhat pluggable: you may switch out the paginator in a Django ListView, for instance, allowing you to do things like switch the Page class, or how the various parts are computed. I used it recently to allow for a different query to be used when calculating the total number of objects in a queryset, to vastly improve performance of a particular paginated queryset.

However, there are limits. Both the view and the paginator expect (nay, demand) an integer page number, which, as we shall see shortly, will not work in this case. I also feel like the view is over-reaching it’s remit by casting the page number to an integer, as I’ll discuss below.

In order to get consistent results with any type of pagination, you must ensure that the ordering on the queryset is stable: that is, there are no rows that will be ‘tied’. Doing otherwise will mean that the database will “break the tie”, and not always in the same order. I’ve seen a bug that was extremely hard to track down that was caused by exactly this problem (and that was just with OFFSET pagination).

Often, to ensure stable ordering, the primary key is used as the “last” sort column. This is perfectly valid, but is not always necessary.

Because in many cases we will need to sort by multiple columns, we’ll need some mechanism for passing through to the paginator the “last value” in a given page for each of these columns. For instance, if we are ordering by timestamp and then group, we would need to pass through both the timestamp and the group of the last object. Because I like to use GET forms to allow me to paginate filtered results, I’ll want to have all of the values combined into one query parameter. If you were constructing links instead, you could look at having these as different parameters. However, you’d still need to be careful, because you aren’t filtering all results on these parameters. Having them serialised into a single parameter (using JSON) means that they are all in the one place, and you can just use that for the filtering to get the page results.

I’ve built a working implementation of keyset pagination, at least for forwards traversal, at django-keyset-pagination.

We can see from this that there really is not that much that we needed to do. We use a different Page object, which enables us to change what the next_page_number will generate. When I figure out how, it will also allow us to work out the previous_page_number

Likewise, we needed to change how we validate a page number, and how we fetch results for a page. That method, _get_page(number) is the one that does most of the work.

Ultimately, we wind up with a filter that looks like:

  WHERE (A < ?) OR (A = ? AND B > ?) OR (A = ? AND B = ? AND C < ?)

The direction of the test (< vs >) depends upon the sorting of that column, but hopefully you get the idea.

In order to enable the query planner to be able to use an index effectively (if one exists), we need to adjust this to (thanks Markus):

WHERE A <= ? AND (
  (A < ?) OR (A = ? AND B > ?) ...
)

It’s also possible, in Postgres at least, to use a ROW() constructor comparison to order rows. However, this only works if the direction of each column ordering is the same, which in my use case it was not. I have a proof of concept of using ROW() constructors, but I need to figure out how to detect if they are available to the database in use.

In order to use the new paginator, we need to work around some issues in the Django class based views: namely that they force an integer value (or use the special string last, neither of which are acceptable in this case):

class PaginateMixin(object):
    "Make pagination work for non integer page numbers"
    def paginate_queryset(self, queryset, page_size):
        # This is very similar to how django currently (2.1) does it: I may submit a PR to use this
        # mechanism instead, as it is more flexible.
        paginator = self.get_paginator(
            queryset, page_size, orphans=self.get_paginate_orphans(),
            allow_empty_first_page=self.get_allow_empty()
        )
        page_kwarg = self.page_kwarg
        page = self.kwargs.get(page_kwarg) or self.request.GET.get(page_kwarg) or 1

        try:
            page_number = paginator.validate_number(page_number)
        except ValueError:
            raise Http404(_('Page could not be parsed.'))

        try:
            page = paginator.page(page_number)
            return (paginator, page, page.object_list, page.has_other_pages())
        except InvalidPage as e:
            raise Http404(
                _('Invalid page (%(page_number)s): %(message)s') % {
                    'page_number': page_number,
                    'message': str(e)
                }
            )

There’s really only one change: instead of just casting the page number to an integer, we let the paginator handle that.

Okay, once all that is done, we can use our paginator:

class List(PaginateMixin, ListView):
    paginator_class = KeysetPaginator
    paginate_by = 10

    def get_queryset(self):
        return Foo.objects.order_by('timestamp', 'bar', 'baz')

We’ll need to change our template rendering to only render a next page link or button, rather than trying to render them for each page. We also don’t have any way to return to the previous page: I’m still working through a mechanism for that.


This post was originally written using the ROW() constructor, and this part of the post discussed the shortcomings. Now that has been resolved, the main shortcoming is that it is not yet possible to traverse to the previous page of results. In many cases that may not be necessary (we could use a browser’s back button, or rely on the fact that if it’s infinite scrolling the data is already in the document), however I would like to investigate how hard it is to actually get the previous page.

Leveraging HTML and Django Forms: Pagination of Filtered Results

Django’s forms are fantastic for parsing user input, but I’ve come up with a nice way to use them, in conjunction with HTML forms, for pagination, using the inbuilt Django pagination features.

It all stems from the fact that I’ve begun using forms quite heavily for GET purposes, rather than just for POST. Basically, anytime you have a URL that may have some parts of the query string that may need to be built, it’s simpler to use a form element, than to manually build up the url in your template.

Thus, where you may have something like:

<a href="{% url 'foo' %}?page={{ page }}">

It may be better do do something more like:

<form action="{% url 'foo' %}">
  <input type="hidden" name="page" value="{{ page }}">
</form>

Indeed, you can even use named buttons for submission, which will refer to the page. That is the key to the process outlined below.


Django comes with lots of “batteries”, including form handling and pagination. The Class Based Views (CBV) that deal with collections of objects will include pagination, although it is possible to use this pagination in your own views. For simplicity, we’ll stick with a simple ListView.

Let’s begin with that simple view: in our views.py:

from django.views.generic import ListView, DetailView

from .models import Person

person_list = ListView.as_view(
    queryset=Person.objects.all(),
    template_name='person/list.html',
    paginate_by=10,
)

person_detail = DetailView.as_view(
    queryset=Person.objects.all(),
    template_name='person/detail.html',
)

Essentially, that’s all you need to do. You could use implied template names, but I almost never do this. The takeaway from this block is that we are stating the queryset that our ListView will use as the base, the template it should render, and the number of items per page.

I’ve stubbed out the person_detail view, just so we can refer to it in our urlconf, and then in turn in the template. Because of the simplicity of it, we could have just done all of this in our urls.py.

Speaking of our urls.py, we have something like:

from django.conf.urls import url

import views

urlpatterns = [
    url(r'^people/$', views.person_list, name='person_list'),
    url(r'^people/(?P<pk>\d+)/', views.person_detail, name='person_detail'),
]

Then, in our template, we can render it as (ignoring the majority of the page):

<ul class="people">
  {% for object in object_list %}
    <li>
      <a href="{% url 'person_detail' pk=object.pk %}">
        {{ object }}
      </a>
    </li>
  {% endfor %}
</ul>

But this doesn’t give us our pagination. It will only show the first ten results, with no way to access the others. All we need to do to access the others is to append ?page=X, but, as we will see, there is another way.

Typically, your pagination block might look something like:

<ul class="pagination">
  <li>
    {% if page_obj.has_previous %}
      <a href="?page={{ page_obj.previous_page_number }}">
        prev
      </a>
    {% else %}
      <span>prev</span>
    {% endif %}
  </li>

  {% for page_number in paginator.page_range %}
    {% if page_number = page_obj.number %}
      <li class="active">
        <span>{{ page_number }}</span>
      </li>
    {% else %}
      <li>
        <a href="?page={{ page_number }}">
          {{ page_number }}
        </a>
      </li>
    {% endif %}
  {% endfor %}


  <li>
    {% if page_obj.has_next %}
      <a href="?page={{ page_obj.next_page_number }}">
        next
      </a>
    {% else %}
      <span>next</span>
    {% endif %}
  </li>
</ul>

Depending upon your CSS framework, if you use one, there may already be some pre-prepared styles to help you out with this.

This is all well and good, until you want paginated search results. Then, you can no longer rely on being able to rely on using ?page=N, as this would remove any search terms you were already using. Also, if you were using ajax to fetch and display stuff, you may need to use the whole URL, rather than just the query string.

Instead, we can use a Django form for searching, and just add in the pagination bits.

We will build a page that displays an optionally filtered list of people.

Our Person model will be deliberately simple:

from django.db import models

class Person(models.Model):
    name = models.CharField(max_length=256)

Likewise, our form will be simple. All we need to do is have the form able to filter our queryset.

from django import forms

class PersonSearchForm(forms.Form):
    query = forms.CharField(label=_('Filter'), required=False)

    def filter_queryset(self, request, queryset):
        if self.cleaned_data['name']:
            return queryset.filter(name__icontains=self.cleaned_data['query'])
        return queryset

Finally, we will need to subclass a ListView. We’ll mixin from FormMixin, so we get the form-handling capabilities:

from django.views.generic.edit import FormMixin
from django.views.generic import ListView

class FilteredListView(FormMixin, ListView):
    def get_form_kwargs(self):
        return {
          'initial': self.get_initial(),
          'prefix': self.get_prefix(),
          'data': self.request.GET or None
        }

    def get(self, request, *args, **kwargs):
        self.object_list = self.get_queryset()

        form = self.get_form(self.get_form_class())

        if form.is_valid():
            self.object_list = form.filter_queryset(request, self.object_list)

        context = self.get_context_data(form=form, object_list=self.object_list)
        return self.render_to_response(context)

There’s a little bit to comment on there: we override the get_form_kwargs so we pull our form’s data from request.GET, instead of the default.

We also override get, so we filter results if the form validates (which it will if there was data provided). We delegate responsibility for the actual filtering to the form class.

Everything else is just standard.

We will want to actually use this view:

people_list = FilteredListView.as_view(
    form_class=PersonSearchForm,
    template_name='person/list.html',
    queryset=Person.objects.all(),
    paginate_by=10
)

Now we need to render this.

<form id="person-list-filter" action="{% url 'person_list' %}">
  <input name="{{ form.query.html_name }}" value="{{ form.query.value }}" type="search">
  <button type="submit" name="page" value="1">{% trans 'Search' %}</button>
</form>

<div class="results">
  {% include 'person/list-results.html' %}
</div>

You may notice that the search button will result in page=1 being used. This is deliberate.

Our person/list-results.html is just the same as what our person/list.html looked like before, with the addition of the pagination template inclusion.

{% include 'pagination.html' with form_target='person-list-filter' %}

<ul class="people">
  {% for object in object_list %}
    <li>
      <a href="{% url 'person_detail' pk=object.pk %}">
        {{ object }}
      </a>
    </li>
  {% endfor %}
</ul>

Our pagination.html is very similar to how our other template above looked too, but using <button> elements instead of <a>, and we will disable those that should not be clickable. Also, the buttons contain an attribute indicating which form they should be bound to.

<ul class="pagination">
  <li>
    <button
      form="{{ form_target }}"
      {% if page_obj.has_previous %}
        name="page"
        value="{{ page_obj.previous_page_number }}"
        type="submit"
      {% else %}
        disabled="disabled"
      {% endif %}>
      prev
    </button>
  </li>

  {% for page_number in paginator.page_range %}
    <li class="{% if page_number = page_obj.number %}active{% endif %}">
      <button
        name="page"
        value="{{ page_number }}"
        type="submit"
        form="{{ form_target }}"
        {% if page_number = page_obj.number %}
          disabled="disabled"
        {% endif %}>
        {{ page_number }}
      </button>
    </li>
  {% endfor %}

  <li>
    <button
      form="{{ form_target }}"
      {% if page_obj.has_next %}
        name="page"
        value="{{ page_obj.next_page_number }}"
        type="submit"
      {% else %}
        disabled="disabled"
      {% endif %}>
      next
    </button>
  </li>
</ul>

We are getting close now. This will be enough to have clicking on the next/previous or page number buttons resubmitting our search form, resulting in the page reloading with the correct results.

But we can do a bit better. We can easily load the results using AJAX, and just insert them into the page.

We just need one additional method on our View class:

class FilteredListView(FormMixin, ListView):
    # ...

    def get_template_names(self):
        if self.request.is_ajax():
            return [self.ajax_template_name]
        return [self.template_name]

    # ...

and one addition to our view declaration:

people_list = FilteredListView.as_view(
    form_class=PersonSearchForm,
    template_name='person/list.html',
    ajax_template_name='person/list-results.html',
    queryset=Person.objects.all(),
    paginate_by=10,
)

I’ll use jQuery, is it makes for easier to follow code:

// Submit handler for our form: submit it using AJAX instead.
$('#person-list-filter').on('submit', function(evt) {
  evt.preventDefault();

  var form = evt.target;

  $.ajax({
    url: form.action,
    data: $(form).serialize(),
    success: function(data) {
      $('#results').html(data)
    }
  });
});

// Because we are using buttons, which ajax submit will not send,
// we need to add a hidden field with the relevant page number
// when we send our request.
$('#person-list-filter').on('click', '[name=page]', function(evt) {
  var $button = $(evt.target).closest('button');
  var $form = $button[0].form;

  if (!$form.find('[type=hidden][name=page]')) {
    $form.append('<input type="hidden" name="page">');
  }

  $form.find('[type=hidden][name=page]').val($button.val());

  $form.submit();
});

That should do nicely.


There is another thing that we need to think about. If we leave the next/prev buttons, then we need to handle multiple clicks on those buttons, which fetch the subsequent page, and possibly cancel the existing AJAX request.

I do have a solution for this, too, although it complicates things a fair bit. First, we need to add some attributes to the next/prev buttons:

<ul class="pagination">
  <li>
    <button
      form="{{ form_target }}"
      {% if page_obj.has_previous %}
        name="page"
        value="{{ page_obj.previous_page_number }}"
        type="submit"
        data-increment="-1"
        data-stop-at="1"
      {% else %}
        disabled="disabled"
      {% endif %}>
      prev
    </button>
  </li>

  {% for page_number in paginator.page_range %}
    <li class="{% if page_number = page_obj.number %}active{% endif %}">
      <button
        name="page"
        value="{{ page_number }}"
        type="submit"
        form="{{ form_target }}"
        {% if page_number = page_obj.number %}
          disabled="disabled"
        {% endif %}>
        {{ page_number }}
      </button>
    </li>
  {% endfor %}

  <li>
    <button
      form="{{ form_target }}"
      {% if page_obj.has_next %}
        name="page"
        value="{{ page_obj.next_page_number }}"
        type="submit"
        data-increment="1"
        data-stop-at="{{ paginator.num_pages }}"
      {% else %}
        disabled="disabled"
      {% endif %}>
      next
    </button>
  </li>
</ul>

And our click handler changes a bit too:

$('#person-list-filter').on('click', 'button[name=page]', function() {
  var page = parseInt(this.value, 10);
  var $form = $(this.form);
  // Only update the value of the hidden form.
  if (!$form.find('[name=page][type=hidden]')) {
    $form.insert('<input name=page type=hidden>');
  }
  $form.find('[name=page][type=hidden]').val(page);
  // Increment any prev/next buttons values by their increment amount,
  // and set the disabled flag on any that have reached their stop-at
  $form.find('[data-increment]').each(function() {
    this.value = parseInt(this.dataset.increment, 10) + page;
    // We want to disable the button if we get to the 'stop-at' value,
    // but this needs to happen after any submit events have occurred.
    if (this.dataset.stopAt) {
      setTimeout(function() {
        this.disabled = (this.value == this.dataset.stopAt);
      }.bind(this), 0);
    }
  });

  $form.submit();
});

Since this was posted, I have written a number of pages that use this pattern. Some of the improvements that could be made are listed below:

It’s possible to have these results automatically update as the user types. Obviously, this only makes sense if we have AJAX submission happening!

$('#person-list-filter').on('keyup', function() {
  this.submit();
})

If you have lots and lots of results, you probably won’t want to show every button. Often you will see the first few, and a couple either side of the current page (and sometimes the last few). This is almost possible to do with pure CSS, but not quite. I do have a solution for this, but it’s probably worthy of a complete post of its own.

Another situation that is likely to happen is this:

  • User clicks on a page other than page 1 of results. Let’s say page N.
  • User enters text in search field which results in fewer than N pages of results being available.
  • User gets error message.

We can fix this with an overridden method:

class FilteredListView(FormMixin, ListView):
    # ...

    def paginate_queryset(self, queryset, page_size):
        try:
            return super(FilteredListView, self).paginate_queryset(queryset, page_size)
        except Http404:
            self.kwargs['page'] = 'last'
            return super(FilteredListView, self).paginate_queryset(queryset, page_size)

    # ...

You’ll also need to add in a get_prefix() method if you are using an old Django, but really you should just upgrade.


Updated: I’ve added in some more error checking into the templates, to prevent exceptions when attempting to render previous and next page links (thanks inoks).

Updated: I’ve changed to use the preferred urlpattern syntax. (thanks knbk).

Updated: Delegate to the form for filtering. Add discussion of other extensions. Add button[form] attributes.