EspHome and HomeKit

HomeKit is Apple’s home automation software, and it’s pretty nice. It is local by default, although it can be configured to allow access from outside your home network, it does so in a way that does not store all of your data on an external server. Instead, the hub (an AppleTV, iPad or HomePod that lives in your house) can be contacted by your iOS or macOS device, and interact that way.

I’ve been playing around with the ESP8266 microcontroller, and whilst you could put a complete HomeKit Automation Protocol client on there, it’s probably going to run too slow to be useful. Instead, I’ve written an MQTT to HomeKit bridge, and just have simple devices that send data to the local MQTT broker.

I’ve even set up a different wireless network, just for those devices. They cannot connect to the outside internet, only to the MQTT broker (which is then in turn able to connect to the HomeKit hub). That can comfortably run on a Raspberry Pi Zero W.

The story for writing custom firmware for the IoT devices is a bit more complex. I started out with MicroPython, and then (after running into problems with devices stopping and requiring a reset), moved to an Arduino approach. Following that (and after meeting the author of MicroPython), I went back to that. But, as I thought more about it, there were a bunch of features that I wanted to implement, and my deployment processes were somewhat awkward.

I’d looked at the various packaged firmware: Espurna, and Tasmota in particular. All of these have a bunch of people using them (which is good, because then I don’t have to reimplement everything). But all of them were not quite able to do what I wanted, which was just send messages to a specific MQTT topic (and perhaps listen on that one too).

Then I came across ESPHome. This enables you to use YAML to define a device and it’s services, and is far more flexible than the prepackaged firmware. You can run the build tool in docker, which compartmentalises it and means you don’t have to worry about maintaining an installation of the tool. And, it can do OTA updates, as well as provide access to logs over Wifi!

So, now, I can have a really nice workflow for building and deploying firmware.

  name: temperature_01
  platform: ESP8266
  board: nodemcuv2

  ssid: "SSID"
  password: "PASSWORD"

# Enable logging


  - id: bus_a
    sda: D2
    scl: D1
    scan: True

  - platform: bmp280
    i2c_id: bus_a
    address: 0x76
      name: "Temperature 1"
      state_topic: HomeKit/temperature_01/TemperatureSensor/CurrentTemperature
      name: "Inverter pressure 1"
  - platform: wifi_signal
    name: "WiFi signal sensor"
    update_interval: 60s

  broker: "mqtt.lan"
  discovery: false
  topic_prefix: esphome/temperature_01

In addition to this, I came across the tuya-convert package, and Mirabella Wifi Bulbs, which are $15 at K-mart, so now I have a couple of those. Although, I am still not happy with not being able to have the smarts at the switch, but since I don’t have neutral wires in the switch, that’s a problem.

Functions as Tables in Django and Postgres

Postgres has some excellent features. One of these is set-returning functions. It’s possible to have a function (written in SQL, or some other language) that returns a set of values. For instance, the in-built function generate_series() returns a set of values:

  FROM generate_series(now(),
                       now() + INTERVAL '1 month',
                       '1 day') day;

This uses a set returning function as a table source: in this case a single column table.

You can use scalar set-returning functions from within Django relatively easily: I blogged about it last year.

It is possible to create your own set-returning functions. Further, the return type can be a SETOF any table type in your database, or even a “new” table.

              bar_id INTEGER,
              baz JSON[]) AS $$

  SELECT AS id, AS bar_id,
         ARRAY_AGG(JSON_BUILD_OBJECT(bar.x, foo.y))
    FROM foo
   INNER JOIN bar ON ( = bar.foo_id)
   WHERE foo.y = $1
     AND bar.x > $2


It’s possible to have a Postgres VIEW as the data source for a Django model (you just set the Meta.db_table on the model, and mark it as Meta.managed = False). Using a FUNCTION is a bit trickier.

You can use the QuerySet.raw() method, something like:

qs = Foo.objects.raw('SELECT * FROM foo(%s, %s)', [x, y])

The downside of using raw is you can’t apply annotations, or use .filter() to limit the results.

What would be useful is if you could extract the relevant parameters out of a QuerySet, and inject them as the arguments to the function call.

But why would we want to have one of these set (or table) returning functions? Why not write a view?

I have some complex queries that reference a whole bunch of different tables. In order to be able to write a sane query, I decided to use a CTE. This allows me to write the query in a more structured manner:

WITH foo AS (
  SELECT ...
    FROM foo_bar
   WHERE ...
bar AS (
  SELECT ...
    FROM foo
   WHERE ...
   GROUP BY ...
  FROM bar
 WHERE ...

There is a drawback to this approach, specifically how it interacts with Django. We can turn a query like this into a view, but any filtering we want to do using the ORM will only apply after the view has executed. Normally, this is not a problem, because Postgres can “push down” the filters it finds in the query, down into the view.

But older versions of postgres are unable to perform this operation on a CTE. In other words, each clause of a CTE must run (and be fully evaluated at that point in time) before the next one can run. In practice, if a clause of a CTE is not referenced later on, postgres will not execute that clause, but that is the extent of the optimisation.

So, if we had 50 million objects in foo_bar, and we needed to filter them in a dynamic way (ie, from the ORM), we would be unable to do this. The initial clause would execute for all 50 million rows, and then any subsequent clauses would then include all these, and so on. Then, the filtering would happen after the view had returned all it’s rows.

The workaround, using a function, is to use the parameters to do the filtering as early as possible:


  WITH foo_1 AS (
    SELECT *
      FROM foo_bar
     WHERE x BETWEEN $1 AND $2
  bar AS (
    SELECT *
      FROM foo_1
     INNER JOIN baz USING (baz_id)
     WHERE baz.qux > $3

  SELECT ...
    FROM bar
   GROUP BY ...


Notice that we do the filtering on foo_bar as early as we possibly can, and likewise filter the baz the first time we reference it.

Now we have established why we may want to use a function as a model source, how do we go about doing that?

We are going to dig fairly deep into the internals of Django’s ORM now, so tighten up your boots!

When Django comes across a .filter() call, it looks at the arguments, and applies them to a new copy of the QuerySet’s query object: or more specifically, the query.where node. This has a list of children, which Django will turn into SQL and execute later. The QuerySet does some validation at this point: we will only use those fields known to the QuerySet (either through being fields of the Model, or those that added using .annotate()). Any others will result in an exception.

This will require some extension, as it is possible that one or more arguments to a Postgres function may not be fields on the Model class used to define the shape of the results.

Each Node within a QuerySet’s query has a method: .as_sql(). This is the part that turns the python objects into actual SQL code. We are going to leverage the fact that even the python object that refers to the table itself uses .as_sql() to safely allow passing parameters to the function-as-a-table.

Normally, the .as_sql() method of the BaseTable object returns just the name of the table (possibly with the current alias attached), and an empty list as params. We can swap this class out with one that will return an SQL fragment, containing function_name(%s, %s) (with the correct number of %s arguments), and a list containing those parameters.

Every Postgres function has a unique signature: the function name, and the list of parameters; or more specifically, their types. Thus, postgres will deem the functions:


as distinct entities. We will ignore for now the fact it is possible to have optional arguments, variadic arguments and polymorphic functions.

We need some way of storing what the signature of a Postgres function is. Initially, I used an analog (perhaps even just a subclass) of Django’s Model class. This enabled me to create (temporary) Exact(Col()) nodes to go in the query.where.children list, to be later removed and used for function arguments. I needed to ignore the primary key field, and it always felt wrong to add to the WhereNode, only to remove later.

I ended up settling on a class like Django’s Form class. It uses a metaclass (but requires a Meta.function_name), and uses Django’s form fields to define the arguments.

class FooFunction(Function):
    x = forms.IntegerField()
    y = forms.IntegerField()
    z = forms.BooleanField(required=False, default=True)

    class Meta:
        function_name = 'foo'

A Function subclass can generate a Manager on a Model class, but also can also create the object that will in turn create the relevant SQL. That part happens automatically, when a queryset created from the manager is filtered using appropriate arguments. The Function subclass uses it’s fields to validate the params that it will be passing are valid for their types. It’s a bit like the clean() method on a Form.

We then also need a model class (it could be a model you have already defined, if your function returns a SETOF <table-name>):

class Foo(models.Model):
    bar = models.ForeignKey('foo.Bar', related_name='+', on_delete=models.DO_NOTHING)
    baz = JSONField()

    objects = FooFunction.as_manager()

    class Meta:
        db_table = 'foo()'
        managed = False

Because this is a new and unmanaged model, then we need to set the on_delete so that the ORM won’t try to cascade deletes, but also mark the model as unmanaged. I also set the Meta.db_table to the function call without arguments, so it looks nicer in repr(queryset). It would be nice to be able to get the actual arguments in there, but I haven’t managed that yet.

If this was just a different way of fetching an existing model, you’d just need to add a new manager on to that model. Keep in mind that Django needs a primary key field, so ensure your function provides one.

Then, we can perform a query:

>>> print(Foo.objects.filter(x=1, y=2).query)
SELECT, foo.bar_id, foo.baz FROM foo(1, 2)
>>> print(Foo.objects.filter(x=1, y=2, z=True).query)
SELECT, foo.bar_id, foo.baz FROM foo(1, 2, true)
>>> print(Foo.objects.filter(x=1, y=2, bar__gt=5).query)
SELECT, foo.bar_id, foo.baz FROM foo(1, 2) WHERE foo.bar_id > 5
>>> print(Foo.objects.filter(x=1, y=2).annotate(qux=models.F('bar__name')).query)
SELECT, foo.bar_id, foo.baz, AS qux
FROM foo(1, 2) INNER JOIN bar ON ( = foo.bar_id)

Note that filters that are not function parameters will apply after the function call. You can annotate on, and it will automatically create a join through a foreign key. If you omit a required parameter, then you’ll get an exception.

So, where’s the code for this?

Well, I need to clean it up a bit, and write up some automated tests on it. Expect it soon.

Orders please!

Things are better when they are ordered.

Why might we want things ordered?

From a user experience perspective, if things are not ordered the same way upon repeated viewings, they are liable to get confused, and think that things have changed, when really it’s only the order that has changed.

Some time ago, there were versions of Microsoft Office that used to move menu items closer to the start of the menu that had been used more recently and more frequently. This is not “most recent documents”, or onything like that: this was the menu items like “Format” and other tools. There’s a pretty good rationalé for why they stopped doing that!

Why might we need things ordered?

There are a few reasons why we must have things ordered. One good example is the reason Django will complain about a queryset without ordering when you use pagination: if you select the first 10 items when there is no ordering, and then select the next 10 items, there is no guarantee that those items will not have some of the same items as the first set. Without ordering, pagination is useless. There’s no way to page through items and ensure that you see all of them.

Another reason is where you need to process things in a specific order: either to ensure that the first (or last) matching item is “the winner”. We’ll come across one of those shortly.

Ordering by a specific value or set of values (or function over the same) is usually a sane way to order. This could be alphabetically, by date of insertion, or some numeric value, for instance.

How should we order things?

Sometimes, it is obvious how we should order a list of things. Sometimes the order doesn’t matter, and sometimes we want to allow an ordering that doesn’t rely at all on any attributes of the objects in question.

Let’s look at a list of condition names that might be used for a workplace agreement in Australia.

  • Regular Hours
  • Public Holiday
  • Saturday
  • Sunday
  • After 6pm
  • Regular Overtime (first 2 hours)
  • Extended Overtime (more than 2 hours)

For those outside of Australia: under many work circumstances, employees will get paid at a higher rate because of what are called “Penalty” conditions. This could include being paid, for example, a 15% loading on a Saturday, and a 25% loading on a Sunday. We can rewrite this list with the loading percentages:

Condition Penalty Rule
Regular Hours 100 % When no other rules apply
Public Holiday 200 % Any hours on a public holiday
Saturday 125 % Any hours on a Saturday
Sunday 150 % Any hours on a Sunday
After 6pm 117 % Any hours between 6pm and midnight
Regular Overtime 150 % After 8 hours worked in a day or 38 hours in a week
Extended Overtime 200 % After 2 hours of overtime worked, or any overtime on a Sunday

So now we have two possible ways of ordering our list, by name:

  • After 6pm
  • Extended Overtime (more than 2 hours)
  • Public Holiday
  • Regular Hours
  • Regular Overtime (first 2 hours)
  • Saturday
  • Sunday

Or by penalty rate:

  • 100% Regular Hours
  • 117% After 6pm
  • 125% Saturday
  • 150% Regular Overtime (first 2 hours)
  • 150% Sunday
  • 200% Extended Overtime (more than 2 hours)
  • 200% Public Holiday

However, there’s actually a little more complexity than this. We can have the rules in this order, and apply them sequentially, and we should get the right result for an employee’s wages. But, according to our rules, we should be applying Extended Overtime for any overtime hours on a Sunday. So we need some way of controlling the ordering manually.

Specifically, we want to have our conditions ordered in a way that we can check each rule in turn, and ensure that the last rule that is applied will be the one that should apply, if multiple rules do match.

  • Regular hours
  • After 6pm
  • Saturday
  • Sunday
  • Regular Overtime
  • Extended Overtime
  • Public Holiday

So, how do we actually store this custom ordering?

The naive solution is to store an integer for each item, and ensure that we only have one of each value. Let’s look at that as a Django model:

class Condition(models.Model):
    name = models.TextField()
    penalty = models.DecimalField()
    position = models.IntegerField(unique=True)

Then, we can ensure we get our conditions in the correct order by applying an ordering to our queryset:

queryset = Condition.objects.order_by('position')

However, this solution is not without it’s drawbacks. Primarily, to insert an object between two others, we renumber all of the subsequent items:

condition = Condition(name='New condition', position=3, penalty=Decimal('1.20'))
Condition.objects.filter(position__gte=condition.position).update(position=F('position') + 1)

We must ensure that within our database, we are able to suspend constraints until our transaction is complete, or we must perform actions in a specific order. If we care about not having “holes” between our positions, then we also need to renumber remaining objects when we remove one.

Let’s have a look at how we could handle this, with some client and server code. We can use a formset here, which has ordering permitted, and then we just need the JS to make that happen:

ConditionPositionFormSet = modelformset_factory(Condition, ordering=True, fields=())

class OrderConditionsView(FormView):
    form_class = ConditionPositionFormSet

This will result in rewriting all items one by one, which must happen in a transaction, or in the correct order.

Alternatively, we could use a form that just indicates the new position of a given condition, and have that form handle the UPDATE queries: this will result in just two queries instead.

class ConditionPositionForm(forms.ModelForm):
    class Meta:
        model = Condition
        fields = ('position',)

class OrderConditionsView(UpdateView):
    Handle an AJAX request to move one condition to a new position.

    The condition in that position will be moved to the next position, and so on.
    form_class = ConditionPositionForm

    def form_valid(self, form):
        position = form.cleaned_data('position')
                         .update(position=F('position') + 1)
        return HttpResponse(json.dumps({'status': 'ok'}))

    def form_invalid(self, form):
      return HttpResponse(json.dumps({
        'status': 'error',
        'message': form.fields['position'].errors,
      }), status_code=409)

    def get_queryset(self):
        return Condition.objects.order_by('position')

# simplified

urlpatterns = [
  include('/condition/', [
    path('/condition/<int:pk>/position/', OrderConditionsView.as_view(), name='position'),
    # ...
  ], namespace='condition'),
  # ...

In this case, our HTML and JS can just send back the one that was moved, and it’s new position. We’ll want some mechanism for indicating that the move was successful though, because it may not be permitted to move some objects to specific positions.

  <ul id="conditions">
    {% for condition in object_list %}
      <li data-position-url="{% url 'condition:position' %}">
        {{ }} ({{ condition.penalty }}%)
    {% endfor %}

  // Uses SortableJS:

  require(['Sortable'], (Sortable) => {
    function save(evt) {
      const condition = evt.item;

      function revert() {
        if (evt.oldIndex < evt.newIndex) {
          // Move back to the old index.
        } else {
          // The old index is actually in the wrong spot now,
          // move back to the old index + 1
          condition.before(condition.parentElement.children[evt.oldIndex + 1]);

      fetch(condition.dataset.positionUrl, {
        method: 'POST',
        body: {position: evt.newIndex},
        credentials: 'same-origin'
        response => response.json()
        response => {
          if (response.status === 'error') {
            // We got an error: let's put the object back into the previous
            // position and then show that error.
            // Maybe a nicer way than this though!

    new Sortable(document.getElementById('conditions'), {onUpdate: evt => save});


So, are there any other ways we could store the positions of our conditions, which would allow us to insert a new one, or move one around, without having to renumber a bunch of other items?

Why yes, yes we can.

Instead of storing each position as an integer, we can instead store two numbers, and use the ratio of these two numbers as our ordering:

POSITION = ExpressionWrapper(
  models.F('position_a') / models.F('position_b'),

class PositionQuerySet(models.query.QuerySet):
    def with_position(self):
        return self.annotate(position=POSITION)

    def order_by_position(self):
        return self.order_by(POSITION)

class Condition(models.Model):
    name = models.TextField()
    penalty = models.DecimalField()
    position_a = models.IntegerField()
    position_b = models.IntegerField()

    class Meta:
      unique_together = (
        ('position_a', 'position_b'),  # In fact, this is not quite strict enough, more below.

The logic behind this is that here is always the ability to find a fraction that is between two other fractions. Our initial “first” item can be given the values (1, 1), which will evaluate to 1.0

To place another item, we have three cases:

  • The new value is before the first item in the list.
  • The new value is between two existing items.
  • The new value is after the last item in the list.

Let’s look at how we can calculate the position for the newly added item:

  • Before the first item in the list:

    Increment the denominator (position_b), and use the same numerator (position_a).

    For instance, adding a new first value where our current first value is (1, 1) would give us (1, 2), which evaluates to 0.5, less than 1.0

  • After the last item in the list:

    Increment the numerator (position_a), and use the same denominator (position_b).

    For instance, adding a new value where the last value is (5, 1) would give us (6, 1), which evaluates to 6.0, greater than 5.0

  • Inserting an item between two existing items.

    Use the sum of the two numerators (position_a) as the numerator, and the sum of the two denominators (position_b) as the new denominator. Optionally, you can then simplify this fraction if there are common factors.

    For instance, if we have two values (1, 4) and (5, 1), we can make a new value (6, 5), which evaluates to 1.2, which is between 0.25 and 5.0.

The same operations can be performed on a move: we don’t actually care what the values are, just that the relative values are still consistent. Whenever we move an object, we just change it’s position_a and/or position_b to whatever makes sense for the new position.

<ul id="conditions">
  {% for condition in object_list %}
    <li class="condition"
        data-position-url="{% url 'condition:position' %}"
        data-position-a="{{ condition.position_a }}"
        data-position-b="{{ condition.position_b}}">
      {{ }}
  {% endfor %}

  require(['sortable'], (Sortable) => {
    function save(evt) {
      const item = evt.item;
      const prev = condition.previousElementSibling;
      const next = condition.nextElementSibling;

      if (prev == null && next == null) {
        // This is the only item in the list.
        item.dataset.position_a = 1;
        item.dataset.position_b = 1;
      } else if (prev == null) {
        // This is now the first item in the list.
        item.dataset.position_a = next.dataset.position_a;
        item.dataset.position_b = Number(next.dataset.position_b) + 1;
      } else if (next == null) {
        // This is now the last item in the list.
        item.dataset.position_a = Number(prev.dataset.position_a) + 1;
        item.dataset.position_b = prev.dataset.position_b;
      } else {
        // Neither the first nor the last.
        item.dataset.position_a = Number(prev.dataset.position_a) + Number(next.dataset.position_a);
        item.dataset.position_b = Number(prev.dataset.position_b) + Number(next.dataset.position_b);

      function revert() {
        if (evt.oldIndex < evt.newIndex) {
          // Move back to the old index.
        } else {
          // The old index is actually in the wrong spot now,
          // move back to the old index + 1
          condition.before(condition.parentElement.children[evt.oldIndex + 1]);

      // This could do with some error handling...
      fetch(condition.dataset.positionUrl, {
        method: 'POST',
        body: {
          position_a: item.dataset.position_a,
          position_b: item.dataset.position_b
        credentials: 'same-origin'
      }).then(response => response.json()).then(
        response => {
          if (response.status == 'error') {


Detecting queries in Django tests

Putting this here so I can find it next time I need to know it…

Django has a useful test assertion you can use to ensure you make a set number of queries. However, at times this is a bit less useful than it needs to be, because something changes and we do indeed have a different number of queries, but it’s got nothing to do with the actual code under test.

If you are running in DEBUG=True mode, then you can examine the queries that have been made to the database connection, and ensure the raw SQL of a specific query matches (and is not duplicated, for instance).

This does require a little bit of trickery:

from django.db import connection
from django.test import TestCase, override_settings

from foo.factories import FooFactory

class TestTheThing(TestCase):
    def test_no_update_query(self):
        foo = FooFactory()

        # Our Foo instance should be smart enough to notice that nothing
        # has changed, and thus should not emit an UPDATE query.
        with override_settings(DEBUG=True):
              for x in connection.queries
              if 'UPDATE' in x['sql']

This is a bit of a contrived case: in my case today it was a celery task that only updated an object if the incoming data differed to the saved data, so there was by necessity a SELECT query to get the object, and then an UPDATE only if the data had changed.

The trick is that we use a context manager to set DEBUG to true, this means that it will start capturing queries, and drop them out at the end of the context. This also means you can have a bunch of these in the same test case, and each one will have an independent set of queries.

Partial, Deferrable Unique Constraints

I had a bug I came across today that was easiest to reproduce by building up a test case that uses a Factory from factory-boy, that creates a set of related objects.

In order to reproduce the failure conditions, I needed to increment a value from a specific column in each of these. This column is a member of a multi-column UNIQUE constraint:

class MyModel(models.Model):
    team = models.ForeignKey(Team)
    day_part = models.ForeignKey(DayPart, null=True)
    level = models.IntegerField()

    class Meta:
        unique_together = (
            ('team', 'day_part', 'level'),

By default, Django creates UNIQUE constraints as immediate, and not deferable. There is currently no way to change this. This prevents you from doing something like:

MyModel.objects.filter(team=X, day_part=Y).update(level=models.F('level') + 1)

The query it generates should be valid, and should actually work, because it all happens in one transaction. But because the constraints are not deferred, then instead you need to do:

for instance in MyModel.objects.filter(...).order_by('-level'):
    instance.level += 1

Which results in a number of queries twice the size of the number of objects, plus one.

Instead, we want to defer the constraint, and make it initially deferred. The name of the index will be generated by Django, so you’ll need to look in the database, and then create a migration accordingly:

ALTER TABLE myapp_mymodel
DROP CONSTRAINT myapp_mymodel_team_id_day_part_id_level_aaaaaa_uniq;

ALTER TABLE myapp_mymodel
ADD CONSTRAINT myapp_mymodel_team_id_day_part_id_level_aaaaaa_uniq
UNIQUE (team_id, day_part_id, level)

Because the foreign key to DayPart is nullable, it means that any records with a NULL in this field will not be subject to this constraint. In this case, that is the intention, but the requirements for this model also say that for a given team, there may not be any duplicate levels where there is no linked DayPart. I’d initially modelled this as:

CREATE UNIQUE INDEX myapp_mymodel_no_day_part
ON myapp_mymodel(team_id, level)
WHERE day_part_id IS NULL

But there is no way to apply this index as a deferred constraint. Instead, you need to use an EXCLUDE constraint:

ALTER TABLE myapp_mymodel
ADD CONSTRAINT myapp_mymodel_no_day_part
EXCLUDE USING btree(team_id WITH =, level WITH =)
WHERE (day_part_id IS NULL)

Apple Watch Stand Hours

As of watchOS 6/iOS 13, I’ve noticed some inconsistencies between stand hours and stand minutes in Health/Activity.

Specifically, there are often hours where, inside, I can see that I have at least one stand minute, but when I view in Activity or Health Stand Hours, there is no stand hour.

This morning, I think I figured it out.

I got up, as I normally do, at 6:55. This is my life hack to ensure I get a stand hour at 6-7 am. I went about my normal routine, making sure I moved enough before 7am to trigger the stand hour. It did.

At around 7:15, I happened to look at my watch again. I still only had one stand hour. I know I had moved around in the previous 15 minutes, so I went into stand minutes. Yep, there was a stand minute after 7am.

At about 7:35, I glanced at my watch again, and noticed there was a second stand hour.

I live in a place that has a non-integer number of hours offset from UTC (+09:30, +10:30 DST). I believe that stand hours are calculated based on UTC, and then offset for the current timezone, wheras it used to actually use the hour within the timezone.

My workaround is to make sure I have a stand minute within both the first and second half-hours of every hour (which can be helped by having half-hourly chimes enabled on the watch).