Using Zoom for remote Cello lessons

With COVID-19 starting to cause significant lockdowns, our music lessons have moved to online. Our piano and cello teachers are both using Zoom for this.

Piano was fine, but the Cello teacher noticed that when my eldest son was playing some notes (especially the lower ones), the audio was cutting out. When he mentioned it was fine at the start of the note, but dropped in the middle, and then came back at the end, it triggered something in my brain.

Turns out Zoom has automatic background noise cancellation. This was picking up the low sounds of the Cello and blocking them.

This is fairly easy to fix:

SettingsAudioAdvanced

Then, set both types of background noise suppression to disabled:

Zoom Settings Audio Advanced

I’m not sure which one is more important: you’d think it was probably intermittent, but it was easy enough in our case just to leave them both set to disable.

Lockdown: Day 4

It’s Friday. End of our first week of working and schooling from home.

We bought a new iPad for the boys, which arrived within 25 hours. They are now using it attached to the trampoline; the old iPad was not working with that, so that’s a bonus. We also bought a water filter, so it’s been a productive week.

Working from home has been fine. I’m sure I’m not as productive as I was in previous weeks, but I don’t know how much of that is COVID-19 in general playing on my mind.

Australia’s infection rate is still above what my 23% compounding increase model suggests. https://covid-19-au.github.io is currently showing 854 confirmed cases, whilst my simple predictions said 835 by today. Watch it go over 1000 tomorrow.

According to the Guardian, our rates are slightly better than most other countries, although only Japan is an outlier there. Since that’s a logarithmic scale, we are perhaps even better off than it seems. Not sure how much of that is just insufficient testing though. On the same page, you can see that our total number of deaths is low: however, there is nothing that shows deaths-per-infection.

SA still sits at 50, which is about 30% better than predicted. However, we are not doing nearly as much testing as NSW.

Recommendations are to stay at home all weekend. I suspect I will be taking the boys somewhere - perhaps we could go for a walk in BNP.

Lockdown: Day 3

Took the boys down to the park this afternoon. They hadn’t left the house since Monday evening.

The school adjacent to the park was closed (due to a case of COVID-19), but otherwise, it didn’t feel that different. There were a couple of other groups of people there, and we ran/scooted/played for about an hour.

Getting them out of the house was really good - not only did they get to burn a bunch of energy, but it seems to have made them better able to play together this evening. They’ve been really good so far in terms of not fighting that much, but it is only day 3.

I’d be so happy if we turn out to have gone way too gung-ho on this - if South Australia (and Australia) manage to get this outbreak under control in weeks, rather than many months, I’ll be happy to cop shit for going overboard and pulling the kids out of school. But the number of cases has continued to grow at a rate higher than 23% per day.

My modelling, which is simplistic, looks like this:

    date    │ infected
────────────┼──────────
 2020-03-17 │      449
 2020-03-18 │      552
 2020-03-19 │      679
 2020-03-20 │      835
 2020-03-21 │     1027
 2020-03-22 │     1263
 2020-03-23 │     1553
 2020-03-24 │     1910
 2020-03-25 │     2349
 2020-03-26 │     2889
 2020-03-27 │     3553
 2020-03-28 │     4370
 2020-03-29 │     5375
 2020-03-30 │     6611
 2020-03-31 │     8132
 2020-04-01 │    10002

That’s stopping at 10k, by April 1st.

Currently, it’s 2020-03-19, and there are 691 confirmed cases.

Lockdown: Day 2

I really don’t know how this is going to play out.

It seems that the Australian government just doesn’t want to shut schools because they know when they do shut them, they won’t be re-opening them for many months. What they’d prefer to do is hold that off as long as possible, at the cost of people’s lives.

Currently, in Australia, the number of cases of COVID-19 appears to be growing at a rate exceeding 23% per day.

At that rate, we’ll have over 1000 cases by the weekend, and 10k within a fortnight or so. Within three weeks, there will no longer be any where near enough intensive care beds or ventilators to support the number of critically ill patients, and we’ll start to see Italy-like levels of mortality.

How will schools stay open when teachers are dying.


Our parents don’t seem to realise how dangerous this is.

My parents are apparently still planning on travelling from a rural town of 6000 people (they actually live on a farm, so that’s even better), where they are relatively safe, to Adelaide to attend a funeral. I understand that it’s one of their friends that died, but to be honest, there will be a lot more of those in the not too distant future.

And my partner’s mother was planning on going to the hairdresser tomorrow.


Personally, we have savings, and I have a stable job (for now). But considering the number of our customers who are service industry, quite a number of which I have little doubt will be out of business in short order.

There will be a new normal, we just don’t know what that is going to look like.

Lockdown: Day One

The Australian government is in denial.

It’s becoming clear that the global health crisis from COVID-19 is very likely to worsen quite quickly. From information I have come across from reliable medical sources (mostly from the ABC), it seems that our government is probably not doing enough to slow the growth of cases, and that the health risks to all age groups are significant, not just the elderly or vulnerable. Similarly, it feels like our health system has every likelihood of being overwhelmed by the increasing case load of critically ill people.

Because I have the “luxury” of being able to work from home (and still be gainfully employed), I am now doing so; additionally we are keeping our children home from school. Whilst there is probably very little practical risk from any two students attending one specific school, every person you come into contact with, and more so every person you remain in contact with for a significant period of time is an increased risk.

Other countries have been recommending that gatherings larger than 5 people are sufficient to continue the spread of the virus, and whilst it may seem that Australia (and Adelaide specifically) has less risk right now than some European and Asian countries, we have already seen Victoria declare a State of Emergency. Yet our government maintains that 500 people is still safe.

The number of confirmed cases in Australia has reached a higher proportion of the general population than that of the United States, which has many states and cities implementing strict shutdown policies.

The actual number of confirmed cases has started to grow exponentially, with an increase of almost 50% in one day (20 to 29 on March 16th). The school 400m away from where I live has a confirmed case - the principal of my children’s school has a child that attends there. The school where my niece attends school has a confirmed case.

Around 4000 doctors sent a petition to the government asking for strict lockdowns, fearing that our outcomes could be worse than that of Italy.

The government continues to maintain that this disease is just not that risky for young people.

This is just a flat-out lie.

The only reason we have not had deaths in Australia of young people is just a matter of numbers. Currently, anyone who has had severe respiratory issues has been able to receive artificial respiration. This will change, as there are only so many resipirators in Australian hospitals. As more people are infected, more will get serious complications, and some of them will miss out. Either on a respirator altogether, or later in the epidemic, perhaps even a bed.

Children appear not to be having severe complications, or even especially strong symptoms, but there is evidence that asymptomatic people shed a higher amount of the virus, and children continue to have the virus in stool samples well after they are deemed to be “clear” according to a throat swab.


So, my kids, partner and I stayed home today. I think it’s still fun for them - we have tried to help them understand the scale of this problem (“We could be here, just the four of us together for at least two weeks, maybe even longer”), but the novelty (oops) of being home-schooled has yet to wear off.

I’m thankful I have the means and opportunity to work from home, and have my immediate family here with me. We aren’t going to see the boys’ grandparents for some weeks, and I’m sure we’ll all go batshit insane from being around one another non-stop for an indefinite period of time.

Stay safe.

Modelling a residential battery in Postgres

Having five-minute interval generation and consumption data from my PV system means that, in theory at least, I should be able to do some predictive modelling of what adding a battery to the system will do to my import and export levels, and more importantly, determine how much money having a battery of a specific capacity and efficiency would save me.

There are several approaches that could be used to model a battery. They all boil down to the same main idea:

  • The battery has a fixed capacity
  • The battery has a current level
  • If generating more power than consuming, this excess could be fed into the battery (assuming the battery is not full)
  • If the battery is full, any excess generation is exported
  • If we are consuming more power than generating, and we have battery level > 0, then we draw power from the battery
  • Any power we are drawing from the battery is divided by the round-trip efficiency of the battery
  • Any excess consumption that cannot be supplied by the battery results in imports

There are some assumptions that this model makes.

  • There is no current (as in Amperes, not “now”) limit to either exporting or importing
  • If we are both importing and exporting, the battery can handle both of these. In practice, most batteries take some time to switch between charging and discharging
  • The usable capacity is the same as the battery capacity

In all of these cases, the limitations we are not addressing would more than likely result in a poorer outcome than the model will predict.

So, on to the technical details of creating the model. In the first instance, I tried using a recursive view (using in this case the 13.5kWh/90% efficient Tesla Powerwall):

CREATE OR REPLACE RECURSIVE VIEW battery_status(
  timestamp,
  capacity,
  efficiency,
  level,
  import,
  export
) AS

SELECT '2019-03-07'::TIMESTAMP,
       13500 AS capacity,
       0.90 AS efficiency,
       0::NUMERIC AS level,
       0::NUMERIC AS import,
       0::NUMERIC AS export

 UNION ALL

SELECT battery.timestamp + INTERVAL '5 minutes',
       battery.capacity,
       battery.efficiency,
       GREATEST(
         0,
         battery.level + delta.amount_to_battery - delta.amount_from_battery
       )::NUMERIC(12, 3) AS level,
       (summary.import - delta.amount_from_battery)::NUMERIC(8, 3) AS import,
       (summary.export - delta.amount_to_battery)::NUMERIC(8, 3) AS export
  FROM battery_status battery
  LEFT OUTER JOIN summary ON (summary.timestamp = battery.timestamp + INTERVAL '5 minutes')
 INNER JOIN LATERAL (
   SELECT LEAST(summary.import, battery.level / battery.efficiency) AS amount_from_battery,
          LEAST(summary.export, battery.capacity - battery.level) AS amount_to_battery
 ) delta ON (true)

Whilst this sort-of worked, it was unsatisfactory in a couple of ways. Firstly, it took ages to run. Way longer than I expected, even when I used materialised views, and changed the interval to hourly instead of 5 minutely.

Additionally, because I have some gaps in my data (for instance, when my power was disconnected, or the inverter was off because I was cleaning the panels, or whatever), the view stopped recursing at this point, so I was never actually able to get a result that went for more than a couple of months. Even generating missing values seemed to be insufficient, so at some point I gave up on this.

I even tried the same approach on a daily summary - this was never going to give me anything close to an accurate result, but at that point I was stretching thin, so tried a bunch of things.

It seemed unusual that it was taking a long as it did (dozens of seconds, and that was only to build a few months of data). It should only need to pass over the summary data once, and store…that gave me an idea. I could write a fairly simple python generator function to perform the modeling.

def battery_status(capacity, efficiency):
    result = plpy.execute('SELECT * FROM summary')
    energy = 0

    for row in result:
        export_amount = row['export']
        if export_amount and energy < capacity:
            energy += export_amount
            if energy > capacity:
                export_amount = energy - capacity
                energy = capacity
            else:
                export_amount = 0
        import_amount = row['import']
        if import_amount and energy > 0:
            energy -= import_amount / efficiency
            if energy < 0:
                import_amount = -energy * efficiency
                energy = 0
            else:
                import_amount = 0
        yield (row['timestamp'], energy, import_amount, export_amount)

Following it through, we can see some benefits to this approach. We don’t need to pass the capacity and efficiency (or the current energy level in the battery) through to each iteration: because this is a generator, the state remains in the function, and those variables are just re-used over and over again.

The only thing that might be unusual there is the plpy.execute('SELECT * FROM summary'): this hints that this code is actually going inside a plpythonu function and living inside the database. We can use a composite type to make the return values easier to deal with:

CREATE TYPE battery_status AS (
    timestamp TIMESTAMP,
    stored NUMERIC,
    import NUMERIC,
    export NUMERIC
);

CREATE FUNCTION battery_status(capacity NUMERIC, efficiency NUMERIC)
RETURNS SETOF battery_status AS $$

result = plpy.execute('SELECT * FROM summary')
energy = 0

for row in result:
    export_amount = row['export']
    if export_amount and energy < capacity:
        energy += export_amount
        if energy > capacity:
            export_amount = energy - capacity
            energy = capacity
        else:
            export_amount = 0
    import_amount = row['import']
    if import_amount and energy > 0:
        energy -= import_amount / efficiency
        if energy < 0:
            import_amount = -energy * efficiency
            energy = 0
        else:
            import_amount = 0
    yield (row['timestamp'], energy, import_amount, export_amount)

$$ LANGUAGE plpythonu STRICT IMMUTABLE;

Again, I’ll reiterate that in this context, we apply both charging and discharging in the same period (which could happen), but assumes that there is zero turnaround time required between charge and discharge, and that there is no limit to the charge and discharge rate. We could probably improve the model to take those two into account fairly easily though.

To revisit what the other data stores look like, let’s have a look at my summary view:

CREATE OR REPLACE VIEW summary AS

SELECT "timestamp",
       COALESCE(generation.energy, 0::double precision) AS generation,
       COALESCE(import.energy, 0) AS import,
       COALESCE(export.energy, 0) AS export,
       import.energy::double precision
           + generation.energy
           - export.energy::double precision AS consumption
  FROM generation
  JOIN import USING ("timestamp")
  JOIN export USING ("timestamp")

My generation table just contains the generation for each interval: the import and export are both views that use the cumulative import/export totals and calculate the value by subtracting the previous row:

CREATE OR REPLACE VIEW import AS

SELECT import_cumulative."timestamp",
       import_cumulative.energy
           - lag(import_cumulative.energy, 1) OVER (ORDER BY import_cumulative."timestamp")
           AS energy
  FROM import_cumulative
 ORDER BY import_cumulative."timestamp";

We can create a materialised view of the battery status per-day, since that’s all we will need for the purposes of cost calculations:

CREATE MATERIALIZED VIEW daily_summary_battery AS

SELECT battery_status."timestamp"::date AS "timestamp",
       sum(battery_status.import) AS import,
       sum(battery_status.export) AS export
  FROM battery_status(13500::numeric, 0.9)
       battery_status("timestamp", stored, import, export)
 GROUP BY (battery_status."timestamp"::date)
 ORDER BY (battery_status."timestamp"::date);

So, lets have a look at what results we get.

Refreshing the materialised view takes me around 9 seconds (this is quite stable), and I currently have about 106000 records (370 days of data, 288 records per day if we got them all). This is nice: it’s not immediate, but not going to be a deal-breaker if I need to run a bunch of different models of different battery capacity or efficiencies.

I can use the same cost functions from my actual cost calculator to look at what this battery model will do to my costs:

CREATE VIEW daily_battery_cost AS
SELECT daily_summary_battery."timestamp" AS date,
       daily_summary_battery.import,
       daily_summary_battery.export,
       cost(retailer.*,
            daily_summary_battery.import::integer,
            daily_summary_battery.export::integer) AS cost,
       billing_periods.period
  FROM daily_summary_battery
  JOIN billing_periods ON daily_summary_battery."timestamp" <@ billing_periods.period
  JOIN retailer ON retailer.name = billing_periods.retailer
 ORDER BY daily_summary_battery."timestamp";

And for comparisons in my billing periods:

CREATE VIEW billing_period_battery_costs AS
SELECT daily_battery_cost.period,
       sum(daily_battery_cost.cost) AS cost
  FROM daily_battery_cost
 GROUP BY daily_battery_cost.period
 ORDER BY daily_battery_cost.period;

CREATE VIEW billing_period_battery_savings AS
SELECT costs.period,
       costs.cost - battery.cost AS savings
  FROM billing_period_costs costs
  JOIN billing_period_battery_costs battery USING (period)
 ORDER BY costs.period;
solar=# SELECT * FROM billing_period_battery_savings ;
         period          |            savings
-------------------------+-------------------------------
 [2019-03-07,2019-06-01) | 101.6372820000000000000000000
 [2019-06-01,2019-06-24) |      15.592432500000000000000
 [2019-06-24,2019-07-24) |  20.4731725000000000000000000
 [2019-07-24,2019-08-24) |      17.898683500000000000000
 [2019-08-24,2019-09-24) |  18.5995855000000000000000000
 [2019-09-24,2019-10-24) |  35.1526490000000000000000000
 [2019-10-24,2019-11-24) |  38.4160700000000000000000000
 [2019-11-24,2019-12-24) |  37.7519350000000000000000000
 [2019-12-24,2020-01-24) |  39.2153900000000000000000000
 [2020-01-24,2020-02-24) |  39.5369135000000000000000000
 [2020-02-24,2020-03-24) |  21.2207440000000000000000000
(11 rows)

And for a per-day saving: it works out to around $1.05 per day.

Which is ~$380 per year.

A Tesla Powerwall 2 battery would cost me (excluding installation, but including the SA battery rebate), in excess of $5000. That’s still almost a 15 year payback time.


What’s quite amusing is that I distinctly remember calculating what I thought a battery would save me shortly after getting my PV system installed. I have such a strong recollection of it that I can remember exactly where I was: I was out running at lunch time, and was partway up the hill at the back of Flinders Uni.

And my in-head calculations were that it would save me $1 per day.


I’m actually more interested in a DC coupled battery system, partly because of the reduction in round-trip efficiency losses, but more so because I should be able to stick more panels in that would just feed the battery. I’m especially interested in the Fronius Gen24 inverters and a compatible battery, however there aren’t clear details on what that would cost. The benefit of this is that in SA there is currently a 10kW inverter limit; this includes the inverter in an AC coupled battery like the Tesla Powerwall.

These are likely to come midway through this year. I’ll be interested to see the pricing.

I think I’ll need to do a bit more on my calculations to be able to model this: my panel_factor calculations have a 5kW limit - this goes away, and instead the limit applies after taking the battery charging into account. But this can wait for another day.

QuerySets of various models

In general, the first thing we try to do to reduce the load time of a page in Django is to reduce the number of queries we are making to the database. This often has the greatest impact, sometimes orders of magnitude greater than other improvements.

One problem I recently hit with a specific set of pages was that there are potentially seven different models that may have zero or more items for a given request. This could mean we do seven queries that are all empty.

These are for all distinct models, but in this case, they are used for the same purpose, and this case, we always need all of them, if there are any relevant records. Whilst there are seven models, there may be more in the future.

Each of these models has a specific method, that validates some data against the rules that are applicable to that model, and the stored attributes. But each model has a different set of attributes, and future models will more than likely have different attributes.

There are at least three different ways we could have solved this problem. It turns out we have solved similar problems in the past the first two ways, and the third one is the one I’ve used in this case.


Solution 1: Store all data in the same model, and use a JSON field to store the attributes specific to this “class”.

This also requires a “type” field of some sort. Then, when loading data, we have the model use this type field to work out which attributes should apply.

This has a bunch of problems. First and foremost is that it becomes far more difficult to get the database (postgres, in this case) to apply integrity constraints. It’s not impossible, but it is much harder to read a constraint that checks a field and performs a JSON expression constraint. Changing a constraint, assuming it’s done using a check constraint and not a trigger, is still possible, but is likely to be harder to understand.

Secondly, it no longer becomes “automatic” to get Django Model Form support. It’s not impossible to use a Django Model Form, but you need to work a bit harder to get the initial data in, and ensure that the cleaned data for the fields is applied to the JSON field correctly.

Finally, as hinted above, using a “type” field means the logic for building the checks is more complex, unless you use class swizzling and proxy models or similar to have a different class for each type value. If an instance was accidentally updated to the wrong type, then all sorts of things could go wrong.

This was the first solution we used for our external integrations, and whilst convenient at some level, turned out to be much harder to manage than distinct model classes. It is not subject to the problem that is the basis of this article: we can always fetch objects of different logical types, as it’s all the same model class. Indeed, to only fetch a single class, we need to perform extra filtering.


Solution 2: Use concrete/multi-table inheritance.

This is the solution we moved to with our external integrations, and has been much nicer than the previous solution. Instead of having a JSON field with an arbitrary blob of data in it, we have distinct fields. This makes it much easier to have unique constraints, as well as requiring values, or values of a specific type. Knowing that the database is going to catch us accidentally putting a text value into the external id field for a system that requires an integer is reassuring.

This overcomes the second problem. We can now just use a Django ModelForm, and this makes writing views much easier. The validation for a given field or set of fields lives on the model, and where possible also in the database, as an exclusion or check constraint.

It also overcomes the third problem. We have distinct classes, which can have their own methods. We don’t need to try to use some magical proxy model code, and it’s easy for new developers to follow.

Finally, thanks to django-model-utils InheritanceManager, we can fetch all of our objects using the concrete parent model, and the .select_subclasses() method to downcast to our required class.

There are a couple of drawbacks to using concrete inheritance. Any fetch of an instance will perform a JOIN in your database, but more importantly, it’s impossible to perform a bulk_create() for models of these types.


Solution 3: Use a Postgres VIEW and JSONB to perform one query, and reconstruct models.

In the problem I’ve recently solved, we had a bunch of different models that, although being used in similar ways, didn’t have that much similarity. They were pre-existing, and it wasn’t worth the effort to move them to concrete inheritance, and using JSON fields for all data is not something I would repeat.

Instead, I came up with an idea that seems novel, based on some previous work I’ve done converting JSON data into models:

CREATE VIEW combined_objects AS

SELECT 'foo.Foo' AS model,
       foo.id,
       foo.tenant_id,
       TO_JSONB(foo) AS data
  FROM foo

 UNION ALL

SELECT 'foo.Bar' AS model,
       bar.id,
       baz.tenant_id,
       TO_JSONB(bar) AS data
  FROM bar
 INNER JOIN baz ON (baz.id = bar.baz_id)

 UNION ALL

SELECT 'qux.Qux' AS model,
       qux.id,
       tenant.id,
       TO_JSONB(qux) AS data
  FROM tenant
  JOIN qux ON (true)

This builds up a postgres view that contains all of the data from the model, but in a generic way. It also contains a tenant_id, which in this case was the mechanism that we’ll be using to filter the ones that are required at any given time. This can be a field on a model (as shown in the first subquery), or a field on a related model (as shown in the second). It could even be every object in a table for every tenant, as shown in the third.

From there, we need a model that will recreate the model instances correctly:

class CombinedObjectQuerySet(models.query.QuerySet):
    def for_tenant(self, tenant):
        return self.filter(tenant=tenant)

    def as_model_instances(self):
        return [x.instance for x in self]


class CombinedObject(models.Model):
    model = models.TextField()
    tenant = models.ForeignKey('tenant.Tenant')
    data = JSONField()

    objects = CombinedObjectQuerySet.as_manager()

    class Meta:
        managed = False
        db_table = 'combined_objects'

    def __str__(self):
        return '%s wrapper (%s)'.format(self.model, self.instance)

    def __eq__(self, other):
        return self.instance == other.instance

    @property
    def model_class(self):
        return apps.get_model(*self.model.split('.'))

    @cached_property
    def instance(self):
        return self.model_class(**self.data)

This works great, as long as you don’t apply additional annotations, typecast to python values, or want to deal with related objects. That is where it starts to get a bit tricky.

We can handle annotations and typecasting:

@cached_property
def instance(self):
    data = self.data
    model_class = self.model_class
    field_data = {
        field.name: field.to_python(data[field.name])
        for field in model_class
        if field.name in data
    }
    instance = model_class(**field_data)
    for attr, value in data.items():
        if attr not in field_data:
            setattr(instance, attr, value)
    return instance

There’s still the issue of foreign keys in the target models: in this case I know the code is not going to traverse these and trigger extra database hits. We could look at omitting those fields to prevent that being possible, but this is working well enough for now.

Postgres VIEW from Django QuerySet

It’s already possible, given an existing Postgres (or other database) VIEW, to stick a Django Model in front of it, and have it fetch data from that instead of a table.

Creating the views can currently be done using raw SQL (and a RunSQL migration operation), or using some helpers to store the SQL in files for easy versioning.

It would be excellent if it was possible to use Django’s ORM to actually generate the VIEW, and even better if you could make the migration autodetector generate migrations.

But why would this be necessary? Surely, if you were able to create a QuerySet instance that contains the items in your view, that should be good enough?

Not quite, because currently using the ORM it is not possible to perform the following type of query:

SELECT foo.a,
       foo.b,
       bar.d
  FROM foo
  INNER JOIN (
    SELECT baz.a,
           ARRAY_AGG(baz.c) AS d
      FROM baz
     GROUP BY baz.a) bar ON (foo.a = bar.a)

That is, generating a join to a subquery is not possible in the ORM. In this case, you could probably get away with a correlated Subquery, however that would probably not perform as well as using a join in this case. This is because a subquery in a SELECT is evaluated once for each row, whereas a subquery join will be evaluated once.

So, we could use a VIEW for the subquery component:

CREATE OR REPLACE VIEW bar AS

SELECT baz.a,
       ARRAY_AGG(baz.c) AS d
  FROM baz
 GROUP BY baz.a;

And then stick a model in front of that, and join accordingly:

SELECT foo.a,
       foo.b,
       bar.d
  FROM foo
 INNER JOIN bar ON (foo.a = bar.a)

The Django model for the view would look something like:

class Bar(models.Model):
    a = models.OneToOneField(
        'foo.Foo',
        on_delete=models.DO_NOTHING,
        primary_key=True,
        related_name='bar'
    )
    d = django.contrib.postgres.fields.ArrayField(
        base_field=models.TextField()
    )

    class Meta:
        managed = False

The on_delete=models.DO_NOTHING is important: without it, a delete of a Foo instance would trigger an attempted delete of a Bar instance - which would cause a database error, because it’s coming from a VIEW instead of a TABLE.

Then, we’d be able to use:

queryset = Foo.objects.select_related('bar')

So, that’s the logic behind needing to be able to do a subquery, and it becomes even more compelling if you need that subquery/view to filter the objects, or perform some other expression/operation. So, how can we make Django emit code that will enable us to handle that?

There are two problems:

  • Turn a queryset into a VIEW.
  • Get the migration autodetector to trigger VIEW creation.

The other day I came across Create Table As Select in Django, and it made me realise that we can use basically the same logic for creating a view. So, we can create a migration operation that will perform this for us:

class CreateOrReplaceView(Operation):
    def __init__(self, view_name, queryset):
        self.view_name = view_name
        self.queryset = queryset

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        queryset = self.queryset
        compiler = queryset.query.get_compiler(using=schema_editor.connection.alias)
        sql, params = compiler.as_sql()
        sql = 'CREATE OR REPLACE VIEW {view} AS {sql}'.format(
            view=schema_editor.connection.ops.quote_name(self.view_name),
            sql=sql
        )
        schema_editor.execute(sql, params)

    def state_forwards(self, app_label, state):
        pass

We can then have this operation (which needs to be passed a queryset).

This doesn’t really solve how to define the queryset for the view, and have some mechanism for resolving changes made to that queryset (so we can generate a new migration if necessary). It also means we have a queryset written in our migration operation. We won’t be able to leave it like that: due to loading issues, you won’t be able to import model classes during the migration setup - and even if you could, you shouldn’t be accessing them during a migration anyway - you should use models from the ProjectState which is tied to where in the migration graph you currently are.

What would be excellent is if we could write something like:

class Bar(models.Model):
    a = models.OneToOneField(
        'foo.Foo',
        on_delete=models.DO_NOTHING,
        primary_key=True,
        related_name='bar',
    )
    d = django.contrib.postgres.fields.ArrayField(
        base_field=models.TextField()
    )

    class Meta:
        managed = False

    @property
    def view_queryset(self):
        return Baz.objects.values('a').annotate(d=ArrayAgg('c'))

And then, if we change our view definition:

@property
def view_queryset(self):
  return Baz.objects.values('a').filter(
      c__startswith='qux',
  ).annotate(
      d=ArrayAgg('c')
  )

… we would want a migration operation generated that includes the new queryset, or at least be able to know that it has changed. Ideally, we’d want to have a queryset attribute inside the Meta class in our model, which could itself be a property. However, that’s not possible without making changes to django itself.

In the meantime, we can borrow the pattern used by RunPython to have a callable that is passed some parameters during application of the migration, which returns the queryset. We can then have a migration file that looks somewhat like:

def view_queryset(apps, schema_editor):
    Baz = apps.get_model('foo', 'Baz')

    return Baz.objects.values('a').filter(
        c__startswith='qux'
    ).annotate(
        d=ArrayAgg('c')
    )


class Migration(migrations.Migration):
    dependencies = [
        ('foo', '0001_initial'),
    ]

    operations = [
        migrations.CreateModel(
            name='Bar',
            fields=[
                ('a', models.OneToOneField(...)),
                ('d', ArrayField(base_field=models.TextField(), ...)),
            ],
            options={
                'managed': False,
            }
        ),
        CreateOrReplaceView('Bar', view_queryset),
    ]

We still need to have the CreateModel statement so Django knows about our model, but the important bit in this file is the CreateOrReplaceView, which references the callable.

Now for the actual migration operation.

class CreateOrReplaceView(migrations.Operation):
    def __init__(self, model, queryset_factory):
        self.model = model
        self.queryset_factory = queryset_factory

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        model = from_state.apps.get_model(app_label, self.model)
        queryset = self.queryset_factory(from_state.apps, schema_editor)
        compiler = queryset.query.get_compiler(using=schema_editor.connection.alias)
        sql, params = compiler.as_sql()
        sql = 'CREATE OR REPLACE VIEW {view_name} AS {query}'.format(
            view_name=model._meta.db_table,
            query=sql,
        )
        schema_editor.execute(sql, params)

The backwards migration is not quite a solved problem yet: I do have a working solution that steps up the stack to determine what the current migration name is, and then finds the previous migration that contains one of these operations for this model, but that’s a bit nasty.


There’s no (clean) way to inject ourself into the migration autodetector and “notice” when we need to generate a new version of the view, however we can leverage the checks framework to notify the user when our view queryset is out of date compared to the latest migration.

from django.apps import apps
from django.core.checks import register

@register()
def check_view_definitions(app_configs, **kwargs):
    errors = []

    if app_configs is None:
        app_configs = apps.app_configs.values()

    for app_config in app_configs:
        errors.extend(_check_view_definitions(app_config))

    return errors

And then we need to implement _check_view_definitions:

def get_out_of_date_views(app_config):
    app_name = app_config.name

    view_models = [
        model
        # We need the real app_config, not the migration one.
        for model in apps.get_app_config(app_name.split('.')[-1]).get_models()
        if not model._meta.managed and hasattr(model, 'get_view_queryset')
    ]

    for model in view_models:
        latest = get_latest_queryset(model)
        current = model.get_view_queryset()

        if latest is None or current.query.sql_with_params() != latest.query.sql_with_params():
            yield MissingViewMigration(
                model,
                current,
                latest,
                Warning(W003.format(app_name=app_name, model_name=model._meta.model_name), id='sql_helpers.W003'),
            )


def _check_view_definitions(app_config):
    return [x.warning for x in get_out_of_date_views(app_config)]

The last puzzle piece there is get_latest_queryset, which is a bit more complicated:

def get_latest_queryset(model, before=None):
    from django.db.migrations.loader import MigrationLoader
    from django.db import connection

    migration_loader = MigrationLoader(None)
    migrations = dict(migration_loader.disk_migrations)
    migration_loader.build_graph()
    state = migration_loader.project_state()
    app_label = model._meta.app_label
    root_node = dict(migration_loader.graph.root_nodes())[app_label]
    # We want to skip any migrations in our reverse list until we have
    # hit a specific node: however, if that is not supplied, it means
    # we don't skip any.
    if not before:
        seen_before = True
    for node in migration_loader.graph.backwards_plan((app_label, root_node)):
        if node == before:
            seen_before = True
            continue
        if not seen_before:
            continue
        migration = migrations[node]
        for operation in migration.operations:
            if (
                isinstance(operation, CreateOrReplaceView) and
                operation.model.lower() == model._meta.model_name.lower()
            ):
                return operation.queryset_factory(state.apps, connection.schema_editor())

This also has code to allow us to pass in a node (before), which limits the search to migrations that occur before that node in the forwards migration plan.

Since we already have the bits in place, we could also have a management command that creates a stub migration (without the queryset factory, that’s a problem I haven’t yet solved). I’ve built this into my related “load SQL from files” app.


This is still a bit of a work in progress, but writing it down helped me clarify some concepts.

Smart Lights or Smart Switches, or my philosophy of home automation

I really like home automation. Being able to see when things are turned on, and being able to turn them off remotely, or build automations around events in the home is fun. I’ve built a garage door opener, a bunch of temperature, humidity and air pressure sensors, and played around with various light bulbs and switches.

I haven’t invested deeply in any platform: for a couple of reasons. I have some of the Sonoff devices, and like that it’s possible to flash my own firmware. This allows me to lock down the device so it can only connect to the one MQTT broker in my network: indeed, my IoT devices are all on a seperate network that has very strict limits.

But that’s not what I want to talk about today. I want to talk about smart lights and smart switches, and why switches are superior.

I don’t live by myself. I have a family, none of whom share my level of excitement about smart devices, but all of whom need to still turn lights on and off. They don’t have an Apple Watch (or iPhone, in most cases), and should not have to use one to turn the lights on.

In fact, I should not have to use my watch or phone to toggle the lights either. We have managed as a society to come up with a really simple system for controlling lights: you flick the switch, and the light toggles.

Direct control of lights using the existing switches is the number one priority of a smart light system. Having a smart bulb that you can turn off using Siri, but then means you need to turn the switch-off-and-then-back-on to get it to turn on is not acceptable.

Likewise, requiring someone to use a smart device to turn a light on is unacceptable.

Having a switch that prevents the smart light from being smart (ie, when it’s off, you have to use the switch to turn it on) is also unacceptable.

This makes existing smart bulbs a non-event for me. Even if you have a duplicate “smart” switch next to an existing switch, there’s still the chance someone will turn off the switch that needs to stay on.

What is a much better solution is to have the switch itself be smart. In most cases, that means the switch will no longer be mechanical, although it could be a press button. There are a bunch of smart switches that perform this way: they work manually, but also still allow an automated or remote toggle of the state.

These switches will not work in my home.

Pretty much all of these (except one exception from Sonoff, see this video for a really nice description of how this works) require a neutral wire at the switch. It seems that my house is wired up with the neutral at the light only, and then only a pair of wires (live, and switched live) that travel down to the switch. Thus, the switch is wired in series with the live wire, and the neutral is only connected directly to the fitting. Jonathon does a really good job in the above-linked video of describing the problem.

There is an alternative solution. Sonoff also make another device, the Sonoff Mini. This one is pretty small (much smaller than the Sonoff Basic), and can be wired up and programmed to behave just like a traditional hallway switch: toggling either the manual or smart switch will result in the lights toggling. The nice thing about these is that they have the new DIY mode (just pop them open, add a jumper, and you can flash them without having to connect header pins). In fact, I even wrote a script to automate the process. It’s not quite perfect, but it’s been good for flashing the five of these that I currently own.

You can then have the mini connected up to the light (at least, in other countries you can do this yourself: in Australia this is illegal unless you are an electrician, so be aware of that), and have the switch connected just to the switch pins on the mini. Bingo, smart switches the right way. When the smart fails, the switch still works the same as they have for generations.

(Unless the microcontroller itself dies, but that’s a problem I have not as yet solved).


As an aside, I wanted to comment on the setup from Superhouse. It’s quite nifty: all of the switches are low voltage, and control relays back in the switchboard. However, it relies on the logic running in a computer (and running JavaScript!) to connect which switch to which light.

This to me feels wrong. It’s better than having those connections over WiFi, but it still means there is a single point of failure. I think the architecture I have used - smart switches that are independent, and if they fail to connect to WiFi continue to work just the same as an old dumb switch - is superior. It’s more like the philosophy I have with web pages: the pages should still work when all of the JavaScript breaks, but when it doesn’t break, they can be nicer (and have some cool affordances).

To that end, I’ve tried to make each little module of my smart home somewhat independent. For instance, having an IR sensor connected to the same microcontroller that controls the lights that sensor means that even if WiFi breaks, the lights still behave exactly the same way.

Adding a PIR sensor to an Arlec Smart LED Strip

I built a shed on Australia Day.

Actually, I built two of them over the Australia Day Long Weekend. I bought them from easyshed.com.au, and they were pretty nifty. Specifically, I bought a couple of the off the wall sheds, to attach them to the wall of my garage. There’s already a concrete path there that I was able to use as the floor.

One of the sheds is longer than the others, and is used for storage of tools. Mostly I need to get stuff during the day, but sometimes at night I may need to access something from in there. So, I bought a 2m Grid Connect LED Strip Light. It’s colour changing, but I’m not really that fussed about the colours: this one has a Warm White LED too, so that’s better than trying to twiddle the colours to get the right temperature.

Being Grid Connect, it’s really Tuya, which means it’s flashable. So I did. I’ve got a nice esphome firmware on there, which hooks it up to HomeKit via my MQTT bridge.

I wasn’t really able to get the colour parts working correctly, so I just disabled those pins, and use the light in monochromatic mode.

However, being the tinkerer I am, I opened it up to have a look inside. It’s got one of the standard Tuya mini-boards, in fact a TYWE3S.

This exposes a bunch of pins, most of which were in use:

  • GPIO 0: (Used during boot)
  • GPIO 2: UART0 (TX)
  • GPIO 4: Red LED channel
  • GPIO 5: Warm White LED channel
  • GPIO 12: Green LED channel
  • GPIO 13: unused
  • GPIO 14: Blue LED channel
  • GPIO 15: (Used during boot)
  • GPIO 16: unused

Because the LED strip runs at 12V, I was able to use this to power a PIR sensor, which I then hooked up to GPIO 13. I also tried to connect a DS18B20 temperature sensor to GPIO 16, but was not able to get it to be recognised. From the page linked above, perhaps I needed a pull-up resistor, however, I didn’t really need a temperature sensor.

Having a PIR sensor on the LED strip is worthwhile, however. Otherwise, you’d need to manually turn the lights on when going into the shed.

Having a sensor light is fantastic, but at times you might want the light to stay on, even when it does not detect motion. To achieve this, I have a global variable manual_override, that is set when the light is turned on “manually” (using HomeKit, as there is no physical switch). When this variable is set, the light will not turn off when motion is no longer detected.

I also found it was simpler to have the motion detector (binary_sensor in esphome) have a “delayed off” filter for the period I wanted the light to stay on for, rather than try to manage that in conjunction with the manual override.

The full YAML follows:

esphome:
  name: $device_name
  platform: ESP8266
  board: esp01_1m
  on_loop:
    then:
      - if:
          condition:
            not:
              mqtt.connected:
          then:
            - globals.set:
                id: has_connected_to_mqtt
                value: 'false'

globals:
  - id: has_connected_to_mqtt
    type: bool
    restore_value: no
    initial_value: 'false'
  - id: manual_override
    type: bool
    restore_value: no
    initial_value: 'false'

light:
  - platform: monochromatic
    name: "White"
    id: white
    output: white_channel
    restore_mode: ALWAYS_OFF
    default_transition_length: 0.25s

output:
  - platform: esp8266_pwm
    pin:
      number: GPIO5
    id: white_channel
    inverted: False

sensor:
  - platform: wifi_signal
    name: "WiFi signal sensor"
    update_interval: 5min

binary_sensor:
  - platform: gpio
    pin: GPIO13
    name: "PIR Sensor"
    device_class: motion
    id: pir
    filters:
      - delayed_off: 15s
    on_press:
      then:
        - mqtt.publish:
            topic: HomeKit/${device_name}/MotionSensor/MotionDetected
            payload: "1"
        - if:
            condition:
              light.is_off: white
            then:
              - light.turn_on: white
              - mqtt.publish:
                  topic: HomeKit/${device_name}/Lightbulb/On
                  payload: "1"
    on_release:
      - mqtt.publish:
          topic: HomeKit/${device_name}/MotionSensor/MotionDetected
          payload: "0"
      - if:
          condition:
            lambda: 'return id(manual_override);'
          then:
            - logger.log: "Manual override prevents auto off."
          else:
            - logger.log: "Turning off after motion delay."
            - if:
                condition:
                  - light.is_on: white
                then:
                  - light.turn_off: white
                  - mqtt.publish:
                      topic: HomeKit/${device_name}/Lightbulb/On
                      payload: "0"

ota:

logger:

mqtt:
  broker: "mqtt.lan"
  discovery: false
  topic_prefix: esphome/${device_name}
  on_message:
    - topic: HomeKit/${device_name}/Lightbulb/Brightness
      then:
        - logger.log: "Brightness message"
        - globals.set:
            id: manual_override
            value: 'true'
        - light.turn_on:
            id: white
            brightness: !lambda "return atof(x.c_str()) / 100;"
        - globals.set:
            id: has_connected_to_mqtt
            value: 'true'

    - topic: HomeKit/${device_name}/Lightbulb/On
      payload: "1"
      then:
        - logger.log: "Turn on message"
        - if:
            condition:
              and:
                - binary_sensor.is_off: pir
                - lambda: "return id(has_connected_to_mqtt);"
            then:
              - globals.set:
                  id: manual_override
                  value: 'true'
              - logger.log: "Manual override enabled"
            else:
              - globals.set:
                  id: manual_override
                  value: 'false'
              - logger.log: "Manual override disabled"
        - light.turn_on:
            id: white
        - globals.set:
            id: has_connected_to_mqtt
            value: 'true'
    - topic: HomeKit/${device_name}/Lightbulb/On
      payload: "0"
      then:
        - logger.log: "Turn off message"
        - if:
            condition:
              lambda: 'return id(has_connected_to_mqtt);'
            then:
              - light.turn_off:
                  id: white
              - globals.set:
                  id: manual_override
                  value: 'false'
              - logger.log: "Manual override disabled"
        - globals.set:
            id: has_connected_to_mqtt
            value: 'true'

  birth_message:
    topic: HomeKit/${device_name}/Lightbulb/On
    payload: "1"
  will_message:
    topic: HomeKit/${device_name}/Lightbulb/On
    payload: "0"

There’s also some logic in there to prevent the light turning off and then back on when it first connects to the MQTT broker and is already turned on.