RobotFramework, Chromedriver and Docker

One of my team implemented RobotFramework support for automated browser testing of our platform a while ago. At the time, we were using Codeship Basic, and I built a helper to run a robot test suite within a tox environment. It was all good, because chromedriver and all it’s dependencies were already installed.

But time passes, and we needed to move to Codeship Pro. Which has some neater features, but required me to build docker images for everything. We already use docker for deployment, but I didn’t really want to build a bunch of distinct images just for testing that re-implemented the same stuff that we have in our deployment images. Even just appending new stuff to them means that things could turn out to be a pain in the arse to manage.

And getting chromedriver installed into a docker image is not neat.

I did find a docker image that just has an instance of chromedriver, and exposes that. But getting that to work with robot was still a bunch of work. After much experimentation, I was able to get the connections between everything to work.

First, we need to have the chromedriver container running:

$ docker run -p 4444:4444 CHROMEDRIVER_WHITELISTED_IPS='' robcherry/docker-chromedriver:latest

Then, there are a few moving parts that need to be in place to get things to work. Using my djangobot management command (which I had to extend a bit here), a single command can be used to spin up a Django runserver command, apply migrations (if necessary), and then run the robot commands. The trick is you need to teach Robot to speak to the remote WebDriver instance, which then in turn speaks to the running django webserver.

First, the RobotFramework commands; my resource.robot file which is referenced by all of my robot test suites contains:

*** Variables ***

${HOSTNAME}         127.0.0.1
${PORT}             8000
${SCHEME}           http
${SERVER}           ${SCHEME}://${HOSTNAME}:${PORT}
${BROWSER}          headlesschrome
${TIMEOUT}          30
${REMOTE_URL}

*** Settings ***

Documentation   A resource file with reusable keywords and variables.
Library         SeleniumLibrary             timeout=${TIMEOUT}      implicit_wait=1
Library         Collections
Library         DebugLibrary
Library         DateTime
Library         String
Library         djangobot.DjangoLibrary     ${HOSTNAME}     ${PORT}

*** Keywords ***

Create Remote Webdriver
    ${chrome_options} =     Evaluate    sys.modules['selenium.webdriver'].ChromeOptions()    sys, selenium.webdriver
    Call Method    ${chrome_options}   add_argument    headless
    Call Method    ${chrome_options}   add_argument    disable-gpu
    Call Method    ${chrome_options}   add_argument    no-sandbox
    ${options}=     Call Method     ${chrome_options}    to_capabilities

    Create Webdriver    Remote   command_executor=${REMOTE_URL}    desired_capabilities=${options}
    Open Browser    ${SERVER}   ${BROWSER}  remote_url=${REMOTE_URL}    desired_capabilities=${options}

Start Session
    Run Keyword If  '${REMOTE_URL}'    Create Remote Webdriver
    Run Keyword If  '${REMOTE_URL}' == ''    Open Browser    ${SERVER}   ${BROWSER}

    Set Window Size     2048  2048
    Fetch Url       login
    Add Cookie      robot   true

    Register Keyword To Run On Failure    djangobot.DjangoLibrary.Dump Error Data

End Session
    Close Browser

Logout
    Fetch Url     logout

Notice that the Start Session keyword determines which type of browser to open - either a local or remote one.

Thus, each *.robot file starts with:

*** Settings ***
Resource                resource.robot
Suite Setup             Start Session
Suite Teardown          End Session
Test Setup              Logout

Because the requests will no longer be coming from localhost, you need to ensure that your runserver is listening on the interface the requests will be coming from. If you can’t detect this, and your machine is not exposed to an insecure network, then you can use 0.0.0.0 to get the django devserver to listen on all interfaces. You will also need to supply the hostname that you will be using for the requests (which won’t be localhost anymore), and ensure this is in your Django settings.ALLOWED_HOSTS.

In my case, I needed to make my robot command allow all this, but ultimately I can now do:

$ ./manage.py robot --runserver 0 \
                    --listen 0.0.0.0 \
                    --hostname mymachine.local \
                    --remote-url http://localhost:4444 \
                    --include tag

This runs against the database I already have prepared, but in my codeship-steps.yml I needed to do a bit more, and hook it up to the other containers:

coverage run --branch --parallel \
    /app/manage.py robot --migrate \
                         --server-url=http://web.8000 \
                         --remote-url=http://chromedriver:4444 \
                         --tests-dir=/app/robot_tests/ --output-dir=/coverage/robot_results/ \
                         --exclude skip  --exclude expected-failure

Now, if only Codeship’s jet tool actually cached multi-stage builds correctly.

Maybe I neeed to try this.

Update value only if present

We have a bunch of integrations with external systems, and in most of these cases we are unable to use Oauth, or other mechanisms that don’t require us to store a username password pair. So, we have to store that information (encrypted, because we need to use the value, rather than just being able to store a hashed value to compare an incoming value with).

Because this data is sensitive, we do not want to show this value to the user, but we do need to allow them to change it. As such, we end up with a form that usually contains a username and a password field, and sometimes a URL field:

class ConfigForm(forms.ModelForm):
    class Meta:
        model = ExternalSystem
        fields = ('username', 'password', 'url')

But this would show the password to the user. We don’t want to do that, but we do want to allow them to include a new password if it has changed.

In the past, I’ve done this on a per-form basis by overridding the clean_password method:

class ConfigForm(forms.ModelForm):
    class Meta:
        model = ExternalSystem
        fields = ('username', 'password', 'url')

    def clean_password(self):
        return self.cleaned_data.get('password') or self.instance.password

But this requires implementing that method on every form. As I mentioned before, we have a bunch of these. And on at least one, we’d missed this method. We could subclass a base form class that implements this method, but I think there is a nicer way.

It should be possible to have a field that handles this. The methods that look interesting are clean, and has_changed. Specifically, it would be great if we could just override has_changed:

class WriteOnlyField(forms.CharField):
    def has_changed(self, initial, data):
        return bool(data) and initial != data

However, it turns out this is not used until the form is re-rendered (or perhaps not at all by default, it’s very likely my code calls this to get a list of changed fields to mark as changed as a UI affordance).

The clean method in a CharField does not have access to the initial value, and there really is not a nice way to get this value attached to the field (other than doing it in the has_changed method, which is not called).

But it turns out this behaviour (apply changes only when a value is supplied) is the same behaviour that is used by FileField: and as such, it gets a special if statement in the form cleaning process, and is passed both the initial and the new values.

So, we can leverage this and get a field class that does what we want:

class WriteOnlyField(forms.CharField, forms.FileField):
    def clean(self, value, initial):
        return value or initial

    def has_changed(self, initial, data):
        return bool(data) and initial != data

We can even go a bit further, and rely on the behaviour of forms.PasswordInput() to hide the value on an unbound form:

class WriteOnlyField(forms.CharField, forms.FileField):
    def __init__(self, *args, **kwargs):
        defaults = {
            'widget': forms.PasswordInput(),
            'help_text': _('Leave blank if unchanged'),
        }
        defaults.update(**kwargs)
        return super().__init__(*args, **defaults)

    def clean(self, value, initial):
        return value or initial

    def has_changed(self, initial, data):
        return bool(data) and initial != data

Then we just need to override that field on our form definition:

class ConfigForm(forms.ModelForm):
    password = WriteOnlyField()

    class Meta:
        model = ExternalSystem
        fields = ('username', 'password', 'url')

Please note that this technique should not be used in the situation where you don’t need the user to be able to change a value, but instead just want to render the value. In that case, please omit the field from the form, and just use `` instead - you can even put that in a disabled text input widget if you really want it to look like the other fields.


I also use a JavaScript affordance on all password fields that default to hiding the value, but allows clicking on a control to toggle the visibility of the value: UIkit Password Field.

Preventing Model Overwrites in Django and Postgres

I had an idea tonight while helping someone in #django. It revolved around using a postgres trigger to prevent overwrites with stale data.

Consider the following model:

class Person(models.Model):
    first_name = models.TextField()
    last_name = models.TextField()

If we had two users attempting to update a given instance at around the same time, Django would fetch whatever it had in the database when they did the GET request to fetch the form, and display that to them. It would also use whatever they sent back to save the object. In that case, the last update wins. Sometimes, this is what is required, but it does mean that one user’s changes would be completely overwritten, even if they had only changed something that the subsequent user did not change.

There are a couple of solutions to this problem. One is to use something like django-model-utils FieldTracker to record which fields have been changed, and only write those back using instance.save(update_fields=...). If you are using a django Form (and you probably should be), then you can also inspect form.changed_data to see what fields have changed.

However, that may not always be the best behaviour. Another solution would be to refuse to save something that had changed since they initially fetched the object, and instead show them the changes, allow them to update to whatever it should be now, and then resubmit. After which time, someone else may have made changes, but then the process repeats.

But how can we know that the object has changed?

One solution could be to use a trigger (and an extra column).

class Person(models.Model):
    first_name = models.TextField()
    last_name = models.TextField()
    _last_state = models.UUIDField()

And in our database trigger:

CREATE EXTENSION "uuid-ossp";

CREATE OR REPLACE FUNCTION prevent_clobbering()
RETURNS TRIGGER AS $prevent_clobbering$

BEGIN
  IF NEW._last_state != OLD._last_state THEN
    RAISE EXCEPTION 'Object was changed';
  END IF;
  NEW._last_state = uuid_generate_v4();
  RETURN NEW;
END;

$prevent_clobbering$
LANGUAGE plpgsql STRICT IMMUTABLE;

CREATE TRIGGER prevent_clobbering
BEFORE UPDATE ON person_person
FOR EACH ROW EXECUTE PROCEDURE prevent_clobbering();

You’d also want to have some level of handling in Django to capture the exception, and re-display the form. You can’t use the form/model validation handling for this, as it needs to happen during the save.

To make this work would also require the _last_state column to have a DEFAULT uuid_generate_v4(), so that newly created rows would get a value.


This is only a prototype at this stage, but does work as a mechanism for preventing overwrites. As usual, there’s probably more work in the application server, and indeed in the UI that would need to be required for displaying stale/updated values.

What this does have going for it is that it’s happening at the database level. There is no way that an update could happen (unless the request coming in happened to guess what the new UUID was going to be).

What about drawbacks? Well, there is a bit more storage in the UUID, and we need to regenerate a new one each time we save a row. We could have something that checks the other rows looking for changes.

Perhaps we could even have the hash of the previous row’s value stored in this field - that way it would not matter that there had been N changes, what matters is the value the user saw before they entered their changes.

Another drawback is that it’s hard-coded to a specific column. We could rewrite the function to allow defining the column when we create the trigger:

CREATE TRIGGER prevent_clobbering
BEFORE UPDATE ON person_person
FOR EACH ROW EXECUTE PROCEDURE prevent_clobbering('_last_state_');

But that requires a bit more work in the function itself:

CREATE OR REPLACE FUNCTION prevent_clobbering()
RETURNS TRIGGER AS $prevent_clobbering$

BEGIN
  IF to_jsonb(NEW)->TG_ARGV[0] != to_jsonb(OLD)->TG_ARGV[0] THEN
    RAISE EXCEPTION 'Object was changed';
  END IF;
  NEW._last_state = uuid_generate_v4();
  RETURN NEW;
END;

$prevent_clobbering$
LANGUAGE plpgsql STRICT IMMUTABLE;

Django properties from expressions, or ComputedField part 2

I’ve discussed the concept of a ComputedField in the past. On the weekend, a friend pointed me towards SQL Alchemy’s Hybrid Attributes. The main difference here is that in a ComputedField, the calculation is always done in the database. Thus, if a change is made to the model instance (and it is not yet saved), then the ComputedField will not change it’s value. Let’s look at an example from that original post:

class Person(models.Model):
    first_name = models.TextField()
    last_name = models.TextField()
    display_name = ComputedField(
        Concat(F('first_name'), Value(' '), F('last_name')),
        output_field=models.TextField()
    )

We can use this to query, or as an attribute:

Person.objects.filter(display_name__startswith='foo')
Person.objects.first().display_name

But, if we make changes, we don’t see them until we re-query:

person = Person(first_name='Fred', last_name='Jones')
person.display_name  # This is not set

So, it got me thinking. Is it possible to turn a django ORM expression into python code that can execute and have the same output?

And, perhaps the syntax SQL Alchemy uses is nicer?

class Person(models.Model):
    first_name = models.TextField()
    last_name = models.TextField()

    @shared_property
    def display_name(self):
        return Concat(
            F('first_name'),
            Value(' '),
            F('last_name'),
            output_field=models.TextField(),
        )

The advantage to using the decorator approach is that you could have a more complex expression - but perhaps that is actually a disadvantage. It might be nice to ensure that the code can be turned into a python function, after all.


The first step is to get the expression we need to convert to a python function. Writing a python decorator will give us access to the “function” object - we can just call this, as long as it does not refer to self at all, this can be done without an instance:

class shared_property(object):
    def __init__(self, function):
        expression = function(None)

This gives us the expression object. Because this is a python object, we can just look at it directly, and turn that into an AST. Having a class for parsing this makes things a bit simpler. Let’s look at a parser that can handle this expression.

import ast


class Parser:
    def __init__(self, function):
        # Make a copy, in case this expression is used elsewhere, and we change it.
        expression = function(None).copy()
        tree = self.build_expression(expression)
        # Need to turn this into code...
        self.code = compile(tree, mode='eval', filename=function.func_code.co_filename)

    def build_expression(self, expression):
        # Dynamically find the method we need to call to handle this expression.
        return getattr(self, 'handle_{}'.format(expression.__class__.__name__.lower()))(expression)

    def handle_concat(self, concat):
        # A Concat() contains only one source expression: ConcatPair().
        return self.build_expression(*concat.get_source_expressions())

    def handle_concatpair(self, pair):
        left, right = pair.get_source_expressions()
        return ast.BinOp(
            left=self.build_expression(left),
            op=ast.Add(),
            right=self.build_expression(right),
        )

    def handle_f(self, f):
        # Probably some more work here around transforms/lookups...
        # Set this, because without it we get errors. Will have to
        # figure out a better way to handle this later...
        f.contains_aggregate = False
        return ast.Attribute(
            value=ast.Name(id='self'),
            attr=f.name,
        )

    def handle_value(self, value):
        if value.value is None:
            return ast.Name(id='None')

        if isinstance(value.value, (str, unicode)):
            return ast.Str(s=value.value)

        if isinstance(value.value, (int, float)):
            return ast.Num(n=value.value)

        if isinstance(value.value, bool):
            return ast.Name(id=str(value.value))

        # ... others?
        raise ValueError('Unable to handle {}'.format(value))

There’s a bit more “noise” required in there (every node must have a ctx, and a filename, lineno and col_offset), but they make it a bit harder to follow.

So, we have our expression, and we have turned that into an equivalent python expression, and compiled it…except it won’t compile. We need to wrap it in an ast.Expression(), and then we can compile it (and call it).

Roughly, we’ll end up with a code object that does:

self.first_name + (' ' + self.last_name)

We can call this with our context set:

eval(code, {'self': instance})

But, before we head down that route (I did, but you don’t need to), it’s worth noticing that not all ORM expressions can be mapped directly onto a single python expression. For instance, if we added an optional preferred_name field to our model, our display_name expression may look like:

@shared_property
def display_name(self):
    return Case(
        When(preferred_name__isnull=True, then=Concat(F('first_name'), Value(' '), F('last_name'))),
        When(preferred_name__exact=Value(''), then=Concat(F('first_name'), Value(' '), F('last_name'))),
        default=Concat(F('first_name'), Value(' ('), F('preferred_name'), Value(') ') F('last_name')),
        output_field=models.TextField()
    )

Since this will roughly translate to:

@property
  def display_name(self):
      if all([self.preferred_name is None]):
          return self.first_name + ' ' + self.last_name
      elif all([self.preferred_name == '']):
          return self.first_name + ' ' + self.last_name
      else:
          return self.first_name + ' (' + self.preferred_name + ') ' + self.last_name

Whilst this is still a single ast node, it is not an expression (and cannot easily be turned into an expression - although in this case we could use a dict lookup based on self.preferred_name, but that’s not always going to work). Instead, we’ll need to change our code to generate a statement that contains a function definition, and then evaluate that to get the function object in the context. Then, we’ll have a callable that we can call with our model instance to get our result.

There are a few hitches along the way though. The first is turning our method into both a private field and a property. That is the relatively straightforward part:

class shared_property:
    def __init__(self, function):
        self.parsed = Parser(function)
        context = {}
        eval(self.parsed.code, context)
        self.callable = context[function.func_code.co_name]

    def __get__(self, instance, cls=None):
        # Magic Descriptor method: this method will be called when this property
        # is accessed on the instance.
        if instance is None:
            return self
        return self.callable(instance)

    def contribute_to_class(self, cls, name, private_only=False):
        # Magic Django method: this is called by django on class instantiaton, and allows
        # us to add our field (and ourself) to the model. Mostly this is the same as
        # a normal Django Field class would do, with the exception of setting concrete
        # to false, and using the output_field instead of ourself.
        field = self.parsed.expression.output_field
        field.set_attributes_from_name(name)
        field.model = cls
        field.concrete = False
        # This next line is important - it's the key to having everything work when querying.
        field.cached_col = ExpressionCol(self.parsed.expression)
        cls._meta.add_field(field, private=True)
        if not getattr(cls, field.attname, None):
            setattr(cls, field.attname, self)

There are a few things to note in that last method.

  • We use the output_field from the expression as the added field.
  • We mark this field as a private, non-concrete field. This prevents django from writing it back to the database, but it also means it will not appear in a .values() unless we explicitly ask for it. That’s actually fine, because we want the python property to execute instead of just using the value the database gave us.
  • The cached_col attribute is used when generating queries - we’ll look more at that now.

When I previously wrote the ComputedField implementation, the place I was not happy was with the get_col() method/the cached_col attribute. Indeed, to get that to work, I needed to use inspect to sniff up the stack to find a query instance to resolve the expression.

This time around though, I took a different approach. I was not able to use the regular resolve_expression path, because fields are assumed not to require access to the query to resolve to a Col expression. Instead, we can delay the resolve until we have something that gives us the query object.

class ExpressionCol:
    contains_aggregate = False
    def __init__(self, expression):
        self.expression = expression
        self.output_field = expression.output_field

    def get_lookup(self, name):
        return self.output_field.get_lookup(name)

    def get_transform(self, name):
        return self.output_field.get_transform(name)

    def as_sql(self, compiler, connection):
        resolved = self.expression.resolve_expression(compiler.query)
        return resolve_expression.as_sql(compiler, connection)

    def get_db_converters(self, connection):
      return self.output_field.get_db_converters(connection) + \
             self.expression.get_db_converters(connection)

This doesn’t need to be a full Expression subclass, because it mostly delegates things to the output field, but when it is turned into SQL, it can resolve the expression before then using that resolved expression to build the SQL.

So, let’s see how this works now (without showing the new Nodes that are handled by the Parser):

Person.objects.filter(display_name__startswith='Bob')

Yeah, that correctly limits the queryset. How about the ability to re-evaluate without a db round trip?

person = Person(first_name='Fred', last_name='Jones')
person.display_name  # -> 'Fred Jones'
person.preferred_name = 'Jonesy'
person.display_name  # -> 'Fred (Jonesy) Jones'

Success!


This project is not done yet: I have improved the Parser (as implied) to support more expressions, but there is still a bit more to go. It did occur to me (but not until I was writing this post) that the ComputedField(expression) version may actually be nicer. As hinted, that requires the value to be an expression, rather than a function call. It would be possible to create a function that references self, for instance, and breaks in all sorts of ways.

Redirect all DNS traffic to the pi.hole

This is more to remind me than anything else, but I figured out how to configure my firewall to redirect all DNS traffic (except from the pihole itself) to the pihole.

My pihole has an IP address of 10.1.1.3:

iptables -t nat -A PREROUTING -i br-lan ! -s 10.1.1.3 -p tcp --dport 53 -j DNAT --to 10.1.1.3
iptables -t nat -A PREROUTING -i br-lan ! -s 10.1.1.3 -p udp --dport 53 -j DNAT --to 10.1.1.3
iptables -t nat -A POSTROUTING -j MASQUERADE

In OpenWrt, this needs to be pasted into Network → Firewall → Custom Rules, and then possibly reboot the router.

It is likely that a reboot is not necessary: the MASQUERADE line made me think I was still hitting the external DNS server, but it was transparently being handled by my pihole.

Certificate Expiry Dates without extra software

I’ve got my Home Assistant set up and running, and have obtained a Lets Encrypt certificate to allow me to serve it all over HTTPS.

One of the things that you can do is set it up to notify you about expiring certificates. However, this requires the installation of a specific package. Since I’m running Home Assistant in a docker image, I can’t really do this.

However, the tools you need to determine a certificate’s expiry date are already in most systems (otherwise how would they be able to tell if the certificate from a site is still valid?).

echo | \
  openssl s_client -connect example.com:443 2>/dev/null | \
  openssl x509 -noout -dates

This gives the very useful:

notBefore=Nov 28 00:00:00 2018 GMT
notAfter=Dec  2 12:00:00 2020 GMT

We can manipulate this using some other commands to get just the expiry date:

echo | \
  openssl s_client -connect example.com:443 2>/dev/null | \
  openssl x509 -noout -dates | \
  tail -n 1 | \
  cut -d '=' -f 2

Now, we want to turn this into a number of days from today. Bash can do arithmetic, we just need to make the values in the right format. In this case, we’ll get date to give us an epoch value, and divide this by 3600 * 24.

echo $(( ($(date +%s --date "2020-12-02 12:00:00") - $(date +%s)) / (3600 * 24) ))

That gives us 158 days from the day I wrote this. Now let’s put our command instead of the fixed date:

echo $((
  (
    $(date +%s --date "$(echo | \
  openssl s_client -connect example.com:443 2>/dev/null | \
  openssl x509 -noout -dates | \
  tail -n 1 | \
  cut -d '=' -f 2)") - $(date +%s)
  ) / (3600 * 24)
))

Okay, we still get our 158. That’s a good sign.

Now, to put this into a Home Assistant sensor, we need to edit our configuration.yaml. Note that I needed to change the date parsing format inside the docker container to %b %d %H:%M:%S %Y GMT.

sensor:
  - platform: command_line
    name: SSL Certificate Expiry
    unit_of_measurement: days
    scan_interval: 10800
    command: echo $((
      (
        $(date +%s --date "$(echo | openssl s_client -connect example.com:443 2>/dev/null
                                  | openssl x509 -noout -dates
                                  | tail -n 1
                                  | cut -d '=' -f 2)"
                   -D "%b %d %H:%M:%S %Y GMT") - $(date +%s) ) / (3600 * 24) ))

This should give us a sensor that we can then use to create an automation, as seen in the original post.

Don’t forget to change the domain to your Home Assistant hostname!

Smart Devices Aren't (or why connected devices suck)

I love tinkering with gadgets. I’ve put a bunch of sensors around my house, so I can see the temperature in various places, and have a couple of smart light and power devices too. At the moment, they are limited to my laundry (where the hard-wired switch is in the wrong place, due to the moving of a door), my workbench (because the overhead lights there run from a power point, so it was trivial to put in a smart switch), and the lounge room (where I had room in the light fitting to put a Sonoff Mini).

In all of these cases, with the exception of the Laundry, since the switch is not really accessible, I have taken great care to ensure that the physical switches still toggle the light. In that case I have an Ikea bulb connected to an Ikea dimmer.

In my study, I have a desk lamp that has a smart (dimmable) bulb in it, and it irks me no end that I have to use a smart device or computer to turn it on or off. I will be getting some more of the Ikea dimmers to alleviate this, but in the mean time, it’s a pain to turn on or off.

Having said that, I love the option of being able to automate power and lighting, or turn things off from a distance. I just don’t like that being the only way.

I installed Home Assistant on the weekend. But, in order to fit that onto my Raspberry Pi, I needed to use a bigger Micro SD card.

Which meant I needed to clone the old one.

Which took several hours.

I’d already installed Home Assistant before running out of space, and had converted a couple of my esphome devices to use the API instead of just MQTT for connection, including the lounge room light.

Now, it turns out, by default there is an “auto reboot if API not found in 15 minutes” setting, which meant that during the four or five hours it took to create an image of the Micro SD, verify this, copy to a new SD card, and then verify that, my lights (and a powerboard in my office) would flick off every 15 minutes. Likewise, if they cannot connect to a WiFi access point they will power cycle. I believe this second one can be resolved using a Captive AP setting that will mean if they can’t connect to a network, they will create their own.

Which really got me thinking. Smart devices should continue to work in every way possible when they don’t have access to the local network, or the internet. In my case, my smart devices do not have access to the internet anyway, because they don’t need to. However, the point is the same.

In situations where a network connection, or even worse, a working connection to a server that you don’t control, is no longer available, you dont’ want your lights or god forbid, your coffee machine to not be able to perform their simple task.

This worries me somewhat about the current trends in smart homes. At some point, companies will stop supporting their devices (this has already happened), and they will become less useful than their dumb counterparts. And add further to our global waste problems.

But having a significant system outage (even an intentional one, like in my case), made me think about other aspects of my home automation as well.

I’ve been using NodeRED for a couple of automation tasks. One of them was to have different grind lengths for my coffee grinder, and making this available to Siri.

However, with the device running NodeRED not operating, I was no longer able to rely on this.

I was heading this way philosophically before, but (OMG NO COFFEE) this just cemented something else in my mind. Automations, where they don’t rely on interaction between multiple devices, should live on the local device where possible. Further to this, where the interaction between devices is required for the automation (like the PIR sensor in the laundry I have that turns on the Ikea lightbulb), the devices should connect directly to one another, without requiring some other mechanism to trigger the automation.

In my case, I have a physical button that I press to trigger a long grind. But, the grind only stops if the NodeRED server tells it to. And, I had no way to (when NodeRED was not running), trigger a short grind.

I was able to fix this: I now have a short press triggering a long grind, and a long press triggering a short grind. That seems backwards, but since I mostly do a long grind in the morning before I’ve had time to properly wake up, I want that the easiest one to trigger…


Having to program this in my esphome firmware instead of NodeRED made for an interesting exercise. Because we need to turn off the device after a period of time, but need to be aware of other events that have happened in the meantime, we need to use scripts.

script:
  - id: short_grind
    then:
      - switch.turn_on: relay
      - delay: 13s
      - switch.turn_off: relay
  - id: long_grind
    then:
      - switch.turn_on: relay
      - delay: 17s
      - switch.turn_off: relay

Whenever our relay turns on, we want to start our long grind script, so that even if the relay was triggered some other way than through the script, it will turn off after 17s if not before. Whenever it turns off, we want to stop any instances of our scripts running. We can also use Template Switches to have logical devices we can use to trigger the different scripts, either from Home Assistant, or from button presses:

switch:
  - platform: gpio
    id: relay
    pin: GPIO2
    restore_mode: ALWAYS_OFF
    on_turn_on:
      - script.execute: long_grind
    on_turn_off:
      - script.stop: short_grind
      - script.stop: long_grind
  - platform: template
    name: "Grind a Single"
    optimistic: true
    id: grind_a_single
    icon: mdi:coffee-outline
    turn_on_action:
      - script.execute: short_grind
      - script.wait: short_grind
      - switch.template.publish:
          id: grind_a_single
          state: OFF
    turn_off_action:
      - switch.turn_off: relay
  - platform: template
    name: "Grind a Double"
    optimistic: true
    id: grind_a_double
    icon: mdi:coffee
    turn_on_action:
      - script.execute: long_grind
      - script.wait: long_grind
      - switch.template.publish:
          id: grind_a_double
          state: OFF
    turn_off_action:
      - switch.turn_off: relay

Both of these template switches will also turn off the grinder when toggled off if they are currently on.

There’s only one more bit of logic that’s required, and that’s the handling of the physical button. I wanted this to trigger either length based on the amount of time that the button is held down for, but I also want a UX affordance of knowing when you have held it down long enough to trigger the alternate action. Finally, if it’s on, any type of press should turn it off, and not trigger a new grind.

binary_sensor:
  - platform: gpio
    pin:
      number: GPIO14
      inverted: true
      mode: INPUT_PULLUP
    on_press:
      - light.turn_on:
          id: led
          transition_length: 0s
      - delay: 500ms
      - light.turn_off:
          id: led
          transition_length: 0s
    on_click:
      - max_length: 350ms
        then:
          - if:
              condition:
                - switch.is_on: relay
              then:
                - switch.is_off: relay
              else:
                - script.execute: long_grind
      - min_length: 500ms
        max_length: 2s
        then:
          - if:
              condition:
                - switch.is_on: relay
              then:
                - switch.is_off: relay
              else:
                - script.execute: short_grind

Remember that the turning off of the relay will stop any running scripts.

So, now, when you hold down the button, when the light turns off, you can release it and it will trigger a short grind. If you just tap the switch and release it immediately, it will trigger a long grind. Any button press when the grinder is already running will turn it off.

Hacking Arlec's 'Smart' sensor light

Quite some time ago, I purchased one of the Arlec Smart Security Lights from Bunnings. The big draw for me was that these devices have an ESP8266, and run a Tuya firmware, which can be trivially flashed without opening up the unit.

In this case, though, the device was not really that capable. Whilst there is a PIR sensor (and an LDR for tuning if the light should turn on yet or not), the status of the PIR is not exposed at all. Instead, the firmware allows toggling between three states: “ON”, “OFF”, and “SENSOR”.

That’s not actually that useful. For one, it makes using it in conjunction with a physical switch really inconvenient. The behaviour I would prefer is:

  • Light ON/OFF state can be toggled by network request.
  • Light ON/OFF state can be toggled by physical switch. It must not matter which state the switch is in, toggling it must toggle the light.
  • PIR ON turns ON the light.
  • PIR OFF turns OFF the light, but only if it was turned ON by PIR.

As the last point indicates, the only time the PIR turning off should turn the light off is if the light was turned on by the PIR. That is, either physical switch actuation or a network request should turn the light into manual override ON.

There is no manual override OFF.

Most of this was already present in a firmware I wrote for adding a PIR to a light strip. However, the ability to also toggle the state using a physical switch is important to me: not in the least because there is a switch inside the door where I want to mount this, and I’m very likely to accidentally turn it on when I go outside: it’s also a much better solution to a manual override than just having to use HomeKit. I’ll possibly add that feature back into the aforementioned project.

Like all of the other Grid Connect hardware I’ve used so far, it was easy to flash using tuya convert. But I did run into a bunch of problems from that point onwards. The base contains the ESP8266, one of the TYWE2S units, and it’s possible to see without opening up the sensor unit that this has three GPIO pins connected: GPIO4, GPIO5 and GPIO12.

With a custom firmware, it was possible to see that GPIO5 is connected to the green LED near the PIR sensor, but the other two appear to be connected to another IC on the PCB. I thought perhaps this could be accessed using the TuyaMCU protocol, but had no luck.

As it turns out, I’d prefer not to have to use that. There are two more wires, it would be great if I could connect one of them to the relay, and the other to the PIR.

Indeed, with limited rewiring (I did have to cut some tracks and run wires elsewhere), I was able to connect GPIO12 to the point on the PCB that the output from the other IC that triggered the relay, and GPIO4 to the input of the IC that was sensing the PIR output.

I also ran an extra pair of wires from GPIO14 and GND, to use to connect to the physical switch. These will only transmit low voltage.

Unfortunately, I forgot to take photos before putting it all back together and having it mounted on the wall.

Then we just need the firmware:

esphome:
  name: $device_name
  platform: ESP8266
  board: esp01_1m

globals:
  - id: manual_override
    type: bool
    restore_value: no
    initial_value: 'false'
  - id: mqtt_triggered
    type: bool
    restore_value: no
    initial_value: 'false'

sensor:
  - platform: wifi_signal
    name: "WiFi signal sensor"
    update_interval: 60s

binary_sensor:
  - platform: gpio
    pin: GPIO4
    id: pir
    device_class: motion
    filters:
      - delayed_off: 15s
    on_press:
      - light.turn_on: green_led
      - mqtt.publish:
          topic: HomeKit/${device_name}/MotionSensor/MotionDetected
          payload: "1"
      - switch.turn_on: relay
    on_release:
      - light.turn_off: green_led
      - mqtt.publish:
          topic: HomeKit/${device_name}/MotionSensor/MotionDetected
          payload: "0"
      - if:
          condition:
            lambda: 'return id(manual_override);'
          then:
            logger.log: "Manual override prevents auto off."
          else:
            switch.turn_off: relay

  - platform: gpio
    pin:
      number: GPIO14
      mode: INPUT_PULLUP
    name: "Toggle switch"
    filters:
      - delayed_on_off: 100ms
    on_state:
      - switch.toggle: relay
      - globals.set:
          id: manual_override
          value: !lambda "return id(relay).state;"

ota:

logger:

output:
  - platform: esp8266_pwm
    id: gpio5
    pin:
      number: GPIO5
    inverted: False

switch:
  - platform: gpio
    id: relay
    pin:
      number: GPIO12
      # inverted: True
      # mode: INPUT_PULLDOWN_16
    on_turn_on:
      - if:
          condition:
            lambda: 'return id(mqtt_triggered);'
          then:
            logger.log: "No MQTT message sent"
          else:
            mqtt.publish:
              topic: HomeKit/${device_name}/Lightbulb/On
              retain: ON
              payload: "1"
    on_turn_off:
      - if:
          condition:
            lambda: 'return id(mqtt_triggered);'
          then:
            logger.log: "No MQTT message sent"
          else:
            mqtt.publish:
              topic: HomeKit/${device_name}/Lightbulb/On
              retain: ON
              payload: "0"


light:
  - platform: monochromatic
    id: green_led
    output: gpio5
    restore_mode: ALWAYS_OFF
    default_transition_length: 100ms

mqtt:
  broker: "mqtt.lan"
  discovery: false
  topic_prefix: esphome/${device_name}
  on_message:
    - topic: HomeKit/${device_name}/Lightbulb/On
      payload: "1"
      then:
        - globals.set:
            id: mqtt_triggered
            value: 'true'
        - switch.turn_on: relay
        - globals.set:
            id: mqtt_triggered
            value: 'false'
        - globals.set:
            id: manual_override
            value: !lambda "return !id(pir).state;"
    - topic:  HomeKit/${device_name}/Lightbulb/On
      payload: "0"
      then:
        - globals.set:
            id: mqtt_triggered
            value: 'true'
        - switch.turn_off: relay
        - globals.set:
            id: manual_override
            value: 'false'
        - globals.set:
            id: mqtt_triggered
            value: 'false'

I’ve also implemented a filter on sending the state to MQTT: basically, we don’t want to send a message to MQTT if we received the same message. There is a race condition that can occur where this results in fast toggling of the relay as each toggle sends a message, but then receives a message with the alternate state. I’ve had this on my Sonoff Mini firmware too.

Extracting values from environment variables in tox

Tox is a great tool for automated testing. We use it, not only to run matrix testing, but to run different types of tests in different environments, enabling us to parallelise our test runs, and get better reporting about what types of tests failed.

Recently, we started using Robot Framework for some automated UI testing. This needs to run a django server, and almost certainly wants to run against a different database. This will require our tox -e robot to drop the database if it exists, and then create it.

Because we use dj-database-url to provide our database settings, our Codeship configuration contains an environment variable set to DATABASE_URL. This contains the host, port and database name, as well as the username/password if applicable. However, we don’t have the database name (or port) directly available in their own environment variables.

Instead, I wanted to extract these out of the postgres://user:password@host:port/dbname string.

My tox environment also needed to ensure that a distinct database was used for robot:

[testenv:robot]
setenv=
  CELERY_ALWAYS_EAGER=True
  DATABASE_URL={env:DATABASE_URL}_robot
  PORT=55002
  BROWSER=headlesschrome
whitelist_externals=
  /bin/sh
commands=
  sh -c 'dropdb --if-exists $(echo {env:DATABASE_URL} | cut -d "/" -f 4)'
  sh -c 'createdb $(echo {env:DATABASE_URL} | cut -d "/" -f 4)'
  coverage run --parallel-mode --branch manage.py robot --runserver={env:PORT}

And this was working great. I’m also using the $PG_USER environment variable, which is supplied by Codeship, but that just clutters things up.

However, when merged to our main repo, which has it’s own codeship environment, tests were failing. It would complain about the database not being present when attempting to run the robot tests.

It seems that we were using a different version of postgres, and thus were using a different port.

So, how can we extract the port from the $DATABASE_URL?

commands=
  sh -c 'dropdb --if-exists \
                -p $(echo {env:DATABASE_URL} | cut -d "/" -f 3 | cut -d ":" -f 3) \
                $(echo {env:DATABASE_URL} | cut -d "/" -f 4)'

Which is all well and good, until you have a $DATABASE_URL that omits the port…

dropdb: error: missing required argument database name

Ah, that would mean the command being executed was:

$ dropdb --if-exists -p  <database-name>

Eventually, I came up with the following:

sh -c 'export PG_PORT=$(echo {env:DATABASE_URL} | cut -d "/" -f 3 | cut -d ":" -f 3); \
              dropdb --if-exists \
                     -p $\{PG_PORT:-5432} \
                     $(echo {env:DATABASE_URL} | cut -d "/" -f 4)'

Whew, that is a mouthful!

We store the extracted value in a variable PG_PORT, and then use bash variable substitution (rather than tox variable substitution) to put it in, with a default value. But because of tox variable substitution, we need to escape the curly brace to allow it to be passed through to bash: $\{PG_PORT:-5432}. Also note that you’ll need a space after this before a line continuation, because bash seems to strip leading spaces from the continued line.

Django and Robot Framework

One of my colleagues has spent a bunch of time investigating and then implementing some testing using Robot Framework. Whilst at times the command line feels like it was written by someone who hasn’t used unix much, it’s pretty powerful. There are also some nice tools, like several Google Chrome plugins that will record what you are doing and generate a script based upon that. There are also other tools to help build testing scripts.

There is also an existing DjangoLibrary for integrating with Django.

It’s an interesting approach: you install some extra middleware that allows you to perform requests directly to the server to create instances using Factory Boy, or fetch data from Querysets. However, it requires that the data is serialised before sending to the django server, and the same the other way. This means, for instance, that you cannot follow object references to get a related object without a bunch of legwork: usually you end up doing another Query Set query.

There are some things in it that I do not like:

  • A new instance of the django runserver command is started for each Test Suite. In our case, this takes over 10 seconds to start as all imports are processed.
  • The database is flushed between Test Suites. We have data that is added through migrations that is required for the system to operate correctly, and in some cases for tests to execute. This is the same problem I’ve seen with TransactionTestCase.
  • Migrations are applied before running each Test Suite. This is unnecessary, and just takes more time.
  • Migrations are created automatically before running each Test Suite. This is just the wrong approach: at worst you’d want to warn that migrations are not up to date - otherwise you are testing migrations that may not have been committed: your CI would pass because the migrations were generated, but your system would fail in reality because those migrations do not really exist. Unless you are also making migrations directly on your production server and not committing them at all, in which case you really should stop that.

That’s in addition to having to install extra middleware.

But, back onto the initial issue: interacting with Django models.

What would be much nicer is if you could just call the python code directly. You’d get python objects back, which means you can follow references, and not have to deal with serialisation.

It’s fairly easy to write a Library for Robot Framework, as it already runs under Python. The tricky bit is that to access Django models (or Factory Boy factories), you’ll want to have the Django infrastructure all managed for you.

Let’s look at what the DjangoLibrary might look like if you are able to assume that django is already available and configured:

import importlib

from django.apps import apps
from django.core.urlresolvers import reverse

from robot.libraries.BuiltIn import BuiltIn


class DjangoLibrary:
    """

    Tools for making interaction with Django easier.

    Installation: ensure that in your `resource.robot` or test file, you have the
    following in your "***Settings***" section:

        Library         djangobot.DjangoLibrary     ${HOSTNAME}     ${PORT}

    The following keywords are provided:


    Factory:        execute the named factory with the args and kwargs. You may omit
                    the 'factories' module from the path to reduce the amount of code
                    required.

        ${obj}=     Factory     app_label.FactoryName       arg  kwarg=value
        ${obj}=     Factory     app_label.factories.FactoryName     arg  kwarg=value


    Queryset:       return a queryset of the installed model, using the default manager
                    and filtering according to any keyword arguments.

        ${qs}=      Queryset    auth.User       pk=1


    Method Call:    Execute the callable with tha args/kwargs provided. This differs
                    from the Builtin "Call Method" in that it expects a callable, rather
                    than an instance and a method name.

        ${x}=       Method Call     ${foo.bar}      arg  kwargs=value


    Relative Url:   Resolve the named url and args/kwargs, and return the path. Not
                    quite as useful as the "Url", since it has no hostname, but may be
                    useful when dealing with `?next=/path/` values, for instance.

        ${url}=     Relative Url        foo:bar     baz=qux


    Url:            Resolve the named url with args/kwargs, and return the fully qualified url.

        ${url}=     Url                 foo:bar     baz=qux


    Fetch Url:      Resolve the named url with args/kwargs, and then using SeleniumLibrary,
                    navigate to that URL. This should be used instead of the "Go To" command,
                    as it allows using named urls instead of manually specifying urls.

        Fetch Url   foo:bar     baz=qux


    Url Should Match:   Assert that the current page matches the named url with args/kwargs.

        Url Should Match        foo:bar     baz=qux

    """

    def __init__(self, hostname, port, **kwargs):
        self.hostname = hostname
        self.port = port
        self.protocol = kwargs.pop('protocol', 'http')

    @property
    def selenium(self):
        return BuiltIn().get_library_instance('SeleniumLibrary')

    def factory(self, factory, **kwargs):
        module, name = factory.rsplit('.', 1)
        factory = getattr(importlib.import_module(module), name)
        return factory(**kwargs)

    def queryset(self, dotted_path, **kwargs):
        return apps.get_model(dotted_path.split('.'))._default_manager.filter(**kwargs)

    def method_call(self, method, *args, **kwargs):
        return method(*args, **kwargs)

    def fetch_url(self, name, *args, **kwargs):
        return self.selenium.go_to(self.url(name, *args, **kwargs))

    def relative_url(self, name, *args, **kwargs):
        return reverse(name, args=args, kwargs=kwargs)

    def url(self, name, *args, **kwargs):
        return '{}://{}:{}'.format(
            self.protocol,
            self.hostname,
            self.port,
        ) + reverse(name, args=args, kwargs=kwargs)

    def url_should_match(self, name, *args, **kwargs):
        self.selenium.location_should_be(self.url(name, *args, **kwargs))

You can write a management command: this allows you to hook in to Django’s existing infrastructure. Then, instead of calling robot directly, you use ./manage.py robot

What’s even nicer about using a management command is that you can have that (optionally, because in development you probably will already have a devserver running) start runserver, and kill it when it’s finished. This is the same philosophy as robotframework-DjangoLibrary already does, but we can start it once before running out tests, and kill it at the end.

So, what could our management command look like? Omitting the code for starting runserver, it’s quite neat:

from __future__ import absolute_import

from django.core.management import BaseCommand, CommandError

import robot


class Command(BaseCommand):
    def add_arguments(self, parser):
        parser.add_argument('tests', nargs='?', action='append')
        parser.add_argument('--variable', action='append')
        parser.add_argument('--include', action='append')

    def handle(self, **options):
        robot_options = {
            'outputdir': 'robot_results',
            'variable': options.get('variable') or []
        }
        if options.get('include'):
            robot_options['include'] = options['include']

        args = [
            'robot_tests/{}_test.robot'.format(arg)
            for arg in options['tests'] or ()
            if arg
        ] or ['robot_tests']

        result = robot.run(*args, **robot_options)

        if result:
            raise CommandError('Robot tests failed: {}'.format(result))

I think I’d like to do a bit more work on finding tests, but this works as a starting point. We can call this like:

./manage.py robot foo --variable BROWSER:firefox --variable PORT:8000

This will find a test called robot_tests/foo_test.robot, and execute that. If you omit the test argument, it will run on all tests in the robot_tests/ directory.

I’ve still got a bit to do on cleaning up the code that starts/stops the server, but I think this is useful even without that.