More thoughts on HATEOAS

Last night I wrote some stuff about REST, user access and HATEOAS. After I wrote it, I started planning out what our application would look like if it were structured this way. This morning, as I prepared Weetbix for my little boy, I had a bit of an epiphany.

You could use HATEOAS with a non-javascript HTML web-site. It becomes less ‘dynamic’ in the sense that it is pageclicknext page. But each hyperlink represents a change of state: either of what the user is viewing (on a GET request), or what the system is storing (on a PUT/POST/DELETE).

I had been thinking about it from a ‘rich client’ perpective, whether that was in a browser (and loading pages using $.ajax() or similar), or what we have now (a wxWidgets application). But everything makes sense from a pure HTML perspective too. In fact, you can automatically generate links and everything, if you want.

This made me think a bit more about how to define what the system does (or more specifically, what the system can and should do when). One of the problems we have now is that it is such a big system that it’s hard to know what is happening when. I had thought that the whole ‘Use Case’ concept taught in Software Engineering courses was mostly bollocks, but I can see some value in parts of it.

Flow diagrams suddenly make sense, because you need to define what possible things can happen from any given state: what other system states are available to that user at that time.

For instance, a user has logged in, and is viewing their own user details. The things they can do are:

  • update their own details
  • change their password
  • exit viewing their own details

This maps nicely onto link relations, and the resource they are viewing becomes:

  {
    "_links": {
      "self": {"href": "..."},
      "edit": {"href": "...", "title": "Save changes"},
      "change-password": {"href": "...", "title": "Change Password"},
      "up": {"href": "...", "title": "Exit"}
    },
    "formats": {
      "date": "%Y-%m-%d",
      "time": "%H:%M:%S",
      "name": "%(first_name)s %(last_name)s"
    }
  }

If they are not permitted to perform an action: say they may not edit their own options, then the edit link is simply not there. The client could then use this to know that the data is not editable, and only represent a static text view of it, instead of editable fields. I am a big fan of stopping the user from performing an action, rather than failing.

Now, a (naïve) pure html web page could look like (I’ve only included the interesting parts):

<form method="put" action="...">
  <input name="formats:date" value="%Y-%m-%d">
  <input name="formats:time" value="%H:%M:%S">
  <input name="formats:date" value="%(first_name)s %(last_name)s">
  <input name="edit" type="submit">Save Changes</input>
</form>
<a href="...">Change Password</a>
<a href="...">Exit</a>

A more sophisticated client would probably want to have limited choices for those three formats. If it were still pure HTML (ie, generated by the server that will be handling it), then it would know about the available choices for those fields, and you could have something like:

<form method="put" action="...">
  <select name="formats:date">
    <option value="%Y-%m-%d" selected>ISO 8601 (2012-01-26)</option>
    <option value="%b %d, %Y">Long (January 26, 2012)</option>
    <option value="%m/%d/%Y">US (26/01/2012)</option>
    <option value="%d/%m/%Y">Australian (01/26/2012)</option>
    ...
  </select>
  ...
  <input name="edit" type="submit">Save Changes</input>
</form>
<a href="...">Change Password</a>
<a href="...">Exit</a>

But how do we provide these choices to a non-pure HTML client? Surely we don’t want to embed them in the resource: that seems wasteful, especially if they are unlikely to change often. Why, we can have a resource that contains the choices:

[
  {
    "value": "%Y-%m-%d",
    "title": "ISO 8601"
  },
  {
    "value": "%b %d, %Y",
    "title": "Long"
  },
  {
    "value": "%m/%d/%Y",
    "title": "US"
  }
]

Note here that I am relying on the rich client to provide the example, based on today’s date, probably. That means we can cache this more.

But where do we link to this?

I’d probably extend it so that all of the formats appear in one resource:

{
  "date": [...],
  "time": [...],
  "name": [...]
}

And then have a link:

{
  "_links": {
    "self": {"href": "..."},
    "edit": {"href": "...", "title": "Save changes"},
    "change-password": {"href": "...", "title": "Change Password"},
    "up": {"href": "...", "title": "Exit"},
    "choices:formats": {"href": "..."}
  },
  "formats": {
    "date": "%Y-%m-%d",
    "time": "%H:%M:%S",
    "name": "%(first_name)s %(last_name)s"
  }
}

Indeed, it might be even more complicated than that: with the name format, for instance, and even with the date and time, we could provide some canned types, and “Custom”, allowing the user to use the valid formatting token to make their own. But that’s another story.


What is still part of the same story, however, is how to know what verb types should be used for which links, and what data should be sent in the case of PUT/POST requests. I’ve been sitting down working some stuff out about this, and come up with some conventions I think I’ll look at using.

rel=self

"self": {"href": "...", "title": "User Details"}

rel=self means that this is the address of the resource that is currently being used. We could look at using the title attribute to mean that ‘this is a good title for the current place of interaction within the system’. That also means we can have a different title for the same resource, depending upon other stuff in the system, for instance.

rel=edit

"edit": {"href": "...", "title": "Save changes", "data": {
    "editable-field-name": {
      "format": "string",
      "required": true,
      "multiple": false
    }, ...
  }
}

rel=edit means that this is the URI that should be PUT to, with what the client desires the resource should be. Note that I have played around a bit and added in an optional attribute: data, which contains an object with every editable field on the source object. If this field is missing, then the implication is that all fields are editable.

This also can be used to provide the client with information as to the data type, and if it is required and/or must contain an array of objects. The formats would be string, number, boolean. I still haven’t worked out how you might include an object here: the format could be object, but then you might need to define further what should go there.

Requesting this link with a PUT does not ‘move’ the application location: essentially you are still viewing the same object. A client may choose to follow the rel=up link after a successful update, however.

rel=delete

"delete": {"href": "...", "title": "Delete"}

This link can be used to delete an object. I’m not sure what exactly should happen after a deletion: we could have some method of undo that only works until the user visits another link:

{
  "_links": {
    "undelete": {"href": "...", "title": "Restore"}
    "up": {"href": "...", "title": "..."}
  }
}

rel=up, rel=root

"up": {"href": "...", "title": "Back to <...>"}

This link type would be used to move up a level in the application ‘hierarchy’. I think it would need to be present in every resource, except the root resource.

Perhaps we could also have:

"root": {"href": "...", "title": "Home"}

Then we could always jump quickly back to the root resource, without having to navigate through many layers.

rel=[other]

Every other link might be a domain-specific link. For instance, the change-password link in the example above. But we need a way to handle what http verb should be used.

Here is an example of how I think you could indicate that this should be a POST link:

"change-password": {
  "href": "...", 
  "title": "Change password",
  "data": {
    "old-password": {"format": "string", "required": true},
    "new-password": {"format": "string", "required": true}
  }
}

By having the data attribute, we prevent the need for having a GET request to a form resource, and then a POST from that resource to its rel=edit. However, there may be some value in having that step (it matches up with web page navigation, for instance).

The way I would handle it is that every link with a data attribute is essentially defining its form data, which could be used to construct multiple HTML forms in the one page.

It would be domain specific what happens after this request has been executed. In this case, you would probably want to stay on the user details page, but just flash a message that the password had been changed (or not).

Which brings up how to reply with messages in this context. I am tempted to have a “_messages” attribute at the same level as the “_links” attribute, but I’m not totally sold on that just yet.

In the cases where you want the link to be followed with a GET, simply omit the data attribute. However, if you want to be able to have query parameters, you could use:

"posts": {
  "href": "...",
  "title": "Posts",
  "param": {
    "date": {"format": "date"},
    "tag": {"format": "string", "multiple": true}
  }
}

In this case, neither of those parameters are required, but tag can be supplied multiple times. The IANA Link Relations document seems to imply that rel=search should be used for all search stuff, but I have tried to avoid having multiple links of the same reltype. Having that means that instead of being able to use _links.reltype.uri, you would need to use _links.reltype.uri || _links.reltype[0].uri, or something like that.

versioning resources

There are also some nice bits and bobs in the IANA Link Relations document about how versions of resources can refer to one another. Having a working copy is a nice idea, and would allow you to aggregate a series of changes together, while still allowing you to have business logic (and essentially saves) happening. You can then commit a working copy to create a new version.

I’m still a bit anti-locking of resources, as the danger is that a user might lock a resource from one machine, and the commit/discard request may never arrive, resulting in the version remaining locked, even when the same user attempts to change it again. I have done some stuff with merging conflicts on a 412 response.

The two options you have if you discard locking are:

  1. There may be multiple working-copy versions of a resource. Essentially each user (or each process they are using) would have a seperate working-copy. You would then need to merge conflicts when a subsequent working-copy is committed to the resource.
  2. There may only be one working-copy of a resource. Multiple users may interact with it, and conflicts must be resolved before the working-copy will be updated with that user’s changes.

I think I actually prefer #2. It does mean that two users working on the same object would never be able to independently create their own version: it’s probably a bit more svn than git, but for the business domain I am thinking in, it makes a bit more sense. And it seems to be simpler to merge conflicts as soon as possible.

Again, this just makes me want to write code.