Postgres Domains and Triggers

It occurred to me tonight (after rewriting my Tax File Number generation script in JavaScript, so it works on Text Expander Touch), that it might be nice to do TFN validation in Postgres.

You can write a function that validates a TFN (this is not the first version I wrote, this one is a bit more capable in terms of the range of values it accepts):

CREATE OR REPLACE FUNCTION valid_tfn(TEXT)
RETURNS BOOLEAN AS $$

WITH weights AS (
  SELECT
    row_number() OVER (), weight
  FROM unnest('{1,4,3,7,5,8,6,9,10}'::integer[]) weight
),
digits AS (
  SELECT
    row_number() OVER (),
    digit
  FROM (
    SELECT
      unnest(digit)::integer as digit
    FROM regexp_matches($1, '^(\d)(\d)(\d)[- ]?(\d)(\d)(\d)[- ]?(\d)(\d)(\d)$') AS digit
  ) digits
)

SELECT
  COALESCE(sum(weight * digit) % 11 = 0, FALSE)
FROM weights INNER JOIN digits USING (row_number);

$$ LANGUAGE SQL IMMUTABLE;

Once you have this (which will incidentally limit to 9 digits, and optional spacer items of - or ` `), you may create a DOMAIN, that validates values:

CREATE DOMAIN tax_file_number AS TEXT
CONSTRAINT valid_tfn CHECK (valid_tfn(VALUE));

Now, we can test it:

# SELECT valid_tfn('123-456-789') ; --> FALSE
# SELECT valid_tfn('123 456 782') ; --> TRUE

However, we might want to convert our data into a canonical “format”: in this case, always store it as XXX-XXX-XXX. We can write a function that does this:

CREATE OR REPLACE FUNCTION normalise_tfn(text)
RETURNS tax_file_number AS $$

SELECT string_agg(block, '-'::text)::tax_file_number
FROM (
  SELECT unnest(value) AS block
  FROM regexp_matches($1, '(\d\d\d)', 'g') value
) value;

$$ LANGUAGE SQL IMMUTABLE;

But how can we make our data that we insert always stored in this format?

Postgres triggers to the rescue. We can’t do it as part of a Domain (although we probably could do it as part of a scalar type, but that’s a whole other kettle of fish).

Let’s set up a table to store our tax declarations in:

CREATE TABLE tax_declaration
(
  tax_declaration_id SERIAL PRIMARY KEY,
  tfn tax_file_number,
  lodgement_date date
);

And now create a trigger function and trigger:

CREATE OR REPLACE FUNCTION rewrite_tfn() RETURNS TRIGGER AS $$

BEGIN
  NEW.tfn = normalise_tfn(NEW.tfn);
  RETURN NEW;
END;

$$ LANGUAGE plpgsql;

CREATE TRIGGER rewrite_tfn BEFORE INSERT OR UPDATE ON tax_declaration
FOR EACH ROW EXECUTE PROCEDURE rewrite_tfn();

Now, we can insert some data:

INSERT INTO tax_declaration (tfn, lodgement_date)
VALUES ('123456782', now()::date);

And now we can query it:

SELECT * FROM tax_declaration;
 tax_declaration_id |     tfn     | lodgement_date
--------------------+-------------+----------------
                  1 | 123-456-782 | 2015-10-13
(1 row)