Ratio Ordering in Postgres

A while ago, I had an idea about how to use a materialised view to handle ordering and pagination. Today in #postgresql on IRC, there was a discussion about using ratio ordering. Which reminded me that I also did some work on that.

It’s possible to use both of these techniques together to allow easy access to ordinal position, and more importantly, fast pagination over large data sets. We can also make use of a few other new postgres features to make things even nicer.

CREATE TABLE condition (
  condition_id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
  name TEXT UNIQUE NOT NULL,
  position_a INTEGER NOT NULL,
  position_b INTEGER NOT NULL,
  ratio_position NUMERIC UNIQUE GENERATED ALWAYS AS (position_a::NUMERIC / position_b::NUMERIC) STORED
);

We can add some data to make sure we have correct behaviour on those generated columns:

INSERT INTO condition (name, position_a, position_b)
VALUES ('One', 1, 1),
       ('Two', 2, 1);

We can also try adding in values that should be prevented:

-- Different position_a/position_b, but position_a / position_b clashes.
INSERT INTO condition (name, position_a, position_b)
VALUES ('Three', 2, 2);

-- Not able to divide by zero.
INSERT INTO condition (name, position_a, position_b)
VALUES ('Three', 2, 0);

Okay, it would be really neat if by default we just inserted a new value at the end of the list. However, this would require us to be able to have a DEFAULT on a pair of columns. Perhaps if we were storing these as a custom type, with a custom sequence, we could do that.

Anyway, onward. We can insert values between two others by doing some simple mathematics:

INSERT INTO condition (name, position_a, position_b)
VALUES (
  'One-point-five',
  1 + 2,  -- The sum of the previous and next items' position_a values
  1 + 1   -- The sum of the previous and next items' position_b values
);

And, we can see that we now have the sortable by the correct field:

SELECT name FROM condition ORDER BY ratio_position;
name
One
One-point-five
Two

(3 rows)

So, what about getting the ordinal positions? We can do that using a row_number() window function, but that’s only useful if we are fetching all of them.

We can re-use the trick of creating a materialised view that contains all of them:

CREATE MATERIALIZED VIEW condition_position AS

SELECT condition_id,
       row_number() OVER () AS position
  FROM condition
  ORDER BY ratio_position;

CREATE UNIQUE INDEX condition_position_condition ON condition_position(condition_id);
CREATE UNIQUE INDEX condition_position_position ON condition_position(position);

And now we can use it:

SELECT *
  FROM condition
 INNER JOIN condition_position USING (condition_id)
 ORDER BY position;

We want this to recalculate whenever any positions change:

CREATE OR REPLACE FUNCTION refresh_positions()
RETURNS TRIGGER AS $$
BEGIN
  EXECUTE 'REFRESH MATERIALIZED VIEW ' || TG_ARGV[0];
  RETURN NEW;
END;
$$ LANGUAGE plpgsql STRICT;

CREATE TRIGGER condition_refresh_positions
AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE
ON condition
FOR EACH STATEMENT
EXECUTE PROCEDURE refresh_positions('condition_position');

Now we can add more data:

INSERT INTO condition (name, position_a, position_b)
VALUES ('Three', 5, 1);

And fetch our data again:

SELECT *
  FROM condition
 INNER JOIN condition_position USING (condition_id)
 ORDER BY position;
condition_id name position_a position_b ratio_position position
1 One 1 1 1.00000000000000000000 1
2 Two 2 1 2.0000000000000000 2
5 One-point-five 3 2 1.5000000000000000 3
8 Three 5 1 5.0000000000000000 4

(4 rows)