This is a bit of a follow-up to Aggregating Ranges in Postgres.
Since we don’t have a nice range type in Python, we will just use a tuple that contains a lower and upper bound. We will assume that this is a canonical form: the lower bound is inclusive, the upper bound is non-inclusive. We will also assume (for simplicity) that the 0-th item is always less than the 1-th item in the tuple.
Given a list of these range-tuples, we want to do the two things we saw in the previous post.
- Aggregate them into the simplest possible array of range-tuples, using a union operation.
- Determine the parts that are missing.
We will use in in a similar way:
>>> data =[(2,3),(3,4),(11,12),(5,12),(4,5)] >>> aggregate_union(data) (2, 12) >>> missing_ranges(data) [(None, 2), (12, None)]
None sorts before any integer in python, so we will just be able to use the normal sort.
def aggregate_union(data): if not data: return None sorted_data = sorted(data) result = sorted_data for lower, upper in sorted_data[1:]: # If we ever find result is None, we know it covers any # other possible values, so we can stop at that point. if result is None: break if lower > result: return None # Or raise an exception, if you want. result = (result, max(upper, result)) return result
missing_ranges(data) takes cues from the SQL version too.
def missing_ranges(data): if not data: return (None, None) result =  # We do a little fancy stuff here: append an extra item that # mimics what we were able to use lead for, but in a different # way so we can use [i + 1] later. sorted_data = sorted(data) + [(None, None)] if sorted_data is not None: # Note: the upper bound here is not quite correct! result.append((None, sorted_data)) for i, (lower, upper) in enumerate(sorted_data): # Grab the next item in our sorted list. Normally, we would check # to see if we are at the end, but we will rely on the break later # on: we can always stop processing when the upper bound of our # next item is `None` next = sorted_data[i + 1] if upper < next or (next is None and upper is not None): result.append((upper, next)) # Now, exit before we ever get to a bounds error on getting the next item. if next is None: break return result
However, there is a problem here that is nicely solved by the Range types within postgres:
>>> missing_ranges(data) [(None, 3), (12, None)]
We need to subtract from the first object’s upper bound. Which is easy with integer tuple-ranges (basically, any discrete range type), but not so much with continuous ranges.