Limiting SQL Alchemy queries

Note for future reference: especially since the data migration script I was running today was running out of memory and crashing.

You can use slice notation, which limits the query:

>>> for p in db.query(Person)[:10]:
...   print p

This rocks.

Sometimes, the bugs aren't mine...

I’ve just spent about five hours working out some kinks with data migration between an SQLite database, and a PostgreSQL database. I’m using SQLAlchemy as the transfer tool, since the newer PGSQL database is created and only ever used through this interface.

The big issue I’ve hit is with some Numeric types. One table has Numeric values, which are NULLABLE, but for some reason when I try to load objects from this table, decimal.Decimal throws an exception.

But here is the weird bit. If I rerun the query, and try to load the same object a second time, it succeeds.

Since I needed to access all of the rows from this table, I had to keep trying to load them until I’m no longer throwing exceptions, and then finish. I had a nice code sample from SubEthaEdit, but that seems to have a bit of a bug with the Copy As XHTML feature in the newest build (submitting a bug report now).

Transferring Data using SQLAlchemy

I’ve had cause to transfer a whole stack of data from an old sqlite database to a PostgreSQL database, and since on the new database I am using SQL Alchemy, I used SA to do the transfer.

Because of some of the relations, I have had to keep IDs constant across some of the columns. This is fine, but because they have Sequence objects associated with them, these are not kept up to date with the ‘custom’ created IDs. Thus, when attempting to create a new row in the table, it often fails, since it is trying to add a primary key that already exists.

After a little bit of research, I discovered it is possible to force the sequence object to increment, using the command:

nextID = engine.execute(Sequence('sequence_name'))

Using a little bit of magic, we can find out the largest index currently in use:

maxID = db.query(Object).order_by('-id').first().id

And a simple while loop will keep incrementing until we have reached the correct id.