Capture and test sys.stdout/sys.stderr in unittest.TestCase

Testing in Django is usually done using the unittest framework, which comes with Python. You can also test using doctest, with a little bit of work.

One advantage of doctest is that it’s super-easy to test for an exception: you just expect the traceback (which can be trimmed using \n ... \n).

In a unittest.TestCase, you can do a similar thing, but it’s a little more work.

Basically, you want to temporarily replace sys.stdout (or sys.stderr) with a StringIO instance, and set it back after the block you care about has finished.

Python has had a nice feature for some time called Context Managers. These enable you to ensure that cleanup code will be run, regardless of what happens in the block.

The syntax for running code within a context manager is:

with context_manager(thing) as other:
  # Code we want to run
  # Can use 'other' in here.

One place that you can see this syntax, in the context of testing using unittest is to check a specific exception is raised when a function that uses keyword arguments, or a statement that is not a callable is executed:

class FooTest(TestCase):
  def test_one_way(self):
    self.assertRaises(ExceptionType, callable, arg1, arg2)

  def test_another_way(self):
    with self.assertRaises(ExceptionType):
      callable(arg1, arg2)
      # Could also be:
      #     callable(arg1, arg2=arg2)
      # Or even:
      #     foo = bar + baz
      # Which are not possible in the test_one_way call.

So, we could come up with a similar way of calling our code that we want to capture the sys.stdout from:

class BarTest(TestCase):
  def test_and_capture(self):
    with capture(callable, *args, **kwargs) as output:
      self.assertEquals("Expected output", output)

And the context manager:

import sys
from cStringIO import StringIO
from contextlib import contextmanager

@contextmanager
def capture(command, *args, **kwargs):
  out, sys.stdout = sys.stdout, StringIO()
  try:
    command(*args, **kwargs)
    sys.stdout.seek(0)
    yield sys.stdout.read()
  finally:
    sys.stdout = out

It’s simple enough to do the same with sys.stderr.

Update: thanks to Justin Patrin for pointing out that we should wrap the command in a try:finally: block.

My own private PyPI

PyPI, formerly the CheeseShop is awesome. It’s a central repository of python packages. Knowing you can just do a pip install foo, and it looks on pypi for a package named foo is superb. Using pip requirements files, or setuptools install_requires means you can install all the packages you need, really simply.

And, the nice thing about pip is that it won’t bother downloading a package you already have installed, subject to version requirements, unless you specifically force it to. This is better than installing using pip install -e <scm>+https://... from a mercurial or git repository. This is a good reason to have published version numbers.

However, when installing into a new virtualenv, it still may take some time to download all of the packages, and not everything I do can be put onto pypi: quite a lot of my work is confidential and copyrighted by my employer. So, there is quite a lot of value to me to be able to have a local cache of packages.

You could use a shared (between all virtualenvs) --build directory, but the point of virtualenv is that every environment is isolated. So, a better option is a local cache server. And for publishing private packages, a server is required for this too. Being able to use the same workflow for publishing a private package as an open source package is essential.

Because we deploy using packages, our private package server is located outside of our office network. We need to be able to install packages from it on our production servers. However, this negates the other advantage of a pypi cache. It does mean we control all of the required infrastructure required to install: no more “We can’t deploy because github is down.”

So, the ideal situation is to actually have two levels of server: our private package server, and then a local cache server on each developer’s machine. You could also have a single cache server in the local network, or perhaps three levels. I’m not sure how much of a performance hit not having the cache on the local machine is.

To do this, you need two things. Your local cache needs to be able to use an upstream cache (no dicking around with /etc/hosts please), and your private server needs to be able to provide data to this.

The two tools I have been using handle neither of these. pypicache does not handle upstream caching, however this was easy to patch. My fork handles upstream caching, plus uses setuptools, enabling it to install it’s own dependencies.

localshop, however, will not work as an upstream cache, at least with pypicache, which uses some other APIs than those used by pip. However, it does have nice security features, and to move away from it would require me to extract the package data out. pypicache works to a certain extent with itself as an upstream cache, until you try to use it’s ‘requirements.txt caching’ feature. Which I tried to tonight.

Oh well.

Django and RequireJS

Until very recently, I was very happy with django-compressor. It does a great job of combining and minifying static media files, specifically JavaScript and CSS files. It will manage compilation, allowing you to use, for example, SASS and CoffeeScript. Not that I do.

But, for me, the best part was the cache invalidation. By combining JavaScript (or CSS) into files that get named according to a hash of their contents, it’s trivial for clients to not have an old cached JS or CSS file.

However, recently I have begun using RequireJS. This enables me to declare dependencies, and greatly simplify the various pages within my site that use specific JavaScript modules. But this does not play so well with django-compressor. The problem lies with the fact that there is no real way to tell RequireJS that “instead of js/file.js, it should use js/file.123ABC.js”, where 123ABC is determined by the static files caching storage. RequireJS will do optimisation, and this includes combining files, but that’s not exactly what I want. I could create a built script for each page that has a require() call in it, but that would mean jQuery etc get downloaded seperately for each different script.

I have tried using django-require, but using the {% require_module %} tag fails spectacularly (with a SuspicousOperation exception). And even then, the files that get required by a dependency hierarchy do not have the relevant version string.

That is, it seems that the only way to get the version numbering is to use django’s templating system over each of the javascript files.

There appear to be two options.

** List every static file in require.config({paths: ...}). **

This could be manually done, but may be possible to rewrite a config.js file, as we do have access to all of the processed files as part of the collectstatic process.

Basically, you need to use {% static 'js/file.js' %}, but strip off the trailing .js.

** Rewrite the static files. **

Since we are uglifying the files anyway, we could look at each require([...], function(){ ... }) call, and replace the required modules. I think this would actually be more work, as you would need to reprocess every file.

So, the former looks like the solution. django-require goes close, but, as mentioned, doesn’t quite get there.