Highlighting Liquid Template Blocks in Marked.app

For many years, I’ve used an old version of Jekyll to write this blog. For previewing, I use Marked.app, and one of the things I like about it is how you can get it to preprocess your Markdown files before processing by the markdown processor, or use a custom markdown processor altogether.

In my case, I use Liquid Templates, although the only part of them I use often are the syntax highlighting features. I have some neat TextMate language extensions so that I see the code blocks for Python, SQL and other languages syntax highlighted in the “proper” way for that code block.

Until recently, I think I had a custom markdown processor which used to apply the syntax highlighting so I saw them in Marked.app as I would in the browser after rendering using Jekyll, but that stopped working. So tonight, I wrote a small tool in python to use Pygments to apply the syntax highlighting.

There’s not much to it: it’s more glue code: it uses re.sub to switch out the highlight block with the syntax highlighted version. Something like:

import pathlib
import sys

from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters.html import HtmlFormatter


def highlight_block(match):
    data = match.groupdict()
    formatter = HtmlFormatter(noclasses=True, linenos=False)
    lexer = get_lexer_by_name(data['language'], stripall=True)
    return highlight(data['code'], lexer, formatter)


re.sub(
  r'{% highlight (?P<language>.*?) %}\n(?P<code>.*?)\n{% endhighlight %}',
  highlight_block,
  pathlib.Path(sys.argv[1])
)

However, it is a bit slow to syntax highlight the files. It might be nice to cache them somewhere:

import pathlib

CACHE_DIR = pathlib.Path('/tmp/pygments-cache/')
CACHE_DIR.mkdir(exist_ok=True)


def highlight_block(match):
    data = match.groupdict()
    cache = CACHE_DIR / '{language}.{hash}.html'.format(
        hash=hash(data['code']),
        language=data['language'],
    )

    if cache.exists():
        return cache.open().read()

    formatter = HtmlFormatter(
        noclasses=True,
        linenos='linenos' in data
    )
    lexer = get_lexer_by_name(data['language'], stripall=True)
    output = pygments.highlight(data['code'], lexer, formatter)
    cache.open('w').write(output)
    return output

Now it doesn’t need to rebuild syntax highlighting for blocks that have already been highlighted, and the cache automatically invalidates when there are changes to the block.

This is almost the same solution I implemented as a Jekyll plugin to make that run a bunch faster: although this version does inline styles, which means I don’t have to use the same CSS from my blog.

This is packaged up into a command line tool, and installed using:

$ pipx install --spec \
        hg+https://hg.sr.ht/~schinckel/liquid-highlight \
        liquid_highlight

(or would be if sourcehut’s public urls worked).