Skip to content
July 17, 2007 (last updated October 18, 2007)

Multiline Code Blocks for Markdown With Syntax Highlighting

Since I'm releasing lots of code on this site, being able to have code blocks in news posts sounded like a good idea.

At first, I wrapped the code in standard XHTML tags and added some formatting via CSS.

Then, I started to use the Markdown syntax (through its Python port) in my posts, and their source became much easier to read and write.

Unfortunately, I ran into some serious problems with them. It's been a while ago and I cannot remember anymore what exactly was the issue, sorry. As far as I can remember, they were related to indention, line breaks and (unintentional) Markdown syntax in a code block.

However, syntax highlighting was also not available and I really wanted it for better readability. I've already been using the great Pygments package for my snippets section and it proved to be an excellent choice.

So, I wanted to have a Markdown syntax element like this:

[sourcecode:lexer]
some code
[/sourcecode]

lexer can be any language short name supported by Pygments.

Here is a Markdown preprocessor that uses Pygments to highlight the content enclosed by the above syntax:

import re

from markdown import Preprocessor
from pygments import highlight
from pygments.formatters import HtmlFormatter
from pygments.lexers import get_lexer_by_name, TextLexer


class CodeBlockPreprocessor(Preprocessor):

    pattern = re.compile(
        r'\[sourcecode:(.+?)\](.+?)\[/sourcecode\]', re.S)

    def run(self, lines):
        def repl(m):
            try:
                lexer = get_lexer_by_name(m.group(1))
            except ValueError:
                lexer = TextLexer()
            code = highlight(m.group(2), lexer, HtmlFormatter())
            code = code.replace('\n\n', '\n \n')
            return '\n\n<div class="code">%s</div>\n\n' % code
        return self.pattern.sub(
            repl, '\n'.join(lines)).split('\n')

Then, the preprocessor can be integrated like this:

from markdown import Markdown

md = Markdown()
md.preprocessors.insert(0, CodeBlockPreprocessor())
markdown = md.__str__

markdown is then a callable that can be passed to the context of a template and used in that template, for example.

Finally, pygmentize -S <some style> -f html > pygments.css creates a stylesheet to be added to the website.

Here we go, enjoy the new colorful code presentation.

Update: As of Pygments 0.9, released October 14, 2007, the code presented here is included in the distribution as external/markdown-processor.py.

Not tagged.