Skip to content
June 18, 2008

Entitled URLs with Werkzeug

# -*- coding: utf-8 -*-

"""
Entitled URLs with Werkzeug
===========================

:Copyright: 2008 Jochen Kupperschmidt
:Date: 18-Jun-2008
:License: MIT
"""

from string import ascii_lowercase, digits

from werkzeug.routing import BaseConverter


class TitleConverter(BaseConverter):
    """A `Werkzeug routing`_ converter that turns strings (usually titles)
    into URL-suitable representations.

    The only characters allowed in the converter's output are ASCII lower case
    letters, digits and the dash; everything else will be removed.  A
    dictionary for custom replacement rules allows e.g. to replace umlauts
    with their two-letter representation to keep the result readable.  Spaces
    are per default replaced with dashes as it is quite common in web blog
    software like Wordpress and others.

    An example URL map using this converter::

        from werkzeug.routing import Map, Rule

        map = Map([
            Rule('/items/<int:id>', defaults={'title': None}, endpoint='show'),
            Rule('/items/<int:id>-<title:title>', endpoint='show'),
            ], converters={'title': TitleConverter})

    Note that there are basically two very similar URLs defined, one having a
    default.  The rules match both URL paths ending with just an ID, but also
    with any string following an ID and a dash.  Also note that a view
    callable that a request is probably dispatched to via the endpoint string
    has to take both the ID and the title as arguments, but can set the latter
    as a keyword argument with a default of ``None`` and ignore it apart from
    that.

    .. _Werkzeug routing: http://werkzeug.pocoo.org/documentation/routing
    """

    # allowed characters
    allowed = frozenset(ascii_lowercase + digits + '-')

    # replacements for alternative spellings instead
    # of having non-allowed characters filtered out
    replacements = {
        u'ä': u'ae',
        u'ö': u'oe',
        u'ü': u'ue',
        u'ß': u'ss',
        u' ': u'-'}

    def to_url(self, value):
        """Format ``value`` as a readable but URL-safe string."""
        value = value.lower()
        for old, new in self.replacements.iteritems():
            value = value.replace(old, new)
        value = value.encode('ascii', 'ignore')
        value = filter(lambda char: char in self.allowed, value)
        return value

Download "entitled_urls.py"

Tagged as , .