Single-pass multiple replace

Replace multiple patterns in a single pass with Python.

I came across the desire, once again, to replace multiple patterns in a single pass, and feeling something just missing from or put too complicated in both the solution at ASPN and the one in the print version of the Python Cookbook, I put my version of it here:

import re

text = 'i ran from the bob.  after all, he\'s a large mother.'
subs = {
    'bob': 'panda',
    'mother': 'bear',
}

compiled = re.compile('|'.join(map(re.escape, subs)))
def lookup(match):
    return subs[match.group(0)]
new_text = compiled.sub(lookup, text)

It could only get simpler if I replaced lookup with a lambda, but I'm not a fan. I don't mind using re directly, but it is a bit hard on the eyes. Unnecessarily so if I might use it more than once in the same project. For that situation, I'd use the closure version from the second edition of the Python Cookbook. It reads as follows:

def make_xlat(*args, **kwds):
    adict = dict(*args, **kwds)
    rx = re.compile('|'.join(map(re.escape, adict)))
    def one_xlat(match):
        return adict[match.group(0)]
    def xlat(text):
        return rx.sub(one_xlat, text)
    return xlat

As described in the book, a translater for a particular dictionary can then be created with a simple translate = make_xlat(adict). Finally, I prefer the class version in the book to the one at ASPN because it's a "drop-in replacement" for the closure, and because the names make more sense to me. Here it is:

class make_xlat:
    def __init__(self, *args, **kwds):
        self.adict = dict(*args, **kwds)
        self.rx = self.make_rx( )
    def make_rx(self):
        return re.compile('|'.join(map(re.escape, self.adict)))
    def one_xlat(self, match):
        return self.adict[match.group(0)]
    def __call__(self, text):
        return self.rx.sub(self.one_xlat, text)

This recipe is a great example of the easy adaptability of Python for similar, but subtly different purposes. It's also a great example of just how educational each one of the recipes in the Python Cookbook can be. Damn that's a fine tome.

keywords: python, regex, multiple-replace created 2007-12-18 last modified 2009-01-12