Saturday, February 23, 2008

Programmable Python Syntax via Source Encodings

Because of my recent forays into Common Lisp and Haskell this recent recipe on ASPN piqued my interest. It demonstrates using Python source file encodings as a hook for introducing alternate Python syntax. Here's the simplest example of the technique I could come up with:

codec.py:
import codecs

class StreamReader(codecs.StreamReader):
def decode(self, input, errors='strict'):
output = input.replace('until ', 'while not ')
output = output.replace('++', '+= 1')
return unicode(output), len(input)

def get_my_codec(name):
if name == 'play-language':
return (codecs.utf_8_encode, None, StreamReader, None)

codecs.register(get_my_codec)

The above creates a codec called "play-language" and defines a very, very simple-minded source code transformation. For actual use, you'd need to work harder at parsing the code.

Now, if you create a file named play1.py with these contents:
# coding=play-language

c = 0
until c == 10:
print c
c++

And execute this from a command line:
python -c 'import codec1;import play1'

You'll get this output:
0
1
2
3
4
5
6
7
8
9