Introduction to Python Decorators

This post was originally written in August 2009. Since then, I have come to believe that there is a much better way to explain Python decorators, which I describe in another post.


Writing introductions to decorators is a popular pastime in the Python community. Here, for example, are some useful links on the subject:

But when it comes to technical topics, everyone has his or her own style of learning and one size of explanation does not fit all.

So I thought I’d try my hand at writing an introduction to Python decorators. My goal is not to explain everything about decorators. Instead I want to try to explain just the basics, just enough to give you a workable mental model of what decorators are and how to use them. Just enough to get started on doing useful work with decorators.

As Aristotle said, “Let us begin at the beginning”, which is to say, we begin by looking at functions.

Functions

When the Python interpreter encounters this code:

def hello():
    print ("Hello, world!")

it:

  • compiles the code to create a function object
  • binds the name “hello” to that function object.

Then, to run the function object, you can code

hello()

which causes this to be printed:

Hello, world!

If you code:

print (hello)

you will get something like:

<function hello at 0x02D021E0>

which is the string representation of the hello function object.
 

Annotations

Many discussions of decorators use the word “decorator” rather loosely, to refer to different decorator-related concepts. This kind of ambiguity is disconcerting at best, and confusing at worst.

To help avoid this ambiguity I will use the term “annotation” in this discussion to refer to lines of code that begin with “@”.

Here is a snippet of Python code that begins with two annotations:

@helloGalaxy
@helloSolarSystem
def hello():
    print ("Hello, world!")

We can say that the definition of the hello function is “decorated” with these two annotations. Since there are multiple annotations, we say that the annotations are “stacked”.

When the interpreter sees these lines of code, here is what it does.

  • It pushes helloGalaxy onto the annotation stack.
  • It pushes helloSolarSystem onto the annotation stack.

then it does the standard processing for a function definition …

  • It compiles the code for hello into a function object (lets call it functionObject1)
  • It binds the name “hello” to functionObject1.

then…

  • It pops helloSolarSystem off of the annotation stack,
  • passes functionObject1 to helloSolarSystem
  • helloSolarSystem returns a new function object (lets call it functionObject2), and…
  • the interpreter binds the original name “hello” to functionObject2

then…

  • It pops helloGalaxy off of the annotation stack,
  • passes functionObject2 to helloGalaxy
  • helloGalaxy returns a new function object (lets call it functionObject3), and…
  • the interpreter binds the original name “hello” to functionObject3

As you can see, this process could be repeated for indefinitely many annotations.

I’ve been vague about what kind of thing that helloSolarSystem and helloGalaxy are. For now, think of them as a special kind of function — a kind of function that takes one function object as an argument, and returns another function object as a result. The annotations:

@helloGalaxy
@helloSolarSystem

are calls to these functions. So this snippet of Python code:

@helloGalaxy
@helloSolarSystem
def hello():
    print ("Hello, world!")

is functionally equivalent to this:

def hello():
    print ("Hello, world!")
hello = helloSolarSystem(hello)
hello = helloGalaxy(hello)

Decorators

Now we are ready to define “decorator”.

A decorator is a function that is called by an annotation.

Where where do decorators come from?

You write them, just the way that you write other function definitions.

So let’s write some decorators. Here is helloSolarSystem.

def helloSolarSystem(original_function):
    def new_function():
        original_function()  # the () after "original_function" causes original_function to be called
        print("Hello, solar system!")
    return new_function

And let’s write helloGalaxy.

def helloGalaxy(original_function):
    def new_function():
        original_function()  # the () after "original_function" causes original_function to be called
        print("Hello, galaxy!")
    return new_function

As you can see, both of these decorators add a bit of functionality to the function object — original_function — that they receive as input. They wrap a call to original_function in a new function, new_function, put some additional functionality in new_function, and then they return the new_function object. (They return the new_function object to the annotation, which binds the original name to the new function object.)

So now let’s run our whole program and see what we get. Here’s the program.

def helloSolarSystem(original_function):
    def new_function():
        original_function()  # the () after "original_function" causes original_function to be called
        print("Hello, solar system!")
    return new_function
	
def helloGalaxy(original_function):
    def new_function():
        original_function()  # the parentheses after "original_function" cause original_function to be called
        print("Hello, galaxy!")
    return new_function

@helloGalaxy
@helloSolarSystem
def hello():
    print ("Hello, world!")

# Here is where we actually *do* something!
hello()

And here is what we get:

Hello, world!
Hello, solar system!
Hello, galaxy!

Arguments to functions

Now lets look at decorating functions that take arguments. Let’s modify the hello function so it accepts an argument, like this:

def hello(targetName=None):
    if targetName:
        print("Hello, " +  targetName +"!")
    else:
        print("Hello, world!")

If we were to run an undecorated version of the hello function, we’d get a nice greeting, like this:

>>> hello("Earth")
Hello, Earth!

But if we run the decorated version of the hello function, we get this:

TypeError: new_function() takes no arguments (1 given)

What’s the problem?

Remember that we wrapped functionObject1 (created from hello) in functionObject2 (created from helloSolarSystem) and then in functionObject3 (created from Galaxy), and then bound the name “hello” to functionObject3. So when we use the “hello” function, we are calling functionObject3.

FunctionObject3 was created by the code for new_function in helloGalaxy, and it accepts no arguments. Which is why we get the error message:

TypeError: new_function() takes no arguments (1 given)

The solution is to add support for arguments to the function objects that our decorators create. We need to add code to new_function so that it will accept arguments, and we need to add code to original_function so that it will accept the arguments that its wrapper (new_function) makes available to it.

def helloSolarSystem(original_function):
    def new_function(*args, **kwargs):
        original_function(*args, **kwargs)
        print("Hello, solar system!")
    return new_function

def helloGalaxy(original_function):
    def new_function(*args, **kwargs):
        original_function(*args, **kwargs)
        print("Hello, galaxy!")
    return new_function

And now:

>>>hello("Earth")
Hello, Earth!
Hello, solar system!
Hello, galaxy!

Gotcha — forgetting parentheses

Goto start of series

In Python, omitting the trailing parentheses from the end of a method call (one that takes no arguments) is not a syntax error. The place where this most frequently bites me is with the “close” method on file objects. Suppose you have an output file called “foo” and you want to close it. The correct way to do this is:

foo.close()

However, if you accidentally omit the trailing parentheses, and code this:

foo.close

Python will not report a syntax error, because this is not an error in Python. In Python, this is a perfectly legitimate statement that returns the method object “close”. (Remember that methods are first-class objects in Python.) If you do this in the Python interpreter, you will get a message like this:

<built-in method close of file object at 0x007E6AE0>

The nastiness about this gotcha is that if you fail to code the trailing parentheses on a “close” method for an output file, the output file will not be closed properly. The file’s output buffer will not be flushed out to disk, and the part of the output stream that was still left in the output buffer will be lost. After your program finishes, part of your output file will be missing, and you won’t know why.

The best way of dealing with this gotcha is just to be aware that it can be a problem, and to be alert. Be careful to code the parenthese on your method calls, and especially careful to code them on calls to the “close” method of file objects.

And if you find yourself with an output file that seems to be inexplicably truncated, your first thought should be to check for missing parentheses in the file.close() statement that closes the file.

Programs like PyChecker and PyLint may be able to detect this kind of error, which is one good reason to use them.

Gotcha — backslashes in Windows filenames

Goto start of series

Once upon a time there was a beautiful Windows programmer named Red Ridinghood.

One day, Red’s supervisor told her that they were going to start building a new application called GrandmasHouse. The feature list for the application was so long that they would never have attempted to get to GrandmasHouse if they hadn’t learned about a shortcut through Python Woods that would make the journey much shorter.

So Red started working her way through Python, and indeed found the going quick and easy. She loved the woods, and was happy to be traveling in them.

There was only one problem. Her programs did a lot of file manipulation, and so she had to do a lot of coding of filenames. Windows filenames used a backslash as a separator, but within Python the backslash had the magic power of an escape character, so every time she wanted a backslash in a filename she had to code two backslashes, like this:

myServer = "\\\\aServer" # ==&gt; \\aServer
myFilename = myServer + "\\aSharename\\aDirName\\aFilename"

This feature of Python got very old very quickly. Red started calling it The Wolf, and it was the one part of Python that she hated.

One day as she was walking through the forest, she came to a clearing. In the clearing was a charming little pub, and inside the pub she met a tall, dark, and handsome stranger named Rawstrings.

Rawstrings said he could save her from The Wolf. All she had to do, he said, was to put an “r” in front of her quoted string literals. This would change them from escaped strings into raw strings. The backslash would lose its powers as an escape character, and become just an ordinary character. For example, with raw strings, you could code

r"\t"

and you wouldn’t get a string contining a single tab character — you would get a string containing the backslash character followed by “t”.

So instead of coding

myServer = "\\\\aServer"

Red could just code

myServer = r"\\aServer"

Red was seduced by the things that Rawstrings was telling her, and she began to spend a lot of time in his company.

Then one day, she coded

myDirname = r"c:\aDirname\"

and her program blew up with the following message:

myDirname = r"c:\aDirname\" ^ SyntaxError: invalid token 

After some experimenting, she discovered that — contrary to what Rawstrings had told her — the backslash seemingly hadn’t lost all of its magic powers after all. For example, she could code:

aString = r"abc\"xyz"
print aString

When she did this, it seemed perfectly legal. The double-quote just before “xyz” did not close the raw string at all. Somehow the backslash seemed to protect it — it wasn’t recognized as the closing delimiter of the raw string, but was included in the string. When she coded

print aString

she got

abc\"xyz

It was this protective character that the backslash had acquired that made

myDirname = r"c:\aDirname\"

blow up. The final backslash was protecting the closing double-quote, so it was not being recognized as a closing quote. And since there was nothing after the double-quote, the raw string was not closed, and she got an error. She tried coding the raw string with two backslashes at the end — as if the backslash was an escape character —

myDirname = r"c:\aDirname\\"

but that didn’t do it either. Instead of getting the single closing backslash that she wanted, she got two backslashes at the end:

c:\aDirname\\

She was in despair. She couldn’t figure out any way to use raw strings to put a single backslash at the end of a string, and she didn’t want to have to go back to fighting The Wolf.

Fortunately, at this point she confided her troubles to Paul Woodman, a co-worker who had started exploring Python a few months earlier. Here is what he told her.

In raw strings, backslashes do not have the magical power of escape characters that they do in regular strings. But they don’t lose all of their magical powers.

In raw strings — as you discovered — backslashes have the magical power of protection characters. Basically, this means that a backslash protects any character that follows it from being recognized as the closing delimiter of the raw string.

Coming from a Windows programming background, you assumed that support for raw strings was a feature whose purpose was to make the work of coding Windows filenames easier by removing the magical escape character powers from the backslash. And you were surprised to discover that raw strings aren’t truly raw in the way that you expected — raw in the sense that the backslash had no magical powers.

The reason for the special powers of backslashes in raw strings is that — contrary to what you assumed — raw strings were not developed to make it easier for Windows programmers to code filenames containing backslash characters. In fact, raw strings were originally developed to make the work of coding regular expressions easier. In raw strings, the backslash has the magical power of a protection character because that is just the kind of behavior it needs to have in order to make it easier to code regular expressions. The feature that you can’t end a raw string with a single backslash is not a bug. It is a feature, because it is not legal to end a regular expression with a single backslash (or an odd number of backslashes).

Unfortunately for you, this power makes it impossible to create a raw string that ends in a single backslash, or in an odd number of backslashes. So raw expressions won’t do what you want them to, namely save you from The Wolf.

But don’t despair! There is a way…

In Python, there are a number of functions in the os.path module that change forward slashes in a string to the appropriate filename separator for the platform that you are on. One of these function is os.path.normpath()The trick is to enter all of your filename strings using forward slashes, and then let os.path.normpath() change them to backslashes for you, this way.

myDirname = os.path.normpath("c:/aDirname/")

It takes a bit of practice to get into the habit of specifying filenames this way, but you’ll find that you adapt to it surprisingly easily, and you’ll find it a lot easier than struggling with The Wolf.

Red was super happy to hear this. She transferred to Woodman’s project team, and they all coded happily ever after!

Gotcha — backslashes are escape characters

Goto start of series

This is a language feature that is so common on Unix that Unix programmers never think twice about it. Certainly, a Unix programmer would never consider it to be a gotcha. But for someone coming from a Windows background, it may very well be unfamiliar.

The gotcha may occur when you try to code a Windows filename like this:

myFilename = "c:\newproject\typenames.txt"
myFile = open(myFilename, "r")

and — even though the input file exists — when you run your program, you get the error message

IOError: [Errno 2] No such file or directory:
'c:\newproject\typenames.txt'

To find out what’s going on, you put in some debugging code:

myFilename = "c:\newproject\typenames.txt"
print "(" + myFilename + ")"

And what you see printed on the console is:

(c:
ewproject       ypenames.txt)

What has happened is that you forgot that in Python (as in most languages that evolved in a Unix environment) in quoted string literals the backslash has the magical power of an escape character. This means that a backslash isn’t interpreted as a backslash, but as a signal that the next character is to be given a special interpretation. So when you coded

myFilename = "c:\newproject\typenames.txt"

the “\n” that begins “\newproject” was interpreted as the newline character, and the “\t” that begins “\typenames.txt” was interpreted as the tab character. That’s why, when you printed the filename, you got the result that you did. And it is why Python couldn’t find your file — because no file with the name

c:(newline)ewproject(tab)ypenames.txt

could be found.To put a backslash into a string, you need to code two backslashes — that is, the escape character followed by a backslash. So to get the filename that you wanted, you needed to code

myFilename = "c:\\newproject\\typenames.txt"

And under some circumstances, if Python prints information to the console, you will see the two backslashes rather than one. For example, this is part of the difference between the repr() function and the str() function.

myFilename = "c:\\newproject\\typenames.txt"
print repr(myFilename), str(myFilename)

produces

'c:\\newproject\\typenames.txt' c:\newproject\typenames.txt

Escape characters are documented in the Python Language Reference Manual. If they are new to you, you will find them disconcerting for a while, but you will gradually grow to appreciate their power.

Python Gotchas

What is a “gotcha”?The word “gotcha” started out as the expression “Got you!” This is something that someone who speaks idiomatic American English might say when he succeeds in playing a trick or prank on someone else. “I really got you with that trick!”

The expression “Got you!” is pronounced “Got ya!” or “Got cha!”.

Among computer programmers, a “gotcha” has become a term for a feature of a programming language that is likely to play tricks on you to display behavior that is different than what you expect.

Just as a fly or a mosquito can “bite” you, we say that a gotcha can “bite” you.

About this Page

This is a page devoted to Python “gotchas”. Python is a very clean and intuitive language, so it hasn’t got many gotchas, but it still has a few that often bite beginning Python programmers. My hope is that if you are warned in advance about these gotchas, you won’t be bit quite so hard!

Note that a gotcha isn’t necessarily a problem in the language itself. Rather, it is a situation in which there is a mismatch between the programmer’s expections of how the language will work, and the way the language actually does work. Often, the source of a gotcha lies not in the language, but in the programmer. Part of what creates a programmer’s expectations is his own personal background. A programmer with a Windows or mainframe background, or a background in COBOL or the Algol-based family of languages (PL/1, Pascal, etc.), is especially prone to experiencing gotchas in Python, a language that evolved in a Unix environment and incorporates a number of conventions of the C family of programming languages (C, C++, Java).

If you’re such a programmer, don’t worry. There aren’t many Python gotchas. Keep learning Python. It is a great language, and you’ll soon come to love it.

Other posts about Python Gotchas

Lists of Python Gotchas