enum in Python

Recently I was reading a post by Eli Bendersky (one of my favorite bloggers) and I ran across a sentence in which Eli says “It’s a shame Python still doesn’t have a functional enum type, isn’t it?”

The comment startled me because I had always thought that it was obvious how to do enums in Python, and that it was obvious that you don’t need any special language features to do it. Eli’s comment made me think that I might need to do a reality-check on my sense of what was and was not obvious about enums in Python.

So I googled around a bit and found that there are a lot of different ideas about how to do enums in Python. I found a very large set of suggestions on StackOverflow here and here and here. There is a short set of suggestion on Python Examples. The ActiveState Python Cookbook has a long recipe, and PEP-354 is a short proposal (that has been rejected). Surprisingly, I found only a couple of posts that suggested what had seemed to me to be THE obvious solution. The clearest was by snakile on StackOverflow.

Anyway, to end the suspense, the answer that seemed to me so obvious was this. An enum is an enumerated data type. An enumerated data type is a type, and a type is a class.

class           Color : pass
class Red      (Color): pass
class Yellow   (Color): pass
class Blue     (Color): pass

Which allows you to do things like this.

class Toy: pass

myToy = Toy()

myToy.color = "blue"  # note we assign a string, not an enum

if myToy.color is Color:
    pass
else:
    print("My toy has no color!!!")    # produces:  My toy has no color!!!

myToy.color = Blue   # note we use an enum

print("myToy.color is", myToy.color.__name__)  # produces: myToy.color is Blue
print("myToy.color is", myToy.color)           # produces: myToy.color is <class '__main__.Blue'>

if myToy.color is Blue:
    myToy.color = Red

if myToy.color is Red:
    print("my toy is red")   # produces: my toy is red
else:
    print("I don't know what color my toy is.")

So that’s what I came up with.

But with so many intelligent people all trying to answer the same question, and coming up with such a wide array of different answers, I had to fall back and ask myself a few questions.

  • Why am I seeing so many different answers to what seems like a simple question?
  • Is there one right answer? If so, what is it?
  • What is the way — the best, or most widely-used, or most pythonic — way to do enums in Python?
  • Is the question really as simple as it seems?

For me, the jury is still out on most of these questions, but until they return with a verdict I have come up with two thoughts on the subject.

First, I think that many programmers come to Python with backgrounds in other languages — C or C++, Java, etc. Their experiences with other languages shape their conceptions of what an enum — an enumerated data type — is. And when they ask “How can I do enums in Python?” they’re asking a question like the question that sparked the longest thread of answers on StackOverflow:

I’m mainly a C# developer, but I’m currently working on a project in Python. What’s the best way to implement the equivalent of an enum [i.e. a C# enum] in Python?

So naturally, the question “How can I implement in Python the equivalent of the kind of enums that I’m familiar with in language X?” has at least as many answers as there are values of X.

My second thought is somewhat related to the first.

Python developers believe in duck typing. So a Python developer’s first instinct is not to ask you:

What do you mean by “enum”?

A Python developer’s first instinct is to ask you:

What kinds of things do you think an “enum” should be able to do?
What kinds of things do you think you should be able to do with an “enum”?

And I think that different developers probably have very different ideas about what one should be able to do with an “enum”. Naturally, that leads them to propose different ways of implementing enums in Python.

As a simple example, consider the question — Should you be able to sort enums?

My personal inclination is to say that — in the most conceptually pure sense of “enum” — the concept of sorting enums makes no sense. And my suggestion for implementing enums in Python reflects this. Suppose you implement a “Color” enum using the technique that I’ve proposed, and then try to sort enums.

# how do enumerated values sort?
colors = [Red, Yellow, Blue]
colors.sort()
for color in colors:
    print(color.__name__)

What you get is this:

Traceback (most recent call last):
  File "C:/Users/ferg_s/pydev/enumerated_data_types/edt.py", line 32, in <module>
    colors.sort()
TypeError: unorderable types: type() < type()

So that suites me just fine.

But I can easily imagine someone (myself?) working with an enum for, say, Weekdays (Sunday, Monday, Tuesday… Saturday). And I think it might be reasonable in that situation to want to be able to sort Weekdays and to do greater than and less than comparisons on them.

So if we’re talking duck typing, I’m happy with enums/ducks that are motionless and silent. My only requirement is that they be different from everything else and different from each other. But I can easily imagine situations where one might reasonably need/want/prefer ducks that can form a conga line, dance, and sing a few bars. And for those situations, you obviously need more elaborate implementations of enums.

So, with these thoughts in mind, I’m inclined to think that there is no single, best way to implement an enum in Python. The concept of an enum is flexible enough to cover a variety of implementations offering a variety of features.

Python’s magic methods

Here are some links to documentation of Python’s magic methods, aka special methods, aka “dunder” (double underscore) methods.

There are also a few other Python features that are sometimes characterized as “magic”.

I’m sure there are other useful Web pages about magic methods that I haven’t found. If you know of one (and feel like sharing it) note that you can code HTML tags into a WordPress comment, like this, and they will show up properly formatted:

I found a useful discussion of magic methods at
<a href="http://www.somebodys_web_site.com/magic-methods">www.somebodys_web_site.com/magic-methods</a>

 

Yet Another Lambda Tutorial

modified to use the WordPress [sourcecode] tag — 2012-01-14

There are a lot of tutorials[1] for Python’s lambda out there. One that I stumbled across recently and really found helpful was Mike Driscoll’s discussion of lambda on the Mouse vs Python blog.

When I first started learning Python, one of the most confusing concepts to get my head around was the lambda statement. I’m sure other new programmers get confused by it as well…

Mike’s discussion is excellent: clear, straight-forward, with useful illustrative examples. It helped me — finally — to grok lambda, and led me to write yet another lambda tutorial.

 


Lambda: a tool for building functions

Basically, Python’s lambda is a tool for building functions (or more precisely, function objects). That means that Python has two tools for building functions: def and lambda.

Here’s an example. You can build a function in the normal way, using def, like this:

def square_root(x): return math.sqrt(x)

or you can use lambda:

square_root = lambda x: math.sqrt(x)

Here are a few other interesting examples of lambda:

sum = lambda x, y:   x + y   #  def sum(x,y): return x + y
out = lambda   *x:   sys.stdout.write(" ".join(map(str,x)))
lambda event, name=button8.getLabel(): self.onButton(event, name)

 


What is lambda good for?

A question that I’ve had for a long time is: What is lambda good for? Why do we need lambda?

The answer is:

  • We don’t need lambda, we could get along all right without it. But…
  • there are certain situations where it is convenient — it makes writing code a bit easier, and the written code a bit cleaner.

What kind of situations?

Well, situations in which we need a simple one-off function: a function that is going to be used only once.

Normally, functions are created for one of two purposes: (a) to reduce code duplication, or (b) to modularize code.

  • If your application contains duplicate chunks of code in various places, then you can put one copy of that code into a function, give the function a name, and then — using that function name — call it from various places in your code.
  • If you have a chunk of code that performs one well-defined operation — but is really long and gnarly and interrupts the otherwise readable flow of your program — then you can pull that long gnarly code out and put it into a function all by itself.

But suppose you need to create a function that is going to be used only once — called from only one place in your application. Well, first of all, you don’t need to give the function a name. It can be “anonymous”. And you can just define it right in the place where you want to use it. That’s where lambda is useful.

But, but, but… you say.

  • First of all — Why would you want a function that is called only once? That eliminates reason (a) for making a function.
  • And the body of a lambda can contain only a single expression. That means that lambdas must be short. So that eliminates reason (b) for making a function.

What possible reason could I have for wanting to create a short, anonymous function?

Well, consider this snippet of code that uses lambda to define the behavior of buttons in a Tkinter GUI interface. (This example is from Mike’s tutorial.)

frame = tk.Frame(parent)
frame.pack()

btn22 = tk.Button(frame, 
        text="22", command=lambda: self.printNum(22))
btn22.pack(side=tk.LEFT)

btn44 = tk.Button(frame, 
        text="44", command=lambda: self.printNum(44))
btn44.pack(side=tk.LEFT)

The thing to remember here is that a tk.Button expects a function object as an argument to the command parameter. That function object will be the function that the button calls when it (the button) is clicked. Basically, that function specifies what the GUI will do when the button is clicked.

So we must pass a function object in to a button via the command parameter. And note that — since different buttons do different things — we need a different function object for each button object. Each function will be used only once, by the particular button to which it is being supplied.

So, although we could code (say)

def __init__(self, parent):
    """Constructor"""
    frame = tk.Frame(parent)
    frame.pack()

    btn22 = tk.Button(frame, 
        text="22", command=self.buttonCmd22)
    btn22.pack(side=tk.LEFT)

    btn44 = tk.Button(frame, 
        text="44", command=self.buttonCmd44)
    btn44.pack(side=tk.LEFT)

def buttonCmd22(self):
    self.printNum(22)

def buttonCmd44(self):
    self.printNum(44)

it is much easier (and clearer) to code

def __init__(self, parent):
    """Constructor"""
    frame = tk.Frame(parent)
    frame.pack()

    btn22 = tk.Button(frame, 
        text="22", command=lambda: self.printNum(22))
    btn22.pack(side=tk.LEFT)

    btn44 = tk.Button(frame, 
        text="44", command=lambda: self.printNum(44))
    btn44.pack(side=tk.LEFT)

When a GUI program has this kind of code, the button object is said to “call back” to the function object that was supplied to it as its command.

So we can say that one of the most frequent uses of lambda is in coding “callbacks” to GUI frameworks such as Tkinter and wxPython.

 


This all seems pretty straight-forward. So…

Why is lambda so confusing?

There are four reasons that I can think of.

First Lambda is confusing because: the requirement that a lambda can take only a single expression raises the question: What is an expression?

A lot of people would like to know the answer to that one. If you Google around a bit, you will see a lot of posts from people asking “In Python, what’s the difference between an expression and a statement?”

One good answer is that an expression returns (or evaluates to) a value, whereas a statement does not. Unfortunately, the situation is muddled by the fact that in Python an expression can also be a statement. And we can always throw a red herring into the mix — assigment statements like a = b = 0 suggest that Python supports chained assignments, and that assignment statements return values. (They do not. Python isn’t C.)[2]

In many cases when people ask this question, what they really want to know is: What kind of things can I, and can I not, put into a lambda?

And for that question, I think a few simple rules of thumb will be sufficient.

  • If it doesn’t return a value, it isn’t an expression and can’t be put into a lambda.
  • If you can imagine it in an assignment statement, on the right-hand side of the equals sign, it is an expression and can be put into a lambda.

Using these rules means that:

  1. Assignment statements cannot be used in lambda. In Python, assignment statements don’t return anything, not even None (null).
  2. Simple things such as mathematical operations, string operations, list comprehensions, etc. are OK in a lambda.
  3. Function calls are expressions. It is OK to put a function call in a lambda, and to pass arguments to that function. Doing this wraps the function call (arguments and all) inside a new, anonymous function.
  4. In Python 3, print became a function, so in Python 3+, print(…) can be used in a lambda.
  5. Even functions that return None, like the print function in Python 3, can be used in a lambda.
  6. Conditional expressions, which were introduced in Python 2.5, are expressions (and not merely a different syntax for an if/else statement). They return a value, and can be used in a lambda.
    lambda: a if some_condition() else b
    lambda x: ‘big’ if x > 100 else ‘small’

 

Second Lambda is confusing because: the specification that a lambda can take only a single expression raises the question: Why? Why only one expression? Why not multiple expressions? Why not statements?

For some developers, this question means simply Why is the Python lambda syntax so weird? For others, especially those with a Lisp background, the question means Why is Python’s lambda so crippled? Why isn’t it as powerful as Lisp’s lambda?

The answer is complicated, and it involves the “pythonicity” of Python’s syntax. Lambda was a relatively late addition to Python. By the time that it was added, Python syntax had become well established. Under the circumstances, the syntax for lambda had to be shoe-horned into the established Python syntax in a “pythonic” way. And that placed certain limitations on the kinds of things that could be done in lambdas.

Frankly, I still think the syntax for lambda looks a little weird. Be that as it may, Guido has explained why lambda’s syntax is not going to change. Python will not become Lisp.[3]

 

Third Lambda is confusing because: lambda is usually described as a tool for creating functions, but a lambda specification does not contain a return statement.

The return statement is, in a sense, implicit in a lambda. Since a lambda specification must contain only a single expression, and that expression must return a value, an anonymous function created by lambda implicitly returns the value returned by the expression. This makes perfect sense.

Still — the lack of an explicit return statement is, I think, part of what makes it hard to grok lambda, or at least, hard to grok it quickly.

 

Fourth Lambda is confusing because: tutorials on lambda typically introduce lambda as a tool for creating anonymous functions, when in fact the most common use of lambda is for creating anonymous procedures.

Back in the High Old Times, we recognized two different kinds of subroutines: procedures and functions. Procedures were for doing stuff, and did not return anything. Functions were for calculating and returning values. The difference between functions and procedures was even built into some programming languages. In Pascal, for instance, procedure and function were different keywords.

In most modern languages, the difference between procedures and functions is no longer enshrined in the language syntax. A Python function, for instance, can act like a procedure, a function, or both. The (not altogether desirable) result is that a Python function is always referred to as a “function”, even when it is essentially acting as a procedure.

Although the distinction between a procedure and a function has essentially vanished as a language construct, we still often use it when thinking about how a program works. For example, when I’m reading the source code of a program and see some function F, I try to figure out what F does. And I often can categorize it as a procedure or a function — “the purpose of F is to do so-and-so” I will say to myself, or “the purpose of F is to calculate and return such-and-such”.

So now I think we can see why many explanations of lambda are confusing.

First of all, the Python language itself masks the distinction between a function and a procedure.

Second, most tutorials introduce lambda as a tool for creating anonymous functions, things whose primary purpose is to calculate and return a result. The very first example that you see in most tutorials (this one included) shows how to write a lambda to return, say, the square root of x.

But this is not the way that lambda is most commonly used, and is not what most programmers are looking for when they Google “python lambda tutorial”. The most common use for lambda is to create anonymous procedures for use in GUI callbacks. In those use cases, we don’t care about what the lambda returns, we care about what it does.

This explains why most explanations of lambda are confusing for the typical Python programmer. He’s trying to learn how to write code for some GUI framework: Tkinter, say, or wxPython. He runs across examples that use lambda, and wants to understand what he’s seeing. He Googles for “python lambda tutorial”. And he finds tutorials that start with examples that are entirely inappropriate for his purposes.

So, if you are such a programmer — this tutorial is for you. I hope it helps. I’m sorry that we got to this point at the end of the tutorial, rather than at the beginning. Let’s hope that someday, someone will write a lambda tutorial that, instead of beginning this way

Lambda is a tool for building anonymous functions.

begins something like this

Lambda is a tool for building callback handlers.

 


So there you have it. Yet another lambda tutorial.


Footnotes

 

[1] Some lambda tutorials:

 

[2] In some programming languages, such as C, an assignment statement returns the assigned value. This allows chained assignments such as x = y = a, in which the assignment statement y = a returns the value of a, which is then assigned to x. In Python, assignment statements do not return a value. Chained assignment (or more precisely, code that looks like chained assignment statements) is recognized and supported as a special case of the assignment statement.

 

[3] Python developers who are familiar with Lisp have argued for increasing the power of Python’s lambda, moving it closer to the power of lambda in Lisp. There have been a number of proposals for a syntax for “multi-line lambda”, and so on. Guido has rejected these proposals and blogged about some of his thinking about “pythonicity” and language features as a user interface. This led to an interesting discussion on Lambda the Ultimate, the programming languages weblog about lambda, and about the idea that programming languages have personalities.

Why import star is a bad idea

When I was learning Python, I of course read the usual warnings. They told me: You can do

from something_or_other import *

but don’t do it. Importing star (asterisk, everything) is a Python Worst Practice.

Don’t “import star”!

But I was young and foolish, and my scripts were short. “How bad can it be?” I thought. So I did it anyway, and everything seemed to work out OK.

Then, like they always do, the quick-and-dirty scripts grew into programs, and then grew into a full-blown system. Before long I had a monster on my hands, and I needed a tool that would look through all of the scripts and programs in the system and do (at least) some basic error checking.

I’d heard good things about pyflakes, so I thought I’d give it a try.

It worked very nicely. It found the basic kinds of errors that I wanted it to find. And it was fast, so I could run it through a directory containing a lot of .py files and it would come out alive and grinning on the other side.

During the process, I learned that pyflakes is designed to be a bit on the quick and dirty side itself, with the quick making up for the dirty. As part of this process, it basically ignores star imports.  Oh, it warns you about the star imports.  What I means is — it doesn’t try to figure out what is imported by the star import.

And that has interesting consequences.

Normally, if your file contains an undefined name — say TARGET_LANGAGE — pyflakes will report it as an error.

But if your file includes any star imports, and your script contains an undefined name like TARGET_LANGAGE, pyflakes won’t report the undefined name as an error.

My hypothesis is that pyflakes doesn’t report TARGET_LANGAGE as undefined because it can’t tell whether TARGET_LANGAGE is truly undefined, or was pulled in by some star import.

This is perfectly understandable. There is no way that pyflakes is going to go out, try to find the something_or_other module, and analyze it to see if it contains TARGET_LANGAGE. And if it doesn’t, but contains star imports, go out and look for all of the modules that something_or_other star imports, and then analyze them. And so on, and so on, and so on. No way!

So, since pyflakes can’t tell whether TARGET_LANGAGE is (a) an undefined name or (b) pulled in via some star import, it does not report TARGET_LANGAGE as an undefined name. Basically, pyflakes ignores it.

And that seems to me to be a perfectly reasonable way to do business, not just for pyflakes but for anything short of the Super Deluxe Hyperdrive model static code analyzer.

The takeaway lesson for me was that using star imports will cripple a static code analyzer. Or at least, cripple the feature that I most want a code analyser for… to find and report undefined names.

So now I don’t use star imports anymore.

There are a variety of alternatives.  The one that I prefer is the “import x as y” feature. So I can write an import statement this way:

import   some_module_with_a_big_long_hairy_name   as    bx

 and rather than coding

x = some_module_with_a_big_long_hairy_name.vector

I can code

x = bx.vector

Works for me.

An alternative to string interpolation

I sort of like this.

# ugly
msg = "I found %s files in %s directories" % (filecount,foldercount)

# better
def Str(*args): return "".join(str(x) for x in args)
:
:
msg = Str("I found ", filecount, " files in ", foldercount, " directories" )

You don’t have to call it “Str”, of course.

How to open a web browser from Python

This goes under the Tips and Tricks category. 

Also under Stuff I wish I had known about a long time ago.

The trick is in the standard library, in the webbrowser module.

"""
For documentation of the webbrowser module,
see http://docs.python.org/library/webbrowser.html
"""
import webbrowser
new = 2 # open in a new tab, if possible

# open a public URL, in this case, the webbrowser docs
url = "http://docs.python.org/library/webbrowser.html"
webbrowser.open(url,new=new)

# open an HTML file on my own (Windows) computer
url = "file://X:/MiscDev/language_links.html"
webbrowser.open(url,new=new)