<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Python Conquers The Universe</title>
	<atom:link href="http://pythonconquerstheuniverse.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://pythonconquerstheuniverse.wordpress.com</link>
	<description>Adventures across space and time with the Python programming language</description>
	<lastBuildDate>Sat, 04 May 2013 20:54:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='pythonconquerstheuniverse.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Python Conquers The Universe</title>
		<link>http://pythonconquerstheuniverse.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://pythonconquerstheuniverse.wordpress.com/osd.xml" title="Python Conquers The Universe" />
	<atom:link rel='hub' href='http://pythonconquerstheuniverse.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Cracking passwords is getting easier</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2013/02/14/cracking-passwords-is-getting-easier/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2013/02/14/cracking-passwords-is-getting-easier/#comments</comments>
		<pubDate>Thu, 14 Feb 2013 12:52:23 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Python features]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=2121</guid>
		<description><![CDATA[Not Python-related, but really worth reading. Why passwords have never been weaker — and crackers have never been stronger on ars technica.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=2121&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Not Python-related, but really worth reading.</p>
<p><a href="http://arstechnica.com/security/2012/08/passwords-under-assault/">Why passwords have never been weaker — and crackers have never been stronger</a> on <em>ars technica</em>.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=2121&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2013/02/14/cracking-passwords-is-getting-easier/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>Death Swamp</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2013/01/17/death-swamp/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2013/01/17/death-swamp/#comments</comments>
		<pubDate>Thu, 17 Jan 2013 13:24:53 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=2112</guid>
		<description><![CDATA[Recently a friend sent me this. I recognized it instantly, although I never knew that it had a name. There is a management technique called &#8220;death swamp&#8221; (or &#8220;death bog&#8221; or &#8220;fly paper&#8221;). It works this way. Occasionally some young &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2013/01/17/death-swamp/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=2112&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Recently a friend sent me this.  I recognized it instantly, although I never knew that it had a name.</p>
<blockquote><p>There is a management technique called &#8220;death swamp&#8221; (or &#8220;death bog&#8221; or &#8220;fly paper&#8221;).  It works this way.  Occasionally some young fire-eater comes up with an idea to Do Something.   The bureaucracy can&#8217;t simply reject his idea because then they&#8217;d have to give an explanation for why his idea was rejected.  So they pat him on the back, agree that his idea is a good one, and encourage him to pursue it.  In fact they think so highly of his idea that they helpfully volunteer information about How We Get Things Done Around Here.  They provide a sheaf of forms and advice on how to get the ball rolling.</p>
<p>The young and inexperienced fire-eater happily starts down the road in the direction that has been pointed out to him.  In short order he finds himself in a swamp of procedures and paperwork so thick that he is completely bogged down and making no progress.  Eventually he gives up.  </p>
<p>The next time he comes up with an idea, he is given the same forms again.  This time, seeing the forms, he realizes his mistake.  He politely accepts the forms and walks away.  Around the corner, he throws the forms in the trash and gives up on his idea.  Because by then he knows that the only way to escape the swamp is not to enter it in the first place.</p></blockquote>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=2112&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2013/01/17/death-swamp/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>enum in Python</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2012/08/12/enum-in-python/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2012/08/12/enum-in-python/#comments</comments>
		<pubDate>Sun, 12 Aug 2012 19:02:15 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Python features]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=2081</guid>
		<description><![CDATA[Recently I was reading a post by Eli Bendersky (one of my favorite bloggers) and I ran across a sentence in which Eli says &#8220;It’s a shame Python still doesn’t have a functional enum type, isn’t it?&#8221; The comment startled &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2012/08/12/enum-in-python/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=2081&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Recently I was reading <a href="http://eli.thegreenplace.net/2012/08/09/using-sub-generators-for-lexical-scanning-in-python/" target="_blank">a post by Eli Bendersky</a> (one of my favorite bloggers) and I ran across a sentence in which Eli says &#8220;It’s a shame Python still doesn’t have a functional enum type, isn’t it?&#8221;</p>
<p>The comment startled me because I had always thought that it was <u>obvious</u> how to do enums in Python, and that it was obvious that you don&#8217;t need any special language features to do it.  Eli&#8217;s comment made me think that I might need to do a reality-check on my sense of what was and was not obvious about enums in Python.</p>
<p>So I googled around a bit and found that there are <strong>a lot</strong> of different ideas about how to do enums in Python.  I found a very large set of suggestions on StackOverflow <a href="http://stackoverflow.com/questions/36932/whats-the-best-way-to-implement-an-enum-in-python" target="_blank">here</a> and <a href="http://stackoverflow.com/questions/702834/whats-the-common-practice-for-enums-in-python" target="_blank">here</a>  and <a href="http://stackoverflow.com/questions/1969005/enumerations-in-python" target="_blank">here</a>.  There is a short set of suggestion on <a href="http://www.pythonexamples.org/2011/01/12/how-to-create-an-enum-in-python/" target="_blank">Python Examples</a>.  The ActiveState Python Cookbook has <a href="http://code.activestate.com/recipes/67107-enums-for-python/" target="_blank">a long recipe</a>, and <a href="http://www.python.org/dev/peps/pep-0354/" title="PEP for enumerated data type in Python" target="_blank">PEP-354</a> is a short proposal (that has been rejected).  Surprisingly, I found only a couple of posts that suggested what had seemed to me to be THE obvious solution. The clearest was by <em>snakile</em> <a href="http://stackoverflow.com/questions/3248851/pythons-enum-equivalent?rq=1" target="_blank">on StackOverflow</a>.</p>
<p>Anyway, to end the suspense, the answer that <u>seemed to me</u> so obvious was this.  An enum is an enumerated data type.  An enumerated data type is a type, and a type is a class.</p>
<pre class="brush: python; title: ; notranslate">
class           Color : pass
class Red      (Color): pass
class Yellow   (Color): pass
class Blue     (Color): pass
</pre>
<p>Which allows you to do things like this.</p>
<pre class="brush: python; title: ; notranslate">
class Toy: pass

myToy = Toy()

myToy.color = &quot;blue&quot;  # note we assign a string, not an enum

if myToy.color is Color:
    pass
else:
    print(&quot;My toy has no color!!!&quot;)    # produces:  My toy has no color!!!

myToy.color = Blue   # note we use an enum

print(&quot;myToy.color is&quot;, myToy.color.__name__)  # produces: myToy.color is Blue
print(&quot;myToy.color is&quot;, myToy.color)           # produces: myToy.color is &lt;class '__main__.Blue'&gt;

if myToy.color is Blue:
    myToy.color = Red

if myToy.color is Red:
    print(&quot;my toy is red&quot;)   # produces: my toy is red
else:
    print(&quot;I don't know what color my toy is.&quot;)
</pre>
<p>So that&#8217;s what I came up with.  </p>
<p>But with so many intelligent people all trying to answer the same question, and coming up with such a wide array of different answers, I had to fall back and ask myself a few questions.</p>
<ul>
<li>Why am I seeing so many different answers to what seems like a simple question?</li>
<li>Is there one right answer? If so, what is it? </li>
<li>What is <strong>the</strong> way &mdash; the best, or most widely-used, or most pythonic &mdash; way to do enums in Python?</li>
<li>Is the question really as simple as it seems?</li>
</ul>
<p>For me, the jury is still out on most of these questions, but until they return with a verdict I have come up with two thoughts on the subject.</p>
<p>First, I think that many programmers come to Python with backgrounds in other languages &mdash; C or C++, Java, etc.  Their experiences with other languages shape their conceptions of what an enum &mdash; an enumerated data type &mdash; is.  And when they ask &#8220;How can I do enums in Python?&#8221; they&#8217;re asking a question like <a href="http://stackoverflow.com/questions/36932/whats-the-best-way-to-implement-an-enum-in-python" target="_blank">the question that sparked the longest thread of answers on StackOverflow</a>:</p>
<blockquote><p>I&#8217;m mainly a C# developer, but I&#8217;m currently working on a project in Python. What&#8217;s the best way to implement the equivalent of an enum <em>[i.e. a C# enum]</em>  in Python?
</p></blockquote>
<p>So naturally, the question &#8220;How can I implement in Python the equivalent of the kind of enums that I&#8217;m familiar with in language X?&#8221; has at least as many answers as there are values of X.</p>
<p>My second thought is somewhat related to the first.  </p>
<p>Python developers believe in duck typing.  So a Python developer&#8217;s first instinct is not to ask you:</p>
<blockquote><p>What do you mean by &#8220;enum&#8221;?</p></blockquote>
<p>A Python developer&#8217;s first instinct is to ask you:</p>
<blockquote><p>What kinds of things do you think an &#8220;enum&#8221; should be able to do?<br />
What kinds of things do you think you should be able to do with an &#8220;enum&#8221;?</p></blockquote>
<p>And I think that different developers probably have very different ideas about what one should be able to do with an &#8220;enum&#8221;.  Naturally, that leads them to propose different ways of implementing enums in Python.</p>
<p>As a simple example, consider the question &mdash; <em>Should you be able to <u>sort</u> enums?</em>  </p>
<p>My personal inclination is to say that &mdash; in the most conceptually pure sense of &#8220;enum&#8221; &mdash; the concept of sorting enums makes no sense.  And my suggestion for implementing enums in Python reflects this. Suppose you implement a &#8220;Color&#8221; enum using the technique that I&#8217;ve proposed, and then try to sort enums. </p>
<pre class="brush: python; title: ; notranslate">
# how do enumerated values sort?
colors = [Red, Yellow, Blue]
colors.sort()
for color in colors:
    print(color.__name__)
</pre>
<p>What you get is this:</p>
<pre class="brush: plain; title: ; notranslate">
Traceback (most recent call last):
  File &quot;C:/Users/ferg_s/pydev/enumerated_data_types/edt.py&quot;, line 32, in &lt;module&gt;
    colors.sort()
TypeError: unorderable types: type() &lt; type()
</pre>
<p>So that suites me just fine.  </p>
<p>But I can easily imagine someone (myself?) working with an enum for, say, Weekdays (Sunday, Monday, Tuesday&#8230; Saturday).  And I think it might be reasonable in that situation to want to be able to sort Weekdays and to do <em>greater than</em> and <em>less than</em> comparisons on them.</p>
<p>So if we&#8217;re talking duck typing, I&#8217;m happy with enums/ducks that are motionless and silent.  My only requirement is that they be different from everything else and different from each other.  But I can easily imagine situations where one might reasonably need/want/prefer ducks that can form a conga line, dance, and sing a few bars.  And for those situations, you obviously need more elaborate implementations of enums.</p>
<p>So, with these thoughts in mind, I&#8217;m inclined to think that there is no single, best way to implement an enum in Python. The concept of an <em>enum</em> is flexible enough to cover a variety of implementations offering a variety of features.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=2081&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2012/08/12/enum-in-python/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>Python Decorators</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2012/04/29/python-decorators/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2012/04/29/python-decorators/#comments</comments>
		<pubDate>Mon, 30 Apr 2012 01:09:12 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Decorators]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=1884</guid>
		<description><![CDATA[In August 2009, I wrote a post titled Introduction to Python Decorators. It was an attempt to explain Python decorators in a way that I (and I hoped, others) could grok. Recently I had occasion to re-read that post. It &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2012/04/29/python-decorators/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1884&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In August 2009, I wrote a post titled <a href="http://pythonconquerstheuniverse.wordpress.com/2009/08/06/introduction-to-python-decorators-part-1">Introduction to Python Decorators</a>. It was an attempt to explain Python decorators in a way that I (and I hoped, others) could grok.</p>
<p>Recently I had occasion to re-read that post.  It wasn&#8217;t a pleasant experience &mdash; it was pretty clear to me that the attempt had failed.  </p>
<p>That failure &mdash; and two other things &mdash; have prompted me to try again.</p>
<ul>
<li>Matt Harrison has published an excellent e-book <a title="Amazon.com - Guide to Learning Python Decorators" href="http://www.amazon.com/Guide-Learning-Python-Decorators-ebook/dp/B006ZHJSIM/" target="_blank">Guide to: Learning Python Decorators</a>.</li>
<li>I now have a theory about why most explanations of decorators (mine included) fail, and some ideas about how better to structure an introduction to decorators.</li>
</ul>
<p>There is an old saying to the effect that &#8220;Every stick has two ends, one by which it may be picked up, and one by which it may not.&#8221; I believe that most explanations of decorators fail because they pick up the stick by the wrong end. </p>
<p>In this post I will show you what the wrong end of the stick looks like, and point out why I think it is wrong.  And I will show you what I think the right end of the stick looks like. </p>
<p>&nbsp;</p>
<h1>The wrong way to explain decorators</h1>
<p>Most explanations of Python decorators start with an example of a function to be decorated, like this:</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def aFunction():
    print(&quot;inside aFunction&quot;)
</pre>
<p>and then add a decoration line, which starts with an @ sign:</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
@myDecorator
def aFunction():
    print(&quot;inside aFunction&quot;)
</pre>
<p>At this point, the author of the introduction often defines a <em>decorator</em> as the line of code that begins with the &#8220;@&#8221;.   (In my older post, I called such lines &#8220;annotation&#8221; lines.  I now prefer the term &#8220;decoration&#8221; line.)  </p>
<p>For instance, in 2008 Bruce Eckel <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=240808" target="_blank">wrote on his Artima blog</a></p>
<blockquote><p><em>A function decorator is applied to a function definition by placing it on the line before that function definition begins.</em></p></blockquote>
<p>and in 2004, Phillip Eby wrote in <a href="http://www.drdobbs.com/web-development/184406073" target="_blank">an article in Dr. Dobb&#8217;s Journal</a></p>
<blockquote><p><em>Decorators may appear before any function definition&#8230;. You can even stack multiple decorators on the same function definition, one per line.</em></p></blockquote>
<p>Now there are two things wrong with this approach to explaining decorators. The first is that the explanation begins in the wrong place. It starts with an example of a function to be decorated and an decoration line, when it should begin with the decorator itself.  The explanation should end, not start, with the decorated function and the decoration line.  The decoration line is, after all, merely syntactic sugar &mdash; is not at all an essential element in the concept of a decorator.</p>
<p>The second is that the term &#8220;decorator&#8221; is used incorrectly (or ambiguously) to refer both to the decorator and to the decoration line. For example, in his <em>Dr. Dobb&#8217;s Journal </em>article, after using the term &#8220;decorator&#8221; to refer to the decoration line, Phillip Eby goes on to define a &#8220;decorator&#8221; as a callable object.</p>
<blockquote><p><em>But before you can do that, you first need to have some decorators to stack. A decorator is a callable object (like a function) that accepts one argument—the function being decorated.</em></p></blockquote>
<p>So&#8230; it would seem that a decorator is both a callable object (like a function) <b>and</b> a single line of code that can appear before the line of code that begins a function definition. This is sort of like saying that an &#8220;address&#8221; is both a building (or apartment) at a specific location <b>and</b> a set of lines (written in pencil or ink) on the front of a mailing envelope. The ambiguity may be almost invisible to someone familiar with decorators, but it is very confusing for a reader who is trying to learn about decorators from the ground up.</p>
<p>&nbsp;</p>
<h1>The right way to explain decorators</h1>
<p>So how <strong>should</strong> we explain decorators?</p>
<p>Well, we start with the decorator, not the function to be decorated.</p>
<p><strong>One</strong><br />
We start with the <a href="http://www.informit.com/articles/article.aspx?p=1849243" target="_blank">basic notion of a function</a> — a function is something that generates a value based on the values of its arguments.</p>
<p><strong>Two</strong><br />
We note that in Python, functions are first-class objects, so they can be passed around like other values (strings, integers, objects, etc.).</p>
<p><strong>Three</strong><br />
We note that because functions are first-class objects in Python, we can write functions that both (a) accept function objects as argument values, and (b) return function objects as return values.  For example, here is a function <em>foobar </em>that accepts a function object <em>original_function </em>as an argument and returns a function object <em>new_function </em>as a result.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def foobar(original_function):

    # make a new function
    def new_function():
        # some code

    return new_function
</pre>
<p><strong>Four</strong><br />
We define &#8220;decorator&#8221;.</p>
<blockquote><p>A <strong>decorator</strong> is a function (such as <em>foobar</em> in the above example) that takes a function object as an argument, and returns a function object as a return value. </p></blockquote>
<p>So there we have it — the definition of a decorator.  Anything else that we say about decorators is a refinement of, or an expansion of, or an addition to, this definition of a decorator.</p>
<p><strong>Five</strong><br />
We show what the internals of a decorator look like. Specifically, we show different ways that a decorator can use the <em>original_function</em> in the creation of the <em>new_function</em>.  Here is a simple example.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def verbose(original_function):

    # make a new function that prints a message when original_function starts and finishes
    def new_function(*args, **kwargs):
        print(&quot;Entering&quot;, original_function.__name__)
        original_function(*args, **kwargs)
        print(&quot;Exiting &quot;, original_function.__name__)

    return new_function
</pre>
<p><strong>Six</strong><br />
We show how to invoke a decorator — how we can pass into a decorator one function object (its input) and get back from it a different function object (its output).  In the following example, we pass the <em>widget_func</em> function object to the <em>verbose</em> decorator, and we get back a new function object to which we assign the name <em>talkative_widget_func</em>.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def widget_func():
    # some code

talkative_widget_func = verbose(widget_func)
</pre>
<p><strong>Seven</strong><br />
We point out that decorators are often used to add features to the original_function. Or more precisely, decorators are often used to create a <em>new_function</em> that does roughly what <em>original_function</em> does, but also does things <span style="text-decoration:underline;">in addition</span> to what <em>original_function</em> does.</p>
<p>And we note that the output of a decorator is typically used to replace the <em>original function</em> that we passed in to the decorator as an argument. A typical use of decorators looks like this. (Note the change to line 4 from the previous example.)</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def widget_func():
    # some code

widget_func = verbose(widget_func)
</pre>
<p>So for all practical purposes, in a typical use of a decorator we pass a function (<em>widget_func</em>) through a decorator (<em>verbose</em>) and get back an enhanced (or souped-up, or &#8220;decorated&#8221;) version of the function.</p>
<p><strong>Eight</strong><br />
We introduce Python&#8217;s &#8220;decoration syntax&#8221; that uses the &#8220;@&#8221; to create decoration lines. This feature is basically syntactic sugar that makes it possible to re-write our last example this way:</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
@verbose
def widget_func():
    # some code
</pre>
<p>The result of this example is exactly the same as the previous example &mdash; after it executes, we have a <em>widget_func</em> that has all of the functionality of the original <em>widget_func</em>, plus the functionality that was added by the <em>verbose</em> decorator.</p>
<blockquote><p><em>Note that in <b>this</b> way of explaining decorators, the &#8220;@&#8221; and decoration syntax is one of the  <b>last</b> things that we introduce, not one of the first.</p>
<p>And we absolutely do <b>not</b> refer to line 1 as a &#8220;decorator&#8221;.  We might refer to line 1 as, say, a &#8220;decorator invocation line&#8221; or a &#8220;decoration line&#8221; or simply a &#8220;decoration&#8221;&#8230; whatever.  But line 1 is <strong>not</strong> a &#8220;decorator&#8221;.  </p>
<p>Line 1 is a line of code.  A decorator is a function &mdash; a different animal altogether.</em><em><br />
</em></p></blockquote>
<p>&nbsp;</p>
<p><strong>Nine</strong><br />
Once we&#8217;ve nailed down these basics, there are a few advanced features to be covered.</p>
<ul>
<li>We explain that a decorator need not be a function (it can be any sort of callable, e.g. a class).</li>
<li>We explain how decorators can be nested within other decorators.</li>
<li>We explain how <del>decorators</del> decoration lines can be &#8220;stacked&#8221;.  A better way to put it would be: we explain how decorators can be &#8220;chained&#8221;.</li>
<li>We explain how additional arguments can be passed to decorators, and how decorators can use them.</li>
</ul>
<pre>
</pre>
<p><strong>Ten — A decorators cookbook</strong></p>
<p>The material that we&#8217;ve covered up to this point is what any basic introduction to Python decorators would cover. But a Python programmer needs something more in order to be productive with decorators. He (or she) needs a catalog of recipes, patterns, examples, and commentary that describes / shows / explains when and how decorators can be used to accomplish specific tasks. (Ideally, such a catalog would also include examples and warnings about decorator gotchas and anti-patterns.)  Such a catalog might be called &#8220;Python Decorator Cookbook&#8221; or perhaps &#8220;Python Decorator Patterns&#8221;. </p>
<ul>
<li>
<p>As far as I know, no such decorator cookbook currently exists.</p>
</li>
<li>
<p>The <a title="Python Decorator Library" href="http://wiki.python.org/moin/PythonDecoratorLibrary" target="_blank">Python Decorator Library</a> on the Python wiki is a collection of decorator examples.  It has its uses, but it does not have the systematic organization and explanatory material of a true cookbook.</p>
</li>
<li>
<p>Something similar to a descriptor cookbook, although still not systematically organized, can be generated by a <a href="http://code.activestate.com/search/recipes/#q=decorator" title="Descriptor recipes in ASPN" target="_blank">search of the ActiveState Python Cookbook, filtering on &#8220;descriptor&#8221;.</a></p>
</li>
</ul>
<hr />
<hr />
<p>So that&#8217;s it.  I&#8217;ve described what I think is wrong (well, let&#8217;s say suboptimal) about most introductions to decorators.  And I&#8217;ve sketched out what I think is a better way to structure an introduction to decorators.</p>
<p>Now I can explain why I like Matt Harrison&#8217;s e-book <a title="Amazon.com - Guide to Learning Python Decorators" href="http://www.amazon.com/Guide-Learning-Python-Decorators-ebook/dp/B006ZHJSIM/" target="_blank">Guide to: Learning Python Decorators</a>.  Matt&#8217;s introduction is structured in the way that I think an introduction to decorators <em>should</em> be structured. It picks up the stick by the proper end. </p>
<p>The first two-thirds of the <em>Guide</em> hardly talk about decorators at all. Instead, Matt begins with a thorough discussion of how Python functions work.  By the time the discussion gets to decorators, we have been given a strong understanding of the internal mechanics of functions.  And since most decorators are functions (remember our definition of <em>decorator</em>), at that point it is relatively easy for Matt to explain the internal mechanics of decorators.</p>
<p>Which is just as it should be.</p>
<hr />
<em>Revised 2012-11-26 &mdash; replaced the word &#8220;annotation&#8221; with &#8220;decoration&#8221;, following terminology ideas discussed in the comments.</em></p>
<pre>
</pre>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1884&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2012/04/29/python-decorators/feed/</wfw:commentRss>
		<slash:comments>32</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>Unicode &#8211; the basics</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2012/03/16/unicode-the-basics/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2012/03/16/unicode-the-basics/#comments</comments>
		<pubDate>Fri, 16 Mar 2012 14:06:40 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Unicode]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=1680</guid>
		<description><![CDATA[An introduction to the basics of Unicode, distilled from several earlier posts. In the interests of presenting the big picture, I have painted with a broad brush &#8212; large areas are summarized; nits are not picked; hairs are not split; &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2012/03/16/unicode-the-basics/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1680&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>An introduction to the basics of Unicode, distilled from several earlier posts.  In the interests of presenting the big picture, I have painted with a broad brush &mdash; large areas are summarized; nits are not picked; hairs are not split; wind resistance is ignored. </em> </p>
<p><strong>Unicode = one character set, plus several encodings</strong></p>
<p><a href="http://en.wikipedia.org/wiki/Unicode" target="_blank">Unicode</a> is actually not one thing, but two separate and distinct things.  The first is a <strong>character set</strong> and the second is a set of <strong>encodings</strong>.</p>
<ul>
<li>The first &mdash; the idea of a character set &mdash; has absolutely nothing to do with computers.</li>
<li>The second &mdash; the idea of encodings for the Unicode character set &mdash; has everything to do with computers.</li>
</ul>
<p><strong>Character sets</strong></p>
<p>The idea of a character set has nothing to do with computers.  So let&#8217;s suppose that you&#8217;re a British linguist living in, say, 1750.  The British Empire is expanding and Europeans are discovering many new languages, both living and dead.  You&#8217;ve known about Chinese characters for a long time, and you&#8217;ve just discovered Sumerian cuneiform characters from the Middle East and Sanskrit characters from India.  </p>
<p>Trying to deal with this huge mass of different characters, you get a brilliant idea &mdash; you will make a numbered list of every character in every language that ever existed.</p>
<p>You start your list with your own familiar set of English characters &mdash; the upper- and lower-case letters, the numeric digits, and the various punctuation marks like period (full stop), comma, exclamation mark, and so on.  And the space character, of course.</p>
<pre class="brush: plain; collapse: false; title: ; wrap-lines: false; notranslate">
01 a
02 b
03 c
...
26 z
27 A
28 B
...
52 Z
53 0
54 1
55 2
...
62 9
63 (space)
64 ? (question mark)
65 , (comma)
... and so on ...
</pre>
<p>Then you add the Spanish, French and German characters with tildes, accents, and umlauts.  You add characters from other living languages &mdash; Greek, Japanese, Chinese, Korean, Sanscrit, Arabic, Hebrew, and so on.  You add characters from dead alphabets &mdash; Assyrian cuneiform &mdash; and so on, until finally you have a very long list of characters.</p>
<ul>
<li>What you have created &mdash; a numbered list of characters &mdash; is known as a <strong>character set</strong>. </li>
<li>The numbers in the list &mdash; the numeric identifiers of the characters in the character set &mdash; are called <strong>code points</strong>. </li>
<li>And because your list is meant to include every character that ever existed, you call your character set the <strong>Universal Character Set</strong>.</li>
</ul>
<p>Congratulations! You&#8217;ve just invented (something similar to) the the first half of Unicode &mdash; the <a href="http://en.wikipedia.org/wiki/Universal_Character_Set">Universal Character Set</a> or <strong>UCS</strong>.</p>
<p><strong>Encodings</strong></p>
<p>Now suppose you jump into your time machine and zip forward to the present.  Everybody is using computers.  You have a brilliant idea.  You will devise a way for computers to handle UCS.</p>
<p>You know that computers think in ones and zeros &mdash; bits &mdash; and collections of 8 bits &mdash; bytes.  So you look at the biggest number in your UCS and ask yourself: How many bytes will I need to store a number that big?  The answer you come up with is 4 bytes, 32 bits.  So you decide on a simple and straight-forward digital implementation of UCS &mdash; each number will be stored in 4 bytes.  That is, you choose a fixed-length encoding in which every UCS character (code point) can be represented, or <strong>encoded</strong>, in exactly 4 bytes, or 32 bits.</p>
<p>In short, you devise the Unicode <a href="http://en.wikipedia.org/wiki/Unicode#Mapping_and_encodings">UCS-4 (Universal Character Set, 4 bytes)</a> encoding, aka  <a href="http://en.wikipedia.org/wiki/Unicode#Mapping_and_encodings">UTF-32 (Unicode Transformation Format, 32 bits)</a>.</p>
<p><strong>UTF-8 and variable-length encodings</strong></p>
<p>UCS-4 is simple and straight-forward&#8230; but inefficient. Computers send <em>a lot</em> of strings back and forth, and many of those strings use only ASCII characters &mdash; characters from the old ASCII character set.  One byte &mdash; eight bits &mdash; is more than enough to store such characters.  It is grossly inefficient to use 4 bytes to store an ASCII character.</p>
<p>The key to the solution is to remember that a code point is nothing but a number (an integer).  It may be a short number or a long number, but it is only a number.  We need just one byte to store the shorter numbers of the Universal Character Set, and we need more bytes only when the numbers get longer.  So the solution to our problem is a <em>variable-length</em> encoding.  </p>
<p>Specifically, Unicode&#8217;s <a href="http://en.wikipedia.org/wiki/UTF-8">UTF-8 (Unicode Transformation Format, 8 bit)</a> is a variable-length encoding in which each UCS code point is encoded using 1, 2, 3, or 4 bytes, as necessary.  </p>
<p>In UTF-8, if the first bit of a byte is a &#8220;0&#8243;, then the remaining 7 bits of the byte contain one of the 128 original 7-bit <a href="http://en.wikipedia.org/wiki/ASCII" title="Wikipedia: ASCII" target="_blank">ASCII</a> characters.  If the first bit of the byte is a &#8220;1&#8243; then the byte is the first of multiple bytes used to represent the code point, and other bits of the byte carry other information, such as the total number of bytes &mdash; 2, or 3, or 4 bytes &mdash; that are being used to represent the code point.  (For a quick overview of how this works at the bit level, see <a target="_blank" href="http://stackoverflow.com/questions/1543613/how-does-utf-8-variable-width-encoding-work">How does UTF-8 &#8220;variable-width encoding&#8221; work?</a>)</p>
<p><strong>Just use UTF-8</strong></p>
<p>UTF-8 is a great technology, which is why it has become the <em>de facto</em> standard for encoding Unicode text, and is the most widely-used text encoding in the world.  Text strings that use only ASCII characters can be encoded in UTF-8 using only one byte per character, which is very efficient.  And if characters &mdash; Chinese or Japanese characters, for instance &mdash; require multiple bytes, well, UTF-8 can do that, too.</p>
<p><strong>Byte Order Mark</strong></p>
<p>Unicode fixed-length multi-byte encodings such as UTF-16 and UTF-32 store UCS code points (integers) in multi-byte chunks &mdash; 2-byte chunks in the case of UTF-16 and 4-byte chunks in the case of UTF-32.</p>
<p>Unfortunately, different computer architectures &mdash; basically, different processor chips &mdash; use different techniques for storing such multi-byte integers.  In &#8220;little-endian&#8221; computers, the &#8220;little&#8221; (least significant) byte of a multi-byte integer is stored leftmost.  &#8220;Big-endian&#8221; computers do the reverse; the &#8220;big&#8221; (most significant) byte is stored leftmost.  </p>
<ul>
<li> Intel computers are little-endian.</li>
<li> Motorola computers are big-endian.</li>
<li> Microsoft Windows was designed around a little-endian architecture &mdash; it runs only on little-endian computers or computers running in little-endian mode &mdash; which is why Intel hardware and Microsoft software fit together like hand and glove.</li>
</ul>
<p>Differences in endian-ness can create data-exchange issues between computers.  Specifically, the possibility of differences in endian-ness means that if two computers need to exchange a string of text data, and that string is encoded in a Unicode fixed-length multi-byte encoding such as UTF-16 or UTF-32, the string should begin with a <strong>Byte Order Mark</strong> (or <strong>BOM</strong>) &mdash; a special character at the beginning of the string that indicates the endian-ness of the string.  </p>
<p>Strings encoded in UTF-8 don&#8217;t require a BOM, so the BOM is basically a non-issue for programmers who use only UTF-8.   </p>
<hr />
<p><strong>Resources</strong></p>
<ul>
<li>Ned Batchelder&#8217;s <a href="http://nedbatchelder.com/text/unipain.html" target="_blank">Pragmatic Unicode</a>.  Highly recommended.</li>
<li><a href="http://www.joelonsoftware.com/articles/Unicode.html" target="_blank">The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)</a> (2003) by Joel Spolsky is good, and widely read, but now a bit dated.  I think it is rather misleading in the prominence it gives to the BOM.</li>
<li> <a href="http://en.wikipedia.org/wiki/Unicode" target="_blank">http://en.wikipedia.org/wiki/Unicode</a> </li>
<li><a href="http://en.wikipedia.org/wiki/Universal_Character_Set" target="_blank">http://en.wikipedia.org/wiki/Universal_Character_Set</a></li>
<li><a href="http://en.wikipedia.org/wiki/ASCII" target="_blank">http://en.wikipedia.org/wiki/ASCII</a> </li>
<li><a href="http://en.wikipedia.org/wiki/UTF-8" target="_blank">http://en.wikipedia.org/wiki/UTF-8</a></li>
<li><a href="http://en.wikipedia.org/wiki/Byte_order_mark" title="Wikipedia: BYTE ORDER MARK" target="_blank">http://en.wikipedia.org/wiki/Byte_order_mark</a></li>
<li><a href="http://docs.python.org/library/codecs.html#encodings-and-unicode" target="_blank">http://docs.python.org/library/codecs.html#encodings-and-unicode</a></li>
</ul>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1680&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2012/03/16/unicode-the-basics/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>Python&#8217;s magic methods</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2012/03/09/pythons-magic-methods/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2012/03/09/pythons-magic-methods/#comments</comments>
		<pubDate>Fri, 09 Mar 2012 04:03:04 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Python features]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=1413</guid>
		<description><![CDATA[Here are some links to documentation of Python&#8217;s magic methods, aka special methods, aka &#8220;dunder&#8221; (double underscore) methods. Rafe Kettler&#8217;s A Guide to Python&#8217;s Magic Methods ::&#8212;&#62; http://www.rafekettler.com/magicmethods.html Michael Foord&#8217;s chapter on Python Magic Methods from this book IronPython in &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2012/03/09/pythons-magic-methods/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1413&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Here are some links to documentation of Python&#8217;s <strong>magic methods</strong>, aka <strong>special methods</strong>, aka <a title="Ned Batchelder's recommendation in 2006 of the expression 'dunder'" href="http://nedbatchelder.com/blog/200605/dunder.html" target="_blank">&#8220;dunder&#8221;</a> (double underscore) methods.</p>
<ul>
<li>
<p>Rafe Kettler&#8217;s <em>A Guide to Python&#8217;s Magic Methods</em><br />
		::&mdash;&gt; <a href="http://www.rafekettler.com/magicmethods.html" target="_blank">http://www.rafekettler.com/magicmethods.html</a></p>
</li>
<li>
<p>Michael Foord&#8217;s chapter on <em>Python Magic Methods</em> from this book <a href="http://www.ironpythoninaction.com/" target="_blank">IronPython in Action: Unleashing .NET with Python</a><br />
		::&mdash;&gt; <a href="http://www.ironpythoninaction.com/magic-methods.html" target="_blank">http://www.ironpythoninaction.com/magic-methods.html</a></p>
</li>
<li>
<p>David&#8217;s article on Siafoo on <em>Python __Underscore__ Methods</em><br />
		::&mdash;&gt; <a href="http://www.siafoo.net/article/57" target="_blank">http://www.siafoo.net/article/57</a></p>
</li>
<li>
<p>The <a href="http://infohost.nmt.edu/tcc/help/pubs/python/web/index.html" target="_blank">Python Quick Reference</a> on John Shipman&#8217;s wonderful <a title="Index of New Mexico Tech publications on programming languages" href="http://infohost.nmt.edu/tcc/help/pubs/index/lang.html" target="_blank">New Mexico Tech web site</a><br />
	::&mdash;&gt; <a href="http://infohost.nmt.edu/tcc/help/pubs/python/web/special-methods.html" target="_blank">http://infohost.nmt.edu/tcc/help/pubs/python/web/special-methods.html</a></p>
</li>
<li>The official <em>Python Language Reference</em><br />
		::&mdash;&gt; <a href="http://docs.python.org/reference/datamodel.html#special-method-names" target="_blank">http://docs.python.org/reference/datamodel.html#special-method-names</a></li>
<li>
<p>Kumar McMillan&#8217;s post on <em>Magic Methods</em> on his FarmDev blog has a few interesting observations<br />
		::&mdash;&gt; <a href="http://farmdev.com/src/secrets/magicmethod/index.html" target="_blank">http://farmdev.com/src/secrets/magicmethod/index.html</a></p>
</li>
</ul>
<p>There are also a few other Python features that are sometimes characterized as &#8220;magic&#8221;.</p>
<ul>
<li>some of the <a href="http://docs.python.org/library/functions.html" target="_blank">built-in functions</a>, especially that newcomer: <a target="_blank" href="http://docs.python.org/library/functions.html#super">super()</a>
</li>
<li>decorators that use some of the new built-in functions &mdash; <a href="http://docs.python.org/library/functions.html#staticmethod" target="_blank">@staticmethod</a>, <a href="http://docs.python.org/library/functions.html#classmethod" target="_blank">@classmethod</a>, and <a href="http://docs.python.org/library/functions.html#property" target="_blank">@property</a>
</li>
</ul>
<p>I&#8217;m sure there are other useful Web pages about magic methods that I haven&#8217;t found.  If you know of one (and feel like sharing it) note that you <u>can</u> code HTML tags into a WordPress comment, like this, and they will show up properly formatted: </p>
<blockquote><p><code>I found a useful discussion of magic methods at<br />
&lt;a href="http://www.somebodys_web_site.com/magic-methods"&gt;www.somebodys_web_site.com/magic-methods&lt;/a&gt;<br />
</code></p></blockquote>
<p>&nbsp;</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1413&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2012/03/09/pythons-magic-methods/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>Gotcha &#8212; Mutable default arguments</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2012/02/15/mutable-default-arguments/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2012/02/15/mutable-default-arguments/#comments</comments>
		<pubDate>Wed, 15 Feb 2012 07:04:24 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Python gotchas]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=1544</guid>
		<description><![CDATA[Goto start of series Note: examples are coded in Python 2.x, but the basic point of the post applies to all versions of Python. There&#8217;s a Python gotcha that bites everybody as they learn Python. In fact, I think it &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2012/02/15/mutable-default-arguments/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1544&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://pythonconquerstheuniverse.wordpress.com/2008/06/04/python-gotchas/">Goto start of series</a></p>
<p><em>Note: examples are coded in Python 2.x, but the basic point of the post applies to all versions of Python.</em></p>
<p>There&#8217;s a Python gotcha that bites everybody as they learn Python. In fact, I think it was Tim Peters who suggested that every programmer gets caught by it exactly two times. It is call the <em>mutable defaults</em> trap. Programmers are usually bit by the mutable defaults trap when coding class methods, but I&#8217;d like to begin with explaining it in functions, and then move on to talk about class methods.</p>
<p><strong>Mutable defaults for function arguments</strong></p>
<p>The gotcha occurs when you are coding default values for the arguments to a function or a method. Here is an example for a function named <tt>foobar</tt>:</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def foobar(arg_string = &quot;abc&quot;, arg_list = []):
    ...
</pre>
<p>Here&#8217;s what most beginning Python programmers believe will happen when <tt>foobar</tt> is called without any arguments:</p>
<blockquote><p>A new string object containing &#8220;abc&#8221; will be created and bound to the &#8220;arg_string&#8221; variable name. A new, empty list object will be created and bound to the &#8220;arg_list&#8221; variable name. In short, if the arguments are omitted by the caller, the <tt>foobar</tt> will always get &#8220;abc&#8221; and [] in its arguments.</p></blockquote>
<p>This, however, is <em>not</em> what will happen. Here&#8217;s why.</p>
<p>The objects that provide the default values are not created at the time that <tt>foobar</tt> is called. They are created <em>at the time that the statement that defines the function is executed</em>. (See the discussion at <a href="http://www.deadlybloodyserious.com/2008/05/default-argument-blunders/" target="_blank">Default arguments in Python: two easy blunders</a>: &#8220;Expressions in default arguments are calculated when the function is defined, <em>not</em> when it’s called.&#8221;)</p>
<p>If <tt>foobar</tt>, for example, is contained in a module named <tt>foo_module</tt>, then the statement that defines <tt>foobar</tt> will probably be executed at the time when <tt>foo_module</tt> is imported.</p>
<p>When the <tt>def</tt> statement that creates <tt>foobar</tt> is executed:</p>
<ul>
<li>A new function object is created, bound to the name <tt>foobar</tt>, and stored in the namespace of <tt>foo_module</tt>.</li>
<li>Within the <tt>foobar</tt> function object, for each argument with a default value, an object is created to hold the default object. In the case of <tt>foobar</tt>, a string object containing &#8220;abc&#8221; is created as the default for the <tt>arg_string</tt> argument, and an empty list object is ccreated as the default for the <tt>arg_list</tt> argument.</li>
</ul>
<p>After that, whenever <tt>foobar</tt> is called without arguments, <tt>arg_string</tt> will be bound to the default string object, and arg_list will be bound to the default list object. In such a case, <tt>arg_string</tt> will always be &#8220;abc&#8221;, but <tt>arg_list</tt> may or may not be an empty list. Here&#8217;s why.</p>
<p>There is a crucial difference between a string object and a list object. A string object is immutable, whereas a list object is mutable. That means that the default for <tt>arg_string</tt> can never be changed, but the default for <tt>arg_list</tt> can be changed.</p>
<p>Let&#8217;s see how the default for arg_list can be changed. Here is a program. It invokes <tt>foobar</tt> four times. Each time that <tt>foobar</tt> is invoked it displays the values of the arguments that it receives, then adds something to each of the arguments.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def foobar(arg_string=&quot;abc&quot;, arg_list = []): 
    print arg_string, arg_list 
    arg_string = arg_string + &quot;xyz&quot; 
    arg_list.append(&quot;F&quot;)

for i in range(4): 
    foobar()
</pre>
<p>The output of this program is:</p>
<pre class="brush: plain; collapse: false; title: ; wrap-lines: false; notranslate">
abc [] 
abc ['F'] 
abc ['F', 'F'] 
abc ['F', 'F', 'F']
</pre>
<p>As you can see, the first time through, the argument have exactly the default that we expect. On the second and all subsequent passes, the arg_string value remains unchanged — just what we would expect from an immutable object. The line</p>
<pre class="brush: plain; collapse: false; title: ; wrap-lines: false; notranslate">
arg_string = arg_string + &quot;xyz&quot;</pre>
<p>creates a new object — the string &#8220;abcxyz&#8221; — and binds the name &#8220;arg_string&#8221; to that new object, but it doesn&#8217;t change the default object for the <tt>arg_string</tt> argument.</p>
<p>But the case is quite different with arg_list, whose value is a list — a mutable object. On each pass, we append a member to the list, and the list grows. On the fourth invocation of <tt>foobar</tt> — that is, after three earlier invocations — <tt>arg_list</tt> contains three members.</p>
<p><strong>The Solution</strong><br />
This behavior is not a wart in the Python language. It really is a feature, not a bug. There are times when you really do want to use mutable default arguments. One thing they can do (for example) is retain a list of results from previous invocations, something that might be very handy.</p>
<p>But for most programmers — especially beginning Pythonistas — this behavior is a gotcha. So for most cases we adopt the following rules.</p>
<ol>
<li>Never use a mutable object — that is: a list, a dictionary, or a class instance — as the default value of an argument.</li>
<li>Ignore rule 1 only if you really, <em>really</em>, <em><span style="text-decoration:underline;">REALLY</span></em> know what you&#8217;re doing.</li>
</ol>
<p>So&#8230; we plan always to follow rule #1. Now, the question is <em>how</em> to do it&#8230; how to code <tt>foobar</tt> in order to get the behavior that we want.</p>
<p>Fortunately, the solution is straightforward. The mutable objects used as defaults are replaced by None, and then the arguments are tested for None.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def foobar(arg_string=&quot;abc&quot;, arg_list = None): 
    if arg_list is None: arg_list = [] 
    ...
</pre>
<p>Another solution that you will sometimes see is this:</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
def foobar(arg_string=&quot;abc&quot;, arg_list=None): 
    arg_list = arg_list or [] 
    ...
</pre>
<p>This solution, however, is <em>not</em> equivalent to the first, and should be avoided. See <em>Learning Python</em> p. 123 for a discussion of the differences. <em>Thanks to Lloyd Kvam for pointing this out to me.</em></p>
<p>And of course, in some situations the best solution is simply not to supply a default for the argument.</p>
<p><strong>Mutable defaults for method arguments</strong></p>
<p>Now let&#8217;s look at how the mutable arguments gotcha presents itself when a class method is given a mutable default for one of its arguments. Here is a complete program.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
# (1) define a class for company employees 
class Employee:
    def __init__ (self, arg_name, arg_dependents=[]): 
        # an employee has two attributes: a name, and a list of his dependents 
        self.name = arg_name 
        self.dependents = arg_dependents
    
    def addDependent(self, arg_name): 
        # an employee can add a dependent by getting married or having a baby 
        self.dependents.append(arg_name)
    
    def show(self): 
        print
        print &quot;My name is.......: &quot;, self.name 
        print &quot;My dependents are: &quot;, str(self.dependents)
#--------------------------------------------------- 
#   main routine -- hire employees for the company 
#---------------------------------------------------

# (2) hire a married employee, with dependents 
joe = Employee(&quot;Joe Smith&quot;, [&quot;Sarah Smith&quot;, &quot;Suzy Smith&quot;])

# (3) hire a couple of unmarried employess, without dependents 
mike = Employee(&quot;Michael Nesmith&quot;) 
barb = Employee(&quot;Barbara Bush&quot;)

# (4) mike gets married and acquires a dependent 
mike.addDependent(&quot;Nancy Nesmith&quot;)

# (5) now have our employees tell us about themselves 
joe.show() 
mike.show() 
barb.show()
</pre>
<p>Let&#8217;s look at what happens when this program is run. </p>
<ol>
<li>First, the code that defines the <tt>Employee</tt> class is run.</li>
<li>Then we hire Joe. Joe has two dependents, so that fact is recorded at the time that the <tt>joe</tt> object is created.</li>
<li>Next we hire Mike and Barb. </li>
<li>Then Mike acquires a dependent.</li>
<li>Finally, the last three statements of the program ask each employee to tell us about himself.</li>
</ol>
<p>Here is the result.</p>
<blockquote><pre class="brush: plain; collapse: false; title: ; wrap-lines: false; notranslate">
My name is.......:  Joe Smith 
My dependents are:  ['Sarah Smith', 'Suzy Smith']

My name is.......:  Michael Nesmith 
My dependents are:  ['Nancy Nesmith']

My name is.......:  Barbara Bush 
My dependents are:  ['Nancy Nesmith']
</pre>
</blockquote>
<p>Joe is just fine. But somehow, when Mike acquired Nancy as his dependent, Barb <em>also</em> acquired Nancy as a dependent. This of course is wrong. And we&#8217;re now in a position to understand what is causing the program to behave this way.</p>
<p>When the code that defines the <tt>Employee</tt> class is run, objects for the class definition, the method definitions, and the default values for each argument are created. The constructor has an argument <tt>arg_dependents</tt> whose default value is an empty list, so an empty list object is created and attached to the <tt>__init__</tt> method as the default value for <tt>arg_dependents</tt>.</p>
<p>When we hire Joe, he already has a list of dependents, which is passed in to the Employee constructor — so the <tt>arg_dependents</tt> attribute does not use the default empty list object.</p>
<p>Next we hire Mike and Barb. Since they have no dependents, the default value for <tt>arg_dependents</tt> is used. Remember — this is the empty list object that was created when the code that defined the <tt>Employee</tt> class was run. So in both cases, the empty list is bound to the <tt>arg_dependents</tt> argument, and then — again in both cases — it is bound to the <tt>self.dependents</tt> attribute. The result is that after Mike and Barb are hired, the <tt>self.dependents</tt> attribute of <em>both</em> Mike and Barb <em>point to the same object</em> — the default empty list object.</p>
<p>When Michael gets married, and Nancy Nesmith is added to his <tt>self.dependents</tt> list, Barb also acquires Nancy as a dependent, because Barb&#8217;s <tt>self.dependents</tt> variable name is bound to the same list object as Mike&#8217;s <tt>self.dependents</tt> variable name.</p>
<p>So this is what happens when mutuable objects are used as defaults for arguments in class methods. If the defaults are used when the method is called, <em>different class instances end up sharing references to the same object.</em></p>
<p>And <em>that</em> is why you should never, <em>never</em>, <strong><em><span style="text-decoration:underline;">NEVER</span></em></strong> use a list or a dictionary as a default value for an argument to a class method. Unless, of course, you really, <em>really</em>, <strong><em><span style="text-decoration:underline;">REALLY</span></em></strong> know what you&#8217;re doing.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1544&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2012/02/15/mutable-default-arguments/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>Backing up your email</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2012/02/13/backing-up-your-email/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2012/02/13/backing-up-your-email/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 02:17:03 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=1532</guid>
		<description><![CDATA[Just in case someone might find this useful &#8230; I recently had something bad happen to me. I use Thunderbird (on Windows Vista) as my email client. I asked Thunderbird to compact my email files, and it wiped out a &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2012/02/13/backing-up-your-email/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1532&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Just in case someone might find this useful &#8230;</p>
<p>I recently had something bad happen to me.  I use Thunderbird (on Windows Vista) as my email client.  I asked Thunderbird to compact my email files, and it wiped out a bunch of my email messages.  (I think that one of my email files must have been corrupt, and when I compacted it, the compaction process wiped out messages that should not have been wiped out.)  </p>
<p>You can recover deleted email messages &#8230; but not after the email file has been compacted.  So the messages were not recoverable. Bummer. </p>
<p>The upside is that this nasty incident led me to learn some things. </p>
<p>One thing that I learned was that the disk backup utility that I was using at the time did NOT backup my email files.  The email files were stored in a directory called AppData, and the AppData directory is a &#8220;hidden&#8221; directory.  So the backup utility didn&#8217;t see the AppData directory, and didn&#8217;t back it up.  So I had no backup of the deleted messages.</p>
<p>Learning that led me to investigate ways to backup my email files, and I found this: <a href="http://www.makeuseof.com/tag/5-ways-to-keep-your-emails-backed-up/" title="Five ways to keep your emails backed up" target="_blank">Five ways to keep your emails backed up</a></p>
<p>For backing up Thunderbird files, it recommends MozBackup as being fast, free and easy to use.  So I tried MozBackup, and those claims seem to be true.</p>
<p>Now I&#8217;m evaluating different disk backup options.</p>
<p>The take-away here is that <strong>you need to pay special attention to backing up your email files.</strong> So if you&#8217;re not backing up your email files, take a look at <a href="http://www.makeuseof.com/tag/5-ways-to-keep-your-emails-backed-up/" title="Five ways to keep your emails backed up" target="_blank">Five ways to keep your emails backed up</a> (and read the comments, which are useful) or google something like <a href="http://www.google.com/search?q=email+backup" title="google EMAIL BACKUP" target="_blank">&#8220;email backup&#8221;</a>.   </p>
<p>[Note that this applies only if you are using an email client such as Thunderbird, Outlook, Outlook Express, etc.  If you don't use an email client, and do all of your email work through a Web interface to your Internet Service Provider, then this is not an issue.]</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1532&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2012/02/13/backing-up-your-email/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
		<item>
		<title>Unicode for dummies &#8212; Encoding</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2012/02/01/unicode-for-dummies-encoding/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2012/02/01/unicode-for-dummies-encoding/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 18:03:19 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Unicode]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=1419</guid>
		<description><![CDATA[Another entry in an irregular series of posts about Unicode. Typos fixed 2012-02-22. Thanks Anonymous, and Clinton, for reporting the typos. This is a story about encoding and decoding, with a minor subplot involving Unicode. As our story begins &#8212; &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2012/02/01/unicode-for-dummies-encoding/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1419&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>Another entry in <a href="http://pythonconquerstheuniverse.wordpress.com/category/unicode/">an irregular series of posts about Unicode</a>.</em><br />
<em>Typos fixed 2012-02-22.  Thanks Anonymous, and Clinton, for reporting the typos.</em></p>
<p>This is a story about encoding and decoding, with a minor subplot involving Unicode.</p>
<p>As our story begins &#8212; on a dark and stormy night, of course &#8212; we find our protagonist deep in thought.  He is asking himself &#8220;What is an encoding?&#8221; </p>
<p><strong>What is an encoding?</strong></p>
<p>The basic concepts are simple.  First, we start with the idea of a piece of information &#8212; a message &#8212; that exists in a representation that is understandable (perspicuous) to a human being.  I&#8217;m going to call that representation &#8220;plain text&#8221;.  For English-language speakers, for example, English words printed on a page, or displayed on a screen, count as plain text.  </p>
<p>Next, (for reasons that we won&#8217;t explore right now) we need to be able to translate a message in a plain-text representation into some other representation (let&#8217;s call that representation the &#8220;encoded text&#8221;), and we need to be able to translate the encoded text back into plain text.  The translation from plain text to encoded text is called &#8220;encoding&#8221;, and the translation of encoded text back into plain text is called &#8220;decoding&#8221;.</p>
<p><img src="http://pythonconquerstheuniverse.files.wordpress.com/2012/01/encoding_decoding2.jpg?w=584" alt="encoding and decoding" title="encoding and decoding"   class="aligncenter size-full wp-image-1420" border="0" /></p>
<p>There are three points worth noting about this process.  </p>
<p><strong>The first point</strong> is that no information can be lost during encoding or decoding.  It must be possible for us to send a message on a round-trip journey &#8212; from plain text to encoded text, and then back again from encoded text to plain text &#8212; and get back exactly the same plain text that we started with.   That is why, for instance, we can&#8217;t use one natural language (Russian, Chinese, French, Navaho) as an encoding for another natural language (English, Hindi, Swahili).  The mappings between natural languages are too loose to guarantee that a piece of information can make the round-trip without losing something in translation.</p>
<p>The requirement for a lossless round-trip means that the mapping between the plain text and the encoded text must be very tight, very exact.  And that brings us to <strong>the second point</strong>.</p>
<p>In order for the mapping between the plain text and the encoded text to be very tight &#8212; which is to say: in order for us to be able to specify very precisely how the encoding and decoding processes work &#8212; we must specify very precisely what the plain text representation looks like.  </p>
<p>Suppose, for example, we say that plain text looks like this:  the 26 upper-case letters of the Anglo-American alphabet, plus the space and three punctuation symbols: period (full stop), question mark, and dash (hyphen).  This gives us a plain-text alphabet of 30 characters.  If we need numbers, we can spell them out, like this: &#8220;SIX THOUSAND SEVEN HUNDRED FORTY-THREE&#8221;.  </p>
<p>On the other hand, we may wish to say that our plain text looks like this: 26 upper-case letters, 26 lower-case letters, 10 numeric digits, the space character, and a dozen types of punctuation marks: period, comma, double-quote, left parenthesis, right parenthesis, and so on.  That gives us a plain-text alphabet of 75 characters.</p>
<p>Once we&#8217;ve specified exactly what a plain-text representation of a message looks like &#8212; a finite sequence of characters from our 30-character alphabet, or perhaps our 75-character alphabet &#8212; then we can devise a system (a code) that can reliably encode and decode plain-text messages written in that alphabet.  The simplest such system is one in which every character in the plain-text alphabet has one and only one corresponding representation in the encoded text.  A familiar example is Morse code, in which &#8220;SOS&#8221; in plain text corresponds to
<pre>                ... --- ...</pre>
<p>in encoded text.</p>
<p>In the real world, of course, the selection of characters for the plain-text alphabet is influenced by technological limitations on the encoded text.  Suppose we have several available technologies for storing encoded messages: one technology supports an encoded alphabet of 256 characters, another technology supports only 128 encoded characters, and a third technology supports only 64 encoded characters.  Naturally, we can make our plain-text alphabet much larger if we know that we can use a technology that supports a larger encoded-text alphabet.</p>
<p>And the reverse is also true.  If we know that our plain-text alphabet must be very large, then we know that we must find &#8212; or devise &#8212; a technology capable of storing a large number of encoded characters.</p>
<p>Which brings us to Unicode.</p>
<p><strong>Unicode</strong></p>
<p>Unicode was devised to be a system capable of storing encoded representations of every plain-text character of every human language that has ever existed.  English, French, Spanish.  Greek.  Arabic.  Hindi.  Chinese.  Assyrian (cuneiform characters).</p>
<p>That&#8217;s a lot of characters.</p>
<p>So the first task of the Unicode initiative was simply to list all of those characters, and count them.  That&#8217;s the first half of Unicode, the <a href="http://en.wikipedia.org/wiki/Universal_Character_Set">Universal Character Set</a>.  (And if you really want to &#8220;talk Unicode&#8221;, don&#8217;t call plain-text characters &#8220;characters&#8221;.  Call them &#8220;code points&#8221;.)</p>
<p>Once you&#8217;ve done that, you&#8217;ve got to figure out a technology for storing all of the corresponding encoded-text characters. (In Unicode-speak, the encoded-text characters are called &#8220;code values&#8221;.)</p>
<p>In fact Unicode defines not one but several methods of mapping code points to code values.  Each of these methods has its own name.  Some of the names start with &#8220;UTF&#8221;, others start with &#8220;UCS&#8221;: UTF-8, UTF-16, UTF-32, UCS-2, UCS-4, and so on.  The naming convention is &#8220;UTF-&lt;number of bits in a code value&gt;&#8221; and &#8220;UCS-&lt;number of bytes in a code value&gt;&#8221; Some (e.g. UCS-4 and UTF-32) are functionally equivalent.   See the <a href="http://en.wikipedia.org/wiki/Unicode#Mapping_and_encodings" target="_blank">Wikipedia article on Unicode</a>.</p>
<p>The most important thing about these methods is that some are fixed-width encodings and some are variable-width encodings.  The basic idea is that the fixed-width encodings are very long &#8212; UCS-4 and UTF-32 are 4 bytes (32 bits) long &#8212; long enough to hold the the biggest code value that we will ever need.  </p>
<p>In contrast, the variable-width encodings are designed to be short, but expandable.  UTF-8, for example, can use as few as 8 bits (one byte) to store Latin and ASCII <del>characters</del> code points.  But it also has a sort of &#8220;continued on the next byte&#8221; mechanism that allows it to use 2 bytes or even 4 bytes if it needs to (as it might, for Chinese characters).  For Western programmers, that means that UTF-8 is both efficient and flexible, which is why UTF-8 is the de facto standardard encoding for exchanging Unicode text.</p>
<p>There is, then, no such thing as THE Unicode encoding system or method.  There are <em>several</em> encoding methods, and if you want to exchange text with someone, you need explicitly to specify which encoding method you are using.  </p>
<p>Is it, say, this.</p>
<p><img src="http://pythonconquerstheuniverse.files.wordpress.com/2012/02/encoding_decoding_uf8.png?w=584" alt="encoding decoding UTF-8" title="encoding_decoding_uf8"   class="aligncenter size-full wp-image-1445" /></p>
<p>Or this.</p>
<p><img src="http://pythonconquerstheuniverse.files.wordpress.com/2012/02/encoding_decoding_utf16.jpg?w=584" alt="encoding decoding UTF-16" title="encoding_decoding_utf16"   class="aligncenter size-full wp-image-1443" /></p>
<p>Or something else.</p>
<p>Which brings us back to something I said earlier.  </p>
<p><strong>Why encode something in Unicode?</strong></p>
<p>At the beginning of this post I said</p>
<blockquote><p>We start with the idea of a piece of information &#8212; a message &#8212; that exists in a representation that is understandable (perspicuous) to a human being. </p>
<p>Next, (for reasons that we won&#8217;t explore right now) we need to be able to translate a message in a plain-text representation into some other representation.  The translation from plain text to encoded text is called &#8220;encoding&#8221;, and the translation of encoded text back into plain text is called &#8220;decoding&#8221;.</p></blockquote>
<p>OK.  So now it is time to explore those reasons.  Why might we want to translate a message in a plain-text representation into some other representation?</p>
<p>One reason, of course, is that we want to keep a secret.  We want to hide the plain text of our message by <em>encrypting </em>and <em>decrypting </em>it &#8212; basically, by keeping the algorithms for encoding and decoding secret and private.  </p>
<p>But that is a completely different subject.  Right now, we&#8217;re not interested in keeping secrets; we&#8217;re Python programmers and we&#8217;re interested in Unicode.  So:</p>
<blockquote><p><em>Why &#8212; as a Python programmer &#8212; would I need to be able to translate a plain-text message into some encoded representation&#8230; say, a Unicode representation such as UTF-8?</em></p></blockquote>
<p>Suppose you are happily sitting at your PC, working with your favorite text editor, writing the standard Hello World program in Python (specifically, in Python 3+).  This single line is your entire program.</p>
<pre>
                   print("Hello, world!")
</pre>
<p>Here, &#8220;Hello, world!&#8221; is plain text.  You can see it on your screen.  You can read it. You know what it means.  It is just a string and you can (if you wish) do standard string-type operations on it, such as taking a substring (a slice).</p>
<p>But now suppose you want to put this string &#8212; &#8220;Hello, world!&#8221; &#8212; into a file and save the file on your hard drive.  Perhaps you plan to send the file to a friend.</p>
<p>That means that you must eject your poor little string from the warm, friendly, protected home in your Python program, where it exists simply as plain-text characters.  You must thrust it into the cold, impersonal, outside world of the file system.  And out there it will exist not as characters, but as mere 1&#8242;s and 0&#8242;s, a jumble of dits and dots, charged and uncharged particles.  And that means that your happy little plain-text string must be represented <em>by some specific configuration of 1s and 0s</em>, so that when somebody wants to retrieve that collection of 1s and 0s and convert it back into readable plain text, they can.  </p>
<p>The process of converting a plain text into a specific configuration of 1s and 0s is a process of <em>encoding</em>.  In order to write a string to a file, you must encode it using some encoding system (such as UTF-8).  And to get it back from a file, you must read the file and decode the collection of 1s and 0s back into plain text.</p>
<p>The need to encode/decode strings when writing/reading them from/to files isn&#8217;t something new &#8212; it is not an additional burden imposed by Python 3&#8242;s new support for Unicode.  It is something you have always done.  But it wasn&#8217;t always so obvious.  In earlier versions of Python, the encoding scheme was <a href="http://en.wikipedia.org/wiki/ASCII" title="Wikipedia - ASCII">ASCII</a>.  And because, in those olden times, ASCII was pretty much the only game in town, you didn&#8217;t need to specify that you wanted to write and read your files in ASCII.  Python just assumed it by default and did it.  But &#8212; whether or not you realized it &#8212; whenever one of your programs wrote or read strings from a file, Python was busy behind the scene, doing the encoding and decoding for you.  </p>
<p>So that&#8217;s why you &#8212; as a Python programmer &#8212; need to be able to encode and decode text into, and out of, UTF-8 (or some other encoding: UTF-16, ASCII, whatever).  You need to encode your strings as 1s and 0s so you can put those 1s and 0s into a file and send the file to someone else.</p>
<p><strong>What is plain text?</strong></p>
<p>Earlier, I said that there were three points worth noting about the encoding/decoding process, and I discussed the first two.  Here is <strong>the third point</strong>.</p>
<p>The distinction between plain text and encoded text is relative and context-dependent.  </p>
<p>As programmers, we think of plain text as being written text.  But it is possible to look at matters differently.  For instance, we can think of spoken text as the plain text, and written text as the encoded text.  From this perspective, writing is encoded speech.  And there are many different encodings for speech as writing. Think of Egyptian hieroglyphics, Mayan hieroglyphics, the Latin alphabet, the Greek alphabet, Arabic, Chinese ideograms, wonderfully flowing <a href="http://en.wikipedia.org/wiki/Hindi_script">Devanagari देवनागरी</a>, sharp pointy cuneiform wedges, even shorthand.  These are all written encodings for the spoken word.  They are all, as Thomas Hobbes put it, &#8220;Marks by which we may remember our thoughts&#8221;.</p>
<p>Which reminds us that, in a different context, even speech itself &#8212; language &#8212; may be regarded as a form of encoding.  In much of early modern philosophy (think of Hobbes and Locke) speech (or language) was basically considered to be an encoding of thoughts and ideas.  Communication happens when I encode my thought into language and say something &#8212; speak to you.  You hear the sound of my words and decode it back into ideas.  We achieve communication when I successfully transmit a thought from my mind to your mind via language.  You understand me when &#8212; as a result of my speech &#8212; you have the <em>same</em> idea in your mind as I have in mine.  (See Ian Hacking, <a href="http://www.amazon.com/Why-Does-Language-Matter-Philosophy/dp/0521099986/ref=cm_cr_pr_product_top"><em>Why Does Language Matter to Philosophy?</em></a>)</p>
<p>Finally, note that in other contexts, the &#8220;plain text&#8221; isn&#8217;t even text.  Where the plain text is soundwaves (e.g. music), it can be encoded as an mp3 file.  Where the plain text is an image, it can be encoded as a gif, or png, or jpg file.  Where the plain text is a movie, it can be encoded as a wmv file.  And so on.  </p>
<p>Everywhere, we are surrounded by encoding and decoding.</p>
<hr />
<p><strong>Notes</strong></p>
<p>I&#8217;d like to recommend Eli Bendersky&#8217;s recent post on <a href="http://eli.thegreenplace.net/2012/01/30/the-bytesstr-dichotomy-in-python-3">The bytes/str dichotomy in Python 3</a>, which prodded me &#8212; finally &#8212; to put these thoughts into writing.  I especially like this passage in his post.</p>
<blockquote><p>Think of it this way: a string is an abstract representation of text. A string consists of characters, which are also abstract entities not tied to any particular binary representation. When manipulating strings, we’re living in blissful ignorance. We can split and slice them, concatenate and search inside them. We don’t care how they are represented internally and how many bytes it takes to hold each character in them. We only start caring about this when encoding strings into bytes (for example, in order to send them over a communication channel), or decoding strings from bytes (for the other direction).</p></blockquote>
<p>I strongly recommend Charles Petzold&#8217;s wonderful book <a href="http://www.amazon.com/Code-Language-Computer-Hardware-Software/dp/0735611319/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1328110879&amp;sr=1-1"><em>Code: The Hidden Language of Computer Hardware and Software.</em></a></p>
<p>And finally, I&#8217;ve found Stephen Pincock&#8217;s <a href="http://www.amazon.com/Codebreaker-History-Ciphers-Stephen-Pincock/dp/0802715478/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1328110831&amp;sr=1-1"><em>Codebreaker: The History of Secret Communications</em></a> a delightful read.  It will tell you, among many other things, how the famous WWII Navaho codetalkers could talk about submarines and dive bombers&#8230; despite the fact that there are no Navaho words for &#8220;submarine&#8221; or &#8220;dive bomber&#8221;.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=1419&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2012/02/01/unicode-for-dummies-encoding/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>

		<media:content url="http://pythonconquerstheuniverse.files.wordpress.com/2012/01/encoding_decoding2.jpg" medium="image">
			<media:title type="html">encoding and decoding</media:title>
		</media:content>

		<media:content url="http://pythonconquerstheuniverse.files.wordpress.com/2012/02/encoding_decoding_uf8.png" medium="image">
			<media:title type="html">encoding_decoding_uf8</media:title>
		</media:content>

		<media:content url="http://pythonconquerstheuniverse.files.wordpress.com/2012/02/encoding_decoding_utf16.jpg" medium="image">
			<media:title type="html">encoding_decoding_utf16</media:title>
		</media:content>
	</item>
		<item>
		<title>How to post source code on WordPress</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/11/06/posting-sourcecode-on-wordpress/</link>
		<comments>http://pythonconquerstheuniverse.wordpress.com/2011/11/06/posting-sourcecode-on-wordpress/#comments</comments>
		<pubDate>Sun, 06 Nov 2011 16:46:56 +0000</pubDate>
		<dc:creator>Steve Ferg</dc:creator>
				<category><![CDATA[Miscellaneous]]></category>

		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=986</guid>
		<description><![CDATA[This post is for folks who blog about Python (or any programming language for that matter) on WordPress. Updated 2011-11-09 to make it easier to copy-and-paste the [sourcecode] template. My topic today is How to post source code on WordPress. &#8230; <a href="http://pythonconquerstheuniverse.wordpress.com/2011/11/06/posting-sourcecode-on-wordpress/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=986&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This post is for folks who blog about Python (or any programming language for that matter) on WordPress.</em><br />
<em>Updated 2011-11-09 to make it easier to copy-and-paste the <code>[<em></em>sourcecode]</code> template.</em></p>
<p>My topic today is <strong>How to post source code on WordPress.</strong></p>
<p>The trick is to use the WordPress <strong>[<em></em>sourcecode]</strong> shortcut tag, as documented at <a title="How to post source code on WordPress" href="http://en.support.wordpress.com/code/posting-source-code/" target="_blank">http://en.support.wordpress.com/code/posting-source-code/</a>.</p>
<p>Note that when the WordPress docs tell you to enclose the <strong>[<em></em>sourcecode]</strong> shortcut tag in square &#8212; not pointy &#8212; brackets, they mean it. When you view your post <em>as HTML</em>, what you should see is square brackets around the shortcut tags, not pointy brackets.</p>
<p>Here is the tag I like to use for snippets of Python code.</p>
<blockquote><pre>

[<em></em>sourcecode language="python" wraplines="false" collapse="false"]
your source code goes here
[<em></em>/sourcecode]


</pre>
</blockquote>
<p>The default for <strong>wraplines </strong>is <strong>true</strong>, which causes long lines to be wrapped. That isn&#8217;t appropriate for Python, so I specify <strong>wraplines=&#8221;false&#8221;</strong>.</p>
<p>The default for <strong>collapse</strong> is <strong>false</strong>, which is what I normally want. But I code it explicitly, as a reminder that if I ever want to collapse a long code snippet, I can.</p>
<hr />
<p>Here are some examples.</p>
<p>Note that</p>
<ul>
<li>WordPress knows how to do syntax highlighting for Python. It uses <a href="http://alexgorbatchev.com/wiki/SyntaxHighlighter">Alex Gorbatchev&#8217;s SyntaxHighlighter</a>.</li>
<li>If you hover your mouse pointer over the code, you get a pop-up toolbar that allows you to look at the original source code snippet, copy it to the clipboard, print it, etc.</li>
</ul>
<p>(1)</p>
<p>First, a normal chunk of relatively short lines of Python code.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
indentCount = 0
textChars = []
suffixChars = []

# convert the line into a list of characters
# and feed the list to the ReadAhead generator
chars = ReadAhead(list(line))

c = chars.next() # get first

while c and c == INDENT_CHAR:
    # process indent characters
    indentCount += 1
    c = chars.next()

while c and c != SYMBOL:
    # process text characters
    textChars.append(c)
    c = chars.next()

if c and c == SYMBOL:
    c = chars.next() # read past the SYMBOL
    while c:
        # process suffix characters
        suffixChars.append(c)
        c = chars.next()
</pre>
<p>(2)</p>
<p>Here is a different code snippet. This one has a line containing a very long comment. Note that the long line is NOT wrapped, and a horizontal scrollbar is available so that you can scroll as far to the right as you need to. That is because we have specified wraplines=&#8221;false&#8221;.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: false; notranslate">
somePythonVariable = 1
# This is a long, single-line, comment.  I put it here to illustrate the effect of the wraplines argument.  In this code snippet, wraplines=&quot;false&quot;, so lines are NOT wrapped, but extend indefinitely, and a horizontal scrollbar is available so that you can scroll as far to the right as you need to.
</pre>
<p>(3)</p>
<p>This is what a similar code snippet would look like if we had specified wraplines=true. Note that line 2 wraps around and there is no horizontal scrollbar.</p>
<pre class="brush: python; collapse: false; title: ; wrap-lines: true; notranslate">
somePythonVariable = 1
# This is a long, single-line, comment.  I put it here to illustrate the effect of the wraplines argument.  In this code snippet, wraplines=&quot;true&quot;, so lines are ARE wrapped.  They do NOT extend indefinitely, and a horizontal scrollbar is NOT available so that you can scroll as far to the right as you need to.
</pre>
<p>(4)</p>
<p>Finally, the same code snippet with <strong>collapse=true</strong>, so the code snippet initially displays as collapsed. Clicking on the collapsed code snippet will cause it to expand.</p>
<pre class="brush: python; collapse: true; light: false; title: ; toolbar: true; wrap-lines: true; notranslate">
somePythonVariable = 1
# This is a long, single-line, comment.  I put it here to illustrate the effect of the wraplines argument.  In this code snippet, wraplines=&quot;true&quot;, so lines are ARE wrapped.  They do NOT extend indefinitely, and a horizontal scrollbar is NOT available so that you can scroll as far to the right as you need to.
</pre>
<p>As far as I can tell, once a reader has expanded a snippet that was initially collapsed, there is no way for him to re-collapse it. That would be a nice enhancement for WordPress — to allow a reader to collapse and expand a code snippet.</p>
<hr />
<p>Here is a final thought about wraplines. If you specify <strong>wraplines=&#8221;false&#8221;</strong>, and a reader <em>prints</em> a paper copy of your post, the printed output will not show the scrollbar, and it will show <em>only </em>the portion of long lines that were visible on the screen. In short, the printed output might cut off the right-hand part of long lines.</p>
<p>In most cases, I think, this should not be a problem. The pop-up tools allow a reader to view or print the entire source code snippet if he wants to. Still, I can imagine cases in which I might choose to specify <strong>wraplines=&#8221;true&#8221;</strong>, even for a whitespace-sensitive language such as Python. And I can understand that someone else, simply as a matter of personal taste, might prefer to specify <strong>wraplines=&#8221;true&#8221;</strong> all of the time.</p>
<p>Now that I think of it, another nice enhancement for WordPress would be to allow a reader to toggle wraplines on and off.</p>
<hr />
<p>Keep on bloggin&#8217;!</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=pythonconquerstheuniverse.wordpress.com&#038;blog=9223888&#038;post=986&#038;subd=pythonconquerstheuniverse&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://pythonconquerstheuniverse.wordpress.com/2011/11/06/posting-sourcecode-on-wordpress/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/ba3bd1b3d3ba79b1595052ca2f14eb2b?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">Steve Ferg</media:title>
		</media:content>
	</item>
	</channel>
</rss>
