<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Read-Ahead and Python Generators</title>
	<atom:link href="http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/feed/" rel="self" type="application/rss+xml" />
	<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/</link>
	<description>Adventures across space and time with the Python programming language</description>
	<lastBuildDate>Sat, 04 May 2013 20:54:54 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: jnothman</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/#comment-1683</link>
		<dc:creator><![CDATA[jnothman]]></dc:creator>
		<pubDate>Mon, 06 Aug 2012 11:22:08 +0000</pubDate>
		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=966#comment-1683</guid>
		<description><![CDATA[I don&#039;t think ReadAhead is descriptive of what this function is actually doing. All it does is tacks a None onto the end of an iterator. Apart from the fact that this simply won&#039;t do if the sequence contains None, it can also be implemented as itertools.chain(sequence, (None,)), which is much more explicit. (To me, the name ReadAhead implies the ability to view future sequence items without modifying the underlying state of the iterator. This can be performed with itertools.tee.)

As long as there is no chance of StopIteration being called by something else, the above could be rewritten using a try-except.

However, this particular application would be best solved by a regular expression: indentChars, textChars, suffixChars = re.match(&#039;({indent}*)(.*?)(?:{symbol}(.*))?&#039;, line).groups().]]></description>
		<content:encoded><![CDATA[<p>I don&#8217;t think ReadAhead is descriptive of what this function is actually doing. All it does is tacks a None onto the end of an iterator. Apart from the fact that this simply won&#8217;t do if the sequence contains None, it can also be implemented as itertools.chain(sequence, (None,)), which is much more explicit. (To me, the name ReadAhead implies the ability to view future sequence items without modifying the underlying state of the iterator. This can be performed with itertools.tee.)</p>
<p>As long as there is no chance of StopIteration being called by something else, the above could be rewritten using a try-except.</p>
<p>However, this particular application would be best solved by a regular expression: indentChars, textChars, suffixChars = re.match(&#8216;({indent}*)(.*?)(?:{symbol}(.*))?&#8217;, line).groups().</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Ferg</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/#comment-899</link>
		<dc:creator><![CDATA[Steve Ferg]]></dc:creator>
		<pubDate>Wed, 12 Oct 2011 20:30:46 +0000</pubDate>
		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=966#comment-899</guid>
		<description><![CDATA[No, that won&#039;t even come close.   

First of all, the demonstration.  Here is a full Python program using the ReadAhead technique.&lt;pre&gt;###################################################################
indentCount = 0
textChars = []
suffixChars = []

INDENT_CHAR = &quot; &quot; # space character
SYMBOL = &quot;!&quot;
line = &quot;   this is the line body ! this is the line suffix !!!&quot;

def ReadAhead(sequence):
    for item in sequence:
        yield item
    yield None # return the &quot;end of file mark&quot; after the last item

#feed the line to the ReadAhead generator
chars = ReadAhead(line)

# This code works for Python 2.x
# For Python 3.x, change &quot;chars.next()&quot; to &quot;next(chars)&quot;
c = chars.next() # get first

while c and c == INDENT_CHAR:
    # process indent characters
    indentCount += 1
    c = chars.next()

while c and c != SYMBOL:
    # process text characters
    textChars.append(c)
    c = chars.next()

if c and c == SYMBOL:
    c = chars.next() # read past the SYMBOL
    while c:
        # process suffix characters
        suffixChars.append(c)
        c = chars.next()

print(&#039;Line (input)= &quot;&#039; + line + &#039;&quot;&#039;)
print(&#039;Line indent = &#039; + str(indentCount))
print(&#039;Line text   = &quot;&#039; + &quot;&quot;.join(textChars) + &#039;&quot;&#039;)
print(&#039;Line suffix = &quot;&#039; + &quot;&quot;.join(suffixChars) + &#039;&quot;&#039;)
##################################################################&lt;/pre&gt;

It produces this output.&lt;pre&gt;
Line (input)= &quot;   this is the line body ! this is the line suffix !!!&quot;
Line indent = 3
Line text   = &quot;this is the line body &quot;
Line suffix = &quot; this is the line suffix !!!&quot;&lt;/pre&gt;

Now replace the relevant code with your suggested code and look at the output.

Second, the explanation.  To understand &lt;em&gt;why&lt;/em&gt; your suggested code does not work, the best place to start is Michael Jackson&#039;s famous paper
&lt;a href=&quot;http://mcs.open.ac.uk/mj665/GetWrong.pdf&quot; rel=&quot;nofollow&quot;&gt;Getting It Wrong -- A Cautionary Tale&lt;/a&gt; 

After that, his book &lt;a href=&quot;http://www.amazon.com/Principles-Program-Design-APIC-Jackson/dp/0123790506/ref=wl_it_dp_o_npd?ie=UTF8&amp;coliid=I1C5CKQP91IFPT&amp;colid=3RR8GM5V6873R&quot; rel=&quot;nofollow&quot;&gt;Principles of Program Design&lt;/a&gt; would be a good place to go.

For online resources about the Jackson methods, see &lt;a href=&quot;http://www.jacksonworkbench.co.uk/stevefergspages/jsp_and_jsd/index.html&quot; rel=&quot;nofollow&quot;&gt;http://www.jacksonworkbench.co.uk/stevefergspages/jsp_and_jsd/index.html&lt;/a&gt;.]]></description>
		<content:encoded><![CDATA[<p>No, that won&#8217;t even come close.   </p>
<p>First of all, the demonstration.  Here is a full Python program using the ReadAhead technique.
<pre>###################################################################
indentCount = 0
textChars = []
suffixChars = []

INDENT_CHAR = " " # space character
SYMBOL = "!"
line = "   this is the line body ! this is the line suffix !!!"

def ReadAhead(sequence):
    for item in sequence:
        yield item
    yield None # return the "end of file mark" after the last item

#feed the line to the ReadAhead generator
chars = ReadAhead(line)

# This code works for Python 2.x
# For Python 3.x, change "chars.next()" to "next(chars)"
c = chars.next() # get first

while c and c == INDENT_CHAR:
    # process indent characters
    indentCount += 1
    c = chars.next()

while c and c != SYMBOL:
    # process text characters
    textChars.append(c)
    c = chars.next()

if c and c == SYMBOL:
    c = chars.next() # read past the SYMBOL
    while c:
        # process suffix characters
        suffixChars.append(c)
        c = chars.next()

print('Line (input)= "' + line + '"')
print('Line indent = ' + str(indentCount))
print('Line text   = "' + "".join(textChars) + '"')
print('Line suffix = "' + "".join(suffixChars) + '"')
##################################################################</pre>
<p>It produces this output.
<pre>
Line (input)= "   this is the line body ! this is the line suffix !!!"
Line indent = 3
Line text   = "this is the line body "
Line suffix = " this is the line suffix !!!"</pre>
<p>Now replace the relevant code with your suggested code and look at the output.</p>
<p>Second, the explanation.  To understand <em>why</em> your suggested code does not work, the best place to start is Michael Jackson&#8217;s famous paper<br />
<a href="http://mcs.open.ac.uk/mj665/GetWrong.pdf" rel="nofollow">Getting It Wrong &#8212; A Cautionary Tale</a> </p>
<p>After that, his book <a href="http://www.amazon.com/Principles-Program-Design-APIC-Jackson/dp/0123790506/ref=wl_it_dp_o_npd?ie=UTF8&amp;coliid=I1C5CKQP91IFPT&amp;colid=3RR8GM5V6873R" rel="nofollow">Principles of Program Design</a> would be a good place to go.</p>
<p>For online resources about the Jackson methods, see <a href="http://www.jacksonworkbench.co.uk/stevefergspages/jsp_and_jsd/index.html" rel="nofollow">http://www.jacksonworkbench.co.uk/stevefergspages/jsp_and_jsd/index.html</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/#comment-897</link>
		<dc:creator><![CDATA[Anonymous]]></dc:creator>
		<pubDate>Wed, 12 Oct 2011 19:32:36 +0000</pubDate>
		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=966#comment-897</guid>
		<description><![CDATA[I wonder if this wouldn&#039;t do roughly the same:
&lt;pre&gt;
    indentCount = 0
    textChars = []
    suffixChars = []

    # convert the line into a list of characters
    for c in line:
        if c == INDENT_CHAR:
            # process indent characters
            indentCount += 1
            continue

        if c != SYMBOL:
            # process text characters
            textChars.append(c)
            continue

        if c == SYMBOL:
            for c2 in line:
                # process suffix characters
                suffixChars.append(c2)
&lt;/pre&gt;]]></description>
		<content:encoded><![CDATA[<p>I wonder if this wouldn&#8217;t do roughly the same:</p>
<pre>
    indentCount = 0
    textChars = []
    suffixChars = []

    # convert the line into a list of characters
    for c in line:
        if c == INDENT_CHAR:
            # process indent characters
            indentCount += 1
            continue

        if c != SYMBOL:
            # process text characters
            textChars.append(c)
            continue

        if c == SYMBOL:
            for c2 in line:
                # process suffix characters
                suffixChars.append(c2)
</pre>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Ferg</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/#comment-896</link>
		<dc:creator><![CDATA[Steve Ferg]]></dc:creator>
		<pubDate>Wed, 12 Oct 2011 17:13:44 +0000</pubDate>
		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=966#comment-896</guid>
		<description><![CDATA[Thanks for your comment!  I learned something from it. &lt;code&gt;:-) &lt;/code&gt;

The bad news is that it doesn&#039;t work.  The good news is that understanding why it doesn&#039;t work is really useful and interesting, because it helps point out an important feature of the ReadAhead technique and the ReadAhead generator.

I ran this program under Python 2.5:
&lt;pre&gt;import sys
print(sys.version)
listOfItems =&quot;abcd&quot;
items = (x for x in listOfItems)
item = items.next()  # get first
while item:
    print(item)
    item = items.next()  # get next&lt;/pre&gt;And got this.&lt;pre&gt;2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]
a
b
c
d
Traceback (most recent call last):
  File &quot;c:\junk\test.py&quot;, line 9, in 
    item = items.next()  # get next
StopIteration&lt;/pre&gt;
The &quot;trick&quot; with the ReadAhead generator is that after yielding all of the members of the list, it &lt;em&gt;yields None&lt;/em&gt;, thus avoiding the raising of StopIteration.  In the classic formulation of the ReadAhead technique, the &lt;em&gt;yield None&lt;/em&gt; corresponds to returning the end-of-file mark.

&lt;strong&gt;NOTE&lt;/strong&gt; that with Python 3.x, instead of &lt;code&gt;&lt;strong&gt;items.next()&lt;/strong&gt;&lt;/code&gt; you need to use &lt;code&gt;&lt;strong&gt;next(items)&lt;/strong&gt;&lt;/code&gt;.
]]></description>
		<content:encoded><![CDATA[<p>Thanks for your comment!  I learned something from it. <code> <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  </code></p>
<p>The bad news is that it doesn&#8217;t work.  The good news is that understanding why it doesn&#8217;t work is really useful and interesting, because it helps point out an important feature of the ReadAhead technique and the ReadAhead generator.</p>
<p>I ran this program under Python 2.5:</p>
<pre>import sys
print(sys.version)
listOfItems ="abcd"
items = (x for x in listOfItems)
item = items.next()  # get first
while item:
    print(item)
    item = items.next()  # get next</pre>
<p>And got this.
<pre>2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]
a
b
c
d
Traceback (most recent call last):
  File "c:\junk\test.py", line 9, in 
    item = items.next()  # get next
StopIteration</pre>
<p>The &#8220;trick&#8221; with the ReadAhead generator is that after yielding all of the members of the list, it <em>yields None</em>, thus avoiding the raising of StopIteration.  In the classic formulation of the ReadAhead technique, the <em>yield None</em> corresponds to returning the end-of-file mark.</p>
<p><strong>NOTE</strong> that with Python 3.x, instead of <code><strong>items.next()</strong></code> you need to use <code><strong>next(items)</strong></code>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: rcriii</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/#comment-895</link>
		<dc:creator><![CDATA[rcriii]]></dc:creator>
		<pubDate>Wed, 12 Oct 2011 16:30:46 +0000</pubDate>
		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=966#comment-895</guid>
		<description><![CDATA[This is cool.  I can see myself using this pattern.  

I don&#039;t know that it isn&#039;t pythonic overall, but it would be _more_ pythonic to replace the function ReadAhead with a generator expression:
&lt;pre&gt;
items = (x for x in listOfItems)
item = items.next()  # get first
while item:
    processItem(item)
    item = items.next()  # get next
&lt;/pre&gt;]]></description>
		<content:encoded><![CDATA[<p>This is cool.  I can see myself using this pattern.  </p>
<p>I don&#8217;t know that it isn&#8217;t pythonic overall, but it would be _more_ pythonic to replace the function ReadAhead with a generator expression:</p>
<pre>
items = (x for x in listOfItems)
item = items.next()  # get first
while item:
    processItem(item)
    item = items.next()  # get next
</pre>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Williams</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/#comment-820</link>
		<dc:creator><![CDATA[Steve Williams]]></dc:creator>
		<pubDate>Mon, 25 Jul 2011 01:43:56 +0000</pubDate>
		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=966#comment-820</guid>
		<description><![CDATA[A long time ago I posted this pattern and received a stern rebuke from martelli it was &quot;unpyhtonic&quot;]]></description>
		<content:encoded><![CDATA[<p>A long time ago I posted this pattern and received a stern rebuke from martelli it was &#8220;unpyhtonic&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: daemian mack</title>
		<link>http://pythonconquerstheuniverse.wordpress.com/2011/07/22/read-ahead-and-python-generators/#comment-817</link>
		<dc:creator><![CDATA[daemian mack]]></dc:creator>
		<pubDate>Sat, 23 Jul 2011 15:08:12 +0000</pubDate>
		<guid isPermaLink="false">http://pythonconquerstheuniverse.wordpress.com/?p=966#comment-817</guid>
		<description><![CDATA[Cool, I hadn&#039;t seen this pattern explained before.

Incidentally, there&#039;s no need to convert the line via list in the final example: since strings are sequences in Python, the ReadAhead generator will iterate over it as-is on a character-by-character basis.]]></description>
		<content:encoded><![CDATA[<p>Cool, I hadn&#8217;t seen this pattern explained before.</p>
<p>Incidentally, there&#8217;s no need to convert the line via list in the final example: since strings are sequences in Python, the ReadAhead generator will iterate over it as-is on a character-by-character basis.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
