Learning about iterators and generators in Python
Python is a cool language! I might be biased, since that’s what I used to learn programming some years ago. It wasn’t until recently, though, that I started learning about some more advanced features of the language. One such feature is iterators and generators.
An iterator is, in the words of the official documentation, an object representing a stream of data. An iterator is something you can call
next on to get the next item. This is what happens in a
for loop from one iteration to the next (I think!).
Some typical things to do
for loops on include lists and strings. These are not iterators, but iterables. Again in the words of the documentation, an iterable is an object capable of returning its members one at a time. An iterable is not necessarily an iterator. For example, (note that
# here means the output we get)
items = [1, 2, 3] item = next(items) # TypeError: 'list' object is not an iterator
However, if you call
iter on an iterable, you get an iterator:
items = [1, 2, 3, 4, 5] # iterable iterator_items = iter(items) # iterator item = next(iterator_items) print(item) # 1
In addition, we have something called generators. A generator is a function which returns an iterator. It looks like a regular function, except it uses
yield instead of
return. Unfortunately, the term generator is used to mean both the generator function and the generator object.
When a generator function is called, a generator object is created, without executing anything in the function. The generator works in such a way that it starts executing until it reaches the first
yield expression in the code. Then it returns the value in this expression. The next time
next is called, the generator continues its execution, continuing where it left off, until the next
yield expression. And so on. This is an example of lazy evaluation.
Let’s look at an example of a generator.
def a_generator(): print("before 0") yield 0 print("before hi") yield "hi" print("last") iterator = a_generator() item = next(iterator) # before 0 print(item) # 0 item = next(iterator) # before hi print(item) # hi next(iterator) # last # StopIteration error
We see that we called
next one time too many. This is avoided, however, if we use a
for to iterate, since it will stop when hitting
iterator = a_generator() for item in iterator: print(item) # before first # 0 # after first # hi # last
Finally, an important thing that the lazy evaluation provides us with is the ability to create infinite iterators!
The iteration framework, and perhaps generators in particular, are powerful tools, and can really help with abstraction, among other things. Ned Batchelder gave a really good talk on this at PyCon 2013. I recommend you check it out!