A few places in the Python language reference coyly refer to "the old-style iteration protocol." What is that protocol and how is it different from "the new-style iteration protocol?" Does it have anything to do with the difference between old-style and new-style classes? Let's dig in to see how iteration works now, and how it used to work.

Iteration In Python

(If you're familiar with the current state of Python iteration, feel free to skip ahead to The Old Iteration Protocol.)

Iteration is the ability to operate on the members of a collection [1] individually, one after the other. It is commonly used in for loops.

>>> collection = [1, 2, 3]
>>> for member in collection:
...     print(member)
1
2
3

There's a lot going on behind the scenes there, which makes it harder to discuss the concepts involved in iteration. To bring those concepts into focus, let's pull back the curtain and look at what's actually going on. We can do this by noting that a for loop could be considered syntactic sugar for an equivalent while loop.

>>> collection = [1, 2, 3]
>>> iterator = iter(collection)
>>> while True:
...    try:
...        member = next(iterator)
...    except StopIteration:
...       break
...    else:
...       print(member)
1
2
3

In this example, we want to iterate over the value of the variable collection. Python provides two built-in functions that help us.

The first function is iter, which takes a collection (also known as an iterable) and returns an iterator (conveniently stashed in the variable named iterator). An iterator is an object that keeps track of the state of the iteration, i.e., what members have been seen and what members are still to come.

The second built-in function is next which takes an iterator and returns its next member. When the iterator is out of members, next will raise the StopIteration exception.

A for loop, then, just knows to call iter at the beginning, next as needed, and catch StopIteration at the end.

Iterables and Iterators

Recall that being iterable means that when passed to iter we get back an iterator. The types Iterable and Iterator formally define what this post has been describing in words. Many built-in types are iterable (and thus Iterable), including lists, tuples, sets, and dictionaries.

>>> from collections.abc import Iterable
>>> from collections.abc import Iterator
>>> isinstance([], Iterable)
True
>>> isinstance(iter([]), Iterator)
True
>>> iter([])
<list_iterator object at 0x1042244a8>

>>> isinstance((), Iterable)
True
>>> isinstance(iter(()), Iterator)
True
>>> iter(())
<tuple_iterator object at 0x1042244a8>

>>> isinstance(set(), Iterable)
True
>>> isinstance(iter(set()), Iterator)
True
>>> iter(set())
<set_iterator object at 0x10422ed38>

>>> isinstance({}, Iterable)
True
>>> isinstance(iter({}), Iterator)
True
>>> iter({})
<dict_keyiterator object at 0x104221278>

Our Own Iterables

Python has documented what it means for an object to be iterable and for an object to be an iterator, so we can create our own objects that work with for loops. Making an object Iterable just means defining the __iter__ method to return an Iterator (Iterator is slightly more complex).

>>> class MyIterable(object):
...     def __init__(self, count):
...        self.count = count
...     def __iter__(self): # This makes this object Iterable
...         return MyIterator(self.count)
...
>>> class MyIterator(object):
...     def __init__(self, count):
...        self.count = count
...        self.value = 0
...     def __next__(self):
...         if self.value >= self.count:
...            raise StopIteration()
...         self.value += 1
...         return self.value
...     def __iter__(self):
...         return self
...
>>> collection = MyIterable(3)
>>> isinstance(collection, Iterable)
True
>>> for i in collection:
...     print(i)
...
1
2
3

Iterators are Iterables

You may have noticed that MyIterator also defined __iter__, even though it's the iterator, not the iterable. Why? Well, one interesting quirk is that Iterator is a subclass of Iterable, and thus every iterator is also an iterable. When passed to iter, an iterator returns itself.

>>> iterator = iter([])
>>> iterator
<list_iterator object at 0x1042244a8>
>>> iter(iterator)
<list_iterator object at 0x1042244a8>
>>> iter(iterator) is iterator
True

The direct consequence of this is that we can pass an iterator to for.

>>> iterator = iter([1, 2, 3])
>>> for member in iterator:
...    print(member)
1
2
3

Why would we want to do that, other than programming convenience? Some objects such as streams (sys.stdin, sockets, and files in general) are inherently stateful—they can only go forwards, not backwards, and not randomly. Unlike iterating across a list, where each iterator is independent and a new one can be started from the beginning at any time, iterating across a stream is always going to move the underlying stream forward. Letting an iterable be its own iterator simplifies the implementation of such objects. I like to call such objects (where the iterable is the iterator) "consumable iterables." Note that iterators in general (such as MyIterator) are consumable iterables because there is no standard way to reset them.

Iterators are Containers

In what may be the most surprising twist so far, it turns out that iterables are also containers. A container is an object that can be used with the in operator to test for membership. Usually this means that it implements the __contains__ special method and is a subclass of a Container, but for iterables that's not required. If the container doesn't implement __contains__, the in operator will fall back to iterating across it and comparing each member for equality. (If iter doesn't succeed in getting an iterator, in returns False.)

For consumable iterables, such as iterators, this can be surprising.

>>> iterable = MyIterable(3)
>>> 1 in iterable
True
>>> 1 in iterable
True
>>> iterator = iter(iterable)
>>> 1 in iterator
True
>>> 1 in iterator
False

The Old Iteration Protocol

The above describes the way iteration in Python works now. But it wasn't always this way. If you've been following along with the links to the Python documentation, you saw that __contains__ referenced "the old sequence iteration protocol." The Iterable class referenced some other way to iterate besides __iter__, and the documentation on iterable types stated that sequence types are always iterable without saying why or how.

It all goes back to Python 2.2, when iter was first introduced, along with the special methods __iter__ and __next__ [2]. Prior to that, the only thing that for knew how to iterate was sequences like lists. for member in sequence: print(member) would have looked something like this as a while loop.

sequence = [1, 2, 3]
_i = 0
while 1: # `True` wasn't a built-in yet
    try:
      member = sequence[i]
    except IndexError:
      break
    else:
      print(member)
    _i += 1

In other words, for was implemented by calling __getitem__ with ever-increasing integers until it "ran off the end" and the sequence raised IndexError.

This was suboptimal. For one thing, dictionaries couldn't be iterated with a for loop—their keys (usually) weren't sequential increasing integers, and misses raise KeyError instead of IndexError. For another thing, if you had a custom type you wanted to make iterable—such as a set which wasn't standard until Python 2.4—you had to also make it subscriptable, which may not make any sense (sets aren't ordered).

Python's BDFL [3] and others recognized the shortcomings, wrote PEP 234 to give us iterables and iterators, and the rest is history.

Except

Except it's not quite history. Even in the latest and greatest Python 3.7 the old iteration protocol is alive and well, as the documentation obliquely references.

>>> class MySequence:
...     def __getitem__(self, i):
...         if i > 3:
...             raise IndexError()
...         return str(i)
...
>>> for member in MySequence():
...    print(member)
...
0
1
2
3

This is implemented with a special type of iterator that knows how to call __getitem__ with increasing integers until IndexError is raised. When iter detects that the object it was given doesn't implement __iter__ it checks to see if it has __getitem__ and if so instantiates and returns that type.

There are a few small gotchas associated with this behaviour.

One, as Iterable documents, is that this means that things that aren't an instance of Iterable can still be iterated.

>>> isinstance(MySequence(), Iterable)
False
>>> iter(MySequenc())
<iterator object at 0x10e1e7790>

Another is that if you're implementing a container you may find your __getitem__ called with unexpected values if someone tries to iterate these objects. If your implementation is type-specific this can lead to unexpected errors.

>>> import bisect
>>> class ContainerOfSortedStrings(object):
...   def __init__(self):
...      self._strings = []
...   def __setitem__(self, key, value):
...      assert isinstance(key, str) and isinstance(value, str)
...      bisect.insort_left(self._strings, (key, value))
...   def __getitem__(self, key):
...      return self._strings[bisect.bisect_left(self._strings, (key,))]

>>> cs = ContainerOfSortedStrings()
>>> cs['key'] = 'value'
>>> cs['key']
('key', 'value')
>>> for member in cs: print(member)
Traceback (most recent call last)
TypeError: '<' not supported between instances of 'str' and 'int'

You can workaround this problem by assigning __iter__ to None. On recent versions of Python this will raise an informative TypeError.

Old Squared

Finally, for some fun, here's an example of the old iteration protocol on an old-style (Python 2) class. (Exercise for the reader: Does this work in Python 3? Why or why not?)

>>> class OldStyle:
...     pass
...
>>> container = [1, 2, 3]
>>> old_style = OldStyle()
>>> old_style.__getitem__ = container.__getitem__
>>> for i in old_style:
...     print(i)
...
1
2
3

Footnotes

[1] In the generic sense of the word, not the collections.abc sense. I would use the word "container," but that does have a collections.abc sense I want to use later on. This way is less confusing. Trust me.
[2] In Python 2, that was actually the next method without the dunders. This was probably because the next function didn't come about until Python 2.6. Instead of next(iter) users of Python 2.2 were expected to write iter.next().
[3] Benevolent dictator for life is the title used to refer to Python's creator Guido van Rossum.