Understand Python’s iterators, iteratable objects, and generators

Time:2020-1-10

Many partners are a little confused about the concepts of Python iterator, iteratable object and generator. Let me talk about my understanding and hope to help the friends in need.

1 iterator protocol

Iterator protocol is the core. If you understand this, the above concepts will be well understood.

The so-called iterator protocol requires an iterator to implement the following two methods

iterator.__iter__()
Return the iterator object itself.

iterator.__next__()
Return the next item from the container.

In other words, as long as an object supports the above two methods, it is an iterator.__iter__()You need to return the iterator itself, and__next__()The next element needs to be returned.

2 iteratable objects

Knowing the concept of iterator, what is an iterative object?

This is simpler, as long as the object is implemented__iter__()Method and returns an iterator, which is an iteratable object.

For example, our common list is an iterative object

>>> l = [1, 3, 5]
>>> iter(l)

Using iter() will call the corresponding__iter__()Method, which returns a list iterator, so a list is an iteratable object.

3 handwriting an iterator

There are different ways to implement iterators. I believe the first thing you can think of is custom classes. Let’s start from this.

For illustration, we write an iterator to generate an odd sequence.

According to the iterator protocol, we implement the above two methods.

class Odd:
    def __init__(self, start=1):
        self.cur = start

    def __iter__(self):
        return self

    def __next__(self):
        ret_val = self.cur
        self.cur += 2
        return ret_val

In the terminal, we instantiate an odd class to get an object odd

>>> odd = Odd()
>>> odd

Using the ITER () method will call the__iter__Method, get it

>>> iter(odd)

Using the next () method will call the corresponding__next__()Method to get the next element

>>> next(odd)
1
>>> next(odd)
3
>>> next(odd)
5

In fact, the odd object is an iterator.

We can traverse it with for

odd = Odd()
for v in odd:
    print(v)

Careful partners may find that this will be printed infinitely, so how to solve it?

Let’s experiment with a list and get its iterator object first

>>> l = [1, 3, 5]
>>> li = iter(l)
>>> li

Then get the next element manually until there is no next element, and see what happens

>>> next(li)
1
>>> next(li)
3
>>> next(li)
5
>>> next(li)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

The original list iterator will throw a stopiteration exception when there is no next element. It is estimated that the for statement is based on this exception to determine whether to end.

Let’s modify the original code to generate odd numbers within the specified range

class Odd:
    def __init__(self, start=1, end=10):
        self.cur = start
        self.end = end

    def __iter__(self):
        return self

    def __next__(self):
        if self.cur > self.end:
            raise StopIteration
        ret_val = self.cur
        self.cur += 2
        return ret_val

Let’s try it with for

>>> odd = Odd(1, 10)
>>> for v in odd:
...     print(v)
...
1
3
5
7
9

As expected, it is consistent with the expectation.

We use the while loop to simulate the execution of for

Target code

for v in iterable:
    print(v)

Translated code

iterator = iter(iterable)
while True:
    try:
        v = next(iterator)
        print(v)
    except StopIteration:
        break

In fact, Python’s for statement principle is just like this. You can understand for as a syntax sugar.

4 other ways to create iterators

Generators are also iterators, so you can use the way generators are created to create iterators.

4.1 generator functions

Unlike the return of a normal function, the generator function uses yield.

>>> def odd_func(start=1, end=10):
...     for val in range(start, end + 1):
...         if val % 2 == 1:
...             yield val
...
>>> of = odd_func(1, 5)
>>> of

>>> iter(of)

>>> next(of)
1
>>> next(of)
3
>>> next(of)
5
>>> next(of)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

4.2 generator expression

>>> g = (v for v in range(1, 5 + 1) if v % 2 == 1)
>>> g
 at 0x101a142b0>
>>> iter(g)
 at 0x101a142b0>
>>> next(g)
1
>>> next(g)
3
>>> next(g)
5
>>> next(g)
Traceback (most recent call last):
  File "", line 1, in 
StopIteration

4.3 how to choose

So far, we know three ways to create iterators, so how to choose?

Needless to say, the simplest is the generator expression. If the expression can meet the needs, it is it; if you need to add more complex logic, you can choose the generator function; if the first two cannot meet the needs, then you can customize the class implementation. In short, choose the simplest way.

Characteristics of 5 iterators

5.1 inertia

Iterators don’t compute all elements in advance, but return when they need to.

5.2 support for unlimited elements

For example, the first odd class we created above, whose instance odd is greater than all the odd numbers of start, and the list and other containers can’t hold infinite elements.

5.3 provincial space

Like 10000 elements

>>> from sys import getsizeof
>>> a = [1] * 10000
>>> getsizeof(a)
80064

The list occupies about 80K.

What about iterators?

>>> from itertools import repeat
>>> b = repeat(1, times=10000)
>>> getsizeof(b)
56

It only takes 56 bytes.

Because of the inertia of the iterator, it has this advantage.

6. Some details to pay attention to

6.1 iterators are also iterative objects

Because of the__iter__()Method returns itself, which is an iterator, so iterators are also iteratable objects.

6.2 iterator can’t start from scratch after traversing once

Look at a strange example

>>> l = [1, 3, 5]
>>> li = iter(l)
>>> li

>>> 3 in li
True
>>> 3 in li
False

Because Li is a list iterator, it was found the first time it looked for 3, so it returned true. However, since the first iteration has skipped the 3 element, it cannot be found the second time, so it will appear false.

Therefore, remember that iterators are “disposable.”.

Of course, lists are iterative objects, and it’s normal to look up them several times. (if it’s hard to understand, think about the execution principle of the for statement above. Every time, a new iterator will be obtained from the iteratable object through the ITER () method.)

>>> 3 in l
True
>>> 3 in l
True

7 stanzas

  • All objects that implement the iterator protocol are iterators
  • Realized__iter__()Method and return iterators are iteratable objects
  • Generator is also an iterator
  • There are three ways to create an iterator: generator expression, generator function, and custom class. You can choose the simplest one according to the situation
  • Iterators are also iterative objects
  • Iterators are “disposable”

The first three small items are the key points. These three points have been understood and others can be understood. It is no problem to understand the concepts of the nouns in the title.

8 reference

  • https://docs.python.org/3/library/stdtypes.html#iterator-types
  • https://opensource.com/article/18/3/loop-better-deeper-look-iteration-python
  • http://treyhunner.com/2018/06/how-to-make-an-iterator-in-python

Original link: http://www.kevinbai.com/articles/25.html

Pay attention to the “little back end” public number, more dry goods waiting for you!