Learn Python by Building Data Science Applications
上QQ阅读APP看书,第一时间看更新

Using generators

Generators are not exactly data structures—they are functions. However, while normal functions compute their results and return them at once, generators can be stopped and resumed on the fly, resulting in an iterable-like behavior. In other words, you can loop over a generator, retrieving one value at a time. Unlike classic iterables, however, generators are lazy. They compute values once we ask for them, but not before we do. As a result of that, there are a few significant differences in their behavior as compared to iterables:

  • First, generators use a fixed amount of memory. Even if you ask one to compute zillions of values, a generator will produce and store just one value every time you ask, which is great! In fact, generators can produce an infinite number of values with no memory issues.
  • Second, as generators do not store the values, there is no way to retrieve values by their index. In order to get the third element, you need to compute the first two, first. Similarly, there is no way to get back to the previous element. If you didn't store it, the value is lost. Also, there is no way to estimate the length of the generator other than computing all the values, but again, generators can be infinite.

For a function to work as a generator, it needs to emit multiple yield statements instead of return statements. Once the function is called, you can loop over it as if it were a list or a tuple, or retrieve one value at a time by using Python's built-in next() function:

def my_generator(N, power=2):

# for loop, which we'll cover in depth in the next chapter.
# note that loops require another level of indentation
for el in range(N):
yield el**power

N = my_generator(4, power=2)

next(N)
>>> 0

for el in N:
print(el)
>>> 1 # zero was computed already
>>> 4
>>> 9

In the preceding example, we used the range() function, which takes one to three integer arguments, with only 1 required. If 1 is provided, range will return a generator of numbers from 0 to this number, excluding it. If 2 is provided, the former will become the starter, and the latter, which is at the end of the generator will, again, be excluded. If 3 is provided, it will be used as a step. There are plenty of other functions within Python that return generators. Don't worry if you need a list or tuple as a result—just convert them:

list(range(5))
>>> [0, 1, 2, 3, 4]

The range object has some syntactic sugar functionality. For example, it can check for inclusion without actually calculating the values (which is a very easy thing to do, if you think about it thoroughly):

20346 in range(0, 100_000_000, 2)  # even numbers
>>> True