Advanced Python Programming
上QQ阅读APP看书,第一时间看更新

Useful algorithms and data structures

Algorithmic improvements are especially effective in increasing performance because they typically allow the application to scale better with increasingly large inputs.

Algorithm running times can be classified according to their computational complexity, a characterization of the resources required to perform a task. Such classification is expressed through the Big-O notation, an upper bound on the operations required to execute the task, which usually depends on the input size.

For example, incrementing each element of a list can be implemented using a for loop, as follows:

    input = list(range(10))
for i, _ in enumerate(input):
input[i] += 1

If the operation does not depend on the size of the input (for example, accessing the first element of a list), the algorithm is said to take constant, or O(1), time. This means that, no matter how much data we have, the time to run the algorithm will always be the same.

In this simple algorithm, the input[i] += 1 operation will be repeated 10 times, which is the size of the input. If we double the size of the input array, the number of operations will increase proportionally. Since the number of operations is proportional to the input size, this algorithm is said to take O(N) time, where N is the size of the input array.

In some instances, the running time may depend on the structure of the input (for example, if the collection is sorted or contains many duplicates). In these cases, an algorithm may have different best-case, average-case, and worst-case running times. Unless stated otherwise, the running times presented in this chapter are considered to be average running times.

In this section, we will examine the running times of the main algorithms and data structures that are implemented in the Python standard library, and understand how improving running times results in massive gains and allows us to solve large-scale problems with elegance.

You can find the code used to run the benchmarks in this chapter in the Algorithms.ipynb notebook, which can be opened using Jupyter.