Algorithm classes
Algorithms are recipes for calculating something. As you are an intermediate programmer, you might surely have heard about the O(n) (that is, the big O) notation which ranks the algorithms performance-wise. This notation describes the relative theoretical performance of algorithms by measuring how the runtime of an algorithm increases with increasing input size. The variable n stands for the input size, and for us the big O notation expresses the independence of any constant factors. Its other meaning will be ignored in this explanation.
The algorithm classes you will encounter in real life are O(n), O(n log n), O(log n), O(n²), O(n³), ..., O(nx), O(x n). These are, in order of appearance, logarithmic, linear, linear-logarithmic, quadratic, cubic, polynomial, and exponential.
The speed of logarithmic algorithms decreases slower than the increase of input size, while the speed of linear ones decreases proportionally to that. The speed of linear-logarithmic algorithms decreases a little slower than pure linear ones. These are the good guys we'd like to use all the time. The other ones are bad and dangerous dudes, which you should omit if you can. The most innocent-looking is the quadratic (that is O(n²)) algorithm—it can be written so easily and doesn't look very dangerous:
for(int i = 0; i < count1; ++i)
{
for(j = 0; j < count2; ++j)
{
...
}
}
But then, look at the graphical illustration of their relative running time in the next figure:
On the left side, we see the amount of work the quadratic algorithm is doing; to the right is the work of a linear one. When it is shown in this way, the difference is quite pronounced! So, stay with the good guys: O(n), O(n log n), and O(log n).
So, what's the difference between a performant algorithm and a bad one?
- A performant algorithm will use additional knowledge of the structure of the data; the simple and inferior algorithm will not.
Using the knowledge about the structure of the data is an example of the basic don't do unnecessary work optimization principle. And if we do less work, then we improve the overall performance!
The simplest example is the search for some value in an array of data. The simplest solution would just go over the array and check its elements. The binary search would use the knowledge that the data in the array are sorted and achieve logarithmic performance. For that, the data has to be sorted, but this will only pay out if we are looking at the elements very often. We can discern two things from this: the design of an algorithm is a trade-off and, secondly, it is intimately bound to the data structures we are using.
Now, we see the reasoning behind Linus's warning, so be a good programmer and think about organizing your data structures so that they match the hidden structure of your problem! Then, use the standard algorithms defined for that data structure. Thus, for the next topic, we will take a short look at data structures that we can use in a program, but before that, let me formulate a short warning.