How it works...
The execution times of the different implementations of the colorReduce function from this chapter are reported here. The absolute runtime numbers would differ from one machine to another (here, we used a 2.83 GHz machine equipped with a 64-bit Intel Core 2 Quad Q9550 processor). It is rather interesting to look at their relative differences. These results are also dependent on the specific compiler that is used to produce the executable file. Our tests report the average time to reduce the colors of an image that has a resolution of 4,288 x 2,848 pixels.
First, we compare the three ways of computing the color reduction as presented in the There's more... section of the Scanning an image with pointers recipe. It is interesting to observe that both the integer division formula and the one with bitwise operators take about the same execution time, that is, 31 ms. The version based on the modulo operator, however, takes 52 ms. This represents a difference of almost 50% between the fastest and the slowest! It is, therefore, important to take the time to identify the most efficient way of computing a result in an image loop, as the net impact can be very significant.
When an output image that needs to be reallocated is specified instead of in-place processing, the execution time becomes 33 ms. The extra duration represents the overhead for memory allocation.
In a loop, you should avoid repetitive computations of values that could be precomputed instead. This consumes time, obviously. For example, take the following inner loop of the color reduction function:
int nc= image.cols * image.channels(); uchar div2= div>>1; for (int i=0; i<nc; i++) {
Then, replace it with the following one:
for (int i=0; i<image.cols * image.channels(); i++) { // . . .... *data++ += div>>1;
The preceding code is a loop where you need to compute the total number of elements in a line and the div>>1 result again and again; you will obtain a runtime of 61 ms, which is significantly slower than the original version, which took 31 ms. Note, however, that some compilers might be able to optimize these kinds of loops and still obtain an efficient code.
The version of the color reduction function that uses iterators, as shown in the Scanning an image with iterators recipe, gives slower results at 56 ms. The main objective of iterators is to simplify the image-scanning process and make it less prone to errors.
For completeness, we also implemented a version of the function that uses the at method for pixel access. The main loop of this implementation would then read simply as follows:
for (int j=0; j<nl; j++) { for (int i=0; i<nc; i++) { // process each pixel --------------------- image.at<cv::Vec3b>(j,i)[0]= image.at<cv::Vec3b>(j,i)[0]/div*div + div/2; image.at<cv::Vec3b>(j,i)[1]= image.at<cv::Vec3b>(j,i)[1]/div*div + div/2; image.at<cv::Vec3b>(j,i)[2]= image.at<cv::Vec3b>(j,i)[2]/div*div + div/2; // end of pixel processing ---------------- } // end of line }
This implementation is much slower when a runtime of 91 ms is obtained. This method should then be used only for the random access of image pixels but never when scanning an image.
A shorter loop with few statements is generally more efficiently executed than a longer loop over a single statement, even if the total number of elements processed is the same. Similarly, if you have N different computations to apply to a pixel, apply all of them in one loop rather than writing N successive loops, one for each computation.
We also performed the continuity test that produces one loop in the cases of continuous images instead of the regular double loop over lines and columns. For a very large image, such as the one we used in our tests, this optimization is not significant (29 ms instead of 31 ms), but, in general, it is always a good practice to use this strategy, since it can lead to a significant gain in speed.