Practical Convolutional Neural Networks
上QQ阅读APP看书,第一时间看更新

How do computers interpret images?

Essentially, every image can be represented as a matrix of pixel values. In other words, images can be thought of as a function (f) that maps from R2 to R.

f(x, y) gives the intensity value at the position (x, y). In practice, the value of the function ranges only from 0 to 255. Similarly, a color image can be represented as a stack of three functions. We can write this as a vector of:

 f( x, y) = [ r(x,y) g(x,y) b(x,y)]

Or we can write this as a mapping:

f: R x R --> R3

So, a color image is also a function, but in this case, a value at each (x,y) position is not a single number. Instead it is a vector that has three different light intensities corresponding to three color channels. The following is the code for seeing the details of an image as input to a computer.