Understanding machine learning
Aren't we taught that computer systems have to be programmed to do certain tasks? They may be a million times faster at doing things but they have to be programmed. We have to code each and every step and only then do these systems work and complete a task. Isn't then the very notion of machine learning a very contradictory concept?
In the simplest ways, machine learning refers to a method of teaching the systems to learn to do certain tasks, such as learning a function. As simple as it sounds, it is a bit confusing and difficult to digest. Confusing because our view of the way the systems (computer systems specifically) work and the way we learn are two concepts that hardly intersect. It is even more difficult to digest because learning, though an inherent capability of the human race, is difficult to put in to words, let alone teach to the systems.
Then what is machine learning? Before we even try to answer this question, we need to understand that at a philosophical level it is something more than just a way to program. Machine learning is a lot of things.
There are many ways in which machine learning can be described. Continuing from the high level definition we presented in the previous chapter, let us go through the definition given by Tom Mitchell in 1997:
"A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E."
Note
Quick Note about Prof Tom Mitchell
Born in 1951, he is an American computer scientist and professor at Carnegie Mellon University (CMU). He is also the chair of the machine learning department at CMU. He is well known for his contributions in the fields of machine learning, artificial intelligence, and cognitive neuroscience. He is part of various institutions such as the Association for the Advancement of Artificial Intelligence.
Now let us try to make sense out of this concise yet powerful definition with the help of an example. Let us say we want to build a system that predicts the weather. For the current example, the task (T) of the system would be to predict the weather for a certain place. To perform such a task, it needs to rely upon weather information from the past. We shall term it as experience E. Its performance (P) is measured on how well it predicts the weather at any given day. Thus, we can generalize that a system has successfully learned how to predict the weather (or task T) if it gets better at predicting it (or improves its performance P) utilizing the past information (or experience E).
As seen in the preceding example, this definition not only helps us understand machine learning from an engineering point of view, it also gives us tools to quantify the terms. The definition helps us with the fact that learning a particular task involves understanding and processing of the data in the form of experience. It also mentions that if a computer program learns, its performance improves with experience, pretty similar to the way we learn.