Data Science Projects with Python
上QQ阅读APP看书,第一时间看更新

Introduction

The first chapter got you started with some basic Python, and then progressed to equipping you with tools for data exploration. Specifically, we performed operations such as loading the dataset and verifying data integrity, and we performed our first exploratory analysis on our case study dataset.

In this chapter, we finish our exploration of the data by examining the response variable. After we've concluded that the data is of high quality and makes sense, we will be ready to move forward with the practical concerns of developing machine learning models. We will take our first steps with scikit-learn, one of the most popular machine learning packages available in the Python language. Before learning the details of how mathematical models work in the next chapter, here we'll start to get comfortable with the syntax for using them in scikit-learn.

We will also learn some common techniques for how to answer the question, "Is this model good or not?" There are many possible ways to approach model evaluation. For business applications, some kind of financial analysis to determine the value that could be created by the model is usually necessary. However, we will reserve this for the end of the book.

There are several important model evaluation criteria that are considered standard knowledge in data science and machine learning. We will cover a few of the most widely used classification model performance metrics here, to give you a strong foundation.