Python Machine Learning Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

Getting ready

Standardization or mean removal is a technique that simply centers data by removing the average value of each characteristic, and then scales it by dividing non-constant characteristics by their standard deviation. It's usually beneficial to remove the mean from each feature so that it's centered on zero. This helps us remove bias from features. The formula used to achieve this is the following:

Standardization results in the rescaling of features, which in turn represents the properties of a standard normal distribution:

  • mean = 0
  • sd = 1

In this formula, mean is the mean and sd is the standard deviation from the mean.