Python Natural Language Processing
上QQ阅读APP看书,第一时间看更新

Transforming data

In this stage, we will apply some feature engineering techniques that help us convert the text data into numeric data so the machine can understand our dataset and try to find out the pattern in the dataset. So, this stage is basically a data manipulation stage. In the NLP domain, for the transformation stage, we can use some encoding and vectorization techniques. Don't get scared by the terminology. We will look at all the data manipulation techniques and feature extraction techniques in Chapter 5, Feature Engineering and NLP Algorithms and Chapter 6, Advance Feature Engineering and NLP Algorithms.

All the preceding stages are basic steps to prepare the dataset for any NLP or data science related applications. Now, let's see how you can generate data using web scraping.