Modeling
In the modeling stage you formalize your discoveries found during exploration into an explicit explanation of the steps and data structures required to get to the desired meaning contained within your data. This is the model, a combination of both data structures as well as steps in code to get from the raw data to your information and conclusions.
The modeling process is iterative where, through an exploration of the data, you select the variables required to support your analysis, organize the variables for input to analytical processes, execute the model, and determine how well the model supports your original assumptions. It can include a formal modeling of the structure of the data, but can also combine techniques from various analytic domains such as (and not limited to) statistics, machine learning, and operations research.
To facilitate this, pandas provides extensive data modeling facilities. It is in this step that you will move more from exploring your data, to formalizing the data model in DataFrame objects, and ensuring the processes to create these models are succinct. Additionally, by being based in Python, you get to use its full power to create programs to automate the process from beginning to end. The models you create are executable.
From an analytic perspective, pandas provides several capabilities, most notably integrated support for descriptive statistics, which can get you to your goal for many types of problems. And because pandas is Python-based, if you need more advanced analytic capabilities, it is very easy to integrate with other parts of the extensive Python scientific environment.