Introduction
Grouping
10. Grouping rows
Summary

Instruction

Good job! Calculating statistical functions for entire columns is a good start, but we often want to compare rows in some groups. For instance, we may want to know which country produces movies that generate the most revenue, or which director has the best average movie rating.

To answer such questions, we need to group rows by a specific column:

players_by_height = players.groupby('height')

The function groupby() creates a special type of object DataFrameGroupBy. Then we stored this object in a variable named players_by_height. If you try to show the contents of players_by_height, you will see something like:

pandas.core.groupby.DataFrameGroupBy object at 0x00000...

Strange as it may look, this means that groups were correctly created. The image below shows what happens behind the scenes and how groups are created:

name height ...
David Goffin 180 ... group with height=180
Roger Federer 185 ... group with height=185
Rafael Nadal 185 ...
Dominic Thiem 185 ...
Grigor Dimitrov 191 ... group with height=181
Jack Sock 191 ...
Marin Cilic 198 ... group with height=198
Alexander Zverev 198 ...
Juan Martin del Potron 198 ...
Kevin Anderson 203 ... group with height=203

Exercise

Create a new variable named movies_by_country that will store movies grouped by the country column.

Stuck? Here's a hint!

Use

groupby('country')