Learn Python for Data Science | Learn Python

10/24 Grouping rows

Good job! Calculating statistical functions for entire columns is a good start, but we often want to compare rows in some groups. For instance, we may want to know which country produces movies that generate the most revenue, or which director has the best average movie rating.

To answer such questions, we need to group rows by a specific column:

players_by_height = players.groupby('height')

The function groupby() creates a special type of object DataFrameGroupBy. Then we stored this object in a variable named players_by_height. If you try to show the contents of players_by_height, you will see something like:

pandas.core.groupby.DataFrameGroupBy object at 0x00000...

Strange as it may look, this means that groups were correctly created. The image below shows what happens behind the scenes and how groups are created:

name	height	...
David Goffin	180	...	→	group with height=180
Roger Federer	185	...	→	group with height=185
Rafael Nadal	185	...
Dominic Thiem	185	...
Grigor Dimitrov	191	...	→	group with height=181
Jack Sock	191	...
Marin Cilic	198	...	→	group with height=198
Alexander Zverev	198	...
Juan Martin del Potron	198	...
Kevin Anderson	203	...	→	group with height=203

Create a new variable named movies_by_country that will store movies grouped by the country column.

Use

groupby('country')

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?

login:
password:
Remember me on this computer

Recipient's Name:
Recipient's Email:
Your Message (Optional):

Create a free accountand start learning now!

What you get?

Sign upand join a company account!

Log in

Your registration has been successfully finished

Wrap course as a gift

Introduction to Python for Data Science

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?