Learn Python for Data Science | Learn Python

9/24 Statistical functions – exercise

Excellent! Naturally, pandas offers many more statistical functions than those described here. If you need anything else, you can browse the pandas documentation on your own.

Alright! Let's do one more exercise on basic statistical functions to strengthen your knowledge.

Count the number of movies with an untypical rating. A typical rating is in the interval between mean - standard deviation and mean + standard deviation, both limits exclusive. An atypical rating is one outside this interval. The answer should be presented as a number.

First, you have to calculate the mean() and the std() of the rating column. Let's assign them to variables for the sake of the code cleanliness:

movies_mean = movies['rating'].mean()
movies_std = movies['rating'].std()

Then you have to select only these movies that have rating smaller than movies_mean - movies_std or greater than movies_mean + movies_std:

greater_rating = movies['rating'] > (movies_mean + movies_std)
smaller_rating = movies['rating'] < (movies_mean - movies_std)

Use the loc() function to select the rating, filter it with the appropriate conditions, and use the count() in order to count the number of remaining rows:

movies.loc[greater_rating | smaller_rating, 'rating'].count()

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?

login:
password:
Remember me on this computer

Recipient's Name:
Recipient's Email:
Your Message (Optional):

Create a free accountand start learning now!

What you get?

Sign upand join a company account!

Log in

Your registration has been successfully finished

Wrap course as a gift

Introduction to Python for Data Science

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?