Learn Python for Data Science | Learn Python

11/19 Dealing with outliers

Great! We've identified an outlier in our dataset – now, what should we do about it?

That's another difficult question, because there is no universal answer. An outlier may be a mistake in the data. If that's the case, you can try to fix the erroneous value. If you can't fix the value, it's best to remove it so that it doesn't distort your data. An outlier can also be a correct, albeit atypical value. In this case, your approach depends on what you intend to do with the data. The rule of thumb is this: if you don't really need a specific value in place of the outlier, just delete the row – if you delete the outlier, it will not distort your analysis results.

A quick recap: to remove the row with index 5 from the cars DataFrame, just use:

cars = cars.drop(5)

The year 2000 with the temperature value of 280.0 has an index 10. Delete this row from the temperatures DataFrame. Store the result in temperatures.

Use:

temperatures = temperatures.drop(index)

Instead of "index", provide the index from the exercise instruction.

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?

login:
password:
Remember me on this computer

Recipient's Name:
Recipient's Email:
Your Message (Optional):

Create a free accountand start learning now!

What you get?

Sign upand join a company account!

Log in

Your registration has been successfully finished

Wrap course as a gift

Introduction to Python for Data Science

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?