Introduction
Missing values
Duplicate rows
Outliers
Joining datasets
Summary
17. Summary

Instruction

It's time to wrap things up!

In this part, we learned how to deal with the following problems:

1. Missing values - you can use

dataframe.fillna(value)
to replace all NaNs with value.

2. Duplicate rows - you can use

dataframe.drop_duplicates()
to get rid of duplicate rows.

3. Outliers - one of the ways of dealing with duplicates is to remove them using

dataframe.drop(index)

4. Joining DataFrames - to join two DataFrames, use:

dataframe_a.merge(dataframe_b, on='column')

Okay, are you ready for a short quiz?

Exercise

Click Next exercise to continue.