Learn Python for Data Science | Learn Python

Good! Duplicate rows are not easy to identify at first sight. Luckily, just like with NaN values, we can check if a given DataFrame or specific column contains duplicates. To look for duplicates in a DataFrame named cars, we can write:

cars.duplicated().values.any()

To check if a specific column named vin has any duplicate rows, we can write:

cars['vin'].duplicated().values.any()

Both of these expressions return True if there is at least one duplicate row.

Usually, it makes more sense to check for duplicates in the whole DataFrame rather than in single columns. For instance, in our case, two states could generate the exact same sales figures – and it would be perfectly legit.

Check if there are duplicate rows in the states_sales DataFrame.

Add:

.duplicated().values.any()

to the whole DataFrame name.

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?

login:
password:
Remember me on this computer

Recipient's Name:
Recipient's Email:
Your Message (Optional):

Create a free accountand start learning now!

What you get?

Sign upand join a company account!

Log in

Your registration has been successfully finished

Wrap course as a gift

Introduction to Python for Data Science

Instruction

Exercise

Stuck? Here's a hint!

Need assistance?