Introduction
Missing values
Duplicate rows
Outliers
Joining datasets
15. Merging two DataFrames – different columns, step 2
Summary

Instruction

Good job. In order to join two datasets on columns with different names, we can use the following syntax:

cars_with_owners = cars.merge(owners_2, left_on='owner_id', right_on='person_id')

Instead of on, we now specify left_on and right_on. "Left" corresponds to the column in cars – this is the DataFrame on which we used the merge function. "Right" corresponds to owners – this DataFrame is used as an argument inside the parentheses of merge().

Exercise

Join store_a and store_b by creating a new variable stores_comparison. For store_a, use the year column. For store_b, use the period column. After that, show the contents of stores_comparison.

Note that there are now two items_sold columns - one ending in "_x" and another in "_y", and that there are only data for the years 2013-2015.

Stuck? Here's a hint!

Start with:

stores_comparison = store_a.merge()