Excellent! The example with employee data was nice and easy because both datasets had the same join column name, and all data could match. As you may guess, reality is sometimes different.
We will now work with a slightly more complicated example. There will be two datasets, each containing the number of items sold in a single store over years. We will try to compare sales figures in both stores in each year. The problem is twofold:
- Dataset 1 uses column
year, but Dataset 2 uses column
- Dataset 1 contains data for the years 2013-2018, whereas Dataset 2 contains data for the years 2010-2015.