Deals Of The Week - hours only!Up to 80% off on all courses and bundles.-Close
Create new variables
Grouping and statistical functions
Joining datasets
11. The inner_join() function


Finally, we'll learn one more SQL-like function: inner_join(). As you might have guessed, this works a lot like INNER JOIN in SQL. There are other joins available in dplyr, but we won't go into them here.

For those not familiar with SQL, inner_join() takes two sets of data and joins their contents based on matching fields in a specific column. The arguments are the datasets being joined (like tables in SQL) and the column name that holds the matching data.

If we have data about countries' GDPs (gross domestic product) in the countries dataset and want to join that data with the corresponding country names, we'd write:

countries <- inner_join(
  by = "country_name")


We have data about countries' forested areas in memory. We want to find the percentage of land area covered by forests for each country. Join the forests data (from the forests table) with the countries data, using the country_name column to join the two datasets. Assign the result to countries.

Stuck? Here's a hint!


countries <- inner_join(countries, forests, by = "country_name")