Create new variables
Grouping and statistical functions
10. group_by() and summarise()
Joining datasets
Summary

Instruction

Awesome! Now, let's see why group_by() and summarise() are so great together.

If we wanted to calculate the mean and median populations in groups defined by continent, we'd write:

countries_urban %>%
  group_by(continent) %>%
  summarise(
    mean_pop = mean(population),
    median_pop = median(population))

What do you think this will return? Well, we know that summarise() will return the median and mean populations. And group_by() will organize the information by continent. Together, these two functions allow us to summarize information by group. In the real world, we'd use this to see subtotals and category totals, which are essential in reporting and analysis.

Exercise

Show the following statistics by continent groups:

  • minimum area as min_area,
  • maximum area as max_area, and
  • mean area as mean_area.

Base those statistics on data from countries_urban.

Stuck? Here's a hint!

Type:

countries_urban %>% 
  group_by(continent) %>% 
  summarise(min_area = min(area), max_area = max(area), mean_area = mean(area))