Deals Of The Week - hours only!Up to 80% off on all courses and bundles.-Close
Introduction
The lapply() function
Other apply() functions
13. The tapply() function
The split-apply-combine pattern
Summary

Instruction

Great! Let's meet one more function in the apply family: tapply() (the name is short of "table apply").

tapply() is different; it's used when you need to apply a function to some group of members – that is, on a subset of the data, not on each individual member.

Let's explain this concept with a data frame named monday_statistics and our sales list. The column named monday in monday_statistics represents the number of bracelets sold per hour. Another column named monday_admin gives us the name of the site administrator who made sure the site was working properly during those hours.

If you want to see the total number of bracelets sold, grouped by site administrator on Monday, you would use tapply() like this:

tapply(
  X = monday_statistics$monday, 
  INDEX = monday_statistics$monday_admin, 
  FUN = sum)

This function takes three arguments:

  • X – the vector (column) we want to analyze (monday_statistics$monday).
  • INDEX – the vector (column) for grouping the data (monday_statistics$monday_admin).
  • FUN – the function that we want to apply to each subgroup (sum).

The code above gives us this result:

Alex John Tanya
1208 1204 1202

For each administrator, tapply() returned the total sales on Monday. As you can see Alex, John, and Tanya form the three groups. The values in the monday vector are summarized per group – the same function (sum()) was applied to all three groups.

Exercise

Get familiar with the monday_statistics data frame. Then click Next exercise to continue.