Now we have to prepare some more complicated data – for each of the four wealth categories, we have to calculate the percentages of levels of alcohol consumption. To do that, we first calculate how often each combination of these variables occurs in the dataset, using
count() function, but this time with two arguments, one for each variable:
tab <- count(dataset, variable_x, variable_y)
variable_y are the variables we want to analyze.
Then we do something very similar to what we did for the first variable – we calculate the percentage of the whole for each combination. Again, we use the
mutate function. However, this time we have two variables. We must consider which one is the main variable and which is the grouping variable. Earlier, we set wealth category on the x-axis; this makes it the grouping variable because it will determine how the data points are grouped.
To set a variable as a grouping variable, we use the
tab <- group_by(tab, variable_x)
and then we use our
function to calculate percentages for each group determined by
tab <- mutate(tab, percent = n / sum(n) * 100)