Spring Deals - hours only!Up to 70% off on all courses and bundles.-Close
Know your data
Visualize your data – categorical variables
Work with your chart 2
Check yourself 2

Instruction

Now we have to prepare some more complicated data – for each of the four wealth categories, we have to calculate the percentages of levels of alcohol consumption. To do that, we first calculate how often each combination of these variables occurs in the dataset, using count() function, but this time with two arguments, one for each variable:

tab <- count(dataset, variable_x, variable_y)

where variable_x and variable_y are the variables we want to analyze.

Then we do something very similar to what we did for the first variable – we calculate the percentage of the whole for each combination. Again, we use the mutate function. However, this time we have two variables. We must consider which one is the main variable and which is the grouping variable. Earlier, we set wealth category on the x-axis; this makes it the grouping variable because it will determine how the data points are grouped.

To set a variable as a grouping variable, we use the group_by function:

tab <- group_by(tab, variable_x)
and then we use our mutate() function to calculate percentages for each group determined by variable_x:
tab <- mutate(tab, percent = n / sum(n) * 100)

Exercise

Calculate percentages for the level of alcohol consumption for each category of wealth category. Use the above three commands, one after the other. For variable_x, set wealth_index_cat; for variable_y, set consumption_cat. For dataset use alcohol_wealth2.

When you're done, press the Run and Check Code button.

Stuck? Here's a hint!

You should write:

tab <- count(alcohol_wealth2, wealth_index_cat, consumption_cat)
tab <- group_by(tab,wealth_index_cat)
tab <- mutate(tab, percent = n/sum(n)*100)