Factors and how to create them
Working with factors
Modifying factor variables
18. Collapsing levels to "Other", part 1
Summary

Instruction

It's pretty common to have lots of categories with very few values in them. Often, you'll see these all lumped together in one Other category.

We can automatically put all low-frequency categories into an Other category with the fct_lump() function.

fct_lump(survey$primary_language, n = 5)

It's pretty self-explanatory, with the exception of the n argument. We set n to the number of categories we want to keep. In the preceding example, we opt to keep the five most common languages and lump the rest into an Other category.

Exercise

Use fct_lump() and fct_count() on the employment_status column in survey to find the five most common employment situations.

Stuck? Here's a hint!

Type:

fct_count(fct_lump(survey$employment_status, n = 5))