Now we have to find out the data we need to visualize part of a whole.
The data we'll use for this visualization has two forms: raw (single observations) and final (aggregated observations). If we want to answer the above question precisely, we have to be aware that we will need one of the following two things:
- One categorical variable for each observation (raw data). We'll calculate frequencies for each category and transform these frequencies into percentages. This will give us the final, aggregated data form.
- One categorical variable and one numerical variable for each observation (raw data). We'll add up the values of the numerical variable in each category, calculate the percentages, and get aggregate data.
In either case, the final aggregated data consists of two columns:
- One with all the non-overlapping categories (i.e. the Likert answers of Agree, Disagree, etc.)
- One with the values calculated for each category. These values add up to the total magnitude of the situation (i.e. the percentages of all responses total 100%).
In reality, we usually receive the final, aggregated data to work with. That's what we will use in this chapter. However, we should always be aware that these aggregates were created from raw data.