Introduction
Key elements of the visualization process
5. Data
Environment - the R Language

Instruction

Without data there is nothing to visualize. Data is the most important part of any chart. In this course, you will not learn how to make beautiful but useless charts. We're not looking to nicely decorate reports; we want to create charts that effectively present data. We want this presentation to be done in a visually appealing way, but the main goal is to illustrate the data for analysis and decision-making purposes.

So, the first step in the data visualization process is a detailed examination of the data. This really is a crucial element. Many decisions we make will depend on the data we have: for example, data type will determine the type of the chart which we choose.

In our course, we will always focus on data in the "Know your data" section. This will come first (aside from a brief introduction in some chapters). We will always use this section to examine the data and identify our variables. In fact, correctly identifying variables is essential to creating accurate visualizations.

Data - variables

What are variables? Variables are another name for the data going into our visualizations. Specifically, variables are described as "data items", or anything that can be measured, counted, or categorized. Some examples of variables include ages, countries, income levels, and fruits.

What are values? All variables are made up of values. These values are the individual data points in our dataset. For example:

  • Age values include 1 month old, 1 year old, and 100 years old.
  • Country values include Poland, Canada, India, and China.
  • Income level values include poor, middle class, wealthy, and very wealthy.
  • Fruit values include apples, cherries, peaches, and figs.