If we want to use a time axis on our charts, we have to guarantee that our time variable is in a date-time format. Otherwise, the differences between time moments may not be properly measured.
Aside from the four main data types that we have already discussed, the R language has many other data types. These are specific to certain kinds of data and offer handy extra features that simplify their usage. Some of these data types were designed just for the date-time format. We don't have time to go through all of the R date-time data types; instead, we will focus on learning how to convert the data in year to a date-time format.
We can use the parse_date_time
function from the lubridate
package to convert to the date-time data type. (The lubridate
package has lots of functions that significantly ease your work with dates. Why not check it out later?)
To use this function, we write:
vector <- parse_date_time(vector, orders = "Y")
where the vector within the parse_date_time
function is the vector containing the values (string or numeric) to be converted to dates. The orders
argument declares the format these values currently have (e.g. 2012). The function then converts the values to a standard date format (e.g. 2012-01-01) and outputs the results to the vector you choose.
We want to convert the year variable. Currently, it has a four-digit format ( e.g. 2000, 2001, etc.) so we must put "Y"
as the argument in orders to inform the parse_date_time
format. This function will return dates for the first day of each year (2000-01-01).
In the current case , we're not just changing formats; we're actually storing additional information about dates that ggplot
will use to plot the time axis more accurately. This way, we will know when there is a leap year – the distance between 2000 (a leap year containing 366 days) and 2001 will be a little bit greater on the time axis than the distance between 2001 and 2002.
You can learn more about this function in the documentation for parse_date_time
.