If we want to use a time axis on our charts, we have to guarantee that our time variable is in a date-time format. Otherwise, the differences between time moments may not be properly measured.
Aside from the four main data types that we have already discussed, the R language has many other data types. These are specific to certain kinds of data and offer handy extra features that simplify their usage. Some of these data types were designed just for the date-time format. We don't have time to go through all of the R date-time data types; instead, we will focus on learning how to convert the data in year to a date-time format.
We can use the
parse_date_time function from the
lubridate package to convert to the date-time data type. (The
lubridate package has lots of functions that significantly ease your work with dates. Why not check it out later?)
To use this function, we write:
vector <- parse_date_time(vector, orders = "Y")
where the vector within the
parse_date_time function is the vector containing the values (string or numeric) to be converted to dates. The
orders argument declares the format these values currently have (e.g. 2012). The function then converts the values to a standard date format (e.g. 2012-01-01) and outputs the results to the vector you choose.
We want to convert the year variable. Currently, it has a four-digit format ( e.g. 2000, 2001, etc.) so we must put
"Y" as the argument in orders to inform the
parse_date_time format. This function will return dates for the first day of each year (2000-01-01).
In the current case , we're not just changing formats; we're actually storing additional information about dates that
ggplot will use to plot the time axis more accurately. This way, we will know when there is a leap year – the distance between 2000 (a leap year containing 366 days) and 2001 will be a little bit greater on the time axis than the distance between 2001 and 2002.
You can learn more about this function in the documentation for