Good job! Our managers send their production reports in different formats, but we can handle it.
R has a general function, read.table()
, that we can use with a number of different file formats. This function has many arguments. The most important argument is sep
, which stands for "separator". This tells R which field separator character is being used. When we're using this command to read a CSV file, sep
is set to ","
. If we use it to read a tab-delimited file, sep
is set to \t
(which signifies "tab").
Both tab- and comma-separated formats are common, but sometimes we get data separated by another character, such as a semicolon (;
). To read this type of file, we'd write:
df_report_thunderrock <- read.table("data/thunderrock_industries.csv", sep = ";")
Note: The read.csv()
function is just read.table()
with the argument sep = ","
built in. The CSV format is so common it has its own reading function in R. Likewise, the read.delim()
function is just the read.table()
function with the sep
argument set to \t
(i.e., a tab).