Introduction to data frames
2. Data frames - introduction
Data frame structure
Accessing columns in a data frame
Accessing rows of a data frame
Accessing rows and columns combined in a data frame
Data frame analysis
Summary

Instruction

Interesting! Up until now, we've worked with basic variables and vectors. This time, though, notice that the cities variable contains tabular data. Perhaps you are more familiar with the term "spreadsheet" that is used in Excel or the term "database table" that is used in database languages. In R, a "table" is simply called a data frame!

A data frame is a collection of vectors arranged as columns that each have their own names. For instance, the cities data frame contains five column vectors: country, city, latitude, longitude, and population. The vectors country and city are character vectors; latitude, longitude, and population are numeric vectors.

We'll refer to these vectors of a data frame as just columns. You can think of a data frame as a two-dimensional array – each entry in a data frame is related to one specific column and one specific row.

cities COLUMN
id country city latitude longitude population
1 France Paris 48.866693 2.33333533 4957588.5
2 Germany Berlin 52.521819 13.40154862 32500.4.0
3 Germany Frankfurt 50.099977 8.67501542 1787332.0
4 Germany Stuttgart 48.779980 9.19999630 1775644.0 ROW
5 Germany Hamburg 53.550025 9.99999914 1748058.5
6 Poland Warsaw 52.250001 20.99999955 1704569.5
7 Portugal Lisbon 38.722723 -9.14486631 1664901.0
8 Poland Katowice 50.260380 19.0200.405 1527362.0

Exercise

We've set up a data frame named countries.

Display the information stored in this data frame, and observe the data that are returned. Notice that this data frame only contains countries from Western and Eastern Europe.

Stuck? Here's a hint!

Just type

countries

in the editor.