In this part, we'll work on a recruitment project. Our recruitment agency is using R in its analysis of potential candidates for a specific position. The candidate CVs are parsed from PDF and .docx files and stored in a list object in R.
In our last exercise, we observed a list that saves information about a potential candidate (Emma Stone) for employment that our recruitment agency is considering for a specific role.
In addition to candidate CVs, we also have access to a list of open positions, stored in a separate list. We'll work with both lists.
You're probably wondering: Why use lists to store the contents of applicant CVs? Why not vectors or data frames?
Well, CVs contain lots of different information. They don't have a fixed structure; each CV is specific to one individual. Some candidates write about their past education; some don't. Some write about their known languages; some don't. The structure of each CV differs. Thus, it makes sense to use lists in our analysis. After all, it's quite difficult to place a CV in a 2D structure like a data frame without any duplication of the data (data redundancy – we'll show that later in one of our exercises).
Lists also support nesting, allowing you to store lists in each other. For example, the candidate lists from our previous exercise can be part of a larger list of all candidates.
To start, let's observe Emma's CV and one particular list that we have in our R environment.