Summer Deals - hours only!Up to 80% off on all courses and bundles.-Close
Strings basics
Regular expressions
Modifying strings
17. Not all data is clean data
Summary

Instruction

Now let's see how we clean – or remove – unwanted characters from data.

We have client names (first and last) and locations (city and state) in the clients data frame. Unfortunately, this data is corrupted; there are some errors. We need to fix them! Let's have a look at our data.

Exercise

Using the head() function on the clients data frame, find out what's wrong with our data.

As you see, there are some random digits between first and last names, and both are joined together. The same thing is happening with addresses. For example, we have:

Charleston~*115_South Carolina

Okay, we will take care of the addresses; you'll clean up the names. To do this, we'll need some useful commands.