Earth’s 7.5 billion people together use several billion different devices, generating an annual global IP traffic of more than one zettabyte. Out of these impressive numbers rises a new field above the others — “data science”. Is data science an inevitable reality, or will it yet be dismissed as just another “wave of the future” that never came to be?
A few days ago I signed a new employment contract with my company. Finally, my long-time wish came true! After working for the last 10 years as an ETL developer, data warehouse specialist and data analyst, I have arrived at the place of complete satisfaction with my professional occupation. I became, officially, a data scientist. Some call it the hottest job of the 21st century, some, the sexiest job of the year; but my satisfaction does not come out of my identity with an attractive title about which everyone is talking. My satisfaction derives from my everyday work. It comes from knowing that each day will bring new and unexpected challenges which will lead to new learning and exploration of the unknown. But what is data science, really, and what, exactly, do data scientists do?
What Is Data Science and What Do Data Scientists Do?
Data science is all about getting valuable insights from the available data. Surfing the internet I discovered a number of definitions, most claiming that being a data scientist means that you are expert in the following fields:
- Computer science (Cloud computing, Distributed systems)
- Data management (Business Intelligence, Data warehousing, Big Data)
- Math & Algorithms (Predictive modeling, Machine learning, Data mining)
- Art & Design (Visualization, Storyteller)
- Business domain (Data product design, Sales, Business Knowledge)
Each field contains many topics to master?—?lifelong learning in every field. I began to feel embarrassed because: no, I’m not excellent in all of these fields. Did I become all that when I signed my new contract? Of course not! Probably no one person is expert in all these fields!
Ten years ago my department had 15 data analysts. Today that number has risen to 40, and in spite of that we still need new employees. The number grows, and the demand for this kind of staff is increasing. I read that IBM predicted demand for both data scientists and data engineers will grow by 39% by the year 2020. Isn’t that an impressive number? In a world where several exabytes of data are created each day, somehow I’m not surprised that the demand for data scientists is enormous. Are you?
Data Science Is Not Just Hype
In my opinion, the job description of the “legendary” data scientist is hype, a little overdone, I think. But what about data science itself? It is present and very popular. Zettabytes of data are waiting to be analyzed even now as I write about the topic. A lot can be learned from the massive quantities of data constantly being produced, and in order to research and learn, we need data scientists. My point here is that the field of data science is far more than just hype. We all are aware that data are present in huge amounts, and those data must be used in decision-making processes. The problem lies in the misunderstanding of what a particular data scientist should do. The skills mentioned above are necessary, maybe, to a team of people, but not for just one person!
What Makes Me a Data Scientist?
But there must be some way in which all people of this “data science” profile are alike. What makes me a data scientist, and in what way am I similar to other data scientists? I think the similarity lies in the fact that we all are comfortable working with data. Not only that, but we also love to explore, to learn new things, and to gain new skills. It doesn’t matter whether you know Spark at this time. The question is whether you have the will to learn it, if the skill is needed. Are you willing to pursue new knowledge and new skills? Even if you do not possess a specific knowledge at a given moment, if you are always eager to learn, you have a chance in this field.
What I Learned First
Ten years ago I started with SQL. At the time I knew very little about databases, tables, indexes, and primary keys. I knew nothing about business or data warehouse methodologies and very little about statistics or machine learning algorithms. But SQL was my starting point. From there I was continually in the process of upgrading my knowledge. Gradually, almost unconsciously, I expanded my range of tools, technologies and algorithms, both at work and privately at home. And all this because I wanted to be as close as possible to the data!
Let the Starting Point of Your Path Be SQL
If data science sounds tempting to you and you would like to grow in that direction, I recommend that you start with SQL. SQL is a must, because as a data scientist you will at some point work with structured data and relational databases. Although today we often encounter unstructured types of data, relational databases are still present in the digital age of mass data production, and SQL is a perfect tool for analyzing those data. This skill will move you closer to the data, and once you find out what it means to work with data, you will easily expand into the range of tools you need for further growth. But make SQL your starting point. Vertabelo Academy offers excellent interactive courses for SQL beginners, so check out the following:
- SQL Basics – You’ll learn basic syntax and how to retrieve necessary data from a database. You’ll learn how to perform simple data analysis, which is a powerful start.
- Standard SQL Functions – You’ll learn slightly more advanced syntax which will allow you to make more complex analyses. You’ll learn how to deal with specific data types and aggregate data and discover null/missing values in the data.
- Operating on Data in SQL – This course will teach you how to insert, delete and update data in a database. These are basic operations that will allow you to control your information in a relational database.
These are really nice courses for a fresh start in a data science field. SQL is still a powerful tool and will remain such for quite some time. Don’t waste your precious time! Start learning now, because already, somewhere, there is waiting for you a wealth of data, ready for your analysis.