Avi's Virtual Enclave

View Original

What is Data Science?

If you’ve ever been curious about a career in data science, chances are that you’ve heard that it’s “the sexiest job of the 21st century.” A lot of people have asked me what that actually entails, so I want to dive into that in this article. And to understand data science, it helps to learn a little bit about how it came to be as a professional discipline.

What is data science?

To me, data science is, in it’s simplest and purest form: The study of solving math problems and identifying patterns in ways that directly drive business value.

This is a little vague, and that’s intentional; data science is an extremely broad discipline.

On one end, we have business analysts, who spend a lot of time preparing data visualizations and communicate with business stakeholders. On the other end are machine-learning researchers who study AI and design algorithms to solve mathematically abstract problems. Those roles, and everything in between, can be considered a “data scientist”.

And considering the amount of data we’re able to collect these days, it is increasingly impossible to solve these problems by hand; it requires the use of software that can store, structure, and learn from large amounts of data. This includes programming languages (Python), databases (SQL), and machine learning frameworks (Tensorflow).

With that, there’s 4 major pillars to data science: visualizations/dashboards, databases, statistics, and AI/machine learning.

Before and after “data science”

The term “data science” was coined in 2008, but traction around it didn’t pick up until 2013 (see the Google Trends plot below).

Interest in the “data science” search term over time

While data science has only gained traction in the last 15 or so years, jobs in quantitative research and analytics have been around for decades.

Statistics as a field of study has been around for hundreds of years. And mathematical models have been heavily used in academic research, forming a major component of disciplines such as genetics, psychology, and engineering.

And Wall Street has employed physicists and other academic researchers to build statistical financial models since the 80’s.

Also, the concept of analytics as a profession has been discussed since the 60’s.

A big contributor to the incubation of data science was the growth in data size. Previously, only large, insular institutions like universities and banks had the resources to record and store data en masse. With the proliferation of open-source software and inexpensive data storage, more industries and people have access to “bigger data”, providing the need and opportunity for analytics specialists to create value by making sense of it all.

The ability to handle and interpret large amounts of data has become so valuable that many companies will pay top dollar for data scientists. At Facebook for instance, total compensation can exceed $400K/year.

As a result, an increasing number of industries have job opportunities for data scientists, and many universities are offering degrees and majors in data science, legitimizing it as a standalone field of study.