The “hybrid” data scientist is one whose work reflects many different data science roles. Hybrid data scientists, compared to data scientists who play a single, narrower role, possess deeper knowledge in particular data science skills such as machine learning, managing unstructured data and optimization. Organizations can leverage their deeper knowledge to optimize their data science efforts.
The success of a data science program rests on the skills of the data scientists doing the work. Because different types of data scientists have unique skills, it’s important that you get data scientists who possess the skills you need to address the problems you want to solve. While our earlier reporting focused on understanding the difference among four types of data scientists (Business Manager, Developer, Creative and Researcher), we want to now understand the data professional who self-identifies as multiple types of data scientists, the hybrid data scientist. Do these hybrid data scientists possess more skills than their counterparts?
Based on over 600 responses to our survey, data professionals self-identified as different types of data scientists. Most of the respondents (49%) said that the work they do is best described as falling into one data science role (see Figure 1).
Nearly all of the remaining data professionals self-identified as some form of a hybrid data scientist. Nearly a third of the respondents (32%) indicated the work they do with data is best described as two different roles. A few of the respondents indicated that the work they do falls into three (13%) and four (4%) data science roles.
There are many different types of data scientists. As you can see in Figure 2, the most common type of data scientist was the Researcher (22%), followed by Business Manager (14%) and Developer (9%). The next most common data scientists were the Researcher/Developer (8%) and Researcher/Creative (8%).
Some of the data professionals who were specific types of hybrid data scientist combinations appeared to be more proficient in some data science skills compared to data scientists in single roles. Specifically, the data professional who said their work is best described as falling under all four data science roles tend to be the most proficient of all data scientists; they possessed at least an intermediate level (i.e., can do work by themselves) of proficiency in 18 out of the 25 data science skills.
The hybrid data scientist who sees herself and her work as reflecting all four data science roles is the only data scientist who has adequate proficiency to do work in (proficiency scores of 60 or higher) in Managing Unstructured Data (66), Optimization (66), Graphical Models (64) and Bayesian Statistics (62).
Also, data scientists who were self-described as Researchers/Developers and Researchers/Creatives/Developers reported greater proficiency in Machine Learning (62.35 and 59.33) and Algorithms and Simulations (60.0 and 60.0) compared to their individual counterparts (average was 49.0 and 50; 48.1 and 47.3).
Data professionals self-identify as many different types of data scientists, some with single roles, others with multiple roles. Proficiency in data science skills seems to improve when data scientists have multiple roles. For machine learning and algorithms and simulations, the ideal data scientist is one who considers herself both a Researcher and Developer. Data scientists with the highest levels of proficiency were those who describe themselves as those whose work falls into the four different data science roles.
The study results suggest that organizations, when building their data science capabilities, need to either identify or train hybrid data scientists. Hybrid data scientists, with their deep knowledge of diverse data science skills, could be better suited for upper-level data science roles (e.g., Chief Data Officers, Chief Analytics Officers) in which their expertise would allow them to better manage a diverse team of data scientists. These hybrid data scientists better understand how all the data science components (e.g., business, technology, statistics) can work together to solve a particular business need. Classifying data scientists into the various roles would be a useful step in helping organizations manage their data scientists to optimize their analytics efforts.