The practice of data science requires skills that fall into three general areas: business acumen, computer technology/programming and statistics/math. Depending on whom you ask, the specific set of top skills varies. Dave Holtz describes the data science skills you need to get a job as a data scientist (8 Skills You Need to Be a Data Scientist). Ferris Jumah, examining LinkedIn profiles with the title “Data Scientists,” identified 10 skills (The Data Science Skills Network). BurtchWorks offers their list of skills that are critical to success in data science (9 Must-Have Skills You Need to Become a Data Scientist). RJMetrics, using LinkedIn data, identified the top 20 data science skill (The State of Data Science). For these lists, top skills reflected the frequency with which data professionals list these skills on their social media profile or simply reflect what the author thinks is a good set of skills.
Data Science Skills and Possessing Proficiency in the Skills
In our ongoing study of data scientists, we ask data professionals to indicate their proficiency in 25 different data science skills. The 25 skills, listed in Figure 1, reflect the set of skills that are commonly associated with data scientists. These skills are, in fact, included in the studies listed above.
I used the “intermediate” proficiency level as the criterion for the data professional for possessing the skill. The proficiency level of “intermediate” indicates that a data professional is able to complete tasks as requested and can usually perform the skill without help from others.
I ranked the list of these 25 skills based on the percentage of data professionals who possessed each skill.
This list appears in Figure 2. The first 10 skills listed in the figure (from left to right) were the most common skill across all data professionals. The top 10 data science skills were:
- S – Communication (87% possess skill)
- T – Managing Structured data (75%)
- M – Math (71%)
- B – Project management (71%)
- S – Data Mining and Viz Tools (71%)
- S – Science/Scientific Method (65%)
- S – Data Management (65%)
- B – Product design and development (59%)
- S – Statistics and statistical modeling (59%)
- B – Business development (53%)
Many of the top data science skills fall into the areas of Statistics; all five statistics skills appear in the top 10 list, including Communication, Data Mining and Viz Tools, Science/Scientific Method and Statistics and statistical modeling. Additionally, three skills related to Business acumen appear in the top 10, including Project management and Product design and development. No Programming skills appear in the top 10 list.
Top 10 Data Science Skills Vary by Job Role
Next, I looked at the top data science skills by job role. This depiction also appears in Figure 2 (and Table 1 below in detail). For each of the job roles, I graphically indicated the frequency with which the data professionals in a particular role possessed the skills. As you can see in Figure 2, some top data science skills are common across all the different roles. These include Communication, Managing structured data, Math, Project management Data mining and viz tools, Data management and Product design and development. In addition to these similarities, however, there are considerable differences in top data science skills across the job roles. Let’s take a look at each job role.
Business Manager: Top 10 data science skills for data professionals who consider themselves Business Managers (i.e., leader, business person, entrepreneur) are:
- S – Communication (91% possess skill)
- B – Project management (86%)
- B – Business development (77%)
- T – Managing structured data (74%)
- B – Budgeting (71%)
- B – Product design and development (70%)
- M – Math (65%)
- S – Data Management (64%)
- S – Data Mining and Viz Tools (64%)
- B – Governance and Compliance (61%)
Top skills that are unique to Business Managers are, not surprisingly, business-related. These skills include Business development, Budgeting and Governance and Compliance.
Developers: The top 10 data science skills for data professionals who consider themselves Developers (i.e., developers or engineers) are:
- T – Managing structured data (91%)
- S – Communication (85%)
- S – Data Mining and Viz Tools (76%)
- B – Product Design (75%)
- M – Math (75%)
- S – Data Management (75%)
- B – Project management (74%)
- P – Database Administration (73%)
- P – Back-end Programming (70%)
- P – Systems Administration (65%)
Top skills that are unique to Developers are related to skills in Technology and Programming. These top unique skills include Back-end programming, Systems Administration and Design and Database administration. While these data professionals possess these skills, fewer of them possess skills in other technology- and programming-heavy skills that are important in our Big Data world. For example, less than half possess skills in Cloud Management (42%), Big and Distributed Data (48%) and NLP and text mining (42%). These results are consistent with the data science study by RJ Metrics. I suspect that these percentages will grow as more graduates of data science programs come into the work world.
Creatives: Top 10 data science skills for data professionals who consider themselves Creatives (i.e., Jack of all trades, artist, hacker) are:
- S – Communication (87% possess skill)
- T – Managing structured data (79%)
- B – Project management (77%)
- S – Data Mining and Viz Tools (77%)
- M – Math (75%)
- B – Product Design and Development (68%)
- S – Science/Scientific Method (68%)
- S – Data Management (67%)
- S – Statistics and statistical modeling (63%)
- B – Business development (58%)
Creatives do not have top skills that are unique to only them. In fact, their list of top data science skills closely match those of Researchers. Of their 10 top list of data skills, they share eight of them.
Researcher: The top 10 data science skills for Researchers (i.e., researcher, scientist, statistician) are:
- S – Communication (90% possess skill)
- S – Data Mining and Viz Tools (81%)
- M – Math (80%)
- S – Science/Scientific Method (78%)
- S – Statistics and statistical modeling (75%)
- T – Managing structured data (73%)
- S – Data Management (69%)
- B – Project management (68%)
- T – Machine Learning (58%)
- M – Optimization (56%)
Top data science skills for Researchers are primarily in the area of Statistics. Additionally, top data science skills that are unique to Researchers are highly quantitative in nature, including Machine Learning and Optimization.
Summary and Conclusions
The list of top data science skills depends on the type of data scientists you are considering. While some data science skills appear to be common among different types of data professionals (i.e., Communication, Managing structured data, Math, Project management Data mining and viz tools, Data management and Product design and development), other data science skills are unique to certain roles. Developers’ top skills include Programming skills; Researchers’ top skills include Math-heavy skills; Still Business Managers’ top skills include business-related skills.
These results have implications for data professionals interested in the field of data science as well as the the recruiters in pursuit of them and the organizations who hire them. Data professionals could use the results to understand they types of skills they need to possess for different types of jobs. If your skills are focused in Statistics, you might look for jobs that have a strong research component.
Recruiters need to understand the different types of data science roles to better recruit data professionals that best match the role requirements. Matching the skills of job applicants to the job requirements would greatly improve the efficiency of their recruitment efforts. Organizations can optimize their data science teams by ensuring their teams consist of different types of data scientists. It appears that data science teams need to consist of different types of data professionals, especially Business Managers, Developers and Researchers.
While looking at the frequency with which data scientists possess certain skills is one way of identifying top data skills, I’m interested in understanding how well each of the data science skills is related to job success. It’s possible that even though many data professionals possess a given skill, that skill may not be important to the success of analytics projects. Next week, I will examine the importance of each data science skill in determining the success of a data analytics project.