Most Popular Integrated Development Environments (IDEs) Used by Data Scientists


Share on LinkedIn

Results of a worldwide survey of data professionals, the top used Integrated Development Environments (IDEs) are: Jupyter (73% have used), Visual Studio (31%), RStudio (30%), PyCharm (29%) and Notepad++ (22%). 

Integrated Development Environments (IDEs) helps programmers consolidate different aspects of software development. An IDE typically consists of: 1) source code editor, 2) build automation tools and 3) a debugger. IDEs are designed to help developers maximize productivity by helping them debug code much faster and more easily than without an IDE.

Kaggle conducted a worldwide survey in October 2019 of 19,717 data professionals (2019 Kaggle Machine Learning and Data Science Survey). Their survey included a variety of questions about data science, tool usage, machine learning, education and more. In this week’s post, I looked at the IDEs used by data professionals.

Popular IDEs used by Data Pros

Figure 1. Integrated Development Environments (IDEs) used by data professionals.

Survey respondents were asked to indicate, “Which of the following integrated development environments (IDEs) do you use on a regular basis? (Select all that apply).” On average (median), data professionals said that they use two (2) IDEs on a regular basis. A little less than 2% of respondents indicated that they did not use an IDE. Of the respondents who did use at least one IDE, the top IDEs used were:

  1. Jupyter (JupyterLab, Jupyter Notebooks, etc) (73% used)
  2. Visual Studio / Visual Studio Code (31%)
  3. RStudio (30%)
  4. PyCharm (29%)
  5. Notepad++ (22%)
  6. Spyder (21%
  7. Sublime Text (18%)
  8. Vim / Emacs (12%)
  9. MATLAB (12%)
  10. Atom (10%)

About five percent of respondents indicated they used “Other” IDEs; the top other IDEs included 1) Eclipse, 2) IntelliJ IDEA, 3) Google Colab, 4) Anaconda and 5) NetBeans.

IDE Usage by Job Title

Figure 2. Integrated Development Environment by Job Title.

The popularity of IDEs depends on the job role of respondents. As seen in Figure 2, Jupyter is the most popular IDE for all job titles except one, Statistician. Data Scientists (83% used), Students (77%) and Data Engineers (74%) are big users of Jupyter. For Statisticians, the most popular IDE is RStudio (73% used).

The second most popular IDE of data professionals vary over different job titles. The second most used IDE within job titles are:

  1. RStudio: Data Scientist, Research Scientist, Data Analyst and Business Analyst
  2. Visual Studio: Student, Data Engineer, Product/Project Engineer and Software Engineer
  3. Notepad: DBA/Database Engineer
  4. Jupyter: Statistician

The pattern for the second most popular IDE within job titles seems to indicate that RStudio is used by roles that are geared solely to statistical analysis while Visual Studio is used by data pros who are typically creating software applications.

Integrated Development Environments (IDE) help data professionals build applications, combining common developer tools into a single graphical user interface. Top IDEs are, by far, Jupyter, followed by Visual Studio, RStudio and PyCharm. Jupyter tends to be used a lot by engineers while RStudio tends to be used by analysts. This list of IDE options is not an exhaustive list of IDEs. As such, in subsequent surveys of data professionals, it will be useful to study usage of all IDEs. For example, Amazon’s SageMaker IDE, released late last year, was not mentioned in this study.

Republished with author's permission from original post.


Please use comments to add value to the discussion. Maximum one link to an educational blog post or article. We will NOT PUBLISH brief comments like "good post," comments that mainly promote links, or comments with links to companies, products, or services.

Please enter your comment!
Please enter your name here