(This year, I plan to devote some of my Research Round-Up posts to a discussion of academic research papers about artificial intelligence. Some of these scientific papers will likely focus on comparing the capabilities of AI to those of humans at performing tasks related to marketing. This month’s Research Round-Up features an unpublished paper that compares the performance of AI vs. humans at generating ideas for new products.)
“Ideas are Dimes a Dozen: Large Language Models for Idea Generation in Innovation“
- Authors – Karan Girotra, Cornell Tech and Johnson College of Business, Cornell University; Lennart Meincke, Christian Terwiesch, and Karl T. Ulrich, The Wharton School, University of Pennsylvania
- Date Written – July 10, 2023
This paper describes the results of an experiment designed to compare the performance of generative AI and humans at producing ideas for new consumer products.
The task used in the experiment was to generate ideas for a new product for the college student market that would sell at retail for less than $50. The AI application used in the experiment was OpenAI’s ChatGPT-4.
The experiment used three “pools” of new product ideas.
- First pool (200 ideas) – Ideas created without AI assistance by students enrolled in a product design course at an elite university.
- Second pool (100 ideas) – Ideas generated by ChatGPT based on the same “prompt” as that given to the students.
- Third pool (100 ideas) – Ideas generated by ChatGPT based on the same prompt and a sample of highly-rated product ideas.
All 400 product ideas were evaluated by a panel of college-age individuals in the United States. The quality of the product ideas was based on purchase intent. Panel members expressed their purchase intent by selecting one of five options – definitely would not purchase, probably would not purchase, might or might not purchase, probably would purchase, or definitely would purchase.
The Results
The average quality of the product ideas produced by ChatGPT was higher than the average quality of the human-generated ideas. The average purchase probability of a human-produced idea was 40.4%, while the average for a ChatGPT idea (without examples) was 46.8%, and the average with examples was 49.3%.
Of the 40 highest-rated ideas in the experiment, 35 (87.5%) were ideas produced by ChatGPT.
The researchers also asked members of the evaluating panel to rate the novelty of the new product ideas. In this experiment, the mean novelty value of the human-generated ideas was higher than that of the ideas generated by ChatGPT. However, the researchers noted that novelty did not appear to be significantly correlated with purchase intent.
Implications for Marketers
The Girotra et al. paper has important implications for marketers because it adds to our understanding of the capabilities of AI applications like ChatGPT.
The results of the experiment described in the paper are similar to the findings of other recent research, including an experiment conducted by Boston Consulting Group (GCG) and scholars from four elite universities. I described this study in a post I wrote last fall.
In the BCG study, participants were tasked to generate ideas for a new shoe for an underserved market. They were also required to develop a list of the steps needed to launch the product, create marketing slogans, and write a press release for the product. The researchers found that participants who used an AI tool to complete the tasks outperformed those who didn’t by 40%.
The results of these studies suggest that AI tools based on large language models may be better than humans at performing “brainstorming-like” tasks where the objective is to generate a large number of diverse ideas relating to a topic.
This result should not be that surprising. Large language models are trained on a voluminous amount of data from incredibly diverse sources. The ability to generate responses based on such a vast repository of training data enables an AI tool like ChatGPT to excel at brainstorming-like tasks.
For marketers, the findings described in the Girotra et al. paper and similar findings in other studies suggest that AI tools powered by large language models can be particularly well suited to perform content ideation tasks such as generating potential topics for blog posts or producing potential social media posts.