When bots take surveys: AI’s impact on the integrity of market research
Online surveys have been the bread and butter of market research for the past several years—they are fast and efficient. But there’s a growing problem we need to address: AI bots are pretending to be real people and slipping through to complete surveys.
Tools like ChatGPT and other similar platforms are being used to fill out surveys meant for human respondents. The result of this phenomenon is compromised data, misleading or just plain wrong insights, and wasted budgets.
Industry-wide estimates suggest that between 15% and 30% of online survey responses may be fraudulent, and on some platforms, fraud rates can run as high as 45%.
This is a big issue.
Competing priorities
Market researchers are in a bind.
On one side, sample vendors, who are motivated to provide survey respondents quickly and economically, often push back when too many responses are rejected. They urge researchers to accept more completes in order to maintain the study feasibility, speed, cost and push back on any suggestion that the sample they provide may not be good.
On the other side, researchers are trying to protect the integrity of their work, ensuring that only real human responses make it into the data set delivered to their clients. While AI could be useful during the research process, using it to impersonate respondents defeats the goal of research—understanding what real people think and do.
Why is this happening?
There are several reasons the recent increase in bot responses:
Panelists trying to earn incentives faster or using AI to help them complete surveys they may not fully understand
Sample farms using bots to fulfill quotas or earn survey rewards on a large scale
Tech-savvy individuals just testing the system for fun or to make money
Regardless of the reason why this is happening, the result is the same: we risk replacing the voice of real humans with the voice of bots.
How can we detect AI responses?
AI-written responses can be sophisticated, but they still leave some clues for researchers. In addition to using 3rd-party AI detection tools to flag responses ,here are other signals researchers can look out for when analyzing their datasets:
Overly polished open-ended responses
Responses may sound too perfect, clean or over-explained. Punctuation and grammar are perfect but the content lacks texture.Lack of personal detail
AI often is not good at offering contradictory, or even simplistic responses that real people give (for example “I just like it”, “I like the taste”, “works well” or for example rating something a 4 instead of 5 but giving a somewhat negative reason why etc.) It also is not good at providing personal stories or opinions.Repeated patterns
Identical or nearly identical responses may appear across different respondents.Better than average survey engagement/completion
Many times the research project has low dropout rates and very fast survey completion, even for harder to get audiences.
How should we respond to this issue?
Dealing with the AI issue requires a multi-layer approach that requires the partnership of all stakeholders – sample vendors, research agencies as well as users of research.
1. Strengthen sample recruitment
Use trusted panels that verify respondent identity
Monitor and block questionable traffic sources
Use CAPTCHA or other technology to help identify bots
2. Include ways to detect real human engagement
Include open-ended questions that require detail or some level of experience to answer
Use trap or logic-check questions to identify answer inconsistencies
3. Collaborate with your sample vendors and end clients
Communicate clear anti-fraud policies to respondents and sample vendors
Work with sample vendors to audit and enforce quality standards
Educate research users on the state of the category, the steps you are taking to detect and control it and warn them of potential timing and cost impact. This is not an easy conversation, but it will show that you have their best interest in mind.
AI is a valuable tool for researchers, but it cannot replace real human feedback. If the industry doesn’t address this issue, the relevance and credibility of primary research will quickly deteriorate.
All stakeholders must act together to protect data quality and ensure business decisions are the result of real human insights.