Qualtrics RelevantID, reCAPTCHA, and other tips for survey research in 2021

Use these settings to detect and prevent spam in a survey dataset.

For my dissertation research at Carnegie Mellon University, I have created a national advertising campaign to recruit interview subjects via an online survey. The resulting interviews of U.S. residents age 18 and older will, in turn, inform the design of a final national survey.

It’s fun to return to two of my passions – connecting with people online and conducting quantitative survey research – EXCEPT when my survey gets flooded with spam! Once study info gets posted to the internet, anyone can copy it to a forum or group where people try to game paid surveys with repeated and/or inauthentic responses. This could max out my quota sampling before I reach the people who actually want to be part of this research.

Below are some of my tips for setting up the survey in Qualtrics, in order to address and prevent spam in my dataset:

  • In Qualtrics’ survey settings, I have enabled RelevantID. This checks in the background for evidence that a response is a duplicate or otherwise a fraud, and reports the score in the metadata. This helps catch, for example, whether someone is using a different email to take the survey more than once, and thus increase the amount of compensation they are issued.
  • The “Prevent Ballot Box Stuffing” setting (known as “Prevent Multiple Submissions” in the newer interface) will also help guard against spam duplicates. In past surveys, I have set this to only flag the repeat responses for review. However, for this national survey, I set it to prevent multiple submissions. A message tells anyone caught by this option that they are not able to take the survey more than once.
  • Also in Qualtrics’ survey settings, I have enabled reCAPTCHA bot detection. This is not just the “Prove you are not a robot” challenge question (which I added to the second block in the survey flow). Invisible tech judges the likelihood that the participant is a bot, and reports the score in the metadata.
  • With all of the above enabled, I can manually filter responses in Qualtrics’ Data & Analysis tab. On the top right, the Response Quality label is clickable. It takes me to a report of what issues, if any, the above checks have flagged, and gives me the option to view the problematic responses. Once in that filter, I can use the far-left column of check boxes to delete data and decrement quotas for any or all the selected responses.
  • Even better, though, is to kick these out of the survey before they start. I set Embedded Data to record the above settings, at the top of the Survey Flow. Then, I set a branch near the top with conditions matched to the Embedded Data: a True for Q_BallotBoxStuffing and Q_RelevantIDDuplicate, and thresholds for Q_DuplicateScore, Q_RecaptchaScore and Q_FraudScore. If any of these conditions are met, the block returns End of Survey. See the below image or the Qualtrics page for Fraud Detection for more info.
  • Finally, I want to help the real humans who respond to my ads to choose not to take it, if they judge that it’s not worth the risk of having a response thrown out. In my survey email’s auto-responder and in the Qualtrics block with the reCAPTCHA question, I include text to this effect: Note that only one response will be accepted. We may reject responses if the survey metadata reports duplication, low response quality and/or non-U.S. location, if the duration of the survey seems inconsistent with manual human response, or if the responses fail attention checks.
Screengrab from Qualtrics showing the placement and settings for the Fraud Detection blocks in the survey flow. See https://www.qualtrics.com/support/survey-platform/survey-module/survey-checker/fraud-detection/ for more information.
Screengrab from Qualtrics showing the placement and settings for the Fraud Detection blocks in the survey flow. See https://www.qualtrics.com/support/survey-platform/survey-module/survey-checker/fraud-detection/ for more information.

Bytes of Good Live podcast: Talking ‘Social Cybersecurity’ with Hack4Impact

One upside of video calls during the COVID-19 pandemic has been that I can attend or speak at virtually any location or event, without having to travel or move my schedule around too much. It’s helped me get more comfortable with public speaking, and exposed me to different audiences for my work.

In my latest public appearance: I appeared this spring with fellow CMU grad student Tom Magelinski at Bytes of Good Live, organized by Hack4Impact, a student-run nonprofit that promotes software for social good. We talked about our Social Cybersecurity research and what we know of careers in cybersecurity. The recording is available on YouTube, or click on the preview shown below to go to the video. Let me know what you think!