Qualtrics RelevantID, reCAPTCHA, and other tips for survey research in 2021

Use these settings to detect and prevent spam in a survey dataset.

For my dissertation research at Carnegie Mellon University, I have created a national advertising campaign to recruit interview subjects via an online survey. The resulting interviews of U.S. residents age 18 and older will, in turn, inform the design of a final national survey.

It’s fun to return to two of my passions – connecting with people online and conducting quantitative survey research – EXCEPT when my survey gets flooded with spam! Once study info gets posted to the internet, anyone can copy it to a forum or group where people try to game paid surveys with repeated and/or inauthentic responses. This could max out my quota sampling before I reach the people who actually want to be part of this research.

Below are some of my tips for setting up the survey in Qualtrics, in order to address and prevent spam in my dataset:

  • In Qualtrics’ survey settings, I have enabled RelevantID. This checks in the background for evidence that a response is a duplicate or otherwise a fraud, and reports the score in the metadata. This helps catch, for example, whether someone is using a different email to take the survey more than once, and thus increase the amount of compensation they are issued.
  • The “Prevent Ballot Box Stuffing” setting (known as “Prevent Multiple Submissions” in the newer interface) will also help guard against spam duplicates. In past surveys, I have set this to only flag the repeat responses for review. However, for this national survey, I set it to prevent multiple submissions. A message tells anyone caught by this option that they are not able to take the survey more than once.
  • Also in Qualtrics’ survey settings, I have enabled reCAPTCHA bot detection. This is not just the “Prove you are not a robot” challenge question (which I added to the second block in the survey flow). Invisible tech judges the likelihood that the participant is a bot, and reports the score in the metadata.
  • With all of the above enabled, I can manually filter responses in Qualtrics’ Data & Analysis tab. On the top right, the Response Quality label is clickable. It takes me to a report of what issues, if any, the above checks have flagged, and gives me the option to view the problematic responses. Once in that filter, I can use the far-left column of check boxes to delete data and decrement quotas for any or all the selected responses.
  • Even better, though, is to kick these out of the survey before they start. I set Embedded Data to record the above settings, at the top of the Survey Flow. Then, I set a branch near the top with conditions matched to the Embedded Data: a True for Q_BallotBoxStuffing and Q_RelevantIDDuplicate, and thresholds for Q_DuplicateScore, Q_RecaptchaScore and Q_FraudScore. If any of these conditions are met, the block returns End of Survey. See the below image or the Qualtrics page for Fraud Detection for more info.
  • Finally, I want to help the real humans who respond to my ads to choose not to take it, if they judge that it’s not worth the risk of having a response thrown out. In my survey email’s auto-responder and in the Qualtrics block with the reCAPTCHA question, I include text to this effect: Note that only one response will be accepted. We may reject responses if the survey metadata reports duplication, low response quality and/or non-U.S. location, if the duration of the survey seems inconsistent with manual human response, or if the responses fail attention checks.
Screengrab from Qualtrics showing the placement and settings for the Fraud Detection blocks in the survey flow. See https://www.qualtrics.com/support/survey-platform/survey-module/survey-checker/fraud-detection/ for more information.
Screengrab from Qualtrics showing the placement and settings for the Fraud Detection blocks in the survey flow. See https://www.qualtrics.com/support/survey-platform/survey-module/survey-checker/fraud-detection/ for more information.

Bytes of Good Live podcast: Talking ‘Social Cybersecurity’ with Hack4Impact

One upside of video calls during the COVID-19 pandemic has been that I can attend or speak at virtually any location or event, without having to travel or move my schedule around too much. It’s helped me get more comfortable with public speaking, and exposed me to different audiences for my work.

In my latest public appearance: I appeared this spring with fellow CMU grad student Tom Magelinski at Bytes of Good Live, organized by Hack4Impact, a student-run nonprofit that promotes software for social good. We talked about our Social Cybersecurity research and what we know of careers in cybersecurity. The recording is available on YouTube, or click on the preview shown below to go to the video. Let me know what you think!

Alipay and WeChat Pay are everywhere in China – new paper for CSCW 2020 + reflections on cross-cultural research

This is a super-weird week to be submitting the camera-ready version of this research paper for publication at CSCW 2020. On Thursday, the “Executive Order Addressing the Threat Posed by WeChat” set a countdown of 45 days until the Tencent app would be “banned,” along with ByteDance’s TikTok. It recognizes what we document – the central role that these apps’ financial transactions play in the U.S.-intertwined Chinese economy.

Of course: I agree that apps such these, and Alipay and WeChat Pay, collect a lot of data about us while we go about using them for both fun and serious self-expression, and that this data is obtainable through various processes by the government of the country in which their parent companies are headquartered. I’ve long worried about our data security and privacy with regards to a constellation of mobile social media and short-form video apps, along with mobile payment options such as Apple Pay, Google Wallet Google Pay, PayPal, Venmo, Zelle, Square Cash, and Facebook’s Messenger and Novi. (Disclosure: I work at Facebook this summer, on marketing/ad data literacy.)

I felt a grief, however, at thinking of our global internet shrinking just a bit more from fully embracing the marvel of how newly connected so many of us can live and work despite our physical boundaries and limitations. The pandemic has sharpened my keen appreciation for how WeChat and other social media help family and friends bridge great distances, and for how much education, business and other knowledge work depends on reliable and usable communication software being available to everyone, everywhere.

It has been a joy and fascination to help pilot and design research into a very different manifestation of internet-enhanced life than the one I know in the U.S., directed by lead author Hong Shen (also a graduate of the University of Illinois College of Media) and with fellow HCII Phd researcher Haojian Jin and my awesome advisors, Laura Dabbish and Jason Hong. In China, you don’t have to go out with your wallet, just your phone! Even street vendors have QR codes for you to scan! Which gives rise to new forms of communication, such as attaching a message with a transfer equal to a penny! and new threat models, such as thieves coming in the night and replacing the QR code printout with their own!

And that was just from the pilot interviews. Read the preprint version of the paper for specifics on what my Chinese co-authors discovered when they conducted a survey (n=466) and interviews (n=12) in China about the advantages and the pitfalls of moving to a largely mobile and cashless economy.

I spoke up about my interest in the project in part thanks to Dan Grover, whose blogging (in English, thankfully 🙂 ) about his experience of working at WeChat as a product manager had piqued my interest in the various advances in the Chinese social media ecosystem. I couldn’t agree with him more in his tweeted responses to the EO on Thursday night: