Unveiling the realities of synthetic respondents - Part One

11 May

Part one in a three-part series that explores synthetic respondents and their impact on the market research industry.

4 min read
4 min read
unveiling the realities of synthetic respondents

A cottage industry selling survey-like AI-generated data will have you believe a revolution is taking shape. But they should hold their horses.

The name synthetic respondents is of course a misnomer. Unlike respondents (i.e. real people), synthetic respondents are answers generated by large language models (or as you will learn in this series, by some other unspecified methodology). However, the synthetic respondent industry and  some in academia are presenting it as a legitimate research method, overplaying its applicability. 

In this three-part series, I will begin with an introduction to how synthetic respondents are normally generated. Next, I will examine the reliability and validity of synthetic respondents and cover the sources of error involved. And finally, I will discuss the allure of synthetic respondents and the effects of this trend specifically on the market research industry. This first article will give an introduction to synthetic respondents, explaining how they work.

How do synthetic respondents work
Synthetic respondents are artificial survey responses generated with large language models (LLMs).

From the research user perspective, reports generated using synthetic responses have the same shape and form as reports based on real data. They are typically structured around survey questions. They purport to have a certain number of responses. The same types of analysis like tabulation, correlation, multidimensional scaling, and so on can be performed on them. 

Generating responses from LLMs takes a fair bit of preparation (i.e. building out a platform that does that), but then only a few minutes per each new report. Typically, synthetic respondent sellers will use prompts like this (example prompting of GPT 3.5):

Example prompting of GPT 3.5 for generating synthetic responses

Pre-prompt: You are a 45 year old mother of three children. You eat bread for breakfast, you eat spaghetti for lunch and you like to vary your meals for dinner. Your budget for everything is limited because you are on government allowance. Next please answer several questions asked of you. In giving answers you will strictly follow the required format and will not add any additional words.

Prompt: How many loaves of bread do you buy per week? (Response format: Numeric)

LLM completion: 2

Prompt: What types of bread do you normally buy? 1. White bread 2. Wholemeal bread 3. Rye bread 4. Other type of bread (Response format: Integer corresponding to the selected option)

LLM completion: 1

Prompt: Describe the taste of the bread that you like the most. (Response format: Open-ended text response up to 255 characters)

LLM completion: Soft, fluffy, and slightly sweet, with a light crust.

This process is repeated multiple times with slight variations of the pre-prompt to match different demographic segments. 

Fine-tuning of models. Some companies claim to “train” (by which they typically mean “fine-tune”) large language models based on people’s answers to past surveys. Thus, they either:

  1. Create a version of an LLM (such as Llama) for each personality that they pretend to represent. However, this approach would be expensive, making it unlikely that companies would pursue this method.

  2. Create a single fine-tuned LLM instance that combines different respondent demographics and personalities into one model. This method poses legal risks, as if they receive a “right to be forgotten” request, they would be obligated o delete the whole model.

In fairness, both options are also a legal risk because it is not likely the original respondents consented to such processing of their responses.

After learning about how synthetic respondents work, you might ask: Why are they bothering with individual response generation, when they can just use AI to generate the final report? The main reason is to provide internal consistency of the results. If you want to have count analysis on synthetic responses, you can. If you want correlations, you can. If you want any other types of analysis, you can too.    

We’ve all been trained at school to think that internal consistency is a marker of the quality of intellectual labour. Fake response generation exploits that view.

In the next instalment of this series, we will delve deeper into the issues surrounding synthetic respondents, exposing their fundamental unreliability and invalidity, as well as exploring the various sources of error that can arise when using these artificial datasets.

For a full list of references and appendices, view the original article here: https://conjointly.com/blog/synthetic-respondents-are-the-homeopathy-of-market-research/