What 1.8 million survey records tell us about data quality, and why your organisation should join the next wave

19 June

Authors Ray Poynter

In a typical online survey today, a large share of the responses collected never reach the analysis stage.

Esomar Reports

8 min read

In a typical online survey today, a large share of the responses collected never reach the analysis stage. Across the 1.8 million records in GDQ's latest benchmarking study, Research Agencies removed 29% of responses before analysis and Suppliers removed 21%. In some business-to-business samples, more than half of the people who completed the survey were excluded afterwards. The reassuring reading is that the problems are being caught. The harder one is that using survey data exactly as it arrives is no longer a credible option.

That is the picture from Wave 2 of the Global Data Quality (GDQ) benchmarking report, published this month and built on records contributed by 46 companies across 13 countries, collected between October 2025 and March 2026. The project is research-on-research, a shared, industry-wide project to define what "good" data quality actually looks like, and to track it consistently over time. It is run on GDQ's behalf by the Insights Association, with Esomar one of the member bodies supporting it.

The numbers below repay a close look, and they make a case I want to put directly to you. We need many more organisations to take part.

Removal rates are high, and they depend on who you are

Across the global benchmarks, Research Agencies recorded total pre-survey plus in-survey removal of 29%, edging ahead of Suppliers at 21%. That reverses the Wave 1 pattern, although the gap is modest and partly a product of Wave 2's revised approach of reporting in-survey fraud and behavioural terminations together. The clearest difference lies in in-survey fraud removals, where Research Agencies report 12% compared with Suppliers' 3%, offset to some degree by Suppliers' higher behavioural terminations.

The headline to hold on to is not which side is higher. It is that double-digit removal has become the norm, a point I return to below.

B2B remains the hardest audience

General B2B again shows the highest combined pre- and in-survey removal at 23%, and the post-survey picture is striking: 60% of qualified B2B completes are removed on review after completion. The country summaries underscore how back-end-heavy B2B has become, with post-survey cleanout ranging from 48% to 95% across markets with sufficient sample size. For anyone buying or selling B2B samples, this is a reminder that front-end screening alone does not solve the problem. B2B incentives can be much higher and the sample sizes are often much smaller.

Country patterns reward a closer look

Several countries stand out, and the report candidly notes that they reflect a mix of genuine market dynamics, smaller contributor pools, and methodological variance. Australia shows the highest removal in the report, with pre-survey blocking alone reaching 41% to 56%. Canada shows the steepest post-survey cleanout, peaking at 74% for B2C. France records an unusually high B2C pre-survey block rate of 46%, second only to Australia.

What sold versus actual incidence tells us

Incidence rate is the proportion of people reaching a survey who qualify to take part. The project records two versions. "Sold" is the incidence that is quoted, and on which the prices and plans are agreed. "Actual" is the incidence the project delivers once fielding is complete. The gap between them matters commercially. When actual runs below sold, a study is harder to fill than expected, which pushes up cost, lengthens timelines, and puts more strain on the sample source to find qualifying respondents. A narrowing gap indicates more accurate feasibility and pricing.

On that measure, Wave 2 brought good news. The sold-to-actual gap narrowed globally compared with Wave 1. Research Agencies sold an average incidence of 62% against an actual of 59%, a gap of around 3 points, while Suppliers sold 54% against an actual of 46%, a wider gap of around 8 points. Suppliers, therefore, still have more ground to close than Research Agencies.

One key defence against fraud is link encryption. The headline adoption figures for link encryption (Research Agency 46%, Supplier 50%) dropped against Wave 1, but roughly half of all records are missing this field. The sensible reading is that reporting completeness, not adoption, is the immediate problem to fix. Better data in means better benchmarks out.

The data is being cleaned, and that cleaning is not optional

Step back from the individual figures, and a clear message emerges. At every stage and in almost every market, substantial volumes of poor-quality responses are identified and removed before the data reaches analysis. That is encouraging. Problems are being caught at the front gate, during the survey, and on review after completion, rather than flowing through into findings that a client then acts on. The quality control system is doing its job.

The scale of that effort is the other half of the story. When combined removal routinely runs into the double digits, and post-survey cleaning of some B2B samples removes more than half of the completes, it is no longer credible to treat raw survey data as analysis ready. Using the data exactly as collected, without screening, fraud checks, deduplication and attention controls, is simply not an option. Quality is now something the industry must actively produce on every project, not something it can assume is already there.

This is why participation matters

Benchmarks of this kind work on a simple principle: the more organisations contribute, the more representative and useful the numbers become. Several of the most interesting findings in Wave 2 come with caveats, precisely because the contributor pool in some countries and study types remains small. Wider participation turns "directional, treat with caution" into "robust enough to act on."

From a standards perspective, this matters more than the individual numbers. I should declare an interest here. I am the chair of Esomar's Professional Standards Committee, and through that role, I sit on the board of GDQ. Data quality is the heart of professional standards, which is why this project matters to me and why I want to see it succeed. Esomar and its sister bodies can write principles about data quality, but principles are stronger when the industry can see, in real figures, what current practice looks like and where it falls short. A well-populated benchmark is a way the profession holds itself accountable, rather than simply asserting that quality is improving. That is the link between this project and the Committee's work, and it is why I am asking colleagues across our industry to contribute.

Taking part is also self-interest in the best sense. Once you contribute, you can compare your own studies against the industry overall, by country, company type, and study type, see where you sit relative to the norm, and identify the white spaces where your business might innovate. The framing throughout the project is "benchmark, don't judge." These are reference points for improvement, not a scorecard.

Wave 3 will make taking part easier

The next wave is being redesigned to lower the barrier to entry. The headline change under consideration is a move from respondent-level records to aggregated survey-level metrics. That single shift does three useful things at once: it widens the range of people who can pull and deliver the data, it opens participation to brand-side researchers for the first time, and it removes the anonymisation concerns that come with sharing individual-level records. Wave 3 will also move to a fixed six-month cadence, with annual submission deadlines of 15 October and 15 April.

If your organisation has held back until now because contributing felt complicated or sensitive, this is the wave to revisit that decision. As a member of GDQ, Esomar will be encouraging participation, and the working group welcomes feedback ahead of the Wave 3 call for data.

Read the full report

You can download the complete Wave 2 report, with the full country and study-type breakdowns, here: Download the GDQ Wave 2 Data Quality Benchmarking Report.

About GDQ

The Global Data Quality (GDQ) initiative is an industry-wide, research-on-research project to understand and improve data quality. It brings together brands, research agencies, panel providers, technology platforms, digital analytics firms and consultancies to contribute anonymised data from online survey projects, which is aggregated into shared benchmarks. The aim is to establish globally accepted data-quality buying signals and fuel innovation that strengthens the entire research ecosystem. Esomar is one of the member bodies behind GDQ, and the benchmarking project is managed for GDQ by the Insights Association. You can learn more at globaldataquality.org.

Esomar Reports

Ray Poynter

Chair of the Professional Standards Committee at Esomar

Ray has spent the last 45 years at the intersection of insights, research, and new thinking. Ray has held director-level positions with companies such as The Research Business, IntelliQuest, Millward Brown, and Vision Critical. Ray is committed to the research and insights industry, having been a member of Esomar for over 30 years and a fellow of the MRS.

In recent years Ray’s work has focused on training, writing, speaking and sharing. Ray has run training workshops for a variety of national and international organisations, including RANZ, TRS, JMRA, MRS and ESOMAR. Ray has written textbooks, taught at Saitama and Nottingham Universities, regularly blogs, and is active on social media.

In 2023, Ray was elected President of Esomar.

Esomar