Polling Problems, 2020 Edition
America saw inaccurate polling for the House and Senate races across the country, which overestimated Democratic support in a wide range of races.
Hey, it wouldn’t be 2020 without something going wrong.
While the 2020 polls will have correctly predicted the winner of the national presidential race (Joe Biden), they have generally overstated Democratic support by, on average, several points. Among the key errors is Florida, where Trump is on pace to win by over 3 points despite trailing by 2-3 percentage points in pre-election polling, as well as Ohio, where Trump will improve on his narrow lead in polling to take an easy, eight-point win. However, issues were not limited to the presidential race. Indeed, polls turned out to be inaccurate for House and Senate races across the country. Those errors also generally pointed in the same direction, overrepresenting Democratic support in a wide range of races.
To be sure, there are challenges facing election pollsters every year, and Nate Cohn’s explanation of challenges facing pollsters in 2018 remains true today. This year we added the additional obstacle of polling during a pandemic and determining who would and would not vote in such a strange election year.
Nor have polls missed as much as people though on election night. As the vote counting has continued since November 3, Biden’s margins over Trump have grown, bringing national results closer to national polls. And as David Byler argues in the Washington Post, it’s also far too early to trash the polls as a whole. In many states what has happened is within the normal margin of polling error—and in some states, like Colorado, Minnesota, and Virginia—the polls were spot-on.
Nevertheless, the fact that the surveys got the margin of victory for Biden over Trump wrong (polls were predicting about an eight point margin between the two candidates) has pundits and the public alike questioning polls. So what’s going on? Below we outline some possible explanations include differential non-response, the massive nationwide turnout, poll burnout, and late-deciding voters. As we learn more we will be back with updates and additional insights from the industry.
Huge Turnout Adds Challenges
The record turnout for the 2020 election also posed a challenge. At present, experts project that this election turnout, as a percentage of the voting-eligible population, will hit 66.5% - a record for US election turnout not matched since 1900. As the Washington Post visualizes, turnout was up everywhere.
This surge in turnout is exciting from the perspective of public engagement with politics. However, events with unusually large turnout (such as referenda) also present challenges for pollsters looking to model the electorate. That’s because if the electorate looks notably different than in previous elections, approaches to modeling the electorate that worked in previous elections may not work for one with massively high turnout. And of course, modeling turnout is a key challenge of election polling. Even working from the same baseline data, different pollsters can reach quite different conclusions about the result based on how they model turnout.
Likely Voter Models
Most polling organizations divide up the electorate into very likely versus somewhat or not likely voters. Each organization keeps the proprietary ingredients for each model under lock and key; few are transparent about their secret sauce. But many of the models try to recreate the demographic composition of the population participating in an election from past elections, and that can be more difficult to do if certain individuals are under or over represented in the sample, and also if turnout varies drastically from one year to another.
Differential Non-Response?
One of the potential causes of the 2016 polling miss was rooted in differential non-response – that is, that Trump supporters were systematically less likely to respond to surveys. But analyses in 2016 didn’t find much evidence for this, instead pointing to a different issue: underrepresentation of noncollege whites in polls. Since this group of Americans was particularly likely to support Trump in 2016, underestimating the group in polls would skew the overall results. As a corrective, pollsters can weight polls by education (and race) to compensate. But as Sam Wang points out in his post-election writeup, some groups remain difficult to reach, like non-college and Hispanic voters—and it was among these groups that Trump gained in 2020.
There is another group that might have been underrepresented in 2020 polls which David Shor has pointed to as an underlying issue: low-social-trust voters. Low social trust, as measured major surveys like the General Social Survey, reduces the likelihood of responding to public opinion surveys. As he points out, non-college whites with low social trust were both systemically less likely to respond to surveys, and far more likely to vote for Trump.
In 2020, differential non-response is back as a potential problem. As Andrew Gelman and G. Elliott Morris note in their Nov. 6 analysis, the average prediction was off by 2.5 points, which they hypothesize as being caused by a combination of differential nonresponse to polls and differential turnout.
Poll Burnout and Late Deciders
Another potential problem is poll burnout. With the high level of demand for polling data in battleground states (and districts), potential respondents may simply be overwhelmed by the demand. Does an increasing number of polling requests end up shifting the population that responds to those poll requests?
Lastly, and hardest for polls to capture: how do the small portion of Americans who remain undecided on election day end up voting, if they do indeed vote? This is a source of error in polls that surveys have a difficult time adjusting for, since polls have to finish up their work a few days ahead of Election Day, giving them time to do the data work needed to publish the results.
#NotAllPolls
Of course, not all polls missed this year. The final Selzer Iowa Poll, the gold standard of all gold standards, caused some panic among Democrats when it showed Trump rolling to a seven-point win. She was right on, and pundits who doubted her (like David Axelrod) quickly owned up to eating their crow.
Furthermore, election polling is hardly the only use of public opinion research. We at the Chicago Council are issue pollsters, not election pollsters. In that role, we aim to study and present the views of the American public on vital issues, like America’s role in the world, US-China relations, and more. We place greater emphasis on the trends in opinion – for example whether the public is more likely to favor engaging with the rest of the world one year versus another – rather than trying to quantify the precise percentage at a snapshot in time.
Come Back in Six Months
Why can’t we say for certain what happened? Well, for one, states are still counting votes! It’s hard to know exactly how far off polls were until we have final results. For another, many of the more detailed analyses that organizations like the American Association of Public Opinion Researchers (AAPOR) will conduct rely on validated voter file information. That data is not available yet, and won’t be for a while. We’d recommend checking back in around May and tuning into the discussions at AAPOR’s 2021 conference, when pollsters will come together (virtually) and sort through the data.