It’s now about two months, 60 days, until actual voting begins in that fascinating exercise in democracy, the Iowa caucuses. A few days later the good folks of New Hampshire follow with another great American tradition, their first-in-the-nation primary and its own unique brand of retail politics. As always, the results from these two states will greatly influence, but not necessarily determine, the other battles that lead to two nominees in Cleveland and Philadelphia.
Hillary Clinton and Donald Trump now hold comfortable leads in almost all the national polls and most state surveys, though Bernie Sanders is neck and neck with Clinton in the Granite State. What does history say about the predictive power of both the national polls and the state polls at this point in the process? Should we trust the predictive power of the current state and national polls?
The short answer is no.
Take a look at Table 1 below. It illustrates why the answer to that question is no in the case of the national polls.
Table 1: National polling averages two months out from Iowa, 2008 and 2012
Two months before the three contested nomination battles in 2008 and 2012, the RealClearPolitics (RCP) national poll averages show a leader different from the actual Iowa results. And, what is worse, they didn’t get even one of the three actual ultimate nominees correct.
Neither Clinton nor Rudy Giuliani won Iowa or the nomination in 2008. Herman Cain didn’t win Iowa or the nomination in 2012 — he dropped out before Iowa. Clinton did narrowly win New Hampshire, but Giuliani and Cain did not. In the end, Barack Obama, Mike Huckabee, and Rick Santorum were the three Iowa winners. All three surged in the two months before Iowa and even in the last few weeks.
But, what of the state polls in Iowa two months before the caucuses? Were they accurate in their implicit predictions? Again, no.
In all three caucus battles in 2008 and 2012, the RCP average of the results from polls of Iowa voters two months before the caucuses had someone other than the ultimate Iowa winner in the lead.
Table 2: Iowa polling averages and actual results, 2008 and 2012
These state polls showed Clinton, Mitt Romney, and Cain ahead while the winners in Iowa were Obama, Huckabee, and Santorum. The Iowa polls were more accurate than the national polls’ implicit predictions but still quite inaccurate.
David Byler of RCP has done the nation’s political junkies a favor and calculated the accuracy of the polls in both Iowa and New Hampshire at various points before the vote. Two months before the elections these state polls have only 50% predictive accuracy. The predictive accuracy does not get to 90% until days before the vote and never reaches 100%.
A good example is Santorum in Iowa in 2012. His RCP average in the final polls was 16.3% and none of the polls had him over 18%. He won with 25.54%, just 34 votes ahead of Romney. Of note is that in the national polls just before Iowa he was at 4% yet Santorum ended up essentially finishing second to Romney in the nomination process.
There is a lot of hyperventilating going on now about the unexpected character and characters of this year’s nomination battles. All should remember that a great deal will almost certainly change before Iowa and New Hampshire and then change further because of the results in the Hawkeye and Granite states.
There are several methodological reasons for pre-election polling prediction inaccuracy. These plague pollsters because they are very expensive and difficult to solve. Even if you commit the money there is still no guarantee that you have a sample of just the people who will actually vote.
The first difficult methodological issue is the crucial decision as to how to select the initial sample for the poll. Most if not all of the current national polls start with a random sample of adults and narrow it to registered voters with a question or two. This point is where the current national polls begin to get methodologically weak.
They take almost all of the Republican or Democratic identifiers in their sample as well as the partisan leaners and ask them the vote questions. This sounds reasonable, but it is not. A national sample of registered voters has the equivalent of about 145 million registered voters, including about 70 million registered Democrats and 55 million registered Republicans. It is representative of those populations.
The problem is that a much smaller portion of these partisan registered voters will actually participate in their state primary or caucus. In 2008, a little less than 40 million people voted in the Democratic nomination contests. But given the competitiveness of the Clinton-Obama race, that was something of an outlier. On the GOP side that cycle, a bit more than 20 million voters participated. And the 2012 Republican race, which was the sole focus of the nomination season, actually had fewer primary voters and caucus goers than in 2008 (slightly less than 20 million). Thus, we can figure that perhaps anywhere from one-half to two-thirds of the interviewed respondents in these polls won’t actually vote in a primary or caucus. In other words, too many non-primary voters are interviewed.
The state polls often, but not always, do a better job with their initial sample. Most start with a sample of registered voters from their state’s secretary of state or whomever maintains those rolls. They also get info on which of those folks voted in previous primaries. They then have various methods to supplement the narrow sample with newly registered voters and other registered voters especially motivated and unusually likely to vote in a primary this election. In 2016 for example, many Trump and Sanders supporters have shown very high intensity.
But, even with these adjustments it is hard to be sure you have the actual “likely electorate.” Thus many of the surprises we get on election nights during the primaries are caused by these methodological difficulties.
Bottom line, the national polls are very poor predictors of primary/caucus results. While the state polls are better, they are still faced with the very difficult task of basing their predictions on what is at best an artful guess as to which individuals in the samples will actually vote.
The current polls almost surely do not provide a good read on the winners of the early February contests in Iowa and New Hampshire. Rather than relying on these polls to handicap the races, an eclectic multi-factor approach is necessary. That analysis needs to take into account history, candidate strengths and weaknesses, money, ground games, likely turnout, intensity of support, and interest in the election, among other factors.
The national polls will not tell us much about the winners in Iowa and New Hampshire. The state polls in those leadoff states will only get more accurate in their implicit predictions in the days just before the voting, when the folks that are actually going to vote have made up their minds.
Stay tuned as we are in for a political roller-coaster ride.
|Alfred J. Tuchfarber is Professor Emeritus of Political Science at the University of Cincinnati and founded the Ohio Poll as part of a long career studying American politics and survey methodology.|