Polling, modeling, forecasting

For doing any task scientifically, the general principle is that more data are always better than less. Garbage-quality data, once recognized, can be thrown away, and all the decent data should be analyzed together. In the case of the coming United States election, there are aggregators worth consulting (see “2 of 3 Fil-Ams are for Kamala,” 8/17/24).

As of Aug. 30, 2024, the national polls are 47.2 percent for Kamala Harris and 43.7 percent for Donald Trump, or +3.5 points for Harris, as reported by abcnews.go.com. In the national tracker of Silver Bulletin (8/30/24), Harris is up by 3.8 points. The consensus is that the Democratic Party, with its new candidate Harris, has caught up in terms of the national contest. The vital question is whether she is also likely to win 270 votes in the US electoral college (EC), recalling the 2016 case where the Democrat Hillary Clinton won the popular vote, and yet the Republican Trump won the election.

Based on long election history, there are some states considered “safe” for the Republican Party; these are the “red” states. There are some considered “safe” for the Democratic Party; these are called “blue” states. The following are the seven so-called “swing” states (with their EC votes in parentheses, totaling 93) that flip their color back and forth, preventing either party from monopolizing the White House: Pennsylvania (19), Georgia (16), North Carolina (16), Michigan (15), Arizona (11), Wisconsin (10), and Nevada (6). Note that Kamala’s running-mate choice of a politician from Michigan, rather than from Pennsylvania, meant a net risk of 4 EC votes for her.

When a statewide poll is reported as having an error margin of plus/minus 4 percentage points, which equals 1/25, that means it came from a sample of 25×25=625 respondents. If its error is plus/minus 5 points, which is 1/20, then it came from a sample of 20×20=400 respondents. This is a statistical formula worth memorizing.

The typical US statewide poll takes only a few hundred respondents in its sample, as long as it is a random sample. (Personally, I don’t put much value in online sampling.) It doesn’t matter how many or how few voters the state has. The same formula works for a scientific provincial or citywide poll in the Philippines, no matter how large or small the voting district is. Minimizing sample size is important for lowering the cost of repeated polling over the course of a campaign.

Many factors determine a voter’s choice; some are personal characteristics. In the US, about one-fourth call themselves as Republicans and an equal one-fourth call themselves Democrats; almost half say they are independent, which implies they are swayable by the campaign.

Many Americans aren’t registered to vote, and many are registered but say they might not vote—bear in mind that election day is a regular workday. A national voting turnout of 60 percent is quite high; while in the Philippines, 80 percent is normal.

There are various groups—racial, educational, occupational, etc.—already known to lean blue or to lean red, but these groups’ voting turnouts have room for either expansion or deflation, and so need to be forecasted also.

And what about the economic situation, state by state? This can be modeled into the election forecast, on the basis of historical data relating the state’s economy to its votes. Thus, some forecasts are labeled as “polls-only” while other forecasts also incorporate special models designed by the forecasters, in which the national vote, if included, would just be one of the factors.

Modeling is important for forecasting election outcomes in a parliamentary system. Whereas the US has 50 states, and as many pre-election polls at a time, the United Kingdom has 650 seats in its House of Commons, each won by a plurality choice among multiple candidates, like our system for the lower house.

In principle, 650 separate constituency-wide polls could be done to predict the winner of each seat in Commons, but that’s rather expensive. In practice, the results of UK national elections are predicted by a model relating the national preference among the major parties to the parties’ share of the parliamentary seats.

In Singapore, the ruling party has nearly all of the 93 elected seats in parliament, even though it gets only 60 percent of the popular vote. It does not need 93 constituency-wide polls to be assured of dominance. One national poll will do.

—————-

Contact: mahar.mangahas@sws.org.ph.

READ NEXT
The UP dream
Read more...