Pre-election Forecast and Post-election
True Vote Analysis: Data, Assumptions and Methodology
Mar. 5, 2009
The 2008 Election Model (EM) and the Election Calculator (EC) consist of three basic components: recorded (“official”) data, assumptions and methodology. The recorded, official vote data and calculation methods are easily verified; it’s the estimates for the base case assumptions that are the subject of debate. This comprehensive summary will show that the assumptions are based on the best available data.
The base case assumptions are best estimates derived from the following data sources: 2004 and 2008 official recorded vote, pre-election state and national polls, unadjusted state and national exit polls, voter mortality tables, historical returning voter turnout, Census total votes cast. Because of the margin of error in the polling data and other assumptions, a sensitivity analysis (”stress-test”) was provided in both models to examine the effects of changes in the assumptions and determine which of the assumptions are most critical.
A pre-election, state and
national poll-based popular and electoral vote projection model
Recorded Data
Final State and National polls
Assumptions
Undecided voter allocation (UVA) is based on historical election data and professional
pollster practice.
Methodology
The State projection is based on the average of the
final state polls plus UVA.
The National projection is based on the average of
the final national polls plus UVA.
The probability (P) of winning a state is based on
the final projected vote shares, assuming a 3% MoE.
The theoretical Electoral vote is the expected value
å P (i) * EV (i), 1 =1 51 states
Obama won all 5000 Monte Carlo election trials based
on the final state polls, therefore he had a virtual 100% win probability.
EM
Sensitivity Analysis – calculates the effects of changes in UVA on the
projected vote shares.
A
post-election True Vote model based on a feasible, calculated returning voter
mix and National Exit Poll vote shares
Recorded Data
2004 and 2008 official vote
Assumptions
Annual voter mortality: U.S. mortality rates
Uncounted votes:
U.S. Vote Census
Obama and McCain share of returning and new voters:
2008 National Exit Poll
Methodology
Returning 2004 voters calculated based on 2004 and 2008 official vote data and the assumptions
National Exit Poll shares of returning and new
voters are applied to derive the True Vote
EC
Sensitivity Analysis - calculates the effects of assumption changes on the True
Vote.
Undecided Voters
“In the final USA
TODAY/CNN/GALLUP poll before the election, President Bush held a 49-47 edge
over Sen. John Kerry when the undecided voters were not allocated to a
particular candidate. When Gallup, using
a statistical model that assumes that 9 of 10 of those voters would support
Kerry, allocated the voters, the poll ended as a dead heat with each
candidate garnering 49%. The Gallup
allocation formula is based on analyses of previous presidential races
involving an incumbent”.
How does Gallup decide how to "allocate"
undecided voters?
This is what Frank
Newport, Editor in Chief of the Gallup Poll,
about undecided voters just before the 2004:
The allocation
procedure is a Gallup tradition, and represents Gallup scientists' best
estimate of what the final popular vote will be on Election Day.
Here's how it works.
The unallocated numbers in the pool of likely voters (that is, the percentages
of likely voters supporting Bush and Kerry, not including undecided voters) are
49% for Bush and 47% for Kerry. We
assume, based on an analysis of previous presidential and other elections, that
there is a high probability that the challenger (in an incumbent race) will
receive a higher percentage of the popular vote than he did in the last
pre-election poll, while there is a high probability that the incumbent will
maintain his share of the vote without any increase. This has been dubbed the
"challenger rule." There are various explanations for why this may
occur, including the theory that any voter who maintains that he or she is
undecided about voting for a well-known incumbent this late in the game is
probably leaning toward voting for the challenger.
This persistent
historical pattern is the basis for Gallup's decision to allocate the 3%
undecided vote to Kerry and Nader/other, making the final estimate 49% Bush,
49% Kerry, and 2% Nader/other.
Zogby
said this a few days before the election:
“The key reason why I still think that Kerry will win… traditionally, the undecideds break for the challenger against the incumbent on the basis of the fact, simply, that the voters already know the incumbent, and it's a referendum on the incumbent. And if the incumbent is polling, generally, under 50 percent and leading by less than 10, historically, incumbents have lost 7 out of 10 times. In this instance you have a tie, a President who is not going over 48, undecideds who tell us by small percentages that the President deserves to be reelected. And in essence, it gives all the appearances that the undecideds -- the most important people in the world today -- have made up their minds about President Bush. The only question left is: Can they vote for John Kerry? If it's a good turnout, look for a Kerry victory. If it's a lower turnout, it means that the President has succeeded in raising questions about John Kerry's fitness”.
Note: Final Zogby Election Day polling had Kerry winning by 50-47%, with 311 electoral votes, indicating that 75% of undecided voters broke for Kerry. It was not a good turnout; it was a great turnout. Officially, 122 million voted in 2004, compared to 105m in 2000, a net increase of 17m. But a closer analysis indicates that there must have been close to 30 million new voters. Here’s why: Approximately five million 2000 voters died prior to 2004. Assuming 95% turnout, another five million did not vote, so only 95m former 2000 voters returned to the polls in 2004. In addition, approximately three million ballots in 2004 were uncounted (a total of 125m were cast). Preliminary National Exit Polls indicated that Kerry won 57-62% of new voters, or 6m more than Bush.
Harris Interactive
on Election Day:
“The final Harris Polls show Senator John Kerry making
modest gains at the very end of the campaign in an election that is still too
close to call using telephone methods of polling. At the same time, the final
Harris Internet-based poll suggests that Kerry will win the White House today
in a narrow victory. Harris Interactive’s final online survey of
5,508 likely voters shows a three-point lead for Senator Kerry. The final
Harris Interactive telephone survey of 1,509 likely voters shows a one-point
lead for President Bush. Both surveys are based on interviews conducted
between October 29, 2004 and November 1, 2004.
The telephone survey is consistent with most of the other telephone
polls, which show the race virtually tied.
If this trend is real, then Kerry may actually do better than these numbers suggest. In the past, presidential challengers tend to do better against an incumbent President among the undecided voters during the last three days of the election, and that appears to be the case here. The reason: undecided voters are more often voters who dislike the President but do not know the challenger well enough to make a decision. When they decide, they frequently split 2:1 to 4:1 for the challenger.”
Uncounted Votes
The difference between the 2004 recorded vote total and the 2004 U.S. Census estimate is 3.45m votes. According to the Census Bureau: “The data are from the November 2004 Voting and Registration Supplement to the Current Population Survey (CPS). Statistics from surveys are subject to sampling and nonsampling error. The CPS estimate of overall turnout (125.7 million) differs from the “official” turnout, as reported by the Clerk of the House (122.3 million). For further information on the source of the data and accuracy of the estimates, including standard errors and confidence intervals, go here.” The published Census survey margin of error is 0.30%.
Note that the 3.4m estimated difference is a net figure. In 13 states, the official vote exceeded the Census estimate by a total of 730,000 votes. The largest discrepancies were in Florida (238k), Ohio (143k) and Tennessee (118k). Apparently more votes were padded than were suppressed in the 13 states. Based on the Census 3.4m net estimate, there is no way to calculate the actual mix of uncounted and padded votes. But according to investigative reporter Greg Palast, actual government records show that 3.006m votes were uncounted. The votes were comprised of 1.389m spoiled, 1.091m provisional and .0.526m absentee ballots.
The annual voter mortality rate is calculated based on official statistics.
|
Voter Mortality and
National Exit Poll Age Demographic |
||||||
|
|
|
|
|
|
|
|
|
Mortality Groups |
|
NEP |
Annual |
Mortality |
2000 Votes |
|
|
Age |
Rate |
|
Age |
Rate |
(millons) |
Cast |
|
15-24 |
0.09% |
|
18-29 |
0.10% |
0.019 |
18.84 |
|
25-45 |
0.18% |
|
30-44 |
0.20% |
0.064 |
32.13 |
|
45-64 |
0.71% |
|
45-59 |
0.60% |
0.199 |
33.24 |
|
65+ |
5.07% |
|
60+ |
4.00% |
1.064 |
26.59 |
|
|
|
|
|
|
|
|
|
|
|
|
Total |
1.215% |
1.346 |
110.8 |
To believe the National Exit Poll, you must believe that there were more returning Bush voters than were alive to vote. Obama leads the recorded vote by 69.46-59.34m, a 52.87-45.6% vote share. The 2008 National Exit Poll, which is closely forced to match the recorded vote, indicates that the vote share is 52.62- 45.52%. Assuming that the election was fraud-free, using the NEP shares and returning 2004 voter mix, we can determine the required turnout of 2004 voters.
The only assumptions are the following:
4.80% voter mortality rate.
All votes were counted in 2004 and 2008.
According to the 2004 US Census, there were 3.45m more votes cast than recorded, of which an estimated 75% were Kerry votes. But to be conservative for this analysis, we will assume that there were no uncounted votes.
We assume two scenarios for returning 2004 voters:
1) The 2004 vote was fraud-free (the recorded vote was the True vote).
2) The 2004 election was stolen (the unadjusted exit poll was the True vote).
For each scenario, we will consider two cases for 2004 voter turnout in 2008:
a) Turnout is calculated based on the NEP voter mix.
b) Turnout is 95% for returning Kerry, Bush and Other voters.
In Scenario 1a, Bush voter turnout is an impossible 102%; Kerry turnout is an implausible 86%; third-party turnout is an impossible 451%.
In Scenario 1b, Obama’s vote share is 2.34% higher than the recorded vote; his vote margin is 7 million higher than the recorded margin.
In Scenario 2a, Bush voter turnout is an impossible 110%; Kerry turnout is an implausible 80%; third-party turnout is an impossible 451%.
In Scenario 2b, Obama’s vote share is 4.60% higher than the recorded vote; his vote margin is 13m higher than the recorded margin.
Given:
C =
2008 Recorded vote = 131.37m
P =
2004 Recorded vote = 122.3m
M =
4-year voter mortality = 4.8%
N =
new voters in 2008 = 20.77m
Calculate:
T =
2004 Voter percentage turnout in 2008
RV
= 2004 Returning voters = C - N
D =
2004 Voter mortality = P * M
L =
Living 2004 Voters = P * (1-M)
Calculating
T for N = 20.77m (15.8% of 131.37):
T =
(131.37 –20.77) / (122.3* (1-.048))
T =
111 / (.952*122.3)
T =
110.6 / 116.4
Calculating
T for N = 17.08m (13% of 131.37, according to the NEP):
T =
(131.37 –17.08) / (122.3* (1-.048)
T =
114.29 / (.952*122.3)
T =
114.29 / 116.43
2008 Election Calculator
1) The US Census determined that 2.74% (3.45m) of 125.74m votes cast in 2004 were uncounted.
2) 3.0% of votes cast in 2008 will be uncounted.
3) Obama and Kerry each won approximately 75% of the uncounted vote (over 50% of uncounted votes are in minority districts).
4) 1.2% annual voter mortality (18+ years old).
5) 95% of Kerry, Bush and Other third-party voters still living in 2008 turned out to vote.
6) 2008 Final NEP vote shares.
For those who believe the 2004 election was legitimate:
Scenario 1: The returning 2004 voter mix is based on the Recorded Vote (Bush 50.73-Kerry 48.27%)
For those who believe the 2004 election was stolen:
Scenario 2: The returning 2004 voter mix is based on the Unadjusted Exit Poll (Kerry 52.0-Bush 47.0%)
|
Kerry uncounted vote share based on: |
|
|
|
|
|
||||||
|
Recorded |
Exit poll |
True Vote |
|
|
|
|
52.62% |
45.52% |
1.86% |
|
|
|
75.0% |
52.0% |
53.26% |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Estimated |
2008 NEP |
2004 NEP |
2004 NEP |
|
|
Uncounted |
Rate |
Votes (mil.) |
Total Cast |
|
|
Obama |
Share |
Final |
12:22am |
Final |
|
|
2008 |
3.00% |
4.06 |
135.43 |
|
|
DNV |
71% |
71% |
57% |
54% |
|
|
2004 |
2.74% |
3.45 |
125.74 |
|
|
Kerry |
89% |
89% |
91% |
90% |
|
|
|
|
|
|
|
|
Bush |
17% |
17% |
10% |
9% |
|
|
2004 Unctd |
Share |
2008 Unctd |
Share |
|
|
Other |
66% |
66% |
64% |
71% |
|
|
Kerry |
75% |
Obama |
75% |
|
|
|
|
|
|
|
|
|
Bush |
24% |
McCain |
24% |
|
|
McCain |
|
|
|
|
|
|
Other |
1% |
Other |
1% |
|
|
DNV |
27% |
27% |
41% |
45% |
|
|
|
|
|
|
|
|
Kerry |
9% |
9% |
8% |
10% |
|
|
2004 Annual Voter Mortality |
|
|
|
Bush |
82% |
82% |
90% |
91% |
|
||
|
Rate |
1.20% |
|
|
|
|
Other |
24% |
24% |
17% |
21% |
|
|
Died |
6.04 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Other |
|
|
|
|
|
|
2004 Voter Turnout in 2008 |
|
|
|
DNV |
2% |
2% |
2% |
1% |
|
||
|
Kerry |
95% |
|
|
|
|
Kerry |
2% |
2% |
1% |
0% |
|
|
Bush |
95% |
|
|
|
|
Bush |
1% |
1% |
0% |
0% |
|
|
Other |
95% |
|
|
|
|
Other |
10% |
10% |
19% |
8% |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Scenario 1: 2004 election was fraud-free.
Returning voters based on recorded vote (Bush
50.73-Kerry 48.27%)
Result: Obama 55.7-42.7%; 17.7m vote margin
|
|
|
2004 Unadjusted State Exit Poll Aggregate (WPE/IMS) |
2008 |
|
Calculated Vote |
|
|
||||
|
2004 |
Exit Poll |
Uncounted |
Cast |
Deaths |
Alive |
Turnout |
Voted |
Mix |
Obama |
McCain |
Other |
|
DNV |
|
|
|
|
|
|
21.71 |
16.0% |
71% |
27% |
2% |
|
Kerry |
59.03 |
2.58 |
61.61 |
2.96 |
58.65 |
95% |
55.72 |
41.1% |
89% |
9% |
2% |
|
Bush |
62.04 |
0.83 |
62.87 |
3.02 |
59.85 |
95% |
56.86 |
42.0% |
17% |
82% |
1% |
|
Other |
1.23 |
0.03 |
1.26 |
0.06 |
1.20 |
95% |
1.14 |
0.84% |
66% |
24% |
10% |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Total |
122.30 |
3.45 |
125.74 |
6.04 |
119.70 |
113.7 |
135.43 |
100% |
55.69% |
42.66% |
1.65% |
|
|
|
|
|
|
|
|
Cast |
135.43 |
75.43 |
57.77 |
2.23 |
|
|
|
|
|
|
|
|
|
Recorded |
52.87% |
45.62% |
1.51% |
|
|
|
|
|
|
|
|
|
131.37 |
69.46 |
59.94 |
1.98 |
Scenario 2: 2004 election was fraudulent.
Returning voters based on unadjusted state exit poll
(Kerry 52-47%)
Result: Obama 57.5-40.8%; 22.6m vote margin
|
|
|
|
|
|
|
2008 |
|
Calculated Vote |
|
|
|
|
2004 |
Exit Poll |
Uncounted |
Cast |
Deaths |
Alive |
Turnout |
Voted |
Mix |
Obama |
McCain |
Other |
|
DNV |
|
|
|
|
|
|
21.71 |
16.0% |
71% |
27% |
2% |
|
Kerry |
63.59 |
1.79 |
65.38 |
3.14 |
62.25 |
95% |
59.13 |
43.7% |
89% |
9% |
2% |
|
Bush |
57.47 |
1.62 |
59.09 |
2.84 |
56.26 |
95% |
53.44 |
39.5% |
17% |
82% |
1% |
|
Other |
1.23 |
0.03 |
1.26 |
0.06 |
1.20 |
95% |
1.14 |
0.84% |
66% |
24% |
10% |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Total |
122.30 |
3.45 |
125.74 |
6.04 |
119.70 |
113.7 |
135.43 |
100% |
57.51% |
40.82% |
1.67% |
|
|
|
|
|
|
|
|
Cast |
135.43 |
77.88 |
55.28 |
2.27 |
|
|
|
|
|
|
|
|
|
Recorded |
52.87% |
45.62% |
1.51% |
|
|
|
|
|
|
|
|
|
131.37 |
69.46 |
59.94 |
1.98 |
Election models consist of recorded data, assumptions (parameters), and calculations. Given that the mathematical logic in the 2008 Election Calculator (EC) is correct, the assumptions should be realistic in order to determine the True Vote. The EC base case assumptions are the best estimates derived from the following data sources: 2008 National Exit Poll and 2004 unadjusted aggregate state exit polls, 2004 and 2008 official recorded vote, voter mortality tables, historical returning voter turnout percentages, Census total votes cast. Due to the margin of error in the data and assumptions, a thorough examination of the effects of changes in the assumptions on the vote share is necessary in order to have confidence in the model.
The Final 2008 National Exit Poll vote shares of returning 2004 and new voters are used as the base case. Since the shares were used to match the recorded vote, it makes perfect sense to use them for the base case. The model calculates the returning voter mix based on plausible, documented assumptions for 2004 uncounted votes, voter mortality and voter turnout in 2008. Along with the vote shares, the assumptions comprise the full base case in the sensitivity analysis.
The EC contains a comprehensive set of 12 sensitivity analysis
tables. Each 5x5 table displays vote share and margin for 25
combinations of two input data assumptions. It is very likely that the True
Vote is represented in one of the 25 cells. The base case is located in the
central cell. The range of plausible True vote shares is reduced by focusing on
the input combinations that lie within the margin of error. For example, Table
1 contains two input variables: the Obama share of Kerry and Bush voters. They
range over the intervals 85-93% and 13-21% in increments of 2%. The base case
Obama True vote share is found in the central cell, where the base case assumptions
intersect (89% Kerry; 17% Bush)
To analyze the effect of incremental changes in the
assumptions, compare the base case vote share and margin (central cell) to the
adjacent cells. The least likely combinations are in the lower left (worst
case) and upper right (best case) cells.