Pre-election Forecast and Post-election True Vote Analysis: Data, Assumptions and Methodology

 

TruthIsAll

 

Mar. 5, 2009

 

The 2008 Election Model (EM) and the Election Calculator (EC) consist of three basic components: recorded (“official”) data, assumptions and methodology. The recorded, official vote data and calculation methods are easily verified; it’s the estimates for the base case assumptions that are the subject of debate. This comprehensive summary will show that the assumptions are based on the best available data.

 

The base case assumptions are best estimates derived from the following data sources: 2004 and 2008 official recorded vote, pre-election state and national polls, unadjusted state and national exit polls, voter mortality tables, historical returning voter turnout, Census total votes cast. Because of the margin of error in the polling data and other assumptions, a sensitivity analysis (”stress-test”) was provided in both models to examine the effects of changes in the assumptions and determine which of the assumptions are most critical.

 

The Election Model 

A pre-election, state and national poll-based popular and electoral vote projection model

 

Recorded Data

Final State and National polls

 

Assumptions

Undecided voter allocation  (UVA) is based on historical election data and professional pollster practice.

 

Methodology

The State projection is based on the average of the final state polls plus UVA.

The National projection is based on the average of the final national polls plus UVA.

 

The probability (P) of winning a state is based on the final projected vote shares, assuming a 3% MoE.

The theoretical Electoral vote is the expected value å P (i) * EV (i), 1 =1 51 states

Obama won all 5000 Monte Carlo election trials based on the final state polls, therefore he had a virtual 100% win probability.

 

EM Sensitivity Analysis – calculates the effects of changes in UVA on the projected vote shares.

 

The Election Calculator

A post-election True Vote model based on a feasible, calculated returning voter mix and National Exit Poll vote shares

 

Recorded Data

2004 and 2008 official vote

 

Assumptions

Annual voter mortality: U.S. mortality rates

Uncounted votes:  U.S. Vote Census

Obama and McCain share of returning and new voters: 2008 National Exit Poll

 

Methodology

Returning 2004 voters calculated based on 2004 and 2008 official vote data and the assumptions

National Exit Poll shares of returning and new voters are applied to derive the True Vote

 

EC Sensitivity Analysis - calculates the effects of assumption changes on the True Vote.

 

Undecided Voters
 

 “In the final USA TODAY/CNN/GALLUP poll before the election, President Bush held a 49-47 edge over Sen. John Kerry when the undecided voters were not allocated to a particular candidate. When Gallup, using a statistical model that assumes that 9 of 10 of those voters would support Kerry, allocated the voters, the poll ended as a dead heat with each candidate garnering 49%. The Gallup allocation formula is based on analyses of previous presidential races involving an incumbent”.

 

How does Gallup decide how to "allocate" undecided voters?

This is what Frank Newport, Editor in Chief of the Gallup Poll, about undecided voters just before the 2004:

The allocation procedure is a Gallup tradition, and represents Gallup scientists' best estimate of what the final popular vote will be on Election Day.

Here's how it works. The unallocated numbers in the pool of likely voters (that is, the percentages of likely voters supporting Bush and Kerry, not including undecided voters) are 49% for Bush and 47% for Kerry. We assume, based on an analysis of previous presidential and other elections, that there is a high probability that the challenger (in an incumbent race) will receive a higher percentage of the popular vote than he did in the last pre-election poll, while there is a high probability that the incumbent will maintain his share of the vote without any increase. This has been dubbed the "challenger rule." There are various explanations for why this may occur, including the theory that any voter who maintains that he or she is undecided about voting for a well-known incumbent this late in the game is probably leaning toward voting for the challenger.

 

This persistent historical pattern is the basis for Gallup's decision to allocate the 3% undecided vote to Kerry and Nader/other, making the final estimate 49% Bush, 49% Kerry, and 2% Nader/other.

 

 Zogby said this a few days before the election:

“The key reason why I still think that Kerry will win… traditionally, the undecideds break for the challenger against the incumbent on the basis of the fact, simply, that the voters already know the incumbent, and it's a referendum on the incumbent. And if the incumbent is polling, generally, under 50 percent and leading by less than 10, historically, incumbents have lost 7 out of 10 times. In this instance you have a tie, a President who is not going over 48, undecideds who tell us by small percentages that the President deserves to be reelected. And in essence, it gives all the appearances that the undecideds -- the most important people in the world today -- have made up their minds about President Bush. The only question left is: Can they vote for John Kerry? If it's a good turnout, look for a Kerry victory. If it's a lower turnout, it means that the President has succeeded in raising questions about John Kerry's fitness”.

 

Note: Final Zogby Election Day polling had Kerry winning by 50-47%, with 311 electoral votes, indicating that 75% of undecided voters broke for Kerry. It was not a good turnout; it was a great turnout. Officially, 122 million voted in 2004, compared to 105m in 2000, a net increase of 17m. But a closer analysis indicates that there must have been close to 30 million new voters. Here’s why: Approximately five million 2000 voters died prior to 2004. Assuming 95% turnout, another five million did not vote, so only 95m former 2000 voters returned to the polls in 2004. In addition, approximately three million ballots in 2004 were uncounted (a total of 125m were cast).  Preliminary National Exit Polls indicated that Kerry won 57-62% of new voters, or 6m more than Bush. 

 

 

Harris Interactive on Election Day: 

“The final Harris Polls show Senator John Kerry making modest gains at the very end of the campaign in an election that is still too close to call using telephone methods of polling. At the same time, the final Harris Internet-based poll suggests that Kerry will win the White House today in a narrow victory. Harris Interactive’s final online survey of 5,508 likely voters shows a three-point lead for Senator Kerry. The final Harris Interactive telephone survey of 1,509 likely voters shows a one-point lead for President Bush. Both surveys are based on interviews conducted between October 29, 2004 and November 1, 2004.  The telephone survey is consistent with most of the other telephone polls, which show the race virtually tied.

 

If this trend is real, then Kerry may actually do better than these numbers suggest. In the past, presidential challengers tend to do better against an incumbent President among the undecided voters during the last three days of the election, and that appears to be the case here. The reason: undecided voters are more often voters who dislike the President but do not know the challenger well enough to make a decision. When they decide, they frequently split 2:1 to 4:1 for the challenger.”

 

 

Uncounted Votes

 

The difference between the 2004 recorded vote total and the 2004 U.S. Census estimate is 3.45m votes. According to the Census Bureau: “The data are from the November 2004 Voting and Registration Supplement to the Current Population Survey (CPS). Statistics from surveys are subject to sampling and nonsampling error. The CPS estimate of overall turnout (125.7 million) differs from the “official” turnout, as reported by the Clerk of the House (122.3 million). For further information on the source of the data and accuracy of the estimates, including standard errors and confidence intervals, go here.” The published Census survey margin of error is 0.30%.

 

Note that the 3.4m estimated difference is a net figure.  In 13 states, the official vote exceeded the Census estimate by a total of 730,000 votes.  The largest discrepancies were in Florida (238k), Ohio (143k) and Tennessee (118k).  Apparently more votes were padded than were suppressed in the 13 states. Based on the Census 3.4m net estimate, there is no way to calculate the actual mix of uncounted and padded votes. But according to investigative reporter Greg Palast, actual government records show that 3.006m votes were uncounted. The votes were comprised of 1.389m spoiled, 1.091m provisional and .0.526m absentee ballots.

 

 

Voter Mortality

 

The annual voter mortality rate is calculated based on official statistics.

 

Voter Mortality and National Exit Poll Age Demographic

 

 

 

 

 

 

 

Mortality Groups

 

NEP

Annual

Mortality

2000 Votes

Age

Rate

 

Age

Rate

(millons)

Cast

15-24

0.09%

 

18-29

0.10%

0.019

18.84

25-45

0.18%

 

30-44

0.20%

0.064

32.13

45-64

0.71%

 

45-59

0.60%

0.199

33.24

65+

5.07%

 

60+

4.00%

1.064

26.59

 

 

 

 

 

 

 

 

 

 

Total

1.215%

1.346

110.8

 

 

Voter Turnout

2004 Voter Turnout in 2008

To believe the National Exit Poll, you must believe that there were more returning Bush voters than were alive to vote. Obama leads the recorded vote by 69.46-59.34m, a 52.87-45.6% vote share. The 2008 National Exit Poll, which is closely forced to match the recorded vote, indicates that the vote share is 52.62- 45.52%. Assuming that the election was fraud-free, using the NEP shares and returning 2004 voter mix, we can determine the required turnout of 2004 voters.

 

The only assumptions are the following:

4.80% voter mortality rate. 

All votes were counted in 2004 and 2008.

According to the 2004 US Census, there were 3.45m more votes cast than recorded, of which an estimated 75% were Kerry votes. But to be conservative for this analysis, we will assume that there were no uncounted votes.

 

We assume two scenarios for returning 2004 voters:

1) The 2004 vote was fraud-free (the recorded vote was the True vote).

2) The 2004 election was stolen (the unadjusted exit poll was the True vote).

 

For each scenario, we will consider two cases for 2004 voter turnout in 2008:

a) Turnout is calculated based on the NEP voter mix.

b) Turnout is 95% for returning Kerry, Bush and Other voters.

 

In Scenario 1a, Bush voter turnout is an impossible 102%; Kerry turnout is an implausible 86%; third-party turnout is an impossible 451%.

In Scenario 1b, Obama’s vote share is 2.34% higher than the recorded vote; his vote margin is 7 million higher than the recorded margin.

In Scenario 2a, Bush voter turnout is an impossible 110%; Kerry turnout is an implausible 80%; third-party turnout is an impossible 451%.

In Scenario 2b, Obama’s vote share is 4.60% higher than the recorded vote; his vote margin is 13m higher than the recorded margin.

 

 

Plausible 2008 Voter Turnout

 

Given:

C = 2008 Recorded vote = 131.37m

P = 2004 Recorded vote = 122.3m

M = 4-year voter mortality = 4.8%

N = new voters in 2008 = 20.77m

 

Calculate:

T = 2004 Voter percentage turnout in 2008

 

RV = 2004 Returning voters = C - N

D = 2004 Voter mortality = P * M

L = Living 2004 Voters = P * (1-M)

 

T = RV/L = (C – N) /  (P * (1-M))

 

Calculating T for N = 20.77m (15.8% of 131.37):

T = (131.37 –20.77) /  (122.3* (1-.048))

T = 111 / (.952*122.3)

T = 110.6 / 116.4

 

T = 95.0% (plausible)

 

Calculating T for N = 17.08m (13% of 131.37, according to the NEP):

T = (131.37 –17.08) / (122.3* (1-.048)

T = 114.29 / (.952*122.3)

T = 114.29 / 116.43

 

T = 98.16%  (implausible)

 

 

2008 Election Calculator

Base Case Assumptions

 

1)      The US Census determined that 2.74% (3.45m) of 125.74m votes cast in 2004 were uncounted.

2)      3.0% of votes cast in 2008 will be uncounted.

3)      Obama and Kerry each won approximately 75% of the uncounted vote (over 50% of uncounted votes are in minority districts).

4)      1.2% annual voter mortality (18+ years old).

5)      95% of Kerry, Bush and Other third-party voters still living in 2008 turned out to vote.

6)      2008 Final NEP vote shares.

 

For those who believe the 2004 election was legitimate:

Scenario 1: The returning 2004 voter mix is based on the Recorded Vote (Bush 50.73-Kerry 48.27%)

 

For those who believe the 2004 election was stolen:

Scenario 2: The returning 2004 voter mix is based on the Unadjusted Exit Poll (Kerry 52.0-Bush 47.0%)

 

 

Kerry uncounted vote share based on:

 

 

 

Final 2008 National Exit Poll

 

 

Recorded

Exit poll

True Vote

 

 

 

 

52.62%

45.52%

1.86%

 

 

75.0%

52.0%

53.26%

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Estimated

2008 NEP

2004 NEP

2004 NEP

 

Uncounted

Rate

Votes (mil.)

Total Cast

 

 

Obama

Share

Final

12:22am

Final

 

2008

3.00%

4.06

135.43

 

 

DNV

71%

71%

57%

54%

 

2004

2.74%

3.45

125.74

 

 

Kerry

89%

89%

91%

90%

 

 

 

 

 

 

 

Bush

17%

17%

10%

9%

 

2004 Unctd

Share

2008 Unctd

Share

 

 

Other

66%

66%

64%

71%

 

Kerry

75%

  Obama

75%

 

 

 

 

 

 

 

 

Bush

24%

  McCain

24%

 

 

McCain

 

 

 

 

 

Other

1%

  Other

1%

 

 

DNV

27%

27%

41%

45%

 

 

 

 

 

 

 

Kerry

9%

9%

8%

10%

 

2004 Annual Voter Mortality

 

 

 

Bush

82%

82%

90%

91%

 

Rate

1.20%

 

 

 

 

Other

24%

24%

17%

21%

 

Died

6.04

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Other

 

 

 

 

 

2004 Voter Turnout in 2008

 

 

 

DNV

2%

2%

2%

1%

 

Kerry

95%

 

 

 

 

Kerry

2%

2%

1%

0%

 

Bush

95%

 

 

 

 

Bush

1%

1%

0%

0%

 

Other

95%

 

 

 

 

Other

10%

10%

19%

8%

 

 

 

 

 

 

 

 

 

 

 

 

 

                                                                                                                                                                                                                                                                    

 

Scenario 1: 2004 election was fraud-free.

Returning voters based on recorded vote (Bush 50.73-Kerry 48.27%)

Result: Obama 55.7-42.7%; 17.7m vote margin

 

                                                                                                                                                                                                                                                                     

 

 

2004 Unadjusted State Exit Poll Aggregate (WPE/IMS)

2008

 

Calculated Vote

 

 

2004

Exit Poll

Uncounted

Cast

Deaths

Alive

Turnout

Voted

Mix

Obama

McCain

Other

DNV

 

 

 

 

 

 

21.71

16.0%

71%

27%

2%

Kerry

59.03

2.58

61.61

2.96

58.65

95%

55.72

41.1%

89%

9%

2%

Bush

62.04

0.83

62.87

3.02

59.85

95%

56.86

42.0%

17%

82%

1%

Other

1.23

0.03

1.26

0.06

1.20

95%

1.14

0.84%

66%

24%

10%

 

 

 

 

 

 

 

 

 

 

 

 

Total

122.30

3.45

125.74

6.04

119.70

113.7

135.43

100%

55.69%

42.66%

1.65%

 

 

 

 

 

 

 

Cast

135.43

75.43

57.77

2.23

 

 

 

 

 

 

 

 

Recorded

52.87%

45.62%

1.51%

 

 

 

 

 

 

 

 

131.37

69.46

59.94

1.98

 

Scenario 2: 2004 election was fraudulent.

Returning voters based on unadjusted state exit poll (Kerry 52-47%)

Result: Obama 57.5-40.8%; 22.6m vote margin

 

 

 

 

 

 

 

2008

 

Calculated Vote

 

 

2004

Exit Poll

Uncounted

Cast

Deaths

Alive

Turnout

Voted

Mix

Obama

McCain

Other

DNV

 

 

 

 

 

 

21.71

16.0%

71%

27%

2%

Kerry

63.59

1.79

65.38

3.14

62.25

95%

59.13

43.7%

89%

9%

2%

Bush

57.47

1.62

59.09

2.84

56.26

95%

53.44

39.5%

17%

82%

1%

Other

1.23

0.03

1.26

0.06

1.20

95%

1.14

0.84%

66%

24%

10%

 

 

 

 

 

 

 

 

 

 

 

 

Total

122.30

3.45

125.74

6.04

119.70

113.7

135.43

100%

57.51%

40.82%

1.67%

 

 

 

 

 

 

 

Cast

135.43

77.88

55.28

2.27

 

 

 

 

 

 

 

 

Recorded

52.87%

45.62%

1.51%

 

 

 

 

 

 

 

 

131.37

69.46

59.94

1.98

 

 

2008 Sensitivity Analysis

 

Election models consist of recorded data, assumptions (parameters), and calculations. Given that the mathematical logic in the 2008 Election Calculator (EC) is correct, the assumptions should be realistic in order to determine the True Vote. The EC base case assumptions are the best estimates derived from the following data sources: 2008 National Exit Poll and 2004 unadjusted aggregate state exit polls, 2004 and 2008 official recorded vote, voter mortality tables, historical returning voter turnout percentages, Census total votes cast. Due to the margin of error in the data and assumptions, a thorough examination of the effects of changes in the assumptions on the vote share is necessary in order to have confidence in the model.

 

Two general cases will be analyzed. The first assumes that the 2004 election was legitimate and that the recorded vote (Bush by 50.7-48.3%) was the True vote; the second assumes that the unadjusted state exit poll aggregate reflected the True Vote (Kerry by 52-47%).

 

The Final 2008 National Exit Poll vote shares of returning 2004 and new voters are used as the base case. Since the shares were used to match the recorded vote, it makes perfect sense to use them for the base case. The model calculates the returning voter mix based on plausible, documented assumptions for 2004 uncounted votes, voter mortality and voter turnout in 2008. Along with the vote shares, the assumptions comprise the full base case in the sensitivity analysis.

 

The EC contains a comprehensive set of 12 sensitivity analysis tables. Each 5x5 table displays vote share and margin for 25 combinations of two input data assumptions. It is very likely that the True Vote is represented in one of the 25 cells. The base case is located in the central cell. The range of plausible True vote shares is reduced by focusing on the input combinations that lie within the margin of error. For example, Table 1 contains two input variables: the Obama share of Kerry and Bush voters. They range over the intervals 85-93% and 13-21% in increments of 2%. The base case Obama True vote share is found in the central cell, where the base case assumptions intersect (89% Kerry; 17% Bush)

 

To analyze the effect of incremental changes in the assumptions, compare the base case vote share and margin (central cell) to the adjacent cells. The least likely combinations are in the lower left (worst case) and upper right (best case) cells.