Statistical Analysis of Pennsylvania Voting Data Shows Troubling Patterns - The American Spectator | USA News and Politics
Statistical Analysis of Pennsylvania Voting Data Shows Troubling Patterns
Ballot drop box, Media, Pennsylvania, Oct. 17, 2020 (Abigail McCann/

How did Biden, the “science” and “decency” candidate, prevail in Pennsylvania’s reported vote count over Trump, a president beloved by many despite his sometimes harsh rhetoric and his foibles? Instead of concerning ourselves with whether it was due to unexpectedly large support from the state’s scientific community, or to revenge of “decent” people against “deplorable” folks, let’s explore the data using basic mathematics and statistics.

Our trek through the numerical data will offer a bird’s-eye view of an unusual landscape, so please join in. For example, we will see that gains in votes for Biden (relative to Hillary Clinton’s 2016 performance) exhibit unusual spikes in counties with a high concentration of vacation homes. Biden’s vote gains appear mysteriously large in light of the Democratic Party’s relatively poor performance in increasing its registered voter roll. The tour will now begin.

Trump clobbered the Democratic Party in growing registered voter counts. The 2020 voter population divides into two mutually exclusive segments: those who changed their registration status between 2016 and 2020 (who joined or exited the registered voter pool, or who changed their party affiliation), and those who did not (who were registered to vote in 2016 and kept the same party affiliation or remained unaffiliated).

Those whose registration status changed between 2016 and 2020 are clearly an important group to consider, for two reasons. First, a net gain in registered voters for one party or the other is likely to translate into a net gain in votes, and, in fact, this plausible hypothesis can be tested statistically, as we shall see shortly. Second, a net gain in registered voters in one direction or the other might be an indicator of voter sentiment overall. Data downloadable from Pennsylvania’s Department of State yield the following, indisputable fact: the Republican Party grew its registered voter population in the state at a substantially higher rate than the Democratic Party after 2016, and especially during 2020.

Statewide, Democrats lost 3.3 percent of their registered voters between November 2016 and November 2019, while Republicans lost 0.1 percent of their registered voters (essentially zero). In nearly every county in Pennsylvania, Democrats experienced a net decline in registered voters between November 2016 and November 2019 (see chart 1).

PA voting, chart 1,

Chart 1: Percent change in registered voters between Nov. 2016 and Nov. 2019, by party affiliation (Note: each dot represents a county, located along the horizontal axis in relation to Trump’s 2016 share of votes)

During 2020, Republicans strongly outperformed Democrats in growing their registered voter roll, particularly within more heavily Republican counties, as well as in Philadelphia (see chart 2). In half of Pennsylvania’s 67 counties, the number of registered Democratic voters grew by less than 1 percent or declined. By contrast, in over half of the counties the number of registered Republicans increased more than 10 percent. For the state overall, Republican Party registrations grew by 9.2 percent, compared to 4.2 percent for the Democratic Party.

PA voting, chart 2,

Chart 2: Percent change in registered voters between Nov. 2019 and Oct. 2020, by party affiliation (Note: each dot represents a county, located along the horizontal axis in relation to Trump’s 2016 share of votes)

The same official state source reports that during 2020, 71,634 registered Democrats switched their party affiliation to Republican, while 44,713 Republicans switched to the Democrats. Democrats switching to Republican were outnumbered by those doing the opposite in all but six counties. And even in the week just prior to the 2020 election (October 26 through November 2), 2,086 registered Democrats switched their affiliation to Republican, while 1,150 Republicans switched to Democrat.

In Philadelphia, Trump’s advantage in growing registrations was reflected in strong vote gains. Trump grew his vote count in Philadelphia by an extraordinary 22 percent compared to 2016. In contrast, Biden eked out a meager 3 percent gain relative to Clinton’s 2016 vote count.

As a result, Trump not only held his ground in Philadelphia but trimmed the Democrats’ margin there (by about 4,200 votes relative to 2016). Philadelphia was thus neutralized as a potential threat to Trump’s winning the state in 2020.

In most counties outside of Philadelphia, however, Trump’s advantage in net growth in registrations failed to translate into an advantage in gaining new votes. In most other Pennsylvania counties (in all but eight counties outside of Philadelphia), Biden dominated Trump with respect to percentage gain in votes over 2016 (see chart 3, which excludes Philadelphia). Mathematically, it seems that this is how Biden prevailed.

PA voting, chart 3,

Chart 3: Percent change in number of votes between 2016 and 2020, by party (Note: each dot represents a county, located along horizontal axis in relation to Trump’s 2016 share of votes)

What makes these county-by-county percentage gains puzzling, aside from Biden’s initial disadvantage with respect to growth in registrations, is their consistency across counties. In Democrat as well as in Republican strongholds, Biden dominated by this measure.

Notably, the county with the largest percentage gain for Biden, appearing as somewhat of an outlier in the chart, is Pike County, with a whopping 41 percent increase in votes relative to Clinton in 2016. This happens to be a county with a large percentage of vacation homes belonging to residents of New York and New Jersey, and this data point is consistent with news reports that residents of New York were voting from second homes in Pike County.

How did Trump fare against Biden after factoring in growth in registrations and other observables? Let’s now embark on the next stage of our magical numerical mystery tour, in which we shall more precisely quantify Biden’s versus Trump’s performance in gaining votes, county by county. Because of its unusual, outlier status, we exclude Philadelphia from this analysis.

For a more apples-to-apples comparison between Trump and Biden, we need to apply a common denominator for their percentage increases in votes between 2016 and 2020 in a county. Rather than measure these in relation to their respective party’s 2016 vote count, as was done previously, we shall now calculate these as a percentage of total votes cast in 2016 for both parties’ candidates (Trump and Clinton) combined.

We then estimate linear regression equations for each candidate’s percentage increase in votes across counties. These regressions account for the contribution of growth in registrations and other relevant, quantifiable factors. For those readers unfamiliar with linear regression, please just bear with us.

The estimated regression equations yield findings that are somewhat phenomenal. Trump’s increase in votes between 2016 and 2020 in each Pennsylvania county, as a percentage of the county’s combined (Trump plus Clinton) 2016 vote count, is mostly explained by net growth in registered Republicans during 2020 (see chart 4). The estimated equation also indicates that net growth in registered voters during 2020 maps roughly one-to-one into additional votes. Beyond that, Trump’s vote gain amounted to 1.9 percent of the 2016 combined count in each county. A positive relationship to Trump’s share of the 2016 combined count is also indicated by the regression equation, which makes intuitive sense as it reflects greater enthusiasm for Trump in Republican stronghold counties.


PA voting, chart 4a,

Chart 4A, Biden: Percent change in number of votes between 2016 and 2020 relative to the 2016 combined count, by party — overall versus portion explained by net change in the party’s registered voters (Note: each dot represents a county, located along horizontal axis in relation to Trump’s 2016 share of votes)

Kalkala election chart 4b,

Chart 4B, Trump: Percent change in number of votes between 2016 and 2020 relative to the 2016 combined count, by party — overall versus portion explained by net change in the party’s registered voters (Note: each dot represents a county, located along horizontal axis in relation to Trump’s 2016 share of votes)

Similarly, Biden’s net additional registered voters in a county during 2020 (where he had a gain) maps one-to-one into additional votes, and net loss of registered voters during 2020 maps five-to-two into lost votes. On average, beyond the amounts attributable to changes in registered voter counts, Biden’s vote gain amounted to 7.2 percent of the 2016 combined count, nearly quadruple that of Trump.

Nuanced details of how mail-in voting was implemented in a state, whether deriving from legislative, judicial, or bureaucratic decisions, could have consequences for whether the process played out fairly.

Other potential explanatory variables, including Clinton’s share of the 2016 combined count, were tested, but none exhibited a statistically significant relationship to Biden’s vote gains in a county, with one important exception. A high concentration of vacation homes in a county in eastern Pennsylvania (based on data from the National Association of Home Builders) is associated with a larger percentage vote gain for Biden, aligning with the previously noted Pike County effect. The statistical significance of this indicator suggests that Pennsylvania may not have had adequate guardrails in place to prevent acquisition of mail-in ballots by non-residents.

Thus, on average, Biden gained nearly four times as many votes as Trump from the long-term (i.e., inherited from 2016) pool of registered voters. This advantage is observed not just in counties that lean Democrat but in many Republican strongholds as well. Both the size of this Democrat advantage and its constancy across counties is puzzling.

No doubt, the total voter population grew through mail-in voting, since with mail-in it would be easier for individuals who had sat out 2016 to participate in 2020. Thus, it is conceivable that the Democrats may have exploited the availability of mail-in voting more successfully than the Republicans, attracting more voters from the latter cohort.

If mail-in voting is the explanation, however, why such a wide margin between Republican (1.9 percent) and Democrat (7.2 percent) gains? In other words, it seems odd that among those who had not voted in 2016, so few would vote for Trump relative to Biden when given the mail-in option.

And why the similarity of outcomes between counties that are predominantly Democrat and those that are not? Intuitively, one would expect that Democrats would gain proportionately more via mail-in where their share of the registered voter pool is relatively large.

One might instead hypothesize that the disproportionate increase in Democrat votes between 2016 and 2020 was not tied to mail-in voting but came from 2016 Trump voters who had abandoned Trump. Possibly so, but the abandonment (or return to the Democrat fold) would have had to occur at a similar rate between Republican strongholds and other counties, which again seems counterintuitive. Nor do these scenarios seem consistent with Trump’s strength in attracting new registered voters.

Not only has our foray into statistical analysis yet been unable to explain Biden’s success in gaining votes, it has compounded the mystery. But rather than call off our magical numerical mystery tour, let us press on, taking a closer look at the data on mail-in voting.

Data on mail-in voting in Pennsylvania and other states, available here, show that Pennsylvania had a larger gap in mail-in ballot return rates between Republicans and Democrats (a lower return rate for Republicans) compared to any other state for which such data are available. This is not merely an observation that Democrats are more inclined toward mail-in voting. Rather, among those who requested and were sent mail-in ballots, registered Democrats were more likely (by about 8 percentage points) to have returned the ballot (thereby having their votes counted) than registered Republicans.

But this is not the end of the story. The aggregate, statewide gap in mail-in ballot return rates by party affiliation masks the fact that Democrats disproportionately reside in more highly populated counties, which tend to have lower return rates. Another linear regression equation, in this case relating mail-in ballot return rates in a county to number of voters and Trump’s percent of votes in 2016, suggests a much larger disparity, in the neighborhood of 20 percentage points. (Again, we exclude Philadelphia because of its outlier status, and we also exclude Allegheny because it is missing from the data.) Thus, for example, in a county where 95 percent of Democrats are returning their requested mail-in ballots, only 75 percent of Republicans apparently are doing so.

A return rate gap of this magnitude potentially can explain much, if not all, of the Democrat versus Republican disparity in new votes added between 2016 and 2020. Trump received close to 600,000 votes by mail-in; an additional 20 percent implies an additional 120,000 votes.

What are the reasons for the mail-in ballot return rate gap in Pennsylvania? Lacking additional data, we are limited in what further insights we can draw on the basis of statistical analysis. We are nearing the end of our tour.

But there is a potential, partial explanation for the return-rate gap that merits serious consideration, deriving from the fact that anyone voting by mail in the 2020 primary could check a box and automatically receive a mail-in ballot for the general election. Primary voters thus opting for automated send-out of a mail-in ballot for the general election likely received that ballot in a more timely and assured manner than those requesting a mail-in ballot later on.

This distinction very plausibly favored Democrats, because they proportionately were more likely to participate in the primary and may also have been more inclined to vote by mail in the primary. In other words, the design of the mail-in ballot for the 2020 primary could have created a non-level playing field favoring Democrats, facilitating their higher mail-in ballot return rate during the general election.

Data on percent of registered voters that participated in the 2020 primary election in a county are also available from the Pennsylvania Department of State and can be used to assess the plausibility of the scenario just described. These data, in fact, confirm that the percent of registered Democrats who voted in the primary is predictive of the mail-in ballot return rate in the county, consistent with the posited scenario.

This scenario demonstrates that nuanced details of how mail-in voting was implemented in a state, whether deriving from legislative, judicial, or bureaucratic decisions, could have consequences for whether the process played out fairly. In contrast to Pennsylvania, other states, such as Connecticut and South Dakota, ensured a comparatively level playing field by sending out applications for mail-in ballots to all registered voters at around the same date. Thus, it seems fair to criticize Pennsylvania for proceeding in a manner that clearly could have favored one party over another.

Another aspect of the mail-in voting in Pennsylvania that could have generated return-rate disparities was inconsistency across counties in how mail-in ballots containing flaws were handled. According to an investigative report in the Philadelphia Inquirer, the state left it to counties to decide how aggressive to be in trying to contact voters to help them fix their ballots.” This resulted in “a patchwork of policies around how — or even whether — people are notified and given a chance to make their votes count.

Finally, the mail-in vote return-rate gap is consistent with reports of irregularities involving mail-in ballots, including incidents for which affidavits reportedly have been filed. Such reported activities range from deliberate, wholesale discarding of ballots that contained votes for Trump to mysterious deliveries of thousands of ballots all or almost all of which allegedly contained votes for Biden.

To sum up, the observed outcomes are puzzling and counterintuitive, and surely consistent with reports of suspicious activities that may have artificially boosted the return rate of mail-in ballots requested by Democrats and/or impeded return of the ballots that had been requested by Republicans. The oddly systematic nature of Biden’s vote gains despite the Republicans’ success in growing their registered voter rolls, as documented here, provide support for the view that allegations of systematic fraud should be taken seriously.

The analysis also highlights likely effects of arbitrary decisions governing mail-in ballot processes, as well as failure to put in place adequate guardrails against non-resident mail-in voting. Thus, the analysis is supportive of calls for judicial review of whether the mail-in process as implemented in Pennsylvania violated basic constitutional guarantees of free and fair elections.

Editor’s note: Chart 4b was updated on February 1, 2021, to fix a mismatch in ranges between Panel A and Panel B. The data reported has not changed.

The author has a BA in mathematics and Ph.D. in economics, has many years experience in economic research, and is now semi-retired. While he considers himself a political independent, he does not pretend to be completely objective — his personal view is that President Trump’s opponents have used shamefully duplicitous if not unlawful tactics in their quest to derail the president, ongoing since 2016. His goal in writing this article, however, is simply to highlight what he sees in publicly available data, and he prefers that readers draw their own conclusions, or revisit the data themselves if they are so inclined.

Sign up to receive our latest updates! Register

By submitting this form, you are consenting to receive marketing emails from: The American Spectator, 122 S Royal Street, Alexandria, VA, 22314, You can revoke your consent to receive emails at any time by using the SafeUnsubscribe® link, found at the bottom of every email. Emails are serviced by Constant Contact

Be a Free Market Loving Patriot. Subscribe Today!

Black Friday Special

The American Spectator

One Month for Only $2.99

The offer renews after one year at the regular price of $10.99 monthly.