The idea that this is somehow proof of election fraud is laughable in my mind.
His analysis splits the electorate up into two groups. One group votes straight ticket and one votes for each candidate individually. His assumption is that people who vote straight ticket and people who vote for each candidate individually should vote for Trump at equal rates.
This assumption is completely untested and unfounded. He barely tries to defend it. He spends no time attempting to demonstrate this is what we should assume is correct. There's no normal comparison group. Nothing. We are just supposed to believe it's true and use this as a basis for saying that votes were shifted.
I think, instead, he's just demonstrating human nature.
I think it's pretty obvious that the people who vote straight ticket are more partisan than those who do not. Whether they are Democrats or Republicans, they're partisan. That means those in the middle are more likely to be swing voters. I think that swing voters are less likely to vote heavily one way over the other, their vote is more evenly distributed between Biden and Trump.
Therefore, when a precinct is highly partisan Republican, when a precinct has a high percentage of people voting straight party ticket Republican, the swing voters are unlikely to be equally heavily Republican and vice versa. This gives the exact same outcome as his data without the need for fraud.
Using SPVoting as proxy for the "republicanism" of any particular precinct IS problematic.. Because it's ASSUMED that "in some mathematical fashion" SPV will go up "correlated" with the partisianship of that precinct.
We know that intuitively that's true. But SPV doesn't prove that.
Will work for Dems as well..
But it's NOT a definitive measure of the partisanship.. I can give you a couple better ones..
1) Use the percent REGISTERED Repubs in that precinct.. It's available -- have at it..
2) Better choice. Because this MIT analysis never uses any metric other the Repub votes. And No TURNOUT factor. And turnout boosts the vote for one party or another. So use the relative TURNOUT for each party.. This metric isn't part (to my knowledge) of a General election race. Because no one is queried for party at polls or by mail.. But it can be DISCOVERED thru inquiries to the registrars or approximated by the percentages posted in the results for the Prez race..
3) Just use the final vote for Trump/Biden in that district as your "measure of partisanship".
So much for his "x-axis" independent variable.. Now look at the Y axis.. (See OPost for definitions)
(OTV - RSPV) / RSPV
This choice ONLY means anything in terms of the method voters use to choose to fill out a ballot.. NOTHING AT ALL to do with the actual competition or imbalances in Turnout. And actually the "minus" seems kinda arbitrary at first glance. But it's NOT.. The way it's formulated -- as RSPV goes UP -- the numerator goes downs linearly, but the fraction goes down even faster.
Seems to me the clearer metric is to use --
OTV + RSPV) / RSPV
In this case the Y value has strategic meaning.. The numerator is TOTAL VOTES for Trump in that district. It wasn't the other way 'round.. And in this case -- as RSPV goes UP -- the NUMERATOR goes up -- but the fraction goes down even faster.
MY BET IS -- if you used the latter Y formula -- the curves would be FLAT like he postulates they SHOULD BE -- if there was no "machine hanky panky"..
I'm looking at how much work it would be to get Michigan precinct data for one of Dr. Shiva's examples.. And it would take maybe an hour to RE-RUN a scattergram with that change..