MIT Shiva Ayyaduri Analysis
RSPV = Republican Straight Party vote (for Trump)
RFBV = Republican Full Ballot Vote fraction of OVT
CPV = Cross party vote
OVT = Other votes for Trump
OVT = RFBV + CPV (does RFBV INCLUDE mail-ins? Dunno)
So the formulation in the presentation (OVT - RSPV) = RFBV + CPV - RSPV (no way of KNOWING the CPV without registration data it's just and assumption made in the presentation)
The Y axis metric is ACTUALLY (RFBV + CPV - RSPV)
The X axis just RSPV.
SUMMARY
What the presenters alleged to have found was a "man-made" artifact in actual Michigan precinct data that was purposely inserted into the voting tally machines. The assumption was that the "redder" the precinct, the more votes there should be for Trump.. Their use of RSPV is a proxy for ACTUAL number of Repubs in that precinct. I question WHY they fixated on the "two choices in method of voting -- Straight party vote or Full Ballot vote as their PRIMARY variables. Because clearly these variables are a fixed percentage of REPUBLICAN voters in that district that chose either method. EVERY variable in the axis metrics DOES NOT INCLUDE any feature of the vote count that has any numbers for the competitiveness of that race. It LACKS:::::
1) Any relationship to competitive voter TURNOUT advantage in that precinct.
2) Any numerical clues as to the SIZE of that precinct and TOTAL #s cast in the race.
3) Any useful forensic information about the REGISTRATION declarations in that precinct.
Note that RSPV + RFBV as percentages HAVE to Equal 100% of the Republicans that voted REGARDLESS of turn-out or the size of the registered Repubs in the district. BOTH variables are SIMPLY "choices of METHOD of voting"..
Actually was 3 choices of voting method for (R)s to vote. Dr. Shiva Analysis missed mail-ins???
Or was the data spilled into the graph including mail-ins?? Mail-ins offer the same 2 choices, but the fact the graph behaved the SAME for early voting as it did for "day of" voting makes me wonder if mail-ins were even counted when this data was spilled into the graphs. Mail-ins wouldn't reduce the choices of Full Ballot or Straight ballot, but would interfere with conclusions on how the vote SHOULD have went.
If the 2 primary variables HAVE to add to 100% -- then for the Yaxis = (RFBV + CPV - RSPV) -- As RSPV goes up as the "districts get REDDER then the other of the 2 choices of method goes down by the SAME AMOUNT. You "rob from Peter to pay Paul". So as the RSPV goes UP -- The Yaxis line follows with a linear decline.
So because the SET-UP methodology was botched by a fascination with using RSPV as indicator of HOW RED a precinct was -- what was actually PROVED is their contention that RSPV is one way to show the redness of a precinct.The only time you'd expect to equal zero and fall on their 0% line is when the (R) voters split their voting method exactly 50%/50% between the 2 choices. And UNFORTUNATELY -- that's the EXACT scenario chosen to present the "Two Types of Voting" graphic that I post below. Don't suspect they did this on purpose.
Maybe the entire production and thinking was just too hasty. This is Algebra 1 level math. But it demonstrates that correctly SETTING UP THE PROBLEM is the larger part of driving a mathematical analysis.
So below I show a couple examples using DIFFERENT data points for the "voting preference" and how the SLOPE of ANY DATA poured into that graph will be a mostly linear line with a negative slope. I also was curious about how to pick a better problem set-up and ventured a couple ideas on how to "maybe" forensically find "man-made" anomalies in the voting results.
**************************************************************************************************************
Examples showing that a 50/50 split in choosing one or the other methods is the ONLY expected solution for Y=0 ::::
From the "Two Types of Voting" chart..
50/50 split on choice between RFBV vs RSPV.. Results in 65% and 60% vote total for Trump respectively.
That corresponds to vote percentage of 62.5% Trump, 37.5% for Biden.
Or 125 votes for Trump, 75 for Biden.
What happens when you DONT assume a 50/50 split in the choice of HOW they voted? Leave CPV the same 5%.
So try 70% RSPV and 30% RFBV::::
So given his example the outcome was 125 for Trump, 75 for Biden. That CANNOT CHANGE depending JUST on the METHOD that Republicans chose to use..
70% of 125 = 87.5 votes with RSPV (cant help the fraction)
30% of 125 = 37.5 votes with RFBV (cant help the fraction)
We've changed nothing BUT the way (R)s chose to vote.. So the Y metric becomes assuming CPV =
Y = (RFBV + CPV - RSPV) = 37.5 + CPV - 87.5 = -50 + CPV. HOW DID WE CHANGE Y simply by changing the assumption on how many folks voted Straight Party ticket !!!! Y WAS a +5%...
It's because Y is DEPENDENT on PREFERENCE to vote Full Ballot or Straight Ballot.. And as RSPV goes UP -- RPFB MUST go down. THe vote total for Repubs must remain the same. The 125 votes in this case that were cast for trump.
So the assumption that RSPV should indicate the final vote for Trump aint shown by the chosen Y metric. Part of the problem was choosing a deceptive 50/50 in vote preference. Get into this more later.
***********************************************************************************************
You can skip this part -- but lets do the same thing assuming a 40% RSPV and a 60% RFBV..
40% of 125 = 50 votes with RSPV (cant help the fraction)
60% of 125 = 75 votes with RFBV (cant help the fraction)
We've changed nothing BUT the way (R)s chose to vote.. So the Y metric becomes:
Y = (RFBV + CPV - RSPV) = 75 + CPV - 50 = 25 + CPV. HOW DID WE CHANGE Y simply by changing the assumption on how many folks voted Straight Party ticket !!!!
***********************************************************************************************************
My two examples bracket either side of 50/50 voting preference in his set-up.. Plot the 3 on his graph and we get a line that goes down linearly from 40% RSPV on the X-axis (40,25+CPV) to 70% RSPV on the X axis (70,-50 + CPV). It goes thru Y=0 at the unfortunate 50/50 vote method point he chose to set up the problem. In fact 50/50 is the ONLY Y solution that falls on their "0% axis".. Accident? or On Purpose.. Dunno.
At this point I'd LIKE to introduce variables to FIX this. But that would change the METHOD selected by the authors.. And I offer an example of that below in "Was there a way to do this right?"
If you look at the SLOPE of graph metrics which is simply Y/X you get:
(RFBV + CPV - RSPV)/ RSPV (ignoring CPV) . An alternate expression for the slope is --> (RFBV/RSVP) - 1
You can easily see that as RSPV GOES UP (Deeper RED district) -- RFBV is gonna decline. Because of the fixed number of Repubs in that precinct and the fact that the 2 methods have to add to 100%..
ANY data spilled into a graph with that metric is GUARANTEED to be a line with a negative slope.. I'm fairly certain that if you plotted the results for ONLY Dems as they did here for ONLY Repubs, you'd get the same result.
WAS THERE A WAY TO DO THIS RIGHT????
The objective here was to find "vote flipping" from (R) to (D) due to exploitation of a "feature" embedded into the voting system to "weight" or fractional manipulate the results. Any feature like that would be EASILY detectable if the PRODUCT of the weighting factors for (D)s and (R)s did NOT equal 1.. Because it would be betrayed by the fact that the numerical sum of (D) and (R) votes would not equal the VOTE TOTAL..
You would have some choices on HOW to apply that..
1) At the individual precincts versus just at the STATE level tally center.
2) ALL precincts or just "some precincts"
3) Same weight for all -- or a weight related to the "blue/red color" of the precints.
Doubt that ALL precincts would be manipulated. TACTICALLY hard to do.. And some precincts are just too polarized for this to pass smell test.
And getting fancy by SKIMMING more votes in DEEP RED counties would also raise alarms. Which is what Dr.Shiva alleged happened.
That's too simple to find. So the least detectable method would be FLAT weighting over a few select precincts at the State level of the tally.
I think the authors got too fascinated by the chance to use Straight Party Voting as a proxy for how RED a precinct is.. The BETTER approach would be to get publicly published data for %(R) and %(D) in each precinct. Using (%R) would NOT not be dependent at all on what method -- straight party or full ballot or mail-in -- they chose to use. Any "party cross-over" in the real data would just be
noise on the graph. (maybe HIGHER than what's shown in THEIR graphs - because INDEPENDENTS now decide elections pretty much). And Independents tend to wain off at both extremes of the X axis.
That would be the X axis. It ranges from 0 to 100.
For the Y axis -- Just use %TrumpVote which = RepubTrump + CPV.The suspected FRAUD would be the outliers that dont fit to the linear regression of the line.
The EXPECTED slope would Y/X = (%Trump) / (%R). As %R goes up -- the %Trump should go up. Maybe with some "never Trump" effect and the variance from CPV. It's ESSENTIALLY what they TRIED to do before all the different choices of voting METHOD kicked in.. Have a direct measure of"party loyalty for X -- and a measure of the voting RESULT as the Y variable.
Dont KNOW if a scatterplot would REVEAL hijinks in just a few select precincts. Think there's too much "independent" factor in voting nowadays. And FLIPPING 4% or LESS as a weight -- would be lost in the noise of the 40% of Independents and 3rd parties in the vote.. EXCEPT at the FAR ENDS of the X axis where Indies just become a "rare species".
My opinion is to FIND exploitation of "weighting" vulnerabilities in voting systems, you pretty much have to CATCH THEM IN THE ACT of cranking in the weighting factors..
Scholars have shown Democratic constituencies are more likely to vote
straight ticket than other groups.8 Studies from selected elections show that
Democrats are advantaged more than Republicans by straight party voting in
vote share. There's graph in there showing SPV DECREASED for Texas Repubs in 2016 to 2018 by 10% once Trump was elected.. And historically about 1/2 of them voted SPV until it was eliminated there in 2019.
Note in THIS slide what he calls INDIVIDUAL VOTES is the RFBV.. The 50% of Repubs that chose Full Ballot. And shortly after he plots it on the 0% line. It's the ONLY CASE that falls on the 0% line if you define "individual votes" this way..
However in THIS slide -- he appears to be using the TOTAL Trump vote but still referring to it as the "Individual vote" because you can not have 60% of Repubs choosing Straight Party and 65% of them choosing "individual" or Full Ballot.. You've just manufactured 125% of Republicans voting !!!
That's what initially confused me..
But what he uses on the ACTUAL Michigan data charts is the RFBV that he calculates from the vote totals. Not this ALTERNATIVE interpretation in the last slide. So my analysis above is correct if you look at what he extracts from the election results.