Categories
Uncategorized

Non-citizen registrations with previous voting history in VA election data – update Oct 2024

We have updated our previous analysis (from March, July, and Sept) with the latest information from the VA Department of Elections data.

Abstract:

Using the data provided by the VA Department of Elections (ELECT), we have identified at least 3,533 unique registrations that were identified as “Declared Non-Citizen” and removed by ELECT from the voter rolls since May of 2023. In the last update period, there was a significant(!) increase of almost 50% of the TOTAL number of non-citizen removals that we’ve seen in our previous reporting (see data below). This increase does not appear to be an artifact of our processing routines, as all of the processing code has remained the same since our last update. We should also note for the record that these are only self-declared removals, and it does not appear from the data available that there was any changes to the standard process used by ELECT.

Of those 3,533 removals there were 537 that also had corresponding records of recent ballots cast at some point in the official Voter History record that we could observe. There were 1,296 associated ballots cast identified since Feb of 2019. There were an additional 2 non-citizen registrations and ballots as per the Daily Absentee List (DAL) data, that were not contained in the Voter History data.  The total number of identified non-citizen ballots cast is therefore 1,298 by 539 registrants when combining unique VHL and DAL identifications.

After our March 2024 post on this topic, we submitted all of the relevant information that we had at the time to the VA AG’s office. We have not heard any response or update on the matter since that time, besides this being considered an active investigation. We subsequently sent our July results as well to the same contact at the AG’s office, but have had no response.

The Arlington County VA Electoral Board undertook their own investigation into this matter after our previous results were posted, and they recently (as of Sept 10 2024) voted 3-0 to send the information to the AG’s office as well. The Arlington County Commonwealths Attorney also is reported to have an ongoing investigation into the matter. Similar efforts are underway in multiple other counties, including Loudoun and Fairfax counties, to name a few.

https://www.gazetteleader.com/arlington/news/investigation-launched-have-non-citizens-voted-in-arlington-9379534

https://www.gazetteleader.com/arlington/news/va-attorney-general-to-be-alerted-on-possible-non-citizen-voting-9504753

Background:

The VA Department of Elections continuously tries to identify and remove invalid or out of date registration records from the voter rolls. One category used for removal is if a registrant has been determined to be a non-citizen. It is required by the VA Constitution that only citizens are allowed to vote in VA elections.

In elections by the people, the qualifications of voters shall be as follows: Each voter shall be a citizen of the United States, shall be eighteen years of age, shall fulfill the residence requirements set forth in this section, and shall be registered to vote pursuant to this article. …

VA Constitution, Article II, Section 1. https://law.lis.virginia.gov/constitution/article2/section1/

Additionally, according to VA Code Section 24.2-1004, the act of knowingly casting a ballot by someone who is not eligible to vote is a Class 6 felony.

A. Any person who wrongfully deposits a ballot in the ballot container or casts a vote on any voting equipment, is guilty of a Class 1 misdemeanor.

B. Any person who intentionally (i) votes more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (ii) procures, assists, or induces another to vote more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (iii) votes knowing that he is not qualified to vote where and when the vote is to be given, or (iv) procures, assists, or induces another to vote knowing that such person is not qualified to vote where and when the vote is to be given is guilty of a Class 6 felony.

https://law.lis.virginia.gov/vacode/title24.2/chapter10/section24.2-1004/

ELECT makes available for purchase by qualifying parties various different data sets, including the registered voter list (RVL) and the voter history list information file (VHL). Additionally, ELECT makes available a Monthly Update Service (MUS) subscription that is published at the beginning of each month and contains (almost) all of the Voter List changes and transactions for the previous period.

In the MUS data there is a “NVRAReasonCode” field that is associated with each transaction that gives the reason for the update or change in the voter record. This is in accordance with the disclosure and transparency requirements in the NVRA. One of the possible reason codes given for records that are removed is “Declared Non-Citizen.”

EPEC has been consistently purchasing and archiving all of these official records as part of our ongoing work to document and educate the public as to the ongoing operations of our elections. (If your interested in supporting this work, please head on over to our donation page, or to our give-send-go campaign to make a tax-deductible donation, as these data purchases are not cheap!)

EPEC looked at the number of records associated with unique voter identification numbers that had been identified for removal from the voter record due to non-citizenship status, per the entries in the MUS, and correlated those results with our accumulated voter history list information in order to determine how many non-citizen registrations had corresponding records of ballots cast in previous elections. We only considered those records that are currently in a non-active state as of the latest MUS transaction log, as some determinations of non-citizenship status in the historical MUS transaction log might have been due to error and subsequently corrected and reinstated to active status. That is, we are not considering those records that had a “Declared Non-Citizen” disqualification, but were then subsequently reinstated and reactivated by ELECT.

While EPEC has periodically purchased full copies of the Voter History List for our archive, there is a known issue with the way ELECT handles removals from the voter record that can cause sampling issues depending on the time the VHL file is purchased, and records of legitimately cast ballots to not be present in the VHL: Namely, when ELECT removes a voter from the voter list, they also remove all instances of that voter ID from voter history information and other data files provided to qualified organizations. (IMO … thats a terrible way to manage the data, but that is the way it is done.) In light of that, EPEC also used its archived versions of the Daily Absentee List (DAL) for recent elections in order to attempt to find records of votes cast that might otherwise be missing from the VHL.

Results:

There were 3,533 unique voter records marked for removal with the reason of “Declared Non-Citizen” and not subsequently reinstated in the accumulated MUS record that EPEC began collecting in mid-2023. Of those 3,533 there were 537 that also had corresponding records of recent ballots cast at some point in the official Voter History record that we could observe. There were 1,296 associated ballots cast identified since Feb of 2019. Figure 1 shows the distribution of non-citizen voters in the cumulative MUS file history. The blue trace represent the total identified and CANCELED non-citizen registrations, and the yellow trace represents the number of those records that also had corresponding records in the accumulated voter history data.

Figure 1: Distribution if the number of identified non-citizen records and ballots in the cumulative ELECT MUS file history. The x-axis is the date that a record was marked as CANCELED for the reason of “Declared Non-Citizen”.

Note that the data contained in the MUS updates often covers more than a single month period. In other words, the individual MUS files are oversampled. Subsequent MUS files can therefore also have repeated entries from previous versions, as their data may overlap. Our analysis used the first unique entry for a given voter ID marked as “Declared Non-Citizen” in the cumulative MUS record in order to build Figure 1. This data oversampling in the MUS helps explain the relative increase in the May 2023 bin.

As VHL information can be incomplete depending on the time the VHL data was purchased in relation to the time that registrants were removed from voter records, EPEC also checked these non-citizen removals against the archived history of Daily Absentee List (DAL) files that EPEC has accumulated.  There were an additional 2 non-citizen registrations and ballots as per the Daily Absentee List (DAL) data, that were not contained in the Voter History data.  The total number of identified non-citizen ballots cast is therefore 1,298 by 539 registrants when combining unique VHL and DAL identifications.

These identifications represent only the individuals who declared themselves as non-citizen status through official interactions with ELECT, DMV, or other agencies. Each removed registrant was then contacted by the registrar to confirm their non-citizen status.

The distribution of identified unique voter ID’s for the 537 identified non-citizen voters per VA locality is given below in Table 1. It should be noted that each ballot record has a specific locality associated with where the ballot was cast, whereas unique individuals might move between localities over time. The assignment of unique identified individuals to each locality in table 1 is therefore based on the locality listed in the specific MUS “Declared Non-Citizen” record for that individual, while the assignment of ballot cast to Localities is based on the individual VHL/DAL records. A person could have lived and voted multiple times in one county, then moved to another county and voted again before finally being determined as a non-citizen. The same person would have generated multiple VHL/DAL records for each ballot cast, and associated with potentially different localities. This should be kept in mind when attempting to interpret Table 1.


MUS RemovalsVoted (VHL)Voted (DAL)Votes (VHL)Votes (DAL)
ACCOMACK COUNTY7



ALBEMARLE COUNTY335
8
ALEXANDRIA CITY15924
41
AMELIA COUNTY21
3
APPOMATTOX COUNTY1



ARLINGTON COUNTY11218
49
AUGUSTA COUNTY131
1
BEDFORD COUNTY92
4
BOTETOURT COUNTY2



BRISTOL CITY1



BRUNSWICK COUNTY21
2
BUCKINGHAM COUNTY3



CAROLINE COUNTY82
4
CARROLL COUNTY72
5
CHARLES CITY COUNTY21
3
CHARLOTTESVILLE CITY242
7
CHESAPEAKE CITY9222
56
CHESTERFIELD COUNTY22724
57
CLARKE COUNTY83
4
COLONIAL HEIGHTS CITY121
3
COVINGTON CITY1



CRAIG COUNTY1



CULPEPER COUNTY252
2
DANVILLE CITY172
8
DINWIDDIE COUNTY91
1
EMPORIA CITY2



ESSEX COUNTY31
3
FAIRFAX CITY93
11
FAIRFAX COUNTY6419812131
FAUQUIER COUNTY233
10
FLUVANNA COUNTY21
1
FRANKLIN COUNTY32
4
FREDERICK COUNTY282
2
FREDERICKSBURG CITY232
4
GALAX CITY2



GILES COUNTY2



GLOUCESTER COUNTY21
1
GOOCHLAND COUNTY5

1
GRAYSON COUNTY1



GREENE COUNTY61
2
HALIFAX COUNTY1



HAMPTON CITY5813
17
HANOVER COUNTY131
2
HARRISONBURG CITY692
3
HENRICO COUNTY1019
45
HENRY COUNTY


2
ISLE OF WIGHT COUNTY1

2
JAMES CITY COUNTY314
13
KING GEORGE COUNTY4



KING WILLIAM COUNTY1



LOUDOUN COUNTY22247
110
LOUISA COUNTY9



LYNCHBURG CITY224
11
MANASSAS CITY504
11
MANASSAS PARK CITY17

2
MARTINSVILLE CITY61


MECKLENBURG COUNTY73
10
MIDDLESEX COUNTY2



MONTGOMERY COUNTY


3
NELSON COUNTY2



NEW KENT COUNTY31


NEWPORT NEWS CITY10322
49
NORFOLK CITY8913
33
NORTHAMPTON COUNTY1



NORTHUMBERLAND COUNTY32
5
NORTON CITY1



NOTTOWAY COUNTY4



ORANGE COUNTY31
3
PATRICK COUNTY1



PETERSBURG CITY253
5
PITTSYLVANIA COUNTY72
4
PORTSMOUTH CITY3813
37
POWHATAN COUNTY4

1
PRINCE EDWARD COUNTY103
11
PRINCE GEORGE COUNTY121
1
PRINCE WILLIAM COUNTY39860
136
PULASKI COUNTY61
2
RAPPAHANNOCK COUNTY2



RICHMOND CITY161241671
ROANOKE CITY452
3
ROANOKE COUNTY192


ROCKINGHAM COUNTY225
13
RUSSELL COUNTY31
1
SALEM CITY3



SCOTT COUNTY11
4
SHENANDOAH COUNTY171
1
SMYTH COUNTY2



SPOTSYLVANIA COUNTY614
10
STAFFORD COUNTY7510
28
STAUNTON CITY5



SUFFOLK CITY2811
20
SURRY COUNTY1



SUSSEX COUNTY21
3
TAZEWELL COUNTY41
1
VIRGINIA BEACH CITY14919
66
WARREN COUNTY142
5
WASHINGTON COUNTY52
6
WAYNESBORO CITY3



WESTMORELAND COUNTY1



WILLIAMSBURG CITY101


WINCHESTER CITY222
2
WISE COUNTY1



WYTHE COUNTY3



YORK COUNTY2110
38
Totals3533537212962

The distribution of the 1,296 ballots that were identified as being cast by non-citizen voters (the yellow trace in Figure 1) in previous elections is shown in Figure 2. The most significant spikes are in the 2019, 2020, 2021 and 2022 November General elections, as well as the 2020 March Democratic presidential primary. Figure 3, which shows this distribution as a percentage of votes cast. Please note the scale of the Y-axis on the percent plot in Figure 3 is in percent of total ballots cast in each election. These graphs were only produced for the VHL data, and do not include the DAL identified records.

Figure 2: Distribution of identified non-citizen ballots cast in previous elections.
Figure 3: Distribution of identified non-citizen ballots cast in previous elections as percent of total ballots cast, according to entries in the VHL/DAL data files.

Figures 4 and 5 show the distribution of the registration dates of the identified non-citizen records. The same data is plotted in figure 4 and 5, with the only difference being the scale of the Y-axis in order to better observe the dynamic range of the values. When we look at the registration date of these identified records, we see that there is a distinct relative increase starting around 1996, and then again around 2012.

Figure 4: Registration dates of the identified non-citizen records. Absolute count on y-axis.
Figure 5: Registration dates of the identified non-citizen records. Logarithmic Y-axis scale.

EPEC made a FOIA request to the VA Attorney General’s office on March 11, 2024 inquiring for any records regarding how many prosecutions for non-citizen voting had occurred since June of 2023. We received a response that the AG had no such relevant records.

EPEC subsequently submitted our March analysis dataset to the VA AG’s office upon their request. We have heard no updates or status as to any action taken by the AG’s office since that time, except that it is being considered an ongoing investigation.

Discussion

It appears from the MUS data, that the VA Department of Elections (ELECT) is doing routine identification, cleanup and removal of non-citizen registrations, which is a good thing and we commend them for their continued efforts to maintain clean voter registration lists.

However, the fact that a small number of these identified non-citizen registrations are also associated with (presumably … if the data from ELECT is accurate) illegally cast ballots in previous elections does raise a number of questions that citizens should be (politely) asking and discussing with their legislators, elected and appointed government officials. Each act of non-citizen voting is a de-facto disenfranchisement of legal voters rights, and is a punishable offense under VA law.

Q: How did these registrants get placed onto the voter rolls in the first place?

Q: What method and/or data sources are used by the state to identify non-citizen registrations for removal? If that process is exhaustive, and covers all registrations, then these numbers might be considered to represent a statistical complete picture of the problem. If that process is not exhaustive, in that it only uses serendipitous corroborating data sources, then these results likely under-represent the scale of the issues.

Q: As noted above, we are only considering here those individuals who have not had their records re-instated or reactivated after a determination of non-citizen status. We do not have enough information to determine how or why some records were first determined to be non-citizen, canceled and then subsequently re-instated. One potential area of concern is determining whether or not registrants might be falsely or errantly claiming to not be a citizen on official documents in order to be excused from jury duty, for example, and then work to re-instate their voting status once those documents percolate through the system to ELECT and are flagged for removal. This is a wholly separate but serious issue, as making false claims on official documents is itself a punishable offense.

Q: What procedures, processes and technical solutions are in place to prevent current or future registration and casting of ballots by non-citizens? This is especially pertinent given the current state of the flow of illegal immigrants crossing our national borders. According to a recent report by Yahoo Finance, VA is one of the top 30 destinations for illegal migrants, with both Loudoun County and Fairfax making the list.

Q: Why have none of the identified non-citizens who also cast ballots been investigated or prosecuted under VA Code 24.2-1004? As the identification of these ballots comes directly from looking at the official records produced by ELECT, it seems prudent for these to be forwarded by ELECT to the AG’s office with a recommendation to investigate and prosecute. Yet our FOIA request to the VA AG’s office inquiring as to any records associated with these types of investigations or prosecutions produced a “no relevant records exist” response. And since we submitted this information to the AG’s office, there has been no follow up.

Additionally, this evidence which is derived from only official state records, directly contradicts multiple news media reports and attestations that non-citizen voting is a “Myth”, and that non-citizen voting happens “almost never”. If the data from ELECT is accurate, then there are at least 1,298 ballots that have been cast by non-citizen voters just since 2019. Now, that is still very infrequent, but it is not “almost never.” It is a legitimate concern … and these discoveries are only the registrations that have been found and removed from the voter roles by ELECT and that we can observe in the data. We do not know how many exist that we do not know about.

It should be reiterated that these are only the records that we can observe given our data repository, and how often we can realistically purchase and acquire voter history and voter registration information. It is therefore likely that this represents a significant undercount of the occurrences of non-citizen voters and non-citizen voting.

It costs us (EPEC) approximately ~$5K for each purchase of the statewide voter history list, and approximately $15K/year to maintain RVL records using a single baseline full purchase + 2 purchases of the 6mo MUS subscription. Due to the infrequent nature of these data purchases, it is very likely that some individuals have had their voter history or voter registration information completely removed from the record in between our purchases. Additionally, we know that the MUS data does not entirely encompass all transactions performed on the RVL by the department of elections, so there may be yet other unknown transactions that we are missing.

For information that is supposed to be publicly available (according to federal NVRA laws), the state has put up significant hurdles in order for citizens and organizations to acquire it for use it for ensuring transparency and integrity of our electoral process. If we are to have elections that are transparent and accountable to the public, then we must insist that the data be made available and accessible.

Categories
Uncategorized

A Canary in the Data Mine

Since VA Gov Youngkin issued Executive Order 35, it has been getting quite a bit of press … I wanted to make a few comments on it after having a chance to digest it.

Overall, I think it’s a net positive for Election Integrity efforts in VA, but not because there is any new or groundbreaking policy by the governor or his administration. Most of the items in his EO are already existing policy, and the EO language is worded such that the current actions of the Department of Elections (ELECT) can be arguably said to comply with those policies. There are a couple of small improvements, such as the fact that this EO codifies into transparent public policy the specific requirements that the commissioner must certify in writing. Even though there isn’t any drastically new policies, this does improve on overall public transparency, confidence and accountability, and I think thats a good thing.

What will be interesting in my opinion, is that now that this EO has been issued, will our team at EPEC notice any demonstrable difference in the quality of voter registration data that we track from ELECT going forward?

More specifically, I’m going to be tracking a few very specific records to see if they get addressed or not.

There are a few egregious records in the VA voter registration file that are obviously problematic that should be removed or at least updated/corrected by ELECT. I, and the team at EPEC, have been tracking these specific records for years but have not published or discussed them publicly. A few of them we even directly mentioned to the current Commissioner of Elections during our face-to-face meeting with her last year. These records should be “easy” to find and clean up by ELECT, as they are obviously errant and invalid registrations in their current state.

As of the latest data we have, those records are still in the voter list and listed as ACTIVE registrations, after nearly four years since I first found them. They can easily be identified with very simple logical checks to the registration records … descriptions of which I and the team at EPEC have discussed publicly and provided direct to ELECT and local registrars on multiple occasions.

The reason I bring this up, is that these can be considered and used as “hold out” test cases, in data science parlance. They are “canaries in the coal mine”, if you will. If we at EPEC start observing that these records are being responsibly updated or removed going forward, in accordance with the mandate in the EO to perform daily scrubbing of the voter rolls for ineligible records, then that would give VA citizens some evidence and verification that the Gov and ELECT are serious about their efforts.

That would be excellent if they did, but seeing as how I’ve been observing these records for 4 years while the administration proclaim what an excellent job its doing … I’m not going to hold my breath.

Categories
Election Data Analysis Election Forensics Election Integrity mathematics technical Uncategorized

VA 2024 March Primary Election Fingerprints

Abstract

Examining the Election Night Reporting data from the VA 2024 March Democratic and Republican primaries provides supporting evidence that the Republican primary was impacted and skewed by a large number of Democratic “crossover” voters, resulting in an irregular election fingerprint when the data is plotted.

Background

The US National Academy of Sciences (NAS) published a paper in 2012 titled “Statistical detection of systematic election irregularities.” [1] The paper asked the question, “How can it be distinguished whether an election outcome represents the will of the people or the will of the counters?” The study reviewed the results from elections in Russia and other countries, where widespread fraud was suspected. The study was published in the proceedings of the National Academy of Sciences as well as referenced in multiple election guides by USAID [2][3], among other citations.

The study authors’ thesis was that with a large sample sample of the voting data, they would be able to see whether or not voting patterns deviated from the voting patterns of elections where there was no suspected fraud. The results of their study proved that there were indeed significant deviations from the expected, normal voting patterns in the elections where fraud was suspected, as well as provided a number of interesting insights into the associated “signatures” of various electoral mechanism as they present themselves in the data.

Statistical results are often graphed, to provide a visual representation of how normal data should look. A particularly useful visual representation of election data, as utilized in [1], is a two-dimensional histogram of the percent voter turnout vs the percent vote share for the winner, or what I call an “election fingerprint”. Under the assumptions of a truly free and fair election, the expected shape of the fingerprint is of that of a 2D Gaussian (a.k.a. a “Normal”) distribution [4]. The obvious caveat here is that no election is ever perfect, but with a large enough sample size of data points we should be able to identify large scale statistical properties.

In many situations, the results of an experiment follow what is called a ‘normal distribution’. For example, if you flip a coin 100 times and count how many times it comes up heads, the average result will be 50. But if you do this test 100 times, most of the results will be close to 50, but not exactly. You’ll get almost as many cases with 49, or 51. You’ll get quite a few 45s or 55s, but almost no 20s or 80s. If you plot your 100 tests on a graph, you’ll get a well-known shape called a bell curve that’s highest in the middle and tapers off on either side. That is a normal distribution.

https://news.mit.edu/2012/explained-sigma-0209

In a free and fair election, the plotted graphs of both the Turnout percentage and the percentage of Vote Share for Election Winner should (again … ideally) both resemble Gaussian “Normal” distributions; and their combined distribution should also follow a 2-dimensional Gaussian (or “normal”) distribution. Computing this 2 Dimensional joint distribution of the % Turnout vs. % Vote Share is what I refer to as an “Election Fingerprint”.

Figure 1 is reprinted examples from the referenced National Academy of Sciences paper. The actual election results in Russia, Uganda and Switzerland appear in the left column, the right column is the modeled expected appearance in a fair election with little fraud, and the middle column is the researchers’ model of the as-collected data, with any possible fraud mechanisms included.

Figure 1: NAS Paper Results (reprinted from [1])

As you can see, the election in Switzerland (assumed fair) shows a range of voter turnout, from approximately 30 – 70% across voting districts, and a similar range of votes for the winner. The Switzerland data is consistent across models, and does not show any significant irregularities.

What do the clusters mean in the Russia 2011 and 2012 elections? Of particular concern are the top right corners, showing nearly 100% turnout of voters, and nearly 100% of them voted for the winner.

Both of those events (more than 90% of registered voters turning out to vote and more than 90% of the voters voting for the winner) are statistically improbable, even for very contested elections. Election results that show a strong linear streak away from the main fingerprint lobe indicates ‘ballot stuffing,’ where ballots are added at a specific rate. Voter turnout over 100% indicates ‘extreme fraud’. [1][5]

Note that election results with ‘outliers’ – results that fall outside of expected normal voting patterns – while evidentiary indicators, are not in and of themselves definitive proof of outright fraud or malfeasance. For example, in rare but extreme cases, where the electorate is very split and the split closely follows the geographic boundaries between voting precincts, we could see multiple overlapping Gaussian lobes in the 2D image. Even in that rare case, there should not be distinct structures visible in the election fingerprint, linear streaks, overly skewed or smeared distributions, or exceedingly high turnout or vote share percentages. Additional reviews of voting patterns and election results should be conducted whenever deviations from normal patterns occur in an election.

Additionally it should be noted that “the absence of evidence is not the evidence of absence”: Election Fingerprints that look otherwise normal might still have underlying issues that are not readily apparent with this view of the data.

Results on 2024 VA March Primaries:

Figure 2 and Figure 3 are the computed election fingerprints for the Democratic and Republican VA 2024 March Primaries, respectively. They were computed according to the NAS paper and using official state reported voter turnout and votes for the statewide winner and reported per voting Locality with combined In-Person Early, Election Day, Absentee and Provisional votes. Figures 4 and 5 perform the same process, except each data point is generated per individual precinct in a locality. The color scale moves from precincts with low counts as deep blue, to precincts with high numbers represented as bright yellow. Note that a small blurring filter was applied to the computed image for ease of viewing small isolated Locality or Precinct results.

The upper right inset in each graphic image was computed per the NAS paper; the bottom left inset shows what an idealized model of the data could or should look like, based on the reported voter turnout and vote share for the winner. This ideal model is allowed to have up to 3 Gaussian lobes based on the peak locations and standard deviations in the reported results. The top-left and bottom-right inset plots show the sum of the rows and columns of the fingerprint image. The top-left graph corresponds to the sum of the rows in the upper right image and is the histogram of the vote share for the winner across precincts. The bottom right graph shows the sum of the columns of the upper right image, and is the histogram of the percentage turnout across voting localities.

Figure 2 Democratic primary, accumulated per Locality:
Figure 3 Republican primary, accumulated per Locality:
Figure 4 Democratic primary, accumulated per Precinct:
Figure 5 Republican primary, accumulated per Precinct:

Analysis:

As can be seen in Figure 2 and 4, the Democratic primary fingerprint looks to fall within expected normal distribution. Even though the total vote share for the winner (Biden) is up around 90%, this was not unexpected given the current set of contestants and the fact that Biden is the incumbent.

The Republican primary results, as shown in Figure 3 and 5, show significant “smearing” of the percent of total vote share for the winner. The percent of voter turnout (x-axis) does however show a near Gaussian distribution, which is what one would expect. The republican primary data does not show the linear streaking pattern that the authors in [1] correlate with extreme fraud, but significant smearing of the distribution is observed.

A consideration that might partially explain this smearing of the histogram, is that there was at least 17% of “crossover voters” who historically lean Democrat but voted in the Republican primary (see here for more information). Multiple news reports and exit polling suggest that this was due in part to loosely organized efforts by the opposing party to cast “Protest Votes” and artificially inflate the challenger (Haley) and dilute the expected (Trump) margin of victory for the winner, with no intention of supporting a Republican candidate in the General Election. (This is completely legal in VA, by the way, as VA does not require by-party voter registration.)

If we categorize each locality as being either Democratic or Republican leaning based on the average results of the last four presidential elections, and then split the computation of the per precinct results into separate parts accordingly, we can see this phenomenon much clearer.

Figure 6 shows the per-precinct results for only those locality precincts that belong to historic Republican leaning localities. It depicts a much tighter distribution and has much less smearing or blurring of the distribution tails. We can see from the data that Republican base in historically Republican leaning localities seems solidly behind candidate Trump.

Figure 7 shows the per-precinct results for only those locality precincts that belong to historic Democratic leaning localities. It can clearly be seen by comparing the two plots that the major contributor to the spread of the total republican primary distribution is the votes from historically Democratic leaning localities.

Figure 6 Republican primary, accumulated per Precinct in Republican leaning localities:
Figure 7 Republican primary, accumulated per Precinct in Democratic leaning localities:

References:

Categories
Election Data Analysis Election Integrity technical Uncategorized

Technical issues with Enhanced Voting Election Night Return JSON files

As I was going through and processing the (new) VA election night reporting data provided by Enhanced Voting, I noticed a number of technical issues with the data feed. I’ve tried to capture them here in the attempt to help assist the VA Department of Elections in correcting bugs and implementation issues with their new reporting format.

While the new ENR data feed is commendable in that it presents the data for the state in an easily obtainable JSON formatted file, the following issues were observed in my processing of the data. I am happy to provide specific examples of these issues to the Enhanced Voting development team in order to help address them.

  • Inconsistent JSON formats being returned. Sometimes locality group results information is a cell of structures, sometimes an array.
  • Occasional mal-formed JSON, missing opening or closing parentheses or brackets causing the file to not be able to be parsed by JSON importing functions in python, MATLAB, etc.
  • Occasional duplicated locality precinct group result information
Categories
Election Integrity mathematics technical Uncategorized

Ranked Choice Voting: An Example of a Perverse Social Choice Function

The below is based on the discussion of “Single Transferrable Vote” (“STV”) methods in [1], published in 1977. STV has more recently been called “Ranked Choice Voting” (RCV) or “Instant Runoff Voting” (IRF), among other names, by lobbying groups that are currently pushing for its incorporation into our voting systems. Irrespective of the name used, it represents a family of voting methods, with slightly different variants depending on how votes are removed and/or redistributed in each successive round of voting. [2][5]

What does STV/RCV/IRV entail, in general:

The core system is a proportional voting system, where voters are required to rank order their preferred candidate selections and all ballots are collected and centralized tabulation is performed in multiple rounds until winner(s), or candidates that have support above a specified quota (or “threshold”), are allocated.

A common definition of the quota utilized in STL/RCV/IRV systems is the “Droop quota”, and is defined as:

q = FLOOR( # of Voters / (# of Seats + 1) + 1)

In a given round the candidate with the least support is eliminated from further evaluation. Surplus votes from candidates that go over the droop threshold and votes from eliminated candidates can be distributed amongst remaining candidates for subsequent rounds. Surplus vote distribution is only applicable when multiple winners are allowed in a contest.

Vote allocation procedure for STV/RCV/IRV. Reprinted from [1].

The arguments used to support and push for RCV have not significantly changed since the time that the original paper was published, but the terms and language utilized have been modified. The authors note that much of the rationale in pushing for STV was centered around the ideas of inclusivity and making sure voters are able to cast “effective” ballots.

“Modem proponents emphasize the system’s effective representation of minorities, its sensitivity and accuracy in ‘measuring changes in popular will,’ and its tendency to encourage independent (nonparty line) voting.”

Doron, G., & Kronick, R. (1977) [1]

The same arguments have been recently repeated and pushed to legislators and the media. The name has changed from “Single Transferrable Vote” to “Ranked Choice Voting” or “Instant Runoff Voting”, but the argument remains largely the same, as can be seen by simply visiting the websites and promotional material for any of the current groups that are lobbying for RCV to be incorporated [3][4].

The issue pointed out by Doron & Kronick:

The authors in [1] note that the STV/RCV/IRV system allows for a “perversion” (their words, not mine) whereby a candidates chances to be selected as a winner can potentially be negatively impacted even when receiving increased support.

“… a function that permitted an increased vote for a candidate to cause a decline in that candidate’s rank in the social ordering-would probably strike most of us as a rather absurd, even perverse, method of arriving at a social choice. Consequently, some writers refer to this condition as the ‘Non-Perversity’ condition. All of the democratic social choice functions that have been considered in the literature were assumed to guarantee this condition, but the Single Transferrable Vote system does not.”

Doron, G., & Kronick, R. (1977) [1]

The authors present a hypothetical example to demonstrate the issue. Suppose we have 3 candidates (Candidate X, Candidate Y, Candidate Z) and two different voting groups, which we will refer to as group D and D’. Both D and D’ are fairly similar and only disagree on the relative ranking of two specific candidates.

In the tables below, recreated from [1], the only difference in the two voting group selections is that candidate X receives more support than candidate Y in group D’. However, if using the voting rules as described above candidate X wins in D, and loses in D’ even though X has increased support in D’.

# of VotersFirst ChoiceSecond ChoiceThird Choice
6XYZ
2YXZ
4YZX
5ZXY
Voting group D selections. Reprinted from [1].
# of VotersFirst ChoiceSecond ChoiceThird Choice
6XYZ
2XYZ
4YZX
5ZXY
Voting group D’ selections. Reprinted from [1].

There are 17 voters in each case, and only 1 seat available. Therefore, the Droop quota/threshold is 9 votes required in order to declare a winner.

In group D it is candidate Z that has the least amount of votes in the first round and is eliminated, therefore advancing 5 second-choice votes for X into the next round. Candidate X passes the threshold and wins in the second round.

In group D’, where candidate X received more support than candidate Y, it is candidate Y that has the least amount of votes in the first round and is eliminated, therefore advancing 4 second-choice votes for Z into the next round. Candidate Z then passes the threshold and wins in the second round.

Bibliography:

  1. Doron, G., & Kronick, R. (1977). Single Transferrable Vote: An Example of a Perverse Social Choice Function. American Journal of Political Science, 21(2), 303–311. https://doi.org/10.2307/2110496
  2. https://ballotpedia.org/Ranked-choice_voting_(RCV)
  3. https://campaignlegal.org/democracyu/accountability/ranked-choice-voting
  4. https://www.hhh.umn.edu/research-centers/center-study-politics-and-governance/research-and-initiatives-cspg/ranked-choice-voting
  5. Brandt F, Conitzer V, Endriss U, Lang J, Procaccia AD, eds. Handbook of Computational Social Choice. Cambridge: Cambridge University Press; 2016. https://doi.org/10.1017/CBO9781107446984
Categories
Election Data Analysis Election Forensics Election Integrity programming technical Uncategorized

Records of Early Ballots Cast Do Not Have Corresponding Registration Records in VA 2023 General Election Data

Update (2023-12-14 12:00:00 EST) : Special thank you to Rick Michael of the Chesterfield Electoral board for checking their records on issues #1 and #2 below. There were 3 x Issue #1 records and 9 x Issue #2 records identified in Chesterfield County.

According to Rick, the records in question were populated and visible when looking via the electronic VERIS (the states election database) login available to the Registrar. The 3 x Issue #1 records can be found and are Active records in the electronic system, and the 9 x Issue #2 records had an update that moved the records from Inactive to Active that were not reflected in the data supplied to us.

That implies that the data that we purchased (for approximately $12,000) directly from the department of elections is inaccurate and incomplete. Our initial purchase and download of the June 30 Registered Voter List (RVL) database does not show the registrants identified in Issue #1, even though the Registrar can see them in their electronic terminal. And our Monthly Update Subscription (MUS) we receive is missing the updates showing the registrant records identified in Issue #2 being moved from Inactive to Active status.

The department of elections is required by federal law (NVRA, HAVA) to keep and maintain accurate election records AND to make those records accessible for inspection and verification, and for use by candidates and political parties. Additionally, we have paid (twice!) for this data; once as taxpayers, and once again as a 501c3 entity. If the data we, and other campaigns and candidates are receiving is not representative of the actual records in the database, incomplete and inaccurate … that needs to be addressed and fixed.

Summary:
  • Issue #1: There are 99 records of ballots cast, according to the VA Department of Elections (ELECT) Daily Absentee List (DAL) data file that do not have corresponding voter ID listed in Registered Voter List (RVL) data.
  • Issue #2: There are 380 records of ballots cast in the DAL where the corresponding RVL record has been listed as “Inactive” since June-30-2023 and no modification to the RVL record has taken place.
  • Issue #3: There are 18 records of ballots cast in the DAL where the corresponding RVL record is listed as “Inactive” as of Dec-01-2023, but there has been previous modifications to the RVL record since June-30-2023.
  • We are currently reaching out in attempts to contact the VA AG’s office and to provide them the details of this analysis in order to have these anomalies further investigated.
Data files utilized for this analysis:

Our 501c3 EPEC purchased and downloaded the full statewide VA RVL on June-30-2023 from ELECT. We additionally purchased the Monthly Update Service (MUS) package from ELECT, where on the 1st of each month we are provided a list of all of the changes that have occurred to the RVL in the previous month. By applying these changes to our baseline data file, we are able to update our copy of the RVL to reflect the latest state as per ELECT. We can also create a cumulative record of all entries associated with a particular voter ID by simply concatenating all of these datafiles.

Additionally, during the VA 2023 General Election, we purchased access to the Daily Absentee List (DAL) file generated by ELECT that documents all of the transactions associated with early mail-in or in-person voting during the 45 day early voting period. The DAL file we utilized for this analysis was downloaded from ELECT on Nov-13-2023 at 6am EST.

Identification of ballots cast via the DAL file can be performed by checking for rows of the DAL data table that have the APP_STATUS field set to “Approved” and have the BALLOT_STATUS field set to any of the following: “Marked” | “Pre-Processed” | “On Machine” | “FWAB” | “Provisional”.

Once cast ballots are identified in the DAL, the Voter Identification Number can be used to lookup all of the corresponding records in our cumulative RVL data. The data issues summarized above can be directly observed using this process. Due to VA law, we cannot publish the full specific records here in this blog but have summarized, captured and described our process and results.

For Issue #1: If there does not exist a corresponding registration record for cast ballots, then the voter should not have been able to have their mail-in ballot approved or issued, or been able to check-in to an early voting precinct to vote on-machine. If the voter record actually does exists, then why is it not reflected in the data that we purchased from ELECT. Note that all provisional and Same Day Registration (SDR) ballots were required to be entered into the states database (“VERIS”) by the Friday after the election (Fri Nov-10-2023). We specifically waited until we received the Dec-01-2023 MUS data update from ELECT to attempt to perform this or similar analysis in order to ensure that we would not be missing any last minute registrations or RVL updates.

For Issue #2: There are 380 records of ballots being cast in the DAL where the baseline June-30-2023 RVL data file shows the registrant as inactive, and there has been no modifications or adjustments to the record presented in the MUS data files. Therefore these registrants should still have been listed as “Inactive” during the early voting period which started in September through Election Day (Nov 7).

For issue #3: There are 18 records that show the cast ballot is from a registrant that is currently listed as “Inactive” but there has been adjustments to the registration record over the last 6 months. An example of such is below. Note that I have captured the MUS data file generation dates in the 5th column to note when the file was generated and received by us.

In the example given below, the first invalidation operation on the registration record appears in the MUS file dated Sep-01, with the earliest transaction date listed as Aug-29-2023. The ballot application was not received until Sept 26 according to the DAL, so the application should never have been approved or the ballot issued as the registrant status should have been “Invalid” according to the states own data.

(Also … yes … I know there is a typo in the spelling of “APP_RECIEPT_DATE” in the tables below … but this is the data as it comes from ELECT).

APP_RECIEPT_DATEAPP_STATUSBALLOT_RECEIPT_DATEBALLOT_STATUS
“2023-09-26 00:00:00”Approved“2023-10-19 00:00:00”Pre-Processed
Example Extract of a DAL data record for a mail-in ballot cast during early voting.
TRANSACTIONDATETRANSACTIONTIMETrans_TypeNVRAReasonCodeFile Source
30-June-202312:12:00BASELINEBASELINEBaseline RVL
28-Jul-202309:34:03MODIFYChange OutMUS 08/01/2023
28-Jul-202309:34:04MODIFYAddress ChangeMUS 08/01/2023
28-Jul-202309:34:04MODIFYChange InMUS 09/01/2023
28-Jul-202309:34:03MODIFYChange OutMUS 09/01/2023
28-Jul-202309:34:04MODIFYAddress ChangeMUS 09/01/2023
28-Jul-202309:34:04MODIFYChange InMUS 09/01/2023
29-Aug-202311:55:49MODIFYInactivateMUS 09/01/2023
29-Aug-202311:55:49MODIFYInactivateMUS 10/01/2023
Extract of RVL Cumulative Data Records for Voter ID in above DAL entry
Categories
Uncategorized

Fluctuations in election night reporting ballot counts for multiple races in VA 2023 General Election

Well, election day came and went and everyone was glued to the internet to find out the results.

I took the extra step of logging all of the election night return files posted by the VA department of elections (“ELECT”) at 5 minute increments, as I wanted to plot the results over time as the numbers came in.

The data is from this link on ELECT’s website: https://enr.elections.virginia.gov/results/public/api/elections/Virginia/2023-Nov-Gen/files/json

I used a simple wget script to grab this file once every 5 minutes (approximately).

However, when I went to plot the results, I found some data curves that I can’t quite explain. Take, for example, the VA House of Delegates race in the 22nd District:

Now … last I checked, when accumulating counts of ballots … you wouldn’t expect the totals to go down, let alone oscillate back and forth.

This is not the only race where I found ballot curves that have a decrease in one of the ballot count after a data update. (The gallery is posted below.) Of the 183 races I looked at so far, 79 had a ballot trace that had its count total reduced after a data update. (I haven’t looked at all of the races yet.)

Now, one expects there to be some issues and corrections that have to be made to the election night reporting data. But when 43% of the races sampled have obvious data quality issues like this … I think that deserves some explanation.

So … can ELECT please address this:

  • Why do 79 (and counting) races (~43% of races sampled) in the VA election night reporting have obvious issues where the vote totals decreased after a data reporting update?
  • What was the cause?
  • Why was it not caught by your QA/QC procedures?
  • How will you be addressing it going forward?
Categories
Election Data Analysis Election Integrity Uncategorized

VA Daily Absentee List

The EPEC staff monitors the Virginia Daily Absentee List for unexpected values. We essentially “audit” the electoral process in Virginia during an election cycle. We are currently monitoring the 2022 General Election.

One of the areas of interest is the DAL – Daily Absentee List. It shows the current status of absentee voting in Virginia – by mail in ballot and early voting (absentee in person).

In Virginia, Absentee In-Person Early Voting started on Friday, September 23. Our initial DAL file was saved on Saturday, September 24, at 9 PM.

The official Ballot Status in the DAL at 9 PM was:

Issued: 290,095

Federal Worker Absentee Ballot (FWAB) 1

Marked: 2,118

On Machine: 8,397

Not Issued: 5,766

Unmarked: 546

Pre-Processed: 1

Deleted: 13,015

Grand Total: 319,939

Nearly 19,327 ballots – 6 % of those requested, were in a state which would not be counted if the election vote counting period were over today – Not Issued, Unmarked, or Deleted. There was also 1 ballot in a Pre-Processed Ballot Status state. The magnitude of ballots in one of these “states” is surprising but not alarming.

It appears Not Issued means there is either a backlog in mailing out ballots or an issue with voter registration – legal name, address of record in the registration database, citizenship, etc. Unless the backlog or issue is resolved, the voter will be denied a ballot.

Unmarked is associated with mail-in Absentee Ballots. A Marked ballot is moved to an Unmarked status if an election official notices an error with the associated absentee ballot documents such as a name or address error, missing signature, or missing signature verification. Election officers are required to contact voters if their ballot requires a cure – correction to the information accompanying the ballot. If the cure is not provided, the ballot will not be counted. Some voters choose to have a new ballot mailed to them if a cure is required, in which case a ballot in the Unmarked state will be spoiled and marked Deleted in the system. This is one of the reasons we see voters having one or more Deleted ballots associated with them in the DAL files.

Deleted ballots are not supposed to be processed (counted). We believe these are officially referred to as “spoiled ballots. The process to keep these separate from countable ballots is an interest area for election integrity observers. The most common reason for ballots to get Deleted (spoiled) is voter error. Examples: mistake when filling out a ballot in person resulting in the first ballot being spoiled and a new ballot issued, or a voter surrendering an absentee ballot to vote in person or receive a new one via the mail.

More accurate voter registration records MAY reduce the volume of initial Not Issued and Deleted ballots. Our post-election observations and recommendations will address this issue. Our initial hypothesis – changes in residency, relocation within Localities, ineligible voters requesting ballots, and voters passing away probably account for most of the unexpectedly large values of ballots in an “at risk” state.

Categories
Uncategorized

Multiple Active Ballots

Individual voters should NEVER have more than one (1) active ballot. If this occurs, there is a risk that human error by an election official will result in a voter having more than one ballot counted.

Virginia has 226 individuals with two or more active ballots according to the Daily Absentee List file as of 28 October, 6 AM. This is occurring in nearly half of the Localities in Virginia – 59 out of 133.

This is a process issue – either procedural, or ballot tracking. The process should make it impossible for more than one vote to be counted.

It is possible that these will be caught before they get counted … but mistakes are made when people get overloaded or distracted. Process software should prevent the possibility of this “defect” occurring to prevent the perception of malfeasance.

Categories
Election Data Analysis Uncategorized

VA “Provisional” Ballots

The number of ballots in “Provisional” status is growing. This is to be expected because Virginia began allowing “same day voter registration” on 1 October, and same-day votes are to be labeled Provisional.

A handful of ballots were Provisional status prior to 1 October, and this ought to be explained. The steady increase of Provisional ballots started on 19 October. The count of Provisional ballots is currently growing by approximately 200 ballots each day. This number is expected to grow exponentially as we approach election day.

The root cause of the Provisional ballot increase is most likely “same day registration and voting” but a detailed study has not yet been performed.