Categories
Election Data Analysis Election Forensics Election Integrity mathematics technical

Identification of 2,502 Potential Matches of Active Voter Registrations Between FL and VA Voter Registration Lists

Building off of our previous work on computing the string distance between all possible pairs of registered voter records in a single state in order to identify potential matches, we’ve updated the code to allow for cross state comparisons. The first states that we ran this on was VA and FL, using the dataset produced by the FL Department of Elections on 05-07-2024, and the dataset from the VA department of elections dated 05-01-2024. There were a total of 2,502 records that matched our constraints between the FL and VA datasets, as detailed below.


Note: All examples of data records given in this writeup have been fictionalized to protect registered voter identities from being published on this website, and only serve as illustrative examples representative of the nature of properties and characteristics discussed. Law enforcement, election or other gov officials, or individuals otherwise authorized to receive and handle voter data as per VA law and the VA Department of Elections are welcome to contact us for specific details and further information.

Each dataset had the First Name, Middle Initial, Last Name, Suffix, Gender, and Year, Month and Day of Birth concatenated into strings that were then compared against each other using the Levenshtein String Distance measure as an initial filtering method to determine potential matches.

Additionally, for each pair we computed the minimum string distance measure between all of the four possible permutations of pairings between the Primary and Mailing addresses in each record between the states. We required that this minimum distance for a set of registration entries be less than or equal to 12 characters. The choice of the value of twelve was empirically determined after review of the data, as it is loose enough to allow for common variations in address presentation while not being so loose as to be overwhelmed with false positive.

We additionally filtered these findings for only those pairings that were of ACTIVE registrations in both datasets AND where the year, month and day of birth were exact matches.

In summary the 2,502 matches were generated according to the following constraints:

  • Only applied to ACTIVE voter registrations
  • Required completed DOB (year, month and day) to exactly match
  • Required [First Name + Middle Initial + Last Name + Suffix + Gender + DOB] strings to be similar to within <=2 characters
  • Required that the minimum distance between any pairwise combination of the Primary or Mailing address between the records be less than or equal to 12 characters.

It should be noted that it is readily apparent from reviewing the potential matched records that the majority of these matches look to have originated in FL and then were subsequently moved to VA, but the FL record remained listed as active.

Category 1 Matches:

There were 698 matches in Category 1: where the Levenshtein distance measure for the name and DOB was equal to 0 (exact match) and the minimum address distance was also 0 (also an exact match). Examples in this category are exact matches for every considered field. An example is given below.

FL Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PL SPRINGFIELD VA 22150

VA Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PL SPRINGFIELD VA 22150

Category 2 Matches:

There were 1,533 matches in Category 2: where the Levenshtein distance measure for the name and DOB was equal to 0 (exact match) and the minimum address distance was greater than 0, but less than or equal to 12. Examples in this category commonly have differences in how the zip code, apartment numbers or state code is presented in either the Primary or Mailing address strings. An example is given below.

FL Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PLACE SPRINGFIELD VA 22150

VA Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PL SPRINGFIELD VA 221504259

Category 3 Matches:

There were 44 matches in Category 3: where the Levenshtein distance measure for the name and DOB was equal to 1 and the minimum address distance was equal 0 (exact match). Examples in this category are most often due to hyphenation or misspellings in the name, or a change in Gender (i.e. from “M”->”U”). An example is given below.

FL Active Registration Record:
BENNIE DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

VA Active Registration Record:
BENNEE DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

Category 4 Matches:

There were 140 matches in Category 4: where the Levenshtein distance measure for the name and DOB was equal to 1 and the minimum address distance was greater than 0, but less than or equal to 12. Examples in this category are most often due to hyphenation or misspellings in the name, or a change in Gender (i.e. from “M”->”U”), as well as small differences in how the addresses are presented. An example is given below.

FL Active Registration Record:
BENNIE DAS M 05/14/1945
1267 SLEEPY SONG PLACE SPRINGFIELD VA 22150

VA Active Registration Record:
BENNEE DAS M 05/14/1945
1267 SLEEPY SONG PL SPRINGFIELD VA 221504259

Category 5 Matches:

There were 19 matches in Category 5: where the Levenshtein sistance measure for the name and DOB was equal to 2 and the minimum address distance was equal 0 (exact match). Examples in this category are most often due to a middle name/initial being present in one record and not being present in the other. An example is given below.

FL Active Registration Record:
BENNIE DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

VA Active Registration Record:
BENNIE C DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

Category 6 Matches:

There were 68 matches in Category 3: where the Levenshtein Distance measure was equal to 1 and the minimum address distance was greater than 0, but less than or equal to 12. Examples in this category are most often due to a middle name/initial being present in one record and not being present in the other, as well as small differences in how the addresses are presented. An example is given below.

FL Active Registration Record:
BENNIE C DAS M 05/14/1945
1267 SLEEPY SONG PLACE SPRINGFIELD VA 22150

VA Active Registration Record:
BENNIE DAS M 05/14/1945
1267 SLEEPY SONG PL SPRINGFIELD VA 221504259

Table of Results by VA Locality:

Row LabelsLD=0, AD=0LD=0, 0<AD<=12LD=1, AD=0LD=1, 0<AD<=12LD=2, AD=0LD=2, 0<AD<=12
ACCOMACK COUNTY381100
ALBEMARLE COUNTY13240100
ALEXANDRIA CITY15521611
ALLEGHANY COUNTY130100
AMELIA COUNTY220000
AMHERST COUNTY320000
APPOMATTOX COUNTY500010
ARLINGTON COUNTY27532826
AUGUSTA COUNTY380110
BEDFORD COUNTY4150100
BOTETOURT COUNTY720000
BRISTOL CITY320000
BRUNSWICK COUNTY120000
BUCHANAN COUNTY100000
BUCKINGHAM COUNTY010000
CAMPBELL COUNTY231100
CAROLINE COUNTY020000
CARROLL COUNTY160100
CHARLOTTE COUNTY140000
CHARLOTTESVILLE CITY460001
CHESAPEAKE CITY278741314
CHESTERFIELD COUNTY28492503
CLARKE COUNTY020000
COLONIAL HEIGHTS CITY011000
CRAIG COUNTY210000
CULPEPER COUNTY680000
CUMBERLAND COUNTY200000
DANVILLE CITY210000
DICKENSON COUNTY130000
DINWIDDIE COUNTY030100
ESSEX COUNTY200000
FAIRFAX CITY360000
FAIRFAX COUNTY108259714415
FALLS CHURCH CITY220001
FAUQUIER COUNTY4141000
FLOYD COUNTY111000
FLUVANNA COUNTY230200
FRANKLIN CITY310000
FRANKLIN COUNTY560101
FREDERICK COUNTY1090200
FREDERICKSBURG CITY170000
GALAX CITY200000
GILES COUNTY000100
GLOUCESTER COUNTY6170110
GOOCHLAND COUNTY221010
GRAYSON COUNTY130100
GREENE COUNTY050000
HALIFAX COUNTY120100
HAMPTON CITY10160600
HANOVER COUNTY261210
HARRISONBURG CITY160100
HENRICO COUNTY24330301
HENRY COUNTY350100
ISLE OF WIGHT COUNTY4130102
JAMES CITY COUNTY23251100
KING GEORGE COUNTY241001
KING WILLIAM COUNTY200000
LANCASTER COUNTY211001
LEE COUNTY310000
LEXINGTON CITY020000
LOUDOUN COUNTY29731122
LOUISA COUNTY520000
LYNCHBURG CITY6150200
MADISON COUNTY200000
MANASSAS CITY300000
MANASSAS PARK CITY100000
MARTINSVILLE CITY210000
MATHEWS COUNTY030000
MECKLENBURG COUNTY320000
MIDDLESEX COUNTY040100
MONTGOMERY COUNTY6111100
NELSON COUNTY120100
NEW KENT COUNTY060000
NEWPORT NEWS CITY8170102
NORFOLK CITY145801101
NORTHUMBERLAND COUNTY211000
NOTTOWAY COUNTY010000
ORANGE COUNTY561000
PAGE COUNTY120000
PATRICK COUNTY020000
PETERSBURG CITY210000
PITTSYLVANIA COUNTY370100
POQUOSON CITY100000
PORTSMOUTH CITY591100
POWHATAN COUNTY220100
PRINCE EDWARD COUNTY020000
PRINCE GEORGE COUNTY111101
PRINCE WILLIAM COUNTY408321133
PULASKI COUNTY220000
RADFORD CITY020000
RAPPAHANNOCK COUNTY021000
RICHMOND CITY12291300
ROANOKE CITY14121200
ROANOKE COUNTY14150001
ROCKBRIDGE COUNTY222000
ROCKINGHAM COUNTY150101
RUSSELL COUNTY030001
SALEM CITY210000
SCOTT COUNTY200000
SHENANDOAH COUNTY010101
SMYTH COUNTY120000
SOUTHAMPTON COUNTY020100
SPOTSYLVANIA COUNTY10191100
STAFFORD COUNTY20480404
STAUNTON CITY120000
SUFFOLK CITY12310001
TAZEWELL COUNTY050100
VIRGINIA BEACH CITY46177111112
WARREN COUNTY240000
WASHINGTON COUNTY351100
WAYNESBORO CITY130000
WESTMORELAND COUNTY520001
WILLIAMSBURG CITY110000
WINCHESTER CITY060000
WISE COUNTY070000
WYTHE COUNTY000100
YORK COUNTY12352200
Grand Total6981533441401968

Tabulated Results by FL County Code:

Row LabelsLD=0, AD=0LD=0, 0<AD<=12LD=1, AD=0LD=1, 0<AD<=12LD=2, AD=0LD=2, 0<AD<=12
MON2200100
ALA0230200
BAK020000
BAY7400410
BRA220000
BRE41391123
BRO12950608
CHA71146121
CIT160100
CLA7472503
CLL1520101
CLM000100
DAD50592621
DES110000
DUV2811442119
ESC1910311003
FLA5110122
FRA110000
GAD100100
GLA100000
GUL040000
HAM300000
HAR310000
HEN100000
HER8160201
HIG010000
HIL296521014
HOL010000
IND9111010
JAC020000
LAK1100101
LEE0460301
LEO3592010
LEV301000
MAD001000
MAN31211101
MRN26160101
MRT4062211
NAS4120100
OKA50313012
OKE100000
ORA11390904
OSC4151000
PAL358931002
PAS0300301
PIN4880603
POL0620902
PUT210000
SAN13420302
SAR17181120
SEM53345303
STJ8221503
STL60204221
SUM2290301
SUW330000
TAY020000
VOL0510303
WAK110000
WAL160000
Grand Total6981533441401968

Addendum + Updates:

In response to a number of questions we have received on this topic, and continued work to dig into this data:

  1. The number of matches above has been corrected from the original 2,527 to 2,502 (a difference of 25) due to a “fat-finger” error in tallying the total number of category 5 matches.
  2. For the strict constraints given above, the number of matched records where there is a vote recorded for the same election date in both the VA and FL data is 13.
  3. We also computed the number of exact [First Name + Middle Initial + Last Name + Gender + Full DOB] matches without requiring our additional address filter. This criteria is more strict in the initial match, but more loose in the subsequent filtering.
    • This results in a total of 17,701 matches when considering only Active voters on each of the FL and VA voter lists.
      • There are 343 of these matches where both FL and VA records have a history of votes cast in the same election.
    • The number jumps to 81,155 if we consider either Active or Inactive registrations.
      • There are 382 of these matches where both FL and VA records have a history of votes cast in the same election.
Categories
Election Data Analysis Election Forensics Election Integrity

2024 VA June Democratic Primary Election DAL File Metrics

Below you will find the current summary data and graphics from the 2024 VA June Democratic Primary Election Daily Absentee List files. We pull the DAL file everyday and track the count of each specific ballot category in each daily file.

Note: Page may take a moment to load the graphics objects.

Linear Scale Plot:

Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Logarithmic Scale Plot:

The logarithmic plot is the same underlying data as the linear scale plot, except with a logarithmic y-scale in order to be able to compress the dynamic range and see the shape of all of the data curves in a single graphic. Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Summary Data Table:
Print  CSV  Copy  

The underlying data for the graphics above is provided in the summary data table.

Additional Data:

Additional CSV datasets stratified by Locality, City, Congressional District, State House District, State Senate District, and Precinct are available here.

Data column descriptions:
  • ISSUED” := Number of DAL file records where BALLOT_STATUS= “ISSUED”
  • NOT_ISSUED” := Number of DAL file records where BALLOT_STATUS= “NOT ISSUED”
  • PROVISIONAL” := Number of DAL file records where BALLOT_STATUS= “PROVISIONAL” and APP_STATUS=”APPROVED”
  • DELETED” := Number of DAL file records where BALLOT_STATUS= “DELETED”
  • MARKED” := Number of DAL file records where BALLOT_STATUS= “MARKED” and APP_STATUS=”APPROVED”
  • ON_MACHINE” := Number of DAL file records where BALLOT_STATUS= “ON_MACHINE” and APP_STATUS=”APPROVED”
  • PRE_PROCESSED” := Number of DAL file records where BALLOT_STATUS= “PRE-PROCESSED” and APP_STATUS=”APPROVED”
  • FWAB” := Number of DAL file records where BALLOT_STATUS= “FWAB” and APP_STATUS=”APPROVED”
  • MAIL_IN” := The sum of “MARKED” + “PRE_PROCESSED”
  • COUNTABLE” := The sum of “PROVISIONAL” + “MARKED” + “PRE_PROCESSED” + “ON_MACHINE” + “FWAB”
  • MILITARY” := Number of DAL file records where VOTER_TYPE= “MILITARY”
  • OVERSEAS” := Number of DAL file records where VOTER_TYPE= “OVERSEAS”
  • TEMPORARY” := Number of DAL file records where VOTER_TYPE= “TEMPORARY”
  • MILITARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “MILITARY” and where COUNTABLE is True
  • OVERSEAS_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “OVERSEAS” and where COUNTABLE is True
  • TEMPORARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “TEMPORARY” and where COUNTABLE is True

All data purchased by Electoral Process Education Corp. (EPEC) from the VA Dept of Elections (ELECT). All processing performed by EPEC.

If you like the work that EPEC is doing, please support us with a donation.

Categories
Election Data Analysis Election Forensics Election Integrity

2024 VA June Republican Primary Election DAL File Metrics

Below you will find the current summary data and graphics from the 2024 VA June Republican Primary Election Daily Absentee List files. We pull the DAL file everyday and track the count of each specific ballot category in each daily file.

Note: Page may take a moment to load the graphics objects.

Linear Scale Plot:

Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Logarithmic Scale Plot:

The logarithmic plot is the same underlying data as the linear scale plot, except with a logarithmic y-scale in order to be able to compress the dynamic range and see the shape of all of the data curves in a single graphic. Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Summary Data Table:
Print  CSV  Copy  

The underlying data for the graphics above is provided in the summary data table.

Additional Data:

Additional CSV datasets stratified by Locality, City, Congressional District, State House District, State Senate District, and Precinct are available here.

Data column descriptions:
  • ISSUED” := Number of DAL file records where BALLOT_STATUS= “ISSUED”
  • NOT_ISSUED” := Number of DAL file records where BALLOT_STATUS= “NOT ISSUED”
  • PROVISIONAL” := Number of DAL file records where BALLOT_STATUS= “PROVISIONAL” and APP_STATUS=”APPROVED”
  • DELETED” := Number of DAL file records where BALLOT_STATUS= “DELETED”
  • MARKED” := Number of DAL file records where BALLOT_STATUS= “MARKED” and APP_STATUS=”APPROVED”
  • ON_MACHINE” := Number of DAL file records where BALLOT_STATUS= “ON_MACHINE” and APP_STATUS=”APPROVED”
  • PRE_PROCESSED” := Number of DAL file records where BALLOT_STATUS= “PRE-PROCESSED” and APP_STATUS=”APPROVED”
  • FWAB” := Number of DAL file records where BALLOT_STATUS= “FWAB” and APP_STATUS=”APPROVED”
  • MAIL_IN” := The sum of “MARKED” + “PRE_PROCESSED”
  • COUNTABLE” := The sum of “PROVISIONAL” + “MARKED” + “PRE_PROCESSED” + “ON_MACHINE” + “FWAB”
  • MILITARY” := Number of DAL file records where VOTER_TYPE= “MILITARY”
  • OVERSEAS” := Number of DAL file records where VOTER_TYPE= “OVERSEAS”
  • TEMPORARY” := Number of DAL file records where VOTER_TYPE= “TEMPORARY”
  • MILITARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “MILITARY” and where COUNTABLE is True
  • OVERSEAS_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “OVERSEAS” and where COUNTABLE is True
  • TEMPORARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “TEMPORARY” and where COUNTABLE is True

All data purchased by Electoral Process Education Corp. (EPEC) from the VA Dept of Elections (ELECT). All processing performed by EPEC.

If you like the work that EPEC is doing, please support us with a donation.

Categories
Election Data Analysis Election Forensics Election Integrity technical

Non-citizen registrations with previous voting history in VA election data

Abstract:

Using the data provided by the VA Department of Elections (ELECT), we have identified at least 1,481 unique registrations that were identified as “Determined Non-Citizen” and removed by ELECT from the voter rolls since May of 2023. Of those 1,481 there were 335 that also had corresponding records of recent ballots cast in the official Voter History record. There were 838 associated ballots cast since Feb of 2019.

We submitted a FOIA request to the VA Attorney General’s office requesting any and all documents regarding any prosecutions for non-citizen voters in the same time period as our data covers. We received a response that no relevant records were identified.

Background:

The VA Department of Elections continuously tries to identify and remove invalid or out of date registration records from the voter rolls. One category used for removal is if a registrant has been determined to be a non-citizen. It is required by the VA Constitution that only citizens are allowed to vote in VA elections.

In elections by the people, the qualifications of voters shall be as follows: Each voter shall be a citizen of the United States, shall be eighteen years of age, shall fulfill the residence requirements set forth in this section, and shall be registered to vote pursuant to this article. …

VA Constitution, Article II, Section 1. https://law.lis.virginia.gov/constitution/article2/section1/

Additionally, according to VA Code Section 24.2-1004, the act of knowingly casting a ballot by someone who is not eligible to vote is a Class 6 felony.

A. Any person who wrongfully deposits a ballot in the ballot container or casts a vote on any voting equipment, is guilty of a Class 1 misdemeanor.

B. Any person who intentionally (i) votes more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (ii) procures, assists, or induces another to vote more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (iii) votes knowing that he is not qualified to vote where and when the vote is to be given, or (iv) procures, assists, or induces another to vote knowing that such person is not qualified to vote where and when the vote is to be given is guilty of a Class 6 felony.

https://law.lis.virginia.gov/vacode/title24.2/chapter10/section24.2-1004/

ELECT makes available for purchase by qualifying parties various different data sets, including the registered voter list (RVL) and the voter history list information file (VHL). Additionally, ELECT makes available a Monthly Update Service (MUS) subscription that is published at the beginning of each month and contains (almost) all of the Voter List changes and transactions for the previous period.

In the MUS data there is a “NVRAReasonCode” field that is associated with each transaction that gives the reason for the update or change in the voter record. This is in accordance with the disclosure and transparency requirements in the NVRA. One of the possible reason codes given for records that are removed is “Determined Non-Citizen.”

EPEC has been consistently purchasing and archiving all of these official records as part of our ongoing work to document and educate the public as to the ongoing operations of our elections. (If your interested in supporting this work, please head on over to our donation page, or to our give-send-go campaign to make a tax-deductible donation, as these data purchases are not cheap!)

EPEC looked at the number of records associated with unique voter identification numbers that had been identified for removal from the voter record due to non-citizenship status, per the entries in the MUS, and correlated those results with our accumulated voter history list information in order to determine how many non-citizen registrations had corresponding records of ballots cast in previous elections. We only considered those records that are currently in a non-active state as of the latest MUS transaction log, as some determinations of non-citizenship status in the historical MUS transaction log might have been due to error and subsequently corrected and reinstated to active status. That is, we are not considering those records that had a “Determined Non-Citizen” disqualification, but were then subsequently reinstated and reactivated by ELECT.

Results:

There were 1,481 unique voter records marked for removal with the reason of “Determined Non-Citizen” and not subsequently reinstated in the accumulated MUS record that EPEC began collecting in mid-2023. Of those 1,481 records there were 335 unique voter ID’s that also had a record of casting one or more ballots in the accumulated vote history data that EPEC has been gathering, for a total of 838 ballots cast that can be identified since Feb of 2019. Figure 1 shows the distribution of non-citizen voters in the cumulative MUS file history. The blue trace represent the total identified and CANCELED non-citizen registrations, and the yellow trace represents the number of those records that also had corresponding records in the accumulated voter history data.

Figure 1: Distribution if the number of identified non-citizen ballots in the cumulative ELECT MUS file history. The x-axis is the date that a record was marked as CANCELED for the reason of “Determined Non-Citizen”.

Note that the data contained in the MUS updates often covers more than a single month period. In other words, the individual MUS files are oversampled. Subsequent MUS files can therefore also have repeated entries from previous versions, as their data may overlap. Our analysis used the first unique entry for a given voter ID marked as “Determined Non-Citizen” in the cumulative MUS record in order to build Figure 1. This data oversampling in the MUS helps explain the relative increase in the May 2023 bin, and similar decrease in Feb 2024 that can be observed in Figure 1.

The distribution of identified unique voter ID’s for the 335 identified non-citizen voters per VA locality is given below in Table 1. It should be noted that each ballot record has a specific locality associated with where the ballot was cast, whereas unique individuals might move between localities over time. The assignment of unique identified individuals to each locality in table 1 is therefore based on the locality listed in the specific MUS “Determined Non-Citizen” record for that individual, while the assignment of ballot cast to Localities is based on the individual VHL records. A person could have lived and voted multiple times in one county, then moved to another county and voted again before finally being determined as a non-citizen. The same person would have generated multiple VHL records for each ballot cast, and associated with potentially different localities. This should be kept in mind when attempting to interpret Table 1.

Locality NameIndividualsBallots
FAIRFAX COUNTY58135
LOUDOUN COUNTY36103
PRINCE WILLIAM COUNTY3174
RICHMOND CITY2167
CHESTERFIELD COUNTY2048
ALEXANDRIA CITY1633
NORFOLK CITY1320
VIRGINIA BEACH CITY1222
ARLINGTON COUNTY1032
CHESAPEAKE CITY924
POWHATAN COUNTY91
YORK COUNTY828
NORTHUMBERLAND COUNTY84
HARRISONBURG CITY83
SUFFOLK CITY718
STAFFORD COUNTY412
WARREN COUNTY411
RUSSELL COUNTY41
CLARKE COUNTY34
HENRY COUNTY32
NEWPORT NEWS CITY233
PORTSMOUTH CITY231
JAMES CITY COUNTY29
HENRICO COUNTY27
CHARLOTTESVILLE CITY26
FAIRFAX CITY26
WASHINGTON COUNTY26
CAROLINE COUNTY25
SPOTSYLVANIA COUNTY25
NEW KENT COUNTY24
ALBEMARLE COUNTY23
PRINCE EDWARD COUNTY23
LYNCHBURG CITY22
HAMPTON CITY114
MECKLENBURG COUNTY19
ROCKINGHAM COUNTY19
MANASSAS CITY14
PETERSBURG CITY14
AMELIA COUNTY13
ORANGE COUNTY13
SOUTHAMPTON COUNTY13
SUSSEX COUNTY13
BRUNSWICK COUNTY12
FAUQUIER COUNTY12
GREENE COUNTY12
AUGUSTA COUNTY11
BEDFORD COUNTY11
CULPEPER COUNTY11
DANVILLE CITY11
DINWIDDIE COUNTY11
FRANKLIN COUNTY11
FREDERICK COUNTY11
FREDERICKSBURG CITY11
LOUISA COUNTY11
PRINCE GEORGE COUNTY11
SHENANDOAH COUNTY11
WINCHESTER CITY11
FRANKLIN CITY10
CAMPBELL COUNTY06

335838
Table 1: Distribution of unique individuals determined to be non-citizens that voted in each locality, and the number of total non-citizen identified ballots cast.

The distribution of the 838 ballots that were identified as being cast by non-citizen voters (yellow trace in Figure 1) in previous elections is shown in Figure 2a. The most significant spikes are in the 2019, 2020, 2021 and 2022 November General elections, as well as the 2020 March Democratic presidential primary. Figure 2b, which shows this distribution as a percentage of votes cast, was added on 3/24/2024 per feedback on X/Twitter after the initial posting of this article.

Figure 2a: Distribution of identified non-citizen ballots cast in previous elections.
Figure 2b (added 3/24/2024 per feedback on X/twitter): Distribution of identified non-citizen ballots cast in previous elections as percent of total ballots cast, according to entries in the VHL data files.

Figures 3 and 4 show the distribution of the registration dates of the identified non-citizen records. The same data is plotted in figure 3 and 4, with the only difference being the scale of the Y-axis in order to better observe the dynamic range of the values. When we look at the registration date of these identified records, we see that there is a distinct relative increase starting around 1996, and then again around 2012.

Figure 3: Registration dates of the identified non-citizen records. Absolute count on y-axis.
Figure 4: Registration dates of the identified non-citizen records. Logarithmic Y-axis scale.

EPEC made a FOIA request to the VA Attorney General’s office on March 11, 2024 inquiring for any records regarding how many prosecutions for non-citizen voting had occurred since June of 2023. We received a response that the AG had no such relevant records.

It should be noted for completeness that if we consider records that have been “Determined Non-Citizen” at any point in the MUS logs, regardless of their current status, we find 1,532 individual records that have been determined non-citizens. Of the 1,532 there was 379 with corresponding vote histories, accounting for 1,019 ballots cast. Omitting the reinstated and reactivated records results in a percent difference of (1532-1481)/1532*100 = 3.33% diff in total determined non-citizen records, (379-335)/379*100 = 11.6% diff in number of determined non-citizen voters, and (1019-838)/1019*100 = 17.76% diff in number of determined non-citizen ballots.

Discussion

It appears from the MUS data, that the VA Department of Elections (ELECT) is doing routine identification, cleanup and removal of non-citizen registrations, which is a good thing and we commend them for their continued efforts to maintain clean voter registration lists.

However, the fact that a small number of these identified non-citizen registrations are also associated with (presumably … if the data from ELECT is accurate) illegally cast ballots in previous elections does raise a number of questions that citizens should be (politely) asking and discussing with their legislators, elected and appointed government officials. Each act of non-citizen voting is a de-facto disenfranchisement of legal voters rights, and is a punishable offense under VA law.

Q: How did these registrants get placed onto the voter rolls in the first place?

Q: What method and/or data sources are used by the state to identify non-citizen registrations for removal? If that process is exhaustive, and covers all registrations, then these numbers might be considered to represent a statistical complete picture of the problem. If that process is not exhaustive, in that it only uses serendipitous corroborating data sources, then these results likely under-represent the scale of the issues.

Q: As noted above, we are only considering here those individuals who have not had their records re-instated or reactivated after a determination of non-citizen status. We do not have enough information to determine how or why some records were first determined to be non-citizen, canceled and then subsequently re-instated. One potential area of concern is determining whether or not registrants might be falsely or errantly claiming to not be a citizen on official documents in order to be excused from jury duty, for example, and then work to re-instate their voting status once those documents percolate through the system to ELECT and are flagged for removal. This is a wholly separate but serious issue, as making false claims on official documents is itself a punishable offense.

Q: What procedures, processes and technical solutions are in place to prevent current or future registration and casting of ballots by non-citizens? This is especially pertinent given the current state of the flow of illegal immigrants crossing our national borders. According to a recent report by Yahoo Finance, VA is one of the top 30 destinations for illegal migrants, with both Loudoun County and Fairfax making the list.

Q: Why have none of the identified non-citizens who also cast ballots been investigated or prosecuted under VA Code 24.2-1004? As the identification of these ballots comes directly from looking at the official records produced by ELECT, it seems prudent for these to be forwarded by ELECT to the AG’s office with a recommendation to investigate and prosecute. Yet our FOIA request to the VA AG’s office inquiring as to any records associated with these types of investigations or prosecutions produced a “no relevant records exist” response.

Additionally, this evidence which is derived from only official state records, directly contradicts multiple news media reports and attestations that non-citizen voting is a “Myth”, and that non-citizen voting happens “almost never”. If the data from ELECT is accurate, then there are at least 838 ballots that have been cast by non-citizen voters just since 2019. Now, that is still very infrequent, but it is not “almost never.” It is a legitimate concern … and these discoveries are only the registrations that have been found and removed from the voter roles by ELECT and that we can observe in the data. We do not know how many exist that we do not know about.

Categories
technical

Information contained in 2D barcode on VA Drivers licenses

This one is quick, I promise.

Last week I got asked a question as to what information is actually contained in the 2D barcode on VA drivers licenses. I didn’t know the answer, so I did some digging.

First of all, I found the following information available on the VA DMV website: https://www.dmv.virginia.gov/sites/default/files/documents/barcode_calibration.pdf

I also was able to find a 2D QR code reader application that would allow anyone to scan their own 2D barcodes and see what information is contained in them. https://apps.apple.com/us/app/qr-reader-plus-2d-barcode/id1263976468

In case anyone else was curious.

Categories
Election Data Analysis Election Forensics Election Integrity technical

Potential Double Voters in VA 2024 DEM and REP March Primary

EPEC has identified at least 28 individual voter IDs in the 2024 VA March Primaries who, according to the Daily Absentee List (DAL) files purchased from the VA Department of Elections (a.k.a. “ELECT”), have a record showing they voted in both the Republican and Democratic primaries.

All of the records identified have the same voter ID and voter information appearing in both the Democratic and Republican DAL files and are recorded as having voted In-Person Early (a.k.a. an “On Machine” ballot). 26 of the individual voter IDs identified have nearly identical timestamps associated with their duplicated ballots, while 2 have significantly differing timestamps.

The number of identified records fall into a small handful of localities:

  • Hanover County had 1 identified record (different timestamps)
  • Northampton County had 1 identified record (different timestamps)
  • Bath County had 12 identified records
  • Norfolk County had 11 identified records
  • Franklin County had 1 identified record
  • Harrisonburg City had 1 identified record
  • Staunton City had 1 identified record

Note that the localities with identified records utilize differing poll-book or poll-pad vendors and optical scanners, so it is unlikely that these errors are due to being associated with a particular vendor.

It is not clear at this point if these records are simply errant due to technology, policy, and/or procedural issues, or if these records truly reflect individuals that cast ballots in both primaries, which is a felony according to VA law. Section 24.2-1004 of the VA code states:

B. Any person who intentionally (i) votes more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (ii) procures, assists, or induces another to vote more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (iii) votes knowing that he is not qualified to vote where and when the vote is to be given, or (iv) procures, assists, or induces another to vote knowing that such person is not qualified to vote where and when the vote is to be given is guilty of a Class 6 felony.

https://law.lis.virginia.gov/vacode/title24.2/chapter10/section24.2-1004/

The VA Department of Elections (“ELECT”) also states:

Virginia will have a dual presidential primary election, which means both the Democratic Party and the Republican Party will have primaries on the same day.

In a dual primary, officers of election will ask voters if they want to cast their ballot in the Democratic Party Primary or the Republican Party Primary. All qualified voters may vote in either primary, but voters may not vote in both primaries.

https://www.elections.virginia.gov/news-releases/early-voting-for-presidential-primaries-begins-jan-19.html
Categories
Election Data Analysis Election Forensics Election Integrity mathematics technical Uncategorized

VA 2024 March Primary Election Fingerprints

Abstract

Examining the Election Night Reporting data from the VA 2024 March Democratic and Republican primaries provides supporting evidence that the Republican primary was impacted and skewed by a large number of Democratic “crossover” voters, resulting in an irregular election fingerprint when the data is plotted.

Background

The US National Academy of Sciences (NAS) published a paper in 2012 titled “Statistical detection of systematic election irregularities.” [1] The paper asked the question, “How can it be distinguished whether an election outcome represents the will of the people or the will of the counters?” The study reviewed the results from elections in Russia and other countries, where widespread fraud was suspected. The study was published in the proceedings of the National Academy of Sciences as well as referenced in multiple election guides by USAID [2][3], among other citations.

The study authors’ thesis was that with a large sample sample of the voting data, they would be able to see whether or not voting patterns deviated from the voting patterns of elections where there was no suspected fraud. The results of their study proved that there were indeed significant deviations from the expected, normal voting patterns in the elections where fraud was suspected, as well as provided a number of interesting insights into the associated “signatures” of various electoral mechanism as they present themselves in the data.

Statistical results are often graphed, to provide a visual representation of how normal data should look. A particularly useful visual representation of election data, as utilized in [1], is a two-dimensional histogram of the percent voter turnout vs the percent vote share for the winner, or what I call an “election fingerprint”. Under the assumptions of a truly free and fair election, the expected shape of the fingerprint is of that of a 2D Gaussian (a.k.a. a “Normal”) distribution [4]. The obvious caveat here is that no election is ever perfect, but with a large enough sample size of data points we should be able to identify large scale statistical properties.

In many situations, the results of an experiment follow what is called a ‘normal distribution’. For example, if you flip a coin 100 times and count how many times it comes up heads, the average result will be 50. But if you do this test 100 times, most of the results will be close to 50, but not exactly. You’ll get almost as many cases with 49, or 51. You’ll get quite a few 45s or 55s, but almost no 20s or 80s. If you plot your 100 tests on a graph, you’ll get a well-known shape called a bell curve that’s highest in the middle and tapers off on either side. That is a normal distribution.

https://news.mit.edu/2012/explained-sigma-0209

In a free and fair election, the plotted graphs of both the Turnout percentage and the percentage of Vote Share for Election Winner should (again … ideally) both resemble Gaussian “Normal” distributions; and their combined distribution should also follow a 2-dimensional Gaussian (or “normal”) distribution. Computing this 2 Dimensional joint distribution of the % Turnout vs. % Vote Share is what I refer to as an “Election Fingerprint”.

Figure 1 is reprinted examples from the referenced National Academy of Sciences paper. The actual election results in Russia, Uganda and Switzerland appear in the left column, the right column is the modeled expected appearance in a fair election with little fraud, and the middle column is the researchers’ model of the as-collected data, with any possible fraud mechanisms included.

Figure 1: NAS Paper Results (reprinted from [1])

As you can see, the election in Switzerland (assumed fair) shows a range of voter turnout, from approximately 30 – 70% across voting districts, and a similar range of votes for the winner. The Switzerland data is consistent across models, and does not show any significant irregularities.

What do the clusters mean in the Russia 2011 and 2012 elections? Of particular concern are the top right corners, showing nearly 100% turnout of voters, and nearly 100% of them voted for the winner.

Both of those events (more than 90% of registered voters turning out to vote and more than 90% of the voters voting for the winner) are statistically improbable, even for very contested elections. Election results that show a strong linear streak away from the main fingerprint lobe indicates ‘ballot stuffing,’ where ballots are added at a specific rate. Voter turnout over 100% indicates ‘extreme fraud’. [1][5]

Note that election results with ‘outliers’ – results that fall outside of expected normal voting patterns – while evidentiary indicators, are not in and of themselves definitive proof of outright fraud or malfeasance. For example, in rare but extreme cases, where the electorate is very split and the split closely follows the geographic boundaries between voting precincts, we could see multiple overlapping Gaussian lobes in the 2D image. Even in that rare case, there should not be distinct structures visible in the election fingerprint, linear streaks, overly skewed or smeared distributions, or exceedingly high turnout or vote share percentages. Additional reviews of voting patterns and election results should be conducted whenever deviations from normal patterns occur in an election.

Additionally it should be noted that “the absence of evidence is not the evidence of absence”: Election Fingerprints that look otherwise normal might still have underlying issues that are not readily apparent with this view of the data.

Results on 2024 VA March Primaries:

Figure 2 and Figure 3 are the computed election fingerprints for the Democratic and Republican VA 2024 March Primaries, respectively. They were computed according to the NAS paper and using official state reported voter turnout and votes for the statewide winner and reported per voting Locality with combined In-Person Early, Election Day, Absentee and Provisional votes. Figures 4 and 5 perform the same process, except each data point is generated per individual precinct in a locality. The color scale moves from precincts with low counts as deep blue, to precincts with high numbers represented as bright yellow. Note that a small blurring filter was applied to the computed image for ease of viewing small isolated Locality or Precinct results.

The upper right inset in each graphic image was computed per the NAS paper; the bottom left inset shows what an idealized model of the data could or should look like, based on the reported voter turnout and vote share for the winner. This ideal model is allowed to have up to 3 Gaussian lobes based on the peak locations and standard deviations in the reported results. The top-left and bottom-right inset plots show the sum of the rows and columns of the fingerprint image. The top-left graph corresponds to the sum of the rows in the upper right image and is the histogram of the vote share for the winner across precincts. The bottom right graph shows the sum of the columns of the upper right image, and is the histogram of the percentage turnout across voting localities.

Figure 2 Democratic primary, accumulated per Locality:
Figure 3 Republican primary, accumulated per Locality:
Figure 4 Democratic primary, accumulated per Precinct:
Figure 5 Republican primary, accumulated per Precinct:

Analysis:

As can be seen in Figure 2 and 4, the Democratic primary fingerprint looks to fall within expected normal distribution. Even though the total vote share for the winner (Biden) is up around 90%, this was not unexpected given the current set of contestants and the fact that Biden is the incumbent.

The Republican primary results, as shown in Figure 3 and 5, show significant “smearing” of the percent of total vote share for the winner. The percent of voter turnout (x-axis) does however show a near Gaussian distribution, which is what one would expect. The republican primary data does not show the linear streaking pattern that the authors in [1] correlate with extreme fraud, but significant smearing of the distribution is observed.

A consideration that might partially explain this smearing of the histogram, is that there was at least 17% of “crossover voters” who historically lean Democrat but voted in the Republican primary (see here for more information). Multiple news reports and exit polling suggest that this was due in part to loosely organized efforts by the opposing party to cast “Protest Votes” and artificially inflate the challenger (Haley) and dilute the expected (Trump) margin of victory for the winner, with no intention of supporting a Republican candidate in the General Election. (This is completely legal in VA, by the way, as VA does not require by-party voter registration.)

If we categorize each locality as being either Democratic or Republican leaning based on the average results of the last four presidential elections, and then split the computation of the per precinct results into separate parts accordingly, we can see this phenomenon much clearer.

Figure 6 shows the per-precinct results for only those locality precincts that belong to historic Republican leaning localities. It depicts a much tighter distribution and has much less smearing or blurring of the distribution tails. We can see from the data that Republican base in historically Republican leaning localities seems solidly behind candidate Trump.

Figure 7 shows the per-precinct results for only those locality precincts that belong to historic Democratic leaning localities. It can clearly be seen by comparing the two plots that the major contributor to the spread of the total republican primary distribution is the votes from historically Democratic leaning localities.

Figure 6 Republican primary, accumulated per Precinct in Republican leaning localities:
Figure 7 Republican primary, accumulated per Precinct in Democratic leaning localities:

References:

Categories
Election Data Analysis Election Integrity technical Uncategorized

Technical issues with Enhanced Voting Election Night Return JSON files

As I was going through and processing the (new) VA election night reporting data provided by Enhanced Voting, I noticed a number of technical issues with the data feed. I’ve tried to capture them here in the attempt to help assist the VA Department of Elections in correcting bugs and implementation issues with their new reporting format.

While the new ENR data feed is commendable in that it presents the data for the state in an easily obtainable JSON formatted file, the following issues were observed in my processing of the data. I am happy to provide specific examples of these issues to the Enhanced Voting development team in order to help address them.

  • Inconsistent JSON formats being returned. Sometimes locality group results information is a cell of structures, sometimes an array.
  • Occasional mal-formed JSON, missing opening or closing parentheses or brackets causing the file to not be able to be parsed by JSON importing functions in python, MATLAB, etc.
  • Occasional duplicated locality precinct group result information
Categories
Election Data Analysis Interesting technical

Observed indications of cross party voting in VA 2024 ongoing primaries

There are currently two completely separate but simultaneous primary elections being held in VA, with actual Election Day coming up fast on March 5th. As part of EPEC’s data analysis on the ongoing Democrat and Republican primaries, I took some time to look at the distribution of voter participation. VA does not have voter registration by party, but participation in primary elections is often used as a surrogate method to try and estimate a voter leaning.

I was specifically interested as to how many “cross-over” voters were participating in each parties primary. There have been multiple news articles (here, for example) discussing the potential for democrats to cross-vote in the 2024 Primaries, and I wanted to see if I could observe evidence of that behavior in the data.

Results:

As can be seen from the image below, there is definitely evidence of crossover voting occurring, with historically democratic primary voters crossing over and voting in this years (2024) Republican primary.

Approximately 17.5% of the 109,395 ballots cast in the 2024 VA Republican primary are associated with historically Democrat leaning registrants. Only 0.35% of the 159,505 ballots cast in the 2024 VA Democratic primary are associated with historically Republican leaning registrants. [Note this plot was updated on 2024-03-11 to reflect the latest values. The previous results from mid-February had the number of crossover D->R voters at ~12%]

Method:

Step 1: Compute an estimate of voter leaning.

The data utilized in this analysis all comes directly from the VA Dept of Elections (“ELECT”) and includes the statewide Registered Voter List (“RVL”) and Voter History List (“VHL”) files dated 03/11/2024, as well as the Daily Absentee List (“DAL”) files corresponding to each of the ongoing Democrat and Republican primaries.

An estimation of each party leaning is first computed by going through the VHL and for each unique voter in the VHL summing the number of Democrat or Republican primaries that voter has participated in historically. We then take the difference of these two fields and divide by the total number of election contests the voter has participated in. This gives us a resultant estimate of the “leaning” for each unique voter.

leaning = (# Dem Primaries – # Rep Primaries) / (# of Total Contests)

A leaning < 0 indicates a Republican lean, and > 0 indicates a Democratic lean. A voter might have a lean == 0 if they had a balanced participation in previous primaries, or if there is no voter history for that particular voter.

Step 2: Plot the histogram of voter leaning for ballots cast so far in both the Democratic and Republican primaries. Additionally plot the Computed voter leaning for the entire RVL as a reference.

Categories
Election Data Analysis Election Forensics Election Integrity

2024 VA March Republican Primary Election DAL File Metrics

Below you will find the current summary data and graphics from the VA 2024 Republican Primary Election Daily Absentee List files. We pull the DAL file everyday and track the count of each specific ballot category in each daily file.

Note: Page may take a moment to load the graphics objects.

Linear Scale Plot:

Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Logarithmic Scale Plot:

The logarithmic plot is the same underlying data as the linear scale plot, except with a logarithmic y-scale in order to be able to compress the dynamic range and see the shape of all of the data curves in a single graphic. Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Summary Data Table:
Print  CSV  Copy  

The underlying data for the graphics above is provided in the summary data table.

Additional Data:

Additional CSV datasets stratified by Locality, City, Congressional District, State House District, State Senate District, and Precinct are available here.

Data column descriptions:
  • ISSUED” := Number of DAL file records where BALLOT_STATUS= “ISSUED”
  • NOT_ISSUED” := Number of DAL file records where BALLOT_STATUS= “NOT ISSUED”
  • PROVISIONAL” := Number of DAL file records where BALLOT_STATUS= “PROVISIONAL” and APP_STATUS=”APPROVED”
  • DELETED” := Number of DAL file records where BALLOT_STATUS= “DELETED”
  • MARKED” := Number of DAL file records where BALLOT_STATUS= “MARKED” and APP_STATUS=”APPROVED”
  • ON_MACHINE” := Number of DAL file records where BALLOT_STATUS= “ON_MACHINE” and APP_STATUS=”APPROVED”
  • PRE_PROCESSED” := Number of DAL file records where BALLOT_STATUS= “PRE-PROCESSED” and APP_STATUS=”APPROVED”
  • FWAB” := Number of DAL file records where BALLOT_STATUS= “FWAB” and APP_STATUS=”APPROVED”
  • MAIL_IN” := The sum of “MARKED” + “PRE_PROCESSED”
  • COUNTABLE” := The sum of “PROVISIONAL” + “MARKED” + “PRE_PROCESSED” + “ON_MACHINE” + “FWAB”
  • MILITARY” := Number of DAL file records where VOTER_TYPE= “MILITARY”
  • OVERSEAS” := Number of DAL file records where VOTER_TYPE= “OVERSEAS”
  • TEMPORARY” := Number of DAL file records where VOTER_TYPE= “TEMPORARY”
  • MILITARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “MILITARY” and where COUNTABLE is True
  • OVERSEAS_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “OVERSEAS” and where COUNTABLE is True
  • TEMPORARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “TEMPORARY” and where COUNTABLE is True

All data purchased by Electoral Process Education Corp. (EPEC) from the VA Dept of Elections (ELECT). All processing performed by EPEC.

If you like the work that EPEC is doing, please support us with a donation.