Categories
Election Data Analysis Election Forensics Election Integrity technical

Identified Non-Citizen Voters Stratified by Party Leaning

We have had a lot of interest in the work we’ve done following the numbers of non-citizen registrant removals by the VA department of election (ELECT). I recently got asked a follow up question as to how those numbers broke out with respect to estimated party leaning for each of the 435 identified non-citizen voters. The method for performing this calculation is presented below. The majority (~77%) could not be associated with either party as they either did not have Primary Election voting history or the computation resulted in a neutral score. There were approximately ~21% with DEM party leaning and only ~2% with REP party leaning.

Method:

VA does not have registration by party, so there is no direct method for knowing a registrants party affiliation. However, for the subset of voters that have voted in party primaries, we can calculate a “leaning” score in a couple of different ways:

Leaning_v1 : This method uses the weighted average of the elections each voter has participated in, with Democratic primaries weighted as +1, and Republican primaries weighted as -1, and all other elections weighted as 0.

Leaning_v2: This method computes the voter leaning as the difference in the ratios of how often a voter participates in either a Dem or Rep primary, with a positive result meaning higher Dem, and a negative result meaning a higher Rep score.

Using the results that we presented previously, we have computed the Leaning scores for each identified non-citizen voter and computed the percentage of the non-citizen voters that fall into each category of Republican, Democrat or Unknown. Both methods of computing the leaning give slightly different, but consistent, results.

Results:
Leaning_v1 of Distinct Identified Non-Citizen Voters:
CategoryPercentCount
Republican (score<0) Leaning_v1 Non-Citizen Voters1.38%6
Unknown (score=0) Leaning_v1 Non-Citizen Voters77.7%338
Democrat (score>0) Leaning_v1 Non-Citizen Voters20.92%91
Leaning_v2 of Distinct Identified Non-Citizen Voters:
CategoryPercentCount
Republican (score<0) Leaning_v2 Non-Citizen Voters2.07%9
Unknown (score=0) Leaning_v2 Non-Citizen Voters77.01%335
Democrat (score>0) Leaning_v2 Non-Citizen Voters20.92%91

Categories
Election Data Analysis Election Forensics Election Integrity technical

Non-citizen registrations with previous voting history in VA election data – update Sept 2024

We have updated our previous analysis (from March, and July) with the latest information from the VA Department of Elections data.

Corrections (09/21/2024): After initial publication there were two errors discovered in generating the correlated entries from the VHL and DAL file archives.

  • Due to the Department of Elections recent (08/28/2024) change to the DOB data, VHL entries that should have been removed as duplicate entries in our combined VHL history were not being correctly handled.
  • Due to a variable naming typo in our processing scripts, the supplemental entries found in the DAL were not being de-duplicated against the entries already found in the VHL, as was reported.

Abstract:

Using the data provided by the VA Department of Elections (ELECT), we have identified at least 2,299 unique registrations that were identified as “Declared Non-Citizen” and removed by ELECT from the voter rolls since May of 2023. Of those 2,299 there were 438 (revised from 453) that also had corresponding records of recent ballots cast at some point in the official Voter History record that we could observe. There were 1,034 (revised from 1,117) associated ballots cast identified since Feb of 2019. There were an additional 164 non-citizen registrations that had at least 1 vote cast as per the Daily Absentee List (DAL) data, with a total of 204 ballots identified, however only two of those DAL identified voters and two ballots were not already identified in the VHL.  The total number of identified non-citizen ballots cast is therefore 1,036 (revised from 1,321) by 440 (revised from 617) registrants when combining unique VHL and DAL identifications.

After our March 2024 post on this topic, we submitted all of the relevant information that we had at the time to the VA AG’s office. We have not heard any response or update on the matter since that time, besides this being considered an active investigation. We subsequently sent our July results as well to the same contact at the AG’s office, but have had no response.

The Arlington County VA Electoral Board undertook their own investigation into this matter after our previous results were posted, and they recently (as of Sept 10 2024) voted 3-0 to send the information to the AG’s office as well. The Arlington County Commonwealths Attorney also is reported to have an ongoing investigation into the matter.

https://www.gazetteleader.com/arlington/news/investigation-launched-have-non-citizens-voted-in-arlington-9379534

https://www.gazetteleader.com/arlington/news/va-attorney-general-to-be-alerted-on-possible-non-citizen-voting-9504753

Background:

The VA Department of Elections continuously tries to identify and remove invalid or out of date registration records from the voter rolls. One category used for removal is if a registrant has been determined to be a non-citizen. It is required by the VA Constitution that only citizens are allowed to vote in VA elections.

In elections by the people, the qualifications of voters shall be as follows: Each voter shall be a citizen of the United States, shall be eighteen years of age, shall fulfill the residence requirements set forth in this section, and shall be registered to vote pursuant to this article. …

VA Constitution, Article II, Section 1. https://law.lis.virginia.gov/constitution/article2/section1/

Additionally, according to VA Code Section 24.2-1004, the act of knowingly casting a ballot by someone who is not eligible to vote is a Class 6 felony.

A. Any person who wrongfully deposits a ballot in the ballot container or casts a vote on any voting equipment, is guilty of a Class 1 misdemeanor.

B. Any person who intentionally (i) votes more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (ii) procures, assists, or induces another to vote more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (iii) votes knowing that he is not qualified to vote where and when the vote is to be given, or (iv) procures, assists, or induces another to vote knowing that such person is not qualified to vote where and when the vote is to be given is guilty of a Class 6 felony.

https://law.lis.virginia.gov/vacode/title24.2/chapter10/section24.2-1004/

ELECT makes available for purchase by qualifying parties various different data sets, including the registered voter list (RVL) and the voter history list information file (VHL). Additionally, ELECT makes available a Monthly Update Service (MUS) subscription that is published at the beginning of each month and contains (almost) all of the Voter List changes and transactions for the previous period.

In the MUS data there is a “NVRAReasonCode” field that is associated with each transaction that gives the reason for the update or change in the voter record. This is in accordance with the disclosure and transparency requirements in the NVRA. One of the possible reason codes given for records that are removed is “Declared Non-Citizen.”

EPEC has been consistently purchasing and archiving all of these official records as part of our ongoing work to document and educate the public as to the ongoing operations of our elections. (If your interested in supporting this work, please head on over to our donation page, or to our give-send-go campaign to make a tax-deductible donation, as these data purchases are not cheap!)

EPEC looked at the number of records associated with unique voter identification numbers that had been identified for removal from the voter record due to non-citizenship status, per the entries in the MUS, and correlated those results with our accumulated voter history list information in order to determine how many non-citizen registrations had corresponding records of ballots cast in previous elections. We only considered those records that are currently in a non-active state as of the latest MUS transaction log, as some determinations of non-citizenship status in the historical MUS transaction log might have been due to error and subsequently corrected and reinstated to active status. That is, we are not considering those records that had a “Declared Non-Citizen” disqualification, but were then subsequently reinstated and reactivated by ELECT.

While EPEC has periodically purchased full copies of the Voter History List for our archive, there is a known issue with the way ELECT handles removals from the voter record that can cause sampling issues depending on the time the VHL file is purchased, and records of legitimately cast ballots to not be present in the VHL: Namely, when ELECT removes a voter from the voter list, they also remove all instances of that voter ID from voter history information and other data files provided to qualified organizations. (IMO … thats a terrible way to manage the data, but that is the way it is done.) In light of that, EPEC also used its archived versions of the Daily Absentee List (DAL) for recent elections in order to attempt to find records of votes cast that might otherwise be missing from the VHL.

Results:

There were 2,299 unique voter records marked for removal with the reason of “Declared Non-Citizen” and not subsequently reinstated in the accumulated MUS record that EPEC began collecting in mid-2023. Of those 2,299 there were 438 (revised from 453) that also had corresponding records of recent ballots cast at some point in the official Voter History record that we could observe. There were 1,034 (revised from 1,117) associated ballots cast identified since Feb of 2019. Figure 1 shows the distribution of non-citizen voters in the cumulative MUS file history. The blue trace represent the total identified and CANCELED non-citizen registrations, and the yellow trace represents the number of those records that also had corresponding records in the accumulated voter history data.

Figure 1: Distribution if the number of identified non-citizen records and ballots in the cumulative ELECT MUS file history. The x-axis is the date that a record was marked as CANCELED for the reason of “Declared Non-Citizen”.

Note that the data contained in the MUS updates often covers more than a single month period. In other words, the individual MUS files are oversampled. Subsequent MUS files can therefore also have repeated entries from previous versions, as their data may overlap. Our analysis used the first unique entry for a given voter ID marked as “Declared Non-Citizen” in the cumulative MUS record in order to build Figure 1. This data oversampling in the MUS helps explain the relative increase in the May 2023 bin.

As VHL information can be incomplete depending on the time the VHL data was purchased in relation to the time that registrants were removed from voter records, EPEC also checked these non-citizen removals against the archived history of Daily Absentee List (DAL) files that EPEC has accumulated.  There were an additional 164 non-citizen registrations that had at least 1 vote cast as per the Daily Absentee List (DAL) data, with a total of 204 ballots identified, however only two of those DAL identified voters and two ballots were not already identified in the VHL.  The total number of identified non-citizen ballots cast is therefore 1,036 (revised from 1,321) by 440 (revised from 617) registrants when combining unique VHL and DAL identifications.

These identifications represent only the individuals who declared themselves as non-citizen status through official interactions with ELECT, DMV, or other agencies.

The distribution of identified unique voter ID’s for the 440 identified non-citizen voters per VA locality is given below in Table 1. It should be noted that each ballot record has a specific locality associated with where the ballot was cast, whereas unique individuals might move between localities over time. The assignment of unique identified individuals to each locality in table 1 is therefore based on the locality listed in the specific MUS “Declared Non-Citizen” record for that individual, while the assignment of ballot cast to Localities is based on the individual VHL/DAL records. A person could have lived and voted multiple times in one county, then moved to another county and voted again before finally being determined as a non-citizen. The same person would have generated multiple VHL/DAL records for each ballot cast, and associated with potentially different localities. This should be kept in mind when attempting to interpret Table 1.


MUS RemovalsVoted (VHL)Voted (DAL)Total VotedVotes (VHL)Voted (DAL)Total Votes
ACCOMACK COUNTY1

0

0
ALBEMARLE COUNTY224
46
6
ALEXANDRIA CITY10921
2136
36
AMELIA COUNTY21
13
3
APPOMATTOX COUNTY1

0

0
ARLINGTON COUNTY6915
1543
43
AUGUSTA COUNTY61
11
1
BEDFORD COUNTY92
23
3
BOTETOURT COUNTY2

0

0
BRUNSWICK COUNTY11
12
2
BUCKINGHAM COUNTY3

0

0
CAROLINE COUNTY51
15
5
CARROLL COUNTY62
2

0
CHARLES CITY COUNTY21
13
3
CHARLOTTESVILLE CITY152
27
7
CHESAPEAKE CITY5916
1633
33
CHESTERFIELD COUNTY21222
2251
51
CLARKE COUNTY63
34
4
COLONIAL HEIGHTS CITY51
13
3
CULPEPER COUNTY161
11
1
DANVILLE CITY142
28
8
DINWIDDIE COUNTY71
11
1
EMPORIA CITY2

0

0
ESSEX COUNTY1

0

0
FAIRFAX CITY93
311112
FAIRFAX COUNTY40178179175
175
FAUQUIER COUNTY132
29
9
FLUVANNA COUNTY11
11
1
FRANKLIN COUNTY32
24
4
FREDERICK COUNTY192
22
2
FREDERICKSBURG CITY151
11
1
GALAX CITY1

0

0
GILES COUNTY2

0

0
GLOUCESTER COUNTY11
11
1
GOOCHLAND COUNTY4

0

0
GRAYSON COUNTY1

0

0
GREENE COUNTY41
12
2
HALIFAX COUNTY1

0

0
HAMPTON CITY409
915
15
HANOVER COUNTY81
14
4
HARRISONBURG CITY392
23
3
HENRICO COUNTY293
310
10
HENRY COUNTY



2
2
ISLE OF WIGHT COUNTY1

01
1
JAMES CITY COUNTY234
411
11
KING GEORGE COUNTY1

0

0
KING WILLIAM COUNTY1

0

0
LOUDOUN COUNTY13542
4289
89
LOUISA COUNTY7

0

0
LYNCHBURG CITY162
22
2
MANASSAS CITY253
39
9
MANASSAS PARK CITY11

0

0
MARTINSVILLE CITY41
1

0
MECKLENBURG COUNTY63
310
10
MIDDLESEX COUNTY2

0

0
NELSON COUNTY2

0

0
NEW KENT COUNTY21
1

0
NEWPORT NEWS CITY6719
1945
45
NORFOLK CITY5812
1229
29
NORTHUMBERLAND COUNTY32
25
5
NOTTOWAY COUNTY2

0

0
ORANGE COUNTY21
13
3
PETERSBURG CITY173
35
5
PITTSYLVANIA COUNTY41
11
1
PORTSMOUTH CITY3311
1135
35
POWHATAN COUNTY3

01
1
PRINCE EDWARD COUNTY103
311
11
PRINCE GEORGE COUNTY101
11
1
PRINCE WILLIAM COUNTY22043
43103
103
PULASKI COUNTY51
12
2
RAPPAHANNOCK COUNTY1

0

0
RICHMOND CITY1122012159160
ROANOKE CITY432
2

0
ROANOKE COUNTY4

01
1
ROCKINGHAM COUNTY215
513
13
RUSSELL COUNTY21
11
1
SALEM CITY3

0

0
SHENANDOAH COUNTY71
11
1
SMYTH COUNTY2

0

0
SPOTSYLVANIA COUNTY353
38
8
STAFFORD COUNTY417
721
21
STAUNTON CITY3

0

0
SUFFOLK CITY2711
1120
20
SUSSEX COUNTY21
13
3
TAZEWELL COUNTY31
11
1
VIRGINIA BEACH CITY10615
1545
45
WARREN COUNTY82
25
5
WASHINGTON COUNTY42
26
6
WAYNESBORO CITY3

0

0
WESTMORELAND COUNTY1

0

0
WILLIAMSBURG CITY3

0

0
WINCHESTER CITY152
22
2
WYTHE COUNTY2

0

0
YORK COUNTY159
935
35

22994382440103421036

The distribution of the 1,034 ballots that were identified as being cast by non-citizen voters (yellow trace in Figure 1) in previous elections is shown in Figure 2. The most significant spikes are in the 2019, 2020, 2021 and 2022 November General elections, as well as the 2020 March Democratic presidential primary. Figure 3, which shows this distribution as a percentage of votes cast. Please note the scale of the Y-axis on the percent plot in Figure 3 is in units of 10^-3 percent. These graphs were only produced for the VHL data, and do not include the DAL identified records.

Figure 2: Distribution of identified non-citizen ballots cast in previous elections.
Figure 3: Distribution of identified non-citizen ballots cast in previous elections as percent of total ballots cast, according to entries in the VHL/DAL data files.

Figures 4 and 5 show the distribution of the registration dates of the identified non-citizen records. The same data is plotted in figure 4 and 5, with the only difference being the scale of the Y-axis in order to better observe the dynamic range of the values. When we look at the registration date of these identified records, we see that there is a distinct relative increase starting around 1996, and then again around 2012.

Figure 4: Registration dates of the identified non-citizen records. Absolute count on y-axis.
Figure 5: Registration dates of the identified non-citizen records. Logarithmic Y-axis scale.

EPEC made a FOIA request to the VA Attorney General’s office on March 11, 2024 inquiring for any records regarding how many prosecutions for non-citizen voting had occurred since June of 2023. We received a response that the AG had no such relevant records.

EPEC subsequently submitted our March analysis dataset to the VA AG’s office upon their request. We have heard no updates or status as to any action taken by the AG’s office since that time, except that it is being considered an ongoing investigation.

Discussion

It appears from the MUS data, that the VA Department of Elections (ELECT) is doing routine identification, cleanup and removal of non-citizen registrations, which is a good thing and we commend them for their continued efforts to maintain clean voter registration lists.

However, the fact that a small number of these identified non-citizen registrations are also associated with (presumably … if the data from ELECT is accurate) illegally cast ballots in previous elections does raise a number of questions that citizens should be (politely) asking and discussing with their legislators, elected and appointed government officials. Each act of non-citizen voting is a de-facto disenfranchisement of legal voters rights, and is a punishable offense under VA law.

Q: How did these registrants get placed onto the voter rolls in the first place?

Q: What method and/or data sources are used by the state to identify non-citizen registrations for removal? If that process is exhaustive, and covers all registrations, then these numbers might be considered to represent a statistical complete picture of the problem. If that process is not exhaustive, in that it only uses serendipitous corroborating data sources, then these results likely under-represent the scale of the issues.

Q: As noted above, we are only considering here those individuals who have not had their records re-instated or reactivated after a determination of non-citizen status. We do not have enough information to determine how or why some records were first determined to be non-citizen, canceled and then subsequently re-instated. One potential area of concern is determining whether or not registrants might be falsely or errantly claiming to not be a citizen on official documents in order to be excused from jury duty, for example, and then work to re-instate their voting status once those documents percolate through the system to ELECT and are flagged for removal. This is a wholly separate but serious issue, as making false claims on official documents is itself a punishable offense.

Q: What procedures, processes and technical solutions are in place to prevent current or future registration and casting of ballots by non-citizens? This is especially pertinent given the current state of the flow of illegal immigrants crossing our national borders. According to a recent report by Yahoo Finance, VA is one of the top 30 destinations for illegal migrants, with both Loudoun County and Fairfax making the list.

Q: Why have none of the identified non-citizens who also cast ballots been investigated or prosecuted under VA Code 24.2-1004? As the identification of these ballots comes directly from looking at the official records produced by ELECT, it seems prudent for these to be forwarded by ELECT to the AG’s office with a recommendation to investigate and prosecute. Yet our FOIA request to the VA AG’s office inquiring as to any records associated with these types of investigations or prosecutions produced a “no relevant records exist” response. And since we submitted this information to the AG’s office, there has been no follow up.

Additionally, this evidence which is derived from only official state records, directly contradicts multiple news media reports and attestations that non-citizen voting is a “Myth”, and that non-citizen voting happens “almost never”. If the data from ELECT is accurate, then there are at least 1,117 ballots that have been cast by non-citizen voters just since 2019. Now, that is still very infrequent, but it is not “almost never.” It is a legitimate concern … and these discoveries are only the registrations that have been found and removed from the voter roles by ELECT and that we can observe in the data. We do not know how many exist that we do not know about.

It should be reiterated that these are only the records that we can observe given our data repository, and how often we can realistically purchase and acquire voter history and voter registration information. It is therefore likely that this represents a significant undercount of the occurrences of non-citizen voters and non-citizen voting.

It costs us (EPEC) approximately ~$5K for each purchase of the statewide voter history list, and approximately $15K/year to maintain RVL records using a single baseline full purchase + 2 purchases of the 6mo MUS subscription. Due to the infrequent nature of these data purchases, it is very likely that some individuals have had their voter history or voter registration information completely removed from the record in between our purchases. Additionally, we know that the MUS data does not entirely encompass all transactions performed on the RVL by the department of elections, so there may be yet other unknown transactions that we are missing.

For information that is supposed to be publicly available (according to federal NVRA laws), the state has put up significant hurdles in order for citizens and organizations to acquire it for use it for ensuring transparency and integrity of our electoral process. If we are to have elections that are transparent and accountable to the public, then we must insist that the data be made available and accessible.

Categories
Election Data Analysis Election Forensics Election Integrity mathematics technical

‘Dark’ Transactions in VA’s Voter Registration Data

EPEC has compared the changes to two purchased full versions of the VA Registered Voter List (RVL) to the content of the Monthly Update Service (MUS) data covering the same temporal period. Of the ID numbers that were added to the RVL, 3,613 (or 1.0589% of total additions) never appear anywhere in the MUS files covering the same temporal period. Of the ID numbers that were removed from the RVL, 3,355 (or 2.4096% total removals) never appear anywhere in the MUS files covering the same temporal period.


Since mid 2023 EPEC has been purchasing, processing and archiving copies of both the full Registered Voter List (RVL) and the Monthly Update Service (MUS) files which gives the UPDATE, ADD or CANCEL transactions to the voter list throughout the year.

Once a baseline RVL is established, the MUS files can be used to update that baseline in order to keep the list current. That should be all one needs to keep an accurate dataset of the registered voter list using monthly updates … except there is a catch … the MUS for some reason doesn’t quite capture all of the changes that are occurring in the voter list. In fact, we see about 1-2.5% of the ADD or CANCEL transactions between each RVL snapshot are not reflected by any corresponding entries in the MUS.

All of the changes that are made between two different RVL baseline snapshots should be able to be observed in the corresponding MUS files that cover the same time period, and vice versa. The MUS has transaction logs accounting for new registrants, for registrants who move, for removing deceased individuals, for individuals that have had a change in their felon status, for individuals who are determined non-citizen, for administrative updates and correction, etc. So, in theory, it should be able to be a complete record. However, over the course of working with the VA data files, every so often we have noticed that some transactions seem to be unaccounted for. Therefore, once we had enough data compiled, we decided to test just how well the MUS data actually explains the changes we see between between two baseline RVL files.

Method:

For this experiment, we used full RVL snapshots purchased from VA Department of Elections (ELECT) on 2023-06-30 and 2024-08-29, and all of the monthly MUS distributions covering the entire time period in between.

Using the voter ID number field that is present in all datasets, we first determine which ID numbers were added to the 2024 RVL dataset, and which ID numbers were deleted from the 2023 RVL data. We then checked to see how many of those ID numbers appear in any of the MUS data files, for any reason.

Note that this data was processed statewide, such that registrants moving between localities within the state should not affect the total number of computed additions or removals, as the ID numbers should still be present in the datasets, although corresponding locality information may have changed.

Results:

The breakdown of the number of changes that were present in the MUS file over the time period of the RVL snapshots (2023-06-30 through 2024-08-29) is given in Figure 1 below. The MUS data was deduplicated and truncated to only consider transactions with TRANSACTION date information between the dates associated with the RVL datasets. The bars in Figure 1 are logarithmically scaled in the y-axis, with the x-axis representing the NVRAReasonCode given for each transaction in the MUS. The bars are color coded by transaction type. As there are duplicates and oversampling within the collection of MUS files, only the latest transactions for each uniquely identified ID number was utilized to generate the plot. As can be seen from the various categories along the x-axis of this plot, the data in the MUS logs should be sufficient to capture all of the transactions with the RVL.

Figure 1: Breakdown of MUS transactions between 2023-06-30 and 2024-08-29

Direct Inspection of the RVL Snapshots:

Performing a simple set-difference between the elements of the unique ID numbers present in the 2023-06-30 RVL data vs the 2024-08-29 RVL data shows that there were 341,191 unique ID’s added, and 139,232 removed between the two datasets.

Of the ID numbers that were ADDED between the raw RVL snapshots, 3,613 (or 1.0589%) never appear anywhere in the MUS files covering the same temporal period.

Of the 3,613 ID numbers that were ADDED between the raw RVL snapshots, and that don’t appear in the MUS record, 537 (or 14.863%) have at least one entry in the Voter History List (VHL) data the EPEC has been collecting and archiving.

Of the ID numbers that were REMOVED between the raw RVL snapshots, 3,355 (or 2.4096%) never appear anywhere in the MUS files covering the same temporal period.

Of the 3,355 ID numbers that were REMOVED between the raw RVL snapshots, and that don’t appear in the MUS record, 2,011 (or 59.94%) have at least one entry in the VHL data the EPEC has been collecting and archiving.

Using the MUS-Adjusted RVL baseline

If we ignore the 2024-08-29 dataset, and instead directly apply the transactions in the MUS datafiles to the 2023-06-30 dataset in order to create a new RVL list, we would end up with 342,888 Additions, and 137,849 removals respectively to unique voter ID numbers. We see 1,697 more (342,888-341,191=1697) additions when trying to directly apply the MUS than when directly comparing RVL snapshots, and 1,383 less (139,232-137,849=1393) removals. Keep in mind these discrepancies are in addition to the 3,613 and 3,355 discrepancies using the RVL snapshot baselines, as the ID numbers in each set are unique. So the total number of discrepancies is 3,613 + 3,355 + 1,697 + 1,383 = 10,048 records.

Summary of these results:
                   Num_Added: 341191
        Num_Added_not_in_MUS: 3613
        Pct_Added_not_in_MUS: 1.0589
  Num_Added_not_in_MUS_wVHL: 537
                 Num_Removed: 139232
      Num_Removed_not_in_MUS: 3355
      Pct_Removed_not_in_MUS: 2.4096
Num_Removed_not_in_MUS_wVHL: 2011
           MUS_Num_Deletions: 137849
           MUS_Num_Additions: 342888
             MUS_Num_Updates: 946248
            NUS_Num_NOOP_ADD: 651
         NUS_Num_NOOP_MODIFY: 334282
Discussion:

We do not understand yet the origin of these discrepancies, it could be a coding error on the part of the developers of the VERIS system, or it could be that there is a category of data adjustments that is not adequately reflected in the RVL or MUS data products. The RVL snapshots are supposed to be the authoritative record of the voter registration data, and the MUS data updates are supposed to capture all of the transactional changes to said registration records.

Regardless of the cause of the discrepancy, the fact remains that there are a small number of transactions and changes to the voter record that are unobservable. They are, in effect, “dark” transactions in the voter registration data that cannot be observed, validated or verified.

Categories
Election Integrity technical

VA Department of Elections Removing Full Date of Birth from Purchased Datasets

The VA Department of Elections (ELECT) has given us at EPEC notice (as of 8/26/2024) that they will be removing the full date of birth information from purchased datasets and replacing it with year of birth only. Note that this is not in relation to publicly available data, but for data that has been purchased by specific qualifying individuals or organizations capable of purchasing and handling expanded datasets according to VA law and ELECT policies. Current VA law does NOT allow full birthdate information in publicly disclosed records, but there is no restriction on it being included in otherwise protected records provided to qualified organizations. In fact, many qualified organizations have relied on this information in order to perform their legitimate functions.

The use of full birthdate information is an important field when trying to identify and discriminate between individual registrants that might otherwise have the same name information: “John Q. Smith 10/19/1981” can obviously be determined to be a unique registrant vs. “John Q. Smith 04/01/1981”. But if only the year of birth information is available, these two (hypothetical) records become much more difficult to distinguish without supplemental information. Removing the full date of birth information degrades the confidence and accuracy of registration matching queries.

The degradation in the ability to confidently match records due to removing month and day of birth information can be seen in the table below. 1,000,000 records from a recent VA RVL (Aug 2024) were used to demonstrate what happens to the ability to match records when altering the birth information. All other processing was exactly the same. Even when only considering records that are exactly matching, the removal of month and day of birth information results in an increase of potential matches by nearly 400%. That means that for every true match that a Full-DOB would produce, there are 3 false-positive matches when only using Year of Birth information. The numbers get increasingly worse as you consider 1, 2, or 3 character differences.

Exact Matches1 Char Diff 2 Chars Diff 3 Chars Diff Total
Full-DOB1138633415562089
Year-Only4413512101993415148303
% Change390.27%4083.72%3053.59%2194.79%2312.25%
1,000,000 records from the most recent VA Registered Voter List (RVL) were compared for similarity score using different birth information. “Full-DOB” processing utilized the FIRST MIDDLE LAST SUFFIX GENDER MM/DD/YYYY information. Year only processing utilized FIRST MIDDLE LAST SUFFIX GENDER YYYY information. All other processing was exactly the same.

The ability to match or distinguish between registration records in the registered voter list is important for being able to perform a number of different legitimate activities by authorized organizations, such as:

  • The determination if a person is already registered (or not) by get-out-the-vote organizations.
  • Determining the existence of potential duplicate registration records by election integrity, public research, and watchdog organizations.
  • The identification and validation of potential Electoral Board member candidates, Election Officers and authorized Poll Watchers by political parties and candidates as required by VA law.
  • The vetting of volunteers for other partisan candidate and party functions and activities (door-knockers, event organizers, etc.)
  • The identification and verification of deceased individuals and corroborating obituary information.
  • … and much, much, more.

This change in the official data by ELECT will also affect the organizational and operational logistics of multiple organizations. With only a few weeks left before the start of the early voting period, organizations (both partisan and non-partisan alike) will need to expend precious money, time and resources in order to correct all of their data ingest and processing systems to handle the new formats and fields. They will also need to invent new logic to combine and fuse older (full-DOB) data with the newly released (year-only) data in order to maintain their mission effectiveness.

Note: This also means that there is an increased risk of spam phone calls and marketing materials being sent to individuals who otherwise would have been excluded from targeted marketing efforts had they been able to confidently discriminate them from similar, but different, registration records. Each of those otherwise unnecessary phone calls or text messages costs time and money for the candidates and campaigns, and the companies they hire, as well as potential annoyance by the recipients.

As it stands, we (the proverbial “we”) are just going to have to deal with this sudden loss of data fidelity for the time being. Even if political or legal remedies are ultimately successful, they will take time. The works still needs to get done in the meantime. It’s not the end of the world, but it decreases the confidence of automated matching systems, and increases the amount of human labor required by volunteers, campaigns, registrars and other election officials.

Some questions I have regarding this matter:

  • Why the sudden change? There does not seem to be a recent court case or legal reason pressing for immediate change, at least that I am aware of.
  • Why not wait until after the election such as not to impact current operations of various election related organizations? This would be consistent with how the department of elections has operated in the past.
Categories
Election Data Analysis Election Forensics Election Integrity technical

2024 VA November General Election DAL File Metrics

Below you will find the current summary data and graphics from the 2024 VA November General Election Daily Absentee List files. We pull the DAL file everyday and track the count of each specific ballot category in each daily file.

Note: Page may take a moment to load the graphics objects.

Linear Scale Plot:

Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Logarithmic Scale Plot:

The logarithmic plot is the same underlying data as the linear scale plot, except with a logarithmic y-scale in order to be able to compress the dynamic range and see the shape of all of the data curves in a single graphic. Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Summary Data Table:
Print  CSV  Copy  

The underlying data for the graphics above is provided in the summary data table.

Additional Data:

Additional CSV datasets stratified by Locality, City, Congressional District, State House District, State Senate District, and Precinct are available here. Please note that you need to give the page time to load before trying to drill down into any of the listed subdirectories, and mobile browsers have shown some issues.

A direct link to a zip file with ALL of the metrics data stratified by Locality, etc is here.

Data column descriptions:
  • ISSUED” := Number of DAL file records where BALLOT_STATUS= “ISSUED”
  • NOT_ISSUED” := Number of DAL file records where BALLOT_STATUS= “NOT ISSUED”
  • PROVISIONAL” := Number of DAL file records where BALLOT_STATUS= “PROVISIONAL” and APP_STATUS=”APPROVED”
  • DELETED” := Number of DAL file records where BALLOT_STATUS= “DELETED”
  • MARKED” := Number of DAL file records where BALLOT_STATUS= “MARKED” and APP_STATUS=”APPROVED”
  • ON_MACHINE” := Number of DAL file records where BALLOT_STATUS= “ON_MACHINE” and APP_STATUS=”APPROVED”
  • PRE_PROCESSED” := Number of DAL file records where BALLOT_STATUS= “PRE-PROCESSED” and APP_STATUS=”APPROVED”
  • FWAB” := Number of DAL file records where BALLOT_STATUS= “FWAB” and APP_STATUS=”APPROVED”
  • MAIL_IN” := The sum of “MARKED” + “PRE_PROCESSED”
  • COUNTABLE” := The sum of “PROVISIONAL” + “MARKED” + “PRE_PROCESSED” + “ON_MACHINE” + “FWAB”
  • MILITARY” := Number of DAL file records where VOTER_TYPE= “MILITARY”
  • OVERSEAS” := Number of DAL file records where VOTER_TYPE= “OVERSEAS”
  • TEMPORARY” := Number of DAL file records where VOTER_TYPE= “TEMPORARY”
  • MILITARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “MILITARY” and where COUNTABLE is True
  • OVERSEAS_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “OVERSEAS” and where COUNTABLE is True
  • TEMPORARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “TEMPORARY” and where COUNTABLE is True
  • MILITARY_ISSUED” := Number of DAL file records where APP_STATUS==”Approved”, VOTER_TYPE= “MILITARY” and where BALLOT_STATUS==”ISSUED”
  • OVERSEAS_ISSUED” := Number of DAL file records where APP_STATUS==”Approved”, VOTER_TYPE= “OVERSEAS” and where BALLOT_STATUS==”ISSUED”
  • TEMPORARY_ISSUED” := Number of DAL file records where APP_STATUS==”Approved”, VOTER_TYPE= “TEMPORARY” and where BALLOT_STATUS==”ISSUED”
  • COUNTABLE_HIGH_PROP_NG” := Number of DAL file records where COUNTABLE is True and the registrant has voted in 75% or more of the November General elections on record. (i.e. They have a high November General propensity score)
  • COUNTABLE_MED_PROP_NG” := Number of DAL file records where COUNTABLE is True and the registrant has voted in < 75% and > 0% of the November General elections on record.
  • COUNTABLE_ZERO_PROP_NG” := Number of DAL file records where COUNTABLE is True and the registrant has never voted in any of the November General elections on record.

All data purchased by Electoral Process Education Corp. (EPEC) from the VA Dept of Elections (ELECT). All processing performed by EPEC.

If you like the work that EPEC is doing, please support us with a donation.

Categories
Uncategorized

A Canary in the Data Mine

Since VA Gov Youngkin issued Executive Order 35, it has been getting quite a bit of press … I wanted to make a few comments on it after having a chance to digest it.

Overall, I think it’s a net positive for Election Integrity efforts in VA, but not because there is any new or groundbreaking policy by the governor or his administration. Most of the items in his EO are already existing policy, and the EO language is worded such that the current actions of the Department of Elections (ELECT) can be arguably said to comply with those policies. There are a couple of small improvements, such as the fact that this EO codifies into transparent public policy the specific requirements that the commissioner must certify in writing. Even though there isn’t any drastically new policies, this does improve on overall public transparency, confidence and accountability, and I think thats a good thing.

What will be interesting in my opinion, is that now that this EO has been issued, will our team at EPEC notice any demonstrable difference in the quality of voter registration data that we track from ELECT going forward?

More specifically, I’m going to be tracking a few very specific records to see if they get addressed or not.

There are a few egregious records in the VA voter registration file that are obviously problematic that should be removed or at least updated/corrected by ELECT. I, and the team at EPEC, have been tracking these specific records for years but have not published or discussed them publicly. A few of them we even directly mentioned to the current Commissioner of Elections during our face-to-face meeting with her last year. These records should be “easy” to find and clean up by ELECT, as they are obviously errant and invalid registrations in their current state.

As of the latest data we have, those records are still in the voter list and listed as ACTIVE registrations, after nearly four years since I first found them. They can easily be identified with very simple logical checks to the registration records … descriptions of which I and the team at EPEC have discussed publicly and provided direct to ELECT and local registrars on multiple occasions.

The reason I bring this up, is that these can be considered and used as “hold out” test cases, in data science parlance. They are “canaries in the coal mine”, if you will. If we at EPEC start observing that these records are being responsibly updated or removed going forward, in accordance with the mandate in the EO to perform daily scrubbing of the voter rolls for ineligible records, then that would give VA citizens some evidence and verification that the Gov and ELECT are serious about their efforts.

That would be excellent if they did, but seeing as how I’ve been observing these records for 4 years while the administration proclaim what an excellent job its doing … I’m not going to hold my breath.

Categories
Election Data Analysis Election Forensics Election Integrity technical

Non-citizen registrations with previous voting history in VA election data – update July 2024

We have updated our previous analysis with the latest information from the VA Department of Elections data.

Update 07/27/2024: The below numbers have been revised after discovering a minor programming error and a human “fat-finger” error when I originally transcribed table 1. [The previous erroneous numbers were 2031 registrants removed, 438 with observed voting history, for a total of 1089 ballots. The new corrected totals are 1973, 399, and 938 respectively.]

Abstract:

Using the data provided by the VA Department of Elections (ELECT), we have identified at least 1,973 unique registrations that were identified as “Determined Non-Citizen” and removed by ELECT from the voter rolls since May of 2023. Of those 1,973 there were 399 that also had corresponding records of recent ballots cast at some point in the official Voter History record that we could observe. There were 938 associated ballots cast identified since Feb of 2019.

After our previous post on this topic in March 2024, we submitted all of the relevant information that we had at the time to the VA AG’s office. We have not heard any response or update on the matter since that time, besides this being considered an active investigation.

Background:

The VA Department of Elections continuously tries to identify and remove invalid or out of date registration records from the voter rolls. One category used for removal is if a registrant has been determined to be a non-citizen. It is required by the VA Constitution that only citizens are allowed to vote in VA elections.

In elections by the people, the qualifications of voters shall be as follows: Each voter shall be a citizen of the United States, shall be eighteen years of age, shall fulfill the residence requirements set forth in this section, and shall be registered to vote pursuant to this article. …

VA Constitution, Article II, Section 1. https://law.lis.virginia.gov/constitution/article2/section1/

Additionally, according to VA Code Section 24.2-1004, the act of knowingly casting a ballot by someone who is not eligible to vote is a Class 6 felony.

A. Any person who wrongfully deposits a ballot in the ballot container or casts a vote on any voting equipment, is guilty of a Class 1 misdemeanor.

B. Any person who intentionally (i) votes more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (ii) procures, assists, or induces another to vote more than once in the same election, whether those votes are cast in Virginia or in Virginia and any other state or territory of the United States, (iii) votes knowing that he is not qualified to vote where and when the vote is to be given, or (iv) procures, assists, or induces another to vote knowing that such person is not qualified to vote where and when the vote is to be given is guilty of a Class 6 felony.

https://law.lis.virginia.gov/vacode/title24.2/chapter10/section24.2-1004/

ELECT makes available for purchase by qualifying parties various different data sets, including the registered voter list (RVL) and the voter history list information file (VHL). Additionally, ELECT makes available a Monthly Update Service (MUS) subscription that is published at the beginning of each month and contains (almost) all of the Voter List changes and transactions for the previous period.

In the MUS data there is a “NVRAReasonCode” field that is associated with each transaction that gives the reason for the update or change in the voter record. This is in accordance with the disclosure and transparency requirements in the NVRA. One of the possible reason codes given for records that are removed is “Determined Non-Citizen.”

EPEC has been consistently purchasing and archiving all of these official records as part of our ongoing work to document and educate the public as to the ongoing operations of our elections. (If your interested in supporting this work, please head on over to our donation page, or to our give-send-go campaign to make a tax-deductible donation, as these data purchases are not cheap!)

EPEC looked at the number of records associated with unique voter identification numbers that had been identified for removal from the voter record due to non-citizenship status, per the entries in the MUS, and correlated those results with our accumulated voter history list information in order to determine how many non-citizen registrations had corresponding records of ballots cast in previous elections. We only considered those records that are currently in a non-active state as of the latest MUS transaction log, as some determinations of non-citizenship status in the historical MUS transaction log might have been due to error and subsequently corrected and reinstated to active status. That is, we are not considering those records that had a “Determined Non-Citizen” disqualification, but were then subsequently reinstated and reactivated by ELECT.

Results:

There were 1,973 unique voter records marked for removal with the reason of “Determined Non-Citizen” and not subsequently reinstated in the accumulated MUS record that EPEC began collecting in mid-2023. Of those 1,973 records there were 399 unique voter ID’s that also had a record of casting one or more ballots in the accumulated vote history data that EPEC has been gathering, for a total of 938 ballots cast that can be identified since Feb of 2019. Figure 1 shows the distribution of non-citizen voters in the cumulative MUS file history. The blue trace represent the total identified and CANCELED non-citizen registrations, and the yellow trace represents the number of those records that also had corresponding records in the accumulated voter history data.

Figure 1: Distribution if the number of identified non-citizen records and ballots in the cumulative ELECT MUS file history. The x-axis is the date that a record was marked as CANCELED for the reason of “Determined Non-Citizen”.

Note that the data contained in the MUS updates often covers more than a single month period. In other words, the individual MUS files are oversampled. Subsequent MUS files can therefore also have repeated entries from previous versions, as their data may overlap. Our analysis used the first unique entry for a given voter ID marked as “Determined Non-Citizen” in the cumulative MUS record in order to build Figure 1. This data oversampling in the MUS helps explain the relative increase in the May 2023 bin.

The distribution of identified unique voter ID’s for the 399 identified non-citizen voters per VA locality is given below in Table 1. It should be noted that each ballot record has a specific locality associated with where the ballot was cast, whereas unique individuals might move between localities over time. The assignment of unique identified individuals to each locality in table 1 is therefore based on the locality listed in the specific MUS “Determined Non-Citizen” record for that individual, while the assignment of ballot cast to Localities is based on the individual VHL records. A person could have lived and voted multiple times in one county, then moved to another county and voted again before finally being determined as a non-citizen. The same person would have generated multiple VHL records for each ballot cast, and associated with potentially different localities. This should be kept in mind when attempting to interpret Table 1.

LocalityRegistrantsVotersBallots
ACCOMACK COUNTY100
ALBEMARLE COUNTY1934
ALEXANDRIA CITY961934
AMELIA COUNTY213
APPOMATTOX COUNTY100
ARLINGTON COUNTY571338
AUGUSTA COUNTY511
BEDFORD COUNTY923
BOTETOURT COUNTY100
BRUNSWICK COUNTY112
BUCKINGHAM COUNTY200
CAROLINE COUNTY510
CARROLL COUNTY625
CHARLES CITY COUNTY100
CHARLOTTESVILLE CITY1527
CHESAPEAKE CITY521532
CHESTERFIELD COUNTY1301941
CLARKE COUNTY634
COLONIAL HEIGHTS CITY413
CULPEPER COUNTY1200
DANVILLE CITY1328
DINWIDDIE COUNTY711
EMPORIA CITY200
FAIRFAX CITY639
FAIRFAX COUNTY35573162
FAUQUIER COUNTY1112
FRANKLIN COUNTY211
FREDERICK COUNTY1611
FREDERICKSBURG CITY1311
GALAX CITY100
GILES COUNTY200
GLOUCESTER COUNTY111
GOOCHLAND COUNTY400
GRAYSON COUNTY100
GREENE COUNTY412
HALIFAX COUNTY100
HAMPTON CITY39915
HARRISONBURG CITY3023
HENRICO COUNTY2938
HENRY COUNTY002
ISLE OF WIGHT COUNTY100
JAMES CITY COUNTY21411
KING WILLIAM COUNTY100
LOUDOUN COUNTY1263886
LOUISA COUNTY600
LYNCHBURG CITY1422
MANASSAS CITY2139
MANASSAS PARK CITY900
MARTINSVILLE CITY410
MECKLENBURG COUNTY6310
MIDDLESEX COUNTY100
NELSON COUNTY200
NEW KENT COUNTY110
NEWPORT NEWS CITY581844
NORFOLK CITY501128
NORTHUMBERLAND COUNTY214
NOTTOWAY COUNTY100
ORANGE COUNTY213
PETERSBURG CITY1735
PITTSYLVANIA COUNTY311
PORTSMOUTH CITY311034
POWHATAN COUNTY301
PRINCE EDWARD COUNTY823
PRINCE GEORGE COUNTY911
PRINCE WILLIAM COUNTY1933992
PULASKI COUNTY512
RAPPAHANNOCK COUNTY100
RICHMOND CITY1031958
ROANOKE CITY3935
ROANOKE COUNTY200
ROCKINGHAM COUNTY2049
RUSSELL COUNTY211
SALEM CITY300
SHENANDOAH COUNTY411
SMYTH COUNTY200
SPOTSYLVANIA COUNTY2825
STAFFORD COUNTY36620
STAUNTON CITY200
SUFFOLK CITY271121
SUSSEX COUNTY213
TAZEWELL COUNTY211
VIRGINIA BEACH CITY961438
WARREN COUNTY825
WASHINGTON COUNTY226
WAYNESBORO CITY300
WESTMORELAND COUNTY100
WILLIAMSBURG CITY300
WINCHESTER CITY1311
WYTHE COUNTY200
YORK COUNTY15935
Grand Total1973399938
Table 1: Distribution of unique individuals determined to be non-citizens that voted in each locality, and the number of total non-citizen identified ballots cast.

The distribution of the 938 ballots that were identified as being cast by non-citizen voters (yellow trace in Figure 1) in previous elections is shown in Figure 2. The most significant spikes are in the 2019, 2020, 2021 and 2022 November General elections, as well as the 2020 March Democratic presidential primary. Figure 3, which shows this distribution as a percentage of votes cast. Note that we do not yet have a voter history dataset from ELECT that covers the recent 2024 March and June primaries, so those elections are omitted from consideration. Also please note the scale of the Y-axis on the percent plot in Figure 3 is in units of 10^-3 percent.

Figure 2: Distribution of identified non-citizen ballots cast in previous elections.
Figure 3: Distribution of identified non-citizen ballots cast in previous elections as percent of total ballots cast, according to entries in the VHL data files.

Figures 4 and 5 show the distribution of the registration dates of the identified non-citizen records. The same data is plotted in figure 4 and 5, with the only difference being the scale of the Y-axis in order to better observe the dynamic range of the values. When we look at the registration date of these identified records, we see that there is a distinct relative increase starting around 1996, and then again around 2012.

Figure 4: Registration dates of the identified non-citizen records. Absolute count on y-axis.
Figure 5: Registration dates of the identified non-citizen records. Logarithmic Y-axis scale.

EPEC made a FOIA request to the VA Attorney General’s office on March 11, 2024 inquiring for any records regarding how many prosecutions for non-citizen voting had occurred since June of 2023. We received a response that the AG had no such relevant records.

EPEC subsequently submitted our March analysis dataset to the VA AG’s office upon their request. We have heard no updates or status as to any action taken by the AG’s office since that time, except that it is being considered an ongoing investigation.

Discussion

It appears from the MUS data, that the VA Department of Elections (ELECT) is doing routine identification, cleanup and removal of non-citizen registrations, which is a good thing and we commend them for their continued efforts to maintain clean voter registration lists.

However, the fact that a small number of these identified non-citizen registrations are also associated with (presumably … if the data from ELECT is accurate) illegally cast ballots in previous elections does raise a number of questions that citizens should be (politely) asking and discussing with their legislators, elected and appointed government officials. Each act of non-citizen voting is a de-facto disenfranchisement of legal voters rights, and is a punishable offense under VA law.

Q: How did these registrants get placed onto the voter rolls in the first place?

Q: What method and/or data sources are used by the state to identify non-citizen registrations for removal? If that process is exhaustive, and covers all registrations, then these numbers might be considered to represent a statistical complete picture of the problem. If that process is not exhaustive, in that it only uses serendipitous corroborating data sources, then these results likely under-represent the scale of the issues.

Q: As noted above, we are only considering here those individuals who have not had their records re-instated or reactivated after a determination of non-citizen status. We do not have enough information to determine how or why some records were first determined to be non-citizen, canceled and then subsequently re-instated. One potential area of concern is determining whether or not registrants might be falsely or errantly claiming to not be a citizen on official documents in order to be excused from jury duty, for example, and then work to re-instate their voting status once those documents percolate through the system to ELECT and are flagged for removal. This is a wholly separate but serious issue, as making false claims on official documents is itself a punishable offense.

Q: What procedures, processes and technical solutions are in place to prevent current or future registration and casting of ballots by non-citizens? This is especially pertinent given the current state of the flow of illegal immigrants crossing our national borders. According to a recent report by Yahoo Finance, VA is one of the top 30 destinations for illegal migrants, with both Loudoun County and Fairfax making the list.

Q: Why have none of the identified non-citizens who also cast ballots been investigated or prosecuted under VA Code 24.2-1004? As the identification of these ballots comes directly from looking at the official records produced by ELECT, it seems prudent for these to be forwarded by ELECT to the AG’s office with a recommendation to investigate and prosecute. Yet our FOIA request to the VA AG’s office inquiring as to any records associated with these types of investigations or prosecutions produced a “no relevant records exist” response. And since we submitted this information to the AG’s office, there has been no follow up.

Additionally, this evidence which is derived from only official state records, directly contradicts multiple news media reports and attestations that non-citizen voting is a “Myth”, and that non-citizen voting happens “almost never”. If the data from ELECT is accurate, then there are at least 938 ballots that have been cast by non-citizen voters just since 2019. Now, that is still very infrequent, but it is not “almost never.” It is a legitimate concern … and these discoveries are only the registrations that have been found and removed from the voter roles by ELECT and that we can observe in the data. We do not know how many exist that we do not know about.

It should be reiterated that these are only the records that we can observe given our data repository, and how often we can realistically purchase and acquire voter history and voter registration information. It is therefore likely that this represents a significant undercount of the occurrences of non-citizen voters and non-citizen voting.

It costs us (EPEC) approximately ~$5K for each purchase of the statewide voter history list, and approximately $15K/year to maintain RVL records using a single baseline full purchase + 2 purchases of the 6mo MUS subscription. Due to the infrequent nature of these data purchases, it is very likely that some individuals have had their voter history or voter registration information completely removed from the record in between our purchases. Additionally, we know that the MUS data does not entirely encompass all transactions performed on the RVL by the department of elections, so there may be yet other unknown transactions that we are missing.

For information that is supposed to be publicly available (according to federal NVRA laws), the state has put up significant hurdles in order for citizens and organizations to acquire it for use it for ensuring transparency and integrity of our electoral process. If we are to have elections that are transparent and accountable to the public, then we must insist that the data be made available and accessible.

Categories
Election Data Analysis Election Forensics Election Integrity mathematics technical

Identification of 2,502 Potential Matches of Active Voter Registrations Between FL and VA Voter Registration Lists

Building off of our previous work on computing the string distance between all possible pairs of registered voter records in a single state in order to identify potential matches, we’ve updated the code to allow for cross state comparisons. The first states that we ran this on was VA and FL, using the dataset produced by the FL Department of Elections on 05-07-2024, and the dataset from the VA department of elections dated 05-01-2024. There were a total of 2,502 records that matched our constraints between the FL and VA datasets, as detailed below.


Note: All examples of data records given in this writeup have been fictionalized to protect registered voter identities from being published on this website, and only serve as illustrative examples representative of the nature of properties and characteristics discussed. Law enforcement, election or other gov officials, or individuals otherwise authorized to receive and handle voter data as per VA law and the VA Department of Elections are welcome to contact us for specific details and further information.

Each dataset had the First Name, Middle Initial, Last Name, Suffix, Gender, and Year, Month and Day of Birth concatenated into strings that were then compared against each other using the Levenshtein String Distance measure as an initial filtering method to determine potential matches.

Additionally, for each pair we computed the minimum string distance measure between all of the four possible permutations of pairings between the Primary and Mailing addresses in each record between the states. We required that this minimum distance for a set of registration entries be less than or equal to 12 characters. The choice of the value of twelve was empirically determined after review of the data, as it is loose enough to allow for common variations in address presentation while not being so loose as to be overwhelmed with false positive.

We additionally filtered these findings for only those pairings that were of ACTIVE registrations in both datasets AND where the year, month and day of birth were exact matches.

In summary the 2,502 matches were generated according to the following constraints:

  • Only applied to ACTIVE voter registrations
  • Required completed DOB (year, month and day) to exactly match
  • Required [First Name + Middle Initial + Last Name + Suffix + Gender + DOB] strings to be similar to within <=2 characters
  • Required that the minimum distance between any pairwise combination of the Primary or Mailing address between the records be less than or equal to 12 characters.

It should be noted that it is readily apparent from reviewing the potential matched records that the majority of these matches look to have originated in FL and then were subsequently moved to VA, but the FL record remained listed as active.

Category 1 Matches:

There were 698 matches in Category 1: where the Levenshtein distance measure for the name and DOB was equal to 0 (exact match) and the minimum address distance was also 0 (also an exact match). Examples in this category are exact matches for every considered field. An example is given below.

FL Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PL SPRINGFIELD VA 22150

VA Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PL SPRINGFIELD VA 22150

Category 2 Matches:

There were 1,533 matches in Category 2: where the Levenshtein distance measure for the name and DOB was equal to 0 (exact match) and the minimum address distance was greater than 0, but less than or equal to 12. Examples in this category commonly have differences in how the zip code, apartment numbers or state code is presented in either the Primary or Mailing address strings. An example is given below.

FL Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PLACE SPRINGFIELD VA 22150

VA Active Registration Record:
SOUXIEE Q SMITH F 08/19/1968
1267 SLEEPY SONG PL SPRINGFIELD VA 221504259

Category 3 Matches:

There were 44 matches in Category 3: where the Levenshtein distance measure for the name and DOB was equal to 1 and the minimum address distance was equal 0 (exact match). Examples in this category are most often due to hyphenation or misspellings in the name, or a change in Gender (i.e. from “M”->”U”). An example is given below.

FL Active Registration Record:
BENNIE DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

VA Active Registration Record:
BENNEE DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

Category 4 Matches:

There were 140 matches in Category 4: where the Levenshtein distance measure for the name and DOB was equal to 1 and the minimum address distance was greater than 0, but less than or equal to 12. Examples in this category are most often due to hyphenation or misspellings in the name, or a change in Gender (i.e. from “M”->”U”), as well as small differences in how the addresses are presented. An example is given below.

FL Active Registration Record:
BENNIE DAS M 05/14/1945
1267 SLEEPY SONG PLACE SPRINGFIELD VA 22150

VA Active Registration Record:
BENNEE DAS M 05/14/1945
1267 SLEEPY SONG PL SPRINGFIELD VA 221504259

Category 5 Matches:

There were 19 matches in Category 5: where the Levenshtein sistance measure for the name and DOB was equal to 2 and the minimum address distance was equal 0 (exact match). Examples in this category are most often due to a middle name/initial being present in one record and not being present in the other. An example is given below.

FL Active Registration Record:
BENNIE DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

VA Active Registration Record:
BENNIE C DAS M 05/14/1945
12345 PEPPERMINT PATTY CREST APT 1000 ASHBURN VA 201475724

Category 6 Matches:

There were 68 matches in Category 3: where the Levenshtein Distance measure was equal to 1 and the minimum address distance was greater than 0, but less than or equal to 12. Examples in this category are most often due to a middle name/initial being present in one record and not being present in the other, as well as small differences in how the addresses are presented. An example is given below.

FL Active Registration Record:
BENNIE C DAS M 05/14/1945
1267 SLEEPY SONG PLACE SPRINGFIELD VA 22150

VA Active Registration Record:
BENNIE DAS M 05/14/1945
1267 SLEEPY SONG PL SPRINGFIELD VA 221504259

Table of Results by VA Locality:

Row LabelsLD=0, AD=0LD=0, 0<AD<=12LD=1, AD=0LD=1, 0<AD<=12LD=2, AD=0LD=2, 0<AD<=12
ACCOMACK COUNTY381100
ALBEMARLE COUNTY13240100
ALEXANDRIA CITY15521611
ALLEGHANY COUNTY130100
AMELIA COUNTY220000
AMHERST COUNTY320000
APPOMATTOX COUNTY500010
ARLINGTON COUNTY27532826
AUGUSTA COUNTY380110
BEDFORD COUNTY4150100
BOTETOURT COUNTY720000
BRISTOL CITY320000
BRUNSWICK COUNTY120000
BUCHANAN COUNTY100000
BUCKINGHAM COUNTY010000
CAMPBELL COUNTY231100
CAROLINE COUNTY020000
CARROLL COUNTY160100
CHARLOTTE COUNTY140000
CHARLOTTESVILLE CITY460001
CHESAPEAKE CITY278741314
CHESTERFIELD COUNTY28492503
CLARKE COUNTY020000
COLONIAL HEIGHTS CITY011000
CRAIG COUNTY210000
CULPEPER COUNTY680000
CUMBERLAND COUNTY200000
DANVILLE CITY210000
DICKENSON COUNTY130000
DINWIDDIE COUNTY030100
ESSEX COUNTY200000
FAIRFAX CITY360000
FAIRFAX COUNTY108259714415
FALLS CHURCH CITY220001
FAUQUIER COUNTY4141000
FLOYD COUNTY111000
FLUVANNA COUNTY230200
FRANKLIN CITY310000
FRANKLIN COUNTY560101
FREDERICK COUNTY1090200
FREDERICKSBURG CITY170000
GALAX CITY200000
GILES COUNTY000100
GLOUCESTER COUNTY6170110
GOOCHLAND COUNTY221010
GRAYSON COUNTY130100
GREENE COUNTY050000
HALIFAX COUNTY120100
HAMPTON CITY10160600
HANOVER COUNTY261210
HARRISONBURG CITY160100
HENRICO COUNTY24330301
HENRY COUNTY350100
ISLE OF WIGHT COUNTY4130102
JAMES CITY COUNTY23251100
KING GEORGE COUNTY241001
KING WILLIAM COUNTY200000
LANCASTER COUNTY211001
LEE COUNTY310000
LEXINGTON CITY020000
LOUDOUN COUNTY29731122
LOUISA COUNTY520000
LYNCHBURG CITY6150200
MADISON COUNTY200000
MANASSAS CITY300000
MANASSAS PARK CITY100000
MARTINSVILLE CITY210000
MATHEWS COUNTY030000
MECKLENBURG COUNTY320000
MIDDLESEX COUNTY040100
MONTGOMERY COUNTY6111100
NELSON COUNTY120100
NEW KENT COUNTY060000
NEWPORT NEWS CITY8170102
NORFOLK CITY145801101
NORTHUMBERLAND COUNTY211000
NOTTOWAY COUNTY010000
ORANGE COUNTY561000
PAGE COUNTY120000
PATRICK COUNTY020000
PETERSBURG CITY210000
PITTSYLVANIA COUNTY370100
POQUOSON CITY100000
PORTSMOUTH CITY591100
POWHATAN COUNTY220100
PRINCE EDWARD COUNTY020000
PRINCE GEORGE COUNTY111101
PRINCE WILLIAM COUNTY408321133
PULASKI COUNTY220000
RADFORD CITY020000
RAPPAHANNOCK COUNTY021000
RICHMOND CITY12291300
ROANOKE CITY14121200
ROANOKE COUNTY14150001
ROCKBRIDGE COUNTY222000
ROCKINGHAM COUNTY150101
RUSSELL COUNTY030001
SALEM CITY210000
SCOTT COUNTY200000
SHENANDOAH COUNTY010101
SMYTH COUNTY120000
SOUTHAMPTON COUNTY020100
SPOTSYLVANIA COUNTY10191100
STAFFORD COUNTY20480404
STAUNTON CITY120000
SUFFOLK CITY12310001
TAZEWELL COUNTY050100
VIRGINIA BEACH CITY46177111112
WARREN COUNTY240000
WASHINGTON COUNTY351100
WAYNESBORO CITY130000
WESTMORELAND COUNTY520001
WILLIAMSBURG CITY110000
WINCHESTER CITY060000
WISE COUNTY070000
WYTHE COUNTY000100
YORK COUNTY12352200
Grand Total6981533441401968

Tabulated Results by FL County Code:

Row LabelsLD=0, AD=0LD=0, 0<AD<=12LD=1, AD=0LD=1, 0<AD<=12LD=2, AD=0LD=2, 0<AD<=12
MON2200100
ALA0230200
BAK020000
BAY7400410
BRA220000
BRE41391123
BRO12950608
CHA71146121
CIT160100
CLA7472503
CLL1520101
CLM000100
DAD50592621
DES110000
DUV2811442119
ESC1910311003
FLA5110122
FRA110000
GAD100100
GLA100000
GUL040000
HAM300000
HAR310000
HEN100000
HER8160201
HIG010000
HIL296521014
HOL010000
IND9111010
JAC020000
LAK1100101
LEE0460301
LEO3592010
LEV301000
MAD001000
MAN31211101
MRN26160101
MRT4062211
NAS4120100
OKA50313012
OKE100000
ORA11390904
OSC4151000
PAL358931002
PAS0300301
PIN4880603
POL0620902
PUT210000
SAN13420302
SAR17181120
SEM53345303
STJ8221503
STL60204221
SUM2290301
SUW330000
TAY020000
VOL0510303
WAK110000
WAL160000
Grand Total6981533441401968

Addendum + Updates:

In response to a number of questions we have received on this topic, and continued work to dig into this data:

  1. The number of matches above has been corrected from the original 2,527 to 2,502 (a difference of 25) due to a “fat-finger” error in tallying the total number of category 5 matches.
  2. For the strict constraints given above, the number of matched records where there is a vote recorded for the same election date in both the VA and FL data is 13.
  3. We also computed the number of exact [First Name + Middle Initial + Last Name + Gender + Full DOB] matches without requiring our additional address filter. This criteria is more strict in the initial match, but more loose in the subsequent filtering.
    • This results in a total of 17,701 matches when considering only Active voters on each of the FL and VA voter lists.
      • There are 343 of these matches where both FL and VA records have a history of votes cast in the same election.
    • The number jumps to 81,155 if we consider either Active or Inactive registrations.
      • There are 382 of these matches where both FL and VA records have a history of votes cast in the same election.
Categories
Election Data Analysis Election Forensics Election Integrity

2024 VA June Democratic Primary Election DAL File Metrics

Below you will find the current summary data and graphics from the 2024 VA June Democratic Primary Election Daily Absentee List files. We pull the DAL file everyday and track the count of each specific ballot category in each daily file.

Note: Page may take a moment to load the graphics objects.

Linear Scale Plot:

Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Logarithmic Scale Plot:

The logarithmic plot is the same underlying data as the linear scale plot, except with a logarithmic y-scale in order to be able to compress the dynamic range and see the shape of all of the data curves in a single graphic. Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Summary Data Table:
Print  CSV  Copy  

The underlying data for the graphics above is provided in the summary data table.

Additional Data:

Additional CSV datasets stratified by Locality, City, Congressional District, State House District, State Senate District, and Precinct are available here.

Data column descriptions:
  • ISSUED” := Number of DAL file records where BALLOT_STATUS= “ISSUED”
  • NOT_ISSUED” := Number of DAL file records where BALLOT_STATUS= “NOT ISSUED”
  • PROVISIONAL” := Number of DAL file records where BALLOT_STATUS= “PROVISIONAL” and APP_STATUS=”APPROVED”
  • DELETED” := Number of DAL file records where BALLOT_STATUS= “DELETED”
  • MARKED” := Number of DAL file records where BALLOT_STATUS= “MARKED” and APP_STATUS=”APPROVED”
  • ON_MACHINE” := Number of DAL file records where BALLOT_STATUS= “ON_MACHINE” and APP_STATUS=”APPROVED”
  • PRE_PROCESSED” := Number of DAL file records where BALLOT_STATUS= “PRE-PROCESSED” and APP_STATUS=”APPROVED”
  • FWAB” := Number of DAL file records where BALLOT_STATUS= “FWAB” and APP_STATUS=”APPROVED”
  • MAIL_IN” := The sum of “MARKED” + “PRE_PROCESSED”
  • COUNTABLE” := The sum of “PROVISIONAL” + “MARKED” + “PRE_PROCESSED” + “ON_MACHINE” + “FWAB”
  • MILITARY” := Number of DAL file records where VOTER_TYPE= “MILITARY”
  • OVERSEAS” := Number of DAL file records where VOTER_TYPE= “OVERSEAS”
  • TEMPORARY” := Number of DAL file records where VOTER_TYPE= “TEMPORARY”
  • MILITARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “MILITARY” and where COUNTABLE is True
  • OVERSEAS_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “OVERSEAS” and where COUNTABLE is True
  • TEMPORARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “TEMPORARY” and where COUNTABLE is True

All data purchased by Electoral Process Education Corp. (EPEC) from the VA Dept of Elections (ELECT). All processing performed by EPEC.

If you like the work that EPEC is doing, please support us with a donation.

Categories
Election Data Analysis Election Forensics Election Integrity

2024 VA June Republican Primary Election DAL File Metrics

Below you will find the current summary data and graphics from the 2024 VA June Republican Primary Election Daily Absentee List files. We pull the DAL file everyday and track the count of each specific ballot category in each daily file.

Note: Page may take a moment to load the graphics objects.

Linear Scale Plot:

Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Logarithmic Scale Plot:

The logarithmic plot is the same underlying data as the linear scale plot, except with a logarithmic y-scale in order to be able to compress the dynamic range and see the shape of all of the data curves in a single graphic. Place your cursor over the series name in the legend at right to see the series highlighted in the graphic. Place your cursor over a specific data point to see that data points value.

Summary Data Table:
Print  CSV  Copy  

The underlying data for the graphics above is provided in the summary data table.

Additional Data:

Additional CSV datasets stratified by Locality, City, Congressional District, State House District, State Senate District, and Precinct are available here.

Data column descriptions:
  • ISSUED” := Number of DAL file records where BALLOT_STATUS= “ISSUED”
  • NOT_ISSUED” := Number of DAL file records where BALLOT_STATUS= “NOT ISSUED”
  • PROVISIONAL” := Number of DAL file records where BALLOT_STATUS= “PROVISIONAL” and APP_STATUS=”APPROVED”
  • DELETED” := Number of DAL file records where BALLOT_STATUS= “DELETED”
  • MARKED” := Number of DAL file records where BALLOT_STATUS= “MARKED” and APP_STATUS=”APPROVED”
  • ON_MACHINE” := Number of DAL file records where BALLOT_STATUS= “ON_MACHINE” and APP_STATUS=”APPROVED”
  • PRE_PROCESSED” := Number of DAL file records where BALLOT_STATUS= “PRE-PROCESSED” and APP_STATUS=”APPROVED”
  • FWAB” := Number of DAL file records where BALLOT_STATUS= “FWAB” and APP_STATUS=”APPROVED”
  • MAIL_IN” := The sum of “MARKED” + “PRE_PROCESSED”
  • COUNTABLE” := The sum of “PROVISIONAL” + “MARKED” + “PRE_PROCESSED” + “ON_MACHINE” + “FWAB”
  • MILITARY” := Number of DAL file records where VOTER_TYPE= “MILITARY”
  • OVERSEAS” := Number of DAL file records where VOTER_TYPE= “OVERSEAS”
  • TEMPORARY” := Number of DAL file records where VOTER_TYPE= “TEMPORARY”
  • MILITARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “MILITARY” and where COUNTABLE is True
  • OVERSEAS_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “OVERSEAS” and where COUNTABLE is True
  • TEMPORARY_COUNTABLE” := Number of DAL file records where VOTER_TYPE= “TEMPORARY” and where COUNTABLE is True

All data purchased by Electoral Process Education Corp. (EPEC) from the VA Dept of Elections (ELECT). All processing performed by EPEC.

If you like the work that EPEC is doing, please support us with a donation.