Previously I wrote about finding In-Person Early Vote records inserted into the Daily Absentee List (DAL) records after the close of early voting in VA in 2021. Well, theres been quite a bit of activity since then and I have somme updates to share.
I originally discovered this issue and began digging into it around Nov 8th 2021, and finally published it to my blog on Dec 10th 2021. At the same time, queries were sent through the lawyers for the RPV to the ELECT Chairman (Chris Piper) and to a number of registrars to attempt to determine the cause of this issue, but no response was supplied. I also raised this issue to my local Board of Elections chair, and requested that ELECT comment on the matter through their official twitter account.
Since that time I have continued to publish my findings, have continued to request responses from ELECT, and have offered to work with them to address and resolve these discrepancies. I know ELECT pays attention to this site and my twitter account, as they have quietly corrected both their website and data files after I have pointed out other errors and discrepancies. Additionally, Chris Piper has continued to publicly insist that there were no major issues in either the 2020 or 2021 election (including under questioning by the VA Senate P&E Committee), and neither he nor any member of ELECT has publicly acknowledged any of the issues I have raised … besides the aforementioned changing of their site contents, of course. I have thankfully had a few local board of elections members work with me, as well as a few local registrars … but I did not see any meaningful response or engagement from anyone at ELECT until Feb 23rd 2022 as discussed below.
On Feb 22nd 2022, I was invited to participate in a meeting arranged by VA State Senator Amanda Chase with the VA Attorney Generals office to discuss election integrity issues. I specifically cited a number of the issues that I’ve documented here on my blog, including the added DAL entries, as justification for my belief that there is an arguable case to be made for there being criminal gross negligence and maladministration at ELECT with respect to the administration and handling of VA election data.
That meeting apparently shook some things loose. Good.
The day after that meeting Chris Piper finally sent a response at 10:45am to our inquiry on the subject of the added DAL entries. It is quoted below:
While I am glad that he finally responded, his technical reasoning does not address all of the symptoms that are observed:
- He states the cause was due to the EPB’s from one vendor, but the data shows DAL additions being attributed to multiple localities that use different EPB vendors.
- His explanation does not address the distribution of the numbers of DAL additions across all of the precincts observed. A misconfigured or malfunctioning poll book would affect all of the check-ins entered on it, not just a sporadic few.
- This also does not seem like a minor issue as its affecting thousands of voters ballots and voter history. So I’m rather concerned with Mr. Piper’s attitude toward this issue, as well as others. It needs to be addressed as a logic and accuracy testing issue, as a matter of procedures and training, in addition to simply asking vendors to add checks to their software. Also, will this be addressed in VERIS at all, or if/when a replacement system is put in place?
- In response to his last paragraph, I will simply note that I have been actively and consistently working to raise all of these issues through official channels … through official requests via the RPV, working with local election board members and registrars, and asking for input from elect through social media. I have not made accusations of malicious or nefarious intent, but I do think there is plenty of evidence to make the case of incompetence +/or gross negligence with our elections data … which is actually a crime in VA … and Chris Piper has been the head of ELECT and the man responsible for ensuring our election data during that time (he is stepping down effective March 11th).
Since his response was sent, a few additional things have occurred:
(A) The AG’s office informed us that they are actually required by law to treat ELECT as their client and defend them from any accusations of wrongdoing. This is frustrating as there does not seem to be any responsive cognizant authority that is able to act on this matter in the interest of the public. This is not a local jurisdictional issue as it affects voters statewide, and is therefore in the purview of the AG as the Department of Elections has been heretofore dismissive and non-responsive of these matters. I am not a lawyer, however.
(B) I was able to connect with the Loudoun County Deputy Director of Elections (Richard Keech) as well as the Prince William County registrar (Eric Olsen) and have been working through the finer details of Chris’s explanation to verify and validate at the local level. [Note: I previously had Richard erroneously listed here as the Registrar instead of the Deputy Directory]. Both Richard and Eric are continuing to look into the matter, and I continue to work with them to get to the bottom of this issue.
- Richard confirmed his belief that the bad OS system date on Election Day EPBs was responsible for the errors, however with some slight differences in the details from Piper’s description. There were multiple vendors affected, not just one. Per Richard, the problem appeared to be that a number of Loudoun poll-books (regardless of vendor) that were used for Election Day had been in storage so long that their batteries had completely depleted. When they were finally powered up, their OS system clocks had a wildly incorrect date. The hardware used was a mixture of Samsung SM-T720 and iPad tablets, depending on poll-book application vendor. The hardware was purchased separately through local contracts with CDW, and the software was uploaded and configured by the vendors.
- In Loudoun, all of the EPBs went through logic and accuracy testing before the election per Richard, but it does not appear that the procedures for the Logic and Accuracy testing had any specific checks for OS date settings.
- In Prince William County the registrar (Eric) was not aware of any issues with the system clocks on the poll-books, and he was skeptical of the distribution of the small numbers of added DAL entries. He noted, as I did above, that if a poll book was misconfigured it would affect all of the records that passed through it, not just a small handful. He also noticed that there was a discrepancy with the attribution of polling place names that I had extracted from the DAL files, where some of the names did not correspond with actual polling places in PWC. He has stated he will look into the matter and get back to me. I will update my blog when he does so.
- From my communications with Richard, the VERIS system imports a text based file for processing voter credit and does not have any special checks against the dates for in-person vs early voting records. Hence, why this issue can impact multiple vendors if their applications use the system clock to date-stamp their exported txt files for upload into VERIS.
(C) I have reworked my code per my conversations with Ricky and Eric, and fixed a few bugs and parsing errors along the way. Most notably, there are a number of missing or malformed field values in the raw DAL files that were being parsed into ‘undefined’ categorical values by the default csv MATLAB parser. These ‘undefined’ values, even when located in unimportant fields in the row of a MATLAB table, can cause the entire row to be incomparable to other entries when performing logical operations. I have adjusted my parser and logic to account for +/or ignore these entries as necessary. Additionally I had previously looked for new entries by comparing values across entire rows, but have adjusted to now only look at voter ID numbers that have not been seen previously, in order to omit those entries that had simply been adjusted (address changes, etc) after the fact, or that contained ‘undefined’ field elements as mentioned previously. Also I noticed that some of the dal files had duplicate records of Approved and On-Machine records for the same voter ID. While that is an issue in itself, I de-duplicated those entries for this analysis. This new logic gives the updated results presented below, with a total number of discrepancies now at 2820.
I will note that I am still a little skeptical of the “bad date” explanation as being a complete answer to this issue, as it does not adequately explain the distribution of small numbers of discrepancies attributed to multiple precincts, for one thing. While the bad date may explain part of the issue, it does not adequately account for all of the observed effects, IMO. For example, in Loudoun there are 26 precincts listed below that have inserted DAL records attributed to them. Many of these precincts having only 1 or 2 associated records. If the bad OS date explanation is to blame, then (a) there must have been at least 26 poll-books, one at each precinct, in Loudoun with misconfigured and incorrect dates AND (b) many of these poll-books were used to only check in 1 or 2 people total, as anyone checked in on a misconfigured poll-book would have their voter credit/DAL file entries affected. This would have to have been replicated at ALL of the precincts in ALL of the localities listed below. While the above scenario is admittedly possible, I find it rather implausible.
Update 2022-03-20
I’ve heard back from both Ricky Keech (Loudoun) and Eric Olsen (PWC).
Eric looked into the 10 entries for PWC and all of them were Military voters who did actually walk in, in person, to the registrars office and vote on machine absentee after the close of early voting, as allowed for by state law. So everything checks out for PWC.
Per discussion with Ricky, there were two issues: The first being a number of pollbooks for four specific precincts that were used for Election Day having the wrong date setting as discussed above. The other issues was possibly the connectivity issue of the Pollbook to the servers in South Riding during the early voting period that had to be hand corrected. Ricky’s explanation e-mail to me is copied below.
Hi Jon,
Following up on our conversation the other day. So, I did some more digging and was able to figure out what happened. The bulk of the voters (1213) were from the four precincts that had a tablet with the wrong date on it. That accounts for all voters in precincts 214, 416, 628, and 708. I had a staff person go back and pull out the tablets used in those precincts, and we confirmed that each of those four had one table with the wrong date.
That leaves 141 voters that received ‘credit’ after election day. Once we had narrowed it down, I looked for patterns and noticed all the remaining precincts were in South Riding. That jogged my memory and led me to the solution. When I ran the final reports on the Sunday before the election, I noticed that the number showing as voting early seemed to be off by 100 or so. This was odd because our daily checked in count and voted count reconciled at every site every day. So, I went back and compared the voters checked into our pollbooks at early voting to the voters with early voting (On Machine) credit in VERIS and found that there were 137 voters who voted on October 19 at the Dulles South EV site and for some reason did not have credit. I worked on this Monday to make sure it was right, and it was, none of those voters had credit. This could either have been a connectivity issue at Dulles South EV site OR an issue with VERIS when the data was uploaded to mark the voters. I can say definitively that the number checked in on the pollbooks at that site on that day and the number of people who put ballots into the machine was correct, we check that constantly and the observers on site checked as well. I can guarantee that if there had been a discrepancy, we’d have heard about it right away.
So, after determining that was exactly what happened I uploaded credit for those voters at 2:06:51pm on Wednesday, November 3 and the upload completed processing at 2:07:07pm.
When we spoke the other day, I thought it was likely a connectivity issue, but now I’m not entirely sure that’s the case, as if the connection wasn’t working the numbers should have been off. And they were correct on the devices at the EV site and my laptop here at the office. Everything matched.
So long story short, we did an audit, discovered missing credit from one early voting site on one day, and corrected it.
The other four voters were people who voted an emergency early voting ballot on Monday, November 1.
Richard Keech, Deputy Director of Elections for Loudoun County, Mar 11 2022 email to Jon Lareau
Locality | COUNT |
LOUDOUN COUNTY | 1344 |
HANOVER COUNTY | 1302 |
CHESAPEAKE CITY | 92 |
PRINCE WILLIAM COUNTY | 10 |
HENRICO COUNTY | 7 |
WINCHESTER CITY | 7 |
CHARLES CITY COUNTY | 5 |
CAMPBELL COUNTY | 3 |
CHARLOTTE COUNTY | 3 |
FAUQUIER COUNTY | 3 |
LUNENBURG COUNTY | 3 |
WASHINGTON COUNTY | 3 |
ALEXANDRIA CITY | 2 |
AMELIA COUNTY | 2 |
AMHERST COUNTY | 2 |
BATH COUNTY | 2 |
CAROLINE COUNTY | 2 |
FALLS CHURCH CITY | 2 |
HENRY COUNTY | 2 |
NORTHAMPTON COUNTY | 2 |
ORANGE COUNTY | 2 |
ROANOKE CITY | 2 |
VIRGINIA BEACH CITY | 2 |
ALLEGHANY COUNTY | 1 |
APPOMATTOX COUNTY | 1 |
BLAND COUNTY | 1 |
CHARLOTTESVILLE CITY | 1 |
CLARKE COUNTY | 1 |
CULPEPER COUNTY | 1 |
ESSEX COUNTY | 1 |
FAIRFAX CITY | 1 |
FLOYD COUNTY | 1 |
KING GEORGE COUNTY | 1 |
MIDDLESEX COUNTY | 1 |
NELSON COUNTY | 1 |
NOTTOWAY COUNTY | 1 |
POQUOSON CITY | 1 |
STAFFORD COUNTY | 1 |
STAUNTON CITY | 1 |
LOCALITY | PRECINCT | COUNT |
HANOVER COUNTY | 704 – ELMONT | 667 |
HANOVER COUNTY | 602 – LEE DAVIS | 635 |
LOUDOUN COUNTY | 416 – HAMILTON | 443 |
LOUDOUN COUNTY | 214 – SUGARLAND NORTH | 344 |
LOUDOUN COUNTY | 708 – SENECA | 319 |
LOUDOUN COUNTY | 628 – MOOREFIELD STATION | 97 |
LOUDOUN COUNTY | 319 – JOHN CHAMPE | 21 |
LOUDOUN COUNTY | 313 – PINEBROOK | 16 |
LOUDOUN COUNTY | 112 – FREEDOM | 13 |
LOUDOUN COUNTY | 122 – HUTCHISON FARM | 11 |
LOUDOUN COUNTY | 126-GOSHEN POST | 10 |
CHESAPEAKE CITY | 055 – GEORGETOWN EAST | 8 |
LOUDOUN COUNTY | 107 – LITTLE RIVER | 8 |
CHESAPEAKE CITY | 053 – FAIRWAYS | 7 |
LOUDOUN COUNTY | 121 – TOWN HALL | 7 |
LOUDOUN COUNTY | 316 – CREIGHTON’S CORNER | 7 |
LOUDOUN COUNTY | 318 – MADISON’S TRUST | 7 |
CHESAPEAKE CITY | 008 – SOUTH NORFOLK RECREATION | 6 |
CHESAPEAKE CITY | 012 – GEORGETOWN | 6 |
LOUDOUN COUNTY | 114 – DULLES SOUTH | 6 |
CHESAPEAKE CITY | 018 – INDIAN RIVER | 5 |
LOUDOUN COUNTY | 124 – LIBERTY | 5 |
LOUDOUN COUNTY | 320 – STONE HILL | 5 |
CHESAPEAKE CITY | 029 – TANGLEWOOD | 4 |
CHESAPEAKE CITY | 042 – PARKWAYS | 4 |
CHESAPEAKE CITY | 059 – CLEARFIELD | 4 |
CHESAPEAKE CITY | 065 – WATERWAY II | 4 |
LOUDOUN COUNTY | 119 – ARCOLA | 4 |
LOUDOUN COUNTY | 322-BUFFALO TRAIL | 4 |
CHESAPEAKE CITY | 005 – CRESTWOOD | 3 |
CHESAPEAKE CITY | 022 – NORFOLK HIGHLANDS | 3 |
CHESAPEAKE CITY | 057 – CYPRESS | 3 |
LOUDOUN COUNTY | 120 – LUNSFORD | 3 |
LOUDOUN COUNTY | 123 – CARDINAL RIDGE | 3 |
WINCHESTER CITY | 101 – MERRIMANS | 3 |
AMHERST COUNTY | 501 – MADISON | 2 |
CHARLES CITY COUNTY | 101 – PRECINCT 1-1 | 2 |
CHARLES CITY COUNTY | 301 – PRECINCT 3-1 | 2 |
CHARLOTTE COUNTY | 702 – BACON/SAXE | 2 |
CHESAPEAKE CITY | 010 – OSCAR SMITH | 2 |
CHESAPEAKE CITY | 014 – GRASSFIELD | 2 |
CHESAPEAKE CITY | 015 – GREENBRIER MIDDLE SCHOOL | 2 |
CHESAPEAKE CITY | 016 – HICKORY GROVE | 2 |
CHESAPEAKE CITY | 023 – OAK GROVE | 2 |
CHESAPEAKE CITY | 024 – OAKLETTE | 2 |
CHESAPEAKE CITY | 031 – CARVER SCHOOL | 2 |
CHESAPEAKE CITY | 032 – PROVIDENCE | 2 |
CHESAPEAKE CITY | 034 – HICKORY MIDDLE SCHOOL | 2 |
CHESAPEAKE CITY | 043 – PLEASANT CROSSING | 2 |
CHESAPEAKE CITY | 056 – GREEN TREE | 2 |
FALLS CHURCH CITY | 003 – THIRD WARD | 2 |
LOUDOUN COUNTY | 302 – ROUND HILL | 2 |
LOUDOUN COUNTY | 308 – ST LOUIS | 2 |
LOUDOUN COUNTY | 309 – ALDIE | 2 |
LOUDOUN COUNTY | 617 – OAK GROVE | 2 |
ORANGE COUNTY | 101 – ONE WEST | 2 |
PRINCE WILLIAM COUNTY | 409 – TYLER | 2 |
PRINCE WILLIAM COUNTY | 513 – LYNNWOOD | 2 |
PRINCE WILLIAM COUNTY | 712 – LEESYLVANIA | 2 |
WASHINGTON COUNTY | 701 – HIGH POINT | 2 |
WINCHESTER CITY | 201 – VIRGINIA AVENUE | 2 |
ALEXANDRIA CITY | 110 – CHARLES HOUSTON CENTER | 1 |
ALEXANDRIA CITY | 201 – NAOMI L. BROOKS SCHOOL | 1 |
ALLEGHANY COUNTY | 101 – ARRITT | 1 |
AMELIA COUNTY | 301 – NUMBER THREE | 1 |
AMELIA COUNTY | 501 – NUMBER FIVE | 1 |
APPOMATTOX COUNTY | 401 – COURTHOUSE | 1 |
BATH COUNTY | 101 – WARM SPRINGS | 1 |
BATH COUNTY | 201 – HOT SPRINGS | 1 |
BLAND COUNTY | 301 – HOLLYBROOK | 1 |
CAMPBELL COUNTY | 102 – NEW LONDON | 1 |
CAMPBELL COUNTY | 402 – COURT HOUSE | 1 |
CAMPBELL COUNTY | 602 – CONCORD | 1 |
CAROLINE COUNTY | 202 – SOUTH MADISON | 1 |
CAROLINE COUNTY | 401 – DAWN | 1 |
CHARLES CITY COUNTY | 201 – PRECINCT 2-1 | 1 |
CHARLOTTE COUNTY | 201 – RED OAK WYLLIESBURG | 1 |
CHARLOTTESVILLE CITY | 102 – CLARK | 1 |
CHESAPEAKE CITY | 006 – DEEP CREEK | 1 |
CHESAPEAKE CITY | 009 – BELLS MILL | 1 |
CHESAPEAKE CITY | 011 – GENEVA PARK | 1 |
CHESAPEAKE CITY | 020 – E W CHITTUM | 1 |
CHESAPEAKE CITY | 033 – WESTOVER | 1 |
CHESAPEAKE CITY | 046 – BELLS MILL II | 1 |
CHESAPEAKE CITY | 048 – JOLLIFF MIDDLE SCHOOL | 1 |
CHESAPEAKE CITY | 049 – WATERWAY | 1 |
CHESAPEAKE CITY | 050 – RIVER WALK | 1 |
CHESAPEAKE CITY | 051 – COOPERS WAY | 1 |
CHESAPEAKE CITY | 062 – FENTRESS | 1 |
CHESAPEAKE CITY | 063 – POPLAR BRANCH | 1 |
CHESAPEAKE CITY | 064 – DEEP CREEK II | 1 |
CLARKE COUNTY | 301 – MILLWOOD | 1 |
CULPEPER COUNTY | 303 – CARDOVA | 1 |
ESSEX COUNTY | 201 – NORTH | 1 |
FAIRFAX CITY | 001 – ONE | 1 |
FAUQUIER COUNTY | 202 – AIRLIE | 1 |
FAUQUIER COUNTY | 404 – SPRINGS VALLEY | 1 |
FAUQUIER COUNTY | 501 – THE PLAINS | 1 |
FLOYD COUNTY | 301 – COURTHOUSE | 1 |
HENRICO COUNTY | 105 – GREENDALE | 1 |
HENRICO COUNTY | 209 – GLEN LEA | 1 |
HENRICO COUNTY | 304 – JACKSON DAVIS | 1 |
HENRICO COUNTY | 316 – COLONIAL TRAIL | 1 |
HENRICO COUNTY | 416 – SPOTTSWOOD | 1 |
HENRICO COUNTY | 506 – EANES | 1 |
HENRICO COUNTY | 513 – PLEASANTS | 1 |
HENRY COUNTY | 203 – HORSEPASTURE #2 | 1 |
HENRY COUNTY | 501 – BASSETT NUMBER ONE | 1 |
KING GEORGE COUNTY | 101 – COURTHOUSE | 1 |
LOUDOUN COUNTY | 108 – MERCER | 1 |
LOUDOUN COUNTY | 118 – MOOREFIELD | 1 |
LOUDOUN COUNTY | 401 – WEST LOVETTSVILLE | 1 |
LUNENBURG COUNTY | 301 – ROSEBUD | 1 |
LUNENBURG COUNTY | 501 – REEDY CREEK | 1 |
LUNENBURG COUNTY | 502 – PEOPLES COMMUNITY CENTER | 1 |
MIDDLESEX COUNTY | 501 – WILTON | 1 |
NELSON COUNTY | 401 – ROSELAND | 1 |
NORTHAMPTON COUNTY | 201 – PRECINCT 2-1 | 1 |
NORTHAMPTON COUNTY | 401 – PRECINCT 4-1 | 1 |
NOTTOWAY COUNTY | 201 – PRECINCT 2-1 | 1 |
POQUOSON CITY | 001 – CENTRAL | 1 |
PRINCE WILLIAM COUNTY | 103 – GLENKIRK | 1 |
PRINCE WILLIAM COUNTY | 210 – PENN | 1 |
PRINCE WILLIAM COUNTY | 305 – PATTIE | 1 |
PRINCE WILLIAM COUNTY | 311 – SWANS CREEK | 1 |
ROANOKE CITY | 014 – Crystal Spring | 1 |
ROANOKE CITY | 019 – Forest Park | 1 |
STAFFORD COUNTY | 702 – WHITSON | 1 |
STAUNTON CITY | 301 – WARD NO 3 | 1 |
VIRGINIA BEACH CITY | 030 – RED WING | 1 |
VIRGINIA BEACH CITY | 063 – CULVER | 1 |
WASHINGTON COUNTY | 302 – SOUTH ABINGDON | 1 |
WINCHESTER CITY | 301 – WAR MEMORIAL | 1 |
WINCHESTER CITY | 402 – ROLLING HILLS | 1 |
The MATLAB code for generating the above is given below. The raw time-stamped DAL data files, as downloaded from the ELECT website, are loaded from the ‘droot’ directory tree as shown below, and I only utilize the latest daily download of DAL file data for simplicity.
warning off all
% Data directory root
droot = 'SourceData/DAL/2021/';
% Gets the list of DAL files that were downloaded from the ELECT provided
% URL over the course of the 2021 Election.
files = dir([droot,'raw/**/Daily_Absentee_List_*.csv']);
matc = regexp({files.name}, 'Daily_Absentee_List_\d+(T\d+)?.csv','match');
matc = find(~cellfun(@isempty,matc));
files = files(matc);
% Only process the last updated DAL for each day. I downloaded multiple
% times per day, but we will just take the last file downloaded each day
% for simplicity here.
matc = regexp({files.name}, '(\d+)(T\d+)?','tokens');
fd = []; pc = 0; ic = 0; idx = [];
for i=1:numel(matc)
if isempty(regexp(matc{i}{1}{1},'^2021.*'))
fd(i) = datenum(matc{i}{1}{1},'mmddyyyy');
else
fd(i) = datenum(matc{i}{1}{1},'yyyymmdd');
end
datestr(fd(i))
if pc ~= fd(i)
ic = ic+1;
end
idx(ic) = i;
pc = fd(i);
end
files = files(idx);
% Now that we have our list of files, lets process.
seen = [];
firstseen = [];
astats = [];
astatsbc = {};
cumOnMachine = [];
T = [];
for i = 1:numel(files)
% Extract the date of the ELECT data pull. Note that the first few
% days I was running the script I was not including the time of the
% pull when I was pulling the data fro the ELECT url and writing to
% disk, so there's soe special logic here to handle that issue.
matc = regexp(files(i).name, '(\d+)(T\d+)?','tokens');
matc = [matc{1}{:}];
if isempty(regexp(matc,'^2021.*'))
fdn = datenum(matc,'mmddyyyy')+.5;
else
fdn= datenum(matc,'yyyymmddTHHMMSS');
end
fds = datestr(fdn,30)
fdt = datetime(fdn,'ConvertFrom','datenum');
% Move a copy to the 'byDay' folder so we can keep a reference to the
% data that went into this analysis.
dal2021filename = [files(i).folder,filesep,files(i).name];
ofn = [droot,'byDay/Daily_Absentee_List_',fds,'.csv'];
if ~exist(ofn)
copyfile(dal2021filename,ofn);
end
% Import the DAL file
dal2021 = import2021DALfile(dal2021filename, [2, Inf]);
% Cleanup and handle undefined or imssing values.
dal2021.CITY(isundefined(dal2021.CITY)) = 'UNDEFINED';
dal2021.STATE(isundefined(dal2021.STATE)) = 'UNDEFINED';
dal2021.ZIP(isundefined(dal2021.ZIP)) = 'UNDEFINED';
dal2021.COUNTRY(isundefined(dal2021.COUNTRY)) = 'UNDEFINED';
% Do some basic indexing of different DAL status categories and combinations
appvd = dal2021.APP_STATUS == 'Approved' ;
aiv2021 = appvd & dal2021.BALLOT_STATUS == 'Issued';
amv2021 = appvd & dal2021.BALLOT_STATUS == 'Marked';
aomv2021 = appvd & dal2021.BALLOT_STATUS == 'On Machine';
appv2021 = appvd & dal2021.BALLOT_STATUS == 'Pre-Processed';
afwv2021 = appvd & dal2021.BALLOT_STATUS == 'FWAB';
appmv2021 = amv2021 | aomv2021 | appv2021 | afwv2021; % Approved and Countable
% Accumulate the stats for each DAL file
rstats = table(fdt,sum(aiv2021),sum(amv2021),sum(aomv2021),...
sum(appv2021),sum(afwv2021),sum(appmv2021),'VariableNames',{'DALFileDate',...
'NumIssued','NumMarked','NumOnMachine','NumPreProcessed','NumFWAB','NumCountable'});
astats = [astats; rstats];
% Write out the entries that were approved and countable in this DAL
% file
ofn = [droot,'byDayCountable/Daily_Absentee_List_Countable_',fds,'.csv'];
if ~exist(ofn)
writetable(dal2021(appmv2021,:),ofn);
end
% Write out the entries that were approved, countable and marked as 'On
% Machine' (i.e. an Early In-Person Voter Check-In) in this DAL file
ofn = [droot,'byDayCountableOnMachine/Daily_Absentee_List_Countable_OnMachine',fds,'.csv'];
if ~exist(ofn)
writetable(dal2021(aomv2021,:),ofn);
end
% Since the DAL file grows over time, we're going to try and figure out
% which On-Machine entries in each new file:
% (a) We've seen before and are still listed.
% (b) We haven't seen before (a NEW entry).
% (c) We've seen before but the listing is missing (a DELETED
% entry).
if isempty(T)
% Only applicable for the first file we process.
T = dal2021(aomv2021,:);
% There are sometimes duplicate uid numbers in the approved and
% on-machine counts! This is a problem in and of itself, but not
% the problem I'm trying to focus on at the moment. So I'm going
% to remove any duplicated rows based on UID.
[uid, ia, ib] = unique(T.identification_number);
T = T(ia,:);
inew = (1:size(T,1))';
ideleted = [];
firstseen = repmat(fdt,size(T,1),1);
else
dOM = dal2021(aomv2021,:);
% There are sometimes duplicate uid numbers in the approved and
% on-machine counts! This is a problem in and of itself, but not
% the problem I'm trying to focus on at the moment. So I'm going
% to remove any duplicated rows based on UID.
[uid, ia, ib] = unique(dOM.identification_number);
dOM = dOM(ia,:);
%[~,ileft,iright] = innerjoin(T,dOM);
[~,ileft,iright] = intersect(T.identification_number,dOM.identification_number);
% 'inew' will be a boolean vector representing those entries in
% 'dOM' that are new On-Machine records
inew = true(size(dOM,1),1);
inew(iright) = false;
% 'ideleted' will be a boolean vector representing those entries in
% 'T' that are missing On-Machine records in 'dOM'
ideleted = true(size(T,1),1);
ideleted(ileft) = false;
T = [T;dOM(inew,:)];
firstseen = [firstseen; repmat(fdt,sum(inew),1)];
end
clf;
plot(astats.DALFileDate,astats{:,2:end},'LineWidth',2);
grid on;
grid minor;
legend(astats.Properties.VariableNames(2:end),'Location','NorthWest');
xlabel('Date of DAL file pull from ELECT');
ylabel('Counts');
drawnow;
end
ofn = [droot,'byDayStats.csv'];
writetable(astats,ofn);
T.FirstSeen = firstseen;
ofn = [droot,'onMachineRecords.csv'];
writetable(T,ofn);
ofn = [droot,'onMachineRecords_missing.csv'];
writetable(T(ideleted,:),ofn);
cutoffDate = datetime('2021-11-01');
ofn = [droot,'onMachineRecords_after_20211101.csv'];
writetable(T(T.FirstSeen >= cutoffDate,:),ofn);
Ta = T(T.FirstSeen >= cutoffDate,:);
[ulocality,ia,ib] = unique(Ta.LOCALITY_NAME);
clocality = accumarray(ib,1,size(ulocality));
Tu = table(ulocality,clocality,'VariableNames',{'Locality','COUNT'});
ofn = [droot,'numOnMachineRecords_after_20211101_byLocality.csv'];
writetable(Tu,ofn);
Ta = T(T.FirstSeen >= cutoffDate,:);
[uprecinct,ia,ib] = unique(join([string(Ta.LOCALITY_NAME),string(Ta.PRECINCT_NAME)]));
cprecinct = accumarray(ib,1,size(uprecinct));
Tu = table(Ta.LOCALITY_NAME(ia),Ta.PRECINCT_NAME(ia),cprecinct,'VariableNames',{'LOCALITY','PRECINCT','COUNT'});
ofn = [droot,'numOnMachineRecords_after_20211101_byLocalityByPrecinct.csv'];
writetable(Tu,ofn);
The adjusted MATLAB parser function is listed below:
function dal = import2021DALfile(filename, dataLines)
%IMPORTFILE Import data from a text file
% DAILYABSENTEELIST10162021 = IMPORTFILE(FILENAME) reads data from text
% file FILENAME for the default selection. Returns the data as a table.
%
% DAILYABSENTEELIST10162021 = IMPORTFILE(FILE, DATALINES) reads data
% for the specified row interval(s) of text file FILENAME. Specify
% DATALINES as a positive scalar integer or a N-by-2 array of positive
% scalar integers for dis-contiguous row intervals.
%
% Example:
% dal = import2021DALfile("SourceData/DAL/Daily_Absentee_List_10162021.csv", [2, Inf]);
%
% See also READTABLE.
%
% Auto-generated by MATLAB on 16-Oct-2021 14:19:26
%% Input handling
% If dataLines is not specified, define defaults
if nargin < 2
dataLines = [2, Inf];
end
%% Set up the Import Options and import the data
opts = delimitedTextImportOptions("NumVariables", 38);
% Specify range and delimiter
opts.DataLines = dataLines;
opts.Delimiter = ",";
% Specify column names and types
opts.VariableNames = ["ELECTION_NAME", "LOCALITY_CODE", "LOCALITY_NAME", "PRECINCT_CODE", "PRECINCT_NAME", "LAST_NAME", "FIRST_NAME", "MIDDLE_NAME", "SUFFIX", "ADDRESS_LINE_1", "ADDRESS_LINE_2", "ADDRESS_LINE_3", "CITY", "STATE", "ZIP", "COUNTRY", "INTERNATIONAL", "EMAIL_ADDRESS", "FAX", "VOTER_TYPE", "ONGOING", "APP_RECIEPT_DATE", "APP_STATUS", "BALLOT_RECEIPT_DATE", "BALLOT_STATUS", "identification_number", "PROTECTED", "CONG_CODE_VALUE", "STSENATE_CODE_VALUE", "STHOUSE_CODE_VALUE", "AB_ADDRESS_LINE_1", "AB_ADDRESS_LINE_2", "AB_ADDRESS_LINE_3", "AB_CITY", "AB_STATE", "AB_ZIP", "BALLOTSTATUSREASON", "Ballot_Comment"];
opts.VariableTypes = ["string", "string", "categorical", "string", "string", "string", "string", "string", "string", "string", "string", "string", "categorical", "categorical", "categorical", "categorical", "categorical", "string", "string", "string", "categorical", "string", "categorical", "string", "categorical", "double", "string", "categorical", "categorical", "categorical", "string", "string", "string", "string", "string", "string", "string", "string"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, ["ELECTION_NAME", "LOCALITY_CODE", "PRECINCT_CODE", "PRECINCT_NAME", "LAST_NAME", "FIRST_NAME", "MIDDLE_NAME", "SUFFIX", "ADDRESS_LINE_1", "ADDRESS_LINE_2", "ADDRESS_LINE_3", "EMAIL_ADDRESS", "FAX", "VOTER_TYPE", "APP_RECIEPT_DATE", "BALLOT_RECEIPT_DATE", "PROTECTED", "AB_ADDRESS_LINE_1", "AB_ADDRESS_LINE_2", "AB_ADDRESS_LINE_3", "AB_CITY", "BALLOTSTATUSREASON", "Ballot_Comment"], "WhitespaceRule", "preserve");
opts = setvaropts(opts, ["ELECTION_NAME", "LOCALITY_CODE", "LOCALITY_NAME", "PRECINCT_CODE", "PRECINCT_NAME", "LAST_NAME", "FIRST_NAME", "MIDDLE_NAME", "SUFFIX", "ADDRESS_LINE_1", "ADDRESS_LINE_2", "ADDRESS_LINE_3", "CITY", "STATE", "COUNTRY", "INTERNATIONAL", "EMAIL_ADDRESS", "FAX", "VOTER_TYPE", "ONGOING", "APP_RECIEPT_DATE", "APP_STATUS", "BALLOT_RECEIPT_DATE", "BALLOT_STATUS", "PROTECTED", "CONG_CODE_VALUE", "STSENATE_CODE_VALUE", "STHOUSE_CODE_VALUE", "AB_ADDRESS_LINE_1", "AB_ADDRESS_LINE_2", "AB_ADDRESS_LINE_3", "AB_CITY", "AB_STATE", "BALLOTSTATUSREASON", "Ballot_Comment"], "EmptyFieldRule", "auto");
% Import the data
dal = readtable(filename, opts);
% Perform some cleanup on commonly found issues...
dal.Ballot_Comment = strrep(dal.Ballot_Comment,char([13,10]),". ");
dal.BALLOTSTATUSREASON = strrep(dal.BALLOTSTATUSREASON,char([13,10]),". ");
dal.LOCALITY_NAME = categorical(strtrim(string(dal.LOCALITY_NAME)));
dal.PRECINCT_NAME = categorical(regexprep(strtrim(string(dal.PRECINCT_NAME)),'^(\d+)( +)(\w+)','$1 - $3'));
end