Authoritative Data Sources
NC Voter Registration (weekly)
Primary demographic, status, party, address, precinct, and district data. Sensitive fields (SSN, DOB) are excluded per state rules.
NC Voter History (weekly)
Event-level participation records including election identifiers, method, and party at vote. Must be joined back to registration data via ncid.
Polling Places & Early Voting Sites
Election-specific CSVs that map precincts to locations and hours; support Phase 2 precinct exports and maps.
Precinct Shapefiles
Current and historical polygons used for mapping precinct boundaries and tying voters to geographic context.
AddressNC Address Points
Statewide standardized addresses that enable Phase 3 set-difference analysis to uncover likely unregistered households.
Recommended Architecture
Frontend
React or Svelte single-page application with accessible data tables and MapLibre/Leaflet for precinct visualization.
Backend
Python (FastAPI) or Node (Fastify) serving search, filter, and export endpoints backed by PostgreSQL 15+ with PostGIS and pg_trgm.
Jobs & Pipelines
Airflow or cron-driven ingestion, dbt transformations, and object storage for raw drops. Exports delivered via pre-signed URLs and async jobs for large payloads.
Indexing & Performance
- Create GIN indexes on trigram-enabled name columns to power fuzzy search.
- Use B-tree indexes on county, precinct, and status to accelerate filters.
- Introduce partial indexes scoped to Halifax initially, expanding statewide later.
Security & Privacy
- Enforce HTTPS, IP rate limiting, audit logging, and least-privilege database roles.
- Display terms banner emphasizing non-commercial political use and PII handling policies.
- Allow exports to omit sensitive columns when organizers need to limit field exposure.
Example SQL References
Sample queries illustrate fuzzy search, history filtering, and address set differences.
-- Fuzzy name candidate set (Halifax only)
SELECT v.ncid, v.first_name, v.last_name, v.party, v.status, v.precinct_code,
similarity((v.first_name||' '||v.last_name), $1) AS sim
FROM voter v
WHERE v.county_id = 'HALIFAX'
AND (v.first_name||' '||v.last_name) % $1
ORDER BY sim DESC
LIMIT 200;
-- History filter (voted in 2020_G, not 2022_G)
WITH voted AS (
SELECT ncid,
MAX(CASE WHEN election_id = '2020_G' THEN 1 END) has20,
MAX(CASE WHEN election_id = '2022_G' THEN 1 END) has22
FROM voter_history
GROUP BY ncid
)
SELECT v.*
FROM voter v
JOIN voted h USING (ncid)
WHERE h.has20 = 1
AND COALESCE(h.has22, 0) = 0
AND v.county_id = 'HALIFAX';
-- Address set difference for likely unregistered households
SELECT a.*
FROM addressnc a
LEFT JOIN (
SELECT DISTINCT canonical_address_key
FROM voter
) r ON a.canonical_address_key = r.canonical_address_key
WHERE r.canonical_address_key IS NULL
AND a.county_id = 'HALIFAX';