Data Sources & Architecture

Authoritative data sets, integration cadence, stack recommendations, and security practices that underpin the NC GOTV Web App.

Authoritative Data Sources

NC Voter Registration (weekly)

Primary demographic, status, party, address, precinct, and district data. Sensitive fields (SSN, DOB) are excluded per state rules.

NC Voter History (weekly)

Event-level participation records including election identifiers, method, and party at vote. Must be joined back to registration data via ncid.

Polling Places & Early Voting Sites

Election-specific CSVs that map precincts to locations and hours; support Phase 2 precinct exports and maps.

Precinct Shapefiles

Current and historical polygons used for mapping precinct boundaries and tying voters to geographic context.

AddressNC Address Points

Statewide standardized addresses that enable Phase 3 set-difference analysis to uncover likely unregistered households.

Recommended Architecture

Frontend

React or Svelte single-page application with accessible data tables and MapLibre/Leaflet for precinct visualization.

Backend

Python (FastAPI) or Node (Fastify) serving search, filter, and export endpoints backed by PostgreSQL 15+ with PostGIS and pg_trgm.

Jobs & Pipelines

Airflow or cron-driven ingestion, dbt transformations, and object storage for raw drops. Exports delivered via pre-signed URLs and async jobs for large payloads.

Indexing & Performance

Security & Privacy

Example SQL References

Sample queries illustrate fuzzy search, history filtering, and address set differences.

-- Fuzzy name candidate set (Halifax only)
SELECT v.ncid, v.first_name, v.last_name, v.party, v.status, v.precinct_code,
       similarity((v.first_name||' '||v.last_name), $1) AS sim
FROM voter v
WHERE v.county_id = 'HALIFAX'
  AND (v.first_name||' '||v.last_name) % $1
ORDER BY sim DESC
LIMIT 200;

-- History filter (voted in 2020_G, not 2022_G)
WITH voted AS (
  SELECT ncid,
         MAX(CASE WHEN election_id = '2020_G' THEN 1 END) has20,
         MAX(CASE WHEN election_id = '2022_G' THEN 1 END) has22
  FROM voter_history
  GROUP BY ncid
)
SELECT v.*
FROM voter v
JOIN voted h USING (ncid)
WHERE h.has20 = 1
  AND COALESCE(h.has22, 0) = 0
  AND v.county_id = 'HALIFAX';

-- Address set difference for likely unregistered households
SELECT a.*
FROM addressnc a
LEFT JOIN (
  SELECT DISTINCT canonical_address_key
  FROM voter
) r ON a.canonical_address_key = r.canonical_address_key
WHERE r.canonical_address_key IS NULL
  AND a.county_id = 'HALIFAX';