Data Coverage & Methodology

FARS API serves data from NHTSA's Fatality Analysis Reporting System - the definitive US fatal crash dataset. Here's what's included and what to expect.

What Data is Included

Every fatal motor vehicle crash in the United States from 2017 through 2023. A crash is included if it resulted in at least one fatality within 30 days of the crash and involved a motor vehicle traveling on a public road.

YearCrashesFatalitiesVehiclesPersons
201734,56037,47353,12885,840
201833,91936,83552,28684,344
201933,48736,35551,62382,843
202035,93539,00754,55286,396
202139,78543,23061,80297,511
202239,42242,72160,76596,186
202337,76941,02558,50892,768
Total254,877276,646392,664625,888

Data Source

Crash data is sourced from NHTSA's Fatality Analysis Reporting System (FARS), maintained by the National Highway Traffic Safety Administration, part of the US Department of Transportation. FARS has collected data on every fatal motor vehicle crash in the US since 1975.

Traffic volume (AADT) and road segment data is sourced from FHWA's Highway Performance Monitoring System (HPMS) 2017 public release, maintained by the Federal Highway Administration. We spatially join each FARS crash to its nearest HPMS road segment within an 80-meter radius.

Traffic volume / road segment join

204,751 of 254,877 fatal crashes (80.3%) successfully matched a FHWA HPMS 2017 road segment with non-zero AADT. Matched crashes carry the segment's AADT (annual average daily traffic), AADT_truck, functional class (Interstate, Principal Arterial, etc), lane count, posted speed limit, and the snap distance from the crash GPS point to the segment centerline (median 2.5m, p95 22m).

Per-state match rate:

Coverage# statesExamples
≥85% (excellent)11Texas 91.7%, DC 94.1%, Massachusetts 90.6%, California 85.8%
75–85% (good)24Florida 84.7%, New York 83.4%, Georgia 80.0%, Illinois 77.5%
65–75% (fair)14Kentucky 67.2%, New Jersey 72.5%, Iowa 72.6%
<65% (poor)2Delaware 63.2%, North Carolina 64.8%

The two low-coverage states (NC, DE) reflect a known issue where some state DOT submissions to HPMS exclude smaller portions of the federal-aid road network, so a meaningful share of fatal crashes happens on roads not represented in HPMS at all. We document this and degrade gracefully (return road_exposure: null for unmatched crashes) rather than guessing.

Per-road-class national baselines: computed from the joined FARS+HPMS population, used to anchor Expected vs Actual claims:

Road classFatal crashes per 100M VMT
Interstate0.32
Principal Arterial - Other Freeway0.42
Principal Arterial - Other1.34
Minor Arterial1.55
Major Collector1.89
Local0.98

Interstates are the safest road class per vehicle mile traveled, despite carrying the most traffic. This matches well-established findings in transportation safety literature.

What Each Record Contains

Crash Record

Date, time, GPS coordinates, state, county, road type, weather, light conditions, manner of collision, speed limit, number of fatalities, number of vehicles, drunk driver involvement, hit-and-run flag.

Vehicle Record

Make, model, model year, body type, number of occupants, travel speed, speed limit, rollover, fire, hit-and-run, driver drinking status.

Person Record

Age, sex, person type (driver/passenger/pedestrian/cyclist), seat position, injury severity, restraint use, air bag deployment, ejection, alcohol test result.

Update Frequency

NHTSA publishes FARS data annually, typically in Q2 of the year following the crash year. There is an approximately 18-month lag - 2024 data is expected to be available by mid-2025. farsapi.com ingests new data within one week of NHTSA publication.

The 2023 dataset is currently labeled as "Initial Release" by NHTSA. A final version with corrections will replace it when available. farsapi.com labels data release status in API responses.

Known Limitations

Methodology

FARS API downloads annual CSV files from NHTSA's FTP server, parses the ACCIDENT, VEHICLE, and PERSON tables, translates numeric codes to human-readable labels using NHTSA's coding manuals, computes derived fields (e.g., drunk driver counts from vehicle-level data), and loads the normalized data into a PostgreSQL database with geographic indexing for radius queries.

Explore the API