FARS for All

Jan. 12, 2017, 4:44 p.m. return to index

The Fatality Analysis Reporting System (FARS) is a database of all fatal car crashes in the United States. The database spans from 1975 to present and is the most comprehensive database of fatal crashes available.

Since 1975 FARS has logged over 1.5 million fatal accidents, 26,343 of those involved cyclists. Since 1999 FARS has included GPS provided latitude and longitude of the crash location. We’ve been able to match about 10,000 of those accidents involving bikes to an associated OSM way. This is one of our main source of data for our edge weighting algorithm.

Over the years FARS has released their data sets in some very non open source friendly formats. From 1975 to 2014 your two main options are SAS files and DBF files, both of which are proprietary binary formats. If you wish to use the SAS files be aware there are no tools we know of that will import them into a database automatically. There are libraries available for both R and Python, but you’ll have to write your own import script.

We chose to work with the DBF files since ogr2ogr seems to handle them quite well and it will handle importing into your database automatically. If you wish to do this yourself you’ll need to make sure GDAL is compiled with the FileGDB extension. To do this, first build the FileGDB extension and then simply provide the path to it when configuring GDAL:

./configure --with-fgdb=/path/to/FileGDB_API-64

We were unable to import years 1984 and 2014 using the DBF format due to encoding issues that GDAL couldn’t handle. For 2014 we were unable to parse the SAS files with either of the available R or Python packages. Ultimately the only tool we could read these files with was SAS Universal Viewer, which although it’s proprietary and Windows only is free to use.

If you don’t want to go through the trouble of importing all of this data yourself we’re providing our imported data set. A few things to note before using our data set. We’re currently only providing the ACCIDENT and PERSON tables as we don’t have much use for the VEHICLES table. We’ve also merged all years into a single table. The field provided by FARS for joining between their tables is st_case, which is unique only to each year. We’ve added a year field to the PERSON, so to join the two tables you’ll need to use both st_case and year. Lastly, FARS has added and removed columns over the years , so be aware that some columns will not contain data for all years.

We’re providing two versions of the data set, all accidents and bicycle involved accidents. You can download the bicycle only data set here. The full data set is located here.