The University of Sheffield
43 files

United States Commutes and Megaregions data for GIS

Version 5 2017-01-31, 10:01
Version 4 2016-12-20, 15:16
Version 3 2016-12-01, 00:07
Version 2 2016-11-02, 09:04
Version 1 2016-11-01, 09:27
posted on 2017-01-31, 10:01 authored by Alasdair Rae, Garrett G.D. Nelson
This Figshare dataset contains the files created by and used in a related PLOS ONE paper, entitled 'An economic geography of the United States: from commutes to megaregions', by Garrett Dash Nelson and Alasdair Rae, published 30 November 2016. 

Update: 27 January 2017 - see item 7. below

In addition to the files listed below, we have also provided a series of maps here, as high resolution PNGs. The fifth file below can be styled in QGIS using the QML style file provided in number 6.

Information on files

1. A us_ttw_v3_US_only_epsg5070v2 zipped shapefile (for use in GIS software such as ArcGIS or QGIS) containing individual census tract to census tract commuter links for the entire United States, based on the 2006-2010 American Community Survey dataset, as cited in the paper.

2. The file is a Pajek format file used in the derivation of our megaregions within Combo, the partitioning algorithm we used in the paper. Pajek (Program for Large Network Analysis) is a free software package. Combo is an open source modularity optimisation program developed by researchers at MIT, and the source code can be accessed at:

3. A us_ttw_v3_US_only_upto160km_epsg5070v2 zipped shapefile. This is an extract of 1. above but includes only those commutes of 160km or less (approx 100 miles).

4. A us_ttw_v3_US_only_800km_10plus_epsg5070v2 zipped shapefile. This is an extract of 1. above but includes only those commutes of 10 or more which cover a distance of 800km or more (approx 500 miles). There are just over 28,000 of these and we expect these may be occasional, weekly or otherwise irregular 'commuters' but we are not able to discern this from the underlying dataset.

5. A merged_ttw_v3_origin_and_dest_communities zipped shapefile. This contains the commute data but each line also has a 'community assignment' field. The ofips_comm (origin community) and dfips_comm (destination community) fields tell you which megaregion each line belongs to. They are mostly the same, but in some cases a flow line crosses between two megaregions. The dfips_comm field was used to decide which megaregion an area belongs to.

6. We also include a community_colors_20_dec_2016_v3.qml file to accompany 5. above. This is a QGIS style file and can be used to style the maps in the same way as our dark background images on Figshare.

7. We have also uploaded a zipped folder with two US megaregion shapefiles - one generalized and one at full resolution. These files are similar to Figure 11 in our paper. Whereas Figure 11 was somewhat fluid in terms of the precise location of the boundaries, these files are constructed from Census Tracts so they do not match exactly the lines shown in Figure 11. However, they do provide a very close match to the regions labelled in Figure 11 of the paper. These files are provided so that other researchers can then analyse associated data at these scales. The Shapefiles have three columns - one contains the original megaregion names from the PLOS ONE paper, another contains a new megaregion name based in part on post-publication interactions with readers, and the third column indicates the name of a major city in each megaregion. We hope you find these files useful.

Please see the attached readme file for full information.



  • There is no personal data or any that requires ethical approval


  • The data complies with the institution and funders' policies on access and sharing

Sharing and access restrictions

  • The data can be shared openly

Data description

  • The file formats are open or commonly used

Methodology, headings and units

  • There is a readme.txt file describing the methodology, headings and units