Measuring “Walkability”

by Bob Gradeck

February 24, 2022


We’ve been asked to create measures of communities that are “walkable” for several projects. While we acknowledge that there is no standard definition of what makes a community “walkable,” and the definition of “walkability” can differ from person to person, we thought developing an indicator that explores the total length of available sidewalks and public staircases relative to the total length of streets in a community could be a place to start. In this blog post, we describe how we used open data from SPC and Allegheny County to create a new measure for how “walkable” a community is. We wanted to create a ratio of the length of a community’s sidewalks to the length of a community’s streets as a measure of pedestrian infrastructure. A ratio of 1 would mean that a community has an equal number of linear feet of sidewalks and streets. A ratio of about 2 would mean that a community has two linear feet of sidewalk for every linear foot of street. In other words, every street has a sidewalk on either side of it. 
In the past, we’ve used data from Walk Score, simply because it was the only source available. While the data matched our perceptions of how “walkable” a place is, we found two major challenges in using Walk Score data as a community indicator. First, the data is based on a particular address, and is not necessarily designed to calculate how “walkable” an entire community actually is. When we obtained the data the first time, we used the block group centroid as a reference point. The second challenge in using the Walk Score indicator is that it is a “black box” algorithm, meaning that we have no details of how the algorithm operates, and don’t know much about the quality of the underlying data or any biases inherent in the model or various data sources it relies on. For this reason, we found that we also couldn’t explain how it works or how it measures what it measures. 


How We Created the Sidewalk to Street Ratio

Now that the SPC has started to collect and publish data about sidewalks on their open data portal, we wanted to try to create our own measure of community “walkability.” Specifically, we wanted to compare sidewalk length (in linear feet) to street length by 2010 Census Blockgroup. The sidewalk data produced by SPC includes sidewalks on each side of streets, steps, crosswalks, and trails. 
In creating a measure of the ratio of streets to sidewalks, we had to do a little bit of data cleanup. Much of this was by trial and error, ground-truthing the data based on our personal  experiences walking in different neighborhoods. Since street data was not shared as open data by many counties in our region either on PASDA or through the SPC open data portal, we limited our analysis of “walkability” to Allegheny County.
  • In looking at the sidewalk data table and map, we noticed that trails were included. While nice to have in the data, we wanted to exclude these two features from the ratio. We did this to avoid a situation where a community that had few sidewalks but was in the same blockgroup as a park with trails would get “credit” for being more “walkable” than it actually is according to our definition. We did this by removing all segments where “Trail” was in the “Type_Name” field.
  • We also used a similar tabular selection method to remove crosswalks from the sidewalk data “Type_Name”=”Crosswalk.” We kept the steps in the dataset along with the sidewalks. 
  •  In the street data obtained from Allegheny County’s GIS department, we felt like we should try to exclude limited-access highway segments from the analysis, since pedestrians are prohibited from using them, and their presence would have reduced the sidewalk/street ratio in communities where they are located. We did this by excluding street segments whose values in the “FCC” field (designating type of street) equaled “A11” or “A63.” We also removed trails from this dataset by excluding those classified as “H10.” Since documentation was sparse, we looked to see how these features were classified in the data to determine which codes to exclude.
  • After running the data initially, we also realized that excluding alleyways from the calculations also could improve the accuracy of our results. Some of the communities with substantial pedestrian infrastructure have alleyways, and including them would make them appear to be less-”walkable” in our indicator. We removed these from the dataset by removing records with a value of “Aly” or “Way” in the “St_Type” field. We also excluded streets where the word “Alley” appeared in the street name, or “St_Name” field.


Figure 1: Sidewalks and streets in Downtown Pittsburgh included in the sidewalk-street ratio



Once we were relatively happy with the quality of our underlying data, we went about the work of calculating the ratio of sidewalks to streets for each 2010 blockgroup. We used GIS to clip the sidewalk and street segments and append Blockgroup numbers to each segment. We then summarized the linear feet for each using 2010 Census blockgroup geographies. We aggregated the length of all qualifying sidewalks and streets to produce our ratio of sidewalk length to street length. The following map shows the ratio of sidewalks to streets (in linear feet) for all blockgroups in Allegheny County. The map below shows the ratio for all of Allegheny County, and we also created an interactive map.
Figure 2: Sidewalk-Street Ratio by Blockgroup


Are We Really Measuring “Walkability”

We are reluctant to call our highest-scoring communities “walkable,” because no matter how many sidewalks there may be in a community, pedestrians are far too-often involved in crashes with a vehicle. For this reason, we put the term “walkable” in quotations in this blog post. Using open crash data from the Pennsylvania Department of Transportation from 2004 through 2020, we analyzed pedestrian crashes in the county, along with those occurring in the “most-walkable” communities. 
2004-2020 4,318 crashes involving 4,544 pedestrians in blockgroups with a sidewalk to street  ratio of over 1 – communities with a physical infrastructure that should be conducive to walking.  These incidents resulted in 99 deaths and 360 major injuries. The state’s data allows for a look at the reasons contributing to the crash. In these incidents, some of the leading contributing factors included aggressive driving, distracted driving, driver impairment, and running red lights. The following table and interactive map includes information about the total number of crashes involving pedestrians, along with their locations. Clicking on each blockgroup provides the ratio of sidewalks to streets, and clicking on each crash record shows details about the incident, including year, month, time, injuries, and factors that are associated with each crash.


Table 1: Vehicle Crashes Involving Pedestrians 2004-2020
Total Crashes Pedestrians Involved Deaths Major Injuries
Sidewalk to Street Ratio >1 4,318 4,544 99 360
Total County 6,929 7,288 207 641

Includes only crashes reported to the Pennsylvania State Police. 

Source: PA Department of Transportation Crash Data hosted on WPRDC


Figure 3: Interactive Map of Sidewalk/Street Ratio with Locations of Crashes Involving Pedestrians

The ratio excludes alleyways, interstates, crosswalks, and trails. Calculated by WPRDC from SPC sidewalk data and Allegheny County Street Centerlines

Open the map in its own tab

How this can be used

This data can have many different uses, and here are a few ways that it might be useful. This data can be combined with other measures of public health and social determinants of health to start to analyze the impacts that sidewalk infrastructure can have on community dynamics and public health. The data can highlight opportunities to invest in pedestrian infrastructure, and a comparison of local government policies around sidewalk maintenance and development can also highlight the ways that some communities are incentivizing the development and maintenance of sidewalks. 


Avenues for future work

We admit that this isn’t a perfect measure.  Future improvements of this indicator could involve incorporating data about slope, sidewalk condition and presence of curb cuts into the measure, and performing some additional data cleanup and ground-truthing. For example, some communities have sidewalks that are impassable due to overgrowth, cars parked on the sidewalk, or poor pavement conditions, and including this in a measure of “walkability” could more-accurately reflect the pedestrian experience. We also may have included some roads (like Route 28) that some may not want to include in a community’s calculation of “walkability.”
We like to show our work. We have included our street and sidewalk layers along with the output of our analysis as an open dataset, and welcome you to pick-up where we left-off. Frameworks like the Age Friendly Communities Evaluation Guide has more resources that can inspire your thinking. This framework incorporates sidewalk condition, perceived and real physical accessibility, presence and safety of crosswalks, rest spaces, and washroom access into their definition. These measures provide a framework for further exploration, but require additional data and time to compute them – two things we lacked.