How We Published Data about Pittsburgh Parking Transactions

by David Walker

July 2, 2019

We at the Western Pennsylvania Regional Data Center have been working in partnership with the Pittsburgh Parking Authority and the City of Pittsburgh to shed light on how people in Pittsburgh use on-street meters and surface parking lots operated by the Authority. Developing a better understanding of how people park can help support efforts by the City and Authority to set rates based on demand. Having access to data can also help the Authority share information on revenue generation with City leadership, and develop a better understanding of mobility in Pittsburgh.

Our initial meetings identified an opportunity to automate the generation of parking reports created for City Council members and the Mayor by the Authority. These reports had been manually generated in a time-consuming process that involved running queries on the management dashboard of Cale, the Authority’s payment system provider. The results of these queries were pasted by staff into separate tabs of a spreadsheet. The process was replicated for each of the City’s approximately 60 parking zones and lots. The process of compiling these reports took approximately 2-3 hours every time they were created.

Bringing the data-handling expertise of the Regional Data Center to bear on the problem, we were able to go beyond quarterly reports and develop a data dashboard allowing for the display of daily data updates built on top of data extracts from the Authority’s data kiosk and separate mobile system. Users of the tool are able to view the amount of revenue generated by an on-street parking zone or surface lot over a user-specified date range ranging from as small as one day to as large as several years and aggregated by time of day. The story of the work that went into the production of the extract is an illuminating case study for anyone interested in the back story of what is needed to produce data to support computational mobility initiatives.

On a typical weekday in Pittsburgh, thousands of people park their cars in one of the lots or on-street parking spaces managed by the Parking Authority, paying at one of the over 1000 kiosks or through the increasingly popular pay-by-phone system. These systems generate about 25,000 transactions per day. Depending on whether people pay at the parking meters or through their phones, the transaction data goes through different systems and winds up in somewhat different formats, operating according to different rules. Each transaction is represented by a record consisting of over 30 database fields containing information about each element of the transaction.

When faced with the ambiguity of many timestamps with names like “PurchaseDate”, “StartDate”, “PayIntervalStart”, and “DateCreated” and with no documentation about what each field meant or how each payment system worked, we did what any good data detective would do; we went to the nearest meter, paid for the smallest amount of time we could, then ran back to our office to pull down the record of the parking transaction and scrutinize all the field values. Several trips to the parking meter later, we had our first set of clues in an epic journey of discovery that would span months of effort.

A system as complex as the parking system tends to result in knowledge of the system’s operations being spread out among different people, each of whom may use their own naming conventions and approaches for handling and storing the records that they manage or use. By building relationships with Authority and software-vendor staff, and asking a lot of questions, we were finally able to put the whole picture together (joining transaction data from APIs with operational data from Excel spreadsheets and hand-written records) and apply enough structure and standardization to the extracted datasets to make them usable. Finally we were able to complete our original objective of synthesizing data from the generated datasets to create a dashboard capable of presenting an aggregate view of parking activity by parking zone, for a user-defined date range.

Our collaboration with the Authority is having the added benefit of building an internal data culture within the organization. They’re asking questions that can be answered by their data and coming up with ideas for follow-on work, including dynamic pricing, using parking data to inform enforcement, understanding capacity, developing a more accurate picture of parking occupancy on a granular level, generating real-time estimates of how full different zones and lots are, and developing a mobile app to use this data to help direct motorist to areas where there is more parking available. We also anticipate that people outside the PPA will find uses for the data, such as investigating whether parking patterns have changed in response to the use of ride-sharing services, examining parking-activity trends in a given neighborhood (e.g., to study business-district change), investigating the seasonality of parking or impacts of weather, and using pay-by-phone payments as a proxy to characterize smartphone penetration by neighborhood over time.

This collaboration also proved to be a great learning opportunity for the Regional Data Center. Since the project was established in 2015, we have learned that many public-sector agencies like the Parking Authority have a need for internal data management, analytics, and software development capacity. Many of the organizations we talk with are interested in sharing data as open data, but need help to prepare it for publication. Developing long-term relationships with public-sector agencies and nonprofit organizations will allow us to help our partners unlock the potential in their data, and cover the costs of establishing, maintaining, and hosting their data as open data.

We’d like to thank the staff of the Pittsburgh Parking Authority and Cale for help in collecting, working with, and understanding parking-system data, as well as our partners at the City’s Department of Innovation and Performance for collaboration, which was especially important in the early phases of this project. This work was funded as a demonstration project through our initial startup project funding, and we’d also like to thank our financial contributors in the philanthropic community and the University of Pittsburgh.