What is open data?
Commonly-cited principles hold that open data is a complete set of primary data made easily and permanently available in a timely fashion using electronic, machine readable, open file formats. Cost should not pose a barrier to accessing information, and no unreasonable restrictions should limit accessibility, sharing and re-use.
How do I know what the fields in the data mean?
We encourage our publishers to prepare data dictionaries when publishing tabular datasets. Data dictionaries include field names and narrative descriptions, and sometimes include information on data types, field length, and other details about the file. These dictionaries can be found as a resource on dataset pages.
Where can I learn new skills?
There are many places to turn when looking to build your data skills. Our Data 101 classes developed in collaboration with the Carnegie Library of Pittsburgh provide a foundation for beginning data users, and teach basic data and statistical literacy concepts using paper and group activities. Trainings are also offered through the Regional Data Center, and also through our partners the Carnegie Library and many other organizations. We post a wide range of training opportunities in our newsletter and events calendar, and also encourage data users to view the tutorials found on our Website.
How do I get help with data?
Just let us know. We reply to phone calls and emails. You’re also welcome to stop by one of our office hours or other events, or schedule a consultation. Librarians are also a great place to turn for help in your own neighborhood, and our partners at the Carnegie Library of Pittsburgh and the Allegheny County Library Association have also encouraged their librarians to provide data services.
What file formats are used in the open data portal?
The Regional Data Center’s open data portal can accommodate many different types of data formats. Data tables, text documents, geographic files, html/hyperlinks, archives, images, and even sound files may be found on the open data portal.
Common geospatial data formats include ESRI shapefiles (all components in a .zip archive), GeoJSON, KML, and even links to ESRI REST endpoints on other servers.
Other image or document files on the site include pdf files, gif, and jpeg formats. A full list of file types available on the Regional Data Center can be found on the open data portal’s side panel (desktop version), or under the “Filter Results” menu (mobile version).
How do I download and use data from your site?
We have a tutorial that shows you exactly how to download and open a dataset in your favorite spreadsheet software. If you get stuck, we encourage you to reach-out in an email, phone call, or by attending one of our upcoming training or office-hour events.
There also may be a tool that enables you to use data without having to download it directly from our site. A full list of tools can be found on our Website.
What if I need data and you don’t have it?
We realize that there is a lot of data that isn’t available through the Regional Data Center or the Southwestern Pennsylvania Community Profiles Website. We’re always working to add more to these websites, but want to help you get the data you need.
There are three primary ways you can request data:
- Make an informal data request through our website: If there’s data you’d like to suggest be made available as open data, please let us know. We will share your request with our partners, and your suggestions help us develop our outreach and publication priorities. This request does not take the place of a formal Right to Know request.
- Contact us directly. For years, we’ve responded to requests for information, and are always happy to make a referral to the right local, state, and national sources. To make a data request, please give us a call or send an email to get the ball rolling.
- Formally request data directly from a government agency: Public agencies are required to respond to external data requests through Pennsylvania’s Right to Know Law (RTKL) and the U.S. Freedom of Information Act (FOIA). This legislation has been an important tool used by journalists, activists, researchers, and many other people to obtain public information. There’s a lot to know about the process before making a request under RTKL (for state and local agencies) and FOIA (for Federal requests). Here are a few places to get more information about the RTKL and FOIA processes.
- The Pennsylvania Office of Open Records provides information about RTKL and the process for making a request.
- The Pennsylvania NewsMedia Association also provides information about RTKL and resources for those making a request.
- FOIA.gov provides information on FOIA, including details in how to make a request, and statistics about FOIA requests.
- The National Archives Office of Government Information Services has additional FOIA resources available on their Website.
There is also a new openFOIA Website being developed to manage FOIA requests by the Federal government.
What if I have an idea for a tool that uses data?
Let us know in an email or call, or by attending one of our data user group meetings or office hours to share your idea in-person. We can help you frame your idea and give you suggestions in how to take the next step. We also highly encourage you to join a community of other data users at a civic organization such as Code for Pittsburgh, or by participating in a hackathon, data dive, or other type of event.
Can you come to an event and talk more about data?
Yes – we regularly are asked to come to events to talk more about open data and share the story of the Regional Data Center. Please let us know if we can speak at one of your events, join you for a meeting, deliver a guest lecture in your class, or hold office hours in your community.
How do you handle privacy issues in data?
We take privacy seriously here at the Regional Data Center, and weigh the benefit of sharing the data against the harm caused to an individual if data is shared. We are reluctant to publish personally-identifiable information, and also will not publish data protected by legislation such as HIPAA and FERPA.
We include materials on privacy in our new publisher trainings and encourage organizations to develop internal privacy review processes before sharing data. We will work with our publishers to aggregate some sensitive data to enable publication, and are also available for consultations, or can arrange a conversation with outside experts. We also provide a final privacy review for each dataset at initial publication.
The Berkman Klein Center for Internet and Society at Harvard University published a helpful Open Data Privacy Playbook in February 2017, and is an excellent resource to learn more about the issue of privacy when it comes to open data.
Which datasets are the most popular?
Our performance management dashboard provides an easy way to see which datasets are most used. The “Top Tens” tab provides a quick look at the ten most popular datasets over the past 30 days, while the “Dataset Stats” tab allows users to sort the datasets by users, downloads, or pageviews, view trends over time, search by keyword or publisher, and download usage statistics.
I’m from another City and want to learn more about the Regional Data Center.
We are always happy to share more about our project with people from other cities, and welcome the opportunity to talk with people interested in our model. We are happy to schedule a call to talk more about our work, and suggest you first read through the information we’ve assembled about the project in order to make the most out of your conversation with us.