- Home
- Frequently Asked Questions
Frequently Asked Questions
Fundamentals
What is open data?
Commonly-cited principles hold that open data is a complete set of primary data made easily and permanently available in a timely fashion using electronic, machine readable, open file formats. Cost should not pose a barrier to accessing information, and no unreasonable restrictions should limit accessibility, sharing and re-use.
Data Foraging
How do I download and use data from your site?
We have a tutorial that shows you exactly how to download and open a dataset in your favorite spreadsheet software. If you get stuck, we encourage you to reach-out in an email, phone call, or by attending one of our upcoming training or office-hour events.
There also may be a tool that enables you to use data without having to download it directly from our site. A full list of tools can be found on our web site.
What if I need data and you don’t have it?
We realize that there is a lot of data that isn’t available through the Regional Data Center. We’re always working to add more to these web sites, but want to help you get the data you need.
There are three primary ways you can request data:
- Make an informal data request through our web site: If there’s data you’d like to suggest be made available as open data, please let us know. We will share your request with our partners, and your suggestions help us develop our outreach and publication priorities. This request does not take the place of a formal Right to Know request.
- Contact us directly. For years, we’ve responded to requests for information, and are always happy to make a referral to the right local, state, and national sources. To make a data request, please give us a call or send an email to get the ball rolling.
- Formally request data directly from a government agency: Public agencies are required to respond to external data requests through Pennsylvania’s Right to Know Law (RTKL) and the U.S. Freedom of Information Act (FOIA). This legislation has been an important tool used by journalists, activists, researchers, and many other people to obtain public information. There’s a lot to know about the process before making a request under RTKL (for state and local agencies) and FOIA (for Federal requests). Here are a few places to get more information about the RTKL and FOIA processes.
- The Pennsylvania Office of Open Records provides information about RTKL and the process for making a request.
- The Pennsylvania NewsMedia Association also provides information about RTKL and resources for those making a request.
- FOIA.gov provides information on FOIA, including details in how to make a request, and statistics about FOIA requests.
- The National Archives Office of Government Information Services has additional FOIA resources available on their web site.
There is also a FOIA web site being to manage making FOIA requests of the federal government.
Data Sensemaking
How do I know what the fields in the data mean?
We encourage our publishers to prepare data dictionaries when publishing tabular datasets. Data dictionaries include field names and definitions of the fields, and sometimes include information on data types, field length, and other details about the records. These dictionaries generally can be found below the table view, and look like this:
One benefit of this form of data dictionary is that, when you look at the table view of the data on our web site, the value from the "Label" field will replace the field name (the "Column" value). Also, if there is a "Description" value (which should contain a definition of the field and any additional notes), there will be a blue "Information Symbol" (circle with an "i" in it) in the table header, next to the field label. If you hover your cursor over that Information Symbol, you will get a pop-up box containing the field's description:
In some cases, data dictionaries can also be found as separate downloadable files on the dataset page.
What file formats are used in the open data portal?
The Regional Data Center’s open data portal can accommodate many different types of data formats. Data tables, text documents, geographic files, HTML/hyperlinks, archives, images, and even sound files may be found on the open data portal.
One of our preferred tabular data formats is comma-separated variables, or CSV. We like CSVs because they are an open file format that is compatible with many software programs. Other common tabular formats found on the site include JSON (JavaScript Object Notation) and XLS (Excel).
Common geospatial data formats include ESRI shapefiles (all components in a .ZIP archive), GeoJSON, KML, and even links to ESRI REST endpoints on other servers.
Other image or document files on the site include PDF files, GIF, and JPEG formats. A full list of file types available on the Regional Data Center can be found on the open data portal’s side panel (desktop version), or under the “Filter Results” menu (mobile version).
What if I need more information about a particular dataset?
We believe context is critical to effectively use open data. We have been writing data guides for some of our most-used or complex datasets. Where data guides haven’t yet been produced, we encourage you to submit a question through the open data portal, or contact us or the data steward listed in the dataset’s metadata.
Data Use
This dataset is too big for me to work with. How can I download only part of it?
If there is a tabular view for it, you can filter the Data Table view down to a more manageable number of records if you can pick a category that contains a desired subset of records (e.g., use the MUNICIPALITY field to pick the records for one or two municipalities in the entire county).
- Above the Data Table, click on the "Add Filter" link. (Data Tables on the landing page of a dataset may not have those link. In this case, find the corresponding data resource under "Data and Resources" and click through to find the filterable Data Table.)
- A "Select a field" dropdown appears above the "Add Filter" link. Select the field you want to filter on from this dropdown.
- A new field-values dropdown will be generated below the selected field. Select a value from the dropdown. (For very large tables (millions of records), there may be some delay before the field-values can be gathered and presented in the dropdown.)
- The Data Table will update, with text to describe the filtered view (e.g., "Showing 1 to 10 of 22,692 entries (filtered from 4,302,279 total entries)").
- You can now sort and browse the filtered view.
- More importantly, you can now download the filtered view by clicking the Download button and then selecting your desired file format.
Note that you can add more filters. Adding more filter values for a given field will combine together all the results of the individual filters (that is, the filters will be ORed together, yielding more records). Adding new fields will further filter the results (the filters will be ANDed together, yielding fewer records).
What if I notice a problem or error in a dataset?
Feedback from users is always welcome, especially when it can result in improvements to data quality. The easiest way to let us know about an issue is to post a comment on the dataset page on our web site. We monitor comments and will forward any relevant details to the data steward or other appropriate contact. You are also welcome to share details with us in an email, or reach out directly to the data steward listed in the dataset’s metadata.
Data Ethics
How do you handle privacy issues in data?
We take privacy seriously here at the Regional Data Center, and weigh the benefit of sharing the data against the harm caused to an individual if data is shared. We are reluctant to publish personally-identifiable information, and also will not publish data protected by legislation such as HIPAA and FERPA.
We include materials on privacy in our new publisher trainings and encourage organizations to develop internal privacy review processes before sharing data. We will work with our publishers to aggregate some sensitive data to enable publication, and are also available for consultations, or can arrange a conversation with outside experts. We also provide a final privacy review for each dataset at initial publication.
The Berkman Klein Center for Internet and Society at Harvard University published a helpful Open Data Privacy Playbook in February 2017, and is an excellent resource to learn more about the issue of privacy when it comes to open data.
You can also review our privacy policy on our web site to see how we handle things such as email and IP addresses in our work.
Learning More
Where can I learn new skills?
There are many places to turn when looking to build your data skills. Our Data 101 classes developed in collaboration with the Carnegie Library of Pittsburgh provide a foundation for beginning data users, and teach basic data and statistical literacy concepts using paper and group activities. Trainings are also offered through the Regional Data Center, and also through our partners the Carnegie Library and many other organizations. We post a wide range of training opportunities in our events calendar, and also encourage data users to view the tutorials found on our web site.
How do I get help with data?
Just let us know. We reply to phone calls and emails. You’re also welcome to stop by one of our office hours or other events, or schedule a consultation. Librarians are also a great place to turn for help in your own neighborhood, and our partners at the Carnegie Library of Pittsburgh and the Allegheny County Library Association have also encouraged their librarians to provide data services.
Beyond the Data Portal
What if I have an idea for a tool that uses data?
Let us know in an email or call, or by attending one of our data user group meetings or office hours to share your idea in-person. We can help you frame your idea and give you suggestions in how to take the next step. We also highly encourage you to join a community of other data users at a civic organization such as Code for Pittsburgh, or by participating in a hackathon, data dive, or other type of event.
Can you come to an event and talk more about data?
Yes – we regularly are asked to come to events to talk more about open data and share the story of the Regional Data Center. Please let us know if you'd like us to speak at one of your events, join you for a meeting, deliver a guest lecture in your class, or hold office hours in your community.
I’m from another City and want to learn more about the Regional Data Center.
We are always happy to share more about our project with people from other cities, and welcome the opportunity to talk with people interested in our model. We are happy to schedule a call to talk more about our work, and suggest you first read through the information we’ve assembled about the project in order to make the most out of your conversation with us.