Who can publish data through the Regional Data Center?
Any nonprofit organization, public-sector, public authority, or academic institution is welcome to share data through the Regional Data Center’s Open Data Portal.
What are the benefits of publishing data?
Organizations share open data for many different reasons. In some cases, the organization is obligated to share data as the result of a law, ordinance, mandate, or directive. In other voluntary situations, organizations are looking to increase trust and transparency, enhance public participation, improve collaboration, or inform key community issues. One often overlooked benefit includes enhanced efficiency – many of our publishers share data through the open data portal to minimize the number of external data requests that staff must manage.
How can I become a publisher?
Before a new publisher can share information through the Regional Data Center, the following steps must be completed.
- Return an executed Data Deposit Agreement
- In order to share data with the Regional Data Center, a Data Deposit Agreement will need to be completed and returned. The person signing the Agreement must have the authority to commit the publishing organization into an agreement with the University of Pittsburgh – this person is often the Board Chair, executive director, or elected official. Electronic submissions of the agreement are encouraged. A scanned version of the document can be emailed to the Regional Data Center. The completed agreement can be included as an email attachment.
- New Publisher Training or Consultation
- Participate in a training or consultation. Training is, usually held on-site at the publisher’s offices, and lasts approximately 90 minutes.
- In this training, new publishers will learn how to:
- determine publishing priorities
- protect sensitive information, ensuring it is not shared through the open data portal
- prepare data for publication
- load data to the Regional Data Center Website
- create metadata documentation
- Share a copy of the organization’s logo and brief description for use on the Website
Does it cost anything to publish data?
We do not want cost to be a barrier to sharing data. For that reason, we do not charge fees to our data publishers. We do encourage our partners to become stewards of the Regional Data Center, and hope they will support us in our fundraising efforts.
What type of data do you accept?
The Regional Data Center can accept information in a number of different file formats. We are biased toward open file formats, such as csv’s for tabular data.
Not all data can be open data. Our data deposit agreement contains a detailed listing of the types of information that we will not publish, including:
- The information is exempt from disclosure or the information is prohibited from being disclosed under State and Federal Laws and regulations;
- The information is covered by a contractual non-disclosure obligation;
- The information is covered by confidentiality and fiduciary obligations; or
- The information is private, proprietary or privileged.
We take privacy seriously. Even if information does not meet the criteria listed above, we will weigh the value to users against the harm that may be caused by sharing the information, and make a decision about whether or not to publish.
Even if data is too sensitive to publish as-is, we are happy to work with our publishers to aggregate or de-identify data enabling it to be released as open data.
What data should I share?
Some organizations have had a tough time deciding what type of data to share. Looking at how others prioritized their data releases can help you in your decision. Here are a few ideas to get you started:
Examine data requests
A great place to start is to look at a list of the data requests your organizations have recently received. These may consist of both formal “Right to Know Law” requests, and informal requests from external partners and residents. Proactively sharing data can lead to enhanced productivity and greater transparency. Staff will spend less time posting frequently requested data to the data portal once rather than repeatedly sharing it by request. The Regional Data Center’s open data portal contains a data request form allowing anyone to submit an informal data request.
Look at data being shared internally
It’s very likely that people in your organization are already sharing data with each other, often through highly-inefficient methods (such as e-mail attachments). One of the benefits of an open data portal is that it allows for efficient sharing of information. If this information can be shared publicly, loading it to the open data portal will break down information “silos” allowing for more efficient data sharing and access by all members of the organization.
Supporting organizational priorities
Data can support internal priorities of your organization. If a dataset can support a key business process, facilitate collaboration, or can be used to inform important decisions, it is probably a good candidate for release as open data. Also, if your organization measures its performance through a series of indicators, you may also want to publish this information for others to see.
Helping address community priorities
Your data may also have value to others in your community. Information from your organization can be essential to improving the lives of your neighbors and informing the work for your community partners. For example, property information shared by local governments often helps community development organizations understand market conditions and target programs. If your data can support the work of others in your community, consider sharing it with them. As part of this process, it can be helpful to ask others for recommendations on data they’d like your organization to share.
Look at what other organizations share
Open data programs have been in place around the U.S. for nearly a decade. Organizations in other communities, or counterparts in Western Pennsylvania can provide you with inspiration. If there’s an organization similar to yours that is effectively using information in their work, see if there’s anything you can emulate. Imitation is the sincerest form of flattery.
Look at legal obligations
Some organizations are required to share information with the public. If this is the case for your organization, an open data portal provides an easy way to share information in a way that makes it easy for it to be used by others.
How frequently should data we share be updated?
Update frequency is an important consideration after deciding what data to share. Some of the factors that often go into this decision include the difficulty or time required to prepare the data update and the degree to which the data informs important organizational or community initiatives and critical business processes. Publishers often take an incremental strategy in updating data, where they assess how the data is being used before determining how often to refresh the information on the data portal. However your organization decides to proceed, we encourage you to edit the “update frequency” in the metadata to let users know when to expect an update.
Organizations that are experienced in open data often develop a publishing calendar. If you’d like to stay on top of your open data publishing, a calendar can help you organize your publishing efforts. Sharing your open data calendar can also be a way to let your data users know when to expect a data update.
If feasible, the Regional Data Center will work with publishing partners to automate data updates through an “Extract, Transform, Load” (ETL) process. ETL processes are in place for several datasets on the portal, and allow for near real-time data updates. If you think one of your organizations’ datasets is a good ETL candidate, please let us know.
The condition of our data is terrible and an organizational embarrassment. Why should I share it?
Don’t worry if the quality of your organization’s data is not perfect. Users often provide feedback that will result in improvements to the quality of your data. We will work with you to make sure your data doesn’t contain sensitive information before it’s published. We’ll also help you document the condition of your data so that users know exactly what they’re working with, warts and all.
Do you work with Publishers outside of Allegheny County?
Yes. We are very interested and looking for partners willing to help us understand what capacity and partnerships are needed to achieve positive outcomes in communities far from our offices on the University of Pittsburgh’s campus. If you want to work with us, we’d love to talk with you.
What is metadata?
Metadata is a structured framework for documenting data. Some people like to say it’s data about data. It’s essential if anyone hopes to find and use your data. Metadata appears with every dataset on the open data portal. We invite you to learn more about our metadata standard and how it was developed.
Where can I turn for more information on open data policies or legislation?
Open data policies are a great way for communities to institutionalize open data through a legislative framework. Our friends at the Sunlight Foundation have developed expertise in helping local communities develop open data policies. We encourage you to check out all of the resources available on their Website, which include sample policies, guidelines, and tools to help you craft and get feedback on policies of your own. If you’d like to speak with someone from Sunlight, feel free to reach-out directly, or we would be happy to make an introduction.