Data Literacy for Data Stewards Data Management Workshop

by Bob Gradeck

January 17, 2023

The goal of our January 13 workshop on data management plans was geared to getting participants comfortable and confident in asking up-front questions about how data would be created, collected, used, documented, organized, stored, shared, and archived. Data management plans have been a requirement for federal research grants, and academic libraries have developed tools and expertise in documenting important aspects of how research projects handle data throughout the data life cycle. Civic data projects can benefit from community conversations around many of the same discussion prompts used to develop research data management plans. 

In our data management workshop, we talked about the importance of being intentional about data stewardship. Librarians at the University of Pittsburgh have talked about how the choices we make about data can go a long way toward supporting discoverability, long-term sustainability, security, provenance, usability, interoperability, and efficiency. A data management plan is just one of several structures and frameworks that can help data stewards be better custodians of data. 

We then asked participants to begin writing a data management plan for a community garden. The plan would have to account for data related to: 

  • volunteer recruitment and management;  
  • distribution of crops between volunteers, the community food bank, and those sold at the farmers market;  
  • fundraising, revenue and expenses; and  
  • documenting growing operations, which include planting schedules, soil testing, and rainfall. 

Our activity asked participants to select one type of data that is important to managing the garden, and have a conversation using the following questions and prompts that are often a part of a data management planning process: 

Questions that describe the data 

  • Briefly describe the data. What information is included in this dataset (fields, variables, etc) 
  • Why does the data exist? What purpose does it serve? 
  • How is it collected and used? Include information about frequency, people, tools and instruments. 
  • Who is responsible for capturing this data? What training is provided? 
  • Who was consulted in the design of this dataset? 
  • How will you check your work and make sure you’re following the plan? 

Questions about documentation, organization, and storage 

  • Are there any standards that relate to this data? 
  • What documentation will be created to make the data understandable to others (metadata)? 
  • What is the file format for this data?  
  • What file naming conventions are used? 
  • Where will data be stored? How will it be backed-up? 

Questions about access, sharing, and re-use 

  • Will this data be shared? How and with whom? 
  • Describe any privacy, ethical, or confidentiality concerns related to this data 
  • Is there a plan to protect and secure this data as needed? 
  • Will this data be made publicly available? Where? How will it be licensed? 
  • Who owns this data? What laws may relate to this data? 

Questions about archiving 

  • Which data has long-term value?  
  • How will it be archived and preserved? 
  • How long will the data be kept? 

They also suggested that data management plans could add questions that could help institutionalize equitable data practices. Some of the suggestions included: 

  • How have we attempted to incorporate community ownership of this data into the data system? 
  • How will you communicate what the data will/should be be used for? 
  • How are the identities of people most-affected by use of the data reflected in the identities of the data team, and 
  • Do the ways that data is collected, managed, and use exclude anyone from participating? 

Participants felt strongly that data management plans are best if written before any data is collected or any systems are built, and that plans should be regularly updated. They would be eager to engage in more of these conversations about the design of public data systems.  

They also saw parallels between data systems and community gardens, in that both needed a lot of planning, maintenance and care, and often rely on “hidden labor” in order to be successful.  

In the penultimate workshop of our initial cohort this Friday, we’ll learn about the importance of designing our procurement systems to reinforce equitable data practices. 

Additional Resources