On Friday, January 6, we held the ninth workshop in the Data Literacy for Data Stewards series.
Our objectives for this workshop included helping participants understand risks and benefits that can come from the use of public algorithms and data collection devices. We also wanted them to get practice asking critical questions about public algorithms, and learn more about the Pittsburgh Task Force on Public Algorithms.
In a definition adopted by the Task Force, “a “public algorithmic system” means any system, software, or process that uses computation, including those derived from machine learning or other data-processing or artificial intelligence (AI) techniques, to aid or replace government decisions, judgments, and/or policy implementations that impact opportunities, access, liberties, rights, and/or safety.”
While a robot judge is an extreme fictional example of a public algorithm, public agencies are using algorithms in their work. In Allegheny County, algorithms are used to contribute to staff assessments in the foster care system, detect gunshots, connect residents with resources, manage traffic lights, plan transportation routes, and identify people that may have engaged in plagiarism.
According to the Task Force report, here are many potential benefits from the use of algorithms, including:
- They can help solve complex problems
- Decisions may be made more-fairly and with less bias
- Improved service delivery
- Faster response times
- Efficiency increases
- Public agencies are more-responsive to demands and needs
- Costs are reduced.
There are also potential problems that can come from the use of algorithms. These problems include:
- Error
- Biased data
- Non-representative data
- Privacy and freedom are eroded
- Lack of transparency
- Lack of public ownership and control
- Lack of diversity among the developers of algorithms
- Overconfidence in the ability of technology to solve problems
- Inadequate public deliberation and input
All of these problems can enable systemic racism/discrimination and other structural biases to persist, and algorithms can even amplify them. That’s why the Task Force suggests that public agencies adopt a risk-based approach for assessing the impact of the algorithm against the status-quo.
In groups of 3-4 people, participants in the workshop were given a scenario involving a public algorithm, and asked to list the benefits of the algorithm and the risks of using the algorithm. They were also asked to add to and customize a list of questions developed by the Task Force that people can ask about algorithms.
Here’s one of the scenarios that were developed for the workshop. This algorithm was developed to improve tax collections by identifying returns that are highest potential for an audit.
Three years ago, East Versailles township contracted with a private vendor to manage tax collection efforts in the hope that an outside company could bring in more revenue than the existing collection process that involved municipal staff. Additional revenue driven by greater compliance would forestall tax increases. After accounting for the cost of the contract, the revenue received by the local government has remained virtually unchanged since the City outsourced this work. The vendor has been growing increasingly worried that the township may decide not to renew the contract with them when it expires in two years. As a result, the company has proposed the use of a proprietary algorithmic tool that, according to marketing materials “uses a range of data from previous returns, public records, and commercial sources (including credit scores)” to flag people whose returns should be audited because the algorithm suggests that they may not have been paying their fair share. Thanks to the township’s new algorithms ordinance, the company must get approval from the commissioners before purchasing the algorithm from the vendor. As a commissioner, you will attend a public hearing on the matter.
Here are the benefits, risks, and questions that people in one of the breakout groups had developed.
What are some of the benefits that can come from this algorithm?
- Increase revenue
- Catch people who are breaking the law
- Encourage compliance
What are some of the risks or harms that could come from the use of this algorithm?
- Inaccuracies
- Bias – could unfairly target some demographics or low-income communities, particularly in use of credit scores/commercial sources
- Privacy risks
- Seem to be based on company/vendor’s financial interest, rather than true public interest
What questions would you ask at a public meeting about this algorithm (please add to, modify, or delete the existing suggestions from the Task Force’s Public Toolkit)?
- What is the government agency trying to achieve with this system?
- What is the policy goal? Increasing revenue, catching delinquent tax filers – or increasing vendor’s bottom-line?
- Why is an algorithm the most effective way to achieve this policy goal?
- Were alternatives considered and what were they?
- Is an algorithm even better than a human at this task? Do you have data showing its effectiveness?
- What are the potential social, racial, economic, and privacy harms?
- What is the worst harm that could result if this system were inaccurate?
- How will potential risks and harms be mitigated?
- What are safeguards to ensure it doesn’t target low-income residents or specific demographic groups?
- Who is funding the system?
- And what is cost/benefit analysis? How much will it cost township to institute this system compared to how much revenue it could bring in? (Especially given that vendor’s contract cost is roughly equivalent to additional revenue that vendor has brought in.)
- Who will design the system?
- Does the team include diverse inputs?
- How are they seeking community input?
- Has this tool been used in other communities, and what changes have been made in response to previous community input/engagement?
- Will the government agency own and maintain access to the data?
- Will the public be able to view their own information?
- How will public even know about the use of this tool?
- What data sources are being used?
- How are they guarding against biased data and other common issues with data that exacerbate bias in algorithmic systems?
- Do data sources cost additional money, like using credit scores or other commercial sources?
- Who is scrutinizing the system?
- Is the system auditable?
- Is there continued evaluation throughout the lifecycle of the system?
- How and when will we know the results of those evaluations?
- Will government be able to challenge results or encourage changes to algorithm once in use?
- Will (and how) members of the public be able to challenge results of the algorithm?
- Do people know when and how the algorithm is applied to them?
- Can someone – member of government or member of public – challenge the result? And how?
- Who rules on a challenge? Can you challenge underlying data used too, like if someone had identity stolen and algorithm is run while they are in process of repairing that?
- Who has ultimate decision-making authority? The vendor, the contractor, the government? Who in the government (commissioners or municipal employees)?
As we do each week, we wrapped-up the session asking everyone to list “better” practices that they’ll adopt following their workshop. The bulk of responses mentioned the need for public agencies to establish opportunities for people to be active participants in a public decision-making process when considering the adoption of algorithms, and being transparent about algorithms in use or those that may be used. People outside of government identified a role for questioning the use of algorithms, and playing a role in public education about algorithms.
This Friday (January, 13 2023), we will discuss data management. Participants will learn more about the importance of documenting their data management practices through the framework provided by a data management plan. Describing the processes and tools we use to create, collect, manage, and share data enables us to be more-intentional in how we build equity and sustainability into our data systems.
If you are interested in participating in the next cohort of our Data Literacy for Data Stewards peer learning series starting in the first quarter of 2023, email us at wprdc@pitt.edu and we will let you know when registration is open.