Things data users always wanted to know about the Census (but were afraid to ask)

by Bob Gradeck

April 2, 2020

Now that Census Day is upon us, I thought it would be time to share some of the things I learned earlier this year about how Census data is collected, and how tract boundary changes are made. Since data from the 1990 Census was released, I have been using Census data and have wanted to learn more about the processes used to create it. I hope the information I share here will help other data users understand more about the context behind the data, and provide an appreciation of the planning and partnerships that are essential in producing the statistics that we depend on so much. I’ll touch on some of the recent changes prompted by COVID-19, but since plans and timelines seem to change day by day, I’ll avoid getting into too many details about current events. 

The questions I’ll focus on in this post include:

  • Why do we have the Census?
  • How does the Census Bureau update its address file?
  • How are data collection processes different this year?
  • How are changes to tract and block group boundaries determined?


Why do we have the Census?

To start, the Census has been conducted every year since 1790. We do it every ten years because Article 1 Section 2 of the U.S. Constitution mandates we do it in order to allocate congressional representatives among the states according to population. The number of seats in the House of Representatives increased with the size of the country until 1913, when it was fixed at 435 members. In 1790, every member of the house represented about 57,000 people. By 1913, this had grown to one representative for every 193,000 people, and now stands at one per 750,000. If representation is a measure of political power, then Pennsylvania has lost half of its power since 1910, falling from 36 to 18 Representatives. In addition to allocating congressional representatives across the states, Census data is also used to draw political district boundaries for the U.S. Congress, along with State Senate and House districts, County Council, City Council, and School Board Districts. 

The Census is also used to allocate nearly one trillion in federal funding each year among the states, Some of the programs whose funding is allocated using Census data include Medicaid and Medicare, student loans and pell grants, food assistance, housing choice vouchers, educational grants, and transportation funding. The data is also vital to planning communities and is essential data for business, research, and many other uses.


How does the Census Bureau update its address file?

The Census Bureau maintains a Master Address File (MAF) containing  information about all addresses in the U.S. The MAF is compiled from the U.S. Postal Service’s MAF, address data provided by local governments, and field operations. The Census Bureau is prohibited from sharing their MAF with the public, but through the Local Update of Census Address Operation (LUCA), local governments were able to review and comment on the Census Bureau’s MAF if they chose to participate. In our region, most Counties and many local governments worked through the LUCA process to share address updates with the Census Bureau. 


Figure 1: 2020 LUCA Participation in the Pittsburgh Region.

This map shows which states (yellow), counties (blue) and local governments (purple and orange) participated in the LUCA process. An interactive version of this map is also available. 

Address canvassing also saw a drastic change in the run-up to this year’s Census.  In the past, Census employees used to visit every street in the U.S. in order to validate the MAF. This year, the Bureau made use of the Block Assessment Research Classification Application (BARCA), which incorporated administrative data and aerial imagery to verify 70% of all addresses. The time needed to validate a block of addresses using the BARCA system took approximately two minutes, compared to the two hours it takes using manual methods. There still was the need to manually validate addresses from August-October 2019 on about 35% of streets in the U.S.


Figure 2: BARCA Address Validation Software

Screenshot of the BARCA software used to remotely validate addresses


How are data collection processes different this year?

Prior to 1960, much of the data collected through the Census was collected by enumerators who would visit most households to gather data. Starting with the 1960 Census, the Census Bureau began to use mail-back forms, and this was the dominant method of data collection up until this year, when many households are being provided with the option to reply over the internet or by phone (the “internet first” option). The motivation to encourage a digital response involves the reduction of cost, improvement of data quality, and desire to provide additional response options. People living in areas with poor internet access or in communities expected to have below-average response were mailed a form and also provided with the option to reply by phone or online (the “internet choice” option). The following map shows communities that are provided with the “internet first” option (in purple), and those who are provided an “internet choice” option (in green). 


Figure 3: Map of Census Response Options

See which communities were provided with an “internet first” (purple) or “internet choice” (green) response option on the Census 2020 Hard to Count Map

People that received an “internet first” option will receive several reminder mailings to reply online or by phone, and if they don’t reply, will then be mailed a paper form in April. In-person visits were then planned to gather data from people in households that haven’t responded, but it remains to be seen how the pandemic will impact these plans and delay the calendar. If people still hadn’t replied, then enumerators were planning to ask neighbors and landlords to provide any information they could to provide about people in households where data was missing. In 2010, the self-response, in-person visits by enumerators, and information provided by others resulted in at least some usable information about 99% of all households in the U.S. 

Given the switch to internet-based data collection, there is a lot of concern that people lacking internet access, and people with lower levels of digital literacy may experience difficulty responding to the Census. There is also concern over data security and the integrity of data collection systems that were designed to support the Census’ first-ever internet response. For more background on these concerns, we recommend that you look at the report titled “Preparing for the Digital Decennial Census: Building Consent, Equity, and Safety into Digital Transition” by the New School’s Digital Equity Laboratory.

Separate processes have been established to count people in group quarters facilities, including people living in on-campus student housing, correctional facilities, nursing and treatment facilities, and military bases. There are also processes to count people experiencing homelessness. During the pandemic, the Census Bureau is encouraging the use of eResponse enumeration methods, where property owners and managers provide data to the Census Bureau electronically to maximize the response and minimize in-person contact.


How are changes to tract and block group boundaries determined?

The Census Bureau reports data using a variety of geographies, but tracts and block groups serve as the building block for many of our nation’s spatial statistics. Every ten years, the Census Bureau modifies maps to ensure that tract and block group boundaries stay within the Bureau’s optimal size thresholds shown in the table below. While the Census Bureau tries to minimize the number of changes, tracts where population growth causes them to approach the upper size threshold are split, while other tracts whose population are in danger of falling below minimum thresholds are combined with an adjacent tract. 


Figure 4: Small-Area Census Geographies and Population Size Thresholds

Figure 5: Map of 2010 Small Area Census Geography in Pittsburgh

2010 Census Blocks in light gray, Block Groups in orange, and Tracts in dark gray

The Participant Statistical Areas Program (PSAP) is the name of the process that the Census Bureau uses to modify tract and block group boundaries. The PSAP process starts with the Census Bureau identifying tracts that will need to be combined or split, and shares this information with the local Metropolitan Planning Organization (MPO).  The MPO is charged with recommending boundary changes to the Census Bureau, and is given 120 days to submit updates. The MPO is provided with software to help with the process of drawing new boundaries. The MPO in our region is the Southwestern Pennsylvania Commission (SPC), and they provide counties and local governments with the opportunity to recommend boundary changes if they choose to participate. Once recommendations are received from local partners, they’re reviewed by the SPC, then shared with the Census Bureau for final review. Once the proposed changes have received final approval, the new tract and block group maps will be released by the Bureau on a rolling state-by-state basis. 

As part of a course project for Open Government Data class (LIS 2970) taught by Nora Mattern at the University of Pittsburgh, Lisa Over compiled the 2000-2010 tract relationship file as a dataset on our open data portal. 


Happy Census Day!

April 1st is Census Day! Unlike Tax Day (April 15), Census Day isn’t a deadline for responding to the Census, but instead serves as a reference date for the entire data collection process. The Census Bureau asks people to respond at the address where they live and sleep most of the time on April 1st, but new guidance is now being issued in how to be counted if someone was forced to move as a result of the pandemic. To stay on top of the latest guidance for students and others, please see the Census 2020 pandemic webpage, and the local Allegheny County-City of Pittsburgh Complete Count Committee’s website.  

One of the many challenges presented by conducting a Census in the middle of a pandemic involves spreading the word about the importance of the count. So many in-person outreach activities that were planned have been cancelled, and the latest response data shows that we still have a lot of work to do. 


Figure 6: Census Response Map as of April 1, 2020

The Census 2020 Hard to Count Map shows daily change in Census responses

In order to help ensure a complete count, please remind everyone you connect with to respond to the Census. A response now will ensure that the data we rely on will be more-accurate, and minimize the health risks to Census enumerators once in-person follow-up activities resume. As we respond and ultimately work to recover from the impacts of this pandemic, keep in mind that many of the decisions and funding allocations will be made based on data provided by the 2020 Census. Your participation counts now more than ever. Even if you don’t have the Census ID that was mailed to you in March, you can respond by visiting

The information in this blog post was obtained by researching information on the Census Bureau’s Website, and through a very helpful conversation with people at the Southwestern Pennsylvania Commission and the Allegheny County GIS team.