Census Bureau statisticians and outside experts are trying to unravel a mystery: Why did so many questions about households in the 2020 census go unanswered?
Residents did not answer a plethora of questions about gender, race, Hispanic background, family relationships and age, even counting the number of people living in the home, according to documents released by the agency. Statisticians had to fill in the gaps.
Reflecting an early stage in calculating the numbers, the documents show that 10-20% of questions were left unanswered in the 2020 census, depending on the question and the state. According to the Census Bureau, later stages of processing show that the actual rates were lower.
Rates have averaged 1% to 3% over the past 170 years of U.S. censuses, according to University of Minnesota demographer Steven Ruggles.
The information is important because data with demographic details will be used to draw congressional and legislative districts. The data, which the Census Bureau will release Thursday, is also used to distribute $ 1.5 trillion in federal spending each year.
The documents, released in response to a request for open files from a Republican constituency advocacy group, don’t shed much light on why the questions went unanswered, although theories abound. Some observers say the software used in the first census that most Americans could answer online allowed people to skip questions. Others say the pandemic has made it more difficult to access people who have not responded.
Confusion over some issues, including the traditional uncertainty among Hispanics about how to answer the racial question, may have been a factor, but some experts hint at a more sinister possibility. They say the Trump administration’s attempt to end the count earlier and the unsuccessful efforts to ask a citizenship question on the form and exclude people who were in the United States illegally have had a chilling effect.
“I think it’s the pandemic and Trump. The very threat that citizenship was on the questionnaire, the very notion that it might have been there, may have dissuaded some Latinos from filling it out, ”said Andrew Beveridge, a sociologist at Queens College and the City University of New York Graduate School and University Center. “I think a lot of us are flabbergasted by this. This is a very high number. “
Ruggles initially believed it had to do with the software used by people responding online – roughly two-thirds of American households. Other countries like Australia and Canada, which used similar software for censuses, saw the number of unanswered questions drop to almost zero because respondents could not continue if they did not answer a question. question.
“I guess in the American version, they just must have accepted incomplete answers,” Ruggles said. “If the non-response rate was consistently high in all response modes, it is just strange.”
Acting Census Bureau Director Ron Jarmin recently said in a blog post that blank answers cover all categories of questions and all modes of response – online, by paper, by phone or in in-person interviews. face to face.
“These blank answers left holes in the data that we had to fill in,” Jarmin said.
In a statement last week to The Associated Press, Jarmin declined to go into details, saying only that the bureau would issue updated rates later this month “based on the correct numbers.”
To fill in the gaps, Census Bureau statisticians searched for other administrative documents such as tax forms, Social Security card applications, or previous censuses to find race, age, gender, and Hispanic origin. people.
If the available records did not provide the necessary information, they turned to the statistical technique called imputation that the Census Bureau has used for 60 years. The technique has been challenged and confirmed in court after previous censuses.
In some cases, statisticians looked for information about a family member, such as race, and applied it to another member who had blank answers. Either they assigned a gender based on the respondent’s first name. In other cases, when the entire household did not have information, they filled it in using data from similar neighbors.
“Imputation has been shown to improve data quality and accuracy over leaving these fields blank or without respondent information,” Census Bureau officials Roberto Ramirez and Christine Borman recently wrote in an article by blog.
The Census Bureau released the 2020 census state population totals in April. These are used to divide the number of congressional seats in each state during a ten-year process known as an apportionment. .
The agency released a slide show on the high unanswered question rate, along with group housing cases and early details on the non-response rate, in response to a request for open cases from Fair Lines. American Foundation. The Republicans’ advocacy group sued the Census Bureau for information about how the count was done in dorms, prisons, nursing homes and other places where people live in groups. Fair Lines says it is concerned about the accuracy of the number of collective dwellings and wants to ensure that anomalies do not affect the state’s population counts.
With reports showing high imputation rates, some Republican-controlled states may try to leave students out of the clipping data, claiming they were also counted at their parents’ homes, to gain a partisan advantage, a said Jeffrey Wice, a Democratic cutting expert. .
“It will be difficult to prove but would inject more uncertainty and possible delay into the redistribution,” Wice said.
Follow Mike Schneider on Twitter at https://twitter.com/MikeSchneiderAP