Census data weirdness

turbofish

Gold Member
Apr 19, 2021
349
464
203
OK, some initial weirdness but on my part. I have played with census data for decades now. It has always been one of the fun data projects that I have done on my own time after working with data for a living all day long. Heck, I will even pay for monthly ZIP code data with a breakout of racial, income, house value, age, gender and estimated to the month population per ZIP code. A weird hobby I know.

So I downloaded all of the available census data from the 2020 census, divided it all into the 4 different file groups [Geo, File01, File02, File03 - this is how they are separated from the government this time], created a database along with a SSIS package that imported the data - 22 GB of data into my MS-SQL server. In the past, they have spent more columns on households, age of households, number of people per house, house values, payroll, businesses and age and less about race. Now the house value and payroll isn't in the data at all. There are hundreds of columns on race but the interesting part is the omission. They have never concentrated so much about that one topic before. It is less about the general census and more about racial profile of the nation. But even that data has been messed up...

Population has been broken out in the following races: White, Black/African American, American/Alaska Indian, Asian, Hawaiian, or other. Then it goes out to population of two races with a break out of the mention races. Then three races with another breakout. Then four combinations, five, and finally six. Always omitting the Hispanic population. Then you get a count of those who are 100% Hispanic. It will then go through the same breakout but only counting the races in the first list. Under the Hispanic data, it will separate White and Black/African American, White and Indian,...Black/African American and Asian,...
Notice the missing data?

In the 2010, they broke out Hispanic with people from Mexican, Puerto Rican, Cuban. They also dealt with gender data and with a break out of male/Female/Age. Same with 2000 data as well. Now they will count population ages only with the initial race data along with the same breakout with multiple races as long as those races don't include Hispanic and ages of Hispanics if 100% Hispanic. No gender data at all. No financial data at all, no information on businesses, no income per household, no housing value, no persons per household. The only housing data is if the housing unit was occupied or vacant.

It is all kind of a waste of data
 
Our small, rural Wisconsin county lost over 700 people, as of the 2020 Census.

We keep searching the woods for them, but nobody knows what happened to them. Maybe Ed Gein ate them.
 
OK, some initial weirdness but on my part. I have played with census data for decades now. It has always been one of the fun data projects that I have done on my own time after working with data for a living all day long. Heck, I will even pay for monthly ZIP code data with a breakout of racial, income, house value, age, gender and estimated to the month population per ZIP code. A weird hobby I know.

So I downloaded all of the available census data from the 2020 census, divided it all into the 4 different file groups [Geo, File01, File02, File03 - this is how they are separated from the government this time], created a database along with a SSIS package that imported the data - 22 GB of data into my MS-SQL server. In the past, they have spent more columns on households, age of households, number of people per house, house values, payroll, businesses and age and less about race. Now the house value and payroll isn't in the data at all. There are hundreds of columns on race but the interesting part is the omission. They have never concentrated so much about that one topic before. It is less about the general census and more about racial profile of the nation. But even that data has been messed up...

Population has been broken out in the following races: White, Black/African American, American/Alaska Indian, Asian, Hawaiian, or other. Then it goes out to population of two races with a break out of the mention races. Then three races with another breakout. Then four combinations, five, and finally six. Always omitting the Hispanic population. Then you get a count of those who are 100% Hispanic. It will then go through the same breakout but only counting the races in the first list. Under the Hispanic data, it will separate White and Black/African American, White and Indian,...Black/African American and Asian,...
Notice the missing data?

In the 2010, they broke out Hispanic with people from Mexican, Puerto Rican, Cuban. They also dealt with gender data and with a break out of male/Female/Age. Same with 2000 data as well. Now they will count population ages only with the initial race data along with the same breakout with multiple races as long as those races don't include Hispanic and ages of Hispanics if 100% Hispanic. No gender data at all. No financial data at all, no information on businesses, no income per household, no housing value, no persons per household. The only housing data is if the housing unit was occupied or vacant.

It is all kind of a waste of data
It's pretty obvious what they are doing. They are concealing, or attempting to conceal the ~10-12% of US pop that are here illegally. I don't think the average American understands the magnitude of the problem. We could potentially have as much as 40M illegal aliens in this country. When factoring their off spring (anchor babies) into the equation we most certainly have that many.
 

Forum List

Back
Top