## Exercise 2 -- Analyzing Other Population Characteristics

Database:  USCOsp.por

In this exercise you will use a few of the measures commonly used to describe population in more detail.

1. The Sex Ratio
1. Calculate the sex ratio for California counties and for the state. Compare the state total to that of the 58 counties. What reasons can you suggest for the large differences between some of the counties?
2.  a. Under the Data menu select the Compute option.
(ie SexR  =  P005001  *  100  /  P0050002)
3. Compute the sex ratio for other states and compare these to California. Why do you think some states have higher and lower values?

2. The Location Quotient
1. Compute the location quotients for Class of Worker for all California counties and compare these to the values for Los Angeles County in the module. What counties have the greatest concentration of government workers and of self employed persons?
2. Compute location quotients for the six civilian industry employment categories for California counties. To calculate the location quotient, create a fixed variable of the California proportion of the total employed in each industry category and divide it into the county proportion of the total employed in each industry category.
3. Item135 Percent Employed in Agriculture, Forestry, and Fisheries 1990
Item136 Percent Employed in Manufacturing 1990 and Retail Trade 1990

Item138 Percent Employed in Finance, Insurance,  and Real Estate 1990

Item139 Percent Employed in Health Services 1990

Item140 Percent Employed in Public Administration 1990
Prepare a table of counties (rows) versus employment categories (columns). Note which counties have scores less than .3 or greater than 3 on any of the six employment categories. Using a map of California and your knowledge of the geography of California, explain as well as you can the reasons for any three of these unusually low or high location quotients.

3. The Entropy Index

4. Compute the entropy index for California and its counties across five major ethnic groups: non-Hispanic whites, blacks, American Indians (including Aleuts and Eskimos), Asian and Pacific Islanders, and Hispanics.

Item005      Total population 1990
P0100001      Non-Hispanic white population

P0070002      Black population

P00703_05      American Indian-Eskimo-Aleut population

P0070006 thru P0717_24       Asian and Pacific Islander population

P0090002 thru P0090005       Hispanic population
1. First compute a summary variable of the Asian and Hispanic populations. Then compute the entropy index. In the equation below the white, black, American Indian, Asian and Pacific Islander, and Hispanic populations are each divided by the total population (Item5).
2. a. To compute the entropy index select the Data menu and the Compute option.
H  =  - ((P0100001 /  Item5) * LN(P0100001 / Item5) +
(P0070002 / Item5) * LN(P0070002 / Item5) +
(P00703_05 / Item5) * LN(P00703_05 / Item5) +
(Asian / Item5) * LN(Asian / Item5) +
(Hispanic / Item5) * LN(Hispanic / Item5)) / 1.609
Note the leading negative sign and the final division by 1.609. This number is the maximum possible diversity score using the Loge and five groups. It can be determined by computing the H value for an equal proportion of an ethnic population in each category. For example, if the five groups were evenly distributed in a county, each group would have a proportion of 0.2 or 20 percent of the population in the county. How do you think the value might change if you had used census tracts instead of counties?

3. Print out the diversity index values along with the ethnic population percentages for each of the counties. Which counties have the two highest and lowest index values and why do you think these four counties are exceptional in having such high or such low diversity? What groups dominate in counties with low entropy? You have just identified the most and the least ethnically diverse counties in California according to an appropriate statistical technique. Are there other dimensions of diversity that you think should be incorporated into its measurement statistically? Explain.

5. Geographic Association
1. People often assume that higher-income areas receive better health care. To test this relationship, examine the association between the median household income (Item79) with the infant death rate per thousand persons (Item52) for the counties of the United States.
2. Generate a scattergram of these two variables to portray the strength and direction of the association. Describe this relationship.
3. Calculate the Pearson product-moment correlation for the pair of variables. Is this a significant relationship?
4. Run a regression on these two variables using household income as the independent variable and infant death rate as the dependent variable.