Exercise on Critical Thinking and Testing Hypotheses Involving Three Variables (CT2)
Nelson, Department of Sociology
© SSRIC; Last Modified 21 September 2001
Note to the instructor: The data set used in this exercise is RELG9800 which is a combination of the 1998 and 2000 General Social Surveys. (Some of the variables in the GSS have been recoded to make them easier to use and some new variables have been created.) This exercise uses CROSSTABS to test hypotheses involving the relationships among three variables and RECODE to combine categories. It is meant to be used after the exercise that focuses on the relationship between two variables (i.e., "Exercise on Critical Thinking and Testing Hypotheses Involving Two Variables"). In CROSSTABS, students are asked to use percentages to interpret the tables. You could modify this exercise by adding Chi Square and measures of association. A good reference on using SPSS is SPSS for Windows Version 9.0 A Basic Tutorial by Richard Shaffer, Edward Nelson, Nan Chico, John Korey, Elizabeth Nelson and Jim Ross. To order this book, call McGraw-Hill at 1-800-338-3987. The ISBN is 0-07-241445-6 . You have permission to use this exercise and to revise it to fit your needs. Please send a copy of any revision to the author.
Department of Sociology, M/S SS107
California State University, Fresno
Fresno, CA 93740
Please contact the author for additional information.
The goal of this exercise is to learn how to develop hypotheses involving three variables, develop arguments to support these hypotheses, use SPSS to get the tables to test these hypotheses, and to interpret these tables and decide if the data support the hypothesis.
There is a PowerPoint presentation to accompany this exercise which can be downloaded from the Teaching Resources Depository by clicking here.
Part I. Hypotheses involving two variables.
In another exercise, we developed a hypothesis that stated the expected relationship between two variables from the data set RELG9800. As our example, we used church attendance and opinions on pornography laws. Our hypothesis was that "those who attend church frequently are more likely to think there should be laws against the distribution of pornography for everyone regardless of age."
The table from our data set supports this hypothesis. But this does not prove that a causal relationship exists between these two variables. By the way, it would be a good exercise to run the table for ATTEND and PORNLAW for yourself. If you do this, be sure to recode ATTEND first. Recode ATTEND into a different variable and call this new variable ATTEND1. Combine every week (value 7) and more than once a week (8) into one category and give this category a value of 1. Combine once a month (4), two to three times a month (5), and nearly every week (6) into another category and give this a value of 2. Finally, combine never (0), less than once a year (1), once a year (2), and several times a year (3) into another category and give this a value of 3. Now you have three categories--often (1), sometimes (2), and infrequently (3). Add the value labels so your output will be labeled properly. You should get this table:
Part II. Thinking about other variables that might affect a relationship.
We know that other variables are related to church attendance and feelings about pornography laws. For example, women are more likely to attend church than men and are also more likely to feel pornography ought to be illegal to everyone regardless of age. "How do we know this?" Letís use our data set to check on this.
First, we will need to crosstabulate church attendance and sex to see if women actually go to church more frequently than men. We also need to be sure to get the column percents. Here is the table from our data set:
We were right, but not by very much. Women are more likely to attend church compared to men--about 29% of women compared to 20% of men go to church often.
Now letís crosstabulate sex and opinions on pornography laws to see if women are more likely to favor making pornography illegal. Here is our table.
Women are clearly more likely to favor making pornography illegal to everyone--about 46% of women compared to 26% of men favor making pornography illegal to all.
So what does this mean? We want to know if church attendance might be a cause of how people feel about pornography laws. Our two-variable table showing the relationship between church attendance and opinions on pornography laws suggests that this is a possibility.
However, we know that showing that a statistical relationship exists does not prove causality. There could be other explanations. Perhaps the reason those who attend church frequently are more likely to feel that pornography ought to be illegal to everyone is that women attend church more frequently than men and women are also more likely to feel that pornography ought to be illegal to everyone. In other words, perhaps gender might explain away the relationship between church attendance and opinion on pornography laws.
Now itís your turn. Think about the analysis we have just done with sex, church attendance, and opinion on pornography laws. Can we think of another variable besides sex that might account for the relationship between church attendance and opinion on pornography laws? We want to try to think of a variable that will be related to both variables. What about age?
Write a short paragraph explaining what you think will be the relationship between age and church attendance and between age and opinion on pornography laws. Then use SPSS to run these two tables. Remember to put age in the column box so it will be the column variable and to put the other variables in the row box. Also remember to ask for the column percents. Write another short paragraph interpreting these two tables. Use the percents to help describe the relationships in these two tables.
Before we run these tables, we will need to recode AGE into a different variable which we could call AGE1. Letís use the following categories: 18 to 29, 30 to 44, 45 to 59, and 60 to 89. (There is no one younger than 18 or older than 89 in the data.) Be sure to add variable labels to make your output easier to read.
Part III. Using a third variable in the analysis of relationships.
Letís go back to our example of sex and church attendance and pornography laws. How can we check on the possibility that the relationship between church attendance and pornography laws is due to the effect of sex on the relationship? What we can do is to separate the males and females into two tables and then look at the relationship between church attendance and pornography laws separately for men and for women. We can do that in SPSS by putting ATTEND1 in the column box (our recoded independent variable), putting PORNLAW in the row box (our dependent variable), and putting SEX in the third box in SPSS. Sex is the variable we are holding constant and is often called the control variable. Hereís what you should get:
Letís see what happens to the relationship between church attendance and opinion on pornography laws when we hold sex constant. Weíll start by looking at the males only (i.e., the top half of the table). The numbers are different, but the pattern is the same. Men who attend church often are more likely than men who attend infrequently to feel that pornography ought to be illegal to everyone.
Now weíll do the same thing for the females (i.e., the bottom half of the table). Again, the numbers are different, but the pattern is the same. Women who attend church often are more likely than women who attend infrequently to feel that pornography ought to be illegal to everyone.
What does this mean? If the relationship had been due to sex, then the relationship between church attendance and opinion on pornography laws would have disappeared or decreased when we took out the effect of sex by holding it constant. However, the relationship did not disappear. Therefore, the relationship is not due to sex. It is not spurious when we hold sex constant. Spurious means that there is a statistical relationship, but not a causal relationship. We know that the relationship is not spurious due to sex, but it might be spurious due to some other variable. So now itís your turn again.
In Part II we learned that age is related to both church attendance and opinion on pornography laws. So letís hold age constant. Run a table in SPSS in which church attendance is the column variable, opinion on pornography laws is the row variable, and the recoded version of age (AGE1) is the control variable. Write a couple of paragraphs explaining the relationship between church attendance and opinion on pornography laws for all four age groups (i.e., those under 30, those 30 to 44, those 45 to 59, and those 60 and over). Did age affect the relationship between church attendance and opinion on pornography laws? What does this tell us about the possible spuriousness of the relationship due to age? What does this suggest about causality? Can we ever prove that a causal relationship really exists? Why or why not?