This section explains how to set up a file with new data.After finishing this chapter, you should be able to create a SPSS data file that will include 1) the data and 2) some labeling indicating what the data is about.Also, if you don’t have complete data for a case such as, if someone didn’t answer a question or if they chose two answers to a question, you will be able to mark it as missing so it will be excluded from the analysis.To illustrate this process, we will use a shortened version of the questionnaire used by the General Social Survey (GSS) conducted by the National Opinion Research Center (NORC).For this example, our students wanted to see if their opinions on social issues were similar to those of the national sample.More details can be found by looking at theGeneral Social Survey codebook. See General Social Survey, Davis, Smith, and Marsden, 2001.
The students knew they were not a representative sample, even of college students, but this questionnaire is an interesting way to learn how to create a new data file.They decided to use the following questions[1]:
·What
is your age?
·Are you male or female?
·What is your religious preference?
·Generally speaking, in politics do you consider yourself as conservative, liberal, middle of the road?
·What kind of marriage do you think is the more satisfying way of life:one where the husband provides for the family and the wife takes care of the house and children or one where both the husband and wife have jobs and both take care of the house and children?
·Do you think it should be possible for a pregnant woman to obtain a legal abortion:
If there is a strong chance of a serious defect in the baby? [ABDEFECT[2]]
If she is married and does not want any more children? [ABNOMORE]
If the woman's own health is seriously endangered by pregnancy? [ABHLTH]
If the family has a very low income and cannot afford any more children? [ABPOOR]
If she became pregnant as a result of rape? [ABRAPE]
If she is not married and does not want to marry the man? [ABSINGLE]
If the woman wants it for any reason [ABANY]
Basic
Steps in Creating a Data File
There are a few things that always need to be done to create a data file.It is best to start your data file with some careful planning.
1.First we will want to assign each respondent an identification number, not so individuals can be identified, but so we can keep track of each case when we go back to check the accuracy of the data entering.For each question (variable), we need a variable name that is simple but expresses something about the variable.SPSS limits variable names to eight characters or less starting with a letter.Variable names can contain numbers or letters but not spaces and only a few special characters are permitted, so don’t use any odd symbols.AGE and SEX would be easy variable names for the first two questions.For the questions on abortion, we decided to use the first three characters of the variable names used by the General Social Survey (in brackets after each question).We used MG for the preferred type of marriage and called political orientation C-L.Each variable name can be given an extended variable label that gives more detail, and they can use spaces or special characters.For example, C-L could have a variable label that said Conservative-Liberal.
2.After we have given each variable a name and label, we give each possible response to the question a code called a value label that is often the number corresponding to the order of the answers.(We could use another system, but this is the easiest because SPSS works best with numeric codes to represent the data.)For example, SEX could use 1 for male and 2 for female; C-L could use 1 for conservative, 2 for liberal, and 3 for middle of the road.These would be given value labels such as Male, Female, Conservative, Liberal, Middle of the Road.
3.Sometimes respondents do not answer a question, give more than one answer, or do something else that would make their answers unusable.In our example, respondent #2 marked both yes and no on the last question, respondent #3 wrote in none on question 4, and respondent #13 didn’t answer the marriage question.We can assign these missing value codes so they don’t mess up the analysis.Often 9is used to indicate missing data or 99 if it is a two-digit value.(Note that this would cause problems in the analysis if 9 or 99 were real codes, for example, if there were 9 possible responses to a question or if age included some ninety-nine-year-olds.So think carefully before you choose numbers for missing values.).
It is a good idea to plan all this carefully.It is often useful to put the data in a matrix like Table 2.1 before entering it into the SPSS Data Editor.
|
id
|
age
|
sex
|
rel
|
c-l
|
mg
|
abd
|
abn
|
abh
|
abp
|
abr
|
abs
|
aba
|
|
01
|
20
|
1
|
4
|
2
|
2
|
2
|
2
|
1
|
3
|
1
|
2
|
2
|
|
02
|
24
|
2
|
5
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
9
|
|
03
|
21
|
2
|
2
|
9
|
2
|
2
|
2
|
2
|
2
|
2
|
2
|
2
|
|
04
|
24
|
2
|
5
|
3
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
05
|
26
|
2
|
4
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
06
|
28
|
2
|
2
|
2
|
2
|
2
|
2
|
1
|
2
|
1
|
2
|
2
|
|
07
|
23
|
1
|
1
|
2
|
2
|
1
|
2
|
1
|
1
|
1
|
2
|
2
|
|
08
|
22
|
2
|
4
|
3
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
09
|
22
|
1
|
5
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
10
|
22
|
2
|
4
|
4
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
11
|
23
|
1
|
2
|
2
|
1
|
2
|
2
|
1
|
2
|
1
|
2
|
3
|
|
12
|
24
|
2
|
2
|
3
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
2
|
|
13
|
51
|
2
|
1
|
2
|
9
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
14
|
22
|
2
|
2
|
3
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
15
|
21
|
2
|
4
|
3
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
16
|
37
|
1
|
1
|
3
|
2
|
1
|
2
|
1
|
2
|
1
|
2
|
2
|
|
17
|
22
|
2
|
4
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
2
|
2
|
|
18
|
22
|
2
|
3
|
3
|
2
|
1
|
2
|
1
|
2
|
1
|
2
|
2
|
|
19
|
22
|
2
|
4
|
3
|
2
|
3
|
2
|
1
|
2
|
1
|
1
|
1
|
|
20
|
30
|
2
|
5
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
21
|
25
|
2
|
5
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
22
|
23
|
1
|
2
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
23
|
21
|
1
|
1
|
2
|
1
|
1
|
1
|
2
|
1
|
2
|
1
|
1
|
Getting
Started in SPSS
To
create the data file in SPSS, open SPSS (probably by clicking on the SPSS
icon on the desktop).If it says,
“What would you like to do?”, choose “Type in
data” and click OK, see Figure 2-1.This
opens a matrix similar to a spreadsheet such as Excel.
1.You’ll be using the first column for the respondents’ ID numbers, so type “1” into the first cell (don’t type the quotation marks, just the number).If you click the “Variable View” tab, you can assign variable and value names, Figure 2-3.
3.The
second column will be age, so change to the “Variable View”, tab at the
bottom left of the SPSS screen, and in row 2, type age under name and tab
over to Missing.Click the little
gray box,
,
to open the “Missing Values” dialog box, click “Discrete missing values”,
type in 99, and click “OK”, Figure 2-4.(Now,
if someone does not give his/her age, we’ll code it 99—and hope no one
is really 99 years old.If you want
to, you can change back to Data View to see that the column is now headed
age.)
5.The next variable is religion, we’re going to name it RELIG.Notice that it has five possibilities—Protestant, Catholic, Jewish, other, and no religion.Go ahead and work out the variable label, value name and labels, and missing values just as you did above.You can refer to the “Codebook for Student Questionnaire” located at the end of this chapter.So far, your data file should look like Figure 2-6.
6.Continue entering the variables for the rest of the data set.Some people, especially those who are used to working with spreadsheets, like to enter all the data in Data View before they set up the variable names, etc., so you’ll have to figure out what works best for you.It is very important to save your work as you go along, so do that now.Click the “Save” icon or use “Save” under “File”, and give your data file a sensible name.Notice that SPSS automatically adds ".sav" at the end ofthe file name.
7.Enter
the codes for each variable in “Data View”.Then
check the accuracy down each column looking for codes that would be impossible.For
example, sex can have only three of your data file by scanning possibilities
since male is 1, female is 2, and missing information is 9, so a 5 would
be a mistake.The best check is to
have one person read the codes while another checks the entries on the
Data File. [3]
Chapter Two Exercises
At California State University, Fresno, the Friendly Visitors Service hires college students to do in-home care for elderly people so they can remain independent and stay in their homes as long as possible.The students do cleaning, yard work, shopping, etc.The staff begins by interviewing clients in their homes and assessing their need for services. The following information is used to match the seniors with the students who want employment:
·Age:Age at last birthday
·Sex:Male or Female
·Lives alone:Yes or No
·Low income:Yes = Eligible for Supplemental Security Income (SSI)
·Need for assistance with the activities of daily living (ADL): Bathing, Dressing, Toileting, Transferring in/out of bed, Eating
·Total number of ADLs needing help:
·Need for assistance with the instrumental activities of daily living (IADL):Using telephone, Shopping, Preparing food, Light housework, Heavy housework, Finances
·Total Number IADLs needing help:
To keep track of the needs of potential clients, the program could create a data file and use it in SPSS.(Data from one month’s new applications is presented in Table 2.3.For this example, we’ll just use the count of the number of activities for which the seniors need help, but note that they could include the yes/no responses for each of the activities of daily living.
Exercise Idea for Instructors to Set Up:
Sometimes a university will be willing to provide raw data on the students enrolled on your campus by age and sex.If so, it is interesting to get the data for the most recent year and for five or ten years ago, so students can enter it on an SPSS data file and use it to learn how to do a variety of statistics with SPSS.
Table2.2Sample
Data Set: Friendly Visitor Service Clients
|
id
|
age
|
sex
|
alone
|
low
income
|
#
ADL
|
#IADL
|
|
001
|
74
|
M
|
N
|
N
|
0
|
4
|
|
002
|
66
|
M
|
N
|
N
|
4
|
6
|
|
003
|
81
|
M
|
N
|
N
|
2
|
5
|
|
004
|
76
|
F
|
N
|
N
|
0
|
4
|
|
005
|
74
|
M
|
N
|
N
|
1
|
5
|
|
006
|
69
|
F
|
N
|
Y
|
0
|
4
|
|
007
|
79
|
F
|
Y
|
N
|
0
|
4
|
|
008
|
80
|
M
|
N
|
Y
|
3
|
6
|
|
009
|
89
|
M
|
N
|
N
|
3
|
5
|
|
010
|
60
|
F
|
Y
|
N
|
2
|
6
|
|
011
|
88
|
F
|
Y
|
N
|
0
|
3
|
|
012
|
82
|
F
|
Y
|
N
|
2
|
4
|
|
013
|
79
|
F
|
Y
|
N
|
1
|
4
|
|
014
|
77
|
M
|
N
|
N
|
3
|
6
|
|
015
|
62
|
M
|
Y
|
N
|
1
|
4
|
|
016
|
83
|
M
|
N
|
N
|
4
|
6
|
|
017
|
80
|
F
|
Y
|
N
|
0
|
2
|
|
018
|
85
|
F
|
N
|
N
|
1
|
4
|
|
019
|
66
|
F
|
Y
|
N
|
1
|
3
|
|
020
|
84
|
M
|
N
|
N
|
4
|
6
|
|
021
|
74
|
F
|
N
|
N
|
4
|
4
|
|
022
|
74
|
M
|
N
|
N
|
0
|
2
|
|
023
|
74
|
F
|
Y
|
N
|
0
|
5
|
|
024
|
92
|
M
|
N
|
N
|
3
|
6
|
|
025
|
66
|
F
|
N
|
N
|
2
|
6
|
|
|
|
What is your age? ________
Are you ____ male or ___ female?
What is your religious preference?
___ Protestant ___Catholic ___ Jewish ___ Some other religion ___No religion
Generally speaking, in politics, do you consider yourself as
___conservative, ___ liberal, __ middle of the road, or
What kind of marriage do you think is the more satisfying way of life?
___ One where the husband provides for the family and the wife takes care of the house and children
___ One where both the husband and wife have jobs and both take care of the house and children
Do you think it should be possible for a pregnant women to obtain a legal abortion:
If there is a strong chance of serious defect in the baby? __Yes __ No ___Don’t Know
If she is married and does not want any more children? __Yes __ No ___Don’t Know
If the woman's own health is seriously endangered by pregnancy?
__Yes __ No ___Don’t Know
If the family has a very low income and cannot afford any more children?
__Yes __ No ___Don't Know
If she became pregnant as a result of rape? __Yes __ No ___Don’t Know
If she is not married and does not want to marry the man? __Yes __No __ Don’t Know
If the woman wants it for any reason __Yes __ No ___Don’t Know
Codebook for Student Questionnaire |
|
Missing Values
|
9 or 99
|
Age
|
Age at last birthday
|
|
Sex
|
1 = male, 2 = female
|
Religious Preference
|
1 = Protestant, 2 = Catholic,
3 = Jewish, 4 = Other, 5 = No
|
Political
|
1 = Conservative, 2 =
Liberal, 3 = Middle of the road
|
Preferred Marriage
|
1 = Traditional, 2 =
Shared
|
|
Abortion if Birth Defect
|
1= Yes, 2 = No, 3 = Don't
Know
|
|
Abortion if No More Children
|
1= Yes, 2 = No, 3 = Don't
Know
|
Abortion if Health Risk
|
1= Yes, 2 = No, 3 = Don't
Know
|
Abortion if Poor
|
1= Yes, 2 = No, 3 = Don't
Know
|
Abortion if Rape:
|
1= Yes, 2 = No, 3 = Don't
Know
|
Abortion if Not Married:
|
1= Yes, 2 = No, 3 = Don't
Know
|
|
Abortion For Any Reason:
|
1= Yes, 2 = No, 3 = Don't
Know
|
References
James
A. Davis, Smith, Tom W., and Marsden, Peter.2001.General
Social Surveys: 1972-2000.