Chapter 2: Aggregating Variables
This exercise illustrates aggregating variables, that is, summing the same measure over time. It uses the PERS dataset, consisting of 90 cases and 968 variables. The variables represent measures of traits and relevant behaviors for the dimensions of extraversion (outgoingness) and conscientiousness, reported each week for three weeks by a group of undergraduate psychology students.
Go to the section in the PERS codebook on "Variables Measured Every Week for 3 Consecutive Weeks." Note that these each measure is available for three separate weeks, along with a total (actually, an average) over the three weeks. This brief exercise shows how the aggregates were created.
Get descriptive statistics for the first of these measures which were repeatedly assessedóDO1, the participantís report of their conscientiousness, in general, for the past 24 hours ("conscientiousness today").We will look at this measure for week#1 (W1DO1), week#2 (W2DO1), week#3 (W3DO1), and the three-week average which was created (DO1):
Analyze>Descriptive Statistics> Descriptives
Select W1DO1, W2DO1, W3DO1, and DO1 as the variables
Note that the minimum score for DO1 (1.67) is not a whole number, since this is the average of the three weeks for this measure.
Use Transform>Compute to create your own 3-week average (aggregate) for this variable:
Call your target variable (the average you are creating) w123do1
For the numeric expression, type: (w1do1+w2do1+w3do1)/3
Now get descriptive statistics (see above) for your variable w123do1. The mean, etc. should be identical to the DO1 aggregate variable provided in the dataset. This is how all of the aggregates (averages) were created. This is called "temporal aggregation," or aggregation over time (occasions of measurement). You can also visually check the values that are being averaged for each person to create the aggregate by listing out the data values for a few participants for the relevant variables. To do this, you will need to open a syntax window: File>New>Syntax.
Type the following command in the syntax window:
List /variables=w1do1 w2do1 w3do1 do1 /cases=from 1 to 10.
In the syntax window menu, type on Run>All.
From the output, you should be able to manually verify that DO1 is the average of W1DO1, W2DO1, and W3DO1 for each participant.
You can check other 3-week averages in the same manner as above.
**NOTE: If you are ultimately interested in the correlation of a measure with other measures, it DOES NOT MATTER if you "average" or simply "sum" across occasions when creating your aggregate. Thus, the term "aggregate" is often used interchangeably with "average." Each method is favored by some people.
An argument for averaging is that it allows you to interpret the scores using the same measurement scale as the single-occasion measures. The PERS dataset uses averages for this reason. However, an argument for summing (totaling up) without averaging is that it allows you to more easily see the differences between scores which have compounded over time when the different occasions were averaged. For our do1 variable, these two alternatives yield the following:
Aggregate method Minimum score Maximum score Mean Std. Dev.
Average 1.67 7.00 4.9296 1.0404
Sum 5.00 21.00 14.7889 3.1211
Either of these
variables would yield exactly the same correlation with any other specified
variable. HOWEVER, IF you are interested in the actual score of the aggregated
variable, it obviously will matter.