Chapter 3: Reliability of Single vs. Aggregated Measures
This exercise illustrates internal consistency of aggregated measures and Cronbach’s alpha. It uses the PERS dataset, consisting of 90 cases and 968 variables. The variables represent measures of traits and relevant behaviors for the dimensions of extraversion (outgoingness) and conscientiousness, reported each week for three weeks by a group of undergraduate psychology students.
This exercise will examine the variability of the same behavioral measure over time. It will illustrate that behavioral measures vary from occasion to occasion, and that summing them up over time produces an aggregate measure of the behavior which is more reliable (consistent) than any of the singleoccasion measurements. See Epstein (1983) and Rushton, Brainerd, and Pressley (1983) for information about the process and benefits of aggregation in measurement. This is because situational, circumstantial, and idiosyncratic factors vary from one occasion to another, and contribute to error in the measurement of the behavior at a single occasion. This exercise uses Cronbach’s coefficient alpha to estimate the reliability of the aggregate measure, and the correlation between the same behavioral measure at different times to estimate the reliability of the singleoccasion measurement.
This exercise illustrates the concept of internal consistency reliability—the consistency of items within a test or measure with each other. Chapter 2 showed that the "3week average" measures on the dataset are simply the sum (divided by 3, giving an average) of the scores for each of the three weeks they were measured. If we consider that the aggregate measures across the 3 weeks are a "score" totaling up three items (the three weeks added up to form the aggregate), then we can use the SPSS reliability program to find the reliability of these threeitem "scores."
We will use the daily neatness measures from Chapter 2. First, we will look at the "reliability" of the first neatness item, time spent on appearance. See how this varies across the three weeks by getting bivariate correlations between W1DCN1, W2DCN1, and W3DCN1. These correlations represent the consistency in this behavior across the three weeks. The correlations are .580, .590, and .698 (the median is .590). Thus, this measure of a person’s attention to personal neatness is only moderately reliable, correlating about .59 from week to week.
Now the 3week aggregate (DCN1) was created by simply summing up each person’s score for W1DCN1, W2DCN1, and W3DCN1. In essence, this creates a "3item scale," with DCN1 being the total score for this "scale." We can use the SPSS Reliability program to estimate the reliability of this "scale" (our 3week aggregate measure).
To run the reliability analysis, select Analyze>Scale>Reliability Analysis. For the items, select W1DCN1, W2DCN1, and W3DCN1. Select the box to "List item labels." Click on the Statistics button and select the following optional statistics: Descriptives for item, scale, and scale if item deleted; Interitem correlations. The output shows that the Cronbach’s alpha (indicating internal consistency reliability) for our "scale" is .83. This is a large improvement over the consistency of each singleweek measurement (median of .59 across the three weeks; see above).
Select any other measure from the dataset that was measured each week for three weeks, compute its correlations across the three weeks, and then use SPSS Reliability to determine the reliability of the 3week aggregate measure.
References
Epstein, S. (1983). Aggregation and beyond: Some basic issues on the prediction of behavior. Journal of Personality, 51, 360392.
Rushton, J. P., Brainerd, C. J., & Pressley, M. (1983). Behavioral development and construct validity: The principle of aggregation. Psychological Bulletin, 94, 1838.






