SSRIC Teaching Resources Depository
James Gerber, San Diego State University

Macroeconomics Chapter 4: Testing hypotheses with t-statistics

© The Author, 1998; Last Modified 14 August 1998
In this chapter we examine two procedures for testing whether an observed difference in averages (means) is statistically significant. The first case looks at the difference between unemployment rates for blacks and whites. In this example, we ask whether the observed differences are large enough and systematic enough to give us a high degree of confidence that unemployment affects the black population more severely than whites. This form of inquiry is an attempt to rule out the possibility that the observed differences are solely a reflection of random variation in unemployment rates for both groups.

The second case asks a general question about the US's experience with supply side economics during the 1980s. Supply side policy makers and journalists made extravagant claims about the positive effects of supply side economics. In particular, they argued that deep tax cuts and extensive deregulation would improve incentives for working, investing, and saving. It is well known and widely accepted that higher savings and investment rates are associated with faster growth in real GDP and productivity. In its most extreme form, supply siders argued that the tax cuts would help to shrink the federal deficit. Their flawed reasoning was based on a serious overestimate of the growth stimulus provided by tax cuts and deregulation. In simple terms, they argued that when government revenue becomes a smaller percentage of GDP, the economy grows so much that the dollar size of revenue is actually more in dollar terms.

In order to examine these issues, we must conceptualize the economy as a process which generates many different outcomes. The outcomes are the measured values of the variables in the dataset. The measured values, however, are not entirely determined by the systematic operation of our economic system. There are also random factors that play a role, as well as a certain (unknown) amount of measurement error. The value of every variable in the data set is a result of all three of these factors: the systematic processes of the economy, random and uncontrollable factors external to the economy, and measurement error.

Recognition of a randomness and measurement error complicates the simple act of comparing variables. For example, we would like to compare black and white unemployment rates in order to determine the average difference. We have already calculated averages for both races, and black rates are higher. The problem, however, is that we cannot say for certain if there is a systematic component to the difference given that the higher unemployment rates for blacks could be due to a couple of years of random events or a couple of years of measurement errors. Hypothesis tests for a difference in means enables us to test this possibility. As you might imagine, the procedure depends on both the average unemployment rates, and the amount of variation they exhibit over time.

We may also want to compare the values of a single variable measured at different points in time. For example, the 1980s look different from the 1970s. Deficits were higher, inflation was lower, real rates of interest were higher, and so forth. Once again, however, the differences may not be large enough to rule out the possibility that they are due to measurement error or random, non-repeated processes. What we really want to know is whether these differences are systematic enough to give us a high degree of confidence that they cannot be fully explained by the normal amount of variation which is always occurring.

In the following, we are trying to determine if the observed difference in black and white unemployment rates is large enough and persistent enough so that we can rule out the possibility that the "true" underlying difference is zero. Formally, let mB represent the true average rate of unemployment for blacks, and mW the rate for whites. Our hypothesis is mB = mW, or alternatively, mB - mW = 0. If we rule this out, then it must be the case that mB ¹mW, which we will designate our alternative hypothesis. Formally we call these the null and alternative hypotheses, where the term "null" conveys the idea of no difference. Symbolically, they can be written:
H0: mB = mW,
H1: mB ¹mW,

where H0: is the symbol for the null hypothesis.

In fact, however, we never observe the true averages. Instead, we have sample averages which are based on the available data for a group of years. The sample averages are subject to measurement error and random variation due to unique events in particular years. In addition, they are due to the systematic and persistent factors that determine unemployment rates for each group. The relationship between the sample and average and the true average is:

Sample average = x-bar =
m± (t statistic)(standard error of the sample average),

where the standard error of the sample average is the standard deviation of the unemployment rate (sur) divided by the square root of the sample size (Ö n). The t-statistic is the relevant value of a student's t distribution for n-1 degrees of freedom, and (usually) .025 in each tail. (See a statistics text for a complete treatment.)

The procedure for carrying out this test in SPSS is straightforward. We will test three pairs of unemployment rates, those for black and white men, women, and teens.

    1. Select Statistics from the menu bar, choose Compare Means, and Paired Samples t test;
    2. Highlight bm20u in the variable list box (this clicks it into the Current Selections box);
    3. Highlight wm20u in the variable list box, and click the arrow to put them into the Paired Variables list box;
    4. Do the same for bw20u and ww20u;
    5. Do the same for btu and wtu;
    6. Click Okay.
The SPSS output for black and white men is in Table 5. SPSS prints two tables for each pair of variables. In the upper part of the table, it prints a set of descriptive statistics, including means, standard deviations, and standard error of the estimate of the mean (SE Mean). The latter is an estimate of the possible range for the "true" population mean, given that this is a sample based on 25 observations. Between the descriptive statistics for bm20u and wm20u, SPSS prints the number of observations (25), the correlation coefficient (0.949--see Chapter 5), and a test statistic to determine if bm20u and wm20u are significantly correlated.
Table 5
T-tests for Paired Samples
Variable Number 
of pairs
Corr 2-tail Sig Mean SD SE of Mean
25 0.949 0.000
Paired differences
Mean SD SE of Mean t-value df 2-tail Sig
In the second part of the table, SPSS puts the results of the test H0: mB = mW. This is the most important information, and the point at which interpretation of results becomes important. The average difference is 6.3307; the t-statistic for the test is 18.17. The 2-tail Sig is the probability of a t-statistic which is 18.17, or larger, in absolute value. To three decimal places, it has a zero probability. Another way to look at the t-statistic is as the value of the mean difference (6.3307) when it is transferred to a t-distribution scale under the assumption that the null hypothesis is true (no difference in the "true" population mean). Since the t has a zero probability, we can conclude that there is also a zero probability of getting a sample difference of 6.3307 when the true difference is zero. Hence, we reject the null hypothesis.

What about women? Is the difference between black and white women significant (i.e. significantly different from zero)? What about teens? In general, should we reject the idea that the underlying "true" rates are the same? How confident can you be about this?

Proponents of supply side economics appeared on the scene in the late 1970s, at a time when the traditional Keynesian consensus was in disarray. Growth had fallen in the 1970s, inflation had continued to creep up, unemployment rates were consistently higher than they had been in the 1960s, and Keynesian policy prescriptions seemed to hold little promise for improving the situation. Compounding these macroeconomic problems were several microeconomic ones. The US automobile industry experienced some of its worst years ever and the onslaught of more fuel efficient and reliable Japanese imports began to swamp Detroit. The US steel industry, consumer electronics, machine tools, and a number of other traditional manufacturing strengths also experienced their first real challenge in domestic markets. Some of these industries disappeared from the US altogether (consumer electronics) while others were forced to make painful choices in order to restructure over a period of years (steel).

Given the turmoil in domestic markets and the macroeconomy, it is not surprising that radical alternatives to mainstream economic analysis suddenly began to appear. The supply siders were the most successful of the radical views. They managed to win the support of an extremely popular president and were blessed (or cursed) with the opportunity to enact major parts of their program.

During the 1970s, mainstream conservative economists began to examine the macroeconomic effects of taxes and regulations. They came up with a number of widely accepted and credible empirical studies which showed that various taxes and business regulations had become obstacles to economic growth. The conclusion of many of their studies was that if these disincentives to work and invest were addressed, then there would probably be modest improvements in the overall rate of economic growth. In no way did this body of work support the idea that the much higher rates of growth of the 1950s and 1960s would return; rather it showed a potential for relatively modest increases in economic growth.

In the hands of the supply siders, conservative ideas about taxes and regulation were turned into a panacea for every economic problem, including inflation, budget deficits, trade deficits, productivity growth, GDP growth, loss of manufacturing, low savings and investment, and so on. The key promise they made, however, was that with a cut in taxes, saving and investment rates would rise. They argued that when people were allowed to keep a larger piece of future income, they would work, save, and invest more. The rise in work effort, savings and investment would raise the rate of growth of GDP and productivity (output per hour worked).

In 1981, President Reagan took office on the promise that he would enact many of the supply side proposals. The cornerstone of his policy was an across the board income tax cut. Legislation was quickly passed cutting everyone's income taxes by 10% in 1981, 10% in 1982, and 5% in 1983. In addition, he continued the trend that was begun under his predecessor, President Carter, of deregulating various sectors of the economy.

We will examine a number of variables to see if their is any evidence to support the supply siders' claims. In Chapter 3 we created the variable "is," the share of investment in GDP. According to the proponents of supply side economics, this variable should have increased in the 1980s. Similarly, the variable psp, personal savings as a share of disposable personal income should have risen. The growth rates of productivity (prod1 and/or prod2) and GDP should have risen and the size of the average deficit should have shrunk.

In each case, we can test for the predicted effects by testing the hypothesis that the mean value (is, psp, GDP growth, productivity growth, deficit as a share of GDP) for 1970 is different from the 1980 mean. The steps to do this first require the computation of the variables not already in the data set:

    1. Select Transform from the menu bar, then choose Compute . . .;
    2. If you have not already done so, create new variables:
    3. growth rate of GDP;
    4. deficits/GDP;
    5. growth rate of productivity;
    6. investment/GDP;
Use the recode function to create a marker for the 1970s and 1980s (if you did not do this in the last chapter).
    1. Select Transform from the menu bar, then Recode, and Into Different Variable;
    2. Highlight year in the variable list and use the arrow to move it into the Numeric Variable -> Output box;
    3. Type sside in the Output Variable box and click Change;
    4. Click Old and New Values;
    5. In the Old Value box, click the Range button and put 1971 and 1980 in the two boxes;
    6. In the New Value box type 1 and click Add;
    7. Go back to the Range boxes and type 1981 and 1990;
    8. In the New Value box type 2 and click Add;
    9. Click Continue and then click OK.
Test the hypothesis for each variable,
H0: m70s = m80s,
H1: m70s ¹m80s,

using the Independent Samples t test:

    1. Select Statistics from the menu bar, choose Compare Means, then Independent Samples T-Test;
    2. Highlight psp and click the arrow to put it into the Test Variable(s) box;
    3. Do the same for the other variables (investment share, rate of growth of GDP and productivity, deficits as a share of GDP);
    4. Highlight sside and click the arrow to put it into the Grouping Variable box, then click Define Groups . . .;
    5. In Group 1, type 1 and in Group 2, type 2;
    6. Click Continue, then OK;
SPSS will perform t-tests on each variable, comparing the mean value for the 1970s to the mean for the 1980s. For each variable, there are two tables, one with the means and standard deviations, and the second with the t value for the tests. Note that SPSS also automatically performs a test to see if the variances are the same during the two periods (Levene's test) and calculates separate t values for each case (equal variances, unequal variances). If the variances are the same, then the procedure pools all the data from both periods to calculate a pooled variance. This makes the t-test slightly more powerful if it is valid to pool the data.

What can you conclude? Did the growth rate of real GDP increase? Did any of the variables perform as predicted by supply side politicians? Why do you suppose supply side theory is ignored by mainstream economists?

Previous Chapter
Module Table of Contents
Next Chapter