Evaluating Fund Stabilization Programs with Multicountry Data Some Methodological Pitfalls
Author:
Mr. Morris Goldstein https://isni.org/isni/0000000404811396 International Monetary Fund

Search for other papers by Mr. Morris Goldstein in
Current site
Google Scholar
Close
and
Mr. Peter J Montiel
Search for other papers by Mr. Peter J Montiel in
Current site
Google Scholar
Close

WILLIAM. white, who joined the International Monetary Fund in 1948, spent his entire professional life in the Research Department. Present and past staff members, many of whom benefited from his advice, have asked that his contribution-to the work of the Fund should receive recognition in Staff Papers. This appreciation draws on excerpts from written recollections of some of his colleagues.

Abstract

WILLIAM. white, who joined the International Monetary Fund in 1948, spent his entire professional life in the Research Department. Present and past staff members, many of whom benefited from his advice, have asked that his contribution-to the work of the Fund should receive recognition in Staff Papers. This appreciation draws on excerpts from written recollections of some of his colleagues.

A noteworthy by-product of the continuing debate over the benefits and costs of Fund conditionality has been the development of a considerable empirical literature on Fund stabilization programs (that is, on stand-by and extended arrangements with the Fund). Furthermore, although the first studies of Fund program experience were carried out almost exclusively by Fund staff (for example, Reichmann and Stillson (1978), Reichmann (1978)), the past seven years have witnessed at least as much quantitative scrutiny of Fund programs from outside the Fund as from within it.1

A common practice in many of these studies has been to compare the behavior of one or more key macroeconomic variables (for example, the current account, overall balance of payments, rate of inflation, growth rate of real output, and the like) before the program with the behavior during or after the program. To account for changes in the international economic environment that could alter macroeconomic outcomes independently of the program, it also has become increasingly popular (see Donovan (1982) and Gylfason (1983)) to supplement the “before-after” calculations for program countries with a similar comparison for a reference or “control group” of nonprogram countries.

The primary purpose of this paper is to present and to discuss several methodological problems or pitfalls that can cause “true” program effects to differ from “estimated” program effects when either the before-after approach or the control-group approach is used.2 Particular attention is paid to the inferences that can properly be drawn about the independent effects of a Fund program from a comparison of program and nonprogram countries. More specifically, we attempt to spell out the conditions under which the observed behavior of macroeconomic outcomes in non-program countries can serve as a good predictor of the unobserved behavior of program countries in the absence of a program, and to identify the biases in estimates of program effectiveness if these conditions are not satisfied.

Because the issue of program effectiveness is a broad and controversial one, it is worthwhile at the outset to indicate three particular caveats relevant to this study. First, we interpret or define “program effectiveness” as the difference between the actual macroeconomic performance observed under a Fund program and the performance that would have been expected in the absence of such a program. As noted by Guitián (1981, pp. 36–37), this is only one of at least three possible measuring rods.3 Two others, both of which have been used in some earlier studies, are the difference between actual macroeconomic performance under the program and actual performance before the program, and the difference between actual performance under the program and the performance specified in the targets of the program. Obviously, these three alternative performance indicators can yield different verdicts about program effectiveness. Our preference for the first measure, despite its subjective nature, rests on the argument that it is the only one that can provide an estimate of the program’s independent effect in the real world, where non-program factors (for example, oil price shocks, varying rates of economic activity in industrial countries, and the like) are also operating on observed macroeconomic outcomes.

A second caveat is that the interpretation of program effects in this paper depends critically on our definition of a “program.” Specifically, a program is defined to be in effect when a country has a formal arrangement with the Fund, and not when a country adopts a “Fund-type” policy package on its own. Under this definition of a program, and using our preferred definition of program effectiveness, a program would be judged to have no effect if the country would have adopted the identical set of policies anyway, even though the policies themselves may have substantial impact on the economy and even though these Fund-type policies could be better than some other set of policies.4 It might be argued that it is the effects of Fund-type policies rather than of Fund involvement that is the more relevant issue. To investigate the effects of Fund-type policies, it is not necessary to differentiate between program and nonprogram countries, instead, the relevant comparison would be between macroeconomic outcomes under Fund-type policies and those under some other set of policies. Program countries would no doubt be included in the data set, but they would not be identified as such. This approach would have the advantage that the considerable diversity in the policy mix and in country circumstances that characterize Fund programs can be reflected in the analysis. This is not the case with the “on-off” approach typically adopted with large multicountry samples. Nevertheless, much of the existing empirical literature on Fund stabilization programs does make direct comparisons between program and nonprogram countries. It is therefore of some interest to identify what can and cannot be legitimately inferred from such comparisons.

Third, this paper deals exclusively with the methodology of estimating program effects. Specifically, we do not offer our own estimates of Fund program effects. We do present (in Section III) some empirical examples of how estimated program effects can differ depending on the methodology used, but these should not be viewed as reliable estimates of program effects themselves. Indeed, it is one of the central tenets of this paper that reliable estimates of Fund program effects from multicountry data must await, among other things, further testing of the issues and pitfalls outlined here.5 In this sense the calculations presented in this paper should not alter anyone’s view about whether Fund stabilization programs “work”; these calculations do, however, have implications for the kinds of evidence that one may want to collect in the future to determine if and how programs work.

The plan of the rest of the paper is as follows. In Section I we introduce a simple but fairly general model of the relationship between macroeconomic outcomes and the presence or absence of a Fund stabilization program. This model not only permits Fund programs to affect macroeconomic outcomes in program countries through a variety of channels, but it also permits prior macroeconomic outcomes to affect the probability that a country will embark upon a Fund program. In addition, the model admits the possibility of stabilizing macroeconomic policy actions in the absence of a Fund program. We then use this model to analyze the conditions under which true program effects would equal estimated program effects under two shorthand calculations: before-after comparisons of (mean) macroeconomic outcomes for program countries alone, and before-after comparisons of (mean) outcomes for program countries relative to those for nonprogram countries. In anticipation of what follows, potentially serious estimation biases are found to exist when the “selection” of program countries is nonrandom and when the determinants of macroeconomic outcomes are correlated with the determinants of Fund program selection.

In Section II we outline a procedure (the “modified control-group approach”) for removing sample-selectivity bias from control-group estimates of the effects of Fund programs when the selection of program countries is nonrandom. This modified control-group approach is also capable (in principle) of providing information on how total program effects are apportioned among induced changes in policy instruments, induced changes in behavioral parameters, and general “confidence effects.” Practical estimation problems associated with the modified control-group approach are discussed.

In Section III we investigate the empirical relevance of the most important methodological pitfalls mentioned above. For this purpose we utilize a sample of developing countries and of Fund stabilization programs over the 1974–81 period. Estimates of Fund program effects are then compared against three alternative estimators—a before-after comparison of mean outcomes for program countries alone, a before-after comparison of mean outcomes for program countries relative to that for nonprogram countries, and a reduced-form regression estimate of program effects that controls for revealed preprogram differences between program and nonprogram countries. The three alternative estimators are demonstrated to produce substantially different estimates of Fund program effects. Conclusions are summarized in Section IV.

I. Comparing Alternative Estimators of Fund Program Effects

In this section we introduce an explicit analytical framework (in the form of a simple four-equation model) for analyzing the effects of Fund stabilization programs.

A Simple Model of Program Effects

For the purposes of this paper, it was desirable for such a model to have four broad features. First, the model should be general enough that the two dominant existing statistical approaches to ex post program evaluation—the before-after approach and the control-group approach—could be treated as special cases of the more general model. In this way the assumptions implicit in the existing methodologies can be identified and evaluated. Second, the model should incorporate nonprogram determinants of macroeconomic outcomes of both an international and a country-specific nature. Third, given that Fund stabilization programs operate primarily by altering the design or stance (or both) of macroeconomic policies, the model should indicate the determinants of indigenous changes in macroeconomic policy so that the macroeconomic outcomes expected in the absence of a program can be explicitly defined. In other words, we want a model in which the policy instruments and not just macroeconomic out-comes are endogenous. Finally, the model should indicate what objective factors, if any, determine the probability that a country will have a Fund program during a given period. The reason for treating Fund program status or program-country selection as an endogenous variable is that this is the only way in which to investigate the consequences of systematic differences between program countries and nonprogram countries. Obviously, if such differences exist before a program period, they need to be taken into account in any subsequent comparison of program and non-program countries to the extent that they affect macroeconomic performance. Failure to include such differences would mean that variations in macroeconomic performance between the two country groups could be attributed to the presence or absence of a program, when in reality the differences might in large part reflect other factors.

In equations (1) through (4) below, we set out a model of Fund program effects that contains these basic features:

Δ y i j = Δ x i β i j + Δ W α i j + β i j I M F Δ d i + Δ ϵ i j ( 1 )
Δ x i = γ [ y i d ( y i ) 1 ] + η i ( 2 )
z i = [ y i d ( y i ) 1 ] δ + π i ( 3 )
d i = 1 if z i > z d i = 0 if z i z . ( 4 )

In these equations, yij is the; th macroeconomic outcome or target variable in country i; xi is a K-element vector of macroeconomic policy variables that would be observed in country i in the absence of a Fund program; W is an M-element random vector of world nonprogram variables; zi is a random variable that serves as the index of country-specific characteristics that determines the probability of country i having a Fund program during a given period; di is a dummy variable that takes the value of unity if a country has a Fund program and the value of zero otherwise; ydi is the desired value of the vector yi; z* is the threshold value of zi that divides program from nonprogram countries; εij, ηi, and πi, are unobservable error terms (with zero means and fixed variances) that are serially and (for simplicity) mutually uncorrected; βij, αij, βijIMF, γ, and δ are constants with the appropriate dimensions; Δ is the first-difference operator; the subscript -1 indicates the previous period; and a prime (′) denotes the transpose of a matrix.

The variable yij in equation (1) should be considered as one of the primary targets of a stabilization program, such as the current account, the overall balance of payments, the inflation rate, the real growth rate, and the like.6 Equation (1) can then be interpreted as positing that the change in this macroeconomic outcome or target variable will be a function of four factors: (1) changes in macroeconomic policy instruments (for example, the rate of domestic credit expansion; government tax revenues, expenditure, or both; the exchange rate; and the like) that would have occurred in the absence of a program; (2) changes in world economic conditions (such as changes in world oil prices or changes in real economic activity in industrial countries); (3) the total effect of a Fund program if the country has a program in place during that period; and (4) a host of unobservable shocks that are specific to country i.

A special word of comment is appropriate for βijIMF, which is the coefficient that indicates the effect of a Fund program on macroeconomic outcomes. In our view, this coefficient should incorporate at least three channels or avenues by which Fund programs can affect yij. First, Fund programs can alter the value of macroeconomic policy instruments from what that value would be in the absence of such programs. Note, however, that since Δxi is defined as the change in policy instruments that would occur in the absence of a program, Δxi is directly observable only for non-program countries; for program countries, Δxi must be estimated (through equation (2)). In any case, the important implication is that a program can affect yij by making the actual change in policy instruments different from Δxi. The second potential channel of program effect is by altering what might be called the general state of confidence about the economy of country i. Here the successful negotiation of a credible program with the Fund may, for example, have a positive effect on private and official capital inflows into country i that may indeed be quantitatively more significant than the financial resources supplied by the Fund itself in support of the stabilization package. This, like the first channel of effect, is of course an empirical question; suffice it to note here that the measurement of such confidence effects of programs is an extremely difficult task in practice. The third and final channel of potential program effect is by changing the parameters βij for any given size change in the policy instruments. In other words, programs can work not only by, say, making monetary and fiscal policies more restrictive than they would otherwise be, but also by improving or reducing the effectiveness of any given stance of policy. The ways in which behavioral parameters can shift in response to policy changes have been outlined by Lucas (1976), but here it is enough merely to note that programs can change the expectations of agents in the economy about the future course of xi and yij, and these altered expectations can in turn affect Δyij.7 The assumptions in equation (1) that unobservable country-specific shocks have zero expected means and are serially un-correlated imply that, other things being equal, a negative shock to, say, country i’s balance of payments in period f is not expected to be repeated in the next period. In other words, the model contains a regression-to-the-mean characteristic for macroeconomic outcomes that provides for some automatic stabilization. Of course, only the data can decide whether such an assumption is consistent with the recent experience of program and nonprogram countries.

The basic notion represented in equation (2) is that the authorities display a systematic policy reaction to perceived disequilibria in their macroeconomic target variables. More specifically, equation (2) says that the change in country i’s macroeconomic policy instruments between the current and previous period will be a function of the difference between the desired value of the macroeconomic target variables in this period, yid, and their actual value in the preceding period, (yi)-1, with γ serving as the coefficient that indicates the responsiveness of the policy instruments to such target disequilibria. For example, in the case of stabilizing policy behavior, equation (2) would suggest that a current account deficit in the preceding period that was large relative to the authorities’ target deficit would call for a downward adjustment in, say, the rate of domestic credit expansion in this period. Because Δxi is defined as the change in policy instruments in country i that would occur in the absence of a program, equation (2) spells out “normal” policy behavior by the authorities and thus provides one approach to estimating the counterfactual for program countries.

In addition, so long as γ carries the correct sign, equation (2) also implies that there may well be stabilizing policy action in non-program countries. Idiosyncratic, country-specific policy behavior is intended to be captured by the error term ηi, in equation (2).

Equations (3) and (4) constitute perhaps the greatest departure in this model from the earlier literature on program evaluation by suggesting that the presence or absence of a Fund program should itself be treated endogenously and, in particular, as a function of observable country-specific characteristics. There are strong a priori reasons for believing that Fund program status is not random. A necessary (but not sufficient) condition for the use of Fund resources is that the country display a balance of payments need.8 This implies that, among the population of potential claimants for Fund resources, the sample of countries with Fund programs in place at any given time is likely to have displayed less favorable external balance performance before the program period itself than the population at large.

As written, equation (3) uses the difference between the desired values of macroeconomic target variables in this period and their actual values in the preceding period to explain the probability that country i will have a Fund program this period. Under the assumption that the desired target values yid are constant over time but not necessarily across countries, this difference reduces to a formulation in which the actual values of macroeconomic outcomes in the preprogram period, (yi)-1, influence program-country selection. Equation (3) should therefore be capable of capturing systematic selection of program countries by the Fund on the basis of balance of payments need because the vector (yi)-1 will include preprogram values of country i’s external accounts.

Note further that such specification of equation (3) deliberately makes for a potentially serious problem. As written, the preprogram outcomes (yi)-1 help to explain program selection in equation (3) and also policy reaction Δxi in the absence of a program in equation (2). Because both zi and Δxi can influence the change in macroeconomic outcomes between the program and preprogram year, it can be seen that our model sets up the troublesome possibility that it will be difficult to separate program from non-program determinants of Δyi. As we shall demonstrate later, this problem will be present so long as the determinants of Fund program status di are correlated with the determinants of Δyi, whether through Δxi or through any other variable explaining Δyi. As also discussed later, this problem disappears if program selection is random, since δ in equation (3) will then be zero; that is, it will not be possible to relate program-country selection zi—hence, ultimately, di—to any observable objective factors.

With the general outlines of this model of Fund program effects in mind, we can next proceed to analyze how estimated program effects will differ from true program effects under a variety of shorthand estimation techniques.

The Before-After Approach

This approach to ex post program evaluation has been used both by Fund staff (for example, Reichmann and Stillson (1978)) and by outside observers (for example, Connors (1979), Killick and Chapman (1982)). Although these studies utilized multicountry samples, this approach is not necessarily a cross-sectional technique because the (implicit) parameters estimated are allowed to differ across countries.

Recalling that βijIMF is the “true” effect of a Fund program on the j th target variable in country i, the before-after approach estimates βijIMF—call it βijIMF, A— as:

β^ i j I M F , A = Δ y i j , i є P , ( 5 )

where P denotes the set of program countries. Thus, any change in a target variable in a program country (or in a group of program countries) is attributed exclusively to program effects. The estimate, βijIMF, A, is sometimes subjected to (nonparametric) statistical tests of significance and sometimes not.

The fatal flaw of the before-after approach is that it relies on assumptions of other things being equal that are highly implausible. To see this, let us introduce only equation (1) from the general model of program effects:

Δ y i j = Δ x i β i j + Δ W α i j + β i j I M F Δ d i + Δ є i j . ( 1 )

Now suppose that the preceding period was one during which there was no Fund program in effect (so that di = 0 for t - 1). Then,

Δ y i j = β i j I M F + Δ x i β i j + Δ W α i j + Δ ϵ i j for i є P . ( 6 )

From equation (5), the before-after approach then gives

β ^ i j I M F , A = β i j I M F + Δ x i β i j + Δ W α i j + Δ є i j for i є P . ( 7 )

Taking expectations of equation (7), conditional on the presence of a Fund program in country i and on observed changes in the world economic environment, we have

E ( β ^ i j I M F , A | i є P , Δ W ′ ) = β i j I M F + E ( Δ x i | i є P , Δ W ) β i j + Δ W α i j + E ( Δ є i j | i є P , Δ W ) . ( 8 )

Thus, the before-after approach would produce an unbiased estimate of program effects—that is, E(βijIMF,A|iєP,ΔW)=βijIMF and only if

E ( Δ x i | i є P , Δ W ) β i j + Δ W α i j + E ( Δ є i j | i є P , Δ W ) = 0. ( 9 )

In other words, an unbiased estimate of program effects would require that the nonprogram determinants of yij would have behaved in such a way as to leave yij unchanged, on average, between the preprogram and current program periods. Reference to the 1973–81 period—when large changes in world oil prices, large year-to-year changes in industrial country real gross national product (GNP), and significant shifts in real interest rates created serious difficulties for the external positions of developing countries (see International Monetary Fund (1983), Goldstein and Khan (1982), and Khan and Knight (1983))—gives sufficient reason to doubt that ΔW′αij, would be zero for most program countries, even over short periods. By the same token, the ex post record of money supply growth and fiscal deficits by developing countries during the same period (see International Monetary Fund (1983) and Gylfason (1983)) generates skepticism that changes in domestic policy instruments would have been such as to offset exactly, on the average, the effects of external and internal shocks for most program countries. In short, we should expect the other-things-equal assumption of the before-after approach to be violated in practice. As such, estimates of program effects under this approach are likely to be contaminated by nonprogram factors.

The Traditional Control-Group Approach

A second approach, hereafter called the traditional control-group approach, has a long history in empirical labor economics but appears to have been first applied to analysis of experience with Fund programs by Donovan (1982).9 More recently. Gylfason (1983) has adopted a more sophisticated version of it.

This technique in effect uses the behavior of a control group (a group of nonprogram countries) to estimate what would have happened in the program group in the absence of programs. Thus it implicitly assumes that only the program itself distinguishes the group of program countries from the control group. It is therefore natural to interpret this as a cross-sectional approach. Specifically, in terms of the model, we can drop all the country i subscripts from the coefficients because these are now assumed to be identical across countries. In addition, because we are now dealing with country groups, βijIMF represents the mean effect of Fund programs on the j th macroeconomic target variable. The equation for Δyij can now be written as

Δ y i j = Δ x i β j + Δ W α j + β j I M F Δ d i + Δ є i j . ( 10 )

Under the control-group approach, βijIMF is estimated by

β ^ j I M F , B = ( Δ y ¯ j ) p ( Δ y ¯ j ) N , ( 11 )

where a bar over a variable represents its mean, and N denotes the set of nonprogram countries.

To investigate the properties of this estimator, we again take expectations. Applying this procedure to equation (11) yields10

E ( β ^ j I M F , B ) = β j I M F + E [ ( Δ x i β j + Δ є i j ) | i є P ] E [ ( Δ x i β j + Δ є i j ) | i є N ] . ( 12 )

From equation (12), it can be seen that the condition for β^ijIMF,B to represent an unbiased estimate of the true program effects, βijIMF, is that

E [ ( Δ x i β j + Δ є i j ) | i є P ] E [ ( Δ x i β j + Δ є i j ) | i є N ] = 0. ( 13 )

In other words, the groups of program and nonprogram countries have to be drawn from the same population in the sense that the expected value of the change in nonprogram determinants of yij must be the same for members of both groups. Comparing equation (13) with equation (9) shows that the control-group approach is not necessarily less restrictive than the before-after approach. Although the control-group approach controls for the effect of changes in the global economic environment (that is, the term ΔW′αij that appears in equation (9) drops out as a source of bias in equation (13) because such global factors are assumed to affect program and nonprogram countries equally), it does so at the expense of introducing a new source of bias—the characteristics of nonprogram countries (that is, the term E[(Δxiβj+Δєij)|iєN] appears in equation (13) but not in equation (9)).11

The foregoing suggests that the choice between the before-after approach and the control-group approach to estimating program effects ought to depend on one’s a priori beliefs about similarities between program and nonprogram countries and about the relationship between domestic and global determinants of Δyij. Specifically, if program and nonprogram countries are believed to be quite similar on average, and if the domestic determinants of Δyij are not believed to offset international influences on Δyij, then equation (13) is more likely to be satisfied than equation (9); hence the control-group approach will provide a better (less biased) estimate of program effects than will the before-after approach. We next proceed to investigate, first, the nature of the bias that is generated by this methodology when the determinants of program-country selection are correlated with the determinants of macroeconomic performance and, second, the nature of the biases in the specific (but probably most relevant) case in which both program-country selection and macroeconomic performance depend on macroeconomic performance before the program period.

Nonrandom Selection of Program Countries

To examine these issues it is helpful to introduce the index of unobservable country-specific characteristics, zi, that regulates the probability that country i will have a program during any given period. Specifically, we now introduce equation (4) from the general model of program effects:

d i = 1 if z i > z
d i = 0 if z i > z , ( 4 )

where z* is an arbitrary threshold value for zi. Instead of also introducing equation (3), assume for the moment that E(zi) = 0 and E(zi2)=σz2 Equation (4) says that a country will have a program if its index of country-specific characteristics is greater than z*; if not, it will not have a Fund program, at least in that period. The probability that a country will have a program is therefore equal to the probability that zi >z*.

We can now use equation (4) to rewrite the necessary condition, as previously expressed in equation (13), for an unbiased estimator of the program effects under the control-group approach; that is,

E [ ( Δ x i β j + Δ є i j ) | z i > z ] E [ ( Δ x i β j + Δ є i j ) | z i z ] = 0. ( 13 a )

Recall that both Δx′ iβj+Δєcij and zi are random variables. Suppose that the correlation between these two variables is given by ρxz and that the expected value of Δxiβj+Δєij We show in Appendix I that if Δx′ iβj+Δєcij and zi have a joint normal distribution, then

E [ ( Δ x i β j + Δ є i j ) | z i > z ] > Δ x β j if ρ x z > 0 = Δ x β j if ρ x z = 0 < Δ x β j if ρ x z < 0 ( 14 a )
E [ ( Δ x i β j + Δ є i j ) | z i z ] < Δ x β j if ρ x z > 0 = Δ x β j if ρ x z = 0 > Δ x β j if ρ x z < 0. ( 14 b )

In other words, if Δx′i βj + Δєij and zi are correlated (ρxz ≠ 0), then our expectation of Δx′i βj+Δєij will depend on the value taken by zi. This result is intuitive. Suppose, for example, that ρi is positive. Then, relatively large values of Δx′i βj+Δєij are associated with relatively large values of zi. Thus, if we know that zi is relatively large for some country i but do not observe Δx′i βj+Δєij our expectation is that Δx′i βj+Δєcij will also be relatively large for this country. Likewise, if ρxz is negative, then relatively large values of zi will be associated with relatively small values of Δx′i βj+Δєij and observing a large zi will lead us to expect a small Δx′i βj+Δєcij. In contrast, if Δx′i βj+Δєij and zi are known to be uncorrelated, then observing a large zi gives no basis on which to expect Δx′i βj+Δєij to be either particularly large or particularly small for the ith country.

Because program countries are those for which zi>z*, program countries as a group will exhibit a relatively large zi. Likewise, the representative z for the nonprogram group will be relatively small. It follows that if ρxz > 0 the difference between the program and nonprogram groups with respect to the expected change in target variable yj consists of both the effect of the program on the change in yj and the difference between the relatively large Δx′i βj+Δєij expected for the program group and the relatively small Δx′i βj+Δєij expected for the nonprogram group. Because this second component of the expected difference must be positive, the expected difference will exceed the true effect of the program on the change in target yj. A similar analysis establishes that the expected difference will be less than the true program effect when ρxz<0. When ρxz = 0, it remains the case that zi is relatively large for program countries and relatively small for non-program ones, but this gives no reason to expect that Δx′i βj+Δєij will be systematically different between the two groups; therefore, the only difference we are justified in expecting is that attributable to the effects of the program. These considerations imply that

E ( β ^ j I M F , B ) > β j I M F if ρ x z > 0 = β j I M F if ρ x z = 0 < β j I M F if ρ x z < 0. ( 15 )

The relations in expression (15) can be derived formally by substituting expressions (14a) and (Hb) in equation (12).

Thus, ρxz is the crucial parameter in determining the direction of the bias in the control-group methodology. Specifically, if the determinants of program selection (zi) are positively correlated with the determinants of macroeconomic performance that would have occurred in the absence of a program (Δx′i βj+Δєij), then the control-group estimate of program effects (βijIMF, B) will overstate true program effects (βijIMF). Conversely, only if the determinants of program selection are uncorrelated with the determinants of macroeconomic performance (ρxz = 0) is the control-group estimator an unbiased indicator of true program effects.

The significance of the preceding analysis is that it permits us to move from the vague statement that “if the program and non-program groups are different, then the control-group approach will be biased”—a statement that is not correct—to the precise identification of ρxz as the critical parameter determining both the presence and the direction of bias.12 In assessing the adequacy of the control-group methodology, the relevant question then is whether there are any reasons inherent in the nature of the problem that would lead us to believe that this correlation (ρxz) will be nonzero.

The model embodies precisely such a nonzero correlation because both the determinants of program status in equation (4) and of normal policy changes in equation (2) are linear functions of macroeconomic outcomes before the program period. To show this formally, first rewrite the model by taking the transpose of equation (2), substituting for Δx′i in equation (1), and making some small changes in notation:13

Δ y i j = β 0 i j ( y i ) 1 γ ′ β j + β j I M F Δ d i + Δ є ˜ i j ( 16 )
z i = δ 0 i ( y i ) 1 δ + π i ( 17 )
d i = 1 if z i > z d i = 0 if z i z , ( 4 )

where

β 0 i j = y i d γ ′ β j Δ є ˜ i j = Δ є i j + η i β j δ 0 i = y i d δ .

To determine whether the control-group estimator of program effects will be biased, we again need to examine the correlations between −(yi)′-1γ′βj and zi and between Δ̃εij and zi. These will be determined by the signs of

cov [ ( y i ) 1 γ ′ β j , z i ] = β j γ y 1 δ ( 18 a )
cov ( Δ є ˜ i j , z i ) = σ є 2 δ j , ( 18 b )

where Σy-1 is the covariance matrix of (yi)1,σє2 is the variance of εij, and δj is the j th component of δ. These results assume that εij, ηi, and πi are mutually uncorrelated.

The crucial thing to notice about the covariances portrayed in equations (18a) and (18b) is that they can in general be expected to be nonzero—a finding that implies that the control-group estimator of program effects will be biased. The more interesting issue, however, is why this estimator turns out to be biased. Our analysis suggests that the determinants of program status will be correlated with the nonprogram determinants of Δyij for two reasons.

First, preprogram values of key macroeconomic target variables, (yi)-1, are likely to trigger policy responses, Δxi even in the absence of programs, as originally suggested in our policy reaction function (equation (2)). In terms of equation (18a), this assumption shows up as a nonzero value for the covariance βiγy1δ

Second, negative transitory shocks in the preprogram period are by their very nature unlikely to recur during the program period (recall that εij has an expected value of zero and is assumed to be serially uncorrelated), with the result that changes in macroeconomic target variables between the preprogram and program periods, Δyi, will display regression to the mean with respect to past shocks; in terms of equation (18b), this result shows up as a nonzero value for σ2 δj.

We next inquire about the direction of this bias. For that source of bias that arises from regression to the mean, we can provide an unambiguous answer under reasonable assumptions. This is not possible for the bias arising from the existence of policy reaction functions; in this case we can, however, spell out the conditions necessary for that source of bias to disappear.

Consider the bias arising from regression to the mean. Because Fund programs are designed to move target variables in the desired direction, we expect the product of βijIMF and δj to be greater than zero (that is, βijIMF δj). The logic is that, if a below-target value of yij (for example, the current account surplus) causes a country to come to the Fund for assistance (δj >0), the Fund program will seek to increase actual yij (that is, βijIMF > 0); hence, βijIMF δj > 0. Likewise, for those target variables (for example, the rate of inflation) for which the likelihood of program participation is increased when (yi)-t >yid, then δj<0 and we can expect βijIMF < 0; here, too, the product βijIMF δj will still be greater than zero. The relevance of βijIMF δj); hence, βijIMF is that, because we know from equation (18b) that the correlation between Δ̃δij and zi carries the same sign as δj, we can conclude that regression to the mean contributes to a correlation between the determinants of program Status (zi) and the nonprogram determinants of Δyij (that is, ρxz), a correlation that has the same sign as βijIMF. From our earlier analysis, especially equation (15), we then know that in these circumstances the control-group approach will overstate the true effect of a Fund program. In short, if program countries are more likely to have experienced negative temporary shocks in the preprogram period, a comparison of changes in mean macroeconomic outcomes between program and nonprogram countries will, under plausible assumptions, overstate the beneficial effect of a program. A negative shock in the preprogram period simultaneously increases the probability of program participation and increases the probability of a positive change in yij in the program period. Thus, attributing all of this improvement in yij to a Fund program overstates the true independent effect of the program.

The direction of bias arising from the existence of policy reaction functions depends on the characteristics of such functions, which is of course an empirical question. Nevertheless, we can show that the bias will disappear under two conditions.

The first condition is that δ = 0; that is, when Fund program status can no longer be related to observable country characteristics and when all countries therefore have an equal probability of becoming program countries. In this case, the covariance represented by equation (18a) is zero as long as πi is uncorrected with εij and ηi. In this case of random selection, both sources of bias disappear because the original premise of the control-group approach—that program and nonprogram countries are similar— is satisfied.

The second condition for the policy-reaction bias to disappear is γ = 0. Again, this would make the covariance represented in equation (18a) equal to zero under our assumption. In other words, if γ = 0, the policy reactions of the authorities cannot be systematically related to observable characteristics; that is, we would not observe the systematic policy reaction functions represented by equation (2). Note, however, that even when γ = 0, the bias in the control-group estimator attributable to regression to the mean would still remain. This is so because δ = 0 eliminates the improvement in yij that is attributable to nonprogram policy actions but not that improvement attributable to automatic stabilization from reversible country-specific shocks.

To sum up, we have argued in this section that there are strong ex ante reasons for believing that the past procedures used to estimate the effects of Fund programs in multicountry samples are subject to significant sources of statistical bias. Because the non-program determinants of macroeconomic outcomes cannot in general be expected to behave in such a way as to leave these outcomes unchanged from year to year, the potential problems with the before-after approach can be readily acknowledged. The problems with the control-group approach are also important, but they are perhaps more subtle. As shown above, comparing mean macroeconomic outcomes between groups of program and non-program countries will lead to biased estimates of program effects whenever the determinants of program selection are correlated with the determinants of macroeconomic outcomes that would have occurred in the absence of a program.

II. Obtaining Unbiased Control-Group Estimates Under Nonrandom Selection

In this section, we describe a modified control-group estimator and show why it is capable of producing unbiased estimates of program effects even when program and nonprogram countries are different. Second, we discuss some of the operational problems that would have to be faced in actually using this estimator. Finally, we show how this estimator could be used to obtain information not only on total program effects but also on how these effects are achieved. It must be emphasized that we describe only the modifications required to control for observable differences between program and nonprogram countries. Sample-selectivity bias would remain because of unobservable differences between program and nonprogram countries. Although statistical procedures are available to handle this source of bias, we do not describe them here. Futhermore, the modifications we discuss also cannot manage other potential biases (for example, aggregation effects or interdependence between program and nonprogram countries) that may be intrinsic to multicountry data and that may distort the true effects of programs.

A Modified Control-Group Estimator

Consider the following modified estimator, β^j IMF, M for Fund program effects:

β ^ j I M F , M = ( y ¯ j ) p ( y ¯ j ) N ( x P ¯ x N ¯ ) β j . ( 19 )

Reference to equation (11) reveals that this modified estimator differs from the traditional control-group estimator in two respects: the modified estimator contains the additional term (xP¯xN¯)βj, and it is specified in level rather than in first-difference form.14

To investigate the properties of this estimator, write the basic equation for the jth macroeconomic target variable in country i (equation (1)) in level form:

y i j = x i β j + W α j + β j I M F d i + є i j . ( 20 )

Taking expectations of equation (19), after substituting from equation (20), we then obtain

E ( β ^ j I M F , M ) = β j I M F + E [ ( є j ¯ ) P ( є j ¯ ) N ] = β j I M F + E ( є i j | z i > 0 ) E ( є i j | z i 0 ) = β j I M F . ( 21 )

Thus, the modified control-group estimator will be unbiased so long as the unobservable country-specific determinants of yij (that is, ɛij), are uncorrelated with the determinants of program status (Zi). In such a case, one can set E(εij|zi>0)=E(εij|zi0)=0 and thereby justify the last equality above.

The reason that the modified estimator is unbiased can be explained intuitively by using the conclusions from Section I. Recall that we established there that the traditional control-group estimator would be biased if the nonprogram determinants of Δyij (that is, changes in domestic macroeconomic policy and changes in unobservable shocks) differed systematically between program and nonprogram countries (that is, if Δxi and Δɛti were correlated with Fund program status, zi). The modified control-group estimator removes both sources of bias present in the traditional version. By subtracting the term (x¯Px¯N)βj an adjustment is made for any differences in indigenous macroeconomic policy between program and nonprogram countries. As regards the second potential source of bias (regression to the mean), note that systematic differences between program and nonprogram countries with respect to changes in unobservable shocks are to be expected only because the program-selection rule makes it more likely that countries with negative shocks in the preprogram period will subsequently adopt programs. But also note that the expected level of such shocks—that is, Eij)—is zero for all countries. Thus, under our assumptions about the distribution of ɛij, this source of bias is present in estimators expressed in first-difference form that fail to control for prior shocks but not in those (such as the modified estimator) expressed in level form.

Operational Aspects of the Modified Control-Group Estimator

The traditional control-group estimator has an obvious attraction: estimated program effects require only the calculation of mean changes in macroeconomic outcomes for program and for nonprogram countries; that is, only of Δ¯ yjp and (Δ¯ y j)N The estimation requirements for the modified control-group estimator, however, are substantially more demanding. Not only do we need values for three additional variables or parameters (¯xN, ¯xp, and βj), but we also face the problem that two of these (¯xp and βj,) are not observed directly. Recall that ¯xp is not observed because xi refers to policies that would have been undertaken in the absence of programs; thus, xi is equal to observed policies in nonprogram countries but not in program countries. Hence, implementation of the modified control-group approach requires estimating xi for program countries (as well as estimating the parameter βj linking Δxi to Δyij).

The policy vector xi is generated by the reaction function (2). In practice, an important limitation of the modified control-group estimator is that such reaction functions may be highly unstable, both across countries and in a given country over time. On the one hand, in extreme cases of instability the problem of estimating the counterfactual becomes insoluble. On the other hand, if crosscountry instability is dominant, the solution is to abandon multi-country samples in favor of country studies. In any case, the issue is an empirical one. In what follows, we describe the calculation of a modified control-group estimator that is conditional on the existence of stable policy reaction functions.

The first step in estimating xi for program countries is to fit the reaction function (equation (2)) to observable data for non-program countries. The only unobserved variable in equation (2) is the country-specific vector of desired macroeconomic out-comes, yid. If this variable can be assumed to be constant over time, it can be captured by a set of country-specific constants, giving the policy-reaction equation the following final form:

Δ x i = γ 0 i γ ( y i ) 1 + η i . ( 22 )

The fitted values of this equation for program countries constitute the counterfactual Δxi. In effect, this procedure uses data on observed policy behavior in nonprogram countries to identify normal policy reaction in given policy-target circumstances.15 This normal policy reaction is then used to estimate what “would have been” in program countries if there had not been a Fund program.

An important caveat is in order about another potential source of systematic differences between program and nonprogram countries. Because both the setting of policy instruments in equation (22) and the acceptance by a country of a Fund program as specified in equation (3) reflect policy decisions of the authorities, any unobservable factors, πi, that make a given country more likely to go to the Fund for assistance—such as a general commitment to adjustment—may also make that country more likely to have adopted a different policy package in the absence of a program, Δxi, than another country facing similar observable (policy-target) circumstances. In this case, the behavior of nonprogram countries would not be a good guide to the counterfactual in program countries —even after observable preprogram characteristics of the two groups are controlled for. Formally, this possibility would manifest itself in the model as correlation between the error terms πi, in equation (3) and ηi in equation (2). If such a correlation is present, then equation (22) will provide a biased estimate of Δxi for program countries —in essence because it fails to remove this additional source of sample-selectivity bias.16

But this additional source of bias can be eliminated, even though both ηi and πi, are unobservable. The reader is referred to Heckman (1979) for a description of the appropriate procedure. For our purposes, we note that the procedure requires the specification and estimation of a model of program participation —that is, of equation (3). Thus, removal of the two sources of sample-selectivity bias we have identified requires the specification and estimation of models of endogenous policy formation (equation (2)) and program participation (equation (3)).

With (¯yj)P, (¯yj)N and ¯xN observed directly, and with ¯xP estimated as outlined above, the remaining element necessary for application of the modified control-group estimator is the parameter vector βj, which links normal policy changes in the absence of programs to changes in the macroeconomic target variables. Until now, this vector has been assumed to be known. For our purposes, any unbiased estimator of j will suffice. Perhaps the simplest way to produce such an estimate is to fit the macro-economic outcome equation (20) in level form to a pooled cross-sectional time-series data sample by using observed values for the policy vector xi.17

If the objective is solely to obtain an unbiased estimate of total program effects, we can substitute the policy-reaction equation (2) for Δxi into the level-form equation (20) and derive

y i j = β 0 i ( y i j ) 1 γ ′ β j ( x i ) 1 β j + W α j + β j I M F d i + ( є i j + η i β j ) . ( 23 )

Fitting equation (23) to observable data will then yield an estimate of total program effects through the estimated coefficient βjIMF on the dummy variable di. This procedure does not, however, take into account any sample-selectivity bias arising from systematic differences in reaction functions between program and non-program countries. If the error terms in equations (2) and (3) are correlated, then the reduced-form approach has to be augmented by the Heckman (1979) correction in order to obtain unbiased estimates of program effects. This shortcut works because it essentially controls for observable differences between program and nonprogram countries. But it cannot yield information on how total program effects are apportioned between changes in policy instruments and other factors.

Analyzing How Programs Work

To analyze the three different channels by which programs can affect macroeconomic outcomes, it is helpful to introduce some additional notation. Let xi, IMF be the vector of policy instruments adopted under a program, βxj, IMF-the vector of coefficients linking these policy instruments to the target variable yijy, and CONi, IMF any unmeasurable confidence effects on yij attributable to a program. As before, xi and βj will be the values of policy instruments and their coefficients in the absence of a program. We can then express the total effect of a Fund program, βjIMF, as:

β j I M F = ( x i , I M F β j , I M F x i β j ) + C O N i , I M F . ( 24 )

Rewriting the level-form equation (20) for yij with the substitution for βj yields

y i j = x i β j + W α j + [ C O N i , I M F + ( x i , I M F β j , I M F x i β j ) ] d i + є i j . ( 25 )

It is clear that if separate estimates of βj, IMF and CONi, IMF could be obtained, it would be possible to identify the separate channels through which a program affects yij. Given the estimate of xi, for program countries and the estimate of βj as outlined above, we next need to estimate the following equation:

y i j = x i ( 1 d i ) β j + x i , I M F d i β j , I M F + C O N i , I M F d i + є i j . ( 26 )

The estimated coefficients on xi, IMFdi and on di will then be the estimates of βj, IMF and of CONi, IMF that we seek. Of course, if estimation of equation (26) produces the result that βj is not significantly different from βj, IMF then we can put aside shifts in behavioral parameters as a source of program effects and deal exclusively with (xi, IMFxi)′βj and CONi, IMF

To summarize, in this section we have shown that the presence of systematic differences between program and nonprogram countries need not render useless the control-group approach to the estimation of program effects. One way of handling the problem is to account for any differences in indigenous macroeconomic policy between program and nonprogram countries and to use the level of macroeconomic performance in the program period rather than its change. This “modified” estimator, however, is significantly more difficult to calculate than the traditional control-group estimator. Yet one important feature of the more structural version of this estimator is that it can be used to provide information not only on total program effects but also on how these effects are apportioned among induced changes in policy instruments, shifts in behavioral parameters, and general confidence effects.18

III. Some Empirical Exercises

Demonstrating that several alternative estimation methods can in theory yield different results about the effects of Fund programs is one thing. Illustrating the empirical relevance of that point with actual data on Fund programs is quite another. In this section we provide an exploratory empirical investigation of the aforementioned methodological pitfalls by comparing estimates of Fund program effects against the three estimators discussed earlier: a before-after comparison of mean outcomes for program countries alone (estimator A) ; a before-after comparison for program countries relative to that for nonprogram countries (the traditional control-group estimator B); and a reduced-form regression estimate of program effects that controls only for observed preprogram differences between program and nonprogram countries (a version of the modified control-group estimator M). The data samples are drawn from the population of Fund stabilization programs over the 1974–81 period.

As suggested earlier, although we think that these empirical results are instructive for testing the sensitivity of estimated program effects to alternative estimation methods, we do not think that much confidence ought to be placed in any of the estimates of program effects themselves. We say this because the particular equations tested, even for the modified control-group estimator, accommodate only one of the possible sources of bias outlined in Sections I and II (we have not investigated the empirical relevance of correlations between the unobservable components of policy-reaction functions and the factors affecting program participation); because we do not construct a carefully specified, structural economic model for the macroeconomic outcome variables or for indigenous policy reaction; because we have experimented with only one short time span for program effects (that is, the change from the preprogram year to the program year);19 and because the goodness-of-fit characteristics of the estimates themselves do not merit such confidence. Having made these qualifications, we should also point out that most of the same deficiencies also plague the earlier empirical literature on program effects using multicountry data.20

Data Base

Our estimates were made using a sample that contains observations from 58 developing countries during the 1974–81 sample period. It consists of 397 country-year observations, 68 of which are program-year observations. The 58 countries in the sample are those for which data were available for all relevant macroeconomic variables for at least two consecutive years during 1974–81 (not necessarily for the entire period). Consecutive-year Fund programs, including those classified as extended Fund facility programs, are contained in the sample. (A list of the program countries represented is given in Table 6 in Appendix II.)21

Definition of Variables

As in most earlier studies, we have selected some popular indicators of external and internal balance as the appropriate outcome or target variables for Fund stabilization programs. Specifically, the four outcome variables that serve as the empirical counterparts to the y variables of the theoretical sections are: the ratio of the overall balance of payments to nominal GNP, BOP/GNP; the ratio of the current account of the balance of payments to nominal GNP, CA/GNP; the rate of inflation as measured by the consumer price index, ΔΔCPI/CPI,t-1 and the rate of growth of real gross domestic product, ΔRGDPIRGDPt-1,. These four summary indicators are of course not the only relevant yardsticks of the success of a Fund program, but it would be difficult to argue that they are not important ones.22 For the purposes of this study, they also carry the advantage of facilitating comparison with earlier empirical work on program effects.23

Recall from Section II that calculation of the modified control-group estimator requires data on the vector of policy instruments for both program and nonprogram countries. For this purpose, we collected data on total domestic credit (D) and on the real effective exchange rate (REX) for each of the sample countries.24 These measures serve as the empirical counterpart to the x variables of the theoretical sections. Again, it is not difficult to think of other policy instruments that would be pertinent to Fund stabilization programs, but few would deny the key roles accorded these two instruments in most programs.

Finally, to create the dummy variable di that captures the presence (di = 1) or absence (di=0) of a Fund program, we assigned a program to a given year if it was approved (by the Fund’s Executive Board) during the first six months of that year. Otherwise, the program was assigned to the following year. Also, the phrase “program countries” is used in what follows to refer to those (country-year) observations during which Fund programs were in effect. The data source for each of the variables is identified in Appendix II.

How the Estimators Were Calculated

All that remains before examining the results themselves is to review briefly how the three alternative estimators of Fund program effects in the tables that follow were actually calculated.

For the simple before-after estimator (β^jIMF, A), we computed the mean change across the group of program countries for each of the four macroeconomic outcome variables. In terms of earlier symbols, the before-after estimator becomes

β ^ j I M F , A = ( Δ y ¯ j ) P . ( 27 )

For computational convenience, the traditional control-group estimator was calculated by running the following regression equation on the combined sample of program and nonprogram countries:

Δ y i = α 1 + α 2 d i , ( 28 )

where, as before, yi is the dummy variable for Fund program status and where α1; and α2 are estimated coefficients. The estimate of α2 will then be the traditional control-group estimator, β^jIMF, B

Last, we have the modified control-group estimator β^jIMF, M As suggested in Section II, there are several ways to calculate it. Because our primary purpose here is to determine how sensitive estimated program effects are to alternative assumptions, it seemed acceptable to concern ourselves only with total program effects. We therefore chose to use the reduced-form version of the modified control-group estimator given in equation (23), since it is so much easier to calculate. Again, we did not correct for any possible correlation between the unobservable components of program participation and of the policy-reaction function.

By subtracting (yi)-1- from both sides of equation (23), this equation can be estimated in the form

Δ y i j = β 0 i h ( y i h ) 1 λ h ( 1 + λ j ) ( y i j ) 1 ( x i ) 1 β j + W α j + β j I M F d i + ( є i j + η i β j ) , ( 23 a )

where λ = γ′βj is an ? × 1 vector with jth element equal to λj. As a proxy for W, the variable which measures the international economic environment, we introduced a set of time dummy variables. Also, the β0i are coefficients of a set of country dummy variables designed to capture intercountry differences in desired target values for the yij. It is also possible to test formally whether the additional variables peculiar to the modified control-group estimator —that is, the lagged values of the vectors yi and xi — make as a set a significant contribution to the explanation of Δy. To do so, one performs an F-test on the null hypothesis that the coefficients of these variables are all zero. Observe also that, even if prior statistical tests document that program countries differ systematically from nonprogram countries with respect to these variables, these preprogram period characteristics must show as a group a statistically significant effect on Δy for there to be a bias in the traditional control-group estimator of program effects. If preprogram characteristics are not related to Δy, then the estimates of program effects using the traditional and modified control-group methodologies will yield the same results.25

Table 1.

“Before-After” Estimales of Program Effects, 1974–81

(In percent)

article image

Here and in Tables 2–4, variables are as defined in Section III of the test (under “Definition of Variables”).

The growth effect was negative but negligible.

Results

The results of principal interest are set forth in Tables 15. Tables 1 and 2 provide estimates of program effects under the before-after estimator and the traditional control-group estimator, respectively. Table 3 presents the results of a test for differences between program and nonprogram countries in the level of macroeconomic outcomes before the program period. Table 4 gives the estimates of program effects using the modified control-group estimator. Finally, Table 5 presents a summary of the sensitivity of estimated program effects to the estimation methodology.

Table 1, although it is confined to changes in macroeconomic outcomes for program countries alone, already raises some doubts about the quality of estimates based on simple before-after calculations. There is a marked difference in the nature and pattern of estimated program effects from year to year. Note, for example, the difference in estimated program effects between, say, 1976 programs and 1980 programs. Again, whereas it is possible that true program effects really do change markedly from year to year, it seems more likely that this temporal instability arises because the nonprogram determinants of changes in macro-economic outcomes (for example, oil shocks, foreign demand conditions, agricultural supply shocks, and the like), often change significantly from year to year. Because the before-after methodology does not acknowledge such nonprogram influences on Δy, it cannot control for them in estimating program effects.

Table 2, which conveys the traditional control-group estimates of program effects, illustrates three noteworthy features of the results. First, the size and even the direction of estimated program effects sometimes change quite noticeably from values obtained with the before-after estimator. Specifically, once the performance of nonprogram countries is used as a measuring rod, Fund programs become associated with an improvement in the current account and with slightly better growth. Second, Table 2 documents the importance of applying tests of statistical significance to observed differences in performance between program and nonprogram countries. Whereas the macroeconomic performance of program countries is always different from that of non-program countries in each of the four comparisons shown in Table 2, in none of them could it be legitimately concluded that the observed difference was statistically significant (that is, not the outcome of chance).

As emphasized in the preceding sections, we must suspect that the traditional control-group estimates of program effects will be biased if the selection of program countries is nonrandom and if these nonrandom characteristics are correlated with macro-economic performance during the program period. Tables 3 and 4 address these two questions. In particular, Table 3 tests our earlier argument that Fund program status is likely to be related systematically to the country’s level of macroeconomic performance before the program period. The results are straightforward and can be summarized as follows. Program countries do seem to be different from nonprogram countries. In the year before the inception of a Fund program, program countries experienced (on average) larger balance of payments deficits in proportion to GNP, larger current account deficits in proportion to GNP, higher rates of inflation, and lower rates of real output growth than did nonprogram countries. Each of these differences is statistically significant at the 5 percent level or better. This significance is revealed not only by the t-test results shown in Table 3 but also by c2 tests for differences in the whole set of mean comparisons. These differences in preprogram conditions between program and nonprogram countries appear in all the samples we examined. Indeed, the existence of these preprogram differences between program and nonprogram countries was the single most robust empirical finding of our tests.

Table 2.

Traditional Control-Group Estimates of Program Effects

(In percent)

article image
Table 3.

Differences Between Program and Nonprogram Countries: Means of Outcome Variables in Preprogram Year

(In percent)

article image

Two asterisks indicate statistical significance at the 1 percent level.

Table 4 takes the analysis one step further by testing whether these revealed preprogram differences in macroeconomic outcomes affect the change in macroeconomic performance between the preprogram year and the year of the program. Again the results of interest can be conveniently summarized. First, preprogram levels of macroeconomic outcomes do appear to affect the change in these outcomes. For all four equations in Table 4, the change in the outcome variable is related in a statistically significant way to two or more of the four outcome-level variables in the preprogram year. In each case, the outcome-change variable is related to is own lagged level with a negative coefficient. This finding can be taken as supporting the notion advanced earlier, that macroeconomic outcomes in both program and nonprogram countries may display a regression-to-the-mean property. For example, the greater is the size of the current account deficit in period t–1, the greater is the improvement in the current account between period t – 1 and period t.

Table 4.

Modified Control-Group Estimates of Program Effects

article image
Note: Figures in parentheses are standard errors. Coefficients of time and country-specific dummy variables are not reported. A single asterisk indicates statistical significance at the 5 percent level; two asterisks indicate statistical significance at the 1 percent level.

Measured as fractions.

Measured in percent.

Second, and not surprising, estimated program effects under the modified control-group estimator are quite different from those obtained under the traditional control-group estimator. This can be seen most vividly in Table 5, where the three estimators are shown side by side. In addition, and consistent with our earlier expectations about the direction of bias, we find that estimated program effects, after allowing for preprogram differences between program and nonprogram countries, are almost always less favorable. For example, the improvement in the current account ratio disappears entirely, the deterioration in the balance of payments ratio is magnified, and the favorable outcomes for inflation and growth are reversed. Tests of statistical significance again indicate, however, that observed differences in macroeconomic performance between program and nonprogram countries are not significant.

Finally, although the explanatory power of the regression equations in Table 4 is rather low (in the range of 1 to 2 without country and time dummy variables and 2 to 3 with them), the explanatory power is significantly higher (in a statistical sense) when those variables peculiar to the modified control-group estimator are included in the equations. In this respect, an F-test reveals that, for each of the equations in Table 4, the modified control-group variables (that is, the lagged values of targets and instruments) are statistically significant as a group at the 1 percent level. In other words, the modified control-group equations in Table 4 hardly provide a “full” or even a “good” explanation for observed changes in macroeconomic outcomes, but it is hard to argue that preprogram characteristics can be ignored when Fund program effects are estimated from multicountry samples.

Table 5.

Implications of Alternative Statistical Methodologies for Estimates of Program Effects

(In percent)

article image

To summarize, we have shown in this section, if only in a preliminary way, that it does make a significant difference how one estimates the effect of Fund programs from cross-sectional data. None of the estimates reported in Table 5 can be construed as a true estimate of program effects. We have not attempted a vigorous implementation of the modified control-group technique and are not convinced that it would be worthwhile to do so. We have, however, demonstrated that point estimates of program effects are not robust when variables are included that measure the preprogram characteristics of countries, and that the direction of change in estimated program effects is as expected a priori. Thus, some of the theoretical sources of bias outlined in the earlier sections do appear to be of more than academic interest.

IV. Conclusions

Given the pivotal role assigned to Fund stabilization programs in the past and present economic policy strategies of many developing countries, and given the continuing controversy over the effects of these programs, it is not surprising that there has been strong interest in empirical measures of program effectiveness. Because the large number of such programs makes the case-by-case approach a laborious and time-consuming way to arrive at an estimate of “average” program effectiveness, it is likewise understandable that the cross-country approach to program evaluation has dominated the empirical literature. We have argued in this paper, however, that if the estimated program effects from such a cross-country analysis are to be representative of “true” program effects, then certain methodological pitfalls need to be avoided. At the risk of ignoring some problems and of unduly simplifying others, the main lessons of the preceding analysis can be summarized as follows.

  • In comparing the performance of program countries with that of nonprogram countries, it is strongly advisable to subject any differences to tests of statistical significance. As brought out in our empirical investigation, it frequently turns out that observed differences in performance between the two groups during the program period would not be judged statistically significant at conventional levels of confidence.

  • A before-after comparison of mean macroeconomic outcomes for program countries is unlikely to yield a good estimate of true program effects because the nonprogram determinants of macroeconomic outcomes typically change between the preprogram period and the program period. As such, ascribing all of the observed change in outcomes to the program alone will invariably overstate or understate the true independent effect of the program.

  • If the mean change in outcomes for nonprogram countries is subtracted from the mean change for program countries, the bias in program estimates attributable to ignoring the nonprogram determinants of macroeconomic outcomes will be reduced. A new source of bias will be introduced, however, whenever program countries differ systematically from nonprogram countries in some characteristic that is related to subsequent macroeconomic performance. In the particular case in which the determinants of Fund program selection are positively correlated with the non-program determinants of changes in macroeconomic outcomes, this traditional control-group estimate of program effects will overstate true program effects. Furthermore, preliminary empirical tests suggest that in practice (at least for the 1974–81 period) program countries did have significantly less favorable macro-economic performance than did nonprogram countries before the program period, and that such preprogram outcomes were significantly related to subsequent performance during the program period itself. Not surprisingly, therefore, estimates of program effects that held constant the preprogram levels of macro-economic outcomes were quite different from those that did not. In any case, the “moral” is that if the program countries are not selected randomly, then these nonrandom selection criteria must be identified either so that a control group can be found with the same characteristics or so that these group differences can be accounted for in any comparison of outcomes between the two groups.

  • Because Fund programs probably work in good measure by changing the stance of policy instruments from what would pertain in the absence of such programs, any estimate of program effects that does not allow for this channel of influence runs the risk of capturing only part of total program effects (for example, only “confidence” effects) and thus of understating true program effectiveness (see, for example, Killick and Chapman (1982)). In this paper we have outlined an estimation procedure that in principle permits calculation of how total program effects are apportioned among induced changes in policy instruments, induced changes in behavioral parameters, and general confidence effects. Central to this procedure is the estimation of “policy reaction functions” for both program and nonprogram countries. Although we would not want to underestimate the practical difficulties associated with obtaining credible estimates of such reaction functions for developing countries (particularly when underlying economic and political conditions are changing markedly at frequent intervals), we see no other way of estimating the counter-factual for program countries. If cross-country stability of policy reaction functions is not observed, then the cross-sectional approach using multicountry samples must be abandoned. If temporal stability of such functions does not obtain for individual countries, the country studies may also fail to tell much about program effects per se. We would then be left to analyze the effects of Fund-type policies and to speculate in each case about the alternative policies that would have been pursued in the absence of a program.

  • On a broad level, the methodological problems we have described lead us to the view that considerable caution is needed in attempting to estimate and interpret the effects of Fund programs by using multicountry data.

APPENDIX I Bias in the Traditional Control-Group Approach

Equations (14a) and (14b) in the text are crucial for establishing the presence of bias in the traditional control-group approach under nonrandom selection of program countries, in this Appendix we derive these equations.

Denote the variance of Δxiβj + Δεij as σx2. Suppose that Δxiβj + Δεij and zi have a joint normal distribution, with the correlation between Δxiβj + Δεij and zi denoted as ρxz. Finally, let ϕ and Φ represent respectively the standard normal density and distribution functions. For the i th country, it will be true that

E [ z i | ( Δ x i β j + Δ є i j ) ] = ρ x z σ z σ x ( Δ x i β j + Δ є i j Δ x β j ) . ( 29 )

The probability that country i will be a program country is then prob

prob ( i є P ) = prob ( z i > z ) = 1 Φ { [ z ρ x z σ z σ x ( Δ x i β j + Δ є i j Δ x β j ) ] / σ z } . ( 30 )

In the special case in which ρxz is zero, equation (30) reduces to

prob ( i є P ) = 1 Φ ( z / σ z ) . ( 30 a )

The key difference between equations (30) and (30a) is that, whereas the probability of being a program country is a function of Δx′iβj + Δεij in equation (30) and thus will differ across countries, in equation (30a) this probability is the same for all countries in the sample.

The next step toward discovering the direction of bias in the control-group estimate of program effects under conditions of nonrandom program selection is to write

E [ ( Δ x i β j + Δ є i j ) | z i > z ] = Δ x β j ρ x z σ x ϕ ( z / σ z ) 1 Φ ( z / σ z ) . ( 31 )
E [ ( Δ x i β j + Δ є i j ) | z i z ] = Δ x β j ρ x z σ x ϕ ( z / σ z ) Φ ( z / σ z ) . ( 32 )

Since σx and ϕ are both positive, and since Φ is bounded between zero and unity, σxϕ/(1–Φ) and σxϕ/Φ are both positive. Equations (14a) and (14b) in the text then follow directly from equations (31) and (32a), respectively.

APPENDIX II Description of Sample and Data Used

The program countries contained in the sample are listed in Table 6.

The sources of the data used in Section III are as follows: for net foreign assets, International Financial Statistics (IFS) (Washington: International Monetary Fund, various issues), line 31n; for nominal GNP, real GDP, and current accounts, Fund staff estimates (Current Studies Division data file); for consumer price indices, IFS, line 64; for domestic credit, IFS, line 32; for real effective exchange rates. Fund staff estimates (Developing Countries Studies Division data file). The precise definition of the real exchange rate (REX) is

R E X 100 exp [ s = 1 n In ( E X I i / E X I s ) W s s = 1 n In ( C P I s / C P I i ) W s ] ,

where EXI is the nominal exchange rate index; i is the reporting country; s is the partner country; Ws is the import weight for partner country s ; and CPI is the consumer price index.

Table 6.

Program Countries in Sampie

article image

REFERENCES

  • Asenfelter, Orley, “Estimating the Effect of Training Programs on Earnings,” Review of Economics and Statistics (Cambridge, Massachusetts), Vol. 60 (February 1978), pp. 4757.

    • Search Google Scholar
    • Export Citation
  • Asenfelter, Orley, and David Card, Using the Longitudinal Structure of Earnings to Estimate the Effect of Training Programs, Working Paper 174 (Princeton, New Jersey: Princeton University, Industrial Relations Section, July 1984).

    • Search Google Scholar
    • Export Citation
  • Beveridge, W.A., “Fiscal Adjustment in Financial Programs Supported by Stand-By Arrangements in the Upper Credit Tranches, 1978–79” (unpublished; Washington: International Monetary Fund, July 1981).

    • Search Google Scholar
    • Export Citation
  • Beveridge, W.A., and Margaret R. Kelly, “Fiscal Content of Financial Programs Supported by Stand-By Arrangements in the Upper Credit Tranches, 1969–78,” Staff Papers, International Monetary Fund (Washington), Vol. 27 (June 1980), pp. 20549.

    • Search Google Scholar
    • Export Citation
  • Cline, William R., and Sidney Weintraub, eds., Economic Stabilization in Developing Countries (Washington: The Brookings Institution, 1981).

    • Search Google Scholar
    • Export Citation
  • Connors, Thomas A., “The Apparent Effects of Recent IMF Stabilization Programs,” International Finance Discussion Paper 135 (Washington: Board of Governors of the Federal Reserve System, International Finance Division, April 1979).

    • Search Google Scholar
    • Export Citation
  • Donovan, Donal J., “Real Responses Associated with Exchange Rate Action in Selected Upper Credit Tranche Stabilization Programs,” Staff Papers, International Monetary Fund (Washington), Vol. 28 (December 1981), pp. 698727.

    • Search Google Scholar
    • Export Citation
  • Donovan, Donal J., “Macroeconomic Performance and Adjustment Under Fund-Supported Programs: The Experience of the Seventies,” Staff Papers, International Monetary Fund (Washington), Vol. 29 (June 1982), pp. 171203.

    • Search Google Scholar
    • Export Citation
  • Goldstein, Morris, The Global Effects of Fund-Supported Adjustment Programs, Occasional Paper 42 (Washington: International Monetary Fund, March 1986).

    • Search Google Scholar
    • Export Citation
  • Goldstein, Morris, and Mohsin S. Khan, Effects of Slowdown in Industrial Countries on Growth in Non-Oil Developing Countries, Occasional Paper 12 (Washington: International Monetary Fund, August 1982).

    • Search Google Scholar
    • Export Citation
  • Guitián, Manuel, Fund Conditionality: Evolution of Principles and Practices, Pamphlet Series No. 38 (Washington: International Monetary Fund, 1981).

    • Search Google Scholar
    • Export Citation
  • Gylfason, Thorvaldur, Credit Policy and Economic Activity in Developing Countries: An Evaluation of Stabilization Programs Supported by the IMF, 1977–79, Seminar Paper 268 (Stockholm: University of Stockholm, Institute for International Economic Studies, December 1983).

    • Search Google Scholar
    • Export Citation
  • Heckman, James J., “Sample Selection Bias as a Specification Error,” Econometrica (Evanston, Illinois), Vol. 47 (January 1979), pp. 15361,

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, World Economic Outlook: A Survey by the Staff of the International Monetary Fund (Washington, 1983).

  • Kelly, Margaret R., “Fiscal Adjustment and Fund-Supported Programs, 1971–80” (unpublished; Washington: International Monetary Fund, September 1982).

    • Search Google Scholar
    • Export Citation
  • Khan, Mohsin S., and Malcolm Knight, “Determinants of Current Account Balances of Non-Oil Developing Countries in the 1970s: An Empirical Analysis,” Staff Papers, International Monetary Fund (Washington), Vol. 30 (December 1983), pp. 81942.

    • Search Google Scholar
    • Export Citation
  • Killick, Tony, The Quest for Economic Stabilization: The IMF and the Third World (New York: St. Martin’s, 1984).

  • Killick, Tony, and M. Chapman, “Much Ado About Nothing? Testing the Impact of IMF Stabilization Programmes in Developing Countries,” Overseas Development Institute Working Paper 7 (London, March 1982).

    • Search Google Scholar
    • Export Citation
  • Kmenta, Jan, Elements of Econometrics (New York: Macmillan, 1971).

  • Loxley, John, The IMF and the Poorest Countries: The Performance of the Least Developed Countries Under IMF Stand-By Arrangements (Ottawa: The North-South Institute, 1984).

    • Search Google Scholar
    • Export Citation
  • Lucas, Robert E., Jr., “Econometric Policy Evaluation: A Critique,” in The Phillips Curve and Labor Markets, ed. by Karl Brunner and Allan H. Meltzer, Carnegie-Rochester Conference Series on Public Policy, Vol. 1 (Amsterdam: North-Holland, 1976; New York: Elsevier, 1976), pp. 19–46.

    • Search Google Scholar
    • Export Citation
  • Odling-Smee, John, “Adjustment with Financial Assistance from the Fund: The Experience of Seven Countries,” Finance&Development (Washington), Vol. 19 (December 1982), pp. 2630.

    • Search Google Scholar
    • Export Citation
  • Reichmann, Thomas M., “The Fund’s Conditional Assistance and the Problems of Adjustment, 1973–75,” Finance&Development (Washington), Vol. 15 (December 1978), pp. 3841.

    • Search Google Scholar
    • Export Citation
  • Reichmann, Thomas M., and Richard T. Stillson, “Experience with Programs of Balance of Payments Adjustment; Stand-By Arrangements in the Higher Credit Tranches, 1963–72,” Staff Papers, International Monetary Fund (Washington), Vol. 25 (June 1978), pp. 293309.

    • Search Google Scholar
    • Export Citation
  • Williamson, John, The Lending Policies of the International Monetary Fund (Washington: Institute for International Economics, 1982).

  • Williamson, John, ed, IMF Conditionally (Washington: Institute for International Economics, 1983).

  • Zulu, Justin B., and Saleh M. Nsouli, Adjustment Programs in Africa: The Recent Experience, Occasional Paper 34 (Washington: International Monetary Fund, 1985).

    • Search Google Scholar
    • Export Citation
*

Mr. Goldstein, Advisor in the Research Department when this paper was written, is now Advisor in the External Relations Department. He is a graduate of Rutgers University and New York University. Mr. Montiel, an economist in the Developing Country Studies Division of the Research Department when this paper was written, is now in the Macroeconomics Division of the Development Research Department. The World Bank. He is a graduate of Yale University and the Massachusetts Institute of Technology.

2

Athough not all of the previous studies of Fund program experience sought to identify the independent effects of Fund programs, this paper evaluates the before-after approach and the control-group approach as estimators of such program effects.

3

For a thorough discussion of alternative interpretations of “program effects” and of their relative strengths and weaknesses, see Goldstein (1986).

4

In this connection, one would also have to account for the effects of Fund involvement on the availability of additional external resources, either from the Fund itself or through the catalytic effect of Fund involvement on other lenders. For more on this point, see Section II.

5

The empirical examples in the paper are not reliable indicators of program effectiveness for many reasons. To mention just three, the paper deals with only one (sample-selection bias) of many potential sources of bias in cross-country estimates (for example, we ignore bias associated with interdependence between outcomes in program and nonprogram countries, as well as bias from aggregation across different types of programs); the paper considers only short-term (one-year) effects of programs; and the calculations cover only a few of the wide range of policy instruments actually specified in Fund programs.

6

Such an interpretation would be consistent with Guitián’s view (1981, p. 30) that the broad objective of a Fund-supported stabilization program is “… the restoration and maintenance of viability to the balance of payments in an environment of price stability and sustainable rates of economic growth,”

7

For example, an announced new target for the real exchange rate may be viewed as more likely to be adhered to if it is a component of a Fund program than otherwise. In that case, the response of the private sector to a given change in the real exchange rate may depend on whether that change occurs in the context of a Fund program.

8

Although there is no explicit formula for judging balance of payments needs, the three indicators given foremost attention are the actual balance of payments, the level of international reserves, and recent changes in the level of reserves.

9

Many of the methodological issues discussed in this paper have been analyzed earlier in the literature of labor economics concerning “treatment effects”; for example, see Ashenfelter (1978) and Ashenfelter and Card (1984).

10

To simplify the notation, expectations of group averages in equation (12) and elsewhere in the paper are expressed in terms of the “representative” member of each group; that is, we implicitly assume that all members of a group are identical.

11

We are indebted to Rüşdü Saracoglu for drawing this to our attention.

12

Note that, when equation (4) holds but ρxz = 0, the program and non-program groups can be quite different without implying the existence of bias in the control-group methodology.

13

We also have dropped the global variable ΔW′ and its coefficient αij from equation (1) to simplify the exposition.

14

This is only one of several equivalent modified estimators that could be proposed. Their common feature is that outcomes are measured net of observable nonprogram influences that can be estimated on the basis of preprogram information.

15

In a pooled cross-sectional time-series sample, this would include observations in nonprogram periods for countries that are program countries in other periods.

16

The direction of the bias depends in part on the correlation between πi and ηi. Intuitively, if (for a given set of observable preprogram circumstances) countries that would have pursued “worse” policies are more likely to adopt Fund programs, then the behavior of nonprogram countries would provide an excessively optimistic counterfactual, and the beneficial effects of programs would be understated. Conversely, if programs are more likely to be adopted by countries that would have undertaken “better” policies anyway, then the beneficial effects of programs would be overstated because the favorable effects of the policies would be erroneously attributed to Fund involvement.

17

If Fund programs induce parameter shifts, then only data on nonprogram countries could be used for this purpose.

18

If economic and political conditions change markedly at frequent intervals, if governments with different policy-reaction functions appear frequently, or both, then it may not be feasible to identify empirically a “stable” policy-reaction function. But this is a matter for empirical testing.

19

The question of when a program country stops being a program country is a particularly difficult one to answer, yet it can have an important effect on program estimates based on multicountry data. Suppose, for example, that two countries face identical current account deficits. Country A, with a Fund program, undertakes a policy of devaluation with expenditure reduction while country B, without a program, adopts increased trade restrictions. Over a one-year period, the change in the current account could well be quite similar for the two countries. Over a longer period (after the program), one might expect country A to show better growth and external balance performance than country B, but this improvement would not be reflected in one-year comparisons. Indeed, country A would be classified as a nonprogram country after the program year.

20

In this respect, the attention devoted by Donovan (1982) to both long- and short-term effects of programs, and by Gylfason (1983) to the theoretical channels by which domestic credit can affect economic growth as well as the balance of payments, are particularly commendable.

21

We also ran some tests on several smaller samples. Because the results were qualitatively similar to those reported here, we did not include them in the text.

22

In addition to the four indicators mentioned above (measured somewhat differently). Donovan (1982) also examined changes in savings and investment ratios and changes in the growth rate of real consumption.

23

In some of the earlier studies, the external balance variables were scaled by nominal exports rather than by nominal GNP. but we doubt that this difference has any material effect on the qualitative nature of the results.

24

The real effective exchange rate is an import-weighted index, with the consumer price index (CPI) used as the relevant deflator; sec Appendix II for a more precise definition.

25

Some readers will recognize this as an application of “specification bias due to an omitted variable”; see, for example, Kmenta (1971, p. 391). In brief, if equation (23a) is the “true model” but we estimate equation (28) instead, then the bias attaching to α2 in equation (28) will be a product of two factors: the correlations between the program dummy di and the omitted variables (here, the lagged values of y and x); and the coefficients on the omitted variables. If either of these factors is zero, then the α2 in equation (28) will equal the βIMFj in equation (23a).

  • Collapse
  • Expand