Article

Further Cross-Country Evidence on the Accuracy of the Private Sector’s Output Forecasts

Author(s):
Robert Flood
Published Date:
April 2002
Share
  • ShareShare
Show Summary Details

“… the ability to produce accurate predictions of the course of the economy in the near-term is probably the main criterion by which the public judges the usefulness of our entire profession.”

Victor Zarnowitz (1986, p. 1)

This paper evaluates the performance of Consensus Forecasts of real GDP growth for a large number of industrialized and developing countries for the time period October 1989 to December 1999. The questions addressed are the following:

  • How accurate are private sector forecasts?

  • How does their accuracy compare with that of the IMF’s World Economic Outlook?

  • How well do forecasters predict rare events such as recessions or crises?

  • Is discord among forecasters associated with lower forecast accuracy?

The evidence on such questions is useful for a number of reasons. First, private sector capital flows have supplanted official funds as the dominant form of external financing for many countries. Hence, private-sector assessments of the relative macroeconomic outlook for various countries play a role in guiding the allocation of capital across the globe.

Second, many in the "official" sector are increasingly relying on these forecasts as a summary of the private sector’s assessment of the macroeconomic outlook. In addition to being extensively used in the multilateral institutions for this purpose,1Consensus Forecasts are used by national government agencies, as revealed for example in the following quote from a speech by New Zealand’s central bank governor Donald Brash (1998):

“We do not ourselves make forecasts of the international economy, but instead use the monthly Consensus Forecasts.… We certainly have no reason to believe that we could produce better forecasts for our overseas markets than can the forecasters’on the ground’ in the countries concerned.”

Third, growth forecasts are an important component for several other forecasts, such as those of the trade balance and government fiscal balance.

Despite the increasing visibility of Consensus Forecasts, there has been very little independent analysis of their accuracy. To our knowledge, the only studies are by Artis (1997), Batchelor (1997), Harvey, Leybourne and Newbold (1999), and Gallo, Granger and Jeon (2002). The first two restrict attention to the G-7 countries, the third to the United Kingdom, and the last to the United States, the United Kingdom, and Japan. 2

I. Data Description and Terminology

Consensus Forecasts has provided macroeconomic forecasts for industrialized countries on a monthly basis since October 1989. Over time, the coverage has expanded to encompass many developing countries; forecasts for these countries are reported in the publication’s off-shoots, namely, Latin American Consensus Forecasts (published bimonthly since 1993), Asia Pacific Consensus Forecasts (monthly since 1995), and Eastern Europe Consensus Forecasts (bi-monthly since 1998). Each of these publications surveys a number of prominent financial and economic analysts and reports their individual forecasts as well as simple statistics summarizing the distribution of forecasts. The focus of this paper is on the mean forecast (the "consensus") and the standard deviation across forecasters.3

Table 1 provides a list of the 63 countries used in the analysis, the sample period over which forecasts are available for each, and whether they are classified as an "industrialized" country or a "developing" country. As noted, for the industrialized countries the forecasts start in October 1989 (with a single exception). For the developing countries, the starting dates are more varied: in about 25 percent of the cases, the starting date is between October 1989 and October 1993; in the remaining cases, the starting date is January 1995. Most of results in this paper are based on an "unbalanced" panel (i.e., countries enter the sample at different dates) in order to make use of all available information.

Table 1.List of Countries
Developing Countries
IndustrializedAsia-PacificLatin AmericanTransition and
Country GroupCountriesEconomiesEconomiesOther Economies
List of CountriesAustria (1)Bangladesh (4)Argentina (4)Bulgaria (4)
Australia (1)China (4)Bolivia (4)Czech Republic (4)
Belgium (1)Hong Kong (2)Brazil (1)Hungary (2)
Canada (1)India (4)Chile (4)Poland (2)
Denmark (1)Indonesia (2)Colombia (4)Romania (4)
Finland (1)Korea (1)Costa Rica (4)Russia (4)
France(1)Malaysia (1)Dominican Rep. (4)Slovakia (4)
Germany (1)Pakistan (4)Ecuador(4)Slovenia (4)
Greece (3)Philippines (4)Mexico (1}Ukraine (4)
Ireland (1)Singapote(1)Panama (4)
Israel (1)Sri Lanka (4)Paraguay (4)Egypt (4)
Italy (1)Taiwan (1)Peru (4)Saudi Arabia (4)
Japan (1)Thailand (2)Uruguay (4)South Africa (3)
Netherlands (1)Vietnam (4)Venezuela (4)Turkey (4)
New Zealand (1)
Norway (1)
Portugal (1)
Spain (1)
Sweden (1)
Switzerland (1)
United Kingdom (1)
United States (1)
Sample Period(1) From Oct. 1989(1) From Oct. 1989(1) From Oct. 1989(2) From April 1991
(3) From Oct. 1993(2) From April 1991(4) From Jan. 1995(3) From Oct. 1993
(4) From Jan. 1995(4) From Jan. 1995

The "event" being forecast is annual average real GDP growth. Every month (or, as noted above, every other month in the case of Latin American or East European countries) a new forecast is made of this event. To ensure consistency of treatment across countries, we study the bi-monthly sequence of forecasts for each event. To take a concrete example, suppose that the event is 1999 real GDP growth. The sequence of forecasts that we study for this event are the twelve forecasts made between February 1998 and December 1999. The first six forecasts, the ones made during 1998, are referred to as year-ahead forecasts; the six forecasts made during 1999 are called current-year forecasts.

In discussing the results, it will sometimes be convenient to index the forecasts by the forecasting horizon, the number of months before the terminal date. So, to continue the example, the February 1998 forecast is considered as made at a 23- month horizon and the December 1999 forecast as made at a one-month horizon.

The comparison with IMF forecasts is made on the basis of the April and October forecasts, both year-ahead and current-year. In terms of the notation just discussed, these are the forecasts made at horizons 21, 15, 9, and 3 months.

II. Accuracy of Private Sector Forecasts

The forecast errors are given by:

where A(t) is a vector of growth outcomes (the "actuals"), and F(t) is the corresponding vector of forecasts. A perennial issue in the forecasting literature is whether the "actual" value should correspond to the early releases of the data or later revisions. Using a very early release may not be satisfactory as the number may be highly preliminary. But using the final release may not be satisfactory either as it may incorporate information (such as revisions of weights, changes in methods of construction, etc.) that forecasters simply could not have been aware of at the time of the forecast. Given our large sample of countries, there will likely be quite a few data revisions of this kind. Our compromise is to use the real GDP data as reported in the May WEO of the following year—this likely falls in between a highly preliminary number and later revised versions.4

Three measures of forecast accuracy are used. The first is the Mean Absolute Error (MAE), which is the average across all countries and over all years of the differences between actual and forecast values, disregarding the sign of the error. The second is the root mean square error (RMSE). To compute RMSE, the forecast errors are squared and averaged over the sample to get the mean squared error (MSE); RMSE is the square root of MSE. The third measure, Theil’s U-Statistic (TU), is defined as follows:

TU accomplishes two things. It scales RMSE by the variability of the underlying data, and it offers a way of evaluating forecasting performance relative to a "naive" forecast of no change in the growth rate between t—1 and t. TUs of less than 1 are said to beat the naive forecast.

The results are reported in Table 2, for the full sample and also for the industrialized and developing countries separately. There are two clear findings. First, as one would expect, the magnitude of the forecast error declines as the forecast horizon gets shorter. For instance, MAE for the full sample is 2 percent at a 23- month horizon and declines to just under 1 percent at the one-month horizon. Similar patterns are evident for RMSE and TU.

Table 2.Accuracy of Private Sector Forecasts
MeanRoot MeanTheils
Absolute ErrorSquare ErrorU-Statistic
Year-ahead or Current-yearHorizon(MAE)(RMSE)(TU)
All countries
Year-ahead232.103.320.86
Year-ahead212.083.280.85
Year-ahead192.033.210.83
Year-ahead171.993.140.81
Year-ahead151.872.960.81
Year-ahead131.792.760.75
Current-year111.632.450.67
Current-year91.542.280.62
Current-year71.311.950.53
Current-year51.151.720.47
Current-year30.981.520.41
Current-year10.871.340.36
Industrialized countries
Year-ahead231.431.981.01
Year-ahead211.421.951.00
Year-ahead191.391.910.97
Year-ahead171.361.870.95
Year-ahead151.251.710.92
Year-ahead131.211.620.87
Current-year111.101.470.79
Current-year91.021.370.74
Current-year70.921.250.67
Current-year50.831.130.61
Current-year30.710.990.53
Current-year10.630.890.48
Developing countries
Year-ahead232.674.130.91
Year-ahead212.634.090.90
Year-ahead192.604.040.88
Year-ahead172.553.950.87
Year-ahead152.453.780.87
Year-ahead132.323.490.81
Current-year112.063.040.70
Current-year91.962.820.65
Current-year71.662.400.55
Current-year51.432.100.49
Current-year31.241.880.43
Current-year11.091.630.38

Second, while in absolute terms errors are larger for developing countries than for the industrialized countries (as shown by the MAE and RMSE columns), taking account of the variability of the underlying data reverses this conclusion. Values of TU are always a bit higher for the industrialized country sample than for developing countries. Another way of thinking about this result is that the year-ahead forecasts for industrialized countries either do not beat the naive forecast of an unchanged growth rate or just barely beat it, whereas for developing countries they do notably better than a naive forecast.

How accurate are the private sector’s output forecasts? One way to try to answer this question to note that real GDP growth averaged about 3 percent a year over this period (2.3 percent for industrialized countries; 3.6 percent for developing). Against this context, the accuracy of the forecasts, particularly of the yearahead forecasts, is not particularly impressive.

A preferable, and less subjective, attempt to evaluate accuracy is to compare the private sector’s forecast against those of forecasts available from other sources. The next section compares private sector’s forecasts against another prominent source of cross-country forecasts, the IMF’s World Economic Outlook (WEO).

III. Comparison of Private Sector and IMF Forecasts

The IMF’s World Economic Outlook (WEO), published in May and October of each year, reports current-year and year-ahead forecasts for member countries. For about 50 of the largest countries accounting for 90 percent of world output the forecasts are updated for each WEO exercise; these countries are referred to as “Group A” countries. For the other countries, the WEO forecasts are from the most recent Article IV consultation or IMF program document, but they are “incrementally adjusted to reflect changes in assumptions and global economic conditions.”5 The May WEO forecasts are compared with the April private sector Consensus Forecasts, and the October WEO forecasts with the October consensus. 6

Table 3 presents measures of accuracy for the two sources of forecasts. It is evident that the accuracy of the private sector’s forecasts is in almost every case a little bit better than that of the WEO forecasts.7 The differences are marginal in the case of forecasts for industrialized countries (particularly current-year forecasts), but rather more substantial in the case of developing countries. So, assessed against the performance of perhaps the single best-known alternative source of crosscountry forecasts, the private sector forecasts perform well in terms of accuracy.8

Table 3.Comparison of Private Sector and WEO Accuracy
Year-ahead orMAEMAERMSERMSETVTV
Current-yearHorizonConsensusWEOConsensusWEOConsensusWEO
All countries
Year-ahead212.082.183.283.400.850.88
Year-ahead151.872.092.963.310.810.90
Current-year91.541.672.282.550.620.69
Current-year30.981.051.521.630.410.44
Industrialized
Year-ahead211.421.521.952.041.001.04
Year-ahead151.251.351.711.820.920.98
Current-year91.021.031.371.370.740.74
Current-year30.710.750.990.990.530.53
Developing
Year-ahead212.632.764.094.250.900.93
Year-ahead152.452.793.784.260.870.98
Current-year91.962.232.823.240.650.75
Current-year31.241.341.882.070.430.48

As Diebold and Mariano (1995, p. 262) have emphasized, the superiority of one source of forecasts in terms of forecast accuracy does not necessarily imply that forecasts from other sources contain no additional information. To test if WEO forecasts add information to the Consensus forecasts, we estimate the following regression:

where, A(t) is the outcome, and CONSENSUS(t) and WEO(t) are the forecasts from the two sources. Equations of the kind shown above are based on the literature on forecast encompassing (see Fair and Shiller (1990), and Holden and Thompson (1997)). 9 If we cannot reject the null hypothesis that the estimated b2 is zero, the WEO forecasts are encompassed by the Consensus forecast. We report regressions with and without the constant term to see if the results of this exercise are sensitive to assumptions about the unbiasedness of the Consensus forecast.

The regression results are reported in Table 4. Results are reported for all horizons and countries pooled together and for each of the four horizons and two country types separately. This gives a total of 30 regressions. The range of estimates for b1 is between 0.7 and 1.3, with a substantial number of estimates very close to 1. The range of estimates for b2 is -0.2 to 0.4, with a majority of the estimates very close to zero. To the extent that the WEO forecasts are not encompassed by the Consensus forecasts, that tends to be the case for industrialized countries, and even then for current-year forecasts only.

Table 4.Forecast Encompassing Tests
Year-ahead
and/orAdj.No.
Current-yearHorizonCONSENSUSWEOConstantR2of Obs.
PANEL A: REGRESSIONS WITH A CONSTANT
All countries
BothAll1.000.004-0.390.521681
(0.03)(0.01)(0.10)
Year-ahead210.83(0.08)0.01(0.02)-0,34(0,34)0.21403
Year-ahead150.96(0.07)0.00(0.02)-0.41(0.28)0.34393
Current-year91.21(0.09)-0.10(0.09)-0.50(0.15)0.67455
Current-year31.03(0.02)0.01(0.01)-0.07(0.08)0.87430
Industrialized countries
BothAll0.970.18-0,540.55796
(0.10)(0.09)(0.10)
Year-ahead211.01(0.30)0.17(0.24)-1.01(0.48)0.21188
Year-ahead151.25(0.22)0.10(0.21)-1.18(0.32)0.41189
Current-year90.85(0.17)0.36(0.15)-0.50(0.14)0.70210
Current-year30.84(0.09)0.28(0.09)-0.22(0.07)0.91209
Developing countries
BothAll1.010.01-0.630.50885
(0.04)(0.13)(0.17)
Year-ahead210.92(0.13)0.01(0.03)-1.07(0.68)0.19215
Year-ahead150.97(0.10)0.01(0.03)-0.68(0.52)0.30204
Current-year91.28(0.12)-0.15(0.12)-0,76(0.27)0.65245
Current-year31.02(0.03)0.01(0.02)-0.04(0.15)0.86221
PANEL fl: REGRESSIONS WITHOUT A CONSTANT
All countries
BothAll0.930.0020.721681
(0.02)(0.01)
Year-ahead210.76(0.04)0.00(0.02)0.54403
Year-ahead150.88(0.04)0.00(0.02)0,62393
Current-year91.17(0.09)-0.14(0.09)0.80455
Current-year31.02(0.02)0.01(0.01)0.92430
Industrialized countries
BothAll0,820.150,80796
(0.10)(0.09)
Year-ahead210.69(0.26)0.15(0.25)0.65183
Year-ahead150.97(0.22)-0.03(0.20)0.72189
Current-year90.67(0.16)0.37(0.15)0.87210
Current-year30.80(0.09)0.25(0.09)0.96209
Developing countries
BothAll0.910.0010.70885
(0.02)(0.01)
Year-ahead210.74(0.06)0.00(0.03)0.51215
Year-ahead150.86(0.06)-0.00(0.03)0.59204
Current-year91.22(0.12)-0.21(0.12)0.79245
Current-year31.01(0.03)0.01(0.02)0.91221

Another dimension on which forecasts can be compared is directional accuracy. We compute whether the forecasted change in growth is in the same direction as the actual change in growth. The fraction of forecasts that get the direction right is shown in Table 5. For both sources of forecasts the fraction increases as the horizon gets shorter and is higher for industrialized than for developing countries in the case of current-year forecasts. The relative performance does not produce a clear winner: the WEO does better at a three-month horizon but generally a bit worse at the other horizons.

Table 5.Directional Accuracy of Consensus and WEO Forecasts
All CountriesIndustrializedDeveloping
Year-aheadCountriesCountries
or Current-yearHorizonConsensusWEOConsensusWEOConsensusWEO
Year-ahead230.620.610.62
Year-ahead210.620.600.600.560.620.62
Year-ahead190.620,610.63
Year-ahead170.620.600.63
Year-ahead150.620,600.620.580.620.61
Year-ahead130.680.710.66
Current-year110.700.750.67
Current-year90.710.700.770.780.6S0.66
Current-year70.730.820.68
Current-year50.750.850.70
Current-year30.770.800.870.890,710.75
Current-year10.800.910.75

To summarize the results of this section, the results for directional accuracy are a statistical “dead heat,” whereas the evidence tends to favor the private sector forecasts as being a little more accurate than, and encompassing, the WEO forecasts. The dominance of private sector forecasts is not wholly unexpected. The Consensus forecasts are updated in a far more timely manner and with a shorter "production lag" than the WEO forecasts. The consensus is also an average of several individual forecasts which—even allowing for "herding" tendencies—should tend to produce a more accurate forecast. 10

It should also be noted that forecasts are only one component of the WEO; far more attention is devoted to analyzing the global outlook, risks to it, and special topics of policy interest. For Consensus Forecasts, however, forecasts are the raison d’etre.

IV. Forecasting Recessions

Fintzen and Stekler (1999, p. 309) note that “one of the most disturbing findings of forecast evaluations is that, in the United States, recessions have generally not been predicted prior to their occurrence.” As one country’s macroeconomic history only yields a few observations of recessions, the cross-country sample used here affords an opportunity for testing if the failure to forecast recessions is an ubiquitous feature of growth forecasts. 11 One limitation is that the term recession—that is, a year in which real G D P declined—has to be interpreted rather broadly to encompass cyclical downturns (as in the case of the United States in 1991), declines in output associated with transition from planned economies to market economies (as in the case of Hungary and Poland), and declines associated with crises of various kinds (e.g., the E R M crisis of 1992-93, the Mexican crisis (1995), and the global financial crises of 1997-99). There were a total of 72 episodes of recessions in our sample (Table 6).

The properties of forecasts during recession years are summarized in Table 7. As shown in the first column, very few recessions are predicted a year in advance; for instance, at a 13-month horizon negative growth was forecast in only 8 of the 72 episodes. There is then a discrete jump in the number of forecast of recessions; at a 11-month horizon the number of cases of negative growth forecasts has risen to 23 out of the possible 72 cases. 12 The number of cases of forecasts of recessions then rises steadily, reaching 54 at a one-month horizon.

While forecasters increasingly start to recognize recessions in the year in which they occur, the results in the second column show that the magnitude of the downturn is underpredicted in the vast majority of cases. For instance, even as late as December of the year of the recession, the forecast is more optimistic than the outcome in 40 cases.

Table 6.List of 72 Episodes of Recessions(Countries that experienced recessions over the sample period are shown in bold.)
Developing Countries
lndustriaiizedAsia-PacificLatin AmericanTransition and
Country GroupCountriesEconomiesEconomiesOther Economies
List of Countries.Austria 1993BangladeshArgentina 7995, 99Bulgaria 1996, 97
Recession year(s)Austraiia 1991ChinaBoliviaCzech Rep.7995,
Beigium 1993Hong Kong 1998Brazil 1990, ’92’99
Canada 1991IndiaChile 1999Hungary 1990, ’97,
DenmarkIndonesia 1998Colombia 1999’92, ’93
Finiand 1991,Korea 1998Costa Rica 1996Poland 1990, ’97
Malaysia 1998Dominican Rep.Romania 1997, ’98
France 1993PakistanEcuador 1999Russia 1995, ’96,
Germany 1993Philippines 1998Mexico 1995’98
Greece 1993SingaporePanamaSlovakia
IrelandSri LankaParaguaySlovenia
IsraelTaiwanPeruUkraine 1995, ’96,
itaiy 7993Thailand 1997, ’98Uruguay 1995,’94’97, ’98, ’99
Japan 1998VietnamVenezuela 1993, ’94,
Netherlands’96, ’98, ’99Egypt
New ZeaiandSaudi Arabia
1991, ’981995, ’99
NorwaySouth Africa
Portugal 1993Turkey 1999
Spain 1993
Sweden 1991,
’92, ’93
Switzerland1991,
’92,’93,96
UK 1991, ’92
USA 1991
Table 7.Forecast Performance During Recession Episodes(Total number of episodes = 72)
AverageAverageAverage
ForecastForecastForecast
Year-ahead orErrorErrorEmir
Current-yearHorizonForecast <0Forecast >ActualFull SampleIndustrializedDeveloping
Year-ahead232606.183.557.94
Year-ahead212606.053.507.76
Year-ahead192565.903.417.77
Year-ahead172565.673.277,46
Year-ahead154565.142.896,84
Year-ahead138604.662.486.11
Current-year1123633.701.964.81
Current-year926613.161.604.17
Current-year735592.301.213.11
Current-year541541.690.872.31
Current-year350501.060.511.45
Current-year154400.570.210.81

The final columns show the average forecast error at the different forecast horizons over all 72 episodes and also for the industrialized countries and developing countries separately. Note that average forecast errors continue to be quite substantial even for forecasts made fairly late in the year of the recession. For instance, at the three-month horizon, the average forecast error is 0.5 percentage points for industrialized countries and nearly 1.5 percentage points for developing countries; at the onemonth horizon the corresponding numbers are 0.2 and 0.8 percentage points.

A Goldman Sachs report (2001)13 contains an extended discussion of why private sector economists tend to avoid forecasting recessions. 14 First, recessions are relatively rare events, which makes forecasting positive growth a pretty good bet in most years. (In the sample used in this paper, years marked by recessions comprise about 15 percent of the total number of country years.) Second, as a related point, macroeconomic models are built to capture normal relationships among variables and are hence not always suited to predicting recessions unless there is a clear large exogenous shock. Third, there is a “herding” tendency among forecasters. The report notes (p. 4) that,

“to forecast a recession substantially ahead of the pack, a forecaster must be willing to deviate from the consensus in an extremely transparent manner, knowing very clearly that in any given year a recession is a low probability outcome.”

V. Forecaster Discord and Forecaster Uncertainty

In addition to the mean forecast, users of Consensus Forecasts often look at the degree of discord across forecasters, as measured by the standard deviation of the individual forecasts. For instance, the high standard deviation of forecasts in the case of Japan over the last few years is taken to be a signal that developments in Japan have been particularly difficult to forecast in recent times. The high standard deviation could serve as a warning that the forecast accuracy for Japan’s growth may be low. Conversely, a low level of forecaster discord might suggest that the country’s growth prospects are relatively easy to forecast, and hence that one should expect that the forecast error will be low.

How reliable is forecaster discord as a predictor of accuracy? To investigate this issue, we estimate a regression of the (absolute value of the) forecast error on the standard deviation of the forecast and other variables. 15 The sample consists of 26 countries, listed in Table 8, for which individual forecasts are available. The data are pooled for these countries for four years (1995 to 1998) and for two forecast horizons (the current-year April forecast and the current-year October forecast); 16 this yields a total of 208 observations.

Table 8.List of Countries Used in Forecaster Discord Regressions
Industrialized CountriesAsia-Pacific EconomiesLatin American Economies
CanadaAustraliaArgentina
FranceChinaBrazil
GermanyHong KongChile
ItalyIndiaMexico
JapanIndonesiaVenezuela
NetherlandsKorea
SpainMalaysia
SwedenNew Zealand
United KingdomSingapore
United StatesTaiwan
Thailand

The results of the estimation are given in Table 9. In addition to the standard error, the following explanatory variables are included: (1) dummy variables for each region and each year; (2) dummy variables to test whether the results are due to outliers; and (3) adummy variable that takes the value 1 if the forecast was made in April, and zero if the forecast was made in October. Since, as was shown above, the forecast error is higher earlier on in the year than later, the expected sign of the coefficient on this dummy variable is positive.

Table 9.Forecast Errors and Forecaster Discord(Dependent variable: current-year forecast error, absolute value (AFE))
Independent Variables(2)(3)(4)(5)
Standard deviation of forecasts2.01.91.80.70.6
(0.2)(0.2)(0.2)(0.2)(0.2)
(0,1) Dummy for April or Oct. forecast0.50.50,5030.2
(April=l)(0.14)(0.14)(0,14)(0.1)(0.1)
(0,1) Dummy for Industrialized countries-0.4-0.5-0.4-0.2
(0.2)(0.2)(0.1)(0.15)
(0,1) Dummy for Asia-Pacific countries-0.2-0.2-0.20.07
(0.2)(0.2)(0.13)(0.14)
(0,1) Dummy for 1995-0.1-0.1-0.1
(0.2)(0.14)(0.1)
(0,1 J Dummy for 1996-0J-0.25-0.2
(0.2)(0.14)(0.15)
(0,1) Dummy for 1997-0.1-0.02-0.1
(0.2)(0.14)(0.15)
(0,1) Dummy for very high AFE4.0
(mean + 2 times s.d.)(0.3)
(0,1) Dummy for high AFE2.9
(mean + s.d.)(0.2)
Intercept-0.20.10.40.60.4
(0.13)(0.2)(0.3)(0.2)(0.2)
Adjusted R20.370,370.380.720.73
Number of observations208
(26 countries x 4 yean x 2 forecast horizons)
Note: Numbers in parentheses are standard errors
Note: Numbers in parentheses are standard errors

The initial regression in column (1) shows that higher forecaster discord is indeed associated with higher forecast errors, controlling for the month of the forecast (April or October). The coefficient estimate is positive and significantly different from zero.

Adding on region-specific fixed effects (column (2)) and year-specific fixed effects does not materially affect the strength of the positive correlation between (absolute) forecast error and forecaster discord. The inclusion of d u m m y variables to pick up the effects of outliers (columns (4) and (5)) attenuates the correlation, but it remains positive and significantly different from zero.

Overall, these results provide some support for the common practice of using the standard deviation of the forecast as an rough indication of the difficulty of the forecasting "terrain," and consequently as one determinant of the magnitude of the forecast error.17

VI. Conclusions

This paper has assembled further evidence on the properties of private sector growth forecasts for a large sample of countries. The main questions addressed, and the answers suggested by the evidence, are as follows:

  • How accurate are private sector forecasts ?

  • In absolute terms, the magnitude of the errors tends to be larger for developing than for industrialized countries. However, growth is more variable for developing countries; if one adjusts for this fact, say by scaling the forecast error for a country by the variability of its growth, the errors are a bit smaller for developing countries.

  • How does private sector accuracy compare with that of the IMF’s World Economic Outlook?

  • Private sector and WEO forecasts seem to be quite similar. Measures of accuracy such as mean absolute error and root mean square error are better for Consensus forecasts than for WEO forecasts, but the differences are not overwhelming. Tests of directional accuracy do not reveal substantial differences between the two sources either. Nevertheless, the WEO forecasts are encompassed by the Consensus forecasts, suggesting that they do not add explanatory power to the private sector’s forecasts.

  • How well do forecasters predict recessions ?

  • Updating earlier work, we show that very few of the 72 recessions that occurred over the sample were predicted a year in advance and two-thirds remained undetected by the April of the year in which the recession occurred. In over half the cases, the forecast made in December of the year of the recession underestimated its extent. This predictive failure could arise either because forecasters lack the requisite information (in terms of reliable real-time data or reliable models) or because they lack the incentives to predict recessions; further work would be needed to discriminate between these two classes of theories.

  • Is forecaster discord a reliable predictor of forecast accuracy ?

  • There is a positive relationship between the two: when there is greater discord across forecasters, the forecast error tends to be larger, on average. At the same time, the relationship is not overwhelmingly strong. This means that forecast discord can be used as one element in trying to gauge the likely magnitude of the forecast error, but it cannot be used as the only element.

References

Grace Juhn is a student at Harvard University’s Kennedy School of Government, and Prakash Loungani is Assistant to the Director of External Relations at the International Monetary Fund. Work on this paper was completed while Juhn was a Research Assistant in the IMF’s Research Department. We acknowledge useful comments from Frank Diebold.

Publications such as the IMF’s World Economic Outlook (WEO), the World Bank’s Global Economic Prospects (GEP), and the OECD’s Economic Outlook (EO) contain references to the Consensus forecasts. See, for instance, WEO: Interim Assessment (December 1997, pp. 34-36), Staff Studies for the WEO (December 1997, pp. 23-25), and GEP (1999, p. 9).

Earlier work by Loungani (2001a and b) also contains an evaluation of Consensus Forecasts of output growth. This paper builds on that work in five ways: (1) the entire sequence of bi-monthly forecasts is studied, instead of just the April and October forecasts; (2) forecast encompassing tests are presented to test more formally for the relative information content of private and official sector forecasts, instead of the scatter plots presented in the earlier work; (3) evidence is presented on directional accuracy of consensus and WEO forecasts; (4) the relationship between forecaster discord and forecast accuracy is studied; and (5) the sample period is extended by a year, not a trivial increase when the sample period is as short as it is here. The additional year is particularly useful in updating the evidence on forecasting recessions that was presented in the earlier work.

In future work, it would be interesting to examine the properties of the median forecast as well.

For example, the 1990 forecast was compared to the realization as reported in the M a y 1991 W E O. In cases where this was not possible, because the data were not reported, w e used the first available realization reported in the W E O

Preface to October 1998 WEO.

The correlation between Consensus Forecasts for any two adjacent months is very high, 0.95 or better. This suggests that our results are not likely to have been much affected by using the May forecasts instead of the April forecasts.

One interesting extension to pursue would be to see if forecast accuracy in the case of countries with IMF-supported programs differs from that in other cases. On the one hand, forecasts for program countries are subject to greater scrutiny, which may lead to greater accuracy. On the other hand, forecasts for program countries are often arrived at after negotiations with the country’s authorities and may not represent true forecasts. See Musso and Phillips (2002) for a further discussion and evidence on the accuracy of projections made as part of IMF-supported programs.

We carried out a test, based on Diebold and Mariano (1995) and Diebold (2001, pp. 293-94), of whether the better performance of the Consensus relative to the WEO is statistically significant. Our preliminary results suggest that it is, but this result will need to be tested more rigorously in future work. One reason is that the test is intended for a time series rather than a panel data context; we used fixed effects to control for the panel nature of our data, but this may not be an adequate control.

Equations of this kind can also be motivated on the basis of an older literature on combining forecasts (Bates and Granger (1969), and Granger and Ramanathan (1984)), where the focus is on finding the optimal linear combination of available forecasts of an event. Diebold (1989) discusses the links between the forecast combination and forecast encompassing literatures.

0See Gallo, Granger, and Jeon (2002) for evidence on copycat behavior among the individual forecasters included in the Consensus Forecasts.

For evidence on how well recoveries are forecast, see Loungani (2002).

That this jump coincides with the arrival of a new year suggests that there is a heightened focus by both forecasters and their clients in the growth outcomes for the current year and perhaps lesser interest in outcomes for the following year.

Cited with permission from Goldman Sachs.

In a related discussion, Loungani (2000, 2001a) discusses two classes of theories for why recessions might not be forecast. The first is that the information needed is lacking: forecasters either do not have access to reliable real-time information or lack reliable models for translating available information into predictions of a recession. The second is that the incentives for producing an "outlier" forecast (a recession or a strong boom) are lacking.

Recent work (e.g., Alizadeh, Brandt, and Diebold, 2002) makes it clear that the range of forecasts (that is, max.-min.) can be a very informative volatility measure. It would be interesting to use the range instead of the standard deviation in regressions of the sort reported in Table 8.

In principle, one could carry out a similar analysis for the year-ahead standard deviation as well.

Zarnowitz and Lambros (1985) and Gallo, Granger, and Jeon (2002) caution against using the standard deviation (across analysts) as a measure of the standard deviation of the consensus.

Other Resources Citing This Publication