Recent developments in the sphere of international economic policy coordination produced an agreement at the May 1986 Tokyo Summit that the major countries should focus on a set of economic indicators as a means of strengthening the degree of cooperation in macroeconomic policymaking already in existence. The Fund was given the formal responsibility for carrying this suggestion forward. In the subsequent development of this idea (see. in particular, Crockett and Goldstein (1987)), emphasis has been given to a taxonomy of indicators of current economic developments, distinguishing those which are signals of policy posture from those which measure intermediate variables, and which in turn are distinguished from those measuring economic performance. Indicators may be used in a number of ways. On a rising scale of increasing international interdependence, they may provide individual countries with a checklist of variables against which to monitor the short-run progress of their economies; they may provide information on the medium-run sustainability of policies; and they may signal in a formal way the need for multilateral discussion of policies.

Recent developments in the sphere of international economic policy coordination produced an agreement at the May 1986 Tokyo Summit that the major countries should focus on a set of economic indicators as a means of strengthening the degree of cooperation in macroeconomic policymaking already in existence. The Fund was given the formal responsibility for carrying this suggestion forward. In the subsequent development of this idea (see. in particular, Crockett and Goldstein (1987)), emphasis has been given to a taxonomy of indicators of current economic developments, distinguishing those which are signals of policy posture from those which measure intermediate variables, and which in turn are distinguished from those measuring economic performance. Indicators may be used in a number of ways. On a rising scale of increasing international interdependence, they may provide individual countries with a checklist of variables against which to monitor the short-run progress of their economies; they may provide information on the medium-run sustainability of policies; and they may signal in a formal way the need for multilateral discussion of policies.

Because of the lags in the economic process, it is clear that indicators of current developments cannot be a substitute for forecasting; on the contrary, for any of the purposes listed above forecasts are needed for the evolution of the relevant indicators. Here, the relevance of the present study should become apparent. Students of the analytics of economic policy coordination (see, for example. Cooper (1984)) have long stressed the significance of agreement about propositions in positive economics to the success of international policy coordination: that agreement must embrace both the evaluation of responses of performance indicators to policy indicators and the baseline forecast evolution of the indicators. The Fund’s World Economic Outlook (WEO) has long been in the business of projecting the latter, making forecasts of the development of the performance indicators subject, essentially, to starting assumptions about policies. For the successful functioning of an indicator system, the degree of forecasting accuracy must be tolerably good, given the alternatives. This setting provides more than adequate motivation for an examination of the Fund’s forecasting track record as distilled from the projections published in the World Economic Outlook and publications of the same kind that were circulated internally within the Fund for nearly a decade before regular publication began in 1980.1


The scope of this study was restricted to the examination of the World Economic Outlook forecasting record for the principal performance indicators for the major industrial countries and corresponding aggregates and for groups of non-oil developing countries. Moreover, the analysis only covered the short-term forecasts—that is, those covering the current year and one year ahead. Necessarily set on one side has been the mass of detail presented in the forecasts pertaining to the components of national expenditure and the medium-term “scenarios” formulated in recent years.

Several criteria were used in evaluating the forecasts: the computation and evaluation of various summary statistics of forecast accuracy, bias, and efficiency; comparisons with alternative forecasts—naive forecasts and forecasts produced by the Organization for Economic Cooperation and Development (OECD) and by national forecasting agencies; the examination of turning-point errors and forecast performance in defined episodes; and, finally, some attempt to explain forecast error in terms of unanticipated developments in policy variables and oil prices.

In judging the forecast performance of the World Economic Outlook, a number of points must be kept in mind. Most important, it has to be recognized that the period since the inception of the World Economic Outlook as a regular forecasting exercise has been extraordinarily rich in economic upheavals, which have made the odds against accurate forecasting formidable. It should also be recalled that the objective of the World Economic Outlook is not to forecast the most likely outcome but rather to provide conditional estimates of economic developments under the assumption of unchanged policies and exchange rates. Indeed, the quintessential purpose of the World Economic Outlook exercise is to assist the Fund in carrying out bilateral as well as multilateral surveillance by helping to identify tensions in the projections that may call for policy adjustments or may result in exchange rate changes. Finally, it must remain true that the standard of accuracy required in a forecast is relative to the task for which the forecasts are required. Adapting the results recorded here to this criterion must remain an exercise for the interested reader.

Against this background, the forecast performance appears to have been reasonably accurate, particularly for output and inflation, with the industrial country forecasts generally more accurate than those for the developing countries. Although the average absolute error for year-ahead forecasts of growth in the seven major industrial countries as a group over the 1973–85 period has been slightly above 1 percentage point, this result is strongly influenced by a few outliers attributable to some of the large disturbances experienced over the period, notably the first major round of oil price increases in 1973–74 and the tight monetary conditions in the early 1980s. Excluding 1974 and 1982, the mean absolute error for year-ahead output growth in the seven major industrial countries has been only 0.7 of 1 percentage point. While the biggest errors can be related to the large fluctuations in oil prices, the significance of unforeseen fiscal policy developments in explaining forecast error seems strictly limited. The results also show that forecast accuracy is quite sensitive to forecast lead time, so that the errors typically diminish as more information becomes available, particularly for the industrial countries.

The forecasts for output growth, both for industrial and for developing countries, appear to have suffered from a degree of “optimism bias” in the sense that output forecasts have been mostly on the high side in relation to realized values. This bias in an optimistic direction is concentrated in the 1971–80 period. It undoubtedly reflects the fact that the slowdown in growth in this period was only gradually perceived to be a break in trend growth rather than primarily a cyclical phenomenon. Since 1980, output forecast errors have been more evenly distributed. Moreover, there is little evidence of inefficiency in the World Economic Outlook forecasts, in the sense that the forecast errors cannot systematically be explained by the level of the forecast itself and are not obviously statistically biased. The World Economic Outlook forecasts also appear to be efficient in that they are generally incapable of being improved by adding information from the available forecasts produced by the OECD or by national forecasters.

As between the performance indicators considered, forecasts for the current account of the balance of payments are inferior to those for output and inflation, at least for the industrial countries. This result, which might appear surprising for forecasts that are prepared on an internationally consistent basis, must be qualified in two important respects. First, the current account is the balance of very large gross flows of exports and imports (of goods, services, and transfers); even small errors in the growth estimates of exports or imports may thus show up as large errors in the current account. The second qualification concerns the questionable quality of balance of payments statistics as evidenced by the large and highly volatile discrepancy in the world current account, particularly since the late 1970s. As argued in a recent report on this problem (International Monetary Fund (1987)), large fluctuations in the current account discrepancy have undoubtedly been an important source of error, not only for balance of payments forecasts but also for world growth projections.

On the track record of the World Economic Outlook relative to national forecasting agencies, comparisons are hampered by differences in source dates of available forecasts, making generalization hazardous. However, it would appear that the World Economic Outlook forecasts do not generally provide any distinct improvement over those of national agencies in forecasting national output growth and inflation. Indeed, one outcome of the comparisons with other forecasters is the finding that there is a high degree of common sharing in the principal forecasting errors. The largest of these are traceable to the two large oil price rises, especially to the first (1973–74), though there are turning-point errors outside of these episodes which also appear to be widely shared by national and international forecasters. Although this may seem disappointing, it would be a mistake to conclude that international forecasts such as those of the World Economic Outlook are therefore redundant. Indeed, in most cases, the national forecasts are prepared on the basis of assumptions about each country’s international environment that typically originate from forecasting exercises like those of the Fund or the OECD. Projections prepared on an internationally consistent basis are also necessary as an input into Fund multilateral surveillance activities and into any attempt to coordinate economic policies among countries. Such projections are also required as a basis for monitoring the situation of the indebted developing countries.

The question remains whether the World Economic Outlook’s forecasting accuracy can be significantly improved. It would probably not be helpful to be overly ambitious in this respect. Nevertheless, the results of this study do suggest that there may be scope for improvement in several areas. In particular, it seems clear that a reduction in the magnitude and, more especially, the volatility of the world current account discrepancy could only enhance the quality of an internationally consistent exercise such as the World Economic Outlook. An early implementation of the recommendations contained in the recent report on the discrepancy may well be a necessary condition for significant improvements in the accuracy of the projections. Another conclusion that has emerged from this report is the sensitivity of forecast accuracy to lead time, which underlines the importance of being able promptly to take into account any new information that becomes available. This raises the question of whether the accuracy of the World Economic Outlook could be improved by a more widespread use of formal, model-based methods which would reduce processing time and would allow more frequent ad hoc updates of the forecasts. The ready availability of such methods would permit a given baseline projection based on the judgment of individual desk officers to be adjusted incrementally at short notice for changes in the main exogenous assumptions underlying the projections and would ease the task of providing scenario analyses. Whilst a move in this direction should not be expected to yield early dividends, it is probably also true that a more formal methodology, simply by being more explicit, more easily allows constructive post mortem analysis of forecast error which should help to improve forecast performance over time.

Forecasting Methods and Criteria for Evaluation


This is not the place to explain the construction of the World Economic Outlook forecasts in detail. (Goldstein (1986) may be consulted for such a description.) But it is essential to spell out some of the principal characteristics of the forecasting process, for these affect the post mortem techniques that can be used.

First, it is important to stress that the forecasts are conditional. They are prepared on certain assumptions about “exogenous” variables: fiscal and monetary policy, exchange rates, and oil prices are the leading variables in question. The basic assumption about policies is that “present policies” will be held unchanged during the forecast period, though “present policies” are interpreted to include any currently known announcements about future policy adaptations and may also “encompass certain policy adaptations or changes that seem likely to occur even though they have not been announced by the authorities.”2 Exchange rates are currently projected at the real (formerly, nominal) levels prevailing at a recent base date, whilst the oil price is also usually projected (in the absence of more specific indicators otherwise) as constant in real terms.

The reasons these variables are treated in this particular way are mixed. The treatment of policies follows the customary practice of national official forecasting and its many derivatives where a prime originating purpose of the forecasting exercise is to provide a consistency check on policy itself. A similar justification applies here too and the World Economic Outlook draws conclusions for desirable policy adjustment from its analysis of the future outlook. For market-based policy instruments such as interest rates, World Economic Outlook projections must also be inhibited by the knowledge that a Fund forecast might move the market in a way which could force the hand of a member government, which would be an embarrassing prospect.3 Somewhat similar considerations may affect the treatment of oil prices, but the practice of the World Economic Outlook here is like that of other forecasters and to this extent reflects a belief that predicting the timing and magnitude of changes in oil prices is a particularly hazardous undertaking.

The fact that World Economic Outlook forecasts are conditional on assumptions about policy and oil prices suggests that, subject to measurement problems, it is important to allow for the falsification of the conditional assumptions in reviewing the track record (see the last section of this paper). The position on exchange rates is rather different. The typical conditional projection in the World Economic Outlook of an unchanged pattern of exchange rates cannot be defended on the argument that exchange rates are a policy instrument, at least not one independent of fiscal and monetary policy, but rather because the undoubted power of the Fund to “move markets” would make the publication of exchange rate forecasts inappropriate.

Strictly speaking, since policy adjustment is not allowed to take the strain of supporting the pattern of exchange rates assumed and the exchange rate is not allowed to take the strain of supporting the set of policies assumed, the collection of conditional assumptions the World Economic Outlook is forced to make about these variables can only be squared with theoretical considerations by invoking “portfolio shifts” of just the right type and magnitude to sustain them. In principle, nothing is more likely than that this assumption of accommodating portfolio shifts will fail. But this cannot mean that it would be right to treat deviations of exchange rates from their “forecast” paths as a reason for forecast error elsewhere. First of all, the failure of the exchange rate assumption to materialize may reflect a failure of other parts of the forecast just as much as the other way around; second, pragmatically but most importantly, despite theoretical considerations, the power of structural models to predict the exchange rate is extremely low. In practice, it is not clear that the conditional exchange rate baseline projection—which is, after all, a form of random walk prediction—can be significantly bettered. Finally, also pragmatically, exchange rate effects take a considerable time to work through onto output (although less time onto prices); they could have little impact within the typical short-term forecast period. On the other hand, it seems fair to say that failures of the exchange rate assumption may have more rapid and noticeable effects on balance of payments forecast errors, and since these turn out to be the most problematic part of the track record, some attempt to relate them to exchange rate forecast errors seems worthwhile.

A second important characteristic of the World Economic Outlook forecasting exercise is that it is comparatively informal, much more so, say, than the leading forecasting models in the United Kingdom. (For a recent review of these the interested reader may consult Wallis et al. (1986).) While model-based exercises are conducted at various stages of the production of the forecast, there is no computer-based “world model” behind the forecast as a whole. This is not necessarily a drawback in itself, but the implication for post mortem analysis is that it is not possible to decompose an ex post forecast error into exogenous variable, judgmental, and model-based error in the way that would be appropriate, and feasible, for a model-based exercise (see Osborn and Teal (1979) for an original exercise of this type). Nevertheless, it should be possible, measurement problems permitting, to relate the forecast errors to exogenous variable errors, as discussed below.

A third important characteristic of the World Economic Outlook forecast procedure is that it has, at its heart, a consistency check not shared by national forecasters. As described in Goldstein (1986), original country-desk-based forecasts, prepared against environmental assumptions specified by the Fund’s Research Department, are aggregated to check for the consistency of their trade and balance of payments implications. Identified discrepancies are then removed by an iterative process in which the country desk forecasts are successively revised, until the check is satisfied. The opportunity, and indeed the need, to conduct this check obviously arises from the closed economy nature of world forecasting which contrasts with the open economy basis of national forecasting. It would be useful to identify a way of confirming the value of consistency checks. One approach might be to compare the ex post accuracy of the initial and final forecasts made in each round, but the records available do not allow this comparison to be made. An additional problem relates to the fact that there is a significant discrepancy in the world current account, which may reduce the value of the consistency check.


Given the selection of variables to be examined, the principal tools used for assessing the World Economic Outlook forecasts in the sections below comprise the following: inspection of forecast error summary statistics; investigation of systematic bias in the forecasts taken over a long period; comparison with alternative forecasts; and the investigation of the rationality of forecast error. These checks are supplemented by an identification of outstanding episodes in the track record and an attempt to explain these in more detail by recourse to narrative material. Some explanation of these tools is in order.

Summary statistics. The principal summary statistics deployed in examining the World Economic Outlook track record are the average absolute error of forecast, the root mean square error, and the Theil inequality statistic. Because a forecast error series may display both positive and negative errors, the simple mean may be a highly misleading indicator of accuracy, and for this reason the average absolute error is preferred. As a basis for comparing this statistic among series, the mean absolute value of the realized series itself is also presented. It is commonplace in economic analysis to prefer a measure which penalizes a large deviation more highly than a series of smaller ones of equivalent total size; for analytical tractability a quadratic measure is often used and for this reason the root mean square error (RMSE) is a preferred statistic in studies like the present one. This statistic too needs to be normalized in some fashion to facilitate comparison between series and the study makes use of such a normalization in the Theil inequality statistic, which can be generally defined as the ratio of the RMSE of the forecast under consideration to the RMSE of an alternative forecast. In the main text tables displayed below, this alternative is provided by the naive “no change” forecast, where the forecast for year t of variable x is the t –1 value for x (where x may be the growth rate of real GDP or the rate of inflation).4 (Appendix IV also presents the Theil statistics produced by the alternative naive standard that the forecast of x for year t corresponds to the ten-year moving average of x.)

Realization—forecast regressions. The efficiency of forecasting may be tested by performing the regression of the realizations on the forecasts themselves, as R(t) = a + b F(t) + u(t). A perfect forecast would identify the intercept in such a regression as zero, the slope as unity and yield a correlation coefficient of 1.00. Where knowledge of the realization-forecast relationship itself can reduce the forecast error variance, these conditions will not hold. It seems a natural interpretation, within the terms of this regression framework, to identify a failure of the two expectations about the intercept and slope terms with the presence of bias; but this inference is not necessarily correct.5 The essence of the matter is that the realization-prediction regression detects whether the pattern of forecast errors can be related to the level of the forecast, not whether the average error is significantly different from zero, which can be tested for directly by measuring the average error and asking whether it is significantly different from zero.6 In answering some of these questions it is useful to supplement the results that can be obtained for specific countries (areas or aggregates) by pooling the data. While this procedure permits the benefit of offsetting country error, it enhances the power of significance tests.

Comparisons. “Absolute” measures of forecast accuracy are useless in themselves; they need to be related, on the one hand, to the standards of accuracy required by the purpose for which they are sought and, on the other, to comparable measures generated by alternative forecasting techniques. In the latter category, the normalizations already noted compare the forecasts with those generated by two alternative prediction schemes. It would be possible also to generate univariate time series and multivariate (Bayesian vector autoregression) models as a further source of alternative forecasts; ex post facto it might well prove possible to generate a model in this class which would be superior to the World Economic Outlook forecasts, but the achievement would not be very interesting because the alternative model does not represent a feasible alternative forecasting technique. Even if models of this class, possessing superior forecasting qualities, could be built on a purely ex ante bias, their usefulness and plausibility would be in doubt if they did not enforce consistency and could not accommodate variation for policy or environmental change. Given these drawbacks, this type of alternative was not explored.

The alternative actual forecasts with which the World Economic Outlook forecasts are compared here are those produced by the OECD and, following Llewellyn and Arai (1984), by a set of national forecasters. With the OECD the comparison is with another international agency producing forecasts of a nearly comparable scope in country coverage, assumptions, and detail. However, a difficulty with both types of comparison is that it is not possible to align the forecast dates exactly and so differences in the information sets conditioning the forecasts inescapably contaminate the comparisons.

Explanations for Forecast Errors

Given the conditional nature of the forecasts, explained above, testing the rationality of the forecasts involves assessing the contributions of “innovations” (unexpected changes) in the exogenous variables. The principal difficulties in implementing this approach are measurement problems. While it is possible to derive a reasonably satisfactory series for the innovations in oil prices, it is less easy to do this for fiscal policy and appears not to be feasible for monetary policy. In the case of fiscal policy the problem is less conceptual than practical: series of fiscal policy anticipations and outturns exist, but are for one reason or another less than satisfactory. For monetary policy there is the substantial conceptual problem that indicators like the growth rate of the money supply reflect not only policy but the economy more generally. While measures of fiscal policy like the fiscal impulse or the structural budget balance attempt to normalize for the influence of the economy, no comparable measure exists for monetary policy. Hence, even though fiscal policy and oil prices are not the only driving variables in world economic forecasting, for practical reasons it is only the contribution of innovations in these variables to explanations of the forecast error that is assessed. In a fully “rational” forecast these variables should only appear in the form of current innovations; neither tagged innovations nor actual values should in principle explain current errors if the forecasters have fully taken on board the implications of previous changes and have a correct model of the significance of their own current anticipations of these variables for those they are forecasting.7

The role of systematic analysis of the complete time series of forecasts is not to avoid the challenge of historical analysis of forecast error so much as to provide a context for it and to avoid the trap of choosing specific explanations which fit the facts in any one episode but have no overall power to improve the forecasts in general. Moreover, there is a dimension of forecasting quality which lends itself best to graphical and narrative analysis and this is the question of turning-point error. An allegedly common failing of forecasts is their failure to spot the significant cyclical turning points.

Data Base

The data base used in this study comprises the forecasts in the published versions of the World Economic Outlook and similar data from the earlier comparable unpublished documents. The nature of this data base, in terms of the forecast horizons used and regularity of the forecast exercises conducted, is indicated in Table 1. This shows that while there has been some irregularity in forecast production dates, particularly up to 1982, there has nearly always been a forecast for the year in question produced in the second quarter. An earlier forecast for the year has generally been available in the fourth, and often as early as the third, quarter of the previous year. In the last two years, the first forward look has been taken even earlier, with forecasts for the following year appearing as early as April.8 Besides producing a main forecast, there have been many occasions when uncertainties about principal conditioning variables (such as oil prices and exchange rates) have been felt to be sufficiently acute as to warrant the production of variant “scenarios.”

Table 1.

Forecast Horizon Content of World Economic Outlook

article image
article image

Some figures given for the first half of the following year.


The content of the World Economic Outlook is extraordinarily rich; forecasts are produced not simply for the principal variables of interest in the main countries, but in considerable detail both for these economies and for regional and analytical groupings embracing the entire world economy (with the exception of the U.S.S.R. and other countries of Eastern Europe that are not members of the Fund). In order to make progress in assessing the accuracy of the forecasts, it is necessary to make a number of decisions about which variables and forecasts to exclude.

The identification of forecast error plainly requires a definition of the outturn or realization with which the forecast can be compared. Because of the incidence of revisions of economic data, there is more than one possible series of realizations that might be chosen. Investigators generally take the view that the purpose of forecasts is to be right about the true evolution of the economy and that, at any point in time, that is most nearly revealed by the latest, revised-to-date series of data. This view, though clearly quite a persuasive one, is perhaps too superficial. The latest available set of data is not homogeneous in vintage: early data are many times revised, while latest data are perhaps still preliminary or partial estimates. Re-basing economic series may make it quite inappropriate to use the latest available data as a check on the forecast: the latter will have been formulated on data with a different base, and different properties, and it may not be feasible to reconstruct the data on a consistent base.9 Then again, policy (and short-term forecasting post mortems) will inevitably be based on early, not subsequently revised, data. For all these reasons there is room for choice about the realization series to be used and expedient criteria may legitimately affect the decision. In the present study (as detailed in the next section), three types of realization are deployed, which one is in play at any time being made clear in context. None of the more general conclusions arrived at appears to depend on the particular choice of realization series, though some conclusions are drawn from experiments involving the use of a specific series which were not, or could not be, replicated on the latest available set as was done for all the processing described in the next section.

Forecasting Accuracy

In this section we consider the accuracy of World Economic Outlook forecasts of principal variables over the whole available period, using a selection of the standard criteria discussed in the previous section. This discussion covers three topics: first, the variables selected for study; second, the forecasts for industrial countries; and third, the forecasts for developing countries.


As indicated in the previous section, World Economic Outlook forecasts embrace a large number of variables for several individually specified countries and aggregate groupings of various kinds. To be useful, a study must suppress secondary detail and select a primary set of variables. Recent discussion of the use of indicators in multilateral surveillance draws attention to the relationship between indicators and the transmission mechanism of economic policy. Thus, a conventional view of the latter directs attention to indicators of policy input (as, for example, the structural budget balance) at one end of the transmission mechanism and indicators of performance at the other (such as output growth, inflation, or the balance of payments); in between stand intermediate variables such as the exchange rate and perhaps interest rates. This study concentrates on the indicators of economic performance, measured by real GNP/GDP growth, GNP/GDP deflator or consumer price inflation, and the current account of the balance of payments. In addition, because the global context of World Economic Outlook forecasts lends trade a particular interest—World Economic Outlook trade forecasts often being cited by national forecasters—export and import volume growth and the development of the terms of trade are also investigated.

The country coverage of the projections examined also needs to be determined. Here, the institutional importance of the Group of Seven major industrial countries, their weight in world output and trade (in 1984–85, 56.9 and 53.5 percent, respectively), and the fact that the World Economic Outlook has consistently provided forecasts for the Group of Seven members individually, dictate that the forecast record for each of these countries and for the group as a whole should be examined. At the same time, aggregates for the industrial countries as a whole and for “Europe” as a group can also be easily and usefully examined. In addition to the industrial countries, the developing countries need also to be examined; none of these is as large in combined trade and output weight as the smallest of the Group of Seven countries, and World Economic Outlook forecasts have traditionally distinguished various groupings of the developing country bloc. The longest standing of such groupings and thus the most amenable to analysis over a reasonably long period of time are the regional groupings, where among the non-oil bloc, Africa, Asia, Europe, the Middle East, and the Western Hemisphere are separately distinguished.

Finally, a choice has to be made of horizon of forecast and vintage of realization or outturn data to be employed. Table 1 gives a brief summary of the projection content of successive World Economic Outlook iterations; the variable dates of these imply that whatever selection is made, no set of forecasts is homogeneous in its timing relative to the forecast horizon. However, a distinction was drawn between two groups of broadly homogeneous forecasts—current-year forecasts, where the forecast for year t is made during the year t itself—and year-ahead forecasts where the forecast for year t is made in year t –1.10

In practice even this distinction proved an ideal rather than a rigorously enforceable practice, as the actual sourcing for the two categories of forecast shown in Table 2 illustrates. The current-year forecasts are considerably more homogeneous in timing, varying by only three months from April to July at the maximum, compared to a maximum variation of seven months from August to March (of the following year!) in the case of the year-ahead forecasts.11 The additional variability nevertheless seemed a price worth paying to obtain a reasonably long series. In the choice of outturn data, the main analysis deploys two categories. For the current-year forecasts, the outturn is identified with the “first available” estimate, the figure reported in the following year’s World Economic Outlook; in the case of the year-ahead forecasts, however, the outturn is identified with the “first settled” estimate, that available in the World Economic Outlook of the following-year-but-one (that is, the year-ahead forecast for 1980 is compared with the outturn data published in the forecast source in 1981). These choices of outturn data had certain specific advantages over the use of latest available estimates: first, the definition of some of the aggregates were changed over the course of time, and the use of these outturn data enabled the resultant inconsistencies to be minimized or even eliminated in a way which would not have been so straightforward with latest available data. Second, the combination of “first settled estimates” as outturn data with the year-ahead forecasts allowed these to be compared with OECD and national forecasts prepared on a similar basis for the paper by Llewellyn and Arai (1984) and extended in the present study. Latest available data were nevertheless used in replication of all the principal computations of the main analysis; a summary of these results appears in Appendix II.

Table 2.

Classification of World Economic Outlook Forecasts

article image
Note: Dates refer to those of World Economic Outlook documents, published where stated, otherwise unpublished. The publication lag is generally one to two months.

Summary Statistics: Industrial Countries

Tables 3 to 6 provide evidence of the track record of World Economic Outlook forecasting based on averaging over the whole period: 1971–86 for the current-year forecasts, 1973–85 for the year-ahead forecasts.

Table 3.

World Economic Outlook Forecast Accuracy: Industrial Countries’ Output Growth

(In percent)

article image
Note: The definitions of current-year and year-ahead forecasts are discussed in the text and in Appendix I. Mean absolute actual value is defined as Σ|Ri|/n where Ri is the realization (“actual”) in year i and n the number of years in the sample; mean absolute error is Σ|FiRi|/n where Fi is the forecast for year i. RMSE is Σ(FiRI)2/n and Theil’s inequality statistic is RMSE(F)/RMSE(F,a), where F,a is a naive “no change” forecast. The regression data are for the regression of R, on F and figures in parentheses are t-statistics: those for the intercept test against difference from zero; those against the slope for differences from unity.
Table 4.

World Economic Outlook Forecast Accuracy: Industrial Countries’ Inflation

(In percent)

article image
Note: For definitions etc., see Note to Table 3.
Table 5.

World Economic Outlook Forecast Accuracy: Industrial Countries’ Export Growth

(In percent)

article image
Note: For definitions etc., see Note to Table 3.
Table 6.

World Economic Outlook Forecast Accuracy: Industrial Countries’ Import Growth

(In percent)

article image
Note: For definitions etc., see Note to Table 3.

Subject to a finding (see below) of some bias when the data are pooled, the track record for output growth forecasts, in the first table, is by a small margin the best of the three. The current-year forecasts show comparatively low average absolute errors compared to the mean absolute value of the series, while the Theil statistics indicate that the root mean square error of the forecasts is only some 20–40 percent, in typical cases, of the error that would be incurred by a “naive” forecaster. The realization-forecast regressions provide no indication of inefficiency.12 As might be expected, the current-year forecasts are superior by these criteria to the year-ahead forecasts, where the RMSEs and Theil statistics are higher, the fit of the realization-forecast regression poorer, and average absolute errors in relation to actual mean absolute values higher than they are in the current-year forecasts. Even so, these results appear fairly satisfactory: the Theil statistics are all well below unity, and the average absolute errors are well below the mean absolute value of the output growth series itself.

The track record for inflation (Table 4) is marginally less satisfactory than that for output, though still overall highly acceptable. The superiority of the current-year forecasts again stands out. These forecasts display, with the single exception of Germany, smaller average absolute errors, lower RMSEs, and lower Theil statistics than the year-ahead forecasts. The current-year forecasts provide no evidence of inefficiency, yielding a good fit in the realization-forecast regressions. The year-ahead forecasts provide a poorer fit in these regressions, and for Italy indicate inefficiency; for this same country, moreover, the Theil statistic exceeds unity. Elsewhere, however, the general run of evidence is favorable, even if the performance is not so good as in the nearer-term horizon of the current-year forecast or the comparable output forecasts.

Turning to the evidence on export and import volume forecasts, Tables 5 and 6, the track record now suggests little difference between the current-year and year-ahead forecasts (though the current-year statistics for imports are better than those for exports). In terms of overall quality, both appear equally good, with low Theil statistics suggesting generally that these forecasts provide a distinct improvement on the naive standard. There is no evidence of inefficiency and the overall fits of the realization-forecast regressions are on the whole not unreasonable except for the export growth statistics for Italy.

The record for balance of payments forecasts in Table 7 is considerably less reassuring than for output and inflation. The Theil statistics, especially for the year-ahead forecasts, are notably high, showing that the forecasts are little better than a naive projection, while the average absolute errors are high in relation to the absolute mean values, and in two of the year-ahead forecasts (for France and the Group of Seven) are actually somewhat higher. The realization-forecast regression suggests inefficiency in two cases, both for the year-ahead (Group of Seven and Total Industrial) and for the current-year (Canada and Japan) forecasts, whilst the overall explanatory power of the forecast is rated very low, at least in the year-ahead forecasts. There is, it is important to note, a considerable improvement in forecast accuracy when the forecast horizon is reduced—though the current-year forecasts are still markedly inferior to forecasts of a corresponding term for output or inflation.

Table 7.

World Economic Outlook Forecast Accuracy: Industrial Countries’ Balance of Payments on Current Account

(In billions of dollars)

article image
Note: For definitions etc., see Note to Table 3.

Includes official transfers.

Excludes official transfers.

Table 8.

World Economic Outlook Forecast Accuracy: Terms of Trade and Trade Volumes

(In percentage changes)

article image
Note: For definitions etc., see Note to Table 3.

The relative weakness of the balance of payments forecasts revealed by these summary statistics is not unexpected and is in line with experience at a national level and with the OECD forecasting track record (see the next section). The problem is evidently related to the sizable fluctuations that have been observed in the world current account discrepancy, particularly since the late 1970s. As the reasons for this discrepancy are imperfectly understood, World Economic Outlook projections have to be based on an implicit assumption of relative stability in the projected path of the discrepancy.13 It should also be noted that the current account is the difference between two large flows, each of which has a volume as well as a price component. Relatively small forecast errors in any of the underlying volume or price changes can thus induce relatively large errors in the absolute difference between the nominal flows. Finally, as discussed in Appendix VI, exchange rate innovations may at times have contributed to the errors in current account projections.

Table 8 shows the forecasts for world trade and industrial countries’ terms of trade. (The statistics on industrial countries’ export and import volume forecasts are repeated in the table for convenience.) The overall record for trade conveyed in these figures is one in which the shortening of the forecast horizon contributes greatly to accuracy, with a marked decline in the error statistics (the RMSE and average absolute error more than halve between the year-ahead and current-year forecasts) and a sharp rise in the explanatory power of the forecasts. Turning to the terms of trade, there is again a marked improvement in quality as the forecast horizon is reduced, yet in both cases there is strong evidence of inefficiency, with the forecasts underestimating whenever the terms of trade improve significantly. Inspection of the time series of the errors shows that, while there are large positive forecast errors associated with the two rounds of large oil price increases, these are more than offset by persistent negative forecast errors in the subsequent periods.

Table 9.

World Economic Outlook Forecast Accuracy: Non-Oil Developing Countries’ Output Growth

(In percent)

article image
Note: For definitions etc., see Note to Table 3.

Current-year data for Europe cover the period 1978–86.

Year-ahead data for Europe cover the period 1980–85.

The direct tests for bias in the forecast errors, in the sense of significant differences from zero, are reported in full in Appendix III where, in addition to the processing of individual country data, tests are also conducted on the pooled set of errors. The results suggest a degree of output optimism in World Economic Outlook forecasts. Although individual country output forecast errors are not significant, they are predominantly of the same sign, so that upon pooling, a significant amount of bias is suggested—on the order of 0.3 percent in the current-year forecasts and somewhat higher at about 0.5 percent in the year-ahead forecasts (if the 1974 error is excluded from these). The output optimism appears to have been most pronounced in the second half of the 1970s, undoubtedly reflecting the fact that the deceleration in growth in many countries was only gradually perceived as a break in trend, rather than as a cyclical downturn. On the other hand there appears to be no bias in the inflation forecasts, at least not if the 1974 year-ahead error is excluded. Interestingly, the inflation and output errors are significantly negatively correlated in the pooled data set, lending support to the contention (see Kenen and Schwartz (1986)) that the implicit World Economic Outlook forecasts for nominal income are more robust than those for either real output growth or inflation.

Summary Statistics: Developing Countries

Statistics of the forecasting record for developing countries are presented in Tables 913. These plainly show a much poorer track record than that for the industrial countries.

Table 10.

World Economic Outlook Forecast Accuracy: Non-Oil Developing Countries’ Inflation

(In percent)

article image
Note: For definitions etc., see Note to Table 3. Inflation is measured by consumer price indices.

For Europe, year-ahead data cover 1980–85.

Table 11.

World Economic Outlook Forecast Accuracy: Non-Oil Developing Countries’ Export Growth

(In percent)

article image
Note: For definitions etc., see Note to Table 3. Regional details for year-ahead data are not available.

Current-year data for Total Non-Oil Developing Countries cover 1972–86.

Table 12.

World Economic Outlook Forecast Accuracy: Non-Oil Developing Countries’ Import Growth

(In percent)

article image
Note: For definitions etc., see Note to Table 3. Regional detail for year-ahead data not available.

Current-year data for Total Non-Oil Developing Countries cover 1972–86, while current-year data for Europe cover 1980–86.

Table 13.

World Economic Outlook Forecast Accuracy: Non-Oil Developing Countries’ Balance of Payments on Current Account

(In billions of dollars)

article image
Note: For definitions etc., see Note to Table 3.

Current-year data for Non-Oil Developing Countries cover 1972–86, while current-year data for Europe cover 1980–86.

Year-ahead data for Europe cover the period 1981–85.

In the output growth forecasts, for example, a majority of the Theil statistics for the year-ahead forecasts exceed unity, while half of those for the current-year forecasts do so. Apparently, a naive prediction of no change in output growth would have been a better forecast in these instances than the actual World Economic Outlook forecasts.14 The fit of the realization-forecast regressions is also problematic, violation of efficiency being particularly strongly indicated for the Asia group in the year-ahead forecasts. Nevertheless, average absolute errors appear reasonably low in relation to the mean absolute value of the outturn series and there appears to be an improvement with the reduction in forecast horizon length upon moving from the year-ahead to the current-year forecast sets. Much the same statements can be made of the record in relation to forecasts of inflation. These are considerably poorer than the corresponding forecasts for the industrial countries, with some notably high Theil statistics in the year-ahead forecasts (where all but one exceed unity), generally low overall explanatory power in the realization-forecast regression, and evidence of inefficiency in several cases.

For export and import volume growth, regional detail is available only for the current-year forecasts; here the evidence is somewhat more reassuring. In the statistics on the export volume forecasts, only one of the Theil statistics exceeds unity (for the Middle East grouping, for which the average absolute forecast error itself exceeds the mean absolute value of the outturn series), although for import growth, Theil statistics exceed unity in three out of the seven cases. The overall explanatory power of the forecasts in the realization-forecast regression is generally low although indications of inefficiency are confined to European import growth forecasts.

The balance of payments forecasts, finally, provide some indication of weakness; the year-ahead forecasts produce three instances of Theil statistics above unity, with evidence of bias in the Asia grouping and generally low explanatory power for the forecasts in the realization-forecast regression. The current-year forecasts are somewhat better. The average absolute errors are generally lower in relation to the absolute mean of the outturns, the Theil statistics (that for Europe excepted) are lower, and the overall explanatory power of the forecasts higher, although with more evidence of departure from the efficiency requirements on the parameters of the regression.

The results of directly testing for bias, both on individual area results and the pooled data as a whole, are again reported in Appendix III. For the developing countries too, these tests suggest a tendency toward output optimism, at least in the year-ahead sample. Some individual area bias in inflation estimates also appears, though this is not significant when the data are pooled.

One reason for weakness in developing country forecasting is the extent to which such forecasts must rely upon projections of commodity prices, themselves known to be associated with large margins of uncertainty. Unfortunately, changes of definition and non-continuities in reporting such forecasts in the World Economic Outlook documents make it impossible to examine more than a small run of years of commodity price projections. Table 14 reports some summary statistics on forecasts made for four individual groups of commodities and the aggregate of interest here, non-oil developing countries’ exports. The variability of these prices is notably high and it is not too surprising that the average absolute errors—ranging from 5.7 to 8.5 percent—are also rather big. Even so, the forecasts do at least compare well with the naive standard and only a proportion appear to infringe the efficiency criteria in the realization-forecast regression.

Table 14.

Primary Product Price Forecast Accuracy

(In percent)

article image
Note: For definitions etc., see Note to Table 3.

The summary statistics reviewed, based on the overall record, provide a number of general conclusions. First, industrial country forecasting appears to be much better than that for developing countries. This is not surprising: developed countries are better understood, data streams are not so thin and are more reliable. It should also be borne in mind that the quality of the data analyzed in these tables is less good for the developing countries owing to frequent changes in definitions and coverage.

Second, among the industrial country forecasts, the balance of payments forecasts appear considerably worse than those for output, inflation, and export or import volumes. This should not be cause for great surprise (though it may be cause for concern): the balance of payments is the difference between two series; small changes in these can induce large changes in the difference, which as a data series may be volatile both in behavior and in its revisions. Moreover, the emergence of a relatively large and volatile discrepancy in the world’s aggregate current account—which, in principle, should be in balance—casts some doubt on the quality of balance of payments data even for the industrial countries. Difficulty in forecasting the balance of payments is a common complaint among national forecasters.15

A third conclusion that can be drawn for the industrial countries is that the current-year forecasts are superior to the year-ahead forecasts. While it may not seem surprising (it may even appear obvious) that near-term forecasting is more accurate than longer-term forecasting, such results are not invariably recorded (see, for example, Burns (1986) for a contrary instance).

Fourth, the record appears comparatively free from inefficiency in the sense that country-by-country and area-by-area the parameters of the realization-forecast regression conform by and large to the requirements of efficiency. However, when the data are pooled, direct tests for the significance of forecast bias produce some evidence of an output optimism error (more pronounced for the year-ahead than for the current-year forecasts).

Finally, it should be noted that the relative inferiority of the balance of payments forecasts and of the year-ahead forecasts for the industrial countries does not carry over to the developing countries. For this group, output forecasts are the weakest, with forecasts for export growth and current year balance of payments somewhat better. By and large, these results are similar to those arrived at by Kenen and Schwartz (1986) in their study of the Fund’s forecasting and they were confirmed by additional replicating calculations using the latest available estimates of outturns (Appendix II).


It is natural to enquire how well forecasting in the Fund compares with other forecasts. Here we consider two alternatives, the OECD (see Table 15) and national forecasting agencies. In these comparisons we are able to follow the lead set by Llewellyn and Arai (1984), who have already compared OECD and national forecasting records.

Table 15.

Published Basis for OECD-World Economic Outlook Comparison

article image

Comparisons with OECD

There have been a number of analyses of OECD’s track record besides that of Llewellyn and Arai (such as Smyth (1983), Smyth and Ash (1975), and Holden, Peel, and Sandhu (1987)). It would be useful if, in choosing the OECD as a comparison, one is also choosing a forecast which is-—at least in more recent years—less “judgmental” and more model-based than that underlying the World Economic Outlook forecasts, for this would give added point to the comparison. However, it is not clear how far such a contrast is realistic.16

The forecasts compared here are those for output growth, inflation, and the balance of payments on the current account of the Group of Seven countries individually and in aggregate. In Llewellyn and Arai (1984) attention was focused on the OECD’s forecasts a year ahead, using issues of the OECD Economic Outlook for December of year t –1 for forecasts for year t, the realizations coming from the OECD Economic Outlook for December of the following year (t+ 1). An immediate problem is that World Economic Outlook forecasts are not always based on the same information as that conditioning the OECD forecasts, the historically less regular World Economic Outlook round drawing on some forecasts made as early as August of the previous year and as late as March in the year in question. In order to achieve as close a match as possible, since 1981 year-ahead forecasts in July issues of the OECD Economic Outlook have been compared with the World Economic Outlook’s August forecasts. But this is only a partial solution to the problem and the remaining differences in timing, whilst apparently not severe, are unfortunate.17 The fact that calendar time discrepancies between forecast dates are small is uncertain assurance that discrepancies between the corresponding information sets are in the relevant sense also small. The scope of the comparison is from 1973 to 1985.

Table 16.

OECD Forecast Accuracy—Summary Statistics: Group of Seven

article image
Note: For definitions etc., see Note to Table 3.