How Informative Are Real Time Output Gap Estimates in Europe?
  • 1 0000000404811396https://isni.org/isni/0000000404811396International Monetary Fund
  • | 2 0000000404811396https://isni.org/isni/0000000404811396International Monetary Fund
  • | 3 0000000404811396https://isni.org/isni/0000000404811396International Monetary Fund

Contributor Notes

We study the properties of the IMF-WEO estimates of real-time output gaps for countries in the euro area as well as the determinants of their revisions over 1994-2017. The analysis shows that staff typically saw economies as operating below their potential. In real time, output gaps tend to have large and negative averages that are largely revised away in later vintages. Most of the mis-measurement in real time can be explained by the difficulty in predicting recessions and by overestimation of the economy’s potential capacity. We also find, in line with earlier literature, that real-time output gaps are not useful for predicting inflation. In addition, countries where slack (and potential growth) is overestimated to a larger extent primary fiscal balances tend to be lower and public debt ratios are higher and increase faster than projected. Previous research suggests that national authorities’ real-time output gaps suffer from a similar bias. To the extent these estimates play a role in calibrating fiscal policy, over-optimism about long-term growth could contribute to excessive deficits and debt buildup.

Abstract

We study the properties of the IMF-WEO estimates of real-time output gaps for countries in the euro area as well as the determinants of their revisions over 1994-2017. The analysis shows that staff typically saw economies as operating below their potential. In real time, output gaps tend to have large and negative averages that are largely revised away in later vintages. Most of the mis-measurement in real time can be explained by the difficulty in predicting recessions and by overestimation of the economy’s potential capacity. We also find, in line with earlier literature, that real-time output gaps are not useful for predicting inflation. In addition, countries where slack (and potential growth) is overestimated to a larger extent primary fiscal balances tend to be lower and public debt ratios are higher and increase faster than projected. Previous research suggests that national authorities’ real-time output gaps suffer from a similar bias. To the extent these estimates play a role in calibrating fiscal policy, over-optimism about long-term growth could contribute to excessive deficits and debt buildup.

I. Introduction

How informative are IMF-World Economic Outlook (IMF-WEO) output gap estimates for countries in the euro area? This is an important question since the output gap—the deviation of output from its potential—is often considered a key input to stabilization policy. The output gap (a summary measure of slack) typically informs central banks and fiscal authorities about the appropriateness of their policy stance, with larger slack calling for more stimulus. And yet, the output gap is notoriously hard to measure since potential output—commonly defined as the maximal level of output attainable without stoking inflationary pressures (Okun, 1962)—is unobservable.

Given its importance for policymaking, vast efforts have been put into estimating the output gap, with methodologies ranging from simple univariate and purely data-driven statistical filters to complex, micro-founded structural models. While some methodologies may perform better along specific dimensions, they all face limitations when it comes to estimating output gaps in real time. Real-time estimates are the relevant measures—since policy decisions about macroeconomic stabilization cannot wait—but they are always subject to considerable uncertainty and are revised ex post once more information becomes available.

This paper provides an analysis of the IMF’s real-time output gap estimates for countries in the euro area. We observe that real-time output gap estimates drawn from the WEO database tend to have large and persistently negative means. For several large euro area countries the average real-time output gap is close to -2 percent of potential output over the period 1994–2017. The negative real-time gaps are even larger before the Great Financial Crisis (GFC) and tend to be systematically “revised away” in later WEO vintages, pointing to persistent negative bias in real time. Such patterns are qualitatively in line with experiences of real-time estimates by other institutions as well as with earlier research on revisions of real-time output gap estimates for the euro area, the U.S. and OECD countries.2 Earlier research at the IMF suggests that the country authorities’ estimates of real time output gaps suffer from similar bias (see Eyraud and others, 2018). To the extent that fiscal authorities rely on output gap estimates to assess the cyclical position of the economy, a large and persistent negative real-time bias could lead to systematic over-optimism about future output and fiscal revenues and thus to a larger-than-planned accumulation of public debt. Large and persistent negative real-time bias could also lead to unwarranted monetary stimulus. All in all, our paper cautions against an excessive focus on output gap estimates in real time when calibrating counter-cyclical policy.

But how do large and persistent negative real-time output gap estimates come about? To shed light on this question, we use the IMF multivariate filter (MVF, developed by Benes and others, 2010) to construct time series of artificial real-time output gap estimates based on all information available to staff in real time (data as well as forecasts). These artificial output gap estimates show what staff estimates would have been in real time if they had used the MVF filter in the past. This natural benchmark allows to isolate the respective roles of data revisions, forecast uncertainty, and judgment in explaining the negative bias in real time output gap estimates.

  • Data revisions. Following Orphanides and others (2000, 2005) we compare output gap estimates based on data available in real time against “quasi-real-time” estimates that are based on the final data vintage truncated at the year at which the gap is estimated. We corroborate the long-established result that data revisions play only a minor role in explaining large and persistent negative real time output gap estimates.

  • Forecast errors. WEO growth forecasts tend to be too optimistic on average because staff baseline forecast is a modal forecast that typically fails to predict recessions and other tail events: a result established in earlier literature (see Box 3 for details). This has important implications for the estimation of potential output since overly optimistic forecasts tend to pull up the estimates of potential output. We estimate that based on our MVF benchmark, forecast errors lowered the real-time output gap estimates by about 0.7 percentage points on average in the euro area.

  • Judgment. When estimating output gaps, economists typically incorporate specialized knowledge that is not directly captured by the benchmark modelling framework. To analyze the role of judgment we compare actual WEO real-time output gap estimates to MVF estimates based on WEO data and forecasts that were actually available in real time. Because we rely on a particular benchmark (MVF), what we call “judgment” may reflect staff’s reliance on a different methodology in the past as well as any additional off-model information that could inform the assessment of potential output. The results reveal that judgment typically increased potential output by almost 1.0 percentage point of potential GDP in our sample, accounting for just above one- half of the overall tendency towards negative output gap estimates in real time.

GDP Growth Forecast Errors Across WEO Vintages

WEO forecasts for the euro area tend to overpredict real GDP growth. The main reason is that baseline forecasts typically do not incorporate the possibility of severe recessions. They are “modal” forecast—and as such display the most likely outcome—not “average” forecasts based on distributional assumptions. Tail events have no quantitative bearings on the baseline forecast. Therefore, when they occur, recessions give rise to large forecast errors.

Building on the IMF IEO report (2014), we analyze GDP growth forecast errors for the 12 euro area countries over the period 1994-2017. We confirm the results established in literature (e.g., Timmermann (2006), De Resende (2014)) that forecasts for nearly all countries and over all forecast horizons (apart from real time, nowcast) exhibit an optimistic bias. However, the bias is statistically significant in less than half of countries in the sample. If the crisis period is excluded from the sample, the size, incidence, and statistical significance of the optimistic bias are reduced substantially.

The optimistic bias is stronger in medium-term growth forecasts, where forecast errors are larger and more pronounced, and more countries exhibit statistically significant optimistic bias compared to short-term forecasts. The size of bias also varies with the time-period and countries studied, and mostly stems from a failure to predict recessions. It is, however, noteworthy that the real-time “nowcasts” tend to exhibit a small negative bias, indicating that in real time GDP growth has generally been under-projected.

As GDP growth forecasts are an important input in the estimates of potential output, ex ante optimistic bias in forecasts is translated into overestimating potential output in real time. This optimistic bias could be further exacerbated by an institutional informal rule that require to close the output gap by the end of a five- year forecast horizon (Timmermann, 2006; De Resende, 2014). For example, starting from a negative output gap, in order to close the output gap in the medium-term actual output has to be forecasted to grow rapidly to catch up with potential. As a result, the growth forecast errors can be significant contributors to the output gap revisions.

Drawing from the IMF-WEO database, we show that systematic upward revisions to real-time output gaps are positively correlated with both public debt levels and public debt WEO forecast errors in the main countries of the euro area. Similarly, there is a robust, and negative, empirical association between primary fiscal balances and revisions to the WEO output gap estimates, controlling for a variety of other relevant variables.

We also establish that real time output gap estimates are generally not robust predictors of inflation, even though the Phillips curve relationship is clearly established using final estimates of the output gap (see also Abdih and others, 2018). Unlike real time output gap estimates, real time WEO inflation forecasts are not significantly revised and therefore are informative for monetary policy purposes. Real-time output gaps at best contain information on directional changes in business cycles and inflation, while final or real time inflation is the best predictor of future inflation.

The rest of the paper is structured as follows. Section 2 reviews the most common methods to estimate the output gap and their key properties. Section 3 presents key stylized facts showing the persistent and often large negative bias in WEO real time output gap estimates for the countries in the euro area, while section 4 decomposes the output gap bias, attributing most of it to judgment and forecast errors. Section 5 looks at the role of real time output gap in predicting inflation in real time and related implications for monetary policy. Section 6 analyses the correlation between the upward revisions in real-time output gap estimates and debt buildup. Section 7 concludes and maps out possible avenues to minimize the real-time output gap bias and associated policy errors.

II. Estimation Methods and Desirable Properties

A. Estimation Methods

Different approaches to measuring the output gap largely reflect variations in the concept of potential output. From a purely statistical perspective, potential output can be seen as the trend component of actual output. In economic terms, potential output may be characterized by the level of utilization of the production factors consistent with stable inflation. The concept of “sustainable” output (IMF, 2015), on the other hand, refers to a level of GDP that the economy can sustainably produce over the medium term without stoking imbalances. Reflecting these variations, the most commonly used methods to estimate potential output and the output gap range from purely statistical univariate filters to fully structural Bayesian estimation of large Dynamic Stochastic General Equilibrium (DSGE) models. All these methods have conceptual or practical strengths and weaknesses (see Blagrave and others, 2015; Álvarez and Gómez-Loscos, 2018).

Univariate filters rely on purely statistical methods to extract a trend component from GDP time series. The most popular univariate filters are the Hodrick and Prescott (HP) filter (Hodrick and Prescott, 1997) and the band pass filters by Baxter and King (1999) and Christiano and Fitzgerald (2003). The filtered output is a smoothed GDP series that is interpreted as potential GDP. The appeal of univariate filters is in their simplicity and transparency; they do not require assumptions about the structure of the economy and instead rely on exogenous parameters to determine the degree of smoothness of a time varying trend. However, the filtered “trend” may not be consistent with the above-mentioned economic interpretation of potential output and may be of little relevance for policy purposes. Univariate filters also suffer from the well-known end-point problem that typically results in large revisions of recent years’ estimates of potential output (Blagrave and others, 2015).

The production function estimation is a supply-side approach that decomposes output into its structural and cyclical components. The production factors include labor, capital, and total factor productivity (TFP), which is calculated as a residual. Potential output is derived by individually filtering labor supply and TFP. While this approach allows to decompose potential growth into its input contributions, thus providing an economic interpretation of potential, the estimates can suffer from the same end-point problem as univariate filtering. Moreover, as is the case with univariate filters, the production function technique ignores important information on labor market and inflation that is central to an economic definition of slack (measured as the gap between output and potential output).

Multivariate filters are typically built around economic relationships such as the Okun’s Law and the Phillips curve, and may incorporate financial or other variables. The multivariate filter (MVF) presented in Benes and others (2010) and Blagrave and others (2015) embodies basic economic theory principles, in that it features a Phillips Curve which relates inflation to the output gap, as well as an Okun’s law linking the output gap to the unemployment gap. Potential output and NAIRU are estimated simultaneously, in line with Okun’s (1962) concept of the output gap, and are therefore readily interpretable for the calibration of countercyclical policies.

Multivariate filters are very flexible and can in principle incorporate variables that are only loosely related to the concept of business cycle in the conventional sense. In this vein, Borio and others (2014) augment the multivariate filter with real credit and real house price growth—variables directly related to the housing cycle—in an attempt to capture macroeconomic stability more broadly. The resulting trend is then closer to a concept of sustainable level of output—attainable without stoking imbalances—than to the traditional, business cycle concept of potential output. Results (reported in IMF, 2015) show that at times of credit and housing booms, the sustainable level of output tends to be below potential output, leading to higher than conventional output gaps and suggesting that estimated potential output may not be sustained. IMF (2015) also caution that real-time identification of sustainable output is very difficult, and in practice the concept of sustainable output is best used as a “fire alarm” to complement the conventional output gap estimates.

Structural model-based approaches include estimated DSGE models that allow for joint estimation of structural shocks and potential output (see Vetlov and others, 2011). The output gap is typically closed in the steady state and consistent with stable inflation. As potential output is derived under flexible wages and prices, output can deviate from potential due to nominal rigidities. While theoretically the soundest approach, the results can be highly sensitive to specific modelling structure and parametrization.

B. Properties of Output Gap Estimates

As potential output and output gap are unobservable, their reliability and usefulness for policymaking can typically be assessed against the following statistical properties.

Stability (Size of Ex-Post Revisions)

Forecasts of observable variables are usually assessed ex-post by their accuracy, i.e. by their distance to actual realizations. Since output gaps are unobservable, another criterion needs to be introduced. We substitute stability—which we interpret as the average size and variation of ex-post revisions of real-time output gaps—for accuracy. The literature, documents that real-time output gap estimates tend to be too negative and substantially revised upwards over time. The ex-post revisions to real-time output gap estimates for the euro area, the U.S. and OECD countries are often at least as large as the estimates themselves and are characterized by a high degree of persistence (see Box 2). Stability is thus a key property for output gap estimates. The more stable the output gap, the smaller the policy error associated with its misestimation. For example, in the EU output gap estimates are used in structural fiscal balance rules within the stability and growth pact mechanism (SGP). Since past fiscal policy actions cannot be “revised” in hindsight, unstable and downward biased output gap estimates may result in involuntary buildups of debt and/or costly adjustments.

Real-Time Output Gap Assessments in International Organizations

The negative bias in real-time output gap estimates by international organizations for the euro area countries has been widely documented.

ECB (2005, 2011) and Marcellino and Musso (2011) find that real-time output gap estimates for the euro area (EC 1999-2004, OECD 2002-2010, IMF 1999-2010) are characterized by a high degree of instability, with large and predominantly positive revisions. They conclude that the information content of real-time output gap estimates tends to be low and suggest that the real-time assessment of the degree of slack in the economy should be based on a wide set of indicators. Rünstler (2002) shows that real time estimates of output gap in the euro area could be improved to a considerable extent if the cyclical co-movement between the output gap (by the ECB over 1970:Q1-2000:Q4), factor inputs and capacity utilization is exploited.

Kempkes (2014) analyze real-time output gaps for EU 15 countries over the 1996-2011 period, as estimated by the EU, the IMF and the OECD. He similarly finds a negative bias in real-time estimates (i) irrespective of the source of the data; (ii) in all real-time vintages; and (iii) across the entire cross-section of countries; the bias is estimated on average 0.5 percentage points of potential GDP per year. He notes that a systematic downward bias in real-time output gap estimates implies that structural balances tend to be over-estimated in real time and suggests that fiscal rules should incorporate ex-post checks of the unbiasedness of the cyclical components.

Hernández de Cos and others (2016) based on the EC estimates for the EU 15 over 2004-14 find that the direction of revisions to the real-time estimates (defined as an estimate of the output gap in year t made in Spring of year t-1) depends on the state of the economy: upward for expansions and downward during recessions. Based on this finding, they argue that this asymmetry should be taken into account in the computation of structural balances. Ademmer and others (2019) similarly find that for EU 28 over 2004-17 period the real-time output gap estimates are revised upwards in boom periods and downwards in recessions.

Using the OECD estimates, Tosetto (2008) and Turner and others (2016) find the revisions of real-time estimates to be large and persistent for 15 OECD countries over 2003–08 and for G7 for 2007–09. Turner and others (2016) show that additional cyclical adjustments using manufacturing capacity utilization, a share of investment in GDP as well as house prices and credit help to improve the reliability of OECD output gap estimates. Edge and Rudd (2016) and Champagne and others (2018) find that the revisions of output gap estimates have become smaller in more recent samples, based on the data on U.S. and Canada, respectively.

Efficiency

Output gap estimates are efficient if they utilize all information available at the time of estimation—in line with the concept of forecast efficiency (Nordhaus, 1987). If an estimate is efficient, future revisions should not be predictable and past revisions should not help explain subsequent revisions, i.e. revisions should not be serially correlated. This property also contributes to stability as efficient estimates should undergo smaller ex-post revisions.

Asymptotic Zero-Mean

All filter-based methods, e.g. the HP filter or the MVF, imply that the estimated output gap has zero-mean asymptotically. This is to reflect the notion that gaps are temporary and should sum to zero, consistent with an economy growing along a balanced growth path in the long term. Similarly, ex-post revisions should be distributed around a mean of zero. One-sided revisions reflect a systemic bias in the real-time output gap estimates. If the bias is negative, for example (one-sided positive revisions), structural fiscal deficits may be underestimated in real time, potentially leading to higher than desired debt buildup, which would increase the risk of procyclical adjustment in the future. Several studies document systematic negative biases in real-time output gap estimates, with some noting potential negative consequences in terms of fiscal sustainability (see Section III.C).

The zero mean output gap property is a general condition traditionally associated with the steady-state of the economy (e.g. in the Phillips curve) and is consistent with traditional business cycle analysis in which cycles are broadly symmetric. It is also embedded in output gap estimates that feed into standard policy reaction functions, highlighting that the long run zero mean is a desirable property in practice as it serves as an anchor in particular for fiscal policies.3 A long-run zero mean is also reflected in the empirical assessments of output gaps by international institutions (see Section III.B). It is only under a zero mean that the average structural fiscal balance (i.e. the fiscal balance adjusted for the output gap) is in line with the average fiscal position over the cycle: if the cycle is symmetric, the structural balance gives the average fiscal position over the cycle, and fiscal rules can be anchored around the structural balance.

In contrast, if output gap estimates are on average negative, fiscal rules based on structural balances would result in excessive deficits. Indeed, separate strands of business cycle literature have recently questioned the symmetry of the business cycle, pointing out that there may be good reasons why countries may occasionally spend more time below potential. This could occur, for example, in case of downward nominal wage rigidities leading to asymmetric business cycle responses that have to be assessed in conjunction with potential effects of hysteresis (see Box 1). Also, stabilization policies can suffer from asymmetric limitations—e.g. ZLB on monetary policy (that going forward can be magnified by low neutral real interest rates, see Rachel and Summers, 2019) or constraints on fiscal policies imposed by high debt—which makes managing overheating easier than providing stimulus during recessions.

Nominal Wage Rigidities, Hysteresis, and Output Gap Estimates

Some DSGE models that feature asymmetric nominal wage rigidities can produce negative long-run output gaps. The output gap in those models is derived as the difference between output under nominal rigidity and output under flexible wages and prices. In the framework of Benigno and Ricci (2011) since downward wage rigidity dominates the forward-looking reaction of wage setters, the output gap is always nonpositive in the long run (it is zero without uncertainty).

Aiyar and Voigts (2019) argue that downward nominal wage rigidities will lead to a negative average output gap by inducing negative demand shocks to have larger impact on unemployment and output than positive shocks. Therefore, in their view conventional methods to estimate potential output such as HP or MVF exhibit an intrinsic upward bias in output gap estimates that is especially large in deep demand-driven recessions. Bound by the asymptotic zero mean property, instead of allowing for asymmetry in shock adjustment and thus in output gap estimates, these filters spuriously lower the estimate of potential.

Coibion and others (2017) find that real-time potential GDP estimates across several institutions tend to be sensitive to demand shocks and under-respond to supply shocks. Following Blanchard and Quah (BQ, 1989) they estimate a bivariate VAR with an identifying restriction that only supply shocks can have permanent effects on output. Deriving the real-time potential GDP as a historical contribution of supply shocks to growth they find that in the aftermath of the GFC the US potential output tended to be under-estimated and consequently the output gaps should currently be larger and more negative than estimated by most institutions.

It is debated whether demand shocks only have transitory effects on GDP. Blanchard and others (2015) find that 83 percent of recessions associated with supply shocks result in sustained decline in output. They also find that recessions triggered by demand shocks are frequently followed by lower output or even lower output growth and can thus have permanent effects. Of all the recessions associated with intentional disinflation—the purest cases of demand shocks—almost two-thirds are associated with lower long-term output. In earlier work, Blanchard and Summers (1986) and Ball (1999) questioned the assumption of long-run neutrality of money and argued that monetary policy and other demand factors can have permanent effects.

These findings suggest important hysteresis effects, implying that potential output can decline more at times of recessions and thus output gaps can be higher (or less negative). Alichi and others (2019) apply the MVF with labor market hysteresis on US data, finding significantly higher estimates of NAIRU, lower estimates of potential and substantially higher (less negative) output gaps throughout the GFC as well as 1980s compared to model simulations without hysteresis. Univariate regression by Hamilton (2018) can capture persistence and centers cycles around zero mean.

A separate line of research has shown how demand shocks can have long-run effects on aggregate output by inducing shifts in a long-run aggregate supply curve. Bashar (2011) shows that in all G-7 countries aggregate demand shocks positively affect aggregate supply shocks (through correlation of demand shocks with labor productivity ruled out in standard BQ identification) causing permanent effects on the output level.

Economic Consistency

As real-time output gap estimates are intended to accurately signal inflationary pressures in the economy, they should be correlated with inflation and consistent with other indicators of slack such as unemployment, capacity utilization or labor market tightness. These additional indicators of slack can be incorporated directly into the MVF or used separately to assess real-time measures of the output gap.

III. Output Gap Estimates Through WEO Vintages

A. Definitions

In this paper we define the real-time output gap as the output gap estimated for year t in the Fall WEO of the same year. For example, the real-time output gap estimate for country i(yi,t|t) for the year 2000 is taken from the 2000 Fall WEO vintage, the real-time gap for the year 2001 is taken from the 2001 Fall WEO vintage, and so forth. Therefore, real time output gap estimates are defined at a time when about half of the actual data for any particular year is available. Using this definition, we compile an unbalanced panel for euro area countries starting in 1994 when systematic reporting of the output gap estimates became available in WEO. Our data extends to the 2017 fall WEO vintage that we refer to as providing the “final” WEO output gap estimates.

Our primary question of interest is the information content of output gap estimates when policy decisions are actually made. Following Orphanides and van Norden (2002) we therefore work with contemporaneous real time gaps, estimated at time t conditional on information available at that time as these are the relevant measures for fiscal and monetary policy decisions inevitably taken in real time.4 This also allows us to isolate different sources of real-time output gap bias, including the effects of data revisions and forecast errors.

To evaluate the quality of macroeconomic forecasts, analysts typically rely on the statistical properties of forecast errors. Since output gaps are unobservable and subject to continuous revisions in subsequent years, we examine both output gap estimates (levels) and revisions of output gap estimates (changes) against a wide set of properties, as outlined in Section II. Revisions are defined as deviations of real-time output gap from final (2017 WEO) estimates, or from WEO estimates up to five years later (yi,t|t+5). We interpret systematic asymmetry in revisions of real-time estimates (changes) as “bias”. Since final estimates broadly satisfy the zero-mean property, the revision bias translates into a level bias for real-time estimates.5 We further investigate whether real-time estimates are consistent with other indicators of slack such as inflation and estimates of unemployment gaps. Results are presented for all euro area countries as well as for a baseline sample of 11 countries for which full data for 1994–2017 is available. Finally, we compare the statistical properties of the WEO output gap estimates for the euro area with estimates for other advanced economies, and with estimates by other international organizations (see Box 2).

B. Properties of WEO Real-Time Output Gap Estimates

Large and Asymmetric Ex-Post Revisions

The revisions of real-time output gap estimates tend to be large and mostly upawards. Figure 1 presents the WEO real-time output gap estimates for the period between 1994–2017 as well as the “final” (yi,t|2017) and “5-year later” (yi,t|t+5) revised WEO estimates.6 Real-time output gaps tend to be revised substantially over time. For the euro area, real-time estimates are revised upwards by more than 1 percent of potential output on average, with substantial variation across countries and time periods. Throughout the WEO history, downward revisions (compared to the final 2017 estimates) constitute only about one-sixth of all observations and are attributable to very few countries. In other words, real-time estimates do not perform well against both the stability and zero-mean properties.

Figure 1.
Figure 1.

Real-Time Output Gaps Through WEO Vintages

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.

The real-time estimates are revised gradually over time. Table 1 presents mean and median estimates for the short- and medium-term revisions, as well as the number of countries that have one-sided (upward) revisions on average, statistically significant bias, and serially correlated revisions (see also Appendix Figures 1 and 2). While short-term revisions tend to vary with a cycle (upward in expansions and downward in recessions) for some countries, the 5-year later revisions as well as the revisions in the 2017 vintage are predominantly upward throughout the sample period. Short-term revisions (one or two years after) are small, likely driven by data and short-term forecast revisions and do not suffer from serial correlation. Already after two years, revisions start to exhibit serial correlation that increase in subsequent years, implying that output gap estimates have not been efficient: past revisions could be used to improve future output gap estimates.

Appendix Figure 1.
Appendix Figure 1.
Appendix Figure 1.

WEO Real Time Output Gap Estimates and Subsequent Revisions

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.
Appendix Figure 2.
Appendix Figure 2.

WEO Real Time, 2-Year Ahead and 2-Year Back Estimates

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.
Table 1.

Descriptive Statistics of Revisions to WEO Real-Time Output Gaps (EA11)

article image
Sources: IMF WEO and staff calculations.

Negative Bias in Real-Time Estimates

Systematic upward revisions manifest themselves in large and persistently negative output gap estimates in real time. Indeed, over the whole WEO history, France and Italy have not recorded a single positive output gap estimate in real time, Germany has recorded four, Spain five and the euro area only one. Table 2 compares the means of output gaps estimated in real-time and in the final (2017) WEO vintage for our baseline sample of 11 countries for which longer time-series are available as well as for all euro area countries and tests whether the mean output gaps are statistically different from zero. Only in four countries—Luxembourg, Ireland, Malta, and Cyprus—are we unable to reject the hypotheses that the mean is zero. In all other countries the output gaps are statistically negative on average (p-value<0.1) and for France, Italy, Finland, Portugal and Spain are close to, or exceed, -2 percent of potential GDP. Such patterns indicate that a negative bias has historically been a persistent feature of real time output gap estimates.

Table 2.

Real-Time Output Gaps: Mean Estimates and Significance (1994-2017)

article image
Source: IMF WEO and staff calculations.Note: Latest vintage estimates are provided for common sample with real-time estimates (second column) as well as for full sample (third column). The 11 baseline countries are shown in the upper part of the table. Real time output gap estimates for the euro area start from 2010, for individual countries based on data availability. For Slovakia 2018 Spring WEO is used as the latest vintage for consitency in revisions.

The negative average output gap estimates are largely revised away over time. By the final (2017) WEO vintage, for most countries the output gaps, averaged over a comparable 1994–2017 period, are substantially more positive and not statistically different from zero (notwithstanding GFC and irregular business cycle length). We conclude that the real-time estimates are biased as is reflected in predominantly upward revisions, indicating that staff’s assessment of potential tends to be more optimistic in real-time than in hindsight. Still, France and Italy as well as the weighted euro area retain large and statistically significant negative gap estimates also in the final WEO.7

For newer euro area countries (e.g. Baltic states, Slovenia, Slovakia, Malta) real time output gap estimates are available only for a shorter period, much of which spans the GFC and not covering a full business cycle. The sample period for these countries is thus not sufficiently long and representative to assess the asymptotic zero-mean property. Still, in the latest WEO, which spans over fuller cycles and longer period, their output gap estimates are not statistically significantly different from zero (although the standard errors tend be large reflecting more volatile business cycles).

Similarly, the sample period 1994–2017 spans from the beginning of a recession to the end of a recession, thus covering incomplete cycles and contributing to marginally negative (although statistically not different from zero) averages in the 2017 vintage. The mean estimates over full lengths of business cycles are closer to zero (less negative) also for baseline countries. Over a longer history and full business cycles, the WEO, OECD and EC final vintage weighted output gap estimates for the euro area on aggregate are smaller, averaging -0.3, -0.3, and 0 percent of potential GDP, respectively (text chart).8 This is in line with the asymptotic zero mean property with broadly similar variation across countries evident for all three institutions.

uA01fig02

Final Output Gap Estimates for the Euro Area

(Percent of potential GDP; weighted average)

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: European Commission; IMF WEO and OECD.Note: unbalanced weighted averages for euro area countries based on data availability. Starting point reflects full cycles.
uA01fig03

Revisions to WEO Real-Time Output Gaps

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.Note: average WEO revisions for a baseline sample of 11 countries.

A question that often arises is whether the negative bias in real-time estimates is uniquely driven by the global financial crisis (GFC). The answer is no. Real-time output gaps have been persistently negative and revised upwards throughout the 1994–2004 period preceding the pre-GFC boom-bust cycle. As the text chart illustrates, the negative bias in real-time output gaps is not only an artifact of the crisis years. The bias is however larger in the immediate pre-crisis period (2005–2008) reflecting the difficulty in forecasting recessions and turning points in business cycles in real time. Following the GFC, growth in the precrisis years was widely assessed to have been fundamentally unsustainable (see Section VI).

Economic Consistency Property

The bias in the real-time output gap estimates can also be seen from other indicators of slack.

Okun’s Law

The time-tested Okun’s law stipulates a consistent relationship between the output gap and unemployment rate deviations from its long-run trend. A positive output gap should correspond to the unemployment rate below its long-term trend and vice versa. However, real-time WEO output gap estimates are largely nonpositive even at times of below-trend real-time unemployment when one would have expected actual output to exceed potential.

We look at two measures of trend: the long-run average unemployment rate and the NAIRU. In the first case, France, Italy, Finland and Portugal still exhibited real-time WEO output gaps of around -2 percent of potential GDP when unemployment was below its long-term average—prima facie, this points to an inconsistency in the gap estimate. Using the NAIRU, which accounts for possible downward revisions of structural unemployment, such an inconsistency is less visible, indicating that the bias mostly reflects overly high estimates of potential output. Nevertheless, at times when the unemployment rate is below NAIRU, real-time output gap estimates are rarely positive (see Appendix Table 1).

Appendix Table 1.

Real-Time Output Gaps: Mean Estimates and Significance at Times of Low Real-Time Unemployment (1994-2017)

article image
Sources: IMF WEO and staff estimates.

Phillips Curve

The second building block in understanding business cycles is the short run trade-off between inflation and a measure of slack such as the output gap. At times of negative output gaps inflation should fall below its expected or targeted level. However, we find a disconnect between the levels of real-time output gap and inflation. Despite persistently large negative real-time output gaps headline inflation has been hovering around or just below 2 percent in many countries (Figure 2), where also inflation expectations in the euro area have been anchored (see Abdih and others, 2018).

Figure 2.
Figure 2.

Inflation and Real-Time Output Gap

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff estimates.

This disconnect is mirrored in revisions to real time inflation and real time output gaps. In our baseline sample of 11 larger euro area countries for the years 1994–2012, revisions to the Fall WEO real time output gaps have been large and systematically upwards (1.3 percent of potential GDP unweighted average), while revisions to real-time inflation have been marginal (-0.04 percentage points unweighted average) and, on average, downwards.9 Therefore, if the final vintage versions of output gap estimates are useful for forecasting inflation, which we show in section V, then the real time estimates are unlikely to be, or at best can be informative in terms of the direction and not the level of inflation.

C. Real-Time Estimates for Other Advanced Economies

Similar patterns can be seen in other advanced countries as well as in real-time output gap estimates by other organizations. Table 3 presents WEO mean output gap estimates for the 4 largest non-euro area countries: US, UK, Canada and Japan. For all these countries the WEO real-time output gap estimates have been significantly negative, whereas—and with the exception of Japan—the mean estimates in the 2017 Fall WEO center around zero. From 1994 to 2017 Japan has not recorded a single positive output gap estimate in real time, Canada has recorded two and the UK four. Box 2 documents similar findings of negative real-time output gap estimates and upward revisions at other international organizations.

Table 3.

Real-Time WEO Output Gaps in Major Non-Euro Area Economies: Mean Estimates and Significance (1994–2017)

article image
Sources: IMF WEO and staff estimates.

IV. Explaining the Negative Bias in Real-Time Output Gaps

A. Decomposing the Real-Time Output Gap Bias

This section seeks to explain what drives the large negative means of real-time output gap estimates and their sizable positive revisions. In their seminal work Orphanides and van Norden (2002) were mostly concerned with the impact of data and model or parameter uncertainty on potential output estimates in real time. They found that model or parameter uncertainty in the form of unreliable end-of-sample estimates resulted in considerable mismeasurement of output gaps in real time. Building on their work we isolate the impact of three measurable sources of bias in real-time output gap estimates: (i) data revisions, (ii) systematic bias in 5-year ahead forecasts, typically used to extend the data sample for computing real time output gaps, and (iii) judgment, i.e. deviations from the MVF estimates (capturing model or parameter uncertainty, “expert” knowledge etc.). To do so we rely on the IMF’s multivariate filter (MVF, see section II.A, Benes and others, 2010 and Blagrave and others, 2015 for details) to obtain three counterfactual series of real-time output gap estimates. This allows to test each of the competing explanations in isolation by using the same methodology to process data in real time. Our sample is limited by the availability of WEO real time vintage data (GDP, inflation and unemployment). It spans the period 1990 to 2017 and covers 11 countries for which a full set of real-time data is available: Austria, Belgium, France, Finland, Germany, Greece, Italy, Ireland, the Netherlands, Portugal and Spain.10

Using the MVF is appealing since it has desirable stability properties, asymptotic mean-zero and a straightforward economic interpretation that also satisfies the economic consistency properties. Nevertheless, any counterfactual exercise of the sort requires benchmarks and to a certain extent is model-dependent. For example, the impact of data revisions is conditional on using the MVF and does not necessarily need to coincide with its impact when a different estimation method is used. This caveat is especially relevant for the analysis of “judgment” that is calculated by comparing counterfactual simulations against the actual WEO real-time output gap estimates and captures both discretionary expert-judgment and methodological differences between the MVF and other filtering methods used at that time.11

B. Data Revisions

The impact of data revisions on output gap estimates is relatively small. Following Orphanides and Norden (2002), we analyze it by comparing quasi-real-time estimates of the output gap with real-time estimates. While the real-time estimate for a given year T uses the (vintage) WEO data from year T, the quasi-real-time estimate for T is based on the final data (from the fall 2017 WEO) truncated at year T. Differences in both estimates are caused by data revisions, as data revisions are only embodied in the final data.12 The first column of Table 4 in sub-section E shows the sample averages of the impact of data revisions on output gap estimates for the 11 baseline countries. The last two rows average across countries, showing that data revisions caused real-time estimates to be generally below final estimates, but on average by less than 0.2 percentage point of potential GDP. The text charts show the examples of Germany and France (see Appendix Figure 3 for other countries).

uA01fig04

Real-Time and Quasi Real-Time Estimates

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff estimates.
Appendix Figure 3.
Appendix Figure 3.

Real-Time and Quasi Real-Time MVF Estimates

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.
Table 4.

MVF Estimates of Real Time Output Gap Biases

article image
Sources: IMF WEO and staff calculations.

Sum of the impacts of data revisions, forecast errors and judgment for 1994-2012.

Difference between the 2017 Fall WEO and real-time output gap estimates for 1994-2012.

Note: A positive mean indicates the extent of downward bias, causing mean output gap estimates to fall below their respective benchmarks. P-values indicate statistical difference from zero. Weighted average is based on GDP weights.

C. Forecast Accuracy

It is well documented that IMF GDP forecasts—as well as those of other institutions—tend to be too optimistic on average (see Box 3 for details). Forecast accuracy matters for all filtering methods because the data sample underlying the estimation does typically not end in year T when the estimation takes place. Instead, to improve the end-of-sample stability of the estimates, i.e. to reduce the extent to which a real-time estimate in year T is revised when data for T+1 become available, real-time estimates typically include several years of forecasts. In the case of the MVF real-time estimates presented in this paper, a five-year forecast of the three variables entering the estimation (i.e. output, inflation and unemployment) is used to extend the data sample up to T+5.

uA01fig08

Impact of Forecast on Today’s Potential Estimate

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

However, extending the data sample comes at the cost of introducing potential spillovers from forecast inaccuracy to inaccuracy in real-time output gap estimates. The reason is that potential output is modelled to be “smooth”, so estimated potential in T also depends on estimated potential in T+1 to T+5, which is informed by the GDP forecast (and its errors). The text chart illustrates the mechanism. The solid black line indicates the observed output in the historical area of the chart (left of “today”) and true future output thereafter, whereas the solid red line symbolizes a too optimistic GDP forecast. Estimated potential is indicated by the dashed lines and, in this simplified example, is thought of as an average between output a few years ago and (forecasted) output a few years ahead. The black dashed line shows estimated potential under a perfect foresight forecast, taking into account the true future output and resulting in a positive output gap estimate for today. When instead forecasted output is taken into account, the estimated potential (red dashed line) exceeds today’s output, yielding a negative output gap. In this example, the overly optimistic forecast spills over into a “too negative” output gap estimate (relative to the one obtained under perfect foresight).

Before turning to a systematic analysis, we define the following two concepts. First, the vintage forecast estimate for year T is based on year-T vintage data and five years ahead WEO vintage forecast. Second, the perfect-foresight estimate for year T is different in that the sample is extended by a forecast that is perfect, in the sense that the implied growth rates are correct in hindsight. That is, we compute growth rates for T+1 to T+5 in the final data (i.e. the fall 2017 WEO) and use those rates to extrapolate the vintage data from year T onwards.13 The two estimates differ because of the inaccuracy of the vintage forecast, i.e. because of deviations of forecasted growth rates from ex-post observed growth rates.

The second column of Table 4 in sub-section E shows that forecast inaccuracy tends to lower output gap estimates (increase potential output) by around 0.7 percentage points on average, which corroborates the finding of optimistic forecast bias reported in Box 3. Appendix Figure 4 shows time-series charts for 11 countries. The text chart depicts the example of France and shows that the implications of forecast inaccuracy—the difference between the blue and red lines—tends to be larger in the years preceding a sharp slowdown.14 The closer we get to 2008, the more prominent is the influence of the crisis period in the 5-year forecast horizon. For the vintage forecast series, the proximity to the crisis has no immediate implications since the crisis was not forecasted and therefore the potential output estimate is too positive and the output gap too small. In the perfect foresight series, in contrast, the crisis is part of the forecast. As we get closer to 2008, the crisis increasingly weighs down on the estimate of potential output and thereby leads to larger positive output gaps.

uA01fig05

France: Vintage and Perfect Foresight Estimates

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff estimates.
Appendix Figure 4.
Appendix Figure 4.

Vintage Forecast, Perfect Foresight, and WEO Real-Time Estimates

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.

E. The Role of Judgment

Finally, we analyze judgment, i.e. deviation of staffs’ output gap estimates from the filter results. We define historical judgment as the deviations of WEO real-time estimates from the pure MVF-based vintage estimates. The MVF-based vintage estimates are based on all information available to staff in real time (data as well as forecasts) and can therefore be interpreted as a natural benchmark, an estimate that is free of expert judgment. In other words, the analysis can be interpreted as showing what would staff estimates have been if it had used the MVF filter since the beginning of the sample. Because of the reliance on a particular estimation method (the MVF), however, judgment does not only capture additional information on economic structure (e.g. impact of structural reforms), but also the choice of modelling framework15 as well as the impact of any other “external” factors, like e.g. overoptimism about convergence and potential growth of EU member countries (see footnote 11).

The third column of Table 4 in the next sub-section shows that the real-time WEO estimates of the output gap are on average about 1 percentage point below the MVF-based vintage estimates. Note that the MVF-based vintage forecast estimates should not be understood as being necessarily superior to WEO real-time estimates because the application of a filter can only complement—but not substitute—staff’s assessment, which is based on a much broader information set than the variables entering a filter. However, it is striking that the deviations of WEO real-time from vintage forecast estimates are that large and predominantly negative.

F. Comparison to WEO Revisions

The first three columns in Table 4 show the individual contributions of data revision, forecast errors and judgment to the total bias summarized in column 4. Column 5 shows the actual revisions to WEO output gap estimates (average final estimate minus average real-time estimate). Comparing the last two columns shows that the total impact of the three sources of uncertainty is close to the size of WEO revisions, with an unweighted average difference of 0.2 percentage points of potential GDP. According to this exercise, just below half of WEO revisions can be attributed to forecast errors and the remaining half to judgment, with data revisions having a minor role.16

V. Relevance to Monetary Policy and Inflation Forecast

Monetary policy reaction functions usually include a measure of slack—that is most often captured by the output gap—and a measure of inflation expectations, with various leads and lags to denote both the forward-looking nature of policy and a desire for gradual adjustment (through interest rate persistence). Therefore, real-time biases in output gap estimates could lead to policy errors either through its direct impact on the policy instrument (Orphanides, 2003) or through lower predictability of inflation (Orphanides and van Norden, 2005).17

In this section we test whether the WEO output gap is useful in explaining inflation. Tables 5 and 6 show basic Phillips curve relationships in real time and in the final 2017 WEO vintage, respectively, for different sample sizes and time periods. While the Phillips curve relationships seems to be strongly confirmed based on final 2017 WEO data, the coefficient for the real-time output gap is small and generally non-significant statistically.18 The real time WEO output gap does not seem to provide additional information over and above what is already contained in past inflation and inflation expectations. It is therefore of limited use as a basis for advice on monetary policy.

Only when the sample size is reduced to the years when countries entered the monetary union can the real-time output gap be statistically associated with inflation, in some specifications. Nevertheless, such association is rare and not robust, and even for this time period, the revisions to real-time output gaps are substantial. Notwithstanding large revisions, simple correlation coefficients between the changes in real-time and final output gap estimates are positive at 67 percent for 11 baseline countries and 28 percent for all countries, and highly significant. Thus, real-time output gaps can to some extent be informative on the direction or turning points of final output gap and hence inflation. This echoes earlier findings by Tosetto (2008) who points out that notwithstanding large and persistent revisions to the initial output gaps published by the OECD, preliminary estimates are strongly correlated with the successive ones, can serve as useful predictors of the latter, and in this respect may still contain useful information for future inflation and monetary policy.

The lack of a “real-time” Phillips curve and negligible ex-post revisions to WEO inflation forecasts mean that the real-time output gap bias is not reflected in inflation forecasts, mirroring the important disconnect between inflation (fluctuating around 2 percent, close to the euro area inflation expectations) and persistently negative real-time output gaps (see Section III.B). Due to high inflation persistence in Europe and small revisions, inflation in real time is a useful predictor of final inflation and informative for monetary policy.19 To the extent that real-time output gap bias leads to a loss in inflation predictability, or policy rules and decisions rely on real time estimates of cyclical position, monetary policy can still suffer from information uncertainty. 20

Table 5.

Basic Real-Time Phillips Curve Regressions

article image
Source: IMF WEO and staff estiamtes.Note: Standard errors in parentheses. ***, **, and * denote significance at 1, 5, and 10 percent, respectively. All regressions include time and fixed effects.
Table 6.

Basic Ex-Post Phillips Curve Regressions

article image
Source: IMF WEO and staff estiamtes.Note: Standard errors in parentheses. ***, **, and * denote significance at 1, 5, and 10 percent, respectively. All regressions include time and fixed effects.

VI. Relevance to Fiscal Policy Outcomes

Persistent bias in real-time output gap estimates could have implications for fiscal policy. Output gaps are typically used to separate fiscal balances into their cyclical and structural components. The latter is supposed to provide a measure of the government’s underlying fiscal policy stance—cleaned from temporary cyclical influence—and help steer the desirable setting of fiscal policy given the authorities’ medium-to-long-term debt consolidation objectives (which in the European stability and growth pact framework is formalized in terms of the medium-term objective). If real-time output gaps fail to properly disentangle trend from cycle, then the long-term public debt implications of a given deficit would be inaccurately estimated. For example, a negative output gap bias in real time would result in an excessively optimistic view of the long-term level of output and fiscal revenues.

To the extent that the estimates of the output gap and structural fiscal balance play a role in calibrating fiscal policy, over-optimism about long-term growth could translate into excessive deficit and debt buildup. Simulations by Ley and Misch (2014) based on WEO data on output gap revisions for 175 countries and over 17 years show that in more than one- fifth of cases the implied revisions of the overall and structural fiscal balances exceed 1 percent of GDP, possibly leading to unplanned debt accumulation or surprise debt reduction. They caution against taking real-time estimates of structural balances at face value and recommend building safety margins in the fiscal policy targets.

To fix ideas about the magnitudes involved, consider the following highly simplified thought experiment. Assume that countries’ fiscal authorities had relied on real-time estimates of the output gap to calibrate fiscal policy, with similar biases as the WEO estimates in terms of the magnitudes and signs, two conditions that would require further verification. We can then translate the real time output gap bias into a cyclical component of fiscal balances, using an approach similar to Kempkes (2014). The resulting bias in fiscal policy would range from 0.7 to 0.9 percent of potential GDP annually.21 Year after year overestimation of potential would thus imply significant unplanned deficits and large debt buildup over a 19-year baseline sample.22

In a similar exercise, Eyraud and others (2018) and Eyraud and Wu (2015) compare ex post estimates of the output gap with real time estimates in the stability programs prepared annually by the euro area country authorities for the European Commission. For the period 2003–16 Eyraud and others (2018) find that the output gap was underestimated in real time by 1.3 percentage points of GDP on average. Assuming an elasticity of revenue to output of 1 and elasticity of expenditure to output of 0, and an average expenditure to GDP ratio of 45 percent the underestimation of output gap in real time would imply an overestimation of cyclically adjusted balances by 0.5 percentage point on average per year. Given that euro area countries are subject to cyclically-adjusted balance rule, negative bias in the authorities’ real time output gap estimates may have led to unplanned deficits.

uA01fig06

Average Revisions to Real Time Output Gap and Public Debt

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.
uA01fig07

Average Revisions to Real-Time Output Gap and 5Y Ahead Public Debt Forecast Errors

Citation: IMF Working Papers 2019, 200; 10.5089/9781513512549.001.A001

Sources: IMF WEO and staff calculations.

Drawing on the WEO database over the period 1994 to 2017, we find a positive association between the level of the debt-to-GDP ratio in 2017 and the average output gap revisions over the past 24 years (text chart). High- debt euro area countries have seen larger upward revisions to WEO real-time output gaps, as illustrated in the text chart. As shown in Appendix Table 3, the average revision in higher-debt countries amounts to 1.1 percent of potential GDP, while real-time estimates in lower-debt countries are on average only revised by 0.5 percent, thus closing the difference between the average output gap estimates for both country groups in the final vintage. Similarly, WEO output gap revisions over the period 1994–2012 are positively associated with WEO 5-year ahead forecast errors for the public debt ratio. 23

Appendix Table 3.

Real-Time Output Gaps for High and Low Debt Countries: Mean Estimates and Significance (1994-2017)

article image
Sources: IMF WEO and staff estimates.

Box 4 builds on the cross-sectional correlations between average output gap revisions over time on the one hand and the public debt ratio in 2017 as well as forecast errors for public debt on the other. Specifically, it investigates the correlation between output gap revisions and primary fiscal balances, while conditioning on a variety of variables that can affect these fiscal balances. In so doing, it takes advantage of both the time series and cross-sectional dimensions of the data. As expected, the association between the final output gap estimate and the primary fiscal balances is positive: the better the economy is doing, the higher is the fiscal balance. The association between revisions to real time output gaps estimates and primary fiscal balances is, however, strongly negative: larger upward revisions to WEO output gaps are associated with lower primary fiscal balances. In other words, the countries for which WEO forecasters have seen too much economic slack and thus too much potential for the economy to grow in the future also tend to be the countries that have seen lower primary fiscal balances.

Fiscal Outcomes and Real-Time Output Gap Bias

To study the relationship between fiscal outcomes and WEO output gap revisions we adapt a standard framework of fiscal reaction functions popularized by Bohn (1998, 2008). Table 7 reports the results of regression analysis that relates primary fiscal balances with the lagged primary balance (persistence), lagged public debt and detrended real government consumption (optimizing conditions due to Barro, 1979) as well as the WEO output gap and revisions to real time WEO output gap (the primary channel of interest). The latter is also broken down by MVF simulated sources of real-time bias (derived in the previous section). The regressions further control for a standard set of institutional and economic variables along with 5-year ahead debt forecast errors to capture for any possible feedback of future changes in fiscal policies on output gap revisions as well as real-time inflation and unemployment gaps to net out residual cyclical effects.1

Table 7.

Primary Balance Regressions

article image
Sources: IMF WEO and staff estimates.Note: Standard errors in parentheses. ***, **, and * denote significance at 1, 5, and 10 percent, respectively. Baseline sample for 11 larger euro area countries for the period 1994-2012. All regressions are estimated with the bias-corrected LSDV dynamic panel data estimator with bootstrapped standard errors.

Our choice of estimator is the bias-corrected Least Squares Dummy Variable (LSDV) dynamic panel estimator by Bruno (2005) to correct for the “dynamic panel bias” in samples with small N that have been applied in similar framework of primary balance regressions (see, for example, Debrun and Kinda, 2013). We do not consider issues of endogeneity and simultaneity in fiscal policy in much depth because we are mostly interested in checking whether the insights from the bivariate, cross-sectional correlations between output gaps revisions and fiscal outcomes discussed above are robust to further testing in a multivariate, time series framework. We find that this is the case.

The coefficient on real time output gap revisions is statistically highly significant and economically almost as large as the one on the final output gap. Estimated negative coefficients in columns 2–5 indicate that predominantly positive (upwards) revisions to WEO real-time output gaps averaging to 1.3–1.4 percent of potential GDP (Table 7) are associated with lower primary balance estimates in the order of 0.8–1.2 percent of GDP annually. The regression results also indicate that the association is broad-based, working through both the MVF simulated effect of forecast errors and judgment with similar coefficients. While the data revisions tend to have a statistically significant positive coefficient, likely indicating that revisions to real-time GDP achieve better consistency with fiscal policy outcomes, the economic magnitude of such revisions is small and driven mostly by one country in the sample.

The estimated associations could in principle be driven by a common factor: countries where GDP growth has disappointed can end up with both upward revisions to output gaps and lower deficits, leading to higher debt-to-GDP ratios. In this regard, the positive coefficient on the current account balance, besides accounting for twin deficits, is expected to capture correlated international spillovers (see Checherita-Westphal and Žd’árek, 2017) while inflation and unemployment gaps control for the impact of the business cycles on the deficit. The baseline bias-corrected LSDV estimates are also robust to time effects that capture a basic form of cross-country dependence (see Appendix Table 4). Estimated coefficients of other determinants of primary fiscal balances are broadly in line with earlier studies. The highly significant autoregressive component indicates persistence in primary balances. The positive coefficient on lagged public debt may capture that governments with higher public debt try to achieve higher primary balances (the “weak sustainability condition”). Results are robust to sample size and real-time debt (results not reported).

Appendix Table 4.

Robustness of Primary Balance Regressions

article image
Sources: IMF WEO and staff estimates.Note: Standard errors in parentheses. ***, **, and * denote significance at 1, 5, and 10 percent, respectively. Baseline sample for 11 larger euro area countries for the period 1994-2012. The bias-corrected LSDV dynamic panel data estimates are reported with bootstrapped standard errors.
1 Shifts in the cyclical and trend components are captured by “inflation gap” (inflation deviation from its long-term average), the “unemployment gap” (deviation of unemployment rate from NAIRU), and a “NAIRU gap” (deviation of estimated NAIRU from long-term average unemployment rate). The two former capture mostly cyclical movements while the NAIRU gap captures shifts in (common) trends (see Section III.B for discussion).

The excess slack in WEO real-time estimates are related to both overly optimistic forecasts as well as overoptimistic potential growth incorporated through judgment. Both are associated with lower primary balances (Box 4). While the former is a corollary of difficulties with projecting recessions (see Section IV), the origin of the latter is not clearly identifiable. It could reflect overestimates of the impact of structural and fiscal reforms on potential output, but also many other factors. The results suggest that such over-optimism is strongly associated with lower than expected primary balances and debt accumulation.

To conclude, we derive the following insights from our analysis of output gaps and fiscal outcomes: (i) WEO forecasters have seen more slack in real time than they considered warranted ex post—their real time output gap estimates were persistently negative; (ii) this has been more pronounced for countries with high public debt; (iii) where it has been more pronounced, it has also been associated with (a) larger WEO forecast mistakes with respect to public debt ratios, and (b) lower primary balances. Simple back-of-the-envelope calculation shows that the unexpected debt accumulation due to real time output gap biases could potentially be substantial. Similarly, evidence presented by Eyraud and others (2018) and Eyraud and Wu (2015) suggests that similar large negative bias is present in country authorities’ real time output gap estimates that may have led to unplanned deficits under a structural fiscal balance rule. Further investigation is required to establish whether and to what extent persistently negative real time output gap estimates have lured country authorities into adopting looser fiscal policies than they would have wanted.

The uncertainty associated with estimating an unobservable concept such as potential GDP calls for fiscal policy prudence. This is especially relevant for high debt countries which lack fiscal space, as an overestimation of potential output could increase the risk of unsustainable fiscal outcomes. In practice, the acknowledgment of high uncertainty would suggest stepping away from point estimates and relying more on the whole distribution of output gap estimates. The dispersion and skewness of the estimated distribution can provide useful information to policy-makers when calibrating fiscal policies. More generally, our analysis cautions against extensive focus on real-time output gaps when calibrating fiscal policy. Eyraud and others (2018) point out that expenditure rules that allow automatic stabilizers to work, although more sensitive to initial conditions, can be more resilient to measurement errors. Andrle and others (2015) show with counterfactual simulations for France and Italy that the differences between real time and ex post outcomes of public debt from 2001 onwards would have been significantly smaller under the expenditure rule than under cyclically adjusted balance rule.

VII. Conclusions

The output gap, while unobservable, is typically considered an important input for policymaking. Policymakers need to be able to assess the underlying growth rate of the economy to separate trend from cycle and optimally calibrate the stance of monetary and fiscal policies. Failure to do so may result in excessive inflation or debt buildup as monetary and fiscal policy become pro-cyclical. However, measuring the output gap in real time is very difficult since information is scarce and hard to interpret, structural shifts may not be immediately visible, and recessions are difficult to forecast. A large empirical literature has shown that output gaps tend to be revised over time, and the revisions may be large, to the point that what looks like an optimal countercyclical policy in real time may appear pro-cyclical in hindsight.

In this paper we audit the WEO real-time output gap estimates for countries in the euro area against a set of desirable properties. We show that revisions of WEO real-time output gap estimates for the main euro area countries are large—especially before the GFC—and systematically upwards. Consequently, staff’s real-time output gap estimates are persistently downward biased, sometimes extending over more than a decade. Counterfactual simulations show that over-estimation of the economy’s productive capacity and over-optimistic growth forecasts are the main drivers of the negative bias. The forecast errors primarily occur because WEO forecasters aim at predicting the mode of the growth distribution which, in the presence of downwardly skewed risks, exceeds average growth. The judgment component of the negative bias argues for caution in incorporating off-model information (such as structural or fiscal reform impact) and favors filtering methods that are not too sensitive to the end of the sample.

Such uncertainties over real-time output gap estimates can have important consequences on policy to the extent it is informed by these estimates. We find that in countries where slack (and potential growth) is overestimated in the WEO data, primary fiscal balances tend to be lower, public debt ratios higher and increase faster than projected. It is thus possible that the biases in output gap estimates reflected an underappreciation of a slowdown in potential growth, which WEO forecasters and policymakers have come to understand only gradually. What were deemed cyclical revenue losses became structural losses, leading to a large accumulation of debt. Available evidence suggests that the national authorities’ real-time output gap estimates have suffered from a similar negative bias. To the extent that those estimates played a role in calibrating fiscal policy, over-optimism about long-term growth could have contributed to excessive deficits and larger debt buildup.

We also find, in line with earlier literature, that real-time output gaps are not useful to predict inflation, as evidenced by the apparent lack of “real time” Phillips curve. Real-time output gaps at best contain information on directional changes in business cycles and inflation. Because inflation is persistent and not revised, inflation in real time remains a useful indicator to forecast inflation.

To minimize the occurrence of large measurement errors in the future, staff has moved towards a multi-pronged approach to the estimation of output gaps (IMF, 2015). The potential output is estimated using a multivariate filter (MVF) that—thanks to the economic structure imposed during estimation—yields real-time potential output estimates that are more in line with desirable economic properties of output gaps and less prone to revisions.

As the model structure imposed on the data may not capture important structural changes in the individual economies, the approach still allows to introduce expert judgment. The latter is added during the filtering procedure and is based on growth accounting via a production function approach.

Several options are currently being studied to further improve the reliability of real-time output gap estimates and their usefulness for policymaking. First, increasing the number of indicators (capacity utilization, investment to GDP, labor vacancy rates, etc.) may improve the precision of the MVF and reduce the size of ex-post revisions (Turner and others, 2016). Second, incorporating some measure of the forecast distribution asymmetry in the central forecast may reduce forecast errors and ex-post revisions. One possibility would be to rely on the mean instead of the mode of forecasts. As shown in this paper, failure to predict recessions has been an important driver of a systematic downward bias in real time output gap estimations. Third, complementary measures of the output gap can help hedge against first order mistakes in real time. IMF (2015) discusses a related but different concept of “sustainable” output: the GDP level that an economy can sustainably produce over the medium term in the absence of imbalances. This approach yields a complementary measure of potential output that can provide timely and useful information for real-time policymaking, especially when the two measures diverge significantly. Finally, policymakers could consider building in safety margins or other robust measures to shield against real time biases. This would require building buffers during upswings, increasing the available resources for effective counter-cyclical fiscal policy in downturns. All in all, our paper cautions against an exclusive focus on output gaps estimates in real time when calibrating counter-cyclical policy and stresses the importance of an encompassing approach.

Appendix Table 2.

Potential Fiscal Implications of Negative Real Time Output Gap Bias

(Percent of Potential GDP)

article image
Sources: IMF WEO and staff estimates.Note: Computed using the EC's semi-elasticity of budget balance to changes in the output gap (see Mourre and others, 2014).

References

  • Abdih, Y., Lin, L. and A.-C. Paret, 2018, “Understanding Euro Area Inflation Dynamics: Why So Low for So Long?IMF Working Paper WP/18/188.

    • Search Google Scholar
    • Export Citation
  • Ademmer, M., Boysen-Hogrefe, J., Carstensen, K., Hauber, P., Jannsen, N., Kooths, S., Rossian, T. and U. Stolzenburg, 2019, “Schätzung von Produktionspotenzial und-lücke: Eine Analyse des EU-Verfahrens und mögliche Verbesserungen,” Kieler Beiträge zur Wirtschaftspolitik, No. 19.

    • Search Google Scholar
    • Export Citation
  • Alichi, A., Avetisyan, H., Laxton, D., Mkhatrishvili, S., Nurbekyan, A., Torosyan, Torosyan. and H. Wang, 2019, “Multivariate Filter Estimation of Potential Output for the United States: An Extension with Labor Market Hysteresis,” IMF Working Paper forthcoming).

    • Search Google Scholar
    • Export Citation
  • Álvarez, L.J. and A. Gómez-Loscos, 2018, “A menu on output gap estimation methods,” Journal of Policy Modeling, (40)4, pp. 827-850.

    • Search Google Scholar
    • Export Citation
  • Aiyar, S. and Voigts, S., 2019, “The negative mean output gap”, IMF Working Paper, forthcoming.

  • Andrle, M., Bluedorn, J., Eyraud, L., Kinda, T., Koeva Brooks, P., Schwartz, Schwartz. and A. Weber, 2015, “Reforming Fiscal Governance in the European Union,” IMF Staff Discussion Note, SDN/15/09.

    • Search Google Scholar
    • Export Citation
  • Ball, L., 1999, “Aggregate Demand and Long-Run Unemployment”, Brookings Papers on Economic Activity, vol. 2, pp. 189-251.

  • Barro, R., 1979, “On the Determination of Public Debt,” Journal of Political Economy, 87(5), pp. 940-971.

  • Bashar, O., 2011, “On the permanent effect of an aggregate demand shock: Evidence from the G-7 countries”, Economic Modelling, 28(3), pp. 1374-1382.

    • Search Google Scholar
    • Export Citation
  • Benes. J., Clinton, K., Garcia-Saltos, R., Johnson, M., Laxton, D., Manchev, Manchev. and T. Matheson, 2010, “Estimating Potential Output with a Multivariate Filer,” IMF Working Paper WP/10/285.

    • Search Google Scholar
    • Export Citation
  • Benigno, P. and L.A. Ricci, 2011, “The Inflation-Output Trade-Off with Downward Wage Rigidities,” American Economic Review, 101(4), pp. 1436-1466.

    • Search Google Scholar
    • Export Citation
  • Berger, H., Dowling, T., Lanau, S., Lian, W., Mrkaic, M., Rabanal, Rabanal. and M.T. Sanjani, 2015, “Steady as She Goes-Estimating Potential Output During Financial “Booms and Busts”,” IMF Working Paper WP/15/233.

    • Search Google Scholar
    • Export Citation
  • Blagrave, P., Garcia-Santos, R., Laxton, Laxton. and F. Zhang, 2015, “A Simple Multivariate Filter for Estimating Potential Output,” IMF Working Paper WP/15/79.

    • Search Google Scholar
    • Export Citation
  • Blanchard, O., Cerutti, Cerutti. and L. Summers, 2015, “Inflation and Activity - Two Explorations and their Monetary Policy Implications,” IMF Working Paper WP/15/230.

    • Search Google Scholar
    • Export Citation
  • Blanchard, O., and L. Summers, 1986, “Hysteresis and the European Unemployment Problem,” NBER Macroeconomics Annual, vol. 1, pp. 15-90.

    • Search Google Scholar
    • Export Citation
  • Bohn, H., 2008, “The Sustainability of Fiscal Policy in the United States.” In: Neck, R., and Sturm, J.E., Eds., Sustainability of Public Debt, MIT Press, Massachusetts, 15-49.

    • Search Google Scholar
    • Export Citation
  • Bohn, H., 1998, “The Behavior of U.S. Public Debt and Deficits,” Quarterly Journal of Economics, 113(3), pp. 949-63.

  • Borio, C., Disyatat, Disyatat. and M. Juselius, 2014, “A parsimonious approach to incorporating economic information in measures of potential output,” BIS Working Paper No 442.

    • Search Google Scholar
    • Export Citation
  • Bruno, G.S.F., 2005, “Approximating the bias of the LSDV estimator for dynamic unbalanced panel data models,” Economic Letters, 87(3), pp. 361-366.

    • Search Google Scholar