Mind the Gap: What is the Best Measure of Slack in the Euro Area?

Contributor Notes

Author’s E-Mail Address: kross@imf.org; angel.ubide@tudor.com

Assessing the magnitude of the output gap is critical to achieving an optimal policy mix. Unfortunately, the gap is an unobservable variable, which, in practice, has been estimated in a variety of ways, depending on the preferences of the modeler. This model selection problem leads to a substantial degree of uncertainty regarding the magnitude of the output gap, which can reduce its usefulness as a policy tool. To overcome this problem, in this paper we attempt to insert some discipline into this search by providing two metrics-inflation forecasting and business cycle dating-against which different options can be evaluated using aggregated euro-area GDP data. Our results suggest that Gali, Gertler, and Lopez-Salido's (2001) inefficiency wedge performs best in inflation forecasting and production function methodology dominates in the prediction of turning points. If, however, a unique methodology must be selected, the quadratic trend delivers the best overall results.


Assessing the magnitude of the output gap is critical to achieving an optimal policy mix. Unfortunately, the gap is an unobservable variable, which, in practice, has been estimated in a variety of ways, depending on the preferences of the modeler. This model selection problem leads to a substantial degree of uncertainty regarding the magnitude of the output gap, which can reduce its usefulness as a policy tool. To overcome this problem, in this paper we attempt to insert some discipline into this search by providing two metrics-inflation forecasting and business cycle dating-against which different options can be evaluated using aggregated euro-area GDP data. Our results suggest that Gali, Gertler, and Lopez-Salido's (2001) inefficiency wedge performs best in inflation forecasting and production function methodology dominates in the prediction of turning points. If, however, a unique methodology must be selected, the quadratic trend delivers the best overall results.

I. Introduction

Recent economic developments have shown the importance of assessing the magnitude of the output gap in order to achieve a correct policy mix. The current discussion on the stance of fiscal policy in the Euro area relies on the achievement of structural balance, the nominal balance adjusted for the output gap, over the medium term. Whether the current fiscal deterioration is due only to the play of automatic stabilizers or the result of discretionary fiscal stimulus cannot be resolved without an assessment of the magnitude of the output gap. In fact, the European Commission has just published a paper with detailed guidelines for the calculation of output gaps and structural balances2, which will be used in the assessment of stability programs. Critics of the recent stance of monetary policy in the United States blame the Federal Reserve for having mistaken a widening output gap for an increase in potential output, a mistake that led, according to these critics, to the need to tighten policy too fast and resulted in the abrupt slowdown of late 2000 and 2001. Other examples abound. The increasing variability of the output gap has been blamed, for example, for the Federal Reserve’s policy miscues which led to the stagflation in the 70s. Also, the failure to correctly identify turning points has resulted, in many instances, in pro-cyclical fiscal policies. This evidence, together with the widespread adoption of inflation targeting frameworks that rely on output gap measures (see Svensson (1999) and Clarida, Gertler, and Gali (2000), has spurred a recent surge in research related to the measurement of the output gap (see, among many others, Camba-Mendez and Rodriguez- Palenzuela (2001), Martins (2001), and Mestre and Fabiani (2001)).

The problem encountered by policy makers is that the output gap is the distance between potential and actual output, and potential output is not directly observable. Thus, multiple different methods have been developed to estimate the output gap, which encompass three different approaches: identification based on statistical properties of the GDP series, identification based on economic theory, and identification based on survey data. These different approaches can also be combined into “thick estimates” a la Granger (2000) that combine the information embedded into the different models.

However, since all of these methodologies rely on different assumptions for identification, they are bound to deliver divergent results and can be considered as “different windows through which economists can examine their models and data” (Canova (1998)). Thus, the selection of the methodology has to be based on the preferences of the researcher and on the question being investigated, and it is open to considerable discretion. A way to eliminate this discretion is to provide a metric against which to measure the different options, in order to select the “best” measure and minimize the subsequent error in the determination of the policy stance.

This paper, after showing how largely uninformative the output gap is unless there is an objective selection criterion across methodologies, attempts to carry out such an exercise for the euro area, by calculating the output gap using several difference approaches and assessing their performance according to two metrics: the ability to forecast turning points, and the ability to forecast inflation. These metrics combine the desirable properties of minimizing the risk of pro-cyclical policies and maximizing the scope for inflation stabilization. Because of the absence of an official chronology of the euro area business cycle, we provide a dating of turning points for the level of real GDP and then assess the accuracy of turning point forecasting using three alternative methods. As for inflation forecasting, we conduct a simulated forecasting exercise, whereby a simulated series is constructed of the forecast of inflation that a model would have produced had it been used historically to generate a forecast of inflation. The results are then compared to a naive (no change) forecast. Our paper relates to the work performed for the United States by Canova (1999), Stock and Watson (1999), and Atkeson and Ohanian (2001), and by Camba-Mendez and Rodriguez-Valenzuela (2001) for the euro area.

The results show the wide range of estimates that different methodologies deliver, which add to the already large uncertainty inherent in the stochastic nature of these calculations. As for the best measure of the output gap, our results suggest that Gali, Gertler, and López-Salido’s (2001) inefficiency wedge performs best in inflation forecasting and the production function methodology dominates in the prediction of turning points. If, however, a unique methodology must be selected, the quadratic trend delivers the best overall results.

The rest of the paper is organized as follows: section II briefly presents the different methodologies for the estimation of the output gap; section III discusses the performance criteria used in the paper and the results of its application to the different measures of the output gap; Section IV discussed the results and presents some conclusions.

II. Methodologies for the Estimation of the Output Gap3

The concept of economic slack derives from the assumption that there is a potential level of output that can be achieved given the resources available in the economy. This potential level of output is, by definition, unobservable, and therefore indirect methods have been devised to extract this unobserved variable from the observed output series. There are three broad approaches to the estimation of the amount of available economic slack: identification based on the statistical properties of the GDP series, identification based on economic theory, and identification based on survey data. These different alternatives can then be used in isolation or combined to create “thick” estimates a la Granger (2000). Given that the nature of the paper is to discuss the properties of alternative methods that can be used in practical macroeconomic surveillance, for each of the three categories mentioned above we have selected methodologies that are easily replicable and widely available in standard econometric packages. In this spirit, other less popular methodologies, such as the Beveridge-Nelson decomposition, Wavelet filters, or multivariate common trends, have been excluded from the analysis.4

A. Based on Statistical Properties

GDP can be characterized as a cycle that evolves around a long term trend. Thus, identification based on the statistical properties of the unobserved components amounts to defining the main features of the trend and cycle (such as order and type of integration of the trend, and length or periodicity of the cycle) and the relationship between trend and cycle. There are several approaches depending on these hypotheses: some define cycle and trend according to their statistical properties and the relationship between trend and cycle (linear and quadratic trends, first order differences), others use an identification procedure based on the definition of cycle (Hodrick-Prescott, frequency domain filters). Throughout the paper we will denote the natural logarithm of the time series at yt, its trend as xt and its cyclical component as ct.

The simplest procedure is linear (LT) and quadratic (QT) detrending, where it is assumed that xt is a deterministic process which can be approximated by a polynomial function of time and that xt and ct are uncorreclated.

The Hodrick-Prescott filter is widely used among applied macroeconomists because of its simplicity and its ability to replicate NBER turning points in U.S. GDP (see Canova 1999). Identification is achieved by assigning the business cycle a periodicity of 2-8 years, and defining a smooth but variable stochastic trend that is uncorrelated with ct. We use the standard value of lambda equal to 1600 as the smoothing parameter (HP 1600). Given the traditional criticism to the crude application of the HP filter5, a modified versions of this filter is calculated, the HP-Arima (Maravall and Kaiser (2000)), that suggests forecasting and backcasting the series with an Arima model to minimize the end-of-sample problem inherent to this filter (HPA1600). However, the choice of the smoothing parameter amounts to identifying the allocation of variations in output to the trend and to the cyclical component. To give robustness to the analysis, we will therefore use two additional alternatives introduced by Marcet and Ravn (2001), who suggests calculating the value of lambda in a cross-country setting so as to equalize the volatility of the trend across countries. We will use the United States as the benchmark. They propose two methods: allowing for a larger variability of the growth rate in countries with a more volatile cyclical component, which yielded a lambda of 3,137 (HP3137); and assuming similar economic structures between the benchmark and the comparator country, which yielded a lambda of 893 (HP893).6

Frequency domain filters assume that ct and xt are independent, that xt has the power concentrated in the low frequency band of the spectrum, and that the power of the secular component decays rapidly away from zero. Baxter and King (1999) provide a time dimension version of this filter (BK). Here, the results are presented for the case where the cycles have a length of less than 32 quarters.

B. Based on Economic Properties

A typical criticism of the statistical or “atheoretical” methods is that the implicit output trend does not necessarily coincide with the definition of potential output, namely the maximum utilization of resources that is compatible with price stability.

Economic identification can be achieved in two main ways. The first one is using economic theory to statistically identify the trend. A widely used methodology is that of Blanchard and Quah (1989) (BQ), which uses the definition of supply and demand shocks in a bivariate structural VAR to identify potential output. The crucial assumption is that demand shocks have no long run effect on output and unemployment but supply shocks have long run effects on output.

The second procedure is estimating the variables using reduced-form equations derived from economic models. Perhaps the most common approach is the direct estimation of a (typically Cobb-Douglass) production function where the parameters are calibrated to match the specifications of the economy under study (see, for example, McMorrow and Roeger (2001)). An alternative method is the estimation of Phillips curves (see, among others, Staiger, Stock, Watson (1997)). Because of the long standing debate about the validity of the standard Phillips curve as a macroeconomic tool, we will use two recent innovations in this area. The first one tries to exploit the intrinsic economic relationship between potential output and the NAIRU by estimating a system of equations that explicitly incorporates the covariation restrictions on cyclical output and cyclical unemployment, while taking into account the available information on inflation (see, Apel and Jansson (1999a, 1999b)). Such simultaneous estimation technique arrives at the desired joint labor- and goods-market assessment of how far the economy is from the levels of output and labor utilization that are consistent with stable inflation. The second approach stems from the New Phillips Curve literature (see Goodfriend and King (1997) for a survey) and has been developed empirically by Gali, Gertler and López-Salido (GGL) (2001)).


Blanchard and Quah (1989) (BQ) propose an identification where, in a bivariate structural VAR of output and unemployment, supply shocks have long run effects on output but not on unemployment and demand shocks have no long run effects on either output or unemployment. The implied trend-cycle decomposition assumes that trend and cycle are uncorreclated.

Based on the Production Function

Potential GDP may be estimated using a production function approach.7 Aggregate production takes a Cobb-Douglas form with constant returns to scale and neutral technical progress. Potential output is then defined as:


where lt is the log of the level of employment consistent with the unemployment being at the Nairn, kt-1 is the one period lag of the log of the capital stock, tfp is trend total factor productivity, β is the share of capital in value added, and (1–β) is the average wage share. The Nairu consistent level of employment is defined by:


where lft is the labor force and Utn is a time-varying Nairu.8 The evolution of the capital stock is defined as:


where δ denotes a rate of depreciation and invt represents real gross investment. Total factor productivity is estimated as a Hodrick-Prescott filtered Solow residual. We calibrate the model as in the ECB’s area-wide model (see Fagan, Henry, and Mestre 2001): β is set at 0.41 and δ is set at one percent per quarter, or 4 percent per annum. Finally, the output gap is defined as the ratio of real GDP to potential GDP.

Based on the Phillips Curve: Unobserved Components

Apel and Jansson (1999a and 1999b) propose a procedure for joint estimation of the unobserved potential output and NAIRU variables based on the standard Phillips curve. Formally, the model contains the following equations:


where πt is the log difference of the CPI, ut the unemployment rate, utn the NAIRU, zt, exogenous (supply-shock) variables, yt the log of real output, and ytp the log of potential output. All innovations in the system (εtpc,εtol,εtn,εtp,εtc) are assumed to be i.i.d., mutually uncorrelated, with zero means and constant variances.

Equation (4) is a representation of the Gordon “triangle model” (Gordon 1997), whereby inflation is a function of inertia, demand, and supply, and embeds an expectation-augmented Phillips curve that controls for supply shocks. Equation (5) is a version of the Okun’s Law, linking the cyclical components of output and unemployment.9 In order to close the model it is necessary to make some assumptions about the stochastic characteristics of the unobserved variables. Equations (6) and (7) assume that both potential output and the NAIRU contain a stochastic trend, and equation (8) specifies the evolution of cyclical employment as an autoregressive distributed lag.

For purposes of estimation, the model is rewritten in state space form and the unknown parameters of the model and the time series of the unobservable components are found through application of the Kalman filter and maximum likelihood estimation (see Harvey (1991)).

Because this approach gives the modeler discretion regarding the underlying structure of potential output and the Nairu, four versions of this model will be estimated.10 The first unobservable components model (UC1) is defined as above; the Nairu is assumed to follow a pure random walk, while potential output follows a random walk plus constant drift. The second model (UC2) is based on the empirical observation that, at least for certain samples, unemployment in the euro area can be characterized as an 1(2) variable (see Camba-Mendez and Rodriguez-Palenzuela (2001)). Thus, the NAIRU and potential output are assumed to follow random walks with stochastic trends. This requires the addition of two stochastic trend terms to equations (3) and (4) defined as:


The third model (UC3), which traces its lineage to Jaeger and Parkinson’s (1994) work on hysteresis effects, allows lagged cyclical unemployment in equation (5) to affect the current natural rate as shown in equation (3 c).


The coefficient θ measures the degree of hysteresis in the unemployment rate series.11 The final model (UC4) is derived from Martins (2001), who models the NAIRU as a random walk with a stochastic trend and potential output as a random walk with a constant drift. Thus, UC4 replaces equation 3 with (6a) and (6b) and equation 4 with


Based on the Phillips Curve: GGL’s “Inefficiency Wedge”

In their examination of inflation dynamics in the euro area, GGL (2001) argue that, in the context of the New Keynesian Phillips curve, real marginal cost and not the output gap, are the theoretically correct measure of real sector inflationary pressures. In their view, real marginal costs tend to move rather sluggishly through the cycle, (in line with the evidence of persistence in inflation), as labor market frictions and wage rigidities prohibit correct market clearing processes. Detrended measures of output, however, generally lack this sluggishness (since they do not encompass these labor market imperfections) and cannot account for the influence of productivity and wage pressures on inflation.

In GGL (2001) log real marginal (mct) can be decomposed into two parts:


(i) a gross wage markup (μlw) which measures the degree of frictions or imperfects in the labor market, and (ii) an inefficiency wedge [(ct + φnt)–(ytnt)], where ct, is log of non-durable consumption, nt is log employment per household, yt, is the log of real output, and the parameter φ represents the inverse of labor supply. Theoretically, the inefficiency wedge (IW) is simply the ratio of a household’s marginal cost of supplying labor to its marginal product of labor, and measures the current level of output in relation to the efficient level of output. Empirically, as in GGL, the inefficiency wedge can easily be calculated from euro-area data on real consumption, real output, employment and the labor force.12

C. Based on Survey Data

An alternative to the estimation of the output gap is survey-based measures of economic slack. The three more important are the measure of capacity utilization in industiy (CU), and the indices of consumer (CC) and business confidence (BC). All three indices are transformed by normalizing them with their respective means and standard deviations. Because of data availability, we use a GDP weighted average of survey data series from four core countries (Germany, France, Italy, and Spain) until 1985 and Eurostat data for the euro area thereafter.

D. Based on “Thick” Estimates

In the process of selecting the best goodness of fit, researchers often choose amongst different variables until they find the one that best suits their needs. However, Granger (2000) suggests that, by discarding sub-optimal specifications, a wealth of valuable information is ignored, and several reasons suggest that this may not be a good practice. For example, as Granger (2000) indicates, a combination of forecasts is often superior to the best forecast, in a fashion similar to the well-known finance axiom that investing in a portfolio of assets is usually superior to investing in a single asset. Thus, many alternative specifications of similar quality are then combined into a single output. Three different “thick” estimates are presented in this paper: the unweighted mean (MEAN); the weighted mean (where the weights are the inverse of the standard deviation) (WMEAN); and the median (MEDIAN).

III. Data and Estimates of Slack

Quarterly data (1970Q1 to 2000Q4) on real GDP, unemployment, and consumer prices was taken from OECD and IFS databases and euro-area aggregates constructed based upon PPP weights of euro area countries. Where applicable, a similar procedure was undertaken to create the supply shock variables used in the unobservable component models as well as the inefficiency wedge variable. To ensure compatibility, in all cases we compared the data to those found in the ECB’s area wide model database (see Fagan, Henry, and Mestre (2001)) and to harmonized data from Eurostat.

The four panels of Figure 1 show the output gap estimates under each methodology. Table 1 contains output gap as well as potential GDP estimates from each of the models in 2000. Two results stand out. First, the wide dispersion of estimates: estimates of the rate of growth of potential output in 2000 vary between 2.1 percent and 3.4 percent, while the estimates of the output gap for 2000 range from -4.6 percent to 2.2 percent. Second, this dispersion occurs even within estimates that follow similar methodologies: for example, output gap measures with the unobserved components methodology range from -4.6 percent to 1.5 percent. To further illustrate this point, we plot in Figure 2 the bands resulting from taking the maximum and minimum values over all gap estimates for every point in time. Up to 1992 the bands are rather symmetric, and only in a few periods in the early 1980s it can be argued that the output gap was unambiguously negative. After 1992 the dispersion widens, and no conclusion can be drawn on whether the output gap was positive or negative or even whether it was widening or narrowing.

Figure 1.
Figure 1.

Euro-Area Output Gaps

Citation: IMF Working Papers 2001, 203; 10.5089/9781451874457.001.A001

Note: See footnote to Table 1 for definitions of the model abbreviations.1/ The inefficiency wedge has been scaled by a factor of 10.2/ The HP893 and HP3137 are similar to the HP 1600 and are not shown.
Table 1.

Output Gaps and Potential Growth Rates In 2000 From Different Methods

article image
Note: Method abbreviations have been defined explicitly in the text. Broadly, the models are defined as follows: UC1 is the base Apel-Jansson unobservable components model, where the NAIRU follows a random walk and potential output follows a constant drift; UC2 contains random walks with stochastic trends; UC3 allows for hysteresis effects in the NAIRU; UC4 allows the NAIRU to follow a random walk with stochastic trend while potential output follows a random walk with constant drift; PF is the production function; IW is the GGL inefficiency wedge; BQ is the Blanchard-Quah method; CC is consumer confidence; BC is business confidence; CU is capacity utilization; LT is Linear Trend; QT is quadratic trend; HP 1600 is the Hodrick-Prescott filter (lambda =1600); HP893 is the Hodrick-Prescott filter (lambda =893); HP3137 is the Hodrick-Prescott filter (lambda =3137); HPA1600 is the Hodrick-Prescott filter (lambda =1600) with ARIMA back- and forward-forecasting; BK is Baxter-King; MEAN is the unweighted mean; WMEAN is the unweighted mean; and MEDIAN is self defined.
Figure 2.
Figure 2.

Minimum-Maximum values Across Euro Area output Gaps 1/

Citation: IMF Working Papers 2001, 203; 10.5089/9781451874457.001.A001

1/ The maximum (minimum) values have been calculated at each point in time across all 20 output gaps.

Thus, the menu of available methodologies adds a considerable amount of uncertainty to the estimation of the output gap, in addition to the uncertainty involved in the estimation of any particular model—an issue consciously ignored in this paper.13 This makes the need to have an objective criterion for the selection of a specific methodology crucial for the achievement of the right policy mix. We tackle this problem in the remainder of the paper.

IV. Assessment Criteria

A. Dating the Euro Area Business Cycle

There is no official or generally agreed business cycle chronology for the euro area. For the US the NBER dates peaks and troughs, and the corresponding cycles, representing periods of expansion and contraction in the level of activity, have become known as classical business cycles. Bry and Boschan (1971) developed a mechanical method to emulate the NBER dating using a univariate method, and versions of it have been used on individual or groups of economies by King and Watson (1994), Watson (1994), Artis, Kontolemis, and Osborn (1997), Artis, Krolzig, and Toro (1999), among others.

To the best of our knowledge, no attempt has been made to date an euro-area business cycle using aggregate area-wide data.14 Therefore, we will apply a simplified version of the Bry-Boschan (BB) procedure on euro area GDP data to create a unique reference cycle for the euro area. We use the set of BB turning points as a reference to assess the dating performance of the different measures of the output gap found under the various detrending procedures. To ensure robustness and for comparison purposes, we will also use the dating obtained by Artis, Kontolemis, and Osborn (AKO) using industrial production data and by Artis, Krolzig and Toro (AKT) using a multi-country approach. The reference cycle dates emanating from both the AKO and AKT methods are broadly similar to our results using a euro-area aggregate real GDP series under the BB procedure (Table 2). Appendix I provides a detailed description of the three dating methodologies.

Table 2.

Chronology of Euro Area Business Cycles

article image

Abbreviations are Bry-Boschan (BB), Artis, Kontolemis, and Osborn (AKO), and Artis, Krolzig, and Toro (AKT).

The cyclical duration is recorded as the number of quarters.

Change in real GDP during peak to trough contraction.

B. Rules for Dating Output Gaps

The procedure used to date the different measures of slack assumes cyclical movements around an underlying trend (growth cycles), and follows and Canova (1999): a trough is defined as a situation where two declines in the cyclical component of GDP are followed by an increase; i.e. at time t ct+1 > ct < ct-1 < ct-2. Similarly, a peak is defined as a situation where two consecutive increases in the cyclical component of GDP are followed by a decline, i.e. ct+1 < ct > ct-1 > ct-2.

The application of this procedure to the real GDP series yields essentially the same turning points than the BB method.15

C. Inflation Performance Criteria

The inflation forecasting exercise was undertaken within the following standard Phillips curve framework:


where CPI inflation16 depends on lagged values of the inflation and of the output gap measure of interest. To ensure robustness of the analysis, two other variables were included in separate specifications: the price of oil, to account for the impact of supply shocks on inflation dynamics;


and the change in the output gap, to account for speed limit effects that may affect inflation dynamics in a situation of rapidly accelerating activity (see Lown and Rich (1997)).


For each measure of slack we conducted a simulated forecasting exercise. We first estimate estimating equations (10a), (10b) and (10c) with quarterly data from 1970Q1 to 1997Q4. For each quarter t from 1998Q1 to 2000Q4, we construct simulated forecasts of inflation over the next 12 quarters by estimating equations (10a), (10b) and (10c) in turn using all data available from 1970Q1 up to quarter t. We consider specifications of the each equation with up to four lags for each of the regressors, thus considering 256 different specifications of each equation for each of the twenty different estimates of the output gaps. For each specification, the Theil U forecast statistic is calculated, namely the ratio of the root mean square error of the forecast under the model to the root mean square error for a “no change” or naive forecast. Values of this statistics greater than 1 indicate that the naive forecast is superior to the regression specification.

In order to avoid biasing the results because of the behavior of the series during the forecasting period, four different measures of inflation πt were examined: (i) quarterly inflation, πtq=(100*ln(pt)ln(pt1)) (ii) annual inflation, πta=(100*ln(pt)ln(pt4)) (iii) the change in annual inflation 4 quarters ahead, (πt+4aπta), and (iv) annual inflation 4 quarters ahead, (πt+4a) To fill out 2001 for the more forward looking measures, actual inflation values were used for the first two quarters, with WEO forecasted values used for the 3rd and 4th quarters. The top panel of Figure 3 contains these four inflation rate series over the full sample.17

Figure 3.
Figure 3.

Inflation Series Used in the Forecasting Exercise

Citation: IMF Working Papers 2001, 203; 10.5089/9781451874457.001.A001

Sources: Staff estimates.

V. The Results

A. Dating of Business Cycles

The results of the exercise are reported in Table 3. A few basic statistical measures are presented first, with additional statistics on each detrending method in relation to a particular reference cycle presented as well. The maximum and minimum ranges vary across methods, with an average range between -1.7 percent and +2 percent. An extreme case is the UC1 model, (where both potential output and the NAIRU are modeled as random walks), where the output gap varies from -6.1 percent to +6 percent. A possible explanation would be that, as suggested by Camba-Mendez and Rodriguez-Palenzuela (2001), the unemployment rate in the euro area, at least in this specific sample, is an 1(2) series, whereas the UC1 specification models it as 1(1), resulting in a somewhat downward trending or non-stationary output gap. Although debatable from a theoretical point of view, this issue is addressed with our model UC2, where unemployment is modeled as an 1(2) series.

Table 3.

Business Cycle Statistics for the Euro Area Under Various Detrending Methods 1/

article image
article image
Note: See footnote to Table 1 for definition of the method abbreviations.

The table shows the results of comparing turning points determined by a simple dating rule of the cyclical components (or output gaps) stemming from the 20 detrending methods versus a reference cycle. The reference cycle is as shown in Table 1 and has been determined through an application of the Bry-Boschan algorithm. The following dating rule was applied to the cyclical components: a trough occurs as a situation where two declines in the cyclical component of GDP are followed by an increase; i.e. at time t ct+1 > ct < ct-1 < ct-2. Similarly, a peak is defined as a situation where two consecutive increases in the cyclical component of GDP are followed by a decline, i.e. ct+1 <ct> ct-l > ct-2.

A false alarm occurs if there is no turning point within ± 3 quarters of the reference date.

A missing signal occurs if the method does not signal a turning point within ± 3 quarters of the reference date.

All economic methods record average negative output gaps and, accordingly, show the economy spending most of its time below trend. Troughs identified with economic methods are significantly lower than those found under statistical, survey or thick modeling methods. This is also confirmed by the average amplitude of the contractions. Also, almost all gap measures record the euro-area economy as more likely than not to be in an expansionary phase. Thus one could categorize the euro area as a relatively slow growing economy, which is generally producing at a point below full potential.

The number of peaks and troughs recorded by the different gap measures under our dating rule tends to be relatively high—in double digits in many cases. Hodrick-Prescott methods report a total of 26 peaks and troughs or thirteen complete cycles, while the production function contains 10 peaks and troughs or 5 complete cycles. Next we discuss a few summary statistics of performance for each measure of slack under the three different reference cycles. Specifically, we report summary statistics on false alarms and missing signals as in Canova (1999). Signals are ranked as false if a reference cycle turning point does not appear within a ± 3 quarter interval around the signal date. Signals are ranked missing if no signal appears within a ± 3 quarter interval around the actual reference cycle turning point.

A worst case scenario would be a gap measure providing a false or missing signal 100 percent of the time on both types of turning points. The sum of these results would be 200 percent under both peaks and troughs, or 400 percent for the total. Gap measures that are more in line with the reference cycle would record lower values, i.e., those that correctly identify all peaks and troughs would result in 0 percent false alarms and missing signals under both peaks and troughs. At the bottom of Table 3, we provide an average ranking of each gap measure across each reference cycle comparator using these summary statistics. The average of these rankings provides a rough guide to the better performing gap measures. Although there are exceptions, one feature that stands out among these rankings of the various detrending methods is their relative uniformity across different reference cycles. Taken together this buttresses the results of this metric; turning point outcomes are not overly sensitive to the reference cycle used.

With a sizable number of turning points signaled by the detrending methods, it is not surprising that the percentage of missing signals is extremely low across all reference cycles. This contrast with Canova’s work on U.S. data using NBER and DOC reference cycles, where he finds numerous missing signals across a variety of detrending methods. Overall, the results suggest that the euro area business cycle is smoother than the US business cycle.

Naturally there are more false alarms given the number of identified troughs and peaks from the detrending methods. Our detrending rankings are more or less similar across the three reference cycles, with the one exception of the production function. Under both the AKO and AKT reference cycle comparators, the production function was able to dramatically improve its ability to correctly identify cyclical turning points. Under the AKO reference cycle, the production function did not miss any peak or trough signal and only 20 percent of the time did it provide a false alarm for peaks or troughs. Some of this improvement is due to the smaller number of turning points identified using the production function resulting in fewer false alarms.

Overall, the best measures to identify cyclical turning points are the quadratic trend and the production function. Consumer confidence, the UC4 model and the inefficiency wedge measure were the least effective in capturing turning points.

B. Inflation Forecasting

Results of the inflation forecasting exercise under the three different specifications of the forecasting equation and the four different definitions of inflation are presented in Tables 4-6. Basically all the output gap measures are useful to forecast quarterly inflation, with U-Theil statistics generally below 1. Results vary for other measures of inflation, but overall the worst results were obtained when forecasting annual inflation four quarters ahead, perhaps the most interesting variable from a monetary policy point of view. Statistical methods generally perform poorly, and this result is robust across all three specifications of the forecasting equation and all four measures of inflation. In terms of the different output gap measures, the most successful are the inefficiency wedge (IW), the UC3 model (Phillips curve with hysteresis effects) and the quadratic trend.

Table 4.

Theil U Statistics from Philips Curve Forecasting Exercise Using Equation 10a1/

article image
article image
Note: See footnote to Table 1 for definition of the method abbreviations.

The inflation regression includes lags of inflation and output gaps as explamatory variables.

Table 5.

Theil U Statistics from Philips Curve Forecasting Exercise Using Equation 10b1/