This Time They Are Different: Heterogeneity and Nonlinearity in the Relationship Between Debt and Growth1
  • 1 0000000404811396 Monetary Fund

Contributor Notes

We study the long-run relationship between public debt and growth in a large panel of countries. Our analysis takes particular note of theoretical arguments and data considerations in modeling the debt-growth relationship as heterogeneous across countries. We investigate the issue of nonlinearities (debt thresholds) in both the cross-country and within-country dimensions, employing novel methods and diagnostics from the time-series literature adapted for use in the panel. We find some support for a nonlinear relationship between debt and long-run growth across countries, but no evidence for common debt thresholds within countries over time.


We study the long-run relationship between public debt and growth in a large panel of countries. Our analysis takes particular note of theoretical arguments and data considerations in modeling the debt-growth relationship as heterogeneous across countries. We investigate the issue of nonlinearities (debt thresholds) in both the cross-country and within-country dimensions, employing novel methods and diagnostics from the time-series literature adapted for use in the panel. We find some support for a nonlinear relationship between debt and long-run growth across countries, but no evidence for common debt thresholds within countries over time.

1 Introduction

The presence of a common threshold or turning point beyond which the detrimental impact of debt on growth is significant or significantly increases is currently taken as a given within many policy circles: in the United States, although many political battles impinge on the Congressional debate over the debt ceiling and the resulting government shutdown of October 2013, this state of affair at least in parts reflects a widespread belief that ‘debt is dangerous’ and that fiscal austerity represents the only way towards restoring sustainable growth for the world’s largest economy. In the United Kingdom Chancellor George Osborne displays a similar sentiment when telling his annual party conference that dealing with the repercussions of the financial crisis is not over “[u]ntil we’ve fixed the addiction to debt that got this country into this mess in the first place” (official Conservative Party Conference speech, Manchester, September 30, 2013). These strong convictions and ensuing actions were strongly influenced by the work of Carmen Reinhart and Kenneth Rogoff, who were among the first to suggest a debt-to-GDP threshold of around 90% beyond which economic growth is seriously affected by the debt burden (Reinhart & Rogoff, 2009, 2010a,b, 2011; Reinhart, Reinhart & Rogoff, 2012).1

This study is not about the analysis in Reinhart & Rogoff (2010b) – which is primarily descriptive – but about the substantial empirical literature these authors point to as a means of support for their findings (Kumar & Woo, 2010; Cecchetti, Mohanty & Zampolli, 2011; Checherita-Westphal & Rother, 2012). We build on this literature and approach the issue of nonlinearity in the debt-growth relationship with a number of alternative empirical strategies which enable us to distinguish a nonlinearity across countries from a within-country nonlinearity, a key distinction which has so far been entirely absent from the empirical literature. Identifying a within-country threshold effect would indeed inform policy makers of the presence of a country-specific tipping point, which may guide macroeconomic policies and fiscal adjustments. If debt-growth dynamics differ across countries, then the assumption of a common threshold across countries, as is the practice in the existing literature, leads to one-size-fits-all policies which are misleading at best and growth-retarding at worst.

We analyse the debt-growth nexus within a standard neoclassical growth model for aggregate economy data, employing an empirical framework which allows for different long-run equilibrium relationships between debt and growth across countries, while simultaneously accounting for short-run effects and the impact of unobserved global shocks and local spillover effects.2 Using total public debt data from 105 developing, emerging and developed economies over the 1972 to 2009 time horizon we find that long-run debt coefficients differ across countries and provide tentative evidence that countries with higher average debt-to-GDP ratios are more likely to see a negative effect on their long-run growth performance. We can however not find any evidence that a specific debt threshold common to all countries triggers a systematic parameter shift for individual countries as is widely suggested in the existing literature.

Four features of our empirical approach distinguish this study from the literature on debt and growth. First, we employ a flexible dynamic empirical framework which allows us to distinguish the long-run from the short-run relationship between debt and growth. We estimate the long-run and short-run parameters in a standard error correction model (ECM), test for the existence of a long-run equilibrium relationship (cointegration) and investigate concerns over endogeneity in the panel using recent panel time series methods.

Second, we put particular emphasis on modelling the debt-growth relationship as potentially differing across economies in an a priori unspecified way. Given theoretical arguments for structural (parameter) differences across countries, model and specification uncertainty, as well as serious shortcomings in the available data on public debt, we argue that flexibility in the cross-section dimensions of our panel econometric framework represents a crucial requirement and considerable strength when investigating complex entities such as national economies.

Third, we identify the structural parameters of the relationship between debt and growth by accounting for the distorting impact of cross-section dependence in the form of unobserved global shocks and local spillover effects,3 both of which are likely to affect different economies in the sample to a different extent.

Fourth, we investigate the issue of nonlinearities in both the cross- and within-country dimensions, employing novel approaches and diagnostics from the time-series literature adapted for use in the panel. The presence of a nonlinearity across countries is studied by estimating a heterogeneous dynamic ECM and subsequently analysing the cross-country patterns of short-run and long-run debt coefficients. Our analysis of the within-country type of nonlinearity includes two approaches: (i) we investigate an asymmetric dynamic model, where we pick a range of threshold values, including the 90% debt-to-GDP ratio, as potential ‘tipping point’ for our debt-growth analysis; (ii) we present results from static regression models with squared and cubed debt terms. These empirical specifications are informed by testing procedures for variable summability, as well as for balance and co-summability of the empirical specifications: since integration and cointegration are linear concepts we cannot apply these conventional tests to diagnose our nonlinear empirical models and instead adapt these novel time series methodologies for the panel.

The theoretical foundations for a negative and/or possibly non-linear relationship between debt and growth are rather tenuous (see Panizza & Presbitero, 2013, for a recent survey). While some models arrive at a negative long-run relationship (Elmendorf & Mankiw, 1999) which may be more pronounced if higher debt stocks lead to uncertainty or expectations of future financial repression (Cochrane, 2011), there are alternatives which suggest that in the presence of wage rigidities and unemployment this negative relationship disappears (Greiner, 2011). A nonlinearity or debt threshold can be motivated in developing countries by the presence of debt overhang (Krugman, 1988; Sachs, 1989) but it is difficult to extend this argument to advanced economies. Nonlinearities may also arise if there is a tipping point of fiscal sustainability (Ghosh et al., 2013; Greenlaw et al., 2013). However we are not aware of any theoretical models incorporating such debt tipping points in a growth framework.

Given the recent interest in this topic, our paper is naturally far from alone in studying the effect of the fiscal stance on growth in a cross-country regression framework (recent studies include Cordella, Ricci & Ruiz-Arranz, 2010; Kumar & Woo, 2010; Cecchetti, Mohanty & Zampolli, 2011; Checherita-Westphal & Rother, 2012; Panizza & Presbitero, 2012) – we provide a synthetic review of this literature in a Technical Appendix.4 Although individually quite rich in empirical results and proposed robustness checks, four features can broadly distinguish the analysis in these existing studies: (a) the data used (external or total debt) and country coverage (Euro area, OECD economies, developing countries, or emerging and developed countries); (b) the modelling of the hypothesised debt-growth nonlinearity/threshold (linear and squared debt terms in the regression, spline regression using preconceived thresholds, endogenous threshold regression); (c) the proposed time horizon of the results (short-run, long-run debt-growth relationship) depending on static or dynamic empirical specifications or, supposedly, the use of time-averaged or annual data; and (d) the identification strategy (standard IV/2SLS estimators, Arellano & Bond (1991)-type estimators). None of these studies however address more than one or arguably two of the four features we highlighted above (long-run versus short-run, cross-section dependence, cross-country heterogeneity, nonlinearity and asymmetry in integrated and cross-sectionally dependent macro panels), which we argue are of great importance for identification, analysis and interpretation.

In empirical spirit this study is closest to that of Kraay & Nehru (2006, p.342) investigating debt sustainability and arguing that “a common single debt sustainability threshold is not appropriate because it fails to recognize the role of institutions and policies that matter for the likelihood of debt distress”. In a similar vein, Reinhart, Rogoff & Savastano (2003) and Reinhart & Rogoff (2010c, p. 24) suggested that “debt thresholds are importantly country-specific”, while recent papers which emphasise the heterogeneity of the debt-growth nexus across countries (Kourtellos, Stengos & Tan, 2014). Within the wider growth empirics literature, we add to the recent work employing more flexible empirical specifications to account for cross-country correlations in order to identify the substantive relationship of interest (Pedroni, 2007; Eberhardt, Helmers & Strauss, 2013; Eberhardt & Teal, 2014), adopting empirical methods from the panel time series literature (Pesaran, 2006; Kapetanios, Pesaran & Yamagata, 2011; Chudik & Pesaran, 2013). We also provide methodological innovation in transferring the asymmetric cointegration framework (Shin, Yu & Greenwood-Nimmo, 2013) from the single time series setup to the panel and similarly for the analysis of summability, balance and co-summability (Berenguer-Rico & Gonzalo, 2013a,b).

The remainder of this article organised as follows: Section 2 considers how the complexities of the economic theory and data realities should inform our empirical analysis. Section 3 describes our data and provides an overview of the econometric methods we apply. In Section 4 we present our empirical results and detailed analysis of heterogeneity and nonlinearity in the debt-growth relationship across and within countries. Section 5 concludes.

2 Linking Theory and Empirics

2.1 Commonality and Heterogeneity

Two aspects of our approach are related to the modelling of empirical processes as common or different across countries. However, our interpretation of ‘common’ is somewhat different from what one may expect, in that we are concerned about common shocks (examples include the 1970s oil crises or the recent global financial crisis) and their distorting impact on identifying the debt-growth nexus in the data.

We start by providing some simple descriptive analysis highlighting the cross-sectional dependence of debt accumulation across countries. The data and sources are described in detail in sections 3.1 and the Data Appendix. Figure 1 provides a histogram for the years in which countries in our sample reach their debt-to-GDP ratio peak: although there is some heterogeneity as to the sample coverage for this period, it is notable that in over one-third of countries these peaks occurred in only three years, namely 1985, 1994 and 2009. Given that the data stretches over forty years, it is a remarkable indication of common effects across countries that the debt-to-GDP ratio peaks are clustered around a much smaller number of dates.

A second illustration, provided in Figure 2, links the debt-to-GDP ratio peaks for each country to the deviation of per capita GDP growth rate in the ‘peak years’ (defined ad hoc as running from two years prior to two years after the debt-to-GDP maximum) from that of the full time horizon (excluding the five peak years).5 We again highlight observations for the three years 1985, 1994 and 2009, as well as a small number of outliers. We can make a number of observations regarding this crude depiction of our empirical relationship of interest: first, there seems to be a negative correlation between the maximum debt level and relative growth performance between peak debt and other years. However, this negative relationship is not statistically significant (linear regression result reported in the figure footnote). Second, the figure highlights considerable heterogeneity across countries: for instance, among the countries for which debt-to-GDP peaked in 1994 (blue squares), one country experienced growth at around 2% above its growth rate in all other years, while another country experienced a ‘peak years’ average growth rate which was 4% lower.6 Third, and perhaps with view to the present debate in the literature most important, we note the dashed vertical line marking a debt-to-GDP ratio of 90%: a considerable number of countries had better growth performance in their peak debt years than at any other point since 1972, even at what some commentators refer to as ‘dangerous’ levels of debt.

In order to move away from matching single debt observations and average growth rates, we provide further descriptive analysis using interquartile ranges (IQR) for debt and growth. Here we focus on the cross-section variation in these two variables over time, further differentiating countries by income levels (high, middle, low). Figure 3 provides IQRs for debt (grey shading, right axis—three debt peak years highlighted in black) and growth (black whiskers, left axis) across all countries and the three income categories. Note that with the exception of 2009 in the High-Income Country sample, none of the three debt-peak years highlighted before look in any way remarkable, both in terms of debt or growth distribution: clearly while there is some commonality in terms of the timing of debt peaks across a number of countries, there are also other countries for which these years are in no way remarkable, thus reducing the spread of the debt IQR—a clear indication of the heterogeneity of the relationship between debt levels and growth. We can however deduce a pattern whereby the growth rate distribution seems to follow an inverted U-shape over time—this is apparent in the full sample and all three sub-samples—while the distribution of debt, perhaps apart from an initial decline in the early 1970s, does not show any clearly discernable patterns.

In econometric terms we are interested in accounting for the impact of ‘cross-section dependence,’ both in the unobservable as well as the observable parts of our empirical model. The conventional empirical approach adopted in the literature however assumes cross-section independence in the panel, i.e. that regression residuals show no systematic patterns of correlation across countries. The problems arising from such correlation are well-known in the econometric literature (Phillips & Sul, 2003; Andrews, 2005; Pesaran, 2006; Bai, 2009; Pesaran & Tosetti, 2011) but have found only comparatively limited recognition in applied work. We briefly sketch the standard framework from the panel time series literature, the common factor model, and indicate the identification problem arising if common shocks are present. For simplicity we adopt a static model with a single covariate x and a single unobserved common factor f with heterogeneous factor loadings λi


Cross-sectional dependence arises from this error structure and further from the assumption (or rather generalisation) that the same unobserved factors are also affecting the evolution of the covariate (ε it above and eit below are stochastic shocks)


Applied to a discussion of the debt-growth nexus, this setup suggests, quite uncontroversially, that there are unobservable time-invariant (ψt) and time-varying (ft) processes driving output (y), possibly including geography or climate amongst the former and institutions, business environment or intangible capital amongst the latter. ft also represents shocks, such as the 1970s oil crises, which affect all countries in the world (albeit to a different extent) and more localised spillover effects, e.g. productivity spillovers between neighbouring countries. Further, these unobserved processes are also suggested to affect the evolution of the determinants of output (x), including in our model the stock of debt. This is a particularly salient point given the recent experience of the global financial crisis and ensuing debt crises for a number of European economies.

We can now illustrate how the parameter of interest (βi or an average thereof) is unidentified unless the unobserved common factors are accounted for. Solving equation (2) for ft and plugging into (1) obtains


where in principle ζiβi. This idea extends to multiple factors and the multivariate context: if the unobservable ft is merely a ‘weak’ factor (representing only local spillovers between a small number of countries) then the estimate of the βi coefficients or their average may not be seriously biased; however, if we have multiple factors of the ‘weak’ and ‘strong’ type (the latter affecting all countries in the sample), the β coefficient is not identified.7 We can extend this setup by arguing for the following relationship in any observable variable which potentially could be employed as an instrument:


Some observable z is correlated with x and thus a potentially (depending on φi) informative instrument but at the same time via ft is correlated with the unobservables in equation (1) and therefore invalid. The notion that a small number of unobserved common factors drive all the macroeconomic variables underlies the application of principal component analysis in the macro forecasting literature (e.g. Stock & Watson, 2002) and thus does not seem far-fetched at all. The same sentiment is expressed in recent work of applied economists investigating macro panel data (Durlauf, Johnson & Temple, 2005; Clemens & Bazzi, 2013) and the general lack of robustness of IV results in the cross-country growth literature has seriously weakened this literature.

Finally, even ignoring the instrumentation issue, we can see that the specification in equation (1) also creates difficulties for standard pooled estimators which lead to heterogeneity bias (Pesaran & Smith, 1995). One problem here is that pooling introduces data dependencies in the residual terms if variable series are integrated, a data property typically assigned to macro data such as those employed in the analysis of the debt-growth nexus (Lee, Pesaran & Smith, 1997; Pedroni, 2007; Bond, Leblebicioglu & Schiantarelli, 2010). Heterogeneity misspecification enters linear combinations of integrated variables into the error term, raising the potential for spurious regression (Kao, 1999; Phillips & Sul, 2003). Existing research has found very different results when moving away from full sample analysis in homogeneous parameter regression models and investigating sub-samples along geographic, institutional or income lines (International Monetary Fund, 2012; Kourtellos, Stengos & Tan, 2014).

There are a number of reasons to assume the equilibrium relationship between debt and growth differs across countries. First, in line with the ‘new growth’ literature (see Temple, 1999) production technology may differ across countries, and in the same vein the relationship between debt and growth.8 Second, vulnerability to public debt depends not only on debt levels, but also on debt composition (Inter-American Development Bank, 2006). Unfortunately, existing data for the analysis of debt and growth represent a mixture of information relating to general and central government debt, debt in different denominations and with different terms attached (be they explicit or implicit). All of this implies that comparability of the debt data across countries may be compromised (Panizza & Presbitero, 2013). In addition, even assuming that debt stocks are comparable across countries and over time, the possible effect of public debt on GDP may depend on the reason why debt has been accumulated and on whether it has been consumed or invested (and in which economic activities). Third, different stock of debt may impinge differently on economic growth. In particular, one could argue that debt could hinder GDP growth when it becomes unsustainable, affecting interest rates and triggering a financial crisis, thus affecting the level of GDP. However, the capacity to tolerate high debts depends on a number of country-specific characteristics, related to past crises and the macro and institutional framework (Reinhart, Rogoff & Savastano, 2003; Kraay & Nehru, 2006; Manasse & Roubini, 2009). The argument of heterogeneity in the debt-growth relationship is a simple extension, which provides greater modelling flexibility and further allows for empirical testing of the validity of this assumption via residual diagnostics.

2.2 Heterogeneity and Nonlinearity

Following the standard strategy in the microeconomic literature, many empirical studies on the debt-growth nexus either include squared debt terms or use spline specifications in their empirical framework to capture the heterogeneous impact of debt across different levels of indebtedness (recent examples include Cordella, Ricci & Ruiz-Arranz, 2010; Pattillo, Poirson & Ricci, 2011; Checherita-Westphal & Rother, 2012). It is notable that this specification is part of a model which assumes common parameters across countries (pooled model), whereas we have just developed a number of arguments why we may want to investigate the possibility that each country follows a different long-run relationship between debt and growth. Given this innovation, the notion of a non-linearity and/or debt threshold becomes a question about the appropriate data dimension: does the nonlinearity distinguish the debt-growth nexus across different countries or within countries over time?

Beginning with the former, Haque, Pesaran & Sharma (1999) provide a detailed discussion of the consequences of neglected parameter heterogeneity in the context of static and dynamic cross-country savings regressions. They particularly indicate the potential for seeming nonlinear relations as a result of mis-specification, concluding that “[t]he linearity hypothesis may be rejected not because of the existence of a genuine nonlinearity between yit and xit, but due to slope heterogeneity” (Haque, Pesaran & Sharma, 1999, p.11).9

We provide a number of illustrations regarding the potential for heterogeneity misspecification in the debt-growth relationship. Figure 4 plots a fractional polynomial regression line (as well as a 95% confidence interval) for per capita GDP against the debt-to-GDP ratio (both variables in logs) – the former is taken in deviation from the country-specific means (‘within’ transformation) to take account of different income levels across countries and thus focus on changes relative to the country mean.10 As can be seen there is clearly a nonlinear relationship between these two variables, in line with the standard arguments advanced by Reinhart and Rogoff as well as many others discussed above, with a ‘threshold’ of 4.5 log points (equivalent to 90% debt-to-GDP) a distinct possibility: higher debt burden is associated with lower per capita GDP, although this is obviously not a statement regarding causality. In a second plot in the same figure we add the actual observations for this regression in form of a scatter graph—the intention here is to cast some doubt over the ‘very obvious’ nonlinear relationship just discussed. In a third plot we provide country-specific fractional polynomial regression lines for all countries in our sample, while a fourth plot randomly selects thirty countries from the previous plot. In our view this highlights that the seeming nonlinearity assuming a pooled empirical model (black regression line and shaded confidence intervals) is far from obvious when we assume an empirical model which allows the relationship to differ across countries.

Our descriptive analysis thus suggests that the raw data (adopting levels variables to elicit the long-run relationship) shows a clear non-linearity or threshold between the debt-to-GDP ratio and income at around 90% debt burden, provided we assume that all countries in the sample follow the same equilibrium path. However, relaxing this assumption in line with the motivation provided in the previous section seriously challenges this conclusion.

Of course this form of descriptive analysis is highly stylised, not to mention that there are other determinants of economic development and that such plots cannot provide any insights into any potentially causal relationship, be it from debt to growth or vice versa. Although our discussion is by no means conclusive, we feel that the illustrations provided above cast some doubt over the stringent implicit assumptions adopted in most of the existing literature: first, that we can carry out empirical analysis assuming that correlation across countries does not matter when running standard panel regression analysis which assumes cross-section independence. Second, the assumption that all countries, regardless of their level of economic development, their industrial structure or institutional environment, follow the same equilibrium relationship between debt and growth. Third, the notion that all countries are subject to the same debt threshold, beyond which growth is affected detrimentally, which is econometrically implemented by use of exogenous or endogenous debt thresholds or by adopting a polynomial specification for debt within a pooled empirical model, thus providing no insights whether the nonlinearity hides heterogeneity across countries or heterogeneity within countries over time.

Thus our empirical analysis of nonlinearity in the debt-growth nexus begins by considering a nonlinearity across countries. We adopt standard linear regression models, albeit of a fashion which accounts for both observed and unobserved heterogeneity. Identification of the long-run and short-run coefficients on debt is achieved by use of the Pesaran (2006) common correlated effects (CCE) estimator, which accounts for the presence of unobserved heterogeneity through a simple augmentation of the regression equation. Due to the dynamic setup and thus the presence of a lagged dependent variable it is necessary to adjust this augmentation following the suggestions in Chudik & Pesaran (2013). We then analyse the relationship between the estimated long-run coefficients and country-specific averages of debt levels, of debt-to-GDP ratios as well as peak debt-to-GDP ratios.

Next, we consider nonlinearity in the debt-growth nexus at the country-level. We are not the first to consider such an empirical setup: Caner, Grennes & Koehler-Geib (2010) argue that provided a debt-threshold exists, this would arguably differ across countries given the heterogeneity in financial market development, openness, institutional development amongst other causes. Kourtellos, Stengos & Tan (2014) argue that if there exist heterogeneities in the debt-growth relationship (thresholds) then there may be other nonlinearities inherent in the empirical model employed to investigate this phenomenon. They find that while there does not exist a generic threshold or tipping point beyond which debt has a detrimental effect on growth, there does exist such a threshold determined by countries’ level of democracy. The main concern for our empirical analysis here is the most appropriate specification with regards to the time-series properties of the data: reliable inference on a relationship involving variable series which are nonstationary involves establishing that these variables are cointegrated, and within both time series and panel time series econometrics a number of alternative approaches are available to test for this property. Crucially, however, cointegration defines a linear combination of variables integrated of order one (in our case) which is stationary (i.e. integrated of order zero). Difficulties for the analysis of potentially nonlinear relationships such as that between debt and growth arise given that the order of integration of the square or cube of an integrated variable is not defined within the linear integration and cointegration framework. We apply novel methods on the order of summability and the concept of co-summability from the time series econometric literature (Berenguer-Rico & Gonzalo, 2013a,b) to provide pre-estimation testing as to the validity of our empirical equation incorporating country-specific nonlinearities. To the best of our knowledge our study is the first to adopt these methods in the panel context, further addressing the concerns over cross-section dependence.

We adopt two approaches to investigating a nonlinearity at the country level: first we employ the nonlinear dynamic model by Shin, Yu & Greenwood-Nimmo (2013), where following selection of an exogenously given threshold (we focus on 52%, 75% and 90% in the debt-to-GDP ratio) we are able to investigate heterogeneous growth regimes (below and above the threshold) whilst accounting for cross-section dependence. Informed by our (co-)summability analysis our second approach will employ the familiar microeconometric practice of including polynomial terms of the debt stock variable in a static regression model whilst accounting for cross-section dependence.

3 Data and Empirical Strategy

3.1 Data

Our main variables are GDP, capital stock, constructed from gross fixed capital formation using the standard perpetual inventory method and assuming a common and constant 5% depreciation rate, and total public debt stock (all in logarithms of real US$). Data are taken from the World Bank World Development Indicators (WDI) database and, in the case of the debt stock, from an update to Panizza (2008). The debt variable is total (external and domestic) general government debt in nominal terms (face value), in the raw data expressed as a share of GDP. We express all variables in per capita terms, including the debt stock. Our empirical setup thus imposes constant returns to scale, which we believe is a reasonable assumption.

For a small number of empirical results we further make use of cross-section averages for data on trade openness and financial development. Trade openness (imports plus exports as a share of GDP) is taken from the NYU Global Development Network Growth Database – in turn based on the World Bank’s WDI and Global Development Finance databases – while financial development (ratio of bank credit to bank deposits) is taken from Thorsten Beck and Asli Demirguc-Kunt’s Financial Structure Database, updated in 2010.11 A Data Appendix provides more details on the construction of our variables and descriptive statistics. Detailed information about the sample make-up is confined to a Technical Appendix.

In the following we introduce our linear dynamic and asymmetric dynamic models of the debt-growth nexus as well as the concepts of summability, balance and co-summability in some more detail.

3.2 Empirical Specification: Linear Dynamic Model

The basic equation of interest in our analysis of the debt-development nexus is a static neoclassical production function augmented with a debt stock term:


where y is aggregate GDP, cap is capital stock and debt is the total debt stock – all variables are in logarithms of per capita terms, imposing constant returns to scale. Our specification of endogenous TFP in the form of common factors does however allow for externalities at the local and/or global level. These variables constitute the observable processes captured in our model, with their parameter coefficients βij (for j=K, D) allowed to differ across countries12—this heterogeneity is a central feature of our empirical setup as motivated in the previous section. In addition we include country-specific Total Factor Productivity (TFP) levels (αt) and a set of common factors ft with country-specific ‘factor loadings’ λt to account for the evolution of unobservable TFP over time. The common factors can be a combination of ‘strong’ factors, representing global shocks such as the recent financial crisis, the 1970s oil crises or the emergence of China as a major economic power; and ‘weak’ factors, capturing local spillover effects following channels determined by shared culture heritage, geographic proximity, economic and social interaction (Chudik, Pesaran & Tosetti, 2011). We assume that these unobservable factors not only drive our measure of output, but also the other covariates in the above model:13 this provides for a standard endogeneity problem whereby the βij parameters are not identified unless some means to account for the unobservable factors in the error term u is found. At the same time this endogenises TFP and allows for externalities of production. We will return to the identification strategy in our discussion of the empirical implementation below. Suffice to highlight that standard instrumentation in a pooled empirical framework is invalid in the present setup as we cannot obtain instruments which are both informative and valid due to the omnipresence of unobserved factors, and/or the underlying equilibrium relationship differing across countries. Finally, we allow for the unobserved factors to be nonstationary, which has important implications for empirical analysis since all observable and unobservable processes in the model are now integrated and standard inference is invalid (Kao, 1999).

Given the importance of time series properties and dynamics in macro panel analysis, we employ an error correction model (ECM) representation of the above substantive equation of interest. This offers at least three advantages over a static model such as the above or restricted dynamic specifications such as those commonly investigated in the literature:14 (i) we can readily distinguish short-run from long-run behaviour; (ii) we can investigate the error correction term and deduce the speed of adjustment for the economy to the long-run equilibrium; and (iii) we can test for cointegration in the ECM by closer investigation of the statistical significance of the error correction term. The ECM representation of the above model is as follows:


where the βij in equation (6) represent the long-run equilibrium relationship between GDP (y) and the measures for capital and debt in our model, while the γij represent the short-run relations. ρi indicates the speed of convergence of the economy to its long-run equilibrium. The term in round brackets represents the candidate cointegrating relationship we seek to identify in our panel time series approach. In equation (7) we have simply relaxed the ‘common factor restriction’ implicit in the nonlinear relationship between parameters in equation (6) and reparameterized the model to highlight that from the coefficients on the ‘levels’ terms (πij for j = K, D) we can back out the long-run parameters


whereas from the coefficient on the terms in first difference (πim for m = k, d, lowercase to distinguish from the long-run coefficients) we can read off the short-run parameters directly. πiEC indicates the speed at which the economy returns to the long-run equilibrium, with a half-life (in our data: in years) computable as (log(0.5)/log(1+πiEC)). Inference on this πiEC parameter will provide insights into the presence of a long-run equilibrium relationship: if πiEC=ρi=0 we have no cointegration and the model reduces to a regression with variables in first differences (i.e. the term in brackets in equation (6) drops out). If πiEC=ρi0 we observe ‘error correction’, i.e. following a shock the economy returns to the long-run equilibrium path, and thus cointegration between the variables and processes in round brackets/levels. Note that we have included the unobservable common factors f in our long-run equation: this implies that we seek to investigate cointegration between output, capital, debt and TFP.

In the spirit of Banerjee & Carrion-i-Silvestre (2011) we employ cross-section averages of all variables in the model to replace unobservables as well as omitted elements of the cointegration relationship.


Recent work by Chudik & Pesaran (2013) has highlighted that this approach is subject to small sample bias, in particular for moderate time series dimensions. Furthermore, these authors relax the assumption of strict exogeneity and thus allow for feedback between (in our application) debt, capital stock and output, which provides a more serious challenge to the original Pesaran (2006) approach:


If αi1 = 0 we maintain the assumption of strict exogeneity and we can proceed with the standard CCE augmentation, whereas if αi1 ≠ 0 this approach is only valid if and only if the dynamic common factor restrictions hold. Chudik & Pesaran (2013) provide the following empirical strategy employing cross-section averages in the presence of weakly exogenous regressors: in addition to the cross-section averages detailed in equation (8) they suggest (i) the inclusion of lags of the cross-section averages, in our ECM setup


and (ii) the inclusion of cross-section averages of one or more further covariates (other than cap and debt) which may help identify the unobserved common factors in the spirit of Pesaran, Smith & Yamagata (2013), in our ECM setup


for covariate z and similarly for further covariates. Chudik & Pesaran (2013) show that once augmented with a sufficient number of lagged cross-section averages (p = T can be employed as a rule of thumb) the CCE mean group estimator performs well even in a dynamic model with weakly exogenous regressors.

Our empirical framework and implementation thus provide a great deal of flexibility to aid our attempts in capturing the long-run and short-run relationships between debt and growth across a set of diverse economies. We do not claim that our empirical approach is ‘superior’ to the existing literature by pointing to asymptotic results in econometric theory or Monte Carlo simulation studies of known data generating processes. Instead, we highlight a set of assumptions which different empirical implementations make and provide diagnostic tests as to the validity of these assumptions. The comparison of results from different empirical estimators presented below does not constitute an exercise in data mining until the desired result emerges, but an attempt at testing the explicit and implicit assumptions made in each empirical model.

An important feature of the empirical implementation adopted here is that all models are estimated by OLS: modelling features such as nonstationarity, cross-section correlation, heterogeneity in the equilibrium relationship across countries and nonlinearity/asymmetry in the long-run and/or short-run relationship are captured by the empirical specification and the use of additional terms in the regression equation. While estimation is thus relatively straightforward, we rely on simulated critical values for various inferential and diagnostic statistics.

3.3 Empirical Specification: Weak Exogeneity Testing

In our factor model setup we have emphasised one type of endogeneity, whereby common factors drive both inputs and output, leading to identification issues unless the factors are accounted for. In the present context, a second form of endogeneity which implies reverse causality is deemed of particular importance for the interpretation of the empirical results: can we argue our empirical model derived from a neoclassical production function augmented with debt is correctly specified, or do we estimate a disguised demand equation for debt or investment?15 We adjust the basic common factor model introduced in equations (1) and (2) accordingly:


for a single covariate x and single factor f contained in both the y – and x-equations. Due to the presence of ψiεit in the second equation we should be concerned over whether y ‘causes’ x or the reverse being the case or both. The standard approach in the literature has been to instrument for x using one or a set of variables z which satisfy the conditions of informativeness (𝔼[zx] ≠ 0) and validity (𝔼[] = 0). Having adopted a panel time series approach the issues of endogeneity and direction of causation can here take an alternative pathway: provided our variables are nonstationary and cointegrated, we can then apply a test for weak exogeneity. This test, described in detail below, can help us determine whether our empirical results can be interpreted as arising from a production function, rather than a misspecified input demand function. This identification strategy is not as clean as microeconometric alternatives, such as a controlled or natural experiment and instrumental variable estimation.16 We argue that neither of these strategies are suitable in a macroeconomic context: experiments may provide insights into a unique episode or a single country experience, but arguably lack the external validity by necessity required in answering our research question. We already argued above that instrument validity is difficult to justify in macro panel analyses of a globalised world. With the empirical questions addressed in this study in mind we believe our empirical strategy is the best we can do.

Provided there exists a cointegrating relationship between variables the Granger Representation Theorem (Engle & Granger, 1987) states that these series can be represented in the form of a dynamic ECM. Generically, for a pair of cointegrated variables x and y we can write


where êt−1 represents the ‘disequilibrium term’ e^=yβ^ixd^ constructed using the estimated cointegrating relationship between these two variables (d represents deterministic terms). Equations (12) and (13) further include lagged differences of the variables in the cointegrating relationship. In the above example there are only two equations, since we have two variables in the cointegrating relationship. The Granger Representation Theorem implies that for a long-run equilibrium relationship to exist between y and x at least one of λ1i and λ2i must be non-zero: if (and only if) λ1i = 0 then x has a causal impact on y, if (and only if) λ2i ≠ 0 then the causal impact is reversed. If both λ1i and λ2i are non-zero they determine each other jointly.

3.4 Empirical Specification: Asymmetric Dynamic Model

We follow the discussion in Shin, Yu & Greenwood-Nimmo (2013) and define the asymmetric long-run regression model


where we again assume observable and unobservable processes are nonstationary and debt stock has been decomposed into debtit=debti0+debtit++debtit. The latter two terms are partial sums of values above and below a specific threshold, debti0 has been subsumed into the constant term. For instance, if we assume a threshold of zero then they define positive and negative changes in debt accumulation. For each country i let


This setup would suit the analysis of an asymmetric response to debt accumulation and relief, whereby the hypothesised substantial growth benefits of debt relief could be shown to be questionable given a the differential relationship between debt accumulation and growth on the one hand and debt reduction and growth on the other. In our present study we instead create partial sums for debt stock below and above a number of (exogenously determined) debt-to-GDP ratio thresholds, namely 52% (sample median), 75% and the ‘canonical’ 90%. Thus the partial sums are constructed from the per capita debt stock variable, while the assignment to one or the other regime is determined by the debt-to-GDP ratio – we adopt this practice in order to be able to compare our results with those in the literature adopting the debt-to-GDP ratio as the primary variable of interest.

The ECM version of our asymmetric dynamic model is thus


The dynamic asymmetry can be included in the long-run relationship (lagged levels terms), in the short-run behaviour (first difference terms) or both. As before we allow for cross-country heterogeneity in all long-run and short-run parameters and account for the presence of unobserved time-varying heterogeneity by augmenting the country regressions with cross-section averages of the dependent and independent variables. While in the original Shin, Yu & Greenwood-Nimmo (2013) time series approach the parameter estimates are identified by augmentation of the empirical equation with additional lagged differences, our panel approach relies on the common factor framework as developed in Section 2 above for identification. The same issues as highlighted in Section 3.2 apply and we shall augment the estimation equation with further lags of the cross-section averages (Chudik & Pesaran, 2013). Note that the implementation raises a number of problems in the case where the debt threshold is relatively high: if only a very small number/share of observations for a specific country are above the threshold, then the estimated coefficient may be very imprecise. In order to guard against this we present results of the estimated long-run debt parameters in the low and high debt regimes only for those countries where at least 20% of all time series observations are in one regime. For the 90% debt/GDP threshold this amounts to a total of 30 countries, 45 countries in case of the 75% threshold and 55 countries for the 52% threshold.17

3.5 Empirical Specification: Order of Summability, Balancedness and Co-Summability

The previous two sections provided empirical specifications either without country-specific non-linearities or where the within-country non-linearity was modelled as an asymmetry. In this section we discuss the fundamental difficulties arising for conventional empirical analysis when assuming a non-linear model in the presence of integrated variables and introduce a novel time series approach to deal with these issues. Suppose a single time series relationship yt = f (xt, θ) + ut for a nonstationary covariate xtI (1), stationary ut and some non-linear function f (·).18 In this context, it becomes difficult to apply our standard notion of integration to f (·), given that integration is a linear concept: although we may be able to determine the order of integration of xt, the order of integration of f (xt, θ) (and thus yt) may not be well defined for many non-linear transformations f (·). Assuming for illustration f(xt)=θxt2 we can make this point somewhat clearer: let xt = xt−1 + εt and st ∼ i.i.d.(0,σɛ2), then we know that


In words, we can show that the Engle & Granger (1987) characterisation of a stationary process holds for Δxt (finite variance is one of five characteristics, albeit the crucial one for our illustration), such that xt can be concluded to follow an I(1) process. Now investigate the same property for Δxt2:


We can see that the finite variance characteristic is violated, given that it is a function of time t – further differencing does not change this outcome. Although we can define xt within the integration framework, we cannot state the order of integration of xt2, which creates fundamental problems if the empirical analysis of yt = f (xt, θ) + ut is to be based on arguments of cointegration.

Berenguer-Rico & Gonzalo (2013b) develop an alternative approach, based on the ‘order of summability’S (δ) of linear or non-linear processes: “[t]he order of summability, 5, gives a summary measure of the stochastic properties – such as persistence – of the time series without relying on linear structures” (p.3). Using OLS we estimate for each country i


where k=1,,T,Yik*=YikYi1,Uik*=UikUi1andYik=log(t=1k(yitmt))2, with mt the country-specific partial mean of yit, namely mt=(1/t)j=1tyj. This is the definition for mt in the ‘intercept only’ case. Given the trending nature of our data we further investigate the ‘constant and linear trend’ case, where mt=(1/t)j=1tyij(2/t)j=1t(yij(1/j)l=1jyil). This implies


from which we then obtain our estimate of the order of summability δ^i*=(β^i*1)/2. This approach essentially investigates the rate of convergence of a rescaled sum constructed from the variable series yit. In the single time series inference can be established using confidence intervals constructed via estimation in subsamples; here, in the panel, where there is no natural ordering of countries in the cross-section dimension we take random draws of N countries (and in each country the full time series T), each time capturing the mean and median summability statistic, to create subsample estimates for inference.

It bears emphasising that summability is a more general concept than integration, but that that latter is closely related to the former in the following fashion: if a time series xt is integrated of order d, I(d) with d ≥ 0, then it is also summable of order d, S (d). It is the breakdown of the reverse of this condition in cases where xt is a nonlinear transformation which necessitates our adoption of the concept of summability. In our empirical application we will analyse the order of summability of all variables entering the polynomial specifications.

Next, in analogy to the analysis of integrated variables, the ‘balance’ of the empirical relationship needs to be tested, namely the condition that both sides of the empirical equation of interest have the same order of summability: S (δy) = S (δz) for z = f (xt, θ) = θf (xt) – see below for a comment on the linearity in parameters we assume here. Such a test of balance is equivalent to testing the null of βni ≡ (βyiβzi) = 0 in the country-specific regression


where Yyik* is for the LHS variable y and defined as in the summability analysis above, and Yzik* is the partially demeaned sum of all RHS processes Yzik=log(t=1k(zitmt))2, accounting for initial conditions in the same fashion as above by taking the deviation from the first observation. In practice, all elements of z (RHS variables) are summed, appropriately partially demeaned and their estimated order of summability is subtracted from that for y and the result divided by 2.19 Again inference in the single time series test is based on subsample estimation. In the panel we employ the same strategy to create subsample estimates and thus confidence bands as detailed above. Under the null of balance the resulting confidence interval includes zero and balancedness is a necessary but not sufficient condition for a valid empirical specification.20

Finally, let êt be the OLS residuals from a balanced country-specific regression yit=θ^g(xit)+e^it, then ‘strong co-summability’ will imply the order of summability of e^it,S(δe^it), is statistically close to zero. We employ the above approach to estimate the order of summability for êit which enables us to determine whether our balanced model is co-summable or not. Note that the residual series êit as defined above will sum to zero by default of the least squares principle, we therefore in practice do not subtract the estimate for the intercept term in each country regression. Inference in the original time series and in our panel application follow the same principles as the previous two testing procedures.

The above routines imply a sequence of tests (summability, balance, co-summability) which in principle bear close resemblance to the integration-cointegration concepts and testing procedures. The simplicity of the above approach is marred by the presence of deterministic components in the variable evolution. Intercept and trend terms are addressed by repeated partial demeaning of the variable series as suggested in Berenguer-Rico & Gonzalo (2013b).21 We assume non-linearity in variables but not in parameters:


The econometric theory of the approach is at present being extended to nonlinearity in parameters. However, the restriction to linearity in parameters is in line with the standard implementations in the literature adopting debt thresholds (endogenous or endogenous debt/GDP threshold with subsequent analysis splitting observations into separate below/above threshold values/terms) or nonlinearities through polynomial functions (linear, squared and cubed debt terms).

We provide an extension to the above panel versions of the balance and co-summability tests, whereby in the spirit of the recent panel time series literature we include the cross-section averages (CA) of all variables in the specification of the empirical test (Pesaran, 2006; Chudik & Pesaran, 2013). The motivation for this approach is the same we provided for our panel models above: country-by-country investigation of the variable and specification properties assumes these to be cross-sectionally independent. Both theorising and empirical practice have shown that in a globalising world where countries trade and are subject to similar social, economic and/or cultural heritage this assumption is likely to be violated.

We adopt two variants of the cross-section average augmentation: (i) a standard approach such as that outlined above, (ii) an approach where in addition to the CA of all model variables we also include the CA of ‘other covariates,’ similar to the approach in the dynamic heterogeneous panel estimations (Chudik & Pesaran, 2013).

3.6 Empirical Specification: Nonlinear Static Model

Following the investigation of summability, balance and co-summability in a flexible nonlinear model we estimate static models with polynomial approximations to unknown nonlinear functions:


We begin by reporting results for a static linear model, then introduce the squared debt term, and finally the cubed debt term. Due to the assumption of cross-country parameter heterogeneity on the one hand, and the reliance of the Mean Group-type estimation approach on parameter averages across a subset of countries on the other, the averaged results from this exercise do not tell the whole picture, given that we average across heterogeneous linear, U-shaped, inverted-U-shaped etc. specifications at the country-level. We provide the average results as a benchmark for comparison with existing work, along with diagnostic tests and descriptive information as to the general patterns of debt-growth relationships contained in the panel.

4 Empirical Results

4.1 Initial Analysis

We carried out panel unit root tests following Pesaran (2007) and investigated the cross-section correlation properties of the raw data including formal CD tests following Pesaran (2004). Results are provided in a technical appendix and indicate that the levels variable series are integrated of order 1 and subject to considerable cross-section dependence.

We conduct a similar descriptive analysis to that pursued in Reinhart & Rogoff (2010b) for our sample of countries, with results presented in Figure 5: within each income group (High, Upper- and Lower-Middle, Low Income) all observations are divided into four bins based on the debt-to-GDP ratio. Although arguably not as clearcut as these authors’ illustration for a set of developed and emerging economies, the means (dark grey bars) and medians (light grey) for different income groups by level of indebtedness may be taken as evidence for a differential growth performance beyond a 90% debt-to-GDP threshold, at least for the high-, lower middle- and low-income samples. We now provide empirical evidence that this descriptive result is misleading.

4.2 Results: Linear Dynamic Models

Table 1 presents results derived from an ECM specification, with results for a standard two-way fixed effects and pooled CCE in columns [1] and [2] assuming parameter homogeneity across countries and all other models in columns [3]-[10] allowing for differential relationships (we report robust mean estimates). The models in columns [4] and [5] represent the standard CCE estimator in the Mean Group version, while models in columns [6], [7] and [9] add further lags of the cross-section averages as suggested in Chudik & Pesaran (2013). Models in columns [8]-[10] experiment with the cross-section averages and lags of additional covariates outside the model: we adopt proxies for trade openness (‘open’) and financial development (‘findev’), both in logs. These variables only enter the empirical model in form of their cross-section averages. The aim here is to help identify the unobserved common factors ft, which represent global shocks and local spillover effects,22 so that adopting variables which are directly linked to globalisation was deemed a suitable choice here.

In each model we focus on the long-run estimates as well as the coefficient on the lagged level of GDP to investigate error correction – full ECM results are available on request. LRA refers to the ‘long-run average’ coefficient, which is calculated directly from the pooled model ECM results in [1] and [2] and the weighted averages – we follow standard practice in this literature and employ robust regression (see Hamilton, 1992) to weigh down outliers in the computation of the averages – of the heterogeneous model ECM results in [3]-[10]. LRA standard errors are computed via the Delta method. ALR refers to the ‘average long-run’ coefficient in the heterogeneous models, whereby the long-run coefficients are computed from the ECM results in each country and then averaged across the panel.23 In the ALR case standard errors are constructed following Pesaran & Smith (1995). For all heterogeneous models which address concerns over cross-section dependence there is evidence of error correction – the lagged GDP pc levels variable is highly statistically significant – and the average long-run coefficients appear statistically significant and positive throughout, whereas short-run coefficients are insignificant. The latter does not imply the absence of any significant effects, but rather highlights the heterogeneity across countries with dynamics on average cancelling out. Coefficients on lagged per capita GDP levels imply reasonable estimates for the speed of convergence, with a half-life of just under a year in most CMG specifications.24 Diagnostic tests highlight that the use of cross-section averages considerably reduces residual cross-section dependence – the CD statistic drops from 18 in the MG to between 2 and 3 in the CMG models. Based on work by Bailey, Kapetanios & Pesaran (2012) it is suggested that the implicit null hypothesis of this test is weak (rather than strong) cross-section dependence (Pesaran, 2013) – recall that dependence of the weak type only affects inference, whereas strong dependence can lead to an identification problem.25

Once we move from a pooled to a heterogenous parameter specification, statistically significant positive average long-run coefficients as we find in our sample only provide insights regarding the central tendency of the panel. This result may indicate that, on average, the countries in our sample are on the ‘right’ side of an hypothetical Debt Laffer curve. This hardly surprising as the median debt-to-GDP ratio is around 50% (Table A1), a value well below the ‘tipping points’ identified by the literature on developing and advanced economies (see Table TA1). In Figure 6 we therefore provide a number of plots indicating the cross-section dispersion of the long-run debt coefficients, primarily focusing on the estimates in the dynamic CCE model with one additional lag (column [6] of Table 1) given its favourable diagnostic results. With the exception of panel (b) all plots capture the country-specific average debt-to-GDP ratio over the entire sample period (in logs) on the x-axis and estimated debt-coefficients on the y-axis (all long-run except for panel (f), which plots short-run coefficients). Panel (a) suggests that there is a nonlinear relationship between the debt-to-GDP ratio and the long-run impact of debt, which around 90% debt-to-GDP turns negative. Panel (c) makes the same point grouping countries into quintiles based on average debt/GDP ratio and providing distributional plots for each of them (group #5 represents debt burden over 90% of GDP).

Panel (b) however cautions against this conclusion: instead of average debt-to-GDP ratio we plot here the debt-to-GDP ratio peak for each country. It is notable that many countries still have positive coefficients despite peak debt-to-GDP ratios in excess of 90%. Panel (d) splits the data into the 25% richest countries and the rest – the nonlinearity between debt burden and the long-run debt coefficient across countries seems to primarily be driven by the poorer countries in the sample. Panels (e) and (f) provide fitted fractional polynomial regression lines for the CMG models in Table 1 for which the residual CD test is below 3: [4]-[7] and [10]. With regard to long-run results in panel (e), the average relationship emerging seems to be fairly robust to the choice of empirical specification. There is no evidence for any systematic heterogeneity in the short-run coefficients presented in panel (f).

We thus find some tentative evidence for a nonlinearity in the long-run relationship between debt and growth across countries. We can be reasonably certain that these empirical models represent cointegrating relationships between debt and income, but this does not rule out the possibility of feedback from income to debt, which would question the validity of our empirical results. As a next step we therefore turn to weak exogeneity testings for all of our heterogeneous parameter models.

4.3 Results: Weak Exogeneity Testing

In Table 2 we present the results for the MG and various CMG models – models refer to the column numbering in Table 1. For each estimator we provide weak exogeneity tests using specifications with one or two lags, in each case providing three sets of results: for an output equation, a capital stock equation and a debt stock equation. If our suggestion that the empirical models analysed represent augmented production functions, rather than investment demand or debt demand equations, thus (informally) allowing us to argue for a causal relation from capital and debt stock to output and not vice-versa, we would expect a pattern whereby the various test statistics for the output equation reject the null of no causal relation from ‘inputs’ to output, whereas those in the two ‘input’ equations cannot reject their respective nulls. Taking in the results as a whole, there appears to be fairly strong evidence for the setup described: p-values for the statistic constructed from averaged t-statistics are typically below 10 percent in the output and close to unity in the input equations; the t-statistics on the averaged λi coefficients are typically very large in the former and typically below 1.96 in the latter.

The purpose of the analysis in this section was to investigate the possibility of a nonlinear relationship between the debt burden and the long-run debt coefficient in the cross-country dimension. A number of empirical models including dynamic CCE which allows for cross-section dependence in a dynamic model were evaluated and while the empirical results are somewhat fragile in a moderate-T sample, one might conclude that on balance there is some evidence for heterogeneity in the long-run coefficients across countries. We now turn to empirical models which allow for heterogeneous long-run relations across countries while at the same time allowing for thresholds in the relationship within countries over time, which represents a direct test of the consensus of a common threshold effect as propagated in the existing empirical literature.

4.4 Results: Asymmetric Dynamic Models

In Figure 7 we present results from the asymmetric (heterogeneous) dynamic regressions where we account for unobserved common factors by inclusion of cross-section averages of all covariates as well as one further lag of the cross-section averages. The three plots correspond to subsamples for an adopted threshold of 52% (top), 75% (middle) and 90% (bottom) for the debt-to-GDP ratio – in each case we only include countries which have at least 20% of their observations in one of the two regimes (below/above threshold), amounting to 55, 45 and 30 for the three thresholds, respectively. Empirical results on which these graphs are based can be found in Table 3: model [5] with one additional lag and asymmetry in the long- and short-run specification.

The x-axis in each plot represents the average debt burden over the entire time horizon, expressed as the average debt-to-GDP ratio (in logs) in the left column and, like in our regressions, as the total debt stock per worker (in logs) in the right column – in both sets of plots the left tip of each arrow represents the average value for the ‘low debt’ regime where debt is below 52%, 75% or 90% of GDP, while the right arrow tip marks the average value for the ‘high debt’ regime above these thresholds. The y-axis in each plot captures the estimated long-run debt coefficient which by construction is allowed to differ across regimes (and countries). Under the working hypothesis that a shift to the ‘high debt’ regime would have a negative, step-change type impact on long-run growth, we would expect most arrows to indicate a negative relationship. As can be seen, this hypothesis is not borne out by the empirical results: there is no evidence for any systematic change in the relationship between debt and growth when countries shift from a ‘low’ to ‘high’ debt regime, with only around one in two countries experiencing an increase in the debt coefficient.26 Average coefficient changes in each of the three cases are statistically insignificant (standard or robust means).

Thus our first test of within-country threshold effects in the debt-growth relationship suggests that the consensus in the empirical literature of a common debt threshold does not hold up for the cutoffs tested if we allow for observed and unobserved heterogeneity across countries. Before we move on to investigate the same issue with an alternative approach, adopting static polynomial specifications of the debt-growth nexus, we first provide results from the summability, balance and co-summability analysis which will determine the preferred polynomial specification.

4.5 Results: Summability, Balancedness and Co-Summability

Table 4 presents the summability results, with models assuming a constant term in the left panel and constant and trend terms in the right panel, with the latter a more natural choice given the trending nature of our data. It appears that all of the variables investigated reject summability of order 0, S(0), which justifies our concern about time series properties – recall the analogy with unit root tests, whereby integrated data of order 1 or higher provides evidence for nonstationarity. In the lower panel we carry out summability testing for the growth rates of per capita GDP, debt stock and capital stock. For the former two we can broadly conclude that these first difference series are S(0), while the capital stock growth rate appears to reject this null hypothesis.

Table 5 presents the results from balance tests, with (unaugmented) ‘standard’ specifications in Panel A, specifications augmented in the common correlated effect fashion in Panel B and specifications which further add cross-section averages from two ‘openness’ variables in Panel C. Recall that for the two sides of the equation to be balanced, i.e. be made up of variables with the same order of summability, the balance statistic should be close to zero. We highlight all those specifications where this requirement is statistically rejected by underlining the estimate and 95% confidence bands. In each of the three panels we provide results for a specification with a constant and a specification with a constant and trend term, where again the latter appears a priori the more suitable choice. Across all three panels there is relatively strong evidence for the linear specification to represent a balanced model, with mean and median estimates for the balance statistics close to zero. There is comparatively less evidence for the two nonlinear specifications, with only the median estimates and 95% confidence intervals for the Model with linear, squared and cubed debt terms containing zero. Having said that, the rejection of the null of equal order of summability on both sides of the model equation is marginal in the specification with linear and squared debt terms of both Panels B and C.

In the co-summability results presented in Table 6 we highlight those specifications for which balance was somewhat uncertain by printing them in grey, whereas the specifications which were confirmed as balanced are printed in black. We again have three blocks of results, for a standard panel version of co-summability (equivalent to Panel A in the balance results in Table 5), for a version which includes cross-section averages of all model variables (Panel B) and for a version which in addition to these cross-section averages includes those from ‘other covariates’ (Panel C). None of the specifications without cross-section averages is co-summable, and the estimated test statistics – summability statistics for model residuals – are some distance away from zero (which would signify co-summability). Results for the specifications with cross-section averages are noticeably closer to zero, but still reject co-summability in the linear model. Results for the final set of specifications which include further cross-section averages in the empirical model then move even closer to zero, with the linear specification now co-summable if we focus on the median statistic. The nonlinear models in this case also appear to be co-summable, however it bears reminding that there was comparatively less evidence for these models to be balanced, which as a prerequisite for co-summability renders these models at best as uncertain with regard to the presence or absence of a long-run equilibrium relationship.

We draw three conclusions from this analysis: first, there is strong evidence for significant persistence in the data investigated, which as argued above may seriously impact estimation and inference. Second, it appears that results from an approach which assumes cross-section independence yields very different results from one which relaxes this assumption. In the context of the recent panel econometric literature this finding is not at all surprising, given the importance of accounting for cross-section correlation in the analysis of macro panel datasets. Further investigation of this result is beyond the scope of this article and left for future research. Third, the only empirical model tested for which we found fairly convincing evidence of it representing a balanced and co-summable specification is the linear model augmented with standard and additional cross-section averages. There is less convincing evidence for nonlinear models, even though some only fail the balance tests marginally. It bears reminding that the purpose of this exercise was to identify linear or nonlinear specifications which represent long-run equilibrium relationships.27 Whatever the identification strategy of existing studies in the literature, these results suggest that the adoption of linear and squared debt terms in a flexible specification to model debt thresholds may represent a seriously misspecified empirical model which could lead to spurious regression results.

4.6 Results: Nonlinear Static Models

We present the averaged debt coefficients from static production function models in various specifications in Table 7.28 All of the models presented are heterogeneous parameter specifications, but we also investigated various pooled model specifications (Pooled OLS, Fixed Effects, CCE Pooled) and found strong evidence of nonstationary residuals in these models, thus highlighting the potentially spurious nature of estimates from pooled empirical models (results available on request).

The nine models for which results are presented in Table 7 correspond to the same nine tested for balance and co-summability (Table 6), namely standard Mean Group (MG) and Common Correlated Effects Mean Group (CMG) estimates, alongside CMG estimates where we further added cross-section averages of two ‘openness’ variables in the empirical specification (CMG+). Of these we only found strong evidence for the linear model in column [7] to represent a balanced and co-summable specification. Average estimates for the linear specifications in Table 7 indicate a negative relationship in the MG and no substantive relationship in the two CMG models, while the models with linear and squared debt terms on average indicate a concave relationship in all three models. The nonlinear model including a cubed debt term is on average statistically insignificant in the MG and CMG models but not in the CMG+ model.

Based on residual diagnostic – residual series from all models were found to be stationarity – we can see how the MG models suffer from very serious residual cross-section dependence (CD test statistics in excess of 20), which is dramatically reduced for the CMG models (around 3.5 to 4.5, thus still rejecting cross-section independence) and the CMG+ models. The latter could be argued to be largely free from cross-section dependence of the strong type, so that any remaining error dependence is not likely to seriously affect estimation (Pesaran, 2013). We also report the number of countries in each estimator for which we found statistically significant debt coefficients. Although we cannot trust country-specific estimates in this empirical approach, this certainly highlights the heterogeneity in the country-specific results, with no linear or nonlinear relationship for the debt-growth nexus emerging as clearly dominant: for instance, in the linear models we find similar numbers of positive and negative slope coefficients on debt once we account for the distorting impact of cross-section correlation. In the models with linear and squared debt there is more evidence for concave relations – in line with the Reinhart & Rogoff (2010b) debt threshold story – but it would be difficult to claim that that this result is uniform across all countries. In the models with three debt terms the averaged coefficients hide a great deal of heterogeneity across countries. Thus on the whole we cannot provide any support for the notion that countries possess similar or even identical nonlinearities in the debt-growth relationship over time once we relax the assumption of common parameters across countries.

5 Concluding Remarks

This article empirically investigates the relationship between public debt and long-run growth and provides important insights for the current debate on threshold effects in the debt-growth nexus sparked by the work of Carmen Reinhart and Kenneth Rogoff (Reinhart & Rogoff, 2009, 2010a,b, 2011). Our paper makes three contributions to this empirical literature: first, we investigated the long-run relationship by means of a dynamic empirical model and adopted time series arguments to establish the presence of a long-run equilibrium, taking into account possible endogeneity issues. Since estimation results are likely to be spurious and seriously biased if these well-known data properties are not recognised and addressed in the empirical analysis our approach signals a significant departure from the standard empirical modelling in this literature and arguably provides more reliable estimates. Second, we adopted empirical specifications which allowed for heterogeneity in the long-run relationship across countries, thus reflecting a host of theoretical and empirical arguments. This heterogeneity in the specification extends to the relevant unobservable determinants of growth and debt burden, which we have addressed by means of a flexible common factor model framework. Ours is the first panel study on debt and growth to address parameter heterogeneity and cross-section dependence, thus allowing for a closer match between economic theory and data restrictions on the one hand and empirical modelling on the other. Third, we used a number of novel empirical estimators and testing procedures to shed light on the potential nonlinearity in the debt-growth relationship, focusing on both the possibility of a debt-growth nonlinearity across and within countries, a distinction previously entirely absent from the empirical literature. It bears emphasising that no empirical study modelling the debt-growth relationship in a pooled panel model can claim to be able to distinguish these two types of nonlinearity.

Our empirical analysis provided some evidence for systematic differences in the debt-growth relationship across countries, but no evidence for systematic within-country nonlinearities in the debt-growth relationship for all countries in our sample. With regard to the first result we observed that long-run debt coefficients appeared to be lower in countries with higher average debt burden, although the average long-run debt coefficient across countries was positive. Regarding the second result, empirical tests seemed to support a linear specification rather than the polynomial specifications popular in the empirical literature. When we employed piecewise linear specifications adopting various pre-specified thresholds, the change in the debt coefficient at the threshold was just as likely to be positive as negative. These findings imply that whatever the shape and form of the debt-growth relationship, it differs across countries, so that appropriate policies for one country may be seriously misguided in another. The commonly found 90% debt threshold is likely to be the outcome of empirical misspecification – a pooled instead of heterogeneous model – and subsequently a misinterpretation of the results, whereby it is assumed that pooled model estimates – obtained from polynomial or piecewise linear specifications for debt – imply that a common nonlinearity detected applies within all countries over time.


Figure 1:
Figure 1:

Peak Debt/GDP Ratio Distribution

Citation: IMF Working Papers 2013, 248; 10.5089/9781484309285.001.A001

Notes: The histogram indicates the distribution of peak years for the debt-to-GDP ratio in our sample of 105 countries. Three years, 1985, 1994 and 2009 account for over one third of all debt/GDP peaks.
Figure 2:
Figure 2:

Peak Debt/GDP Ratio and Relative Growth

Citation: IMF Working Papers 2013, 248; 10.5089/9781484309285.001.A001

Notes: Along the x-axis we arrange countries by the value of the maximum debt-to-GDP ratio (in logarithms), highlighting three years in particular: 1985 (Triangles: 9/105 countries), 1994 (Squares: 17/105), and 2009 (Diamonds: 11/105). All other years (68/105) are indicated with hollow circles. Along the y-axis we plot the deviation of countries’ (i) average per capita growth rate in the five years around their peak debt year (i.e. peak debt occurs in year 3) from (ii) their average per capita growth rate over the entire time horizon 1972-2009 excluding the five ‘peak debt years.’ For peak debt year 2009 we only construct the growth average from 2007-2009 observations, similarly for 1972 and 1973 peak debt years. A simple (outlier-robust) linear regression of average per capita growth rates on debt-to-GDP peaks (in logarithms) yields the following result (absolute t-ratios in brackets): .011 [0.54] – .005 [1.12] log(debt/GDP)imax
Figure 3:
Figure 3:

Interquartile ranges—Growth and Indebtedness

Citation: IMF Working Papers 2013, 248; 10.5089/9781484309285.001.A001

Notes: Each plot shows the interquartile range for GDP per capita growth and the debt/GDP ratio in the year indicated. The three years during which debt/GDP ratios peaked in a substantial number of countries are highlighted. Note that data coverage varies across the sample: for High Income Countries (N = 29) the plot covers 70% of countries in the early 1970s, rising to 80% or more in the late 1970s and early 1980s, and in excess of 90% for the remainder of the period of observation; For Middle Income Countries (N = 53) the 1970s cover a minimum of 60% of countries, rising to more than 80% in the 1980s and in excess of 90% from the early 1990s onwards; for Low Income Countries (N = 23) coverage is poor till 1982 (32%-56%) when over 70% of countries are covered, rising to 90% by the mid-1980s and beyond until 2007 when coverage drops to 68% and then 60% until the end of the sample period.
Figure 4:
Figure 4:

Nonlinearities in the Country-Specific Debt-Income Nexus

Citation: IMF Working Papers 2013, 248; 10.5089/9781484309285.001.A001

Notes: This figure plots the unconditional relation between debt/GDP ratio and within-transformed per capita GDP (both in logs). We employ fractional polynomial regression (solid regression line; shaded 95% confidence intervals) for all observations – see Footnote 10 for details on how we limit the sample to aid presentation. In the right plot of the first row we also provide a scatter of all observations; in the left plot of the second row we instead add fractional polynomial regression lines estimated for each country separately, while in the right plot of the same row we pick these for 30 countries at random.
Figure 5:
Figure 5:

The Rogoff and Reinhart (2010) approach in our dataset

Citation: IMF Working Papers 2013, 248; 10.5089/9781484309285.001.A001

Notes: In each plot the light-grey bars represent median growth rates, the dark-grey bars the mean growth rates (both left axis), the black line the share of total observations (right axis) for each group respectively. For High Income Countries we have a total of 1,001 observations (29 countries), for the Upper Middle Income, Lower Middle Income and Low Income Countries these figures are 752 (23 countries), 1,021 (30 countries) and 685 (23 countries), respectively. Income classification follows the World Bank approach.
Figure 6:
Figure 6:

Patterns for CMG debt coefficients

Citation: IMF Working Papers 2013, 248; 10.5089/9781484309285.001.A001

Notes: We plot the country specific long-run coefficients for debt in each country, taken from the dynamic CMG model with one additional lag (in column [6] of Table 1) against (a) the country-specific average debt/GDP ratio (in logs), and (b) the country-specific peak value for debt/GDP (in logs)—for both plots we reduce the number of countries as detailed below to improve illustration. In both cases we added fitted fractional polynomial regression lines along with 5% and 95% confidence bands (shaded area). We further provide (c) box plots for all 105 country-estimates divided into quintiles of the average country debt/GDP ratio distribution—outliers are omitted from these box plots and we focus on the medians and interquartile ranges (shaded). In (d) we split the sample into the top 25% and bottom 75% by average income and fit fractional polynomial regression lines alongside 5% and 95% confidence bands for each grouping (reduced sample in the plot for illustration). The final set of plots in (e) and (f) presents fitted fractional polynomial regression lines of long-run and short-run debt coefficients against average debt/GDP ratio for all CMG models (columns [4]-[10]), respectively. In each case (as in the first two scatter plots) we omit the observations with average debt/GDP ratio below 12% (ARE, CHN, LUX) as well as countries with absolute long-run debt coefficients (ALR) over 0.5 resulting in 92 [4], 95 [5], 92 [6], 91 [7], 95 [8], 94 [9] and 93 [10] out of 105 observations. This practice excludes the following country estimates in four or more of the seven models: GNB, GUY, HUN, IRL, SLE, SYC, TGO. In all plots we add a horizontal line to mark zero, in most plots we also add a vertical line at 4.5 log points (=90%) of the debt/GDP ratio.
Figure 7:
Figure 7:

Debt Coefficient Comparison: three debt-to-GDP thresholds

Citation: IMF Working Papers 2013, 248; 10.5089/9781484309285.001.A001

Notes: We plot the long-run debt coefficients in the low and high debt regime for (top) 52%, (middle) 75% and (bottom) 90% debt/GDP thresholds. In each case we use the CCEMG results with one additional lag of cross-section averages (model [5] in Table 3) for 55, 44 and 29 countries, respectively—countries are only included if they have at least 20% of their observations in one of the two regimes (below/above threshold). The values on the x-axis represents the average debt/GDP ratio and the average debt stock per capita (both in logarithms) for the lower and higher regimes (average over all years in each regime), in the left and right plot respectively. We carried out empirical tests for statistical significance of average coefficient changes at each threshold and report the mean and robust mean estimates together with respective t-ratios.
Table 1:

Linear Dynamic Models

article image
Notes: Results for full sample of JV = 105 countries, based on an error correction model with the first difference of log real GDP per worker as dependent variable. We report the robust mean of coefficients across countries in the heterogeneous parameter models in [3]-[10] (Hamilton, 1992) unless indicated; standard errors in these models are constructed following Pesaran & Smith (1995). † The CMG estimator (Pesaran, 2006; Chudik & Pesaran, 2013) is implemented using further cross-section averages (CA) of (a) additional lags and/or (b) other variables (open -trade/GDP, findev – financial development, both in logs) as indicated – see main text for details. $ These models are augmented with country-specific linear trend terms. ‘LRA’ refers to the long-run average coefficient, which is calculated directly from the pooled model ECM results in [1] and [2] and the averages of the heterogeneous model ECM results (standard errors computed via the Delta method) in [3]-[10]. ALR’ refers to the average long-run coefficient in the heterogeneous models, whereby the long-run coefficients are computed from the ECM results in each country and then averaged across the panel. ‘SR’ refers to the short-run coefficients. ALR threshold’ indicates the cross-country implied threshold (i.e. where the impact of debt becomes negative) derived from a linear regression of the long-run debt coefficient on average debt-to-GDP ratio over the entire time period, b The first set of t-statistics are non-parametric statistics derived from the country-specific coefficients following Pesaran & Smith (1995). The second set represent averages across country-specific t-statistics. *, ** and *** indicate significance at 10%, 5% and 1% level respectively. RMSE is the root mean squared error, CD test reports the Pesaran (2004) test, which under the null of cross-section independence is distributed standard normal.
Table 2:

Weak Exogeneity Testing

article image
Notes: Numbers in brackets correspond to the columns in Table 1. For the tests in Panel B cross-section averages of all variables are added to the estimation equation, whereas in Panel A we do not include these. All results are for N = 105. Equation refers to the ECM regression where the named variable is on the LHS, lags reports the number of lagged differences included in the regression. GM-t gives the group-mean average of country-specific t-ratios for the coefficient on the disequilibrium term (λ^i) which is distributed N (0,1), p indicates the corresponding p-value. Avg λ^i refers to the robust mean coefficient on the ECM term, t-stat the corresponding t-statistic. Underlined p-values or ‘robust’ t-statistics indicate evidence against the hypothesis of a well-specified production function.
Table 3:

Asymmetric Dynamic Models

article image
Notes: We present average long-run coefficients (based on country-specific long-run results) for debt from models which allow for asymmetry in the debt coefficients, adapted to the panel from the time-series approach by Shin, Yu & Greenwood-Nimmo (2013). The dependent variable is the GDP per capita growth rate. Three thresholds are adopted to split the data into two (high/low debt) ‘regimes’: 52% (sample median), 75% and 90% debt/GDP ratio. Countries are only included in the analysis if they have at least 20% of their observations in one of the two regimes (below/above threshold), resulting in 55, 45 and 30 countries, respectively. Models [2]-[6] add cross-section averages to the regressions, those in [5] and [6] further add lags of the cross-section averages in the spirit of Chudik & Pesaran (2013). All models allow for long-run (LR) and short-run (SR) asymmetry, with the exception of model [4], which only allows for long-run asymmetry. ♭ For the coefficient on lagged GDP per capita we report the Pesaran & Smith (1995) nonparametric t-statistics as well as the average of country-specific t-statistics (t), the CD Test is distributed N (0,1) under the null of cross-section independence. RMSE is the root-mean squared error.
Table 4:

Estimated Order of Summability

article image
Notes: The table presents the panel statistics for N = 105 country-specific estimates of the order of summability δ^* (see main text for further details). All variables are in logarithms. We account for the constant term by partial demeaning, and for the additional linear trend term by double partial demeaning as detailed in Berenguer-Rico & Gonzalo (2013b). For each variable we present two sets of statistics: the upper (lower) panel presents mean (median) δ^* across the panel as well as the mean-(median-)based subsampling results (lower and upper 95% confidence bands). Each of the N−b + 1 = 95 subsamples of size b=int(N)+1=11 countries is a random draw of countries from our full sample of N = 105.
Table 5:

Estimated Balance

article image
Notes: The table presents distributional statistics for N = 105 country-specific estimates of the balance in the indicated regression models (δ^yδ^g) (see main text for further details). The RHS of each model always includes capit and debtit. All variables are in logarithms. We account for a constant term by partial demeaning, and for an additional linear trend term by repeated partial demeaning as detailed in Berenguer-Rico & Gonzalo (2013b). CA refers to the augmentation of the static country regression with cross-section averages following Pesaran (2006): in Panel B we include the model variables, in Panel C we include CA of the openness and financial development variables in addition to the model variables. Underlined mean or median balance statistics indicate evidence against the hypothesis of a balanced regression model, (δ^yδ^g)=0.
Table 6:


article image
Notes: The table presents distributional statistics for N = 105 country-specific order of summability estimates for the respective model residuals. The RHS of each model always includes capit and debtit. All variables are in logarithms. CA refers to the augmentation of the static country regression with cross-section averages following Pesaran (2006), ‘Additional CA’ refer to the CA for the openness and financial development variables (in logs) as described in the main text. We account for a constant term by partial demeaning as detailed in Berenguer-Rico & Gonzalo (2013b). Underlined mean or median co-summability statistics indicate evidence against the hypothesis of a co-summable model specification, δ^=0. Since co-summability is conditional on balance, we only print those specifications in black for which we have convincing evidence from the balance testing in Table 5, whereas all other specifications are printed in grey.
Table 7:

Static Linear and Nonlinear Models

article image
Notes: We report the estimates and diagnostic tests for static production functions with linear and polynomial debt terms. All estimates are robust means (see Table 1). The MG models further include trend terms, we also omitted to report the averaged capital stock coefficients and constant terms in all models (available on request). ‘Bal & Co-Sum’ indicates the specification which was found to be balanced and co-summable (Tables 5 and 6). † We report the number of countries with statistically significant (5% level) positive and negative debt coefficient using / and \, respectively; ⋃ and ⋂ report the number of countries with statistically significant (5% level) convex and concave debt-growth relationships, respectively. ~ reports the number of countries for which all three debt terms are statistically significant at the 5% level. ‡ All residual series were found to be stationary.