Journal Issue

Growth Determinants Revisited

Charalambos Tsangarides, and Alin Mirestean
Published Date:
December 2009
  • ShareShare
Show Summary Details

I. Introduction

Economic growth and human development topics have been examined since the beginning of recorded history.1 Over the last two decades, the philosophical rhetoric has emphasized the primacy of human development as the ultimate objective of economic pursuits, while empirical work has tried to explain why some countries have experienced rapid long-term growth rates in income while others have not. Economic growth has been described as “the part of macroeconomics that really matters,” not least because relatively small differences in growth rates when cumulated over time can have major consequences for standards of living.

Despite the vast number of cross-country growth studies that followed the seminal papers of Barro (1991) and Mankiw, Romer, and Weil (1992), there is little consensus on the mechanics of economic growth. A fundamental problem for researchers is the lack of an explicit theory identifying the determinants of growth, with surveys of the empirical literature (e.g. Durlauf and Quah (1999), and Durlauf, Johnson, and Temple (2005)) identifying more variables partially correlated with growth than the number of countries for which data are available. Indeed, the neoclassical Solow-Swan (1956) growth model and the endogenous growth models by Romer (1986) and Lucas (1988) are “open-ended” (see Brock and Durlauf (2001)) as they admit a vast range of logical and testable extensions, and a broad number of possible specifications.

Unsystematic searches of “ad hoc” growth model configurations may result in overconfident and fragile inferences—even contradictory conclusions—and fundamentally ignore model uncertainty. As a result, a growing number of growth researchers are turning to the Bayesian Model Averaging (BMA) methods in order to deal with the problem of model uncertainty. Building on the work of Raftery (1995), Fernández, Ley and Steel (2001), Brock and Durlauf (2001), and Sala-i-Martin, Doppelhofer and Miller (2004) introduced model averaging to the growth empirics literature. More recent applications of model averaging suggest several modiications of the earlier BMA framework. For example, Brock, Durlauf, and West (2003), and Durlauf, Kourtelos and Tan (2008) discuss testing for growth theories instead of particular variables, while Ley and Steele (2007, 2009), and Doppelhofer and Weeks (2009) attempt to quantify the degree to which development determinants act “jointly” to afect growth.

However, despite the increasing interest in using BMA to investigate growth empirics, most of the work focuses on cross section analysis with data averaged over the time dimension, thus ignoring dynamic relationships among variables and the dynamic evolution of the growth process. Moreover, very few methodologies allow for the inclusion of variables that are endogenous in a statistical sense, that is, correlated with the disturbance term.2. Both of these issues—modeling dynamics and incorporating endogeneity—are issues of particular relevance to growth analyses.

This paper revisits the cross-country growth empirics debate usinga proposed limited information BMA methodology to address model uncertainty in the context of a dynamic panel data growth model with endogenous regressors. In particular, we construct a small sample counterpart of the LIBMA developed by Chen, Mirestean, and Tsangarides (2009). The proposed BMA methodology is a limited information technique based on Generalized Method of Moments (GMM) moment estimation, where the posteriors are obtained through a simple Bayesian procedure taking advantage of the linear structure of the model. Our empirical findings suggest that once endogeneity and model uncertainty are accounted for, various economic factors such as initial conditions and macroeconomic environment are robustly correlated with economic growth. In particular, we find the strongest evidence that initial income, investment, life expectancy, and population growth are robust growth determinants. We also find strong evidence that debt, openness, and inflation are robust growth determinants. Overall, the set of growth determinants for which we find evidence of robustness is different from the sets found by other studies that incorporate model uncertainty. These results suggest that it is important to investigate growth empirics in a setting that explicitly accounts for dynamics and endogeneity.

The rest of the paper is organized as follows. Section 2 presents the theoretical speciication, discusses estimation issues, and describes the estimator used for the robustness analysis. Section 3 presents the data and identifies the growth determinants. Section 4 summarizes the results, and section 5 concludes.

II. Theoretical Considerations

A. A Dynamic Growth Model with Endogenous Regressors

A generic representation of the canonical cross-country growth regression is

where g is the growth rate of output per worker, Y, between the period t and t − 1; Z is an n × k matrix of growth regressors including those suggested by the Solow growth model (population growth, technological change, physical and human capital, and savings rates) and those suggested by new growth theories; θ = (θ1θ2θk) is a vector of unknown parameters to be estimated; and u is the error term.

Much of the work on growth empirics attempts to identify the variables k that comprise Z. Suppose there is a universe of k possible explanatory variables indexed by U = {1, 2,…j,j+ 1,…,k}. Let Z be the matrix of all possible explanatory variables. For a given model Mj that considers only a subset of the possible explanatory variables, MjU, let CMj={cmn,Mj}m,n=1k be a k × k diagonal choice matrix such that its diagonal will have 1’s if the corresponding variable is included in the model and 0’s otherwise. Hence cii,Mj = 1 {iMj}, and for a given model Mj, Z=ZCMj.

Assume further that the universe of potential explanatory variables, indexed by the set U, consists of the lagged dependent variable, indexed by 1, a set of m exogenous variables, indexed by X, a set of p predetermined variables, indexed by P, as well as a set of q endogenous variables, indexed by W, such that {{1}, X, P, W} is a partition of U.

Let us define yit as the log of the output per worker, Yit, that is, yit = log(Yit). Therefore, the dynamic growth model for panel data for a given set of explanatory variables, that is, a particular model MjU, can be written as

where Yit, xit, pit, and wit are observed variables, ηi is the unobserved individual effect while vit is the idiosyncratic random error. The exact distributions for vit and ηi are not speciied here, but assumptions about some of their moments and correlation with the regressors are made explicit below. It is assumed that E (vit) = 0 and that vit’s are not serially correlated. xit is a 1 × m vector of exogenous variables, pit is a 1 × p vector of predetermined variables, while wit is a 1 × q vector of endogenous variables. Therefore, the total number of possible explanatory variables is k = m + q + p +1. The observed variables span N countries and T periods, where T is small relative to N. The unknown parameters α, θxp, and θw are to be estimated. In this model, α is a scalar, θx is a 1 × m vector, θp is a 1 × p vector, while θw is a 1 × q vector.

Given the assumptions made so far, for any model Mj, and any set of exogenous variables, xit, we have E(xitlvis)=0, ∀i, t, s; xitlxit. Similarly, for any endogenous variable we have

while for predetermined variables the conditions are

B. Estimation and Moment Conditions

A common approach for estimating the model (2) is to use the system GMM framework (see Arellano and Bover (1995), and Blundell and Bond (1998)). This implies constructing the instruments set and moment conditions for the “levels equation” (2) and combining them with the moment conditions using the instruments corresponding to the “first-difference” equation written as

One assumption required for the first difference equation is that the initial value of y, yi0, is predetermined, that is, E (yi0vis) = 0 for s = 2, 3,…,T. Since yit−2 is not correlated with Δvit it can be used as an instrument, and we have E (yi,t−2Δvit) ≠ 0 for t = 2, 3,T. Moreover, since yi,t−3 is also not correlated with Δvit (and as long as we have enough observations (that is T ≥ 3)) yi,t−3 can be used as an instrument. Assuming that we have more than two observations in the time dimension, the following moment conditions could be used for estimation

Similarly, the exogenous variable xitl,xitlxit is not correlated with Δvit and therefore we can use it as an instrument.3 That gives us additional moment conditions

The predetermined variable pi,t1l,pi,t1lpit, is not correlated with Δvit and therefore it can be used as an instrument. We have the following possible moment conditions

The endogenous variable wi,t2l,wi,t2lwit, is not correlated with Δvit and therefore it can be used as an instrument. We have the following possible moment conditions

Table A summarizes the moment conditions that could be used for the first difference equation. Basically, the first difference equation provides T (T − 1)/2 moment conditions for the lagged dependent variable, m (T − 1) moment conditions for the exogenous variables, and q (T − 2)(T − 1)/2 moment conditions for the endogenous variables.

Table A.Moment Conditions for the First Difference Equation
VariableInstrumentsMoment conditions
Δyi,t−1yi,t−2,…,yi,0E(yi,t−sΔvit) = 0, t = 2, 3,…, T; s = 2, 3,…,t
Δxitlxitl,,xi1lE(xitlΔvit)=0, t = 2, 3,…, T;l = 1, 2,…, m
Δpitpi,t1l,,pi,1lE(pi,tslΔvit)=0, t = 2,3,…, T; s = 1, 2,…, t − 1;

l = 1, 2,…,p
Δwitlwi,t2l,,wi,1lE(wi,tslΔvit)=0, t = 3,4,…, T; s = 2, 3,…, t −1;l = 1, 2,…,q

For the levels equation(2), it is easy to see that first differences for the lagged dependent variable are not correlated with either the individual effects or the idiosyncratic error term and hence we can use the following moment conditions

Similarly, for the endogenous variables, the first difference Δwi,t1l is not correlated with uit. Therefore, assuming that wi,1l is observable, and as long as T ≥ 3 we have the following additional moment conditions

For the predetermined variables, the first difference Δpi,1l is not correlated with uit. Therefore, assuming that pi,1l is observable, and as long as T ≥ 2 we have the following additional moment conditions

Finally, based on the assumptions made so far, the first difference of the exogenous variables Δxitl,xitlxit are not correlated with current realizations of uit and hence one can use another set of moment conditions

Table B summarizes the moment conditions for the level equation.

Table B.Moment Conditions for the Level Equation
VariableInstrumentsMoment conditions
yi,t−1Δyi,t−1Eyi,t−1uit) = 0, t = 2, 3,…, T
xitlΔxitlE(Δxitluit)=0, t = 2, 3,…, T; l = 1, 2,…, m
pitlΔpi,t1lE(Δpi,tluit)=0, t = 2,3,…, T; l = 1, 2,…, p
witlΔwi,t1lE(Δwi,t1luit)=0, t = 3,4,…, T; i = 1,2,…,q

The equation in levels provides (T − 1) moment conditions for the lagged dependent variable, m (T − 1) moment conditions for the exogenous variables, and q (T − 2) moment conditions for the endogenous variables, and p (T − 1) moment conditions for the predetermined variables.

Furthermore, as shown by Ahn and Schmidt (1995), (T −1) additional linear moment conditions are available if the vit disturbances are assumed to be homoskedastic through time and Eyi1ui2) = 0. Specifically,

Let ui and Dvi denote the T × 1 and (T − 1) × 1 matrices of the error term and the first differenced idiosyncratic random error, respectively, as defined in model (2), ui = (ui1ui2uiT)′ and Dui = (Δvi2 Δvi3 … ΔviT)′. Define a (2T − 1) ×1 matrix Ui=(uiDvi) that contains both the error term and the first differenced idiosyncratic random error. The full set of moment conditions can now be written in matrix form

where Gi is a (2T − 1) × (T + 2m − 2 + p(T + 2)(T − 1)/2 + (T + 1) ((T − 2) q + T)/2) matrix deined as

C. Model Uncertainty

Given a universe of k possible explanatory variables for our growth regression, we have a set of K = 2k models M = (M1,…, MK) under consideration. In the spirit of Bayesian inference, priors p(θ|Mj) for the parameters of each model, and a prior p(Mj) for each model in the model space M are specified. Let D = (Y Z) denote the data set available to the researcher. The probability that Mj is the correct model, given the data D, is, by Bayes’ rule


is the marginal probability of the data given model Mj.

Hypothesis testing for the comparison of model Mj against Mi, is based on the posterior probabilities and expressed by the posterior odds ratio p(Mj|D)p(Mi|D)=p(D|Mj)p(D|Mi).p(Mj)p(Mi). Essentially, the data updates the prior odds ratio p(Mj)p(Mi) through the Bayes factor p(D|Mj)p(D|Mi) to measure the extent to which the data support Mj over Mi.4 Evaluating the Bayes factors needed for hypothesis testing and Bayesian model selection or model averaging requires calculating the marginal likelihood p (D|Mj) = ∫ p (D| θ, Mj) p (θ|Mj).

Since our growth model is dynamic and we have to account for endogeneity, we are going to account for model uncertainty by using the limited information Bayesian model averaging methodology proposed by Chen, Mirestean, and Tsangarides (2009). They advanced a method for constructing the marginal likelihoods (and posteriors) based only on information elicited from moment conditions, with no specific distributional assumptions. Chen, Mirestean, and Tsangarides (2009) consider a likelihood dependent, unit information prior (see Kass and Wasserman (1995)) which enables the derivation of a posterior in a simple Bayesian Information Criterion (BIC)-like form. Following their approach the model likelihood for a given model Mj for which θ has kj elements different from zero is given by

where θ^0,j denotes the estimate for θ.

Then the moment conditions associated with model Mj can be written as E[Gi(y˜iz˜iCMjθ0)]=0 where Gi is the instrument matrix. Using (8), the posterior odds ratio of two models M1 and M2 is given by

which has the same form of BIC as fully specified models. We further assume a Uniform distribution over the model space, which implies that there is no preference for a specific model so p(M1) = p(M2) = … = p(MK) =1K.

Using Bayesian Model Averaging, inference for a quantity of interest Γ can be constructed based on the posterior distribution

which follows by the law of total probability.5 Therefore, the full posterior distribution of Γ is a weighted average of the posterior distributions under each model (M1,…, MK), where the weights are the posterior model probabilities p(Mj|D). Going back to the linear regression model (2), BMA allows the computation of the inclusion probability for every possible explanatory variable


Using (10) posterior means and variances for parameters θι can be constructed, respectively, as follows


D. Reducing the Number of Moment Conditions

As suggested by (5) the number of instruments grows quadratically with T, and this may pose a problem as T rises relative to N. While GMM is consistent in short panels, properties of the GMM estimators are sensitive to the choice of instruments when T rises and N is small. Too many instruments may increase asymptotic efficiency but may cause bias/increased variance in small samples (Donald, Imbens, and Newey (2008)). Roodman (2009) shows that a large instrument collection overfits endogenous variables and leads to imprecise estimates of the GMM optimal weighting matrix.

There have been several contributions in the literature on the performance of IV/GMM estimators when instruments are many, also known as “instrument proliferation”.6 More recently, Roodman (2009), and Mehrhoff (2009) propose transformations of the instrument set such as limiting the lag length of the instruments and/or collapsing the instrument set, where each of these transformations makes the instrument count linear in T. Arellano (2003b) and Donald, Imbens, and Newey (2008) attempt to model or select the optimal instruments.

Given the lack of a widely accepted rule to limiting the instrument count, we follow the recent approach suggested by Roodman (2009) and experiment with collapsing the instruments. We discuss below several ways to reduce the number of instruments by collapsing the instruments matrix and reducing the number of lags used. In Appendix I we present the Monte Carlo results (and the Monte Carlo experiment in Appendix II). We group the moment conditions for the first-difference and levels equations into matrices as the follows.

For the first difference equation

The first difference equation provides T(T − 1)/2 moment conditions for the lagged dependent variable. We can reduce the count of moment conditions to (T − 1) by stacking the instruments as in matrix Yia. In this case we are still using all the all possible lags of the dependent variable for a given period t.

We can further reduce the count of instruments by limiting the number of lags used. For example,Yi1,Yi2,Yi3are the (T − 1) × 1, (T − 1) × 2, (T − 1) × 3 stacked matrices of instruments using at most 1, 2, or 3 of all the possible lags of the dependent variable:

The lagged dependent variable is in fact a predetermined variable.Therefore, the discussion on the instruments of the lagged dependent variable also applies to the instruments of the predetermined variables. The only difference may occur from the fact that at time t = 0 the predetermined variables may not have been observed and hence yi0 would be replaced by 0 in the instruments matrix. Assuming that L represents the maximum number of lags used, the number of moment conditions for the predetermined variables will be Lp.

The first difference equation provides q(T − 2) (T − 1)/2 moment conditions for the endogenous variables. As discussed in the case of the lagged dependent variable, we can reduce the count of moment conditions for the endogenous variables by stacking the matrix of instruments and limiting the umber of lags. Hence the matrix of instruments using at most 1 ad 2 of all the possible lags of the endogenous variables,Wi1,Wi2, are given by

Therefore, the number of moment condition has been reduced to q, and 2q, respectively. More generally, if L represents the maximum number of lags used, the number of moment conditions for the endogenous variables will be Lq. Further, in a similar manner, we can reduce the number of moment conditions for the exogenous variables from m (T − 1) to m. Let Xi denote the (T − 1) × m matrix of instruments for the exogenous variables:

For the level equation

For the level equation we can reduce the number of momentconditions by simply stacking the instruments. For example, we can reduce the number of moment conditions for the lagged dependent variable from T − 1 to 1 by just stacking the (T − 1) instruments. matrix DYi consisting of first differences of the dependent variable and the T × q instruments matrix DWi consisting of first differences of the endogenous variables.

Further, let DXi denote the T × m matrix of the first differenced exogenous variables

Finally, letYi be the T × (T − 1) instrument matrix used for the moment conditions derived from the Ahn and Schmidt (1995) homoskedasticity restriction:

Finally, depending on the maximum number of lags used, we can define the moment conditions in matrix form as:

where the matrix Gia corresponds to all lags, Gi1 to 1 lag and Gi2 to a maximum of 2 lags.Gia is a (2T − 1) × (m + 1 + (2 + q) (T − 1)) matrix defined as

Similarly,Gi1 and Gi2 are (2T − 1) × (T + 2q + m +1), (2T − 1) × (T + 3q + m + 2) matrices. More generally, if L denotes the maximum number of lags being used the corresponding matrixGiL has the dimension given by (2T − 1) × (T + (L + 1)q + m + L) matrices defined similarly to Gia.

As an illustration, Table C below presents the number of moment conditions for various options of T, m, and q for the full set of instruments as well as the collapsed and/or lag reduced options. All the cases presented assume that only one predetermined variable enters the model, i.e. the lagged dependent variable. As indicated from the table, collapsing and/or reducing the lags yields dramatic reductions in the number of moment conditions. For example, for a case of 19 regressors (with 6 exogenous, one predetermined and 12 endogenous regressors) and 6 time periods, simply collapsing reduces the number of instruments from 205 to 77, while collapsing and further reducing the lag length to, say, 2, reduces the lags further to 50. This is particularly relevant for the analysis in this paper where the sample size is limited.

Table C.Instruments for various options of T,m, and q
T = 6T =10
Exogenousm =5m = 5m=6m=5m =5m=6
Endogenousq = 6q = 8q = 12q = 6q = 8q = 12
Uncollapsed full119147205337425603
Uncollapsed 2 lags95117163183229323
Uncollapsed 1 lag7387123133165231
Collapsed full4656777896133
Collapsed 3 lags384663425067
Collapsed 2 lags313750354154
Collapsed 1 lag242837283241

III. The Data

A. Growth Determinants

In this paper we consider growth determinants that capture (proxy)proposed growth theories, policies, institutional characteristics, and other exogenous factors that stimulate growth. In addition to the variables suggested by the “augmented” neoclassical Solow-Swan model, surveys of the empirical growth literature (e.g. Durlauf and Quah (1999), and Durlauf, Johnson and Temple (2005)) identify a large number of explanatory variables grouped into “categories” or distinct growth theories.7 Following these approaches, we construct a our sample of 42 growth determinants grouped into 10 categories. We describe the variables and the broad categories below (for more details on the motivation of the choice and literature review see Tsangarides (2004)).

1. Solow-Swan determinants and human capital

The three variables suggested by the “augmented” neoclassical Solow-Swan model are rates of human and physical capital, and population growth. We capture the effect of (i) physical capital through ratios of real investment to GDP; (ii) human capital development through measures of health and educational status (such as life expectancy and school enrollment rates); and (iii) population through population growth rates.

2. Macroeconomic stability

Macroeconomic policies can affect economic growth directly throught heir effect on accumulation of capital, or indirectly through their impact on the efficiency with which the factors of production are used and sends important signals to the private sector about the commitment and credibility of a country’s authorities to efficiently manage the economy and increase the opportunity for profitable investments. Macroeconomic stability is reflected in sustainable budget deficits and low consumption to GDP ratios, low and stable rates of inflation, a limited departure of the real exchange rate regime from its equilibrium levels, and an appropriate exchange rate regime. The impact of macroeconomic stability is captured by (i) inflation and its volatility, (ii) the government budget balance, (iii) the government consumption relative to GDP, (iv) debt to GDP ratios, (v) indices of exchange rate overvaluation, and (vi) exchange rate regime classiication.

3. Financial development

Financial deepening lowers the cost of borrowing, increases domestic saving, and thus stimulates investment. Also, financial sector development may benefit growth by facilitating access to credit and improving risk-sharing and resource allocation. Financial sector development is measured by the ratio of broad money to GDP and by the ratio of assets of deposit money banks to total bank assets.

4. Trade regime

The proposition that more outward-oriented economies tend to grow faster has been tested extensively. Most studies tend to support the idea that openness to international trade accelerates development and growth by increasing access to free markets and returns from specialization. Theoretical foundations of the positive links between trade openness strategies, growth and poverty reduction come at least from both the neoclassical and the more recent endogenous growth theories. On the one hand, the neoclassical approach explains the gains from trade liberalization by comparative advantages, be they in the form of resource endowment (as in the Hecksher-Ohlin model) or differences in technology (as shown by the Ricardian model). On the other hand, the endogenous growth literature asserts that trade openness positively affects per capita income and growth through economies of scale and technological diffusion between countries. The trade regime and the external environment, generally, are captured by the degree of openness and exogenous terms-of-trade changes.

5. External environment

We capture changes in (the exogenous) external environment by improvements in the terms of trade and estimates of the external regime volatility faced by countries, both of which associated with improved international competitiveness. We also capture other changes in the external environment by foreign direct investment to GDP, capital lows to GPD and foreign aid as percent of GDP.

6. Internal environment

We capture agricultural productivity by the ratio of arable land to total area. We also use proxies for the characteristics of the population like measures of ethnic heterogeneity and ethno-linguistic diversity.

7. Institutions and governance

The distribution of growth benefits are likely to depend not only on the sectoral pattern of growth but also on the degree of popular representation at the policy making level and the effectiveness of the governing institutions. Also, through its likely positive impact on the rule of law and the rate of investment, democracy’s main impact on growth is indirect through the role of secure property rights. In this paper we examine the hypothesis that political freedom is a significant determinant of economic growth using the democracy and autocracy variables as measures of the general openness of political institutions, as well as indices of civil liberties.

8. Violence, war, and conflict

In examining the hypothesis that ethnic divisions influence economic growth, polarized societies may have more difficulties agreeing on the provision of such public goods as infrastructure, education, and growth-enhancing policies, simply because polarization impedes agreement between ethnic groups engaged in competitive rent-seeking. We use proxies for war prevalence as well as domestic conflict and regional conflict to capture spillover effects.

9. Geography and fixed factors

The relationship between geography and growth is complex. While the majority of empirical evidence concludes that geographic attributes like tropical climate or being landlocked correlate negatively with recent rates of economic growth. To examine the extent to which geography does matter, we use a variety of factors, including distance to coastline or navigable river and percentage of land area in tropics.

10. Regional characteristics/unobserved heterogeneity

To capture the unexplained regional heterogeneity, we include aset of dummy variables that capture regional groups (e.g. sub-Saharan African countries, Latin American countries, etc.), resource rich country groups and country income groups.

B. Variable Definitions and Sources

The database constructed for the analysis consists of annual data from the Summers and Heston data set (Penn World Tables, version 6.2) and data from other sources. Switching from a cross section to panel estimation is made possible by dividing the total period into shorter time spans. Following earlier studies in the literature, we focus on five-year time intervals, so we obtain a total of eight panels: 1961-1965, 1966-1970, 1971-1975, 1976-1980, 1981-1985, 1986-1990, 1991-1995, and 1996-2000. In addition, we construct eight-year time intervals (which result in five panels, namely, 1961-1968, 1969-1976, 1977-1984, 1985-1992, and 1993-2000) to examine the robustness of the results to the change in the time span. Differences in data availability across countries and variables lead to different sample sizes for different combinations of explanatory variables. Given that for variables classified as endogenous we need at least three observations in order to have useful moment conditions, we filter out countries with less than two observations. Moreover, for a given country we restrict the sample such that each variable in the considered universe will have the same amount of observations. However, we do allow for sample variation across countries in order to use as much information as possible. In this fashion we arrive at an unbalanced, regularly spaced panel set of observations. Table B1 in Appendix III contains details for each category, the component variables, and their sources.

From the categories of growth theories identified in the previous section, we identify 15 proxies and consider additional time dummies to capture time effects (see discussion below). As discussed, we consider 5 and 8 year spans for the averages, which result in a universe of 2k = 222 (4,194, 304) and 2k = 219 (524, 288) regressions. Based on the discussion in Section 2 and the Monte Carlo results, we focus on collapsed instruments with two lags. The baseline estimation covers 107 countries with 593 observations over the period 1960-2005, with an average of 5.5 observations per country.

IV. Results

We apply the LIBMA methodology to the investigation of growth determinants. In this section, we present results of our analysis and demonstrate how differences in the estimation approach translate into differences in the results. In particular, (i) we demonstrate how model uncertainty affects growth determinants when model uncertainty is not explicitly accounted for in the “ad hoc” growth estimations, (ii) once model uncertainty is accounted for, we demonstrate how accounting for dynamics results in different conclusions; and (iii) once model uncertainty and dynamics are accounted for, we demonstrate how accounting for endogeneity results in different conclusions.

Impact of model uncertainty in “ad hoc” growth regressions

Our investigation for robust growth determinants begins by examining how fragile the results of ad hoc cross-country growth specifications are. Following Tsangarides (2003), we first estimate ad hoc growth regressions using variables from our data set, and then summarize results found in the empirical literature. In the first investigation, we discover how drastically conclusions change with relatively small variations in explanatory variables.8 Adding variables to, say, the Solow determinants, changes the significance and sometimes the sign of various coefficients. Further, the fragility of parameter estimates and the impact of model uncertainty can be detected by observing Appendix B of Durlauf, Johnson and Temple (2005) which summarizes the results of recent empirical work on growth correlates. The significance of parameter estimates tends to fluctuate a lot across studies that use different subsets of the control variables.9 Sometimes the same authors even present different conclusions in studies from different years or when their control variables change.

This investigation confirms that any lessons drawn from ad hoc specifications can be problematic. Because the list of possible regressors can be a linear function of an arbitrary set of control variables, it is difficult to assign a statistical significance or make policy recommendations based on a subset of these variables. This confirms the common tendency for some growth empirical investigation to yield fragile econometric estimates, and underscores the importance of incorporating model uncertainty in the estimation.

Robustness Analysis of Growth Determinants

Table 1 presents the results of the baseline estimation based on a universe of 2k possible models. To examine the effect of using 5-year or 8-year averaged data the set of k variables includes 15 variables from the categories described in Section 4, and adds, accordingly, time variables corresponding to the spans on which the data was averaged. In particular, for the 5-year averaged data we have 222 possible models (namely the 15 variables and 7 time variables corresponding to 7 of the 8 time spans); similarly, for the 8-year averaged data we have 219 possible models (15 variables and the 4 time variables corresponding to 4 of the 5 time spans). We choose to depart from the standard demeaning procedures commonly used in the literature because the demeaning approach would be equivalent to having all the time variables present in all the models, effectively assigning them a probability of inclusion equal to 1. Said differently, we choose to let the time effects enter as any other variable in order to effectively avoid imposing the presence of time effects in all models. Therefore, we create time variables for the periods considered in the sample and include them in the set of possible explanatory variables, thus allowing inferences about the relevance of time effects for all the periods considered.

Table 1.Robustness of Growth Determinants Marginal Evidence ofImportance
5-year averages




1 Log(initial income)0.87-0.2110.204
2 Log(investment)0.840.1470.184
3 Log(population growth)0.66-0.3220.497
4 Log(years of education)0.310.0020.128
5 Life expectancy0.700.0070.017
6 Log(inflation)0.28-0.0350.191
7 Debt0.67-0.0390.064
8 Overvaluation Index0.49-0.0300.160
10 Terms of Trade0.240.0870.580
11 Ethnic heterogeneity0.40-0.0460.391
12 Polity0.27-0.0010.007
13 War0.44-0.0380.169
14 Tropics0.30-0.0190.165
15 Sub-Saharan Africa0.47-0.0190.264
16 Panel 19700.340.0080.061
17 Panel 19750.33-0.0080.059
18 Panel 19800.190.0020.039
19 Panel19850.23-0.0120.046
20 Panel19900.700.0230.056
21 Panel 19950.35-0.0040.047
22Panel 20000.390.0150.059
8-year averages




Standard Error
1 Log(initial income)0.97-0.2020.066
2 Log(investment)0.730.2120.077
3 Log(population growth)0.61-0.2080.522
4 Log(years of education)0.280.0260.065
5 Life expectancy0.470.0060.001
6 Log(inflation)0.75-0.3700.412
7 Debt0.29-0.0350.007
8 Overvaluation Index0.270.0140.048
9 Openness0.29-0.0110.039
10 Terms ofTrade0.260.0910.985
11 Ethnic heterogeneity0.23-0.0040.141
12 Polity0.250.0000.000
13 War0.21-0.0190.039
14 Tropics0.30-0.0130.038
15 Sub-Saharan Africa0.340.0160.077
16 Panel 19760.310.0070.005
17 Panel 19840.75-0.0450.012
18 Panel 19920.340.0130.007
19 Panel 20000.380.0210.012
Notes:1. Boxed areas indicate inclusion probabilities above 0.50.
Notes:1. Boxed areas indicate inclusion probabilities above 0.50.

Our priors are based on the assumption that each variableconsidered has the same probability of being included in the model, namely equal to 0. 50. The posterior inclusion probability shown in the second column of Table 1 reflects how much the data favors including a particular variable in the regression. The boxed areas in Table 1 indicate variables identified as “robust”. These are the variables for which the posterior inclusion probability is above the prior (that is, p(Zi|D) ≥ 0.50).10 The unconditional mean and standard deviation, shown in the third and fourth columns, respectively, are computed taking into account all the possible models according to equations (12) and (13). These statistics are useful in examining the marginal impact of a variable, without accounting for the inclusion probability.

The results from the robustness analysis on growth determinant scan be summarized as follows. The baseline estimations in Table 1 using 5-year and 8-year averaged data identify six, and four variables as robust, respectively. For the 5-year averaged data, the results (shown in the top part of Table 1) show that initial income, investment, population growth, life expectancy, debt, and openness are robust growth determinants. The first four reflect the neoclassical theory variables “augmented” to include measures of human capital. The elasticity of per capita growth rate with respect to initial income is negative and strongly robust providing empirical evidence that conditional convergence holds. In addition, the three of the remaining four Solow-Swan determinants—investment, population growth, and life expectancy—enter with a high inclusion probabilities, indicating that the data favors the inclusion of these variables. Evidence is weak about the inclusion of the second proxy for human capital, education. Finally, we find that trade openness and the level of debt relative to GDP are robust growth determinants. In addition, the time variable “panel 1990” is robust with a positive coefficient, indicating that a time effect might be present for the span 1986-1990. The results using the 8-year averages sample (shown in the bottom part of Table 1) are broadly in line with those of the 5-year averages: initial income, investment, population growth, enter with high inclusion probabilities. Debt, openness, and life expectancy have inclusion probabilities less than 0.50 (though life expectancy is very close, with 0.47). In addition, inflation enters as a new robust variable. Differences between the results of the 5-year and 8-year averages suggest differences in the short-run effects of growth dynamics. Finally, the estimated rate of convergence λ is 3 – 4.5 percent which indicates that after controlling for model uncertainty and other potential inconsistencies arising from omitted variable and/or endogeneity biases, the estimated rate of convergence is higher than the range of the “standard” cross-section finding of 2 – 3 percent.11

The finding that initial income and investment are strongly robustly related with growth is in line with the results from the robustness analyses of Fernández, Ley, and Steel (2001a), Sala-i-Martin, Doppelhofer, and Miller (2004), Papageorgiou and Masanjala (2005), and Moral-Benito (2009). All these BMA studies find the strongest evidence of robustness for initial income, and strong evidence of robustness for the investment measure. In addition, although the magnitudes of the inclusion probabilities in those studies are significantly higher than the ones we report, Fernández, Ley and Steel (2001a), Sala-i-Martin, Doppelhofer, and Miller (2004), and Papageorgiou and Masanjala (2005) also find high inclusion probabilities for life expectancy, while Sala-i-Martin, Doppelhofer, and Miller (2004) also add school enrollment as a robust determinant, and Moral-Benito (2009) adds the price of investment goods, distance, and political rights. Further, the findings on the importance of the (augmented) Solow-Swan determinants confirm results of many studies in the growth literature analyzing growth patterns, which have, in particular, reported a significant and positive association with progress on the human development front. Finally, the finding on openness reflects the view that among the driving factors of growth, trade plays an important role, confirming policy recommendations based on export-led growth and trade liberalization which have been at the heart of policy advice for many years.

A number of variables that have been shown in the empiricalliterature to afect economic growth—such as proxies of macroeconomic stability, institutions, political environment and geographical factors—appear to have a less robust association with growth in our analysis, since they enter with lower inclusion probabilities than the 0. 50 cutoff. While this does not suggest that these determinants are not important in growth, but rather that they have a less important role than the ones identified as robust.

Accounting for dynamics and endogeneity

In Table 2 we present the results of applying the methodology used by Fernández, Ley, and Steel (2001a) and Sala-i-Martin, Doppelhofer, and Miller (2004) to our data set. More precisely we transform our data in order to be able to conduct the cross section analysis exactly as it has been done by Fernández, Ley, and Steel (FLS, 2001a) and the BACE approach of Sala-i-Martin, Doppelhofer, and Miller (BACE, 2004). The former is a fully Bayesian method that allows for the explicit specification of the parameter priors, while the latter assumes diffuse priors, in a sense reflecting the researcher’s ignorance. For the FLS methodology we use improper noninformative priors for the parameters that are common to all models, and a g-prior structure for the slope parameters (with two values for the latter, identified as “prior 1” and “prior 9” in Fernández, Ley, and Steel (2001b)). For all the simulations we assume an equal prior probability for all the models (= 2k). Since the FLS and BACE are cross-section analyses, they do not explicitly model dynamics. As a result, differences between the LIBMA results and the FLS and BACE results are attributed to accounting for dynamics and endogeneity.

Table 2.LIBMA Comparison with BACE and FLS Marginal Evidence ofImportance
VariablesFLS prior 1FLS prior 9BACE












1 Log(initial income)1.00-0.7050.0561.00-0.7080.0031.00-0.7120.003
2 Log(investment)0.830.2000.1180.800.1940.0150.870.2110.013
3 Log(population growth)0.09-0.0010.0710.07-0.0020.0030.110.0000.006
4 Log(years of education)0.140.0140.0520.110.0120.0020.150.0140.003
5 Life expectancy1.000.0420.0081.000.0430.0001.000.0420.000
6 Log(inflation)0.090.0040.0730.070.0030.0040.110.0050.006
7 Debt1.00-0.3150.0391.00-0.3150.0011.00-0.3200.002
8 Over valuation Index0.810.2300.1460.760.2170.0230.850.2470.020
9 Openness0.320.0460.0810.270.0400.0060.350.0480.007
10 Terms of Trade0.12-0.1530.6790.09-0.1090.3310.14-0.1800.546
11 Ethnic heter ogeneity0.11-0.0130.0730.08-0.0090.0040.12-0.0150.006
12 Polity0.140.0010.0030.100.0010.0000.160.0010.000
13 War0.14-0.0270.0990.10-0.0200.0070.17-0.0320.012
15 Sub-Saharan Africa0.090.0000.0410.070.0000.0010.110.0010.002
Notes:1. Boxed areas indicate inclusion probabilities above 0.50.2. Areas in yellow are common to the results in Table 1; areas in green (blue) have P(inclusion)>0.50 in Table 1 but not in Table 2 (have P(inclusion)>0.50 in Table 2 but not in Table 1).
Notes:1. Boxed areas indicate inclusion probabilities above 0.50.2. Areas in yellow are common to the results in Table 1; areas in green (blue) have P(inclusion)>0.50 in Table 1 but not in Table 2 (have P(inclusion)>0.50 in Table 2 but not in Table 1).

Comparing to the results in Table1, results in Table 2 show that both FLS and BACE also find initial income, investment, life expectancy, and debt as strong determinants of growth. However, openness and population growth—which are identified as robust growth determinants using LIBMA in Table 1—have low inclusion probabilities in Table 2. In addition, unlike in Table 1, the overvaluation index enters with high inclusion probability in Table 2. More generally, these differences suggest that panel growth analyses that investigate dynamics (and perhaps give a richer picture of growth patterns that is missing from cross-sectional analyses) identify a different set of robust growth determinants as compared to those of cross section analyses.

Next, we modify the FLS and BACE approaches for implementation in a panel context.12 While these approaches were built with the cross-section analysis in mind (and hence do not address dynamics or endogeneity issues), we construct their “panel analogues” in order to explicitly investigate differences with the LIBMA results. Therefore, since the resulting “panel FLS” and “panel BACE” estimators are constructed in a panel context, the comparison with the LIBMA results in Table 1 would identify differences arising “strictly” from accounting for endogeneity.

Table 3 shows theresults from estimating robust growth determinants using the “panel FLS” and “panel BACE” methods using both the 5-year and 8-year averaged data samples. These methodologies identify nine and seven variables as robust for the 5-year and 8-year data sets, respectively, and several time effects as robust growth determinants. Comparing first with the results in Table 2, the panel analogues of FLS and BACE seem to identify four more robust variables (namely, population growth, inflation, openness, and terms of trade). Next, comparisons of Table 3 and Table 1 results identify both important similarities as well as differences. In terms of the similarities, the majority of the robust determinants identified by LIBMA in Table 1 are also identified by the “panel FLS” and “panel BACE” in Table 3. Particularly, for the 5-year sample both identify initial income, investment, population growth, life expectancy, debt, and openness, while for the 8-year sample both identify initial income, population growth, inflation, debt, and openness. However, in the former sample the “panel FLS” and “panel BACE” identify inflation, overvaluation, and terms of trade (when LIBMA doesn’t), and in the latter sample the “panel FLS” and “panel BACE” identify openness, overvaluation and war (when LIBMA doesn’t) and don’t identify investment (when LIBMA does). In summary, there are six “wrongly” identified variables in Table 3 compared to Table 1. These differences suggest that accounting for endogeneity (as done by the LIBMA) identifies different robust determinants than the case where endogeneity is not accounted for (as the “panel FLS” and “panel BACE” do). Specifically, inflation, investment, openness (all endogenous variables) are “wrongly” identified in Table 3, as is terms of trade (an exogenous variable). Finally, the “panel FLS” and “panel BACE” methods identify many time effects as robust determinants when the LIBMA doesn’t. As time dummies could be proxies of other effects, it is possible that the LIBMA better identifies their effect.

Table 3.LIBMA Comparison with BACE and FLS (Panel Data Format)Marginal Evidence of Importance
VariablesFLS prior 1, 5-year averagesFLS prior 9, 5-year averagesBACE 5-year averages












1 Log(initial income)1.00-0.2860.2531.00-0.2870.0641.00-0.2860.064
3 Log(population growth)1.00-0.2080.3551.00-0.2070.1271.00-0.2070.126
4 Log(years ofeducation)0.30-0.0210.1970.33-0.0240.0430.30-0.0210.039
5 Life expectancy0.560.0030.0150.570.0030.0000.560.0030.000
7 Debt1.00-0.0440.0931.00-0.0440.0091.00-0.0440.009
8 Overvaluation Index0.790.0550.2310.820.0570.0550.790.0550.053
9 Openness0.700.0580.2670.720.0580.0730.700.0580.072
10 Terms of Trade0.960.2600.8600.960.2610.7420.960.2600.739
11 Ethnic heterogeneity0.05-43.6094260.2160.05-48.45020059011.6140.05-43.73518180937.791
12 Polity0.060.0000.0040.070.0000.0000.060.0000.000
13 War0.15-0.0050.0870.16-0.0050.0080.15-0.0050.008
14 Tropics0.0464.5008727.5670.0571.35384085045.3700.0464.64176267725.694
15 Sub-Saharan Africa0.040.0000.0010.040.0000.0000.040.0000.000
16 Panel 19700.09-0.0040.0970.10-0.0040.0100.09-0.0040.009
17 Panel 19750.32-0.0150.1080.32-0.0140.0120.32-0.0140.012
18 Panel 19800.630.0280.1250.660.0290.0160.640.0280.016
19 Panel 19850.43-0.0210.1020.41-0.0200.0100.43-0.0210.010
20 Panel 19900.650.0540.1450.670.0560.0220.650.0540.021
21 Panel 19950.640.0540.1590.670.0570.0270.650.0550.026
22Panel 20000.970.0970.1900.970.1000.0380.970.0980.036
FLS prior 1, 8-year averagesFLS prior 9, 8-year averagesBACE 8-year averages




Standard Error




Standard Error




Standard Error
1 Log(initial income)1.00-0.4110.1291.00-0.4110.1291.00-0.4110.129
2 Log(investment)0.220.0150.0350.230.0150.0350.230.0150.036
3 Log(population growth)0.54-0.1090.4550.55-0.1100.4560.55-0.1100.460
4 Log(years of education)0.09-0.0050.0310.09-0.0050.0310.09-0.0050.031
5 Life expectancy0.330.0020.0000.330.0020.0000.340.0020.000
6 Log(inflation)0.99-0.1150.1130.99-0.1150.1130.99-0.1150.113
7 Debt0.99-0.0590.0250.99-0.0590.0250.99-0.0590.025
8 Overvaluation Index0.530.0440.0770.530.0440.0770.540.0450.078
9 Openness0.870.1080.1730.870.1080.1730.870.1090.173
10 Terms of Trade0.060.0030.1590.060.0030.1590.060.0030.159
11 Ethnic heterogeneity0.0685.86446669196.1860.0686.09146791357.2400.0686.20646803836.596
12 Polity0.070.0000.0000.070.0000.0000.070.0000.000
13 War0.71-0.0610.0900.71-0.0610.0900.71-0.0610.090
14 Tropics0.06217.775220721773.3740.06218.345221296428.8730.06218.339221340492.117
15 Sub-Saharan Africa0.050.0000.0000.050.0000.0000.050.0000.000
16 Panel 19760.080.0020.0090.080.0020.0090.080.0020.009
17 Panel 19840.14-0.0010.0120.14-0.0010.0120.14-0.0010.012
18 Panel 19920.890.0680.0470.890.0680.0470.890.0680.047
19 Panel 20000.990.1290.0650.990.1290.0650.990.1290.065
Notes:1. Boxed areas indicate inclusion probabilities above 0.50.2. Areas in yellow are common to the results in Table 1; areas in green (blue) have P(inclusiony>0.50 in Table 1 but not in Table 2 (have P(inclusion) >0.50 in Table 2 but not in Table 1).
Notes:1. Boxed areas indicate inclusion probabilities above 0.50.2. Areas in yellow are common to the results in Table 1; areas in green (blue) have P(inclusiony>0.50 in Table 1 but not in Table 2 (have P(inclusion) >0.50 in Table 2 but not in Table 1).

V. Conclusions

This paper aims to provide some insights into the mechanics ofeconomic growth by investigating robust determinants of economic growth across the world. The methodology used in this paper incorporates a dynamic panel estimation and Bayesian Model Averaging to simultaneously address endogeneity, omitted variable bias, and model uncertainty—problems that have previously plagued empirical work on growth.

Based on a broad number of growth determinants, our investigation shows that once model uncertainty and other potential inconsistencies are accounted for, there are several factors that robustly affect growth. Our main results are summarized as follows. First, we find the strongest evidence for the robustness of four determinants, namely, initial income, investment, population growth, and life expectancy, while there is strong evidence for the robustness of inflation, debt, and trade openness. Given the robustness of initial income, the conditional convergence hypothesis holds, with estimated rates of convergence in the range of 3-4.5 percent. In addition, several other variables that have been used in “ad hoc” growth regressions in the literature, are generally not found to be robust. Second, we identify significant differences of our results compared to existing literature that addresses model uncertainty but fails to account for dynamics and endogeneity. These differences underscores the importance of addressing dynamics and endogeneity in addition to model uncertainty in growth empirics, and that LIBMA may be a useful tool in this investigation.


    AhnS. and P.Schmidt1995Efficient Estimation of a Model with Dynamic Panel DataJournal of EconometricsVol. 68 pp. 527.

    AnandS. and A.Sen2000Human Development and Economic SustainabilityWorld DevelopmentVol. 28 pp. 202949.

    ArellanoM.2003Panel Data EconometricsOxford University Press.

    ArellanoM. and O.Bover1995Another Look at the Instrumental-Variable Estimation of Error Components ModelsJournal of EconometricsVol. 68 pp. 2952.

    • Crossref
    • Search Google Scholar
    • Export Citation

    ArellanoM.2002Modelling Optimal Instrumental Variables for Dynamic Panel Data ModelsCEMFI Working Paper no. 0310.

    BarroR.1991Economic Growth in a Cross Section of CountriesQuarterly Journal of EconomicsVol. 106 pp. 22351.

    BlundellR. and S.Bond1998Initial Conditions and Moment Restrictions in Dynamic Panel Data ModelsJournal of EconometricsVol. 87 pp. 114343.

    • Crossref
    • Search Google Scholar
    • Export Citation

    BrockW. and S.Durlauf2001Growth Economics and RealityWorld Bank Economic ReviewVol. 15 pp. 22972.

    BrockW.DurlaufS.N. and K.West2003Policy Evaluation in Uncertain Economic EnvironmentsBrookings Papers on Economic Activity1 pp. 235322.

    • Crossref
    • Search Google Scholar
    • Export Citation

    ChenH.A.Mirestean and C.Tsangarides2009Limited Information Bayesian Model Averaging for Dynamic Panels with Short Time PeriodsIMF Working Papers 09/74.

    • Search Google Scholar
    • Export Citation

    DonaldS.G.Imbens and W.Newey2008Empirical Likelihood Estimation and Consistent Tests with Conditional Moment Restrictionsmimeo.

    • Search Google Scholar
    • Export Citation

    DoppelhoferG. and M.Weeks2009Jointness of Growth DeterminantsJournal of Applied EconometricsVol. 24 pp. 209244.

    DurlaufS. and D.Quah1999The New Empirics of Economic Growth” in J. B.Taylor and M.Woodford(eds) Handbook of MacroeconomicsVol. IA (Amsterdam: North Holland).

    • Crossref
    • Search Google Scholar
    • Export Citation

    DurlaufS.P.Johnson and J.Temple2005Growth Econometrics” in P.Aghion and S.Durlauf(eds) Handbook of Economic GrowthVol. IA (Elsevier: North Holland).

    • Search Google Scholar
    • Export Citation

    DurlaufS.N.KourtellosA. and C.M.Tan2008Are Any Growth Theories Robust?Economic JournalVol. 118 pp. 329346.

    FernándezC.E.Ley and M.Steel2001aModel Uncertainty in Cross-Country Growth RegressionsJournal of Applied EconometricsVol. 16 pp. 56376.

    • Crossref
    • Search Google Scholar
    • Export Citation

    FernándezC.E.Ley and M.F.J.Steel2001bBenchmark Priors for Bayesian Model AveragingJournal of EconometricsVol. 100 pp. 381427.

    • Crossref
    • Search Google Scholar
    • Export Citation

    HayashiF.2009EconometricsPrinceton University Press.

    KassR. and L.Wasserman1995A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz CriterionJournal of the American Statistical AssociationVol. 90 pp. 92834.

    • Crossref
    • Search Google Scholar
    • Export Citation

    LeyE. and M.F.J.Steel2007Jointness in Bayesian Variable Selection with Applications to Growth RegressionJournal of MacroeconomicsVol. 29 pp. 476493.

    • Crossref
    • Search Google Scholar
    • Export Citation

    LeyE.Steel and M.F.J.Steel2009On the Effect of Prior Assumptions in Bayesian Model Averaging with Applications to Growth RegressionJournal of Applied Econometricsforthcoming.

    • Crossref
    • Search Google Scholar
    • Export Citation

    LucasR.1988On the Mechanics of Economic DevelopmentJournal of Monetary EconomicsVol. 22 pp. 342

    MankiwG.D.Romer and D.Weil1992A Contribution to the Empirics of Economic GrowthQuarterly Journal of EconomicsVol. 107 pp. 40737.

    MehrhoffJ.2009A Solution to the Problem of Too Many Instruments in Dynamic Panel Data GMMDeutsche Bundesbank Discussion Papers 2009/31.

    • Search Google Scholar
    • Export Citation

    Moral-BenitoE.2009Determinants of Economic Growth: a Bayesian Panel Data ApproachWorld Bank Policy Research Working Paper Number 4830.

    • Search Google Scholar
    • Export Citation

    MasanjalaW. and C.Papageorgiou2008Rough and Lonely Road to Prosperity: A Reexamination of the Sources of Growth in Africa using Bayesian Model AveragingJournal of Applied EconometricsVol. 23 pp. 671682.

    • Crossref
    • Search Google Scholar
    • Export Citation

    RafteryA.E.1995Bayesian Model Selection in Social ResearchSociological MethodologyVol. 25 pp. 11163.

    RomerP. M.1986Increasing Returns and Long Run GrowthJournal of Political EconomyVol. 94 pp. 100237.

    RoodmanD.2009A Note on the Theme of Too Many InstrumentsOxford Bulletin of Economics and StatisticsVol. 71 pp. 13558.

    • Crossref
    • Search Google Scholar
    • Export Citation

    Sala-i-MartinX.DoppelhoferG.R. and I.Miller2004Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (BACE) ApproachAmerican Economic ReviewVol. 94 pp. 81335.

    • Crossref
    • Search Google Scholar
    • Export Citation

    SolowR. M.1956A Contribution to the Theory of Economic Growth”. Quarterly Journal of EconomicsVol. 70 pp. 6594.

    TsangaridesC.2004A Bayesian Approach to Model UncertaintyIMF Working Paper No. 04/68 (Washington: International Monetary Fund).

    WooldridgeJ.M.2002Econometric Analysis of Cross Section and Panel Data1st edition (Cambridge, MA: MIT Press).


Appendix I: Monte Carlo experiment results

This Appendix describes the Monte Carlo simulations intended toassess the performance of LIBMA when a reduced instrument count is used. We compute posterior model probabilities, inclusion probabilities for each variable in the universe considered, and parameter statistics.

We consider the case where the universe of potential explanatory variables contains 12 variables, namely, 5 exogenous variables (out of which 2 are time invariant), 6 endogenous variables and the lagged dependent variable which is predetermined. Throughout our simulations we keep the number of periods constant, that is, T = 6 and we vary the number of individuals, N = 50, 75, 90, and 100. We examine three cases of instrument sets, with using (i) all lags of instruments, (ii) 2 lags of the instruments, and (iii) 1 lag of the instruments. For all three cases we use both collapsed and non-collapsed forms of the instruments. As a result, for the three sets for the full (collapsed) forms, we have the moment conditions as follows: (i) for the “all” lags set, we have 119 (51) moment conditions; (ii) for 2 lags 95 (36) moment conditions; and (iii) for 1 lag 73 (29) moment conditions. Table A lists the moment conditions for a variety of m, q, and T also relative to the sample size N.

We generate 500 instances of the data generating process with time invariant variables ιit, regular exogenous variables xit, endogenous variables wit, and parameter values (α θ)′. Further, we assume that both the random error term vit and the individual effect ηi are drawn from a Normal distribution,vit~N(0,σv2) and ηi~N(0,ση2), respectively, and consider the case where σv2=0.10, and ση2=0.10. Appendix II discusses the data generating process in detail and presents the results of the analysis.

Table A1 reports the inclusion probability (defined as the sum of all the posterior probabilities for each model that contains that particular variable) for each variable considered, along with the true model in the second column of the table, for various N and instrument transformations.13 Given the assumptions made relative to the model priors, the prior probability of inclusion for each variable is the same and equal to 0.50. Comparing the collapsed and non-collapsed cases, it is immediately clear that collapsing the instruments improves the inclusion probabilities dramatically for both the included and non-included variables. This is particularly the case for smaller N where the inclusion probabilities in many cases improve by a factor of 1.5 or more. As the sample size increases, the posterior inclusion probabilities approach 1 for all the relevant variables, and for the variables not contained in the true model the median posterior probability of inclusion decreases with the sample size. A comparison among the collapsed forms for various lags suggests that overall, using 1 or 2 lags rather than the full set of lags gives higher inclusion probabilities, particularly for the endogenous variables and for lower values of N, but there is no clear distinction between the choice between 1 or 2 lags. For higher values of N selecting fewer lags among the collapsed does not improve the results dramatically.

We turn now to the parameter estimates and examine how the estimated values compare with the true parameter values. Table A2 presents the median values of the estimated parameters compared to the parameters of the true model (discussed in the previous section). As in the case of inclusion probabilities, collapsing the instruments always improves the parameter estimation, with both the bias and variance decreasing (and with even more improvements as the sample gets larger). Again, as in the case of the inclusion probabilities, using 1 or 2 lags rather than all lags among the stacking options is preferred.

While it is beyond the scope of our approach, we also present results in terms of model selection. Table A3 presents relevant statistics for the posterior probability of the true model; the ratio of the posterior model probability of the true model to the highest posterior probability of all the other models (excluding the true model); and how often our methodology recovers the true model by reporting how many times the true model has the highest posterior probability. Stacking instruments gives better results, particularly for the recovery of the true model. Importantly, even with poorer results in terms of model selection (e.g. cases where the recovery rate of the true model is poor or the true model receives low posterior probability), the BMA is able to differentiate among the relevant and non-relevant variables, as it can be seen from Tables A1 and A2.

In summary, there is clear evidence that collapsing the instruments improves the results, both in terms of inclusion probabilities, parameter estimates, and model selection. This is particularly due to the fact that collapsing the instruments reduces the ratio of instruments to sample size,GN. Once collapsed, further reducing the number of lags yields some further improvements which disappear as the ratio ofGN becomes smaller. So, while in the case of N = 75 using 1 or 2 lags rather than the full set of lags gives better results, comparing cases like N = 75 collapsed 1 lag, with N = 90 collapsed 2 lags, and N = 100 collapsed all lags (all of which haveGN between 0.4 and 0.5) yields similar results. For our investigation of growth determinants, we will use collapsed instruments using 2 lags.

Table A.Instruments for various options of T = 6, m = 5, and q = 6
InstrumentsN = 50N = 75N = 90N = 100
Instrument optionsGNGNGNGN
Uncollapsed full1192.381.591.321.19
Uncollapsed 2 lags951.901.271.060.95
Uncollapsed 1 lag731.460.970.810.73
Collapsed full511.020.680.570.51
Collapsed 2 lags360.720.480.400.36
Collapsed 1 lag290.580.390.320.29

Appendix II: Monte Carlo data generating process

Consider the case where the universe of potential explanatory variables contains 12 variables, namely, 5 exogenous variables (out of which 2 are time invariant), 6 endogenous variables and the lagged dependent variable.

We begin by generating the two time invariant exogenous variablesfor every individual i and period t, as follows

whereriι is a vector random variable with two independent and uniformly distributed elements with discrete support {0,3/2,23/2}. We select the size of support so that variance of the resultant random variable is 1. Next, we generate three exogenous variables by sampling from a normal distribution,

where I3 is the three dimensionalidentity matrix.

Similarly, for the endogenous variables,(wit1wit6), we have the following data generating process

Here 1 denotes the vector of 1’s with appropriate dimension. As the data generating process for the endogenous variables indicates, the overall error term vit is assumed to be distributed normally here.

For t = 0, the dependent variable is generatedby

where θ = 0.23, ιi0=(ιi01ιi02),xi0=(xi01xi02xi03) and (wi01wi06)wi0.. In addition, m = (1 0 1 1 0 1 1 1 0 0 0)′ is the model selection vector. It indicates that we choose the model with 1 time invariant variable, 2 regular exogenous variables, and 3 endogenous variables as the true model.

For t =1, 2,…,T the datagenerating process is given by

The theoretical R2 of the generatedmodels varies between 0.50 and 0.60.

Table A1.Model Recovery: Medians and Variances of Posterior Inclusion Probability for Each Variable True model vs BMA posterior inclusion probability for various N, α = 0.50, and σv2=0.1


True value
All lags used1 lag used2 lags used
Not stackedStackedNot stackedStackedNot stackedStacked
Table A2.Model Recovery: Medians and Variances of Estimated Parameter Values True model vs BMA coefficients’ estimated values for various N, α= 0.50, and σv2=0.1


True value
All lags used1 lag used2 lags used
Not stackedStackedNot stackedStackedNot stackedStacked
Table A3.Posterior Probabilities Summary statistics for various N, α = 0.50, and σv2=0.1
SampleAll lags used1 lag used2 lags used
Not stackedStackedNot stackedStackedNot stackedStacked
Probability of retrieving the true model
  % Correct021613
  % Correct111513113
  % Correct3161120621
  % Correct7201227823
Posterior probability of the true model
Posterior probability ratio of true model/best among the other models
Table B1:Sample Data Variable Definitions and Sources
Dependent Variable
DIFFYPenn World Table 6.2Growth of real GDP per capita (2000 US dollars at PPP)
Explanatory Variables
1 Solow determinants
1 LNY0Endogenous/PredetPenn World Table 6.2Logarithm of initial real GDP per capita (2000 US dollars at PPP)
2 LNIEndogenousPenn World Table 6.2Logarithm of real investment as ratio to GDP (2000 US dollars at PPP)
2 LNI0Endogenous/PredetPenn World Table 6.2Logarithm of initial real investment (2000 US dollars at PPP)
2 DIFFIPenn World Table 6.2Growth of real investment to GDP (2000 US dollars at PPP)
3 LNPOPGREndogenousPenn World Table 6.2Logarithm of annual population growth rate plus 005
2 Human capital (Augmented Solow)
4 LTOTED2EndogenousBarro and Lee datasetLogarithm of total average stock of years of primary and secondary education
4 LTOTED3EndogenousBarro and Lee datasetLogarithm of total average stock of years of primary and secondary education (added years)
5 LFEXP2EndogenousWorld Development Indicators (World Bank)Life expectancy at birth (total) with filled in years
3 Macroeconomic stability
6 LNINFLEndogenousInternational Financial Statistics (IMF)Logarithm of one plus the inflation rate
7 VOL_INFLATION_NEWEndogenousInternational Financial Statistics (IMF)Volatility of inflation in each year, calculated from monthly IFS data
8 BALGDPEndogenousWorld Economic Outlook (IMF)Government balance as share of GDP, current LCU
9 LNGEndogenousPenn World Table 6.2Logarithm of real government consumption as ratio to GDP (2000 US dollars at PPP)
10 DEBTGDPEndogenousWorld Development Indicators (World Bank)Nominal debt to GDP
10 D_DEBTGDPEndogenousAuthors’ caclulationsClassfication 0-4 based on percentiles of debtgdp: 0=unreported, 1<25 perc, 2=25-49perc, 3=50
10 D_DEBT_WDIInvariantWorld Development Indicators (World Bank)1=less, 2=moderately, 3=severely indebted economy, 0=other WDI debt classification
11 OVERVAL1_NEWEndogenousPenn World Table 6.2Index of overvaluation/udervaluation based on ppp
11 INDEXPWTEndogenousPenn World Table 6.2Over/undervaluation index=exp(ln_rer-p_ln_rer); >1: undervalued, <1: overvalued
11 LNINDEXPWTEndogenousPenn World Table 6.2Ln of index of over/undervaluation
12 DEFACTO_AGGEndogenousIMFIMF de facto aggregate classification
12 MCM_F_DEFACTOEndogenousIMFIMF de facto fine classification
12 RR_AGGEndogenousReinhard and RogoffReinhart-Rogoff aggregated
12 RR_FULLEndogenousReinhard and RogoffReinhart-Rogoff full (includes collapsing)
12 LYS_AGGEndogenousLevy-Yeyati/SturzennegerLevy-Yeyati/Sturzenneger aggregated
12 JS_JSPEGEndogenousShambaughShambaugh: binary coding of peg = 1 and nonpeg = 0
4 Financial development
13 DMBCBEndogenousInternational Financial Statistics (IMF)Ratio of assets of deposit money banks to total bank assets (DBA/(DBA+CBA))
14 BRMGDPEndogenousWorld Economic Outlook (IMF)Ratio of broad money to GDP
5 Trade regime
15 EXPGREndogenousWorld Economic Outlook (IMF)Exports growth
16 OPEN_NEWEndogenousPenn World Table 6.2Exports plus Imports as share of GDP (2000 US dollars at PPP)
16 OPEN_NEWGREndogenousPenn World Table 6.2Average annual rate of growth of openness
6 External environment (exogenous)
17 TOTGRExogenousWorld Economic Outlook (IMF)Terms of trade (goods and services) growth
18 G3VLNONROLExogenousWorld Economic Outlook (IMF)G-3 (=US, Euro, Japan) real exchange rate volatility faced by country
7 External environment (other)
19 FDIGDPEndogenousWorld Economic Outlook (IMF)Direct investment abroad to GDP
19 PRIFLOWGDPEndogenousWorld Economic Outlook (IMF)Private capital flows to GDP
20 AIDGDPEndogenousGlobal Development Finance/World Dev. IndicatForeign aid as percentage of GDP
8 Internal environment: resources
21 LLANDPredeterminedWorld Development Indicators (World Bank)Logarithm of arable land per capita, hectares, average over five years
22 EHET2InvariantSambanis 2001Ethnic heterogeneity (Vanhanen’s measure): sum of racial, linguistic, and religious division resc
22 ELFO2InvariantSambanisUpdated index of ethnolinguistics fractionalization
23 RELIPREndogenousPenn World Table 6.2Relative investment price level (PI/PC) (2000 US dollars at PPP)
9 Internal environment: institutions/governance
24 CIVIL_LIBERTY_NEWEndogenousFreedom House1 to 7, with 1 being highest degree of freedom, Freedom House
25 AUTOC2_NEWEndogenousPolity IVInstitutionalized autocracy: 0 to 10 (most autocratic), Polity IV
25 DEMOC2_NEWEndogenousPolity IVAggregate index of democracy Insitutionalized democracy: 0 to 10 (most democratic), Polity IV
25 DEMOC2LAGPredeterminedPolity IVAggregate index of democracy, lagged once
25 POLITY2_NEWEndogenousPolity IVAggregate index of autocracy and democracy
25 POLITY2LAGPredeterminedPolity IVAggregate index of autocracy and democracy, lagged once
26 POLITY2DIFEndogenousPolity IVAnnual change in the Polity index
# Internal environment: violence/war
27 WARExogenousSambanis (2004) and Doyle and Sambanis (2006)War prevalence
28 NATWARExogenousSambanis (2004) and Doyle and Sambanis (2006)Neighbors at war
29 TNATWARExogenousSambanis (2004) and Doyle and Sambanis (2006)Total neighbors at war
# Fixed Factors: Geography/Physical Factors
30 CENLATInvariantGallup, Mellinger, Sachs (CID datasets)Latitude of country centroid
30 DISTCRInvariantGallup, Mellinger, Sachs (CID datasets)Mean distance to nearest ice-free coastline or sea-navigable river (km)
30 CENCRInvariantGallup, Mellinger, Sachs (CID datasets)Distance from centroid of country to nearest ice-free coastline or sea-navigable river (km)
30 POP100CRInvariantGallup, Mellinger, Sachs (CID datasets)Ratio of population within 100 km of ice-free coast/navigable river to total population
30 TROPICARInvariantGallup, Mellinger, Sachs (CID datasets)% Land area in geographical tropics
30 LCR100KMInvariantGallup, Mellinger, Sachs (CID datasets)% Land area within 100 km of ice-free coast/navigable river
# Fixed Factors: Regional characteristics
31 EAPInvariantWorld BankEast Asia and Pacific Regional Dummy
32 ECAInvariantWorld BankEurope and Central Asia Dummy
33 MENAInvariantWorld BankMiddle East and North Africa Dummy
34 LACInvariantWorld BankLatin America and Caribbean Dummy
35 SAInvariantWorld BankSouth Asia Dummy
36 SSAInvariantWorld BankSub-Saharan Africa Dummy
37 OTHERInvariantWorld BankOther
38 LICInvariant
39 OECDInvariant
40 ADVANCED_SID2Invariant
41 EMERGING_SID2Invariant
1We thank Huigang Chen for very helpful comments related to theMonte Carlo simulations. We also thank Andy Berg, Steve Durlauf, RexGhosh, Chris Papageorgiou, David Romer and seminar participants at the IMF’s Research Department for helpful comments and suggestions.
1Anand and Sen (2000), for example, quote Aristotle as favoring human development: “Wealth is evi-dently not the good we are seeking, for it is merely useful and for the sake of something else.”
2Durlauf, Kourtelos and Tan (2008) construct instruments for variables that are endogenously determined in the economic sense and introduce a model averaged version of two-stage Least Squares (2SLS). Eicher, Lenkoski and Raftery (2009) introduce an instrumental variable BMA (IVBMA) approach. Moral-Benito (2009) considers a panel data model where the lagged dependent variable is correlated with the individual efects but not correlated with the error term.
3It is common in the literature to use xitlxit as an instrument, instead of Δxitl. Then the moment condition becomes E(xitlΔvit)=0.
4Often the prior odds ratio is set to 1 representing the lack of preference for either model, in which case Bji.
5Model selection seeks to find the model Mj in M = (M1,…, MK) that actually generated the data. So, a natural strategy for model selection is to chose the most probable model Mj, namely the one with the highest posterior probability, p(Mj|D).
6See Roodman (2009) and discussion in textbooks including Hayashi (2000), Wooldridge (2002), and Arellano (2003).
7The former survey identifies 36 different categories of 87 explanatory variables, while the latter identifies 43 categories and 143 explanatory variables. With cross-country datasets of 100 or, in the best of cases, 120 country observations, the empirical investigation of growth determinants essentially becomes an exercise in small sample econometrics.
8For brevity, these results are not reported here but are available from the authors.
9Clearly, different authors also use different datasets, so presumably some (though not all) of the differ-ences in results can be attributed to that.
10Some researchers (see, for example, Raftery (1994)) further identify inclusion probability thresholds to label variables as “strongly robust,” “very strongly robust,” etc. suggesting stronger evidence. However, these chosen cutoffs are not strictly grounded in statistical theory and remain, therefore, merely indicative of a set of variables that we consider well estimated or robust.
11A rate of convergence of 2, 3, and 4.5 percent suggests that a country will need approximately 35, 23, and 15 years, respectively, to cover half the distance between its initial position and its steady state.
12For brevity, we don’t report the details about the construction of the “panel FLS” and “panel BACE” estimators, but these are available from the authors. We also thank the authors for making their original codes available at the Journal of Applied Econometrics Data Archive for FLS, and BACE.
13A value of 1(0) in column 2 indicates that the true model contains (excludes) that variable.

Other Resources Citing This Publication