The Monetary Model Strikes Back: Evidence from the World
Author:
Ms. Valerie Cerra
Search for other papers by Ms. Valerie Cerra in
Current site
Google Scholar
Close
and
Ms. Sweta Chaman Saxena https://isni.org/isni/0000000404811396 International Monetary Fund

Search for other papers by Ms. Sweta Chaman Saxena in
Current site
Google Scholar
Close

We revisit the dramatic failure of monetary models in explaining exchange rate movements. Using the information content from 98 countries, we find strong evidence for cointegration between nominal exchange rates and monetary fundamentals. We also find fundamentalsbased models very successful in beating a random walk in out-of-sample prediction.

Abstract

We revisit the dramatic failure of monetary models in explaining exchange rate movements. Using the information content from 98 countries, we find strong evidence for cointegration between nominal exchange rates and monetary fundamentals. We also find fundamentalsbased models very successful in beating a random walk in out-of-sample prediction.

I. Introduction

Monetary models of nominal exchange rate determination were a mainstay of international economics in the 1970s, and the key relationships continue to form an important part of current international macro models. These models appeared to fit in-sample empirical estimations fairly well. Nonetheless, the models were dealt a severe blow by the seminal work of Meese and Rogoff (1983). Using a set of post Bretton Woods exchange rates for several major industrial countries, Meese and Rogoff showed that a simple random walk had more out-of-sample predictive power than the monetary models, even when the future realizations of the explanatory variables in the monetary models were used to generate the out-of-sample forecast. Subsequent authors tried to overturn these results, but any promising findings turned out to be fragile and the literature has remained pessimist about the link between exchange rates and monetary fundamentals (Frankel and Rose, 1995; Rogoff, 1999).

A recent resurgence of empirical work tries to evaluate exchange rate models using new methods for in-sample and out-of-sample evaluation. With advances in the econometrics of non-stationary data, in-sample analysis has turned to cointegration to look for long-run relationships between exchange rates and fundamentals. Evidence for cointegration has been mixed, with results depending on the country and sample used. For example, MacDonald and Taylor (1993) provide early favorable evidence for cointegration between nominal exchange rates and monetary fundamentals for the U.S. dollar-Deutche Mark exchange rate. Rapach and Wohar (2002) use data for 14 industrial countries that span as long as 115 years (1880-1995), and find some evidence of cointegration for 8 of the 14 countries. Very recent work focuses on using panel cointegration tests to take advantage of the power of using multiple country exchange rates and fundamentals. Husted and MacDonald (1998) find evidence of cointegrating relationships in panel data sets for the US dollar, German mark and Japanese yen exchange rates using annual data for the recent floating experience. Motivated by the idea of cointegration between variables, the recent out-of-sample analysis examines whether the current deviation of the exchange rate from its long-run equilibrium is useful for predicting the future exchange rate returns (Mark 1995, Mark and Sul, 2001).

This paper exploits the power of panel cointegration tests by including a broad country sample, which has a low degree of cross-sectional dependence. Although recent literature has made advances using panel cointegration, the country samples used tend to suffer from considerable cross-sectional dependence, in part because the panel data sets of industrial countries contain many highly linked EMS countries. For instance, over the period 1984-2004, the average pairwise correlation of exchange rate changes in Mark and Sul, 2001 and Groen (2000) countries is above 0.65. In contrast, the average pairwise correlation of exchange rate changes in our broader data set of 98 countries is below 0.2. Thus, we exploit a larger sample with substantially more independent variation. We also take measures to control for even the low level of cross-sectional dependence in our dataset, using the most recent advances in controlling for cross-sectional dependencies in the cointegration tests. These methods include extracting a common time effect from the data; using the Pesaran (2007) test for cross-sectionally adjusted adf tests; and doing bootstrap trials that resample from the vector of correlated residuals.

The previous literature has largely ignored the information provided by a large set of countries. One reason for this neglect has been a concern that the exchange rate regime has been fixed for many non-industrial countries. We argue that the mix of exchange rate regimes in our country sample is no more an issue than for the extant literature, first because of the high frequency at which countries adjust their pegs in the recent decades, and second because there may be more independent flexibility for the broad sample of countries than for the industrial countries. The proportion of observations in our data sample in which the dollar exchange rate did not change from one year to the next is under 8 percent. Obstfeld and Rogoff (1995) point out that aside from a few minor tourist economies, oil sheikdoms, and heavily dependent principalities, only a very small number of fixed exchange rates survive intact for several years. Klein and Marion (1997) showed that the average duration of pegs in the Western Hemisphere countries was only 10 months. Second, the extant literature on industrial country exchange rates have often ignored the long stretches of links to the Deutche mark in studying the “floating period.” Indeed, Klein and Shambaugh (2006) show that pegged exchange rate regimes accounted for about 40 percent of the observations for industrial countries during the years 1973-2004. The long-span data in Rapach and Wohar (2002) cover not only the post-Bretton Woods period of floating exchange rates, but also long spells of fixed exchange rates during the gold standard and the Bretton Woods era. Against this background, our large data set has the advantage of providing considerably more observations of independent exchange rate adjustment than the previous studies, as evident from the low cross-sectional correlation of exchange rate changes.

A second problem in some panel cointegration tests is the assumption of a homogeneous slope coefficient. Mark and Sul, 2001 check for cointegration in a panel of countries by testing the significance of the slope coefficient in a regression of the exchange rate return on the deviation of the exchange rate from its fundamental value:

Δ s i t   =   β ( f i , t 1     s i , t 1 )   +   ε i t

They estimate the model using panel dynamic OLS with controls for country and time effects. If the exchange rate, s, is cointegrated with the fundamentals, f, then the errors will be stationary, whereas the error will be nonstationary under the null hypothesis of no cointegration. However, they assume that the slope coefficient, β, is homogeneous across countries in the panel. If the homogeneity assumption is incorrect and β differs across countries, then the error will contain the term i - β)(βi, t-1 - si, t-1), violating the consistency requirement that the regressors and

errors are uncorrelated. The same issue arises in the Groen (2000) paper, which first estimates the cointegrating vector and then uses the Levin Lin (LL) panel unit root method to test the residuals for nonstationarity. The LL test assumes a homogeneous coefficient on the lag level of the residual, μ^it:

Δ μ ^ i t   =   ρ μ ^ i , t 1   +   j = 1 p ϕ i j Δ μ ^ i , t j   +   ε i t

To address this issue, we employ recent panel methods that allow for heterogeneous adjustment coefficients in the alternative hypothesis of panel unit root tests.

We complement our in-sample cointegration tests with out-of-sample prediction analysis. We employ specifications and testing procedures that include both Meese and Rogoff s original out-of-sample fit method and the out-of-sample forecasts of exchange rate returns used in the more recent literature. For example, Mark and Sul use the current deviation of the exchange rate from its equilibrium value, as determined by the cointegrating relationship, to form forecasts of the change in the exchange rate between the current period and various future horizons. However, Engel and West (JPE, 2005) show that there should be very little forecastability of exchange rates based on current and past information if exchange rates behave like asset prices. That is, market expectations of future fundamentals, as derived from a current information set, will already be built into the exchange rate. They show that under reasonable assumptions the correlation between future exchange rate returns and current/past fundamentals is extremely low, typically below 0.1 for the most likely parameter calibrations. In contrast, Meese and Rogoff’s out-of-sample fit method uses the realized future values of the fundamental variables. Future fundamentals incorporate future innovations, i.e., those that are unknown at the current time but subsequently impact exchange rate changes. Therefore, if the models are correct, actual future changes in fundamentals will be highly correlated with future changes in exchange rates. In principle, out-of-sample fit (using actual future outturns of fundamentals) should thus be a more powerful model evaluation method than the out-of-sample forecast (using only current information) method. Indeed, Meese and Rogoff’s work generated such pessimism about exchange rate models precisely because the models work poorly in spite of being given the advantage of knowing the future fundamentals. Since this paper is the first attempt to examine the out-of-sample behavior of exchange rates and monetary fundamentals for a broad country sample, we take an agnostic stance and “let the data speak” for both testing procedures.

We also introduce a revised specification of the exchange rate model, which outperforms the other traditional models. That is, in addition to Meese and Rogoff’s specification relating the level of the exchange rate to the level of the fundamentals, and Mark and Sul’s specification relating the exchange rate return to the deviation of the exchange rate from its cointegrating equilibrium, we also provide a model specification of the changes in the exchange rate related to the changes in the fundamentals. This model is more robust to a structural break than the levels specification. We also provide a test of the directional forecasting accuracy for out-of-sample evaluation in addition to the standard root mean squared error measure.

Our larger dataset of countries also provides other advantages to the out-of-sample analysis. We are able to do a long horizon forecast that avoids the size distortions and other statistical problems associated with overlapping observations. We use non-overlapping five year intervals by instead exploiting the large number of countries to gain observations. In addition, we compare fundamentals models to both the random walk and random walk with drift, and use the cross-country dimension to demonstrate the relationship between fundamentals and the drift rate in the random walk with drift model.

II. Structural Specification and Data

The structural specification centers on the relationship between the nominal exchange rate, money, and output relative to a numeraire country. These are the core variables in both flexible price (Frenkel-Bilson) and sticky price (Dornbusch-Frankel) monetary models.1 Additional variables could include interest rates, expected inflation, and trade balances. However, market interest rates are often difficult to obtain for many emerging market and developing countries, and sometimes contain a large component of volatile risk premium that would need to be disentangled. In addition, nominal exchange rates in developing countries may depend on other factors, such as terms of trade. As in Mark (1995) and Mark and Sul, 2001, we focus on the core set of monetary model fundamentals for the purpose of this paper, to determine how far these can explain exchange rate changes, but leave other factors for future work. Thus, we have:

s t   =   α   +   β 1 m t   +   β 2 y t   +   e t ( 1 )

where s is the log of the nominal exchange rate at the end of the year, m is the log of the money supply at the end of the year relative to that of the numeraire country (U.S.), and y is the log of the relative outputs during the year.

The original work on nominal exchange rate models tested variations of equation (1) using OLS, GLS, and IV estimation. However, following advances in the development of the econometrics of nonstationary data, recent work has emphasized the potential cointegrating relationship between these variables. Thus we test the log of exchange rates, relative money supplies, and relative real outputs for unit root processes, and test the combination of variables for cointegration.

The data set contains exchanges for as many countries for which sufficient time series data exists for estimation and forecasting purposes. The data on nominal exchange rates, money supplies, and output is taken from the IMF’s International Financial Statistics. Exchange rates and money supplies are measured as end period values, and the exchange rate is measured using the U.S. dollar as numeraire. Money supply data is constructed from balance sheet information of the central bank and commercial banks. Because banks have reporting requirements, including for prudential reasons, the monetary statistics tend to be available and among the most accurate of macroeconomic data. In contrast, output data is not available for many countries. Thus, some countries had to be eliminated for this reason or because their available sample is too short (particularly transition countries) for individual estimation in models with heterogeneous coefficients. The dataset is an unbalanced panel of 98 countries vis-à-vis the U.S. using annual observations from 1960 to 2004 (see Appendix for country list). Annual data has the advantage that it removes problems with extracting seasonality, which would surely be more pronounced for many lower income countries. Moreover, since some of the exchange rate regimes have been de facto adjustable pegs for a few years, quarterly or monthly data would likely capture too many of the pegged observations, and thus deter rather than add to explanatory power.

We also disaggregate the large country set into regional and income groups and high/low inflation groups. The countries are split into five regional groups based on World Bank classifications—Africa, Asia, Developed Countries, Middle East, and Western Hemisphere— and four income groups—low income (per capita real GDP less than or equal to 735 dollars), lower middle income (per capita real GDP between 736 and 2,935 dollars), upper middle income (per capita real GDP between 2,936 and 9,075 dollars), and high income (per capita real GDP over 9,075 dollars). For each country, we also define a “high inflation episode” as any year in which inflation is above 30 percent in the current year or any of the previous four years. A “high inflation country” is defined as a country that experienced even one episode of high inflation any time during the period 1960-2004.

III. In-sample Analysis: Panel Cointegration Tests and Estimation

A. Unit Root Tests

Cointegration reflects a long term relationship between nonstationary data. Thus, we must first establish whether the nominal exchange rate and monetary fundamentals are nonstationary, that is, integrated at least of order one. We test each variable (exchange rate, relative money supply, and relative output) for a unit root.

Levin, Lin, and Chu (2002) and Im, Pesaran, and Shin (2003) have developed panel unit root tests that allow for heterogeneous dynamics. The basic form of the test is the following:

Δ y i t   =   ρ y i t 1   +   α i   +   μ i t   +   η i t where   μ i t   represents   the   short-run   dynamics : μ i t   =   k = 1 k ϕ k Δ y i , t k

The heterogeneous short-run dynamics of the Levin, Lin, and Chu (2002) test (LLC) can be estimated parametrically or semi-parametrically. Parametric estimation consists of estimating country specific ADF regressions and using these regressions to concentrate out the short-run dynamics from the dependent variable, Δyit and the regressor; yit-1. The residuals of each are then used in pooled regression with no dynamics. Alternatively, we can account for the short-run heterogeneous dynamics by using the Newey-West kernel estimator for the long run variance and forward spectrum for each member in a regression of the change in the variable on its lagged level. This is a semi-parametric test, which has been developed in pooled rho and pooled t-stat versions.

The null hypothesis of the LLC test is that every country’s data contains a unit root. That is, H0:ρ = 0. Under the alternative hypothesis of stationarity, the common slope is negative for all countries, ρi. = ρ < 0 ∀i. Im, Pesaran, and Shin (IPS) develop a group mean test that allows for heterogeneity even in the autoregressive coefficient, relaxing the strong assumption of the alternative hypothesis of the LLC test. The IPS test estimates individual ADF regressions for each country. The individual t-statistics are averaged, providing the group mean value of t-statistic for the panel. Thus, although there is full heterogeneity of coefficients, the group mean estimator pools along the “between” dimension, that is, the cross-country dimension.

Our unit root tests allow for heterogeneous trends and cross-sectional dependence. We can account for a simple form of cross-sectional dependence by removing period averages from each variable prior to the test. Such period effects might be important, for example, given that each country’s exchange rate, money supply, and output are expressed relative to a common numeraire. We also use estimates of the long-run variances and pooled variance to provide weighted versions of the tests. For completeness, we estimate all combinations of these specifications. Critical values of the tests are taken from Pedroni (2004) and Pedroni (1999).

All variants of panel unit root tests on exchange rates and their hypothesized fundamentals are shown in Table 1. The null hypothesis of a unit root would be rejected only for large negative values (a one-sided test, as in an ADF test). All of the test values for the relative outputs are positive, thus we are unable to reject the unit root. The null hypothesis of a unit root in the nominal exchange rates and relative money supplies can be rejected only for 2 out of 32 test statistics for each variable: the IPS tests with heterogeneous trends but no time effects for the exchange rate, and the unweighted Levin and Lin ADF with heterogeneous time trends for the relative money supplies. Thus, the overall preponderance of evidence suggests that the nominal exchange rate, relative money supplies, and relative outputs are integrated.

Table 1.

Panel Unit Root Tests

article image

B. Cointegration Tests

Having found strong evidence that exchange rates and fundamentals are nonstationary, this section performs cointegration tests to look for stable long run relationships among them. If a set of variables is cointegrated, the residuals from the cointegrating equation should be stationary. Thus, panel tests of the null hypothesis of no cointegration are essentially panel unit root tests applied to the estimated residuals of cointegrating regressions. Thus, the first step in the cointegration test is to estimate the cointegrating equation. Because least squares is a superconsistent estimator of the point values of the coefficients, it is sufficient to estimate each equation by OLS in this first stage. Of course, the standard errors on the coefficients may be invalid under some circumstances, but these are not required for the cointegration test. It is necessary only to estimate the equation and obtain the residuals. Then, the second step of the cointegration test is to do a panel version of augmented Dickey Fuller tests on these residuals.

We estimate Δμ^it = ρμ^i,t1 + j=1pϕijΔμ^i,tj + εit where i is the country and t is the year, and conduct a one-sided test of the null hypothesis that the parameter of adjustment to long-run equilibrium ρ= 0, against the alternative that ρ < 0.

Hakkio and Rush (1991) show that the power of unit root and cointegration tests depend on the data’s span rather than the data’s frequency and argue that the short time span of post-Bretton Woods leads to low power of cointegration tests. The recent literature thus uses either long span data or panel data to increase the power of cointegration tests. For instance, Groen (2000) finds evidence for cointegration using the Levin Lin (1993) panel unit root tests for 14 dollar and mark exchange rates, although he finds that the choice of numeraire country matters for cointegration results in sub-panels. However, Rapach and Wohar (2002) argue that using long span data to increase the power of cointegration tests is superior to panel data tests which impose cross-country homogeneity restrictions on the adjustment parameter, ρ, in the alternative hypothesis (that is, ρ = ρi ∀i). They point out that support for cointegration may be overstated using the LL test since it is necessary to accept the alternative for all countries. The IPS panel unit root test, on the other hand, overcomes this problem by allowing heterogeneity of the adjustment parameter. Im, Pesaran, and Shin also show that the size and power characteristics of the IPS test are much better than the LL test in small samples.

We thus perform a variety of cointegration tests, including those that allow for heterogeneity of the adjustment parameter, ρ Pedroni (2004) and Pedroni (1999) provide critical values for several different panel cointegration tests, all of which allow for heterogeneous cointegrating vectors and heterogeneous short-run dynamics. The pooled tests assume only a common autoregressive coefficient in the residuals, as in the LLC panel unit root tests, whereas the group mean tests relax this restriction, as in the IPS panel unit root tests. For the pooled and group mean tests, semi-parametric rho and t-statistic tests (as in Phillips-Perron, 1988) and parametric t-tests (analogous to ADF regressions) are available. A nonparametric pooled variance ratio statistic (analogous to Phillips-Ouliaris variance statistic) is also available. All of these tests can be weighted or unweighted, and exclude or include time trends in the cointegrating equation.

Panel studies that do not control for cross-sectional dependence among the countries can result in biased panel cointegration test results. This could be a particularly important issue with regard to the set of major currencies that are typically analyzed in the literature, as many of the EMS countries were linked to the Deutche Mark for a substantial proportion of the post-Bretton Woods era. To control for cross sectional dependence in the form of a common unobserved factor, we also do the tests by removing the period effects (cross-sectional means at each point in time). Doing so also purges the influence of the numeraire country, as its relative fundamentals enter the regression as a common factor for each country.

Table 2 presents seven different panel cointegration tests. Each test allows for weighted or unweighted versions, excluding or including period effects, and excluding or including heterogeneous trends. The null hypothesis for all of the tests is that the residuals of the cointegrating vectors contain unit roots, implying no cointegration. The test statistics are distributed as standard normal. The panel variance test in the first row has a one-sided rejection region consisting of large positive values, whereas the other tests reject for large negative values. The null hypothesis of no cointegration is easily rejected for 53 out of 56 cointegration tests. We are unable to reject the null only for the unweighted pooled semi-parametric rho and t-tests and parametric t-test that exclude time effects and trends. The group mean version of each test rejects the null, as do the remainder of the tests. Thus, overall we strongly reject unit roots in the residuals of the cointegrating vectors, which is the same as finding strong evidence for cointegration of exchange rates, relative money supplies, and relative outputs.

Table 2.

Panel Cointegration Tests

article image

In addition to the tests above that control for fixed period effects, we also perform a more general cointegration test in the presence of cross-sectional dependency. The Pesaran (2007) cross-sectionally augmented ADF (CADF) test supplements the standard ADF regression with the cross-section averages of both lagged levels and first differences of the individual series. The individual CADF statistics can then be used in a modified version (CIPS) of the IPS test. The pth order CADF equation for each country is given by:

Δ y i t   =   b i y i , t 1   +   c i y ¯ t 1   +   j = 0 p d i j Δ y ¯ t j   +   j = 1 p δ i j Δ y i , t j   +   e i t

where yit is the residual of the cointegrating equation for country i in year t. We estimate this equation for each of the 98 countries over the period 1960-2004 and compare the resulting CIPS test statistics to the critical values in Table II(a) of Pesaran (2007). We choose the specification without an intercept term because the residuals of the cointegrating regression average zero by construction. Table 3 shows that all of the CIPS test statistics are significant at the one percent level, for lag orders spanning 0 to 7. Thus, the null hypothesis of a unit root in the residuals from the cointegrating equations can be rejected even when controlling for more general cross sectional dependence.

Table 3.

CIPS Tests

article image
Statistics are based on univariate AR(p) specifications of CADF regressions on the residuals from the cointegrating equation between the exchange rate, relative money supply, and relative outputs for the 98 countries. Astericks (***) denote significance at the one percent level.

As a final robustness check, we also take cross sectional dependency into account by comparing the test statistics from panel cointegration tests on the data to critical values generated from bootstrap trials. We focus on the IPS test, as it has the most flexibility in allowing for heterogeneity. The bootstrap exercises also have other merits in accounting for other features of the data. The IPS statistic is a group mean test of the ADF t-values. But it must be standardized as follows: ZNTIPS=N/v(t¯ρ  μ), where t¯ρ = N1i=1Ntρi and μ and ν are adjustment values under the null hypothesis that ρi. = 0. When T is large, we can use the asymptotic μ and ν adjustment values regardless of serial correlation. But in finite samples, the critical values of the ADF test depend on the lag order used in the test, the sample size, and the unknown nuisance parameters arising from serially correlated errors. Im, Pesaran, and Shin suggest using μ = E[tρi], v = Var[tρi]. They report small sample approximations for μ and ν conditioning on a common sample size and common lag order truncation in the each ADF test, but a bootstrap can be used to find these values for heterogeneous dynamics and can condition on the estimated serial correlation properties.2

We experiment with three different bootstrap procedures. The first procedure parallels Groen (2000). We construct I(1) bootstrap fundamentals and errors, and then form the bootstrap exchange rate series as sitb = α^i + β^mimitb + β^yiyitb + eitb, where the α and β coefficients are the estimates from the cointegrating regressions on the true data. We then estimate the two-step panel cointegration tests using the bootstrap exchange rates and bootstrap fundamentals. The second procedure constructs I(1) bootstrap fundamentals and exchange rates, and then forms bootstrap errors as eitb = sitb  α^i  β^mimitb  β^yiyitb, using the parameter estimates from the cointegrating regressions. These bootstrap errors are subjected to the panel unit root tests. In the third procedure, we work directly with the residuals from the cointegrating equations (êit) and apply a bootstrap similar to Wu and Wu (2001). Namely, we construct I(1) bootstrap versions of these residuals, e^itb, and conduct panel unit root tests on these bootstrap series. For all of the bootstrap methods, we estimate serial correlation properties of the actual data, choosing a lag length based on a general to specific step down procedure. We use block resampling to initialize series, and we generate innovations by a nonparametric resampling of the data. For each variable, we resample the vector of data across the countries to maintain the pattern of cross-sectional dependencies. In all cases, we find that IPS test statistic on the actual data lies far to the left of the bootstrap critical values, and it is easy to reject the unit root on the data at significance levels much lower than one percent.

C. Panel Estimation of Cointegrating Vectors

In the previous section, we found that exchange rates, money, and income are cointegrated, but we are also interested in the coefficient estimates of the cointegrating vectors. Table 4 displays the point estimates from OLS that uses pooled estimates for relative money supplies and relative outputs and fixed country effects for the intercepts. The regressions are shown for cases that both exclude and include dummies for period effects. We provide estimates for all countries, and by region, and income groups. The coefficients on the fundamentals are the correct signs for all samples and are remarkably close to the theoretical values for most groups. We also check that the point estimates are not being driven by the episodes of high inflation. Of all inflation sub-samples, we find the point estimates to be closest to theoretical values for low inflation episodes when controlling for common time effects, with the coefficient on the relative money supply estimated at 0.97 and the coefficient on the relative income level estimated at -0.95.

Table 4.

Pooled Least Squares Dummy Variable

article image
Note: The estimation period is 1960-2004. T-statistics are below the coefficients. An observation is classified as a high inflation episode if inflation exceeded 30 percent in the current or any of the previous four years. A country is classified as high inflation if it had any episode of high inflation from 1960-2004.

OLS is a superconsistent estimator of the coefficients of cointegrated variables. Indeed, OLS was used in the first stage of cointegration tests in the previous section because the main interest was in obtaining and testing the residuals of the cointegrating vector for stationarity. However, in reporting the results of the estimation in this section, we need to recognize that the standard errors of OLS are biased and thus invalid for hypothesis testing under conditions of serial correlation and endogeneity. Methods have been developed to address these problems. We employ fully modified OLS (FMOLS) and dynamic OLS (DOLS) methods. An alternative estimation approach would be a vector error correction model,3 but general VECMs are not feasible for panels with many countries due to the large number of parameters.

Pedroni (2000) derives a panel group mean FMOLS estimator, which has the advantage of allowing heterogeneous dynamics and heterogeneous cointegrating vectors. This estimator uses the group mean of individual FMOLS estimators, which correct for endogeneity and serial correlation by estimating the long-run covariance directly (Phillips and Hansen, 1990).

The group mean dynamic OLS uses the group mean of the Stock and Watson (1993) DOLS estimator, in which leads and lags of the differenced right hand side variables are used to correct for endogeneity and serial correlation:

y i t   =   α i   +   β i X i t   +   k = K i k i γ i k Δ X i t k   +   u i t

We use FMOLS and DOLS estimators for the cointegrating relationship between the nominal exchange rate, money, and output. In order to control for cross sectional dependence and also reduce the influence of the numeraire country, we also estimate cointegrating vectors for the variables after common time effects have been extracted by demeaning across countries at each year. The time demeaned data is equivalent to the residuals in a panel regression that includes only time dummies. In other words, the cross-sectional average at each year is removed from the exchange rate variables, and the relative money and relative output variables.

In Table 5, the FMOLS coefficient estimates for money and income (group mean values) are shown for the sample of all countries, as well as by regional and income group breakdowns. The estimates for money are all positive and highly significant, while those for income are all negative and highly significant. The second set of estimates extract period effects from the data. This extraction has the advantage of mitigating any impact of movements in the U.S. dollar, which is the common numeraire. The point estimate for money rises to 0.95 and the point estimate for income falls to -0.97 when time dummies are included to extract the period effects. These values are very close to the theoretical values of positive and negative unity, respectively, for money and output.

Table 5.

Group Mean Fully Modified OLS

article image
The estimation runs from 1960-2004. T-statistics are below coefficient values. High inflation countries are those in which inflation exceeded 30 percent for even a single year.

Tables 6 and 7 provide results from the group mean and pooled DOLS estimators for the cointegrating vector. The results for the group mean DOLS are similar to those of FMOLS, with the coefficient on money estimated at 0.92 when period dummies are included. However, the estimate on income increases in absolute value above unity. The high elasticity, if not a statistical aberration, could reflect the impact of relative productivity growth on the equilibrium real exchange rate, which is ignored by the monetary model except to the extent productivity growth raises the transactions demand for money. The income elasticity of output for the Asian region is of the opposite sign to its theoretical prediction. However, the coefficient estimate becomes -0.96 after extracting the time effects. Pooled dynamic OLS results (Table 7) indicate estimates for relative money and outputs that are very close to theoretical predictions. The exception is the Asian region, in which the coefficient on money is much lower than unity.

Table 6.

Group Mean Dynamic OLS

article image
Group mean DOLS using two leads and lags of differences, and Newey-West long run covariances. The estimation runs from 1960-2004. T-statistics are below coefficient values. High inflation countries are those in which inflation exceeded 30 percent for even a single year.
Table 7.

Pooled Dynamic OLS

article image
Note: The estimation period is 1960-2004. T-statistics are below the coefficients. Two leads and lags of the regressors are included. An observation is classified as a high inflation episode if inflation exceeded 30 percent in the current or any of the previous four years. A country is classified as high inflation if it had any episode of high inflation from 1960-2004.

For all of the different estimators, the coefficients are closer to theoretical values when the common time effects have been removed. This result may arise because the raw data (no period effects extracted) may be overly influenced by any deviation of the US dollar from its equilibrium level given by its fundamentals. This deviation would show up in each country’s regression since the US is the numeraire. Of course the difference in results could also reflect the influence of another unobserved common factor.

IV. Out-of-sample analysis

In addition to testing for cointegration, we compare the out-of-sample predictions of our various fundamentals-based models with those from a random walk. This comparison constitutes the acid test for evaluating exchange rate models ever since the Meese-Rogoff (1983) seminal paper, and is included in the work by Mark and Sul (2001). Out-of-sample analysis has been popular in part because it illustrates how in-sample estimation can be misleading when the structural coefficients are unstable over time. The in-sample coefficients are calculated to find the best fit in the sample period, yet this does not provide much guidance over future relationships when the coefficients are likely to change.

The standard metric for evaluating exchange rate models has been the root mean squared forecasting error (RMSE) of the model versus a driftless random walk. However, in an evaluation of the forecasting ability of Markov-switching models for 18 exchange rates, Engel (1994) considers whether the random walk with drift or driftless random walk is the appropriate standard. We include both the driftless random walk and the random walk with drift as benchmark models for comparing the economic models. In our sample, the random walk model has a slight competitive advantage over the other models given that some countries kept their exchange rates fixed for periods of time longer than a year. In order to reduce the number of observations that reflect a fixed exchange rate regime, we compare forecasts from monetary models for each country year observation conditional on a change of any magnitude in the exchange rate. Of course, since we are conditioning on any change in the exchange rate, the monetary models will continue to be disadvantaged relative to a random walk if the exchange rate is not strictly fixed but is forced to trade within a narrow band. In addition, conditioning on a change in the exchange rate also does not offset the advantage of a random walk with drift that arises from the fact that some countries occasionally adopted a crawling peg.

Importantly, we show that the random walk with drift is not a naïve statistical model. Exploiting our large panel of countries, we show that the “drift” in the random walk with drift model is highly linked to the drift rates in monetary fundamentals. As such, it contains similar economic information as in the explicit fundamentals-based models.

Engel (1994) also notes another usefulness model evaluation criterion: the proportion of forecasts that correctly predict the direction of change of the exchange rate. This utility-based criterion follows from the work of Leitch and Tanner (1991), who find that the direction of change criterion is the best among several evaluation criteria for choosing forecasts of interest rates on their ability to maximize expected trading profits. Abhyankara, Sarno, and Valente (2005) employ similar arguments to show that there is economic value in exchange rate forecasts from a fundamentals model. The RMSE criterion, in contrast, compares models based on the distance between the forecast and the actual outturn, regardless of whether the direction of the forecast is correct. Hypothetically, a monetary model could perfectly forecast the direction of change in the exchange rate at all periods and yet be defeated by a random walk using the RMSE criterion if the monetary model consistently over-predicted the magnitude of change. For both policy makers and market participants, however, the ability to accurately forecast the direction of change in the exchange rate may be as useful as a precise point estimate. Therefore, we also compare our models by calculating the percentage of predictions in which the model forecasts the correct sign or direction of the change.

We evaluate our set of fundamentals-based models using these out-of-sample criteria, making several adjustments to offset some of the statistical problems encountered in previous literature. We examine both one year and five year forecast horizons. Given our large data sample, we are able to select non-overlapping five year horizons to avoid the thorny statistical problems with overlapping forecasts. We include a set of results in which the time period effects have been removed from the data prior to analysis. By removing the average at each time period, we not only remove an important source of cross-sectional dependence, but we also mitigate the effect of the numeraire country. In working with demeaned data (that is, data that removes the period effect or cross-sectional average at each year), we are exploiting the fact that we are interested in model evaluation in this study, not in constructing a true forecasting model. Thus, we treat the demeaned data as if it were the true data for comparing the out-of-sample performance of various models for the results labeled “demeaned.” Obviously, if we were interested in true forecasting, we would then need to forecast the time effect.

For the RMSE comparisons, we present the Theil ratios of each model’s RMSE to the benchmark random walk or random walk with drift, and we also present tests of statistical significance. The Theil ratio will be below unity when the simple raw RMSE of the fundamentals-based model is less than that of the benchmark model. For non-nested models, we use the Diebold-Mariano (1995) test to gauge statistical significance. We correct for serial correlation and calculate robust standard errors. In addition, we use a GLS estimator to prevent any outlier country from driving the results. For comparison of models that are nested, we use the recent test of Clark and West (2007). Clark and West argue that under the null that the additional parameters in the larger unconstrained model are zero, estimating those additional parameters introduces estimation error that inflates the RMSE of the alternative (fundamentals) model. Their test adjusts the standard errors of the unconstrained model to offset the bias. Because of these adjustments, the Clark-West measure sometimes provides evidence that the unconstrained (fundamentals-based) model is statistically better than the benchmark model, even when the Theil ratio is above unity. On the other hand, since we adjust for serial correlation, heteroskedasticity, and robust standard errors, it is possible to find that a Theil ratio much below unity is still not statistically significant.

A. Model Specification

We perform the out-of-sample analysis for three specifications of the monetary model. Inspired by the tests of Meese and Rogoff for advanced countries, we use the estimated coefficients from a regression of the exchange rate on relative money supplies and relative outputs, together with the future values of the fundamentals to form the forecasts. We are principally interested in whether adding information about the fundamentals can provide better predictions for the future exchange rate than knowledge about the current value of the exchange rate alone. We deviate from Meese and Rogoff’s specification only by using panel regressions. That is, we pool the coefficient estimates on the fundamental variables for each sample, although allow for country fixed effects for the intercept.

In addition, we use the error correction specification studied by Mark and Sul (2001) for a panel of 19 countries. They compare the exchange rate return on the current deviation of the exchange rate from its fundamental value. The fundamental value is constructed by imposing values of unity and minus unity on relative money supplies and relative output. We perform panel estimation by each group to obtain the pooled estimate of the coefficient of adjustment to equilibrium, and allowing for fixed country effects for the intercept. The error correction forecast is a true forecast, in the sense that it does not include any future information.4

We also introduce two specifications of the monetary model in growth rates, rather than in the levels specification of Meese and Rogoff.5 Hendry and Mizon (2001) show that in the presence of structural breaks and policy-regime shifts, a differenced model can have a smaller forecast bias than a model in levels, because it is robust to forecasting after the equilibrium mean shift. Moreover, Rossi (2005) and Flood and Rose (2007) show that when the error term of the structural specification (1) is highly serially correlated, the random walk is likely to have a lower RMSE forecast, even when the fundamentals have explanatory power. In fact, Rossi also shows that the random walk can have lower RMSE compared to the fundamentals-based model that tries to take serial correlation into account, due to the familiar bias in estimating the serial correlation coefficient. This issue is relevant for the monetary model estimated in levels, and is mitigated by writing the monetary model in growth rates, a specification that nests the random walk. We estimate a set of regressions for the monetary model in growth rates, pooling the coefficients on money and output from each group, and including country fixed effects.6 We use these estimates to construct the out-of-sample forecasts. Separately, we construct a set of forecast errors from imposing the values of one and minus one as the coefficients on money growth and output growth, respectively, and setting the intercept to zero.

As in Meese and Rogoff, we estimate each model over an initial period (we start with 1960-1983), make forecasts, and then recursively increase the estimation period. We obtain one-year-ahead-forecast errors from the following set of models:

Random walk:

( A ) s i , t + 1 f   =   s i , t

Random walk with drift:

( B ) s i , t + 1 f   =   s i , t   +   τ ^ i

Monetary model in levels:

( C ) s i , t + 1 f   =   β ^ i 0   +   β ^ i m m i , t + 1 +   β ^ i y   y i , t + 1

Error correction:

( D ) s i , t + 1 f   =   s i , t   +   γ ^ i 0 , 1   +   γ ^ i e c , 1 ( s i , t     m i , t   +   y i , t )

Monetary model in growth rates, using estimated coefficients:

( E ) s i , t + 1 f   =   s i , t   +   θ ^ i 0   +   θ ^ i m Δ m i , t + 1   +   θ ^ i y Δ y i , t + 1

Monetary model in growth rates, using theoretical coefficients:

( F ) s i , t + 1 f   =   s i , t   +   ( Δ m i , t + 1     Δ y i , t + 1 )

Mark (1995) shows that a fundamentals-based error correction model had greater power to predict exchange rates at longer horizons than at short horizons. However, Berkowitz and Giorgianni (2001) and Kilian (1999) find that Mark’s favorable long horizon results arise from size distortions and statistical problems associated with the high degree of dependence in overlapping observations at forecasting horizons longer than one period. They argue that long horizon regressions offer no more forecasting power than short horizon regressions. We consider Mark’s long horizon regressions, but address the criticism by constructing non-overlapping forecasts.

We construct five-year-ahead-forecasts as follows:

( A ) s i , t + 5 f   =   s i , t
( B ) s i , t + 5 f   =   s i , t   +   5 τ ^ i
( C ) s i , t + 5 f   =   β ^ i 0   +   β ^ i m m i , t + 5   +   β ^ i y y i , t + 5
( D ) s i , t + 5 f   =   s i , t   +   γ ^ i 0 , 5   +   γ ^ i e c , 5 ( s i , t     m i , t   +   y i , t )
( E ) s i , t + 5 f   =   s i , t   +   5 θ ^ i 0   +   θ ^ i m ( m i , t + 5     m i , t )   +   θ ^ i y ( y i , t + 5     y i , t )
( F ) s i , t + 5 f   =   s i , t   +   ( m i , t + 5     m i , t )     ( y i , t + 5     y i , t )

In the monetary models in levels and growth rates, we use actual future values of the fundamentals. The error correction models contain only current variables as regressors.

B. Out-of-Sample Results

Tables 8-15 provide all of the out-of-sample results, for the complete set of countries and for breakdowns by regions, income groups, and inflation groups. We present a one year horizon and a five year horizon. Each of these horizons are presented for the raw data and for the demeaned data (the deviation of the actual data from the cross-sectional means in each year). The top panels of Tables 8, 10, 12, and 14 provide the root mean squared errors (RMSE) of the random walk model, the random walk with drift model, and the various fundamentals-based models. The best model (e.g., that with the lowest RMSE) is shown in bold. The bottom panels of the same tables show the proportion of observations in which the model correctly predicts the direction of change of the exchange rate. The best model is that with the highest percentage and is shown in bold. Tables 9, 11, 13, and 15 provide the Theil ratios of the fundamentals-based models measured against benchmarks of the random walk (top panel) and the random walk with drift (bottom panel). The Theil ratios below unity are bolded. However, we also provide a measure of the statistical significance as determined by either Clark-West tests or Diebold-Mariano tests, depending on whether the models are nested or non-nested, respectively. As discussed earlier, it is possible to observe a significant result even with a Theil ratio above unity, or a non-significant result even if the Theil ratio is much below unity.

Table 8.

Out-of-Sample Forecast: One-Year-Ahead-Forecast

article image
Note: Models C-D are estimated with pooled equations. In model F, the theoretical elasticities on money and output are imposed as plus and minus unity. Initial estimation period is from 1960-83 for 98 countries. Forecasts run from 1984 to 2004. An observation is classified as a low inflation episode if inflation did not exceed 30 percent in the current or any of the previous four years. A country is classified as a low inflation country if it did not have any episode of high inflation from 1960-2004.
Table 9.

Theil Statistic: One-Year-Ahead-Forecast

article image
Note: Bolded values represent models that outperform the random walk (top panel) and random walk with drift (bottom panel). The asterisks denote significant Theil statistics, at one percent (***), five percent (**), and ten percent (*) significance levels. The Clark-West test is used for nested models and the Diebold-Mariano test for non-nested models.
Table 10.

Out-of-Sample Forecast: One-Year-Ahead-Forecast (demeaned data)

article image
Note: Models C-D are estimated with pooled equations. In model F, the theoretical elasticities on money and output are imposed as plus and minus unity. Initial estimation period is from 1960-83 for 98 countries. Forecasts run from 1984 to 2004. An observation is classified as a low inflation episode if inflation did not exceed 30 percent in the current or any of the previous four years. A country is classified as a low inflation country if it did not have any episode of high inflation from 1960-2004.
Table 11.

Theil Statistic: One-Year-Ahead-Forecast (demeaned data)

article image
Note: Bolded values represent models that outperform the random walk (top panel) and random walk with drift (bottom panel). The asterisks denote significant Theil statistics, at one percent (*: The Clark-West test is used for nested models and the Diebold-Mariano test for non-nested models.
Table 12.

Out-of-Sample Forecast: Five-Year-Ahead-Forecast

article image
Note: Models C-D are estimated with pooled equations. In model F, the theoretical elasticities on money and output are imposed as plus and minus unity. Initial estimation period is from 1960-83 for 98 countries. Out of sample tests use non-overlapping 5 year forecasts for years 1988, 1993, 1998, and 2003. An observation is classified as a low inflation episode if inflation did not exceed 30 percent in the current or any of the previous four years. A country is classified as a low inflation country if it did not have any episode of high inflation from 1960-2004.
Table 13.

Theil Statistic: Five-Year-Ahead-Forecast

article image
Note: Bolded values represent models that outperform the random walk (top panel) and random walk with drift (bottom panel). The asterisks denote significant Theil statistics, at one percent (***), five percent (**), and ten percent (*) significance levels. The Clark-West test is used for nested models and the Diebold-Mariano test for non-nested models.
Table 14.

Out-of-Sample Forecast: Five-Year-Ahead-Forecasts (demeaned data)

article image
Note: Models C-D are estimated with pooled equations. In model F, the theoretical elasticities on money and output are imposed as plus and minus unity. Initial estimation period is from 1960-83 for 98 countries. Out of sample tests use non-overlapping 5 year forecasts for years 1988, 1993, 1998, and 2003. An observation is classified as a low inflation episode if inflation did not exceed 30 percent in the current or any of the previous four years. A country is classified as a low inflation country if it did not have any episode of high inflation from 1960-2004.