Nonlinear Exchange Rate Models
A Selective Overview
Author: Lucio Sarno1
  • 1 https://isni.org/isni/0000000404811396, International Monetary Fund

Contributor Notes

Author’s E-Mail Address: Lsarno@imf.org

This paper provides a selective overview of nonlinear exchange rate models recently proposed in the literature and assesses their contribution to understanding exchange rate behavior. Two key questions are examined. The first question is whether nonlinear autoregressive models of real exchange rates help resolve the "purchasing power parity (PPP) puzzles." The second question is whether recently developed nonlinear, regime-switching vector equilibrium correction models of the nominal exchange rate can beat a random walk model, the standard benchmark in the exchange rate literature, in terms of out-of-sample forecasting performance. Finally, issues related to the adequateness of standard methods of evaluation of (linear and nonlinear) exchange rate models are discussed with reference to different forecast accuracy criteria.

Abstract

This paper provides a selective overview of nonlinear exchange rate models recently proposed in the literature and assesses their contribution to understanding exchange rate behavior. Two key questions are examined. The first question is whether nonlinear autoregressive models of real exchange rates help resolve the "purchasing power parity (PPP) puzzles." The second question is whether recently developed nonlinear, regime-switching vector equilibrium correction models of the nominal exchange rate can beat a random walk model, the standard benchmark in the exchange rate literature, in terms of out-of-sample forecasting performance. Finally, issues related to the adequateness of standard methods of evaluation of (linear and nonlinear) exchange rate models are discussed with reference to different forecast accuracy criteria.

I. Introduction

In recent years, a growing body of literature in exchange rate economics has devoted a great deal of attention to the role of nonlinear dynamics in exchange rates. This paper provides a selective overview of nonlinear exchange rate models recently proposed by researchers and assesses their contribution to understanding exchange rate behavior.2 Two key questions are addressed. The first is concerned with whether nonlinear autoregressive models of real exchange rates help us resolve the “purchasing power parity (PPP) puzzles”—namely the fact that the real exchange rate either displays no tendency to revert to a stable long-run equilibrium level consistent with PPP or shows an implausibly slow speed of mean reversion. The second question is whether recently developed nonlinear, regime-switching vector equilibrium correction models of the nominal exchange rate can beat a random walk model, the standard benchmark in the exchange rate literature, in terms of out-of-sample forecasting performance. Finally, issues related to the adequateness of standard methods of evaluation of (linear and nonlinear) exchange rate models are discussed with reference to both point and density forecast accuracy criteria.

The remainder of the paper is as follows. Section II briefly reviews the literature on testing the validity of long-run PPP and on modeling the real exchange rate, discusses the theoretical rationale for nonlinear mean reversion in the real exchange rate, and then summarizes the empirical evidence on the ability of nonlinear models of real exchange rate behavior to address the PPP puzzles. Section III provides an overview of the literature on out-of-sample exchange rate forecasting using economic fundamentals and the term structure of forward premia, followed by a discussion of the recent work on nonlinear models of the term structure and an assessment of their ability to beat a random walk forecast and to improve upon linear specifications. Section IV reviews some of the recent work on evaluating density forecasts, which is particularly relevant in the context of exchange rates and of nonlinear models that are consistent with nonnormal densities. This section also includes a simple application to density forecast evaluations in the context of linear and nonlinear models of exchange rates. A final section concludes.

II. Real exchange rate behavior and long-run purchasing power parity

A. A Brief Overview of the Literature

The purchasing power parity (PPP) hypothesis states that national price levels should be equal when expressed in a common currency. Although very few economists would believe that this simple proposition holds at each point in time, a large literature in international finance has examined empirically the validity of PPP over the long run either by testing whether nominal exchange rates and relative prices move together in the long run or by testing whether the real exchange rate has a tendency to revert to a stable equilibrium level over time. The latter approach is motivated by the fact that the real exchange rate may be defined as the nominal exchange rate adjusted for relative national price levels. More formally, the real exchange rate, qt, may be expressed in logarithmic form as

qtstpt+pt*,(1)

where st, is the logarithm of the nominal exchange rate (domestic price of foreign currency), and pt and pt* denote the logarithms of the domestic and foreign price levels respectively. The real exchange rate, qt may thus be interpreted as a measure of the deviation from PPP and must be stationary for long-run PPP to hold (see the surveys of Froot and Rogoff, 1995; Rogoff, 1996; Sarno and Taylor, 2002).

Although long-run PPP is a very simple proposition about exchange rate behavior, it has attracted the attention of researchers for decades. Indeed, whether long-run PPP holds or whether the real exchange rate is stationary has important economic implications on a number of fronts. In particular, the degree of persistence in the real exchange rate can be used to infer the principal impulses driving exchange rate movements. For example, if the real exchange rate is highly persistent or close to a random walk, then the shocks are likely to be real-side, principally technology shocks, whereas if it is not very persistent, then the shocks must be principally to aggregate demand, such as, for example, innovations to monetary policy (Rogoff, 1996). Further, from a theoretical perspective, if PPP is not a valid long-run international parity condition, this casts doubts on the predictions of much open-economy macroeconomics that is based on the assumption of long-run PPP. Indeed, the implications of open economy dynamic models are very sensitive to the presence or absence of a unit root in the real exchange rate (e.g., Lane, 2001; Sarno, 2001). Finally, estimates of PPP exchange rates are often used for practical purposes such as determining the degree of misalignment of the nominal exchange rate and the appropriate policy response, the setting of exchange rate parities, and the international comparison of national income levels. These practical uses of the PPP concept, and in particular the calculation of PPP exchange rates, would obviously be of very limited use if PPP deviations contain a unit root.

Regardless of the great interest in this area of research, manifested by the large number of papers on PPP published over the last few decades, and regardless of the increasing quality of data sets utilized and of the econometric techniques employed, the validity of long-run PPP and the properties of PPP deviations remain the subject of ongoing controversy. Specifically, earlier cointegration studies generally reported the absence of significant mean reversion of the real exchange rate for the recent floating experience (Taylor, 1988; Mark, 1990), but were supportive of reversion toward PPP for the gold standard period (McCloskey and Zecher, 1984; Diebold, Husted, and Rush, 1991), for the interwar float (Taylor and McMahon, 1988), for the 1950s U.S.-Canadian float (McNown and Wallace, 1989), and for the exchange rates of high-inflation countries (Choudhry, McNown, and Wallace, 1991). More recent applied work on long-run PPP among the major industrialized economies has, however, been more favorable toward the long-run PPP hypothesis for the recent float (e.g., Corbae and Ouliaris, 1988; Cheung and Lai, 1993a, 1993b, 1994, 1998; Frankel and Rose, 1996).

One well-documented explanation for the inability to find clear-cut evidence of PPP is the low power of conventional statistical tests to reject a false null hypothesis of a unit root in the real exchange rate or no cointegration between the nominal exchange rate and relative prices with a sample span corresponding to the length of the recent float (Frankel, 1986, 1990; Froot and Rogoff, 1995; Lothian and Taylor, 1997). Researchers have sought to overcome the power problem in testing for mean reversion in the real exchange rate either through long span studies (e.g., Kim, 1990; Lothian and Taylor, 1996; Taylor, 2002) or through panel unit root studies (e.g., Abuaf and Jorion, 1990; Frankel and Rose, 1996; O’Connell, 1998; Papell, 1998; Sarno and Taylor, 1998; Taylor and Sarno, 1998). However, whether or not the long-span or panel-data studies do in fact answer the question whether PPP holds in the long run remains contentious. As far as the long-span studies are concerned, as noted in particular by Frankel and Rose (1996), the long samples required to generate a reasonable level of statistical power with standard univariate unit root tests may be unavailable for many currencies (perhaps thereby generating a ’survivorship bias’ in tests on the available data) and, in any case, may potentially be inappropriate because of differences in real exchange rate behavior both across different historical periods and across different nominal exchange rate regimes (e.g., Baxter and Stockman, 1989; Hegwood and Papell, 1999; Taylor, 2002). As for panel-data studies, these provide mixed evidence. While, for example, Abuaf and Jorion (1990), Frankel and Rose (1996) and Taylor and Sarno (1998) find results favorable to long-run PPP, O’Connell (1998) rejects it on the basis of their empirical evidence.

In light of the evidence provided by this literature, there remain several unresolved puzzles, among which two are prominent. First, it is still controversial whether long-run PPP is valid during the recent floating exchange rate regime. Second, it is puzzling why the majority of studies find empirical estimates of the persistence of PPP deviations that are too high—the half life of shocks ranges between three and five years—to be explained in light of conventional nominal rigidities and cannot be reconciled with the large short-term volatility of real exchange rates (Rogoff, 1996).

B. Nonlinear Mean Reversion in Real Exchange Rates: Rationale and Modeling Procedures

In the procedures conventionally applied to test for long-run PPP, the null hypothesis is usually that the process generating the real exchange rate series has a unit root, while the alternative hypothesis is that all of the roots of the process lie within the unit circle. Thus, the maintained hypothesis in the conventional framework assumes a linear autoregressive process for the real exchange rate, which means that adjustment is both continuous and of constant speed, regardless of the size of the deviation from PPP. However, the presence of transactions costs may imply a nonlinear process, which has important implications for the conventional unit root tests of long-run PPP. The idea that there may be nonlinearities in real exchange rate adjustment dates at least from Heckscher (1916), who suggested that there may be significant deviations from the law of one price due to international transactions costs between spatially separated markets. A similar viewpoint can be discerned in the writings of Cassel (e.g., Cassel, 1922) and, to a greater or lesser extent, in other earlier writers (Officer, 1982). More recently, a number of authors have developed theoretical models of nonlinear real exchange rate adjustment arising from transactions costs in international arbitrage (e.g., Benninga and Protopapadakis, 1988; Williams and Wright, 1991; Dumas, 1992; Sercu, Uppal and Van Hulle, 1995). In most of these models, proportional or “iceberg” transport costs (“iceberg” because a fraction of goods are presumed to “melt” when shipped) create a band for the real exchange rate within which the marginal cost of arbitrage exceeds the marginal benefit. Assuming instantaneous goods arbitrage at the edges of the band then typically implies that the thresholds become reflecting barriers.

Drawing on recent work on the theory of investment under uncertainty, some of these studies show that the thresholds should be interpreted more broadly than as simply reflecting shipping costs and trade barriers per se, but also as resulting from the sunk costs of international arbitrage and the resulting tendency for traders to wait for sufficiently large arbitrage opportunities to open up before entering the market (see in particular Dumas, 1992; Obstfeld and Rogoff, 2000).

O’Connell and Wei (2002) extend the iceberg model to allow for fixed as well as proportional costs of arbitrage. This results in a two-threshold model where the real exchange rate is reset by arbitrage to an upper or lower inner threshold whenever it hits the corresponding outer threshold. Intuitively, arbitrage will be heavy once it is profitable enough to outweigh the initial fixed cost, but will stop short of returning the real rate to the PPP level because of the proportional arbitrage costs.

Overall, these models suggest that the exchange rate will become increasingly mean reverting with the size of the deviation from the equilibrium level. In some models the jump to mean-reverting behavior is sudden, whilst in others it is smooth, and Dumas (1994) suggests that even in the former case, time aggregation will tend to smooth the transition between regimes. Moreover, if the real exchange rate is measured using price indices made up of goods prices each with a different size of international arbitrage costs, one would expect adjustment of the overall real exchange rate to be smooth rather than discontinuous.

Some empirical evidence of the effect of transactions costs on tests of PPP is provided by Davutyan and Pippenger (1990). More recently, Obstfeld and Taylor (1997), Taylor (2001), Sarno, Taylor, and Chowdhury (2003), Leon and Najarian (2003) have investigated the nonlinear nature of the adjustment process in terms of a threshold autoregressive (TAR) model (Tong, 1990). The TAR model allows for a transactions costs band within which no adjustment takes place—so that deviations from PPP may exhibit unit root behavior—while outside of the band the process switches abruptly to become stationary autoregressive. While discrete switching of this kind may be appropriate when considering the effects of arbitrage on disaggregated goods prices (Obstfeld and Taylor, 1997), discrete adjustment of the aggregate real exchange rate would clearly be most appropriate only when firms and traded goods are identical. Moreover, several theoretical studies suggest that smooth rather than discrete adjustment may be more appropriate in the presence of proportional transactions costs and, as suggested by Teräsvirta (1994), Dumas (1994), and Bertola and Caballero (1990), time aggregation and nonsynchronous adjustment by heterogeneous agents is likely to result in smooth aggregate regime switching.

An alternative characterization of nonlinear adjustment, which allows for smooth rather than discrete adjustment, is in terms of a smooth transition autoregressive (STAR) model (Granger and Teräsvirta, 1993). In the STAR model, adjustment takes place in every period but the speed of adjustment varies with the extent of the deviation from parity. A STAR model may be written as follows:

[qtμ]=j=1Pβj[qtjμ]+(j=1Pβj*[qtjμ]) Φ[θ;qtdμ]+εt,(2)

where {qt} is a stationary and ergodic process, εt ~iid (0,σ2) and (θμ) ∈ {ℜ+ × ℜ}, where ℜ denotes the real line (–∞,∞) and ℜ+ the positive real line (0,∞). The transition function Φ[θ;qt−dμ] determines the degree of mean reversion and is itself governed by the parameter θ, which effectively determines the speed of mean reversion, and the parameter μ, which is the equilibrium level of {qt}; the integer d > 0 denotes a delay parameter. A simple transition function suggested by Granger and Teräsvirta (1993) is the exponential function:

Φ[θ;qtdμ]=1exp {θ2[qtdμ]2},(3)

in which case (2) would be termed an exponential STAR or ESTAR model. The exponential transition function is bounded between zero and unity, Φ:ℜ→[0,1], has the properties Φ[0]= 0 and limx→±∞ Φ[x] = 1, and is symmetrically inverse-bell shaped around zero. These properties of the ESTAR model are attractive in the present modeling context because they allow a smooth transition between regimes and symmetric adjustment of the real exchange rate for deviations above and below the equilibrium level. The transition parameter θ determines the speed of transition between the two extreme regimes, with lower absolute values of θ implying slower transition. The inner regime corresponds to qt–d=μ, when Φ = 0 and (2) becomes a linear AR(p) model:

[qtdμ]=j=1Pβj[qtjμ]+εt.(4)

The outer regime corresponds, for a given θ, to lim[qt–dμ]→±∞ Φ[θ; qt–dμ], where (2) becomes a different AR(p) model:

[qtdμ]=j=1P(βj+βj*)[qtjμ]+εt,(5)

with a correspondingly different speed of mean reversion so long as βj*0 for at least one value of j.3

It is also instructive to reparameterize the STAR model (2) as

Δqt=α+ρqt1+j=1p1ϕjΔqtj+{α*+ρ*qt1+j=1p1ϕj*Δqtj}Φ[θ;qtd]+εt,(6)

where Δqt–jqt–j –qt–j–1. In this form, the crucial parameters are ρ and ρ*. Our above discussion of the effect of transactions costs suggests that the larger the deviation from PPP the stronger will be the tendency to move back to equilibrium. This implies that while ρ ≥ 0 is admissible, we must have ρ* < 0 and (ρ+ρ*) < 0. That is, for small deviations qt may be characterized by unit root or even explosive behavior, but for large deviations the process is mean reverting. This analysis has implications for the conventional test for a unit root in the real exchange rate process, which is based on a linear AR(p) model, written below as an augmented Dickey-Fuller regression:

Δqt=α+ρqt1+j=1p1ϕjΔqtj+εt.(7)

Assuming that the true process for qt is given by the nonlinear model (6), estimates of the parameter ρ′ in (7) will tend to lie between ρ and (ρ+ρ*), depending upon the distribution of observed deviations from the equilibrium level μ. Hence, the null hypothesis H0: ρ′=0 (a single unit root) may not be rejected against the stationary linear alternative hypothesis H1: ρ′<1, even though the true nonlinear process is globally stable with (ρ+ρ*) < 0. Thus, failure to reject the unit root hypothesis on the basis of a linear model does not necessarily invalidate long-run PPP.

C. Empirical Evidence on Nonlinear Mean Reversion in Real Exchange Rates

We now turn to the empirical evidence on nonlinear mean reversion in real exchange rates. Michael, Nobay, and Peel (MNP) (1997) apply the ESTAR model to monthly interwar data for the French franc-U.S. dollar, French franc-pound sterling, and pound sterling-U.S. dollar as well as for the Lothian and Taylor (1996) long span data set. Their results clearly reject the linear framework in favor of an ESTAR process. The systematic pattern in the estimates of the nonlinear models provides strong evidence of mean-reverting behavior for PPP deviations, and helps explain the mixed results of previous studies. However, the periods examined by MNP are ones over which the relevance of long-run PPP is uncontentious (Taylor and McMahon, 1988; Lothian and Taylor, 1996).

Using data for the recent float, however, Taylor, Peel and Sarno (TSP) (2001) report empirical results that provide strong confirmation that four major real bilateral dollar exchange rates are well characterized by nonlinearly mean reverting processes. For example, the estimated model for dollar-sterling over the 1973–1996 sample period is as follows:

qt=qt1[1exp {0.452{qt1+0.149}2}][qt1+0.149],(2.771)(4.274)(4.274) [0.002](8)

where the hat denotes the fitted value, figures in parentheses are t-ratios; the p-value, calculated by Monte Carlo methods, for the significance of the speed of adjustment parameter in the transition function is 0.002. The recorded R2 for this simple nonlinear AR(1) is 0.94. This estimated model, which may be seen as representative of the results reported by TPS (2001), implies an equilibrium level of the real exchange rate in the neighborhood of which the behavior of the log-level of the real exchange rate is close to a random walk, becoming increasingly mean reverting with the absolute size of the deviation from equilibrium, consistent with the recent theoretical literature on the nature of real exchange rate dynamics in the presence of international arbitrage costs.

TPS also estimated the impulse response functions corresponding to their estimated nonlinear real exchange rate models by Monte Carlo integration.4 By taking account of statistically significant nonlinearities, TPS find the speed of real exchange rate adjustment to be typically much faster than the very slow speeds of real exchange rate adjustment hitherto recorded in the literature. For example, the estimated half lives (in months) conditional on average initial history of the real exchange rate are the following:

article image

where in the first row we report the size of the shock (in percentage terms) to the level of the real exchange rate. The estimated half lives of these four major real dollar exchange rates illustrate the nonlinear nature of the response to shocks quite clearly, with larger shocks mean reverting much faster than smaller shocks. The dollar-sterling and dollar-mark models show a marked degree of similarity in terms of the estimated half lives, displaying quite fast mean reversion, ranging from a half life of under one year for the largest shocks of 40 percent to just under three years for very small shocks of 1 percent; for shocks of 5–10 percent, the half lives are just over two years. The dollar-franc displays slightly higher persistence, conditional on average history, with half lives ranging from 13 months for a 40 percent shock to 40 months for a 1 percent shock, while for the dollar-yen the range is 14 to 42 months. For the other sizes of shocks considered, the half lives for dollar-franc and dollar-yen are very close.

These results therefore seem to shed some light on Rogoff’s (1996) PPP puzzle. Only for small shocks occurring when the real exchange rate is near its equilibrium do nonlinear models consistently yield half lives in the range of three to five years, which Rogoff (1996) terms “glacial.” For dollar-mark and dollar-sterling in particular, even small shocks of one to five percent have a half life under three years, conditional on average history. For larger shocks, the speed of mean reversion is even faster.5

III. Out-of-sample forecasts of the nominal exchange rate

A. The Failure of Conventional Exchange Rate Models

A logical way of examining the empirical ability of exchange rate models is to examine their out-of-sample forecasting performance. In a highly influential paper, Meese and Rogoff (1983) compare the out-of-sample forecasts produced by various exchange rate models with forecasts produced by a random walk model, by the forward exchange rate, by a univariate regression of the spot rate, and by a vector autoregression. They use rolling regressions to generate a succession of out-of-sample forecasts for each model and for various time horizons. The conclusion which emerges from this study is that, on a comparison of root mean square errors (RMSEs), none of the asset-market exchange rate models outperforms the simple random walk, even though actual future values of the right-hand-side variables are allowed in the dynamic forecasts (thereby giving the models a very large informational advantage). In particular, Meese and Rogoff compare random-walk forecasts with those produced by the flexible-price monetary model, Frankel’s (1979) real interest rate differential variant of the monetary model, and a synthesis of the monetary and portfolio balance models suggested by Hooper and Morton (1982).

A variant of the Meese-Rogoff approach involves employing a time-varying parameter model. In fact, the poor forecasting performance noted by Meese and Rogoff may be due to the fact the parameters in the estimated equations are unstable. This instability may be rationalized on a number of grounds, in response to policy regime changes as an example of a Lucas critique problem (Lucas, 1976), or because of implicit instability in the money demand or PPP equations, or also because of agents’ heterogeneity leading to different responses to macroeconomic developments over time. For example, Schinasi and Swamy (1989) use a Kalman-filter maximum likelihood estimation technique to estimate time-varying parameter models which are found to outperform the random walk model of the exchange rate for certain time periods and currencies.

A general finding in this literature is that researchers have found that one key to improving forecast performance based on economic fundamentals lies in the introduction of equation dynamics. This has been done in various ways: by using dynamic forecasting equations for the forcing variables in the forward-looking, rational expectations version of the flexible-price monetary model, by incorporating dynamic partial adjustment terms into the estimating equation, by using time-varying parameter estimation techniques, and by using dynamic error correction forms (e.g., Koedijk and Schotman, 1990; MacDonald and Taylor, 1993,1994).

Nevertheless, it remains true that most studies which claim to have beaten the random walk in out-of-sample forecasting turn out to be fragile in the sense that it is generally hard to replicate the superior forecasting performance for alternative periods and alternative currencies.

A related approach, due in the context of foreign exchange market analysis originally to Mark (1995), who considers long-horizon predictability through analysis of equations of the form:

Δkst+k=α+βk(ztst)+ut+k,(9)

where zt is an exchange rate fundamental, for example that suggested by the monetary class of models, zt[(mtmt*)κ(ytyt*)], and ut+k is a disturbance term.6 If the fundamental in question helps forecast the exchange rate, then we should find βk < 0 and significantly different from zero. In a series of forecasting tests over long horizons for a number of quarterly dollar exchange rates, Mark finds that equation (9) may be able to predict the nominal exchange rate only at long horizons, such as the four-year horizon. Moreover, both the goodness of in-sample fit and the estimated value of βk rise as the horizon k rises. Mark interprets this as evidence that, while quarter-to-quarter exchange rate movements may be noisy, systematic movements related to the fundamentals become apparent in long-horizon changes.

In general, long-horizon regressions have been used extensively in the literature, but with mixed success (see Kilian, 1999). One reason may be that previous research has focused on linear models. In fact, in a linear world, it can be argued that there is no rationale for conducting long-horizon forecast tests. The problem is that under linearity k-step ahead forecasts are obtained by linear extrapolation from 1-step ahead forecasts. Thus, by construction there cannot be any gain in power at higher horizons (see Berkowitz and Giorgianni, 1999; Kilian, 1999; Berben and van Dijk, 1998). However, the mounting evidence of nonlinear exchange rate dynamics provides a new rationale for the use of long-horizon regression tests.

B. The Information in the Term Structure of the Forward Premia

Clarida and Taylor (1997) argued that the failure of the forward rate optimally to predict the future spot rate did not necessarily imply that forward rates did not contain valuable information for forecasting future spot exchange rates. Clarida and Taylor develop what they term an “agnostic” framework for linking spot rate and forward rate movements without assuming anything at all specific about risk premia or expectations formation except that departures from the risk-neutral efficient markets hypothesis (RNEMH) drive at most a stationary wedge between forward and expected future spot rates. This is sufficient to establish the existence of a linear vector equilibrium correction model (VECM) for spot and forward exchange rates. Using this framework, Clarida and Taylor are able to extract sufficient information from the term structure of forward premia to outperform the random walk forecast—and a range of alternative forecasts—for several exchange rates in out-of-sample forecasting. Indeed, at the one-year forecasting horizon, their improvement over the naive random walk is of the order of 40 percent in terms of root mean square errors.

To illustrate the rationale behind a VECM for spot and forward rates, let st and fth(k) be, respectively, the spot exchange rate and the h(k)-period forward exchange rate, each at time t. It is now well documented that nominal exchange rates between the currencies of the major industrialized economies are well described by unit root processes. We can therefore write the spot exchange rate as the sum of two components:

st=mt+qt,(10)

where mt is a unit-root process evolving as a random walk with drift, and qt is a stationary process having mean zero and a finite variance (Beveridge and Nelson, 1981). If agents are risk-neutral and the market is efficient in the sense that exchange rates fully reflect all information in a given information set Ωt, (so that, in effect, the market conforms to the rational expectations hypothesis) then the forward exchange rate fth(k) should predict the h(k)-period ahead future value of the spot exchange rate optimally given Ωt. This is the essence of the RNEMH. There now exists a large literature rejecting the RNEMH, although it is unclear whether rejection is due to a failure of the assumptions of risk neutrality or of rational expectations or of both (e.g., Sarno and Taylor, 2002).

We may in general define departures from the RNEMH, due either to the presence of risk premia or to a failure of rational expectations, or both, as follows:

γtfth(k)E(st+h(k)|Ωt),(11)

where E(./Ωt)denotes the mathematical expectation conditional on Ωt. From (10) and (11) we can obtain:

fth(k)=γt+h(k)θ+E(qt+h(k)|Ωt)+mt,(12)

where θ is the drift of the random walk process mt. Subtracting (10) from (12), we obtain an expression for the forward premium at time t:

fth(k)st=γt+h(k)θ+E(qt+h(k)qt|Ωt).(13)

Equation (13) says that if the departure from the RNMEH γt is stationary, given qt ~ I(0), the forward premium (fth(k)st) must also be stationary. This implies that forward and spot rates exhibit a common stochastic trend and are cointegrated with cointegrating vector [1,–1]. Moreover, since this is true for any h(k), if we consider the vector of forward rates for h(1) to h(m) periods, together with the current spot rate, [st,fth(1),fth(2),fth(3),fth(k)], then this must be cointegrated with m unique cointegrating vectors, each given by a row of the matrix [–ι, Im]′, where Im is an m-dimensional identity matrix and ι is an m-dimensional column vector of ones. Further, by the Granger Representation Theorem (Engle and Granger, 1987) the same set of forward and spot rates must possess a VECM representation in which the term structure of forward premia plays the part of the equilibrium errors.

C. Allowing for Nonlinear Dynamics in a Spot-Forward VECM

Alongside the work on exchange rate forecasting, another strand of the literature has developed in which increasingly strong evidence of nonlinearities of one sort or another in exchange rate movements has been reported. One element of this, dating at least to Booth and Glassman (1987), has been the mounting evidence that the conditional distribution of nominal exchange rate changes is well described by a mixture of normal distributions and that, consequently, a Markov switching model may be a logical characterization of exchange rate behavior (e.g., see Engel and Hamilton, 1990; LeBaron, 1992; Engel, 1994; Engel and Hakkio, 1996; Engel and Kim, 1999). However, although Markov-switching models fit nominal exchange rate data very well, in general they do not produce superior forecasts to a random walk or the forward rate on the basis of conventional forecasting criteria (e.g., Engel 1994). An exception in this context is the study by Engel and Hamilton (1990), who apply the Markov-switching model developed by Hamilton (1988, 1989) to dollar exchange rate data and show that the model generates better forecasts than a random walk. In the light of the subsequent literature, however, these forecasting results appear to be somewhat fragile. Overall, in fact, the literature on nonlinear modeling of exchange rates has produced models that fit satisfactorily and forecast well in sample but that in general fail to beat simple random walk models or linear specifications in out-of-sample forecasting (e.g., see Diebold and Nason, 1990; Engel, 1994; Meese and Rose, 1990, 1991).

Clarida, Sarno, Taylor, and Valente (CSTV) (2003) investigate whether allowing for nonlinearities in the underlying data-generating process for the term structure yields superior exchange rate forecasts. This is done through estimating a fairly general three-regime Markov-switching vector equilibrium correction model (MS-VECM) for spot rates and the term structure of forward rates which is essentially based on an extension of Markovian regime shifts to a nonstationary framework, for which the underlying econometric theory has recently been developed (e.g., Krolzig, 1997). The model proposed by CSTV allows regime shifts in both the intercept and the variance-covariance matrix and is governed by three different regimes:

Δyt=v(zt)+Πyt1+i=1p1ΓiΔyti+ωt,(14)

where Π = αβ′,ωt ~ NIID(0, Σ(zt)) and zt=1,2,3. This model—termed Markov-Switching-Intercept-Heteroskedastic-VECM or MSIH-VECM—is estimated by CSTV by maximum likelihood (Dempster, Laird and Rubin, 1977), using dollar rates for each of France, Germany, Japan and the UK, over the sample 1979–1995. With few exceptions, the estimation yields fairly plausible estimates of the coefficients for the VECMs estimated.

For each country we find that three regimes are appropriate in describing the data, and that in each case the three regimes are driven mainly by the joint variability of spot and forward exchange rates. Shifts from one regime to another appeared to be due largely to shifts in the variance of the term structure equilibrium. On the other hand, shifts in the intercept terms were found to be relatively smaller in magnitude, albeit massively statistically significant. This appears to be in line with the extensive empirical literature investigating the time-varying nature of exchange rates risk premia. One tentative interpretation of our MSIH-VECM is, in fact, in terms of shifts in the mean and variance of foreign exchange returns consistent with deviations from the equilibrium levels implied by conventional macroeconomic fundamentals that may be caused, for example, by “peso problems” or by other kinds of departures from the standard efficient markets hypothesis (see Engel and Hamilton, 1990).

D. Empirical Evidence on the Importance of Nonlinear Dynamics in Out-of-Sample Exchange Rate Forecasting

In order to assess the usefulness of the nonlinear VECM characterization of the term structure, CSTV (2003) construct dynamic out-of-sample forecasts of the spot rate using the MSIH(3)-VECM(1) described in the previous section. In particular the forecasting exercises were performed on the period January 1996–December 1998 with forecast horizons up to 52 weeks ahead. The out-of-sample forecasts for a given horizon j=1,…,52 are constructed according a recursive procedure, conditional only upon information up to the data of the forecast and with successive re-estimation as the date on which forecasts are conditioned moves through the data set.7

Forecast accuracy is evaluated using absolute and square error criteria, namely the mean absolute error (MAE) and the root mean square error (RMSE). The forecasts produced by the MSIH-VECM are compared to the forecasts generated by a simple random walk benchmark as well as the forecasts generated by the linear term-structure VECM originally proposed by Clarida and Taylor (1997). Further, in order to assess the accuracy of forecasts derived from two different models, the Diebold and Mariano (1995) test is employed:

DM=d¯2πf^(0)T,(15)

where d¯ is an average (over T observations) of a general loss differential function and f^(0) is a consistent estimate of the spectral density of the loss differential function at frequency zero. Diebold and Mariano show that the DM statistic is distributed as standard normal under the null hypothesis of equal forecast accuracy. Consistent with a large literature (see, inter alia, Mark, 1995) the loss differential function considered is the difference between the (absolute and square) forecast errors.

The results of the accuracy of the forecasts for the dollar-franc, dollar-mark, dollar-sterling and dollar-yen systems respectively, using MAE and RMSE criteria for forecast accuracy, provide evidence in favor of the predictive superiority of the MSIH-VECM models against the naive random walk and, to a lesser extent, against linear VECM models. The MSIH-VECM models give very much more accurate forecasts than a random walk. At the four-week horizon, the models achieve average improvements ranging between 28–38 percent across currencies using the MAE, and between 27–31 percent using the RMSE. At the 52-week horizon, average improvements range between 8–70 percent using the MAE, and between 5–68 percent using the RMSE, with a maximum reduction of 70 percent in the case of the dollar-yen rate using the MAE. The statistical significance of these results is confirmed executing the DM test.8

These results extend the findings of Clarida and Taylor (1997) who, using a linear VECM framework for the term structure of forward foreign exchange premia, were able to provide out-of-sample forecasts of spot exchange rates which were superior to alternative conventional forecasting methods. By explicitly incorporating nonlinearity into the modeling framework, the MS-VECM improves upon the Clarida-Taylor results. In particular, the gains obtained relative to the linear VECMs range, on average across currencies, were between 1–10 percent at the four-week horizon and between 10–38 percent at the 52-week horizon using the MAE. Using the RMSE, the gains range between 1–7 percent at the four-week horizon and between 10–38 percent at the 52-week horizon. Therefore, the gain from using an MSIH-VECM rather than a linear VECM is relatively small at short horizon, albeit generally statistically significant; however, this gain increases with the forecast horizon and becomes substantial at the 52-week horizon.

IV. Further issues in evaluating nonlinear forecasting exchange rate models

A. Density Forecast Evaluation

A large body of literature in econometrics and applied economics has focused on evaluating the forecast accuracy of economic models (e.g., see the survey of Diebold and Lopez, 1996, and the references therein). Although this literature has traditionally focused on accuracy evaluations based on point forecasts, several authors have recently emphasized the importance of evaluating the forecasting ability of economic models on the basis of density, as opposed to point, forecasting performance (see, inter alia, the survey by Tay and Wallis, 2000, and the references therein).

In a decision-theoretical context, the need to consider the predictive density of a time series—as opposed to considering only its conditional mean and variance—seems fairly accepted in the light of the argument that economic agents may not have loss functions that depend symmetrically on the realizations of future values of potentially non-Gaussian variables. In this case, agents are interested in knowing not only the mean and variance of the variables in question, but their full predictive densities. In various contexts in economics and finance—among which the recent boom in financial risk management represents an obvious case (Diebold, Hahn and Tay, 1999; Berkowitz, 2001)—there is an increasingly strong need to provide and evaluate density forecasts. These issues are particularly important in the context of nonlinear models since these models may provide highly non-normal densities.

Several researchers have recently proposed methods for evaluating density forecasts. For example, Diebold, Gunther and Tay (1998) extend previous work on the probability integral transform and show how it is possible to evaluate a model-based predictive density and to test formally the hypothesis that the predictive density implied by a particular model is equal to the true predictive density.9 Similar ideas have been developed by, inter alia, Anderson, Hall and Titterington (1994), Li (1996), Granger and Pesaran (1999) and Berkowitz (2001). In general, this line of research has produced several methods either to measure the closeness of two density functions or to test the hypothesis that the predictive density generated by a particular model is equal to the true predictive density.

Sarno and Valente (2003) recently proposed a test statistic for the null hypothesis that two competing models have equal density forecast accuracy, in an attempt to provide a more accurate description of the uncertainty surrounding forecasts than traditional methods based on point forecasting. This test is, in the context of density forecasting, the analogue of the test statistic developed by Diebold and Mariano (1995) for testing the null hypothesis that two models have equal forecast accuracy in the context of point forecasting. Let f(y), g1(y) and g2(y) be three probability density functions with distribution functions F, G1 and G2 respectively. Let f(y) be the probability density function of the variable yt over the period t = 1,…,T, whereas g1(y) and g2(y) are the probability density functions implied by two competing forecasting models, say M1 and M2.

We are interested in testing the null hypothesis of equidistance of the probability densities g1(y) and g2(y) from f(y), that is,

H0:dist[f(y),g1(y)]=dist[f(y),g2(y)],(16)

where the operator dist denotes a generic measure of distance.

A conventional measure of global closeness between two functions is the integrated square difference (ISD) (e.g., see Pagan and Ullah, 1999):

ISD=[ϕ(x)γ(x)]2dx,(17)

where ϕ(.) and γ(.) denote probability density functions; ISD ≥ 0, and ISD=0 only if ϕ(x)=γ(x.). Using equation (17) we can rewrite the null hypothesis H0 as

H0:[f(y)g1(y)]2dy=[f(y)g2(y)]2dy :ISD1ISD2=0,(18)

where the null hypothesis of equal density forecast accuracy of models M1 and M2 is written as the null hypothesis of equality of two integrated square differences or, equivalently, as the null hypothesis that the difference between two integrated square differences is zero.

Consider two series of forecasts, {y^1t}t=1T1 and {y^2t}t=1T2, obtained from the two competing models M1 and M2. Let g1(y) and g2(y)be the probability density functions of the two forecast series {y^1t}t=1T1 and {y^2t}t=1T2 respectively.10 With observations {y1t}t=1T, {y^1t}t=1T and {y^2t}t=1T we can estimate the unknown functions f(y), g1(y) and g2(y) using kernel estimation, obtaining:

f^(y)=1Thi=1TK(yiyh)(19)
g^1(y)=1Thi=1TK(y1iyh)(20)
g^2(y)=1Thi=1TK(y2iyh)(21)

where K(.) is the kernel function and h is the smoothing parameter.11 Using (19)(21) we can then obtain a consistent estimate of the integrated square differences ISD1 and ISD2, say IS^D1 and IS^D2. Define d=IS^D1IS^D2 as the estimated relative distance of the probability density functions g1(y) and g2(y) from f(y). In order to test for the statistical significance of d, the next step is to calculate a confidence interval for it.

In the spirit of the analysis of Hall (1992), define {yij}t=1T, {y^1ij}t=1T and {y^2ij}t=1T j-th resample of the original data {y1t}t=1T, {y^1t}t=1T and {y^2t}t=1T, drawn randomly with replacement. From these resamples it is possible to obtain consistent bootstrap estimates of the density functions f^j(y),g^1j(y),g^2j(y)and, consequently, of dj=IS^D1jIS^D2j.12

Consider a sample path {dj}j=1B, where B is the number of bootstrap replications. Under general conditions (See Kendall and Stuart, 1976, Ch. 11), we have

B(d¯μ)dN(0,σ2),(22)

where

d¯=1Bj=1Bdj=1Bj=1B(IS^D1jIS^D2j)(23)

is the average difference of the estimated relative distances over B bootstrap replications. Because in large samples the average difference d¯ is approximately normally distributed with mean μ and variance σ2/B, the large-sample statistic for testing the null hypothesis that models M1 and M2 have equal density forecast accuracy is:

η=d¯σ2^BdN(0,1),(24)

where σ2^ is a consistent estimate of σ2.13

This test statistic displays several attractive properties in that it has a known limiting standard normal distribution and—unlike available testing procedures—does not involve testing a joint hypothesis. The test is easy to implement in practice, as illustrated below in an application to exchange rate forecasting. Also, the test is found to have satisfactory empirical size and power properties in a simulation exercise (see Sarno and Valente, 2003). Nevertheless, this test circumvents the problem of testing a joint hypothesis by relying on somewhat stronger assumptions than other methods proposed in the literature that are based on the probability integral transform. Relaxation of these assumptions is an immediate avenue for future research. In particular, the assumption of time-invariance of the densities over the forecast horizon could be relaxed by using recursive kernel estimation, which would allow us to test the null hypothesis of equal density forecast accuracy on time-varying densities period by period (Yamato, 1971; Nobel, Morvai, and Kulkarni, 1998).

B. An Illustrative Application to Exchange Rate Forecasting

We shall illustrate the practical use of the η test statistic with an application to out-of-sample exchange rate forecasting and financial risk management. In recent years, trading accounts at large financial institutions have shown a dramatic growth and become increasingly more complex. Partly in response to this trend, major trading institutions have developed large-scale risk measurement models designed to manage risk. These models generally employ the Value-at-Risk (VaR) methodology. VaR may be defined as the expected maximum loss over a target horizon within a given confidence interval (Jorion, 1997). More formally, VaR is an interval forecast, typically a one-sided 95 or 99 percent interval of the distribution of expected wealth or returns. Users of the VaR methodology generally assume that expected returns are normally or t-distributed. However, this assumption contrasts with the large amount of empirical evidence suggesting that the distribution of exchange rate returns is not normal. Point forecast analysis and testing procedures based upon it do not take into account these features, so that VaR analysis often relies on dubious parametric distributional assumptions. In this example we analyze the out-of-sample forecasting performance of two empirical models of exchange rates and we investigate the implications of these forecasts for a risk manager who has to quantify the risk associated with a simple internationally diversified portfolio over a one-week horizon.

As competing models we consider two multivariate models of the nominal exchange rate. The first model (say Model M1) is the linear vector error correction model (VECM) based on the spot-forward relationship suggested by Clarida and Taylor (1997). The second model (say Model M2) is a nonlinear generalization of Model M1 which allows the intercept and the variance-covariance matrix of the VECM to be regime-shifting, as proposed by CSTV (2003).14

Specifically, M1 can be written as follows:

Δyt=v+i=1PlΓiΔyti+Πyt1+ut,(25)

where yt=[st, ft]; Γi=j=i+1PΠj are matrices of parameters; Π=i=1PΠjI=αβ is the long-run impact matrix whose rank r determines the number of cointegrating vectors (e.g., Johansen, 1995); and the vector of disturbances ut ~ NIID(0,Σ). On the other hand, M2 is a Markov-switching VECM which allows the intercept and the variance-covariance matrix to be regime-shifting:

Δyt=v(zt)+Πyt1+i=1p1ΓiΔyti+ωt,(26)

where yt = [st,ft]ωt ~ NIID(0,Σ), and the number of regimes zt = 1,2. In this application, we set the number of regimes equal to two for convenience.

In order to calculate the η test we first estimated the competing models (25) and (26) using weekly bilateral US dollar exchange rate data (domestic price of the foreign currency) vis-à-vis the Japanese yen and the pound sterling. Time series for bilateral dollar exchange rates and one-month forward rates over the sample period from 1979:1 to 2000:52 were obtained from the Bank for International Settlements. We estimated models (25) and (26) using data from 1979:1 to 1991:52, leaving data from 1992:1 to the end of the sample period for calculating out-of-sample dynamic one-week-ahead forecasts.15 In order to take into account the possibility that the implied one-week-ahead forecast distribution might be time-varying, we split our forecasting period in four non-overlapping equally sized sub-periods: 1992:1–1994:13, 1994:14–1996:27,1996:28–1998:39,1998:40–2000:52. One-week-ahead forecasts were constructed recursively, namely conditional only upon information up to the data of the forecast and with successive re-estimation as the date on which forecasts are conditioned moves through the data set.

Inspecting Figures 1 and 2, which show the predictive densities from the two competing models M1 and M2 together with the true predictive density for each exchange rate examined and for different sub-sample periods, a clear result arises. Across sub-periods, the predictive densities produced by the linear VECM (25) are more leptokurtic than the ones obtained from the Markov-switching VECM (26). Simple visual inspection of the graphs in Figures 1 and 2 show that the distance between the forecast density of the Markov-switching VECM (26) from the true predictive density is shorter than the distance between the predictive density of the linear VECM (25) and the true predictive density. This visual evidence is, in fact, supported by the results of the η test, reported in Table 1. For both exchange rates examined and across subperiods, the η test, calculated using 100 bootstrap replications, is positive and statistically significant, strongly rejecting the null hypothesis of equidistance of the competing predictive densities from the true predictive density. In turn, these results imply that the Markov-switching model (Model M2) is superior to the linear model (Model M1) in terms of out-of-sample forecasting performance, suggesting that nonlinearity plays some role in forecasting exchange rates.

Figure 1.
Figure 1.

Kernel Density Estimation: Japan

Citation: IMF Working Papers 2003, 111; 10.5089/9781451853490.001.A001

Figure 2.
Figure 2.

Kernel Density Estimation: United Kingdom

Citation: IMF Working Papers 2003, 111; 10.5089/9781451853490.001.A001

Table 1.

Exchange Rate Forecasting Results

article image
Notes: Models M1 and M2 denote the linear VECM based on the spot-forward relationship (25) and the Markov-switching VECM (26) respectively. η test is the test statistic for the null hypothesis that Models Ml and M2 have equal density forecast accuracy, using 100 bootstrap replications. Figures in brackets denote p-values; p-values equal to zero up to the 8th decimal point are recorded as [0].

C. Implications for Risk Management and VaR Analysis

We now further investigate the practical implications of the forecasting results reported in the previous sub-section in the context of a simple risk management problem. Given the predictions of the two competing models M1 and M2, assume that a risk manager wishes to quantify the one-week-ahead risk associated with an internationally diversified portfolio selected at time t. For simplicity, assume that the portfolio comprises only one domestic bond and one foreign bond. These domestic and foreign bonds are identical in all respects except for the currency of denomination and yield the continuously compounded returns r and r* respectively, expressed in local currency. Given initial wealth Wt = 1 and defining ωt as the predetermined allocation to the foreign bond at time t, the end-of-week wealth is

Wt+1=(1ωt)exp (r)+ωtexp (r*+Δst+1),(27)

where Δst+1 denotes the weekly change in the nominal exchange rate.16 Given that the two bonds are riskless in local currency, the only source of uncertainty to be taken into consideration by the risk manager is the future nominal exchange rate, Δst+1. Models M1 and M2 provide the one-week-ahead density forecasts of Δst+1, which, in turn, determine implied densities for the end-of-week wealth. On the basis of these densities the risk manager calculates the VaR of the portfolio as the dollar loss relative to the mean:

VaR=Et(Wt+1)Wt+1*,(28)

where Et(Wt+1) is the mean of the end-of-week wealth distribution and Wt+1* is the lowest portfolio value at the given confidence level c. In our example the VaR is calculated according to equation (28) as a 99 percent confidence level for losses (i.e. c = 0.99), for both models M1 and M2.

The results, reported in Table 2, suggest that there is a substantial difference between the estimated values of the VaRs. For both exchange rates and across different subperiods, the linear VECM (25) generates VaRs that are at least ten times smaller than the ones implied by the Markov-switching VECM (26). In turn, this implies that the violations from the VaRs experienced over the sub-sample periods are larger under the linear VECM (25) than under the Markov-switching VECM (26). In fact, the violation rate, calculated as the ratio of the number of violations occurred over a subsample period to the total number of observations in the subsample period, is different for the two competing models, denoting a general tendency for the linear VECM (25) to produce VaRs that underestimate the probability of large losses. On the other hand, the Markov-switching VECM (26) produces VaRs that are generally in line with the theoretical 99 percent confidence level.

This evidence tells us nothing, however, about the statistical significance of these differences. One way to shed light on the statistical significance of the differences is by comparing the targeted violation rate (i.e., 1–0.99 = 0.01) with the observed violation rate and by testing formally for their equality. In general, conventional testing procedures applied to testing for the statistical significance of the coverage rate (i.e. the difference between the target violation rate and the observed violation rate) are known to have low power (Kupiec, 1995; Christoffersen, 1998). Thus, in order to test for the statistical significance of the coverage rate, we calculated the likelihood ratio tests proposed by Christoffersen (1998), who first showed that not only violations should occur one percent of the time, but they should also be i.i.d. over time for them to be nonsystematic. These test statistics (LR1, LR2 and LR3)17, reported in Table 1, confirm that the departures from VaR of the linear VECM (25) are strongly statistically significant. Indeed, all the likelihood ratio tests LR1, LR2 and LR3 exhibit very low p-values. On the other hand, the Markov-switching VECM (26) produces out-of-sample forecasts that imply VaRs associated with departures that are infrequent (generally close to the theoretical 99 confidence level) and often statistically insignificant at conventional significance levels.

Table 2.

VaR Estimation and Backtesting

article image
article image
Notes: This table shows backtests of estimated VaR based upon the one-week-ahead forecasts obtained using the linear VECM (25) and the Markov-switching VECM (26) over four subperiods. VaR (mean) is the Value-at-risk relative to the mean calculated according to equation (28), using the 99 percent confidence level. The number (No.) of violations is calculated as the number of times the realized dollar loss exceeded the estimated VaR over the subsample period. The violation rate is calculated as the ratio of the number of violations to the total number of observations in the subsample period. LR1, LR2 and LR3 are the likelihood ratio tests proposed by Christoffersen (1998): LR1 is a test of unconditional coverage, where the null hypothesis is that the observed violation rate is not different from the target violation rate c = 1–0.99 = 0.01; LR2 is a test of conditional coverage where the null hypothesis is that the observed violation rate is not different from the target violation rate and the violations are i.i.d.; LR3 is a test of the null hypothesis that the violations are i.i.d. LR1 and LR2 are distributed as χ2(1) and χ2(2) under their respective null hypotheses; LR3 is calculated as the difference between LR2 and LR1 and is distributed as χ2(1) under the null hypothesis.

In this simple application we have shown how density forecasts can help us discriminate among competing exchange rate models. In our example, the linear exchange rate model M1 has produced forecasts that do not capture satisfactorily the higher moments of the predictive distribution of the exchange rate, generating VaRs that underestimate the probability of large losses. However, the Markov-switching model M2, which does better than the linear model at matching the higher moments of the predictive distribution of exchange rates, produced VaRs that are generally in line with the target violation rate of one percent.

V. Concluding remarks

Exchange rate economics is alive and continues to attract the attention of academics, policymakers, and practitioners. One reason this field receives attention is the large number of puzzles the profession has been unable to resolve. This paper has reviewed a selection of recent papers that have shed some light on various puzzles. The common feature of the literature reviewed here is that it gives special attention to the importance of allowing for nonlinear dynamics in exchange rate behavior. Besides being a fancy way to improve on the linear econometric techniques more commonly used by researchers, the nonlinear approach to exchange rate modeling and forecasting has produced results that can be viewed as very promising.

With respect to the behavior of the real exchange rate, the theoretical literature focusing on the importance of international trade costs has led researchers to consider nonlinear models of real exchange rates that give a role of the size of the deviation from equilibrium in determining the speed of mean reversion, a feature that cannot be captured within a linear framework. The empirical evidence provided by these studies suggests that major real bilateral dollar exchange rates are well characterized by nonlinearly mean reverting processes over the floating rate period since 1973. These models imply an equilibrium level of the real exchange rate in the neighborhood of which the behavior of the real exchange rate is close to a random walk, becoming increasingly mean reverting with the absolute size of the deviation from equilibrium. In addition, the half lives of shocks to the real exchange rates implied by these models suggest much faster real exchange rate adjustment than typically recorded in the literature, hence shedding some light on Rogoff’s (1996) PPP puzzle.

With respect to the ability of empirical exchange rate models to explain and forecast the nominal exchange rate, the literature is still somewhat in the dark. About 20 years after Meese and Rogoff’s (1983) paper, their findings that empirical exchange rate models are not able to beat a random walk have not been convincingly overturned. Economic fundamentals typically suggested by open-economy macro theory do not appear to contain sufficient information as to provide satisfactory out-of-sample forecasts of the exchange rate, especially at short horizons. However, the information contained in the term structure of forward premia appears to be more useful, especially in empirical models that allow for nonlinearities. An analysis of spot and forward exchange rates in a multivariate Markov-switching framework, inspired by encouraging results previously reported in the literature on the presence of nonlinearities (and particularly by the success of Markov-switching models) in the context of exchange rate modeling, produces very satisfactory forecasting results. Indeed, this framework generates forecasts that are strongly superior to the random walk forecasts at a range of forecasting horizons up to 52 weeks ahead, using standard forecasting accuracy criteria and on the basis of standard tests of significance. Moreover, this model also outperforms its linear counterpart in out-of-sample forecasting, although the magnitude of the gain from using a nonlinear relative to a linear one is rather small in magnitude at short horizons.

With regard to the evaluation of forecasting models, although the relevant literature has traditionally focused on accuracy evaluations based on point forecasts, several authors have recently emphasized the importance of evaluating the forecast accuracy of economic models on the basis of density—as opposed to point—forecasting performance. Especially when evaluating nonlinear models, which are capable of producing highly nonnormal forecast densities, it would seem appropriate to consider a model’s density forecasting performance. In the last part of this paper, we have presented a recently developed testing procedure designed to discriminate between competing forecasting models in terms of density forecast accuracy. An illustration of the potential use of this framework in the context of exchange rate forecasting highlights the benefits of using a nonlinear model relative to a linear one to be in the forecasts of the higher moments of the exchange rate distribution. In other words, not only does this nonlinear framework produce better point forecasts of the level of the exchange rate, but it also provides a better measure of the uncertainty surrounding these forecasts. This seems particularly important for risk management and also for the policymaker interested in forecasting the probability of large changes in the exchange rate (depreciations or appreciations). The probability of large exchange rate movements risks being seriously underestimated when relying on linear models.

Overall, we end this survey with some degree of optimism. Nonlinear exchange rate models may not be the solution to every puzzle in exchange rate economics, but they have led us to a stage where the faith in PPP professed by many international economists for many decades shows some scientific support, and where we can provide forecasts of exchange rate movements that are better than a model that simply assumes no change.