Empirical Exchange Rate Models of the Nineties
Are Any Fit to Survive?
Author:
Yin-Wong Cheung https://isni.org/isni/0000000404811396 International Monetary Fund

Search for other papers by Yin-Wong Cheung in
Current site
Google Scholar
Close
and
Menzie David Chinn https://isni.org/isni/0000000404811396 International Monetary Fund

Search for other papers by Menzie David Chinn in
Current site
Google Scholar
Close

Contributor Notes

We reassess exchange rate prediction using a wider set of models that have been proposed in the last decade. The performance of these models is compared against two reference specifications-purchasing power parity and the sticky-price monetary model. The models are estimated in first-difference and error-correction specifications, and model performance is evaluated at forecast horizons of 1, 4, and 20 quarters, using the mean squared error, direction of change metrics, and the "consistency" test of Cheung and Chinn (1998). Overall, model/specification/currency combinations that work well in one period do not necessarily work well in another period.

Abstract

We reassess exchange rate prediction using a wider set of models that have been proposed in the last decade. The performance of these models is compared against two reference specifications-purchasing power parity and the sticky-price monetary model. The models are estimated in first-difference and error-correction specifications, and model performance is evaluated at forecast horizons of 1, 4, and 20 quarters, using the mean squared error, direction of change metrics, and the "consistency" test of Cheung and Chinn (1998). Overall, model/specification/currency combinations that work well in one period do not necessarily work well in another period.

I. Introduction

The recent movements in the dollar and the euro have appeared seemingly puzzling in the context of standard models. While the dollar may not have been “dazzling”—as it was described in the mid-1980s—it has been characterized until recent months as overly “darling.”2 And the euro’s ability to repeatedly confound predictions has only been highlighted by its recent ascent.

It is against this backdrop that several new models have been developed in the past decade. Some explanations are motivated by findings in the empirical and theoretical literature, such as the correlation between net foreign asset positions and real exchange rates and those based on productivity differences. None of these models, however, have been subjected to rigorous examination of the sort that Meese and Rogoff conducted in their seminal work, the original title of which we have appropriated and amended for this study.3

We believe that a systematic examination of these newer empirical models is long overdue, for a number of reasons. First, although these models have become prominent in policy and financial circles, they have not been subjected to the sort of systematic out-of-sample testing conducted in academic studies. For instance, productivity did not make an appearance in earlier comparative studies, but has come to be viewed as an important determinant of the euro/dollar exchange rate (Owen, 2001; Rosenberg, 2000).4

Second, most of the recent academic treatments of exchange rate forecasting performance rely upon a single model—such as the monetary model—or some other limited set of models of 1970s vintage, such as purchasing power parity or real interest differential models.

Third, the same criteria are often used, neglecting alternative dimensions of model forecast performance. That is, the first- and second-moment metrics, such as mean error and mean squared error, are considered, while other aspects that might be of greater importance are often neglected. We have in mind the direction of change—perhaps more important from a market-timing perspective—and other indicators of forecast attributes.

In this study, we extend the forecast comparison of exchange rate models in several dimensions.

  • Five models are compared against the random walk. Purchasing power parity is included because of its importance in the international finance literature and the fact that the parity condition is commonly used to gauge the degree of exchange rate misalignment. The sticky-price monetary model of Dornbusch and Frankel is the only structural model that has been the subject of previous systematic analyses. The other models include one incorporating productivity differentials, an interest rate parity specification, and a composite specification incorporating a number of channels identified in differing theoretical models.

  • The behavior of U.S. dollar-based exchange rates of Canadian dollar, British pound, deutsche mark, and Japanese yen are examined. To ensure that our conclusions are not driven by dollar-specific results, we also examine (but do not report) the results for the corresponding yen-based rates.

  • The models are estimated in two ways: in first-difference and error-correction specifications.

  • Forecasting performance is evaluated at several horizons (1-, 4-, and 20-quarter horizons) and two sample periods (post-Louvre accord (Feb. 1987) and post-1982).

  • We augment the conventional metrics with a direction of change statistic and the “consistency” criterion of Cheung and Chinn (1998).

Before proceeding further, it may prove worthwhile to emphasize why we focus on out of sample prediction as our basis for judging the relative merits of the models. It is not that we believe that we can necessarily outforecast the market in real time. Indeed, our forecasting exercises are in the nature of ex post simulations, where in many instances contemporaneous values of the right-hand-side variables are used to predict future exchange rates. Rather, we construe the exercise as a means of protecting against data mining that might occur when relying solely on in-sample inference.5

The exchange rate models considered in the exercise are summarized in Section II. Section III discusses the data, the estimation methods, and the criteria used to compare forecasting performance. The forecasting results are reported in Section IV. Section V concludes.

II. Theoretical Models

The universe of empirical models that have been examined over the floating rate period is enormous. Consequently any evaluation of these models must necessarily be selective. Our criteria require that the models are (1) prominent in the economic and policy literature, (2) readily implementable and replicable, and (3) not previously evaluated in a systematic fashion. We use the random walk model as our benchmark naive model, in line with previous work, but we also select the purchasing power parity and the basic Dornbusch (1976) and Frankel (1979) model as two comparator specifications, as they still provide the fundamental intuition for how flexible exchange rates behave. The purchasing power parity condition examined in this study is given by

s t = β 0 + p ^ t , ( 1 )

where s is the log exchange rate, p is the log price level (CPI), and “^” denotes the intercountry difference. Strictly speaking, (1) is the relative purchasing power parity condition. The relative version is examined because price indexes rather than the actual price levels are considered.

The sticky price monetary model can be expressed as follows:

s t = β 0 + β 1 m ^ t + β 2 y ^ t + β 3 i ^ t + β 4 π ^ t + u t , ( 2 )

where m is log money, y is log real GDP, i and π are the interest and inflation rate, respectively, and ut is an error term. The characteistics of this model are well known, so we will not devote time to discussing the theory behind the equation. We note, however, that the list of variables included in (2) encompasses those employed in the flexible price version of the monetary model, as well as the micro-based general equilibrium models of Stockman (1980) and Lucas (1982). In addition, two observations are in order. First, the sticky price model can be interpreted as an extension of equation (1) with the price variables replaced by macro variables that capture money demand and overshooting effects. Second, we do not impose coefficient restrictions in equation (2) because theory gives us little guidance regarding the exact values of all the parameters.

Next, we assess models that are in the Balassa-Samuelson vein, in that they accord a central role to productivity differentials to explaining movements in real, and hence also nominal, exchange rates. Real versions of the model can be traced to DeGregorio and Wolf (1994), while nominal versions include Clements and Frenkel (1980) and Chinn (1997). Such models drop the purchasing power parity assumption for broad price indices, and allow the real exchange rate to depend upon the relative price of nontradables, itself a function of productivity (z) differentials. A generic productivity differential exchange rate equation is

s t = β 0 + β 1 m ^ t + β 2 y ^ t + β 3 i ^ t + β 5 z ^ t + u t . ( 3 )

Although equations (2) and (3) bear a superficial resemblance, the two expressions embody quite different economic and statistical implications. The central difference is that (2) assumes PPP holds in the long run, while the productivity based model makes no such presumption. In fact the nominal exchange rate can drift infinitely far away from PPP, although the path is determined in this model by productivity differentials.

The fourth model is a composite model that incorporates a number of familiar relationships. A typical specification is:

s t = β 0 + p ^ t + β 5 ω ^ t + β 6 r ^ t + β 7 g ^ deb t t + β 8 to t t + β 9 nf a t + u t , ( 4 )

where ω is the relative price of nontradables, r is the real interest rate, gdebt the government debt to GDP ratio, tot the log terms of trade, and nfa is the net foreign asset. Note that we impose a unitary coefficient on the inter-country log price level p^, so that (4) could be re-expressed as determining the real exchange rate.

Although this particular specification closely resembles the behavioral equilibrium exchange rate (BEER) model of Clark and MacDonald (1999), it also shares attributes with the NATREX model of Stein (1999) and the real equilibrium exchange rate model of Edwards (1989), as well as a number of other approaches. Consequently, we will henceforth refer to this specification as the “composite” model. Again, relative to (1), the composite model incorporates the Balassa-Samuelson effect (via ω), the overshooting effect (r), and the portfolio balance effect (gdebt, nfa).6

Models based upon this framework have been the predominant approach to determining the rate at which currencies will gravitate to over some intermediate horizon, especially in the context of policy issues. For instance, the behavioral equilibrium exchange rate approach is the model that is most often used to determine the long-term value of the euro.7

The final specification assessed is not a model per se; rather it is an arbitrage relationship—uncovered interest rate parity:

s t + k = s t + i ^ i , k , ( 5 )

where it,k is the interest rate of maturity k. Similar to the relative purchasing power parity (1), this relation need not be estimated in order to generate predictions.

The interest rate parity is included in the forecast comparison exercise mainly because it has recently gathered empirical support at long horizons (Alexius, 2001; Meredith and Chinn, 1998), in contrast to the disappointing results at the shorter horizons. MacDonald and Nagayasu (2000) have also demonstrated that long-run interest rates appear to predict exchange rate levels. On the basis of these findings, we anticipate that this specification will perform better at the longer horizons than at shorter.8

III. Data, Estimation, and Forecasting Comparison

A. Data

The analysis uses quarterly data for the United States, Canada, United Kingdom, Japan, Germany, and Switzerland over the 1973q2 to 2000q4 period. The exchange rate, money, price and income variables are drawn primarily from the IMF’s International Financial Statistics. The productivity data were obtained from the Bank for International Settlements, while the interest rates used to conduct the interest rate parity forecasts are essentially the same as those used in Meredith and Chinn (1998). See the Data Appendix for a more detailed description.

Two out-of-sample periods are used to assess model performance: 1987q2 to 2000q4 and 1983q1 to 2000q4. The former period conforms to the post-Louvre Accord period, while the latter spans the period after the end of monetary targeting in the U.S. The shorter out-of-sample period (1987–2000) spans a period of relative dollar stability (and appreciation in the case of the mark). The longer out-of-sample period subjects the models to a more rigorous test, in that the prediction takes place over a large dollar appreciation and subsequent depreciation (against the mark) and a large dollar depreciation (from 250 to 150 yen per dollar). In other words, this longer span encompasses more than one “dollar cycle.” The use of this long out-of-sample forecasting period has the added advantage that it ensures that there are many forecast observations to conduct inference upon.

B. Estimation and Forecasting

We adopt the convention in the empirical exchange rate modeling literature of implementing “rolling regressions” established by Meese and Rogoff. That is, estimates are applied over a given data sample, out-of-sample forecasts produced, then the sample is moved up, or “rolled” forward one observation before the procedure is repeated. This process continues until all the out-of-sample observations are exhausted. While the rolling regressions do not incorporate possible efficiency gains as the sample moves forward through time, the procedure has the potential benefit of alleviating parameter instability effects over time—which is a commonly conceived phenomenon in exchange rate modeling.

Two specifications of these theoretical models were estimated: (1) an error-correction specification, and (2) a first differences specification. These two specifications entail different implications for interactions between exchange rates and their determinants. It is well known that both the exchange rate and its economic determinants are I(1). The error-correction specification explicitly allows for the long-run interaction effect of these variables (as captured by the error-correction term) in generating forecast. On the other hand, the first differences model emphasizes the effects of changes in the macro variables on exchange rates. If the variables are cointegrated, then the former specification is more efficient that the latter one and is expected to forecast better in long horizons. If the variables are not cointegrated, the error-correction specification can lead to spurious results. Because it is not easy to determine unambiguously whether these variables are cointegrated or not, we consider both specifications.

Since implementation of the error-correction specification is relatively involved, we will address the first-difference specification to begin with. Consider the general expression for the relationship between the exchange rate and fundamentals:

s t = X t Γ + ε t , ( 6 )

where Xt is a vector of fundamental variables under consideration. The first-difference specification involves the following regression:

Δ s t = Δ X t Γ + u t . ( 7 )

These estimates are then used to generate one- and multi-quarter ahead forecasts.9 Since these exchange rate models imply joint determination of all variables in the equations, it makes sense to apply instrumental variables. However, previous experience indicates that the gains in consistency are far outweighed by the loss in efficiency, in terms of prediction (Chinn and Meese, 1995). Hence, we rely solely on ordinary least squares (OLS).

The error-correction estimation involves a two-step procedure. In the first step, the long-run cointegrating relation implied by (6) is identified using the Johansen procedure. The estimated cointegrating vector (Γ˜) is incorporated into the error-correction term, and the resulting equation

s t s t k = δ 0 + δ 1 ( s t k X t k Γ ˜ ) + u t ( 8 )

is estimated via OLS. Equation (8) can be thought of as an error-correction model stripped of short run dynamics. A similar approach was used in Mark (1995) and Chinn and Meese (1995), except for the fact that in those two cases, the cointegrating vector was imposed a priori. The use of this specification is motivated by the difficulty in estimating the short run dynamics in exchange rate equations.10

One key difference between our implementation of the error-correction specification and that undertaken in some other studies involves the treatment of the cointegrating vector. In some other prominent studies (MacDonald and Taylor, 1993), the cointegrating relationship is estimated over the entire sample, and then out of sample forecasting undertaken, where the short run dynamics are treated as time varying but the long-run relationship is not. While there are good reasons for adopting this approach—in particular one wants to use as much information as possible to obtain estimates of the cointegrating relationships—the asymmetry in estimation approach is troublesome and makes it difficult to distinguish quasi-ex ante forecasts from true ex ante forecasts. Consequently, our estimates of the long-run cointegrating relationship vary as the data window moves.11

It is also useful to stress the difference between the error-correction specification forecasts and the first-difference specification forecasts. In the latter, ex post values of the right hand side variables are used to generate the predicted exchange rate change. In the former, contemporaneous values of the right hand side variables are not necessary, and the error-correction predictions are true ex ante forecasts. Hence, we are affording the first-difference specifications a tremendous informational advantage in forecasting.

C. Forecast Comparison

To evaluate the forecasting accuracy of the different structural models, the ratio between the mean squared error (MSE) of the structural models and a driftless random walk is used. A value smaller (larger) than one indicates a better performance of the structural model (random walk). Inferences are based on a formal test for the null hypothesis of no difference in the accuracy (i.e., in the MSE) of the two competing forecasts—structural model vs. driftless random walk. In particular, we use the Diebold-Mariano statistic (Diebold and Mariano, 1995) which is defined as the ratio between the sample mean loss differential and an estimate of its standard error; this ratio is asymptotically distributed as a standard normal.12 The loss differential is defined as the difference between the squared forecast error of the structural models and that of the random walk. A consistent estimate of the standard deviation can be constructed from a weighted sum of the available sample autocovariances of the loss differential vector. Following Andrews (1991), a quadratic spectral kernel is employed, together with a data-dependent bandwidth selection procedure.13

We also examine the predictive power of the various models along different dimensions. One might be tempted to conclude that we are merely changing the well-established “rules of the game” by doing so. However, there are very good reasons to use other evaluation criteria. First, there is the intuitively appealing rationale that minimizing the mean squared error (or relatedly mean absolute error) may not be important from an economic standpoint. A less pedestrian motivation is that the typical mean squared error criterion may miss out on important aspects of predictions, especially at long horizons. Christoffersen and Diebold (1998) point out that the standard mean squared error criterion indicates no improvement of predictions that take into account cointegrating relationships vis à vis univariate predictions. But surely, any reasonable criteria would put some weight on the tendency for predictions from cointegrated systems to “hang together.”

Hence, our first alternative evaluation metric for the relative forecast performance of the structural models is the direction of change statistic, which is computed as the number of correct predictions of the direction of change over the total number of predictions. A value above (below) 50 percent indicates a better (worse) forecasting performance than a naive model that predicts the exchange rate has an equal chance to go up or down. Again, Diebold and Mariano (1995) provide a test statistic for the null of no forecasting performance of the structural model. The statistic follows a binomial distribution, and its studentized version is asymptotically distributed as a standard normal. Not only does the direction of change statistic constitute an alternative metric, Leitch and Tanner (1991), for instance, argue that a direction of change criterion may be more relevant for profitability and economic concerns, and hence a more appropriate metric than others based on purely statistical motivations. The criterion is also related to tests for market timing ability (Cumby and Modest, 1987).

The third metric we used to evaluate forecast performance is the consistency criterion proposed in Cheung and Chinn (1998). This metric focuses on the time-series properties of the forecast. The forecast of a given spot exchange rate is labeled as consistent if (1) the two series have the same order of integration; (2) they are cointegrated; and (3) the cointegration vector satisfies the unitary elasticity of expectations condition. Loosely speaking, a forecast is consistent if it moves in tandem with the spot exchange rate in the long run. While the two previous criteria focus on the precision of the forecast, the consistency requirement is concerned with the long-run relative variation between forecasts and actual realizations. One may argue that the criterion is less demanding than the MSE and direction of change metrics. A forecast that satisfies the consistency criterion can (1) have a MSE larger than that of the random walk model; (2) have a direction of change statistic less than ½; or (3) generate forecast errors that are serially correlated. However, given the problems related to modeling, estimation, and data quality, the consistency criterion can be a more flexible way to evaluate a forecast. Cheung and Chinn (1998) provide a more detailed discussion on the consistency criterion and its implementation.

It is not obvious which one of the three evaluation criteria is better as they each have a different focus. The MSE is a standard evaluation criterion, the direction of change metric emphasizes the ability to predict directional changes, and the consistency test is concerned about the long-run interactions between forecasts and their realizations. Instead of arguing one criterion is better than the other, we consider the use of these criteria as complementary and providing a multifaceted picture of the forecast performance of these structural models. Of course, depending on the purpose of a specific exercise, one may favor one metric over the other.

IV. Comparing the Forecast Performance

A. MSE Criterion

The comparison of forecasting performance based on mean squared error (MSE) ratios is summarized in Table 1. The table contains MSE ratios and the p-values from five dollar-based currency pairs, five model specifications, the error-correction and first-difference specifications, three forecasting horizons, and two forecasting samples. Each cell in the table has two entries. The first one is the MSE ratio (the MSEs of a structural model to the random walk specification). The entry underneath the MSE ratio is the p-value of the Diebold-Mariano statistic testing the null hypothesis that the difference of the MSEs of the structural and random walk models is zero (i.e., there is no difference in the forecast accuracy of the structural and the random walk model). Because of the lack of data, the composite model is not estimated for the dollar-Swiss franc and dollar-yen exchange rates. Altogether, there are 216 MSE ratios, which spread evenly across the two forecasting samples. Of these 216 ratios, 138 are computed from the error-correction specification and 78 from the first-difference one.

Table 1.

MSE Ratios from Dollar-Based Exchange Rates

article image
article image
Source: Authors’ own estimates. Note: The results are based on dollar-based exchange rates and their forecasts. Each cell in the Table has two entries. The first one is the MSE ratio (the MSEs of a structural model to the random walk specification). The entry underneath the MSE ratio is the p-value of the hypothesis that the MSEs of the structural and random walk models are the same (based on Diebold and Mariano, 1995, described in Appendix II). The notation used in the table is ECM: error-correction specification; FD: first-difference specification; PPP: purchasing power parity model; S-P: sticky-price model; IRP: interest rate parity model; PROD: productivity differential model; and COMP: composite model. The forecasting horizons (in quarters) are listed under the heading “Horizon.” The results for the post-Louvre Accord forecasting period are given under the label “Sample 1” and those for the post-1983 forecasting period are given under the label “Sample 2.” A “.” indicates the statistics are not generated due to unavailability of data.

Note that in the tables, only “error-correction specification” entries are reported for the purchasing power parity and interest rate parity models. In fact, the two models are not estimated; rather the predicted spot rate is calculated using the parity conditions. To the extent that the deviation from a parity condition can be considered the error-correction term, we believe this categorization is most appropriate.

Overall, the MSE results are not favorable to the structural models. Of the 216 MSE ratios, 151 are not significant (at the 10 percent significance level) and 65 are significant. That is, for the majority cases one cannot differentiate the forecasting performance between a structural model and a random walk model. For the 65 significant cases, there are 63 cases in which the random walk model is significantly better than the competing structural models and only 2 cases in which the opposite is true. The significant cases are quite evenly distributed across the two forecasting periods. As 10 percent is the size of the test and 2 cases constitute less than 10 percent of the total of 216 cases, the empirical evidence can hardly be interpreted as supportive of the superior forecasting performance of the structural models.

Inspection of the MSE ratios does not reveal many consistent patterns in terms of outperformance. It appears that the productivity model does not do particularly badly for the dollar-mark rate at the 1- and 4-quarter horizons. The MSE ratios of the purchasing power parity and interest rate parity models are less than unity (even though not significant) only at the 20-quarter horizon—a finding consistent with the perception that these parity conditions work better at long rather than at short horizons. As the yen-based results for the MSE ratios—as well as the other two metrics—display the same pattern, we do not report them. They can be found in the working paper version of this article (Cheung, Chinn, and Garcia Pascual, 2003).

Consistent with the existing literature, our results are supportive of the assertion that it is very difficult to find forecasts from a structural model that can consistently beat the random walk model using the MSE criterion. The current exercise further strengthens the assertion as it covers both dollar- and yen-based exchange rates, two different forecasting periods, and some structural models that have not been extensively studied before.

B. Direction of Change Criterion

Table 2 reports the proportion of forecasts that correctly predict the direction of the dollar exchange rate movement and, underneath these sample proportions, the p-values for the hypothesis that the reported proportion is significantly different from ½. When the proportion statistic is significantly larger than ½, the forecast is said to have the ability to predict the direction of change. On the other hand, if the statistic is significantly less than ½, the forecast tends to give the wrong direction of change. For trading purposes, information regarding the significance of incorrect prediction can be used to derive a potentially profitable trading rule by going again the prediction generated by the model. Following this argument, one might consider the cases in which the proportion of “correct” forecasts is larger than or less than ½ contain the same information. However, in evaluating the ability of the model to describe exchange rate behavior, we separate the two cases.

Table 2.

Direction of Change Statistics from Dollar-Based Exchange Rates

article image
article image
article image
Source: Authors’ own estimates. Note: Table 3 reports the proportion of forecasts that correctly predict the direction of the dollar exchange rate movement. Underneath each direction of change statistic, the p-values for the hypothesis that the reported proportion is significantly different from ½ is listed. When the statistic is significantly larger than ½, the forecast is said to have the ability to predict the direction of change. If the statistic is significantly less than ½, the forecast tends to give the wrong direction of change. The notation used in the table is ECM: error-correction specification; FD: first-difference specification; PPP: purchasing power parity model; S-P: sticky-price model; IRP: interest rate parity model; PROD: productivity differential model; and COMP: composite model. The forecasting horizons (in quarters) are listed under the heading “Horizon.” The results for the post-Louvre Accord forecasting period are given under the label “Sample 1” and those for the post-1983 forecasting period are given under the label “Sample 2.” A “.” indicates the statistics are not generated due to unavailability of data.
Table 3.

Cointegration Between Dollar-Based Exchange Rates and Their Forecasts

article image
article image
Source: Author’s own estimates. Note: The Johansen maximum eigenvalue statistic for the null hypothesis that a dollar-based exchange rate and its forecast are no cointegrated. “*” indicates 10 percent level significance. Tests for the null of one cointegrating vector were also conducted but in all cases the null was not rejected. The notation used in the table is ECM: error-correction specification; FD: first-difference specification; PPP: purchasing power parity model; S-P: sticky-price model; IRP: interest rate parity model; PROD: productivity differential model; and COMP: composite model. The forecasting horizons (in quarters) are listed under the heading “Horizon.” The results for the post-Louvre Accord forecasting period are given under the label “Sample 1” and those for the post-1983 forecasting period are given under the label “Sample 2.” A “.” indicates the statistics are not generated due to unavailability of data.

There is mixed evidence on the ability of the structural models to correctly predict the direction of change. Among the 216 direction of change statistics, 50 (23) are significantly larger (less) than ½ at the 10 percent level. The occurrence of the significant outperformance cases is higher (23 percent) than the one implied by the 10 percent level of the test. The results indicate that the structural model forecasts can correctly predict the direction of the change, while the proportion of cases where a random walk outperforms the competing models is only about what one would expect if they occurred randomly.

Let us take a closer look at the incidences in which the forecasts are in the right direction. Approximately 58 percent of the 50 cases are associated with the error-correction model and the remainder with the first difference specification. Thus, the error-correction specification—which incorporates the empirical long-run relationship—provides a slightly better specification for the models under consideration. The forecasting period does not have a major impact on forecasting performance, since exactly half of the successful cases are in each forecasting period.

Among the five models under consideration, the purchasing power parity specification has the highest number (18) of forecasts that give the correct direction of change prediction, followed by the sticky-price, composite, and productivity models (10, 9, and 8 respectively), and the interest rate parity model (5). Thus, at least on this count, the newer exchange rate models do not edge out the “old fashioned” purchasing power parity doctrine and the sticky-price model. Because there are differing numbers of forecasts due to data limitations and specifications, the proportions do not exactly match up with the numbers. Proportionately, the purchasing power model does the best.

Interestingly, the success of direction of change prediction appears to be currency specific. The dollar-yen exchange rate yields 13 out of 50 forecasts that give the correct direction of change prediction. In contrast, the dollar-pound has only 4 out of 50 forecasts that produce the correct direction of change prediction.

The cases of correct direction prediction appear to cluster at the long forecast horizon. The 20-quarter horizon accounts for 22 of the 50 cases while the 4-quarter and 1-quarter horizons have 18 and 10 direction of change statistics that are significantly larger than ½. Since there have not been many studies utilizing the direction of change statistic in similar contexts, it is difficult to make comparisons. Chinn and Meese (1995) apply the direction of change statistic to 3-year horizons for three conventional models, and find that performance is largely currency-specific: the no change prediction is outperformed in the case of the dollar-yen exchange rate, while all models are outperformed in the case of the dollar-pound rate. In contrast, in our study at the 20-quarter horizon, the positive results appear to be fairly evenly distributed across the currencies, with the exception of the dollar-pound rate.14 Mirroring the MSE results, it is interesting to note that the direction of change statistic works for the purchasing power parity at the 4-quarter and 20-quarter horizons and for the interest rate parity model only at the 20-quarter horizon. This pattern is entirely consistent with the findings that the two parity conditions hold better at long horizons.15

C. Consistency Criterion

The consistency criterion only requires the forecast and actual realization commove one-to-one in the long run. In assessing the consistency, we first test if the forecast and the realization are cointegrated.16 If they are cointegrated, then we test if the cointegrating vector satisfies the (1, -1) requirement. The cointegration results are reported in Table 3, while the test results for the (1, -1) restriction are reported in Table 4.

Table 4.

Results of the (1, −1) Restriction Test: Dollar-Based Exchange Rates

article image
article image
article image
Source: Authors’ own estimates. Note: The likelihood ratio test statistic for the restriction of (1, −1) on the cointegrating vector and its p-value are reported. The test is only applied to the cointegration cases present in Table 5. The notation used in the table is ECM: error-correction specification; FD: first-difference specification; PPP: purchasing power parity model; S-P: sticky-price model; IRP: interest rate parity model; PROD: productivity differential model; and COMP: composite model. The forecasting horizons (in quarters) are listed under the heading “Horizon.” The results for the post-Louvre Accord forecasting period are given under the label “Sample 1”