Time-Varying Thresholds
An Application to Purchasing Power Parity
  • 1 0000000404811396https://isni.org/isni/0000000404811396International Monetary Fund

Contributor Notes

This paper introduces a time-varying threshold autoregressive model (TVTAR), which is used to examine the persistence of deviations from PPP. We find support for the stationary TVTAR against the unit root hypothesis; however, for some developing countries, we do not reject the TVTAR with a unit root in the corridor regime. We calculate magnitudes, frequencies, and durations of the deviations of exchange rates from forecasted changes in exchange rates. A key result is asymmetric adjustment. In developing countries, the average cumulative deviation from forecasts during periods when exchange rates are below forecasts is twice the corresponding measure during periods when exchange rates are above forecasts.

Abstract

This paper introduces a time-varying threshold autoregressive model (TVTAR), which is used to examine the persistence of deviations from PPP. We find support for the stationary TVTAR against the unit root hypothesis; however, for some developing countries, we do not reject the TVTAR with a unit root in the corridor regime. We calculate magnitudes, frequencies, and durations of the deviations of exchange rates from forecasted changes in exchange rates. A key result is asymmetric adjustment. In developing countries, the average cumulative deviation from forecasts during periods when exchange rates are below forecasts is twice the corresponding measure during periods when exchange rates are above forecasts.

I. Introduction

This paper introduces and estimates a threshold autoregressive model (TAR) that allows for time-varying thresholds. Our model focuses on adjustment dynamics when real exchange rate changes exceed upper and lower forecast thresholds. The estimated model allows us to calculate the magnitudes, frequencies, and durations of the deviations from forecast thresholds, both for depreciations and appreciations. We evaluate the fit of our estimated models using some new tests that compare the simulated density from the estimated model with the density of the actual data. Our results indicate asymmetric adjustment for over-depreciations compared to over-appreciations and for advanced economies compared to developing countries.

We begin with the notion that real exchange rates follow a nonlinear adjustment process that can be represented as a regime-switching process.2 Regime-switching may arise from transaction costs in international arbitrage (Sercu, Uppal, and Van Hulle (1995); Obstfeld and Taylor (1997); Coleman (1995); O’Connell and Wei (2002)). Deviations from purchasing power parity (PPP) are assumed not corrected if they are small relative to the costs of trading, creating a band for the real exchange rate within which the marginal cost of arbitrage exceeds the marginal benefit. Dixit (1989) and Krugman (1989) argue that thresholds may also arise because of sunk costs of international arbitrage and the tendency for traders to wait for sufficiently large arbitrage opportunities before entering the market. Thresholds can also occur because governments care about large and persistent deviations, given the potential effect of real exchange rate misalignments on the current account and cost of servicing external debt (Dutta and Leon (2002)).3 This intervention, for example, could be effected in currency markets, using foreign currency reserves, through subsidies and the imposition of various trade restrictions, or through monetary policies that affect domestic price levels. In fact, Calvo, Reinhart, and Veigh (1995) concluded that the real exchange rate is perhaps the most popular real target in developing countries.

Our research addresses three related issues. First, we examine the persistence of deviations from PPP. One of the reasons for the common finding of a unit root in real exchange rates is the low power of unit root tests when the real exchange rate follows a nonlinear process. It is well known that the power of standard unit root tests falls sharply when the true model is a threshold process (Pippenger and Goering (1993 and 2000)). However, tests for nonstationarity in the presence of nonlinearity have only recently been developed. In our unit root tests, we follow Caner and Hansen (2001), who address the problem of disentangling nonstationarity from nonlinearity by allowing for both simultaneously. Using a general TAR(k) model with unrestricted autoregressive orders, Caner and Hansen (2001) propose Wald tests for a threshold effect (nonlinearity) when the series of interest follows a unit root, and Wald and t-tests for unit roots (nonstationarity) when the threshold nonlinearity is either present or absent.4

Second, we examine whether the observed changes in real exchange rates are consistent with the accepted view of fixed thresholds and symmetric reversion toward the band of inaction. While transaction cost models typically assume symmetric adjustment, nonlinearity due to hysteretic behavior, one-sided hedging, or government intervention suggest asymmetry. For example, Dutta and Leon (2002) argue that countries may choose to defend depreciations more or less vigorously than appreciations, thereby generating asymmetric adjustment behavior. Further, if thresholds are determined endogenously, for example, as monitored targets, the fixed threshold model will be misspecified. We propose a specification that is more general than the fixed threshold case used by Obstfeld and Taylor (1997) and the symmetric TAR used by Michael, Nobay, and Peel (1994). In our model, which allows for time-varying bands and asymmetrical adjustment speeds, the time-varying thresholds are determined by forecasts of the real exchange rate. In their test for unit roots, Berben and van Dijk (1999) allow for asymmetry in the speed of adjustment under the alternative, but their “drifting” thresholds are defined as a linear combination of the maximum and minimum of the order statistics of the threshold variable.

Third, we implement tests to compare nonlinear models by evaluating how well they replicate the characteristics of the data. The empirical characteristics commonly found in financial time series and the difficulty of interpreting formal tests of hypotheses in a nonlinear setting suggest the need for alternative measures of model adequacy. We follow Pagan (2002) and Bruneig, Najarian, and Pagan (2002) (BNP) and evaluate our estimated model by comparing the densities implied by the estimated models with the density of the data. To employ these tests, we simulate the models to discover their implied population characteristics and compare these population characteristics with their sample equivalents. We complement the BNP tests with Hamilton’s (2001) flexible parametric nonlinearity test, applied to the residuals of our model.5

We estimate the models for 60 countries, using monthly data on real effective exchange rates. Our sample includes all G-7 countries, a selection of other advanced economies, and some emerging market and developing countries from Asia, Africa, and Latin America. Our results provide support for both stationary regime-switching processes and asymmetric adjustment dynamics. The Wald tests show that the unrestricted TVTAR outperforms both the linear specifications (stationary as well as nonstationary) and the identified threshold nonstationary model (unit root with threshold effects). We find support in some developing countries for the threshold model with a unit root in the corridor regime. As regards asymmetry, we calculate the speed of response to deviations from forecasts and the duration of time spent outside threshold bands to gauge the potential impact of real exchange rate misalignments. We find that G-7 and Asian and other developing (mainly African) countries in our sample respond more strongly to developments relating to over-appreciations; similarly, other advanced economies and Western Hemisphere (WH) developing countries respond more strongly to developments relating to over-depreciations. Durations are longer for over-depreciations in the Asian developing countries, but for over-appreciations in the other advanced economies (non G-7), WH, and other developing countries. We calculate the average cumulative deviation (excess deviation) for periods when the actual exchange rate changes are greater than upper and lower forecast bounds and find that this excess deviation measure for over–depreciations is about twice that for over–appreciations, and is larger for developing countries than for advanced economies. In terms of model adequacy, we evaluate all the models for their ability to replicate five characteristics of the densities of the data. We find that the TVTAR specifications explain the mean and variance and, to a lesser extent, the persistence characteristics of the data, but do less well, especially for developing countries, in replicating the observed asymmetry and interquartile range. These results suggest the need to develop specifications capable of explaining higher moments and other characteristics of the density of observed data.

Our results have the following policy implications. First, the lower persistence implied by our finding of stationarity implies that demand side shocks may also drive exchange rate movements. Second, countries with longer durations of misalignment, larger deviations from threshold bands, or higher excess deviations are likely to have a higher probability of experiencing hysteretic-type effects through their effects on the value of firms. These probabilities appear higher for over-depreciations than for over-appreciations, and is more so for developing countries than for advanced economies. Consequently, an argument can be made for policies aimed at reducing the variability and length of duration of misalignments outside a desired range. Third, exchange rate dynamics seem to vary at least with the level of economic development.

The rest of this paper is organized as follows. Section II discusses the modeling framework used in estimating the real exchange rate dynamics. In Section III we present the results. A brief summary follows in Section IV.

II. Modeling Framework

Nonlinear modeling of economic variables assumes that different states of the world or regimes exist and that the dynamic behavior of economic variables depends on the regime occurrirng at a point in time. We consider models that are characterized as piecewise linear processes, such that the process is linear in each regime. We examine a Threshold Autoregressive (TAR) model, with a discrete jump at a threshold value, for which the switching function is dependent on the value of the transition variable relative to the threshold value (Tong and Lim (1980)). The series can then be categorized into states consistent with the threshold variable reaching the threshold values separating the regimes. In the context of real exchange rates, the TAR model allows for a band within which no adjustment to the deviations from PPP takes place. This implies that within the band, deviations from PPP may exhibit unit root behavior, but the adjustment process is reverting or stationary in the outer bands.

We assume that the bands of inaction may vary over time. This may be because transactions costs and other market frictions defining arbitrage opportunities vary; expectations on foreign exchange transactions change; and policy intervention may vary with the level of monitored economic aggregates. We propose the following time-varying TAR (TVTAR), which allows for asymmetric time-varying thresholds and adjustment parameters, as well as regime specific means that may be different from the neighboring thresholds. Our model is:

Δyt=θLxt1It,L+θHxt1It,H+θCxt1+εt(1)
xt1=(1,yt1,Δyt1,,Δytk),θR=(β0R,ρR,β1R,,βkR),R=L,C,H,and
It,L={1if zt<0  |zt|>|Pt1,L(zt)|0 otherwise
It,H={1if zt<0  |zt|>|Pt1,H(zt)|0 otherwise
ForZt=Δyt1,Pt1,R(Zt)=αt1,R(Zt1)+(1αt1R))Pt2,R(Zt1)
αt1,R=|St1,RAt1,R|,with
St1,R=δRdevt1,R+(1δR)St2,R
At1,R=δR|devt1,R|+(1δR)At2,R,and
devt1,R=zt1Pt2,R(Zt1)

Pt−1 (zt) is the expected forecast value of the transition variable, based on exponential smoothing with adaptive response (time varying) weights for the exponential rate of decay.

Thus, the 3-regime TVTAR divides the regression according to whether the absolute value of the percentage change in the real exchange rate exceeds the upper and lower forecast bounds, Pt−1,R (zt). The corridor regime occurs when the change in the real exchange rate during one month does not appreciate by more than the upper forecast bound, Pt−1,H (zt), or depreciate by more than the lower forecast bound, Pt−1,L (zt). The transition variable zt = Δytd is assumed to be known, stationary, and have a continuous distribution; however, the delay factor d, the lag length k, and the threshold values are unknown. Each δL, δH depends on a functional of the sample. I(A) denotes the indicator function for the event A, such that I(A) = 1 if A is true and I(A) = 0 otherwise. In interpreting the coefficients, R is an index for the alternative regimes, ρR are the slope coefficients on yt−1; β0R are the slope coefficients on the deterministic components; and βiR are the slope coefficients on the (Δyt−1,…Δytk) in the alternative regimes. The model can be nonstationary within one or more regimes, though the alternation between regimes can make it overall stationary.

Unit Root Tests

Hansen (1997) indicates that conventional tests of the null of a linear autoregression (AR) versus TAR models have nonstandard distributions, as the threshold parameter is not identified under the null of linearity (see Davies (1987)); also the sampling distribution of the threshold estimates are not standard. The model can be nonstationary within one or more regimes, though the alternation between regimes can make it overall stationary. We follow Caner and Hansen (2001) in constructing Wald tests for distinguishing between nonlinearity (threshold effects) and possible nonstationarity in real exchange rate series.6 We consider the following hypotheses:

Wald 1: Linear Stationary-ergodic AR versus Unrestricted TAR

H0:θL=θH=0,ρc<0HA:θL0,θH0

Wald 2: Hansen’s Unidentified Threshold Scenario

H0:θL=θH=0,ρc=0HA:Unrestricted 3-regimeTAR

Wald 3: Hansen’s Identified Threshold

H0:θL0,θH0,ρL=ρH=ρc=0HA:θL0,θH0,ρL<0,ρH<0,ρc<0(unrestricted 3-regime TAR)

Wald 4: Unit Root in Corridor Regime, Partial Unit Root

H0:θL0,θH0,ρL<0,ρH<0,ρc=0HA: Unrestricted 3-regime TAR

The test is an F-statistic calculated as the ratio of residual variance of the linear model (null) to that of the TAR model (alternative); however, the F-statistic does not have the standard χ2 (chi-square) asymptotic distribution. The F-statistic is:

FT(δL,δH,k)=T2(σ˜T2σ^T2(δL,δH,k)σ^T2(δL,δH,k))

where σ˜T2 is the residual variance under the null hypothesis, and σ^T2(δL,δH,k) is the residual variance under the alternative. Because FT(δL, δH, k) is a decreasing function of σ^T2(δL,δH,k), it follows that FT(δ^L,δ^H,k^) is equivalent to the supremum of the pointwise test statistic FT(δL, δH, k) over the allowable values for (δL, δH, k), that is

FT(δ^L,δ^H,k^)=sup(δL,δH,k)Λ1×Λ2×Λ3FT(δL,δH,k)

Thus the Wald statistic for H0 is often called the “Sup-Wald” statistic. Given the dependence of the critical values on the particular null and alternative, as well as the presence of nuisance (unidentified under the null) parameters, we calculate the critical values for our test statistics using bootstrap approximations to the Wald statistics.7 The unidentified threshold scenario, which performed better in Caner and Hansen’s Monte Carlo tests, makes use of the constrained bootstrap method,8 and the identified threshold bootstrap is conducted through a simulation from a unit root TAR. The Wald 1 is a test for the existence of a threshold; Wald 2 tests for a unit root when there is no threshold effect; Wald 3 tests for a unit root in the presence of threshold effects; and Wald 4 tests for a (partial) unit root only in the corridor regime. In the presence of threshold effects, these threshold unit root tests have greater power than the conventional ADF unit root tests.

Estimation

We estimate equation (1) using sequential least squares (Hansen 1997). Our δR are initialized through a grid search over [0,1] in steps of 0.1 increments, determining the αR, the threshold sequences, and the indicator variables (IL, IH). We use the lagged difference of the exchange rate as the transition variable and set the delay parameter to unity.9 Our choice of zt = Δyt-1 is stationary whether yt is I(1) or I(0). We also initialize St−2,R = 0, At−2,R=0, and Ft−2,R = Δyt−2. For each triple (δL, δH, k), consisting of the lower and upper thresholds and lag k on Δytk, we estimate by ordinary least squares (OLS)10

Δyt=θ^L(δL,δH,k)xt1It,L+θ^H(δL,δH,k)xt1It,H+θC(δL,δH,k)xt1+εt(δL,δH,k)

Let σ2(δL,δH,k)=T1Σt=1Tε^t(δL,δH,k)2 be the OLS estimate of σ2 for fixed (δL, δL, k). Then the least squares estimate of the threshold values is found by minimizing σ2(δL, δL, k)

(δ^L,δ^H,k^)=arg min(δL,δH,k)ΛLΛHΛkσ^2(δL,δH,k)

The parameters of the model can be estimated consistently as long as the true threshold values lie in the interior of the grid space and each regime has sufficient data points to produce reliable estimates of the autoregressive parameters. The least square estimates of the other parameters and residuals are found by substitution of the point estimates (δ^L,δ^H,k^).

These estimates are used to conduct inference concerning the parameters of interest.

Model Evaluation

We evaluate the estimated models for evidence of remaining nonlinearity, based on Hamilton’s (2001) general linearity test, and their ability to replicate empirical properties of the data. If our focus is the DGP, it is natural to focus on the density describing the variable of interest. In practice, however, researchers tend to focus on some characteristics of the density, depending on the objectives of the modeling exercise. For example, these may include the conditional mean, if the objective is prediction of a point estimate, and volatility, if our interest is uncertainty. In this paper, we focus on tests developed by Pagan (2002) and BNP (2002) that allow us to compare the performance of competing nonlinear models without a priori assumptions that either model is the true DGP. This is particularly important because most times the researcher does not know which model may have generated the hypothesized shift in regime.

Suppose the analyst (policy maker) is interested in some functions of data, g^(y). Let g(θ^) be the corresponding implied population characteristic, obtained from simulated data based on the estimated model. Label the difference between these two measures as d=g^(y)g(θ^). Then, we can think of these tests as comparing a consistent estimator of g(y) to an efficient estimator, g(θ^), if the model is valid, enabling us to formulate the variance of d as var(d)=var(g^(y)var(g(θ^)) (see Hausman (1978)). Although the variance of g^(y) is simply derived from the observed series, the analytical expression for var(g(θ^)) may be difficult to obtain for complicated nonlinear specifications. Because the test statistic T*=d^[var(g^(y)var(g(θ^))]1 d^>T=d^[var(g^(y))]1 d^, Pagan (2002) suggests using the conservative test T. A rejection based on T (compared to χ2(1)) would imply an even stronger rejection than if based on T*. A robust estimator of var(g^(y)), compatible with many alternative models, can be obtained using the Newey-West (1987) covariance matrix.

III. Results

We examine real effective exchange rates of 60 countries for the period 1981:03 to 2001:12, using Ox Professional 3.011,12. All data are taken from the International Financial Statistics (IFS) database of the International Monetary Fund (IMF). The real effective exchange rate (REER), based on consumer prices, measures movements in the nominal exchange rate adjusted for differentials between the domestic price index and trade-weighted foreign price indices.

Empirical Characteristics

We investigate estimated the speed of response to deviations from forecasts, time spent outside threshold bounds, and a measure of deviations between actual changes and forecast thresholds during periods outside of thresholds. We present results for groupings of advanced and developing economies. Summaries of the characteristics of the threshold bands and estimates of duration are shown in Tables 1 and 2 and described below.

Table 1:

Characteristics of Threshold Bands

article image
Note: Let subscript R depict the alternative regimes, with L corresponding to over-depreciation, H to over-appreciation, and Cor to the corridor. The columns report the parameters from the forecast measure that characterizes the time-varying bands (δR and αR), the optimal lag-length (κ*), and the percentage of times the series spends in each of the intervention regimes.
Table 2:

Duration and Loss Estimates

article image
Note: Let subscript R depict the alternative regimes, with L corresponding to over-depreciation, and H to over-appreciation. MaxD, shows average maximum duration of excess deviations on each side of the band (number of periods), and AveD, is the average duration per spell of excess deviation, across countries for each regime; CumL, is the average excess deviation (area between the tolerance margin and the observed realizations when the band is crossed), and AveL, is the average excess deviation per spell, across countries for each regime.

Response: The adaptive response weight parameters aL and aH show the quickness of response to relatively recent exchange rate variations. The response for deviations toward over-depreciation is quicker for advanced economies than that for developing countries (0.59 vs. 0.49), implying narrower, closely watched bands. In contrast, the response for over-appreciation is much quicker for developing countries (0.54 vs. 0.45). The differences are more marked in subregions. For over-depreciations, the other (non-G7) advanced economies have the fastest response (0.62), Asia the slowest (0.43); for over-appreciations, the other advanced economies have the slowest response (0.41), other developing countries the fastest (0.65). If this design of the thresholds reflects a measure of relative tolerance for these exchange rate variations, then the results suggest that Asia and other developing countries have low tolerance for over-appreciations. Further, the longer average lag for developing countries relative to advanced economies (7 vs. 5 months) suggests a more complex structure for short-term interaction between nominal exchange rates and relative prices.

On average, both advanced and developing economies display asymmetrical response to changes in the real exchange rates, with G-7 (0.55 vs. 0.51), Asia (0.58 vs. 0.43), and other developing countries (0.65 vs. 0.53) placing greater weight on recent developments relating to appreciations while predicting the tolerance margin. The opposite is true for the other advanced (0.62 vs. 0.41) and WH (0.49 vs. 0.43) economies, which react more strongly to developments relating to over-depreciations.

Duration: Maximum durations are longer for over-depreciations for G-7 and Asia and for over-appreciations in the other groups. The maximum duration for the G-7 occurs in the lower regime (4.6 months), but in the upper regime for the other advanced economies (4.5 months). Similarly, the maximum duration for Asia is in the lower regime (4.6 months), but in the upper regime for the WH countries (4.7 months) and other developing countries (4.4 months). The average durations show similar patters with the other advanced economies and WH displaying longer average durations for over-appreciations and Asia exhibiting longer average duration for over-depreciations. The G-7 and other developing countries have equal durations for both types of deviations. Given the difference in response towards depreciation and appreciation deviations of the subgroups, the evidence on duration is probably informative about the speed or effectiveness of the policy measures used to reverse deviations from forecasts. Average duration in the upper regime is greater than the average duration in the lower regime in 51 percent of developing countries, compared to 72 percent of advanced economies. Average duration varies across regions: for the other advanced economies, 83 percent record durations in the upper regime in excess of durations for the lower regime; for the Asian countries, the corresponding figure is 29 percent. As regards the distribution of observations across regimes, there is a tendency in developing countries for more observations to lie in the upper regime (29% vs. 25%), more so for WH countries (33% vs. 24%), consistent with longer average durations for and slower response to over-appreciations. For advanced economies, there is a slight tendency in G-7 for more observations in the lower regime and in other advanced economies for more observations in the upper regime.

Excess Deviation: If we define the cumulated difference between the actual exchange rate change and the expected change for the duration of a crossing as an excess deviation measure, we find that, for all groups, the excess deviation for a depreciation spell (crossing beyond the lower threshold) is at least twice as large as that for an appreciation spell (crossing beyond the upper threshold). The overall average is 0.25 in the lower regime and 0.09 in the upper regime. For developing countries, the excess deviation for depreciations is three times higher than that for appreciations; in contrast, the factor is 1.5 for advanced economies. Further, the excess deviation for depreciations is about 4.5 times higher for developing countries relative to advanced economies; for appreciations, that relative factor is about 2.5.

For developing countries, the excess deviation per spell for depreciations (AveLL) are about twice that for appreciations. Also, excess deviation per spell for developing countries are also larger than that for advanced economies for both depreciations and appreciations. The excess deviation per spell for depreciations in the developing countries is four times that of the advanced economies; for the group other developing countries (mostly African countries), this factor is five. Yet, developing countries have excess deviations per spell that are at most twice that of the advanced economies. Also, the excess deviation per spell in the lower regime is greater than the excess deviation per spell in the upper regime in 83 percent of developing countries (in all other developing countries group), compared to 56 percent for the advanced economies.

We use the cross-section data on duration and excess deviation to explore whether the observed asymmetry is correlated with trade openness (ratio of exports plus imports to GDP) and debt (external liabilities to GDP). PPP theory indicates that PPP deviations are corrected over time through adjustments in trade flows, suggesting that the speed of reversion may be related to trade openness. Also, Lane and Milesi-Ferretti (2001) investigate the link between the real exchange rate and net external position and find the magnitude of the “transfer-effect” varies with country characteristics like openness, size, level of development, and composition of external liabilities. Our results relate measures of real exchange rate dynamics to two country characteristics, openness and external debt. The results in Table 3 show that for the sample of 60 countries, openness is positively correlated with the average duration for over-appreciations but uncorrelated with the average duration for over-depreciations.13 Further, openness is negatively correlated with the excess deviation for over-depreciations but uncorrelated with the excess deviation for over-appreciations. Using the results for the 34 countries in the sample for which both debt and openness data were available (the developing countries), we obtain similar results. We find a positive correlation between average openness and average duration for over appreciations but no correlation between openness and average duration for over-depreciations; further, there is a positive correlation between the average debt-to-GDP ratio and the excess deviation per spell for over-depreciations but no correlation between the debt ratio and the excess deviation per spell for over-appreciations. For the developing countries, both the excess deviation for over-depreciation and longer average durations in the lower regime are positively correlated with the debt ratio and negatively correlated with openness. The finding that openness may be related to duration of over-appreciation misalignments but debt ratios are related to excess deviations of over-depreciations merits further research.

Table 3:

Cross-section regression estimates

article image
Note: 1/ Sample is restricted to countries for which both debt and openness data were available. Let subscript R depict the alternative regimes, with L corresponding to over-depreciation and H over-appreciation. AveDt is the average duration per spell of excess deviation, AveL is the average excess deviation per spell, and CumLs is the average excess deviation (area between the tolerance margin and the observed realizations when the band is crossed), across countries for each regime. DL> DH indicates average duration per spell in lower regime is greater than upper regime. Parentheses are t-statistics, based on Newey-West heteroskedasticity-consistent covariance matrix.

Parameter Estimates

Tables 4 and 5 summarize the TAR estimates and the Wald tests (see Appendix Table 1).

Table 4:

Average Reversion Coefficients

article image
Note: Subscripts depict the alternative regimes, with L corresponding to over-depreciation, H to over-appreciation, and C to the corridor. LIN refers to the linear model.
Table 5

Summary of Wald Tests

article image

Tables 4 and 5 summarize the TAR estimates and the Wald tests. For the unrestricted TAR model, ρL > ρH for the other advanced economies and other developing countries, and ρH > ρL for G-7 and WH countries, but the reversion rates are faster in developing countries compared to the advanced economies. For G-7, only pu < 0; on the other hand, for the other advanced economies, only ρL< 0. In the corridor regime, all reversion coefficients are negative. For the TARurCor model, |ρL|>| ρH| for all groups of developing countries, with approximate equality for G-7 and other developing economies. In the lower regime, reversion is faster for developing countries than for advanced economies. On average, WH and other developing countries revert twice as fast in the lower regime compared to G-7, other advanced economies, and Asian developing countries, which revert at similar speeds. On the other hand, Asia and WH have the slowest reversion rates in the upper regime, about one-half the speed of the other advanced economies and other developing countries. The existence of threshold effects suggest that the results for the linear model are averages across the three regimes.

We calculate Wald statistics to test for threshold effects and/or unit roots. The tests measure whether the Data Generating Process (DGP) under the null produces a residual variance that is significantly larger than the residual variance obtained from the fit of the alternative hypothesis, in our case the unrestricted TAR specification. Table 5 shows the percentage of countries for which the various null hypotheses are plausible. These statistics are based on estimated bootstrap p-values, representing the percentage of Wald statistics calculated from the simulated data that exceed the Wald statistics calculated from the observed sample.

The unconstrained bootstrap results indicate an overwhelming rejection of the first three null hypotheses14. The unrestricted TAR specification outperforms the benchmark stationary ergodic linear process. It is also preferred over both the linear non-stationary I(I) specification, the p-values for which are obtained by constructing a bootstrap distribution that imposes an unidentified threshold effect, and the unit root TAR process.15 Because the unidentified threshold model was less sensitive to nuisance parameters Caner and Hansen (2001) recommend calculating p-values using the unidentified threshold bootstrap. The intermediate case, which we label as an identified threshold partial unit root process (I(I) in corridor regime combined with an otherwise stationary TAR), yields different outcomes for advanced and developing economies. While the null is still rejected against the stationary ergodic TAR for most advanced economies, the developing countries do not reject the partial unit root TAR as their preferred specification. Thus, the partial unit root model could characterize the data dynamics for these countries.

Model Evaluation

Applying Hamilton’s (2001) generalized test for nonlinearity to the residuals of our estimated models indicates that both the unrestricted TAR and the TARur model explain the nonlinearity in the advanced economies. The incidence of remaining nonlinearity is about 15 percent for the advanced economies and about 33 percent for the developing countries (see Appendix Table 2). The TARurCor model shows remaining nonlinearity in about one-third of both groups of economies, suggesting that it performs as well as the TARur in developing countries but is less adequate as a characterization of the data dynamics across all economies.

For BNP (2002), we consider tests for the first two moments (mean and variance), the interquartile range (the middle 50% of the observations), and measures of asymmetry and persistence. For asymmetry and persistence, we measure how well the data simulated under the estimated models replicates the features of EGARCH-asymmetry and GARCH-persistence in the conditional variance of the empirical sample. The tests are based on the comparison of the series’ empirical density, estimated nonparametrically, and the density implied by each of the models, obtained from simulations using 1000 replications as the trimming margin. In calculating the Newey-West standard errors, 9 lags were used to account for possible serial correlation. In interpreting the reported statistics, a positive value of a statistic generally indicates a proportional under-representation of the corresponding indicator in the series implied by the model (see Appendix Table 3). For example, the linear AR model tends to over-predict the mean relative to the linear unit root AR, and the TARur tends to over-predict the mean relative to the unrestricted TAR. For the asymmetry and persistence test, we report absolute values of the tests.

These statistics show that, in terms of relative performance, the corridor unit root model (TARurCor) performs the least well in matching the two densities. The unrestricted (stationary) threshold model performs slightly better than the threshold model with a unit root in each regime (TARur). The performances of the linear and linear unit root models are similar, probably indicating a near unit root estimate. In contrast to the Wald tests, the BNP tests are less discriminating, because they are conservative and therefore under-reject.16,17 But they provide critical information on the exact moment based measure that is responsible for misspecification in the estimated model.

Although most models perform well in matching the mean and variance of the data, their ability to replicate the interquartile range and asymmetry effect is less impressive. An interpretation of this result is that the TAR models are more capable of explaining the mean, variance, and a specific form of persistence in the data. In about 50 percent of advanced economies, all the characteristics tested are replicated by at least one model; it is also clear that the TAR framework is less successful in replicating characteristics of the data densities of the developing countries. For the advanced economies, the linear models are also capable of replicating the characteristics of the data densities.

IV. Conclusions

This paper introduces a time-varying threshold autoregressive model and examines whether deviations from PPP are stationary in the presence of that nonlinear specification. Our results are threefold. First, we find support for both stationary regime-switching processes and asymmetric adjustment dynamics. The Wald tests show that the unrestricted TVTAR outperforms both the linear specifications (stationary as well as nonstationary) and the identified threshold nonstationary model (unit root with threshold effects). We find support in some developing countries for the threshold model with a unit root in the corridor regime. As regards asymmetry, we find that G-7 and Asian and other developing (mainly African) countries in our sample respond more strongly to developments relating to over-appreciations; similarly, other advanced economies and Western Hemisphere (WH) developing countries respond more strongly to developments relating to over-depreciations. Second, for both advanced and developing economies, the excess deviation for over-depreciations is at least twice that associated with over-appreciations; further, that measure is larger for developing countries compared to advanced economies for both over-appreciations and over-depreciations. For developing countries, the excess deviation per spell is twice as large for over-depreciations compared to over-appreciations, and about four times larger than the excess deviation per spell for advanced economies. Third, durations are longer for over-depreciations in the Asian developing countries, but for over-appreciations in the other advanced economies (non G-7), WH, and other developing countries.

Our results have the following implications. First, the finding of asymmetrical durations and excess deviations suggest that the macroeconomic consequences of real exchange rate misalignments vary with the level and type of misalignment, that is over-appreciations or over-depreciations. Second, trade openness may be related to duration of over-appreciation misalignments but debt ratios are related to excess deviations of over-depreciations. Third, exchange rate dynamics seem to vary with country characteristics, at least with the level of economic development.

Appendix I

Appendix Table 1:

Parameters of Interest

article image
article image
Note: Subscripts depict the regimes, with L corresponding to over-depreciation, H to over-appreciation, and C to the corridor. LIN refers to the linear model.
Appendix Table 2:

Hamilton’s Nonlinearity Test

article image
article image
Note: Numbers are p-values for null hypohesis of no remamining nonlinearity.
Appendix Table 3:

Unconditional Moments Tests

article image
article image
article image
article image
article image
article image
article image
Note: “na” implies that after a large number of simulations, the estimates from these models lead to a divergence between the theoretical and empirically observed properties. IRQ is the interquartile range. For the asymmetry and persistence statistics, we report absolute values.