Early Warning Systems
A Survey and a Regime-Switching Approach

Contributor Notes

Author’s E-Mail Address: aabiad@imf.org

Previous early-warning systems (EWSs) for currency crises have relied on models that require a priori dating of crises. This paper proposes an alternative EWS, based on a Markov-switching model, which identifies and characterizes crisis periods endogenously; this also allows the model to utilize information contained in exchange rate dynamics. The model is estimated using data for the period 1972-99 for the Asian crisis countries, taking a country-by-country approach. The model outperforms standard EWSs, both in signaling crises and reducing false alarms. Two lessons emerge. First, accounting for the dynamics of exchange rates is important. Second, different indicators matter for different countries, suggesting that the assumption of parameter constancy underlying panel estimates of EWSs may contribute to poor performance.


Previous early-warning systems (EWSs) for currency crises have relied on models that require a priori dating of crises. This paper proposes an alternative EWS, based on a Markov-switching model, which identifies and characterizes crisis periods endogenously; this also allows the model to utilize information contained in exchange rate dynamics. The model is estimated using data for the period 1972-99 for the Asian crisis countries, taking a country-by-country approach. The model outperforms standard EWSs, both in signaling crises and reducing false alarms. Two lessons emerge. First, accounting for the dynamics of exchange rates is important. Second, different indicators matter for different countries, suggesting that the assumption of parameter constancy underlying panel estimates of EWSs may contribute to poor performance.

I. Introduction

A succession of currency crises in the past decade has led to a proliferation of theoretical and empirical papers on the factors that brought about these crises. Several papers have also focused on the issue of anticipation—devising early warning systems that give policymakers and market participants warning that a crisis is likely to occur. Two approaches to constructing early warning systems have become standard: limited dependent variable probit/logit models and the indicators approach of Kaminsky, Lizondo, and Reinhart (1998, hereinafter referred to as KLR). Berg and others (2002) assess the performance of these models and find that they have outperformed alternative measures of vulnerability such as bond spreads and credit ratings. However, while these models are able to anticipate some crises, they also generate many false alarms.

There are several well-known methodological issues associated with the existing early warning models. Perhaps the most significant is that they require an a priori dating of crisis episodes before they can be estimated. The most common procedure for doing so is by taking changes in exchange rates, reserves, and/or interest rates; choosing weights for each; and combining them into an index of speculative pressure, specifying a sample-dependent threshold, and identifying crises based on whether or not the index exceeds the threshold. But as is evident from the survey of 26 recent empirical studies of currency crises in Section II below, this simple procedure has been applied in a multitude of ways, resulting in different periods being identified as crises.2

The threshold procedure provides a set of crisis dates, but raises even more problems. First, the choice of the crisis-identification threshold is arbitrary. A selected sampling of thresholds used in the literature include the threshold of 1.5 ×σ (where σ is the sample standard deviation) used in Aziz and others (1999), 1.645 × σ in Caramazza and others (2000), 1.75 × σ in Kamin and others (2001), 2.5 × σ in Edison (2000), and 3 × σ in KLR. Different choices of threshold will obviously result in different crisis dates and different estimated coefficients. Moreover, the threshold is sometimes treated as a free parameter and chosen so that the fit of the model is maximized (Kamin and others, 2001), or so that a set percentage, say 5 percent, of all observations are crises (Caramazza and others, 2000).

Second, the sample-dependent nature of the threshold definition implies that future data can affect the identification of past crises. Thus one can observe cases of disappearing crises, as documented by Edison (2000). Since the threshold is defined in terms of the sample standard deviation, the occurrence of a new, relatively large crisis, such as the Asian crisis, results in previously identified crises no longer being identified as such. Edison notes that the threshold methodology identifies five crises in Malaysia using pre-1997 data, but these all disappear, and only one crisis is identified (the 1997 crisis itself), when data up to 1999 are included in the sample.

Third, many of these studies make ad hoc adjustments to the binary crisis variable that may introduce artificial serial correlation. One common procedure is the use of “exclusion windows,” which omits any crises identified by the threshold method if they follow a previous crisis within a certain window of time. As is the case with the threshold level, the width of the exclusion window is arbitrary, and has been chosen to be anywhere from one quarter (Eichengreen, Rose, and Wyplosz, 1996) to as long as 18 months (Aziz and others, 1999) and even three years (Frankel and Rose, 1996, using annual data). The motivation for using exclusion windows is to eliminate identifying speculative pressure episodes as new crises if they are just continuations of previous ones. But in doing so, one eliminates any information the sample contains regarding crisis duration. More seriously, it introduces artificial serial correlation in the dependent variable that few studies account for. The estimated probit/logit models implicitly assume independence across observations t. But using an exclusion window means that Ct = 1 ⇒ Pr(Ct+j= 1) = 0 for j=1,2,…J, where J is the width of the exclusion window.3

Finally, information is lost when transforming a continuous variable into a binary variable. In particular, potentially useful information on the dynamics of the dependent variable is discarded. The critique regarding information loss can also be made regarding the treatment of the indicators in the KLR approach, where the explanatory variables themselves are transformed into binary signals.

Given these problems, is there an alternative approach? This paper proposes an EWS methodology, based on a Markov-switching model with time-varying transition probabilities, that can address these issues. First, the model does not require a priori dating of crisis episodes; instead, identification and characterization of crisis periods are part of the model’s output, estimated simultaneously with the crisis forecast probabilities in a maximum-likelihood framework. One thus avoids the pitfalls associated with the threshold dating procedure described above. Additionally, by exploiting information in the dynamics of the dependent variable itself, the model is better able to send warning that a significant exchange rate adjustment is likely.

The assumptions that underlie a Markov-switching model are both concise and intuitive. The first assumption is that there are two states: tranquil periods and speculative attack periods. But we do not directly observe these states; that is, this binary “crisis” variable is latent. This brings us to our second assumption: there are directly observable variables whose behavior changes depending on the value of the crisis variable. Most obviously, the behavior of exchange rates is different during periods of speculative pressure than during tranquil periods.4 In particular, we expect much greater exchange rate volatility as well as higher average depreciations during speculative attacks. Finally, we assume that given the current state—tranquil or crisis—there is a certain probability of staying in the same state, or of moving to the other state. In our model, the probability of moving from the tranquil state to the crisis state depends on the strength or weakness of a country’s fundamentals.

Several studies have used Markov-switching models in developing theoretical models of speculative attacks. Jeanne and Masson (1998) and Fratzscher (1999) develop currency crisis models with multiple equilibria and use a Markov-switching variable to model switches between these equilibria. In both cases, however, the probability of switching from one equilibrium to another is constant. In contrast, the model in this paper allows switches from the optimistic, no-attack equilibrium to the pessimistic, speculative attack equilibrium to be a function of various indicators.

Two other papers have used Markov-switching with time-varying probabilities to empirically model currency crises. Cerra and Saxena (2002) use a Markov-switching model to look at the 1997 Indonesian crisis and investigate whether the crisis was due to domestic factors, monsoonal factors, or pure contagion from neighboring countries. Their model differs from the one explored here, mainly because the only variable that affects the time-varying probability in their model is a measure of contagion, based on exchange market pressure in neighboring countries. Fundamentals in their model affect only the mean of the exchange rate. In contrast, our view is that domestic and external fundamentals affect the probability of a crisis occurring and, hence, should enter into the time-varying probability equation rather than affecting only the level of the exchange rate.

The most closely related work is by Martinez-Peria (2002), who also estimates a Markov switching model with time-varying probabilities to model speculative attacks on the European Monetary System (EMS), using data from 1979 to 1993. That paper evaluated the ability of the Markov-switching model to identify crisis episodes,5 and assessed the degree to which five variables—domestic credit growth, the import-export ratio, the unemployment rate, the fiscal deficit, and interest rates—determined crisis vulnerability in the EMS. We extend this work and focus primarily on the use of the model as an early-warning system. First of all, we begin by looking at a wider set of 22 early-warning indicators. In addition to the standard macroeconomic indicators used in other early-warning systems, we also explore indicators relating to the characteristics of capital flows and to financial sector soundness. Second, the predictive ability of the model is assessed both in sample and out-of-sample. Finally, the Martinez-Peria study assumed that the parameters of the model were uniform across countries and pooled the data to get parameter estimates. This is probably an innocuous assumption for the set of advanced economies in her study, which are broadly similar. But for developing countries, such an assumption might not hold. If a country is relatively more open or has fewer capital controls than other countries, for example, the coefficients on measures of external imbalance may be larger. In this paper, the model is estimated separately for each of the five Asian crisis countries (Indonesia, Korea, Malaysia, the Philippines, and Thailand).

With the usual caveat that all early-warning systems are far from perfect and serve only to synthesize information and supplement more in-depth country knowledge, the model does a good job of anticipating crises. It correctly anticipates two-thirds of crisis periods in the sample and, just as important, sends many fewer false alarms than existing models. In the January 2000–July 2001 out-of-sample period, no warning signals are sent for three of the countries (Korea, Thailand, and Indonesia), but vulnerabilities were signaled for Malaysia and the Philippines in mid-2001, mainly owing to a decline in competitiveness and a slowdown in exports.

The other significant contribution of this paper is a detailed survey of the recent empirical literature on currency crises. Covering 30 studies written since 1998, the survey is meant to complement the survey of the pre-1997 empirical literature provided by KLR. We contrast the studies in terms of the crisis definition they use; the geographical coverage, time period, and frequency of their datasets; and the methodologies used. This survey can be found in Section II. Section III describes the Markov-switching model with time-varying probabilities in detail. The data used in the estimation are described in Section IV, and Section V presents the estimation results and a country-by-country analysis. Section VI assesses the model’s predictive ability both in sample and out-of-sample, and Section VII concludes.

II. A Survey of the Post-1997 Empirical Literature on Currency Crises

This section summarizes the results of 30 selected empirical studies of currency crises written since 1998.6 The table in Appendix I provides a summary of the studies. The first row lists the datasets used in the studies, in terms of country coverage, frequency of the data used, and the sample period. The second row gives the definition of currency crisis used in the study. The remaining rows describe the indicators examined, the methodological approach used, and the main findings of each of the studies. The survey is organized thematically, first comparing the various crisis definitions used, followed by a discussion of their datasets, and concluding with a more thorough discussion of methodology, indicators used, and the main conclusions.

A. An Assortment of Crisis Definitions

KLR already noted a large variation in the way “crisis” is defined in the pre-1997 studies. If anything, this variation has increased in recent studies, as can be seen in the second row of the Appendix table. Only nine of the studies surveyed use the speculative pressure index in what has come to be considered the standard form, consisting of a weighted average of nominal exchange rate, reserve and interest rate changes (the latter frequently dropped due to lack of market-determined data), and converted into a binary crisis variable using a threshold specified using a sample-dependent standard deviation.7 Even among these, there is considerable variation with regard to the inclusion of interest rate changes, the weighting of the various components, the threshold used to define the binary variable, and the treatment of high-inflation episodes.

Four other studies use variants of the speculative pressure index approach. Burkart and Coudert (2000) combine the KLR crisis dates with those identified using the dates of Milesi-Ferretti and Razin (1998), which are based solely on large and accelerating exchange rate depreciations, and further amend these dates using “expert judgment.” Ghosh and Ghosh (2002) concentrate solely on “deep” currency crises, which they define as crises resulting in a decline in the GDP growth rate of at least 3 percentage points. Kamin, Schindler, and Samuel (2001) use real as opposed to nominal exchange rate changes in the speculative pressure index, eliminating the need to treat high-inflation countries separately. And Zhang (2001) avoids averaging and weighting issues altogether, by treating exchange rate and reserve changes separately, identifying crises when either of the two crosses a sample-dependent threshold.

Seven of the studies focus exclusively on successful speculative attacks, which result in a significant depreciation.8 But there is even greater variation within this class, as regards whether to use real or nominal depreciations, and how large and rapid the depreciation must be to qualify as a crisis.

Finally, nine of the studies eschew the use of a binary crisis variable altogether.9 The primary rationale for doing so is the loss of information that comes from transforming a continuous variable into a binary one. Why should speculative pressure just below the threshold be coded the same as a completely tranquil period?10 And why shouldn’t one differentiate between moderate speculative attacks which are just above the threshold and extreme ones which far exceed it? Thus, four of these studies use the continuous speculative pressure index as their dependent variable. The remaining two transform the index into a score with a bounded range: in Hawkins and Klau (2000), the crisis measure is discrete and ranges from -10 to 10, while in Eliasson and Kreuter (2001) an extreme-value distribution is fitted to tail-end observations and its cumulative distribution function (cdf), which is bounded between 0 and 1, is used as the continuous dependent variable.11

B. Differences in Coverage

The empirical studies surveyed differ widely in regard to their geographical coverage. All of the studies except one use a panel of countries, and all focus on emerging market economies, with only a fourth of the studies also including advanced economies in their datasets. And although most of the studies have a global orientation, trying to draw lessons that can be generalized across regions, seven of the studies focused on specific regions—Bruggemann and Linne (2000) and Krkoska (2000) study transition economies, Herrera and Garcia (1999) look exclusively at Latin American economies, and Cerra and Saxena (2002), Kwack (1999), Nag and Mitra (1999), and Zhang (2001) analyze the Asian crisis countries.

Unlike some of the studies surveyed in KLR which investigate crises as far back as the early 1950s, all of the recent papers focus only on the last three decades, and in fact only one-fourth have data going back as early as the 1970s. The remainder are split evenly between those that use data from the 1980s onwards, and those that focus exclusively on the crises in the 1990s. Data availability was the primary determinant of the time periods covered in the studies, and also determined the frequency of the data. Unsurprisingly, most studies indicate a clear preference for higher frequency data, both for a more precise dating of crisis episodes and also since the usefulness of an early-warning system depends largely on the timeliness of the data it is based on. Sixteen of the studies use monthly data, and another three use quarterly data. The seven studies that rely on annual data do so either because their focus is on a cross-sectional comparison, or because they utilize special data which is unavailable at a higher frequency. Two papers that fall into the latter category are Ghosh and Ghosh (2002), who look at the impact of institutional variables on crisis likelihood, and Kaufmann, Mehrez, and Schmukler (2000), who use an annual survey of local managers to gauge the extent to which they have private information about a country’s vulnerability to crisis.

We now turn to a more detailed description of the various studies, which can be grouped into two categories based on the methodology employed. The first group consists of studies which use one of the two standard approaches in the literature—either the indicators approach or limited dependent variable probit/logit modeling—but introduce some novelty, such as a new set of indicators or a new method for assessing performance. The other group consists of studies which utilize methodological alternatives to the standard approaches.

C. Standard Methodologies, with a Twist

Roughly half of the studies use one of the two standard methodologies, but introduce a novel twist in the form of a new set of indicators, an alternative transformation for the indicators, a different crisis definition, or a new approach to assessing the results.12 Berg and Pattillo (1999) revisit the indicators model of KLR, adding two new indicators—the ratio of M2 to reserves, and the current account to GDP ratio—and updating the data through 1996 to see whether the Asian crisis could have been anticipated. They then compare this predictive performance with three probit-based alternatives: one that uses the KLR signals as right-hand side inputs, a second that uses piecewise linear functions (which allows for a more general nonlinearity than the step function used in KLR), and a third using a simple linear specification for the variables. They find that the probit model which uses untransformed variables does better than both the signals-based and piecewise linear probit models in predicting the Asian crisis out-of-sample. The piecewise linear probit model fits better than the KLR probit model in sample, but its poorer out-of-sample performance indicates that this is probably due to overfitting. Berg and Pattillo are also very thorough in comparing the models, using various measures of predictive power including various accuracy scores such as the quadratic probability score, as well as goodness-of-fit based on the percentage of correctly called crises relative to false alarms.

Edison (2000) also provides a very thorough assessment of the KLR methodology, by updating and conducting out-of-sample forecasts, and by submitting it to a battery of various sensitivity tests. She finds that the performance of the various indicators is reasonably robust, although there are differences in the performance of various indicators across regions. She also identifies some shortcomings of the various crisis identification methods, including a documentation of disappearing crises due to the use of sample-dependent thresholds. Finally, she examines the usefulness of the model for monitoring a single country over time, using Mexico as an example, and for doing cross-country vulnerability comparison at a point in time.

A third paper which, along with Berg and Pattillo (1999) and Edison (2000), sets the standard for the proper assessment of early-warning systems, is Kumar, Moorthy, and Perraudin (2002), who use a standard logit model to construct a predictive model for currency crashes. They focus solely on successful speculative attacks, and hence use only exchange rate changes to define crisis episodes. They use various accuracy scores, goodness-of-fit tables, and a much larger hold-out sample (one-third of total observations) for out-of-sample evaluation. One of the non-statistical methods they employ for evaluating their model is whether a trading strategy based on its predictions can be profitable.

Kamin, Schindler, and Samuel (2001) use annual data on 26 emerging markets from 1981–1999 to investigate the relative influence of domestic and external factors in bringing about currency crises. In contrast to other studies, they construct their speculative pressure index using real instead of nominal exchange rate changes, eliminating the need for treating high-inflation episodes separately. Five domestic indicators, four external balance indicators, and three external shock indicators are used. Although they use a standard probit approach with indicators that have been explored in other studies, their contribution lies in their decomposition of probability changes into those brought about by domestic factors and those due to external factors. They find that external balance and external shock variables contribute little to the average probability of crisis, but account for a significant portion of the increases or spikes in the probabilities during the crisis years themselves. They interpret this as evidence that domestic factors are still the primary determinants of vulnerability to crisis, but that it is external factors that often push vulnerable countries over the edge.

The next group of studies explore new measures and explanatory variables that contribute to vulnerability. Caramazza, Ricci, and Salgado (2000) investigate the importance of trade and financial linkages in the transmission of currency crises across countries. Critiquing the trade linkage measures used by Glick and Rose (1998), they construct a new measure of trade linkages based on both the price effects (via the REER) and income effects (via growth slowdowns in partner countries) due to crises in other countries.13 Financial linkages are measured both through common creditor channels (using data from BIS-reporting banks), as well as by stock market correlations with the original crisis country. After controlling for standard economic factors such as overvaluation, current account imbalances and output growth, they find that financial linkages in the form of a common creditor channel significantly increase the likelihood of contagion. Trade linkages, on the other hand, are found to be (marginally) insignificant. However, when the trade linkage measures are interacted with the current account, the interaction term becomes significant, indicating a strong trade spillover effect for countries with already weak external positions.

Bussiere and Mulder (1999a) study the effect of political instability on crisis depth, the latter being measured by a weighted average of exchange rate and reserve changes. To do this, they supplement Tornell’s (1999) cross-section analysis of the Tequila and Asian crises with four measures of political instability. These include two measures of fragmentation (in the legislature and in the ruling coalition), one measure of electoral uncertainty, and dummy variables for election dates. They find the first two to be insignificant and the latter two to be significant and robust. Interestingly, they find that post-election periods, even more than pre-election periods, bring about greater vulnerability. A second paper (Bussiere and Mulder 1999b) finds that the presence of an IMF-supported program significantly reduces crisis depth, even beyond any positive effect a program might have on included fundamentals.

A third, more recent paper (Mulder, Perrelli, and Rocha 2002) investigates the role of corporate balance sheet indicators, as well as governance standards as measured by the strength of creditor and shareholder rights. These indicators are tested in both the probit model of Berg and Pattillo (1999) to assess their impact on crisis likelihood, as well as in the Sachs, Tornell, and Velasco (1996) cross-sectional analysis to assess their impact on crisis depth. They find that high leverage and short maturity structures increase both the likelihood of crises as well as the depth of crises. The impact of these balance sheet variables is greater when bank credit to the corporate sector is large, suggesting that corporate weaknesses are transmitted through the banking sector. Shareholder rights also have a large impact on crisis probabilities.

Another study that looks at new explanatory variables is Kaufmann, Mehrez, and Schmukler (2000), who explore whether local managers have an informational advantage over other market participants. They first look at descriptive evidence from financial markets. Mutual fund holdings did not decrease, except in Malaysia, before the Asian crisis, indicating that the crisis was not anticipated by these funds. Similarly, BIS bank lending did not decrease, although BIS bank deposits by residents of Korea and Thailand increased pre-crisis, indicating capital flight in anticipation of the crisis. There was some evidence that currency forecasters expected a slight depreciation in Thailand, but not in the other Asian crisis countries. Finally, credit rating downgrades followed rather than preceded the crisis. They then examine more direct evidence of local managers’ views, taken from the Global Competitiveness Survey. Here as well, there is evidence of deteriorating local sentiment in Thailand in December 1996, but not in other countries. Kaufmann, Mehrez, and Schmukler undertake an econometric analysis by extracting the private information of local managers— measured by the residual from an ordered probit regression of survey opinions on macroeconomic variables—and find that this private information is a significant predictor of exchange rate volatility.

Weller (2001) examines whether financial liberalization affects the degree to which a country is vulnerable to a currency crisis. He does this first by comparing tranquil and crisis period means of various indicators, both before and after financial liberalization, as well as comparing pre- and post-liberalization crisis periods directly. He then runs separate logit regressions on the pre- and post-liberalization subsamples, and finds significant differences between the two estimates. Sensitivity to changes in vulnerability indicators—especially short-term loans to reserves and real overvaluation—increases significantly after liberalization.

Grier and Grier (2001) investigate whether the exchange rate regime of a country affects the degree of depreciation, or the returns of the stock market. Examining a cross-section of 25 countries in 1997, they find that after controlling for other factors, countries with a peg depreciated more than those without one. Countries with a peg also had lower stock returns. Finally, Kwack (2000) investigates how much of a role external factors (as measured by LIBOR) and financial fragilities (as measured by non-performing loan ratios and corporate leverage ratios) played in the Asian crisis, by running a panel regression on annual 1995-1997 data from seven Asian countries. Despite the limited number of observations (fourteen) in the OLS regression, he finds the LIBOR and the nonperforming loan ratio significant in explaining the variance of the exchange market pressure index. A second regression finds that the non-performing loan ratio can be explained by the debt-equity ratio in the corporate sector. These two results are then combined into a semi-reduced form OLS regression, where the crisis index is significantly related to LIBOR and to the corporate debt-equity ratio; these two alone are found to explain 85 percent of the variance in the crisis index.

Bruggemann and Linne (2000) investigate whether the indicators approach can be used to cover a qualitatively very different set of countries—the transition economies, more specifically the EU accession countries plus Russia and Turkey. In addition to assessing the usefulness of conventional indicators for these economies, they also analyze indicators of capital flight risk and banking sector fragility. They find that the standard macroeconomic variables, particularly dwindling reserves, an overvalued exchange rate and a rising budget deficit, are useful predictors for the transition economies as well, but they also find that some banking sector indicators—the ratio of lending to deposit rates, and the size of bank deposits relative to GDP—also have predictive power.

The final study in this group, Eliasson and Kreuter (2001), uses a standard limited dependent variable (logit) methodology, but they question the use of a binary crisis measure and suggest an alternative continuous crisis measure based on a five-step procedure. This entails fitting an extreme-value distribution to the tail end of exchange rate changes, and using the c.d.f. of the distribution as a bounded measure of crisis intensity. The same procedure is applied to interest rate changes and to deviations of the interest rate from its long-term mean, and the maximum of the three c.d.f. values is used, thus avoiding aggregation via a weighted average. Finally, Eliasson and Kreuter smooth the crisis intensity index via an exponentially weighted moving average over the last six observations, with the largest of the six observations getting the residual weight. The crisis measure obtained is used as the dependent variable in a multinomial logit model. They find that the continuous crisis variables provide a more accurate description of the crises in the 1990s than the binary crisis variable, and that their estimated model performs well in explaining these events.

D. New Methodologies

Almost half of the studies in this survey explore alternative econometric specifications to the standard indicators and probit/logit models. Despite the variety of approaches proposed, most of these studies identify the same set of shortcomings in the existing approaches—which we have already enumerated in the introduction—to justify an exploration of alternatives. But few, if any, of the proposed alternatives are able to address all these shortcomings. Several still rely on the crisis dating methodology of the standard approaches. Others are more successful at addressing the weaknesses of the standard models, but have problems of their own. What should be clear after surveying these new approaches is that although none of the models is perfect, each has its own strengths, and an awareness of the advantages and disadvantages of each model is essential so that practitioners can select the proper model for a given task. Indeed, one of the objectives in compiling this survey was to increase awareness of the existence of these models, many of which have remained as unpublished working papers. We now turn to a discussion of the individual studies.14

Nag and Mitra (1999) use an artificial neural network (ANN) to construct an early-warning system for currency crises, and compare its performance to the indicators approach using monthly data for Indonesia, Malaysia and Thailand from 1980-1998. The primary advantage of ANNs are their flexible specification, and their ability to capture complex interactions among variables. However, this flexibility can also be a potential drawback, as the danger of overfitting is much greater than in other methodologies, given the large number of variables and the ability of the neuron layers to fit the data. Another drawback is the “black box” nature of ANNs. Because there are no coefficient estimates, and the interactions among the variables can be very complicated, it is difficult to determine which indicators are behaving abnormally and driving the forecast probabilities.’15

In their country-by-country replication of the KLR approach using sixteen indicators, Nag and Mitra find that different indicators are useful for different countries; somewhat surprisingly, they rarely find real overvaluation to be a significant indicator. They then estimate a different ANN model for each country, using a genetic algorithm to train the models. Because lags of anywhere from zero to four quarters are also allowed, the final models for each country have a large number of variables, from 13 in Indonesia to 23 in Malaysia. Inadequate information is given regarding the construction of the ANN, such as the number of hidden layers and hidden neurons, the transformation function used, or the parameters of the genetic algorithm used to train the ANN. They do not report the in-sample model forecasts, which is estimated up to end-1996, but their out-of-sample results show very high crisis probabilities—in the vicinity of 80 percent—for all three countries in the months before the crisis broke. A more thorough evaluation is needed for this potentially very promising approach.

Collins (2001) uses a latent variable threshold model to study the timing of currency crises. She assumes that a crisis occurs when some unobservable process crosses a threshold, where the latent variable is assumed to follow Brownian motion with drift. Conditional on the drift factor, the distance to the threshold and the variance of the Brownian motion, the probability of crisis occurrence has an inverse Gaussian distribution. The distance and drift factors are modeled as linear functions of five standard indicators, with overvaluation, the CA/GDP ratio and short-term debt to reserves affecting distance to the threshold, and export and reserves growth affecting the drift factor. Estimating the model using monthly data for 25 emerging markets from 1985-1988, she finds that short-term debt influences distance, while reserve growth influences the drift factor. Interestingly, the drift factor is always negative, indicating that the process is moving away from the crisis threshold over time. Collins also tests the model against two alternative specifications—a probit model—and a Poisson model—and finds that the threshold model is found to fit better than either alternative.

Blejer and Schumacher (1998) propose using a value-at-risk approach to analyze central banks’ balance sheets and assess solvency risk, which can reduce the credibility of an exchange rate peg. They consider exposure to several risk factors, including volatility in exchange rates, international interest rates, and country risk. The methodology is developed in detail, but Blejer and Schumacher refrain from estimating the model using existing data, instead laying down guidelines on how this approach might be made operational.

Vlaar (2000) also takes a different methodological approach to early-warning systems. First, he uses the continuous crisis index itself instead of a binary crisis dummy. Second, he models the crisis index as being drawn from a mixture of two normal distributions. The mean and volatility of each distribution is modeled as a function of various indicators, as is the relative weighting of the two distributions. He finds that inflation, overvaluation and reserve losses are the most significant determinants of vulnerability, but in addition, he finds a substantial degree of dynamics—past exchange rate and reserve deterioration and volatility are themselves significant predictors of future vulnerability. Regional exchange rate volatility also affects vulnerability, suggesting the possibility of contagion. Both in-sample and out-of-sample, the model is able to predict about 7 of 8 crises, but at the expense of sending false alarms through approximately a third of tranquil periods.

Zhang (2001) proposes using the autoregressive conditional hazard (ACH) model developed by Hamilton and Jorda (2002). In essence, it is simply an addition of a new explanatory variable—the duration of the last tranquil period—and an alternative functional form for the crisis probability: Pr(Crisisi,t=1)=1C+αdi,t1+γZi,t, where d gives the duration or length of the last tranquil period and Z is a vector of standard indicators. 16 Using monthly data for Indonesia, Korea, the Philippines and Thailand from 1993-1997, Zhang estimates both a probit model and the ACH model and asserts that the ACH model fits the data better, based on a comparison of log likelihoods; however, such a comparison is invalid since the models are not nested. Further, he finds that none of the standard vulnerability indicators is significant, but that the duration variable is highly significant. A contagion measure is also constructed, based on the duration of the most recent tranquil period in any of the four countries, and not only is it significant, but it also drives out the effect of the domestic duration variable.

It is difficult to assess the generality of Zhang’s findings, since the model is estimated on a very short sample. Not surprisingly, the fitted model is able to predict the crises in Korea, Indonesia and the Philippines on the basis of the onset of the crisis in Thailand. Zhang identifies May 1997 as the starting date of Thailand’s crisis, since he adopts a different crisis dating methodology, treating exchange rate and reserve changes separately and using a three-year moving window for computing the standard deviation threshold.

Adopting a more standard econometric approach, Krkoska (2001) estimates a restricted VAR to analyze vulnerability in four transition countries. An index of speculative pressure is constructed, and the continuous index enters the VAR along with four other endogenous variables—the real exchange rate, industrial production, FDI, and the current account. Five exogenous variables also enter the specification—the CA-FDI gap, growth in real domestic credit, inflation, the DM-US$ exchange rate, and industrial production in the EU. Given the limited data (quarterly data from 1994-1999) and the large number of parameters, many ad hoc restrictions were imposed for the VAR to be tractable. Krkoska finds that the gap between the current account and FDI is the most significant predictor of crisis vulnerability; in all instances that this gap exceeded 5 percent of GDP, a crisis ensued. Overvaluation and a slowdown in the EU are also found to be significant predictors of vulnerability.

Burkart and Coudert (2000) utilize Fisher discriminant analysis in their analysis of currency crises. Discriminant analysis aims to classify a dependent variable into one of K given states, based on information from a set of predictor variables. The principle underlying the analysis is to determine whether the K states differ with regard to the mean of a variable, and then to use that variable to construct predictions for the states.17 In this study, the sample is divided into K=2 states, where the four quarters that precede each crisis are one state (“crisis”), and all other periods are the other state (“tranquil”). Burkart and Coudert start with a set of 34 potential indicators and reduce this set to six indicators—the ratio of reserves to both M2 and debt, the ratio of short-term debt to total debt, real overvaluation, inflation, and a regional contagion indicator—based on performance. The find that the discriminant model based on these indicators yields relatively good performance: four out of five crises are predicted correctly and only one out of five non-crises result in false alarms.

In contrast to the other studies surveyed here, Hawkins and Klau (2000) are not interested in the estimation of a model. Rather, their objective is to present vulnerabilities in a simple, transparent manner. They do this by transforming various indicators of external and banking sector vulnerability into discrete scores, where higher scores indicate increased vulnerability. For example, four continuous indicators for external vulnerability are individually transformed into discrete measures, taking on five possible values: -2, -1, 0, 1 and 2. These scores are then summed, using weights of 1.25 so that the score falls within an interpretable range of -10 to 10. A similar transformation is also done to compute speculative pressure itself, using reserve changes, real interest rate changes, and exchange rate changes at both the 3-month and 12-month horizons. But in its essence, the Hawkins and Klau proposal is just a variant of the KLR approach. The only differences are that (i) they classify indicators into five categories, as opposed to the binary 0-1 categories of KLR; (ii) for the speculative pressure index, they categorize and then sum, whereas KLR sum first and then categorize into a 0–1 crisis dummy; and (iii) Hawkins and Klau sum their indicators using equal weights, while KLR weight their indicators based on predictive performance. Finally, although a five-category variable is more informative than the binary transformation done in the signals approach of KLR, the thresholds used to create the five categories are arbitrary, unlike the threshold used in KLR which is chosen to maximize the signal-to-noise ratio.

Several of the studies surveyed emphasize the importance of potential interactions among various indicator variables. Nitithanprapas and Willett (2000) are critical of other empirical studies for failing to account for these interactions. They note, for example, that current accounts which due to overvaluation are worrisome, while those that are due to inflows of FDI are probably less dangerous. Using a slope dummies regression approach, they test the Lawson dogma that current account deficits are worrisome only if caused by a fiscal deficit, as well as other hypotheses linking current account deficits to real overvaluation and to FDI inflows. They also test whether lending booms increase vulnerability to a crisis, and whether adequate reserves (measured against both M2 and short-term debt) mitigate these risks. They find no support for the Lawson dogma at least in the context of their sample (the Mexican and Asian crises), but they do find evidence of interactions between the current account, overvaluation and FDI. They also find the effect of lending booms significant, and that adequate reserves lessen crisis vulnerability.

In addition to addressing the issue of interactions in their study, Ghosh and Ghosh (2002) introduce several additional innovations. They focus exclusively on “deep” currency crises, defined as currency crises—identified using a standard speculative pressure index and threshold—which result in an appreciable decline in GDP growth of at least 3 or 5 percentage points. They also expand the range of indicators by supplementing five standard macroeconomic variables with two corporate leverage ratios and, more uniquely, a slew of institutional quality variables, including six measures of rule of law, eight of shareholders’ rights, and five of creditors’ rights, each of which is aggregated using principal components analysis. Most importantly, they explore interactions among these variables using a methodology called binary recursive tree, which works as follows. First, thresholds for each indicator are identified that minimize (the sum of) Type I and Type II errors. The sample is then split into two “branches” using the threshold of the best indicator. These two steps are then repeated to construct sub-branches, and the process is continued until some stopping rule (which penalizes overfitting via too many branches) is satisfied. In their sample, countries with poor public sector governance are much more likely to have a crisis; and of those with poor public sector governance, those which also have current account deficits greater than 2.6 percent of GDP are more vulnerable than others, and so on. These interactions are not easily captured in the linear structure of standard probit models.

Another study that uses classification rules is Osband and Van Rijckeghem (2000), who identify values for indicators which keep a country safe from currency crises. Using 18 monthly vulnerability indicators for 31 emerging markets over the period 1985-1998, their objective is the identification of safe or near-safe regions by the use of three types of filters: simple filters (“X>g1”), intersection filters (“X>g1 AND Y>g2”), and linear combination filters (“aX+bY>g3”). Filters are assessed based on their ability to separate or extract tranquil periods from crisis periods in sample, as well as their “marginal extraction” rate, i.e. their ability to extract tranquil periods which are not extracted by other filters. They find that a set of nine filters—three simple, five intersection and one linear combination—are able to identify 47 percent of tranquil periods as being safe or near-safe. External debt and reserve adequacy feature heavily in these filters. This model, as well as the binary recursive tree model of Ghosh and Ghosh, are useful complements to the standard models. Osband and Van Rijckeghem suggest that their model can be used as the first part of a two-stage early-warning system: to initially identify which countries are safe or near-safe based on these filters, and then to identify the degree of vulnerability using more standard predictive early-warning systems.

The final study that emphasizes interactions among indicators is Apoteker and Barthelemy (2001). Their model is based on the evaluation of five “fundamental balance” charts. These are scatterplots that illustrate certain aspects of a country’s economy; for example, the growth balance scatterplot combines a domestic growth indicator (GDP per capita growth) with an external balance indicator (current account balance). A genetic algorithm is used to identify quadrants in the fundamental balance charts which are most associated with four types of crises: transfer crises, liquidity crises, exchange rate crises and cyclical development crises. Apoteker and Barthelemy define exchange rate crises as movements of the real exchange rate by at least 20 percent in one quarter, 30 percent in two quarters, or 40 percent between three and six quarters. The genetic algorithm provides a set of conditions which are associated with crises, and vulnerability is measured by how many of these conditions are satisfied at a given point in time. Precise definitions of the indicators used are absent, and there is no formal testing of the model. Instead, charts are presented indicating Mexico’s vulnerability to a cyclical crisis—but not to an exchange rate crisis—from the second quarter of 1994 onwards. A similar result is found for Thailand in 1997.

III. A Markov-Switching Approach to Early-Warning Systems

Regime switching models have long been a tool available to empirical economists, with early work on these models going back to Quandt (1958), Goldfeld and Quandt (1973), and Hamilton (1990). Applications have only become common in the last decade, however, with the advent of greater computing power. Markov-switching models with constant transition probabilities have been applied to interest rates (Hamilton 1988), the behavior of GNP (Hamilton 1989), stock returns (Cecchetti, Lam and Mark 1990), and floating exchange rates (Engel and Hamilton 1990).18 One serious limitation of the earlier Markov-switching models, however, was the restriction of constant transition probabilities. The baseline model was thus extended to allow for time-varying transition probabilities, by Lee (1991) and Diebold, Weinbach and Lee (1994) and used to model long swings in the dollar-pound rate, as well as by Filardo (1993, 1994) to analyze business cycle phases.

As mentioned in the introduction, there are two primary motivations for using Markov-switching model with time-varying probabilities in modeling speculative attacks. First, one can avoid the many ad hoc assumption required in the standard models. Even if, as many of the studies claim, their results are robust to these ad hoc assumptions, we believe there is virtue in simplicity. Second, using exchange rates or the index of speculative pressure directly avoids the loss of information that results when these variables are transformed into a binary crisis dummy variable. In particular, exchange rate dynamics may itself be informative about the likelihood of a large speculative attack. A small increase in volatility (e.g., from a widening of an exchange rate band) or small devaluations in the span of a few months might foreshadow a coming currency crisis, but this information remains unutilized (and in fact is erased by the threshold dating process) in the standard approaches. As we will see below, even small changes in exchange rate behavior are utilized in a regime-switching framework as signs of increasing speculative pressure.

There are three disadvantages to using Markov-switching models. The first is computational; Markov-switching models with time-varying probabilities are still not part of the standard econometric software packages. But this drawback has become minor, as more researchers use the methodology and make their code available, and since software programs such as EViews now allow the creation of general log likelihood objects.19 A second drawback is the difficulty in testing Markov-switching models against the null of no switching, as one encounters problems with unidentified nuisance parameters (the coefficient parameters in the transition probability matrix), as well as with a singular information matrix. Various tests have been suggested, including Davies (1977, 1987), Hansen (1992, 1996), Hamilton (1996), Garcia (1998) and Mariano and Gong (1998) for testing a constant transition probability model against a null of no switching. For the time-varying transition probability case, one can do a sequential test: first, test the constant transition probability model against a null of no switching, and then test the time-varying transition probability model against a constant transition probability model. Note that testing the significance of individual coefficient estimates, as well as testing the overall model against a null of constant switching, can easily be done using standard t-statistics and likelihood ratio tests. The third drawback is that the likelihood surface can have several local maxima and is sometimes ill-behaved. The model may fail to converge when too many explanatory variables are included, and t-statistics may be sensitive to the choice of step size, since derivatives are calculated numerically. Thus, a judicious choice of for start-up values and step size in the maximum likelihood estimation is important.

A. Model Specification and Estimation

The latent variable in the model follows a first-order, two-state Markov chain {st}t=1T, where st=1 denotes a crisis state and st=0 denotes a tranquil state. Although st is not directly observable, the behavior of our dependent variable yt—which can be either the nominal exchange rate change or the speculative pressure index—is dependent on st as follows:


so that both the mean and variance of yt can shift with the regime.20 The density of yt conditional on st is then


for st = 0,1.

The latent regime-switching variable st evolves according to the transition probability matrix Pt:


where pijt is the probability of going from state i in period t-1 to state j in period t, and F is a cumulative distribution function, most typically the logistic or the normal c.d.f. The elements of the k×1 vector xt-1 are the early-warning indicators that can affect the transition probabilities.

One final quantity needed to complete the model is the start-up value p11=Pr(s1=1), which gives the unconditional probability of being in state 1 at time 1. As Diebold, Weinbach and Lee (1994) note, the treatment of this quantity depends on whether xt is stationary or not. If xt is stationary, thenp11 is simply the long-run probability that s1=1, which in turn would be a function of (β0, β1). If xt is nonstationary, then p11 is an additional parameter that must be estimated. In practice, for a long enough time series this value has a negligible effect on the likelihood function, and whether one calculates it as a function of (β0, β1), estimates it as a separate parameter, or just sets it at a constant value makes little difference.

The estimation procedure we use is direct maximization of the likelihood, where the likelihood function is calculated using the iteration described in Hamilton (1994, pp. 692-93). Using information available up to time t,we can construct Pr(st = jt;θ), the conditional (filtered) probability that the tth observation was generated by regime j, for j = 1, 2,…N, where N is the number of states (in this paper, N=2). Collect these conditional probabilities into an (N×1) vector ξ^t|t.

One can also form forecasts using the conditional (forecast) probability of being in regime j at time t+1, given information up to time t: Pr(st+1 = jt;θ), for j = 1, 2, …N. Collect these forecast probabilities in an (N×1) vector ξ^t+1|t. Lastly, let ηt denote the (N×1) vector whose jth element is the conditional density of yt in equation (2). These filtered and forecast probabilities are calculated for each date t by iterating on the following equations:

ξ^t|t=(ξ^t|t1 o ηt)1(ξ^t|t1 o ηt)(4)

where Pt is the (N×N) transition probability matrix going from period t-1 to period t, described in equation (3), and ° denotes element-by-element multiplication. Equation (4) calculates Pr(st = jt; θ) as the ratio of the joint distribution f(yt,st = jt;θ) to the marginal distribution f(ytt;θ), the latter being obtained by summing the former over the states 1, 2, … N. Equation (5) implies that once we have our best guess as to what state we are in today, we just pre-multiply by the transpose of the transition probability matrix P to obtain the forecast probabilities of being in various states in the next period.

Given an initial value for the parameters, θ, and for ξ^1|0, which in our model is just [1p11,p11], we can then iterate on (4) and (5) to obtain values of ξ^t|t and ξ^t+1|t for t = 1,2, …T. The log likelihood function L(θ) can be computed from these as




One can then evaluate this at different values of θ to find the maximum likelihood estimate.

IV. Data Description and Transformation

The model is estimated using monthly data from January 1972 to December 1999 for the five Asian crisis countries: Indonesia, the Republic of Korea, Malaysia, the Philippines, and Thailand. The dependent variable in our model is the month-to-month percentage change in the nominal exchange rate. Nothing precludes the use of the speculative pressure index as the dependent variable, if one is interested in unsuccessful speculative attacks as well. Another alternative to using the index of speculative pressure, which avoids the need to weight the various components, is described in Abiad (2002). That paper adds reserve changes and interest rate changes as dependent variables, in addition to exchange rate changes, but rather than combining the three variables into a weighted average, the variables are stacked into a 3×1 vector whose distribution is dependent on the Markov-switching crisis variable st. The main finding is that adding reserve and interest rate changes to the model does not help identify any additional crisis episodes not already picked up by the univariate model based on exchange rates alone.

We explore a broad set of twenty-two early-warning indicators, which are listed in Table 1. The indicators can be classified into three categories. The first group includes standard measures of macroeconomic imbalance. There are three measures of external imbalance: deviations of the real exchange rate from a Hodrick-Prescott trend,21 the current account balance relative to GDP, and the growth rate of exports. There are three measures of the adequacy of central bank reserves: the level and the growth rate of M2/reserves, and the growth rate of reserves. We also look at credit expansion, as measured by growth rate of real domestic credit. Two measures of real economic activity are used—the growth rate of industrial production and real GDP growth interpolated from quarterly data. Some crises have been preceded by the bursting of an asset market bubble, usually in the equities market or the property market, so we include the six-month change in the country’s stock market index as well. Finally, we include the real interest rate.

Table 1.

Early-Warning Indicators

article image

The second category of indicators relate to capital flows. The first indicator in this group is the 3-month LIBOR, which has been a primary determinant of the level of capital flows to emerging markets. A second indicator captures the idea that large capital inflows usually fuel a lending boom; one measure of this lending boom, first used by Sachs, Tornell and Velasco (1996), is the growth in the ratio of bank assets to GDP. The three other indicators in this category focus on the composition of capital flows: the level of short-term debt to reserves, the stock of non-FDI investment (measured as a cumulation of flows) relative to GDP, and the ratio of cumulative portfolio inflows to total cumulative inflows.

The final category includes indicators of financial fragility. Kaminsky and Reinhart (1999) have noted that currency crises and banking crises tend to occur together, and that based on their sample of 20 countries over the period 1970-95, problems in the banking sector typically precede a currency crisis. The first indicator of financial sector soundness we use is a rough measure of capital adequacy, the ratio of bank reserves to bank assets. A second indicator is central bank credit to banks, relative to total banking liabilities; an increase in central bank credit may indicate financial weakness, if its purpose is to prop up or bail out weak banks. The ratio of bank deposits to M2 indicates the relative confidence that households and businesses have in the banking system, with a low ratio indicating a lack of confidence; we include both the level and the growth rate of this ratio. Finally, we look at both the level and the growth rate of the loan-deposit ratio. A high and/or rising loans-to-deposits ratio may indicate increased banking system fragility, with an inadequate level of liquidity to respond to shocks. It should also be noted that one of the macroeconomic indicators, the real interest rate, is also frequently used as an indicator of financial sector soundness, as high real interest rates often lead to an increase in nonperforming loans.22

Given the large number of indicators, a general-to-specific procedure was used to pare down the set and identify the final model. For each country, the model was run using each of the 22 early-warning indicators, one at a time.23 The coefficient estimates from these regressions can be found in Table 2. The coefficients on the indicators correspond to the parameter β0 that enters into p00t, the probability of remaining in the tranquil state. All the variables are transformed such that an increase in the variable lowers the probability of remaining in the tranquil state, so that negative coefficient estimate is “correct”.

Table 2.

Coefficient Estimates for the 22 EWIs (Bivariate Regressions)

article image

Examining the results of Table 2 more closely, we see that the real overvaluation indicator is correctly signed and significant across all five countries. In fact, it is the only indicator that is uniformly correctly signed and significant. Four other variables—the level and growth rate of M2/reserves, the growth rate of real GDP, and the LIBOR—are correctly signed in all cases, but are only occasionally significant. All the other indicators have coefficient estimates that have correct signs and/or are significant for some countries, but not for others. Which brings us to another important point: indicator performance clearly varies widely from country to country. An assumption of parameter equality across countries, underlying EWS model estimates which are based on a panel of countries, may lead to incorrect results, and may contribute to poor predictive performance.

The set of indicators for each country was then narrowed based on which coefficient estimates were correctly signed. Desire to monitor a wider set of indicators suggested against eliminating correctly signed coefficients whose t-statistics were not significant at the 5 percent level.24 The moderate correlation among the early warning indicators also suggested that the t-statistics may be misleading. In addition, a likelihood-ratio test of the joint significance of the explanatory variables showed them to be significant.

How high must forecast probabilities rise to be warranted as significant? It has been standard practice in the early-warning systems literature to map a model’s forecast probability into a binary “alarm signal” by determining some cutoff probability, and letting the signal equal 1 if the forecast probability rose above this threshold, and 0 otherwise.25 An assessment of predictive ability is then conducted by computing the number of crises the model signal correctly calls (by sending an alarm within a particular window, usually 24 months before the actual crisis occurrence), and the number of false alarms the model sends.

In this context, it should be noted that forecast probabilities between competing early warning systems are not directly comparable; in particular, one should adjust for the time horizon the model is using. Most of the early warning systems in the literature focus on relatively long-horizon forecasting, with horizons of 12 or 24 months being the norm. The regime-switching model we use here, on the other hand, estimates one-month ahead forecasts. To make forecast probabilities from different models comparable, the forecast horizons must be matched, and the most straightforward way to do this would be to transform the short-horizon forecast into a long-horizon equivalent, using:

Pr(crisis over nextnmonths)=1Pr(no crisis over nextnmonths)=1Pr(no crisis over next1month)n(16)=1(1Pr(crisis over next1month))n

Of course, this transformation is made under the assumption that the fundamentals that determine the crisis probability neither worsen nor improve. If the former, then the n-month crisis probability will be higher; if the latter, then the crisis probability will be lower.26 As an example, a 10 percent probability of a crisis over one month would be equivalent, ceteris paribus, to a three-month crisis probability of 1-(0.90)3 = 27 percent, and a one-year crisis probability of 1-(0.90)12 = 72 percent.

V. Estimation Results

The final model estimates for the five countries can be found in Table 3. For all five countries, State 0 is identified as a low-mean, low-volatility regime while State 1 is a high-mean, high-volatility regime. Average volatility, as measured by the standard deviation, is very low in the tranquil state—less than 1 percent per month in all five countries while average volatility during crisis periods is quite large, with the highest crisis volatilities estimated for Indonesia, at 29 percent per month. In fact, volatility seems to be the primary distinguishing characteristic between tranquil and crisis periods, as σ1 is significantly different from σ0 in all cases. The average depreciation in tranquil periods is effectively zero (less than a quarter percent per month) in all countries, while in crisis periods it ranges from 2.1 percent per month in Thailand, to 12.6 percent per month in Indonesia, but the standard errors are large enough so that the one cannot reject equality of μ1 and μ0. The coefficients on the indicators in the time-varying probabilities are all correctly signed, but as noted earlier, they are insignificant in most cases. This might be due to correlations among the indicator variables; in fact, likelihood-ratio tests for the joint significance of the indicators are significant for all countries except Malaysia, where the test of joint significance is marginally insignificant, with a p-value of 0.16. We now turn to a country-by-country analysis.

Table 3.

Final Model Estimates

article image

A. Indonesia

In the estimated model for Indonesia (Table 3), six indicators are used—real overvaluation, export growth, the level of M2/reserves, reserve growth, central bank credit to the banking sector, and growth of the M2/deposits ratio. There are five speculative pressure episodes in Indonesia in our 1972-1999 sample (Table 4)—a devaluation of 50 percent in November 1978, currency volatility in late 1982 that culminated in a 38 percent devaluation in April 1983, moderate volatility and a 5 percent depreciation in mid-1984, a 44 percent devaluation in September 1986, and the Asian crisis which began in July 1997 with a 6 percent decline in the rupiah. Figure 1 plots these crisis dates, along with 12-month forecast probabilities and alarm signals based on a 50-percent cutoff. Alarm signals are sent at least once in the 12 months preceding four of the five crisis episodes, with the only uncalled crisis being the smallest one, the 5 percent depreciation in mid-1984. However, the Asian crisis was not well-signaled for Indonesia; an alarm was generated only in one month (October 1996), and reflected increased currency volatility during that period. The forecast probabilities do increase steadily, to 45 percent in June 1997, but stay below the signaling threshold of 50 percent.

B. Korea

The indicators that enter the final model for Korea are real overvaluation, the current account to GDP ratio, the level of M2/reserves, industrial production growth, stock market performance, and the share of portfolio flows in total capital flows (Table 3). Three crisis periods are included in the sample (Table 5)—a 20 percent devaluation in January 1980, a depreciation of 7 percent in September-November of the same year (which was likely a continuation of earlier speculative pressure), and the Asian crisis (Figure 2).27 Interestingly, the model already identifies March 1997, when the won depreciated by 4 percent, as a period of speculative pressure. In terms of predicting these episodes, the model does not anticipate the January 1980 depreciation; however, after the initial devaluation it continues to send signals in anticipation of further speculative pressure, which did occur later that year.

Table 4.

Speculative Pressure Episodes and Alarm Signals in Indonesia

article image
Table 5.

Speculative Pressure Episodes and Alarm Signals in the Republic of Korea

article image
Figure 1.
Figure 1.

Indonesia: Crisis Dates, Forecast Probabilities, and Alarm Signals

Citation: IMF Working Papers 2003, 032; 10.5089/9781451845136.001.A001

Figure 2.
Figure 2.

Republic of Korea: Crisis Dates, Forecast Probabilities, and Alarm Signals

Citation: IMF Working Papers 2003, 032; 10.5089/9781451845136.001.A001

With regard to the Asian crisis, the Korean model illustrates the gains from letting an EWS model use information available in the exchange rate behavior. There was already a moderate increase in the won’s volatility before the Asian crisis, beginning as early as the middle of 1996, when the won depreciated by 3 percent. As a result of this increased volatility—and combined with Korea’s weakening external position, a decline in the stock market and the high share of portfolio flows—the model begins signaling in February 1997.

C. Malaysia

The final model for Malaysia contains six indicators—real overvaluation, domestic credit growth, real GDP growth, the real interest rate, the LIBOR, and the ratio of M2 to deposits. Relative to the four other countries, Malaysia’s exchange rate regime was much less of a peg and more of a dirty float. Thus, unlike the other countries which experienced rarer but sharper devaluations, the speculative pressure episodes in Malaysia are protracted periods characterized by increased volatility and a slow deterioration of the exchange rate. Thus, instead of identifying the individual spikes, we group them into four periods which are described in the Table 6.

Table 6.

Speculative Pressure Episodes and Alarm Signals in Malaysia

article image

The model is able to anticipate three of Malaysia’s four speculative pressure periods (Figure 3). Analyzing the individual indicators that enter the model, one finds that the rise in world interest rates contributed to the speculative pressure that occurred in the late 1970s and early 1980s; overvaluation, high real interest rates, and a slowdown in real growth contributed to the 1985-86 depreciation; and overvaluation and a domestic credit boom increased vulnerability in the run-up to the Asian crisis.

Figure 3.
Figure 3.

Malaysia: Crisis Dates, Forecast Probabilities, and Alarm Signals

Citation: IMF Working Papers 2003, 032; 10.5089/9781451845136.001.A001

D. The Philippines

The six indicators that enter into the Philippine model are real overvaluation, export growth, both the level and the growth rate of M2/reserves, industrial production growth, and the growth rate of deposits/M2. The model was estimated from 1982 onwards, since data on industrial production growth is unavailable before 1982. The model identifies three protracted periods of speculative pressure (Table 7).28 The first is from August 1982 to April 1986, when a financial crisis and political turmoil resulted in high exchange rate volatility and a 140 percent depreciation over the period. The second is from June 1988, a period of moderate volatility where the peso depreciated by 34 percent. The third period is the Asian crisis, which began for the Philippines with a 10 percent depreciation in July 1997.

Table 7.

Speculative Pressure Episodes and Alarm Signals in the Philippines

article image

The model for the Philippines is able to anticipate these three crisis periods (Figure 4). Analyzing the individual indicators, one finds that different factors were behind each crisis. A slowdown in both exports and industrial production played some role in triggering the crisis in the early 1980s, but a rise in both the level and growth rate of M2/reserves, as well as a sharp fall in the deposits/M2 ratio (an indicator of the banking crisis that occurred), played a role in prolonging the crisis. Reserve adequacy, as measured by both the level and growth rate of M2/reserves, also increased vulnerability in the late 1980s, which culminated in a 17 percent depreciation in the latter half of 1990, during the Gulf War. Finally, weakening competitiveness that began in late 1996—resulting from the appreciation of the yen against the dollar, to which the peso was pegged—increased the Philippines’ vulnerability, and resulted in the depreciation of the peso following the float of the Thai baht in July 1997.

Figure 4.
Figure 4.

Philippines: Crisis Dates, Forecast Probabilities, and Alarm Signals

Citation: IMF Working Papers 2003, 032; 10.5089/9781451845136.001.A001

E. Thailand

The final model for Thailand includes the following indicators: real overvaluation, the level of M2/reserves, reserve growth, real GDP growth, the real interest rate, and the share of non-FDI capital flows in total flows. There are three crisis periods in the sample, a 10 percent depreciation in 1981, a 19 percent depreciation between November 1984–December 1985, and the Asian crisis which began in Thailand in July 1997 (Table 8).

Table 8.

Speculative Pressure Episodes and Alarm Signals in Thailand

article image

All three crisis periods are anticipated, as can be seen in Figure 5. However, signals are sent only two months prior to the July 1981 devaluation. Better warning is provided for the latter crisis episodes. Real overvaluation seems to have played some role in all three crises. Reserve inadequacy, as measured by the level of M2/reserves, also played a role in the 1980s episodes, but not in the Asian crisis. Based on the model, three factors seem to have played a role in increasing Thailand’s vulnerability to crisis in 1997—a loss of external competitiveness, a slowdown in the real economy, and an increasing proportion of non-FDI flows in total capital flows.

Figure 5.
Figure 5.

Thailand: Crisis Dates, Forecast Probabilities, and Alarm Signals

Citation: IMF Working Papers 2003, 032; 10.5089/9781451845136.001.A001

VI. Forecast Assessment

We now perform a more rigorous evaluation of the predictive performance of the model, both in-sample and out-of-sample. Table 9 contains in sample goodness-of-fit tables for each country model, as well as overall for all five countries. Each 2×2 matrix shows the number of correctly called tranquil and crisis periods, as well as the number of false alarms and the number of missed signals. We summarize this information further in Table 10, which also provides goodness-of-fit measures for five other models evaluated in Berg and Pattillo (1999, henceforth BP). The comparison is only meant to be indicative, as the Markov-switching model differs from the five other models in several important ways, beyond just the differences in model specification. First, the identified crisis dates are different. Second, the forecast horizons are not the same—the models reviewed by BP all use a forecast horizon of 24 months, as opposed to the 12-month forecast horizon used here. Third, the data underlying the estimates, and the transformations applied to them, are similar but not identical. Finally, the models in this paper are estimated country-by-country, whereas all five BP models were estimated on a panel of countries, a point we return to below.

We see that Markov-switching model correctly calls 81 percent of observations. This is slightly lower than the 82–85 percent performance of the standard models in BP, when they use a 50 percent cutoff probability. However, the high predictive performance in those models is driven mostly by their ability to call tranquil periods correctly; they correctly classify 98–100 percent of tranquil periods, as opposed to 89 percent in the Markov-switching model. But in terms of correctly called crises, the Markov-switching model performs much better, calling 65 percent of pre-crisis periods correctly—that is, sending signals in 65 percent of the months where a crisis ensued within a year’s time. The standard models, in contrast, only call 7–19 percent of pre-crisis months correctly. The poorer performance is probably due in part to the longer 24-month forecast horizon they aim for, and also because the 50 percent signaling threshold is too high for those models. BP also report goodness-of-fit for the standard models when the cut-off is lowered to 25 percent, which we replicate in the bottom half of Table 10. The lower signaling threshold increases the number of correctly called pre-crisis periods—now the models correctly send signals in 41–48 percent of pre-crisis periods—but this comes at the expense of a much higher fraction of false alarms. With the lower threshold, false alarms account for 57–65 percent of total alarms, i.e., almost two out of every three signals is a false alarm.

Although the Markov-switching specification probably accounts for part of the improved performance, it is also possible that a substantial portion of the improved performance is due to the fact that the standard models estimate the data using a panel of countries, and assuming that the coefficients are uniform across countries. As we saw in Section V, indicators that matter for crises in one country may not even be pointing in the right direction during crises in another country.

Table 9.

Forecast Assessment

article image
Table 10.

Measures of Predictive Power

article image

A pre-crisis period is correctly called when the estimated probability of crisis is above the cut-off probability and the crisis ensues within 12 months (Abiad), or within 24 months (other models).

A tranquil period is correctly called when the estimated probability of crisis is below the cut-off probability and no crisis ensues within 12 months (Abiad), or within 24 months (other models).

A false alarm is an observation with an estimated probability of crisis above the cut-off (an alarm) not followed by a crisis within 12 months (Abiad), or within 24 months (other models).

What are the out-of-sample predictions of the country Markov-switching models? The models were estimated using data up to the end of 1999. An attempt to estimate the model up to end-1996, to see whether the Asian crisis was forecastable using the model, was not possible in this case, mainly because the model was estimated country-by-country; eliminating the Asian crisis not only removes the most informative episode in the sample, but also results in overfitting and/or nonconvergence of the maximum likelihood algorithm. Hence the only alternative is to look at model forecasts beyond the end of 1999. Admittedly, the hold-out sample from January 2000–July 2001 is relatively small, and moreover, none of the five countries had a crisis during this period. But it is still an informative exercise to see what kinds of probabilities and signals the country models send.

The forecast probabilities and alarm signals for the out-of-sample period of January 2000–July 2001 can be seen in Figures 15. For three of the countries—Indonesia, Korea and Thailand—no alarm signals are sent during the period. There was still a moderate probability (about 20 percent) of a crisis in Indonesia through much of 2000, but vulnerabilities (at least those measured by the indicators in the model) have dropped since then. Thailand has shown lower susceptibility to a crisis, and Korea even less so.

In contrast, the models for Malaysia and the Philippines did signal some vulnerability in the out-of-sample period, although only for a few months. Crisis probabilities in Malaysia were actually dropping toward the end of the estimation period (1999) and were low through most of 2000, but started increasing in the last quarter of 2000 and accelerating in 2001 up until July 2001, the last available data point. In fact, probabilities were high enough that the model began sending signals in May 2001. What was driving this increase in vulnerability? An analysis of the indicators entering the Malaysian model identifies several weaknesses. First, there was a steady decline in competitiveness, as measured by the real exchange rate, through 2000 and 2001.29 Second, there was a slowdown in the real economy. And third, there was an sharp rise in real domestic credit growth.

The Philippines also showed some weaknesses in the out-of-sample period, according to the model. Crisis probabilities were actually low in 2000, but the model starting indicating moderate vulnerabilities beginning in the second quarter of 2001. A spike in the crisis probability led to a signal in June 2001, but probabilities decreased in July, the last data point. There was one primary factor behind the increased vulnerability in the Philippines: a weakened external position, seen most clearly in rapidly contracting exports. Over the January 2000–July 2001 out-of-sample period, then, signals were sent for only 4 out of 95 months: May-July 2001 for Malaysia, and June 2001 for the Philippines, and these reflected vulnerabilities due to external weaknesses present in these two countries at that time.

VII. Conclusions

There is a general consensus among economists that early-warning systems, no matter how sophisticated, will not be able to forecast crises with a high degree of accuracy. Even economists who construct such models are aware of this, and see these models as no more than useful supplements to more informed country analyses, and as a means of summarizing information in an unbiased, objective manner. Nevertheless, increased emphasis on crisis prevention (as opposed to crisis resolution) means that policymakers need to utilize all the tools available for assessing countries’ vulnerabilities, and to improve these tools when possible. This paper hopes to assist in this effort, first by surveying the recent empirical literature on currency crises, and by analyzing an alternative EWS approach that addresses some of the shortcomings of existing models.

The survey of 30 selected empirical studies written since 1998 is meant to increase awareness of the various econometric approaches to early warning systems that have been developed, so that practitioners have at their disposal a larger set of tools in assessing vulnerability. Many of the proposed approaches look promising, and virtually all report some improvement over the standard probit/logit and indicators models. However, many of the studies do not perform rigorous evaluations of performance. Adoption of standard evaluation procedures—including goodness-of-fit tables and measures, accuracy scores, and out-of-sample testing—will help potential users gauge how useful these models really are. Furthermore, it is difficult to assess relative performance across models, given differences in the datasets used and in the sample of countries studied. A true “horse race” among competing models—where each specification is estimated using the same data and sample of countries—will help resolve this issue.

But even in-sample and out-of-sample tests are only indicative; the true test of these models is in operationalizing them. In this regard, an additional measure of a model’s usefulness is simplicity of application. Early warning systems should be easy to replicate and estimate. That is, there should be minimal reliance on ad hoc assumptions, the data should come from published sources, and one should ideally be able to estimate the model using standard software packages or with programming code provided by the authors. If these conditions are satisfied, then it should be possible to monitor these models in real-time at low cost.

In addition to surveying the recent literature on early warning systems, this paper also contributes to it by suggesting an alternative EWS approach based on a Markov-switching model with time-varying transition probabilities. The model does an adequate job of anticipating crises. It correctly anticipates two-thirds of crisis periods in sample (compared to about 50 percent for the standard models), and just as important, sends a much smaller proportion of false alarms. In the January 2000–July 2001 out-of-sample period, no warning signals are sent for three of the five countries studied (Korea, Thailand and Indonesia), but vulnerabilities were signaled for Malaysia and the Philippines in mid-2001, mainly due to a decline in competitiveness and a slowdown in exports.

Beyond the performance of the Markov-switching model itself, there are some lessons that apply to the construction of early warning systems in general. First, accounting for dynamics is important. There is useful information in both the level and the volatility of the exchange rate itself that existing models have ignored. More specifically, some crises have been preceded by a series of smaller depreciations, by a widening of an exchange rate band, and/or an increase in the volatility of exchange rates, and this has not been utilized in existing models. Second, although there are some indicators which are common across countries in their predictive ability (with the real exchange rate being the most uniformly successful), the country-by-country analysis in this paper shows that the performance of individual indicators varies greatly across countries, so that different sets of variables are relevant for different countries. In this light, the one-size-fits-all, panel data approach used in estimating most early warning systems might be one of the causes for their only moderate success. The performance of early warning systems, regardless of the econometric specification chosen, might be improved markedly by taking more care in verifying that the countries used in the estimation possess similar characteristics, or failing that, by estimating the models on a country-by-country basis.

The model presented here is only the simplest variant of what can be done in a Markov-switching EWS. Most obviously, those with a better knowledge of each country can estimate these models using a more informed selection of indicators. Given the role that politics and political stability have played in triggering or exacerbating several crises, most notably Indonesia in 1997, the use of sociopolitical variables could be explored. In light of increased financial globalization, other external factors in addition to world interest rates might be considered, such as global equity market volatility or the spread on high-yield bonds. Regarding the specification itself, one can extend the model in several directions. First, because the focus was on crisis anticipation, the early warning indicators in the current model only affected the probability of moving from a tranquil to a crisis state. But one could let these same indicators (or a different set of indicators) also affect the probability of getting out of a crisis. Second, the current model has only two states: a tranquil state, and a speculative pressure state whose main characteristic is high exchange rate volatility. One could extend the model to allow for three (or more) states, where the three states might correspond to tranquil periods, periods of depreciation pressure and periods of appreciation pressure. Finally, the issue of modeling contagion across countries within the context of a Markov-switching EWS awaits further investigation.

Early Warning Systems: A Survey and a Regime-Switching Approach
Author: Mr. Abdul d Abiad
  • View in gallery

    Indonesia: Crisis Dates, Forecast Probabilities, and Alarm Signals

  • View in gallery

    Republic of Korea: Crisis Dates, Forecast Probabilities, and Alarm Signals

  • View in gallery

    Malaysia: Crisis Dates, Forecast Probabilities, and Alarm Signals

  • View in gallery

    Philippines: Crisis Dates, Forecast Probabilities, and Alarm Signals

  • View in gallery

    Thailand: Crisis Dates, Forecast Probabilities, and Alarm Signals