A Framework for Macroprudential Bank Solvency Stress Testing
Application to S-25 and Other G-20 Country FSAPs

Contributor Notes

The global financial crisis has placed the spotlight squarely on bank stress tests. Stress tests conducted in the lead-up to the crisis, including those by IMF staff, were not always able to identify the right risks and vulnerabilities. Since then, IMF staff has developed more robust stress testing methods and models and adopted a more coherent and consistent approach. This paper articulates the solvency stress testing framework that is being applied in the IMF’s surveillance of member countries’ banking systems, and discusses examples of its actual implementation in FSAPs to 18 countries which are in the group comprising the 25 most systemically important financial systems (“S-25”) plus other G-20 countries. In doing so, the paper also offers useful guidance for readers seeking to develop their own stress testing frameworks and country authorities preparing for FSAPs. A detailed Stress Test Matrix (STeM) comparing the stress test parameters applie in each of these major country FSAPs is provided, together with our stress test output templates.


The global financial crisis has placed the spotlight squarely on bank stress tests. Stress tests conducted in the lead-up to the crisis, including those by IMF staff, were not always able to identify the right risks and vulnerabilities. Since then, IMF staff has developed more robust stress testing methods and models and adopted a more coherent and consistent approach. This paper articulates the solvency stress testing framework that is being applied in the IMF’s surveillance of member countries’ banking systems, and discusses examples of its actual implementation in FSAPs to 18 countries which are in the group comprising the 25 most systemically important financial systems (“S-25”) plus other G-20 countries. In doing so, the paper also offers useful guidance for readers seeking to develop their own stress testing frameworks and country authorities preparing for FSAPs. A detailed Stress Test Matrix (STeM) comparing the stress test parameters applie in each of these major country FSAPs is provided, together with our stress test output templates.

I. Introduction

The global financial crisis has placed the spotlight squarely on the stress testing of financial institutions, notably that of banks. On one hand, the crisis revealed the shortcomings of stress tests as a tool for detecting important vulnerabilities during the lead-up period which forestalled possible mitigating actions being taken. On the other, the experience highlighted the usefulness of credible stress tests in restoring market confidence in the financial system, as demonstrated by the successful Supervisory Capital Assessment Program (SCAP) exercise undertaken by the U.S. authorities in 2009 (Bernanke, 2010). Ultimately, the crisis has underscored that stress tests, irrespective of their level of sophistication or regularity of implementation, are also not fail-safe, stand-alone diagnostic tools.

Post mortems following the crisis show that the stress tests conducted by supervisory authorities, IMF staff and financial institutions themselves were not always able to identify the right risks and exposures. As such, they frequently failed to provide sufficient early warning of potential vulnerabilities to shocks (Borio and others, 2012). In some cases, the simulated shocks and resulting impact were not sufficiently severe, sometimes reflecting the reluctance of the participants to overtly recognize the possible realization of certain extreme scenarios; in others, failure was attributable to the specifications of the stress tests themselves, including inadequate techniques to capture complex financial instruments or second-round effects. Elsewhere, inadequate data or weaknesses in scenario design, such as the exclusion or cursory treatment of certain types of risks and insufficient focus on spillover risks across different segments of the financial system within a country, as well as across borders, also contributed to the lack of robustness of the stress tests.

At the IMF, stress testing has become a central aspect of staff’s macroprudential surveillance of individual financial systems and of the international financial system itself. It is a key component of the Financial Sector Assessment Program (FSAP) and has become an important part of the conjunctural analysis in the Global Financial Stability Report (GFSR); it is also being applied in Article IV and crisis program work. Stress testing has also become increasingly more important for IMF member countries. In addition to microprudential (or supervisory) stress testing, some jurisdictions have established national macroprudential authorities which will also be engaging in macroprudential stress testing. Countries are also increasingly requesting technical assistance on stress testing from the IMF as they too seek to build or enhance their capacity in this area. These developments have underscored the need for a coherent and consistent approach to stress testing by IMF staff in their engagement with the membership.

As a result of the attention drawn to stress testing, exercises conducted by IMF staff have come under intense scrutiny. Consistency in the implementation of these stress tests and the comparability of findings across member countries have taken on significant importance as market participants increasingly place a premium on transparency especially in the current volatile environment. In this context:

  • The large menu of choices in terms of approaches, models, scenarios and underlying assumptions applied in staff’s analyses has given rise to questions about what the results actually represent and their implications for cross-country comparisons. The lack of generally-accepted “best practice” principles (at least for some dimensions of stress tests) on the one hand (see for example, Board of Governors of the Federal Reserve System/FDIC/OCC, 2012) and evolving practices on the other further complicate this issue. In this context, IMF staff is emphasizing the use of prescriptive guidelines in IMF-related stress testing exercises. The aim is to ensure sufficient coverage and a modicum of uniformity for comparison purposes, both within a financial system and at the very least, across “peer” countries.

  • The communication of stress test results has also become an increasingly sensitive issue for the IMF’s membership. Both financial supervisors and financial institutions are struggling to balance the call for increased transparency with the need to avoid unduly alarming the markets and creating self-fulfilling prophesies, especially in the current fraught environment.

In the decade since stress testing was introduced into the IMF’s surveillance toolkit, stress tests have been conducted on the banking and non-bank financial sectors, with a strong focus on the former. Since 2003, FSAP stress tests on the insurance sector have been conducted in only 10 countries and on the pension funds sector in two countries, compared to more than 50 on the banking sector since the onset of the global financial crisis in 2008 alone. To support this work, staff has made significant efforts to develop more robust stress testing methods and models, more so since the start of the crisis. Based on the IMF’s vast practical experience with stress tests through more than a decade of FSAPs to its member countries, staff recently proposed a set of “best practice” principles for macrofinancial stress testing (IMF, 2012a). The principles cover areas such as the institutional perimeter, shock channels, risks, market perspectives and tail risks.

Work on stress tests of the banking sector is most advanced at the IMF, given its systemic importance for practically all member countries. In particular, stress testing for bank solvency risk has been the main focus, and work to continually develop a comprehensive and robust framework is ongoing. Separately, the development of liquidity stress tests by IMF staff, which will be covered in a forthcoming paper by the authors, has also intensified in response to lessons learned from the crisis. This paper complements IMF (2012a) by providing an operational perspective of those “best practice” principles within the bank solvency stress testing framework that is being applied by IMF staff, and which is continually being enhanced. Specifically, this paper:

  • Articulates the framework and demonstrates the actual application of those principles in the implementation of the key elements of this framework in the IMF’s surveillance of banking systems in selected FSAPs. Our sample group consists of 18 countries that have participated in FSAPs since the 2010 fiscal year (FY), out of the 30 jurisdictions comprising the top 25 most systemically important financial systems (“S-25”) that are subject to mandatory assessments every five years (IMF, 2010a and 2010b) plus the remaining five other G-20 countries which are not among the S-25 (hereafter “major countries” per Table 1).

  • Presents the framework in a detailed cross-country Stress Testing Matrix (STeM) to compare actual implementation across the major country FSAPs to date (Appendix I). An abridged version of this STeM for each country is typically presented in the main FSAP report, the Financial System Stability Assessment (FSSA), to enhance the transparency of each exercise.

  • Aims to provide useful guidance for readers seeking to develop their own stress testing frameworks and for country authorities preparing for FSAPs. The paper is illustrative in this regard in that it discusses precisely how the set-up of FSAP stress tests is conceived.

Table 1.

S-25 and Other G-20 Countries: Status of FSAPs since FY 2010

article image
Source: IMF (2010); and Monetary and Capital Markets Department, IMF.Note: S-25 countries are ranked according to the size and interconnectedness of their financial systems. The IMF’s fiscal year (FY) runs from May 1 the previous year to April 30 the current year.* FSAPs currently in progress; stress tests are not conducted for the FY2013 European Union FSAP.** FSAPs scheduled for completion in FY2014.

To date, eight of the 18 countries have published all the details of their respective FSAP stress tests. They comprise the United States, Germany, United Kingdom, Sweden, Japan, France, Spain and Australia. Of the remaining 10 countries, all but one have consented, for the purpose of this paper, to the inclusion of the full suite of information on their respective FSAP stress tests, some of which is not contained in their previously published reports.

IMF practices so far suggest that while concerted efforts are being made to standardize FSAP stress tests across countries, with some degree of success, further improvements are possible. However, there are instances where expert judgment of ad hoc rules may be necessary, where “one size fits all” rules may be irrelevant. Moreover, it is important to recognize that surveillance stress tests are not fail-safe, stand-alone diagnostic tools, although the value of well-designed exercises should not be underestimated. The availability and quality of data applied as input into such tests are also crucial for their usefulness. This paper is organized as follows. Section II puts into context the nature of the stress testing work conducted by IMF staff. This is followed in Section III by detailed coverage of the various components and elements of the stress testing framework and their application in FSAPs. Section IV concludes.

II. IMF Stress Testing in Context

Stress testing is a forward-looking technique that attempts to measure the sensitivity of a portfolio, an institution, or even an entire financial system to events that have a very small probability of occurrence but which have significant impact if they occur. Methods such as scenario and/or sensitivity analysis are applied in a “what if” exercise: a rough estimation of what might happen if certain “extreme but plausible” risks were to crystallize. In the decade-and-a-half since the concept was first introduced, stress testing has been used by central banks, supervisory agencies and international organizations, such as the IMF, to identify vulnerabilities and incipient risks in the financial sector from a rapid deterioration in the operational and market environment. Stress tests are used for various purposes, which may be broadly classified as macroprudential, microprudential or risk management (Figure 1).

Figure 1.
Figure 1.

Solvency Stress Testing Applications

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: Authors.Note: Top-down stress tests are either conducted using the data of individual banks and then aggregated or on an aggregated portfolio; bottom-up stress tests are conducted by individual institutions using their own internal risk models and data.

Stress testing conducted by IMF staff as part of the institution’s surveillance mandate are typically for macroprudential purposes (IMF/World Bank, 2003; Moretti and others, 2008). It is aimed at assessing system-wide resilience to shocks over the medium-term, uncovering vulnerabilities to any rapid deterioration in the macroeconomic environment and, more generally, identifying potential threats to overall financial stability. In this context, IMF stress tests (for both solvency and liquidity risks), notably in FSAPs, tend to incorporate very severe stress scenarios to assess the ability of the financial system to withstand tail risks. The findings of the IMF’s surveillance stress tests typically do not require management action by financial institutions; rather, they are used to inform policy discussions with country authorities about the frameworks in place to deal with systemic shocks.

Ultimately, the robustness and credibility of IMF stress tests are largely dependent on the extent of the cooperation extended by country authorities, which is crucial in terms of the scope of the exercise (see below). Article VIII of the Articles of Agreement of the IMF states that member countries are under no obligation to disclose information of individuals or corporations. This means that the IMF cannot compel country authorities to provide the necessary confidential bank-by-bank data for the stress tests. In some cases, authorities have refused outright to share any supervisory information and IMF staff has had to rely solely on publicly available data, which reduces the specificity of the results; in others, authorities have only consented to running the tests themselves, based on some agreed upon parameters, and sharing the aggregated results. The recourse for IMF staff is to ensure that the transparency of the process—or any limitations thereof—is clearly documented in the official documents.

The IMF’s objectives may be contrasted with the stress testing undertaken by supervisory authorities, usually for microprudential purposes (Fell, 2006). Such exercises are normally embedded in regular supervisory processes wherein the supervisor would run stress tests involving individual institutions on a periodic basis to assess their financial soundness under adverse economic conditions, such as the U.S. Comprehensive Capital Assessment and Review (CCAR) exercise (Board of Governors of the Federal Reserve System, 2012a and 2012b). Supervisory stress tests may be independent of whether an institution is systemic or not, where “failure” would typically require some form of management action, which may include recapitalization.

The crisis has introduced a new concept of stress testing, i.e., that with a crisis management objective, which IMF staff refers to as “crisis stress testing.” Largely macroprudential, as the aim is to restore and sustain market confidence in the financial system, it can also be considered microprudential in that it examines the soundness of individual financial institutions and “failure” would typically require recapitalization or even restructuring. Such stress tests tend to have a more short-term focus, compared to surveillance stress tests. In the United States, system-wide (solvency) stress testing of banks was used by the authorities in 2009 for crisis management purposes, through the SCAP exercise (Board of Governors of the Federal Reserve System, 2009), the predecessor of the CCAR; the EU authorities also made a similar effort through the region-wide stress testing exercise conducted by the Committee of European Banking Supervisors (CEBS) in 2009 and 2010 and then by its successor, the European Banking Authority (EBA) in 2011 (CEBS, 2010; EBA, 2011a, 2011b), as did Ireland (Central Bank of Ireland, 2011) and Spain (Banco de España, 2012). IMF teams working on crisis countries may sometimes run stress tests to determine the condition of the banking sector as an input in designing a program.

Separately, financial institutions regularly carry out stress tests for risk management purposes. In these internal exercises, financial institutions develop and implement their own stress testing programs which assess their ability to meet capital and liquidity requirements under stressed conditions. IMF staff sometimes relies on banks’ stress testing infrastructure for the FSAP bottom-up stress tests (see below). In some countries, supervisors have issued guidance on stress testing to the financial institutions under their supervision (e.g., Hong Kong, Singapore and the United Kingdom). However, this practice is not yet widely implemented, including in some of the world’s largest financial systems. The Basel Committee for Banking Supervision (BCBS) has also issued guidelines for stress testing by individual banks (BCBS, 2009), followed up by a peer review of supervisory authorities’ implementation of those principles (BCBS, 2012a).

III. A Framework for Bank Solvency Stress Testing

The objective of the bank solvency stress tests conducted by IMF staff is to assess the soundness banking systems under adverse macroeconomic conditions. Tests are designed to anticipate banking sector performance relative to a pre-defined baseline scenario in the event of a manifestation of severe macrofinancial stress over the short and medium term. The aim is to determine the sector’s vulnerabilities and its capacity to absorb shocks.

Within the framework, the development of plausible and coherent tests requires a thorough understanding of the financial system in question and its institutions. In other words, knowledge of structural and other specific characteristics of a particular financial sector is crucial if particular nuances are to be adequately captured. The differences in banks’ business models, their role in the domestic financial sector and increasingly, cross-border linkages must also be taken into account. While financial intermediation in smaller countries lends itself quite readily to the identification of vulnerabilities, more complex banks in larger economies and financial centers may create conceptual challenges for stress testing.

Since the inception of the FSAP, the IMF has conducted assessments of about 140 countries, comprising advanced, emerging and low-income countries. Of these, solvency stress tests have been conducted in practically all instances in recent years. Thus, the framework for FSAP solvency stress tests must necessarily be applicable across financial systems—it should support appropriate and consistent applications of assumptions and models and be sufficiently flexible to accommodate vastly different circumstances (e.g., normal or crisis times), systems (e.g., sophisticated or basic), regulatory regimes (e.g., Basel I or Basel II/III) as well as be sensitive to when and how the outcomes are presented and communicated (Table 2). Further, the FSAP stress testing exercise necessarily requires trade-offs among the scope, scenario design and methodologies applied in the context of staff and authorities’ resources and time constraints.

Table 2.

A Framework for Macroprudential Bank Solvency Stress Testing

article image
Source: Authors.

A. Scope

The scope of a stress testing exercise needs to be sufficiently comprehensive to capture the key aspects of a particular financial system. Key considerations are: (i) the stress testing approach(es); (ii) the coverage in terms of the institutions, their market shares and the sources of their earnings and exposures; and (iii) the source(s), granularity and timeliness of the data applied and their reliability. In this regard, stress tests conducted by IMF staff for financial surveillance purposes are typically undertaken in close collaboration with supervisory authorities. In many instances, staff is given access to the necessary granular, supervisory data during FSAPs (on agreement of strict confidentiality); data quality is further enhanced when individual financial institutions participate in the exercise.


In FSAPs, surveillance stress testing of banks’ solvency risk usually consists of a “top-down” (TD) approach, which is sometimes combined with a “bottom-up” (BU) approach. These are carried out in the following manner:

  • TD tests are conducted by IMF staff or by the authorities or by both, typically in close collaboration with one another. In these exercises, tests are either conducted using the data of individual banks and then aggregated, or on an aggregated group of banks to analyze the impact of pre-defined shocks on the system as a whole. A common macrofinancial environment is assumed and a standardized set of behavioral assumptions (see below) is applied across the board. TD stress tests may be used as a standalone analysis or to complement the BU exercise, if one is conducted.

  • The BU approach is used by FSAP teams where authorities are supportive of having individual institutions conduct their own stress tests and banks have sufficient expertise to do so. In the BU approach, individual institutions run the stress tests using their own internal risk models and data. As with the TD approach, common macroeconomic shocks and selected standardized assumptions are prescribed by IMF staff to isolate the impact of shocks on banks’ financial soundness in order to identify specific vulnerabilities.

IMF staff advocates conducting both BU and TD stress tests, as much as possible, to enrich the surveillance analysis in FSAPs. Each approach has its strengths and weaknesses and is considered complementary for cross-validation purposes, rather than as substitutes for one another. The process of reconciling the BU and TD results is usually an important learning process in itself, with any divergence in the results from the two approaches usually traced to differences in either the model design, the scope of the stress testing exercise (including the type of underlying data used), behavioral assumptions and/or modeling of sensitivities. For instance, bank-specific assumptions and the application of internal models based on more granular data can lead to differences in the projection of profits and losses—and consequently the impact on the capital ratios—for individual banks under the various scenarios.

The decision as to whether BU stress tests are conducted to complement TD tests or if TD stress tests are performed by country authorities or by IMF staff, or jointly, is mostly made on an ad-hoc, country-by-country basis. It is usually based on data and resource availability, and on the receptiveness and degree of involvement by authorities. Around half of the FSAPs to the major countries since 2010 have run both BU and TD tests (e.g., Australia, China, France, India, Indonesia, Japan, Mexico, Russia, Turkey and the United Kingdom). TD tests are either conducted by the IMF team only (e.g., Indonesia, Netherlands, Turkey, Saudi Arabia, Sweden, India and Australia) or by the authorities only (e.g., Luxembourg, Russia and Japan), or in some cases, separately by both, using different methods (e.g., China, France, Mexico and the United Kingdom).

The solvency stress testing of the banking sector in the 2011 United Kingdom FSAP Update epitomizes the necessary collaboration among country authorities, the IMF and individual financial institutions (IMF, 2011a). In this instance, both BU and TD solvency stress tests are conducted (together with TD liquidity risk stress tests). BU stress tests are run by the seven major U.K. banks, in close coordination with the FSAP team and the Financial Services Authority (Figure 2). At the same time, TD tests are separately performed by the Bank of England (BoE) using its Risk Assessment Model for Systemic Institutions (RAMSI) and by the FSAP team using the Systemic Contingent Claims Analysis (SCCA) model, applying macroeconomic forecasts and projections from the IMF and FSA, respectively, and satellite model outputs from the BoE.

Figure 2.
Figure 2.

Example of IMF Stress Testing Exercise: U.K. FSAP Update

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: IMF (2011a).


The coverage of a stress test is crucial for the usefulness and thus credibility of the exercise. Ideally, surveillance stress testing of the banking sector for macroprudential purposes should include all institutions, if data availability and resources permit. Realistically, all systemically important institutions, as well as second-tier banks which are potentially systemic depending on circumstances, should be covered. Smaller institutions which may be considered at risk could also be included.

FSAPs typically focus on stress testing the major commercial banks in their respective jurisdictions. The market share coverage of the banks included in the various major country stress testing exercises has been 60 percent or more of the total assets of the sector and up to 100 percent in six of the 18 major countries in which FSAPs have been conducted since 2010 (e.g., Brazil, India, Indonesia, Japan, Luxembourg and Russia) usually determined in collaboration with the authorities. Where resource constraints dictate that only a small sample of banks can be considered, especially in the case of BU stress tests, the usual practice is to focus on the obviously systemic institutions.

The identification of systemically important domestic banks is still not clear-cut. While some banks are of obvious systemic importance in their own respective countries and their selection for stress tests is indisputable, the difficulty has been in identifying those that are systemic at the margins, e.g., some of the smaller institutions which may have the potential to become systemic depending on the environment at a particular point in time (IMF/BIS/FSB, 2009). Thus, the definition of what constitutes a systemic bank remains largely ad hoc in IMF-related stress testing exercises, and a more structured approach is desirable. The BCBS methodology for identifying global systemically important banks (G-SIBs) has facilitated this process (BCBS, 2011; FSB, 2011), while the guidelines on the implementation of supervisory measures for domestic systemically important banks (D-SIBs) and the policy recommendations by the Financial Stability Board (FSB, 2012) for their identification represents another positive step in this direction (BCBS, 2012b).


The availability and sufficiency of timely and reliable data underpin the robustness and credibility of the stress test results. The type, quantity and quality of data play a crucial role in determining the stress tests that can be conducted, the risks that are possible to cover and the models that may be applied in the tests (Howard, 2009). As much as possible, FSAP stress tests utilize the latest audited and corresponding supervisory data alongside the latest macroeconomic projections, all of which contribute to the determination of the appropriate cut-off date. In situations where the authorities are less forthcoming, IMF staff relies on publicly available data on individual banks, which may be less granular. Supervisory data have been provided in almost all major country FSAPs to date. In 17 out of 18 cases, supervisory authorities have made available to IMF teams the relevant data from regulatory returns, which are usually supplemented by publicly available information; only in one instance was staff wholly dependent on public information for the stress testing exercise.

One area in which there has been little standardization across FSAPs is the nature of consolidation of bank financial data applied to the stress tests. While about half of the FSAPs to date have used consolidated banking group data for the stress tests (e.g., Australia, China, Brazil, France, Japan, Netherlands, Sweden, the United Kingdom and the United States), most of the others have utilized unconsolidated local entity data (e.g., Germany, India, Luxembourg, Mexico, Russia, Spain and Turkey).

The main focus of bilateral FSAPs is typically on the domestic banking system, which suggests that data of banks’ local businesses should be utilized on a local-consolidated basis. Such data would avoid double counting local business operations. The use of consolidated level data would not allow consideration of issues such as ring-fencing of subsidiary profits, capital and liquidity by host countries, which may be important for large international groups (Cerutti and Schmieder, 2012). That said, the decision as to which type of data to use may sometimes be moot as it could be constrained by the type of data that are collected for supervisory purposes.

The use of forward-looking market data and other variables reflecting point-in-time risks to complement accounting information is growing, especially for data-rich advanced economies. Market data have been found to add value to the analysis insofar as it provides corroborating evidence of market perceptions of what existing book values represent. They can also be used as a benchmark for internal ratings based (IRB) parameters—i.e., those derived from banks’ own credit risk models to quantify required capital—and for other risks, namely, market risk and operational risk.

An important caution with regard to stress testing in general lies in the use and interpretation of the data. Expert judgment is a crucial supplement to the quantitative approach at all times. It should also be emphasized that FSAPs do not conduct audits of banks’ accounts and therefore cannot corroborate the quality of the reported data used in stress tests. In instances where staff may be concerned about the effects of issues such as loan misclassifications and/or lender forbearance on the accuracy of the data, caveats should be explicitly noted (e.g., Spain and the United Kingdom).

B. Scenario Design

Risk horizon

For surveillance purposes, the choice of a risk horizon is important in terms of designing an exercise that would yield valuable information for policy discussions. Covering a longer risk horizon for macro-scenario solvency stress tests offers several benefits, namely: (i) major macrofinancial distress events typically have a lasting impact spanning several years, especially in the case of credit risk; and (ii) regulatory reforms are likely to be protracted and take several years to implement (e.g., the implementation of Basel III). While the degree of uncertainty also increases as the risk horizon lengthens, surveillance stress testing is not a forecasting exercise; rather, the exercise should adequately capture any medium-term effects of shocks. In contrast, sensitivity tests are usually applied to assess instantaneous shocks.

It is important to balance consistency of the risk horizon across countries with the usefulness of the findings for individual country circumstances. As in other aspects of stress testing, expert judgment is crucial—while major country FSAPs typically apply a five-year risk horizon in their macro-scenario design, exceptions may be made in cases where staff is of the view that the application of a longer sample period may be unconstructive. As an example, the FSAP stress test for Spain applies a 2-year risk horizon to accommodate the rapidly changing financial landscape as a result of ongoing restructuring efforts (IMF, 2012b). In the majority of emerging market economies, whose banking systems are less mature (e.g., Indonesia, China, Turkey, Mexico), risk horizons of between 1–3 years have been used.

Stress scenarios

Stress tests are based on scenario shocks and/or sensitivity analysis. In scenario tests, a baseline scenario is first established and post-shock assessments are made relative to the baseline scenario. In FSAPs, the IMF’s World Economic Outlook (WEO) projections are typically used as the baseline for stress tests. Stress scenarios are then defined based on either historical simulation; hypothetical scenarios that have not yet happened but are particularly relevant given specific vulnerabilities in banks’ portfolios; or ad-hoc expert judgment. The stress scenarios are then applied consistently across banks within the same system.

One of two approaches is to construct the appropriate stress scenarios for FSAP solvency tests, depending on the availability of data and the modeling capabilities. Scenarios may reflect a hypothetical state of risk parameters under stress affecting solvency conditions (the “direct approach”), which is often used in the case of ad-hoc scenarios or historical simulation, or be based on adverse macroeconomic scenarios, which need to be translated into financial stress parameters (the “indirect approach”). The latter approach consists of:

  • An estimation of economic and financial variables conditional on the macroeconomic scenario. Common methods for predicting economic and financial variables conditional upon certain macroeconomic conditions include: (i) structural econometric models; (ii) vector autoregressive (VAR) methods; and (iii) pure statistical approaches (Foglia, 2008). As a general rule, these macrofinancial linkages would need to be clearly documented and back-tested.

  • The translation of these economic and financial variables into financial risk parameters via various types of “satellite” (or auxiliary) models. This step links different macrofinancial shocks, reflected in macroeconomic variables, to the main determinants of bank solvency, i.e., pre-impairment profit, impairments and risk-weighted assets (RWA), since macroeconomic models do not usually include financial balance sheet variables (and credit aggregates in particular). Common explanatory variables include:

    • (i) macroeconomic variables, such as economic growth, unemployment, short- and long-term interest rates, inflation, and exchange rates;

    • (ii) sectoral (asset price) indicators, such as residential and commercial real estate prices and equity market conditions (Figure 3); as well as

    • (iii) micro-level data, such as bank-specific credit growth (e.g., deleveraging under severe stress conditions), which could also be modeled as a macroeconomic variable, operational/financial leverage and funding gaps.

Figure 3.
Figure 3.
Figure 3.

Example of Macro Scenarios for Stress Testing: U.K. FSAP Update

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: IMF (2011a).Note: BoE fan charts are based on BoE, rather than WEO projections.

Recent FSAPs have attempted to introduce similarly severe macro-scenario shocks in the respective solvency stress tests. The aim has been to facilitate the identification of other factors that drive differences across institutions and to facilitate comparisons across peer countries. Growth shocks are defined in terms of standard deviations from long-term historical averages, usually one (mild adverse) and/or two (severe adverse) standard deviations, over varying periods as deemed appropriate. For example, the four standard deviation shock imposed on the Australian banking system is estimated over a 50-year period, whereas the two standard deviation shock applied to several EU countries is calculated over a 30-year period. In about half the exercises, a prolonged slow growth scenario is also included as a separate stress (e.g., Australia, Brazil, China, Germany, Japan, Turkey, Sweden, the United Kingdom and the United States).

Flexibility in the scenario design remains key, even though the application of these shock magnitudes have become more or less a general rule of thumb for recent FSAPs. The prevailing macroeconomic environment and main risks to financial stability should continue to drive the decision as to what constitutes the most appropriate and credible tail shock scenario(s) for a particular financial system. For example, the issue of overheating was a key risk for Turkey at the time of its FSAP and was therefore incorporated into the design of the stress scenario. For Spain, the one standard deviation shock applied takes into account a revised baseline that took into account the rapidly deteriorating economic outlook and a fiscal adjustment.

Nonetheless, there remains significant room for improvement in this area of the IMF’s stress testing work, notably, from a spillover perspective. A current weakness is the inability of IMF staff to extend the shock scenarios for the home country in question to consistently and comprehensively quantify the impact that such scenarios may have on the macro environments of other countries where the international banks in question are active. In such cases, IMF staff sometimes has to rely on the banks themselves to estimate the corresponding scenarios in their footprint countries in BU exercises, potentially giving rise to inconsistencies in projections, and thus the resulting impact on banks’ financial performance and position possibly for the same countries.

FSAP stress scenarios emphasize the importance of tail risks. The tests are aimed at identifying the vulnerabilities of a country’s financial system and the ability of its supervisory and crisis management frameworks to deal with the realization of extreme but plausible risks. In the U.K. FSAP, for instance, capital losses are estimated for a 0.1 percent probability event (IMF, 2011a and 2011b)—the U.K. Financial Services Authority (FSA) had ascribed a two percent probability to a two standard deviation shock to growth materializing (FSA, 2011) and the IMF’s model subsequently calculates capital losses at the 95th percentile of this scenario (i.e., falling into the 5 percent tail of the scenario distribution), i.e., 0.05 × 0.02 = 0.001. That said, it is sometimes difficult to convince national authorities of the importance of running extreme tail scenarios that would show the demise of their financial institutions or system. A useful way forward may be to also run reverse stress tests, i.e., stress tests that aim to determine scenarios that would cause a bank to become insolvent.

Separately, sensitivity tests provide useful information on the immediate impact of individual shocks. These are usually applied as the only type of stress tests for financial systems with little or poor quality data, or to complement the scenario analyses conducted on more complex financial systems. Several risk factors could also be combined to determine the impact of concurrent multiple shocks to a system. Sensitivity analysis has been conducted in the majority of major country FSAPs stress testing exercises on various market risk factors.

Risk factors

The selection of main risk drivers to incorporate and the choice(s) and manner in which they are integrated (or not) have significant bearing on the stress test results. The focus on risks in FSAP solvency stress tests, and thus the manner of tests conducted, has evolved and expanded over time and indeed sharpened following the global financial crisis. FSAPs attempt to cover all key risks borne by a financial institution and the system as a whole. Prior to the global financial crisis, these tests have focused largely on credit and market risks (e.g., interest rates, exchange rates, equity, credit spreads and commodity prices). While these risks remain the mainstay of FSAP solvency stress tests, lessons learned since the onset of the crisis have also motivated the inclusion of additional types of exposures.

Risks which had previously been in the periphery have taken center stage in the design of FSAP stress tests in the throes of the current crisis. They include:

  • Exposures to sovereign and other previously low-default assets. Prior to the global financial crisis, exposures to sovereign debt did not figure prominently in stress tests, if at all. They were considered “risk free” and were typically assigned the lowest (often zero) risk weightings for regulatory capital requirements under the Basel framework. However, recent FSAPs have acknowledged rising sovereign risks by estimating the potential asset price losses for such exposures. The future yield-to-maturity (and the corresponding haircut) of a bond of a given country can be determined based on the impact of changes to the individual sovereign credit risk on its bond price (Jobst and others, forthcoming). The same issue applies to other previously low-default portfolios, such as holdings of bank debt. Shocks to bank holdings of sovereign assets have been incorporated into the recent FSAP stress tests in S-25 EU countries, as well as Japan; and some have also applied the same treatment to portfolios of bank debt.

  • Banking and trading books. For securities, stress tests had previously considered shocks to trading books only, largely because longer horizons were not covered. However, some institutions moved their securities to banking books during the crisis, supported, in some cases, by regulatory forbearance, underscoring the need for stress tests to cover securities in all accounts. FSAP stress tests now attempt to estimate valuation losses in both the available-for-sale (AfS) portfolio—which are not included in net income but put through a reserve under shareholders’ equity, unlike those associated with the trading book—and the hold-to-maturity (HtM) portfolio in the banking book (modeled via provisions). However, not all country authorities are receptive to a comprehensive application of shocks to banks’ securities holdings. In the FSAP stress tests for France, Japan, the Netherlands, Sweden and the United Kingdom, valuation haircuts are applied to both portfolios (excluding the “AAA”-rated sovereigns in the HtM portfolio), but only to the AfS portfolio in the case of Russia and Spain.

  • Funding costs. The experience from the global financial crisis has emphasized the importance of incorporating the impact of rising funding costs on bank solvency (as part of the simulation of income under stress more generally). Funding costs change disproportionately to changes in solvency conditions, rising sharply as a bank’s capital position worsens (especially for banks with sizeable portions of wholesale funding). Stress test calculations link net funding costs (simulating the impact on both assets and liabilities) to income. The explicit incorporation of funding costs into FSAP solvency stress tests is a nascent practice (e.g., France, Germany, Sweden and the United Kingdom) and is not yet widely implemented even for the major countries.

  • Off-balance sheet items. The crisis saw the realization of contingent liabilities arising from explicit and implicit guarantees of investment vehicles that contributed to the sudden realization of large losses. Thus, incorporating off-balance sheet positions that could give rise to such contingent liabilities (such as guarantees, commitments, and derivatives) is important to adequately capture the impact of extreme stress on all relevant exposures. That said, such data are not as readily available especially from public and sometimes even supervisory sources.

  • Cross-border exposures. Prior to the crisis, credit risk tests focused largely on banks’ exposures to domestic corporate and households, paying scant attention to their overseas exposures (through branches and subsidiaries). Since then, FSAPs have incorporated spillover risks in the form of network (e.g., Australia, France, Japan and Spain) and ring-fencing (e.g., Spain) analyses as separate modules in the TD approach. In some cases, international banks are required to take into account shocks to the countries in which they are active, in the BU assessments (e.g., the United Kingdom).

The modeling of impairment parameters has also taken on significant import during the crisis. Estimates of credit losses, usually simulated via probabilities of default (PDs) and losses given default (LGDs) and the resulting potential losses under stress should account for differences in banks’ respective business models and/or specific risks. The decision as to whether through-the-cycle (TTC) or point-in-time (PIT) PDs (and LGDs) should be applied at various points in the economic cycle could have significant impact on the stress test results. In order to form a view on current risks, it is desirable to use PIT risk parameters especially during stressed periods. Another key challenge in FSAPs is to ensure the availability of these parameters for the universe (or at least the majority) of banks tested, and if necessary, to proxy by other methods, such as from loan loss provisions (Schmieder and others, 2011).

Factors that management control

Standardized prescriptions which control for strategic decisions and behavioral adjustments are particularly important for surveillance stress tests. Specifically, common assumptions on factors that management control ensure that findings on the capital adequacy of banks under adverse macroeconomic conditions can be analyzed in a consistent and comparable manner. In FSAPs, common assumptions are especially pertinent for BU stress tests which use own internal models, albeit sacrificing flexibility and some degree of realism. Assumptions adopted in FSAPs are also typically (and appropriately) on the conservative side. The main behavioral variables include:

  • Balance sheet growth. This assumption determines the trend growth in core items on the assets and liabilities sides of banks’ balance sheets. FSAP stress tests typically assume constant (i.e., growing with nominal GDP or some pre-defined rule) or static balance sheets (possibly in combination with a constant credit portfolio). Indeed, major country FSAPs to date have been split almost evenly on the adoption of either assumption.

  • Credit growth. In FSAPs, credit growth assumptions are usually based on models (e.g., Brazil, Spain and Sweden) or on descriptive empirical evidence (e.g., Turkey) and expert judgment. More broadly, it is assumed that banks under stress are likely to reduce lending in line with a slowdown or reversal in balance sheet growth, usually consistent with changes in nominal GDP.

  • Dividend payout. The assessment of potential capital shortfalls takes into account assumptions regarding dividend payouts. In most of the major country FSAPs, the dividend payout is assumed to be zero under stress. For the others, assumptions include payouts based on Basel III capital conservation standards (e.g., Sweden) or on historical ratios (e.g., Brazil, France and Japan); the general rule is that dividends are assumed to be paid only by banks that satisfy all three measures of capital adequacy, as relevant (i.e., total capital, Tier 1 and core/common equity Tier 1) after making adequate provisions for asset impairments and transfers of profits to statutory reserves, which banks must keep on hand to meet their obligations to depositors.

  • Strategic changes and asset disposal. FSAP stress tests typically do not consider changes to business operations that require managerial involvement, such as plans to increase operational efficiencies. Moreover, non-realized and/or strategic disposals (e.g., loan books in run-off or sales of non-core businesses) or acquisitions (except when there are legally binding commitments under competition rules, e.g., as agreed with European Commission in the case of EU countries) are generally eschewed. Firms are also assumed to replace maturing exposures unless there is a sound basis for assuming that this will not happen (e.g., deleveraging plans for banks in IMF program countries).

There is not necessarily a specific “best practice” associated with each assumption on the factors that bank management controls. However, conservatism should be an important consideration. FSAPs have sought to ensure some uniformity in their application where possible, and to match their specific relevance to the country in question. Detailed guidance on these assumptions is usually provided for FSAP stress tests as relevant (Appendix II).

C. Capital Standards

The capital standards applied in a stress test comprise several components, which are key in the determination of bank solvency—the main objective of the stress test and therefore critical to measure appropriately. The main elements underpinning a capital assessment are: (i) the definition of capital; and (ii) the calculation of capital adequacy, which requires decisions on the capital metric(s), hurdle rate(s), assumptions on RWA, and the nature of data consolidation. In the event that capital shortfalls arise under stress, the amount of potential recapitalization needed post-stress are estimated. From a transparency perspective, the composition of the various definitions of capital that are applicable to a particular jurisdiction would ideally be disclosed (Appendix III), along with information on the planned adoption of regulatory changes—e.g., phasing-out of some types of eligible capital (BCBS, 2010a and 2010b)—over the stress test risk horizon.

The definition of capital applied in FSAPs is usually that required by local regulations. Use of the Basel I definition among the major countries is rare (e.g., Indonesia, and specific groups of banks for the United States). With almost all major countries having adopted Basel II (BCBS, 2012c) and many in transition to Basel III (BCBS, 2012d), capital definitions used in FSAP stress tests depend on the jurisdiction under consideration.2 They either:

  • follow Basel II requirements (e.g., Australia, India, Indonesia, Luxembourg, Netherlands, Turkey, Russia and the United States):

  • change in line with the Basel III transition schedule for those that are moving to the new regime (e.g., Brazil, France, Japan, Spain and Sweden); in a couple of cases, own national transitional schedules are applied (e.g., Brazil and Japan);

  • use benchmark parameters from the BCBS’ Sixth Quantitative Impact Study (QIS-6; BCBS, 2010a)—a comprehensive study to ascertain the impact of Basel III on the global banking system—to simulate the likely impact of regulatory reforms on bank solvency (e.g., Germany and the United Kingdom, where a separate and additional transitioning arrangement is also included for the BU exercise in the form of the interim capital regime); or

  • apply a separate local regulatory capital definition (i.e., Mexico).

Capital metrics, and hence the appropriate hurdle rates, which are used to define bank solvency typically vary across countries. For countries where the Basel II capital definition is applied, total regulatory capital is used to determine the hurdle rate. Where the Basel III transition or a national modified version is applied, the metrics usually comprise total capital, Tier 1 capital and core Tier 1 capital, along with the associated Basel III hurdle rates (Table 3) or the national requirements, respectively. On a couple of occasions, the hurdles rates have included the capital conservation buffer (e.g., France and Japan), while the loss absorbency requirement for G-SIBs is also captured in one case (i.e., France). In a few cases, hurdle rates are set in line with existing regulatory standards (e.g., Australia and Netherlands and the United Kingdom as an additional benchmark). In one instance, the 2019 Basel III target for core Tier 1 is applied as a supplementary benchmark for crisis credibility purposes (i.e., Spain).

Table 3.

Original Basel III Transition Schedule

article image
Currently Basel II. Transition to Basel III.Source: BCBS.Note: See BCBS (2010b and 2010c) and Appendix III for capital definitions. According to recent revisions to the liquidity risk framework under Basel III (BCBS, 2013) the introduction of the Liquidity Coverage Ratio (LCR) will now be graduated. Specifically, the LCR will be introduced as planned on 1 January 1, 2015, but the minimum requirement will begin at 60 percent, rising in equal annual steps of 10 percentage points to reach 100 percent on January 1, 2019.

The manner in which RWA is assumed to change over the risk horizon is another important variable in determining capital adequacy. Although the rising riskiness of assets in stress scenarios has to be recognized—as implied by the positive relationship between RWA (i.e., potential worst-case losses) and default risk (and the resulting recovery rates) in economic capital models and the credit risk assumptions underpinning Basel II—actual practice has varied across major country FSAPs to date, namely:

  • RWA are kept constant (e.g., China, Japan and Mexico);

  • RWA weights are kept constant, but the total RWA amounts are adjusted for credit growth and/or credit losses. It corresponds approximately to the evolution of RWA for banks using the Basel II Standardized Approach (e.g., Russia, Saudi Arabia and Spain);

  • RWA weights change under stress due to changes in the risk profile, in addition to the effects from asset growth (e.g., France, Germany, Japan, Luxembourg, Netherlands and the United Kingdom). It is consistent with the rules for risk weights according to Basel II, 2.5, and III, which are either implicitly captured (e.g., based on QIS information, such as for Germany and the United Kingdom), or are treated more explicitly (e.g., France). In other words, the evolution of RWA is determined by changes in the estimated PDs and LGDs on a firm and/or portfolio level for IRB banks, while accounting for the evolution of total credit exposure under stress. For some countries, implicit IRB risk weights were simulated, to reflect the economic risk profile of banks that are still under the standardized approach (e.g., Brazil); or

  • RWA for operational and market risks are often assumed to remain unchanged, or to change proportionally with the changes in RWA for credit risk (mainly for market risk). FSAP stress tests are typically based on the assumption that the asset structure of banks remains the same during the stress test horizon, i.e., that banks do not replace maturing loans with securities, which are assigned different (usually lower) risk-weights.

D. Method

Once the key elements of the stress testing framework have been determined, one or more quantitative stress testing methods are used to estimate capital adequacy under projected financial stress. However, the stress testing literature to date provides little guidance on the selection and application of appropriate models in different circumstances. This issue has given rise to questions about the consistency and comparability of FSAP stress test results across countries and their implications for the associated stability analysis.

A comprehensive FSAP solvency stress testing exercise would preferably comprise three components: a balance sheet module, a portfolio model utilizing market information and spillover analysis. Balance sheet-based methods cover a wide range of items for which granular data tend to be most accessible and thus represent the core of solvency stress tests. Portfolio models are better able to capture dependencies and thus facilitate the computation of tail risks, provided that market data are available (see below). Spillover analysis, which captures contagion risk and feedback effects, has become an important element of solvency stress tests in increasingly interconnected financial systems; however, the development of robust models in this area remains nascent (e.g., Espinosa-Vega and Solé, 2011), in large part due to data limitations.

Macroeconomic and satellite modelling

System-wide stress tests that are informed by adverse macroeconomic conditions affecting the profitability and solvency of banks necessitate the use of satellite models, which help project the impact of key sources of risk. Specifically, satellite models are used to determine credit losses and various components of profit, including funding costs, under various scenarios. Under each stress scenario, macro and financial sector variables are projected as input into the solvency stress tests (Figure 4). These would have a bearing on the net interest income, non-interest income, trading income, credit growth and credit losses of banks. Satellite models can be run at the economy level, sectoral level and also at the level of individual banks or of one of their specific portfolios.

Figure 4.
Figure 4.

General Representation of Satellite Modeling in Bank Solvency Stress Testing

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: Authors.

The construction of satellite models typically comprises three key steps. They are:

(i) the choice of the estimation method;

(ii) the selection of the dependent variable and a set of potential explanatory variables that form the initial model specification; and

(iii) the iterative process of fitting the model (and completing robustness checks).

Various types of modeling may be used. These include time series analysis, regression models (e.g., OLS regression, logistic regression, and panel data analysis) and structural models (Foglia, 2008; Drehmann, 2009). Most major country FSAP stress tests have typically relied on the authorities’ satellite models, on the basis that these models would have undergone repeated calibrations and robustness checks over time. The FSAP team sometimes cross-validates with IMF staff’s own satellite models in parallel TD tests (see Figures 5 and 6 for application in U.K. FSAP Update).

Figure 5.
Figure 5.

Example of Satellite Model Estimations for Bank Solvency Stress Testing: U.K. FSAP Update

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: IMF (2011a).
Figure 6.
Figure 6.

Example of Application of Satellite Model Outputs to Top-down Bank Solvency Stress Test Models: U.K. FSAP Update

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: IMF (2011a).

Stress test models

There is no one specific stress test model that is perfectly suited for a particular financial system. What is important is that the model is able to adequately capture the complexity, uniqueness and idiosyncrasies of that system, subject to data availability. In this context, an important challenge in FSAPs has been to ensure that appropriate stress test model(s) are applied on each occasion. FSAP stress tests for simple financial systems with a predominantly domestic financial sector are normally less resource intensive and require less-sophisticated models. In contrast, stress tests of more complex systems have applied correspondingly more advanced stress testing methods to capture the gamut of risks.

The stress testing methods that are applied to estimate capital adequacy under projected financial stress are based on either a deterministic or stochastic framework. Deterministic approaches are predicated on prudential information in balance sheet based stress test specifications, while stochastic frameworks incorporate uncertainty around these accounting identities using historical volatility and/or market information, usually in the context of portfolio-based models. Both approaches allow for running scenario or sensitivity analysis.

It is important to be aware of the differences between different stress testing methods and their implications for the results. As a general rule, the more sophisticated the model, the higher the chances of estimation uncertainty, an issue which needs to be taken into account when drawing policy conclusions from stress tests. At the same time, simpler methods might be inadequate for highly interconnected and complex banking sectors with large credit and market risk exposures. When different approaches (TD, BU) and models are used in an FSAP, the results are cross-validated and the differences reconciled; discussions on the assumptions and caveats attached to the different models are also included in the write-up.

There is still significant room for improvement in stress test modeling. For instance, existing FSAP stress tests do not adequately capture feedback effects beyond the initial impact of macroeconomic shocks on the banking sector, despite some recent work in this area (Vitek and Bayoumi, 2011). An important reason is that the interaction between adverse macroeconomic scenarios, such as changes in credit aggregates, and firm-level financial soundness complicates the specification of feedback effects (BIS, 2009). The literature and the actual use of stress test models that incorporate feedback effects from the financial sector to the general economy remains very limited to date (Alfaro and Drehmann, 2009).

A suite of stress testing models is currently used by IMF staff for surveillance stress testing. They can be categorized into two broad strands, supplemented by a third, and are discussed below. These approaches are not mutually exclusive in that there are overlaps in the types of data that are utilized (Table 4 and Figure 7). IMF staff is presently cataloguing models developed within the institution to improve transparency in the models used in FSAPs and other areas of IMF work (Čihák and Ong, forthcoming).

Table 4.

Scorecard on Data and IMF Stress Test Models

article image
Source: Čihák and Ong (forthcoming).Note: For descriptions of models, see: Espinosa-Vega and Solé (2011) for the network approach; Chan-Lau and others (2012) for the extreme value theory approach; Jobst and Gray (2013) and Gray and Jobst (2011) for systemic contingent claims analysis; Segoviano and Padilla (2006) for the distress dependence framework.
Figure 7.
Figure 7.

Stress Test Models Developed by IMF Staff

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: Čihák and Ong (forthcoming).
The accounting-based (balance sheet) approach

This approach has the longest history of use. It is applicable to the widest range of countries (advanced, emerging and developing economies) given its relative simplicity (e.g., simulations could be done in a spreadsheet). It has the added attraction of directly producing results in terms of regulatory variables (e.g., capital adequacy ratios). A variety of such tools suitable for banking systems at various levels of development have been developed by IMF staff and deployed for FSAPs and other surveillance work and through technical assistance (Čihák, 2007; Ong and others, 2010; Schmieder and others, 2011). This approach remains the cornerstone of FSAP stress testing and continues to be applied even in the largest, most systemic financial systems as evidenced by its application in all major country FSAPs to date. The network model used for spillover analysis in FSAPs (i.e., Espinosa-Vega and Solé, 2011) can also be considered an accounting-based approach.

Market price-based models

The market price-based models are often built on portfolio risk management techniques and typically derive concise “systemic risk measures” from estimated dependencies among different risk factors. These risks (e.g., sovereign, credit and market) are typically excluded when modeling the default risk of each institution in isolation (Segoviano and Padilla, 2006; Gray and others, 2010; Gray and Jobst, 2011; Jobst and Gray, 2013). Unlike accounting values, risk-based measures of solvency take into account the following considerations to inform the assessment of capital adequacy under stressful conditions (Figure 8):

  • The possibility that institutions may fail simultaneously (joint default risk). Most conventional stress tests do not account for default dependencies across institutions, i.e., when one risk factor increases the likelihood of realization of other risk factors (with common shocks affecting multiple firms at the same time), especially under stressful conditions. Further, given that large shocks are transmitted across entities differently from small shocks, measuring non-linear dependence in stress testing can provide important insights into the joint tail risks that arise in extreme loss scenarios. This would also include measuring the differential effects of combinations of risk factors on the realization of joint outcomes, which affects system-wide capital adequacy.

  • The sensitivity of stress test results to the historical volatility of risk factors (risk-based capital adequacy). Prudential information based purely on accounting identities observed at a certain point in time reflects the outcome of a stochastic process rather than a discrete value. In contrast, the individual and joint default risks of banks within a system vary over time and depend on the individual bank’s propensity to cause and/or propagate shocks as a result of adverse change in one or more risk factors (a distribution-based approach). Thus, there are clear conceptual differences in loss measurement under balance sheet- and distribution-based approaches affecting the comprehensiveness of the capital assessment. Unlike RWA, risk-based measures of solvency (such as market-implied expected losses and the corresponding capital shortfalls) consider the actual historical dynamics of default risk, such as Value-at-Risk (VaR), or the Expected Shortfall, i.e., the average density of extreme losses beyond VaR at a selected percentile level. Hence, in this distribution-based approach, the capital adequacy assessment takes into account the variability of both assets and liabilities at different levels of statistical confidence.

Figure 8.
Figure 8.

Key Conceptual Differences in Loss Measurements between the Accounting-based and Market Price-based Approaches

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: Authors.

The stress test outcomes of market price-based models are likely to involve valuation methods and tend to be less tractable. They usually do not show direct links to key regulatory ratios, which need to be derived in separate, additional steps. Owing to data limitations—the prices of certain market instruments (e.g., equity prices and CDS spreads) are not always readily available—this approach has so far only been applied in FSAPs to supplement the accounting-based approach. The systemic contingent claims analysis (SCCA) and/or the distress dependence (DiDe) models have been used in only a handful of major country FSAPs, where the necessary data are available for credible implementation (e.g., Germany, Mexico, Spain, Sweden, the United Kingdom and the United States).

Macrofinancial models

Macrofinancial models represent the third strand, which may be considered more as a separate dimension of the first two. They are used to examine systemic risks arising from links between the macroeconomic and financial environments. By specifying certain macroeconomic situations, stress testers would apply consistent combinations of multiple shocks (e.g., GDP, employment, inflation, exchange rate, interest rates and asset prices) that could simultaneously affect various segments of banks’ businesses and exposures, and hence potentially extend overall losses. Macrofinancial stress testing could be implemented with both accounting-based and market-price based models, by estimating additional macrofinancial linkages models that directly connect macroeconomic assumptions and risk parameters used in the simulation exercises. The market-based models that fall into this category include the SCCA and DiDe, while satellite models may also be classified as macrofinancial in nature.

E. Communication

Presentation of outputs

Stress tests are aimed at drawing attention to and, if necessary, action of senior supervisors. It is thus important that the results be presented in an accessible manner in order to appropriately convey the findings, namely, by highlighting the relevant risks and vulnerabilities. In FSAPs, stress test results, especially those generated via the BU approach, are often aggregated by the authorities for confidentiality reasons, which means that the design of a meaningful presentation format for analysis by the FSAP team is essential (Figures 911). Specifically, the presentation of the aggregated results by the authorities should be:

  • consistent with local regulatory requirements and where relevant, any transition to a new regulatory regime (e.g., Basel III); and

  • sufficiently granular, such that it:

    • lists the individual institutions or (if constrained by confidentiality) peer groups, at the very least;

    • shows some measure of dispersion, such as the minimum, inter-quartile range (e.g., the 25th, 50th and 75th percentiles of the distribution of capital adequacy levels) and the maximum, if they are not presented by institution;

    • shows the outcome for each year of the risk horizon;

    • shows the amount of capital required in instances where there is a failure to meet the pre-defined hurdle, both in absolute terms and as a percentage of GDP and as a percentage of total sector assets under consideration;

    • details the contributions of different drivers (e.g., profitability, credit/trading losses, RWA) of the results; and

    • clarifies assumptions and key limitations to the stress tests.

The findings of the stress tests are then used for two main purposes, which are to:

  • provide quantitative support for the FSAP’s stability risk assessment by estimating the impact from the realization of key tail risks; and

  • facilitate policy discussions with the authorities on risk mitigation strategies and crisis preparedness.

Figure 9.
Figure 9.

Example of Bottom-up Bank Solvency Stress Test Output Template Provided to Banks: U.K. FSAP Update 1/

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: Authors.1/ See Excel attachment for actual template.
Figure 10.
Figure 10.

Example of Bottom-up Bank Solvency Stress Test Output Template Provided to Authorities: U.K. FSAP Update 1/

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: Authors.1/ See Excel attachment for actual template.
Figure 11.
Figure 11.

Example of Bottom-up Bank Solvency Stress Test Summary Template Provided to Authorities: U.K. FSAP Update 1/

Citation: IMF Working Papers 2013, 068; 10.5089/9781616355074.001.A001

Source: Authors.1/ See Excel attachment for actual template.


The manner in which FSAP stress test results is conveyed to the public is a critical element of the exercise. In addition to providing a meaningful judgment on the outcome of the test (for instance, the fact that no bank fails a test does not mean that vulnerabilities do not exist), a substantial part of the effort in FSAPs is dedicated to the communication of results. Not surprisingly, disclosure of stress test results is a very sensitive issue, especially for supervisors and the financial institutions they oversee. Thus, the presentation of stress test findings should be appropriately nuanced to ensure that the information does not promote a false sense of security or cause undue alarm:

  • The objectives, definitions, assumptions, models and limitations of stress tests are usually written up in detail, either in Technical Notes and/or as supplementary information in the FSSA report. Publication of these documents is voluntary for country authorities.

  • More recently, mandatory summaries of the stress testing exercises are also presented in the FSSA in a standard framework format, i.e., the STeM, to improve transparency and facilitate comparisons across countries (Table 5).

  • The aggregated results of FSAP stress tests of a particular financial system are always disclosed in the reports. As a minimum, information such as the relevant post-stress ratio(s) and the respective amount(s) of capital shortfall is presented. Rarely do authorities agree to make available the results of individual banks.

Table 5.

Example of Stress Test Matrix (STeM) for Bank Solvency Risk: Spain FSAP Update

article image
Source: IMF (2012b).

All the countries in our sample have published their FSSAs. In almost all cases, Technical Notes on the respective stress testing exercises have been produced (with the exception of Australia and Spain, where the details are described in appendices to the respective FSSAs, but only a few countries (i.e., Germany, Sweden, the United Kingdom and the United States) have consented to their publication (see IMF, 2010c; 2011c; 2011d, respectively).

IV. Concluding Remarks

Surveillance stress tests are not fail-safe, stand-alone diagnostic tools. This fact is abundantly clear from the performance of FSAP stress tests in the lead-up to the global financial crisis. Conceptually, the implementation of stress tests is very challenging: the institutions undergoing the stress tests have a diversity of business models and activities; models are subject to varying degrees of estimation uncertainty and assumptions or may not be sufficiently robust to capture all the relevant risks; constraints to data availability and quality may be insurmountable; and stress scenarios are subject to negotiation and political sensitivities. The complexity of running stress tests is magnified during a crisis, amid a rapidly changing financial landscape and heightened market expectations.

At the IMF, significant efforts have been made to address the identified shortcomings. Steps taken include: (i) standardizing the shock scenarios across countries, where possible, and nascent attempts to quantify the likelihood of the realization of specific scenarios; (ii) applying more encompassing stress tests (i.e., complementary accounting- and market price-based models) and undertaking a wider coverage of risks; as well as (iii) ensuring a more organized and cohesive presentation of assumptions and results. The issue of consistency and comparability of implementation across member countries has also become very important as stress tests increasingly come into the limelight.

Building on the progress so far, IMF stress tests will need to be continually enhanced to adequately capture risks in a post-crisis world. Important areas for improvement include integration between solvency and liquidity risks; spillover analysis, both within a financial system and across borders; and the incorporation of feedback loops between the real economy and financial sector, among others. IMF staff recently published a set of “best practice” principles on macrofinancial stress testing, drawing on actual experience with more than a decade of FSAPs. This paper, in turn, discusses the stress testing framework and related key elements in the design of IMF stress tests that have been conducted recently with member countries— notably in the major country FSAPs—and demonstrates the application of those “best practice” principles in the actual implementation of the framework.

Nevertheless, the standardization of stress tests across countries is likely to remain elusive. While common shocks facilitate comparison of results to some extent, qualitative analysis and expert judgment are and will continue to be indispensable. Given the many “moving parts” of stress tests—the constantly evolving risks and methodologies required to adequately capture them and the specific nature of local regulatory requirements—and the political sensitivities depending on the macrofinancial environment at the time of a particular FSAP, these exercises will continue to be art form rather than an exact science. Perfect standardization may never be possible—nor is it desirable, given the purpose of the tool—and thus the output resulting from such exercises should always be interpreted and presented with due care. That said, these challenges should not undermine the value of well-designed stress tests.

Appendix I. FSAP Solvency Stress Tests since FY2010: STeM for S-25 and Other G-20 Countries 1/

article image
article image
article image
article image
Source: Compiled by authors with contributions from respective FSAP stress testers.1/ The IMF fiscal year runs from May 1 to April 30. A larger version of this table is included as a separate pdf file.