Front Matter
Author:
Mr. Jorge A Chan-Lau
Search for other papers by Mr. Jorge A Chan-Lau in
Current site
Google Scholar
PubMed
Close
,
Ruofei Hu
Search for other papers by Ruofei Hu in
Current site
Google Scholar
PubMed
Close
,
Maksym Ivanyna 0000000404811396 https://isni.org/isni/0000000404811396 International Monetary Fund

Search for other papers by Maksym Ivanyna in
Current site
Google Scholar
PubMed
Close
,
Ritong Qu
Search for other papers by Ritong Qu in
Current site
Google Scholar
PubMed
Close
, and
Cheng Zhong
Search for other papers by Cheng Zhong in
Current site
Google Scholar
PubMed
Close

Copyright Page

© 2022 International Monetary Fund

WP/23/41

IMF Working Paper

Strategy, Policy and Review Department

Surrogate Data Models: Interpreting Large-scale Machine Learning Crisis Prediction Models

Prepared by Jorge A. Chan-Lau, Ruofei Hu, Maksym Ivanyna, Ritong Qu, and Cheng Zhong

Authorized for distribution by Natalia Tamirisa

February 2023

IMF Working Papers describe research in progress by the author(s) and are published to elicit comments and to encourage debate. The views expressed in IMF Working Papers are those of the author(s) and do not necessarily represent the views of the IMF, its Executive Board, or IMF management.

ABSTRACT: Machine learning models are becoming increasingly important in the prediction of economic crises. The models, however, use datasets comprising a large number of predictors (features) which impairs model interpretability and their ability to provide adequate guidance in the design of crisis prevention and mitigation policies. This paper introduces surrogate data models as dimensionality reduction tools in large-scale crisis prediction models. The appropriateness of this approach is assessed by their application to large-scale crisis prediction models developed at the IMF. The results are consistent with economic intuition and validate the use of surrogates as interpretability tools.

article image

Title Page

WORKING PAPERS

Surrogate Data Models: Interpreting Large-scale Machine Learning Crisis Prediction Models

Prepared by Jorge A. Chan-Lau, Ruofei Hu, Maksym Ivanyna, Ritong Qu, and Cheng Zhong1

Contents

  • Introduction

  • A short survey of ML crisis prediction models

  • Enhancing ML crisis prediction model interpretability using surrogate data models

    • Surrogate models and feature importance

    • Surrogate Data Models

  • An application to the IMF ML crisis models

    • Feature selection

    • Surrogate data models: estimation

    • Surrogate data models: results

  • Conclusions

  • References

  • Annex I. IMF ML models, crisis event definitions

  • Annex II. IMF ML models, model features

  • FIGURES

    • 1. Surrogate model and surrogate data model approaches

    • 2. Linear combination of country-groups and global models

    • 3. Gap-block cross-validation

    • 4. VE indices, SDM indices, projections, and adverse scenario

    • 5. SDM indices: distribution of Shapley values

    • 6. SDM indices: SHAP decomposition

  • TABLES

    • 1. Variables used in surrogate models

    • 2. Optimal weights of income-based country-group models

    • 3. Out of sample performance

    • 4. Five-year scenarios: IMF (2022) baseline and adverse scenario

1

The authors would like to thank Peter Dohlman, Michael Evans, Xuehui Han, Sandile Hlatshwayo, Luca Mungo, Natalia Tamirisa, and Weining Xin for their useful comments. Any error or omissions are the authors’ sole responsibility. Please address correspondence to all authors.

  • Collapse
  • Expand
Surrogate Data Models: Interpreting Large-scale Machine Learning Crisis Prediction Models
Author:
Mr. Jorge A Chan-Lau
,
Ruofei Hu
,
Maksym Ivanyna
,
Ritong Qu
, and
Cheng Zhong