Selected Issues


Selected Issues

A Machine Learning Approach to Forecasting GDP1

1. GDP is a critical indicator of the health of the economy but is often lagged. In addition, GDP data is subject to revisions which make it difficult to assess the current state of the economy. Suriname’s GDP is released after 3 quarters but is subject to revisions for 3 years, some of which can be large. Many policymakers turn to high frequency data to make an assessment, but in many countries, such data does not exist.

2. The Central Bank (CBvS) estimates economic activity for policy making using a monthly economic activity index (MEAI). This is done using high frequency data, some of which, is not publicly available. The publicly available high frequency data is sparse and is often still subject to lags. The current lag of the MEAI is around 5 months.

3. We propose a method of estimating GDP with publicly available high frequency data using the machine learning (ML) approaches. ML is a very powerful tool but its use in macroeconomics has been somewhat limited because it requires very large datasets. We innovate a method to expand the available dataset for Suriname. We identify cross-country structural characteristics using ML, which help expand the dataset available for each individual country. We assume that countries that are structurally similar to the country of interest will be subject to the same external shocks and they will propagate through the economy in a similar way. This is done in 2 stages:

Stage 1: Identify the countries that have structural similarities.

4. Using big data on the structure of the economy and the categories of exports from the CIA Factbook, we group countries by structural similarities using two ML methods. We use principal component analysis (PCA) for dimensionality reduction to encode countries into their latent factors and then use encoded latent factors to group similar countries using Gaussian mixture model (GMM). Our second approach is to use SimRank to find similar countries to Suriname based on their major shared industries.

Suriname’s Structurally Similar Countries

article image
Source: Fund staff calculations.

Stage 2: Employ elastic net regression method to forecast the variable of interest.

5. Elastic net regression is similar to an ordinary least squares regression (OLS) with two penalty terms. The first is called the ridge penalty that compresses the estimates towards zero. The second is called the LASSO penalty that allows the coefficients to be zero when they are very small, resulting in a parsimonious model. The elastic net approach chooses to tradeoff variance for bias in order to maximize the accuracy of forecasting out of sample. We augment the naive elastic net regression model to accommodate the addition of the GDP growth rates of the countries identified in the previous exercise:


i from 1 to N which represents ith country and j from 1 to Ni which represents the jth observation in the ith country samples such that,


The parameters β and b are optimized by minimizing the loss function where y are the GDP growth rates in time t and X are the predictors which include the SWIFT data in time t and the GDP growth rates in t-1. Then the nowcast equation is:


where the SWIFT messages used for inflows and outflows are MT103 (financial institutions transfers) and MT700 trade related messages.

6. We add high frequency SWIFT data that captures financial transactions and international trade to address the lag as these data are released 9 days after the close of the period. The forecasts are 1-step ahead and the training set used to optimize the model is from time 0 to t-1. Then we estimate the following optimizations (Figure 1):

  • i. using Suriname’s SWIFT data we estimate GDP growth using an AR(1) (RMSE 2.6%) model as a benchmarking exercise then we use the naïve elastic net regression approach on the Suriname SWIFT data (RMSE 2.9%) and find that the AR(1) still performs better;

  • ii. adding the GDP lagged by 1 period of the countries identified by GMM (RMSE 2.5%) we find that there is a slight improvement in the forecast;

  • iii. using SWIFT data of the countries identified by SimRank and the augmented elastic net regression approach we see an improvement in the forecasting power (RMSE 1.6%);

  • iv. adding the GDP lagged by 1 period of the SimRank countries increases the forecasting power (RMSE 1.0%) of the model significantly.

Figure 1.
Figure 1.

Suriname: Real GDP Growth Forecasts Using Elastic Net Regression

Citation: IMF Staff Country Reports 2018, 377; 10.5089/9781484391853.002.A003

Sources: Suriname General Statistics Office; and IMF staff calculations.

7. The additional forecasting accuracy of the ML approaches suggest this is a useful tool for policymakers. Additional expansion of the dataset with other big data sources such as exchange rates, financial market data, COMTRADE, APIs, or media/word count/IoT data could help increase forecasting accuracy even further. The team is developing a tool that can be easily employed for use by researchers and policymakers.


  • Bishop, C. (2006). Pattern recognition and machine learning. New York, NY: Springer.

  • Jeh, G. and Widom, J. (2002). SimRank: a measure of structural-context similarity. In KDD ‘02 Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 538543). New York, NY: ACM.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67.2, 301320.

    • Crossref
    • Search Google Scholar
    • Export Citation

Prepared by Thomas Dowling (WHD), Yang Liu, and Mamoon Saeed (both ITD).

Suriname: Selected Issues
Author: International Monetary Fund. Western Hemisphere Dept.