Back Matter
  • 1 https://isni.org/isni/0000000404811396, International Monetary Fund

References

  • Acevedo, Sebastian, 2016, “Nowcasting GDP with Google Trends,” Proof of Concept for the IMF’s Big Data Challenge, Unpublished.

  • Adachi, Yuta, Motoki Masuda, and Fumiko Takeda, 2017, “Google search intensity and its relationship to the returns and liquidity of Japanese startup stocks,” Pacific-Basin Finance Journal, 46, 243257.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Afkhami, Mohamad, Lindsey Cormack, and Hamed Ghoddusia, 2017, “Google search keywords that best predict energy price volatility,” Energy Economics, 67, 1727.

    • Search Google Scholar
    • Export Citation
  • Aouadi, Amal, Mohamed Arouri, and Frédéric Teulon, 2013, “Investor attention and stock market activity: Evidence from France,” Economic Modelling, 35, 674681.

    • Search Google Scholar
    • Export Citation
  • Araujo, Juliana D., Antonio C. David, Carlos van Hombeeck, and Chris Papageorgiou, 2017, “Joining the club? Procyclicality of private capital inflows in lower income developing economies,” Journal of International Money and Finance, 70, 157182. http://dx.doi.org/10.1016/j.jimonfin.2016.08.006

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Artola, Concha, Fernando Pinto, and Pablo de Pedraza García, 2015, “Can internet searches forecast tourism inflows?International Journal of Manpower, 36:1, 103116. https://doi.org/10.1108/IJM-12-2014-0259

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Askitas, Nikolaos, and Klaus F. Zimmermann, K. F., 2009, “Google Econometrics and Unemployment Forecasting,” DIW Berlin Discussion Paper No. 899.

    • Search Google Scholar
    • Export Citation
  • Bangwayo-Skeete, Prosper F., and Ryan W. Skeete, 2015, “Can Google data improve the forecasting performance of tourist arrivals? Mixed-data sampling approach,” Tourism Management, 46(C): 454464.

    • Search Google Scholar
    • Export Citation
  • Barbieri, Maria M., and James O. Berger, 2004, “Optimal Predictive Model Selection,” Annals of Statistics, 32:3, 879897. DOI 10.1214/009053604000000238

    • Search Google Scholar
    • Export Citation
  • Barreira, Nuno, Pedro Godinho, and Paulo Melo, 2013, “Nowcasting unemployment rate and new car sales in south-western Europe with Google Trends,” Netnomics, 14:129165. DOI:10.1007/s11066-013-9082-8

    • Search Google Scholar
    • Export Citation
  • Barro, Robert J., 2015, “Convergence and modernization,” The Economic Journal 125 (585): 911942. https://doi.org/10.1111/ecoj.12247

  • Buono, Dario, Gian Luigi Mazzi, George Kapetanios, Massimiliano Marcellino, and Fotis Papailias, 2017, “Big data types for macroeconomic nowcasting,” Eurostat Review on National Accounts and Macroeconomic Indicators, 1/2017, 93145.

    • Search Google Scholar
    • Export Citation
  • Calhoun, Gray, 2014, “Out-of-sample comparisons of overfit models,” Economics Working Papers, No. 11002. Iowa State University.

  • Campbell, Donald T., 1979, “Assessing the impact of planned social change,” Evaluation and Program Planning, 2 (1): 6790. DOI:10.1016/0149-7189(79)90048-X.

    • Search Google Scholar
    • Export Citation
  • Campos, I., G. Cortazar, and T. Reyes, 2017, “Modeling and predicting oil VIX: Internet search volume versus traditional variables,” Energy Economics, 66, 194204.

    • Search Google Scholar
    • Export Citation
  • Carrière-Swallow, Yan, and Felipe Labbé, 2013, “Nowcasting with Google Trends in an Emerging Market,” Journal of Forecasting, 32 (4): 289298.

  • Chadwick, Meltem Gülenay, and Gönül Sengül, 2015, “Nowcasting the unemployment rate in Turkey: Let’s ask Google,” Central Bank Review, 15, 1540.

    • Search Google Scholar
    • Export Citation
  • Chamberlin, Graeme, 2010, “Googling the present,” Economic and Labour Market Review, 4:12, 5995.

  • Chinn, Menzie D., and Hiro Ito, 2006, “What matters for financial development? capital controls, institutions, and interactions,” Journal of Development Economics, 81:1, 163192. https://web.pdx.edu/~ito/Chinn-Ito_website.htm

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Choi, Hyunyoung, and Hal R. Varian, 2012, “Predicting the present with google trends,” Economic Record, 88, 29.

  • Choi, Sangyup, and Yuko Hashimoto, 2018, “Does transparency pay? Evidence from IMF data transparency policy reforms and emerging market sovereign bond spreads,” Journal of International Money and Finance, 88, 171190.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Da, Zhi, Joseph Engelberg, and Pengjie Gao, 2011, “In Search of Attention,” Journal of Finance, 66 (5): 14611499.

  • Da, Zhi, Joseph Engelberg, and Pengjie Gao, 2015, “The sum of all FEARS investor sentiment and asset prices,” Review of Financial Studies, 28:1, 132. https://doi.org/10.1093/rfs/hhu072

    • Crossref
    • Search Google Scholar
    • Export Citation
  • D’Amuri, Francesco, and Juri Marcucci, 2017, “The predictive power of Google searches in forecasting US unemployment,” International Journal of Forecasting, 33, 801816.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • De Luca, Giuseppe, and Jan R. Magnus, 2011, “Bayesian model averaging and weighted-average least squares: equivariance, stability, and numerical issues,” Stata Journal, 11:4, 18544.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Diebold, Francis X., 2015, “Comparing predictive accuracy, twenty years later: a personal perspective on the use and abuse of Diebold–Mariano tests,” Journal of Business and Economic Statistics, 33:1, 19. DOI: 10.1080/07350015.2014.983236

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Diebold, Francis X., and Roberto S. Mariano, 1995, “Comparing Predictive Accuracy,” Journal of Business and Economic Statistics, 13:3, 253263. DOI: 10.1080/07350015.1995.10524599

    • Search Google Scholar
    • Export Citation
  • Dimpfl, Thomas, and Stephan Jank, 2016, “Can Internet search queries help to predict stock market volatility?European Financial Management, 22:2, 171192. https://doi.org/10.1111/eufm.12058

    • Search Google Scholar
    • Export Citation
  • Donalodson, Dave, and Adam Storeygard, 2016, “The view from above: applications of satellite data in economics,” Journal of Economic Perspectives, 30:4, 171198.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Engstrom, Ryan, Jonathan Hersh, and David Newhouse, 2017, “Poverty from space: using high-resolution satellite imagery for estimating economic well-being,” Policy Research Working Paper, Washington: World Bank.

    • Search Google Scholar
    • Export Citation
  • Ferreira, Pedro. 2014, “Improving Prediction of Unemployment Statistics with Google Trends: Part 2,” Eurostat Working Paper.

  • Fondeur, Y., and Karamé, Y, 2013, “Can Google data help predict French youth unemployment?Economic Modelling, 30, 117125. https://doi.org/10.1016/j.econmod.2012.07.017

    • Search Google Scholar
    • Export Citation
  • GADM, 2018, Database of Global Administrative Areas, https://gadm.org/, accessed in September 2018.

  • Ginsberg, Jeremy, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S Smolinski, and, Larry Brilliant. 2009, “Detecting influenza epidemics using search engine query data,” Nature, 457:7232, 10121014.

    • Search Google Scholar
    • Export Citation
  • Goddard, John, Arben Kita, and Qingwei Wang, 2015, “Investor attention and FX market volatility,” Journal of International Financial Markets, Institutions and Money, 38, 7996.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Götz, Thomas B., and Thomas A. Knetsch, 2019Google data in bridge equation models for German GDP,” International Journal of Forecasting, 35, 4566.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hamid, A., and Heiden, M., 2015, “Forecasting Volatility with Empirical Similarity and Google Trends,” Journal of Economic Behavior and Organization, 117, 6281.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hammer, Cornelia L., Diane C. Kostroch, Gabriel Quirós, and STA Internal Group, 2017, “Big data: potential, challenges, and statistical implications,” IMF Staff Discussion Note, Washington: International Monetary Fund.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Harchaoui, Tarek M., and Robert V. Janssen, 2018, “How can big data enhance the timeliness of official statistics?: The case of the US consumer price index,” International Journal of Forecasting, 34:2, 225234.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hashimoto, Yuko, and K. M. Wacker, 2016, “The role of information for international capital flows: new evidence from the SDDS,” Review of World Economics, 152:3, 529557. DOI: 10.1007/s10290-016-0250-4

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Henderson, Vernon, J., Adam Storeygard, and David N. Weil, 2012, “Measuring economic growth from outer space,” American Economic Review, 102:2, 9941028. https://doi.org/10.1257/aer.102.2.994

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Independent Evaluation Office, 2014, “IMF forecasts: process, quality, and country perspectives,” Evaluation Report, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • Independent Evaluation Office, 2016, “Behind the scenes with data at the IMF: an IEO evaluation,” Evaluation Report, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • Inoue, Atsushi, and Lutz Kilian, 2005, “In-sample or out-of-sample tests of predictability: which one should we use?Econometric Reviews, 23:4, 371402. DOI: 10.1081/ETC-200040785

    • Crossref
    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2015a, “Macroeconomic developments in low-income developing countries,” Policy Paper, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2015b, “Myanmar: 2015 Article IV Consultation Staff Report,” Country Report No. 15/267, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2015c, “Samoa: 2015 Article IV Consultation Staff Report,” Country Report No. 15/191, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2018a, Financial Flows Analytics, confidential and internally accessed in November 2018, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2018b, International Financial Statistics, http://data.imf.org/?sk=4C514D48-B6BA-49ED-8AB9-52B0C1A0179B, accessed in November 2018, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2018c, “Macroeconomic developments in low-income developing countries—2018,” Policy Paper, March, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2018d, “Overarching strategy on data and statistics at the fund in the digital age,” Policy Paper, March, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • International Monetary Fund, 2018e, Statistical Appendix of the October 2018 World Economic Outlook, Washington: International Monetary Fund.

    • Search Google Scholar
    • Export Citation
  • Jean, Neal, Marshall Burke, Michael Xie, W. Matthew Davis, David B. Lobell and Stefano Ermon, 2016, “Combining satellite imagery and machine learning to predict poverty,” Science, 353:6301, 790794. DOI: 10.1126/science.aaf7894

    • Search Google Scholar
    • Export Citation
  • Joseph, Kissan, M. Babajide Wintoki, and Zelin Zhang, 2011, “Forecasting abnormal stock returns and trading volume using investor sentiment: Evidence from online search,” International Journal of Forecasting, 27, 11161127.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kleinberg, Jon, Jens Ludwig, Sendhil Mullainathan, and Ziad Obermeyer, 2015, “Prediction Policy Problems,” American Economic Review, 105:5, 491495. http://dx.doi.org/10.1257/aer.p20151023

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Koop, Gary, and Luca Onorante, 2013, “Macroeconomic Nowcasting Using Google Probabilities,” retrieved at: https://www.ecb.europa.eu/events/pdf/conferences/140407/OnoranteKoop_MacroeconomicNowcastingUsingGoogleProbabilities.pdf.

    • Search Google Scholar
    • Export Citation
  • Laframboise, Nicole, Nkunde Mwase, Joonkyu Park, and Yingke Zhou, 2014, “Revisiting tourism flows to the Caribbean: what is driving arrivals?IMF Working Paper 14/229, Washington: International Monetary Fund.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lampos, Vasileios, Andrew C. Miller, Steve Crossan, and Christian Stefansen, 2015, “Advances in nowcasting influenza-like illness rates using search query logs,” Scientific Reports, 5:12760. DOI: 10.1038/srep12760

    • Search Google Scholar
    • Export Citation
  • Leamer, Edward E., 1978, Specification Search: Ad Hoc Inference with Nonexperimental Data, New York: Wiley.

  • Lee, Suzanne S., and Per A. Mykland, 2008, “Jumps in financial markets: a new nonparametric test and jump dynamics,” Review of Financial Study, 21:6, 25352563. https://doi.org/10.1093/rfs/hhm056

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Li, Xin, Wei Shang, Shouyang Wang, and Jian Ma, 2015, “A MIDAS modelling framework for Chinese inflation index forecast incorporating Google search data,” Electronic Commerce Research and Applications, 14:2, 112125. https://doi.org/10.1016/j.elerap.2015.01.001

    • Search Google Scholar
    • Export Citation
  • Li, Xin, Jian Ma, Shouyang Wang, and Xun Zhang, 2015, “How does Google search affect trader positions and crude oil prices?Economic Modelling, 49, 162171.

    • Search Google Scholar
    • Export Citation
  • Li, Xin, Bing Pan, Rob Law, and Xiankai Huang, 2017, “Forecasting tourism demand with composite search index,” Tourism Management, 59, 5766.

    • Search Google Scholar
    • Export Citation
  • Moussa, Faten, Ezzeddine Delhoumi, and Olfa Ben Ouda, 2017, “Stock return and volatility reactions to information demand and supply,” Research in International Business and Finance, 39, 5467.

    • Search Google Scholar
    • Export Citation
  • Mullainathan, Sendhil, and Jann Spiess, 2017, “Machine learning: an applied econometric approach,” Journal of Economic Perspectives, 31:2, 87106.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Njuguna, Christopher, 2018, “Rnightlights: Satellite Nightlight Data Extraction,” Comprehensive R Archive Network, https://cran.r-project.org/web/packages/Rnightlights/index.html (for a developer version, https://github.com/chrisvwn/Rnightlights, accessed in September 2018).

    • Search Google Scholar
    • Export Citation
  • Peltomäki, Jarkko, Michael Graham, and Anton Hasselgren, 2018, “Investor attention to market categories and market volatility: The case of emerging markets,” Research in International Business and Finance, 44, 532546. http://dx.doi.org/10.1016/j.ribaf.2017.07.124

    • Search Google Scholar
    • Export Citation
  • Preis, Tobias, Helen Susannah Moat, and H. Eugene Stanley, 2013, “Quantifying Trading Behavior in Financial Markets Using Google Trends,” Scientific Reports, 3:1684.

    • Search Google Scholar
    • Export Citation
  • Reis, Fernando, Pedro Ferreira, and Vittorio Perduca, 2014, “The use of web activity evidence to increase the timeliness of official statistics indicators,” Eurostat Working Paper.

    • Search Google Scholar
    • Export Citation
  • Rivera, Roberto, 2016, “A dynamic linear model to forecast hotel registrations in Puerto Rico using Google Trends data,” Tourism Management, 57, 1220.

    • Search Google Scholar
    • Export Citation
  • Ross, Andrew, 2013, “Nowcasting with Google Trends: a keyword selection method,” Fraser of Allander Economic Commentary, 37:2, 5464.

    • Search Google Scholar
    • Export Citation
  • Rossi, Barbara, and Atsushi Inoue, 2012, “Out-of-sample forecast tests robust to the choice of window size,” Journal of Business and Economic Statistics, 30:3, 432453.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Scott, Steven L., and Hal R. Varian, 2015, “Bayesian variable selection for nowcasting economic time series,” Chapter 4, 119–135, in Economic Analysis of the Digital Economy, edited. by Avi Goldfarb, Shane M. Greenstein, and Catherine E. Tucker, Illinois: University of Chicago Press.

    • Search Google Scholar
    • Export Citation
  • Siliverstovs, Boriss, and Daniel S. Wochner, 2018, “Google Trends and reality: Do the proportions match? Appraising the informational value of online search behavior: Evidence from Swiss tourism regions,” Journal of Economic Behavior and Organization, 145 (C): 123.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smith, Geoffrey Peter, 2012, “Google Internet search activity and volatility prediction in the market for foreign currency,” Finance Research Letters, 9, 103110.

    • Search Google Scholar
    • Export Citation
  • Smith, Paul, 2016, “Google’s MIDAS Touch: Predicting UK Unemployment with Internet Search Data,” Journal of Forecasting, 35:3, 263284.

  • Somaini, Paulo, and Frank Wolak, 2016, “An Algorithm to Estimate the Two-Way Fixed Effects Model,” Journal of Econometric Methods, 5:1, 143152. https://doi.org/10.1515/jem-2014-0008

    • Crossref
    • Search Google Scholar
    • Export Citation
  • StatCounter, 2018, Global Stats. http://gs.statcounter.com/search-engine-market-share, accessed in July 2018.

  • Stephens-Davidowitz, Seth, 2017, Everybody lies: big data, new data, and what the Internet can tell us about who we really are, New York: HarperCollins Publishers.

    • Search Google Scholar
    • Export Citation
  • Stephens-Davidowitz, Seth, and Hal R. Varian, 2015, “A Hands-on Guide to Google Data,” retrieved at http://people.ischool.berkeley.edu/~hal/Papers/2015/primer.pdf, Google.

    • Search Google Scholar
    • Export Citation
  • Takeda, Fumiko, and Takumi Wakao, 2014, “Goggle search intensity and its relationship with returns and trading volume of Japanese stocks,” Pacific-Basin Finance Journal, 27, 118.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tang, Wenbin, and Lili Zhu, 2017, “How security prices respond to a surge in investor attention: Evidence from Google Search of ADRs”, Global Finance Journal, 33, 3850.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tantaopas, Parkpoom, Chaiyuth Padungsaksawasdi, and Sirimon Treepongkaruna, 2016, “Attention effect via internet search intensity in Asia-Pacific stock markets,” Pacific-Basin Finance Journal, 38, 107124.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Thomas, Evan, Luis Alberto Andrés, Christian Borja-Vega, and Germán Sturzenegger, eds., 2018, “Innovations in WASH impact measures: water and sanitation measurement technologies and practices to inform the Sustainable Development Goals,” Directions in Development, Washington: World Bank. doi:10.1596/978*1-4648*1197-5. License: Creative Commons Attribution CC BY 3.0 IGO.

    • Search Google Scholar
    • Export Citation
  • United Nations Economic Commission for Europe, 2013, “Classification of types of big data.” https://statswiki.unece.org/display/bigdata/Classification+of+Types+of+Big+Data

    • Search Google Scholar
    • Export Citation
  • Varian, Hal R., 2014, “Big data: new tricks for econometrics,” Journal of Economic Perspectives, 31:2, 87106.

  • Vicente, María Rosalía, Ana J. López-Menéndez, and Rigoberto Pérez, 2015, “Forecasting unemployment with internet search data: Does it help to improve predictions when job destruction is skyrocketing?Technological Forecasting and Social Change, 92, 132139.

    • Search Google Scholar
    • Export Citation
  • Vlastakis, Nikolaos, and Raphael N. Markellos, 2012, “Information demand and stock market volatility,” Journal of Banking and Finance, 36, 18081821.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vosen, Simeon, and Torsten Shmidt, 2011, “Forecasting private consumption: survey-based indicators vs. Google Trends,” Journal of Forecasting, 30, 565578.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vozlyublennaia, Nadia, 2014, “Investor attention, index performance, and return predictability,” Journal of Banking and Finance, 41, 1735. http://dx.doi.org/10.1016/j.jbankfin.2013.12.010

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Welagedara, Venura, Saikat Sovan Deb, and Harminder Singh, 2017, “Investor attention, analyst recommendation revisions, and stock prices,” Pacific-Basin Finance Journal, 45, 211223.

    • Search Google Scholar
    • Export Citation
  • Wesolowski, Amy, Nathan Eagle, Andrew J. Tatem, David L. Smith, Abdisalan M. Noor, Robert W. Snow, and Caroline O. Buckee, 2012, “Quantifying the impact of human mobility on malaria,” Science, 338:6104, 267270. doi: 10.1126/science.1223467

    • Search Google Scholar
    • Export Citation
  • World Bank, 2018, World Development Indicator, https://datacatalog.worldbank.org/dataset/world-development-indicators, accessed in November 2018, Washington: World Bank.

    • Search Google Scholar
    • Export Citation
  • Wu, Lynn, and Erik Brynjolfsson, 2015, “The future of prediction: how Google searches foreshadow housing prices and sales,Chapter 3, 89–118, in Economic Analysis of the Digital Economy, edited. by Avi Goldfarb, Shane M. Greenstein, and Catherine E. Tucker, Illinois: University of Chicago Press.

    • Search Google Scholar
    • Export Citation
  • Yang, Xin, Bing Pan, James A. Evans, and Benfu Lv, 2015, “Forecasting Chinese tourist volume with search engine data,” Tourism Management, 46, 386397.

    • Search Google Scholar
    • Export Citation
  • Yung, Kenneth, and Nadia Nafar, 2017, “Investor attention and the expected returns of REITs,” International Review of Economics and Finance, 48, 423439.

    • Crossref
    • Search Google Scholar
    • Export Citation

Appendix I. Technical Details

A. Introduction to Google’s Search Volume Index (SVI)

The Google Trends service compiles an index, SVI, which measures how many times a keyword (or key words under a topic) has been submitted to the Google search engine. A search topic, rather than just a keyword, can be specified to deal with ambiguity of a search word due to homographs. Appendix Figure 1 shows an example of the SVI of search queries on country “Kenya” as a topic, from all over the world (specified as “Worldwide”), classified as the finance category (specified as “Finance”). Data points A and B are first calculated as the ratios of searches related to topic “Kenya,” divided by the total searches for all queries from the same location (“Worldwide”), under the same category (“Finance”), for each period (October 2011 and July 2015, respectively). In this case, point B is the maximum of such ratios over time, and therefore, the SVI for July 2015 shows 100 and the SVI for October 2011 shows 58, which is computed as the ratio of A to B multiplied by 100.

Appendix Figure 1.
Appendix Figure 1.

Google Trends search for “Kenya” as a search topic

Citation: IMF Working Papers 2018, 286; 10.5089/9781484390177.001.A999

Source: Google Trends’ website (https://trends.google.com/trends/).

The SVI is constructed from sub-samples of total search data, randomly selected periodically to take a balance between usefulness and anonymity. Although all the queries submitted are stored, the Google Trends service conducts a random sampling and uses only a fraction of the entire search data to construct an SVI. Too small observations are also concealed. Re-sampling is done periodically (e.g., daily), which complicates the replication of the data downloaded previously. It is then recommended that researchers repeat downloading the same data to take the average and focus on inferred population moments, while it is also reported that the sampling generally gives reasonably precise estimates, and more than a single sample may not be needed in practice (Stephens-Davidowitz and Varian, 2015).

We retrieve monthly SVIs via Google Trends’ Application Programming Interface (API), which has two major differences from the website. The SVI from the API is compiled from a 10-percent sample of total Google searches, compared with a 1 percent sampling rate for the website. On the other hand, the API provides monthly data only, while the website (https://trends.google.com/trends/) provides daily (if you query less than 90 days), weekly (if less than 5 years), and monthly data. The access to the API is provided through a proprietary arrangement. We use program codes written in Python to retrieve data through the API.

B. Two-layer normalization of the SVI

Two layers of normalization are conducted in constructing an SVI. The Google servers store the information about “search volume,” which is the total number of searches on query q submitted to the Google search service from location l in time t, denoted by SVt,l(q). However, SVt,l(q) is not available from the Google Trends service. Instead, we observe an SVI, which is defined in two normalization steps as follows. First, search volume on a query is normalized by the total search volume on all queries. That is, the search volume ratio (SVR), which is the ratio of search volume on query q to search volume of all the queries that were submitted in time t at location l—denoted by SVRt,l(q)—, is constructed as follows:

SVRt,l(q)=defSVt,l(q)Σq˜SVt,l(q˜).

Second, the SVR is further normalized such that the highest value under a particular data request takes 100, which defines the SVI—denoted by SVIt,l(q)—as follows:

SVIt,l(q)=defSVRt,l(q)maxtT0SVRt,l(q)×100,

where T0 is the set of the time periods under the data request. Letting t* denote the time that attains the maximum under the data request (and hence it will change under a different data request), we have:

SVIt,l(q)=[SVt,l(q)q˜SVt,l(q˜)][SVt*,l(q)q˜SVt*,l(q˜)]1×100=[SVt,l(q)q˜SVt,l(q˜)]×constant.

Therefore, SVIt,l(q) is an index proportional to the frequency of searches on query q relative to the frequency of searches on all the queries submitted at location l at time t.

The two layers of normalization applied to the SVI are intended to provide an accessible and meaningful metric. The first step of the normalization controls for trivial changes in search volumes, including due to a general trend increase in search volumes observed for virtually all queries and a tendency to observe higher search volumes for queries originated from more populated locations (Stephens-Davidowitz and Varian, 2015). The second step of the normalization scales the SVI to take a value between 0 to 100 for any selection of query, time, and location, which makes the SVI accessible to wide users.

However, the two-layer normalization complicates the analysis of the SVI. For example, an increase in an SVI for query q from time tt to time t2(> t1), while keeping the location the same, does not necessarily mean that query q was searched more often in time t2. Taking two SVIs yields

SVIt2,l(q)SVIt1,l(q)[SVIt2,l(q)SVIt1,l(q)][Σq˜SVt2,l(q˜)q˜SVt1,l(q˜)]1,

which fluctuates not only because of the change in the search volume for query q from time t1 to time t2, but also because of the change in the total search volume for all the queries submitted from time t1 to time t2. In general, there is an increasing trend in the total number of searches over time, and thus, this ratio would increase only if the search volume for query q increased at a faster pace than the increasing trend in the total number of searches.

In addition, the scaling adjustment made per data request prevents researchers from directly comparing different SVIs in levels. The units of SVIs differ across data requests to the Google Trends service. This would not be a problem if researchers could download all the SVIs of interest at once in one data request. But this is not the case in practice, not only because researchers may have second thoughts on which SVIs are needed for their analyses, but also because there are limits on the size of data requests (i.e., “quota limits”), which prevent such a massive data request at once.

C. Making SVIs comparable

SVIs are not comparable as they are, due to the normalization, but there is a way to make them comparable across queries—i.e., across countries in our case. Although we cannot infer search volumes in levels—i.e., SVt,l(q) itself—due to the first layer of the normalization, we can control for the scaling per data request made in the second layer of the normalization. After downloading two SVIs to be compared for a given period, we submit one data request for the averages of the two SVIs over the period of interest and use these values to adjust one of the two SVIs to be in the same unit of the other.

The specific procedure is as follows. Consider two SVIs, denoted by SVIt,l1(q1)andSVIt,l2(q2), to be compared for the same T periods, where superscripts 1 and 2 indicate that they are downloaded in two separate data requests. The scaling per data request results in two constants, C1 and C2, associated with these SVIs as follows:

SVIt,l1(q1)=[SVt,l(q1)q˜SVt,l(q˜)]×C1,SVIt,l2(q2)=[SVt,l(q2)q˜SVt,l(q˜)]×C2.

Downloading the averages of these SVIs over time in one data request, indicated by superscript 3 and associated with a scaling constant C3, provides the two values as follows:

1TtSVIt,l3(q1)=1Tt[SVt,l(q1)q˜SVt,l(q˜)]×C3,
1TtSVIt,l3(q2)=1Tt[SVt,l(q2)q˜SVt,l(q˜)]×C3.

Combining these, we adjust SVIt,l2(q2) as follows:

SVIt,l1(q2)=SVIt,l2(q2)×[1TtSVIt,l3(q2)1TtSVIt,l2(q2)]×[1TtSVIt,l1(q1)1TtSVIt,l3(q1)]=[SVt,l(q2)Σq˜SVt,l(q˜)]×C1,

where SVIt,l1(q2) denotes the adjusted SVI for query q2, which has the common scaling constant C1 with SVIt,l1(q1). This way, SVIt,l1(q1)andSVIt,l1(q2) become comparable with each other.

We apply this adjustment bilaterally for all two pairs of SVIs of interest and make all SVIs associated with one common constant. The common constant is denoted by C0 henceforth. The value 100 under these comparable SVIs now indicates the highest among all the SVIs over time across queries (i.e., country names) in our data set.

We cannot apply this adjustment for SVIs across categories, unfortunately. The Google Trends service does not provide the averages of SVIs across categories, which we need in the adjustment procedure. Therefore, we cannot make SVIs under different categories comparable. For example, in our data set, for Uganda, the SVI under the travel category is higher than that of the finance category, but it does not necessarily mean that more queries are submitted under the travel category than the finance category.

D. Conditions for proper measurement of people’s attention

We establish a simple set of conditions, under which the SVI does capture the degree of people’s attention to the subject of the search. Following the idea of Da, Engelberg, and Gao (2011), we assume that the search volume on query q at time t in location l is associated with some degree of people’s attention to the entity represented by query q, denoted by At,l(q). We need to be careful in establishing the relationship between At,l(q) and SVIt,l(q), the latter of which requires access to the Internet and the use of the Google search service.

We first simply assume that At,l(q) is the fraction of people who are interested in the entity represented by query q:

At,l(q)=defNt,l(q)Populationt,l,

where Populationt,l denotes the total population and Nt,l(q) the number of people who are interested in the entity represented by query q, at location l in time t, regardless of the access to the Internet and the use of the Google search. This way, we put aside the issue of the intensive margin of people’s attention, such as the case where some people may be more attentive than others. Still, we need to take it into account that only part of the people interested in query q have access to the Internet and use the Google search to submit query q (Appendix Figure 2).

Appendix Figure 2.
Appendix Figure 2.

Internet access, use of Google search, and people’s attention

Citation: IMF Working Papers 2018, 286; 10.5089/9781484390177.001.A999

Source: Authors.

We make three assumptions to establish a meaningful relationship between the SVI and people’s attention. The first assumption is about the number of searches on Google per person on average, conditional on making at least one search, which is denoted by Q¯t,l(q). We have:

SVt,l(q)=Q¯t,l(q)Gt,l(q),SVIt,l(q)=[Q¯t,l(q)Gt,l(q)q˜Q¯t,l(q˜)Gt,l(q˜)]×C0.

where Gt,l(q) denotes the number of people who search query q (at least once) on Google and C0 is the common constant discussed in Appendix I, Section C. The second assumption simplifies the relationship between Gt,l(q) and the number of people interested in query Nt,l(q). The third assumption deals with the difficulty stemming from multiple counting in the sum over all submitted queries.

Assumption 1: Focus on the extensive margin

The average number of Google searches regarding query q per person, conditional on making at least one search, is constant across queries: Q¯t,l(q)=Q¯t,l for any t, l and q.

We make Assumption 1, for convenience, to focus only on the extensive margin of the search volume. Under this assumption, we have

SVIt,l(q)=C0SVIt,l(q)q˜SVIt,l(q˜)=C0Q¯t,l(q)Gt,l(q)Q¯t,lq˜Gt,l(q˜)=C0Gt,l(q)q˜Gt,l(q˜).

That is, the SVI is proportional to the fraction of people who submitted query q over the total number of people who use the Google search at location l in time t. Assumption 1 claims that the pattern of such multiple search query submissions does not change significantly or systemically across queries. It should be practically reasonable to consider that levels of SVIs for different queries are not entirely dominated by different degrees of multiple searches across queries (i.e., the intensive margin), but are mostly reflecting the varied number of searchers across queries (i.e., the extensive margin). Note that the Google Trends service excludes repeated searches from the same person over a short period (Google Trends Help, https://support.google.com/trends/answer/4365533?hl=en&ref_topic=6248052).

Assumption 1 may not hold in several important cases as follows. Some search activities require high-frequency updates, including seeking real-time financial investment opportunities. In this case, the SVI would be higher than the fraction of people who are interested in query q. Therefore, people’s attention based on the SVI would be overestimated for the queries related to financial-sector activities (e.g., stock ticker symbols, the exchange rates), compared to slower other activities (e.g., car/home purchases, tourism). Another case is that people who are familiar with information technology may tend to submit more queries than others, and such familiarity with information technology may be correlated with some types of queries. Similarly, in the 2000s, most of Google searchers were people from colleges and universities (Stephens-Davidowitz and Varian, 2015) and they may have submitted search queries regarding their research activities (e.g., “science”, “statistics”) more frequently than usual people did for general search queries. In our application, people in the information technology industry may tend to be interested in queries about countries where the information technology industry is large or emerging (e.g., India). In this case, people’s attention would be overestimated for these countries.

Assumption 2: Random Google search across queries

People have access to the Internet and submit queries of their interests to the Google search service at random with a constant probability that can depend on time t and location l but not depend on query q In other words, there is no correlation between using the Google search service and being interested in the entity represented by query q.

Assumption 2 simplifies the relationship between the SVI and the number of people interested in query q, although the assumption may be too strong. It yields:

Gt,l(q)qGt,l(q)=gt,l(q)Nt,l(q)q˜gt,l(q˜)Nt,l(q˜)=gt,l(q)Nt,l(q)gt,lq˜Nt,l(q˜)=Nt,l(q)q˜Nt,l(q˜),

where gt,l(q) denotes the probability that people who are interested in query q make a search using the Google search service and, by Assumption 2, its dependence on query q is dropped at the second equality. Therefore, combining with Assumption 1, the SVI is now proportionate to the number of people who are interested in query q. Note that this holds regardless of the improved Internet access and the increase in the use of Google search in developing countries in general during our sample period, because Assumption 2 allows the case where gt,l(q) can change over time and vary across locations.

Assumption 2 does not hold in the cases mostly similar to the violation of Assumption 1. As discussed for Assumption 1, the trend shift in the composition of the Google search users from people in colleges and universities to a much broader population from early 2000s to date (Stephens-Davidowitz and Varian, 2015) indicates that the probability of searching the term “science” or “statistics” was higher than other terms in the 2000s, violating Assumption 2. Also, those who are interested in information technology would be more likely to use the Google search than others. Such correlation may generate an upward bias on queries about countries where the information technology industry is large or emerging (e.g., India), as discussed for Assumption 1. Assumption 2 also implicitly requires that there must be no submission of queries by the people who are not actually interested in the entities represented by those queries. Such query submissions without interest lead to a violation of Assumption 2 and add noise in the SVI.

Assumption 3: Stable multiple interests

People may well be interested in multiple queries, but the average number per person of the interested queries is constant over time and across locations.

Assumption 3 is very useful (albeit parsimonious) in establishing a connection between the SVI and economic and social fundamentals. The SVI uses the sum of all submitted queries as its denominator, but this sum is very difficult to analyze in general. Assumption 2 simplifies the denominator to the gross headcount of people who get interested in any of submitted queries. But it is still difficult to see how much such a gross headcount would be, except for a guess that it would be much larger than the population because the sum over queries should count one person several times if that person is interested in multiple queries. Assumption 3 claims that this multiple counting occurs to everyone to the same extent on average, establishing the following simple relationship:

q˜Nt,l(q˜)=iMt,l(i)=M¯×Populationt,l,

where Mt,l(i) denotes the number of queries that person i at location l in time t is interested in, and M¯ is the average of such a number per person, assumed to be constant by Assumption 3. The first equality holds because the sum counted over queries on the left-hand side is just recounted as the sum over persons on the right-hand side. Note that Assumption 3 has nothing to do with whether people access the Internet or how often they search on Google. Rather, Assumption 3 is about a human nature of getting interested in multiple things, which would be generic enough to justify the parsimonious assumption that the average per person would not be different across locations and over time.

Proposition 1: SVI as a measure of attention

Under Assumptions 1, 2, and 3, the SVI is proportionate to the degree of people’s attention on the entity represented by a query:

SVIt,l(q)=C0Gt,l(q)q˜Gt,l(q˜)=C0Nt,l(q)q˜Nt,l(q)=(C0M¯)Nt,l(q)Popu1ationt,l=(C0M¯)At,l(q).

Proposition 1 formalizes the use of the SVI to analyze people’s attention in general. It justifies the use of the SVI and sets a basis to discuss possible biases that could arise in the estimates based on the SVI.

E. How to detect jumps in the SVIs

We employ a methodology in the finance literature to detect acute increases in the SVIs. We apply Lee and Mykland (2008)’s continuous-time model of the log level of stock prices to the log level of SVIs. It uses the difference between squared percent changes and consecutive absolute percent changes (called bi-power variations) to identify huge changes within a period. While the finance model is intended to be applied at a very high frequency such as a 30-minute window, we apply it to monthly data, suffering from lower efficiency and higher bias from the remaining mean drift component, which should be negligible only if the observation frequency goes to infinity. On this account, we first regress the log of SVIs on a third-order polynomial trend and the monthly dummies. We then use its residuals for calculating squared and bi-power percent changes. The jump detection is based on a statistical inference at the 1 percent significance level. It requires estimating the time-varying instantaneous volatility without jumps, for which we use the rolling-window bi-power variation over the past 36 months, excluding the current month. To keep the observations as many as possible, the rolling-window estimation is conducted forwardly (i.e., over 36 months ahead) for the first 36 months in the sample. We only focus on positive jumps and ignore negative jumps (i.e., huge drops), because an acute decrease of people’s attention is not intuitive, and thus, its detection would be erroneous. To increase accuracy, we iterate the procedure once again after removing the detected jumps. Furthermore, we conduct the same procedure at the quarterly frequency, by taking period averages and setting the size of the rolling window at 12 quarters. We then take the union of jumps detected monthly and quarterly.

Appendix II. Supplementary Tables

Appendix Table 1.

Groupings of the economies

article image
Source: World Economic Outlook (IMF, 2018e).

See also IMF (2018c, Appendix I) for the update of the classification of the LIDCs.

EMEs are defined as the residual group of economies that are not included in AEs nor LIDCs.

Appendix Table 2.

Categories under the Google Trends service

article image
Source: Google Trends website (https://trends.google.com/trends/).Note: Queries are assigned to categories using a natural language processing algorithm, whose details are not disclosed to the public.
Appendix Table 3.

Variable definitions and data sources

article image
Sources: Chinn and Ito (2006); Earth Observation Group; Financial Flows Analytics (FFA, IMF, 2018a); GADM (2018); Google Trends; Henderson, Storeygard, and Weil (2012); International Financial Statistics (IFS, IMF, 2018b); National Geophysical Data Center (with U.S. Air Force Weather Agency); World Development Indicators (WDI, World Bank, 2018); and World Economic Outlook (WEO, IMF, 2018e).Note. Nighttime lights measure the light intensity at some instant during 8:30 and 10:00 pm local time, depending on the location, being digitalized as an integer between 0 (no light) to 63 (Henderson, Storeygard, and Weil, 2012). The R package Rnightlights (Njuguna, 2018) compiles nighttime light data for 1992–2013 based on DMSP OLS data and for 2015–2016 based on the Visible Infrared Imaging Radiometer Suite (VIIRS) Day/Night Band (DNB) data. The DMSP OLS data are based on the processed images provided by National Geophysical Data Center, while images are collected by U.S. Air Force Weather Agency. The VIIRS DNB data are produced by the Earth Observation Group, NOAA/NCEI. The FFA database compiled from the IMF’s Balance of Payments Statistics, IFS, and WEO databases, World Bank’s WDI database, Haver Analytics, CEIC Asia database, and CEIC China database. DMSP OLS: Defense Meteorological Satellite Program Operational Linescan System; GEE: Global Economic Environment; LIDCs: low-income developing countries; NCEI: National Centers for Environmental Information; NOAA: National Oceanic and Atmospheric Administration.
Appendix Table 4.

Summary statistics for LIDCs

article image
Sources: Chinn and Ito (2006); Earth Observation Group; Financial Flows Analytics (IMF, 2018a); GADM (2018); Google Trends; Henderson, Storeygard, and Weil (2012); International Financial Statistics (IFS, IMF, 2018b); National Geophysical Data Center (with U.S. Air Force Weather Agency); World Development Indicators (World Bank, 2018); World Economic Outlook (IMF, 2018e); and the authors’ calculation.Note. Sample period: 2004–2016. See Appendix Table 1 for country groupings and Appendix Table 3 for variable definitions (most of variables are in natural logarithm or percent change) and data sources. LIDCs: low-income developing countries; REER: real effective exchange rate.
Appendix Table 5.

Pairwise correlation coefficients for selected variables for LIDCs

article image
Sources: Chinn and Ito (2006); Earth Observation Group; Financial Flows Analytics (IMF, 2018a); GADM (2018); Google Trends; Henderson, Storeygard, and Weil (2012); International Financial Statistics (IMF, 2018b); National Geophysical Data Center (with U.S. Air Force Weather Agency); World Development Indicators (World Bank, 2018); World Economic Outlook (IMF, 2018e); and the authors’ calculation.Note. Sample period: 2004–2016. Superscript * indicates significance at the one percent level. See Appendix Table 1 for country groupings and Appendix Table 3 for variable definitions (most of variables are in natural logarithm or percent change) and data sources. LIDCs: low-income developing countries.
Appendix Table 6.

Nighttime lights (NLs) and real GDP in LIDCs

article image
Sources: Chinn and Ito (2006); Earth Observations Group; GADM (2018); Google Trends; Henderson, Storeygard, and Weil (2012); National Geophysical Data Center (with U.S. Air Force Weather Agency); World Development Indicators (World Bank, 2018); World Economic Outlook (IMF, 2018e); and the authors’ estimation.Note. Cluster-robust standard errors are reported in parentheses. Superscripts *, **, and *** indicate statistical significance at the 10 percent, 5 percent, and 1 percent level, respectively. The “NLs from HSW (2012)” line shows the coefficients on NLs data (variable lndn) compiled by Henderson, Storeygard, and Weil (2012), available for 1992–2008. The “NL from Rnightlights” line shows the coefficients on NL data compiled by the R package Rnightlights developed by Njuguna (2018), available for 1992–2013 based on DMSP OLS data (also used by Henderson, Storeygard, and Weil, 2012) and for 2015–2016 based on the Visible Infrared Imaging Radiometer Suite (VIIRS) Day/Night Band (DNB). The DMSP OLS data are based on the processed images provided by National Geophysical Data Center, while images are collected by U.S. Air Force Weather Agency. The VIIRS data are produced by the Earth Observation Group, NOAA/NCEI. See Appendix Table 1 for country groupings and Appendix Table 3 for variable definitions (most of variables are in natural logarithm or percent change) and data sources. DMSP OLS: Defense Meteorological Satellite Program Operational Linescan System; LIDCs: low-income developing countries; NCEI: National Centers for Environmental Information; NOAA: National Oceanic and Atmospheric Administration; REER: real effective exchange rate; SVI: search volume index.
Appendix Table 7.

Nighttime lights (NLs) and real GDP in EMEs

article image
Sources: Chinn and Ito (2006); Earth Observations Group; GADM (2018); Google Trends; Henderson, Storeygard, and Weil (2012); National Geophysical Data Center (with U.S. Air Force Weather Agency); World Development Indicators (World Bank, 2018); World Economic Outlook (IMF, 2018e); and the authors’ estimation.Note. Cluster-robust standard errors are reported in parentheses. Superscripts *, **, and *** indicate statistical significance at the 10 percent, 5 percent, and 1 percent level, respectively. The “NLs from HSW (2012)” line shows the coefficients on NLs data (variable lndn) compiled by Henderson, Storeygard, and Weil (2012), available for 1992–2008. The “NL from Rnightlights” line shows the coefficients on NL data compiled by the R package Rnightlights developed by Njuguna (2018), available for 1992–2013 based on DMSP OLS data (also used by Henderson, Storeygard, and Weil, 2012) and for 2015–2016 based on the Visible Infrared Imaging Radiometer Suite (VIIRS) Day/Night Band (DNB). The DMSP OLS data are based on the processed images provided by National Geophysical Data Center, while images are collected by U.S. Air Force Weather Agency. The VIIRS data are produced by the Earth Observation Group, NOAA/NCEI. See Appendix Table 1 for country groupings and Appendix Table 3 for variable definitions (most of variables are in natural logarithm or percent change) and data sources. Among EMEs, the NL data exclude countries identified as outliers by Henderson, Storeygard, and Weil (2012, footnote 16, p. 1011; Bahrain, Equatorial Guinea, Serbia, Montenegro). For the data compiled by Rnightlights, several large economies are also excluded due to their heavy computational burden (Brazil, Chile, China, Indonesia, India, Mexico, Peru, Russia). DMSP OLS: Defense Meteorological Satellite Program Operational Linescan System; EMEs: emerging market economies; NCEI: National Centers for Environmental Information; NOAA: National Oceanic and Atmospheric Administration; REER: real effective exchange rate; SVI: search volume index.
Appendix Table 8.

Best specifications that minimize out-of-sample MSE for LIDCs

article image
Sources: Chinn and Ito (2006), Google Trends, Financial Flows Analytics (IMF, 2018a), International Financial Statistics (IMF, 2018b), World Development Indicators (World Bank, 2018), World Economic Outlook (IMF, 2018e), and the authors’ estimation.Note. All control variables are one-year lagged, while SVIs are contemporaneous. See the note under Table 7 for the estimation details. See Appendix Table 1 for country groupings and Appendix Table 3 for variable definitions (most of variables are in natural logarithm or percent change) and data sources. LIDCs: low-income developing countries; MSE: mean square error; REER: real effective exchange rate; SVI: search volume index.
Appendix Table 9.

Best specifications that minimize out-of-sample MSE for EMEs

article image
Sources: Chinn and Ito (2006), Google Trends, Financial Flows Analytics (IMF, 2018a), International Financial Statistics (IMF, 2018b), World Development Indicators (World Bank, 2018), World Economic Outlook (IMF, 2018e), and the authors’ estimation.Note. All control variables are one-year lagged, while SVIs are contemporaneous. See the note under Table 7 for the estimation details. See Appendix Table 1 for country groupings and Appendix Table 3 for variable definitions (most of variables are in natural logarithm or percent change) and data sources. EMEs: emerging market economies; MSE: mean square error; REER: real effective exchange rate; SVI: search volume index.