Skip to main content
Top
Published in: Empirical Economics 6/2021

26-01-2021

Forecasting building permits with Google Trends

Authors: David Coble, Pablo Pincheira

Published in: Empirical Economics | Issue 6/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We propose a useful way to predict building permits in the USA, exploiting rich data from web search queries. The relevance of our work relies on the fact that the time series on building permits is used as a leading indicator of economic activity in the construction sector. Nevertheless, new data on building permits are released with a lag of a few weeks. Therefore, an accurate nowcast of this leading indicator is desirable. In this paper, we show that models including Google search queries nowcast and forecast better than many of our good, not naïve benchmarks. We show this with both in-sample and out-of-sample exercises. In addition, we show that the results of these predictions are robust to different specifications, the use of rolling or expanding windows and, in some cases, to the forecasting horizon. Since Google queries information is free, our approach is a simple and inexpensive way to predict building permits in the USA.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
For example, Strauss (2013) finds that building permits outperform other standard leading indicators of overall economic activity, such as interest rates and oil prices in most US states.
 
2
In the USA, the federal agency in charge of collecting these data from granting government agencies is the US Census Bureau, which provides a monthly estimate through the Building Permits Survey. See more information in the data section.
 
3
Arouba and Diebold (2010), for example, stressed the importance of having higher frequency, real-time data to monitor macroeconomic variables. Also, the term nowcasting—which was coined by Giannone et al. (2008)—was introduced in the literature to refer to their methodology to update forecasts of lower-frequency variables, such as quarterly GDP, as higher-frequency relevant information appears, such as monthly industrial production.
 
4
Naturally, Google Trends has also been used in other research areas, such as oil spending (Yu et al. 2019) youth unemployment (Naccarato et al. 2018) and macroeconomics, for example to forecast inflation and consumer confidence (Niesert et al. 2019), just to name a few. For a review of the use of Google Trends in research during the last decade, see Jun et al. (2018).
 
5
We are only interested in the aggregated number of building permits in the USA, which present no missing data.
 
7
D’Amuri and Marcucci (2017) clarify this point. They present the following equations for calculating the Google search index (GI):
  • The search participation of a certain term in a day (d) and geographical location (r) is given by the number of searches of the term (\(V_{d,r}\)), divided by the total number of searches (\(T_{d,r}\)). Therefore, the daily relative searches of a certain term is \(S_{d,r} = \frac{{V_{d,r} }}{{T_{d,r} }}\).
  • The relative weekly searches of the term are calculated as a simple average of the daily searches: \(S_{T,r} = \frac{1}{7}\mathop \sum \nolimits_{{d = {\text{Sunday}}}}^{{{\text{Saturday}}}} S_{d,r}\).
Google also scales the index as follows: \({\text{GI}}_{T,r} = \frac{100}{{{\text{max}}_{t} \left( {S_{T,r} } \right)}}S_{T,r}\). D’Amuri and Marcucci (2017) interpret GI as the probability of a random user of the r location searching in Google for a particular term during a week.
 
8
It is also important to mention that D’Amuri and Marcucci (2017) show that the effects of sampling errors in Google Trends are quite negligible when applied to unemployment data.
 
9
Examples of this literature are, to name a few, Ginsberg et al. (2009) who select 45 queries over 50 million search terms using out-of-sample goodness of fit for illness data; and Scott and Varian (2014) who use Bayesian methods to automatically select predictors of initial claims and retail sales.
 
10
Of course, it is possible to use both simultaneously: for example, using the first method to narrow down some terms, and use judgment to discard terms that are most likely spurious. Examples of this approach are Fondeur and Karamé (2013), Choi and Varian (2012) and D’amuri and Marcucci (2017).
 
11
For our preferred search queries, we find high correlations between building permits and each of these variables, with and without seasonal adjustment. The lowest correlation is 0.86 for the query “new housing development” while the highest is 0.96 for the seasonally adjusted query “real estate exam.”
 
13
Notice that estimates of the drift terms are removed from Table 7.
 
14
The penalty for the number of parameters is much higher with BIC than with AIC in estimation windows of 50 observations.
 
15
Here, \(\gamma \left( L \right)\) and \(x_{t}\) are defined as in expression (4).
 
16
Table 8 in “Appendix” shows estimates and diagnostic statistics of models (10) and (12) for building permits and the four different Google search queries under consideration. We have removed, again, estimates of the drift terms. We observe that our SARIMA specifications seem to offer a better representation of the data relative to our models (3) and (5). In particular, all the coefficients shown in Table 8 are statistically significant at usual levels, the Schwarz criterion are lower in Table 8 relative to the comparable figures in Table 7 and also the Durbin Watson statistics are closer to 2. This last point is indicating that SARIMA specifications seem to be more successful at removing the excess of first order autocorrelation in the errors relative to our simple specifications in (3) and (5). Finally, while the coefficients of determination show an important degree of heterogeneity, relative to our univariate linear specifications, we observe that SARIMA models tend to produce a higher coefficient of determination for our Google search queries, and a slightly lower one for building permits. This is the only aspect in which the basic linear model in (3) seems to be slightly better than the SARIMA specification in (9) and (11).
 
17
Notice that for estimation of our models we only use a total of R observations both for building permits and Google Trends. The extra observation of Google Trends is used only in the generation of nowcasts and forecasts.
 
18
When recursive or expanding windows are used instead, the size of the estimation window grows with the number of available observations for estimation. For instance, the first nowcast is constructed estimating the models with R observations, whereas the last nowcast is constructed estimating the models with T observations.
 
19
Simulation evidence carried out by Clark and McCracken (2013) and Pincheira and West (2016) show that normal critical values tend to work well when multistep-ahead forecasts are constructed using the iterative method, at least when the data generating process is not very persistent. This is very important because in this paper we use the iterative method for the construction of multistep-ahead forecasts.
 
20
We use the word “lags” in parenthesis because we are also including in expression (1) contemporaneous terms of the search queries.
 
21
Let us recall that in nested environments the CW test removes a term that should be zero in population under the null hypothesis, but that is not zero in finite samples. Tables 4, 13, 14, and 15 corroborate this prior as the corresponding t-statistics of the GW/DMW test are always lower than the comparable t-statistics of the CW test.
 
22
The pairwise Pearson correlations of the 15 series for “real estate exam” fluctuate between 0.97 and 0.99. For "new construction," all correlations are at least 0.99. For "new housing development,” the correlation factors are between 0.90 and 0.96. Finally, the correlations for “new home construction” fluctuate between 0.97 and 0.99.
 
Literature
go back to reference Ang A, Piazzesi M, Wei M (2006) What does the yield curve tell us about GDP growth? J Econom 131(1–2):359–403CrossRef Ang A, Piazzesi M, Wei M (2006) What does the yield curve tell us about GDP growth? J Econom 131(1–2):359–403CrossRef
go back to reference Askitas N, Zimmermann K (2011) Detecting mortgage delinquencies with google trends. IZA Discussion Paper 5895. Askitas N, Zimmermann K (2011) Detecting mortgage delinquencies with google trends. IZA Discussion Paper 5895.
go back to reference Beracha E, Wintoki M (2013) Forecasting residential real estate price changes from online search activity. J Real Estate Res 35(3):283–312CrossRef Beracha E, Wintoki M (2013) Forecasting residential real estate price changes from online search activity. J Real Estate Res 35(3):283–312CrossRef
go back to reference Berge TJ, Jordà Ò (2011) Evaluating the classification of economic activity into recessions and expansions. Am Econ J Macroecon 3(2):246–277CrossRef Berge TJ, Jordà Ò (2011) Evaluating the classification of economic activity into recessions and expansions. Am Econ J Macroecon 3(2):246–277CrossRef
go back to reference Case KE, Shiller RJ (1989) The efficiency of the market for single-family homes. Am Econ Rev 79(1):125–137 Case KE, Shiller RJ (1989) The efficiency of the market for single-family homes. Am Econ Rev 79(1):125–137
go back to reference Clark T, McCracken M (2013) Evaluating the accuracy of forecasts from vector autoregressions. In: Fomby T, Killian L, Murphy A (eds), Vector autoregressive modeling—new developments and applications: essays in honor of Christopher A. Sims. Emerald Group Publishing Limited, Bingley, United Kingdom Clark T, McCracken M (2013) Evaluating the accuracy of forecasts from vector autoregressions. In: Fomby T, Killian L, Murphy A (eds), Vector autoregressive modeling—new developments and applications: essays in honor of Christopher A. Sims. Emerald Group Publishing Limited, Bingley, United Kingdom
go back to reference D’Amuri F, Marcucci J (2017) The predictive power of Google searches in forecasting US unemployment. Int J Forecast 33(4):801–816CrossRef D’Amuri F, Marcucci J (2017) The predictive power of Google searches in forecasting US unemployment. Int J Forecast 33(4):801–816CrossRef
go back to reference Fondeur Y, Karamé F (2013) Can Google data help predict French youth unemployment? Econ Model 30(C):117–125CrossRef Fondeur Y, Karamé F (2013) Can Google data help predict French youth unemployment? Econ Model 30(C):117–125CrossRef
go back to reference Kouwenberg R, Zwinkels R (2014) Forecasting de US housing market. Int J Forecast 30:415–425CrossRef Kouwenberg R, Zwinkels R (2014) Forecasting de US housing market. Int J Forecast 30:415–425CrossRef
go back to reference Marcellino M, Stock J, Watson M (2006) A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series. J Econom 135(1–2):499–526CrossRef Marcellino M, Stock J, Watson M (2006) A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series. J Econom 135(1–2):499–526CrossRef
go back to reference McGuckin RH, Ozyildirim A, Zarnowitz V (2007) A more timely and useful index of leading indicators. J Bus Econ Stat 25:110–120CrossRef McGuckin RH, Ozyildirim A, Zarnowitz V (2007) A more timely and useful index of leading indicators. J Bus Econ Stat 25:110–120CrossRef
go back to reference Pincheira P, Gatty A (2016) Forecasting Chilean inflation with international factors. Empir Econ 51(3):981–1010CrossRef Pincheira P, Gatty A (2016) Forecasting Chilean inflation with international factors. Empir Econ 51(3):981–1010CrossRef
go back to reference Plakandaras V, Gupta R, Gogas P (2015) Forecasting the U.S. real house price index. Econ Model 45:259–267CrossRef Plakandaras V, Gupta R, Gogas P (2015) Forecasting the U.S. real house price index. Econ Model 45:259–267CrossRef
go back to reference Rapach D, Strauss J (2009) Differences in housing price forecastability across US states. Int J Forecast 25(2):351–372CrossRef Rapach D, Strauss J (2009) Differences in housing price forecastability across US states. Int J Forecast 25(2):351–372CrossRef
go back to reference Scott S, Varian H (2014) Predicting the present with Bayesian structural time series. IJMNO 5:4–23CrossRef Scott S, Varian H (2014) Predicting the present with Bayesian structural time series. IJMNO 5:4–23CrossRef
go back to reference Stock JH, Watson MW (1989) New indexes of coincident and leading economic indicators. In: Blanchard OJ, Fischer S (eds) NBER macroeconomics annual. MIT Press, Cambridge, Massachusetts, London, England, pp 351–409 Stock JH, Watson MW (1989) New indexes of coincident and leading economic indicators. In: Blanchard OJ, Fischer S (eds) NBER macroeconomics annual. MIT Press, Cambridge, Massachusetts, London, England, pp 351–409
go back to reference Wu L, Brynjolfsson E (2015) The future of prediction: how Google searches foreshadow housing prices and sales. In: Goldfarb A, Greenstein S, Tucker C (eds) Economic analysis of the digital economy. University of Chicago Press, pp 89–118 Wu L, Brynjolfsson E (2015) The future of prediction: how Google searches foreshadow housing prices and sales. In: Goldfarb A, Greenstein S, Tucker C (eds) Economic analysis of the digital economy. University of Chicago Press, pp 89–118
go back to reference Yu L, Zhao Y, Tang L, Yang Z (2019) Online big data-driven oil consumption forecasting with Google trends. Int J Forecast 35(1):213–223CrossRef Yu L, Zhao Y, Tang L, Yang Z (2019) Online big data-driven oil consumption forecasting with Google trends. Int J Forecast 35(1):213–223CrossRef
Metadata
Title
Forecasting building permits with Google Trends
Authors
David Coble
Pablo Pincheira
Publication date
26-01-2021
Publisher
Springer Berlin Heidelberg
Published in
Empirical Economics / Issue 6/2021
Print ISSN: 0377-7332
Electronic ISSN: 1435-8921
DOI
https://doi.org/10.1007/s00181-020-02011-1

Other articles of this Issue 6/2021

Empirical Economics 6/2021 Go to the issue