With Google Trends. Predicting the Present. Hyunyoung Choi Hal Varian June 2009. Problem statement. Government agencies and other organizations produce monthly reports on economic activity Retail Sales House Sales Automotive Sales Unemployment Problems with reports
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Problems with reports
Google Trends releases daily and weekly index of search queries by industry vertical
Can Google Trends data help predict current economic activity?
Note: Queries from 2009-01-01 to 2009-04-30 & Growth Comparison w/ the same time window
Home Inspections & Appraisal
Real Estate Agencies
Rental Listings & Referrals
Subcategories under Real Estate by Query Shares
New Home Sales
Housing affordability with Average/Median Home Price
Recent Trend with New Home Sales at t-1
Seasonality with New Home Sales at t-12
New Residential Sales from US Census
Google Trends Real Estate by Category
Yt = 446.1 + 0.864 * Yt - 1 – 4.340 * us378.1 + 4.198 * us96.2 – 0.001 * AvgPt – 1
Yt : New house sold at t-th month
AvgPt – 1: Average Sales Price of New One-Family Houses Sold at (t-1)-th month
us378.1 : Google Trend of vertical id = 378 (Rental Listings & Referrals ) at t-th month 1st week
us96.2 : Google Trend of vertical id = 96 (Real Estate Agent) at t-th month 2nd week
Actual = 515K
Predicted = 442.98K
Z-score = 2.53
August 2008 Prediction = 417.52K
Bus & Rail
Cruises & Charters
Attractions & Activities
Car Rental & Taxi Services
Hotels & Accommodations
Subcategories under Travel by Query Shares
Visitors Arrival Statistics from Hong Kong Tourism Board
Google Trends Travel by Category
log(Yi,t) = 0.664 + 0.113 * log(Yi,t-1) + 0.828 * log(Yi,t-12) + 0.001 * Xi,t,2 + 0.001 * Xi,t,3
+ 0.005 * FXrate i,t + ηi, + ei,t
ei,t ~ N(0, 0.09382), ηi ~ N(0, 0.02282)
Yi,t = Arrival to Hong Kong at month t and from i-th country
Xi,t,1 = Google Trend Search at 1st week of month t and from i-th country
Xi,t,2 = Google Trend Search at 2nd week of month t and from i-th country
Xi,t,3 = Google Trend Search at 3rd week of month t and from i-th country
FXrate i,t = Hong Kong Dollar per one unit of i-th country’s local currency at month t. Average of first week’s FX rate is used as a proxy to FX rate per each month.
US Auto Sales by Make
Google Trends under Vehicle Brands Category
NOTE: Area represents the queries volume from first half year 2008 and the color represents queries yearly growth rate
Auto Sales by Make (Top 9 Make by Sales) Monthly Sales vs. Google Trends at Second Week of each month
Fixed effects model:
log(Yi,t) = 2.4276 + 0.2552 * log(Yi,t-1) + 0.4930 * log(Yi,t-12)
+ 0.0005 * Xi,t,2 + 0.0014 * Xi,t,2 + ai * Makei + ei,t
ei,t ~ N(0, 0.13472) , Adjusted R2 = 0.9829
Yi,t = Auto Sales of i-th Make at month t
Xi,t,1 = Google Trend Search at 1st week of month t and from i-th make
Xi,t,2 = Google Trend Search at 2nd week of month t and from i-th make
Makei =Dummy variable for Auto Make
ai = Coefficient to capture the mean level of Auto Sales by Make
Df Sum Sq Mean Sq F value Pr(>F)
trends1 1 12.89 12.89 710.3542 < 2e-16 ***
trends2 1 0.05 0.05 2.7987 0.09455 .
log(s1) 1 1532.95 1532.95 84452.7530 < 2e-16 ***
log(s12) 1 24.07 24.07 1325.9741 < 2e-16 ***
as.factor(brand) 26 3.34 0.13 7.0696 < 2e-16 ***
Residuals 1480 26.86 0.02
According to the NBER, the current recession started December 2007.
National unemployment rate passed 5% in mid 2008 and search queries on [Welfare and Unemployment] also increased at same time.
Window For Long Term Model
Window For Short Term Model
Signif. codes: 0.001 ‘***’ 0.05 ‘**’ 0.01 ‘*’
Reference ARIMA(0,1,1) X (1,0,0)12 Model
ARIMA(0,1,1) X (1,0,0)12 Model With Google Trends
Model Fit improved significantly – smaller Standard deviation, high log likelihood and smaller AIC
Initial Claims are positively correlated with searches on Jobs and Welfare.
With Google Trends, the out-of-sample prediction MAE decreases by 16.84%.
Prediction with rolling window from 1/11/2009 to 4/12/2009
Prediction Error at t:
Mean Absolute Error:
With Google Trends, the out-of-sample prediction MAE decreases by 19.23%.
Prediction errors are within the same range as LT Model.
Fit improvement is better with ST Model.
Google Trends significantly improves out-of-sample prediction of state unemployment, up to 18 days in advance of data release.
Mean absolute error for out-of-sample predictions declines by 16.84% for LT Model and 19.23% for ST Model.
Can examine metro level data
Other local data (real estate)
Combine with other predictors
Detect turning points?