ABSTRACT
We present a hybrid method to turn off-the-shelf information retrieval (IR) systems into future event predictors. Given a query, a time series model is trained on the publication dates of the retrieved documents to capture trends and periodicity of the associated events. The periodicity of historic data is used to estimate a probabilistic model to predict future bursts. Finally, a hybrid model is obtained by intertwining the probabilistic and the time-series model. Our empirical results on the New York Times corpus show that autocorrelation functions of time-series suffice to classify queries accurately and that our hybrid models lead to more accurate future event predictions than baseline competitors.
- E. Adar, D.S. Weld, B.N. Bershad, and S.S. Gribble. Why we search: visualizing and predicting user behavior. In WWW, 2007. Google ScholarDigital Library
- O. Alonso, M. Gertz, and R. Baeza-Yates. Clustering and exploring search results using timeline constructions. In CIKM, 2009. Google ScholarDigital Library
- George E. P. Box, Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. Wiley Series in Probability and Statistics, 4th edition, 2008.Google Scholar
- R. Catizone, A. Dalli, and Y. Wilks. Evaluating automatically generated timelines from the web. In LREC, 2006.Google Scholar
- S. Chien and N. Immorlica. Semantic similarity between search engine queries using temporal correlation. In WWW, 2005. Google ScholarDigital Library
- M. Gamon, S. Basu, D. Belenko, D. Fisher, M. Hurst, and A. önig. Blews: Using blogs to provide context for news articles. In ICWSM, 2008.Google Scholar
- S. Goel, J.M. Hofman, S. Lahaie, D.M. Pennock, and D.J. Watts. Predicting consumer behavior with web search. National Academy of Sciences, 2010.Google Scholar
- S. Goel, D. M. Reeves, D.J. Watts, and D.M. Pennock. Prediction without markets. In EC, 2010. Google ScholarDigital Library
- H. Varian H. Choi. Predicting the present with google trends. Technical report, 2009.Google Scholar
- A. Kulkarni, J. Teevan, K.M. Svore, and S.T. Dumais. Understanding temporal query dynamics. In WSDM, 2011. Google ScholarDigital Library
- M. Murata, H. Toda, Y. Matsuura, R. Kataoka, and T. Mochizuki. Detecting periodic changes in search intentions in a search engine. In CIKM, 2010. Google ScholarDigital Library
- K. Radinsky, S. Davidovich, and S. Markovitch. Predicting the news of tomorrow using patterns in web search queries. In ICWI, 2008. Google ScholarDigital Library
- S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR, 1994. Google ScholarDigital Library
- Y. Zhang, B.J. Jansen, and A. Spink. Time series analysis of a web search engine transaction log. Inf. Processing. Manage., 2008. Google ScholarDigital Library
Index Terms
- Hybrid models for future event prediction
Recommendations
Implementation of Multiplicative Seasonal ARIMA Modeling and Flood Prediction Based on Long-Term Time Series Data in Indonesia
Artificial Intelligence and SecurityAbstractTime series modeling and prediction has fundamental importance in the various practical field. Thus, a lot of productive research works is working in this field for several years. Many essential methods have been proposed in publications to ...
Data-driven models for monthly streamflow time series prediction
Data-driven techniques such as Auto-Regressive Moving Average (ARMA), K-Nearest-Neighbors (KNN), and Artificial Neural Networks (ANN), are widely applied to hydrologic time series prediction. This paper investigates different data-driven models to ...
A fuzzy seasonal ARIMA model for forecasting
Information processingThis paper proposes a fuzzy seasonal ARIMA (FSARIMA) forecasting model, which combines the advantages of the seasonal time series ARIMA (SARIMA) model and the fuzzy regression model. It is used to forecast two seasonal time series data of the total ...
Comments