skip to main content
10.1145/2835776.2835825acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Barbara Made the News: Mining the Behavior of Crowds for Time-Aware Learning to Rank

Published:08 February 2016Publication History

ABSTRACT

In Twitter, and other microblogging services, the generation of new content by the crowd is often biased towards immediacy: what is happening now. Prompted by the propagation of commentary and information through multiple mediums, users on the Web interact with and produce new posts about newsworthy topics and give rise to trending topics. This paper proposes to leverage on the behavioral dynamics of users to estimate the most relevant time periods for a topic. Our hypothesis stems from the fact that when a real-world event occurs it usually has peak times on the Web: a higher volume of tweets, new visits and edits to related Wikipedia articles, and news published about the event.

In this paper, we propose a novel time-aware ranking model that leverages on multiple sources of crowd signals. Our approach builds on two major novelties. First, a unifying approach that given query q, mines and represents temporal evidence from multiple sources of crowd signals. This allows us to predict the temporal relevance of documents for query q. Second, a principled retrieval model that integrates temporal signals in a learning to rank framework, to rank results according to the predicted temporal relevance. Evaluation on the TREC 2013 and 2014 Microblog track datasets demonstrates that the proposed model achieves a relative improvement of 13.2% over lexical retrieval models and 6.2% over a learning to rank baseline.

References

  1. M. Bendersky, D. Metzler, and W. B. Croft. Effective query formulation with multiple information sources. In Proceedings of WSDM '12, 2012, 443--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Choi, W. B. Croft, and J. Y. Kim. Quality models for microblog retrieval. In Proceedings of CIKM '12, 2012, 1834--1838. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ciglan and K. Nørvåg. WikiPop: Personalized event detection system based on wikipedia page view statistics. In Proceedings of CIKM '10, 2010, 1931--1932. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Costa, F. Couto, and M. Silva. Learning temporal-dependent ranking models. In Proceedings of SIGIR '14, 2014, 757--766. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Dai, M. Shokouhi, and B. D. Davison. Learning to rank for freshness and relevance. In Proceedings of SIGIR '11, 2011, 95--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Dakka, L. Gravano, and P. Ipeirotis. Answering general time-sensitive queries. IEEE Trans. Knowl. Data Eng., 24 (2): 220--235, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Efron. Information search and retrieval in microblogs. J. Am. Soc. Inf. Sci. Technol., 62 (6): 996--1008, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Efron and G. Golovchinsky. Estimation methods for ranking recent information. In Proceedings of SIGIR '11, 2011, 495--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Efron, J. Lin, J. He, and A. de Vries. Temporal feedback for tweet search with non-parametric density estimation. In Proceedings of SIGIR '14, 2014, 33--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl, and S. Siersdorfer. Extracting event-related information from article updates in wikipedia. In Proceedings of ECIR'13, 2013, 254--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Jones and F. Diaz. Temporal profiles of queries. ACM Trans Inf Syst, 25 (3), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kanhabua and K. Nørvåg. Learning to rank search results for time-sensitive queries. In Proceedings of CIKM '12, 2012, 2463--2466. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Kanhabua, T. Ngoc Nguyen, and W. Nejdl. Learning to detect event-related queries for web search. In Proceedings of WWW '15 Companion, 2015, 1339--1344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Kim, R. Yeniterzi, and J. Callan. Overcoming vocabulary limitations in twitter microblogs. In Proceedings of TREC 2012, 2012.Google ScholarGoogle Scholar
  15. X. Li and W. B. Croft. Time-based language models. In Proceedings of CIKM '03, 2003, 469--475. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Massoudi, M. Tsagkias, M. de Rijke, and W. Weerkamp. Incorporating query expansion and quality indicators in searching microblog posts. In Proceedings of ECIR'11, 2011, 362--367. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Metzler and W. B. Croft. Linear feature-based models for information retrieval. Inf Retrieval, (3), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. B. Nattiya Kanhabua and K. Nørvåg. Temporal information retrieval. Found. Trends® Inf. Retr., 9 (2): 91--208, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M.-H. Peetz, E. Meij, and M. de Rijke. Using temporal bursts for query modeling. Inf Retrieval, 1--35, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of SIGIR '98, 1998, 275--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. In Proceedings of TREC 1994, 1994.Google ScholarGoogle Scholar
  22. T. Sakai. Statistical reform in information retrieval? SIGIR Forum, 48 (1): 3--12, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Schilder and C. Habel. Temporal information extraction for temporal question answering. In New Directions in Question Answering, 2003, 35--44.Google ScholarGoogle Scholar
  24. T. Steiner, S. van Hooland, and E. Summers. MJ no more: Using concurrent wikipedia edit spikes with social network plausibility checks for breaking news detection. In Proceedings of WWW '13 Companion, 2013, 791--794. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Teevan, D. Ramage, and M. R. Morris. #TwitterSearch: A comparison of microblog search and web search. In Proceedings of WSDM '11, 2011, 35--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. Weerkamp and M. de Rijke. Credibility-inspired ranking for blog post retrieval. Inf Retrieval, 15 (3-4): 243--277, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Weng, E.-P. Lim, J. Jiang, and Q. He. TwitterRank: Finding topic-sensitive influential twitterers. In Proceedings of WSDM '10, 2010, 261--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Whiting, I. A. Klampanos, and J. M. Jose. Temporal pseudo-relevance feedback in microblog retrieval. In Advances in Information Retrieval, number 7224 in Lecture Notes in Computer Science. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Xu, D. W. Oard, and P. McNamee. HLTCOE at TREC 2014: Microblog and clinical decision support. In Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, November 19-21, 2014, 2014.Google ScholarGoogle Scholar
  30. C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst, 22 (2): 179--214, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Ćwik and J. Mielniczuk. Data-dependent bandwidth choice for a grade density kernel estimate. Statistics & Probability Letters, 16 (5): 397--405, 1993.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Barbara Made the News: Mining the Behavior of Crowds for Time-Aware Learning to Rank

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining
        February 2016
        746 pages
        ISBN:9781450337168
        DOI:10.1145/2835776

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 February 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        WSDM '16 Paper Acceptance Rate67of368submissions,18%Overall Acceptance Rate498of2,863submissions,17%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader