skip to main content
10.1145/1242572.1242595acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
Article

Why we search: visualizing and predicting user behavior

Published:08 May 2007Publication History

ABSTRACT

The aggregation and comparison of behavioral patterns on the WWW represent a tremendous opportunity for understanding past behaviors and predicting future behaviors. In this paper, we take a first step at achieving this goal. We present a large scale study correlating the behaviors of Internet users on multiple systems ranging in size from 27 million queries to 14 million blog posts to 20,000 news articles. We formalize a model for events in these time-varying datasets and study their correlation. We have created an interface for analyzing the datasets, which includes a novel visual artifact, the DTWRadar, for summarizing differences between time series. Using our tool we identify a number of behavioral properties that allow us to understand the predictive power of patterns of use.

References

  1. Aizen, J., D. Huttenlocher, J. Kleinberg, and A. Novak, "Traffic-Based Feedback on the Web," PNAS, Suppl. 1: 5254--5260, Apr. 6, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  2. Allan, J., J. Carbonell, G. Doddington, J. Yamron, Y. Yang, "Topic Detection and Tracking Pilot Study Final Report," Proc. of the DARPA Broadcast News Transcription and Understanding Workshop, Feb., 1998.Google ScholarGoogle Scholar
  3. Baeza-Yates, R., and B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chien, S., and N. Immorlica, "Semantic Similarity Between Search Engine Queries Using Temporal Correlation," WWW '05, Chiba, Japan, May 10--14, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Gabrilovich, E., S. Dumais, and Eric Horvitz, "Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty," WWW '04, New York, NY, May 17-12, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gruhl, D., R. Guha, R. Kumar, J. Novak, and A. Tomkins, "The Predictive Power of Online Chatter," KDD '05, Chicago, IL, Aug. 21-24, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Havre, S., E. Hezler, P. Whitney, and L. Nowell, "ThemeRiver: Visualizing Thematic Changes in Large Document Collections," IEEE Transaction on Visualization and Computer Graphics, 8(1):9--20, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Keogh, E.J., J. Lin, and A. Fu, "HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence," ICDM '05, Houston, TX, Nov. 27-30, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Keogh, E.J., and M.J. Pazzani, "Derivative Dynamic Time Warping," SDM '01, Chicago, Apr. 5-7, 2001.Google ScholarGoogle Scholar
  10. Kleinberg, J., "Bursty and Hierarchical Structure in Streams," KDD '02, Alberta, Canada, Jul. 23-26, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kleinberg, J., "Temporal Dynamics of On-Line Information Streams," In Data Stream Management: Processing High-Speed Data Streams, M. Garofalakis, J. Gehrke, R. Rastogi, eds., Springer, 2006.Google ScholarGoogle Scholar
  12. Lavrenko, V., M. Schmill, D. Lawrie, and P. Ogilvie, D. Jensen and J. Allen, "Mining of Concurrent Text and Time Series," Workshop on Text Mining, KDD '00, Boston, MA. Aug. 20, 2000.Google ScholarGoogle Scholar
  13. Lin, J., E. Keogh, and S. Lonard, "Visualizing and discovering non-trivial patterns in large time series databases," Information Visualization, 4(2):61--82, July, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Martzoukou, K., "A review of Web information seeking research: considerations of method and foci of interest," Information Research, 10(2), paper 215, 2004.Google ScholarGoogle Scholar
  15. Microsoft Live Labs, "Accelerating Search in Academic Research," 2006.Google ScholarGoogle Scholar
  16. Murray, G. C., J. Lin, and A. Chowdhury, "Identification of User Sessions with Hierarchical Agglomerative Clustering," ASIS&T'06, Austin, TX, Nov. 3-8, 2006.Google ScholarGoogle Scholar
  17. Myers, C.S., and L.R. Rabiner, "A Comparative Study of Several Dynamic Time-Warping Algorithms for Connected Word Recognition," The Bell System Tech. J., 60(7):1389--1408, September, 191.Google ScholarGoogle ScholarCross RefCross Ref
  18. Nielsen BuzzMetrics, ICWSM Conference dataset, http://www.icwsm.org/data.htmlGoogle ScholarGoogle Scholar
  19. Pass, G., A. Chowdhury, C. Torgeson, "A Picture of Search" Infoscale '06, Hong Kong, June, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sakoe, H., and S. Chiba, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-26(1):43--49, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  21. Teevan, J., E. Adar, R. Jones, and M. Potts, "History repeats itself: repeat queries in Yahoo's logs," SIGIR'06, Seattle, WA, Aug., 6--11, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tufte, E., Beautiful Evidence, Graphics Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Van Wijk, J.J. and van Selow, E.R., "Cluster and Calendar Based Visualization of Time Series Data," Infovis '99, San Francisco, CA, Oct. 24-29, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Vlachos, M., C. Meek, Z. Vagena, and D. Gunopulos, "Identifying Similarities, Periodicities, and Bursts for Online Search Queries," SIGMOD '04, Paris, France, June 13-18, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Weber, M., M. Alexa, and W. Muller, "Visualizing Time Series on Spirals," Infovis '01, San Diego, CA, Oct. 22-23, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wen, J., J. Nie, H. Zhang, "Query Clustering Using User Logs," ACM Trans. on Info. Sys., 20(1):59--81, Jan. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Witkin, A. P. "Scale-space filtering", IJCAI '83, Karlsruche, Germany, Aug. 8-12, 1983.Google ScholarGoogle Scholar

Index Terms

  1. Why we search: visualizing and predicting user behavior

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '07: Proceedings of the 16th international conference on World Wide Web
      May 2007
      1382 pages
      ISBN:9781595936547
      DOI:10.1145/1242572

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 May 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader