Article

Why we search: visualizing and predicting user behavior

Authors:
Eytan Adar

University of Washington: CSE, Seattle, WA

University of Washington: CSE, Seattle, WA
View Profile

,
Daniel S. Weld

University of Washington: CSE, Seattle, WA

University of Washington: CSE, Seattle, WA
View Profile

,
Brian N. Bershad

University of Washington: CSE, Seattle, WA

University of Washington: CSE, Seattle, WA
View Profile

,
Steven S. Gribble

University of Washington: CSE, Seattle, WA

University of Washington: CSE, Seattle, WA
View Profile

WWW '07: Proceedings of the 16th international conference on World Wide WebMay 2007Pages 161–170https://doi.org/10.1145/1242572.1242595

Published:08 May 2007Publication History

WWW '07: Proceedings of the 16th international conference on World Wide Web

Pages 161–170

ABSTRACT

The aggregation and comparison of behavioral patterns on the WWW represent a tremendous opportunity for understanding past behaviors and predicting future behaviors. In this paper, we take a first step at achieving this goal. We present a large scale study correlating the behaviors of Internet users on multiple systems ranging in size from 27 million queries to 14 million blog posts to 20,000 news articles. We formalize a model for events in these time-varying datasets and study their correlation. We have created an interface for analyzing the datasets, which includes a novel visual artifact, the DTWRadar, for summarizing differences between time series. Using our tool we identify a number of behavioral properties that allow us to understand the predictive power of patterns of use.

References

Aizen, J., D. Huttenlocher, J. Kleinberg, and A. Novak, "Traffic-Based Feedback on the Web," PNAS, Suppl. 1: 5254--5260, Apr. 6, 2004.Google ScholarCross Ref
Allan, J., J. Carbonell, G. Doddington, J. Yamron, Y. Yang, "Topic Detection and Tracking Pilot Study Final Report," Proc. of the DARPA Broadcast News Transcription and Understanding Workshop, Feb., 1998.Google Scholar
Baeza-Yates, R., and B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley, 1999. Google ScholarDigital Library
Chien, S., and N. Immorlica, "Semantic Similarity Between Search Engine Queries Using Temporal Correlation," WWW '05, Chiba, Japan, May 10--14, 2005. Google ScholarDigital Library
Gabrilovich, E., S. Dumais, and Eric Horvitz, "Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty," WWW '04, New York, NY, May 17-12, 2004. Google ScholarDigital Library
Gruhl, D., R. Guha, R. Kumar, J. Novak, and A. Tomkins, "The Predictive Power of Online Chatter," KDD '05, Chicago, IL, Aug. 21-24, 2005. Google ScholarDigital Library
Havre, S., E. Hezler, P. Whitney, and L. Nowell, "ThemeRiver: Visualizing Thematic Changes in Large Document Collections," IEEE Transaction on Visualization and Computer Graphics, 8(1):9--20, 2002. Google ScholarDigital Library
Keogh, E.J., J. Lin, and A. Fu, "HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence," ICDM '05, Houston, TX, Nov. 27-30, 2005. Google ScholarDigital Library
Keogh, E.J., and M.J. Pazzani, "Derivative Dynamic Time Warping," SDM '01, Chicago, Apr. 5-7, 2001.Google Scholar
Kleinberg, J., "Bursty and Hierarchical Structure in Streams," KDD '02, Alberta, Canada, Jul. 23-26, 2002. Google ScholarDigital Library
Kleinberg, J., "Temporal Dynamics of On-Line Information Streams," In Data Stream Management: Processing High-Speed Data Streams, M. Garofalakis, J. Gehrke, R. Rastogi, eds., Springer, 2006.Google Scholar
Lavrenko, V., M. Schmill, D. Lawrie, and P. Ogilvie, D. Jensen and J. Allen, "Mining of Concurrent Text and Time Series," Workshop on Text Mining, KDD '00, Boston, MA. Aug. 20, 2000.Google Scholar
Lin, J., E. Keogh, and S. Lonard, "Visualizing and discovering non-trivial patterns in large time series databases," Information Visualization, 4(2):61--82, July, 2005. Google ScholarDigital Library
Martzoukou, K., "A review of Web information seeking research: considerations of method and foci of interest," Information Research, 10(2), paper 215, 2004.Google Scholar
Microsoft Live Labs, "Accelerating Search in Academic Research," 2006.Google Scholar
Murray, G. C., J. Lin, and A. Chowdhury, "Identification of User Sessions with Hierarchical Agglomerative Clustering," ASIS&T'06, Austin, TX, Nov. 3-8, 2006.Google Scholar
Myers, C.S., and L.R. Rabiner, "A Comparative Study of Several Dynamic Time-Warping Algorithms for Connected Word Recognition," The Bell System Tech. J., 60(7):1389--1408, September, 191.Google ScholarCross Ref
Nielsen BuzzMetrics, ICWSM Conference dataset, http://www.icwsm.org/data.htmlGoogle Scholar
Pass, G., A. Chowdhury, C. Torgeson, "A Picture of Search" Infoscale '06, Hong Kong, June, 2006. Google ScholarDigital Library
Sakoe, H., and S. Chiba, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-26(1):43--49, 1978.Google ScholarCross Ref
Teevan, J., E. Adar, R. Jones, and M. Potts, "History repeats itself: repeat queries in Yahoo's logs," SIGIR'06, Seattle, WA, Aug., 6--11, 2006. Google ScholarDigital Library
Tufte, E., Beautiful Evidence, Graphics Press, 2006. Google ScholarDigital Library
Van Wijk, J.J. and van Selow, E.R., "Cluster and Calendar Based Visualization of Time Series Data," Infovis '99, San Francisco, CA, Oct. 24-29, 1999. Google ScholarDigital Library
Vlachos, M., C. Meek, Z. Vagena, and D. Gunopulos, "Identifying Similarities, Periodicities, and Bursts for Online Search Queries," SIGMOD '04, Paris, France, June 13-18, 2004. Google ScholarDigital Library
Weber, M., M. Alexa, and W. Muller, "Visualizing Time Series on Spirals," Infovis '01, San Diego, CA, Oct. 22-23, 2001. Google ScholarDigital Library
Wen, J., J. Nie, H. Zhang, "Query Clustering Using User Logs," ACM Trans. on Info. Sys., 20(1):59--81, Jan. 2002. Google ScholarDigital Library
Witkin, A. P. "Scale-space filtering", IJCAI '83, Karlsruche, Germany, Aug. 8-12, 1983.Google Scholar

Index Terms

Why we search: visualizing and predicting user behavior
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

CrowdScape: interactively visualizing user behavior and output
UIST '12: Proceedings of the 25th annual ACM symposium on User interface software and technology

Crowdsourcing has become a powerful paradigm for accomplishing work quickly and at scale, but involves significant challenges in quality control. Researchers have developed algorithmic quality control approaches based on either worker outputs (such as ...
Read More
Applying data mining technology to analyze user behavior in course website
ACST'07: Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology

Information of network grows up fast, and there is an important thing providing user a tool which could search information quickly. In order to achieve this purpose and we must track and analyze user behavior of network. We apply data mining approach ...
Read More
Modeling and predicting user behavior in sponsored search
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Implicit user feedback, including click-through and subsequent browsing behavior, is crucial for evaluating and improving the quality of results returned by search engines. Several recent studies [1, 2, 3, 13, 25] have used post-result browsing behavior ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
General Chairs:
Carey Williamson
University of Calgary, Canada
,
Mary Ellen Zurko
IBM, USA
,
Program Chairs:
Peter Patel-Schneider
Bell Labs Research, USA
,
Prashant Shenoy
University of Massachusetts at Amherst, USA
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 May 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
DTW
data mining
user behavior
visualization
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 84
  Total Citations
  View Citations
- 1,641
  Total Downloads
- Downloads (Last 12 months)37
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Why we search: visualizing and predicting user behavior

WWW '07: Proceedings of the 16th international conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

CrowdScape: interactively visualizing user behavior and output

Applying data mining technology to analyze user behavior in course website

Modeling and predicting user behavior in sponsored search