ABSTRACT
In this paper, we undertake a large-scale study of online user behavior based on search and toolbar logs. We propose a new CCS taxonomy of pageviews consisting of Content (news, portals, games, verticals, multimedia), Communication (email, social networking, forums, blogs, chat), and Search (Web search, item search, multimedia search). We show that roughly half of all pageviews online are content, one-third are communications, and the remaining one-sixth are search. We then give further breakdowns to characterize the pageviews within each high-level category.
We then study the extent to which pages of certain types are revisited by the same user over time, and the mechanisms by which users move from page to page, within and across hosts, and within and across page types. We consider robust schemes for assigning responsibility for a pageview to ancestors along the chain of referrals. We show that mail, news, and social networking pageviews are insular in nature, appearing primarily in homogeneous sessions of one type. Search pageviews, on the other hand, appear on the path to a disproportionate number of pageviews, but cannot be viewed as the principal mechanism by which those pageviews were reached.
Finally, we study the burstiness of pageviews associated with a URL, and show that by and large, online browsing behavior is not significantly affected by "breaking" material with non-uniform visit frequency.
- E. Adar, J. Teevan, and S. T. Dumais. Resonance on the web: Web dynamics and revisitation patterns. In Proc. 27th CHI, pages 1381--1390, 2009. Google ScholarDigital Library
- E. Baykan, M. R. Henzinger, L. Marian, and I. Weber. Purely URL-based topic classification. In Proc. 18th WWW, pages 1109--1110, 2009. Google ScholarDigital Library
- M. Bilenko and R. W. White. Mining the search trails of surfing crowds: Identifying relevant websites from user activity. In Proc. 17th WWW, pages 51--60, 2008. Google ScholarDigital Library
- M. Bilenko, R. W. White, M. Richardson, and G. C. Murray. Talking the talk vs. walking the walk: Salience of information needs in querying vs. browsing. In Proc. 31st SIGIR, pages 705--706, 2008. Google ScholarDigital Library
- A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarDigital Library
- A. G. Büchner, M. Baumgarten, S. S. Anand, M. D. Mulvenna, and J. G. Highes. User-driven navigation pattern discovery from internet data. In Proc. WebKDD, pages 74--91, 1999. Google ScholarDigital Library
- R. E. Bucklin and C. Sismeiro. A model of web site browsing behavior estimated on clickstream data. Journal of Marketing Research, 11:249--267, 2003.Google ScholarCross Ref
- I. V. Cadez, D. Heckerman, C. Meek, P. Smyth, and S. White. Model-based clustering and visualization of navigation patterns on a web site. DMKD, 7(4):399--424, 2003. Google ScholarDigital Library
- L. D. Catledge and J. E. Pitkow. Characterizing browsing strategies in the World--Wide Web. Computer Networks and ISDN Systems, 27(6):1065--1073, 1995. Google ScholarDigital Library
- O. Chappelle and Y. Zhang. A dynamic Bayesian network click model for web search ranking. In Proc. 18th WWW, pages 1--10, 2009. Google ScholarDigital Library
- A. Cockburn and B. McKenzie. What do Web users do? An empirical analysis of Web use. Intl. J. of Human-Computer Studies, 54(6):903--922, 2001. Google ScholarDigital Library
- H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Probabilistic query expansion using query logs. In Proc. 11th WWW, pages 325--332, 2002. Google ScholarDigital Library
- D. Downey, S. Dumais, and E. Horvitz. Models of searching and browsing: Languages, studies, and applications. JASIST, 58(6):862--871, 2007.Google Scholar
- D. Downey, S. Dumais, D. Liebling, and E. Horvitz. Understanding the relationship between searchers' queries and information goals. In Proc. 17th CIKM, pages 449--458, 2008. Google ScholarDigital Library
- F. Guo, C. Liu, A. Kannan, T. Minka, M. Taylor, Y.-M. Wang, and C. Faloutsos. Click chain model in web search. In Proc. 18th WWW, pages 11--20, 2009. Google ScholarDigital Library
- E. Herder. Characterizations of user web revisit behavior. In Proc. Workshop on Adaptivity and User Modeling in Interactive Systems, 2005.Google Scholar
- B. J. Jansen, A. Spink, and T. Saracevic. Real life, real users, and real needs: A study and analysis of user queries on the web. Information Processing and Management, 36:207--227, 2000. Google ScholarDigital Library
- E. J. Johnson, W. M. Moe, P. S. Fader, S. Bellman, and G. L. Lohse. On the depth and dynamics of online search behavior. Management Science, 50(3):299--308, 2004. Google ScholarDigital Library
- R. Jones and D. Fain. Query word deletion prediction. In Proc. 26th SIGIR, pages 435--436, 2003. Google ScholarDigital Library
- R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proc. 15th WWW, pages 387--396, 2006. Google ScholarDigital Library
- R. Kumar and A. Tomkins. A characterization of online search behavior. IEEE Data Eng. Bull., 32(2):3--11, 2009.Google Scholar
- T. Lau and E. Horvitz. Patterns of search: Analyzing and modeling web query refinement. In Proc. 7th UMAP, pages 119--128, 1999. Google ScholarDigital Library
- Y. Liu, B. Gao, T.-Y. Liu, Y. Zhang, Z. Ma, S. He, and H. Li. Browserank: Letting web users vote for page importance. In Proc. 31st SIGIR, pages 451--458, 2008. Google ScholarDigital Library
- P. Mayr. Website entries from a web log file perspective - a new log file measure. In Proc. AoIR-ASIST Workshop on Web Science Research Methods, 2004.Google Scholar
- Q. Mei, K. Klinkner, R. Kumar, and A. Tomkins. An analysis framework for search sequences. In Proc. 18th CIKM, 2009. Google ScholarDigital Library
- A. L. Montgomery and C. Faloutsos. Identifying web browsing trends and patterns. IEEE Computer, 34(7):94--95, 2001. Google ScholarDigital Library
- J. Morrison, P. Pirolli, and S. K. Card. A taxonomic analysis of what World Wide Web activities significantly impact people's decisions and actions. In Proc. CHI, pages 163--164, 2001. Google ScholarDigital Library
- H. Obendorf, H. Weinreich, E. Herder, and M. Mayer. Web page revisitation revisited: Implications of a long-term click-stream study of browser usage. In Proc. CHI, pages 597--606, 2007. Google ScholarDigital Library
- Y.-H. Park and P. S. Fader. Modeling browsing behavior at multiple websites. Marketing Science, 23(3):280--303, 2004.Google ScholarCross Ref
- F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. In Proc. 11th KDD, pages 239--248, 2005. Google ScholarDigital Library
- A. Spink, M. Park, B. J. Jansen, and J. Pedersen. Multitasking during web search sessions. Information Processing and Management, 42(1):264--275, 2006. Google ScholarDigital Library
- L. Tauscher and S. Grennberg. How people revisit web pages: Empirical findings and implications for the design of history systems. Intl. J. of Human-Computer Studies, 47(1):97--137, 1997. Google ScholarDigital Library
- J. Teevan, E. Adar, R. Jones, and M. Potts. Information re-retrieval: Repeat queries in Yahoo's logs. In Proc. 30th SIGIR, pages 151--158, 2007. Google ScholarDigital Library
Index Terms
- A characterization of online browsing behavior
Recommendations
Parallel browsing behavior on the web
HT '10: Proceedings of the 21st ACM conference on Hypertext and hypermediaParallel browsing describes a behavior where users visit Web pages in multiple concurrent threads. Web browsers explicitly support this by providing tabs. Although parallel browsing is more prevalent than linear browsing online, little is known about ...
Browsing the underdeveloped Web: An experiment on the Arabic Medical Web Directory
While the Web has grown significantly in recent years, some portions of the Web remain largely underdeveloped, as shown in a lack of high-quality content and functionality. An example is the Arabic Web, in which a lack of well-structured Web directories ...
Collaborative Web Browsing Tool supporting Audio/Video Interactive Presentations
WETICE '05: Proceedings of the 14th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative EnterpriseCollaborative Web Browsing aims at extending currently available Web browsing capabilities in order to allow several users getting their browsing activities synchronized. A Collaborative Web Browsing system should provide all the necessary facilities to ...
Comments