skip to main content
10.1145/2786451.2786468acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

What can be Found on the Web and How: A Characterization of Web Browsing Patterns

Published:28 June 2015Publication History

ABSTRACT

In this paper, we suggest a novel approach to studying user browsing behavior, i.e., the ways users get to different pages on the Web. Namely, we classified all user browsing paths leading to web pages into several types or browsing patterns. In order to define browsing patterns, we consider several important points of the browsing path: its origin, the last page before the user gets to the domain of the target page, and the target page referrer. Each point can be of several types, which leads to 56 possible patterns. The distribution of the browsing paths over these patterns forms the navigational profile of a web page.

We conducted a comprehensive large-scale study of navigational profiles of different web pages. First, we demonstrated that the navigational profile of a web page carry crucial information about the properties of this page (e.g., its popularity and age). Second, we found that the Web consists of several typical non-overlapping clusters formed by pages of similar ranges of incoming traffic. These clusters can be characterized by the functionality of their pages.

References

  1. R. Baeza-Yates, A. P. Jr, and N. Ziviani. The evolution of web content and search engines. In Proceedings of the 8th ACM Workshop on Web Mining and Web Usage Analysis, 2008.Google ScholarGoogle Scholar
  2. P. Bailey, R. W. White, H. Liu, and G. Kumaran. Mining historic query trails to label long and rare search engine queries. In ACM Transactions on the Web, volume 4 (4), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Bilenko and R. W. White. Mining the search trails of surfing crowds: identifying relevant websites from user activity. In Proceedings of the 17th international conference on World Wide Web, pages 51--60, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Cho and S. Roy. Impact of search engines on page popularity. In Proceedings of the 13th international conference on World Wide Web, pages 20--29, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. H. Friedman. Stochastic gradient boosting. In Comput. Stat. Data Anal., volume 38(4), pages 367--378, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Goel, J. M. Hofman, and M. I. Sirer. Who does what on the web: A large-scale study of browsing behavior. In Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media, 2012.Google ScholarGoogle Scholar
  7. T. Hastie, R. Tibshirani, and J. H. Friedman. The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. New York: Springer-Verlag, 2001.Google ScholarGoogle Scholar
  8. S. Ieong, N. Mishra, E. Sadikov, and L. Zhang. Domain bias in web search. In Proceedings of the fifth ACM international conference on Web search and data mining, pages 413--422, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Kumar and A. Tomkins. A characterization of online browsing behavior. In Proceedings of the 19th international conference on World wide web, pages 561--570, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. Proceedings of the 14th ACM SIGKDD international conference on Knowledge Discovery and Data mining, pages 462--470, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Liu, R. Cai, M. Zhang, and L. Zhang. User browsing behavior-driven web crawling. In Proceedings of the 20th ACM international conference on Information and knowledge management, pages 87--92, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Liu, B. Gao, T.-Y. Liu, Y. Zhang, Z. Ma, S. He, and H. Li. Browserank: letting web users vote for page importance. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 451--458, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Ostroumova, I. Bogatyy, A. Chelnokov, A. Tikhonov, and G. Gusev. Crawling policies based on web page popularity prediction. In Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 8416, pages 100--111, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  14. F. Qiu, Z. Liu, and J. Cho. Analysis of user web traffic with a focus on search activities. In WebDB, pages 103--108, 2005.Google ScholarGoogle Scholar
  15. W. M. Rand. Objective criteria for the evaluation of clustering methods. In Journal of the American Statistical Association, volume 66(336), pages 846--850, 1971.Google ScholarGoogle ScholarCross RefCross Ref
  16. C. R. Rao. Linear statistical inference and its applications. Wiley, New York, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  17. A. Spink, M. Park, B. J. Jansen, and J. Pedersen. Multitasking during web search sessions. In Information Processing and Management, volume 42(1), pages 264--475, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Tolstikov, M. Shakhray, G. Gusev, and P. Serdyukov. Through-the-looking glass: utilizing rich post-search trail statistics for web search. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pages 1897--1900, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. Weber and A. Jaimes. Who uses web search for what: and how. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 15--24, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. W. White and J. Huang. Assessing the scenic route: measuring the value of search trails in web logs. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 587--594, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Zhukovskiy, A. Khropov, G. Gusev, and P. Serdyukov. Introducing search behavior into browsing based models of page's importance. In Proceedings of the 22nd international conference on World Wide Web companion, pages 129--130, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. What can be Found on the Web and How: A Characterization of Web Browsing Patterns

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WebSci '15: Proceedings of the ACM Web Science Conference
      June 2015
      366 pages
      ISBN:9781450336727
      DOI:10.1145/2786451

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 June 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate218of875submissions,25%

      Upcoming Conference

      Websci '24
      16th ACM Web Science Conference
      May 21 - 24, 2024
      Stuttgart , Germany

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader