Skip to main content
Top

2014 | OriginalPaper | Chapter

Understanding User Behavior Through Log Data and Analysis

Authors : Susan Dumais, Robin Jeffries, Daniel M. Russell, Diane Tang, Jaime Teevan

Published in: Ways of Knowing in HCI

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

HCI researchers are increasingly collecting rich behavioral traces of user interactions with online systems in situ at a scale not previously possible. These logs can be used to characterize user interactions with existing systems and compare different designs. Large-scale log studies give rise to new challenges in experimental design, data collection and interpretation, and ethics. The chapter discusses how to address these challenges using search engine logs, but the methods are applicable to other types of log data.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Adar, E., Teevan, J., & Dumais, S. T. (2008). Large scale analysis of web revisitation patterns. In Proceedings of CHI 2008 (pp. 1197–1206). New York: ACM. Adar, E., Teevan, J., & Dumais, S. T. (2008). Large scale analysis of web revisitation patterns. In Proceedings of CHI 2008 (pp. 1197–1206). New York: ACM.
go back to reference Baeza-Yates, R., Dupret, G., & Velasco, J. (2007). A study of mobile search queries in Japan. In Proceedings of WWW 2007 workshop on query log analysis: Social and technical challenges. New York, NY: ACM. Baeza-Yates, R., Dupret, G., & Velasco, J. (2007). A study of mobile search queries in Japan. In Proceedings of WWW 2007 workshop on query log analysis: Social and technical challenges. New York, NY: ACM.
go back to reference Barnett, V., & Lewis, S. (1994). Outliers in statistical data. New York, NY: Wiley & Sons.MATH Barnett, V., & Lewis, S. (1994). Outliers in statistical data. New York, NY: Wiley & Sons.MATH
go back to reference Beitzel, S. M., Jensen, E. C., Chowdhury, A., Grossman, D. A., & Frieder, O. (2004). Hourly analysis of a very large topically categorized web query log. In Proceedings of SIGIR 2004 (pp. 321–328). New York, NY: ACM. Beitzel, S. M., Jensen, E. C., Chowdhury, A., Grossman, D. A., & Frieder, O. (2004). Hourly analysis of a very large topically categorized web query log. In Proceedings of SIGIR 2004 (pp. 321–328). New York, NY: ACM.
go back to reference Capra, R. (2011). HCI browser: A tool for administration and data collection for studies of web search behavior. In Proceedings of HCIHCI 2011 (pp. 259–268). New York, NY: Springer. Capra, R. (2011). HCI browser: A tool for administration and data collection for studies of web search behavior. In Proceedings of HCIHCI 2011 (pp. 259–268). New York, NY: Springer.
go back to reference Crook, T., Frasca, B., Kohavi, R., & Longbotham, R. (2009). Seven pitfalls to avoid when running controlled experiments on the web. In Proceedings of KDD 2009 (pp. 1105–1114). New York, NY: ACM. Crook, T., Frasca, B., Kohavi, R., & Longbotham, R. (2009). Seven pitfalls to avoid when running controlled experiments on the web. In Proceedings of KDD 2009 (pp. 1105–1114). New York, NY: ACM.
go back to reference Dell, N., Vaidyanathan, V., Medhi, I., Cutrell, E., & Thies, W. (2012). “Yours is better!”: Participant response bias in HCI. In Proceedings of CHI 2012 (pp. 1321–1330). New York, NY: ACM. Dell, N., Vaidyanathan, V., Medhi, I., Cutrell, E., & Thies, W. (2012). “Yours is better!”: Participant response bias in HCI. In Proceedings of CHI 2012 (pp. 1321–1330). New York, NY: ACM.
go back to reference Dumais, S. T., Cutrell, E., Cadiz, J. J., Jancke, G., Sarin, R., & Robbins, D. C. (2003). Stuff I’ve seen: A system for personal information retrieval and re-use. In Proceedings of SIGIR 2003 (pp. 72–79). New York, NY: ACM. Dumais, S. T., Cutrell, E., Cadiz, J. J., Jancke, G., Sarin, R., & Robbins, D. C. (2003). Stuff I’ve seen: A system for personal information retrieval and re-use. In Proceedings of SIGIR 2003 (pp. 72–79). New York, NY: ACM.
go back to reference Efthimiadis, E. N. (2008). How do Greeks search the web?: A query log analysis study. In Proceedings iNews 2008 (pp. 81–84). New York, NY: ACM. Efthimiadis, E. N. (2008). How do Greeks search the web?: A query log analysis study. In Proceedings iNews 2008 (pp. 81–84). New York, NY: ACM.
go back to reference Fetterly, D., Manasse, M., & Najork, M. (2004). Spam, damn spam, and statistics: Using statistical analysis to locate spam web pages. In Proceedings WebDB 2004 (pp. 1–6). New York, NY: ACM. Fetterly, D., Manasse, M., & Najork, M. (2004). Spam, damn spam, and statistics: Using statistical analysis to locate spam web pages. In Proceedings WebDB 2004 (pp. 1–6). New York, NY: ACM.
go back to reference Fox, S., Karnawat, K., Mydland, M., Dumais, S. T., & White, T. (2005). Evaluating implicit measures to improve web search. ACM: Transactions on Information Systems (TOIS), 23(2), 147–168. Fox, S., Karnawat, K., Mydland, M., Dumais, S. T., & White, T. (2005). Evaluating implicit measures to improve web search. ACM: Transactions on Information Systems (TOIS), 23(2), 147–168.
go back to reference Ghorab, M. R., Leveling, J., Zhou, D., Jones, G. J. F., & Wade, V. (2009). Identifying common user behaviour in multilingual search logs. In Proceedings of CLEF 2009, pp. 518–525. Ghorab, M. R., Leveling, J., Zhou, D., Jones, G. J. F., & Wade, V. (2009). Identifying common user behaviour in multilingual search logs. In Proceedings of CLEF 2009, pp. 518–525.
go back to reference Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457, 1012–1014.CrossRef Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457, 1012–1014.CrossRef
go back to reference Huck, S. (2011). Reading statistics and research (6th ed.). Boston, MA: Pearson. Huck, S. (2011). Reading statistics and research (6th ed.). Boston, MA: Pearson.
go back to reference Jansen, B. J. (2006). Search log analysis: What it is, what’s been done, how to do it. Library and Information Science Research, 28(3), 407–432.CrossRef Jansen, B. J. (2006). Search log analysis: What it is, what’s been done, how to do it. Library and Information Science Research, 28(3), 407–432.CrossRef
go back to reference Jupiter Research Corporation. (2005, March 9). Measuring unique visitors: Addressing the dramatic decline in the accuracy of cookie-based measurement Jupiter Research Corporation. (2005, March 9). Measuring unique visitors: Addressing the dramatic decline in the accuracy of cookie-based measurement
go back to reference Kohavi, R., Deng, A., Frasca, B., Longbotham, R., Walker, T., & Xu, Y. (2012). Trustworthy online controlled experiments: Five puzzling outcomes explained. In Proceedings of KDD 2012 (pp. 786–794). New York, NY: ACM. Kohavi, R., Deng, A., Frasca, B., Longbotham, R., Walker, T., & Xu, Y. (2012). Trustworthy online controlled experiments: Five puzzling outcomes explained. In Proceedings of KDD 2012 (pp. 786–794). New York, NY: ACM.
go back to reference Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled experiments on the web: Survey and practical guide. Data Mining and Knowledge Discovery, 18(1), 140–181.CrossRefMathSciNet Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled experiments on the web: Survey and practical guide. Data Mining and Knowledge Discovery, 18(1), 140–181.CrossRefMathSciNet
go back to reference Kotov, A., Bennett, P., White, R. W., Dumais, S. T., & Teevan, J. (2011). Modeling and analysis of cross-session search tasks. In Proceedings of SIGIR 2011 (pp. 5–14). New York, NY: ACM. Kotov, A., Bennett, P., White, R. W., Dumais, S. T., & Teevan, J. (2011). Modeling and analysis of cross-session search tasks. In Proceedings of SIGIR 2011 (pp. 5–14). New York, NY: ACM.
go back to reference Lau, T., & Horvitz, E. (1999). Patterns of search: Analyzing and modeling web query refinement. In Proceedings of user modeling 1999 (pp. 119–128). New York, NY: ACM. Lau, T., & Horvitz, E. (1999). Patterns of search: Analyzing and modeling web query refinement. In Proceedings of user modeling 1999 (pp. 119–128). New York, NY: ACM.
go back to reference Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. In Proceedings of IEEE symposium on security and privacy 2008 (pp. 111–125). Washington, DC: IEEE.CrossRef Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. In Proceedings of IEEE symposium on security and privacy 2008 (pp. 111–125). Washington, DC: IEEE.CrossRef
go back to reference Osborne, J. W. (2012). Best practices in data cleaning: Everything you need to know before and after collecting your data. Thousand Oak, CA: Sage Publications. Osborne, J. W. (2012). Best practices in data cleaning: Everything you need to know before and after collecting your data. Thousand Oak, CA: Sage Publications.
go back to reference Rodden, K., & Leggett, M. (2010). Best of both worlds: Improving Gmail labels with the affordance of folders. In Proceedings of CHI 2010 (pp. 4587–4596). New York, NY: ACM. Rodden, K., & Leggett, M. (2010). Best of both worlds: Improving Gmail labels with the affordance of folders. In Proceedings of CHI 2010 (pp. 4587–4596). New York, NY: ACM.
go back to reference Silverstein, C., Henzinger, M., Marais, H., & Moricz, M. (1998). Analysis of a very large web search engine query log. Technical Report 1998-014. Digital SRC. Silverstein, C., Henzinger, M., Marais, H., & Moricz, M. (1998). Analysis of a very large web search engine query log. Technical Report 1998-014. Digital SRC.
go back to reference Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. Oxford, England: Appleton-Century. Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. Oxford, England: Appleton-Century.
go back to reference Spink, A., Ozmutlu, S., Ozmutlu, H. C., & Jansen, B. J. (2002). U.S. versus European web searching trends. ACM SIGIR Forum, 36(2), 32–38.CrossRef Spink, A., Ozmutlu, S., Ozmutlu, H. C., & Jansen, B. J. (2002). U.S. versus European web searching trends. ACM SIGIR Forum, 36(2), 32–38.CrossRef
go back to reference Starbird, K. & Palen, L. (2010). Pass it on? Retweeting in mass emergencies. In Proceedings of ISCRAM 2010, pp. 1–10. Starbird, K. & Palen, L. (2010). Pass it on? Retweeting in mass emergencies. In Proceedings of ISCRAM 2010, pp. 1–10.
go back to reference Tang, D., Agarwal, A., O’Brien, D., & Meyer, M. (2010). Overlapping experiment infrastructure: More, better, faster experimentation. In Proceedings KDD 2010 (pp. 17–26). New York, NY: ACM. Tang, D., Agarwal, A., O’Brien, D., & Meyer, M. (2010). Overlapping experiment infrastructure: More, better, faster experimentation. In Proceedings KDD 2010 (pp. 17–26). New York, NY: ACM.
go back to reference Teevan, J., Adar, E., Jones, R., & Potts, M. (2007). Information re-retrieval: Repeat queries in Yahoo’s logs. In Proceedings of SIGIR 2007 (pp. 151–158). New York, NY: ACM. Teevan, J., Adar, E., Jones, R., & Potts, M. (2007). Information re-retrieval: Repeat queries in Yahoo’s logs. In Proceedings of SIGIR 2007 (pp. 151–158). New York, NY: ACM.
go back to reference Teevan, J., Dumais, S. T., & Liebling, D. J. (2008). To personalize or not to personalize: Modeling queries with variation in user intent. In Proceedings of SIGIR 2008 (pp. 163–170). New York, NY: ACM. Teevan, J., Dumais, S. T., & Liebling, D. J. (2008). To personalize or not to personalize: Modeling queries with variation in user intent. In Proceedings of SIGIR 2008 (pp. 163–170). New York, NY: ACM.
go back to reference Teevan, J., & Hehmeyer, A. (2013). Understanding how the projection of availability state impacts the reception of incoming communication. In Proceedings of CSCW 2013 (pp. 753–758). New York, NY: ACM. Teevan, J., & Hehmeyer, A. (2013). Understanding how the projection of availability state impacts the reception of incoming communication. In Proceedings of CSCW 2013 (pp. 753–758). New York, NY: ACM.
go back to reference Teevan, J., Ramage, D., & Morris, M. R. (2011). #TwitterSearch: A comparison of microblog search and web search. In Proceedings of WSDM 2011 (pp. 35–44). New York, NY: ACM. Teevan, J., Ramage, D., & Morris, M. R. (2011). #TwitterSearch: A comparison of microblog search and web search. In Proceedings of WSDM 2011 (pp. 35–44). New York, NY: ACM.
go back to reference Tyler, S. K., & Teevan, J. (2010). Large scale query log analysis of re-finding. In Proceedings of WSDM 2010 (pp. 191–200). New York, NY: ACM. Tyler, S. K., & Teevan, J. (2010). Large scale query log analysis of re-finding. In Proceedings of WSDM 2010 (pp. 191–200). New York, NY: ACM.
go back to reference White, R., Dumais, S. T., & Teevan, J. (2009). Characterizing the influence of domains expertise on web search behavior. In Proceedings of WSDM 2009 (pp. 132–141). New York, NY: ACM. White, R., Dumais, S. T., & Teevan, J. (2009). Characterizing the influence of domains expertise on web search behavior. In Proceedings of WSDM 2009 (pp. 132–141). New York, NY: ACM.
go back to reference White, R., & Morris, D. (2007). Investigating the querying and browsing behavior of advanced search engine users. In Proceedings of SIGIR 2007 (pp. 255–262). New York, NY: ACM. White, R., & Morris, D. (2007). Investigating the querying and browsing behavior of advanced search engine users. In Proceedings of SIGIR 2007 (pp. 255–262). New York, NY: ACM.
Metadata
Title
Understanding User Behavior Through Log Data and Analysis
Authors
Susan Dumais
Robin Jeffries
Daniel M. Russell
Diane Tang
Jaime Teevan
Copyright Year
2014
Publisher
Springer New York
DOI
https://doi.org/10.1007/978-1-4939-0378-8_14