skip to main content
research-article
Public Access

Deriving User Preferences of Mobile Apps from Their Management Activities

Published:11 July 2017Publication History
Skip Abstract Section

Abstract

App marketplaces host millions of mobile apps that are downloaded billions of times. Investigating how people manage mobile apps in their everyday lives creates a unique opportunity to understand the behavior and preferences of mobile device users, infer the quality of apps, and improve user experience. Existing literature provides very limited knowledge about app management activities, due to the lack of app usage data at scale. This article takes the initiative to analyze a very large app management log collected through a leading Android app marketplace. The dataset covers 5 months of detailed downloading, updating, and uninstallation activities, which involve 17 million anonymized users and 1 million apps. We present a surprising finding that the metrics commonly used to rank apps in app stores do not truly reflect the users’ real attitudes. We then identify behavioral patterns from the app management activities that more accurately indicate user preferences of an app even when no explicit rating is available. A systematic statistical analysis is designed to evaluate machine learning models that are trained to predict user preferences using these behavioral patterns, which features an inverse probability weighting method to correct the selection biases in the training process.

References

  1. Sharad Agarwal, Ratul Mahajan, Alice Zheng, and Victor Bahl. 2010. Diagnosing mobile applications in the wild. In Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks. 22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439, 509--512.Google ScholarGoogle Scholar
  3. Richard Blundell and Monica Costa Dias. 2009. Alternative approaches to evaluation in empirical microeconomics. Journal of Human Resources 44, 3, 565--640.Google ScholarGoogle ScholarCross RefCross Ref
  4. Matthias Böhmer, Brent Hecht, Johannes Schöning, Antonio Krüger, and Gernot Bauer. 2011. Falling asleep with Angry Birds, Facebook and Kindle: A large scale study on mobile application usage. In Proceedings of the 13th International Conference on Human-Computer Interaction with Mobile Devices and Services. 47--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Matthias Böhmer and Antonio Krüger. 2013. A study on icon arrangement by smartphone users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2137--2146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Breiman. 2001. Random forests. Machine Learning 45, 1, 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Rishi Chandy and Haijie Gu. 2012. Identifying spam in the iOS App store. In Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality. 56--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ning Chen, Jialiu Lin, Steven C. H. Hoi, Xiaokui Xiao, and Boshen Zhang. 2014. AR-Miner: Mining informative reviews for developers from mobile app marketplace. In Proceedings of the 36th International Conference on Software Engineering. 767--778. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ying Chen, Heng Xu, Yilu Zhou, and Sencun Zhu. 2013. Is this app safe for children? A comparison study of maturity ratings on Android and iOS applications. In Proceedings of the 22nd International World Wide Web Conference. 201--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gokul Chittaranjan, Jan Blom, and Daniel Gatica-Perez. 2013. Mining large-scale smartphone data for personality studies. Personal and Ubiquitous Computing 17, 3, 433--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. David Easley and Jon Kleinberg. 2010. Network, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hossein Falaki, Dimitrios Lymberopoulos, Ratul Mahajan, Srikanth Kandula, and Deborah Estrin. 2010. A first look at traffic on smartphones. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. 281--287. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2010. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33, 1, 1--22.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. H. Friedman. 2002. Stochastic gradient boosting. Computational Statistics and Data Analysis 38, 4, 367--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bin Fu, Jialiu Lin, Lei Li, Christos Faloutsos, Jason Hong, and Norman Sadeh. 2013. Why people hate your app: Making sense of user feedback in a mobile app store. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1276--1284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Michael Goul, Olivera Marjanovic, Susan Baxley, and Karen Vizecky. 2012. Managing the enterprise business intelligence app store: Sentiment analysis supported requirements engineering. In Proceedings of the 45th Hawaii International Conference on System Science. 4168--4177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mark Harman, Yue Jia, and Yuanyuan Zhang. 2012. App store mining and analysis: MSR for app stores. In Proceedings of the 9th IEEE Working Conference of Mining Software Repositories. 108--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. J. Heckman. 1979. Sample selection bias as a specification error. Econometrica 47, 1, 153.Google ScholarGoogle ScholarCross RefCross Ref
  19. Miguel A. Hernán, Sonia Hernández-Díaz, and James M. Robins. 2004. A structural approach to selection bias. Epidemiology 15, 5, 615--625.Google ScholarGoogle ScholarCross RefCross Ref
  20. Arthur E. Hoerl and Robert W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1, 55--67.Google ScholarGoogle ScholarCross RefCross Ref
  21. Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 8th IEEE International Conference on Data Mining. 263--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jiayuan Huang, Arthur Gretton, Karsten M. Borgwardt, Bernhard Schölkopf, and Alex J. Smola. 2006. Correcting sample selection bias by unlabeled data. In Proceedings of the 19th International Conference on Neural Information Processing Systems. 601--608. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Claudia Iacob and Rachel Harrison. 2013. Retrieving and analyzing mobile apps feature requests from online reviews. In Proceedings of the 10th Working Conference on Mining Software Repositories. 41--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Guido W. Imbens and Donald B. Rubin. 2015. Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press. Google ScholarGoogle Scholar
  25. Guido W. Imbens and Jeffrey M. Wooldridge. 2009. Recent developments in the econometrics of program evaluation. Journal of Economic Literature 47, 1, 5--86.Google ScholarGoogle ScholarCross RefCross Ref
  26. Bernard J. Jansen. 2008. Handbook of Research on Web Log Analysis. IGI Global, Hershey, PA.Google ScholarGoogle Scholar
  27. Daxin Jiang, Jian Pei, and Hang Li. 2013. Mining search and browse logs for Web search: A survey. ACM Transactions on Intelligent Systems and Technology 4, 4, 57--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In Proceedings of the International Conference on Web Search and Web Data Mining. 219--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 154--161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jaeyeon Jung, Seungyeop Han, and David Wetherall. 2012. Short paper: Enhancing mobile application permissions with runtime feedback and constraints. In Proceedings of the 2nd ACM Workshop on Security and Privacy in Smartphones and Mobile Devices. 45--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Hammad Khalid, Emad Shihab, Meiyappan Nagappan, and Ahmed E. Hassan. 2015. What do mobile app users complain about? IEEE Software 32, 3, 70--77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Huoran Li, Wei Ai, Xuanzhe Liu, Jian Tang, Gang Huang, Feng Feng, and Qiaozhu Mei. 2016. Voting with their feet: Inferring user preferences from app management activities. In Proceedings of the 25th International Conference on World Wide Web. 1351--1362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Huoran Li, Xuanzhe Liu, Wei Ai, Qiaozhu Mei, and Feng Feng. 2015a. A descriptive analysis of a large-scale collection of app management activities. In Proceedings of the 24th International Conference on World Wide Web Companion. 61--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Huoran Li, Xuan Lu, Xuanzhe Liu, Tao Xie, Kaigui Bian, Felix Xiaozhu Lin, Qiaozhu Mei, and Feng Feng. 2015b. Characterizing smartphone usage patterns from millions of Android users. In Proceedings of the 2015 ACM Conference on Internet Measurement. 459--472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Lim, P. Bentley, N. Kanakam, F. Ishikawa, and S. Honiden. 2015. Investigating country differences in mobile app user behavior and challenges for software engineering. IEEE Transactions on Software Engineering 41, 1, 40--64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Roderick J. A. Little and Donald B. Rubin. 2014. Statistical Analysis with Missing Data. John Wiley 8 Sons.Google ScholarGoogle Scholar
  37. Bing Liu. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5, 1, 1--167. Google ScholarGoogle ScholarCross RefCross Ref
  38. Xuan Lu, Xuanzhe Liu, Huoran Li, Tao Xie, Qiaozhu Mei, Dan Hao, Gang Huang, and Feng Feng. 2016. PRADA: Prioritizing Android devices for apps by mining large-scale usage data. In Proceedings of the 38th International Conference on Software Engineering. 3--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Walid Maalej and Hadeer Nabil. 2015. Bug report, feature request, or simply praise? On automatically classifying app reviews. In Proceedings of the 23rd IEEE International Requirements Engineering Conference. 116--125.Google ScholarGoogle ScholarCross RefCross Ref
  40. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarGoogle Scholar
  41. Stuart McIlroy, Nasir Ali, Hammad Khalid, and Ahmed E. Hassan. 2016. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering 21, 3, 1067--1106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Mark E. J. Newman. 2005. Power laws, Pareto distributions and Zipf’s law. Contemporary Physics 46, 5, 323--351.Google ScholarGoogle ScholarCross RefCross Ref
  43. Myle Ott, Claire Cardie, and Jeff Hancock. 2012. Estimating the prevalence of deception in online review communities. In Proceedings of the 21st World Wide Web Conference. 201--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 309--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Fabio Palomba, Mario Linares-Vásquez, Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2015. User reviews matter! Tracking crowdsourced reviews to support evolution of successful apps. In Proceedings of the 31st IEEE International Conference on Software Maintenance and Evolution. 291--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2015. How can I improve my app? Classifying user reviews for software maintenance and evolution. In Proceedings of the 31st IEEE International Conference on Software Maintenance and Evolution. 281--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Dae Hoon Park, Mengwen Liu, Cheng-Xiang Zhai, and Haohong Wang. 2015. Leveraging user reviews to improve accuracy for mobile app retrieval. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 533--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 10, 2825--2830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Ahmad Rahmati, Chad Tossell, Clayton Shepard, Philip Kortum, and Lin Zhong. 2012. Exploring iPhone usage: The influence of socioeconomic differences on smartphone adoption, usage and usability. In Proceedings of the 14th International Conference on Human-Computer Interaction with Mobile Devices and Services. 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Ahmad Rahmati and Lin Zhong. 2013. Studying smartphone usage: Lessons from a four-month field study. IEEE Transactions on Mobile Computing 12, 7, 1417--1427. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Lenin Ravindranath, Jitendra Padhye, Sharad Agarwal, Ratul Mahajan, Ian Obermiller, and Shahin Shayandeh. 2012. AppInsight: Mobile app performance monitoring in the wild. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation. 107--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Paul R. Rosenbaum and Donald B. Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 1, 41--55.Google ScholarGoogle ScholarCross RefCross Ref
  53. Ardalan Amiri Sani, Zhiyong Tan, Peter Washington, Mira Chen, Sharad Agarwal, Lin Zhong, and Ming Zhang. 2013. The wireless data drain of users, apps, and platforms. ACM SIGMOBILE Mobile Computing and Communications Review 17, 4, 15--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations as treatments—debiasing learning and evaluation. In Proceedings of the 33rd International Conference on Machine Learning, Vol. 48. 1670--1679. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Kent Shi and Kamal Ali. 2012. GetJar mobile application recommendations with very sparse datasets. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 204--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Mark D. Smucker, James Allan, and Ben Carterette. 2007. A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the 16th ACM Conference on Information and Knowledge Management. 623. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yuan Tian, Meiyappan Nagappan, David Lo, and Ahmed E. Hassan. 2015. What are the characteristics of high-rated apps? A case study on free Android applications. In Proceedings of the 31st IEEE International Conference on Software Maintenance and Evolution. 301--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) 58, 1, 267--288.Google ScholarGoogle ScholarCross RefCross Ref
  59. Chad Tossell, Philip T. Kortum, Ahmad Rahmati, Clayton Shepard, and Lin Zhong. 2012. Characterizing Web use on smartphones. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2769--2778. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Lorenzo Villarroel, Gabriele Bavota, Barbara Russo, Rocco Oliveto, and Massimiliano Di Penta. 2016. Release planning of mobile apps based on user reviews. In Proceedings of the 38th International Conference on Software Engineering. 14--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th International ACM SIGIR Conference. 115--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. L. Wei, Y. Liu, and S. C. Cheung. 2016. Taming Android fragmentation: Characterizing and detecting compatibility issues for Android apps. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. 226--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Qiang Xu, Jeffrey Erman, Alexandre Gerber, Zhuoqing Mao, Jeffrey Pang, and Shobha Venkataraman. 2011. Identifying diverse usage behaviors of smartphone apps. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement. 329--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Yu Zheng and Xiaofang Zhou. 2011. Computing with Spatial Trajectories. Springer Science 8 Business Media. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deriving User Preferences of Mobile Apps from Their Management Activities

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Information Systems
        ACM Transactions on Information Systems  Volume 35, Issue 4
        Special issue: Search, Mining and their Applications on Mobile Devices
        October 2017
        461 pages
        ISSN:1046-8188
        EISSN:1558-2868
        DOI:10.1145/3112649
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 July 2017
        • Accepted: 1 November 2016
        • Revised: 1 October 2016
        • Received: 1 June 2016
        Published in tois Volume 35, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader