ABSTRACT
Smartphones and tablets with their apps pervaded our everyday life, leading to a new demand for search tools to help users find the right apps to satisfy their immediate needs. While there are a few commercial mobile app search engines available, the new task of mobile app retrieval has not yet been rigorously studied. Indeed, there does not yet exist a test collection for quantitatively evaluating this new retrieval task. In this paper, we first study the effectiveness of the state-of-the-art retrieval models for the app retrieval task using a new app retrieval test data we created. We then propose and study a novel approach that generates a new representation for each app. Our key idea is to leverage user reviews to find out important features of apps and bridge vocabulary gap between app developers and users. Specifically, we jointly model app descriptions and user reviews using topic model in order to generate app representations while excluding noise in reviews. Experiment results indicate that the proposed approach is effective and outperforms the state-of-the-art retrieval models for app retrieval.
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003. Google ScholarDigital Library
- C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 25--32. ACM, 2004. Google ScholarDigital Library
- A. Datta, K. Dutta, S. Kajanan, and N. Pervin. Mobilewalla: A mobile application search engine. In Mobile Computing, Applications, and Services, pages 172--187. Springer, 2012.Google ScholarCross Ref
- A. P. De Vries, A.-M. Vercoustre, J. A. Thom, N. Craswell, and M. Lalmas. Overview of the inex 2007 entity ranking track. In Focused Access to XML Documents, pages 245--251. Springer, 2008. Google ScholarDigital Library
- H. Duan, C. Zhai, J. Cheng, and A. Gattani. Supporting keyword search in product database: A probabilistic approach. Proc. VLDB Endow., 6(14):1786--1797, Sept. 2013. Google ScholarDigital Library
- H. Fang, T. Tao, and C. Zhai. A formal study of information retrieval heuristics. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 49--56. ACM, 2004. Google ScholarDigital Library
- K. Ganesan and C. Zhai. Findilike: preference driven entity search. In Proceedings of the 21st international conference companion on World Wide Web, pages 345--348. ACM, 2012. Google ScholarDigital Library
- K. Ganesan and C. Zhai. Opinion-based entity ranking. Information retrieval, 15(2):116--150, 2012. Google ScholarDigital Library
- T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50--57. ACM, 1999. Google ScholarDigital Library
- D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich. Recommender systems: an introduction. Cambridge University Press, 2010. Google ScholarCross Ref
- K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20(4):422--446, 2002. Google ScholarDigital Library
- J. Kamps, M. Marx, M. De Rijke, and B. Sigurbjörnsson. Xml retrieval: What to retrieve? In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 409--410. ACM, 2003. Google ScholarDigital Library
- J. R. Landis and G. G. Koch. The measurement of observer agreement for categorical data. biometrics, pages 159--174, 1977.Google Scholar
- W. Li and A. McCallum. Pachinko allocation: Dag-structured mixture models of topic correlations. In Proceedings of the 23rd international conference on Machine learning, pages 577--584. ACM, 2006. Google ScholarDigital Library
- J. Lin, K. Sugiyama, M.-Y. Kan, and T.-S. Chua. Addressing cold-start in app recommendation: latent user models constructed from twitter followers. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pages 283--292. ACM, 2013. Google ScholarDigital Library
- Z. Liu, J. Walker, and Y. Chen. Xseek: a semantic xml search engine using keywords. In Proceedings of the 33rd international conference on Very large data bases, pages 1330--1333. VLDB Endowment, 2007. Google ScholarDigital Library
- C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55--60, 2014.Google ScholarCross Ref
- P. Ogilvie and J. Callan. Combining document representations for known-item search. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 143--150. ACM, 2003. Google ScholarDigital Library
- J. Pehcevski, A.-M. Vercoustre, and J. A. Thom. Exploiting locality of wikipedia links in entity ranking. In Advances in Information Retrieval, pages 258--269. Springer, 2008. Google ScholarDigital Library
- J. Pérez-Iglesias, J. R. Pérez-Agüera, V. Fresno, and Y. Z. Feinstein. Integrating the probabilistic models bm25/bm25f into lucene. arXiv preprint arXiv:0911.5046, 2009.Google Scholar
- J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 275--281. ACM, 1998. Google ScholarDigital Library
- S. E. Robertson. The probability ranking principle in ir. Readings in information retrieval, pages 281--286, 1997. Google ScholarDigital Library
- F. Song and W. B. Croft. A general language model for information retrieval. In Proceedings of the eighth international conference on Information and knowledge management, pages 316--321. ACM, 1999. Google ScholarDigital Library
- A.-M. Vercoustre, J. A. Thom, and J. Pehcevski. Entity ranking in wikipedia. In Proceedings of the 2008 ACM symposium on Applied computing, pages 1101--1106. ACM, 2008. Google ScholarDigital Library
- H. M. Wallach, D. Minmo, and A. McCallum. Rethinking lda: Why priors matter. 2009.Google Scholar
- N. Walsh, M. Fernández, A. Malhotra, M. Nagy, and J. Marsh. Xquery 1.0 and xpath 2.0 data model (xdm). W3C recommendation, W3C (January 2007), 2007.Google Scholar
- X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 178--185. ACM, 2006. Google ScholarDigital Library
- X. Yi and J. Allan. A comparative study of utilizing topic models for information retrieval. In Advances in Information Retrieval, pages 29--41. Springer, 2009. Google ScholarDigital Library
- E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the 15th ACM international conference on Information and knowledge management, pages 102--111. ACM, 2006. Google ScholarDigital Library
- P. Yin, P. Luo, W.-C. Lee, and M. Wang. App recommendation: a contest between satisfaction and temptation. In Proceedings of the sixth ACM international conference on Web search and data mining, pages 395--404. ACM, 2013. Google ScholarDigital Library
- C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 334--342. ACM, 2001. Google ScholarDigital Library
- H. Zhu, H. Xiong, Y. Ge, and E. Chen. Mobile app recommendations with security and privacy awareness. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 951--960. ACM, 2014. Google ScholarDigital Library
Index Terms
- Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval
Recommendations
A preliminary analysis of mobile app user reviews
OzCHI '12: Proceedings of the 24th Australian Computer-Human Interaction ConferenceThe advent of online software distribution channels like Apple Inc.'s App Store and Google Inc.'s Google Play has offered developers a single, low cost, and powerful distribution mechanism. These online stores help users discover apps as well as leave a ...
An Explorative Study of the Mobile App Ecosystem from App Developers' Perspective
WWW '17: Proceedings of the 26th International Conference on World Wide WebWith the prevalence of smartphones, app markets such as Apple App Store and Google Play has become the center stage in the mobile app ecosystem, with millions of apps developed by tens of thousands of app developers in each major market. This paper ...
Leveraging app features to improve mobile app retrieval
The continued increase in the use of smartphones and other mobile devices has led to a substantial increase in the demand for mobile applications. With the growing availability of mobile apps, retrieving the right application from a large set has become ...
Comments