skip to main content
10.1145/2766462.2767759acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval

Published:09 August 2015Publication History

ABSTRACT

Smartphones and tablets with their apps pervaded our everyday life, leading to a new demand for search tools to help users find the right apps to satisfy their immediate needs. While there are a few commercial mobile app search engines available, the new task of mobile app retrieval has not yet been rigorously studied. Indeed, there does not yet exist a test collection for quantitatively evaluating this new retrieval task. In this paper, we first study the effectiveness of the state-of-the-art retrieval models for the app retrieval task using a new app retrieval test data we created. We then propose and study a novel approach that generates a new representation for each app. Our key idea is to leverage user reviews to find out important features of apps and bridge vocabulary gap between app developers and users. Specifically, we jointly model app descriptions and user reviews using topic model in order to generate app representations while excluding noise in reviews. Experiment results indicate that the proposed approach is effective and outperforms the state-of-the-art retrieval models for app retrieval.

References

  1. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 25--32. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Datta, K. Dutta, S. Kajanan, and N. Pervin. Mobilewalla: A mobile application search engine. In Mobile Computing, Applications, and Services, pages 172--187. Springer, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  4. A. P. De Vries, A.-M. Vercoustre, J. A. Thom, N. Craswell, and M. Lalmas. Overview of the inex 2007 entity ranking track. In Focused Access to XML Documents, pages 245--251. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Duan, C. Zhai, J. Cheng, and A. Gattani. Supporting keyword search in product database: A probabilistic approach. Proc. VLDB Endow., 6(14):1786--1797, Sept. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Fang, T. Tao, and C. Zhai. A formal study of information retrieval heuristics. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 49--56. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Ganesan and C. Zhai. Findilike: preference driven entity search. In Proceedings of the 21st international conference companion on World Wide Web, pages 345--348. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Ganesan and C. Zhai. Opinion-based entity ranking. Information retrieval, 15(2):116--150, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50--57. ACM, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich. Recommender systems: an introduction. Cambridge University Press, 2010. Google ScholarGoogle ScholarCross RefCross Ref
  11. K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20(4):422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Kamps, M. Marx, M. De Rijke, and B. Sigurbjörnsson. Xml retrieval: What to retrieve? In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 409--410. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. R. Landis and G. G. Koch. The measurement of observer agreement for categorical data. biometrics, pages 159--174, 1977.Google ScholarGoogle Scholar
  14. W. Li and A. McCallum. Pachinko allocation: Dag-structured mixture models of topic correlations. In Proceedings of the 23rd international conference on Machine learning, pages 577--584. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Lin, K. Sugiyama, M.-Y. Kan, and T.-S. Chua. Addressing cold-start in app recommendation: latent user models constructed from twitter followers. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pages 283--292. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Z. Liu, J. Walker, and Y. Chen. Xseek: a semantic xml search engine using keywords. In Proceedings of the 33rd international conference on Very large data bases, pages 1330--1333. VLDB Endowment, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55--60, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  18. P. Ogilvie and J. Callan. Combining document representations for known-item search. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 143--150. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Pehcevski, A.-M. Vercoustre, and J. A. Thom. Exploiting locality of wikipedia links in entity ranking. In Advances in Information Retrieval, pages 258--269. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Pérez-Iglesias, J. R. Pérez-Agüera, V. Fresno, and Y. Z. Feinstein. Integrating the probabilistic models bm25/bm25f into lucene. arXiv preprint arXiv:0911.5046, 2009.Google ScholarGoogle Scholar
  21. J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 275--281. ACM, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. E. Robertson. The probability ranking principle in ir. Readings in information retrieval, pages 281--286, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Song and W. B. Croft. A general language model for information retrieval. In Proceedings of the eighth international conference on Information and knowledge management, pages 316--321. ACM, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A.-M. Vercoustre, J. A. Thom, and J. Pehcevski. Entity ranking in wikipedia. In Proceedings of the 2008 ACM symposium on Applied computing, pages 1101--1106. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. M. Wallach, D. Minmo, and A. McCallum. Rethinking lda: Why priors matter. 2009.Google ScholarGoogle Scholar
  26. N. Walsh, M. Fernández, A. Malhotra, M. Nagy, and J. Marsh. Xquery 1.0 and xpath 2.0 data model (xdm). W3C recommendation, W3C (January 2007), 2007.Google ScholarGoogle Scholar
  27. X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 178--185. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. X. Yi and J. Allan. A comparative study of utilizing topic models for information retrieval. In Advances in Information Retrieval, pages 29--41. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the 15th ACM international conference on Information and knowledge management, pages 102--111. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Yin, P. Luo, W.-C. Lee, and M. Wang. App recommendation: a contest between satisfaction and temptation. In Proceedings of the sixth ACM international conference on Web search and data mining, pages 395--404. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 334--342. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Zhu, H. Xiong, Y. Ge, and E. Chen. Mobile app recommendations with security and privacy awareness. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 951--960. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
      August 2015
      1198 pages
      ISBN:9781450336215
      DOI:10.1145/2766462

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 August 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '15 Paper Acceptance Rate70of351submissions,20%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader