skip to main content
10.1145/2837614.2837661acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article
Public Access

Combining static analysis with probabilistic models to enable market-scale Android inter-component analysis

Published:11 January 2016Publication History

ABSTRACT

Static analysis has been successfully used in many areas, from verifying mission-critical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer Inter-Component Communication (ICC) yield many potential inter-component and inter-application links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the inter-component links is a prerequisite to inter-component analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static inter-application analysis more tractable, even at large scales.

References

  1. A. Aiken and E.L. Wimmers. Solving systems of set constraints. In Logic in Computer Science, 1992. LICS ’92., Proceedings of the Seventh Annual IEEE Symposium on, pages 329–340, Jun 1992.Google ScholarGoogle ScholarCross RefCross Ref
  2. Alexander Aiken. Set constraints: Results, applications and future directions. In Principles and Practice of Constraint Programming, pages 326–335. Springer, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alexander Aiken. Introduction to set constraint-based program analysis. Sci. Comput. Program., 35(2-3):79–111, November 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alexander Aiken, Dexter Kozen, and Ed Wimmers. Decidability of systems of set constraints with negative constraints. Information and Computation, 122, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. AppBrain. Number of available android applications. Available from http://www.appbrain.com/stats/number-of-android-apps.Google ScholarGoogle Scholar
  6. Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. Flowdroid: Precise context, flow, field, objectsensitive and lifecycle-aware taint analysis for android apps. In Proceedings of the 35th Conference on Programming Language Design and Implementation (PLDI), June 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Saurabh Chakradeo, Bradley Reaves, Patrick Traynor, and William Enck. Mast: Triage for market-scale mobile malware analysis. In Proceedings of the Sixth ACM Conference on Security and Privacy in Wireless and Mobile Networks, WiSec ’13, pages 13–24, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Patrick P.F. Chan, Lucas C.K. Hui, and S. M. Yiu. Droidchecker: Analyzing android applications for capability leak. In Proceedings of the Fifth ACM Conference on Security and Privacy in Wireless and Mobile Networks, WISEC ’12, pages 125–136, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Witold Charatonik and Leszek Pacholski. Set constraints with projections. J. ACM, 57(4):23:1–23:37, May 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Erika Chin, Adrienne Porter Felt, Kate Greenwood, and David Wagner. Analyzing Inter-Application Communication in Android. In Proceedings of the 9th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mihai Christodorescu and Somesh Jha. Static analysis of executables to detect malicious patterns. In Proceedings of the 12th Conference on USENIX Security Symposium - Volume 12, SSYM’03, pages 12–12, Berkeley, CA, USA, 2003. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Patrick Cousot and Michael Monerau. Probabilistic abstract interpretation. In Helmut Seidl, editor, Programming Languages and Systems, volume 7211 of Lecture Notes in Computer Science, pages 169–193. Springer Berlin Heidelberg, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Matthew Dering and Patrick McDaniel. Android market reconstruction and analysis. In Proceedings of the Military Communications Conference (MILCOM), 10 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. William Enck, Damien Octeau, Patrick McDaniel, and Swarat Chaudhuri. A Study of Android Application Security. In Proceedings of the 20th USENIX Security Symposium, August 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Adrienne Porter Felt, Helen J. Wang, Alexander Moshchuk, Steven Hanna, and Erika Chin. Permission Re-Delegation: Attacks and Defenses. In Proceedings of the 20th USENIX Security Symposium, August 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yu Feng, Saswat Anand, Isil Dillig, and Alex Aiken. Apposcopy: Semantics-based detection of android malware through static analysis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE ’14), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Filieri, C.S. Pasareanu, and W. Visser. Reliability analysis in symbolic pathfinder. In Software Engineering (ICSE), 2013 35th International Conference on, pages 622–631, May 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Wouter Gelade and Frank Neven. Succinctness of the complement and intersection of regular expressions. ACM Transactions on Computer Logic, 13(1):4, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jaco Geldenhuys, Matthew B. Dwyer, and Willem Visser. Probabilistic symbolic execution. In Proceedings of the 2012 International Symposium on Software Testing and Analysis, ISSTA 2012, pages 166–176, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Leo A. Goodman and William H. Kruskal. Measures of association for cross classifications. Journal of the American Statistical Association, 49(268):pp. 732–764, 1954.Google ScholarGoogle Scholar
  21. Google. Intents and intent filters. Available from https://developer.android.com/guide/components/ intents-filters.html.Google ScholarGoogle Scholar
  22. Michael Grace, Yajin Zhou, Zhi Wang, and Xuxian Jiang. Systematic Detection of Capability Leaks in Stock Android Smartphones. In NDSS ’12, 2012.Google ScholarGoogle Scholar
  23. Andrew Hinton, Marta Kwiatkowska, Gethin Norman, and David Parker. Prism: A tool for automatic verification of probabilistic systems. In Holger Hermanns and Jens Palsberg, editors, Tools and Algorithms for the Construction and Analysis of Systems, volume 3920 of Lecture Notes in Computer Science, pages 441–444. Springer Berlin Heidelberg, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jianjun Huang, Xiangyu Zhang, Lin Tan, Peng Wang, and Bin Liang. Asdroid: Detecting stealthy behaviors in android applications by user interface and program behavior contradiction. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 1036–1046, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Johnson, Yoonki Song, E. Murphy-Hill, and R. Bowdidge. Why don’t software developers use static analysis tools to find bugs? In Software Engineering (ICSE), 2013 35th International Conference on, pages 672–681, May 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. William Klieber, Lori Flynn, Amar Bhosale, Limin Jia, and Lujo Bauer. Android taint flow analysis for app sets. In Proceedings of the 3rd ACM SIGPLAN International Workshop on the State of the Art in Java Program Analysis, SOAP ’14, pages 1–6, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ted Kremenek and Dawson Engler. Z-ranking: Using statistical analysis to counter the impact of static analysis approximations. In Proceedings of the 10th International Symposium on Static Analysis, SAS’03, pages 295–315, Berlin, Heidelberg, 2003. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Li Li, Alexandre Bartel, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon, Steven Arzt, Siegfried Rasthofer, Eric Bodden, Damien Octeau, and Patrick Mcdaniel. IccTA: Detecting Inter-Component Privacy Leaks in Android Apps. In Proceedings of the 37th International Conference on Software Engineering (ICSE 2015), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Li Li, Alexandre Bartel, Jacques Klein, and Yves Le Traon. Automatically exploiting potential component leaks in android applications. In Proceedings of the 13th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom 2014). IEEE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Benjamin Livshits, Aditya V. Nori, Sriram K. Rajamani, and Anindya Banerjee. Merlin: Specification inference for explicit information flow problems. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’09, pages 75–86, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. V. Benjamin Livshits and Monica S. Lam. Finding security vulnerabilities in java applications with static analysis. In Proceedings of the 14th Conference on USENIX Security Symposium - Volume 14, SSYM’05, pages 18–18, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Long Lu, Zhichun Li, Zhenyu Wu, Wenke Lee, and Guofei Jiang. Chex: statically vetting android apps for component hijacking vulnerabilities. In Proceedings of the 2012 ACM conference on Computer and communications security, CCS ’12, pages 229–240. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Amiya Kumar Maji, Fahad A Arshad, Saurabh Bagchi, and Jan S Rellermeyer. An empirical study of the robustness of inter-component communication in android. In Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on, pages 1–12. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Piotr Mardziel, Stephen Magill, Michael Hicks, and Mudhakar Srivatsa. Dynamic enforcement of knowledge-based security policies. In Proceedings of the 2011 IEEE 24th Computer Security Foundations Symposium, CSF ’11, pages 114–128, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. David Monniaux. An abstract monte-carlo method for the analysis of probabilistic programs. In Proceedings of the 28th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’01, pages 93–101, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Damien Octeau, Somesh Jha, and Patrick McDaniel. Retargeting android applications to java bytecode. In Proceedings of the 20th International Symposium on the Foundations of Software Engineering, November 2012. Available from http://siis.cse.psu.edu/ dare/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Damien Octeau, Daniel Luchaup, Matthew Dering, Somesh Jha, and Patrick McDaniel. Composite Constant Propagation: Application to Android Inter-Component Communication Analysis. In Proceedings of the 37th International Conference on Software Engineering (ICSE), May 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Damien Octeau, Patrick McDaniel, Somesh Jha, Alexandre Bartel, Eric Bodden, Jacques Klein, and Yves Le Traon. Effective intercomponent communication mapping in android with epicc: an essential step towards holistic security analysis. In Proceedings of the 22nd USENIX Security Symposium, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Veselin Raychev, Martin Vechev, and Andreas Krause. Predicting program properties from ”big code”. In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’15, pages 111–124, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ben Fox Rubin. Amazon appstore nears 400k apps on ’huge progress’. Available from http://www.cnet.com/news/ amazon-appstore-nears-400k-apps-on-huge-progress/.Google ScholarGoogle Scholar
  41. Sriram Sankaranarayanan, Aleksandar Chakarov, and Sumit Gulwani. Static analysis for probabilistic programs: Inferring whole program properties from finitely many paths. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 447–458, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Feng Shen, Namita Vishnubhotla, Chirag Todarka, Mohit Arora, Babu Dhandapani, Eric John Lehner, Steven Y. Ko, and Lukasz Ziarek. Information flows as a permission mechanism. In To appear in Proceedings of the 29th IEEE/ACM International Conference on Automated Software Engineering (ASE ’14), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Omer Tripp, Salvatore Guarnieri, Marco Pistoia, and Aleksandr Aravkin. Aletheia: Improving the usability of static security analysis. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, pages 762–774, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Saksham Varma, Robert J. Walls, Brian Lynn, and Brian Neil Levine. Efficient smart phone forensics based on relevance feedback. In Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices, SPSM ’14, pages 81–91, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Nicolas Viennot, Edward Garcia, and Jason Nieh. A measurement study of google play. In The 2014 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’14, pages 221–233, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Fengguo Wei, Sankardas Roy, Xinming Ou, and Robby. Amandroid: A precise and general inter-component data flow analysis framework for security vetting of android apps. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, pages 1329–1341, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Daoyuan Wu, Xiapu Luo, and Rocky KC Chang. A sink-driven approach to detecting exposed component vulnerabilities in android apps. arXiv preprint arXiv:1405.6282, 2014.Google ScholarGoogle Scholar
  48. Lei Wu, Michael Grace, Yajin Zhou, Chiachih Wu, and Xuxian Jiang. The impact of vendor customizations on android security. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS ’13, pages 623–634, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Mu Zhang and Heng Yin. Appsealer: Automatic generation of vulnerability-specific patches for preventing component hijacking attacks in android applications. In Proceedings of the 21th Annual Network and Distributed System Security Symposium (NDSS ’14), 2014.Google ScholarGoogle ScholarCross RefCross Ref
  50. Yajin Zhou and Xuxian Jiang. Dissecting android malware: Characterization and evolution. In Security and Privacy (SP), 2012 IEEE Symposium on, pages 95–109. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yajin Zhou and Xuxian Jiang. Detecting passive content leaks and pollution in android applications. In Proceedings of the 20th Annual Symposium on Network and Distributed System Security, 2013.Google ScholarGoogle Scholar

Index Terms

  1. Combining static analysis with probabilistic models to enable market-scale Android inter-component analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
        January 2016
        815 pages
        ISBN:9781450335492
        DOI:10.1145/2837614
        • cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 51, Issue 1
          POPL '16
          January 2016
          815 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2914770
          • Editor:
          • Andy Gill
          Issue’s Table of Contents

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 January 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate824of4,130submissions,20%

        Upcoming Conference

        POPL '25

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader