ABSTRACT
Static analysis has been successfully used in many areas, from verifying mission-critical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer Inter-Component Communication (ICC) yield many potential inter-component and inter-application links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the inter-component links is a prerequisite to inter-component analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static inter-application analysis more tractable, even at large scales.
- A. Aiken and E.L. Wimmers. Solving systems of set constraints. In Logic in Computer Science, 1992. LICS ’92., Proceedings of the Seventh Annual IEEE Symposium on, pages 329–340, Jun 1992.Google ScholarCross Ref
- Alexander Aiken. Set constraints: Results, applications and future directions. In Principles and Practice of Constraint Programming, pages 326–335. Springer, 1994. Google ScholarDigital Library
- Alexander Aiken. Introduction to set constraint-based program analysis. Sci. Comput. Program., 35(2-3):79–111, November 1999. Google ScholarDigital Library
- Alexander Aiken, Dexter Kozen, and Ed Wimmers. Decidability of systems of set constraints with negative constraints. Information and Computation, 122, 1995. Google ScholarDigital Library
- AppBrain. Number of available android applications. Available from http://www.appbrain.com/stats/number-of-android-apps.Google Scholar
- Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. Flowdroid: Precise context, flow, field, objectsensitive and lifecycle-aware taint analysis for android apps. In Proceedings of the 35th Conference on Programming Language Design and Implementation (PLDI), June 2014. Google ScholarDigital Library
- Saurabh Chakradeo, Bradley Reaves, Patrick Traynor, and William Enck. Mast: Triage for market-scale mobile malware analysis. In Proceedings of the Sixth ACM Conference on Security and Privacy in Wireless and Mobile Networks, WiSec ’13, pages 13–24, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- Patrick P.F. Chan, Lucas C.K. Hui, and S. M. Yiu. Droidchecker: Analyzing android applications for capability leak. In Proceedings of the Fifth ACM Conference on Security and Privacy in Wireless and Mobile Networks, WISEC ’12, pages 125–136, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- Witold Charatonik and Leszek Pacholski. Set constraints with projections. J. ACM, 57(4):23:1–23:37, May 2010. Google ScholarDigital Library
- Erika Chin, Adrienne Porter Felt, Kate Greenwood, and David Wagner. Analyzing Inter-Application Communication in Android. In Proceedings of the 9th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys), 2011. Google ScholarDigital Library
- Mihai Christodorescu and Somesh Jha. Static analysis of executables to detect malicious patterns. In Proceedings of the 12th Conference on USENIX Security Symposium - Volume 12, SSYM’03, pages 12–12, Berkeley, CA, USA, 2003. USENIX Association. Google ScholarDigital Library
- Patrick Cousot and Michael Monerau. Probabilistic abstract interpretation. In Helmut Seidl, editor, Programming Languages and Systems, volume 7211 of Lecture Notes in Computer Science, pages 169–193. Springer Berlin Heidelberg, 2012. Google ScholarDigital Library
- Matthew Dering and Patrick McDaniel. Android market reconstruction and analysis. In Proceedings of the Military Communications Conference (MILCOM), 10 2014. Google ScholarDigital Library
- William Enck, Damien Octeau, Patrick McDaniel, and Swarat Chaudhuri. A Study of Android Application Security. In Proceedings of the 20th USENIX Security Symposium, August 2011. Google ScholarDigital Library
- Adrienne Porter Felt, Helen J. Wang, Alexander Moshchuk, Steven Hanna, and Erika Chin. Permission Re-Delegation: Attacks and Defenses. In Proceedings of the 20th USENIX Security Symposium, August 2011. Google ScholarDigital Library
- Yu Feng, Saswat Anand, Isil Dillig, and Alex Aiken. Apposcopy: Semantics-based detection of android malware through static analysis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE ’14), 2014. Google ScholarDigital Library
- A. Filieri, C.S. Pasareanu, and W. Visser. Reliability analysis in symbolic pathfinder. In Software Engineering (ICSE), 2013 35th International Conference on, pages 622–631, May 2013. Google ScholarDigital Library
- Wouter Gelade and Frank Neven. Succinctness of the complement and intersection of regular expressions. ACM Transactions on Computer Logic, 13(1):4, 2012. Google ScholarDigital Library
- Jaco Geldenhuys, Matthew B. Dwyer, and Willem Visser. Probabilistic symbolic execution. In Proceedings of the 2012 International Symposium on Software Testing and Analysis, ISSTA 2012, pages 166–176, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- Leo A. Goodman and William H. Kruskal. Measures of association for cross classifications. Journal of the American Statistical Association, 49(268):pp. 732–764, 1954.Google Scholar
- Google. Intents and intent filters. Available from https://developer.android.com/guide/components/ intents-filters.html.Google Scholar
- Michael Grace, Yajin Zhou, Zhi Wang, and Xuxian Jiang. Systematic Detection of Capability Leaks in Stock Android Smartphones. In NDSS ’12, 2012.Google Scholar
- Andrew Hinton, Marta Kwiatkowska, Gethin Norman, and David Parker. Prism: A tool for automatic verification of probabilistic systems. In Holger Hermanns and Jens Palsberg, editors, Tools and Algorithms for the Construction and Analysis of Systems, volume 3920 of Lecture Notes in Computer Science, pages 441–444. Springer Berlin Heidelberg, 2006. Google ScholarDigital Library
- Jianjun Huang, Xiangyu Zhang, Lin Tan, Peng Wang, and Bin Liang. Asdroid: Detecting stealthy behaviors in android applications by user interface and program behavior contradiction. In Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pages 1036–1046, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- B. Johnson, Yoonki Song, E. Murphy-Hill, and R. Bowdidge. Why don’t software developers use static analysis tools to find bugs? In Software Engineering (ICSE), 2013 35th International Conference on, pages 672–681, May 2013. Google ScholarDigital Library
- William Klieber, Lori Flynn, Amar Bhosale, Limin Jia, and Lujo Bauer. Android taint flow analysis for app sets. In Proceedings of the 3rd ACM SIGPLAN International Workshop on the State of the Art in Java Program Analysis, SOAP ’14, pages 1–6, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Ted Kremenek and Dawson Engler. Z-ranking: Using statistical analysis to counter the impact of static analysis approximations. In Proceedings of the 10th International Symposium on Static Analysis, SAS’03, pages 295–315, Berlin, Heidelberg, 2003. Springer-Verlag. Google ScholarDigital Library
- Li Li, Alexandre Bartel, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon, Steven Arzt, Siegfried Rasthofer, Eric Bodden, Damien Octeau, and Patrick Mcdaniel. IccTA: Detecting Inter-Component Privacy Leaks in Android Apps. In Proceedings of the 37th International Conference on Software Engineering (ICSE 2015), 2015. Google ScholarDigital Library
- Li Li, Alexandre Bartel, Jacques Klein, and Yves Le Traon. Automatically exploiting potential component leaks in android applications. In Proceedings of the 13th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom 2014). IEEE, 2014. Google ScholarDigital Library
- Benjamin Livshits, Aditya V. Nori, Sriram K. Rajamani, and Anindya Banerjee. Merlin: Specification inference for explicit information flow problems. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’09, pages 75–86, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- V. Benjamin Livshits and Monica S. Lam. Finding security vulnerabilities in java applications with static analysis. In Proceedings of the 14th Conference on USENIX Security Symposium - Volume 14, SSYM’05, pages 18–18, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarDigital Library
- Long Lu, Zhichun Li, Zhenyu Wu, Wenke Lee, and Guofei Jiang. Chex: statically vetting android apps for component hijacking vulnerabilities. In Proceedings of the 2012 ACM conference on Computer and communications security, CCS ’12, pages 229–240. ACM, 2012. Google ScholarDigital Library
- Amiya Kumar Maji, Fahad A Arshad, Saurabh Bagchi, and Jan S Rellermeyer. An empirical study of the robustness of inter-component communication in android. In Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on, pages 1–12. IEEE, 2012. Google ScholarDigital Library
- Piotr Mardziel, Stephen Magill, Michael Hicks, and Mudhakar Srivatsa. Dynamic enforcement of knowledge-based security policies. In Proceedings of the 2011 IEEE 24th Computer Security Foundations Symposium, CSF ’11, pages 114–128, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarDigital Library
- David Monniaux. An abstract monte-carlo method for the analysis of probabilistic programs. In Proceedings of the 28th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’01, pages 93–101, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- Damien Octeau, Somesh Jha, and Patrick McDaniel. Retargeting android applications to java bytecode. In Proceedings of the 20th International Symposium on the Foundations of Software Engineering, November 2012. Available from http://siis.cse.psu.edu/ dare/. Google ScholarDigital Library
- Damien Octeau, Daniel Luchaup, Matthew Dering, Somesh Jha, and Patrick McDaniel. Composite Constant Propagation: Application to Android Inter-Component Communication Analysis. In Proceedings of the 37th International Conference on Software Engineering (ICSE), May 2015. Google ScholarDigital Library
- Damien Octeau, Patrick McDaniel, Somesh Jha, Alexandre Bartel, Eric Bodden, Jacques Klein, and Yves Le Traon. Effective intercomponent communication mapping in android with epicc: an essential step towards holistic security analysis. In Proceedings of the 22nd USENIX Security Symposium, 2013. Google ScholarDigital Library
- Veselin Raychev, Martin Vechev, and Andreas Krause. Predicting program properties from ”big code”. In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’15, pages 111–124, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
- Ben Fox Rubin. Amazon appstore nears 400k apps on ’huge progress’. Available from http://www.cnet.com/news/ amazon-appstore-nears-400k-apps-on-huge-progress/.Google Scholar
- Sriram Sankaranarayanan, Aleksandar Chakarov, and Sumit Gulwani. Static analysis for probabilistic programs: Inferring whole program properties from finitely many paths. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 447–458, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- Feng Shen, Namita Vishnubhotla, Chirag Todarka, Mohit Arora, Babu Dhandapani, Eric John Lehner, Steven Y. Ko, and Lukasz Ziarek. Information flows as a permission mechanism. In To appear in Proceedings of the 29th IEEE/ACM International Conference on Automated Software Engineering (ASE ’14), 2014. Google ScholarDigital Library
- Omer Tripp, Salvatore Guarnieri, Marco Pistoia, and Aleksandr Aravkin. Aletheia: Improving the usability of static security analysis. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, pages 762–774, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Saksham Varma, Robert J. Walls, Brian Lynn, and Brian Neil Levine. Efficient smart phone forensics based on relevance feedback. In Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices, SPSM ’14, pages 81–91, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Nicolas Viennot, Edward Garcia, and Jason Nieh. A measurement study of google play. In The 2014 ACM International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’14, pages 221–233, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Fengguo Wei, Sankardas Roy, Xinming Ou, and Robby. Amandroid: A precise and general inter-component data flow analysis framework for security vetting of android apps. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, pages 1329–1341, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- Daoyuan Wu, Xiapu Luo, and Rocky KC Chang. A sink-driven approach to detecting exposed component vulnerabilities in android apps. arXiv preprint arXiv:1405.6282, 2014.Google Scholar
- Lei Wu, Michael Grace, Yajin Zhou, Chiachih Wu, and Xuxian Jiang. The impact of vendor customizations on android security. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS ’13, pages 623–634, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- Mu Zhang and Heng Yin. Appsealer: Automatic generation of vulnerability-specific patches for preventing component hijacking attacks in android applications. In Proceedings of the 21th Annual Network and Distributed System Security Symposium (NDSS ’14), 2014.Google ScholarCross Ref
- Yajin Zhou and Xuxian Jiang. Dissecting android malware: Characterization and evolution. In Security and Privacy (SP), 2012 IEEE Symposium on, pages 95–109. IEEE, 2012. Google ScholarDigital Library
- Yajin Zhou and Xuxian Jiang. Detecting passive content leaks and pollution in android applications. In Proceedings of the 20th Annual Symposium on Network and Distributed System Security, 2013.Google Scholar
Index Terms
- Combining static analysis with probabilistic models to enable market-scale Android inter-component analysis
Recommendations
Combining static analysis with probabilistic models to enable market-scale Android inter-component analysis
POPL '16Static analysis has been successfully used in many areas, from verifying mission-critical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, ...
Android Malware Static Analysis Techniques
CISR '15: Proceedings of the 10th Annual Cyber and Information Security Research ConferenceDuring 2014, Business Insider announced that there are over a billion users of Android worldwide. Government officials are also trending towards acquiring Android mobile devices. Google's application architecture is already ubiquitous and will keep ...
HybriDroid: static analysis framework for Android hybrid applications
ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software EngineeringMobile applications (apps) have long invaded the realm of desktop apps, and hybrid apps become a promising solution for supporting multiple mobile platforms. Providing both platform-specific functionalities via native code like native apps and user ...
Comments