Skip to main content
Top
Published in:
Cover of the book

2019 | OriginalPaper | Chapter

AndroParse - An Android Feature Extraction Framework and Dataset

Authors : Robert Schmicker, Frank Breitinger, Ibrahim Baggili

Published in: Digital Forensics and Cyber Crime

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Android malware has become a major challenge. As a consequence, practitioners and researchers spend a significant time analyzing Android applications (APK). A common procedure (especially for data scientists) is to extract features such as permissions, APIs or strings which can then be analyzed. Current state of the art tools have three major issues: (1) a single tool cannot extract all the significant features used by scientists and practitioners (2) Current tools are not designed to be extensible and (3) Existing parsers can be timely as they are not runtime efficient or scalable. Therefore, this work presents AndroParse which is an open-source Android parser written in Golang that currently extracts the four most common features: Permissions, APIs, Strings and Intents. AndroParse outputs JSON files as they can easily be used by most major programming languages. Constructing the parser allowed us to create an extensive feature dataset which can be accessed by our independent REST API. Our dataset currently has 67,703 benign and 46,683 malicious APK samples.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
2
https://​64.​251.​61.​74/​ (last accessed 13-April-2018).
 
3
A prominent example that these services are valuable for the community is the UCI Machine Learning Repository [25] which includes a multitude of data and repositories and is frequently referenced in literature.
 
4
http://​www.​malgenomeproject​.​org (last accessed 13-April-2018).
 
6
 
7
This portion of code must be performed sequentially as there is a low-level JVM memory error when multiple threads access the library at once.
 
8
https://​golang.​org/​pkg/​plugin/​ (last accessed 13-April-2018).
 
9
One can use any language as long as the code can be compiled into a shared object file.
 
Literature
3.
go back to reference Anonymous. CAPIL: Component-API linkage for android malware detection (2016, unpublished) Anonymous. CAPIL: Component-API linkage for android malware detection (2016, unpublished)
6.
go back to reference Apvrille, L., Apvrille, A.: Identifying unknown android malware with feature extractions and classification techniques. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 182–189. IEEE (2015) Apvrille, L., Apvrille, A.: Identifying unknown android malware with feature extractions and classification techniques. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 182–189. IEEE (2015)
7.
go back to reference Arp, D., Spreitzenbarth, M., Hübner, M., Gascon, H., Rieck, K., CERT Siemens: DREBIN: effective and explainable detection of android malware in your pocket. In: Proceedings of the Annual Symposium on Network and Distributed System Security (NDSS) (2014). https://www.sec.cs.tu-bs.de/~danarp/drebin/. Accessed 13 Apr 2018 Arp, D., Spreitzenbarth, M., Hübner, M., Gascon, H., Rieck, K., CERT Siemens: DREBIN: effective and explainable detection of android malware in your pocket. In: Proceedings of the Annual Symposium on Network and Distributed System Security (NDSS) (2014). https://​www.​sec.​cs.​tu-bs.​de/​~danarp/​drebin/​. Accessed 13 Apr 2018
8.
go back to reference Au, K.W.Y., Zhou, Y.F., Huang, Z., Lie, D.: PScout: analyzing the android permission specification. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 217–228. ACM (2012) Au, K.W.Y., Zhou, Y.F., Huang, Z., Lie, D.: PScout: analyzing the android permission specification. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 217–228. ACM (2012)
9.
go back to reference Aung, Z., Zaw, W.: Permission-based android malware detection. Int. J. Sci. Technol. Res. 2(3), 228–234 (2013) Aung, Z., Zaw, W.: Permission-based android malware detection. Int. J. Sci. Technol. Res. 2(3), 228–234 (2013)
10.
go back to reference Babu Rajesh, V., Reddy, P., Himanshu, P., Patil, M.U.: Droidswan: detecting malicious android applications based on static feature analysis. Comput. Sci. Inf. Technol., 163 (2015) Babu Rajesh, V., Reddy, P., Himanshu, P., Patil, M.U.: Droidswan: detecting malicious android applications based on static feature analysis. Comput. Sci. Inf. Technol., 163 (2015)
11.
go back to reference Baskaran, B., Ralescu, A.: A study of android malware detection techniques and machine learning. University of Cincinnati (2016) Baskaran, B., Ralescu, A.: A study of android malware detection techniques and machine learning. University of Cincinnati (2016)
13.
go back to reference Desnos, A.: Androguard-reverse engineering, malware and goodware analysis of android applications. URL code. google.com/p/androguard (2013) Desnos, A.: Androguard-reverse engineering, malware and goodware analysis of android applications. URL code. google.com/p/androguard (2013)
15.
go back to reference Faruki, P., Bharmal, A., Laxmi, V., Gaur, M.S., Conti, M., Rajarajan, M.: Evaluation of android anti-malware techniques against Dalvik bytecode obfuscation. In: 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications, pp. 414–421. IEEE (2014) Faruki, P., Bharmal, A., Laxmi, V., Gaur, M.S., Conti, M., Rajarajan, M.: Evaluation of android anti-malware techniques against Dalvik bytecode obfuscation. In: 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications, pp. 414–421. IEEE (2014)
16.
go back to reference Feizollah, A., Anuar, N.B., Salleh, R., Wahab, A.W.A.: A review on feature selection in mobile malware detection. Digit. Invest. 13, 22–37 (2015)CrossRef Feizollah, A., Anuar, N.B., Salleh, R., Wahab, A.W.A.: A review on feature selection in mobile malware detection. Digit. Invest. 13, 22–37 (2015)CrossRef
17.
go back to reference Fereidooni, H., Moonsamy, V., Conti, M., Batina, L.: Efficient classification of android malware in the wild using robust static features (2016) Fereidooni, H., Moonsamy, V., Conti, M., Batina, L.: Efficient classification of android malware in the wild using robust static features (2016)
19.
go back to reference Holmes, G., Donkin, A., Witten, I.H.: WEKA: a machine learning workbench. In: Proceedings of the 1994 Second Australian and New Zealand Conference on Intelligent Information Systems, pp. 357–361. IEEE (1994) Holmes, G., Donkin, A., Witten, I.H.: WEKA: a machine learning workbench. In: Proceedings of the 1994 Second Australian and New Zealand Conference on Intelligent Information Systems, pp. 357–361. IEEE (1994)
20.
go back to reference Kaushik, P., Jain, A.: Malware detection techniques in android. Int. J. Comput. Appl. 122(17), 22–26 (2015) Kaushik, P., Jain, A.: Malware detection techniques in android. Int. J. Comput. Appl. 122(17), 22–26 (2015)
21.
go back to reference Maggi, F., Valdi, A., Zanero, S.: Andrototal: a flexible, scalable toolbox and service for testing mobile malware detectors. In: Proceedings of the Third ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 49–54. ACM (2013) Maggi, F., Valdi, A., Zanero, S.: Andrototal: a flexible, scalable toolbox and service for testing mobile malware detectors. In: Proceedings of the Third ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 49–54. ACM (2013)
22.
go back to reference Maiorca, D., Ariu, D., Corona, I., Aresu, M., Giacinto, G.: Stealth attacks: an extended insight into the obfuscation effects on android malware. Comput. Secur. 51, 16–31 (2015)CrossRef Maiorca, D., Ariu, D., Corona, I., Aresu, M., Giacinto, G.: Stealth attacks: an extended insight into the obfuscation effects on android malware. Comput. Secur. 51, 16–31 (2015)CrossRef
23.
go back to reference Malik, S., Khatter, K.: AndroData: a tool for static & dynamic feature extraction of android apps. Int. J. Appl. Eng. Res. 10(94), 98–102 (2015) Malik, S., Khatter, K.: AndroData: a tool for static & dynamic feature extraction of android apps. Int. J. Appl. Eng. Res. 10(94), 98–102 (2015)
28.
go back to reference Pehlivan, U., Baltaci, N., Acartürk, C., Baykal, N.: The analysis of feature selection methods and classification algorithms in permission based android malware detection. In: 2014 IEEE Symposium on Computational Intelligence in Cyber Security (CICS), pp. 1–8. IEEE (2014) Pehlivan, U., Baltaci, N., Acartürk, C., Baykal, N.: The analysis of feature selection methods and classification algorithms in permission based android malware detection. In: 2014 IEEE Symposium on Computational Intelligence in Cyber Security (CICS), pp. 1–8. IEEE (2014)
29.
go back to reference Rami, K., Desai, V.: Performance base static analysis of malware on android (2013) Rami, K., Desai, V.: Performance base static analysis of malware on android (2013)
30.
go back to reference Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference (EISIC), pp. 141–147. IEEE (2012) Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: 2012 European Intelligence and Security Informatics Conference (EISIC), pp. 141–147. IEEE (2012)
31.
go back to reference Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Bringas, P.G., Álvarez, G.: PUMA: permission usage to detect malware in android. In: Herrero, Á., et al. (eds.) International Joint Conference CISIS’12-ICEUTE’ 12-SOCO’ 12. AISC, vol. 189, pp. 289–298. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-33018-6_30CrossRef Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Bringas, P.G., Álvarez, G.: PUMA: permission usage to detect malware in android. In: Herrero, Á., et al. (eds.) International Joint Conference CISIS’12-ICEUTE’ 12-SOCO’ 12. AISC, vol. 189, pp. 289–298. Springer, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-642-33018-6_​30CrossRef
32.
go back to reference Seth, R., Kaushal, R.: Permission based malware analysis & detection in android (2014) Seth, R., Kaushal, R.: Permission based malware analysis & detection in android (2014)
33.
go back to reference Spreitzenbarth, M., Schreck, T., Echtler, F., Arp, D., Hoffmann, J.: Mobile-sandbox: combining static and dynamic analysis with machine-learning techniques. Int. J. Inf. Secur. 14(2), 141–153 (2015)CrossRef Spreitzenbarth, M., Schreck, T., Echtler, F., Arp, D., Hoffmann, J.: Mobile-sandbox: combining static and dynamic analysis with machine-learning techniques. Int. J. Inf. Secur. 14(2), 141–153 (2015)CrossRef
38.
go back to reference Wang, X., Yang, Y., Zeng, Y.: Accurate mobile malware detection and classification in the cloud. SpringerPlus 4(1), 1 (2015)CrossRef Wang, X., Yang, Y., Zeng, Y.: Accurate mobile malware detection and classification in the cloud. SpringerPlus 4(1), 1 (2015)CrossRef
40.
go back to reference Winsniewski, R.: Android–apktool: a tool for reverse engineering android APK files (2012) Winsniewski, R.: Android–apktool: a tool for reverse engineering android APK files (2012)
41.
go back to reference Yerima, S.Y., Sezer, S., Muttik, I.: Android malware detection using parallel machine learning classifiers. In: 2014 Eighth International Conference on Next Generation Mobile Apps, Services and Technologies, pp. 37–42. IEEE (2014) Yerima, S.Y., Sezer, S., Muttik, I.: Android malware detection using parallel machine learning classifiers. In: 2014 Eighth International Conference on Next Generation Mobile Apps, Services and Technologies, pp. 37–42. IEEE (2014)
42.
go back to reference Zhang, X., Breitinger, F., Baggili, I.: Rapid android parser for investigating dex files (RAPID). Digit. Invest. 17, 28–39 (2016)CrossRef Zhang, X., Breitinger, F., Baggili, I.: Rapid android parser for investigating dex files (RAPID). Digit. Invest. 17, 28–39 (2016)CrossRef
44.
go back to reference Zhou, Y., Wang, Z., Zhou, W., Jiang, X.: Hey, you, get off of my market: detecting malicious apps in official and alternative android markets. In: NDSS, vol. 25, pp. 50–52 (2012) Zhou, Y., Wang, Z., Zhou, W., Jiang, X.: Hey, you, get off of my market: detecting malicious apps in official and alternative android markets. In: NDSS, vol. 25, pp. 50–52 (2012)
Metadata
Title
AndroParse - An Android Feature Extraction Framework and Dataset
Authors
Robert Schmicker
Frank Breitinger
Ibrahim Baggili
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-05487-8_4

Premium Partner