Skip to main content
Top
Published in: Knowledge and Information Systems 1/2016

01-10-2016 | Regular Paper

A scalable approach for detecting plagiarized mobile applications

Authors: Ciprian Oprişa, Dragoş Gavriluţ, George Cabău

Published in: Knowledge and Information Systems | Issue 1/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Plagiarism cases are quite common in mobile applications ecosystems like the Android market. An application can be decompiled, modified and repackaged with a different author name. The modifications can affect the user’s privacy or even contain malicious logic. If the original application is supported by advertisements, they are usually replaced so the ad revenue will go to the repackager. Such events can cause the legitimate author damage both in reputation and financially so they need to be detected. A plagiarism detection system is proposed that can detect plagiarized applications based on the features extracted from code. Two similarity functions are given along with techniques for finding similar applications in a large collection. The main issue with this search is that it cannot be performed sequentially, by comparing a given item with every other item in the collection. The built solution will improve the search time by comparing the searched item only with those with a high probability of being similar. The greatest advantage of our approach is scalability. The system’s database can be built, updated and queried in reasonable time, even when large quantities of data are involved. Our experiments were conducted on a large collection of over one million samples and managed to identify a concerning number of plagiarism cases.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
5.
go back to reference Bloice MD, Wotawa F, Holzinger A (2009) Java’s alternatives and the limitations of java when writing cross-platform applications for mobile devices in the medical domain. In: Proceedings of the ITI 2009 31st international conference on information technology interfaces, 2009. ITI’09, pp 47–54. IEEE Bloice MD, Wotawa F, Holzinger A (2009) Java’s alternatives and the limitations of java when writing cross-platform applications for mobile devices in the medical domain. In: Proceedings of the ITI 2009 31st international conference on information technology interfaces, 2009. ITI’09, pp 47–54. IEEE
7.
go back to reference Chodorow K (2013) MongoDB: the definitive guide. O’Reilly Media, Inc., Sebastopol Chodorow K (2013) MongoDB: the definitive guide. O’Reilly Media, Inc., Sebastopol
8.
go back to reference Crussell J, Gibler C, Chen H (2012) Attack of the clones: detecting cloned applications on android markets. In: Computer security–ESORICS 2012. Springer, Berlin, pp 37–54 Crussell J, Gibler C, Chen H (2012) Attack of the clones: detecting cloned applications on android markets. In: Computer security–ESORICS 2012. Springer, Berlin, pp 37–54
9.
go back to reference Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
10.
go back to reference Ferrante J, Ottenstein KJ, Warren JD (1987) The program dependence graph and its use in optimization. ACM Trans Program Lang Syst (TOPLAS) 9(3):319–349CrossRefMATH Ferrante J, Ottenstein KJ, Warren JD (1987) The program dependence graph and its use in optimization. ACM Trans Program Lang Syst (TOPLAS) 9(3):319–349CrossRefMATH
12.
go back to reference Hanna S, Huang L, Wu E, Li S, Chen C, Song D (2013) Juxtapp: a scalable system for detecting code reuse among android applications. In: Detection of intrusions and malware, and vulnerability assessment. Springer, Berlin, , pp. 62–81 Hanna S, Huang L, Wu E, Li S, Chen C, Song D (2013) Juxtapp: a scalable system for detecting code reuse among android applications. In: Detection of intrusions and malware, and vulnerability assessment. Springer, Berlin, , pp. 62–81
13.
go back to reference Heintze N, et al (1996) Scalable document fingerprinting. In: 1996 USENIX workshop on electronic commerce, vol 3 Heintze N, et al (1996) Scalable document fingerprinting. In: 1996 USENIX workshop on electronic commerce, vol 3
14.
go back to reference Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, pp 604–613. ACM Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, pp 604–613. ACM
17.
go back to reference Leurent G (2008) Md4 is not one-way. In: Fast software encryption. Springer, Berlin, pp. 412–428 Leurent G (2008) Md4 is not one-way. In: Fast software encryption. Springer, Berlin, pp. 412–428
18.
go back to reference Manber U, et al (1994) Finding similar files in a large file system. In: Proceedings of the USENIX winter 1994 technical conference, vol 1. San Fransisco, CA, USA Manber U, et al (1994) Finding similar files in a large file system. In: Proceedings of the USENIX winter 1994 technical conference, vol 1. San Fransisco, CA, USA
19.
go back to reference Octeau D, Jha S, McDaniel P (2012) Retargeting android applications to java bytecode. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering, p 6. ACM Octeau D, Jha S, McDaniel P (2012) Retargeting android applications to java bytecode. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering, p 6. ACM
20.
go back to reference Oprişa C, Cabău G, Coleşa A (2014) Automatic code features extraction using bio-inspired algorithms. J Comput Virol Hacking Tech 10(3):165–176 Oprişa C, Cabău G, Coleşa A (2014) Automatic code features extraction using bio-inspired algorithms. J Comput Virol Hacking Tech 10(3):165–176
21.
go back to reference Oprisa C, Checiches M, Nandrean A (2014) Locality-sensitive hashing optimizations for fast malware clustering. In: 2014 IEEE International conference on intelligent computer communication and processing (ICCP), pp 97–104. IEEE Oprisa C, Checiches M, Nandrean A (2014) Locality-sensitive hashing optimizations for fast malware clustering. In: 2014 IEEE International conference on intelligent computer communication and processing (ICCP), pp 97–104. IEEE
22.
go back to reference Potharaju R, Newell A, Nita-Rotaru C, Zhang X (2012) Plagiarizing smartphone applications: attack strategies and defense techniques. In: Engineering secure software and systems. Springer, Berlin, pp 106–120 Potharaju R, Newell A, Nita-Rotaru C, Zhang X (2012) Plagiarizing smartphone applications: attack strategies and defense techniques. In: Engineering secure software and systems. Springer, Berlin, pp 106–120
23.
go back to reference Protsenko M, Muller T (2013) Pandora applies non-deterministic obfuscation randomly to android. In: 2013 8th International conference on malicious and unwanted software:“ The Americas”(MALWARE), pp 59–67. IEEE Protsenko M, Muller T (2013) Pandora applies non-deterministic obfuscation randomly to android. In: 2013 8th International conference on malicious and unwanted software:“ The Americas”(MALWARE), pp 59–67. IEEE
24.
go back to reference Rajaraman A, Ullman JD (2012) Mining of massive datasets. Cambridge University Press, Cambridge Rajaraman A, Ullman JD (2012) Mining of massive datasets. Cambridge University Press, Cambridge
25.
go back to reference Roy CK, Cordy JR (2007) A survey on software clone detection research. Tech. rep., Technical report 541, Queens University at Kingston Roy CK, Cordy JR (2007) A survey on software clone detection research. Tech. rep., Technical report 541, Queens University at Kingston
26.
go back to reference Schleimer S, Wilkerson DS, Aiken A (2003) Winnowing: local algorithms for document fingerprinting. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, pp 76–85. ACM Schleimer S, Wilkerson DS, Aiken A (2003) Winnowing: local algorithms for document fingerprinting. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, pp 76–85. ACM
27.
go back to reference Shabtai A, Moskovitch R, Feher C, Dolev S, Elovici Y (2012) Detecting unknown malicious code by applying classification techniques on opcode patterns. Secur Inform 1(1):1–22CrossRef Shabtai A, Moskovitch R, Feher C, Dolev S, Elovici Y (2012) Detecting unknown malicious code by applying classification techniques on opcode patterns. Secur Inform 1(1):1–22CrossRef
28.
29.
go back to reference Turing AM (1936) On computable numbers, with an application to the entscheidungsproblem. J Math 58(345–363):5MATH Turing AM (1936) On computable numbers, with an application to the entscheidungsproblem. J Math 58(345–363):5MATH
31.
go back to reference Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: 2012 IEEE symposium on security and privacy (SP), pp 95–109. IEEE Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: 2012 IEEE symposium on security and privacy (SP), pp 95–109. IEEE
Metadata
Title
A scalable approach for detecting plagiarized mobile applications
Authors
Ciprian Oprişa
Dragoş Gavriluţ
George Cabău
Publication date
01-10-2016
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 1/2016
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-015-0903-y

Other articles of this Issue 1/2016

Knowledge and Information Systems 1/2016 Go to the issue

Premium Partner