Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 5-6/2014

01-09-2014

Classy: fast clustering streams of call-graphs

Author: Orestis Kostakis

Published in: Data Mining and Knowledge Discovery | Issue 5-6/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

An abstraction resilient to common malware obfuscation techniques is the call-graph. A call-graph is the representation of an executable file as a directed graph with labeled vertices, where the vertices correspond to functions and the edges to function calls. Unfortunately, most of the interesting graph comparison problems, including full-graph comparison and computing the largest common subgraph, belong to the \(NP\)-hard class. This makes the study and use of graphs in large scale systems difficult. Existing work has focused only on offline clustering and has not addressed the issue of clustering streams of graphs. In this paper we present Classy, a scalable distributed system that clusters streams of large call-graphs for purposes including automated malware classification and facilitating malware analysts. Since algorithms aimed at clustering sets are not suitable for clustering streams of objects, we propose the use of a clustering algorithm that relies on the notion of candidate clusters and reference samples therein. We demonstrate via thorough experimentation that this approach yields results very close to the offline optimal. Graph similarity is determined by computing a graph edit distance (GED) of pairs of graphs using an adapted version of simulated annealing. Furthermore, we present a novel lower bound for the GED. We also study the problem of approximating statistics of clusters of graphs when the distances of only a fraction of all possible pairs have been computed. Finally, we present results and statistics from a real production-side system that has clustered and contains more than 0.8 million graphs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases-volume 29, VLDB Endowment, pp 81–92 Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases-volume 29, VLDB Endowment, pp 81–92
go back to reference Aggarwal CC, Han J, Wang J, Yu PS (2004) A framework for projected clustering of high dimensional data streams. In: Proceedings of the thirtieth international conference on very large data bases-volume 30, VLDB Endowment, pp 852–863 Aggarwal CC, Han J, Wang J, Yu PS (2004) A framework for projected clustering of high dimensional data streams. In: Proceedings of the thirtieth international conference on very large data bases-volume 30, VLDB Endowment, pp 852–863
go back to reference Aggarwal C, Zhao Y, Yu P (2010) On clustering graph streams. In: Proceedings of the SIAM international conference on data mining, pp 478–489 Aggarwal C, Zhao Y, Yu P (2010) On clustering graph streams. In: Proceedings of the SIAM international conference on data mining, pp 478–489
go back to reference Akutsu T (1993) A polynomial time algorithm for finding a largest common subgraph of almost trees of bounded degree. IEICE Trans Fundam Electron Commun Comput Sci 76(9):1488–1493 Akutsu T (1993) A polynomial time algorithm for finding a largest common subgraph of almost trees of bounded degree. IEICE Trans Fundam Electron Commun Comput Sci 76(9):1488–1493
go back to reference Bayer U, Comparetti PM, Hlauschek C, Kruegel C, Kirda E (2009) Scalable, behavior-based malware clustering. In: 16th Network & distributed system security conference, vol 9, pp 8–11 Bayer U, Comparetti PM, Hlauschek C, Kruegel C, Kirda E (2009) Scalable, behavior-based malware clustering. In: 16th Network & distributed system security conference, vol 9, pp 8–11
go back to reference Bourquin M, King A, Robbins E (2013) Binslayer: accurate comparison of binary executables. In: Proceedings of the 2nd ACM SIGPLAN program protection and reverse engineering workshop, ACM, p 4 Bourquin M, King A, Robbins E (2013) Binslayer: accurate comparison of binary executables. In: Proceedings of the 2nd ACM SIGPLAN program protection and reverse engineering workshop, ACM, p 4
go back to reference Briones I, Gomez A (2008) Graphs, entropy and grid computing: automatic comparison of malware. Proceedings of the virus bulletin conference, pp 1–12 Briones I, Gomez A (2008) Graphs, entropy and grid computing: automatic comparison of malware. Proceedings of the virus bulletin conference, pp 1–12
go back to reference Bunke H (1997) On a relation between graph edit distance and maximum common subgraph. Pattern Recognit Lett 18(8):689–694CrossRefMathSciNet Bunke H (1997) On a relation between graph edit distance and maximum common subgraph. Pattern Recognit Lett 18(8):689–694CrossRefMathSciNet
go back to reference Burkhard W, Keller R (1973) Some approaches to best-match file searching. Commun ACM 16(4):230–236CrossRefMATH Burkhard W, Keller R (1973) Some approaches to best-match file searching. Commun ACM 16(4):230–236CrossRefMATH
go back to reference Cao F, Ester M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: Proceedings of the SIAM international conference on data mining, pp 328–339 Cao F, Ester M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: Proceedings of the SIAM international conference on data mining, pp 328–339
go back to reference Carrera E, Erdélyi G (2004) Digital genome mapping-advanced binary malware analysis. In: Proceedings of the virus bulletin conference, pp 187–197 Carrera E, Erdélyi G (2004) Digital genome mapping-advanced binary malware analysis. In: Proceedings of the virus bulletin conference, pp 187–197
go back to reference Charikar M, O’Callaghan L, Panigrahy R (2003) Better streaming algorithms for clustering problems. In: Proceedings of the ACM symposium on theory of computing, ACM, pp 30–39 Charikar M, O’Callaghan L, Panigrahy R (2003) Better streaming algorithms for clustering problems. In: Proceedings of the ACM symposium on theory of computing, ACM, pp 30–39
go back to reference Cheng J, Ke Y, Ng W (2009) Efficient query processing on graph databases. ACM Trans Database Syst (TODS) 34(1):2CrossRef Cheng J, Ke Y, Ng W (2009) Efficient query processing on graph databases. ACM Trans Database Syst (TODS) 34(1):2CrossRef
go back to reference Christodorescu M, Jha S (2004) Testing malware detectors. ACM SIGSOFT Softw Eng Notes 29(4):34–44CrossRef Christodorescu M, Jha S (2004) Testing malware detectors. ACM SIGSOFT Softw Eng Notes 29(4):34–44CrossRef
go back to reference Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. Int J Pattern Recognit Artif Intell 18(03):265–298CrossRef Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. Int J Pattern Recognit Artif Intell 18(03):265–298CrossRef
go back to reference Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on computational geometry, ACM, pp 253–262 Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on computational geometry, ACM, pp 253–262
go back to reference Dean T, Boddy M (1988) An analysis of time-dependent planning. In: Proceedings of the 17th national conference on artificial intelligence, pp 49–54 Dean T, Boddy M (1988) An analysis of time-dependent planning. In: Proceedings of the 17th national conference on artificial intelligence, pp 49–54
go back to reference Dullien T, Rolles R (2005) Graph-based comparison of executable objects. SSTIC 5:1–3 Dullien T, Rolles R (2005) Graph-based comparison of executable objects. SSTIC 5:1–3
go back to reference Elhadi AAE, Maarof MA, Barry BI (2013) Improving the detection of malware behaviour using simplified data dependent api call graph. Int J Secur Appl 7(5):29–42 Elhadi AAE, Maarof MA, Barry BI (2013) Improving the detection of malware behaviour using simplified data dependent api call graph. Int J Secur Appl 7(5):29–42
go back to reference Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96:226–231 Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96:226–231
go back to reference Flake H (2004) Structural comparison of executable objects. In: Proceedings of the international GI workshop on detection of intrusions and malware & vulnerability assessment, pp 161–174 Flake H (2004) Structural comparison of executable objects. In: Proceedings of the international GI workshop on detection of intrusions and malware & vulnerability assessment, pp 161–174
go back to reference Gascon H, Yamaguchi F, Arp D, Rieck K (2013) Structural detection of android malware using embedded call graphs. In: Proceedings of the 2013 ACM workshop on artificial intelligence and security, ACM, pp 45–54 Gascon H, Yamaguchi F, Arp D, Rieck K (2013) Structural detection of android malware using embedded call graphs. In: Proceedings of the 2013 ACM workshop on artificial intelligence and security, ACM, pp 45–54
go back to reference Gionis A, Indyk P, Motwani R et al (1999) Similarity search in high dimensions via hashing. VLDB 99:518–529 Gionis A, Indyk P, Motwani R et al (1999) Similarity search in high dimensions via hashing. VLDB 99:518–529
go back to reference Gionis A, Mannila H, Tsaparas P (2005) Clustering aggregation. In: Proceedings of the 21st international conference on data engineering (ICDE), IEEE, pp 341–352 Gionis A, Mannila H, Tsaparas P (2005) Clustering aggregation. In: Proceedings of the 21st international conference on data engineering (ICDE), IEEE, pp 341–352
go back to reference Giugno R, Shasha D (2002) Graphgrep: a fast and universal method for querying graphs. In: Proceedings of the 16th international conference on pattern recognition, IEEE, vol 2, pp 112–115 Giugno R, Shasha D (2002) Graphgrep: a fast and universal method for querying graphs. In: Proceedings of the 16th international conference on pattern recognition, IEEE, vol 2, pp 112–115
go back to reference Guha S, Meyerson A, Mishra N, Motwani R, O’Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515–528CrossRef Guha S, Meyerson A, Mishra N, Motwani R, O’Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515–528CrossRef
go back to reference He H, Singh A (2006) Closure-tree: an index structure for graph queries. In: Proceedings of the 22nd international conference on data engineering, IEEE, pp 38–38 He H, Singh A (2006) Closure-tree: an index structure for graph queries. In: Proceedings of the 22nd international conference on data engineering, IEEE, pp 38–38
go back to reference Hegedus J, Miche Y, Ilin A, Lendasse A (2011) Methodology for behavioral-based malware analysis and detection using random projections and k-nearest neighbors classifiers. In: Seventh international conference on computational intelligence and security (CIS), IEEE, pp 1016–1023 Hegedus J, Miche Y, Ilin A, Lendasse A (2011) Methodology for behavioral-based malware analysis and detection using random projections and k-nearest neighbors classifiers. In: Seventh international conference on computational intelligence and security (CIS), IEEE, pp 1016–1023
go back to reference Hu X, Chiueh T, Shin K (2009) Large-scale malware indexing using function-call graphs. In: Proceedings of the 16th ACM conference on computer and communications security, ACM, pp 611–620 Hu X, Chiueh T, Shin K (2009) Large-scale malware indexing using function-call graphs. In: Proceedings of the 16th ACM conference on computer and communications security, ACM, pp 611–620
go back to reference Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, ACM, pp 604–613 Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, ACM, pp 604–613
go back to reference Jiang H, Wang H, Yu P, Zhou S (2007) Gstring: a novel approach for efficient search in graph databases. In: Proceedings of the IEEE 23rd international conference on data engineering, IEEE, pp 566–575 Jiang H, Wang H, Yu P, Zhou S (2007) Gstring: a novel approach for efficient search in graph databases. In: Proceedings of the IEEE 23rd international conference on data engineering, IEEE, pp 566–575
go back to reference Kang MG, Poosankam P, Yin H (2007) Renovo: a hidden code extractor for packed executables. In: Proceedings of the 2007 ACM workshop on recurring malcode, ACM, pp 46–53 Kang MG, Poosankam P, Yin H (2007) Renovo: a hidden code extractor for packed executables. In: Proceedings of the 2007 ACM workshop on recurring malcode, ACM, pp 46–53
go back to reference Kinable J, Kostakis O (2011) Malware classification based on call graph clustering. J Comput Virol 7(4):233–245CrossRef Kinable J, Kostakis O (2011) Malware classification based on call graph clustering. J Comput Virol 7(4):233–245CrossRef
go back to reference Kolbitsch C, Comparetti PM, Kruegel C, Kirda E, Zhou Xy, Wang X (2009) Effective and efficient malware detection at the end host. In: USENIX security symposium, pp 351–366 Kolbitsch C, Comparetti PM, Kruegel C, Kirda E, Zhou Xy, Wang X (2009) Effective and efficient malware detection at the end host. In: USENIX security symposium, pp 351–366
go back to reference Kollios G, Potamias M, Terzi E (2013) Clustering large probabilistic graphs. IEEE Trans Knowl Data Eng 25(2):325–336CrossRef Kollios G, Potamias M, Terzi E (2013) Clustering large probabilistic graphs. IEEE Trans Knowl Data Eng 25(2):325–336CrossRef
go back to reference Kostakis O, Kinable J, Mahmoudi H, Mustonen K (2011) Improved call graph comparison using simulated annealing. In: Proceedings of the 2011 ACM symposium on applied computing, ACM, pp 1516–1523 Kostakis O, Kinable J, Mahmoudi H, Mustonen K (2011) Improved call graph comparison using simulated annealing. In: Proceedings of the 2011 ACM symposium on applied computing, ACM, pp 1516–1523
go back to reference Kulis B, Basu S, Dhillon I, Mooney R (2009) Semi-supervised graph clustering: a kernel approach. Mach Learn 74(1):1–22CrossRef Kulis B, Basu S, Dhillon I, Mooney R (2009) Semi-supervised graph clustering: a kernel approach. Mach Learn 74(1):1–22CrossRef
go back to reference Lin IJ, Kung SY (1997) Coding and comparison of dag’s as a novel neural structure with applications to on-line handwriting recognition. IEEE Trans Signal Process 45(11):2701–2708CrossRef Lin IJ, Kung SY (1997) Coding and comparison of dag’s as a novel neural structure with applications to on-line handwriting recognition. IEEE Trans Signal Process 45(11):2701–2708CrossRef
go back to reference Martignoni L, Christodorescu M, Jha S (2007) Omniunpack: fast, generic, and safe unpacking of malware. In: Twenty-third annual computer security applications conference (ACSAC) 2007, IEEE, pp 431–441 Martignoni L, Christodorescu M, Jha S (2007) Omniunpack: fast, generic, and safe unpacking of malware. In: Twenty-third annual computer security applications conference (ACSAC) 2007, IEEE, pp 431–441
go back to reference Mishra N, Schreiber R, Stanton I, Tarjan RE (2007) Clustering social networks. In: Algorithms and models for the web-graph. Springer, Berlin, pp 56–67 Mishra N, Schreiber R, Stanton I, Tarjan RE (2007) Clustering social networks. In: Algorithms and models for the web-graph. Springer, Berlin, pp 56–67
go back to reference Moser A, Kruegel C, Kirda E (2007a) Exploring multiple execution paths for malware analysis. In: IEEE symposium on security and privacy, IEEE, pp 231–245 Moser A, Kruegel C, Kirda E (2007a) Exploring multiple execution paths for malware analysis. In: IEEE symposium on security and privacy, IEEE, pp 231–245
go back to reference Moser A, Kruegel C, Kirda E, (2007b) Limits of static analysis for malware detection. In: Computer security applications conference, 2007. ACSAC 2007. Twenty-third annual, IEEE, pp 421–430 Moser A, Kruegel C, Kirda E, (2007b) Limits of static analysis for malware detection. In: Computer security applications conference, 2007. ACSAC 2007. Twenty-third annual, IEEE, pp 421–430
go back to reference Papapetrou P, Athitsos V, Kollios G, Gunopulos D (2009) Reference-based alignment in large sequence databases. Proc VLDB Endow 2(1):205–216CrossRef Papapetrou P, Athitsos V, Kollios G, Gunopulos D (2009) Reference-based alignment in large sequence databases. Proc VLDB Endow 2(1):205–216CrossRef
go back to reference Ramon J, Gärtner T (2003) Expressivity versus efficiency of graph kernels. First international workshop on mining graphs, trees and sequences, pp 65–74 Ramon J, Gärtner T (2003) Expressivity versus efficiency of graph kernels. First international workshop on mining graphs, trees and sequences, pp 65–74
go back to reference Rieck K, Holz T, Willems C, Düssel P, Laskov P (2008) Learning and classification of malware behavior. In: Detection of intrusions and malware, and vulnerability assessment. Springer, Berlin, pp 108–125 Rieck K, Holz T, Willems C, Düssel P, Laskov P (2008) Learning and classification of malware behavior. In: Detection of intrusions and malware, and vulnerability assessment. Springer, Berlin, pp 108–125
go back to reference Riesen K, Bunke H (2009) Approximate graph edit distance computation by means of bipartite graph matching. Image Vis Comput 27(7):950–959CrossRef Riesen K, Bunke H (2009) Approximate graph edit distance computation by means of bipartite graph matching. Image Vis Comput 27(7):950–959CrossRef
go back to reference Schietgat L, Ramon J, Bruynooghe M (2013) A polynomial-time maximum common subgraph algorithm for outerplanar graphs and its application to chemoinformatics. Ann Math Artif Intell 69(4):343–376CrossRefMATHMathSciNet Schietgat L, Ramon J, Bruynooghe M (2013) A polynomial-time maximum common subgraph algorithm for outerplanar graphs and its application to chemoinformatics. Ann Math Artif Intell 69(4):343–376CrossRefMATHMathSciNet
go back to reference Seward HH (1954) Information sorting in the application of electronic digital computers to business operations. PhD thesis, Department of Electrical Engineering, Massachusetts Institute of Technology Seward HH (1954) Information sorting in the application of electronic digital computers to business operations. PhD thesis, Department of Electrical Engineering, Massachusetts Institute of Technology
go back to reference Shervashidze N, Schweitzer P, Van Leeuwen EJ, Mehlhorn K, Borgwardt KM (2011) Weisfeiler–Lehman graph kernels. J Mach Learn Res 12:2539–2561MATHMathSciNet Shervashidze N, Schweitzer P, Van Leeuwen EJ, Mehlhorn K, Borgwardt KM (2011) Weisfeiler–Lehman graph kernels. J Mach Learn Res 12:2539–2561MATHMathSciNet
go back to reference Tarjan R, Van Leeuwen J (1984) Worst-case analysis of set union algorithms. J ACM 31(2):245–281CrossRefMATH Tarjan R, Van Leeuwen J (1984) Worst-case analysis of set union algorithms. J ACM 31(2):245–281CrossRefMATH
go back to reference Tian Y, Patel J (2008) Tale: A tool for approximate large graph matching. In: Proceedings of the IEEE 24th international conference on data engineering, IEEE, pp 963–972 Tian Y, Patel J (2008) Tale: A tool for approximate large graph matching. In: Proceedings of the IEEE 24th international conference on data engineering, IEEE, pp 963–972
go back to reference Veeramani R, Rai N (2012) Windows api based malware detection and framework analysis. In: International conference on networks and cyber security, p 25 Veeramani R, Rai N (2012) Windows api based malware detection and framework analysis. In: International conference on networks and cyber security, p 25
go back to reference Venkateswaran J, Lachwani D, Kahveci T, Jermaine C (2006) Reference-based indexing of sequence databases. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, pp 906–917 Venkateswaran J, Lachwani D, Kahveci T, Jermaine C (2006) Reference-based indexing of sequence databases. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, pp 906–917
go back to reference Vishwanathan S, Schraudolph NN, Kondor R, Borgwardt KM (2010) Graph kernels. J Mach Learn Res 11:1201–1242MATHMathSciNet Vishwanathan S, Schraudolph NN, Kondor R, Borgwardt KM (2010) Graph kernels. J Mach Learn Res 11:1201–1242MATHMathSciNet
go back to reference Willems C, Holz T, Freiling F (2007) Toward automated dynamic malware analysis using cwsandbox. Proceedings of the 28th IEEE symposium on security and privacy, vol 5(2), pp 32–39 Willems C, Holz T, Freiling F (2007) Toward automated dynamic malware analysis using cwsandbox. Proceedings of the 28th IEEE symposium on security and privacy, vol 5(2), pp 32–39
go back to reference Williams D, Huan J, Wang W (2007) Graph database indexing using structured graph decomposition. In: Proceedings of the IEEE 23rd international conference on data engineering, IEEE, pp 976–985 Williams D, Huan J, Wang W (2007) Graph database indexing using structured graph decomposition. In: Proceedings of the IEEE 23rd international conference on data engineering, IEEE, pp 976–985
go back to reference Xu JY, Sung AH, Chavez P, Mukkamala S (2004) Polymorphic malicious executable scanner by api sequence analysis. In: Fourth international conference on hybrid intelligent systems, HIS’04., IEEE, pp 378–383 Xu JY, Sung AH, Chavez P, Mukkamala S (2004) Polymorphic malicious executable scanner by api sequence analysis. In: Fourth international conference on hybrid intelligent systems, HIS’04., IEEE, pp 378–383
go back to reference Xu M, Wu L, Qi S, Xu J, Zhang H, Ren Y, Zheng N (2013) A similarity metric method of obfuscated malware using function-call graph. J Comput Virol Hacking Tech 9(1):35–47CrossRef Xu M, Wu L, Qi S, Xu J, Zhang H, Ren Y, Zheng N (2013) A similarity metric method of obfuscated malware using function-call graph. J Comput Virol Hacking Tech 9(1):35–47CrossRef
go back to reference Yan X, Yu P, Han J (2005) Substructure similarity search in graph databases. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, ACM, pp 766–777 Yan X, Yu P, Han J (2005) Substructure similarity search in graph databases. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, ACM, pp 766–777
go back to reference Zeng Z, Tung A, Wang J, Feng J, Zhou L (2009) Comparing stars: on approximating graph edit distance. Proc VLDB Endow 2(1):25–36CrossRef Zeng Z, Tung A, Wang J, Feng J, Zhou L (2009) Comparing stars: on approximating graph edit distance. Proc VLDB Endow 2(1):25–36CrossRef
go back to reference Zhao P, Yu J, Yu P (2007) Graph indexing: tree+ delta\(\le \) graph. In: Proceedings of the 33rd international conference on very large data bases, VLDB Endowment, pp 938–949 Zhao P, Yu J, Yu P (2007) Graph indexing: tree+ delta\(\le \) graph. In: Proceedings of the 33rd international conference on very large data bases, VLDB Endowment, pp 938–949
go back to reference Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endow 2(1):718–729CrossRef Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endow 2(1):718–729CrossRef
Metadata
Title
Classy: fast clustering streams of call-graphs
Author
Orestis Kostakis
Publication date
01-09-2014
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 5-6/2014
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-014-0367-9

Other articles of this Issue 5-6/2014

Data Mining and Knowledge Discovery 5-6/2014 Go to the issue

Premium Partner