Abstract
This paper introduces a new definition of dense subgraph pattern, the DN -graph. DN -graph considers both the size of the substructure and the minimum level of interactions between any pair of the vertices.
The mining of DN -graphs inherits the difficulty of finding clique, the fully-connected subgraphs. We thus opt for approximately locating the DN -graphs using the state-of-the-art graph triangulation methods. Our solution consists of a family of algorithms, each of which targets a different problem setting. These algorithms are iterative, and utilize repeated scans through the triangles in the graph to approximately locate the DN -graphs. Each scan on the graph triangles improves the results. Since the triangles are not physically materialized, the algorithms have small memory footprint.
With our solution, the users can adopt a "pay as you go" approach. They have the flexibility to terminate the mining process once they are satisfied with the quality of the results. As a result, our algorithms can cope with semi-streaming environment where the graph edges cannot fit into main memory. Results of extensive performance study confirmed our claims.
- Netflix Prize Data Set, 2009 (accessed July 23, 2009).Google Scholar
- J. Abello, M. Resende, and R. Sudarsky. Massive quasi-clique detection. In Proc. 5th Latin American Symposium on Theoretical Informatics, pages 598--612, 2002. Google Scholar
- I. Akihiro, W. Takashi, and M. Hiroshi. Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning, 50(3):321--354, 2003. Google Scholar
- P. Aloy, B. BäPttcher, H. Ceulemans, C. Leutwein, C. Mellwig, S. Fischer, A. C. Gavin, P. Bork, S. G. Furga, L. Serrano, and R. D. Russell. Structure-based assembly of protein complexes in yeast. Science, 303(5666):2026--2029, 2004.Google Scholar
- L. Becchetti, P. Boldi, C. Castillo, and A. Gionis. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In ACM KDD '08, pages 16--24, New York, NY, USA, 2008. Google Scholar
- V. Boginski, S. Butenko, and P. Pardalos. Mining market data: a network approach. Computer Operational Research, 33(11):3171--3184, 2006. Google Scholar
- M. Brockington and J. Culberson. Camouflaging independent sets in quasi-random graphs. DIMACS Series, 26:75--88, 1994.Google Scholar
- D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB'05, pages 721--732, 2005. Google Scholar
- J. Han, N. Stefanovic, and K. Koperski. Selective materialization: An efficient method for spatial data cube construction. In PAKDD'98 {Lecture Notes in Artificial Intelligence, 1394, Springer Verlag, 1998}, pages 144--158, 1998. Google Scholar
- H. Hu, X. Yan, Y. Huang, J. Han, and X. Zhou. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics, 21:213--221, 2005. Google Scholar
- C. F. J. Rivas. Proteincprotein interactions essentials: Key concepts to building and analyzing interactome networks. PLoS Computational Biology, 6:6, 2010.Google Scholar
- R. Karp. Reducibility among combinatorial problems. The Journal of Symbolic Logic, 40:618--619, 1975.Google Scholar
- M. Latapy. Practical algorithms for triangle computations in very large (sparse (power-law)) graphs. Journal of Theoretical Computer Science, 407:458--473, 2008. Google Scholar
- T. Schank and D. Wagner. Finding, counting and listing all triangles in large graphs, an experimental study. In WEA, pages 606--609, 2005. Google Scholar
- N. Wang, S. Parthasarathy, K. L. Tan, and A. Tung. Csv: visualizing and mining cohesive subgraphs. In SIGMOD'08, pages 445--458, 2008. Google Scholar
- I. Xenarios and L. S. etc. DIP, the database of interacting proteins: A research tool for studying cellular networks of protein interactions. Nucleic Acids Research, 30(1):303--305, 2002.Google Scholar
Index Terms
- On triangulation-based dense neighborhood graph discovery
Recommendations
Lossless graph summarization using dense subgraphs discovery
IMCOM '15: Proceedings of the 9th International Conference on Ubiquitous Information Management and CommunicationDense subgraph discovery, in a large graph, is useful to solve the community search problem. Motivated from this, we propose a graph summarization method where we search and aggregate dense subgraphs into super nodes. Since the dense subgraphs have high ...
A new closure concept preserving graph Hamiltonicity and based on neighborhood equivalence
A graph is Hamiltonian if it contains a cycle which goes through all vertices exactly once. Determining if a graph is Hamiltonian is known as an NP-complete problem, and no satisfactory characterization for these graphs has been found. In 1976, Bondy ...
Dense graphs are antimagic
An antimagic labeling of graph a with m edges and n vertices is a bijection from the set of edges to the integers 1,…,m such that all n vertex sums are pairwise distinct, where a vertex sum is the sum of labels of all edges incident with the same ...
Comments