article

Graph mining: Laws, generators, and algorithms

Authors:
Deepayan Chakrabarti

Yahoo! Research and Carnegie Mellon University, Sunnyvale, CA

Yahoo! Research and Carnegie Mellon University, Sunnyvale, CA
View Profile

,
Christos Faloutsos

Yahoo! Research and Carnegie Mellon University, Sunnyvale, CA

Yahoo! Research and Carnegie Mellon University, Sunnyvale, CA
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 38 Issue 1pp 2–eshttps://doi.org/10.1145/1132952.1132954

Published:29 June 2006Publication History

ACM Computing Surveys

Abstract

How does the Web look? How could we tell an abnormal social network from a normal one? These and similar questions are important in many fields where the data can intuitively be cast as a graph; examples range from computer networks to sociology to biology and many more. Indeed, any M : N relation in database terminology can be represented as a graph. A lot of these questions boil down to the following: “How can we generate synthetic but realistic graphs?” To answer this, we must first understand what patterns are common in real-world graphs and can thus be considered a mark of normality/realism. This survey give an overview of the incredible variety of work that has been done on these problems. One of our main contributions is the integration of points of view from physics, mathematics, sociology, and computer science. Further, we briefly describe recent advances on some related and interesting graph problems.

References

Adamic, L. A. and Huberman, B. A. 2000. Power-law distribution of the World Wide Web. Science 287, 2115.]]Google ScholarCross Ref
Adamic, L. A. and Huberman, B. A. 2001. The Web's hidden order. Comm. ACM 44, 9, 55--60.]] Google ScholarDigital Library
Adamic, L. A., Lukose, R. M., Puniyani, A. R., and Huberman, B. A. 2001. Search in power-law networks. Physical Rev. E 64, 4, 046135 1--8.]]Google ScholarCross Ref
Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA.]] Google ScholarDigital Library
Aiello, W., Chung, F., and Lu, L. 2000. A random graph model for massive graphs. In ACM Symposium on Theory of Computing. ACM Press, New York, NY, 171--180.]] Google ScholarDigital Library
Aiello, W., Chung, F., and Lu, L. 2001. Random evolution in massive graphs. In IEEE Symposium on Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Albert, R. and Barabási, A.-L. 2000. Topology of evolving networks: local events and universality. Physical Rev. Lett. 85, 24, 5234--5237.]]Google ScholarCross Ref
Albert, R. and Barabási, A.-L. 2002. Statistical mechanics of complex networks. Rev. Modern Physics 74, 1, 47--97.]]Google ScholarCross Ref
Albert, R., Jeong, H., and Barabási, A.-L. 1999. Diameter of the World-Wide Web. Nature 401, 130--131.]]Google ScholarCross Ref
Albert, R., Jeong, H., and Barabási, A.-L. 2000. Error and attack tolerance of complex networks. Nature 406, 378--381.]]Google ScholarCross Ref
Alon, N. 1998. Spectral techniques in graph algorithms. In Lecture Notes in Computer Science 1380, C. L. Lucchesi and A. V. Moura, Eds. Springer-Verlag, Berlin, 206--215.]] Google ScholarDigital Library
Alon, N., Yuster, R., and Zwick, U. 1997. Finding and counting given length cycles. Algorithmica 17, 3, 209--223.]]Google ScholarCross Ref
Amaral, L. A. N., Scala, A., Barthélémy, M., and Stanley, H. E. 2000. Classes of small-world networks. Proceedings of the National Academy of Sciences 97, 21, 11149--11152.]]Google ScholarCross Ref
Baeza-Yates, R. and Poblete, B. 2003. Evolution of the Chilean Web structure composition. In Latin American Web Congress. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Bailey, N. T. J. 1974. The Mathematical Theory of Infectious Diseases and its Applications 2nd Ed. Charles Griffin, London, UK.]]Google Scholar
Baker, W. E. and Faulkner, R. R. 1993. The social organization of conspiracy: Illegal networks in the Heavy Electrical Equipment industry. Amer. Sociolog. Rev. 58, 6, 837--860.]]Google ScholarCross Ref
Bar-Yossef, Z., Kumar, R., and Sivakumar, D. 2002. Reductions in streaming algorithms, with an application to counting triangles in graphs. In ACM-SIAM Symposium on Discrete Algorithms. SIAM, Philadelphia, PA.]] Google ScholarDigital Library
Barabási, A.-L. 2002. Linked: The New Science of Networks 1st Ed. Perseus Books Group, New York, NY.]]Google Scholar
Barabási, A.-L. and Albert, R. 1999. Emergence of scaling in random networks. Science 286, 509--512.]]Google ScholarCross Ref
Barabási, A.-L., Jeong, H., Néda, Z., Ravasz, E., Schubert, A., and Vicsek, T. 2002. Evolution of the social network of scientific collaborations. Physica A 311, 590--614.]]Google ScholarCross Ref
Beirlant, J., de Wet, T., and Goegebeur, Y. 2005. A goodness-of-fit statistic for Pareto-type behaviour. J. Computat. Appl. Math. 186, 1, 99--116.]]Google ScholarCross Ref
Ben-Hur, A. and Guyon, I. 2003. Detecting stable clusters using principal component analysis. In Methods in Molecular Biology. M. J. Brownstein and A. Khudorsky, Eds. Humana Press, Totowa, NJ, 159--182.]]Google Scholar
Berger, N., Borgs, C., Chayes, J. T., D'Souza, R. M., and Kleinberg, B. D. 2005. Competition-induced preferential attachment. Combinat. Probabil. Comput. 14, 697--721.]] Google ScholarDigital Library
Berry, M. W. 1992. Large scale singular value computations. Inte. J. Supercomput. Applic. 6, 1, 13--49.]]Google ScholarDigital Library
Bi, Z., Faloutsos, C., and Korn, F. 2001. The DGX distribution for mining massive, skewed data. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY, 17--26.]] Google ScholarDigital Library
Bianconi, G. and Barabási, A.-L. 2001. Competition and multiscaling in evolving networks. Europhysics Letters 54, 4, 436--442.]]Google ScholarCross Ref
Boguñá, M. and Pastor-Satorras, R. 2002. Epidemic spreading in correlated complex networks. Physical Rev. E 66, 4, 047104.]]Google ScholarCross Ref
Boldi, P., Codenotti, B., Santini, M., and Vigna, S. 2002. Structural properties of the African Web. In International World Wide Web Conference. ACM Press, New York, NY.]]Google Scholar
Bollobás, B. 1985. Random Graphs. Academic Press, London, UK.]]Google Scholar
Bollobás, B., Borgs, C., Chayes, J. T., and Riordan, O. 2003. Directed scale-free graphs. In ACM-SIAM Symposium on Discrete Algorithms. SIAM, Philadelphia, PA.]] Google ScholarDigital Library
Bollobás, B. and Riordan, O. 2002. The diameter of a scale-free random graph. Combinatorica.]]Google Scholar
Bonacich, P. 1987. Power and centrality: A family of measures. Amer. J. Sociol. 92, 5 (Mar.), 1170--1182.]]Google ScholarCross Ref
Borgatti, S. 2002. The key player problem. In Proceedings of the National Academy of Sciences Workshop on Terrorism. National Academy of Sciences, Washington DC.]]Google Scholar
Borgatti, S. and Everett, M. G. 1989. The class of all regular equivalences: Algebraic structure and computation. Social Networks 11, 65--88.]]Google ScholarCross Ref
Borgatti, S. and Everett, M. G. 1999. Models of core/periphery structures. Social Networks 21, 275--295.]]Google Scholar
Borgatti, S., Everett, M. G., and Freeman, L. C. 1999. UCINET V User's Guide. Analytic Technologies.]]Google Scholar
Borgs, C., Chayes, J. T., Mahdian, M., and Saberi, A. 2004. Exploring the community structure of newsgroups (Extended Abstract). In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY.]] Google ScholarDigital Library
Brandes, U., Gaertler, M., and Wagner, D. 2003. Experiments on graph clustering algorithms. In European Symposium on Algorithms. Springer Verlag, Berlin, Germany.]]Google Scholar
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. ISDN Syst. 30, 1--7, 107--117.]] Google ScholarDigital Library
Broder, A. Z., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., and Wiener, J. 2000. Graph structure in the web: experiments and models. In International World Wide Web Conference. ACM Press, New York, NY.]] Google ScholarDigital Library
Bu, T. and Towsley, D. 2002. On distinguishing between Internet power law topology generators. In IEEE INFOCOM. IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
Burt, R. S. 1992. Structural Holes. Harvard University Press, Cambridge, MA.]]Google Scholar
Burt, R. S. 2001. Structural holes versus network closure as social capital. In Social Capital: Theory and Research. N. Lin, K. S. Cook, and R. S. Burt, Eds. Aldine de Gruyter, Hawthorne, NY.]]Google Scholar
Callaway, D. S., Newman, M. E. J., Strogatz, S. H., and Watts, D. J. 2000. Network robustness and fragility: Percolation on random graphs. Physical Rev. Lett. 85, 25, 5468--5471.]]Google ScholarCross Ref
Calvert, K. L., Doar, M. B., and Zegura, E. W. 1997. Modeling Internet topology. IEEE Comm. 35, 6, 160--163.]]Google ScholarDigital Library
Carlson, J. M. and Doyle, J. 1999. Highly optimized tolerance: A mechanism for power laws in designed systems. Physical Rev. E 60, 2, 1412--1427.]]Google ScholarCross Ref
Chakrabarti, D. 2004. AutoPart: Parameter-free graph partitioning and outlier detection. In Conference on Principles and Practice of Knowledge Discovery in Databases. Springer, Berlin, Germany.]] Google ScholarDigital Library
Chakrabarti, D., Papadimitriou, S., Modha, D. S., and Faloutsos, C. 2004. Fully automatic Cross-associations. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY.]] Google ScholarDigital Library
Chakrabarti, D., Zhan, Y., and Faloutsos, C. 2004. R-MAT: A recursive model for graph mining. In SIAM Data Mining Conference. SIAM, Philadelphia, PA.]]Google Scholar
Chakrabarti, S. 1999. Recent results in automatic Web resource discovery. ACM Comput. Surv. 31, 4.]] Google ScholarDigital Library
Chakrabarti, S., Punera, K., and Subramanyam, M. 2002. Accelerated focused crawling through online relevance feedback. In International World Wide Web Conference. ACM Press, New York, NY, 148--159.]] Google ScholarDigital Library
Chakrabarti, S., van den Berg, M., and Dom, B. E. 1999. Focused crawling: A new approach to topic-specific Web resource discovery. Comput. Netw. 31, 11--16, 1623--1640.]] Google ScholarDigital Library
Chen, Q., Chang, H., Govindan, R., Jamin, S., Shenker, S., and Willinger, W. 2001. The origin of power laws in Internet topologies revisited. In IEEE INFOCOM. IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
Clauset, A., Newman, M. E. J., and Moore, C. 2004. Finding community structure of very large networks. Physical Rev. E 70, 066111.]]Google ScholarCross Ref
Coleman, J. S. 1988. Social capital in the creation of human capital. Amer. J. Sociol. 94, S95--S121.]]Google ScholarCross Ref
Cooper, C. and Frieze, A. 2003. A general model of web graphs. Rand. Struct. Algor. 22, 3, 311--335.]] Google ScholarDigital Library
Cooper, C. and Frieze, A. 2004. The size of the largest strongly connected component of a random digraph with a given degree sequence. Combinat. Probabil. Comput. 13, 3, 319--337.]] Google ScholarDigital Library
Cormen, T. H., Leiserson, C. E., and Rivest, R. L. 1992. Introduction to Algorithms 6th Ed. MIT Press and McGraw-Hill Book Company, Cambridge, MA.]] Google ScholarDigital Library
Cross, R., Borgatti, S., and Parker, A. 2002. Making invisible work visible: Using social network analysis to support strategic collaboration. CA. Manag. Rev. 44, 2, 25--46.]]Google ScholarCross Ref
Crovella, M. and Taqqu, M. S. 1999. Estimating the heavy tail index from scaling properties. Metho. Comput. Appl. Probabil. 1, 1, 55--79.]] Google ScholarDigital Library
de Solla Price, D. J. 1976. A general theory of bibliometric and other cumulative advantage processes. J. Amer. Soci. Inform. Science 27, 292--306.]]Google ScholarCross Ref
Dehaspe, L. and Toivonen, H. 1999. Discovery of frequent datalog patterns. Data Mining Knowl. Discov. 3, 1, 7--36.]] Google ScholarDigital Library
Dehaspe, L., Toivonen, H., and King, R. D. 1998. Finding frequent substructures in chemical compounds. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY.]]Google Scholar
Dill, S., Kumar, R., McCurley, K. S., Rajagopalan, S., Sivakumar, D., and Tomkins, A. 2001. Self-similarity in the Web. In International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA.]] Google ScholarDigital Library
Dombroski, M., Fischbeck, P., and Carley, K. M. 2003. Estimating the shape of covert networks. In Proceedings of the 8th International Command and Control Research and Technology Symposium.]]Google Scholar
Domingos, P. and Richardson, M. 2001. Mining the network value of customers. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY.]] Google ScholarDigital Library
Dorogovtsev, S. N., Goltsev, A. V., and Mendes, J. F. 2002. Pseudofractal scale-free web. Physical Rev. E 65, 6, 066122.]]Google ScholarCross Ref
Dorogovtsev, S. N. and Mendes, J. F. 2003. Evolution of Networks: From Biological Nets to the Internet and WWW. Oxford University Press, Oxford, UK.]] Google ScholarDigital Library
Dorogovtsev, S. N., Mendes, J. F., and Samukhin, A. N. 2000. Structure of growing networks with preferential linking. Physical Rev. Lett. 85, 21, 4633--4636.]]Google ScholarCross Ref
Dorogovtsev, S. N., Mendes, J. F., and Samukhin, A. N. 2001. Giant strongly connected component of directed networks. Physical Rev. E 64, 025101 1--4.]]Google ScholarCross Ref
Doyle, J. and Carlson, J. M. 2000. Power laws, highly optimized tolerance, and generalized source coding. Physical Rev. Lett. 84, 24 (June), 5656--5659.]]Google ScholarCross Ref
Drineas, P., Frieze, A., Kannan, R., Vempala, S., and Vinay, V. 1999. Clustering in large graphs and matrices. In ACM-SIAM Symposium on Discrete Algorithms. SIAM, Philadelphia, PA.]] Google ScholarDigital Library
Džeroski, S. and Lavrač, N. 2001. Relational Data Mining. Springer Verlag, Berlin, Germany.]] Google ScholarDigital Library
Erdős, P. and Rényi, A. 1960. On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Acadamy of Science 5, 17--61.]]Google Scholar
Erdős, P. and Rényi, A. 1961. On the strength of connectedness of random graphs. Acta Mathematica Scientia Hungary 12, 261--267.]]Google ScholarCross Ref
Everitt, B. S. 1974. Cluster Analysis. John Wiley, New York, NY.]]Google Scholar
Fabrikant, A., Koutsoupias, E., and Papadimitriou, C. H. 2002. Heuristically optimized trade-offs: A new paradigm for power laws in the Internet. In Proceedings of the International Colloquium on Automata, Languages and Programming. Springer Verlag, Berlin, Germany, 110--122.]] Google ScholarDigital Library
Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the Internet topology. In Conference of the ACM Special Interest Group on Data Communications (SIGCOMM). ACM Press, New York, NY, 251--262.]] Google ScholarDigital Library
Feuerverger, A. and Hall, P. 1999. Estimating a tail exponent by modelling departure from a Pareto distribution. Annals Statist. 27, 2, 760--781.]]Google ScholarCross Ref
Flake, G. W., Lawrence, S., and Giles, C. L. 2000. Efficient identification of Web communities. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY.]] Google ScholarDigital Library
Freeman, L. C. 1977. A set of measures of centrality based on betweenness. Sociometry 40, 1, 35--41.]]Google ScholarCross Ref
Gibson, D., Kleinberg, J., and Raghavan, P. 1998. Inferring web communities from link topology. In ACM Conference on Hypertext and Hypermedia. ACM Press, New York, NY, 225--234.]] Google ScholarDigital Library
Giles, C. L., Bollacker, K., and Lawrence, S. 1998. Citeseer: An automatic citation indexing system. In ACM Conference on Digital Libraries. ACM Press, New York, NY.]] Google ScholarDigital Library
Girvan, M. and Newman, M. E. J. 2002. Community structure in social and biological networks. In Proceedings of the National Academy of Sciences. Vol. 99. National Academy of Sciences, Washington DC.]]Google Scholar
Glover, F. 1989. Tabu search---part 1. ORSA J. Comput. 1, 3, 190--206.]]Google ScholarCross Ref
Goh, K.-I., Oh, E., Jeong, H., Kahng, B., and Kim, D. 2002. Classificaton of scale-free networks. In Proceedings of the National Academy of Sciences. Vol. 99. National Academy of Sciences, Washington DC, 12583--12588.]]Google Scholar
Goldstein, M. L., Morris, S. A., and Yen, G. G. 2004. Problems with fitting to the power-law distribution. European Physics J. B 41, 255--258.]]Google ScholarCross Ref
Govindan, R. and Tangmunarunkit, H. 2000. Heuristics for Internet map discovery. In IEEE INFOCOM. IEEE Computer Society Press, Los Alamitos, CA, 1371--1380.]]Google Scholar
Granovetter, M. S. 1973. The strength of weak ties. Amer. J. Sociol. 78, 6 (May), 1360--1380.]]Google ScholarCross Ref
Hanneman, R. A. and Riddle, M. 2005. Introduction to social network methods. http://faculty.ucr.edu/hanneman/nettext/.]]Google Scholar
Hill, B. M. 1975. A simple approach to inference about the tail of a distribution. Annals Statist. 3, 5, 1163--1174.]]Google ScholarCross Ref
Holder, L., Cook, D., and Djoko, S. 1994. Substructure discovery in the SUBDUE system. In National Conference on Artificial Intelligence Workshop on Knowledge Discovery in Databases. AAAI Press, Menlo Park, CA, 169--180.]]Google Scholar
Inokuchi, A., Washio, T., and Motoda, H. 2000. An apriori-based algorithm for mining frequent substructures from graph data. In Conference on Principles and Practice of Knowledge Discovery in Databases. Springer, Berlin, Germany.]] Google ScholarDigital Library
Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N., and Barabási, A.-L. 2000. The large-scale organization of metabolic networks. Nature 407, 6804, 651--654.]]Google Scholar
Kannan, R., Vempala, S., and Vetta, A. 2000. On clusterings -- good, bad and spectral. In IEEE Symposium on Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Karypis, G. and Kumar, V. 1998. Multilevel algorithms for multi-constraint graph partitioning. Tech. Rep. 98-019, University of Minnesota.]]Google Scholar
Kautz, H., Selman, B., and Shah, M. 1997. Referralweb: Combining social networks and collaborative filtering. Comm. ACM 40, 3, 63--65.]] Google ScholarDigital Library
Kempe, D., Kleinberg, J., and Demers, A. J. 2001. Spatial gossip and resource location protocols. In ACM Symposium on Theory of Computing. ACM Press, New York, NY.]] Google ScholarDigital Library
Kephart, J. O. and White, S. R. 1991. Directed-graph epidemiological models of computer viruses. In IEEE Symposium on Research in Security and Privacy. IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
Kephart, J. O. and White, S. R. 1993. Measuring and modeling computer virus prevalence. In IEEE Symposium on Research in Security and Privacy. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Killworth, P. D. and Bernard, H. R. 1978. Reverse small world experiment. Social Networks 1, 2, 103--210.]]Google ScholarCross Ref
Kleinberg, J. 1999a. Authoritative sources in a hyperlinked environment. J. ACM 46, 5, 604--632.]] Google ScholarDigital Library
Kleinberg, J. 1999b. The small-world phenomenon: An algorithmic perspective. Tech. Rep. 99-1776, Computer Science Department, Cornell.]] Google ScholarDigital Library
Kleinberg, J. 2001. Small world phenomena and the dynamics of information. In Neural Information Processing Systems Conference. MIT Press, Cambridge, MA.]]Google Scholar
Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. The web as a graph: Measurements, models and methods. In International Computing and Combinatorics Conference. Springer, Berlin, Germany.]]Google Scholar
Krapivsky, P. L. and Redner, S. 2001. Organization of growing random networks. Physical Rev. E 63, 6, 066123 1--14.]]Google ScholarCross Ref
Krebs, V. E. 2001. Mapping networks of terrorist cells. Connections 24, 3, 43--52.]]Google Scholar
Kumar, R., Raghavan, P., Rajagopalan, S., Sivakumar, D., Tomkins, A., and Upfal, E. 2000. Stochastic models for the Web graph. In IEEE Symposium on Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. Extracting large-scale knowledge bases from the web. In International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA.]] Google ScholarDigital Library
Kuramochi, M. and Karypis, G. 2001. Frequent subgraph discovery. In IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA, 313--320.]] Google ScholarDigital Library
Kuramochi, M. and Karypis, G. 2002. Discovering frequent geometric subgraphs. In IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Leskovec, J., Chakrabarti, D., Kleinberg, J., and Faloutsos, C. 2005. Realistic, mathematically tractable graph generation and evolution, using Kronecker Multiplication. In Conference on Principles and Practice of Knowledge Discovery in Databases. Springer, Berlin, Germany.]]Google Scholar
Leskovec, J., Kleinberg, J., and Faloutsos, C. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY.]] Google ScholarDigital Library
Madeira, S. C. and Oliveira, A. L. 2004. Biclustering algorithms for biological data analysis: A survey. IEEE Trans. Computat. Biology Bioinform. 1, 1, 24--45.]] Google ScholarDigital Library
McGovern, A. and Jensen, D. 2003. Identifying predictive structures in relational data using multiple instance learning. In International Conference on Machine Learning. AAAI Press, Menlo Park, CA.]]Google Scholar
Medina, A., Matta, I., and Byers, J. 2000. On the origin of power laws in Internet topologies. In Conference of the ACM Special Interest Group on Data Communications (SIGCOMM). ACM Press, New York, NY, 18--34.]]Google Scholar
Mihail, M. and Papadimitriou, C. H. 2002. On the eigenvalue power law. In International Workshop on Randomization and Approximation Techniques in Computer Science. Springer Verlag, Berlin, Germany.]] Google ScholarDigital Library
Milgram, S. 1967. The small-world problem. Psychology Today 2, 60--67.]]Google Scholar
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovshii, D., and Alon, U. 2002. Network motifs: Simple building blocks of complex networks. Science 298, 824--827.]]Google ScholarCross Ref
Mitzenmacher, M. 2001. A brief history of generative models for power law and lognormal distributions. In Proceedings of the 39th Annual Allerton Conference on Communication, Control, and Computing. UIUC Press, Urbana-Champaign, IL.]]Google Scholar
Montgomery, A. L. and Faloutsos, C. 2001. Identifying Web browsing trends and patterns. IEEE Comput. 34, 7, 94--95.]] Google ScholarDigital Library
Moody, J. 2001. Race, school integration, and friendship segregation in America. Amer. J. Sociol. 107, 3, 679--716.]]Google ScholarCross Ref
Newman, M. E. J. 2003. The structure and function of complex networks. SIAM Rev. 45, 167--256.]]Google ScholarDigital Library
Newman, M. E. J. 2005. Power laws, pareto distributions and Zipf's law. Contemp. Physics 46, 323--351.]]Google ScholarCross Ref
Newman, M. E. J., Forrest, S., and Balthrop, J. 2002. Email networks and the spread of computer viruses. Physical Rev. E 66, 3, 035101 1--4.]]Google ScholarCross Ref
Newman, M. E. J., Girvan, M., and Farmer, J. D. 2002. Optimal design, robustness and risk aversion. Physical Rev. Lett. 89, 2, 028301 1--4.]]Google ScholarCross Ref
Newman, M. E. J., Strogatz, S. H., and Watts, D. J. 2001. Random graphs with arbitrary degree distributions and their applications. Physical Rev. E 64, 2, 026118 1--17.]]Google ScholarCross Ref
Nijssen, S. and Kok, J. 2001. Faster association rules for multiple relations. In International Joint Conference on Artificial Intelligence. Morgan Kaufmann, San Francisco, CA.]]Google Scholar
Palmer, C., Gibbons, P. B., and Faloutsos, C. 2002. ANF: A fast and scalable tool for data mining in massive graphs. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY.]] Google ScholarDigital Library
Palmer, C. and Steffan, J. G. 2000. Generating network topologies that obey power laws. In IEEE Global Telecommunications Conference. IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
Pandurangan, G., Raghavan, P., and Upfal, E. 2002. Using PageRank to characterize Web structure. In International Computing and Combinatorics Conference. Springer, Berlin, Germany.]] Google ScholarDigital Library
Pastor-Satorras, R., Vásquez, A., and Vespignani, A. 2001. Dynamical and correlation properties of the Internet. Physical Rev. Lett. 87, 25, 258701 1--4.]]Google ScholarCross Ref
Pastor-Satorras, R. and Vespignani, A. 2001b. Epidemic dynamics and endemic states in complex networks. Physical Rev. E 63, 6, 066117 1--8.]]Google ScholarCross Ref
Pastor-Satorras, R. and Vespignani, A. 2001a. Epidemic spreading in scale-free networks. Physical Rev. Lett. 86, 14, 3200--3203.]]Google ScholarCross Ref
Pastor-Satorras, R. and Vespignani, A. 2002a. Epidemic dynamics in finite size scale-free networks. Physical Rev. E 65, 3, 035108 1--4.]]Google ScholarCross Ref
Pastor-Satorras, R. and Vespignani, A. 2002b. Immunization of complex networks. Physical Rev. E 65, 3, 036104 1--8.]]Google ScholarCross Ref
Pennock, D. M., Flake, G. W., Lawrence, S., Glover, E. J., and Giles, C. L. 2002. Winners don't take all: Characterizing the competition for links on the Web. Proceedings of the National Academy of Sciences 99, 8, 5207--5211.]]Google ScholarCross Ref
Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. 1992. Numerical Recipes in C 2nd Ed. Cambridge University Press, Cambridge, UK.]] Google ScholarDigital Library
Ravasz, E. and Barabási, A.-L. 2002. Hierarchical organization in complex networks. Physical Rev. E 65, 026112 1--7.]]Google Scholar
Redner, S. 1998. How popular is your paper? an empirical study of the citation distribution. European Physics J. B 4, 131--134.]]Google ScholarCross Ref
Richardson, M. and Domingos, P. 2002. Mining knowledge-sharing sites for viral marketing. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, New York, NY, 61--70.]] Google ScholarDigital Library
Schwartz, M. F. and Wood, D. C. M. 1993. Discovering shared interests using graph analysis. Comm. ACM 36, 8, 78--89.]] Google ScholarDigital Library
Simon, H. 1955. On a class of skew distribution functions. Biometrika 42, 3/4, 425--440.]]Google Scholar
Solé, R. V. and Montoya, J. M. 2001. Complexity and fragility in ecological networks. In Proceedings of the Royal Society of London B. Vol. 268. The Royal Society, London, UK, 2039--2045.]]Google Scholar
Sparrow, M. K. 1991. The application of network analysis to criminal intelligence: An assessment of the prospects. Social Networks 13, 3, 251--274.]]Google ScholarCross Ref
Spielman, D. A. and Teng, S.-H. 1996. Spectral partitioning works: Planar graphs and finite element meshes. In IEEE Symposium on Foundations of Computer Science. IEEE Computer Society Press, Los Alamitos, CA, 96--105.]] Google ScholarDigital Library
Tangmunarunkit, H., Govindan, R., Jamin, S., Shenker, S., and Willinger, W. 2001. Network topologies, power laws, and hierarchy. Tech. Rep. 01-746, University of Southern California.]]Google Scholar
Tangmunarunkit, H., Govindan, R., Jamin, S., Shenker, S., and Willinger, W. 2002. Network topology generators: Degree-based vs. structural. In Conference of the ACM Special Interest Group on Data Communications (SIGCOMM). ACM Press, New York, NY.]] Google ScholarDigital Library
Tauro, S. L., Palmer, C., Siganos, G., and Faloutsos, M. 2001. A simple conceptual model for the Internet topology. In Global Internet. IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
Tibshirani, R., Walther, G., and Hastie, T. 2001. Estimating the number of clusters in a dataset via the Gap statistic. J. Royal Statist. Soc., B 63, 411--423.]]Google ScholarCross Ref
Travers, J. and Milgram, S. 1969. An experimental study of the Small World problem. Sociometry 32, 4, 425--443.]]Google ScholarCross Ref
Tyler, J. R., Wilkinson, D. M., and Huberman, B. A. 2003. Email as Spectroscopy: Automated Discovery of Community Structure Within Organizations. Kluwer, B. V., The Netherlands.]]Google Scholar
van Dongen, S. M. 2000. Graph clustering by flow simulation. Ph.D. thesis, Univesity of Utrecht.]]Google Scholar
Virtanen, S. 2003. Clustering the Chilean Web. In Latin American Web Congress. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Wang, Y., Chakrabarti, D., Wang, C., and Faloutsos, C. 2003. Epidemic spreading in real networks: An eigenvalue viewpoint. In Symposium on Reliable Distributed Systems. IEEE Computer Society Press, Los Alamitos, CA, 25--34.]]Google Scholar
Wasserman, S. and Faust, K. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.]]Google Scholar
Watts, D. J. 2003. Six Degrees: The Science of a Connected Age 1st Ed. W. W. Norton and Company, New York, NY.]]Google Scholar
Watts, D. J., Dodds, P. S., and Newman, M. E. J. 2002. Identity and search in social networks. Science 296, 1302--1305.]]Google ScholarCross Ref
Watts, D. J. and Strogatz, S. H. 1998. Collective dynamics of ‘small-world’ networks. Nature 393, 440--442.]]Google ScholarCross Ref
Waxman, B. M. 1988. Routing of multipoint connections. IEEE J. Select. Areas Comm. 6, 9 (Dec.), 1617--1622.]]Google ScholarDigital Library
Weeks, M. R., Clair, S., Borgatti, S., Radda, K., and Schensul, J. J. 2002. Social networks of drug users in high-risk sites: Finding the connections. AIDS Behav. 6, 2, 193--206.]]Google ScholarCross Ref
Winick, J. and Jamin, S. 2002. Inet-3.0: Internet Topology Generator. Tech. Rep. CSE-TR-456-02, University of Michigan, Ann Arbor, MS.]]Google Scholar
Wu, F. and Huberman, B. A. 2004. Finding communities in linear time: A physics approach. European Physics J. B 38, 2, 331--338.]]Google ScholarCross Ref
Yan, X. and Han, J. 2002. gSpan: Graph-based substructure pattern mining. In IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA.]] Google ScholarDigital Library
Yook, S.-H., Jeong, H., and Barabási, A.-L. 2002. Modeling the Internet's large-scale topology. Proceedings of the National Academy of Sciences 99, 21, 13382--13386.]]Google ScholarCross Ref

Index Terms

Graph mining: Laws, generators, and algorithms

Recommendations

On mining cross-graph quasi-cliques
KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Joint mining of multiple data sets can often discover interesting, novel, and reliable patterns which cannot be obtained solely from any single source. For example, in cross-market customer segmentation, a group of customers who behave similarly in ...
Read More
A Characterization of $K_{2,4}$-Minor-Free Graphs

We provide a complete structural characterization of $K_{2,4}$-minor-free graphs. The 3-connected $K_{2,4}$-minor-free graphs consist of nine small graphs on at most eight vertices, together with a family of planar graphs that contains $2n-8$ ...
Read More
An efficient method for mining frequent itemsets with double constraints

Constraint-based frequent itemset mining is necessary when the needs and interests of users are the top priority. In this task, two opposite types of constraint are studied, namely anti-monotone and monotone constraints. Previous approaches have mainly ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Computing Surveys Volume 38, Issue 1
2006
173 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/1132952
Issue’s Table of Contents

Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 June 2006
Published in csur Volume 38, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Generators
graphs
patterns
social networks
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 427
  Total Citations
  View Citations
- 9,503
  Total Downloads
- Downloads (Last 12 months)147
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Graph mining: Laws, generators, and algorithms

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

On mining cross-graph quasi-cliques

A Characterization of $K_{2,4}$-Minor-Free Graphs

An efficient method for mining frequent itemsets with double constraints

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Graph mining: Laws, generators, and algorithms

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

On mining cross-graph quasi-cliques

A Characterization of $K_{2,4}$-Minor-Free Graphs

An efficient method for mining frequent itemsets with double constraints

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media