skip to main content
research-article

ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank

Authors Info & Claims
Published:12 October 2015Publication History
Skip Abstract Section

Abstract

In this article, we explore the relationships among digital objects in terms of their similarity based on vertex similarity measures. We argue that SimRank—a famous similarity measure—and its families, such as P-Rank and SimRank++, fail to capture similar node pairs in certain conditions, especially when two nodes can only reach each other through paths of odd lengths. We present new similarity measures ASCOS and ASCOS++ to address the problem. ASCOS outputs a more complete similarity score than SimRank and SimRank’s families. ASCOS++ enriches ASCOS to include edge weight into the measure, giving all edges and network weights an opportunity to make their contribution. We show that both ASCOS++ and ASCOS can be reformulated and applied on a distributed environment for parallel contribution. Experimental results show that ASCOS++ reports a better score than SimRank and several famous similarity measures. Finally, we re-examine previous use cases of SimRank, and explain appropriate and inappropriate use cases. We suggest future SimRank users following the rules proposed here before naïvely applying it. We also discuss the relationship between ASCOS++ and PageRank.

References

  1. Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda. 2009. Link prediction on evolving data using matrix and tensor factorizations. In IEEE International Conference on Data Mining Workshops. IEEE, Washington DC, USA, 262--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Lada A. Adamic and Eytan Adar. 2003. Friends and neighbors on the web. Social Networks 25, 3, 211--230.Google ScholarGoogle ScholarCross RefCross Ref
  3. Alekh Agarwal and Soumen Chakrabarti. 2007. Learning random walks to rank nodes in graphs. In Proceedings of the 24th International Conference on Machine learning. ACM, New York, NY, USA, 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Charu Aggarwal, Yan Xie, and Philip S. Yu. 2012. On dynamic link inference in heterogeneous networks. In Proceedings of the 12th SIAM International Conference on Data Mining. SIAM, Anaheim, CA, USA, 415--426.Google ScholarGoogle Scholar
  5. Mohammad Al Hasan and Mohammed J. Zaki. 2011. A survey of link prediction in social networks. Social Network Data Analytics, 243--275.Google ScholarGoogle Scholar
  6. Réka Albert and Albert-László Barabási. 2002. Statistical mechanics of complex networks. Reviews of Modern Physics 74, 1, 47--97.Google ScholarGoogle ScholarCross RefCross Ref
  7. Ioannis Antonellis, Hector Garcia Molina, and Chi Chao Chang. 2008. Simrank++: Query rewriting through link analysis of the click graph. Proceedings of the VLDB Endowment 1, 1, 408--421. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, Hong Kong, China, 635--644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Shenghua Bao, Guirong Xue, Xiaoyuan Wu, Yong Yu, Ben Fei, and Zhong Su. 2007. Optimizing web search using social annotations. In Proceedings of the 16th International Conference on World Wide Web. ACM, Banff, Alberta, Canada, 501--510. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439, 509--512.Google ScholarGoogle Scholar
  11. Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1, 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Rajmonda Sulo Caceres, Tanya Berger-Wolf, and Robert Grossman. 2011. Temporal scale of processes in dynamic networks. In IEEE International Conference on Data Mining Workshops. IEEE, Vancouver, Canada, 925--932. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yuanzhe Cai, Miao Zhang, Chris Ding, and Sharma Chakravarthy. 2010. Closed form solution of similarity algorithms. In Proceedings of the 33rd International SIGIR Conference on Research and Development in Information Retrieval. ACM, Geneva, Switzerland, 709--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hung-Hsuan Chen, Yan-Bin Ciou, and Shou-De Lin. 2012. Information propagation game: A tool to acquire humanplaying data for multiplayer influence maximization on social networks. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Beijing, China, 1524--1527. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hung-Hsuan Chen and C. Lee Giles. 2013. ASCOS: An asymmetric network structure context similarity measure. In Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on. IEEE, Niagara Falls, Canada, 442--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hung-Hsuan Chen, Liang Gou, Xiaolong Zhang, and C. Lee Giles. 2011. CollabSeer: A search engine for collaboration discovery. In Proceeding of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries. ACM, Ottawa, Canada, 231--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hung-Hsuan Chen, Liang Gou, Xiaolong Zhang, and C. Lee Giles. 2012a. Discovering missing links in networks using vertex similarity measures. In The 27th ACM Symposium on Applied Computing. ACM, Riva del Garda (Trento), Italy, 138--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hung-Hsuan Chen, Liang Gou, Xioalong Zhang, and C. Lee Giles. 2012b. Predicting recent links in FOAF networks. In Social Computing, Behavioral-Cultural Modeling and Prediction. Springer, College Park, MD, USA, 156--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hung-Hsuan Chen, Liang Gou, Xiaolong Luke Zhang, and C. Lee Giles. 2013a. Towards the discovery of diseases related by genes using vertex similarity measures. In Healthcare Informatics (ICHI), 2013 IEEE International Conference on. IEEE, Philadelphia, PA, USA, 505--510. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hung-Hsuan Chen, David J. Miller, and C. Lee Giles. 2013b. The predictive value of young and old links in a social network. In Proceedings of the ACM SIGMOD Workshop on Databases and Social Networks. ACM, New York, NY, USA, 43--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Fan Chung. 2007. The heat kernel as the pagerank of a graph. Proc. Natl. Acad. Sci. USA 104, 50, 19735--19740.Google ScholarGoogle ScholarCross RefCross Ref
  22. Fan Chung. 2009. A local graph partitioning algorithm using heat kernel pagerank. Internet Mathematics 6, 3, 315--330.Google ScholarGoogle ScholarCross RefCross Ref
  23. Adele Cutler and Leo Breiman. 1994. Archetypal analysis. Technometrics 36, 4, 338--347.Google ScholarGoogle ScholarCross RefCross Ref
  24. Yuxiao Dong, Qing Ke, Bai Wang, and Bin Wu. 2011. Link prediction based on local information. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on. IEEE, Kaohsiung, Taiwan, 382--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Leo A. Goodman. 1961. Snowball sampling. The Annals of Mathematical Statistics 32, 1, 148--170.Google ScholarGoogle ScholarCross RefCross Ref
  26. Guoming He, Haijun Feng, Cuiping Li, and Hong Chen. 2010. Parallel simrank computation on large graphs with iterative aggregation. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Washington, DC, USA, 543--552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. Rolx: Structural role extraction & mining in large graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Beijing, China, 1231--1239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Glen Jeh and Jennifer Widom. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Edmonton, Alberta, Canada, 538--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1, 39--43.Google ScholarGoogle ScholarCross RefCross Ref
  30. David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Washington, DC, USA, 137--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Kyle Kloster and David F. Gleich. 2014. Heat kernel based community detection. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, USA, 1386--1395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Danai Koutra, Joshua T. Vogelstein, and Christos Faloutsos. 2013. DeltaCon: A principled massive-graph similarity function. In Proceedings of the SIAM International Conference on Data Mining. SIAM, Austin, Texas, 162--170.Google ScholarGoogle ScholarCross RefCross Ref
  33. Elizabeth A. Leicht, Petter Holme, and Mark E. J. Newman. 2006. Vertex similarity in networks. Physical Review E 73, 2, 26120.Google ScholarGoogle ScholarCross RefCross Ref
  34. Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 8th ACM International Conference on Knowledge Discovery and Data Mining. ACM, Chicago, IL, USA, 177--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance. 2007. Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, San Jose, CA, USA, 420--429. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Cuiping Li, Jiawei Han, Guoming He, Xin Jin, Yizhou Sun, Yintao Yu, and Tianyi Wu. 2010a. Fast computation of simrank for static and dynamic information networks. In Proceedings of the 13th International Conference on Extending Database Technology. ACM, Lausanne, Switzerland, 465--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Pei Li, Hongyan Liu, Jeffrey Xu Yu, Jun He, and Xiaoyong Du. 2010b. Fast single-pair simrank computation. In Proceedings of the SIAM International Conference on Data Mining. SIAM, Columbus, Ohio, USA, 571--582.Google ScholarGoogle ScholarCross RefCross Ref
  38. David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58, 7, 1019--1031. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ryan Lichtnwalter and Nitesh V. Chawla. 2012. Link prediction: Fair and effective evaluation. In Advances in Social Networks Analysis and Mining (ASONAM), 2012 IEEE/ACM International Conference on. IEEE, Istanbul, Turkey, 376--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications 390, 6, 1150--1170.Google ScholarGoogle Scholar
  41. Rashid Mehmood and Jon Crowcroft. 2005. Parallel iterative solution method for large sparse linear equation systems. Computer Laboratory: University of Cambridge.Google ScholarGoogle Scholar
  42. Douglas L. Nelson, Cathy L. McEvoy, and Thomas A. Schreiber. 2004. The university of south florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers 36, 3, 402--407.Google ScholarGoogle ScholarCross RefCross Ref
  43. Mark E. J. Newman. 2003. The structure and function of complex networks. SIAM REVIEW 45, 2, 167--256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2006. Introduction to Data Mining. Pearson Addison Wesley.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Erzsébet Ravasz, Anna Lisa Somera, Dale A. Mongru, Zoltán N. Oltvai, and A.-L. Barabási. 2002. Hierarchical organization of modularity in metabolic networks. Science 297, 5586, 1551--1555.Google ScholarGoogle Scholar
  46. Yousef Saad. 2003. Iterative Methods for Sparse Linear Systems. SIAM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Gerard Salton. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Charalampos E. Tsourakakis. 2014. Towards quantifying vertex similarity in networks. Internet Mathematics 10, 3--4, 263--286.Google ScholarGoogle ScholarCross RefCross Ref
  49. Amos Tversky. 1977. Features of similarity. Psychological Review 84, 4, 327--352.Google ScholarGoogle ScholarCross RefCross Ref
  50. Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, Zheng Chen, Jun Yan, and Dong Zhuang. 2005. SimFusion: Measuring similarity using unified relationship matrix. In Proceedings of the 28th Annual International SIGIR Conference on Research and Development in Information Retrieval. ACM, Salvador, Brazil, 130--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Weiren Yu, Xuemin Lin, Wenjie Zhang, Ying Zhang, and Jiajin Le. 2012. SimFusion+: Extending simfusion towards efficient estimation on large and dynamic networks. In Proceedings of the 35th International SIGIR Conference on Research and Development in Information Retrieval. ACM, Portland, OR, USA, 365--374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. 2014. Social Media Mining: An Introduction. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Jun Zhang, Chaokun Wang, Philip S. Yu, and Jianmin Wang. 2013. Learning latent friendship propagation networks with interest awareness for link prediction. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’13). ACM, Dublin, Ireland, 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Peixiang Zhao, Jiawei Han, and Yizhou Sun. 2009. P-Rank: A comprehensive structural similarity measure over information networks. In Proceeding of the 18th ACM Conference on Information and Knowledge Management. ACM, Hong Kong, China, 553--562. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Tao Zhou, Linyuan Lü, and Yi-Cheng Zhang. 2009. Predicting missing links via local information. The European Physical Journal B-Condensed Matter and Complex Systems 71, 4, 623--630.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Knowledge Discovery from Data
          ACM Transactions on Knowledge Discovery from Data  Volume 10, Issue 2
          October 2015
          291 pages
          ISSN:1556-4681
          EISSN:1556-472X
          DOI:10.1145/2835206
          Issue’s Table of Contents

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 October 2015
          • Accepted: 1 May 2015
          • Revised: 1 January 2015
          • Received: 1 August 2014
          Published in tkdd Volume 10, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader