ABSTRACT
Discovering similar objects in a social network has many interesting issues. Here, we present ASCOS, an Asymmetric Structure COntext Similarity measure that captures the similarity scores among any pairs of nodes in a network. The definition of ASCOS is similar to that of the well-known SimRank since both define score values recursively. However, we show that ASCOS outputs a more complete similarity score than SimRank because SimRank (and several of its variations, such as P-Rank and SimFusion) on average ignores half paths between nodes during calculation. To make ASCOS tractable in both computation time and memory usage, we propose two variations of ASCOS: a low rank approximation based approach and an iterative solver Gauss-Seidel for linear equations. When the target network is sparse, the run time and the required computing space of these variations are smaller than computing SimRank and ASCOS directly. In addition, the iterative solver divides the original network into several independent sub-systems so that a multi-core server or a distributed computing environment, such as MapReduce, can efficiently solve the problem. We compare the performance of ASCOS with other global structure based similarity measures, including SimRank, Katz, and LHN. The experimental results based on user evaluation suggest that ASCOS gives better results than other measures. In addition, the asymmetric property has the potential to identify the hierarchical structure of a network. Finally, variations of ASCOS (including one distributed variation) can also reduce computation both in space and time.
- E. Acar, D. Dunlavy, and T. Kolda. Link prediction on evolving data using matrix and tensor factorizations. In IEEE International Conference on Data Mining Workshops, pages 262--269. IEEE, 2009. Google ScholarDigital Library
- L. Adamic and E. Adar. Friends and neighbors on the web. Social Networks, 25(3): 211--230, 2003.Google ScholarCross Ref
- M. Adams. A distributed memory unstructured Gauss-Seidel algorithm for multigrid smoothers. In Supercomputing, ACM/IEEE 2001 Conference. IEEE, 2001. Google ScholarDigital Library
- M. Al, V. Chaoji, S. Salem, and M. Zaki. Link prediction using supervised learning. In SDM Workshop on Link Analysis, Counter-terrorism and Security, 2006.Google Scholar
- A. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439): 509--512, 1999.Google ScholarCross Ref
- Y. Cai, M. Zhang, C. Ding, and S. Chakravarthy. Closed form solution of similarity algorithms. In Proceedings of the 33rd International SIGIR Conference on Research and Development in Information Retrieval. ACM, 2010. Google ScholarDigital Library
- H.-H. Chen, L. Gou, X. Zhang, and C. Giles. Collabseer: a search engine for collaboration discovery. In Proceedings of the 11th ACM/IEEE Joint Conference on Digital Libraries. ACM, 2011. Google ScholarDigital Library
- H.-H. Chen, L. Gou, X. Zhang, and C. Giles. Discovering missing links in networks using vertex similarity measures. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, pages 138--143. ACM, 2012. Google ScholarDigital Library
- H.-H. Chen, L. Gou, X. Zhang, and C. L. Giles. Capturing missing edges in social networks using vertex similarity. In Proceedings of the sixth international conference on Knowledge capture. ACM, 2011. Google ScholarDigital Library
- J. Cullum and R. Willoughby. Lanczos algorithms for large symmetric eigenvalue computations, volume 1. Society for Industrial Mathematics, 2002. Google ScholarDigital Library
- Y. Dong, Q. Ke, B. Wang, and B. Wu. Link prediction based on local information. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, pages 382--386. IEEE, 2011. Google ScholarDigital Library
- Y. Fujiwara, M. Nakatsuji, T. Yamamuro, H. Shiokawa, and M. Onizuka. Efficient personalized pagerank with accuracy assurance. In Proceedings of the 18th ACM International Conference on Knowledge Discovery and Data Mining. ACM, 2012. Google ScholarDigital Library
- G. Golub and C. Van Loan. Matrix computations. Johns Hopkins Univ Pr, 1996.Google Scholar
- G. He, H. Feng, C. Li, and H. Chen. Parallel simrank computation on large graphs with iterative aggregation. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2010. Google ScholarDigital Library
- G. Jeh and J. Widom. Simrank: a measure of structural-context similarity. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 538--543. ACM, 2002. Google ScholarDigital Library
- L. Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1): 39--43, 1953.Google ScholarCross Ref
- D. Knuth. The Stanford GraphBase: a platform for combinatorial computing. ACM Press, 1993. Google Scholar
- E. Leicht, P. Holme, and M. Newman. Vertex similarity in networks. Physical Review E, 73(2): 026120, 2006.Google ScholarCross Ref
- C. Li, J. Han, G. He, X. Jin, Y. Sun, Y. Yu, and T. Wu. Fast computation of simrank for static and dynamic information networks. In Proceedings of the 13th International Conference on Extending Database Technology. ACM, 2010. Google ScholarDigital Library
- D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7): 1019--1031, 2007. Google ScholarDigital Library
- J.-S. Liu and K.-C. Ning. Applying link prediction to ranking candidates for high-level government post. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on, pages 145--152. IEEE, 2011. Google ScholarDigital Library
- L. Lu and T. Zhou. Link prediction in complex networks: a survey. Physica A: Statistical Mechanics and Its Applications, 390(6): 1150--1170, 2011.Google ScholarCross Ref
- A. Maguitman, F. Menczer, F. Erdinc, H. Roinestad, and A. Vespignani. Algorithmic computation and approximation of semantic similarity. World Wide Web, 9(4): 431--456, 2006. Google ScholarDigital Library
- E. Minkov, W. Cohen, and A. Ng. Contextual search and name disambiguation in email using graphs. In Proceedings of the 29th International SIGIR Conference on Research and Development in Information Retrieval, pages 27--34. ACM, 2006. Google ScholarDigital Library
- D. L. Nelson, C. L. McEvoy, and T. A. Schreiber. The university of south florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers, 36(3): 402--407, 2004.Google Scholar
- M. Newman. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(3): 036104, 2006.Google ScholarCross Ref
- R. Panigrahy, M. Najork, and Y. Xie. How user behavior is related to social affinity. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM). ACM, 2012. Google ScholarDigital Library
- E. Ravasz, A. Somera, D. Mongru, Z. Oltvai, and A. Barabási. Hierarchical organization of modularity in metabolic networks. Science, 297(5586), 2002.Google Scholar
- G. Salton. Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley, 1989. Google ScholarDigital Library
- P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to data mining. Addison-Wesley, 2006. Google ScholarDigital Library
- A. Tversky. Features of similarity. Psychological Review, 84(4): 327, 1977.Google ScholarCross Ref
- W. Xi, E. Fox, W. Fan, B. Zhang, Z. Chen, J. Yan, and D. Zhuang. Simfusion: measuring similarity using unified relationship matrix. In Proceedings of the 28th Annual International SIGIR Conference on Research and Development in Information Retrieval. ACM, 2005. Google ScholarDigital Library
- W. Yu, X. Lin, W. Zhang, Y. Zhang, and J. Le. Simfusion+: extending simfusion towards efficient estimation on large and dynamic networks. In Proceedings of the 35th International SIGIR Conference on Research and Development in Information Retrieval. ACM, 2012. Google ScholarDigital Library
- P. Zhao, J. Han, and Y. Sun. P-rank: a comprehensive structural similarity measure over information networks. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, pages 553--562. ACM, 2009. Google ScholarDigital Library
- T. Zhou, L. Lu, and Y. Zhang. Predicting missing links via local information. The European Physical Journal B-Condensed Matter and Complex Systems, 2009.Google ScholarCross Ref
Index Terms
- ASCOS: an asymmetric network structure COntext similarity measure
Recommendations
ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank
In this article, we explore the relationships among digital objects in terms of their similarity based on vertex similarity measures. We argue that SimRank—a famous similarity measure—and its families, such as P-Rank and SimRank++, fail to capture ...
A novel similarity/dissimilarity measure for intuitionistic fuzzy sets and its application in pattern recognition
Among the most interesting measures in intuitionistic fuzzy sets (IFSs) theory, the similarity measure is an essential tool to compare and determine degree of similarity between IFSs. Although there exist many similarity measures for IFSs, most of them ...
Single-valued neutrosophic similarity measures based on cotangent function and their application in the fault diagnosis of steam turbine
Similarity measure is an important tool in pattern recognition and fault diagnosis. This paper proposes two cotangent similarity measures for single-valued neutrosophic sets (SVNSs) based on cotangent function. Then, the weighted cotangent similarity ...
Comments