skip to main content
research-article

Temporal Link Prediction Using Matrix and Tensor Factorizations

Published:01 February 2011Publication History
Skip Abstract Section

Abstract

The data in many disciplines such as social networks, Web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this article, we consider the problem of temporal link prediction: Given link data for times 1 through T, can we predict the links at time T + 1? If our data has underlying periodic structure, can we predict out even further in time, i.e., links at time T + 2, T + 3, etc.? In this article, we consider bipartite graphs that evolve over time and consider matrix- and tensor-based methods for predicting future links. We present a weight-based method for collapsing multiyear data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural three-dimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrix- and tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem. Additionally, we show that tensor-based techniques are particularly effective for temporal data with varying periodic patterns.

References

  1. Acar, E. and Yener, B. 2009. Unsupervised multiway data analysis: A literature survey. IEEE Trans. Knowl. Data Engin. 21, 1, 6--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Acar, E., Çamtepe, S. A., and Yener, B. 2006. Collective sampling and analysis of high order tensors for chatroom communications. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics (ISI’06). Lecture Notes in Computer Science, vol. 3975. Springer, 213--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Acar, E., Dunlavy, D. M., and Kolda, T. G. 2009. Link prediction on evolving data using matrix and tensor factorizations. In Proceedings of the Workshop on Large-Scale Data Mining: Theory and Applications (LDMTA’09). 262--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Acar, E., Dunlavy, D. M., and Kolda, T. G. 2010. A scalable optimization approach for fitting canonical tensor decompositions. J. Chemometrics. To appear.Google ScholarGoogle Scholar
  5. Bader, B. W. and Kolda, T. G. 2007. Efficient MATLAB computations with sparse and factored tensors. SIAM J. Scient. Comput. 30, 1, 205--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bader, B. W., Berry, M. W., and Browne, M. 2007. Discussion tracking in Enron email using PARAFAC. In Survey of Text Mining: Clustering, Classification, and Retrieval 2nd Ed., M. W. Berry and M. Castellanos Eds. Springer, 147--162.Google ScholarGoogle Scholar
  7. Bast, H. and Majumdar, D. 2005. Why spectral retrieval works. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’05). 11--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bell, R. M. and Koren, Y. 2007. Lessons from the Netflix Prize Challenge. ACM SIGKDD Explor. Newslett. 9, 75--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Carroll, J. D. and Chang, J. J. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika 35, 283--319.Google ScholarGoogle ScholarCross RefCross Ref
  10. Chatfield, C. and Yar, M. 1988. Holt-winters forecasting: Some practical issues. J. Royal Statist. Soc. Series D (The Statistician) 37, 2, 129--140.Google ScholarGoogle Scholar
  11. Clauset, A., Moore, C., and Newman, M. 2008. Hierarchical structure and the prediction of missing links in networks. Nature 453.Google ScholarGoogle Scholar
  12. Dumais, S. T., Furnas, G. W., Landauer, T. K., Deerwester, S., and Harshman, R. 1988. Using latent semantic analysis to improve access to textual information. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’88). ACM Press, 281--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gardner, E. S. 2006. Exponential smoothing: The state of the art - part ii. Int. J. Forecast. 22, 637--666.Google ScholarGoogle ScholarCross RefCross Ref
  14. Getoor, L. and Diehl, C. P. 2005. Link mining: A survey. ACM SIGKDD Explor. Newslett. 7, 2, 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Harshman, R. A. 1970. Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics 16, 1--84. http://www.psychology.uwo.ca/faculty/harshman/wpppfac0.pdf.Google ScholarGoogle Scholar
  16. Hasan, M. A., Chaoji, V., Salem, S., and Zaki, M. 2006. Link prediction using supervised learning. In Proceedings of the SIAM of the Data Mining Workshop on Link Analysis, Counterterrorism, and Security.Google ScholarGoogle Scholar
  17. Huang, Z. and Lin, D. K. J. 2009. The time-series link prediction problem with applications in communication surveillance. INFORMS J. Comput. 21, 286--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Huang, Z., Li, X., and Chen, H. 2005. Link prediction approach to collaborative filtering. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’05). 141--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Katz, L. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1, 39--43.Google ScholarGoogle ScholarCross RefCross Ref
  20. Kolda, T. G. and Bader, B. W. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3, 455--500. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kolda, T. G., Bader, B. W., and Kenny, J. P. 2005. Higher-order web link analysis using multilinear algebra. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM’05). IEEE Computer Society, 242--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Koren, Y. 2009. Collaborative filtering with temporal dynamics. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). ACM, New York, NY, 447--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix factorization techniques for recommender systems. IEEE Comput. 42, 30--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kruskal, J. B. 1989. Rank, decomposition, and uniqueness for 3-way and N-way arrays. In Multiway Data Analysis. R. Coppi and S. Bolasco Eds. North-Holland, Amsterdam, 7--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Liben-Nowell, D. and Kleinberg, J. 2007. The link-prediction problem for social networks. J. Amer. Soc. Inform. Sci. Techn. 58, 7, 1019--1031. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Liu, Y. and Kou, Z. 2007. Predicting who rated what in large-scale datasets. ACM SIGKDD Explor. Newslett. 9, 62--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Makridakis, S. and Hibon, M. 2000. The m3 competition: Results, conclusions and implications. Int. J. Forecast. 16, 451--476.Google ScholarGoogle ScholarCross RefCross Ref
  28. Rattigan, M. J. and Jensen, D. 2005. The case for anomalous link discovery. ACM SIGKDD Explor. Newslett. 7, 2, 41--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Saad, Y. 1992. Numerical Methods for Large Eigenvalue Problems. Manchester University Press.Google ScholarGoogle Scholar
  30. Sarkar, P., Siddiqi, S. M., and Gordon, G. J. 2007. A latent space approach to dynamic embedding of co-occurrence data. In Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AI-STATS’07).Google ScholarGoogle Scholar
  31. Sharan, U. and Neville, J. 2008. Temporal-relational classifiers for prediction in evolving domains. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM’08). IEEE Computer Society, 540--549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Smola, A. and Kondor, R. 2003. Kernels and regularization on graphs. In Proceedings of the Annual Conference on Computational Learning Theory and Kernel Workshop. B. Schölkopf and M. Warmuth Eds., Lecture Notes in Computer Science. Springer.Google ScholarGoogle Scholar
  33. Stager, M., Lukowicz, P., and Troster, G. 2006. Dealing with class skew in context recognition. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW’06). 58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sun, J., Papadimitriou, S., Lin, C.-Y., Cao, N., Liu, S., and Qian, W. 2009. Multivis: Content-based social network exploration through multi-way visual analysis. In Proceedings of the 9th SIAM International Conference on Data Mining (SDM’09). 1064--1075.Google ScholarGoogle Scholar
  35. Tong, H., Papadimitriou, S., Yu, P. S., and Faloutsos, C. 2008. Proximity tracking on time-evolving bipartite graphs. In Proceedings of the 8th SIAM International Conference on Data Mining (SDM’08). 704--715.Google ScholarGoogle Scholar
  36. Tucker, L. R. 1963. Implications of factor analysis of three-way matrices for measurement of change. In Problems in Measuring Change. C. W. Harris Ed., University of Wisconsin Press, 122--137.Google ScholarGoogle Scholar
  37. Tucker, L. R. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31, 279--311.Google ScholarGoogle ScholarCross RefCross Ref
  38. Wang, C., Satuluri, V., and Parthasarathy, S. 2007. Local probabilistic models for link prediction. In Proceedings of the 7th IEEE Conference on Data Mining (ICDM’07). 322--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Xiong, L., Chen, X., Huang, T.-K., Schneider, J., and Carbonell, J. G. 2010. Temporal collaborative filtering with Bayesian probabilistic tensor factorization. In Proceedings of the SIAM International Conference on Data Mining.Google ScholarGoogle Scholar
  40. Yan, H., Grosky, W. I., and Fotouhi, F. 2008. Augmenting the power of LSI in text retrieval: Singular value rescaling. Data Knowl. Engin. 65, 108--125. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Temporal Link Prediction Using Matrix and Tensor Factorizations

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Knowledge Discovery from Data
            ACM Transactions on Knowledge Discovery from Data  Volume 5, Issue 2
            February 2011
            192 pages
            ISSN:1556-4681
            EISSN:1556-472X
            DOI:10.1145/1921632
            Issue’s Table of Contents

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 February 2011
            • Accepted: 1 July 2010
            • Received: 1 June 2010
            Published in tkdd Volume 5, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader