Skip to main content
Top
Published in: Knowledge and Information Systems 3/2019

24-05-2018 | Regular Paper

TensorCast: forecasting and mining with coupled tensors

Authors: Miguel Araujo, Pedro Ribeiro, Hyun Ah Song, Christos Faloutsos

Published in: Knowledge and Information Systems | Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Given an heterogeneous social network, can we forecast its future? Can we predict who will start using a given hashtag on twitter? Can we leverage side information, such as who retweets or follows whom, to improve our membership forecasts? We present TensorCast, a novel method that forecasts time-evolving networks more accurately than current state-of-the-art methods by incorporating multiple data sources in coupled tensors. TensorCast is (a) scalable, being linearithmic on the number of connections; (b) effective, achieving over 20% improved precision on top-1000 forecasts of community members; (c) general, being applicable to data sources with different structure. We run our method on multiple real-world networks, including DBLP, epidemiology data, power grid data, and a Twitter temporal network with over 310 million nonzeros, where we predict the evolution of the activity of the use of political hashtags.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
One of the experiments in Sect. 5 deals with this scenario.
 
2
For instance, we know that the second biggest element is one of \(\varvec{a}_{2}\varvec{b}_1\varvec{s}_1\), \(\varvec{a}_{1}\varvec{b}_{2}\varvec{s}_1\) or \(\varvec{a}_{1}\varvec{b}_1\varvec{s}_{2}\).
 
3
Remember that \(\varvec{A}\) and \(\varvec{B}\) are nonnegative matrices. In the worst-case, the score of the Kth biggest element is taken from a single power-law and the contribution of the rest of the factors is 0, hence \(K^{-\alpha _m}\) is a lower-bound for the Kth biggest value.
 
4
In the worst-case scenario, this element is at position x in every of the factors.
 
5
We consider the sum of the nonzeros of both tensors.
 
6
Note that the quality of absolute precision numbers is affected by (1) how imbalanced the two classes are and (2) the cost of false positives. An improvement from 2 to 5% precision might imply that 1 out of 20 phone-calls we make target a potential customer versus every 1 in 50.
 
Literature
1.
go back to reference Acar E, Aykut-Bingol C, Bingol H, Bro R, Yener B (2007) Multiway analysis of epilepsy tensors. Bioinformatics 23(13):i10–i18CrossRef Acar E, Aykut-Bingol C, Bingol H, Bro R, Yener B (2007) Multiway analysis of epilepsy tensors. Bioinformatics 23(13):i10–i18CrossRef
2.
go back to reference Araujo M, Günnemann S, Mateos G, Faloutsos C (2014) Beyond blocks: hyperbolic community detection. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 50–65 Araujo M, Günnemann S, Mateos G, Faloutsos C (2014) Beyond blocks: hyperbolic community detection. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 50–65
4.
go back to reference Baiocchi G, Distaso W (2003) GRETL: econometric software for the gnu generation. J Appl Econom 18(1):105–110CrossRef Baiocchi G, Distaso W (2003) GRETL: econometric software for the gnu generation. J Appl Econom 18(1):105–110CrossRef
5.
go back to reference Beutel A, Talukdar PP, Kumar A, Faloutsos C, Papalexakis EE, Xing EP (2014) Flexifact: scalable flexible factorization of coupled tensors on hadoop. IN: SDM, SIAM, pp 109–117 Beutel A, Talukdar PP, Kumar A, Faloutsos C, Papalexakis EE, Xing EP (2014) Flexifact: scalable flexible factorization of coupled tensors on hadoop. IN: SDM, SIAM, pp 109–117
6.
go back to reference Box GE, Pierce DA (1970) Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. J Am Stat Assoc 65(332):1509–1526MathSciNetCrossRefMATH Box GE, Pierce DA (1970) Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. J Am Stat Assoc 65(332):1509–1526MathSciNetCrossRefMATH
7.
go back to reference Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29, ACM, pp 93–104 Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: ACM sigmod record, vol 29, ACM, pp 93–104
8.
go back to reference Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15CrossRef Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15CrossRef
9.
go back to reference Cheng J, Adamic L, Dow PA, Kleinberg JM, Leskovec J (2014) Can cascades be predicted? In: Proceedings of the 23rd international conference on world wide web, ACM, pp 925–936 Cheng J, Adamic L, Dow PA, Kleinberg JM, Leskovec J (2014) Can cascades be predicted? In: Proceedings of the 23rd international conference on world wide web, ACM, pp 925–936
10.
go back to reference Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Trans Knowl Discov Data (TKDD) 5(2):10 Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Trans Knowl Discov Data (TKDD) 5(2):10
11.
go back to reference Ermiş B, Cemgil AT, Acar E (2013) Generalized coupled symmetric tensor factorization for link prediction. In: 2013 21st signal processing and communications applications conference (SIU), IEEE, pp 1–4 Ermiş B, Cemgil AT, Acar E (2013) Generalized coupled symmetric tensor factorization for link prediction. In: 2013 21st signal processing and communications applications conference (SIU), IEEE, pp 1–4
12.
go back to reference Gao S, Denoyer L, Gallinari P (2011) Temporal link prediction by integrating content and structure information. In: Proceedings of the 20th ACM international conference on information and knowledge management, ACM, pp 1169–1174 Gao S, Denoyer L, Gallinari P (2011) Temporal link prediction by integrating content and structure information. In: Proceedings of the 20th ACM international conference on information and knowledge management, ACM, pp 1169–1174
13.
go back to reference Grimmett G, Stirzaker D (2001) Probability and random processes. Oxford University Press, OxfordMATH Grimmett G, Stirzaker D (2001) Probability and random processes. Oxford University Press, OxfordMATH
14.
go back to reference Guha S, Mishra N, Roy G, Schrijvers O (2016) Robust random cut forest based anomaly detection on streams. In: International conference on machine learning, pp 2712–2721 Guha S, Mishra N, Roy G, Schrijvers O (2016) Robust random cut forest based anomaly detection on streams. In: International conference on machine learning, pp 2712–2721
15.
go back to reference Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an“explanatory” multi-modal factor analysis. University of California, Los Angeles Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an“explanatory” multi-modal factor analysis. University of California, Los Angeles
16.
go back to reference He Z, Xie S, Zdunek R, Zhou G, Cichocki A (2011) Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering. IEEE Trans Neural Netw 22(12):2117–2131CrossRef He Z, Xie S, Zdunek R, Zhou G, Cichocki A (2011) Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering. IEEE Trans Neural Netw 22(12):2117–2131CrossRef
17.
go back to reference Iasemidis LD, Sackellares JC (1996) Review: chaos theory and epilepsy. Neuroscientist 2(2):118–126CrossRef Iasemidis LD, Sackellares JC (1996) Review: chaos theory and epilepsy. Neuroscientist 2(2):118–126CrossRef
19.
go back to reference Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 426–434 Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 426–434
20.
go back to reference Koren Y (2010) Collaborative filtering with temporal dynamics. Commun ACM 53(4):89–97CrossRef Koren Y (2010) Collaborative filtering with temporal dynamics. Commun ACM 53(4):89–97CrossRef
21.
go back to reference Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562 Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562
22.
go back to reference Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C (2005) Realistic, mathematically tractable graph generation and evolution, using Kronecker multiplication. In: European conference on principles of data mining and knowledge discovery, Springer, pp 133–145 Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C (2005) Realistic, mathematically tractable graph generation and evolution, using Kronecker multiplication. In: European conference on principles of data mining and knowledge discovery, Springer, pp 133–145
23.
go back to reference Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031CrossRef Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031CrossRef
24.
go back to reference Matsubara Y, Sakurai Y, Faloutsos C, Iwata T, Yoshikawa M (2012) Fast mining and forecasting of complex time-stamped events. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 271–279 Matsubara Y, Sakurai Y, Faloutsos C, Iwata T, Yoshikawa M (2012) Fast mining and forecasting of complex time-stamped events. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 271–279
25.
go back to reference Menon AK, Elkan C (2011) Link prediction via matrix factorization. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 437–452 Menon AK, Elkan C (2011) Link prediction via matrix factorization. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 437–452
26.
27.
go back to reference Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings of the 19th international conference on data engineering, 2003, IEEE, pp 315–326 Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C (2003) Loci: fast outlier detection using the local correlation integral. In: Proceedings of the 19th international conference on data engineering, 2003, IEEE, pp 315–326
28.
go back to reference Papalexakis EE, Faloutsos C, Sidiropoulos ND (2012) Parcube: sparse parallelizable tensor decompositions. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 521–536 Papalexakis EE, Faloutsos C, Sidiropoulos ND (2012) Parcube: sparse parallelizable tensor decompositions. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 521–536
29.
go back to reference Pasta MQ, Jan Z, Sallaberry A, Zaidi F (2013) Tunable and growing network generation model with community structures. In: 2013 third international conference on cloud and green computing (CGC), IEEE, pp 233–240 Pasta MQ, Jan Z, Sallaberry A, Zaidi F (2013) Tunable and growing network generation model with community structures. In: 2013 third international conference on cloud and green computing (CGC), IEEE, pp 233–240
30.
go back to reference Ram P, Gray AG (2012) Maximum inner-product search using cone trees. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 931–939 Ram P, Gray AG (2012) Maximum inner-product search using cone trees. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 931–939
31.
go back to reference Scheffer M, Bascompte J, Brock WA, Brovkin V, Carpenter SR, Dakos V, Held H, Van Nes EH, Rietkerk M, Sugihara G (2009) Early-warning signals for critical transitions. Nature 461(7260):53CrossRef Scheffer M, Bascompte J, Brock WA, Brovkin V, Carpenter SR, Dakos V, Held H, Van Nes EH, Rietkerk M, Sugihara G (2009) Early-warning signals for critical transitions. Nature 461(7260):53CrossRef
32.
go back to reference Sharan U, Neville J (2008) Temporal-relational classifiers for prediction in evolving domains. In: 2008 eighth IEEE international conference on data mining, IEEE, pp 540–549 Sharan U, Neville J (2008) Temporal-relational classifiers for prediction in evolving domains. In: 2008 eighth IEEE international conference on data mining, IEEE, pp 540–549
33.
go back to reference Shi C, Li Y, Zhang J, Sun Y, Philip SY (2017) A survey of heterogeneous information network analysis. IEEE Trans Knowl Data Eng 29(1):17–37CrossRef Shi C, Li Y, Zhang J, Sun Y, Philip SY (2017) A survey of heterogeneous information network analysis. IEEE Trans Knowl Data Eng 29(1):17–37CrossRef
34.
go back to reference Shrivastava A, Li P (2014) Improved asymmetric locality sensitive hashing (ALSH) for maximum inner product search (MIPS). arXiv preprint arXiv:1410.5410 Shrivastava A, Li P (2014) Improved asymmetric locality sensitive hashing (ALSH) for maximum inner product search (MIPS). arXiv preprint arXiv:​1410.​5410
35.
go back to reference Şimşekli U, Ermiş B, Cemgil, AT, Acar E (2013) Optimal weight learning for coupled tensor factorization with mixed divergences. In: 21st European signal processing conference (EUSIPCO 2013), IEEE, pp 1–5 Şimşekli U, Ermiş B, Cemgil, AT, Acar E (2013) Optimal weight learning for coupled tensor factorization with mixed divergences. In: 21st European signal processing conference (EUSIPCO 2013), IEEE, pp 1–5
36.
go back to reference Song H A, Hooi B, Jereminov M, Pandey A, Pileggi L, Faloutsos C (2017) Powercast: mining and forecasting power grid sequences. In: Ceci M, Hollmén J, Todorovski L, Vens C, Džeroski S (eds) Machine learning and knowledge discovery in databases. ECML PKDD 2017. Lecture Notes in Computer Science, vol 10535. Springer, Cham, pp 606–621 Song H A, Hooi B, Jereminov M, Pandey A, Pileggi L, Faloutsos C (2017) Powercast: mining and forecasting power grid sequences. In: Ceci M, Hollmén J, Todorovski L, Vens C, Džeroski S (eds) Machine learning and knowledge discovery in databases. ECML PKDD 2017. Lecture Notes in Computer Science, vol 10535. Springer, Cham, pp 606–621
37.
go back to reference Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 374–383 Sun J, Tao D, Faloutsos C (2006) Beyond streams and graphs: dynamic tensor analysis. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 374–383
38.
go back to reference Tao D, Maybank S, Hu W, Li X (2005) Stable third-order tensor representation for color image classification. In: Proceedings of the 2005 IEEE/WIC/ACM international conference on web intelligence, IEEE Computer Society, pp 641–644 Tao D, Maybank S, Hu W, Li X (2005) Stable third-order tensor representation for color image classification. In: Proceedings of the 2005 IEEE/WIC/ACM international conference on web intelligence, IEEE Computer Society, pp 641–644
39.
go back to reference Teflioudi C, Gemulla R, Mykytiuk O (2015) Lemp: fast retrieval of large entries in a matrix product. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, ACM, pp 107–122 Teflioudi C, Gemulla R, Mykytiuk O (2015) Lemp: fast retrieval of large entries in a matrix product. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, ACM, pp 107–122
40.
go back to reference Walker PB, Gilpin S, Fooshee S, Davidson I (2015) Constrained tensor decomposition via guidance: increased inter and intra-group reliability in FMRI analyses. In: International conference on augmented cognition, Springer, pp 361–369 Walker PB, Gilpin S, Fooshee S, Davidson I (2015) Constrained tensor decomposition via guidance: increased inter and intra-group reliability in FMRI analyses. In: International conference on augmented cognition, Springer, pp 361–369
41.
go back to reference Welling M, Weber M (2001) Positive tensor factorization. Pattern Recognit Lett 22(12):1255–1261CrossRefMATH Welling M, Weber M (2001) Positive tensor factorization. Pattern Recognit Lett 22(12):1255–1261CrossRefMATH
42.
go back to reference Xie Z, Li X, Wang X (2007) A new community-based evolving network model. Phys A Stat Mech Appl 384(2):725–732CrossRef Xie Z, Li X, Wang X (2007) A new community-based evolving network model. Phys A Stat Mech Appl 384(2):725–732CrossRef
43.
go back to reference Yılmaz YK (2012) Generalized tensor factorization. Ph.D. thesis, Citeseer Yılmaz YK (2012) Generalized tensor factorization. Ph.D. thesis, Citeseer
44.
go back to reference Zellner A (1962) An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J Am Stat Assoc 57(298):348–368MathSciNetCrossRefMATH Zellner A (1962) An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J Am Stat Assoc 57(298):348–368MathSciNetCrossRefMATH
45.
go back to reference Zhou X, Xiang L, Xiao-Fan W (2008) Weighted evolving networks with self-organized communities. Commun Theor Phys 50(1):261CrossRefMATH Zhou X, Xiang L, Xiao-Fan W (2008) Weighted evolving networks with self-organized communities. Commun Theor Phys 50(1):261CrossRefMATH
Metadata
Title
TensorCast: forecasting and mining with coupled tensors
Authors
Miguel Araujo
Pedro Ribeiro
Hyun Ah Song
Christos Faloutsos
Publication date
24-05-2018
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 3/2019
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1223-9

Other articles of this Issue 3/2019

Knowledge and Information Systems 3/2019 Go to the issue

Premium Partner