Abstract
The increasing popularity of social media encourages more and more users to participate in various online activities and produces data in an unprecedented rate. Social media data is big, linked, noisy, highly unstructured and in- complete, and differs from data in traditional data mining, which cultivates a new research field - social media mining. Social theories from social sciences are helpful to explain social phenomena. The scale and properties of social media data are very different from these of data social sciences use to develop social theories. As a new type of social data, social media data has a fundamental question - can we apply social theories to social media data? Recent advances in computer science provide necessary computational tools and techniques for us to verify social theories on large-scale social media data. Social theories have been applied to mining social media. In this article, we review some key social theories in mining social media, their verification approaches, interesting findings, and state-of-the-art algorithms. We also discuss some future directions in this active area of mining social media with social theories.
- E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne. Finding high-quality content in social media. In WSDM, 2008. Google ScholarDigital Library
- A. Amelio and C. Pizzuti. Community mining in signed networks: a multiobjective approach. In ASONAM, 2013. Google ScholarDigital Library
- S. Asur and B. A. Huberman. Predicting the future with social media. In WI-IAT, 2010. Google ScholarDigital Library
- E. Bakshy, I. Rosenn, C. Marlow, and L. Adamic. The role of social networks in information diffusion. In WWW, 2012. Google ScholarDigital Library
- A.-L. Barabási and R. Albert. Emergence of scaling in random networks. science, 1999.Google Scholar
- S. Y. Bhat and M. Abulaish. Community-based features for identifying spammers in online social networks. In ASONAM, pages 100--107. ACM, 2013. Google ScholarDigital Library
- H. Bisgin, N. Agarwal, and X. Xu. Investigating homophily in online social networks. In WI-IAT, 2010. Google ScholarDigital Library
- J. Bollen, H. Mao, and X. Zeng. Twitter mood predicts the stock market. Journal of Computational Science, 2(1):1--8, 2011.Google ScholarCross Ref
- R. S. Burt. Structural holes: The social structure of competition. Harvard university press, 2009.Google Scholar
- K.-Y. Chiang, N. Natarajan, A. Tewari, and I. S. Dhillon. Exploiting longer cycles for link prediction in signed networks. In CIKM, 2011. Google ScholarDigital Library
- J. A. Davis. Clustering and structural balance in graphs. Human relations, 1967.Google Scholar
- P. Domingos. A few useful things to know about machine learning. Communications of the ACM, 2012. Google ScholarDigital Library
- T. DuBois, J. Golbeck, and A. Srinivasan. Predicting trust and distrust in social networks. In socialcom, 2011.Google Scholar
- H. Gao, J. Tang, and H. Liu. Exploring social-historical ties on location-based social networks. In ICWSM, 2012.Google Scholar
- L. Getoor and C. P. Diehl. Link mining: a survey. ACM SIGKDD Explorations Newsletter, 2005. Google ScholarDigital Library
- M. Granovetter. The strength of weak ties. JSTOR, 1973.Google ScholarCross Ref
- F. Heider. Attitudes and cognitive organization. The Journal of psychology, 1946.Google Scholar
- C.-J. Hsieh, K.-Y. Chiang, and I. S. Dhillon. Low rank modeling of signed networks. In KDD, 2012. Google ScholarDigital Library
- X. Hu, J. Tang, Y. Zhang, and H. Liu. Social spammerdetection in microblogging. In IJCAI, 2013. Google ScholarDigital Library
- X. Hu, L. Tang, J. Tang, and H. Liu. Exploiting social relations for sentiment analysis in microblogging. In WSDM, 2013. Google ScholarDigital Library
- B. Huberman, D. M. Romero, and F. Wu. Social networks that matter: Twitter under the microscope. First Monday, 2008.Google Scholar
- M. Jamali and M. Ester. A matrix factorization technique with trust propagation for recommendation in social networks. In Recsys, 2010. Google ScholarDigital Library
- D. Jensen and J. Neville. Linkage and autocorrelation cause feature selection bias in relational learning. In ICML, 2002. Google ScholarDigital Library
- I. Kahanda and J. Neville. Using transactional information to predict link strength in online social networks. In ICWSM, 2009.Google Scholar
- D. Kim, D. Kim, E. Hwang, and S. Rho. Twittertrends: a spatio-temporal trend detection and related keywords recommendation scheme. Multimedia Systems,2014.Google Scholar
- M. Kosinski, D. Stillwell, and T. Graepel. Private traits and attributes are predictable from digital records of human behavior. PNAS, 2013.Google ScholarCross Ref
- J. Leskovec, D. Huttenlocher, and J. Kleinberg. Predicting positive and negative links in online social networks. In WWW, 2010. Google ScholarDigital Library
- J. Leskovec, D. Huttenlocher, and J. Kleinberg. Signed networks in social media. In CHI, 2010. Google ScholarDigital Library
- F. Li and M.-H. Hsieh. An empirical study of clustering behavior of spammers and group-based anti-spam strategies. In CEAS, 2006.Google Scholar
- D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. JASIST, 2007. Google ScholarDigital Library
- H. Liu and H. Motoda. Computational methods of feature selection. CRC Press, 2007. Google ScholarDigital Library
- B. Long, Z. M. Zhang, X. Wu, and P. S. Yu. Spectral clustering for multi-type relational data. In ICML, 2006. Google ScholarDigital Library
- T. Lou and J. Tang. Mining structural hole spanners through information diffusion in social networks. In WWW, 2013. Google ScholarDigital Library
- Q. Lu and L. Getoor. Link-based classification. In ICML, 2003.Google ScholarDigital Library
- H. Ma, D. Zhou, C. Liu, M. R. Lyu, and I. King. Recommender systems with social regularization. In WSDM, 2011. Google ScholarDigital Library
- S. A. Macskassy and F. Provost. A simple relational classifier. In MRDM, 2003.Google ScholarCross Ref
- S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. JMLR, 2007. Google ScholarDigital Library
- P. Massa. A survey of trust use and modeling in real online systems. Trust in E-services: Technologies, Practices and Challenges, 2007.Google Scholar
- P. Massa and P. Avesani. Trust-aware collaborative filtering for recommender systems. In CoopIS, DOA, and ODBASE, 2004.Google Scholar
- M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual review of sociology, 2001.Google Scholar
- A. Mislove, B. Viswanath, K. P. Gummadi, and P. Druschel. You are who you know: inferring user profiles in online social networks. In WSDM, 2010. Google ScholarDigital Library
- J. Neville and D. Jensen. Leveraging relational auto-correlation with latent group models. In MRDM, 2005. Google ScholarDigital Library
- M. E. Newman and M. Girvan. Finding and evaluating community structure in networks. PRE, 69(2):026113, 2004.Google ScholarCross Ref
- S. Papadopoulos, Y. Kompatsiaris, A. Vakali, and P. Spyridonos. Community detection in social media. DMKD, 2012. Google ScholarDigital Library
- R. R. Sinha and K. Swearingen. Comparing recommendations made by online systems and friends. In DELOS, 2001.Google Scholar
- M. Speriosu, N. Sudan, S. Upadhyay, and J. Baldridge. Twitter polarity classification with label propagation over lexical links and the follower graph. In ULNLP, 2011. Google ScholarDigital Library
- G. Stringhini, C. Kruegel, and G. Vigna. Detecting spammers on social networks. In ACSAC, 2010. Google ScholarDigital Library
- C. Tan, L. Lee, J. Tang, L. Jiang, M. Zhou, and P. Li. User-level sentiment analysis incorporating social networks. In KDD, 2011. Google ScholarDigital Library
- J. Tang, H. Gao, X. Hu, and H. Liu. Exploiting homophily effect for trust prediction. In WSDM, 2013. Google ScholarDigital Library
- J. Tang, H. Gao, and H. Liu. mtrust: discerning multi-faceted trust in a connected world. In WSDM, 2012. Google ScholarDigital Library
- J. Tang, X. Hu, and H. Liu. Social recommendation: a review. SNAM, 2013.Google ScholarCross Ref
- J. Tang and H. Liu. Feature selection with linked data in social media. In SDM, 2012.Google Scholar
- J. Tang, T. Lou, and J. Kleinberg. Inferring social ties across heterogenous networks. In WSDM, 2012. Google ScholarDigital Library
- L. Tang and H. Liu. Relational learning via latent social dimensions. In KDD, 2009. Google ScholarDigital Library
- L. Tang and H. Liu. Community detection and mining in social media. Synthesis Lectures on Data Mining and Knowledge Discovery, 2010.Google ScholarCross Ref
- W. Tang, H. Zhuang, and J. Tang. Learning to infer social ties in large networks. In PKDD, 2011. Google ScholarDigital Library
- B. Taskar, P. Abbeel, M.-F.Wong, and D. Koller. Label and link prediction in relational data. In SRL, 2003.Google Scholar
- V. Traag and J. Bruggeman. Community detection in networks with positive and negative links. PRE, 80(3):036115, 2009.Google ScholarCross Ref
- D. Wang, D. Pedreschi, C. Song, F. Giannotti, and A.-L. Barabasi. Human mobility, social ties, and link prediction. In KDD, 2011. Google ScholarDigital Library
- X. Wang, L. Tang, H. Gao, and H. Liu. Discovering overlapping groups in social media. In ICDM, 2010. Google ScholarDigital Library
- D. J. Watts. Computational social science: Exciting progress and future directions. Winter Issue of The Bridge on Frontiers of Engineering, 2014.Google Scholar
- S. Webb, J. Caverlee, and C. Pu. Social honeypots: Making friends with a spammer near you. In CEAS, 2008.Google Scholar
- J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: finding topic-sensitive influential twitterers. In WSDM, 2010. Google ScholarDigital Library
- R. Xiang and J. Neville. Collective inference for network data with copula latent markov networks. In WSDM, pages 647--656. ACM, 2013. Google ScholarDigital Library
- R. Xiang, J. Neville, and M. Rogati. Modeling relationship strength in online social networks. In WWW, 2010. Google ScholarDigital Library
- X. Xu, N. Yuruk, Z. Feng, and T. A. Schweiger. Scan: a structural clustering algorithm for networks. In KDD, 2007. Google ScholarDigital Library
- J. Yang and J. Leskovec. Modeling information diffusion in implicit networks. In ICDM, 2010. Google ScholarDigital Library
- S.-H. Yang, A. J. Smola, B. Long, H. Zha, and Y. Chang. Friend or frenemy?: predicting signed ties in social networks. In SIGIR, 2012. Google ScholarDigital Library
- M. Ye, X. Liu, and W.-C. Lee. Exploring social influence for recommendation: a generative model approach. In SIGIR, 2012. Google ScholarDigital Library
- R. Zafarani, M. A. Abbasi, and H. Liu. Social Media Mining: An Introduction. Cambridge University Press, 2014.Google ScholarDigital Library
- X. Zhang, J. Cheng, T. Yuan, B. Niu, and H. Lu. Toprec: domain-specific recommendation through community topic mining in social network. In WWW, 2013. Google ScholarDigital Library
- S. Zhu, K. Yu, Y. Chi, and Y. Gong. Combining content and link for classification using matrix factorization. In SIGIR, 2007. Google ScholarDigital Library
- Y. Zhu, X. Wang, E. Zhong, N. N. Liu, H. Li, and Q. Yang. Discovering spammers in social networks. In AAAI, 2012.Google ScholarDigital Library
- D. Watts, and S, Steven. Collective dynamics of 'small-world' networks. In nature, 1998.Google Scholar
Index Terms
- Mining social media with social theories: a survey
Recommendations
Social media user classification: based on social capital expectation, susceptibility, and compulsion loop
ICEC '17: Proceedings of the International Conference on Electronic CommerceSocial media such as Facebook, Instagram and Twitter are originally developed as communication tools among individuals for private conversations. Through the platforms, people share photos, stories and news with their social media friends to interact ...
Comments