skip to main content
research-article

Geosocial Co-Clustering: A Novel Framework for Geosocial Community Detection

Published:13 June 2020Publication History
Skip Abstract Section

Abstract

As location-based services using mobile devices have become globally popular these days, social network analysis (especially, community detection) increasingly benefits from combining social relationships with geographic preferences. In this regard, this article addresses the emerging problem of geosocial community detection. We first formalize the problem of geosocial co-clustering, which co-clusters the users in social networks and the locations they visited. Geosocial co-clustering detects higher-quality communities than existing approaches by improving the mapping clusterability, whereby users in the same community tend to visit locations in the same region. While geosocial co-clustering is soundly formalized as non-negative matrix tri-factorization, conventional matrix tri-factorization algorithms suffer from a significant computational overhead when handling large-scale datasets. Thus, we also develop an efficient framework for geosocial co-clustering, called GEOsocial COarsening and DEcomposition (GEOCODE). To achieve efficient matrix tri-factorization, GEOCODE reduces the numbers of users and locations through coarsening and then decomposes the single whole matrix tri-factorization into a set of multiple smaller sub-matrix tri-factorizations. Thorough experiments conducted using real-world geosocial networks show that GEOCODE reduces the elapsed time by 19–69 times while achieving the accuracy of up to 94.8% compared with the state-of-the-art co-clustering algorithm. Furthermore, the benefit of the mapping clusterability is clearly demonstrated through a local expert recommendation application.

References

  1. Lada A. Adamic, Jun Zhang, Eytan Bakshy, and Mark S. Ackerman. 2008. Knowledge sharing and Yahoo answers: Everyone knows something. In Proceedings of the 17th International Conference on World Wide Web. 665--674.Google ScholarGoogle Scholar
  2. Waseem Ahmad and Ashfaq Khokhar. 2007. cHawk: An efficient biclustering algorithm based on bipartite graph crossing minimization. In Proceedings of the VLDB Workshop on Data Mining in Bioinformatics. 1553--1558.Google ScholarGoogle Scholar
  3. Nikos Armenatzoglou, Stavros Papadopoulos, and Dimitris Papadias. 2013. A general framework for geo-social query processing. Proc. VLDB Endow. 6, 10 (2013), 913--924.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lars Backstrom, Eric Sun, and Cameron Marlow. 2010. Find me if you can: Improving geographical prediction with social and spatial proximity. In Proceedings of the 19th International Conference on World Wide Web. 61--70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jie Bao, Yu Zheng, and Mohamed F. Mokbel. 2012. Location-based and preference-aware recommendation using sparse geo-social networking data. In Proceedings of the 20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 199--208.Google ScholarGoogle Scholar
  6. Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 8 (2011), 1548--1560.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fazli Can and Esen A. Ozkarahan. 1990. Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases. ACM Trans. Database Syst. 15, 4 (1990), 483--517.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Carlos Castro-Herrera, Chuan Duan, Jane Cleland-Huang, and Bamshad Mobasher. 2009. A recommender system for requirements elicitation in large-scale software projects. In Proceedings of the 2009 ACM Symposium on Applied Computing. 1419--1426.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zhiyuan Cheng, James Caverlee, Himanshu Barthwal, and Vandana Bachani. 2014. Who is the Barbecue King of Texas?: A geo-spatial approach to finding local experts on Twitter. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. 335--344.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Eunjoon Cho, Seth A. Myers, and Jure Leskovec. 2011. Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1082--1090.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Minsoo Choy, Jae-Gil Lee, Gahgene Gweon, and Daehoon Kim. 2014. Glaucus: Exploiting the wisdom of crowds for location-based queries in mobile environments. In Proceedings of the 8th AAAI International Conference on Weblogs and Social Media. 61--70.Google ScholarGoogle Scholar
  12. Inderjit S. Dhillon, Subramanyam Mallela, and Dharmendra S. Modha. 2003. Information-theoretic co-clustering. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 89--98.Google ScholarGoogle Scholar
  13. Chris Ding, Xiaofeng He, and Horst D. Simon. 2005. On the equivalence of nonnegative matrix factorization and spectral clustering. In Proceedings of the 2005 SIAM International Conference on Data Mining. 606--610.Google ScholarGoogle Scholar
  14. Chris Ding, Tao Li, Wei Peng, and Haesun Park. 2006. Orthogonal nonnegative matrix tri-factorizations for clustering. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 126--135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Peter D. Grünwald, In Jae Myung, and Mark A Pitt. 2005. Advances in Minimum Description Length: Theory and Applications. MIT Press.Google ScholarGoogle Scholar
  16. Quanquan Gu and Jie Zhou. 2009. Co-clustering on manifolds. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 359--368.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Negar Hariri, Carlos Castro-Herrera, Mehdi Mirakhorli, Jane Cleland-Huang, and Bamshad Mobasher. 2013. Supporting domain analysis through mining and recommending features from online product listings. IEEE Trans. Softw. Eng. 39, 12 (2013), 1736--1752.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xiaofei He and Partha Niyogi. 2004. Locality preserving projections. In Advances in Neural Information Processing Systems. 153--160.Google ScholarGoogle Scholar
  19. Andrew E. G. Jonas. 2012. Region and place: Regionalism in question. Progr. Hum. Geogr. 36, 2 (2012), 263--272.Google ScholarGoogle ScholarCross RefCross Ref
  20. George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 1 (1998), 359--392.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jungeun Kim, Minseo Kang, Sungsu Lim, and Jae-Gil Lee. 2015. Triangle counting in networks using a multi-level branching technique. In Proceedings of the 2015 International Conference on Big Data and Smart Computing. 47--50.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jungeun Kim and Jae-Gil Lee. 2015. Community detection in multi-layer graphs: A survey. ACM SIGMOD Rec. 44, 3 (2015), 37--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jungeun Kim, Jae-Gil Lee, and Sungsu Lim. 2016. Differential flattening: A novel framework for community detection in multi-layer graphs. ACM Trans. Intell. Syst. Technol. 8, 2 (2016), 27.Google ScholarGoogle Scholar
  24. Jungeun Kim, Sungsu Lim, Jae-Gil Lee, and Byung Lee. 2018. LinkBlackHole*: Robust overlapping community detection using link embedding. IEEE Trans. Knowl. Data Eng. 31, 11 (2018), 2138--2150.Google ScholarGoogle ScholarCross RefCross Ref
  25. Jae-Gil Lee and Minseo Kang. 2015. Geospatial big data: Challenges and opportunities. Big Data Res. 2, 2 (2015), 74--81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Thomas Lee. 2001. An introduction to coding theory and the two-part minimum description length principle. Int. Stat. Rev. 69, 2 (2001), 169--183.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. Retrieved from http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  28. Jure Leskovec, Kevin J. Lang, and Michael Mahoney. 2010. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th International Conference on World Wide Web. 631--640.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kenneth Wai-Ting Leung, Dik Lun Lee, and Wang-Chien Lee. 2011. CLR: A collaborative location recommendation framework based on co-clustering. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. 305--314.Google ScholarGoogle Scholar
  30. Yafei Li, Rui Chen, Jianliang Xu, Qiao Huang, Haibo Hu, and Byron Choi. 2015. Geo-social K-cover group queries for collaborative spatial computing. IEEE Trans. Knowl. Data Eng. 27, 10 (2015), 2729--2742.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. David Liben-Nowell and Jon M. Kleinberg. 2007. The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58, 7 (2007), 1019--1031.Google ScholarGoogle ScholarCross RefCross Ref
  32. Richard M. Medina and George F. Hepner. 2011. Advancing the understanding of sociospatial dependencies in terrorist networks. Trans. GIS 15, 5 (2011), 577--597.Google ScholarGoogle ScholarCross RefCross Ref
  33. Kyriakos Mouratidis, Jing Li, Yu Tang, and Nikos Mamoulis. 2015. Joint search by social and spatial proximity. IEEE Trans. Knowl. Data Eng. 27, 3 (2015), 781--793.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Valerio Perrone, Paul A Jenkins, Dario Spano, and Yee Whye Teh. 2017. Poisson random fields for dynamic feature models. J. Mach. Learn. Res. 18, 1 (2017), 4626--4670.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (1987), 53--65.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Mohamed Sarwat, Justin J. Levandoski, Ahmed Eldawy, and Mohamed F. Mokbel. 2014. LARS*: An efficient and scalable location-aware recommender system. IEEE Trans. Knowl. Data Eng. 26, 6 (2014), 1384--1399.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Paulo Shakarian, Patrick Roos, Devon Callahan, and Cory Kirk. 2013. Mining for geographically disperse communities in social networks by leveraging distance modularity. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1402--1409.Google ScholarGoogle ScholarCross RefCross Ref
  38. Fanhua Shang, L. C. Jiao, and Fei Wang. 2012. Graph dual regularization non-negative matrix factorization for co-clustering. Pattern Recogn. 45, 6 (2012), 2237--2250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jieming Shi, Nikos Mamoulis, Dingming Wu, and David W. Cheung. 2014. Density-based place clustering in geo-social networks. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. 99--110.Google ScholarGoogle Scholar
  40. Yves van Gennip, Blake Hunter, Raymond Ahn, Peter Elliott, Kyle Luh, Megan Halvorson, Shannon Reid, Matthew Valasik, James Wo, George E. Tita, Andrea L. Bertozzi, and P. Jeffrey Brantingham. 2013. Community detection using spectral clustering on sparse geosocial data. SIAM J. Appl. Math. 73, 1 (2013), 67--83.Google ScholarGoogle ScholarCross RefCross Ref
  41. Hua Wang, Feiping Nie, Heng Huang, and Chris Ding. 2011. Nonnegative matrix tri-factorization based high-order co-clustering and its fast implementation. In Proceedings of the 11th IEEE International Conference on Data Mining. 774--783.Google ScholarGoogle ScholarCross RefCross Ref
  42. Hao Wang, Manolis Terrovitis, and Nikos Mamoulis. 2013. Location recommendation in location-based social networks using user check-in data. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 364--374.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Meng Wang, Chaokun Wang, Jeffrey Xu Yu, and Jun Zhang. 2015. Community detection in social networks: An in-depth benchmarking study with a procedure-oriented framework. Proc. VLDB Endow. 8, 10 (2015), 998--1009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xiaoyang Wang, Ying Zhang, Wenjie Zhang, and Xuemin Lin. 2016. Distance-aware influence maximization in geo-social network. In Proceedings of the 32nd IEEE International Conference on Data Engineering. 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  45. Yao Wu, Xudong Liu, Min Xie, Martin Ester, and Qing Yang. 2015. CCCF: Improving collaborative filtering via scalable user-item co-clustering. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. 73--82.Google ScholarGoogle Scholar
  46. De-Nian Yang, Chih-Ya Shen, Wang-Chien Lee, and Ming-Syan Chen. 2012. On socio-spatial group query for location-based social networks. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 949--957.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Hongzhi Yin, Zhiting Hu, Xiaofang Zhou, Hao Wang, Kai Zheng, Nguyen Quoc Viet Hung, and Shazia Wasim Sadiq. 2016. Discovering interpretable geo-social communities for user behavior prediction. In Proceedings of the 32nd IEEE International Conference on Data Engineering. 942--953.Google ScholarGoogle ScholarCross RefCross Ref
  48. Jia-Dong Zhang and Chi-Yin Chow. 2013. iGSLR: Personalized geo-social location recommendation—A kernel density estimation approach. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 334--343.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Marinka Zitnik and Blaz Zupan. 2015. Data fusion by matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1 (2015), 41--53.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Geosocial Co-Clustering: A Novel Framework for Geosocial Community Detection

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 11, Issue 4
      Survey Paper and Regular Paper
      August 2020
      358 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/3401889
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2020
      • Online AM: 7 May 2020
      • Accepted: 1 March 2020
      • Received: 1 June 2019
      Published in tist Volume 11, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format