skip to main content
research-article

Telco User Activity Level Prediction with Massive Mobile Broadband Data

Authors Info & Claims
Published:02 May 2016Publication History
Skip Abstract Section

Abstract

Telecommunication (telco) operators aim to provide users with optimized services and bandwidth in a timely manner. The goal is to increase user experience while retaining profit. To do this, knowing the changing behavior patterns of users through their activity levels in advance can be a great help for operators to adjust their management strategies and reduce operational risk. To achieve this goal, the operators can make use of knowledge discovered from telco’s historical mobile broadband (MBB) records to predict mobile access activity level at an early stage. In this article, we report our research in a real-world telco setting involving more than one million telco users. Our novel contribution includes representing users as documents containing a collection of changing spatiotemporal “words” that express user behavior. By extracting users’ space-time access records in MBB data, we use latent Dirichlet allocation (LDA) to learn user-specific compact topic features for user activity level prediction. We propose a scalable online expectation-maximization (OEM) algorithm that can scale LDA to massive MBB data, which is significantly faster than several state-of-the-art online LDA algorithms. Using these real-world MBB data, we confirm high performance in user activity level prediction. In addition, we show that the inferred topics indicate that future activity level anomalies correlate highly with early skewed bandwidth supply and demand relations. Thus, our prediction system can also guide the telco operators to balance the telecommunication network in terms of supply-demand relations, saving deployment costs and energy of cell towers in the future.

References

  1. Jae-Hyeon Ahna, Sang-Pil Hana, and Yung-Seop Lee. 2006. Customer churn analysis: Churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry. Telecommunications Policy 30, 552--568.Google ScholarGoogle ScholarCross RefCross Ref
  2. Arthur Asuncion, Max Welling, Padhraic Smyth, and Yee Whye Teh. 2009. On smoothing and inference for topic models. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI’09). 27--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David Blei, Andrew Y. Ng, and Michael Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. David M. Blei. 2012. Introduction to probabilistic topic models. Communications of the ACM 55, 4, 77--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Leo Breiman. 2001. Random forests. Machine Learning 45, 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Olivier Cappé and Eric Moulines. 2009. Online expectation-maximization algorithm for latent data models. Journal of the Royal Statistical Society: Series B 71, 3, 593--613.Google ScholarGoogle ScholarCross RefCross Ref
  8. Tianqi Chen. 2015. Large-Scale and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and More. Retrieved March 13, 2016, from https://github.com/dmlc/xgboost.Google ScholarGoogle Scholar
  9. Hong Cheng, Jihang Ye, and Zhe Zhu. 2013. What’s your next move: User activity prediction in location-based social networks. In Proceedings of the 2013 SIAM International Conference on Data Mining (SDM’13). 171--179.Google ScholarGoogle Scholar
  10. Koustuv Dasgupta, Rahul Singh, Balaji Viswanathan, Dipanjan Chakraborty, Sougata Mukherjea, Amit A. Nanavati, and Anupam Joshi. 2008. Social ties and their relevance to churn in mobile telecom networks. In Proceedings of the 11th International Conference on Extending Database Technology (EDBT’08). 668--677. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. de Freitas and K. Barnard. 2001. Bayesian Latent Semantic Analysis of Multimedia Databases. Technical Report. University of British Columbia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39, 1--38.Google ScholarGoogle ScholarCross RefCross Ref
  13. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871--1874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. James Foulds, Levi Boyles, Christopher DuBois, Padhraic Smyth, and Max Welling. 2013. Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). 446--454. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Fosca Giannotti, Mirco Nanni, Fabio Pinelli, and Dino Pedreschi. 2007. Trajectory pattern mining. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). 330--339. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Marta C. Gonzalez, Cesar A. Hidalgo, and Albert-Laszlo Barabasi. 2008. Understanding individual human mobility patterns. Nature 453, 7196, 779--782.Google ScholarGoogle Scholar
  17. T. L. Griffiths and M. Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences 101, 5228--5235.Google ScholarGoogle ScholarCross RefCross Ref
  18. Isabelle Guyon, Vincent Lemaire, Marc Boullé, Gideon Dror, and David Vogel. 2009. Analysis of the KDD Cup 2009: Fast scoring on a large orange customer database. Journal of Machine Learning Research 7 1--22.Google ScholarGoogle Scholar
  19. Geoffrey E. Hinton and Ruslan R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786, 504--507.Google ScholarGoogle Scholar
  20. Yap Kok Ho. 2011. Managing user experience for MBB. Huawei Communicate 60, 19--21.Google ScholarGoogle Scholar
  21. M. Hoffman, D. Blei, and F. Bach. 2010. Online learning for latent Dirichlet allocation. In Proceedings of the 24th Annual Conference on Neural Information Processing Systems (NIPS’10). 856--864.Google ScholarGoogle Scholar
  22. Baoxing Huai, Enhong Chen, Hengshu Zhu, Hui Xiong, Tengfei Bao, Qi Liu, and Jilei Tian. 2014. Toward personalized context recognition for mobile users: A semisupervised Bayesian HMM approach. ACM Transactions on Knowledge Discovery from Data 9, 2, 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Shu Huang, Min Chen, Bo Luo, and Dongwon Lee. 2012. Predicting aggregate social activities using continuous-time stochastic process. In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM’12). 982--991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 675--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Shan Jiang, Joseph Ferreira Jr, and Marta C. Gonzalez. 2012. Discovering urban spatial-temporal structure from human activity patterns. In Proceedings of the KDD Workshop on Urban Computing. 95--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Shan Jiang, Gaston A. Fiore, Yingxiang Yang, Joseph Ferreira Jr, Emilio Frazzoli, and Marta C. González. 2013. A review of urban computing for mobile phone traces: Current methods, challenges and opportunities. In Proceedings of the KDD Workshop on Urban Computing. 2--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Enric Junqeé de Fortuny, David Martens, and Foster Provost. 2013. Predictive modeling with big data: Is bigger really better. Big Data 1, 215--226.Google ScholarGoogle ScholarCross RefCross Ref
  28. Marcel Karnstedt, Matthew Rowe, Jeff Chan, Harith Alani, and Conor Hayes. 2011. The effect of user features on churn in social networks. In Proceedings of the ACM Web Science Conference. 14--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. Liang and D. Klein. 2009. Online EM for unsupervised models. In Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the ACL. 611--619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Zhiyuan Liu, Yuzhou Zhang, Edward Y. Chang, and Maosong Sun. 2011. PLDA+: Parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Transactions on Intelligent Systems and Technology 2, 3, 26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. M. Neal and G. E. Hinton. 1998. A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in Graphical Models 89, 355--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14 (NIPS’01).Google ScholarGoogle Scholar
  34. Huy Pham, Cyrus Shahabi, and Yan Liu. 2013. EBM: An entropy-based model to infer social strength from spatiotemporal data. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, New York, NY, 265--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. I. Porteous, D. Newman, A. Ihler, A. Asuncion, P. Smyth, and M. Welling. 2008. Fast collapsed Gibbs sampling for latent Dirichlet allocation. In Proceedings of the KDD Conference. 569--577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yossi Richter, Elad Yom-Tov, and Noam Slonim. 2010. Predicting customer churn in mobile networks through analysis of social groups. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 732--741.Google ScholarGoogle ScholarCross RefCross Ref
  37. H. Robbins and S. Monro. 1951. A stochastic approximation method. Annals of Mathematical Statistics 22, 3, 400--407.Google ScholarGoogle ScholarCross RefCross Ref
  38. C. Song, T. Koren, P. Wang, and A.-L. Barabási. 2010. Modelling the scaling properties of human mobility. Nature Physics 6, 10, 818--823.Google ScholarGoogle ScholarCross RefCross Ref
  39. Lu-An Tang, Yu Zheng, Jing Yuan, Jiawei Han, Alice Leung, Wen-Chih Peng, and Thomas La Porta. 2013. A framework of traveling companion discovery on trajectory data streams. ACM Transactions on Intelligent Systems and Technology 5, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yee Whye Teh, David Newman, and Max Welling. 2006. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In Proceedings of the 20th Annual Conference on Neural Information Processing Systems (NIPS’06). 1353--1360.Google ScholarGoogle Scholar
  41. Jameson L. Toole, Michael Ulm, Marta C. González, and Dietmar Bauer. 2012. Inferring land use from mobile phone activity. In Proceedings of the KDD Workshop on Urban Computing. 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. P. Wang, T. Hunter, A. M. Bayen, K. Schechtner, and M. C. González. 2012. Understanding road usage patterns in urban areas. Scientific Reports 2, 1001.Google ScholarGoogle ScholarCross RefCross Ref
  43. Yi Wang, Xuemin Zhao, Zhenlong Sun, Hao Yan, Lifeng Wang, Zhihui Jin, Liubin Wang, Yang Gao, Ching Law, and Jia Zeng. 2015. Peacock: Learning long-tail topic features for industrial applications. ACM Transactions on Intelligent Systems and Technology 6, 4, Article No. 47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Kuan-Wei Wu, Chun-Sung Ferng, Chia-Hua Ho, An-Chun Liang, Chun-Heng Huang, Wei-Yuan Shen, Jyun-Yu Jiang, Ming-Hao Yang, Ting-Wei Lin, Ching-Pei Lee, and others. 2012. A two-stage ensemble of diverse models for advertisement ranking in KDD Cup 2012. In Proceedings of the KDD Cup Workshop.Google ScholarGoogle Scholar
  45. Limin Yao, David Mimno, and Andrew McCallum. 2009. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). 937--946. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Hsiang-Fu Yu, Hung-Yi Lo, Hsun-Ping Hsieh, Jing-Kai Lou, Todd G McKenzie, Jung-Wei Chou, Po-Han Chung, Chia-Hua Ho, Chun-Fu Chang, Yin-Hsuan Wei, and others. 2010. Feature engineering and classifier ensemble for KDD Cup 2010. In Proceedings of the KDD Cup Workshop.Google ScholarGoogle Scholar
  47. J. Yuan, Y. Zheng, and X. Xie. 2012. Discovering regions of different functions in a city using human mobility and POIs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12). 186--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Mingxuan Yuan, Ke Deng, Jia Zeng, Yanhua Li, Bing Ni, Xiuqiang He, Fei Wang, Wenyuan Dai, and Qiang Yang. 2014. OceanST: A distributed analytic system for large-scale spatiotemporal mobile broadband data. In Proceedings of the 40th International Conference on Very Large Data Bases (VLDB’14). 1561--1564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Jia Zeng, William K. Cheung, and Jiming Liu. 2013. Learning topic models by belief propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 5, 1121--1134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yu Zheng, Licia Capra, Ouri Wolfson, and Hai Yang. 2014. Urban computing: Concepts, methodologies, and applications. ACM Transactions on Intelligent Systems and Technology 5, 3, Article No. 38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Yu Zheng and Xing Xie. 2011. Learning travel recommendations from user-generated GPS traces. ACM Transactions on Intelligent Systems and Technology 2, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Yu Zheng, Xiuwen Yi, Ming Li, Ruiyuan Li, Zhangqing Shan, Eric Chang, and Tianrui Li. 2015. Forecasting fine-grained air quality based on big data. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2267--2276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Yu Zheng and Xiaofang Zhou. 2011. Computing with Spatial Trajectories. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Hengshu Zhu, Enhong Chen, Hui Xiong, Kuifei Yu, Huanhuan Cao, and Jilei Tian. 2014. Mining mobile user preferences for personalized context-aware recommendation. ACM Transactions on Intelligent Systems and Technology 5, 4, 58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Yin Zhu, Erheng Zhong, Sinno Jialin Pan, Xiao Wang, Minzhe Zhou, and Qiang Yang. 2013. Predicting user activity level in social networks. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13). 159--168. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Telco User Activity Level Prediction with Massive Mobile Broadband Data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 7, Issue 4
      Special Issue on Crowd in Intelligent Systems, Research Note/Short Paper and Regular Papers
      July 2016
      498 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/2906145
      • Editor:
      • Yu Zheng
      Issue’s Table of Contents

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 May 2016
      • Accepted: 1 December 2015
      • Revised: 1 November 2015
      • Received: 1 February 2015
      Published in tist Volume 7, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader