Skip to main content
Top

2021 | OriginalPaper | Chapter

An Influence-Based Approach for Root Cause Alarm Discovery in Telecom Networks

Authors : Keli Zhang, Marcus Kalander, Min Zhou, Xi Zhang, Junjian Ye

Published in: Service-Oriented Computing – ICSOC 2020 Workshops

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Alarm root cause analysis is a significant component in the day-to-day telecommunication network maintenance, and it is critical for efficient and accurate fault localization and failure recovery. In practice, accurate and self-adjustable alarm root cause analysis is a great challenge due to network complexity and vast amounts of alarms. A popular approach for failure root cause identification is to construct a graph with approximate edges, commonly based on either event co-occurrences or conditional independence tests. However, considerable expert knowledge is typically required for edge pruning. We propose a novel data-driven framework for root cause alarm localization, combining both causal inference and network embedding techniques. In this framework, we design a hybrid causal graph learning method (HPCI), which combines Hawkes Process with Conditional Independence tests, as well as propose a novel Causal Propagation-Based Embedding algorithm (CPBE) to infer edge weights. We subsequently discover root cause alarms in a real-time data stream by applying an influence maximization algorithm on the weighted graph. We evaluate our method on artificial data and real-world telecom data, showing a significant improvement over the best baselines.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abele, L., Anic, M., et al.: Combining knowledge modeling and machine learning for alarm root cause analysis. IFAC Proc. Volumes 46(9), 1843–1848 (2013)CrossRef Abele, L., Anic, M., et al.: Combining knowledge modeling and machine learning for alarm root cause analysis. IFAC Proc. Volumes 46(9), 1843–1848 (2013)CrossRef
2.
go back to reference Bahl, P., Chandra, R., et al.: Towards highly reliable enterprise network services via inference of multi-level dependencies. In: ACM SIGCOMM Computer Communication Review, vol. 37, pp. 13–24. ACM (2007) Bahl, P., Chandra, R., et al.: Towards highly reliable enterprise network services via inference of multi-level dependencies. In: ACM SIGCOMM Computer Communication Review, vol. 37, pp. 13–24. ACM (2007)
3.
go back to reference Chen, P., Qi, Y., et al.: Causeinfer: automatic and distributed performance diagnosis with hierarchical causality graph in large distributed systems. In: INFOCOM, 2014 Proceedings IEEE, pp. 1887–1895. IEEE (2014) Chen, P., Qi, Y., et al.: Causeinfer: automatic and distributed performance diagnosis with hierarchical causality graph in large distributed systems. In: INFOCOM, 2014 Proceedings IEEE, pp. 1887–1895. IEEE (2014)
4.
go back to reference Ge, Z., Yates, J., et al.: GRCA: a generic root cause analysis platform for service quality management in large ISP networks. In: ACM ACM Conference on Emerging Networking Experiments and Technologies (2010) Ge, Z., Yates, J., et al.: GRCA: a generic root cause analysis platform for service quality management in large ISP networks. In: ACM ACM Conference on Emerging Networking Experiments and Technologies (2010)
5.
go back to reference Goyal, A., Bonchi, F., et al.: Learning influence probabilities in social networks. In: Proceedings of the third ACM International Conference on Web Search and Data Mining, pp. 241–250. ACM (2010) Goyal, A., Bonchi, F., et al.: Learning influence probabilities in social networks. In: Proceedings of the third ACM International Conference on Web Search and Data Mining, pp. 241–250. ACM (2010)
6.
go back to reference Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016) Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016)
7.
go back to reference Hawkes, A.G.: Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1), 83–90 (1971)MathSciNetCrossRef Hawkes, A.G.: Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1), 83–90 (1971)MathSciNetCrossRef
8.
go back to reference Jung, K., Heo, W., et al.: IRIE: scalable and robust influence maximization in social networks. In: 2012 IEEE 12th International Conference on Data Mining (ICDM), pp. 918–923. IEEE (2012) Jung, K., Heo, W., et al.: IRIE: scalable and robust influence maximization in social networks. In: 2012 IEEE 12th International Conference on Data Mining (ICDM), pp. 918–923. IEEE (2012)
9.
go back to reference Kalisch, M., Bühlmann, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)MATH Kalisch, M., Bühlmann, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)MATH
10.
go back to reference Kempe, D., Kleinberg, J., et al.: Maximizing the spread of influence through a social network. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 137–146. ACM (2003) Kempe, D., Kleinberg, J., et al.: Maximizing the spread of influence through a social network. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 137–146. ACM (2003)
11.
go back to reference Kobayashi, S., Otomo, K., et al.: Mining causality of network events in log data. IEEE Trans. Netw. Serv. Manag. 15(1), 53–67 (2018)CrossRef Kobayashi, S., Otomo, K., et al.: Mining causality of network events in log data. IEEE Trans. Netw. Serv. Manag. 15(1), 53–67 (2018)CrossRef
12.
go back to reference Li, Y., Fan, J., et al.: Influence maximization on social graphs: a survey. IEEE Trans. Knowl. Data Eng. 30(10), 1852–1872 (2018)CrossRef Li, Y., Fan, J., et al.: Influence maximization on social graphs: a survey. IEEE Trans. Knowl. Data Eng. 30(10), 1852–1872 (2018)CrossRef
13.
go back to reference Lou, J.G., Fu, Q., et al.: Mining dependency in distributed systems through unstructured logs analysis. SIGOPS Oper. Syst. Rev. 44(1), 91–96 (2010)CrossRef Lou, J.G., Fu, Q., et al.: Mining dependency in distributed systems through unstructured logs analysis. SIGOPS Oper. Syst. Rev. 44(1), 91–96 (2010)CrossRef
14.
go back to reference Meng, Y., et al.: Localizing failure root causes in a microservice through causality inference. In: 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS), pp. 1–10. IEEE (2020) Meng, Y., et al.: Localizing failure root causes in a microservice through causality inference. In: 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS), pp. 1–10. IEEE (2020)
15.
go back to reference Mikolov, T., Sutskever, I., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
16.
go back to reference Nie, X., Zhao, Y., et al.: Mining causality graph for automatic web-based service diagnosis. In: 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC), pp. 1–8 (2016) Nie, X., Zhao, Y., et al.: Mining causality graph for automatic web-based service diagnosis. In: 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC), pp. 1–8 (2016)
17.
go back to reference Ogata, Y.: On lewis’ simulation method for point processes. IEEE Trans. Inf. Theory 27(1), 23–31 (1981)CrossRef Ogata, Y.: On lewis’ simulation method for point processes. IEEE Trans. Inf. Theory 27(1), 23–31 (1981)CrossRef
18.
go back to reference Peters, J., Mooij, J.M., et al.: Causal discovery with continuous additive noise models. J. Mach. Learn. Res. 15(1), 2009–2053 (2014)MathSciNetMATH Peters, J., Mooij, J.M., et al.: Causal discovery with continuous additive noise models. J. Mach. Learn. Res. 15(1), 2009–2053 (2014)MathSciNetMATH
19.
go back to reference Spirtes, P., Glymour, C.: An algorithm for fast recovery of sparse causal graphs. Soc. Sci. Comput. Rev. 9(1), 62–72 (1991)CrossRef Spirtes, P., Glymour, C.: An algorithm for fast recovery of sparse causal graphs. Soc. Sci. Comput. Rev. 9(1), 62–72 (1991)CrossRef
20.
go back to reference Spirtes, P., Glymour, C.N., et al.: Causation, Prediction, and Search. MIT Press, Cambridge (2000)MATH Spirtes, P., Glymour, C.N., et al.: Causation, Prediction, and Search. MIT Press, Cambridge (2000)MATH
21.
go back to reference Su, C., Hailong, Z., et al.: Association mining analysis of alarm root-causes in power system with topological constraints. In: Proceedings of the 2017 International Conference on Information Technology, pp. 461–468. ACM (2017) Su, C., Hailong, Z., et al.: Association mining analysis of alarm root-causes in power system with topological constraints. In: Proceedings of the 2017 International Conference on Information Technology, pp. 461–468. ACM (2017)
22.
go back to reference Veen, A., Schoenberg, F.P.: Estimation of space-time branching process models in seismology using an EM-type algorithm. J. Am. Stat. Assoc. 103(482), 614–624 (2008)MathSciNetCrossRef Veen, A., Schoenberg, F.P.: Estimation of space-time branching process models in seismology using an EM-type algorithm. J. Am. Stat. Assoc. 103(482), 614–624 (2008)MathSciNetCrossRef
23.
go back to reference Wang, P., Xu, J., et al.: Cloudranger: root cause identification for cloud native systems. In: 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 492–502. IEEE (2018) Wang, P., Xu, J., et al.: Cloudranger: root cause identification for cloud native systems. In: 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 492–502. IEEE (2018)
24.
go back to reference Zhang, X., Bai, Y., et al.: Network alarm flood pattern mining algorithm based on multi-dimensional association. In: Proceedings of the 21st ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, pp. 71–78. ACM (2018) Zhang, X., Bai, Y., et al.: Network alarm flood pattern mining algorithm based on multi-dimensional association. In: Proceedings of the 21st ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, pp. 71–78. ACM (2018)
25.
go back to reference Zhou, K., Zha, H., et al.: Learning social infectivity in sparse low-rank networks using multi-dimensional hawkes processes. In: Artificial Intelligence and Statistics, pp. 641–649 (2013) Zhou, K., Zha, H., et al.: Learning social infectivity in sparse low-rank networks using multi-dimensional hawkes processes. In: Artificial Intelligence and Statistics, pp. 641–649 (2013)
Metadata
Title
An Influence-Based Approach for Root Cause Alarm Discovery in Telecom Networks
Authors
Keli Zhang
Marcus Kalander
Min Zhou
Xi Zhang
Junjian Ye
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-76352-7_16

Premium Partner