skip to main content
10.1145/3097983.3098022acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

FLAP: An End-to-End Event Log Analysis Platform for System Management

Authors Info & Claims
Published:13 August 2017Publication History

ABSTRACT

Many systems, such as distributed operating systems, complex networks, and high throughput web-based applications, are continuously generating large volume of event logs. These logs contain useful information to help system administrators to understand the system running status and to pinpoint the system failures. Generally, due to the scale and complexity of modern systems, the generated logs are beyond the analytic power of human beings. Therefore, it is imperative to develop a comprehensive log analysis system to support effective system management. Although a number of log mining techniques have been proposed to address specific log analysis use cases, few research and industrial efforts have been paid on providing integrated systems with an end-to-end solution to facilitate the log analysis routines.

In this paper, we design and implement an integrated system, called FIU Log Analysis Platform (a.k.a. FLAP), that aims to facilitate the data analytics for system event logs. FLAP provides an end-to-end solution that utilizes advanced data mining techniques to assist log analysts to conveniently, timely, and accurately conduct event log knowledge discovery, system status investigation, and system failure diagnosis. Specifically, in FLAP, state-of-the-art template learning techniques are used to extract useful information from unstructured raw logs; advanced data transformation techniques are proposed and leveraged for event transformation and storage; effective event pattern mining, event summarization, event querying, and failure prediction techniques are designed and integrated for log analytics; and user-friendly interfaces are utilized to present the informative analysis results intuitively and vividly. Since 2016, FLAP has been used by Huawei Technologies Co. Ltd for internal event log analysis, and has provided effective support in its system operation and workflow optimization.

Skip Supplemental Material Section

Supplemental Material

li_system_management.mp4

mp4

343.9 MB

References

  1. Amazon CloudWatch. http://aws.amazon.com/cloudwatch/.Google ScholarGoogle Scholar
  2. Scribe. https://github.com/facebookarchive/scribe.Google ScholarGoogle Scholar
  3. S.-H. Cha and S. N. Srihari. On measuring the distance between histograms. Pattern Recognition, 35(6):1355--1370, 2002. Google ScholarGoogle ScholarCross RefCross Ref
  4. O. Etzion and P. Niblett. Event processing in action. Manning Publications Co., 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Z. Ge, J. Yates, L. Breslau, D. Pei, H. Yan, and D. Massey. Grca: A generic root cause analysis platform for service quality management in large isp networks. In ACM Conference on Emerging Networking Experiments and Technologies, 2010.Google ScholarGoogle Scholar
  6. P. D. Grünwald. The minimum description length principle. MIT press, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  7. HP. HP Operations Analytics: a New Analytics Platform to Support the Transformation of IT. HP White Paper, 2013.Google ScholarGoogle Scholar
  8. IBM. Monitoring the ibm http server on z/os from the tivoli enterprise portal. IBM White Paper, 2013.Google ScholarGoogle Scholar
  9. Y. Jiang, C. Perng, and T. Li. META: multi-resolution framework for event summarization. In Proceedings of the 2014 SIAM International Conference on Data Mining, pages 605--613, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  10. Y. Jiang, C.-S. Perng, and T. Li. Natural event summarization. In Proceedings of the 20th ACM international conference on Information and knowledge management, pages 765--774. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Kiernan and E. Terzi. Constructing comprehensive summaries of large event sequences. ACM Transactions on Knowledge Discovery from Data (TKDD), 3(4):21, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Li. Event Mining: Algorithms and Applications, volume 38. CRC Press, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Li, C. Zeng, Y. Jiang, W. Zhou, L. Tang, Z. Liu, and Y. Huang. Data-driven Techniques in Computing System Management. ACM Computing Surveys, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Li and S. Ma. Mining temporal patterns without predefined time windows. In IEEE ICDM 2004, pages 451--454, 2004.Google ScholarGoogle Scholar
  15. Y. Liang, Y. Zhang, H. Xiong, and R. Sahoo. Failure prediction in ibm bluegene/l event logs. In IEEE ICDM 2007, pages 583--588, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J.-G. Lou, Q. Fu, .Y. Wang, and J. Li. Mining dependency in distributed systems through unstructured logs analysis. ACM SIGOPS Operating Systems Review, 44(1):91--96, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Ma and J. L. Hellerstein. Mining partially periodic event patterns with unknown periods. In IEEE ICDE 2001, pages 205--214. IEEE, 2001.Google ScholarGoogle Scholar
  18. M. L. Massie, B. N. Chun, and D. E. Culler. The Ganglia distributed monitoring system: design, implementation, and experience. Parallel Computing, 30(7):817--840, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  19. K. Nagaraj, C. Killian, and J. Neville. Structured comparative analysis of systems logs to diagnose performance problems. In Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12), pages 353--366, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Schneider, I. Beschastnikh, S. Chernyak, M. D. Ernst, and Y. Brun. Synoptic: Summarizing system logs with refinement. In SLAML, 2010.Google ScholarGoogle Scholar
  21. L. Tang and T. Li. Logtree: A framework for generating system events from raw textual logs. In IEEE ICDM 2010, pages 491--500, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Tang, T. Li, Y. Jiang, and Z. Chen. Dynamic query forms for database queries. IEEE Transactions on Knowledge and Data Engineering, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  23. L. Tang, T. Li, and L. Shwartz. Discovering lag intervals for temporal dependencies. In ACM SIGKDD, pages 633--641, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Tatti and J. Vreeken. The long and the short of it: summarising event sequences with serial episodes. In ACM SIGKDD, pages 462--470, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Wang, H. Wang, M. Liu, and W. Wang. An algorithmic approach to event summarization. In ACM SIGMOD, pages 183--194, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Wu, Y. Diao, and S. Rizvi. High-performance complex event processing over streams. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages 407--418. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 117--132. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Zeng, L. Tang, W. Zhou, T. Li, L. Shwartz, and G. Y. Grabarnik. An Integrated framework for Mining Temporal Logs from Fluctuating Events. IEEE Transactions on Services Computing, 2017.Google ScholarGoogle Scholar

Index Terms

  1. FLAP: An End-to-End Event Log Analysis Platform for System Management

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
          August 2017
          2240 pages
          ISBN:9781450348874
          DOI:10.1145/3097983

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 August 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '17 Paper Acceptance Rate64of748submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader