skip to main content
survey

Survey and Analysis of Kernel and Userspace Tracers on Linux: Design, Implementation, and Overhead

Published:12 March 2018Publication History
Skip Abstract Section

Abstract

As applications and operating systems are becoming more complex, the last decade has seen the rise of many tracing tools all across the software stack. This article presents a hands-on comparison of modern tracers on Linux systems, both in user space and kernel space. The authors implement microbenchmarks that not only quantify the overhead of different tracers, but also sample fine-grained metrics that unveil insights into the tracers’ internals and show the cause of each tracer’s overhead. Internal design choices and implementation particularities are discussed, which helps us to understand the challenges of developing tracers. Furthermore, this analysis aims to help users choose and configure their tracers based on their specific requirements to reduce their overhead and get the most of out of them.

References

  1. Georgios Bitzes and Andrzej Nowak. 2014. The overhead of profiling using PMU hardware counters. CERN Openlab Report (2014). Retrieved from https://zenodo.org/record/10800/files/TheOverheadOfProfilingUsingPMUhardwareCounters.pdf.Google ScholarGoogle Scholar
  2. Jan Blunck, Mathieu Desnoyers, and Pierre-Marc Fournier. 2009. Userspace application tracing with markers and tracepoints. In Proceedings of the Linux Kongress. Dresden, Germany, 7--14.Google ScholarGoogle Scholar
  3. Yannick Brosseau. 2017. A userspace tracing comparison: Dtrace vs LTTng UST. Retrieved from http://www.dorsal.polymtl.ca/fr/blog/yannick-brosseau/userspace-tracing-comparison-dtrace-vs-lttng-ust.Google ScholarGoogle Scholar
  4. Mathieu Desnoyers. 2009. Low-Impact Operating System Tracing. Ph.D. Dissertation. École Polytechnique de Montréal.Google ScholarGoogle Scholar
  5. Mathieu Desnoyers. 2012. Common trace format (CTF) specification (v1. 8.2). Common Trace Format GIT Repository (2012). Retrieved from https://github.com/efficios/ctf/blob/master/common-trace-format-specification.md.Google ScholarGoogle Scholar
  6. Mathieu Desnoyers. 2016a. Restartable sequences system call. Retrieved from http://www.mail-archive.com/[email protected]/msg1213826.html.Google ScholarGoogle Scholar
  7. Mathieu Desnoyers. 2016b. Semantics and Behavior of Local Atomic Operations. Documentation/local_ops.txt. (2016). Linux kernel version 4.5.0. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/local_ops.txt?h=v4.5.Google ScholarGoogle Scholar
  8. Mathieu Desnoyers. 2016c. Tracepoints documentation in the Linux kernel. Documentation/trace/tracepoints.txt. Linux kernel version 4.5.0. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/trace/tracepoints.txt?h=v4.5.Google ScholarGoogle Scholar
  9. Mathieu Desnoyers and Michel Dagenais. 2006a. Low disturbance embedded system tracing with linux trace toolkit next generation. In Proceedings of the Embedded Linux Conference (ELC’06), Vol. 2006. Citeseer, San Jose, California.Google ScholarGoogle Scholar
  10. Mathieu Desnoyers and Michel Dagenais. 2008. LTTng: Tracing across execution layers, from the hypervisor to user-space. In Proceedings of the Linux Symposium, Vol. 101. Ottawa Linux Symposium, 101--106.Google ScholarGoogle Scholar
  11. Mathieu Desnoyers and Michel R. Dagenais. 2006b. The LTTng tracer: A low impact performance and behavior monitor for GNU/Linux. In Proceedings of the Ottawa Linux Symposium (OLS’06), Vol. 2006. Citeseer, 209--224.Google ScholarGoogle Scholar
  12. Mathieu Desnoyers and Michel R. Dagenais. 2009. Lttng, filling the gap between kernel instrumentation and a widely usable kernel tracer. In Linux Foundation Collaboration Summit 2009. Linux Foundation.Google ScholarGoogle Scholar
  13. Mathieu Desnoyers and Michel R. Dagenais. 2010. Synchronization for fast and reentrant operating system kernel tracing. Softw. Pract. Exp. 40, 12 (2010), 1053--1072. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mathieu Desnoyers and Michel R. Dagenais. 2012. Lockless multi-core high-throughput buffering scheme for kernel tracing. ACM SIGOPS Op. Syst. Rev. 46, 3 (2012), 65--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mathieu Desnoyers, Paul E. McKenney, Alan S. Stern, Michel R. Dagenais, and Jonathan Walpole. 2012. User-level implementations of read-copy update. IEEE Trans. Parallel Distrib. Syst. 23, 2 (2012), 375--382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Frank Ch. Eigler. 2006. Problem solving with systemtap. In Proceedings of the Ottawa Linux Symposium. Citeseer, 261--268.Google ScholarGoogle Scholar
  17. Extrae. 2016. Extrae website. Retrieved from http://www.vi-hps.org/tools/extrae.html.Google ScholarGoogle Scholar
  18. Pierre-Marc Fournier, Mathieu Desnoyers, and Michel R. Dagenais. 2009. Combined tracing of the kernel and applications with LTTng. In Proceedings of the 2009 Linux Symposium. Citeseer, 87--93.Google ScholarGoogle Scholar
  19. M. Frysinger. 2016. Function tracer guts. Documentation/trace/ftrace-design.txt. Linux kernel version 4.5.0. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/trace/ftrace-design.txt?h=v4.5.Google ScholarGoogle Scholar
  20. Amir Reza Ghods. 2016. A study of Linux Perf and slab allocation sub-systems. Master thesis, University of Waterloo. Retrieved from http://hdl.handle.net/10012/10184.Google ScholarGoogle Scholar
  21. Github. 2017a. BCC project. Retrieved from https://github.com/iovisor/bcc.Google ScholarGoogle Scholar
  22. Github. 2017b. Chisels User Guide. Retrieved from https://github.com/draios/sysdig/wiki/Chisels-User-Guide.Google ScholarGoogle Scholar
  23. Github. 2017c. KTap: A lightweight script-based dynamic tracing tool for Linux. Retrieved from https://github.com/ktap/ktap.Google ScholarGoogle Scholar
  24. Brendan Gregg. 2017. Brendan Gregg Linux Performance. Retrieved from http://www.brendangregg.com.Google ScholarGoogle Scholar
  25. Brendan Gregg and Jim Mauro. 2011. DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X, and FreeBSD. Prentice Hall Professional. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Haardt and M. Coleman. 1999. ptrace(2) Linux Programmer’s Manual. Retrieved from http://man7.org/linux/man-pages/man2/ptrace.2.html.Google ScholarGoogle Scholar
  27. John L. Hennessy and David A. Patterson. 2011. Computer Architecture: A Quantitative Approach. Elsevier. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Hiramatsu. 2010. Kprobes jump optimization support. (Feb. 2010). https://lwn.net/Articles/375232.Google ScholarGoogle Scholar
  29. M. Hiramatsu, J. Keniston, and P. S. Panchamukhi. 2016. Kernel Probes (Kprobes). Documentation/kprobes.txt. (2016). Linux kernel version 4.5.0.Google ScholarGoogle Scholar
  30. Intel Corporation. 2016. Intel® 64 and IA-32 Architectures Software Developer’s Manual, No. 325462-045US.Google ScholarGoogle Scholar
  31. Michael K. Johnson and Erik W. Troan. 2004. Linux Application Development. Addison-Wesley Professional. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tomas Kalibera and Richard Jones. 2013. Rigorous benchmarking in reasonable time. In ACM SIGPLAN Not., Vol. 48. ACM, 63--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Michael Kerrisk. 2010. The Linux Programming Interface. No Starch Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Andreas Knüpfer, Holger Brunst, Jens Doleschal, Matthias Jurenz, Matthias Lieber, Holger Mickler, Matthias S. Müller, and Wolfgang E. Nagel. 2008. The vampir performance analysis tool-set. In Tools for High Performance Computing, Michael Resch, Rainer Keller, Valentin Himmler, Bettina Krammer, and Alexander Schulz (Eds.) Springer, 139--155.Google ScholarGoogle Scholar
  35. Robert Love. 2005. Linux Kernel Development. Novell Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ananth Mavinakayanahalli, Prasanna Panchamukhi, Jim Keniston, Anil Keshavamurthy, and Masami Hiramatsu. 2006. Probing the guts of kprobes. In Proceedings of the Linux Symposium, Vol. 6. Ottawa Linux Symposium, 101--116.Google ScholarGoogle Scholar
  37. Steven McCanne and Van Jacobson. 1993. The BSD packet filter: A new architecture for user-level packet capture. In USENIX Winter, Vol. 46, 259--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Paul E. McKenney and John D. Slingwine. 1998. Read-copy update: Using execution history to solve concurrency problems. In Proceedings of the 10th IASTED International Conference on Parallel and Distributed Computing and Systems, Oct. 1998. 509--518.Google ScholarGoogle Scholar
  39. Bojan Mihajlović, Željko Žilić, and Warren J. Gross. 2014. Dynamically instrumenting the QEMU emulator for Linux process trace generation with the GDB debugger. ACM Trans. Embed. Comput. Syst. (TECS’14) 13, 5s (2014), 167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Shirley Moore, David Cronk, Kevin London, and Jack Dongarra. 2001. Review of performance analysis tools for MPI parallel programs. In Proceedings of the European Parallel Virtual Machine/Message Passing Interface Users’ Group Meeting. Springer, 241--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Matthias S. Müller, Andreas Knüpfer, Matthias Jurenz, Matthias Lieber, Holger Brunst, Hartmut Mix, and Wolfgang E. Nagel. 2007. Developing scalable applications with Vampir, VampirServer and VampirTrace. In PARCO, Vol. 15. Citeseer, 637--644.Google ScholarGoogle Scholar
  42. Pradeep Padala. 2002. Playing with ptrace, Part I. Linux J. 2002, 103 (2002), 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. J. S. Peek. 1996. System and method for creating thread-safe shared libraries. U.S. Patent No. 5,481,706. Jan. 2, 1996. Retrieved from https://www.google.com/patents/US5481706.Google ScholarGoogle Scholar
  44. Vincent Pillet, Jesús Labarta, Toni Cortes, and Sergi Girona. 1995. Paraver: A tool to visualize and analyze parallel code. In Proceedings of WoTUG-18: Transputer and occam Developments, Vol. 44. mar, 17--31.Google ScholarGoogle Scholar
  45. Vara Prasad, William Cohen, F. C. Eigler, Martin Hunt, Jim Keniston, and J. Chen. 2005. Locating system problems using dynamic instrumentation. In 2005 Ottawa Linux Symposium. Citeseer, 49--64.Google ScholarGoogle Scholar
  46. Steven Rostedt. 2009a. Debugging the kernel using Ftrace - Part 1. (2009). https://lwn.net/Articles/365835.Google ScholarGoogle Scholar
  47. Steven Rostedt. 2009b. Finding origins of latencies using ftrace. In Proceedings of the Eleventh Real-Time Linux Workshop, Dresden, Germany, September 2009.Google ScholarGoogle Scholar
  48. S. Rostedt. 2010. Using the trace event macro. Retrieved from http://lwn.net/Articles/379903.Google ScholarGoogle Scholar
  49. S. Rostedt. 2016a. ftrace - Function Tracer. Documentation/trace/ftrace.txt. Linux kernel version 4.5.0. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/trace/ftrace.txt?h=v4.5.Google ScholarGoogle Scholar
  50. S. Rostedt. 2016b. Lockless Ring Buffer Design. Documentation/trace/ring-buffer-design.txt. Linux kernel version 4.5.0. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/trace/ring-buffer-design.txt?h=v4.5.Google ScholarGoogle Scholar
  51. Robert Schöne, Ronny Tschüter, Thomas Ilsche, and Daniel Hackenberg. 2010. The VampirTrace plugin counter interface: Introduction and examples. In Proceedings of the European Conference on Parallel Processing. Springer, 501--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. A. Starovoitov, J. Schulist, D. Borkmann. 2016. Linux Socket Filtering aka Berkeley Packet Filter (BPF). Documentation/networking/filter.txt. Linux kernel version 4.5.0. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.txt?h=v4.5.Google ScholarGoogle Scholar
  53. Jan-Willem Selij and Eric van den Haak. 2014. A visitation of sysdig. Project Report. Retrieved from https://www.os3.nl/_media/2013-2014/courses/ccf/sysdig-jan-willem-eric.pdf.Google ScholarGoogle Scholar
  54. Suchakrapani Sharma and Michel Dagenais. 2016a. Hardware-assisted instruction profiling and latency detection. J.f Eng. 1, 1 (2016).Google ScholarGoogle Scholar
  55. Suchakrapani Datt Sharma and Michel Dagenais. 2016b. Enhanced userspace and in-kernel trace filtering for production systems. J. Comput. Sci. Technol. 31, 6 (2016), 1161--1178.Google ScholarGoogle ScholarCross RefCross Ref
  56. Narendran Sivakumar and Sriram Sundar Rajan. 2010. Effectiveness of tracing in a multicore environment. (2010).Google ScholarGoogle Scholar
  57. James E. Smith. 1981. A study of branch prediction strategies. In Proceedings of the 8th Annual Symposium on Computer Architecture. IEEE Computer Society Press, 135--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Andrew S. Tanenbaum and Herbert Bos. 2014. Modern Operating Systems. Prentice Hall Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Dan Terpstra, Heike Jagode, Haihang You, and Jack Dongarra. 2010. Collecting performance data with PAPI-C. In Tools for High Performance Computing 2009. Springer, 157--173.Google ScholarGoogle Scholar
  60. Reinhard Wilhelm, Daniel Grund, Jan Reineke, Marc Schlickling, Markus Pister, and Christian Ferdinand. 2009. Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 28, 7 (2009), 966. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Survey and Analysis of Kernel and Userspace Tracers on Linux: Design, Implementation, and Overhead

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Computing Surveys
              ACM Computing Surveys  Volume 51, Issue 2
              March 2019
              748 pages
              ISSN:0360-0300
              EISSN:1557-7341
              DOI:10.1145/3186333
              • Editor:
              • Sartaj Sahni
              Issue’s Table of Contents

              Copyright © 2018 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 12 March 2018
              • Accepted: 1 November 2017
              • Revised: 1 August 2017
              • Received: 1 November 2016
              Published in csur Volume 51, Issue 2

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • survey
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader