Abstract
Studying execution of concurrent real-time online systems, to identify far-reaching and hard to reproduce latency and performance problems, requires a mechanism able to cope with voluminous information extracted from execution traces. Furthermore, the workload must not be disturbed by tracing, thereby causing the problematic behavior to become unreproducible.
In order to satisfy this low-disturbance constraint, we created the LTTng kernel tracer. It is designed to enable safe and race-free attachment of probes virtually anywhere in the operating system, including sites executed in non-maskable interrupt context.
In addition to being reentrant with respect to all kernel execution contexts, LTTng offers good performance and scalability, mainly due to its use of per-CPU data structures, local atomic operations as main buffer synchronization primitive, and RCU (Read-Copy Update) mechanism to control tracing.
Given that kernel infrastructure used by the tracer could lead to infinite recursion if traced, and typically requires non-atomic synchronization, this paper proposes an asynchronous mechanism to inform the kernel that a buffer is ready to read. This ensures that tracing sites do not require any kernel primitive, and therefore protects from infinite recursion.
This paper presents the core of LTTng's buffering algorithms and measures its performance.
- Bligh, M., Schultz, R., and Desnoyers, M. 2007. Linux kernel debugging on Google-sized clusters. In Proceedings of the Ottawa Linux Symposium.Google Scholar
- Cantrill, B. M., Shapiro, M. W., and Leventhal, A. H. 2004. Dynamic instrumentation of production systems. In USENIX. {Online}. Available: http://www.sagecertification.org/events/usenix04/tech/general/full_papers/cantrill/cantrill_html/index.html. {Accessed: October 19, 2009}. Google ScholarDigital Library
- Corbet, J. 2007a. Kernel Markers. {Online}. Available: Linux Weekly News, http://lwn.net/Articles/245671/. {Accessed: October 19, 2009}.Google Scholar
- Corbet, J. 2007b. On DTrace envy. {Online}. Available: Linux Weekly News, http://lwn.net/Articles/244536/. {Accessed: October 19, 2009}.Google Scholar
- Corbet, J. 2008. Tracing: no shortage of options. {Online}. Available: Linux Weekly News, http://lwn.net/Articles/291091/. {Accessed: October 19, 2009}.Google Scholar
- Desnoyers, M. 2009. Low-impact operating system tracing. Ph.D. thesis, École Polytechnique de Montréal. {Online}. Available: http://www.lttng.org/pub/thesis/desnoyers-dissertation-2009-12.pdf.Google Scholar
- Desnoyers, M. and Dagenais, M. 2006. The LTTng tracer: A low impact performance and behavior monitor for GNU/Linux. In Proceedings of the Ottawa Linux Symposium.Google Scholar
- Desnoyers, M. and Dagenais, M. R. 2010. Synchronization for fast and reentrant operating system kernel tracing. Software -- Practice and Experience 40, 12, 1053--1072. Google ScholarDigital Library
- Desnoyers, M., McKenney, P. E., Stern, A. S., Dagenais, M. R., and Walpole, J. 2012. User-level implementations of Read-Copy Update. IEEE Transactions on Parallel and Distributed Systems (TPDS) 23, 2 (feb.), 375--382. Google ScholarDigital Library
- Hillier, G. 2008. System and application analysis with LTTng. {Online}. Available: Siemens Linux Inside, http://www.hillier.de/linux/LTTng-examples.pdf. {Accessed: June 7, 2009}.Google Scholar
- Krieger, O., Auslander, M., Rosenburg, B., Wisniewski, R. W., Xenidis, J., Da Silva, D., and al. 2006. K42: building a complete operating system. In EuroSys '06: Proceedings of the 2006 EuroSys conference. 133--145. Google ScholarDigital Library
- Mavinakayanahalli, A., Panchamukhi, P., Keniston, J., Keshavamurthy, A., and Hiramatsu, M. 2006. Probing the guts of kprobes. In Proceedings of the Ottawa Linux Symposium.Google Scholar
- McKenney, P. E. 2004. Exploiting deferred destruction: An analysis of read-copy-update techniques in operating system kernels. Ph.D. thesis, OGI School of Science and Engineering at Oregon Health and Sciences University. {Online}. Available: http://www.rdrop.com/users/paulmck/RCU/ RCUdissertation.2004.07.14e1.pdf. {Accessed: October 19, 2009}. Google ScholarDigital Library
- Prasad, V., Cohen, W., Eigler, F. C., Hunt, M., Keniston, J., and Chen, B. 2005. Locating system problems using dynamic instrumentation. In Proceedings of the Ottawa Linux Symposium. {Online}. Available: http://sourceware.org/systemtap/systemtap-ols.pdf. {Accessed: October 19, 2009}.Google Scholar
- Wisniewski, R. and Rosenburg, B. 2003. Efficient, unified, and scalable performance monitoring for multiprocessor operating systems. In Supercomputing, 2003 ACM/IEEE Conference. IEEE, 3--3. Google ScholarDigital Library
- Wisniewski, R. W., Azimi, R., Desnoyers, M., Michael, M. M., Moreira, J., Shiloach, D., and Soares, L. 2007. Experiences understanding performance in a commercial scale-out environment. In European Conference on Parallel Processing (Euro-Par). Google ScholarDigital Library
- Yaghmour, K. and Dagenais, M. R. 2000. The Linux Trace Toolkit. Linux Journal. {Online}. Available: http://www.linuxjournal.com/article/3829. {Accessed: October 19, 2009}.Google Scholar
- Zanussi, T.,Wisniewski, K. Y. R., Moore, R., and Dagenais, M. 2003. RelayFS: An efficient unified approach for transmitting data from kernel to user space. In Proceedings of the Ottawa Linux Symposium. 519--531. {Online}. Available: http://www. research.ibm.com/people/b/bob/papers/ols03.pdf. {Accessed: October 19, 2009}.Google Scholar
Index Terms
- Lockless multi-core high-throughput buffering scheme for kernel tracing
Recommendations
Survey and Analysis of Kernel and Userspace Tracers on Linux: Design, Implementation, and Overhead
As applications and operating systems are becoming more complex, the last decade has seen the rise of many tracing tools all across the software stack. This article presents a hands-on comparison of modern tracers on Linux systems, both in user space ...
Performance of memory reclamation for lockless synchronization
Achieving high performance for concurrent applications on modern multiprocessors remains challenging. Many programmers avoid locking to improve performance, while others replace locks with non-blocking synchronization to protect against deadlock, ...
Autolocker: synchronization inference for atomic sections
POPL '06: Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languagesThe movement to multi-core processors increases the need for simpler, more robust parallel programming models. Atomic sections have been widely recognized for their ease of use. They are simpler and safer to use than manual locking and they increase ...
Comments