Abstract
The uncontrolled use of the cache hierarchy in a multicore processor by real-time tasks may impact their worst-case execution times. Several operating system techniques have been recently proposed to deal with caches in a multiprocessor in order to improve predictability, such as cache partitioning, cache locking, and real-time scheduling. However, the contention caused by the cache coherence protocol and its implication for real-time tasks is still an open problem. In this paper, we present the design and evaluation of a real-time operating system for cache-coherent multicore architectures. The real-time operating system infrastructure includes real-time schedulers, cache partitioning, and cache coherence contention detection through hardware performance counters. We evaluate the real-time operating system in terms of run-time overhead, schedulability of realtime tasks, cache partitioning performance, and hardware performance counters usability. Our results indicate that: (i) a real-time operating system designed from scratch reduces the run-time overhead, and thus improves the realtime schedulability, when compared to a patched operating system; (ii) cache partitioning reduces the contention in the shared cache and provides safe real-time bounds; and (iii) hardware performance counters can detect when real-time tasks interfere with each other at the shared cache level. Scheduling, cache partitioning, and hardware performance counters together are a step-forward to provide real-time bounds in cache-coherent architectures.
- L. C. Aparicio, J. Segarra, C. Rodríguez, and V. Viñals. Improving the wcet computation in the presence of a lockable instruction cache in multitasking real-time systems. J. of Sys. Arch., 57(7):695--706, 2011. Google ScholarDigital Library
- M. Åsberg, T. Nolte, S. Kato, and R. Rajkumar. Exsched: An external cpu scheduler framework for real-time systems. In Proc. of the RTCSA'12, 2012. Google ScholarDigital Library
- T. P. Baker. An analysis of edf schedulability on a multiprocessor. IEEE Trans. Parallel Distrib. Syst., 16(8):760--768, Aug. 2005. Google ScholarDigital Library
- S. K. Baruah, N. K. Cohen, C. G. Plaxton, and D. A. Varvel. Proportionate progress: A notion of fairness in resource allocation. Algorithmica, 15:600--625, 1996.Google ScholarDigital Library
- A. Bastoni, B. B. Brandenburg, and J. H. Anderson. An empirical comparison of global, partitioned, and clustered multiprocessor edf schedulers. In Proc. of the RTSS '10, pages 14--24, USA, 2010. IEEE. Google ScholarDigital Library
- M. Bertogna, M. Cirinei, and G. Lipari. Improved schedulability analysis of edf on multiprocessor platforms. In Proc. of the ECRTS '05, pages 209--218, Washington, DC, USA, 2005. IEEE Computer Society. Google ScholarDigital Library
- S. Boyd-Wickizer, A. T. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An Analysis of Linux Scalability to Many Cores. In OSDI 2010. Google ScholarDigital Library
- B. B. Brandenburg and J. H. Anderson. Feather-trace: A light-weight event tracing toolkit. In Proc. of the OSPERT'07, pages 61--70, 2007.Google Scholar
- B. B. Brandenburg, J. M. Calandrino, and J. H. Anderson. On the scalability of real-time scheduling algorithms on multicore platforms: A case study. In Proc of the RTSS '08, pages 157--169, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarDigital Library
- J. Calandrino and J. Anderson. On the design and implementation of a cache-aware multicore real-time scheduler. In Proc. of the ECRTS '09, pages 194--204, July 2009. Google ScholarDigital Library
- J. Calandrino, J. Anderson, and D. Baumberger. A hybrid real-time scheduling approach for large-scale multicore platforms. In Proc. of the ECRTS '07., pages 247--258, 2007. Google ScholarDigital Library
- J. M. Calandrino, H. Leontyev, A. Block, U. C. Devi, and J. H. Anderson. Litmusrt: A testbed for empirically comparing real-time multiprocessor schedulers. In Proc. of the RTSS '06, pages 111--126, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
- J. Carpenter, S. Funk, P. Holman, A. Srinivasan, J. Anderson, and S. Baruah. A categorization of real-time multiprocessor scheduling problems and algorithms. In Handbook on Scheduling Algorithms, Methods, and Models. Chapman Hall/CRC, 2004.Google Scholar
- Y. Chen, W. Li, C. Kim, and Z. Tang. Efficient shared cache management through sharing-aware replacement and streaming-aware insertion policy. In Proc. of the IPDPS '09, pages 1--11. IEEE, 2009. Google ScholarDigital Library
- T. M. Chilimbi, M. D. Hill, and J. R. Larus. Making pointer-based data structures cache conscious. Computer, 33(12):67--74, Dec. 2000. Google ScholarDigital Library
- A. Chousein and R. N. Mahapatra. Fully associative cache partitioning with don't care bits for real-time applications. SIGBED Rev., 2(2):35--38, Apr. 2005. Google ScholarDigital Library
- J. Dongarra, K. London, S. Moore, P. Mucci, D. Terpstra, H. You, and M. Zhou. Experiences and lessons learned with a portable interface to hardware performance counters. In Proc. of the IPDPS '03, pages 289.2--, USA, 2003. IEEE. Google ScholarDigital Library
- EPOS. Epos website, Dec. 2014.Google Scholar
- D. Faggioli, F. Checconi, M. Trimarchi, and C. Scordino. An EDF scheduling class for the Linux kernel. In Proc. of the Eleventh Real-Time Linux Workshop, Dresden, Germany, Sept. 2009.Google Scholar
- A. A. Fröhlich. Application-Oriented Operating Systems. Number 17 in GMD Research Series. GMD - Forschungszentrum Informationstechnik, Sankt Augustin, Aug. 2001.Google Scholar
- J. Goossens, S. Funk, and S. Baruah. Priority-driven scheduling of periodic task systems on multiprocessors. Real-Time Systems, 25(2-3):187--205, sep 2003. Google ScholarDigital Library
- G. Gracioli. Real-Time Operating System Support for Multicore Applications. PhD thesis, Federal University of Santa Catarina (UFSC), Automation and Systems Graduate Program, Florianópolis, Brazil, July 2014. Available at http://www.lisha.ufsc.br/pub/Gracioli_PHD_2014.pdf.Google Scholar
- G. Gracioli and A. A. Fröhlich. An experimental evaluation of the cache partitioning impact on multicore real-time schedulers. In Proc. of the RTCSA '13, pages 441--450. IEEE, 2013.Google ScholarCross Ref
- G. Gracioli and A. A. Fröhlich. On the influence of shared memory contention in real-time multicore applications. In Proc. of the SBESC '14, page XXX. IEEE, 2014. Google ScholarDigital Library
- G. Gracioli, A. A. Fröhlich, R. Pellizzoni, and S. Fischmeister. Implementation and evaluation of global and partitioned scheduling in a real-time OS. Real-Time Systems, 2013. Google ScholarDigital Library
- N. Guan, M. Stigge, W. Yi, and G. Yu. Cache-aware scheduling and analysis for multicores. In Proc. of the EMSOFT'09, pages 245--254. ACM, 2009. Google ScholarDigital Library
- J. Herter, P. Backes, F. Haupenthal, and J. Reineke. Cama: A predictable cache-aware memory allocator. In Proc. of the 2011 ECRTS, pages 23--32, july 2011. Google ScholarDigital Library
- Intel. An introduction to the intel quickpath interconnect, January 2009. Document Number: 320412-001US.Google Scholar
- Intel Corporation. Intel R 64 and IA-32 Architectures Software Developer's Manual. Number 253668-037US. January 2011.Google Scholar
- Y. Jiang, E. Z. Zhang, K. Tian, and X. Shen. Is reuse distance applicable to data locality analysis on chip multiprocessors? In Proc. of the ETAPS'10, pages 264--282, Berlin, Heidelberg, 2010. Springer-Verlag. Google ScholarDigital Library
- S. Kato. AIRS website, Oct. 2012.Google Scholar
- C. Kenna, J. Herman, B. Ward, and J. H. Anderson. Making shared caches more predictable on multicore platforms. In Proc. of the ECRTS '13, 2013. Google ScholarDigital Library
- H. Kim, A. Kandhalu, and R. Rajkumar. A coordinated approach for practical OS-level cache management in multi-core real-time systems. In Proc. of the ECRTS 2013, pages 80--89, 2013. Google ScholarDigital Library
- J. Liedtke, H. Haertig, and M. Hohmuth. Os-controlled cache predictability for real-time systems. In Proc. of the RTAS '97, pages 213--223. IEEE, 1997. Google ScholarDigital Library
- J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In Prof. of the HPCA '08, pages 367--378. IEEE Computer Society, 2008.Google Scholar
- R. K. Malladi. Using Intel R VTuneTM Performance Analyzer Events/Ratios Optimizing Applications. Number Intel R White Paper. 2010.Google Scholar
- R. Mancuso, R. Dudko, E. Betti, M. Cesati, M. Caccamo, and R. Pellizzoni. Real-time cache management framework for multi-core architectures. In Proc. of the RTAS '13, 2013. Google ScholarDigital Library
- S. Oikawa and R. Rajkumar. Portable rk: a portable resource kernel for guaranteed and enforced timing behavior. In Proc. of the RTAS '99., pages 111--120, 1999. Google ScholarDigital Library
- F. V. Polpeta and A. A. Fröhlich. Hardware mediators: A portability artifact for component-based systems. In EUC, pages 271--280, 2004.Google ScholarCross Ref
- A. Sarkar, F. Mueller, and H. Ramaprasad. Predictable task migration for locked caches in multi-core systems. In Proc. of the LCTES'11, pages 131--140, New York, 2011. ACM. Google ScholarDigital Library
- A. Sarkar, F. Mueller, and H. Ramaprasad. Static task partitioning for locked caches in multi-core real-time systems. In Proc. of the CASES '12, CASES '12, pages 161--170, NY, USA, 2012. ACM. Google ScholarDigital Library
- M. Shekhar, A. Sarkar, H. Ramaprasad, and F. Mueller. Semi-partitioned hard-real-time scheduling under locked cache migration in multicore systems. In Proc. of the ECRTS'12. IEEE, 2012. Google ScholarDigital Library
- V. Suhendra and T. Mitra. Exploring locking & partitioning for predictable shared caches on multi-cores. In Proc. of the DAC'08, pages 300--303. ACM, 2008. Google ScholarDigital Library
- D. Tam, R. Azimi, L. Soares, and M. Stumm. Managing shared l2 caches on multicore systems in software. In WIOSCA'07, 2007.Google Scholar
- X. Vera, B. Lisper, and J. Xue. Data cache locking for higher program predictability. SIGMETRICS Perform. Eval. Rev., 31(1):272--282, June 2003. Google ScholarDigital Library
Index Terms
- On the Design and Evaluation of a Real-Time Operating System for Cache-Coherent Multicore Architectures
Recommendations
Adding instruction cache effect to schedulability analysis of preemptive real-time systems
RTAS '96: Proceedings of the 2nd IEEE Real-Time Technology and Applications Symposium (RTAS '96)Cache memories are commonly avoided in real time systems because of their unpredictable behavior. Recently, some research has been done to obtain tighter bounds on the worst case execution time (WCET) of cached programs. These techniques usually assume ...
SRCP: sharing and reuse-aware replacement policy for the partitioned cache in multicore systems
AbstractAlthough multi-core processors enhance the performance yet the challenge of estimating Worst-Case Execution Time (WCET) of a task remains in such systems due to interference in shared resources like Last Level Caches (LLC). Cache partitioning has ...
Comparative performance evaluation of cache-coherent NUMA and COMA architectures
ISCA '92: Proceedings of the 19th annual international symposium on Computer architectureTwo interesting variations of large-scale shared-memory machines that have recently emerged are cache-coherent non-uniform-memory-access machines (CC-NUMA) and cache-only memory architectures (COMA). They both have distributed main memory and use ...
Comments