research-article

Analysis of Shared Cache Interference in Multi-Core Systems using Event-Arrival Curves

Authors:
Thilo L. Fischer

Institute of Embedded Systems, Hamburg University of Technology, Germany

Institute of Embedded Systems, Hamburg University of Technology, Germany

0000-0002-6309-8979
View Profile

,
Heiko Falk

Institute of Embedded Systems, Hamburg University of Technology, Germany

Institute of Embedded Systems, Hamburg University of Technology, Germany

0000-0003-1196-0122
View Profile

RTNS '23: Proceedings of the 31st International Conference on Real-Time Networks and SystemsJune 2023Pages 23–33https://doi.org/10.1145/3575757.3593643

Published:07 June 2023Publication History

RTNS '23: Proceedings of the 31st International Conference on Real-Time Networks and Systems

Pages 23–33

ABSTRACT

Caches are used to bridge the gap between main memory and the significantly faster processor cores. In multi-core architectures, the last-level cache is often shared between cores. However, sharing a cache causes inter-core interference to emerge. Concurrently running tasks will experience additional cache misses as the competing tasks issue interfering accesses and trigger the eviction of data contained in the shared cache. Thus, to compute a task’s worst-case execution time (WCET), a safe bound on the effects of inter-core cache interference has to be determined. In this paper, we propose a novel analysis approach for shared caches using the least recently used (LRU) replacement policy. The presented analysis leverages timing information to produce tight bounds on the worst-case interference. We describe how inter-core cache interference may be expressed as a function of time using event-arrival curves. Thus, by determining the maximal duration between subsequent accesses to a cache block, it is possible to bound the inter-core interference. This enables us to classify accesses as cache hits or potential misses. We implemented the analysis in a WCET analyzer and evaluated its performance for multi-core systems containing 2, 4, and 8 cores using shared caches from 4 KB to 32 KB. The analysis achieves significant improvements compared to a standard interference analysis with WCET reductions of up to 60%. The average WCET reduction is 9% for dual-core, 15% for quad-core, and 11% for octa-core systems. The analysis runtime overhead ranges from a factor of 4 × to 7 × compared to the baseline analysis.

References

Patrick Cousot and Radhia Cousot. 1977. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages(POPL 1977). 238–252. https://doi.org/10.1145/512950.512973Google ScholarDigital Library
P. Padma Priya Dharishini and P. V. R. Murthy. 2021. Precise Shared Instruction Cache Analysis to Estimate WCET of Multi-threaded Programs. In 2021 IEEE 18th India Council International Conference (INDICON). 1–7. https://doi.org/10.1109/INDICON52576.2021.9691620Google Scholar
Heiko Falk and Paul Lokuciejewski. 2010. A Compiler Framework for the Reduction of Worst-Case Execution Times. Real-Time Systems 46, 2 (2010), 251–300. https://doi.org/10.1007/s11241-010-9101-xGoogle ScholarDigital Library
Christian Ferdinand and Reinhard Wilhelm. 1999. Efficient and Precise Cache Behavior Prediction for Real-Time Systems. Real-Time Systems 17, 2 (1999), 131–181. https://doi.org/10.1023/A:1008186323068Google ScholarDigital Library
Jan Gustafsson, Adam Betts, Andreas Ermedahl, and Björn Lisper. 2010. The Mälardalen WCET Benchmarks – Past, Present and Future. In 10th International Workshop on Worst-Case Execution Time Analysis (WCET 2010), Björn Lisper (Ed.). OCG, Brussels, Belgium, 136–146. https://doi.org/10.4230/OASIcs.WCET.2010.136Google Scholar
Damien Hardy, Thomas Piquet, and Isabelle Puaut. 2009. Using Bypass to Tighten WCET Estimates for Multi-Core Processors with Shared Instruction Caches. In 30th IEEE Real-Time Systems Symposium (RTSS). 68–77. https://doi.org/10.1109/RTSS.2009.34Google ScholarDigital Library
Yau-Tsun Steven Li and Sharad Malik. 1997. Performance Analysis of Embedded Software Using Implicit Path Enumeration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) 16, 12 (1997), 1477–1487. https://doi.org/10.1109/43.664229Google ScholarDigital Library
Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, and Vivy Suhendra. 2012. Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores. Real-Time Systems 48, 6 (2012), 638–680. https://doi.org/10.1007/s11241-012-9160-2Google ScholarDigital Library
Mingsong Lv, Nan Guan, Jan Reineke, Reinhard Wilhelm, and Wang Yi. 2016. A Survey on Static Cache Analysis for Real-Time Systems. Leibniz Transactions on Embedded Systems (LITES) 3, 1 (2016), 05:1–05:48. https://doi.org/10.4230/LITES-v003-i001-a005Google Scholar
Claire Maiza, Hamza Rihani, Juan M. Rivas, Joël Goossens, Sebastian Altmeyer, and Robert I. Davis. 2019. A Survey of Timing Verification Techniques for Multi-Core Real-Time Systems. ACM Computing Surveys (CSUR) 52, 3 (2019), 1–38. https://doi.org/10.1145/3323212Google ScholarDigital Library
Kartik Nagar. 2016. Precise analysis of Private and Shared Caches for tight WCET Estimates. Ph. D. Dissertation. Indian Institute of Science Bangalore.Google Scholar
Kartik Nagar and Y. N. Srikant. 2014. Precise Shared Cache Analysis using Optimal Interference Placement. In IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS). 125–134. https://doi.org/10.1109/RTAS.2014.6925996Google ScholarCross Ref
Dominic Oehlert. 2021. Worst Case Execution Time Oriented Code Optimization of Hard Real-Time Multicore Systems. Ph. D. Dissertation. Technische Universität Hamburg.Google Scholar
Dominic Oehlert, Selma Saidi, and Heiko Falk. 2018. Compiler-Based Extraction of Event Arrival Functions for Real-Time Systems Analysis. In 30th Euromicro Conference on Real-Time Systems (ECRTS). 4:1–4:22. https://doi.org/10.4230/LIPIcs.ECRTS.2018.4Google Scholar
The Embedded Microprocessor Benchmark Consortium. 2023. About the EEMBC AutoBench™ Performance Benchmark Suite. EEMBC. Retrieved 2023-01-20 from https://www.eembc.org/autobench/Google Scholar
Valentin Touzeau, Claire Maïza, David Monniaux, and Jan Reineke. 2019. Fast and Exact Analysis for LRU Caches. Proceedings of the ACM on Programming Languages (POPL) 3 (2019), 54:1–54:29. https://doi.org/10.1145/3290367Google ScholarDigital Library
Wei Zhang, Mingsong Lv, Wanli Chang, and Lei Ju. 2022. Precise and Scalable Shared Cache Contention Analysis for WCET Estimation. In Proceedings of the 59th ACM/IEEE Design Automation Conference (DAC). 1267–1272. https://doi.org/10.1145/3489517.3530613Google ScholarDigital Library

Index Terms

Analysis of Shared Cache Interference in Multi-Core Systems using Event-Arrival Curves
1. Computer systems organization
  1. Real-time systems
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Formal software verification

Recommendations

An Improved Multi-core Shared Cache Replacement Algorithm
DCABES '12: Proceedings of the 2012 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science

Many multi-core processors employ a large last-level cache (LLC) shared among the multiple cores. Past research has demonstrated that traditional LRU and its approximation can lead to poor performance and unfairness when the multiple cores compete for ...
Read More
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...
Read More
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

RTNS '23: Proceedings of the 31st International Conference on Real-Time Networks and Systems
June 2023
242 pages
ISBN:9781450399838
DOI:10.1145/3575757

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
WCET analysis
event-arrival curve
multi-core
shared cache
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate119of255submissions,47%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 52
  Total Downloads
- Downloads (Last 12 months)52
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Analysis of Shared Cache Interference in Multi-Core Systems using Event-Arrival Curves

RTNS '23: Proceedings of the 31st International Conference on Real-Time Networks and Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

An Improved Multi-core Shared Cache Replacement Algorithm

High performance cache replacement using re-reference interval prediction (RRIP)

High performance cache replacement using re-reference interval prediction (RRIP)

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Analysis of Shared Cache Interference in Multi-Core Systems using Event-Arrival Curves

RTNS '23: Proceedings of the 31st International Conference on Real-Time Networks and Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

An Improved Multi-core Shared Cache Replacement Algorithm

High performance cache replacement using re-reference interval prediction (RRIP)

High performance cache replacement using re-reference interval prediction (RRIP)

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media