poster

Rollback-free value prediction with approximate loads

Authors:
Bradley Thwaites

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Gennady Pekhimenko

Carnegie Melon University, Pittsburgh, PA, USA

Carnegie Melon University, Pittsburgh, PA, USA
View Profile

,
Hadi Esmaeilzadeh

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Amir Yazdanbakhsh

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Onur Mutlu

Carnegie Melon University, Pittsburgh, PA, USA

Carnegie Melon University, Pittsburgh, PA, USA
View Profile

,
Jongse Park

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Girish Mururu

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Todd Mowry

Carnegie Melon University, Pittsburgh, PA, USA

Carnegie Melon University, Pittsburgh, PA, USA
View Profile

PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilationAugust 2014Pages 493–494https://doi.org/10.1145/2628071.2628110

Published:24 August 2014Publication History

PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation

Pages 493–494

ABSTRACT

This paper demonstrates how to utilize the inherent error resilience of a wide range of applications to mitigate the memory wall -- the discrepancy between core and memory speed. We define a new microarchitecturally-triggered approximation technique called rollback-free value prediction. This technique predicts the value of safe-to-approximate loads when they miss in the cache without tracking mispredictions or requiring costly recovery from misspeculations. This technique mitigates the memory wall by allowing the core to continue computation without stalling for long-latency memory accesses. Our detailed study of the quality trade-offs shows that with a modern out-of-order processor, average 8% (up to 19%) performance improvement is possible with 0.8% (up to 1.8%) average quality loss on an approximable subset of SPEC CPU 2000/2006.

References

C. Alvarez et al., "Fuzzy memoization for floating-point multimedia applications," IEEE Trans. Comput., 2005. Google ScholarDigital Library
R. S. Amant et al., "General-purpose code acceleration with limited-precision analog computation," in ISCA, 2014.Google Scholar
W. Baek and T. M. Chilimbi, "Green: A framework for supporting energy-conscious programming using controlled approximation," in PLDI, 2010. Google ScholarDigital Library
L. N. Chakrapani et al., "Ultra-efficient (embedded) SOC architectures based on probabilistic CMOS (PCMOS) technology," in DATE, 2006. Google ScholarDigital Library
J. D. Collins et al., "Speculative precomputation: Long-range prefetching of delinquent loads," in ISCA, 2001. Google ScholarDigital Library
M. de Kruijf et al., "Relax: An architectural framework for software recovery of hardware faults," in ISCA, 2010. Google ScholarDigital Library
R. J. Eickemeyer and S. Vassiliadis, "A load-instruction unit for pipelined processors," IBM JRD, 1993. Google ScholarDigital Library
H. Esmaeilzadeh et al., "Neural acceleration for general-purpose approximate programs," in MICRO, 2012. Google ScholarDigital Library
H. Esmaeilzadeh et al., "Architecture support for disciplined approximate programming," in ASPLOS, 2012. Google ScholarDigital Library
S. Liu et al., "Flikker: Saving refresh-power in mobile devices through critical data partitioning," in ASPLOS, 2011. Google ScholarDigital Library
M. Samadi et al., "Sage: self-tuning approximation for graphics engines," in MICRO, 2013. Google ScholarDigital Library
A. Sampson et al., "EnerJ: Approximate data types for safe and general low-power computation," in PLDI, 2011. Google ScholarDigital Library
A. Sampson et al., "Approximate storage in solid-state memories," in MICRO, 2013. Google ScholarDigital Library
Y. Sazeides and J. E. Smith, "The predictability of data values," in MICRO, 1997. Google ScholarDigital Library
S. Sidiroglou-Douskos et al., "Managing performance vs. accuracy trade-offs with loop perforation," in FSE, 2011. Google ScholarDigital Library
H. Zhou and T. M. Conte, "Enhancing memory level parallelism via recovery-free value prediction," in ICS, 2003. Google ScholarDigital Library

Index Terms

Rollback-free value prediction with approximate loads
1. Computer systems organization
  1. Architectures

Recommendations

CAFFEINE: A Utility-Driven Prefetcher Aggressiveness Engine for Multicores

Aggressive prefetching improves system performance by hiding and tolerating off-chip memory latency. However, on a multicore system, prefetchers of different cores contend for shared resources and aggressive prefetching can degrade the overall system ...
Read More
Improving Performance of Large Physically Indexed Caches by Decoupling Memory Addresses from Cache Addresses

Modern CPUs often use large physically indexed caches that are direct-mapped or have low associativities. Such caches do not interact well with virtual memory systems. An improperly placed physical page will end up in a wrong place in the cache, causing ...
Read More
Band-Pass Prefetching: An Effective Prefetch Management Mechanism Using Prefetch-Fraction Metric in Multi-Core Systems

In multi-core systems, an application’s prefetcher can interfere with the memory requests of other applications using the shared resources, such as last level cache and memory bandwidth. In order to minimize prefetcher-caused interference, prior ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation
August 2014
514 pages
ISBN:9781450328098
DOI:10.1145/2628071
General Chair:
J. Nelson Amaral
University of Alberta, Canada
,
Program Chair:
Josep Torrellas
University of Illinois, USA
Copyright © 2014 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 August 2014
Check for updates
Author Tags
compilers
general-purpose approximate computing
memory systems
rollback-free value prediction
Qualifiers
- poster
Conference

Acceptance Rates
PACT '14 Paper Acceptance Rate54of144submissions,38%Overall Acceptance Rate121of471submissions,26%
More
Upcoming Conference
PACT '24

Sponsor:

sigarch

International Conference on Parallel Architectures and Compilation Techniques

October 14 - 16, 2024

Southern California , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 31
  Total Citations
  View Citations
- 244
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Rollback-free value prediction with approximate loads

PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation

ABSTRACT

References

Cited By

Index Terms

Recommendations

CAFFEINE: A Utility-Driven Prefetcher Aggressiveness Engine for Multicores

Improving Performance of Large Physically Indexed Caches by Decoupling Memory Addresses from Cache Addresses

Band-Pass Prefetching: An Effective Prefetch Management Mechanism Using Prefetch-Fraction Metric in Multi-Core Systems