The TLB slice—a low-cost high-speed address translation mechanism

Authors:
George Taylor

MIPS Computer Systems, 930 Arques Avenue, Sunnyvale, CA

MIPS Computer Systems, 930 Arques Avenue, Sunnyvale, CA
View Profile

,
Peter Davies

MIPS Computer Systems, 930 Arques Avenue, Sunnyvale, CA

MIPS Computer Systems, 930 Arques Avenue, Sunnyvale, CA
View Profile

,
Michael Farmwald

MIPS Computer Systems, 930 Arques Avenue, Sunnyvale, CA

MIPS Computer Systems, 930 Arques Avenue, Sunnyvale, CA
View Profile

ISCA '90: Proceedings of the 17th annual international symposium on Computer ArchitectureMay 1990Pages 355–363https://doi.org/10.1145/325164.325161

Published:01 May 1990Publication History

ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture

Pages 355–363

ABSTRACT

The MIPS R6000 microprocessor relies on a new type of translation lookaside buffer — called a TLB slice — which is less than one-tenth the size of a conventional TLB and as fast as one multiplexer delay, yet has a high enough hit rate to be practical. The fast translation makes it possible to use a physical cache without adding a translation stage to the processor's pipeline. The small size makes it possible to include address translation on-chip, even in a technology with a limited number of devices.

The key idea behind the TLB slice is to have both a virtual tag and a physical tag on a physically-indexed cache. Because of the virtual tag, the TLB slice needs to hold only enough physical page number bits — typically 4 to 8 — to complete the physical cache index, in contrast with a conventional TLB, which needs to hold both a virtual page number and a physical page number. The virtual page number is unnecessary because the TLB slice needs to provide only a hint for the translated physical address rather than a guarantee. The full physical page number is unnecessary because the cache hit logic is based on the virtual tag. Furthermore, if the cache is multi-level and references to the TLB slice are “shielded” by hits in a virtually indexed primary cache, the slice can get by with very few entries, once again lowering its cost and increasing its speed. With this mechanism, the simplicity of a physical cache can been combined with the speed of a virtual cache.

References

Bcck84.John Beck ct. al., "A 3b Microprocessor with On-Chip Virtual Memory Management," 1984 IEEE Inrernational Solid Stnte Circlrits Conference, pp. 17- 170.Google Scholar
Cheng87.Ray Cheng, Virtual Address Cache in UNIX, Proceedings of Summer 1987 USENIX Conference, pp. 217-224.Google Scholar
Denning70.Peter J. Denning, Virtual Memory, Computing Surveys, vol. 2, no. 3, September 1970. Google ScholarDigital Library
Goodman87.James R. Goodman, Coherency for Multiprocessor Virtual Address Caches, Proc. Second International Conference on Architectural Support for Programming Languages and Qerating Systems, October 1987, pp. 72-81. Google ScholarDigital Library
Hill86.Mark Hill et. al., Design Decisions in SPUR, Computer vol. 19, no. 11, November 1986, pp. 8-22. Google ScholarDigital Library
Kane87.Gerry Kane, MIPS RISC Architecture, Prentice-Hall, 19S7. Google ScholarDigital Library
Lee60.F. F. Lee, Study of 'Look Aside' Memory, IEEE Transactions on Computers, vol. lS, no. 11, November 1960, pp. 1062-1064.Google Scholar
Lee89.Ruby Lee, Precision Architecture, Computer, vol. 22, 110. 1, January 1989, pp. 78-91. Google ScholarDigital Library
Przybylski89.Steven Przybylski, Mark Horowitz and John Hennessy, Characteristics of Performance-Optimal Multi-Level Cache Hierarchies, PrOC. Sixteenth IEEE/ACM International Sympositrm on Computer Architecture, June 1989, pp. 114-121. Google ScholarDigital Library
Riordan89.Torn Riordan, G.P. Grewel, Simon Hsu, John Kinsel, Jeff Libby, Roger March, Marvin Mills, Paul Ries and Rancly Scoikld, System Design Using the MIPS R3OCO/3010 RISC Chipset, Digest o,f Papers Spring 1989 IEEE Compcon, pp. 494498.Google Scholar
Roberts90.David Roberts, Tim Layman and George Taylor, An ECL Microprocessor Designed for Two-Level Cache, Digest of Papers Spring 1990 IEEE Compcon, pp. 228-231.Google Scholar
Short88.Robert Short and Henry Levy, A Simulation Study of Two-Level Caches, Proc. Fifteenth IEEEIACM International Symposium on Computer Architecture, June 1988, pp. 81-88. Google ScholarDigital Library
Smith82.Alan J. Smith, Cache Memories, Computing Surveys, vol. 14, no. 3, September 1982, pp. 473-530. Google ScholarDigital Library
Wang89.Wen-Harm Wang, Jean-Loup Baer and Henry Levy, Organization and Performance of a Two-Level Virtual-l&l Cache Hierarchy, Proc. Skteenth IEEE/ACM International Simpositrum on Computer Architecture, June 19S9, pp. 140-14s. Google ScholarDigital Library
Wood86.David Wood et. al., An In-Cache Address Translation Mechanism, Proc. Thirteenth. IEEE/ACM International Symposium on Computer Architecture, June 1986, pp. 358365. Google ScholarDigital Library

Index Terms

The TLB slice—a low-cost high-speed address translation mechanism
1. Hardware
  1. Hardware validation

Recommendations

The TLB slice—a low-cost high-speed address translation mechanism
Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture

The MIPS R6000 microprocessor relies on a new type of translation lookaside buffer — called a TLB slice — which is less than one-tenth the size of a conventional TLB and as fast as one multiplexer delay, yet has a high enough hit rate to be practical. ...
Read More
TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Translation Lookaside Buffers (TLBs) are critical to overall system performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as Chip MultiProcessors (CMPs) become ubiquitous, TLB design and ...
Read More
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors
MICRO '43: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture

Translation Look-aside Buffers (TLBs) are vital hardware support for virtual memory management in high performance computer systems and have a momentous influence on overall system performance. Numerous techniques to reduce TLB miss latencies including ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture
May 1990
378 pages
ISBN:0897913663
DOI:10.1145/325164
Chairmen:
Jean-Loup Baer,
Larry Snyder,
James Goodman
ACM SIGARCH Computer Architecture News Volume 18, Issue 2SI
Special Issue: Proceedings of the 17th annual international symposium on Computer Architecture
June 1990
356 pages
ISSN:0163-5964
DOI:10.1145/325096
Chairmen:
Jean-Loup Baer,
Larry Snyder,
James Goodman
Issue’s Table of Contents
Copyright © 1990 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 1990
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate543of3,203submissions,17%
Upcoming Conference
ISCA '24

Sponsor:

sigarch

ISCA '24: The 51st Annual International Symposium on Computer Architecture

June 29 - July 3, 2024

Buenos Aires , Argentina
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 93
  Total Citations
  View Citations
- 1,084
  Total Downloads
- Downloads (Last 12 months)99
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The TLB slice—a low-cost high-speed address translation mechanism

ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture

ABSTRACT

References

Cited By

Index Terms

Recommendations

The TLB slice—a low-cost high-speed address translation mechanism

TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors