ABSTRACT
We discuss the translation lookaside buffer (TLB) consistency problem for multiprocessors, and introduce the Mach shootdown algorithm for maintaining TLB consistency in software. This algorithm has been implemented on several multiprocessors, and is in regular production use. Performance evaluations establish the basic costs of the algorithm and show that it has minimal impact on application performance. As a result, TLB consistency does not pose an insurmountable obstacle to multiprocessors with several hundred processors. We also discuss hardware support options for TLB consistency ranging from a minor interrupt structure modification to complete hardware implementations. Features are identified in current hardware that compound the TLB consistency problem; removal or correction of these features can simplify and/or reduce the overhead of maintaining TLB consistency in software.
- 1.AT&T. UNiX System V/386 Programmer's Reference Manual. Prentice-Hall, Englewood Cliffs, NI, 1988. Google ScholarDigital Library
- 2.R, Bisiani and A. Forin. Multilanguage Parallel Programming of Heterogeneous Machines. IEEE Trans. Comput., 37(8):930-945, August 1988. Google ScholarDigital Library
- 3.S. Bose, E. Clarke, D. Long, and S. Michaylov. Parthenon: A Parallel Theorem Prover for Non-Horn Clauses. Technical Report CMU-CS-88-137, Computer Science Department, Carnegie Mellon University, Pittsburgh,PA, 1988.Google Scholar
- 4.W. Brantley, K. McAuliffe, and j. Weiss. RP3 Procesor- Memory Element. In Proceedings of the International Conference on Parallel Processing, pages 782-789, IEEE Computer Society, 1985.Google Scholar
- 5.R. Case and A. Padegs. Architecture of the IBM System/370, chapter 51, pages ~30-855. McGraw-Hill Book Company, New York, 1982.Google Scholar
- 6.S. Chatt~jee. Multiprocessor Cache Consistency, an annotated bibliography. To Appear.Google Scholar
- 7.D. Cherimn, P. Boyle, and G. Slavenburg. Comments on 'Coherency for Multiprocessor Virtual Addressed Caches' by James R. Goodman in ASPLOS II, October 1987. Computer Architecture News, 16(3):3-6, June 1988. Google ScholarDigital Library
- 8.D. Cheriton, A. Gupta, P. Boyle, and H. Ooosen. The VMP Multiprocessor: Initial Experience, Refinements, and Performance Evaluation. In Conference Proceedings, The 15th Annual Jnternational Symposium on Computer Architecture, pages 410-421, ACM-SIGARCH/IEEE Computer Society, Honolulu, HI, May/June 1988. Google ScholarDigital Library
- 9.D. Clark and I. Emer. Performance of the VAX 11/780 Translation Buffer: Simulation and Measurement. ACM Transactions on Computer Systems, 3(1):31-62, February 1985. Google ScholarDigital Library
- 10.E. Cooper and R. Draves. C Threads. Computer Science Department, Carnegie Mellon University, Pittsburgh, P^, 1988. Programmer's manual for the Cthreads library.Google Scholar
- 11.W. Crowther, J. Goodhue, E. Start, R. Thomas, W. Milliken, and T. Blackadar. Performance Measurements on a 128-node Butterfly Parallel Processor. In Proceedings of the International Conference on Parallel Processing, pages 531-540, IEEE Computer Society, 1985.Google Scholar
- 12.VAX Hardware Handbook. Digital Equipment Corporation, Maynard, MA, 1982.Google Scholar
- 13.Encore Computer Corporation. Multimax 320 Multiprocessor System. Data Sheet.Google Scholar
- 14.R. Gingell, $. Moran, and W. Shannon. Virtual Memory Architecture in SunOS. In Proceedings of the Summer 1987 USENIX Conference, pages 81-94, USENiX Association, Phoenix, AZ, June 1987.Google Scholar
- 15.80386 Progratmner~ Reference Manual. Intel Corporation, Santa Clara, CA, 1986.Google Scholar
- 16.O. Kane. MiPS R2000 RISC Architecture. Prentice-Hall, Englewood Cliffs, NI, 1987.Google Scholar
- 17.MC88200 Users Manual. Motorola, Inc,, Austin, TX, 1988.Google Scholar
- 18.Series 32000 Databook. National Semiconductor Corporation, Santa Clara, CA, 1986.Google Scholar
- 19.G. Pfister, et. al. The IBM Research Parallel Processor Prototype: Introduction and Architecture. In Proceedings of the International Conference on Parallel Processing, pages 764- 771, IEEE Computer Society, 1985.Google Scholar
- 20.C. Polychronopoulos. Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design. iEEE Trans. Comput., 37(8):991-1004, August 1988. Google ScholarDigital Library
- 21.R. Rashicl, A. Tevardan, M. Young, D. Golub, R. Baron, D. Black, W. Bolosky, and J. Chew. Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures. IEEE Trans. Comput., 37(8):896-908, August 1988. Google ScholarDigital Library
- 22.B. Rosenburg. Personal Communication. Member of the RP3 Group, IBM T. J. Watson Research Center.Google Scholar
- 23.A. Spector, R. Pausch, and G. Bruell. Camelot: A Flexible Distributed Transaction Processing System. In Proceedings of Spring Cornpcon 8& pages 432-437, IEEE, San Francisco, CA, February/March 1988.Google ScholarCross Ref
- 24.A. Spector and K. Swedlow, eds. Guide to the Camelot Distributed Transaction Facility. Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 0.98(51){aleph} edition, 1988.Google Scholar
- 25.P. Teller, R. Kenner, and M. Snir. TLB Consistency on Highly Parallel Shared Memory Multiprocessors. In Proceedings, 21st Annual Hawaii International Conference on System Sci. ences, pages 184-192, IEEE Computer Society, Honolulu, HI, 1988. Google ScholarDigital Library
- 26.A. Tevanian, R. Rashid, D. Golub, D. Black, E. Cooper, and M. Young. Much Threads and the UNIX Kernel: The Battle for Control. In Proceedings of the Summer 1987 USENIX Conference, pages 185-197, USENIX Association, Phoenix, AZ, June 1987.Google Scholar
- 27.A. Tevanian, R. Rashid, M. Young, D. Golub, M. Thompson, W. Bolosky, and R. Sanzi. A UNIX Interface for Shared Memory and Mapped Files under Mach. In Proceedings of the Summer 1987 USENIX Conference, pages 53-68, USENIX Association, Phoenix, AZ, June 1987.Google Scholar
- 28.A. Tevanian, Jr. Architecture-Independent Virtual Memory Management for Parallel and Distributed Environments: The Much Approach. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, December 1987. Google ScholarDigital Library
- 29.M. Thompson, J. Barton, T. Jermoluk, and J. Wagner. Translation Lookaside Buffer Synchronization in a Multiprocessor System. in Conference Proceedings, Winter 1988, USENIX Technical Conference, pages 297-302, USENIX Association, Dallas, TX, February 1988.Google Scholar
- 30.M. Young, A. Tevanian, R. Rashid, D. Oolub, J. Eppinger, J. Chew, W. Bolosky, D. Black, and R. Baron. The Duality of Memory and Communication in the Implementation of a Multiproeessor Operating System. In Proceedings of the Eleventh ACM Symposium on Operating System Principles, pages 63-76, ACM-SlGOPS, Austin, TX, November 1987. Google ScholarDigital Library
Index Terms
- Translation lookaside buffer consistency: a software approach
Recommendations
Translation lookaside buffer consistency: a software approach
Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systemsWe discuss the translation lookaside buffer (TLB) consistency problem for multiprocessors, and introduce the Mach shootdown algorithm for maintaining TLB consistency in software. This algorithm has been implemented on several multiprocessors, and is in ...
Translation-Lookaside Buffer Consistency
Nine solutions to the cache consistency problem for shared-memory multiprocessors with multiple translation-lookaside buffers (TLBs) are described. A TLB's function is defined, and it is shown how TLB inconsistency arises in uniprocessor and ...
Comments