Top

2018 | Book

Read chapter Read first chapter

Modern X86 Assembly Language Programming

Covers x86 64-bit, AVX, AVX2, and AVX-512

Author: Daniel Kusswurm

Publisher: Apress

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

Gain the fundamentals of x86 64-bit assembly language programming and focus on

the updated aspects of the x86 instruction set that are most relevant to application

software development. This book covers topics including x86 64-bit programming and

Advanced Vector Extensions (AVX) programming.

The focus in this second edition is exclusively on 64-bit base programming architecture

and AVX programming. Modern X86 Assembly Language Programming’s structure and

sample code are designed to help you quickly understand x86 assembly language

programming and the computational capabilities of the x86 platform. After reading

and using this book, you’ll be able to code performance-enhancing functions and

algorithms using x86 64-bit assembly language and the AVX, AVX2 and AVX-512

instruction set extensions.

What You Will Learn

Discover details of the x86 64-bit platform including its core architecture, data types,

registers, memory addressing modes, and the basic instruction set

Use the x86 64-bit instruction set to create performance-enhancing functions that

are callable from a high-level language (C++)

Employ x86 64-bit assembly language to efficiently manipulate common data types

and programming constructs including integers, text strings, arrays, and structures

Use the AVX instruction set to perform scalar floating-point arithmetic

Exploit the AVX, AVX2, and AVX-512 instruction sets to significantly accelerate the

performance of computationally-intense algorithms in problem domains such as

image processing, computer graphics, mathematics, and statistics

Apply various coding strategies and techniques to optimally exploit the x86 64-bit,

AVX, AVX2, and AVX-512 instruction sets for maximum possible performance

Who This Book Is For

Software developers who want to learn how to write code using x86 64-bit assembly language. It’s also ideal for software developers who already have a basic understanding of x86 32-bit or 64-bit assembly language programming and are interested in learning how to exploit the SIMD capabilities of AVX, AVX2 and AVX-512.

Frontmatter

Chapter 1. X86-64 Core Architecture

Abstract

Chapter 1 examines the x86-64’s core architecture from the perspective of an application program. It opens with a brief historical overview of the x86 platform in order to provide a frame of reference for subsequent content. This is followed by a review of fundamental, numeric, and SIMD data types. X86-64 core architecture is examined next, which includes explanations of processor register sets, status flags, instruction operands, and memory addressing modes. The chapter concludes with an overview of the core x86-64 instruction set.

Daniel Kusswurm

Chapter 2. X86-64 Core Programming – Part 1

Abstract

In the previous chapter, you learned about the fundamentals of the x86-64 platform including its data types, register sets, memory addressing modes, and the core instruction set. In this chapter, you learn how to code basic x86-64 assembly language functions that are callable from C++. You also learn about the semantics and syntax of an x86-64 assembly language source code file. The sample source code and accompanying remarks of this chapter are intended to complement the instructive material presented in Chapter 1.

Daniel Kusswurm

Chapter 3. X86-64 Core Programming – Part 2

Abstract

The previous chapter introduced the fundamentals of x86-64 assembly language programming. You learned how to use the x86-64 instruction set to perform integer addition, subtraction, multiplication, and division. You also examined source code that illustrated use of logical instructions, shift operations, memory addressing modes, and conditional jumps and moves. In addition to learning about frequently used instructions, your initiation to x86-64 assembly language programming has also covered important practical details including assembler directives and calling convention requirements.

Daniel Kusswurm

Chapter 4. Advanced Vector Extensions

Abstract

In the first three chapters of this book, you learned about the core x86-64 platform including its data types, general-purpose registers, and memory addressing modes. You also examined a cornucopia of sample code that illustrated the fundamentals of x86-64 assembly language programming, including basic operands, integer arithmetic, compare operations, conditional jumps, and manipulation of common data structures.

Daniel Kusswurm

Chapter 5. AVX Programming – Scalar Floating-Point

Abstract

In the previous chapter, you learned about the architecture and computing capabilities of AVX. In this chapter, you’ll learn how to use the AVX instruction set to perform scalar floating-point calculations. The first section includes a couple of sample programs that illustrate basic scalar floating-point arithmetic including addition, subtraction, multiplication, and division. The next section contains code that explains use of the scalar floating-point compare and conversion instructions. This is followed by two examples that demonstrate scalar floating-point operations using arrays and matrices. The final section of this chapter formally describes the Visual C++ calling convention.

Daniel Kusswurm

Chapter 6. AVX Programming – Packed Floating-Point

Abstract

The source code examples of the previous chapter elucidated the fundamentals of AVX programming using scalar floating-point arithmetic. In this chapter, you’ll learn how to use the AVX instruction set to perform operations using packed floating-point operands. The chapter begins with three source code examples that demonstrate common packed floating-point operations, including basic arithmetic, data comparisons, and data conversions. The next set of source code examples illustrate how to carry out SIMD computations using floating-point arrays. The final two source code examples explain how to use the AVX instruction set to accelerate matrix transposition and multiplication.

Daniel Kusswurm

Chapter 7. AVX Programming – Packed Integers

Abstract

In the previous chapter, you learned how to use the AVX instruction set to perform calculations using packed floating-point operands. In this chapter, you learn how to carry out computations using packed integer operands. Similar to the previous chapter, the first few source code examples in this chapter demonstrate basic arithmetic operations using packed integers. The remaining source code examples illustrate how to use the computational resources of AVX to perform common image processing operations, including histogram creation and thresholding.

Daniel Kusswurm

Chapter 8. Advanced Vector Extensions 2

Abstract

In the previous four chapters, you learned about the architecture and processing capabilities of AVX. These chapters explicated AVX’s register sets, data types, and instructions. They also included numerous source code examples that illustrated how to perform scalar floating-point arithmetic, packed floating-point computations, and packed integer calculations. Many of the packed floating-point and packed integer source code examples exemplified important SIMD programming strategies and techniques whose exploitation often results in faster executing code.

Daniel Kusswurm

Chapter 9. AVX2 Programming – Packed Floating-Point

Abstract

In Chapter 6, you learned how to use the AVX instruction set to perform packed floating-point operations using the XMM register set and 128-bit wide operands. In this chapter, you learn how carry out packed floating-point operations using the YMM register set and 256-bit wide operands. The chapter begins with a simple example that demonstrates the basics of packed floating-point arithmetic and YMM register use. This is followed by three source code examples that illustrate how to perform packed calculations with floating-point arrays.

Daniel Kusswurm

Chapter 10. AVX2 Programming – Packed Integers

Abstract

In Chapter 7, you learned how to use the AVX instruction set to perform packed integer operations using 128-bit wide operands and the XMM register set. In this chapter, you learn how to carry out similar operations using AVX2 instructions with 256-bit wide operands and the YMM register set. Chapter 10’s source code examples are divided into two major sections. The first section contains elementary examples that illustrate basic operations using AVX2 instructions and 256-bit wide packed integer operands. The second section includes examples that are a continuation of the image processing techniques first presented in Chapter 7.

Daniel Kusswurm

Chapter 11. AVX2 Programming – Extended Instructions

Abstract

In this chapter, you learn how to use some of the instruction set extensions that were introduced in Chapter 8. The first section contains a couple of source code examples that exemplify use of the scalar and packed fused-multiply-add (FMA) instructions. The second section covers instructions that involve the general-purpose registers. This section includes source code examples that explain flagless multiplication and bit shifting. It also surveys some of the enhanced bit-manipulation instructions. The final section discusses the instructions that perform half-precision floating-point conversions.

Daniel Kusswurm

Chapter 12. Advanced Vector Extensions 512

Abstract

In the previous eight chapters, you learned about the scalar floating-point, packed floating-point, and packed integer capabilities of AVX and AVX2. In this chapter, you’ll learn about Advance Vector Extensions 512 (AVX-512). AVX-512 is undoubtedly the largest and perhaps the most consequential extension of the x86 platform to date. It doubles the number of available SIMD registers and broadens the width of each register from 256 to 512 bits. AVX-512 also extends the instruction syntax of AVX and AVX2 to support additional capabilities not available in the earlier extensions, including conditional execution and merging, embedded broadcasts, and instruction-level rounding control for floating-point operations.

Daniel Kusswurm

Chapter 13. AVX-512 Programming – Floating-Point

Abstract

In previous chapters, you learned how to carry out scalar and packed floating-point operations using the AVX and AVX2 instruction sets. In this chapter, you learn how to perform these operations using the AVX-512 instruction set. The first part of this chapter contains source code examples that illustrate basic AVX-512 programming concepts using scalar floating-point operands. This includes examples that illustrate conditional executions, merge and zero masking, and instruction-level rounding. The second part of this chapter demonstrates how to use the AVX-512 instruction set to carry out packed floating-point calculations using 512-bit wide operands and the ZMM register set.

Daniel Kusswurm

Chapter 14. AVX-512 Programming – Packed Integers

Abstract

In Chapters 7 and 10, you learned how to use the AVX and AVX2 instruction sets to perform packed integer operations using 128-bit and 256-bit wide operands. In this chapter, you learn how to use AVX-512 instructions set to carry out packed integer operations using 512-bit wide operands. You also learn how to use AVX-512 instructions with 256-bit and 128-bit wide packed integer operands. The first source code example explains how to perform basic packed integer arithmetic using ZMM registers. This is followed by several examples that exemplify image-processing algorithms and techniques using AVX-512 instructions. Like the previous chapter, all of source code examples in this chapter require a processor and operating system that support AVX-512 and the following instruction set extensions: AVX512F, AVX512CD, AVX512BW, AVX512DQ, and AVX512VL. You can use one of the freely available utilities listed in Appendix A to determine whether your system supports these extensions.

Daniel Kusswurm

Chapter 15. Optimization Strategies and Techniques

Abstract

In the preceding chapters, you learned the fundamentals of x86-64 assembly language programming. You also learned how to use the computational recourses of Advanced Vector Extensions to perform SIMD operations. To maximize the performance of your x86 assembly language code, it is often necessary to understand important details about the inner workings of an x86 processor. In this chapter, you’ll explore the internal hardware components of a modern x86 multi-core processor and its underlying microarchitecture. You’ll also learn how to apply specific coding strategies and techniques to boost the performance of your x86-64 assembly language code.

Daniel Kusswurm

Chapter 16. Advanced Programming

Abstract

The final chapter of this book reviews several source code examples that demonstrate advanced x86 assembly language programming techniques. The first example explains how to use the cpuid instruction to detect specific x86 instruction set extensions. This is followed by two examples that illustrate how to accelerate SIMD processing functions using non-temporal memory stores and data prefetch instructions. The concluding example elucidates the use of an assembly language calculating function in a multithreaded application.

Daniel Kusswurm

Backmatter

Title: Modern X86 Assembly Language Programming
Author: Daniel Kusswurm
Publisher: Apress
Electronic ISBN: 978-1-4842-4063-2
Print ISBN: 978-1-4842-4062-5
DOI: https://doi.org/10.1007/978-1-4842-4063-2

Springer Professional

Modern X86 Assembly Language Programming

Covers x86 64-bit, AVX, AVX2, and AVX-512

About this book

Table of Contents

Frontmatter

Chapter 1. X86-64 Core Architecture

Chapter 2. X86-64 Core Programming – Part 1

Chapter 3. X86-64 Core Programming – Part 2

Chapter 4. Advanced Vector Extensions

Chapter 5. AVX Programming – Scalar Floating-Point

Chapter 6. AVX Programming – Packed Floating-Point

Chapter 7. AVX Programming – Packed Integers

Chapter 8. Advanced Vector Extensions 2

Chapter 9. AVX2 Programming – Packed Floating-Point

Chapter 10. AVX2 Programming – Packed Integers

Chapter 11. AVX2 Programming – Extended Instructions

Chapter 12. Advanced Vector Extensions 512

Chapter 13. AVX-512 Programming – Floating-Point

Chapter 14. AVX-512 Programming – Packed Integers

Chapter 15. Optimization Strategies and Techniques

Chapter 16. Advanced Programming

Backmatter

Premium Partner