Modern X86 Assembly Language Programming: Covers X86 64-bit, AVX, AVX2, and AVX-512 [3 ed.] 1484296028, 9781484296028, 9781484296035

This book is an instructional text that will teach you how to code x86-64 assembly language functions. It also explains

297 169 3MB

English Pages 700 [688] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Modern X86 Assembly Language Programming: Covers X86 64-bit, AVX, AVX2, and AVX-512 [3 ed.]
 1484296028, 9781484296028, 9781484296035

  • Commentary
  • Publisher PDF

Table of contents :
Table of Contents
About the Author
About the Technical Reviewer
Acknowledgments
Introduction
Chapter 1: X86-64 Core Architecture
Historical Overview
Data Types
Fundamental Data Types
Numerical Data Types
SIMD Data Types
Miscellaneous Data Types
X86-64 Processor Architecture
General-Purpose Registers
Instruction Pointer
RFLAGS Register
Floating-Point and SIMD Registers
MXCSR Register
Instruction Operands
Memory Addressing
Condition Codes
Differences Between X86-64 and X86-32
Legacy Instruction Sets
Summary
Chapter 2: X86-64 Core Programming – Part 1
Source Code Overview
Assembler Basics
Integer Arithmetic
Integer Addition and Subtraction – 32-Bit
Bitwise Logical Operations
Shift Operations
Integer Addition and Subtraction – 64-bit
Integer Multiplication and Division
Summary
Chapter 3: X86-64 Core Programming – Part 2
Simple Stack Arguments
Mixed-Type Integer Arithmetic
Memory Addressing Modes
Condition Codes
Assembly Language For-Loops
Summary
Chapter 4: X86-64 Core Programming – Part 3
Arrays
One-Dimensional Arrays
Multiple One-Dimensional Arrays
Two-Dimensional Arrays
Strings
Counting Characters
Array Compare
Array Copy and Fill
Array Reversal
Assembly Language Structures
Summary
Chapter 5: AVX Programming – Scalar Floating-Point
Floating-Point Programming Concepts
Scalar Floating-Point Registers
Single-Precision Floating-Point Arithmetic
Temperature Conversions
Cone Volume/Surface Area Calculation
Double-Precision Floating-Point Arithmetic
Floating-Point Comparisons and Conversions
Floating-Point Comparisons
Floating-Point Conversions
Floating-Point Arrays
Summary
Chapter 6: Run-Time Calling Conventions
Calling Convention Overview
Calling Convention Requirements for Visual C++
Stack Frames
Using Non-volatile General-Purpose Registers
Using Non-volatile XMM Registers
Calling External Functions
Calling Convention Requirements for GNU C++
Stack Arguments
Using Non-volatile General-Purpose Registers and Stack Frames
Calling External Functions
Summary
Chapter 7: Introduction to X86-AVX SIMD Programming
SIMD Programming Concepts
What Is SIMD?
SIMD Integer Arithmetic
Wraparound vs. Saturated Arithmetic
SIMD Floating-Point Arithmetic
SIMD Data Manipulation Operations
X86-AVX Overview
AVX/AVX2 SIMD Architecture Overview
SIMD Registers
SIMD Data Types
Instruction Syntax
Differences Between X86-SSE and X86-AVX
Summary
Chapter 8: AVX Programming – Packed Integers
Integer Arithmetic
Addition and Subtraction
Multiplication
Bitwise Logical Operations
Arithmetic and Logical Shifts
Integer Arrays
Pixel Minimum and Maximum
Pixel Mean
Summary
Chapter 9: AVX Programming – Packed Floating-Point
Packed Floating-Point Arithmetic
Elementary Operations
Packed Comparisons
Packed Conversions
Packed Floating-Point Arithmetic – Arrays
Mean and Standard Deviation
Distance Calculations
Packed Floating-Point Arithmetic – Matrices
Column Means
Summary
Chapter 10: AVX2 Programming – Packed Integers
Integer Arithmetic
Elementary Operations
Size Promotions
Image Processing
Pixel Clipping
Benchmarking
RGB to Grayscale
Pixel Conversions
Image Histogram
Summary
Chapter 11: AVX2 Programming – Packed Floating-Point – Part 1
Floating-Point Arrays
Least Squares
Floating-Point Matrices
Matrix Multiplication
Single-Precision
Double-Precision
Matrix (4 × 4) Multiplication
Single-Precision
Double-Precision
Matrix (4 × 4) Vector Multiplication
Single-Precision
Double-Precision
Covariance Matrix
Summary
Chapter 12: AVX2 Programming – Packed Floating-Point – Part 2
Matrix Inversion
Single-Precision
Double-Precision
Signal Processing – Convolutions
1D Convolution Arithmetic
1D Convolution Using Variable-Size Kernel
Single-Precision
Double-Precision
1D Convolution Using Fixed-Size Kernel
Single-Precision
Double-Precision
Summary
Chapter 13: AVX-512 Programming – Packed Integers
AVX-512 Overview
Execution Environment
Merge Masking and Zero Masking
Embedded Broadcasts
Instruction-Level Rounding
Integer Arithmetic
Elementary Operations
Masked Operations
Image Processing
Image Thresholding
Image Statistics
Image Histogram
Summary
Chapter 14: AVX-512 Programming – Packed Floating-Point – Part 1
Floating-Point Arithmetic
Elementary Operations
Packed Comparisons
Floating-Point Arrays
Floating-Point Matrices
Covariance Matrix
Matrix Multiplication
Single-Precision
Double-Precision
Matrix (4 × 4) Vector Multiplication
Single-Precision
Double-Precision
Summary
Chapter 15: AVX-512 Programming – Packed Floating-Point – Part 2
Signal Processing
1D Convolution Using Variable-Size Kernel
Single-Precision
Double-Precision
1D Convolution Using Fixed-Size Kernel
Single-Precision
Double-Precision
Summary
Chapter 16: Advanced Assembly Language Programming
CPUID Instruction
Processor Vendor Information
X86-AVX Detection
Non-temporal Memory Stores
Integer Non-temporal Memory Stores
Floating-Point Non-temporal Memory Stores
SIMD Text Processing
Summary
Chapter 17: Assembly Language Optimization and Development Guidelines
Assembly Language Optimization Guidelines
Basic Techniques
Floating-Point Arithmetic
Branch Instructions
Branch Prediction Unit
Data Alignment
SIMD Techniques
Assembly Language Development Guidelines
Identify Functions for x86-64 Assembly Language Coding
Select Target x86-AVX Instruction Set
Establish Benchmark Timing Objectives
Code x86-64 Assembly Language Functions
Benchmark x86-64 Assembly Language Code
Optimize x86-64 Assembly Language Code
Repeat Benchmarking and Optimization Steps
Summary
Appendix A: Source Code and Development Tools
Source Code Download and Setup
Source Code Development Tools
Windows Development Tools
Running a Source Code Example
Creating a Visual Studio C++ Project
Create a C++ Project
Add an Assembly Language File
Set Project Properties
Edit the Source Code
Build and Run the Project
Linux Development Tools
Additional Configuration
Build and Run
Make Utility
Appendix B: References and Resources
X86 Assembly Language Programming References
Algorithm References
C++ References
Software Development Tools
Miscellaneous Utilities, Tools, and Libraries
Index

Polecaj historie