
DATABASE
INTEL 体系结构
MMX 技术程序员参考手册
Intel Architecture MMX Technology Programmer's Reference Manual
Chapter 2
INTEL ARCHITECTURE MMX TECHNOLOGY FEATURES
This chapter provides a general overview of the architectural features of the Intel Architecture MMX technology.
MMX technology provides the following new features, while maintaining backward compatibility with all existing Intel Architecture microprocessors, IA applications, and operating systems.
- New data types
- Eight MMX registers
- Enhanced instruction set
The performance of applications which use these new features of MMX technology can be enchanced.
The principal data type of the IA MMX technology is the packed fixed-point integer. The decimal point of the fixed-point values is implicit and is left for the user to control for maximum flexibility.
The IA MMX technology defines the following four new 64-bit data types (See Figure 2-1):
- Packed byte Eight bytes packed into one 64-bit quantity
- Packed word Four words packed into one 64-bit quantity
- Packed doubleword Two doublewords packed into one 64-bit quantity
- Quadword One 64-bit quantity

Figure 2-1. Packed Data Types
The IA MMX technology provides eight 64-bit, general-purpose registers. These registers are aliased on the floating-point registers. The operating system handles the MMX technology as it would handle floating-point. (See Section 4.3 for more details on register aliasing.)
The MMX registers can hold packed 64-bit data types. The MMX instructions access the MMX registers directly using the register names MM0 to MM7 (See Figure 2-2).
MMX registers can be used to perform calculations on data. They cannot be used to address memory; addressing is accomplished by using the integer registers and standard IA addressing modes.

Figure 2-2. MMX Register Set
The IA MMX instruction set supplies a rich set of instructions that operate on all data elements of a packed data type, in parallel. The MMX instructions can operate on either signed or unsigned data elements.
The MMX instructions implement two new principles(discussed in section Packed Data 2.4.2.):
- Operations on packed data
- Saturation arithmetic
The MMX instructions can operate on groups of eight bytes, four words, and two doublewords. These groups of 64 bits are referred to as packed data. The same 64 bits of data can be treated as any one of the packed data types. Data is cast by the type specified by the instruction.
For example, the PADDB (Add Packed Bytes) instruction adds two groups of eight packed bytes. The PADDW (Add Packed Words) instruction, which adds packed words, could operate on the same 64 bits as the PADDB instruction treating the 64 bits as four 16-bit words. X_PackedData
The MMX technology supports a new arithmetic capability known as saturating arithmetic. Saturation is best defined by contrasting it with wraparound mode.
In wraparound mode, results that overflow or underflow are truncated and only the lower (least significant) bits of the result are returned. That is, the carry is ignored.
In saturation mode, results of an operation that overflow or underflow are clipped (saturated) to a data-range limit for the data type (see Table 2-1). The result of an operation that exceeds the range of a data-type saturates to the maximum value of the range. A result that is less than the range of a data type saturates to the minimum value of the range. This is useful in many cases, such as color calculations.
For example, when the result exceeds the data range limit for signed bytes, it is saturated to 0x7F (0xFF for unsigned bytes). If a value is less than the data range limit, it is saturated to 0x80 for signed bytes (0x00 for unsigned bytes).
Saturation provides a useful feature of avoiding wraparound artifacts. In the example of color calculations, saturation causes a color to remain pure black or pure white without allowing for an inversion.
Table 2-1. Data Range Limits for Saturation |
| Lower Limit |
Upper Limit |
Signed |
Hexadecimal |
Decimal |
Hexadecimal |
Decimal |
Byte | 80H |
-128 | 7FH |
127 |
Word | 8000H |
-32,768 | 7FFFH |
32,767 |
Unsigned |
| | | |
Byte | 00H |
0 | FFH |
255 |
Word | 0000H |
0 | FFFFH |
65,535 |
MMX instructions do not indicate overflow or underflow occurrence by generating exceptions or setting flags.
This section provides an overview of the MMX instruction groups. See Chapter 5 for detailed information on the instructions, including information on encoding, operation, and exceptions. The fifty-seven new MMX instructions are grouped into these categories:
- Arithmetic Instructions
- Comparison Instructions
- Conversion Instructions
- Logical Instructions
- Shift Instructions
- Data Transfer Instructions
- Empty MMX State (EMMS) Instruction
Packed Addition and Subtraction
The PADD (Packed Add) and PSUB (Packed Subtract) instructions add or subtract the signed or unsigned data elements of the source operand to or from the destination operand in wrap- around mode. These instructions support packed byte, packed word, and packed doubleword data types.
The PADDS (Packed Add with Saturation) and PSUBS (Packed Subtract with Saturation) instructions add or subtract the signed data elements of the source operand to or from the signed data elements of the destination operand and saturate the result to the limits of the signed data-type range. These instructions support packed byte and packed word data types.
The PADDUS (Packed Add Unsigned with Saturation) and PSUBUS (Packed Subtract Unsigned with Saturation) instructions add or subtract the unsigned data elements of the source operand to or from the unsigned data elements of the destination operand and saturate the result to the limits of the unsigned data-type range. These instructions support packed byte and packed word data types.
Packed Multiplication
Packed multiplication instructions perform four multiplications on pairs of signed 16-bit operands, producing 32-bit intermediate results. Users may choose the low-order or high-order parts of each 32-bit result.
The PMULHW (Packed Multiply High) and PMULLW (Packed Multiply Low) instructions multiply the signed words of the source and destination operands and write the high-order or low-order 16 bits of each of the results to the destination operand.
Packed Multiply Add
The PMADDWD (Packed Multiply and Add) instruction calculates the products of the signed words of the source and destination operands. The four intermediate 32-bit doubleword products are summed in pairs to produce two 32-bit doubleword results.
The PCMPEQ (Packed Compare for Equal) and PCMPGT (Packed Compare for Greater Than) instructions compare the corresponding data elements in the source and destination operands for equality or value greater than, respectively. These instructions generate a mask of ones or zeros which are written to the destination operand. Logical operations can use the mask to select elements. This can be used to implement a packed conditional move operation without a branch or a set of branch instructions. No flags are set.
These instructions support packed byte, packed word and packed doubleword data types.
Pack and Unpack
The Pack and Unpack instructions perform conversions between the packed data types.
The PACKSS (Packed with Signed Saturation) instruction converts signed words into signed bytes or signed doublewords into signed words, in signed saturation mode.
The PACKUS (Packed with Unsigned Saturation) instruction converts signed words into unsigned bytes, in unsigned saturation mode.
The PUNPCKH (Unpack High Packed Data) and PUNPCKL (Unpack Low Packed Data) instructions convert bytes to words, words to doublewords, or doublewords to quadwords.
The PAND (Bitwise Logical And), PANDN (Bitwise Logical And Not), POR (Bitwise Logical OR), and PXOR (Bitwise Logical Exclusive OR) instructions perform bitwise logical operations on 64-bit quantities.
The logical shift left, logical shift right and arithmetic shift right instructions shift each element by a specified number of bits. The logical left and right shifts also enable a 64-bit quantity (quadword) to be shifted as one block, assisting in data type conversions and alignment operations.
The PSLL (Packed Shift Left Logical) and PSRL (Packed Shift Right Logical) instructions perform a logical left or right shift, and fill the empty high or low order bit positions with zeros. These instructions support packed word, packed doubleword, and quadword data types.
The PSRA (Packed Shift Right Arithmetic) instruction performs an arithmetic right shift, copying the sign bit into empty bit positions on the upper end of the operand. This instruction supports packed word and packed doubleword data types.
The MOVD (Move 32 Bits) instruction transfers 32 bits of packed data from memory to MMX registers and visa versa, or from integer registers to MMX registers and visa versa.
The MOVQ (Move 64 Bits) instruction transfers 64-bits of packed data from memory to MMX registers and vise versa, or transfers data between MMX registers.
The EMMS instruction empties the MMX state. This instruction must be used to clear the IA MMX state (empty the floating-point tag word) at the end of an MMX routine before calling other routines that can execute floating-point instructions.
All MMX instructions, except the EMMS instruction, reference and operate on two operands: the source and destination operands. The right operand is the source and the left operand is the destination. The destination operand may also be a second source operand for the operation. The instruction overwrites the destination operand with the result.
For example, a two-operand instruction would be decoded as:
DEST (left operand) OP SRC (right operand)
The source operand for all the MMX instructions (except the data transfer instructions), can reside either in memory or in an MMX register. The destination operand resides in an MMX register.
For data transfer instructions, the source and destination operands can also be an integer register (for the MOVD instruction) or memory location (for both the MOVD and MOVQ instructions).
The IA MMX state is aliased upon the IA floating-point state. No new state or mode is added to support the MMX technology. The same floating-point instructions that save and restore the floating-point state also handle the IA MMX state (for example, during context switching).
MMX technology uses the same interface techniques between the floating-point architecture and the operating system (primarily for task switching purposes). For more detail, see Section 4.1.
|
|
|
|
|
|
|
|
|
|
All right reserved by Fan Yipeng. |
|
|
|
|
|
|
|
|
|