数值类型转换及相关指令

beyes · 发表于 2010-1-4 14:27:22

IA-32 指令集中包含众多指令，用于把以一种数据类型表示的数据转换为另一种数据类型。程序需要把浮点数据转换为整数值(或者相反的情况)的情况并不少见。这些指令提供完成这种操作的简便方式，无须编写自己的算法。

有很多指令用于转换数据类型，因为有不同的数据类型需要进行相互转换。下表列出这些指令：

指令	转换
CVTDQ2PD	打包双字整数到打包双精度FP(XMM)
CVTDQ2PS	打包双字整数到打包单精度FP(XMM)
CVTPD2DQ	打包双精度FP到打包双字整数(XMM)
CVTPD2PI	打包双精度FP到打包双字整数(MMX)
CVTPD2PS	打包双精度FP到打包单精度FP(XMM)
CVTPI2PD	打包双字整数到打包双精度FP(XMM)
CVTPI2PS	打包双字整数到打包单精度FP(XMM)
CVTPS2DQ	打包单精度FP到打包双字整数(XMM)
CVTPS2PD	打包单精度FP到打包双精度FP(XMM)
CVTPS2PI	打包单精度FP到打包双字整数(MMX)
CVTTPD2PI	打包双精度FP到打包双字整数(MMX,截断)
CVTTPD2DQ	打包双精度FP到打包双字整数(XMM,截断)
CVTTPS2DQ	打包单精度FP到打包双字整数(XMM,截断)
CVTTPS2PI	打包单精度FP到打包双字整数(MMX,截断)

上表中“转换”一栏里，后面的括号中表示的是存放结果的目标寄存器，目标寄存器可以是 MMX 或者 XMM 。另外，最后 4 条指令是截断的转换。在其他指令中，如果转换不精确，就会由 XMM MXCSR 寄存器的 13 位和 14 位控制进行舍入。这些位确定值是被向上还是向下舍。在截断的转换中，会自动执行向零方向的舍入。
源值可以是从内存位置、MMX 寄存器(对于64位值)或者 XMM 寄存器(对于 64 位或者 128 位值)获得。

下面是测试程序：

.section .data
value1:
        .float 1.25, 124.79, 200.0, -312.5
value2:
        .int 1, -435, 0, -25

.section .bss
        .lcomm data, 16

.section .text
.global _start
_start:
        nop
        cvtps2dq value1, %xmm0
        cvttps2dq value1, %xmm1
        cvtdq2ps value2, %xmm2
        movdqu    %xmm0, data

        movl $1, %eax
        movl $0, %ebx
        int $0x80

上面程序里，在内存位置 value1 定义了一个打包单精度浮点值，在内存位置位置 value2 定义一个打包双字整数值。
在执行第一条指令后(nop的下一条)，观察 xmm0 寄存器：

(gdb) print $xmm0
$1 = {v4_float = {1.40129846e-45, 1.75162308e-43, 2.80259693e-43, -nan(0x7ffec8)}, v2_double = {
    2.6524947387115311e-312, -nan(0xffec8000000c8)}, v16_int8 = {1, 0, 0, 0, 125, 0, 0, 0, -56, 0,
    0, 0, -56, -2, -1, -1}, v8_int16 = {1, 0, 125, 0, 200, 0, -312, -1}, v4_int32 = {1, 125, 200,
    -312}, v2_int64 = {536870912001, -1340029796152}, uint128 = 0xfffffec8000000c80000007d00000001}

由上可见，单精度浮点值被打包成了整数值，这里在转换时用的是一般转换(四舍五入)，如把 124.79 舍入为 125 。

步过第 2 条指令，查看 xmm1 寄存器中的内容：

(gdb) print $xmm1
$2 = {v4_float = {1.40129846e-45, 1.7376101e-43, 2.80259693e-43, -nan(0x7ffec8)}, v2_double = {
    2.6312747808018783e-312, -nan(0xffec8000000c8)}, v16_int8 = {1, 0, 0, 0, 124, 0, 0, 0, -56, 0,
    0, 0, -56, -2, -1, -1}, v8_int16 = {1, 0, 124, 0, 200, 0, -312, -1}, v4_int32 = {1, 124, 200,
    -312}, v2_int64 = {532575944705, -1340029796152}, uint128 = 0xfffffec8000000c80000007c00000001}

cvttps2dq 指令用的是截断舍入，截断转换是将要转换的数向 0 的方向舍入，如 124.79 会转换为 124 。

最后，在执行 movdqu 指令后，查看一下 data 内存位置处的内容：

(gdb) x/4d &data
0x80490c0 <data>: 1 125 200 -312

由输出可见，内存位置 data 的值被转换为打包双字整数后，被正确的存储。

beyes · 发表于 2010-1-4 15:08:39

MXCSR 是一个 32 位寄存器，里面包含了与 SSE 指令集相关的控制和状态信息标志。对于 SSE3 ，只有第 0-15 位被定义。具体定义如下：

Pnemonic	Bit Location	Description
FZ	bit 15	Flush To Zero
R+	bit 14	Round Positive
R-	bit 13	Round Negative
RZ	bits 13 and 14	Round To Zero
RN	bits 13 and 14 are 0	Round To Nearest
PM	bit 12	Precision Mask
UM	bit 11	Underflow Mask
OM	bit 10	Overflow Mask
ZM	bit 9	Divide By Zero Mask
DM	bit 8	Denormal Mask
IM	bit 7	Invalid Operation Mask
DAZ	bit 6	Denormals Are Zero
PE	bit 5	Precision Flag
UE	bit 4	Underflow Flag
OE	bit 3	Overflow Flag
ZE	bit 2	Divide By Zero Flag
DE	bit 1	Denormal Flag
IE	bit 0	Invalid Operation Flag

各位解释：
FZ mode causes all underflowing operations to simply go to zero.  This saves some processing time, but loses precision.

The R+, R-, RN, and RZ rounding modes determine how the lowest bit is generated.  Normally, RN is used.

PM, UM, MM, ZM, DM, and IM are masks that tell the processor to ignore the exceptions that happen, if they do.  This keeps the program from having to deal with problems, but might cause invalid results.

DAZ tells the CPU to force all Denormals to zero.  A Denormal is a number that is so small that FPU can't renormalize it due to limited exponent ranges.  They're just like normal numbers, but they take considerably longer to process.  Note that not all processors support DAZ.

PE, UE, ME, ZE, DE, and IE are the exception flags that are set if they happen, and aren't unmasked.  Programs can check these to see if something interesting happened.  These bits are "sticky", which means that once they're set, they stay set forever until the program clears them.  This means that the indicated exception could have happened several operations ago, but nobody bothered to clear it.

DAZ wasn't available in the first version of SSE.  Since setting a reserved bit in MXCSR causes a general protection fault, we need to be able to check the availability of this feature without causing problems.  To do this, one needs to set up a 512-byte area of memory to save the SSE state to, using fxsave, and then one needs to inspect bytes 28 through 31 for the MXCSR_MASK value.  If bit 6 is set, DAZ is supported, otherwise, it isn't.

		自动登录	找回密码
密码			立即注册

数值类型转换及相关指令

SSE — MXCSR 寄存器