伪指令 .previous

beyes · 发表于 2011-2-23 16:33:25

.previous 伪指令，应该说是用来切换段(section)的。

看一段汇编代码：

.section A
.subsection 1
   .word 0x1234

.subsection 2
   .word 0x5678

.previous
  .word 0x9abc

.section .text
.global _start

_start:
        nop
        movl $1, %eax
        movl $0, %ebx
        int  $0x80

使用 objdump 来看一下：

objdump -D previous

previous:     file format elf32-i386

Disassembly of section .text:

00000000 <_start>:
   0:   90                      nop
   1:   b8 01 00 00 00          mov    $0x1,%eax
   6:   bb 00 00 00 00          mov    $0x0,%ebx
   b:   cd 80                   int    $0x80

Disassembly of section A:

00000000 <A>:
   0:   34 12                   xor    $0x12,%al
   2:   bc                      .byte 0xbc
   3:   9a                      .byte 0x9a
   4:   78 56                   js     5c <_start+0x5c>

还有：

objdump -h previous

previous:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000000d  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000000  00000000  00000000  00000044  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  00000000  00000000  00000044  2**2
                  ALLOC
  3 A             00000006  00000000  00000000  00000044  2**0
CONTENTS, READONLY

在 .previous 下的 0x9abc 被安排在了 .subsection1 下面。

再看另外一段汇编代码：

.section A
  .subsection 1
       # Now in section A subsection 1
       .word 0x1234
.section B
     .subsection 0
       # Now in section B subsection 0
       .word 0x5678
     .subsection 1
       # Now in section B subsection 1
       .word 0x9abc
     .previous
       # Now in section B subsection 0
       .word 0xdef0

.section .text
.global _start

_start:
        nop
        movl $1, %eax
        movl $0, %ebx
        int  $0x80

同样用 objdump 来观察：

> objdump -D previous4

previous4:     file format elf32-i386

Disassembly of section .text:

00000000 <_start>:
   0:   90                      nop
   1:   b8 01 00 00 00          mov    $0x1,%eax
   6:   bb 00 00 00 00          mov    $0x0,%ebx
   b:   cd 80                   int    $0x80

Disassembly of section A:

00000000 <A>:
   0:   34 12                   xor    $0x12,%al

Disassembly of section B:

00000000 <B>:
   0:   78 56                   js     58 <_start+0x58>
2:   f0                      lock
3:   de                      .byte 0xde
   4:   bc                      .byte 0xbc
   5:   9a                      .byte 0x9a

还有：

objdump -h previous4

previous4:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000000d  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000000  00000000  00000000  00000044  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  00000000  00000000  00000044  2**2
                  ALLOC
  3 A             00000002  00000000  00000000  00000044  2**0
                  CONTENTS, READONLY
  4 B             00000006  00000000  00000000  00000046  2**0
                  CONTENTS, READONLY

由上面可见，.previous 下的 0xdef0 放在了 .section 0 段下的 subsection 0 的 0x5678 下面。

第 3 个汇编例子(对第 1 个汇编进行了一下修改)：

.section A
.subsection 1
   .word 0x1234

.subsection 2
   .word 0x5678

.section .text
.global _start

_start:
        nop
        .previous
           .word 0x9090
        movl $3, %ecx
        movl $1, %eax
        movl $0, %ebx
        int  $0x80

在上面的代码中，我们将 .previous 插到了 .text 段中，然后再用 objdump 观察在目标文件中会是如何的安排：

objdump -D previous3

previous3:     file format elf32-i386

Disassembly of section .text:

00000000 <_start>:
   0:   90                      nop

Disassembly of section A:

00000000 <A>:
   0:   34 12                   xor    $0x12,%al
   2:   78 56                   js     5a <_start+0x5a>
  4:   90                      nop
5:   90                      nop
   6:   b9 03 00 00 00          mov    $0x3,%ecx
   b:   b8 01 00 00 00          mov    $0x1,%eax
  10:   bb 00 00 00 00          mov    $0x0,%ebx
  15:   cd 80                   int    $0x80

还有：

$ objdump -h previous3

previous3:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000001  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000000  00000000  00000000  00000038  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  00000000  00000000  00000038  2**2
                  ALLOC
  3 A             00000017  00000000  00000000  00000038  2**0

从上面可以看到，.section A 被安排到了 .bss 的下边。而且这个字段总共拥有 17 个字节大小。也就是说，原来看上去好像属于 .text 中的代码由于 .previous 的原因，其下的代码都被安放在 A 中。

从上面的例子中看出，在 .previous 下的代码会被恢复到 .previous 所在位置的前一个段中。至于 .previous 下的代码安排到段(或子段)中的哪个位置，编译器会根据字节的对齐关系考虑而做出安排。

下面示例使用一段内核中的代码(内核版本2.6.8中队 spin_lock 的实现)：

#define spin_lock_string \

"\
n1:
\
t
"

\

"
lock ; decb
%
0
\
n
\
t
"

\

"2:\
t
"

\

"
cmpb $0,
%
0
\
n
\
t
"

\

"
rep;nop
\
n
\
t
"

\

"
jle 2b
\
n
\
t
"

\

"
jmp 1b
\
n
"

\

"
.previous
"

[/table]

我们将这段代模拟成纯汇编以方便阅读分析：

.section .text

.global _start

_start:

.previous

movb $0, %al

[table=100%,#e3d2d2]1:

        decb %al
        js 2f
        .section .text.lock, "ax"

2:

        cmpb $0, %al
        rep;nop
        jle  2b
        jmp  1b
        .previous

        movl $0, %ebx
        int  $0x80
这里主要分析 .previous 伪指令。如上所说，.previous 起到一个切换段的作用。这里可以看到有两个段，一个是 .text 段，另一个是自定义的 .text.lock 段，下面的这些代码属于 .text.lock 段：

cmpb $0, %al
        rep;nop
        jle  2b
        jmp  1b

由于 .previous 的原因，链接器会将：

movl $0, %ebx
int $0x80

这两条语句搬移到 .text 段中 (.text.lock 的前一个段，即 .previous 之意)。这也可以用 obidump 查看目标文件得知：

[root@SLinux assembly]# objdump -D previous.o

previous.o:     file format elf32-i386

Disassembly of section .text:

00000000 <_start>:
   0:    b0 00                    mov    $0x0,%al
   2:    fe c8                    dec    %al
   4:    0f 88 fc ff ff ff        js     6 <_start+0x6>
   a:    bb 00 00 00 00           mov    $0x0,%ebx
   f:    cd 80                    int    $0x80

Disassembly of section .text.lock:

00000000 <.text.lock>:
   0:    3c 00                    cmp    $0x0,%al
   2:    f3 90                    pause
   4:    7e fa                    jle    0 <.text.lock>
   6:    e9 fe ff ff ff           jmp    9 <.text.lock+0x9>

从上可以清晰的看到属于两个段的各自代码。

所以，程序中在执行到 js 语句时，如果 SF 标志没有置位(结果小于0)，则执行 mov $0x0,%ebx 这条语句，而不是来到 cmp $0x0,%al 这里。

之所以定义成一个单独的区，原因是在大多数情况下，spin lock是能获取成功的，从.section 到.previous的这一段代码并不经常被调用，如果把它跟别的常用指令混在一起，会浪费指令缓存的空间。从这里也可以看出，linux内核的实现，要时时注意效率。

上面这句话从网上文章获得，说得很在理！

另外对于 rep;nop 也有解释：

这是一条很有趣的指令:)，咋一看，这只是一条空指令，但实际上这条指令可以降低CPU的运行频率，减低电的消耗量，但最重要的是，提高了整体的效率。因为这段指令执行太快的话，会生成很多读取内存变量的指令，另外的一个CPU可能也要写这个内存变量，现在的CPU经常需要重新排序指令来提高效率，如果读指令太多的话，为了保证指令之间的依赖性，CPU会以牺牲流水线执行（pipeline）所带来的好处。从pentium 4以后，intel引进了一条pause指令，专门用于spin lock这种情况，据intel的文档说，加上pause可以提高25倍的效率！nop指令前加rep前缀意思是：Spin-Wait and Idle Loops 。

从上面的 objdump 中也可以看到，rep;nop 确实是被翻译成 pause 指令。

		自动登录	找回密码
密码			立即注册