神威芯片汇编指令运行情况

1. 概述

本文分析了神威芯片的指令运行情况,会通过几个例子中来查看。

2. 详述

2.1. 示例一

首先看一个简单的加法例子,代码如下:

1
2
3
4
5
6
7
8
9
// test.c

int test()
{
int a = 1;
int b = 2;
int c = a + b;
return c;
}

使用如下命令编译出.o并进行反汇编:

1
2
tecocc -c -device-only -ffp-contract=fast test.c -O0
tecoobjdump -Dr test.o

注意使用了O0优化度,可以查看最原始的汇编,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0000000000000000 <slave_test>:
0: e0 ff 9e e7 ldi $30, -32($30)
4: 1d 00 de c3 stl $15, 24($30)
8: 20 00 de e3 ldi $15, 32($30)
c: f7 ff 0f b4 stw $16, -12($15)
10: f3 ff 4f b4 stw $17, -16($15)
14: f6 ff 4f b0 ldw $1, -12($15)
18: f2 ff 8f b0 ldw $2, -16($15)
1c: 01 00 42 40 addw $1, $2, $1
20: ef ff 4f b0 stw $1, -20($15)
24: ee ff 0f b0 ldw $0, -20($15)
28: e0 ff 8f e7 ldi $30, -32($15)
2c: 1c 00 de c3 ldl $15, 24($30)
30: 20 00 9e e7 ldi $30, 32($30)
34: 01 60 da 07 call $31, ($26), 1

其中,$30为栈指针,$15为帧指针,$31为默认0,$26为返回地址,$0为整数类型返回值。栈变化如下图所示:

这个例子还是比较简单的,看一个调用的例子。

2.2. 示例二

1
2
3
4
5
6
7
8
9
10
11
12
13
int add(int a, int b)
{
int c = a + b;
return c;
}

int test()
{
int a = 1;
int b = 2;
int c = add(a, b);
return c;
}

汇编代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
0000000000000000 <slave_add>:
0: e0 ff 9e e7 ldi $30, -32($30)
4: 1d 00 de c3 stl $15, 24($30)
8: 20 00 de e3 ldi $15, 32($30)
c: f7 ff 0f b4 stw $16, -12($15)
10: f3 ff 4f b4 stw $17, -16($15)
14: f6 ff 4f b0 ldw $1, -12($15)
18: f2 ff 8f b0 ldw $2, -16($15)
1c: 01 00 42 40 addw $1, $2, $1
20: ef ff 4f b0 stw $1, -20($15)
24: ee ff 0f b0 ldw $0, -20($15)
28: e0 ff 8f e7 ldi $30, -32($15)
2c: 1c 00 de c3 ldl $15, 24($30)
30: 20 00 9e e7 ldi $30, 32($30)
34: 01 60 da 07 call $31, ($26), 1

0000000000000000 <slave_test>:
0: 00 00 5b f7 ldih $29, 0($27)
0000000000000000: R_SWAI_GPDISP .text1.slave_test+0x4
4: 00 00 5d e7 ldi $29, 0($29)
8: e0 ff 9e e7 ldi $30, -32($30)
c: 1d 00 9e c6 stl $26, 24($30)
10: 15 00 de c3 stl $15, 16($30)
14: 20 00 de e3 ldi $15, 32($30)
18: 01 00 5f e0 ldi $1, 1($31)
1c: ef ff 4f b0 stw $1, -20($15)
20: 02 00 5f e0 ldi $1, 2($31)
24: eb ff 4f b0 stw $1, -24($15)
28: ee ff 0f b4 ldw $16, -20($15)
2c: ea ff 4f b4 ldw $17, -24($15)
30: 04 00 5d c0 ldl $1, 0($29)
0000000000000030: R_SWAI_LITERAL slave_add
34: 1b 4e 5f 40 or $1, $31, $27
38: 00 60 9b 06 call $26, ($27), 0
0000000000000038: R_SWAI_LITUSE slave_add+0x3
3c: 00 00 5a f7 ldih $29, 0($26)
000000000000003c: R_SWAI_GPDISP .text1.slave_test+0x4
40: 00 00 5d e7 ldi $29, 0($29)
44: e7 ff 0f b0 stw $0, -28($15)
48: e6 ff 0f b0 ldw $0, -28($15)
4c: e0 ff 8f e7 ldi $30, -32($15)
50: 14 00 de c3 ldl $15, 16($30)
54: 1c 00 9e c6 ldl $26, 24($30)
58: 20 00 9e e7 ldi $30, 32($30)
5c: 01 60 da 07 call $31, ($26), 1

可以看到slave_add跟刚刚示例一中的汇编代码完全一致,就不多看了,主要关注slave_test。其中,$16$17为参数寄存器。栈变化图如下所示: