doc.go raw
1 // Copyright 2018 The Go Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style
3 // license that can be found in the LICENSE file.
4
5 /*
6 Package arm64 implements an ARM64 assembler. Go assembly syntax is different from GNU ARM64
7 syntax, but we can still follow the general rules to map between them.
8
9 Instructions mnemonics mapping rules
10
11 1. Most instructions use width suffixes of instruction names to indicate operand width rather than
12 using different register names.
13
14 Examples:
15 ADC R24, R14, R12 <=> adc x12, x24
16 ADDW R26->24, R21, R15 <=> add w15, w21, w26, asr #24
17 FCMPS F2, F3 <=> fcmp s3, s2
18 FCMPD F2, F3 <=> fcmp d3, d2
19 FCVTDH F2, F3 <=> fcvt h3, d2
20
21 2. Go uses .P and .W suffixes to indicate post-increment and pre-increment.
22
23 Examples:
24 MOVD.P -8(R10), R8 <=> ldr x8, [x10],#-8
25 MOVB.W 16(R16), R10 <=> ldrsb x10, [x16,#16]!
26 MOVBU.W 16(R16), R10 <=> ldrb x10, [x16,#16]!
27
28 3. Go uses a series of MOV instructions as load and store.
29
30 64-bit variant ldr, str, stur => MOVD;
31 32-bit variant str, stur, ldrsw => MOVW;
32 32-bit variant ldr => MOVWU;
33 ldrb => MOVBU; ldrh => MOVHU;
34 ldrsb, sturb, strb => MOVB;
35 ldrsh, sturh, strh => MOVH.
36
37 4. Go moves conditions into opcode suffix, like BLT.
38
39 5. Go adds a V prefix for most floating-point and SIMD instructions, except cryptographic extension
40 instructions and floating-point(scalar) instructions.
41
42 Examples:
43 VADD V5.H8, V18.H8, V9.H8 <=> add v9.8h, v18.8h, v5.8h
44 VLD1.P (R6)(R11), [V31.D1] <=> ld1 {v31.1d}, [x6], x11
45 VFMLA V29.S2, V20.S2, V14.S2 <=> fmla v14.2s, v20.2s, v29.2s
46 AESD V22.B16, V19.B16 <=> aesd v19.16b, v22.16b
47 SCVTFWS R3, F16 <=> scvtf s17, w6
48
49 6. Align directive
50
51 Go asm supports the PCALIGN directive, which indicates that the next instruction should be aligned
52 to a specified boundary by padding with NOOP instruction. The alignment value supported on arm64
53 must be a power of 2 and in the range of [8, 2048].
54
55 Examples:
56 PCALIGN $16
57 MOVD $2, R0 // This instruction is aligned with 16 bytes.
58 PCALIGN $1024
59 MOVD $3, R1 // This instruction is aligned with 1024 bytes.
60
61 PCALIGN also changes the function alignment. If a function has one or more PCALIGN directives,
62 its address will be aligned to the same or coarser boundary, which is the maximum of all the
63 alignment values.
64
65 In the following example, the function Add is aligned with 128 bytes.
66 Examples:
67 TEXT ·Add(SB),$40-16
68 MOVD $2, R0
69 PCALIGN $32
70 MOVD $4, R1
71 PCALIGN $128
72 MOVD $8, R2
73 RET
74
75 On arm64, functions in Go are aligned to 16 bytes by default, we can also use PCALGIN to set the
76 function alignment. The functions that need to be aligned are preferably using NOFRAME and NOSPLIT
77 to avoid the impact of the prologues inserted by the assembler, so that the function address will
78 have the same alignment as the first hand-written instruction.
79
80 In the following example, PCALIGN at the entry of the function Add will align its address to 2048 bytes.
81
82 Examples:
83 TEXT ·Add(SB),NOSPLIT|NOFRAME,$0
84 PCALIGN $2048
85 MOVD $1, R0
86 MOVD $1, R1
87 RET
88
89 Special Cases.
90
91 (1) umov is written as VMOV.
92
93 (2) br is renamed JMP, blr is renamed CALL.
94
95 (3) No need to add "W" suffix: LDARB, LDARH, LDAXRB, LDAXRH, LDTRH, LDXRB, LDXRH.
96
97 (4) In Go assembly syntax, NOP is a zero-width pseudo-instruction serves generic purpose, nothing
98 related to real ARM64 instruction. NOOP serves for the hardware nop instruction. NOOP is an alias of
99 HINT $0.
100
101 Examples:
102 VMOV V13.B[1], R20 <=> mov x20, v13.b[1]
103 VMOV V13.H[1], R20 <=> mov w20, v13.h[1]
104 JMP (R3) <=> br x3
105 CALL (R17) <=> blr x17
106 LDAXRB (R19), R16 <=> ldaxrb w16, [x19]
107 NOOP <=> nop
108
109
110 Register mapping rules
111
112 1. All basic register names are written as Rn.
113
114 2. Go uses ZR as the zero register and RSP as the stack pointer.
115
116 3. Bn, Hn, Dn, Sn and Qn instructions are written as Fn in floating-point instructions and as Vn
117 in SIMD instructions.
118
119
120 Argument mapping rules
121
122 1. The operands appear in left-to-right assignment order.
123
124 Go reverses the arguments of most instructions.
125
126 Examples:
127 ADD R11.SXTB<<1, RSP, R25 <=> add x25, sp, w11, sxtb #1
128 VADD V16, V19, V14 <=> add d14, d19, d16
129
130 Special Cases.
131
132 (1) Argument order is the same as in the GNU ARM64 syntax: cbz, cbnz and some store instructions,
133 such as str, stur, strb, sturb, strh, sturh stlr, stlrb. stlrh, st1.
134
135 Examples:
136 MOVD R29, 384(R19) <=> str x29, [x19,#384]
137 MOVB.P R30, 30(R4) <=> strb w30, [x4],#30
138 STLRH R21, (R19) <=> stlrh w21, [x19]
139
140 (2) MADD, MADDW, MSUB, MSUBW, SMADDL, SMSUBL, UMADDL, UMSUBL <Rm>, <Ra>, <Rn>, <Rd>
141
142 Examples:
143 MADD R2, R30, R22, R6 <=> madd x6, x22, x2, x30
144 SMSUBL R10, R3, R17, R27 <=> smsubl x27, w17, w10, x3
145
146 (3) FMADDD, FMADDS, FMSUBD, FMSUBS, FNMADDD, FNMADDS, FNMSUBD, FNMSUBS <Fm>, <Fa>, <Fn>, <Fd>
147
148 Examples:
149 FMADDD F30, F20, F3, F29 <=> fmadd d29, d3, d30, d20
150 FNMSUBS F7, F25, F7, F22 <=> fnmsub s22, s7, s7, s25
151
152 (4) BFI, BFXIL, SBFIZ, SBFX, UBFIZ, UBFX $<lsb>, <Rn>, $<width>, <Rd>
153
154 Examples:
155 BFIW $16, R20, $6, R0 <=> bfi w0, w20, #16, #6
156 UBFIZ $34, R26, $5, R20 <=> ubfiz x20, x26, #34, #5
157
158 (5) FCCMPD, FCCMPS, FCCMPED, FCCMPES <cond>, Fm. Fn, $<nzcv>
159
160 Examples:
161 FCCMPD AL, F8, F26, $0 <=> fccmp d26, d8, #0x0, al
162 FCCMPS VS, F29, F4, $4 <=> fccmp s4, s29, #0x4, vs
163 FCCMPED LE, F20, F5, $13 <=> fccmpe d5, d20, #0xd, le
164 FCCMPES NE, F26, F10, $0 <=> fccmpe s10, s26, #0x0, ne
165
166 (6) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, $<imm>, $<nzcv>
167
168 Examples:
169 CCMP MI, R22, $12, $13 <=> ccmp x22, #0xc, #0xd, mi
170 CCMNW AL, R1, $11, $8 <=> ccmn w1, #0xb, #0x8, al
171
172 (7) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, <Rm>, $<nzcv>
173
174 Examples:
175 CCMN VS, R13, R22, $10 <=> ccmn x13, x22, #0xa, vs
176 CCMPW HS, R19, R14, $11 <=> ccmp w19, w14, #0xb, cs
177
178 (9) CSEL, CSELW, CSNEG, CSNEGW, CSINC, CSINCW <cond>, <Rn>, <Rm>, <Rd> ;
179 FCSELD, FCSELS <cond>, <Fn>, <Fm>, <Fd>
180
181 Examples:
182 CSEL GT, R0, R19, R1 <=> csel x1, x0, x19, gt
183 CSNEGW GT, R7, R17, R8 <=> csneg w8, w7, w17, gt
184 FCSELD EQ, F15, F18, F16 <=> fcsel d16, d15, d18, eq
185
186 (10) TBNZ, TBZ $<imm>, <Rt>, <label>
187
188
189 (11) STLXR, STLXRW, STXR, STXRW, STLXRB, STLXRH, STXRB, STXRH <Rf>, (<Rn|RSP>), <Rs>
190
191 Examples:
192 STLXR ZR, (R15), R16 <=> stlxr w16, xzr, [x15]
193 STXRB R9, (R21), R19 <=> stxrb w19, w9, [x21]
194
195 (12) STLXP, STLXPW, STXP, STXPW (<Rf1>, <Rf2>), (<Rn|RSP>), <Rs>
196
197 Examples:
198 STLXP (R17, R19), (R4), R5 <=> stlxp w5, x17, x19, [x4]
199 STXPW (R30, R25), (R22), R13 <=> stxp w13, w30, w25, [x22]
200
201 2. Expressions for special arguments.
202
203 #<immediate> is written as $<immediate>.
204
205 Optionally-shifted immediate.
206
207 Examples:
208 ADD $(3151<<12), R14, R20 <=> add x20, x14, #0xc4f, lsl #12
209 ADDW $1864, R25, R6 <=> add w6, w25, #0x748
210
211 Optionally-shifted registers are written as <Rm>{<shift><amount>}.
212 The <shift> can be <<(lsl), >>(lsr), ->(asr), @>(ror).
213
214 Examples:
215 ADD R19>>30, R10, R24 <=> add x24, x10, x19, lsr #30
216 ADDW R26->24, R21, R15 <=> add w15, w21, w26, asr #24
217
218 Extended registers are written as <Rm>{.<extend>{<<<amount>}}.
219 <extend> can be UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW or SXTX.
220
221 Examples:
222 ADDS R19.UXTB<<4, R9, R26 <=> adds x26, x9, w19, uxtb #4
223 ADDSW R14.SXTX, R14, R6 <=> adds w6, w14, w14, sxtx
224
225 Memory references: [<Xn|SP>{,#0}] is written as (Rn|RSP), a base register and an immediate
226 offset is written as imm(Rn|RSP), a base register and an offset register is written as (Rn|RSP)(Rm).
227
228 Examples:
229 LDAR (R22), R9 <=> ldar x9, [x22]
230 LDP 28(R17), (R15, R23) <=> ldp x15, x23, [x17,#28]
231 MOVWU (R4)(R12<<2), R8 <=> ldr w8, [x4, x12, lsl #2]
232 MOVD (R7)(R11.UXTW<<3), R25 <=> ldr x25, [x7,w11,uxtw #3]
233 MOVBU (R27)(R23), R14 <=> ldrb w14, [x27,x23]
234
235 Register pairs are written as (Rt1, Rt2).
236
237 Examples:
238 LDP.P -240(R11), (R12, R26) <=> ldp x12, x26, [x11],#-240
239
240 Register with arrangement and register with arrangement and index.
241
242 Examples:
243 VADD V5.H8, V18.H8, V9.H8 <=> add v9.8h, v18.8h, v5.8h
244 VLD1 (R2), [V21.B16] <=> ld1 {v21.16b}, [x2]
245 VST1.P V9.S[1], (R16)(R21) <=> st1 {v9.s}[1], [x16], x28
246 VST1.P [V13.H8, V14.H8, V15.H8], (R3)(R14) <=> st1 {v13.8h-v15.8h}, [x3], x14
247 VST1.P [V14.D1, V15.D1], (R7)(R23) <=> st1 {v14.1d, v15.1d}, [x7], x23
248 */
249 package arm64
250