doc.go

Documentation: cmd/internal/obj/arm64

     1  // Copyright 2018 The Go Authors. All rights reserved.
     2  // Use of this source code is governed by a BSD-style
     3  // license that can be found in the LICENSE file.
     4  
     5  /*
     6  Package arm64 implements an ARM64 assembler. Go assembly syntax is different from GNU ARM64
     7  syntax, but we can still follow the general rules to map between them.
     8  
     9  Instructions mnemonics mapping rules
    10  
    11  1. Most instructions use width suffixes of instruction names to indicate operand width rather than
    12  using different register names.
    13  
    14    Examples:
    15      ADC R24, R14, R12          <=>     adc x12, x24
    16      ADDW R26->24, R21, R15     <=>     add w15, w21, w26, asr #24
    17      FCMPS F2, F3               <=>     fcmp s3, s2
    18      FCMPD F2, F3               <=>     fcmp d3, d2
    19      FCVTDH F2, F3              <=>     fcvt h3, d2
    20  
    21  2. Go uses .P and .W suffixes to indicate post-increment and pre-increment.
    22  
    23    Examples:
    24      MOVD.P -8(R10), R8         <=>      ldr x8, [x10],#-8
    25      MOVB.W 16(R16), R10        <=>      ldrsb x10, [x16,#16]!
    26      MOVBU.W 16(R16), R10       <=>      ldrb x10, [x16,#16]!
    27  
    28  3. Go uses a series of MOV instructions as load and store.
    29  
    30  64-bit variant ldr, str, stur => MOVD;
    31  32-bit variant str, stur, ldrsw => MOVW;
    32  32-bit variant ldr => MOVWU;
    33  ldrb => MOVBU; ldrh => MOVHU;
    34  ldrsb, sturb, strb => MOVB;
    35  ldrsh, sturh, strh =>  MOVH.
    36  
    37  4. Go moves conditions into opcode suffix, like BLT.
    38  
    39  5. Go adds a V prefix for most floating-point and SIMD instructions, except cryptographic extension
    40  instructions and floating-point(scalar) instructions.
    41  
    42    Examples:
    43      VADD V5.H8, V18.H8, V9.H8         <=>      add v9.8h, v18.8h, v5.8h
    44      VLD1.P (R6)(R11), [V31.D1]        <=>      ld1 {v31.1d}, [x6], x11
    45      VFMLA V29.S2, V20.S2, V14.S2      <=>      fmla v14.2s, v20.2s, v29.2s
    46      AESD V22.B16, V19.B16             <=>      aesd v19.16b, v22.16b
    47      SCVTFWS R3, F16                   <=>      scvtf s17, w6
    48  
    49  6. Align directive
    50  
    51  Go asm supports the PCALIGN directive, which indicates that the next instruction should be aligned
    52  to a specified boundary by padding with NOOP instruction. The alignment value supported on arm64
    53  must be a power of 2 and in the range of [8, 2048].
    54  
    55    Examples:
    56      PCALIGN $16
    57      MOVD $2, R0          // This instruction is aligned with 16 bytes.
    58      PCALIGN $1024
    59      MOVD $3, R1          // This instruction is aligned with 1024 bytes.
    60  
    61  PCALIGN also changes the function alignment. If a function has one or more PCALIGN directives,
    62  its address will be aligned to the same or coarser boundary, which is the maximum of all the
    63  alignment values.
    64  
    65  In the following example, the function Add is aligned with 128 bytes.
    66    Examples:
    67      TEXT ·Add(SB),$40-16
    68      MOVD $2, R0
    69      PCALIGN $32
    70      MOVD $4, R1
    71      PCALIGN $128
    72      MOVD $8, R2
    73      RET
    74  
    75  On arm64, functions in Go are aligned to 16 bytes by default, we can also use PCALGIN to set the
    76  function alignment. The functions that need to be aligned are preferably using NOFRAME and NOSPLIT
    77  to avoid the impact of the prologues inserted by the assembler, so that the function address will
    78  have the same alignment as the first hand-written instruction.
    79  
    80  In the following example, PCALIGN at the entry of the function Add will align its address to 2048 bytes.
    81  
    82    Examples:
    83      TEXT ·Add(SB),NOSPLIT|NOFRAME,$0
    84        PCALIGN $2048
    85        MOVD $1, R0
    86        MOVD $1, R1
    87        RET
    88  
    89  7. Move large constants to vector registers.
    90  
    91  Go asm uses VMOVQ/VMOVD/VMOVS to move 128-bit, 64-bit and 32-bit constants into vector registers, respectively.
    92  And for a 128-bit interger, it take two 64-bit operands, for the high and low parts separately.
    93  
    94    Examples:
    95      VMOVS $0x11223344, V0
    96      VMOVD $0x1122334455667788, V1
    97      VMOVQ $0x1122334455667788, $8877665544332211, V2   // V2=0x11223344556677888877665544332211
    98  
    99  Special Cases.
   100  
   101  (1) umov is written as VMOV.
   102  
   103  (2) br is renamed JMP, blr is renamed CALL.
   104  
   105  (3) No need to add "W" suffix: LDARB, LDARH, LDAXRB, LDAXRH, LDTRH, LDXRB, LDXRH.
   106  
   107  (4) In Go assembly syntax, NOP is a zero-width pseudo-instruction serves generic purpose, nothing
   108  related to real ARM64 instruction. NOOP serves for the hardware nop instruction. NOOP is an alias of
   109  HINT $0.
   110  
   111    Examples:
   112      VMOV V13.B[1], R20      <=>      mov x20, v13.b[1]
   113      VMOV V13.H[1], R20      <=>      mov w20, v13.h[1]
   114      JMP (R3)                <=>      br x3
   115      CALL (R17)              <=>      blr x17
   116      LDAXRB (R19), R16       <=>      ldaxrb w16, [x19]
   117      NOOP                    <=>      nop
   118  
   119  
   120  Register mapping rules
   121  
   122  1. All basic register names are written as Rn.
   123  
   124  2. Go uses ZR as the zero register and RSP as the stack pointer.
   125  
   126  3. Bn, Hn, Dn, Sn and Qn instructions are written as Fn in floating-point instructions and as Vn
   127  in SIMD instructions.
   128  
   129  
   130  Argument mapping rules
   131  
   132  1. The operands appear in left-to-right assignment order.
   133  
   134  Go reverses the arguments of most instructions.
   135  
   136      Examples:
   137        ADD R11.SXTB<<1, RSP, R25      <=>      add x25, sp, w11, sxtb #1
   138        VADD V16, V19, V14             <=>      add d14, d19, d16
   139  
   140  Special Cases.
   141  
   142  (1) Argument order is the same as in the GNU ARM64 syntax: cbz, cbnz and some store instructions,
   143  such as str, stur, strb, sturb, strh, sturh stlr, stlrb. stlrh, st1.
   144  
   145    Examples:
   146      MOVD R29, 384(R19)    <=>    str x29, [x19,#384]
   147      MOVB.P R30, 30(R4)    <=>    strb w30, [x4],#30
   148      STLRH R21, (R19)      <=>    stlrh w21, [x19]
   149  
   150  (2) MADD, MADDW, MSUB, MSUBW, SMADDL, SMSUBL, UMADDL, UMSUBL <Rm>, <Ra>, <Rn>, <Rd>
   151  
   152    Examples:
   153      MADD R2, R30, R22, R6       <=>    madd x6, x22, x2, x30
   154      SMSUBL R10, R3, R17, R27    <=>    smsubl x27, w17, w10, x3
   155  
   156  (3) FMADDD, FMADDS, FMSUBD, FMSUBS, FNMADDD, FNMADDS, FNMSUBD, FNMSUBS <Fm>, <Fa>, <Fn>, <Fd>
   157  
   158    Examples:
   159      FMADDD F30, F20, F3, F29    <=>    fmadd d29, d3, d30, d20
   160      FNMSUBS F7, F25, F7, F22    <=>    fnmsub s22, s7, s7, s25
   161  
   162  (4) BFI, BFXIL, SBFIZ, SBFX, UBFIZ, UBFX $<lsb>, <Rn>, $<width>, <Rd>
   163  
   164    Examples:
   165      BFIW $16, R20, $6, R0      <=>    bfi w0, w20, #16, #6
   166      UBFIZ $34, R26, $5, R20    <=>    ubfiz x20, x26, #34, #5
   167  
   168  (5) FCCMPD, FCCMPS, FCCMPED, FCCMPES <cond>, Fm. Fn, $<nzcv>
   169  
   170    Examples:
   171      FCCMPD AL, F8, F26, $0     <=>    fccmp d26, d8, #0x0, al
   172      FCCMPS VS, F29, F4, $4     <=>    fccmp s4, s29, #0x4, vs
   173      FCCMPED LE, F20, F5, $13   <=>    fccmpe d5, d20, #0xd, le
   174      FCCMPES NE, F26, F10, $0   <=>    fccmpe s10, s26, #0x0, ne
   175  
   176  (6) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, $<imm>, $<nzcv>
   177  
   178    Examples:
   179      CCMP MI, R22, $12, $13     <=>    ccmp x22, #0xc, #0xd, mi
   180      CCMNW AL, R1, $11, $8      <=>    ccmn w1, #0xb, #0x8, al
   181  
   182  (7) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, <Rm>, $<nzcv>
   183  
   184    Examples:
   185      CCMN VS, R13, R22, $10     <=>    ccmn x13, x22, #0xa, vs
   186      CCMPW HS, R19, R14, $11    <=>    ccmp w19, w14, #0xb, cs
   187  
   188  (9) CSEL, CSELW, CSNEG, CSNEGW, CSINC, CSINCW <cond>, <Rn>, <Rm>, <Rd> ;
   189  FCSELD, FCSELS <cond>, <Fn>, <Fm>, <Fd>
   190  
   191    Examples:
   192      CSEL GT, R0, R19, R1        <=>    csel x1, x0, x19, gt
   193      CSNEGW GT, R7, R17, R8      <=>    csneg w8, w7, w17, gt
   194      FCSELD EQ, F15, F18, F16    <=>    fcsel d16, d15, d18, eq
   195  
   196  (10) TBNZ, TBZ $<imm>, <Rt>, <label>
   197  
   198  
   199  (11) STLXR, STLXRW, STXR, STXRW, STLXRB, STLXRH, STXRB, STXRH  <Rf>, (<Rn|RSP>), <Rs>
   200  
   201    Examples:
   202      STLXR ZR, (R15), R16    <=>    stlxr w16, xzr, [x15]
   203      STXRB R9, (R21), R19    <=>    stxrb w19, w9, [x21]
   204  
   205  (12) STLXP, STLXPW, STXP, STXPW (<Rf1>, <Rf2>), (<Rn|RSP>), <Rs>
   206  
   207    Examples:
   208      STLXP (R17, R19), (R4), R5      <=>    stlxp w5, x17, x19, [x4]
   209      STXPW (R30, R25), (R22), R13    <=>    stxp w13, w30, w25, [x22]
   210  
   211  2. Expressions for special arguments.
   212  
   213  #<immediate> is written as $<immediate>.
   214  
   215  Optionally-shifted immediate.
   216  
   217    Examples:
   218      ADD $(3151<<12), R14, R20     <=>    add x20, x14, #0xc4f, lsl #12
   219      ADDW $1864, R25, R6           <=>    add w6, w25, #0x748
   220  
   221  Optionally-shifted registers are written as <Rm>{<shift><amount>}.
   222  The <shift> can be <<(lsl), >>(lsr), ->(asr), @>(ror).
   223  
   224    Examples:
   225      ADD R19>>30, R10, R24     <=>    add x24, x10, x19, lsr #30
   226      ADDW R26->24, R21, R15    <=>    add w15, w21, w26, asr #24
   227  
   228  Extended registers are written as <Rm>{.<extend>{<<<amount>}}.
   229  <extend> can be UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW or SXTX.
   230  
   231    Examples:
   232      ADDS R19.UXTB<<4, R9, R26     <=>    adds x26, x9, w19, uxtb #4
   233      ADDSW R14.SXTX, R14, R6       <=>    adds w6, w14, w14, sxtx
   234  
   235  Memory references: [<Xn|SP>{,#0}] is written as (Rn|RSP), a base register and an immediate
   236  offset is written as imm(Rn|RSP), a base register and an offset register is written as (Rn|RSP)(Rm).
   237  
   238    Examples:
   239      LDAR (R22), R9                  <=>    ldar x9, [x22]
   240      LDP 28(R17), (R15, R23)         <=>    ldp x15, x23, [x17,#28]
   241      MOVWU (R4)(R12<<2), R8          <=>    ldr w8, [x4, x12, lsl #2]
   242      MOVD (R7)(R11.UXTW<<3), R25     <=>    ldr x25, [x7,w11,uxtw #3]
   243      MOVBU (R27)(R23), R14           <=>    ldrb w14, [x27,x23]
   244  
   245  Register pairs are written as (Rt1, Rt2).
   246  
   247    Examples:
   248      LDP.P -240(R11), (R12, R26)    <=>    ldp x12, x26, [x11],#-240
   249  
   250  Register with arrangement and register with arrangement and index.
   251  
   252    Examples:
   253      VADD V5.H8, V18.H8, V9.H8                     <=>    add v9.8h, v18.8h, v5.8h
   254      VLD1 (R2), [V21.B16]                          <=>    ld1 {v21.16b}, [x2]
   255      VST1.P V9.S[1], (R16)(R21)                    <=>    st1 {v9.s}[1], [x16], x28
   256      VST1.P [V13.H8, V14.H8, V15.H8], (R3)(R14)    <=>    st1 {v13.8h-v15.8h}, [x3], x14
   257      VST1.P [V14.D1, V15.D1], (R7)(R23)            <=>    st1 {v14.1d, v15.1d}, [x7], x23
   258  */
   259  package arm64
   260
View as plain text