The Wayback Machine - https://web.archive.org/web/20210301161113/https://github.com/JuliaLang/julia/issues/39820
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Darwin/ARM64: Linker corrupts generated code #39820

Open
Keno opened this issue Feb 25, 2021 · 10 comments
Open

Darwin/ARM64: Linker corrupts generated code #39820

Keno opened this issue Feb 25, 2021 · 10 comments

Comments

@Keno
Copy link
Member

@Keno Keno commented Feb 25, 2021

LLVM emits the following:

Lloh1699:
        adrp    x8, __MergedGlobals.1@PAGE
Lloh1700:
        add     x9, x8, __MergedGlobals.1@PAGEOFF
Ltmp3214:
        .loc    2 87 0
        add     x8, x22, #1
Ltmp3215:
        .loc    14 197 0
        ldr     x10, [x21, #8]
Ltmp3216:
        .loc    2 444 0
Lloh1701:
        ldr     x9, [x9]

LOH is what's called Linker-optimization-hint which is an aarch64-specific thing where the linker patches out some instructions (an explanation of how that works is here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64CollectLOH.cpp). However, after linking the instruction sequence is:

    0x108017f4c <+1004>: adrp   x8, 23520
    0x108017f50 <+1008>: nop    
    0x108017f54 <+1012>: add    x8, x22, #0x1             ; =0x1 
    0x108017f58 <+1016>: ldr    x10, [x21, #0x8]
->  0x108017f5c <+1020>: ldr    x9, [x8, #0x740]

looks to me like what happened here is that the relocation offset overflowed the size of the available immediate operand of the subsequent load, causing it to overflow into the source register field, causing crashes.

@Keno Keno mentioned this issue Feb 25, 2021
20 of 29 tasks complete
@Keno
Copy link
Member Author

@Keno Keno commented Feb 25, 2021

For posterity, the linker in question here is ld64-607.2. I'm gonna check if this is fixed in a newer linker version.

@Keno
Copy link
Member Author

@Keno Keno commented Feb 25, 2021

Actually, the linker is fine here. The issue is the intervening assignment to x8, so the compiler shouldn't have asked for an loh.

@Keno
Copy link
Member Author

@Keno Keno commented Feb 25, 2021

Looks like the fix in https://reviews.llvm.org/D80834 needs to be extended to the adrp/add/ldr case also.

@aemerson
Copy link

@aemerson aemerson commented Feb 25, 2021

Do you have an IR reproducer? I tried to re-construct the issue by using this MIR:

--- |
  @sym = local_unnamed_addr global i32 zeroinitializer, align 8

  define i32 @main() {
    ret i32 0
  }

...
---
name:            main
alignment:       4
tracksRegLiveness: true
liveins:
  - { reg: '$x22', virtual-reg: '' }
  - { reg: '$x21', virtual-reg: '' }
body:             |
  bb.0:
    liveins: $x21, $x22
    renamable $x8 = ADRP target-flags(aarch64-page) @sym
    renamable $x9 = ADDXri killed renamable $x8, target-flags(aarch64-pageoff, aarch64-nc) @sym, 0
    renamable $x8 = ADDXri killed renamable $x22, 1, 0
    $x10 = LDRXui $x21, 8
    $x9 = LDRXui $x9, 0
    RET undef $lr

...

but linking this I get a binary that has the last load becoming a literal load, which seems correct:

	nop
	nop
	add	x8, x22, #0x1
	ldr	x10, [x21, #0x40]       ; Latency: 4
	ldr	x9, #0x50               ; Latency: 4
	ret
@Keno
Copy link
Member Author

@Keno Keno commented Feb 25, 2021

Can you try making the address of @sym more than a page away from the pc of the load instruction? In the original example above, the linker had to retain the adrp, but could fold the offset into the subsequent load. I'll generate you an end-to-end reproducer in the meantime.

@Keno
Copy link
Member Author

@Keno Keno commented Feb 25, 2021

@aemerson Here's a full end-to-end example for you: https://gist.github.com/Keno/97c670bb659866ed1925081624fa1fb6. The .s is generated from the .ll in the gist using the command at the top (this is on LLVM11, but the issue reproduced on trunk also, though I didn't check this particular test case). You can see the problematic sequence here: https://gist.github.com/Keno/97c670bb659866ed1925081624fa1fb6#file-extract-loh-s-L476-L488

@Keno
Copy link
Member Author

@Keno Keno commented Feb 25, 2021

(Note this is only extracting the one function. If that turns out to not be big enough to reproduce this fully, I can send you the whole .bc file. It's about 100MB though.

@aemerson
Copy link

@aemerson aemerson commented Feb 25, 2021

Ok, adding some very large padding in the assembly seems to have reproduced it. I now see:

_main:
	adrp	x8, 293 ; 0x10012c000
	nop
	add	x8, x22, #0x1
	ldr	x10, [x21, #0x40]       ; Latency: 4
	ldr	x9, [x8, #0xf80]        ; Latency: 4
	ret
@Keno
Copy link
Member Author

@Keno Keno commented Feb 25, 2021

Yep, that's the issue. The x8 will have been clobbered by the intervening add x8, x22, #0x1, so that'll crash at runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants