As much as possible, ELF dynamic linking defers the resolution of jump/call addresses until the last minute. The technique is inspired by the i386 design, and is based on the following constraints.
1) The calling technique should not force a change in the assembly code produced for apps; it MAY cause changes in the way assembly code is produced for pic-code (i.e. libraries)
2) The technique must be such that all executable areas must not be modified; and any modified areas must not be executed.
To do this, there are three steps involved in a typical jump: 1) in the code 2) through the PLT 3) using a pointer from the GOT
When the executable or library is first loaded, the GOT entry points to code which implements dynamic name resolution and code finding. On the first invocation, the function is located and the GOT entry is replaced by the address of the real functon. Subsequent calls go through 1)-2)-3) and end up calling the real code.
1) In the code:
This is typical ARM code using the 26 bit relative jump or call. The target is an entry in the PLT. Note that this call is identical to a normal call.
2) In the PLT:
The PLT is a synthetic area, created by the linker. It exists in both executables and libraries. It is an array of stubs, one per imported function call. It looks like this:
PLT[n+1]: ldr ip, 1f @load an offset add ip, pc, ip @add the offset to the pc ldr pc, [ip] @jump to that address 1: .word GOT[n+3] - .
The add on the second line makes ip = &GOT[n+3], which contains either a pointer to PLT (the fixup trampoline) or a pointer to the actual code.
The first PLT entry is slightly different, and is used to form a trampoline to the fixup code.
PLT: str lr, [sp, #-4]! @push the lr ldr lr, [pc, #16] @load from 6 words ahead add lr, pc, lr @form an address ldr pc, [lr, #8]! @jump to the contents of that addr
The lr is pushed on the stack and used for calculations. The load on the second line loads lr with &GOT - . - 20. On the third line, the addition leaves
lr = (&GOT - . - 20) + (. + 8) lr = (&GOT - 12)
On the fourth line, the pc and lr are both updated, so that
pc = GOT lr = &GOT
3) In the GOT: The GOT (global offset table) contains helper pointers for both PLT fixups and GOT fixup. The first 3 entries are special. The next M entries belong to the PLT fixups. The next D entries belong to various data fixups.
The GOT is also a synthetic area, created by the linker. It exists in both executables and libraries.
When the GOT is first set up, all the GOT entries relating to PLT fixups are pointing to code back at PLT.
The special entries in the GOT are:
GOT = linked list pointer used by the dyn-loader GOT = pointer to the reloc table for this module GOT = pointer to the fixup/resolver code
The first invocation of function call comes through and uses the fixup/resolver code. On the entry to the fixup/resolver code:
ip = &GOT[n+3] lr = &GOT stack = lr of the function call [r0, r1, r2, r3 are still caller data]
This is enough information for the fixup/resolver code to work with. Before the fixup/resolver code returns, it actually calls the requested function and repairs &GOT[n+3]
NOTE: PLT borrows an offset .word from PLT. I know this is a little "tight", but allows us to keep all the PLT entries the same size.