以 gcc 特定的方式做到这一点的一种方法是利用typeof and 嵌套函数创建一个函数指针embeds对底层函数的调用,但本身没有任何参数。
该指针可以传递给 thunk 方法,该方法调用它并验证 ABI 合规性。
下面是一个将调用转换为int add3(int, int, int)
使用这种方法:
原始调用如下所示:
int res = add3(a, b, c);
Then you wrap the call in a macro, like this2:
CALL_THUNKED(int res, add3, (a,b,c));
...它扩展到类似:
typedef typeof(add3 (a,b,c)) ret_type;
ret_type closure() {
return add3 (a,b,c);
}
typedef ret_type (*typed_closure)(void);
typedef ret_type (*thunk_t)(typed_closure);
thunk_t thunk = (thunk_t)closure_thunk;
int res = thunk(&closure);
We create the closure()
function on the stack, which calls directly into add3
with the original arguments. We can take the address of this closure and pass it an asm function without difficulty: calling it will have the ultimate effect of calling add3
with the arguments1.
其余的 typedef 基本上处理返回类型。我们只有一个closure_thunk
方法,这样声明void* closure_thunk(void (*)(void));
并在装配中实施。它需要一个函数指针(任何函数指针都可以转换为任何其他函数指针),但返回类型是“错误的”。我们把它投射到thunk_t
这是动态生成的typedef
对于具有“正确”返回类型的函数。
当然,这对于 C 函数来说肯定是不合法的,但是我们正在 asm 中实现该函数,因此我们有点回避这个问题(如果您想要更兼容一点,您也许可以向 asm 代码询问函数指针正确的类型,每次都可以“生成”它,超出了标准的范围:当然,它只是每次返回相同的指针)。
The closure_thunk
asm 中的函数是按照以下方式实现的:
GLOBAL closure_thunk:function
closure_thunk:
push rsi
push_callee_saved
call rdi
; set up the function name
mov rdi, [rsp + 48]
; now check whether any regs were clobbered
cmp rbp, [rsp + 40]
jne bad_rbp
cmp rbx, [rsp + 32]
jne bad_rbx
cmp r12, [rsp + 24]
jne bad_r12
cmp r13, [rsp + 16]
jne bad_r13
cmp r14, [rsp + 8]
jne bad_r14
cmp r15, [rsp]
jne bad_r15
add rsp, 7 * 8
ret
也就是说,将我们要检查的所有寄存器压入堆栈(连同函数名),调用该函数rdi
然后进行检查。这bad_*
方法没有显示,但它们基本上会吐出一条错误消息,例如“函数 add3 覆盖了 rbp...顽皮!”和abort()
的过程。
如果在堆栈上传递任何参数,这会中断,但它确实适用于在堆栈上传递的返回值(因为该情况的 ABI 传递一个指向“rax”中返回值位置的指针)。
1 How this is accomplished is kind of magic: gcc
actually writes a few bytes of executable code onto the stack, and the closure
function pointer points there. The few bytes basically loads a register with a pointer to the region that contains the captured variables (a, b, c
in this case), and then calls the actual (read-only) closure()
code which then can access the captured variables though that pointer (and pass them to add3
).
2 As it turns out, we could probably use gcc's statement expression syntax to write the macro in a more usual function like syntax, something like int res = CALL_THUNKED(add3, (a,b,c))
.