iOS底层-消息发送机制

2023-05-16

前言

通过对类的缓存探索了解到方法缓存在类对象的成员cache中，而缓存的目的是为了方法调用的时候能更快的进行响应。

缓存的时候，cache_t结构体用到insert方法进行插入的，那么本次就探索怎么读取。

方法所在源码objc-cache里找能介绍，缓存读取器(Cache readers)包括 objc_msgsend和cache_getImp。

消息发送

C语言是静态类型，OC是动态类型。在编译的时候不知道具体类型。运行的时候才会检查数据类型，根据函数名找到实现。实现语言动态的就是runtime的api。它的知识点都围绕2个核心：

1.动态配置。动态的修改类的信息。添加属性、方法，甚至成员变量的值等数据结构。

2.消息传递。包括发送和转发。在编译的时候，方法调用就会转换成objc_msgsend函数进行消息发送，这是通过sel(方法名)找imp(方法实现)的过程。

objc_msgsend

以下面的demo为例


#import <Foundation/Foundation.h>
#import <objc/runtime.h>
#import <malloc/malloc.h>

@interface Student : NSObject

- (void)study;
- (void)play;
+ (void)eat;

@end

@implementation Student

- (void)study {
    NSLog(@"%s",__func__);
}
- (void)play {
    NSLog(@"%s",__func__);
}
+ (void)eat {
    NSLog(@"%s",__func__);
}

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        Student *student = [Student alloc];
        [student study];
        [student play];
        
    }
    return 0;
}

打开终端，在项目目录下通过clang指令，讲main.m文件编译成后缀.cpp的c++类型文件

clang -rewrite-objc main.m

打开找到main函数，编译后的方法调用都是通过objc_msgSend发送的，证明方法的本质就是消息发送。

// @end
int main(int argc, const char * argv[]) {
    /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; 
        Student *student = ((Student *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("Student"), sel_registerName("alloc"));
        ((void (*)(id, SEL))(void *)objc_msgSend)((id)student, sel_registerName("study"));
        ((void (*)(id, SEL))(void *)objc_msgSend)((id)student, sel_registerName("play"));
    }
    return 0;
}

objc_msgSend带有默认的2个隐式参数：消息的接收者id类型，消息的方法名SEL类型。

开始的alloc方法是给类对象发消息objc_getClass("Student")

如果消息接收者是实例对象，实例对象会通过isa找到类对象，从中找到实例方法。类方法同理，在元类对象中找到。

如果方法是带有参数的:

- (void)study:(NSString *)what {
    NSLog(@"study %@", what ?: @"nothing");
}

编译后：

参数添加在末尾。

((void (*)(id, SEL, NSString *))(void *)objc_msgSend)((id)student, sel_registerName("study:"), (NSString *)&__NSConstantStringImpl__var_folders_gb_l37zch3569s4z3rjfpkwts000000gn_T_main_c85dd8_mi_4);

尝试直接用这些api进行方法调用：

#import <objc/message.h> // 记得添加头文件
// 方式1
((void (*)(id, SEL))(void *)objc_msgSend)((id)student, sel_registerName("play"));
// 方式2
((void (*)(id, SEL))(void *)objc_msgSend)((id)student, NSSelectorFromString(@"play"));

通过 option + 上方向键 来到cpp文件的顶部，可以看到objc_msgSend方法不止一种，这是个家族…

以下方法依次代表发给当前类对象、父类对象、结构体、结构体父类、浮点类型。

__OBJC_RW_DLLIMPORT void objc_msgSend(void);
__OBJC_RW_DLLIMPORT void objc_msgSendSuper(void);
__OBJC_RW_DLLIMPORT void objc_msgSend_stret(void);
__OBJC_RW_DLLIMPORT void objc_msgSendSuper_stret(void);
__OBJC_RW_DLLIMPORT void objc_msgSend_fpret(void);

objc_msgSendSuper

换一个demo测试super,

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        FFToys *toy = [[FFToys alloc] init];
        [toy testInstancePrint];
    }
    return 0;
}

方法打印出的class竟然一致…

再次编译成cpp文件一探究竟，这次编译的是这个类的实现文件。可以看到是通过objc_msgSendSuper发送给父类

通过xcode查看文档，工具栏->Help->Developer Documentation，查看该方法的Discussion:

翻译一下：

当遇到方法调用时，编译器会生成对以下函数之一的调用：objc_msgSend、objc_msgSend_stret、objc_msgSendSuper或objc_msgSendSuper_stret。发送到对象超类的消息（使用super关键字）使用objc_msgSendSuper发送；其他消息使用objc_msgSend发送。使用objc_msgSendSuper_stret和objc_msgSend_stret发送以数据结构作为返回值的方法。

再翻译参数：

super
指向objc_super数据结构的指针。传递值，标识消息发送到的上下文，包括要接收消息的类的实例和要开始搜索方法实现的超类。
op
SEL型指针。传递将处理消息的方法的选择器。
…
包含方法参数的变量参数列表。

既然是发送给"类的实例"，回看刚才的代码：这里接收者还是self。

(__rw_objc_super){(id)self, (id)class_getSuperclass(objc_getClass("FFToys"))}

"开始搜索方法实现的超类"这部分又是什么意思呢？

来看objc_super结构体：

/// Specifies the superclass of an instance. 
struct objc_super {
    /// Specifies an instance of a class.
    __unsafe_unretained _Nonnull id receiver;

    /// Specifies the particular superclass of the instance to message. 
#if !defined(__cplusplus)  &&  !__OBJC2__
    /* For compatibility with old objc-runtime.h header */
    __unsafe_unretained _Nonnull Class class;
#else
    __unsafe_unretained _Nonnull Class super_class;
#endif
    /* super_class is the first class to search */
};

根据编译后的源码参数：{(id)self, (id)class_getSuperclass(objc_getClass("FFToys"))}。模拟super的实现：构造objc_super结构体，接收者是self，super_class使用父类FFGoods；

- (void)testInstancePrint {
//    [super testInstancePrint];
    
    struct objc_super ff_objc_super;
    ff_objc_super.receiver = self;
    ff_objc_super.super_class = FFGoods.class;

    void* (*objc_msgSendSuperTyped)(struct objc_super *self,SEL _cmd) = (void *)objc_msgSendSuper;
    objc_msgSendSuperTyped(&ff_objc_super,@selector(testInstancePrint));
}

成功打印出父类方法：

由此可见，方法的接收和查找不一定是同一个；

super只是关键字，结构体中的super_class 等于父类，代表从父类对象开始查找；不代表接收者receiver是父类对象；

这个用法参考自stackoverflow：

objc_msgSendSuper的区别在于找方法的初始位置不一样。例如改成FFToys就陷入了死循环

super_class = FFGoods.class; 
// 死循环
super_class = FFToys.class;

改成NSObject就直接崩溃了，找不到方法。

快速查找

objc_msgSend在不同架构下都有实现：

以arm64为例，可以看到代码实现是汇编：ENTRY代表入口

为什么选用汇编来实现？速度更快，直接使用参数，免去大量参数的拷贝的开销。

在函数和全局变量前面会加下划线’_'，防止符号冲突。一个程序中会包含汇编和c文件，在编译器看来是一样的？

进行真机汇编调试：

#import "ViewController.h"
#import "FFToys.h"

@interface ViewController ()

@end

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    
    FFToys *toy = [[FFToys alloc] init];
    [toy testInstancePrint];
    
}

@end

汇编断点：

单步调试进入objc_msgSend方法调用，方法有2个入参，消息接收者和方法名。

汇编中x0是默认第一个参数。x1是第二个参数，对应方法名；

打印验证一下：

LLDB内容:

(lldb) register read x0
      x0 = 0x0000000280fd44a0
(lldb) po 0x0000000280fd44a0
<FFToys: 0x280fd44a0>

(lldb) register read x1
      x1 = 0x00000001023a25d9  "testInstancePrint"
(lldb)

继续单步调试到方法实现：

跟源码里的汇编是一样的cmp p0:


//进入objc_msgSend流程
	ENTRY _objc_msgSend
//流程开始，无需frame
	UNWIND _objc_msgSend, NoFrame

//判断p0(消息接受者)是否存在，不存在则重新开始执行objc_msgSend
	cmp	p0, #0			// nil check and tagged pointer check

//如果支持小对象类型。返回小对象或空
#if SUPPORT_TAGGED_POINTERS
//b是进行跳转，b.le是小于判断，也就是小于的时候LNilOrTagged
	b.le	LNilOrTagged		//  (MSB tagged pointer looks negative)
#else
//等于，如果不支持小对象，就LReturnZero
	b.eq	LReturnZero
#endif
//通过p13取isa
	ldr	p13, [x0]		// p13 = isa
//通过isa取class并保存到p16寄存器中
	GetClassFromIsa_p16 p13, 1, x0	// p16 = class
//LGetIsaDone是一个入口
LGetIsaDone:
	// calls imp or objc_msgSend_uncached
//进入到缓存查找或者没有缓存查找方法的流程
	CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached

打印一下isa，也就是x13。

通过isa和掩码获取class保存到x16；添加打印class的代码，重新运行：

得到之后源码会调用CacheLookup这个方法

方法实现：这里是从x16中取出class移到x15中

得到类对象，通过平移16字节得到成员cache：

地址为第一个成员变量x11：_bucketsAndMaybeMask

通过x11与上0xffffffffffff得到x10；buckets。

// cache_t里的buckets()方法
struct bucket_t *buckets() const;
// 实现
struct bucket_t *cache_t::buckets() const
{
    uintptr_t addr = _bucketsAndMaybeMask.load(memory_order_relaxed);
    return (bucket_t *)(addr & bucketsMask);
}
// bucketsMask值就等于0xffffffffffff
static constexpr uintptr_t bucketsMask = ~0ul;

就是说，目前为止汇编代码流程就是为了获取方法的缓存容器buckets; 之后的一段流程就是找到方法的sel后，执行CacheHit。

这是缓存命中的情况，call or return imp意思是返回或调用方法的实现(imp)。

CacheHit的内容：上图的Mode代表走下图的NORMAL流程，authenticate and call imp意思验证并调用方法实现。

如果没有找到缓存，查找下一个bucket，一直循环直到找到对应的方法，循环完都没有找到就调用_objc_msgSend_uncached函数

以上就是消息的快速查找流程（文末总结）。

慢速查找

搜索_objc_msgSend_uncached来到入口，这是静态的:STATIC_ENTRY。

MethodTableLookup内部：

其中bl表示调用了方法_lookUpImpOrForward，去掉下划线，找到lookUpImpOrForward

方法首先是定义一个消息的转发forward_imp；接着判断类的初始化、加锁、检查是否已知的类…等等，先不管这些。重点在于接下来的for循环，可以看到这像是一个死循环：

// unreasonableClassCount()表示循环的上限；
for (unsigned attempts = unreasonableClassCount();;) {
      if (curClass->cache.isConstantOptimizedCache(/* strict */true)) {
#if CONFIG_USE_PREOPT_CACHES
          imp = cache_getImp(curClass, sel);
          if (imp) goto done_unlock;
          curClass = curClass->cache.preoptFallbackClass();
#endif
      } else {
          // curClass method list.
          method_t *meth = getMethodNoSuper_nolock(curClass, sel);
          if (meth) {
              imp = meth->imp(false);
              goto done;
          }

          if (slowpath((curClass = curClass->getSuperclass()) == nil)) {
              // No implementation found, and method resolver didn't help.
              // Use forwarding.
              imp = forward_imp;
              break;
          }
      }

      // Halt if there is a cycle in the superclass chain.
      if (slowpath(--attempts == 0)) {
          _objc_fatal("Memory corruption in class list.");
      }

      // Superclass cache.
      imp = cache_getImp(curClass, sel);
      if (slowpath(imp == forward_imp)) {
          // Found a forward:: entry in a superclass.
          // Stop searching, but don't cache yet; call method
          // resolver for this class first.
          break;
      }
      if (fastpath(imp)) {
          // Found the method in a superclass. Cache it in this class.
          goto done;
      }
}

第一个if判断: 是再次从cache里找；目的在于防止多线程操作时，刚好调用函数，正好缓存进来了。

else: 如果还是没有，从当前类的方法列表里去找；

看一下方法getMethodNoSuper_nolock:


/***********************************************************************
 * getMethodNoSuper_nolock
 * fixme
 * Locking: runtimeLock must be read- or write-locked by the caller
 **********************************************************************/
static method_t *
getMethodNoSuper_nolock(Class cls, SEL sel)
{
    runtimeLock.assertLocked();

    ASSERT(cls->isRealized());
    // fixme nil cls? 
    // fixme nil sel?

    auto const methods = cls->data()->methods();
    for (auto mlists = methods.beginLists(),
              end = methods.endLists();
         mlists != end;
         ++mlists)
    {
        // <rdar://problem/46904873> getMethodNoSuper_nolock is the hottest
        // caller of search_method_list, inlining it turns
        // getMethodNoSuper_nolock into a frame-less function and eliminates
        // any store from this codepath.
        method_t *m = search_method_list_inline(*mlists, sel);
        if (m) return m;
    }

    return nil;
}

跳转search_method_list_inline，fastpath代表大概会走的路径，findMethodInSortedMethodList从Sorted可知从已排序的方法列表里查找。另一分支通过注释可知，未排序方法列表用的线性查找，这就不必看了。

ALWAYS_INLINE static method_t *
search_method_list_inline(const method_list_t *mlist, SEL sel)
{
    int methodListIsFixedUp = mlist->isFixedUp();
    int methodListHasExpectedSize = mlist->isExpectedSize();
    // 已排序的二分查找
    if (fastpath(methodListIsFixedUp && methodListHasExpectedSize)) {
        return findMethodInSortedMethodList(sel, mlist);
    } else {
        // Linear search of unsorted method list
      	// 未排序的线性查找
        if (auto *m = findMethodInUnsortedMethodList(sel, mlist))
            return m;
    }

#if DEBUG
    // sanity-check negative results
    if (mlist->isFixedUp()) {
        for (auto& meth : *mlist) {
            if (meth.name() == sel) {
                _objc_fatal("linear search worked when binary search did not");
            }
        }
    }
#endif

    return nil;
}

跳转findMethodInSortedMethodList，ALWAYS_INLINE代表这是始终内联的

// 方法内联
ALWAYS_INLINE static method_t *
findMethodInSortedMethodList(SEL key, const method_list_t *list)
{
    if (list->isSmallList()) {
        if (CONFIG_SHARED_CACHE_RELATIVE_DIRECT_SELECTORS && objc::inSharedCache((uintptr_t)list)) {
            return findMethodInSortedMethodList(key, list, [](method_t &m) { return m.getSmallNameAsSEL(); });
        } else {
            return findMethodInSortedMethodList(key, list, [](method_t &m) { return m.getSmallNameAsSELRef(); });
        }
    } else {
        return findMethodInSortedMethodList(key, list, [](method_t &m) { return m.big().name; });
    }
}

编译后走的是以下流程，这是通过二分查找进行方法查找的。

/***********************************************************************
 * search_method_list_inline
 **********************************************************************/
template<class getNameFunc>
ALWAYS_INLINE static method_t *
findMethodInSortedMethodList(SEL key, const method_list_t *list, const getNameFunc &getName)
{
    ASSERT(list);
		// 二分查找
  	// auto 代表自动匹配类型；
    auto first = list->begin();
    auto base = first;
  	// decltype: declare type，译为声明类型。这里获取表达式类型；
    decltype(first) probe;

    uintptr_t keyValue = (uintptr_t)key;
    uint32_t count;
    
    for (count = list->count; count != 0; count >>= 1) {
        probe = base + (count >> 1);
        
        uintptr_t probeValue = (uintptr_t)getName(probe);
        
        if (keyValue == probeValue) {
            // `probe` is a match.
            // Rewind looking for the *first* occurrence of this value.
            // This is required for correct category overrides.
            while (probe > first && keyValue == (uintptr_t)getName((probe - 1))) {
                probe--;
            }
            return &*probe;
        }
        
        if (keyValue > probeValue) {
            base = probe + 1;
            count--;
        }
    }
    
    return nil;
}

模拟一下for循环：

// 假设 count = 16; 初始时 base = 0;

count >> 1右移1位就相当于除以2; 那么第一轮时

// probe = 8
// 假如 keyValue < probeValue，直接进入第二轮；

第二轮

// probe = 4, count = 8, base = 0,
// 假如 keyValue > probeValue
// count--, 所以 count = 7, base = 5,

第三轮

// count = (7 >>= 1) = 3, probe = (5 + (3 >> 1)) = 6, 
// 假如此时 keyValue == probeValue

匹配之后，通过while循环执行probe--，退出时得到列表里第一次出现的地方（probe最小）。

意思就是分类优先，因为分类同名的方法会排在列表靠前。多个分类有同名方法时，确保后编译的先调用。

methods()方法可以看到，会判断rwe，而这就是因为分类产生的内存空间。

这就是当前类的查找流程了。

查找后的处理

回到方法

如果找到了，goto 到 done代码块:

跳转 log_and_fill_cache :


/***********************************************************************
* log_and_fill_cache
* Log this method call. If the logger permits it, fill the method cache.
* cls is the method whose cache should be filled. 
* implementer is the class that owns the implementation in question.
**********************************************************************/
static void
log_and_fill_cache(Class cls, IMP imp, SEL sel, id receiver, Class implementer)
{
#if SUPPORT_MESSAGE_LOGGING
    if (slowpath(objcMsgLogEnabled && implementer)) {
        bool cacheIt = logMessageSend(implementer->isMetaClass(), 
                                      cls->nameForLogging(),
                                      implementer->nameForLogging(), 
                                      sel);
        if (!cacheIt) return;
    }
#endif
    cls->cache.insert(sel, imp, receiver);
}

原来找到之后，会放入类的方法缓存里；此时方法还未执行。

回到主方法，如果慢查也没找到？ curClass 赋值为父类的类对象；然后从父类缓存里查找；

cache_getImp只有声明:

// returns:
// - the cached IMP when one is found
// - nil if there's no cached value and the cache is dynamic
// - `value_on_constant_cache_miss` if there's no cached value and the cache is preoptimized
extern "C" IMP cache_getImp(Class cls, SEL sel, IMP value_on_constant_cache_miss = nil);

实现也是汇编：

如果父类里也没有，循环又重头开始直至nil : if (slowpath((curClass = curClass->getSuperclass()) == nil))。

这时就要进入消息的转发，之后的文章再讲解。

子类调用父类方法

有个问题，如果子类调用父类方法，缓存在哪个类？

代码验证一下：


@interface Goods : NSObject

- (void)method1;
- (void)method2;

@end

@implementation Goods

- (void)method1 {
    NSLog(@"%s", __func__);
}
- (void)method2 {
    NSLog(@"%s", __func__);
}

@end

@interface Toys : Goods

@end

@implementation Toys

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Toys *toy = [[Toys alloc] init];
        [toy method1];
        [toy method2];
    }
    return 0;
}

子类没有父类方法时：methods的list.ptr为nil，说明子类不存储父类方法。

运行方法后，先从父类对象缓存里找：

发现buckets一开始就是nil，说明没存在父类对象里，那再看当前的子类对象里有没有了。

如果是x86_64架构下，由于类对象自带2个方法，插入第3个方法会导致扩容清空缓存。只要执行2个方法就行。

太长了，分2张图：

可以看到toy类对象的cache缓存了method2。

回看慢查找方法，cls 是当前传入的类，curClass是局部变量，

最终是当前类的cache插入缓存；

结论：缓存到当前传入的类。

小测试

通过一个demo来验证前面所说的内容

虽然类没有实现该方法，但是不会报错，因为NSObject是所有类的根类；通过父类isa一直向上找，最终从NSObject中找到方法；

通篇内容都不区分实例方法和类方法；进一步说明他们只是存储的位置不一样；

总结

方法的本质就是消息发送

消息的发送在编译的时候，编译器就会把方法转换为objc_msgSend这个函数。函数通过消息的接收者和方法名找到具体的实现。接收者是实例对象，通过isa找到类对象，再通过方法名在类对象的方法缓存中找到实现。如果接收者是类对象，就在元类里找。

objc_msgSendSuper

使用super关键字调用父类方法，消息会通过objc_msgSendSuper发送。 super和self调用方法的区别就在于，查找方法的时候出发点不一样。self会从当前类开始找，而super会从当前类的父类。

消息的快速查找流程

判断receiver(消息的接受者)是否存在。
receiver 通过 isa 找到 class。
class 首地址通过内存平移得到缓存 cache。
cache 中获取 buckets 容器。
遍历buckets，与元素比对方法名（元素是bucket_t结构体类型，包含_sel和_imp成员变量）。
如果找到相等的，执行CacheHit方法，调用imp。
如果没有，执行_objc_msgSend_uncached。

那么类方法也是一样的，只是接收者receiver不同。

消息的慢速查找流程

从汇编方法跳转到lookUpImpOrForward 函数之后：

再次从cache里查找，因为多线程可能已经缓存进来了。
从当前类的bits中获取methodList进行查找。已排序的用二分查找，未排序的用线性查找。
没有就找superClass的cache。
从superClass的 methodList重复第二步。
没找到就直到superClass等于nil为止。

方法缓存的位置

方法调用传入的类。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

ios

消息发送机制