其实上一篇文章《Kotlin Native编译原理02 - 简单「深入」理解Objective-C运行时(一)》写完后,这篇文章就立马开始写了。但是在写文章的那段时间,有很多活动,所以写文章的事情也渐渐耽搁了下来,直到最近。
上一篇文章写完后,我就在思考,写的文章是不是跑题了?明明我要讲的是"运行时",为什么会牵涉到很多内核甚至指令集的知识?但是事实就是这么有意思,"运行时"确实是底层原理息息相关。
OOP与消息#
世间本无OOP,OOP的概念是如何发明的呢?
1950-1960年代,大家还在用LISP语言开发时,开始对某块连续的内存(结构体)描述成对象。1960年代,Simula语言诞生,首次引入了对象、类、继承、虚过程(早期的vtable)的概念,是世界上第一门面向对象的语言。Simula首次将代码跟子过程绑定在一起,后期这一概念称为"方法",即某个类对一个函数的实现。类里的每个方法都会被编译为单独的函数,每个函数的第一个参数固定为对象地址。这样调用某个对象的方法,其实是调用方法对应的函数,第一个参数传入该对象的地址。
Alan Kay受Simula的启发,结合自己对OOP的理解,于1970年发布第一门纯面向对象语言Smalltalk。Alan Kay对OOP的理解和Simula是不完全相同的,他认为对象间只能通过消息通讯,而不是方法:
I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful). Alan Kay
消息跟方法有什么不同?方法本质上还是函数,只是第一个参数写死为对象地址,所以不能调用动态地往对象里加方法。而消息是让对象执行某个逻辑的请求,对象收到消息后内部决定是否处理,以及如何处理消息。不管是编译时还是运行时,给对象发任何消息都是可行的。在实现上,对象内部会有一个类似消息派发中心的逻辑,专门负责处理消息。如果消息能被处理,再派发到对应的处理子过程/函数里。
1981年Brad Cox在ITT上班,开始接触Smalltalk。而他也意识到C语言面向过程的局限性,决定给C语言加上Smalltalk的特性。并于1983年,发布了支持Smalltalk对象特性的C语言预编译器:OOPC
后面他们发现,用预编译器来实现面向对象的特性,局限性太大了。于是他转向支持C语言面向对象拓展的开发,并于1986年,通过Stepstone发布了支持面向对象特性的C语言——Objective-C。
1988年,乔布斯离开Apple,创办NeXT公司,开发NeXTSTEP操作系统,正苦于为NeXTSTEP寻找一门面向对象且效率高并且支持C语言的语言。乔布斯先前访问过开发Smalltalk语言的公司,Smalltalk对他产生了非常大的影响。后面他发现了Objective-C,所以一拍即合,选择Objective-C作为NeXTSTEP系统的开发语言。
后面的事情大家也知道了。20世纪末,乔布斯重返Apple,Apple收购了NeXT公司,NeXTSTEP里优秀的Cocoa库收归Apple。Brad Cox创办的Stepstone公司也于20世纪末期被Apple收购,至此Objective-C成为了Apple开发首选语言,直到2014年Swift的发布。
为什么Objective-C调用对象子过程被称为消息?因为这本来就是Objective-C诞生的原因。
初探objc_msgSend#
在上一章我们就知道,对于[objc sayHello]:
- 实际调用
objc_msgSend(objc, SEL("sayHello"))。 SEL("sayHello")是伪代码,实际是一个指向sayHello的可读区域字符串指针。objc_msgSend的原型是
OBJC_EXPORT id _Nullableobjc_msgSend(id _Nullable self, SEL _Nonnull op, ...)OBJC_AVAILABLE(10.0, 2.0, 9.0, 1.0, 2.0);其中:
id传入接收消息的对象SELselectorop可变参数,是消息参数,并受 method type encoding 与 ABI 约束
SEL#
我们来看一个demo:
#include<Foundation/Foundation.h>#include<objc/message.h>@interfaceMyClass: NSObject@end@implementationMyClass- (void)sayHi {printf("Hi\n");}@endintmain() {MyClass *obj = [[MyClass alloc] init];// 1((void (*)(id, SEL))objc_msgSend)(obj, @selector(sayHi));// 2((void (*)(id, SEL))objc_msgSend)(obj, NSSelectorFromString(@"sayHi"));// 3 下面的代码会导致异常((void (*)(id, SEL))objc_msgSend)(obj, "sayHi");}3出现异常,说明在Objective-C里,"sayHi" 与 @selector(sayHi) / NSSelectorFromString(@"sayHi") 并不属于同一条消息。
@selector是什么?
从上一章我们可以知道,@selector(msg_name)实际上是从__objc_selrefs段取指向__objc_methname段里字符串为msg_name的指针。
NSSelectorFromString是什么?
我们先看结果,看下 NSSelectorFromString(@"sayHi") 返回什么:
#include<Foundation/Foundation.h>#include<objc/message.h>intmain() {printf("%p\n", @selector(sayHi));printf("%p\n", NSSelectorFromString(@"sayHi"));}0x10244ca220x10244ca22@selector(sayHi) == NSSelectorFromString(@"sayHi") ,这也就能说明为什么demo中1和2的执行结果一致。
并且,"sayHi" 存在字符串常量区,地址与 @selector(sayHi) / NSSelectorFromString(@"sayHi") 不同。这也能够说明对于两个selector,Objective-C是通过比对selector的地址来判断是否属于同一条消息,并不是通过简单的字符串比对判断。换句话说,SEL 的唯一性来自 sel_registerName 的**字符串驻留(intern)**逻辑。
而这种「将字符串当作SEL传入objc_msgSend」的行为,属于UB。实际编码过程万万不可这么写。😩
NSSelectorFromString 为什么会返回一个指向__objc_methname 段里的selector呢?NSSelectorFromString 并没有开源,我们写个程序打断点看看。
写一个非常简单的demo程序#
#include<Foundation/Foundation.h>#include<objc/message.h>intmain() {void*sel =NSSelectorFromString(@"sayHi");}编译+打断点#
(lldb) brset-rNSSelectorFromStringBreakpoint1:where=Foundation`NSSelectorFromString, address = 0x000000018ee87f20(lldb) rProcess6955 launched: '/Users/orangeboy/Downloads/untitled folder/test' (arm64)Process6955 stopped* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1frame#0: 0x000000018ee87f20 Foundation`NSSelectorFromStringFoundation`NSSelectorFromString:-> 0x18ee87f20<+0>:pacibsp0x18ee87f24<+4>:stpx22,x21, [sp, #-0x30]!0x18ee87f28<+8>:stpx20,x19, [sp, #0x10]0x18ee87f2c<+12>:stpx29,x30, [sp, #0x20]Target0: (test) stopped.(lldb) disaFoundation`NSSelectorFromString:-> 0x18ee87f20 <+0>: pacibsp0x18ee87f24 <+4>: stp x22, x21, [sp, #-0x30]!0x18ee87f28 <+8>: stp x20, x19, [sp, #0x10]0x18ee87f2c <+12>: stp x29, x30, [sp, #0x20]0x18ee87f30 <+16>: add x29, sp, #0x200x18ee87f34 <+20>: sub sp, sp, #0x3f00x18ee87f38 <+24>: adrp x8, 4029650x18ee87f3c <+28>: ldr x8, [x8, #0x388]0x18ee87f40 <+32>: ldr x8, [x8]0x18ee87f44 <+36>: stur x8, [x29, #-0x28]0x18ee87f48 <+40>: cbz x0, 0x18ee87fc0 ; <+160>0x18ee87f4c <+44>: mov x19, x00x18ee87f50 <+48>: bl 0x18fc95e80 ; objc_msgSend$length0x18ee87f54 <+52>: mov x20, x00x18ee87f58 <+56>: mov x2, sp0x18ee87f5c <+60>: mov x0, x190x18ee87f60 <+64>: mov w3, #0x3e8 ; =10000x18ee87f64 <+68>: mov w4, #0x4 ; =40x18ee87f68 <+72>: bl 0x18fc8f1c0 ; objc_msgSend$getCString:maxLength:encoding:0x18ee87f6c <+76>: cbz w0, 0x18ee87f88 ; <+104>0x18ee87f70 <+80>: mov x0, sp0x18ee87f74 <+84>: bl 0x18f88336c ; symbol stub for: strlen0x18ee87f78 <+88>: cmp x0, x200x18ee87f7c <+92>: b.ne 0x18ee87f88 ; <+104>0x18ee87f80 <+96>: mov x0, sp0x18ee87f84 <+100>: b 0x18ee87fb4 ; <+148>0x18ee87f88 <+104>: cbz x20, 0x18ee87fac ; <+140>0x18ee87f8c <+108>: mov x21, #0x0 ; =00x18ee87f90 <+112>: mov x0, x190x18ee87f94 <+116>: mov x2, x210x18ee87f98 <+120>: bl 0x18fc891e0 ; objc_msgSend$characterAtIndex:0x18ee87f9c <+124>: cbz w0, 0x18ee87fbc ; <+156>0x18ee87fa0 <+128>: add x21, x21, #0x10x18ee87fa4 <+132>: cmp x20, x210x18ee87fa8 <+136>: b.ne 0x18ee87f90 ; <+112>0x18ee87fac <+140>: mov x0, x190x18ee87fb0 <+144>: bl 0x18fc7b220 ; objc_msgSend$UTF8String0x18ee87fb4 <+148>: bl 0x18f88322c ; symbol stub for: sel_registerName0x18ee87fb8 <+152>: b 0x18ee87fc0 ; <+160>0x18ee87fbc <+156>: mov x0, #0x0 ; =00x18ee87fc0 <+160>: ldur x8, [x29, #-0x28]0x18ee87fc4 <+164>: adrp x9, 4029650x18ee87fc8 <+168>: ldr x9, [x9, #0x388]0x18ee87fcc <+172>: ldr x9, [x9]0x18ee87fd0 <+176>: cmp x9, x80x18ee87fd4 <+180>: b.ne 0x18ee87fec ; <+204>0x18ee87fd8 <+184>: add sp, sp, #0x3f00x18ee87fdc <+188>: ldp x29, x30, [sp, #0x20]0x18ee87fe0 <+192>: ldp x20, x19, [sp, #0x10]0x18ee87fe4 <+196>: ldp x22, x21, [sp], #0x300x18ee87fe8 <+200>: retab0x18ee87fec <+204>: bl 0x18f8811cc ; symbol stub for: __stack_chk_failNSSelectorFromString 反汇编路径解读#
前半部分很简单,是把传入的NSString转为CString(实际是一个char数组)。先调用NSString的getCString:maxLength:encoding: ,如果失败就尝试调用strlen
这个CString会传递给sel_registerName 。正好 sel_registerName 是开源的,我们看看里面具体逻辑:
sel_registerName#
staticobjc::ExplicitInitDenseSet<constchar*> namedSelectors;...SELsel_registerName(constchar*name) {return__sel_registerName(name, 1, 1); // YES lock, YES copy}...staticSEL__sel_registerName(constchar*name, boolshouldLock, boolcopy){SEL result =0;if (shouldLock) lockdebug::assert_unlocked(&selLock.get());elselockdebug::assert_locked(&selLock.get());if (!name) return (SEL)0;result =_sel_searchBuiltins(name);if (result) return result;conditional_mutex_locker_tlock(selLock, shouldLock);auto it = namedSelectors.get().insert(name);if (it.second) {// No match. Insert.*it.first = (constchar*)sel_alloc(name, copy);}return (SEL)*it.first;}...SEL_sel_searchBuiltins(constchar*name){#ifSUPPORT_PREOPTif (SEL result = (SEL)_dyld_get_objc_selector(name))return result;#endifreturn nil;}...staticSELsel_alloc(constchar*name, boolcopy){lockdebug::assert_locked(&selLock.get());return (SEL)(copy ?strdupIfMutable(name) : name);}逻辑非常清晰也很简单,主要分为以下几步:
- 判断该selector有没有加载过。实际是调用
dyld的_dyld_get_objc_selector尝试获取selector表,有则直接返回。 - 尝试往
namedSelectors插入selector。namedSelectors是一个ExplicitInitDenseSetDenseSet<const char *>,可以简单理解为一个Set,在插入时会比对字符串的值。如果存在相同的值就直接返回,否则新插入一条selector。
为什么插入的时候会比较字符串内容呢?
DenseSet#
实际上 ExplicitInitDenseSetDenseSet<const char *> 有以下继承关系:
ExplicitInitDenseSetDenseSet<const char *> ← ExplicitInit<DenseSet<const char *>>
ExplicitInit 可暂时不看,是方便初始化用的。我们先来看看 DenseSet<const char *> ,实际实现是
template <typename ValueT, typename ValueInfoT = DenseMapInfo<ValueT>>class DenseSet : public detail::DenseSetImpl<ValueT, DenseMap<ValueT, detail::DenseSetEmpty,DenseMapValueInfo<detail::DenseSetEmpty>,ValueInfoT, detail::DenseSetPair<ValueT>>,ValueInfoT> {using BaseT =detail::DenseSetImpl<ValueT,DenseMap<ValueT, detail::DenseSetEmpty,DenseMapValueInfo<detail::DenseSetEmpty>,ValueInfoT, detail::DenseSetPair<ValueT>>,ValueInfoT>;public:using BaseT::BaseT;};上面的ValueT = const char * ,那么 DenseSetImpl 是什么呢
/// Base class for DenseSet and DenseSmallSet.////// MapTy should be either////// DenseMap<ValueT, detail::DenseSetEmpty,/// DenseMapValueInfo<detail::DenseSetEmpty>,/// ValueInfoT, detail::DenseSetPair<ValueT>>////// or the equivalent SmallDenseMap type. ValueInfoT must implement the/// DenseMapInfo "concept".template <typename ValueT, typename MapTy, typename ValueInfoT>class DenseSetImpl {static_assert(sizeof(typename MapTy::value_type) ==sizeof(ValueT),"DenseMap buckets unexpectedly large!");MapTy TheMap;template <typename T>using const_arg_type_t= typename const_pointer_or_const_ref<T>::type;public:using key_type = ValueT;using value_type = ValueT;using size_type =unsigned;...std::pair<iterator, bool>insert(const ValueT &V) {detail::DenseSetEmpty Empty;return TheMap.try_emplace(V, Empty);}std::pair<iterator, bool>insert(ValueT &&V) {detail::DenseSetEmpty Empty;return TheMap.try_emplace(std::move(V), Empty);}...可见,DenseSet 实际是一个Key为实际值(这里是const char *),Value为空的 DenseMap 。DenseMap 会通过template类型 调用 DenseSet 的 insert ,实际是调用 DenseMap 的 try_emplace 。
DenseMap#
DenseMap 的原型如下:
template <typename KeyT, typename ValueT,typename ValueInfoT = DenseMapValueInfo<ValueT>,typename KeyInfoT = DenseMapInfo<KeyT>,typename BucketT = detail::DenseMapPair<KeyT, ValueT>>class DenseMap : public DenseMapBase<DenseMap<KeyT, ValueT, ValueInfoT, KeyInfoT, BucketT>,KeyT, ValueT, ValueInfoT, KeyInfoT, BucketT> {friend class DenseMapBase<DenseMap, KeyT, ValueT, ValueInfoT, KeyInfoT, BucketT>;...结构有点复杂,但可以注意到,DenseMap 是通过 KeyInfoT = DenseMapInfo<KeyT> 来获取Key的信息的,比如Key的哈希值、两个Key是否相等等。而 DenseMapInfo<const char *> 是什么呢?
// Provide DenseMapInfo for cstrings.template<>struct DenseMapInfo<constchar*> {staticinlineconstchar*getEmptyKey() {return reinterpret_cast<constchar*>((intptr_t)-1);}staticinlineconstchar*getTombstoneKey() {return reinterpret_cast<constchar*>((intptr_t)-2);}staticunsignedgetHashValue(constchar*const&Val) {return_objc_strhash(Val);}staticboolisEqual(constchar*const&LHS, constchar*const&RHS) {if (LHS == RHS) {returntrue;}if (LHS ==getEmptyKey() || RHS ==getEmptyKey()) {returnfalse;}if (LHS ==getTombstoneKey() || RHS ==getTombstoneKey()) {returnfalse;}return0==strcmp(LHS, RHS);}};// objc-private.hstatic __inline uint32_t_objc_strhash(constchar*s) {uint32_t hash =0;for (;;) {int a =*s++;if (0== a) break;hash += (hash <<8) + a;}return hash;}注意 getHashValue 和 isEqual 两个方法,说明 DenseSet<const char *> 是通过字符串本身计算哈希值。所以有两个值相同,但地址不同的字符串存入 DenseSet<const char*> ,最后只会存一份字符串(是否共享字符串内存取决于 copy 参数与是否 strdupIfMutable)。这也能够说明为什么同一个字符串只能对应一个selector。
select进入namedSelectors的时机#
selector sayHi是什么时候插入进namedSelectors 的? namedSelectors 是什么时候初始化的?
通过观察,我们可以看到存在:
map_images → map_images_nolock → sel_init↓_read_images → sel_registerNameNoLock → namedSelectors这条调用链。
/************************************************************************ sel_init* Initialize selector tables and register selectors used internally.**********************************************************************/voidsel_init(size_tselrefCount){#ifSUPPORT_PREOPTif (PrintPreopt) {_objc_inform("PREOPTIMIZATION: using dyld selector opt");}#endifnamedSelectors.init((unsigned)selrefCount);// Register selectors used by libobjcmutex_locker_tlock(selLock);SEL_cxx_construct =sel_registerNameNoLock(".cxx_construct", NO);SEL_cxx_destruct =sel_registerNameNoLock(".cxx_destruct", NO);}sel_init 是谁调用的?
voidmap_images_nolock(unsignedmhCount, conststruct _dyld_objc_notify_mapped_info infos[],bool*disabledClassROEnforcement,_dyld_objc_mark_image_mutable makeImageMutable){...if (firstTime) {sel_init(selrefCount);...}而map_images_nolock是谁调用的呢?
/************************************************************************ map_images* Process the given images which are being mapped in by dyld.* Calls ABI-agnostic code after taking ABI-specific locks.** Locking: write-locks runtimeLock**********************************************************************/voidmap_images(unsignedcount, conststruct _dyld_objc_notify_mapped_info infos[],_dyld_objc_mark_image_mutable makeImageMutable){bool takeEnforcementDisableFault;{mutex_locker_tlock(runtimeLock);map_images_nolock(count, infos, &takeEnforcementDisableFault, makeImageMutable);}if (takeEnforcementDisableFault) {if (DebugClassRXSigning == Fatal)_objc_fatal("class_rx signing mismatch");#ifTARGET_OS_IPHONE&&!TARGET_OS_SIMULATORif (!DisableClassROFaults)_objc_fault("class_ro_t enforcement disabled");#endif}}map_images_nolock 由 map_images 调用。上一章也提到,map_images 在动态库载入时会被调用,所以在程序初始化时就能够完成 namedSelectors 的初始化。
初始化的流程我们理解了,但是问题还没解决:映像 __objc_selrefs 段里存的selector什么时候转到namedSelectors 里呢?
上面的代码可以看到,map_images 会调用 map_images_nolock 。而在 map_images_nolock 里会调用一个函数 _read_images
voidmap_images_nolock(unsignedmhCount, conststruct _dyld_objc_notify_mapped_info infos[],bool*disabledClassROEnforcement,_dyld_objc_mark_image_mutable makeImageMutable){...if (hCount >0) {_read_images(mappedInfos, hCount, totalClasses, unoptimizedTotalClasses, makeImageMutable);}}我们看看里面的作用
/************************************************************************ _read_images* Perform initial processing of the headers in the linked* list beginning with headerList.** Called by: map_images_nolock** Locking: runtimeLock acquired by map_images**********************************************************************/void_read_images(mapped_image_info infosParam[], uint32_thCount, inttotalClasses, intunoptimizedTotalClasses,_dyld_objc_mark_image_mutable makeImageMutable){...staticsize_t UnfixedSelectors;{mutex_locker_tlock(selLock);for (auto& info : infos) {if (info.dyldObjCRefsOptimized()) continue;bool isBundle = info.hi->isBundle();SEL*sels = info.hi->selrefs(&count);UnfixedSelectors += count;for (i =0; i < count; i++) {constchar*name =sel_cname(sels[i]);SEL sel =sel_registerNameNoLock(name, isBundle);if (sels[i] != sel) {// The infos array is reversed, but dyld expects the original indexconstuint32_t infoIndex = (hCount -1) - infos.index(&info);makeImageMutable(infoIndex);withMutableSharedCache(info.tproEnabled(), [&] {sels[i] = sel;});}}}}...}这里调用了 sel_registerNameNoLock ,实际就是将映像里的selector一个个地存进namedSelectors 里。所以能够说明:
@selector(...)与NSSelectorFromString(...)会得到同一个 selector- 因为 selector 已在加载期完成注册/驻留
发送消息#
发送消息部分,就是Objective-C的精髓。
猜测#
已知,对象的isa里存着baseMethods ,实际是一个数组,每一项是{SEL & type(传参类型) & 跳转地址 }。所以我们可以猜测:
每次调用objc_msgSend,都会去对象的 baseMethods 里查跳转地址并执行跳转。
但是 baseMethods 是一个数组,每调一次 objc_msgSend 都需要去遍历数组查找实现,能不能把 baseMethods 存在一个map里,这样查找的时间复杂度就下来了?所以:
每个isa里存在一个map缓存,key为selector。调用 objc_msgSend 时会先去这个缓存查找,如果没找到再去 baseMethods 里查找。
确实,Apple也是这么做的。
深入汇编#
因为objc_msgSend 的调用频次很高。Apple为了提升效率,特意用汇编来实现,属实良心。
不同CPU架构下的汇编指令也会不同。Apple甚至对不同的CPU做了指令差异处理,太良心了。
不过,核心逻辑是大致相同的。我们这里以arm64架构为例,先来看看 objc_msgSend 的具体实现:
MSG_ENTRY _objc_msgSendUNWIND _objc_msgSend, NoFramecmp p0, #0// nil check and tagged pointer check#ifSUPPORT_TAGGED_POINTERSb.le LNilOrTagged // (MSB tagged pointer looks negative)#elseb.eq LReturnZero#endifldr p14, [x0] // p14 = raw isaGetClassFromIsa_p16 p14, 1, x0 // p16 = classLGetIsaDone:// calls imp or objc_msgSend_uncachedCacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncachedobjc_msgSend 的关键步骤可以概括为:
- nil / tagged pointer 检查
- 从对象取出 isa
- 在 class cache 中查找 selector → IMP
- 命中则跳转;未命中则进入慢路径
我让G老师写了一段伪代码,方便理解:
IMP objc_msgSend(id receiver, SEL sel, ...) {if (receiver == nil) return 0;cls = decode_isa(receiver->isa);imp = cache_lookup(cls, sel);if (imp == NULL) {imp = __objc_msgSend_uncached(cls, sel);}return imp(receiver, sel, ...);}LNilOrTagged#
LNilOrTagged 是什么呢?
#ifSUPPORT_TAGGED_POINTERSLNilOrTagged:b.eq LReturnZero // nil checkGetTaggedClassb LGetIsaDone// SUPPORT_TAGGED_POINTERS#endifLReturnZero:// x0 is already zeromov x1, #0movi d0, #0movi d1, #0movi d2, #0movi d3, #0retEND_ENTRY _objc_msgSend其实就是:
- 如果传入的对象为nil,就直接跳到
LReturnZero - 否则解析Tagged Pointer拿到class 指针,存在
x16
这里也能够说明,为什么Objective-C里能够对空指针发消息。
拿到了isa,就到了缓存查找部分:CacheLookup
CacheLookup#
缓存是什么?回顾上一章,我们复习一下isa 的结构:
struct objc_class : objc_object {objc_class(const objc_class&) = delete;objc_class(objc_class&&) = delete;void operator=(const objc_class&) = delete;void operator=(objc_class&&) = delete;// Class ISA;Class superclass;cache_t cache; // formerly cache pointer and vtableclass_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags...cache_t是什么?
structcache_t {private:explicit_atomic<uintptr_t> _bucketsAndMaybeMask;union {// Note: _flags on ARM64 needs to line up with the unused bits of// _originalPreoptCache because we access some flags (specifically// FAST_CACHE_HAS_DEFAULT_CORE and FAST_CACHE_HAS_DEFAULT_AWZ) on// unrealized classes with the assumption that they will start out// as 0.struct {#ifCACHE_MASK_STORAGE==CACHE_MASK_STORAGE_OUTLINED&&!__LP64__// Outlined cache mask storage, 32-bit, we have mask and occupied.explicit_atomic<mask_t> _mask;uint16_t _occupied;#elifCACHE_MASK_STORAGE==CACHE_MASK_STORAGE_OUTLINED&&__LP64__// Outlined cache mask storage, 64-bit, we have mask, occupied, flags.explicit_atomic<mask_t> _mask;uint16_t _occupied;uint16_t _flags;# defineCACHE_T_HAS_FLAGS1#elif__LP64__// Inline cache mask storage, 64-bit, we have occupied, flags, and// empty space to line up flags with originalPreoptCache.//// Note: the assembly code for objc_release_xN knows about the// location of _flags and the// FAST_CACHE_HAS_CUSTOM_DEALLOC_INITIATION flag within. Any changes// must be applied there as well.uint32_t _disguisedPreoptCacheSignature;uint16_t _occupied;uint16_t _flags;# defineCACHE_T_HAS_FLAGS1#else// Inline cache mask storage, 32-bit, we have occupied, flags.uint16_t _occupied;uint16_t _flags;# defineCACHE_T_HAS_FLAGS1#endif};explicit_atomic<preopt_cache_t*, PTRAUTH_STR(originalPreoptCache, ptrauth_key_process_independent_data)> _originalPreoptCache;};_bucketsAndMaybeMask 是什么呢?我们上一章讲过,指针被mask是为了安全。其实,这里buckets的实际内存布局是:
[ bucket0 ][ bucket1 ][ bucket2 ][ bucket3 ] ...bucket是什么呢?
structbucket_t {private:// IMP-first is better for arm64e ptrauth and no worse for arm64.// SEL-first is better for armv7* and i386 and x86_64.#if__arm64__explicit_atomic<uintptr_t> _imp;explicit_atomic<SEL> _sel;#elseexplicit_atomic<SEL> _sel;explicit_atomic<uintptr_t> _imp;#endif说白了就是imp+sel的紧凑结构。
现在先来一个思考题,已知isa的地址,怎么获取bucket的基地?
// 1️⃣ decode isa(mask 取 class pointer)Class cls = raw_isa & ISA_MASK;// 2️⃣ cache_t 在 class 内的偏移cache_t *cache = (cache_t *)((uint8_t*)cls + CACHE_OFFSET);// 3️⃣ buckets 在 cache_t 内的偏移bucket_t *buckets = *(bucket_t **)((uint8_t*)cache + BUCKETS_OFFSET);并且:
CACHE_OFFSET= 0x10BUCKETS_OFFSET= 0x00
我们开始看CacheLookUp的汇编代码
/********************************************************************** CacheLookup NORMAL|GETIMP|LOOKUP <function> MissLabelDynamic MissLabelConstant** MissLabelConstant is only used for the GETIMP variant.** Locate the implementation for a selector in a class method cache.** When this is used in a function that doesn't hold the runtime lock,* this represents the critical section that may access dead memory.* If the kernel causes one of these functions to go down the recovery* path, we pretend the lookup failed by jumping the JumpMiss branch.** Takes:* x1 = selector* x16 = class to be searched** Kills:* x9,x10,x11,x12,x13,x15,x17** Untouched:* x14** On exit: (found) calls or returns IMP* with x16 = class, x17 = IMP* In LOOKUP mode, the two low bits are set to 0x3* if we hit a constant cache (used in objc_trace)* (not found) jumps to LCacheMiss* with x15 = class* For constant caches in LOOKUP mode, the low bit* of x16 is set to 0x1 to indicate we had to fallback.* In addition, when LCacheMiss is __objc_msgSend_uncached or* __objc_msgLookup_uncached, 0x2 will be set in x16* to remember we took the slowpath.* So the two low bits of x16 on exit mean:* 0: dynamic hit* 1: fallback to the parent class, when there is a preoptimized cache* 2: slowpath* 3: preoptimized cache hit*********************************************************************/#define NORMAL 0#define GETIMP 1#define LOOKUP 2// CacheHit: x17 = cached IMP, x10 = address of buckets, x1 = SEL, x16 = isa.macro CacheHit.if 0ドル == NORMALTailCallCachedImp x17, x10, x1, x16 // authenticate andcall imp.elseif 0ドル == GETIMPmov p0, p17cbz p0, 9f // don't ptrauth a nil impAuthAndResignAsIMP x0, x10, x1, x16, x17 // authenticate imp and re-sign as IMP9: ret // return IMP.elseif 0ドル == LOOKUP// No nil check for ptrauth: the caller would crash anyway when they// jump to a nil IMP. We don't care if that jump also fails ptrauth.AuthAndResignAsIMP x17, x10, x1, x16, x10 // authenticate imp and re-sign as IMPcmp x16, x15cinc x16, x16, ne // x16 += 1 when x15 != x16 (for instrumentation ; fallback to the parent class)ret // return imp via x17.else.abort oops.endif.endmacro.macro CacheLookup Mode, Function, MissLabelDynamic, MissLabelConstant//// Restart protocol://// As soon as we're past the LLookupStart\Function label we may have// loaded an invalid cache pointer or mask.//// When task_restartable_ranges_synchronize() is called,// (or when a signal hits us) before we're past LLookupEnd\Function,// then our PC will be reset to LLookupRecover\Function which forcefully// jumps to the cache-miss codepath which have the following// requirements://// GETIMP:// The cache-miss is just returning NULL (setting x0 to 0)//// NORMAL andLOOKUP:// - x0 contains the receiver// - x1 contains the selector// - x16 contains the isa// - other registers are set as per calling conventions//mov x15, x16 // stash the original isaLLookupStart\Function:// p1 = SEL, p16 = isa#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRSldr p10, [x16, #CACHE] // p10 = mask|bucketslsr p11, p10, #48 // p11 = maskand p10, p10, #0xffffffffffff // p10 = bucketsand w12, w1, w11 // x12 = _cmd & mask#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16ldr p11, [x16, #CACHE] // p11 = mask|buckets#if CONFIG_USE_PREOPT_CACHES#if __has_feature(ptrauth_calls)tbnz p11, #0, LLookupPreopt\Functionand p10, p11, #0x0000ffffffffffff // p10 = buckets#elseand p10, p11, #0x0000fffffffffffe // p10 = bucketstbnz p11, #0, LLookupPreopt\Function#endifeor p12, p1, p1, LSR #7and p12, p12, p11, LSR #48 // x12 = (_cmd ^ (_cmd>> 7)) & mask#elseand p10, p11, #0x0000ffffffffffff // p10 = bucketsand p12, p1, p11, LSR #48 // x12 = _cmd & mask#endif // CONFIG_USE_PREOPT_CACHES#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4ldr p11, [x16, #CACHE] // p11 = mask|bucketsand p10, p11, #~0xf // p10 = bucketsand p11, p11, #0xf // p11 = maskShiftmov p12, #0xfffflsr p11, p12, p11 // p11 = mask = 0xffff>> p11and p12, p1, p11 // x12 = _cmd & mask#else#error Unsupported cache mask storage for ARM64.#endifadd p13, p10, p12, LSL #(1+PTRSHIFT)// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))// do {1: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--cmp p9, p1 // if (sel != _cmd) {b.ne 3f // scan more// } else {2: CacheHit \Mode // hit:callor return imp// }3: cbz p9, \MissLabelDynamic // if (sel == 0) goto Miss;cmp p13, p10 // } while (bucket>= buckets)b.hs 1b// wrap-around:// p10 = first bucket// p11 = mask (and maybe other bits on LP64)// p12 = _cmd & mask//// A full cache can happen with CACHE_ALLOW_FULL_UTILIZATION.// So stop when we circle back to the first probed bucket// rather than when hitting the first bucket again.//// Note that we might probe the initial bucket twice// when the first probed slot is the last entry.#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRSadd p13, p10, w11, UXTW #(1+PTRSHIFT)// p13 = buckets + (mask << 1+PTRSHIFT)#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))// p13 = buckets + (mask << 1+PTRSHIFT)// see comment about maskZeroBits#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4add p13, p10, p11, LSL #(1+PTRSHIFT)// p13 = buckets + (mask << 1+PTRSHIFT)#else#error Unsupported cache mask storage for ARM64.#endifadd p12, p10, p12, LSL #(1+PTRSHIFT)// p12 = first probed bucket// do {4: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--cmp p9, p1 // if (sel == _cmd)b.eq 2b // goto hitcmp p9, #0 // } while (sel != 0 &&ccmp p13, p12, #0, ne // bucket> first_probed)b.hi 4bLLookupEnd\Function:LLookupRecover\Function:b \MissLabelDynamic#if CONFIG_USE_PREOPT_CACHES#if CACHE_MASK_STORAGE != CACHE_MASK_STORAGE_HIGH_16#error config unsupported#endifLLookupPreopt\Function:#if __has_feature(ptrauth_calls)and p10, p11, #0x007ffffffffffffe // p10 = bucketsautdb x10, x16 // auth as early as possible#endif// x12 = (_cmd - first_shared_cache_sel)adrp x9, _MagicSelRef@PAGEldr p9, [x9, _MagicSelRef@PAGEOFF]sub p12, p1, p9// w9 = ((_cmd - first_shared_cache_sel)>> hash_shift & hash_mask)#if __has_feature(ptrauth_calls)// bits 63..60 of x11 are the number of bits in hash_mask// bits 59..55 of x11 is hash_shiftlsr x17, x11, #55 // w17 = (hash_shift, ...)lsr w9, w12, w17 //>>= shiftlsr x17, x11, #60 // w17 = mask_bitsmov x11, #0x7ffflsr x11, x11, x17 // p11 = mask (0x7fff>> mask_bits)and x9, x9, x11 // &= mask#else// bits 63..53 of x11 is hash_mask// bits 52..48 of x11 is hash_shiftlsr x17, x11, #48 // w17 = (hash_shift, hash_mask)lsr w9, w12, w17 //>>= shiftand x9, x9, x11, LSR #53 // &= mask#endif// sel_offs is 26 bits because it needs to address a 64 MB buffer (~ 20 MB as of writing)// keep the remaining 38 bits for the IMP offset, which may need to reach// across the shared cache. This offset needs to be shifted << 2. We did this// to give it even more reach, given the alignment of source (the class data)// and destination (the IMP)ldr x17, [x10, x9, LSL #3] // x17 == (sel_offs << 38) | imp_offscmp x12, x17, LSR #38.if \Mode == GETIMPb.ne \MissLabelConstant // cache misssbfiz x17, x17, #2, #38 // imp_offs = combined_imp_and_sel[0..37] << 2sub x0, x16, x17 // imp = isa - imp_offsSignAsImp x0, x17ret.elseb.ne 5f // cache misssbfiz x17, x17, #2, #38 // imp_offs = combined_imp_and_sel[0..37] << 2sub x17, x16, x17 // imp = isa - imp_offs.if \Mode == NORMALbr x17.elseif \Mode == LOOKUPorr x16, x16, #3 // for instrumentation, note that we hit a constant cacheSignAsImp x17, x10ret.else.abort unhandled mode \Mode.endif5: ldur x9, [x10, #-16] // offset -16 is the fallback offsetadd x16, x16, x9 // compute the fallback isab LLookupStart\Function // lookup again with a new isa.endif#endif // CONFIG_USE_PREOPT_CACHES.endmacro代码非常多。其实,这个代码一共分成三个阶段:
准备数据#
mov x15, x16 // stash the original isaLLookupStart\Function:// p1 = SEL, p16 = isa#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRSldr p10, [x16, #CACHE] // p10 = mask|bucketslsr p11, p10, #48 // p11 = maskand p10, p10, #0xffffffffffff // p10 = bucketsand w12, w1, w11 // x12 = _cmd & mask#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16ldr p11, [x16, #CACHE] // p11 = mask|buckets#if CONFIG_USE_PREOPT_CACHES#if __has_feature(ptrauth_calls)tbnz p11, #0, LLookupPreopt\Functionand p10, p11, #0x0000ffffffffffff // p10 = buckets#elseand p10, p11, #0x0000fffffffffffe // p10 = bucketstbnz p11, #0, LLookupPreopt\Function#endifeor p12, p1, p1, LSR #7and p12, p12, p11, LSR #48 // x12 = (_cmd ^ (_cmd>> 7)) & mask#elseand p10, p11, #0x0000ffffffffffff // p10 = bucketsand p12, p1, p11, LSR #48 // x12 = _cmd & mask#endif // CONFIG_USE_PREOPT_CACHES#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4ldr p11, [x16, #CACHE] // p11 = mask|bucketsand p10, p11, #~0xf // p10 = bucketsand p11, p11, #0xf // p11 = maskShiftmov p12, #0xfffflsr p11, p12, p11 // p11 = mask = 0xffff>> p11and p12, p1, p11 // x12 = _cmd & mask#else#error Unsupported cache mask storage for ARM64.#endif首先,外部固定会传入:
x1: selx16: isa
接着,会从isa+${#CACHE}取值并存到x10里。而CACHE是什么呢?
#define CACHE (2 * __SIZEOF_POINTER__)所以,x10实际是buckets
再者,x12 = _cmd & mask。mask是什么?mask=缓存的大小-1。所以,x12就是sel计算后的缓存索引位置。用hashmap的话来讲,是哈希值。
[bucket0][bucket1][bucket2][bucket3][bucket4][bucket5][bucket6][bucket7]^│x12 ← (_cmd & mask) 得到的 index从中往前遍历#
add p13, p10, p12, LSL #(1+PTRSHIFT)// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))// do {1: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--cmp p9, p1 // if (sel != _cmd) {b.ne 3f // scan more// } else {2: CacheHit \Mode // hit:callor return imp// }3: cbz p9, \MissLabelDynamic // if (sel == 0) goto Miss;cmp p13, p10 // } while (bucket>= buckets)b.hs 1b从bucket[x13]遍历到bucket[0]。如果找到sel==cmd,则跳转到CacheHit (传入的参数),否则x13--
该轮遍历结束后,x13位置:
[bucket0][bucket1][bucket2][bucket3][bucket4][bucket5][bucket6][bucket7]^│x13从后往中遍历#
// wrap-around:// p10 = first bucket// p11 = mask (and maybe other bits on LP64)// p12 = _cmd & mask//// A full cache can happen with CACHE_ALLOW_FULL_UTILIZATION.// So stop when we circle back to the first probed bucket// rather than when hitting the first bucket again.//// Note that we might probe the initial bucket twice// when the first probed slot is the last entry.#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRSadd p13, p10, w11, UXTW #(1+PTRSHIFT)// p13 = buckets + (mask << 1+PTRSHIFT)#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))// p13 = buckets + (mask << 1+PTRSHIFT)// see comment about maskZeroBits#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4add p13, p10, p11, LSL #(1+PTRSHIFT)// p13 = buckets + (mask << 1+PTRSHIFT)#else#error Unsupported cache mask storage for ARM64.#endifadd p12, p10, p12, LSL #(1+PTRSHIFT)// p12 = first probed bucket// do {4: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--cmp p9, p1 // if (sel == _cmd)b.eq 2b // goto hitcmp p9, #0 // } while (sel != 0 &&ccmp p13, p12, #0, ne // bucket> first_probed)b.hi 4b接着,x13指向buckets最后一个元素:
[bucket0][bucket1][bucket2][bucket3][bucket4][bucket5][bucket6][bucket7]^│x13然后向前遍历,查找sel==_cmd。有就跳到CacheHit,否则x13--,直到x13==x12
结果#
如果找不到,就跳到MissLabelDynamic。对于objc_msgSend来说,是objc_msgSend_uncached
而如果跳转成功呢?CacheHit会根据传入的参数分为三种情况:
// CacheHit: x17 = cached IMP, x10 = address of buckets, x1 = SEL, x16 = isa.macro CacheHit.if 0ドル == NORMALTailCallCachedImp x17, x10, x1, x16 // authenticate andcall imp.elseif 0ドル == GETIMPmov p0, p17cbz p0, 9f // don't ptrauth a nil impAuthAndResignAsIMP x0, x10, x1, x16, x17 // authenticate imp and re-sign as IMP9: ret // return IMP.elseif 0ドル == LOOKUP// No nil check for ptrauth: the caller would crash anyway when they// jump to a nil IMP. We don't care if that jump also fails ptrauth.AuthAndResignAsIMP x17, x10, x1, x16, x10 // authenticate imp and re-sign as IMPcmp x16, x15cinc x16, x16, ne // x16 += 1 when x15 != x16 (for instrumentation ; fallback to the parent class)ret // return imp via x17.else.abort oops.endif.endmacro当然,objc_msgSend调用的是NORMAL类型的,实际是跳转到TailCallCachedImp。TailCallCachedImp 的作用,实际是传入imp、SEL并执行跳转。
macro TailCallCachedImp// 0ドル = cached imp, 1ドル = address of cached imp, 2ドル = SEL, 3ドル = isaeor 0ドル, 0ドル, 3ドル.ifndef LTailCallCachedImpIndirectBranchLTailCallCachedImpIndirectBranch:.endifbr 0ドル.endmacroobjc_msgSend_uncached#
.macro MethodTableLookupSAVE_REGS MSGSEND// lookUpImpOrForward(obj, sel, cls, LOOKUP_INITIALIZE | LOOKUP_RESOLVER)// receiver and selector already in x0 and x1mov x2, x16mov x3, #3bl _lookUpImpOrForward// IMP in x0mov x17, x0RESTORE_REGS MSGSEND.endmacroSTATIC_ENTRY __objc_msgSend_uncachedUNWIND __objc_msgSend_uncached, FrameWithNoSaves// THIS IS NOT A CALLABLE C FUNCTION// Out-of-band p15 is the class to searchMethodTableLookupTailCallFunctionPointer x17END_ENTRY __objc_msgSend_uncached.macro TailCallFunctionPointer// 0ドル = function pointer valuebr 0ドル.endmacro逻辑很简单,就是走到MethodTableLookup 这个宏,然后再跳到x17的地址上。
lookUpImpOrForward#
这个方法是用C写的:
NEVER_INLINEIMP lookUpImpOrForward(id inst, SEL sel, Class cls, intbehavior){const IMP forward_imp = (IMP)_objc_msgForward_impcache;IMP imp = nil;Class curClass;lockdebug::assert_unlocked(&runtimeLock.get());if (slowpath(!cls->isInitialized())) {// The first message sent to a class is often +new or +alloc, or +self// which goes through objc_opt_* or various optimized entry points.//// However, the class isn't realized/initialized yet at this point,// and the optimized entry points fall down through objc_msgSend,// which ends up here.//// We really want to avoid caching these, as it can cause IMP caches// to be made with a single entry forever.//// Note that this check is racy as several threads might try to// message a given class for the first time at the same time,// in which case we might cache anyway.behavior |= LOOKUP_NOCACHE;}// runtimeLock is held during isRealized and isInitialized checking// to prevent races against concurrent realization.// runtimeLock is held during method search to make// method-lookup + cache-fill atomic with respect to method addition.// Otherwise, a category could be added but ignored indefinitely because// the cache was re-filled with the old value after the cache flush on// behalf of the category.runtimeLock.lock();// We don't want people to be able to craft a binary blob that looks like// a class but really isn't one and do a CFI attack.//// To make these harder we want to make sure this is a class that was// either built into the binary or legitimately registered through// objc_duplicateClass, objc_initializeClassPair or objc_allocateClassPair.checkIsKnownClass(cls);cls =realizeAndInitializeIfNeeded_locked(inst, cls, behavior & LOOKUP_INITIALIZE);// runtimeLock may have been dropped but is now locked againlockdebug::assert_locked(&runtimeLock.get());curClass = cls;// The code used to lookup the class's cache again right after// we take the lock but for the vast majority of the cases// evidence shows this is a miss most of the time, hence a time loss.//// The only codepath calling into this without having performed some// kind of cache lookup is class_getInstanceMethod().// Has this class been disabled? Act like a message to nil.if (!cls ||!cls->ISA()) {#if__arm64__imp = _objc_returnNil;goto done;#elif__x86_64if (behavior & LOOKUP_FPRET)imp = _objc_msgNil_fpret;elseif (behavior & LOOKUP_FP2RET)imp = _objc_msgNil_fp2ret;elseimp = _objc_msgNil;// We can't cache these on x86, in case some other caller tries sending// this selector with a different return type. If we con't cache then we// always come back here, and always choose the correct IMP for the// caller's expected return type.behavior |= LOOKUP_NOCACHE;goto done;#else#errorDon't know how to handle messages to disabled classes on this target.#endif}for (unsigned attempts =unreasonableClassCount();;) {if (curClass->cache.isConstantOptimizedCache(/* strict */true)) {#ifCONFIG_USE_PREOPT_CACHESimp =cache_getImp(curClass, sel);if (imp) goto done_unlock;curClass = curClass->cache.preoptFallbackClass();#endif} else {// curClass method list.method_t*meth =getMethodNoSuper_nolock(curClass, sel);if (meth) {imp = meth->imp(false);goto done;}if (slowpath((curClass = curClass->getSuperclass()) == nil)) {// No implementation found, and method resolver didn't help.// Use forwarding.imp = forward_imp;break;}}// Halt if there is a cycle in the superclass chain.if (slowpath(--attempts ==0)) {_objc_fatal("Memory corruption in class list.");}// Superclass cache.imp =cache_getImp(curClass, sel);if (slowpath(imp == forward_imp)) {// Found a forward:: entry in a superclass.// Stop searching, but don't cache yet; call method// resolver for this class first.break;}if (fastpath(imp)) {// Found the method in a superclass. Cache it in this class.goto done;}}// No implementation found. Try method resolver once.if (slowpath(behavior & LOOKUP_RESOLVER)) {behavior ^= LOOKUP_RESOLVER;returnresolveMethod_locked(inst, sel, cls, behavior);}done:if (fastpath((behavior & LOOKUP_NOCACHE) ==0)) {#ifCONFIG_USE_PREOPT_CACHESwhile (cls->cache.isConstantOptimizedCache(/* strict */true)) {cls = cls->cache.preoptFallbackClass();}#endiflog_and_fill_cache(cls, imp, sel, inst, curClass);}#ifCONFIG_USE_PREOPT_CACHESdone_unlock:#endifruntimeLock.unlock();if (slowpath((behavior & LOOKUP_NIL) && imp == forward_imp)) {return nil;}return imp;}其实,可以分为三部分:
-
类初始化。类信息可能还在只读区域里,需要把这些信息挪到可读可写的区域。
-
从类信息查找selector对应的函数指针IMP。
-
缓存SEL和IMP。
类初始化#
调用关系:
realizeAndInitializeIfNeeded_locked -> realizeClassMaybeSwiftAndLeaveLocked -> realizeClassMaybeSwiftMaybeRelock
/************************************************************************ realizeClassMaybeSwift (MaybeRelock / AndUnlock / AndLeaveLocked)* Realize a class that might be a Swift class.* Returns the real class structure for the class.* Locking:* runtimeLock must be held on entry* runtimeLock may be dropped during execution* ...AndUnlock function leaves runtimeLock unlocked on exit* ...AndLeaveLocked re-acquires runtimeLock if it was dropped* This complication avoids repeated lock transitions in some cases.**********************************************************************/staticClassrealizeClassMaybeSwiftMaybeRelock(Classcls, mutex_t&lock, boolleaveLocked){lockdebug::assert_locked(&lock);if (!cls->isSwiftStable_ButAllowLegacyForNow()) {// Non-Swift class. Realize it now with the lock still held.// fixme wrong in the future for objc subclasses of swift classescls =realizeClassWithoutSwift(cls, nil);if (!leaveLocked) lock.unlock();} else {// Swift class. We need to drop locks and call the Swift// runtime to initialize it.lock.unlock();cls =realizeSwiftClass(cls);ASSERT(cls->isRealized()); // callback must have provoked realizationif (leaveLocked) lock.lock();}return cls;}这里分为两种情况:
- 对objc类初始化
- 对swift类初始化
对objc类初始化#
/************************************************************************ realizeClassWithoutSwift* Performs first-time initialization on class cls,* including allocating its read-write data.* Does not perform any Swift-side initialization.* Returns the real class structure for the class.* Locking: runtimeLock must be write-locked by the caller**********************************************************************/staticClassrealizeClassWithoutSwift(Classcls, Classpreviously){lockdebug::assert_locked(&runtimeLock.get());class_rw_t*rw;Class supercls;Class metacls;if (!cls) returnnil;if (cls->isRealized()) {validateAlreadyRealizedClass(cls);return cls;}ASSERT(cls ==remapClass(cls));// fixme verify class is not in an un-dlopened part of the shared cache?auto ro = cls->safe_ro();auto isMeta = ro->flags & RO_META;if (ro->flags & RO_FUTURE) {// This was a future class. rw data is already allocated.rw = cls->data();ro = cls->data()->ro();ASSERT(!isMeta);cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);} else {// Normal class. Allocate writeable class data.rw = objc::zalloc<class_rw_t>();rw->set_ro(ro);rw->flags = RW_REALIZED|RW_REALIZING|isMeta;cls->setData(rw);}cls->cache.initializeToEmptyOrPreoptimizedInDisguise();#ifFAST_CACHE_METAif (isMeta) cls->cache.setBit(FAST_CACHE_META);#endif// Choose an index for this class.// Sets cls->instancesRequireRawIsa if indexes no more indexes are availablecls->chooseClassArrayIndex();if (PrintConnecting) {_objc_inform("CLASS: realizing class '%s'%s%p%p #%u%s%s",cls->nameForLogging(), isMeta ?" (meta)":"",(void*)cls, ro, cls->classArrayIndex(),cls->isSwiftStable() ?"(swift)":"",cls->isSwiftLegacy() ?"(pre-stable swift)":"");}// Realize superclass and metaclass, if they aren't already.// This needs to be done after RW_REALIZED is set above, for root classes.// This needs to be done after class index is chosen, for root metaclasses.// This assumes that none of those classes have Swift contents,// or that Swift's initializers have already been called.// fixme that assumption will be wrong if we add support// for ObjC subclasses of Swift classes.supercls =realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil);metacls =realizeClassWithoutSwift(remapClass(cls->ISA()), nil);// If there's no superclass and this is not a root class, then we have a// missing weak superclass. Disable the class and return.if (!supercls &&!(cls->safe_ro()->flags & RO_ROOT)) {if (PrintConnecting)_objc_inform("CLASS: '%s'%s%p has missing weak superclass, disabling.",cls->nameForLogging(), isMeta ?" (meta)":"", (void*)cls);addRemappedClass(cls, nil);// Set the metaclass to nil to signal that this class is disabled.// Root classes have a nil superclass, but all (non-disabled) classes// have a non-nil isa pointer, so this can be used as a quick check for// disabled classes.cls->initIsa(nil);returnnil;}#ifSUPPORT_NONPOINTER_ISAif (isMeta) {// Metaclasses do not need any features from non pointer ISA// This allows for a faspath for classes in objc_retain/objc_release.cls->setInstancesRequireRawIsa();} else {// Disable non-pointer isa for some classes and/or platforms.// Set instancesRequireRawIsa.bool instancesRequireRawIsa = cls->instancesRequireRawIsa();bool rawIsaIsInherited =false;staticbool hackedDispatch =false;constchar*name;if (DisableNonpointerIsa) {// Non-pointer isa disabled by environment or app SDK versioninstancesRequireRawIsa =true;}elseif (!hackedDispatch&& (name = ro->getName()) // Yes, we mean to assign here&&0==strcmp(name, "OS_object")){// hack for libdispatch et al - isa also acts as vtable pointerhackedDispatch =true;instancesRequireRawIsa =true;}elseif (supercls && supercls->getSuperclass() &&supercls->instancesRequireRawIsa()){// This is also propagated by addSubclass()// but nonpointer isa setup needs it earlier.// Special case: instancesRequireRawIsa does not propagate// from root class to root metaclassinstancesRequireRawIsa =true;rawIsaIsInherited =true;}if (instancesRequireRawIsa) {cls->setInstancesRequireRawIsaRecursively(rawIsaIsInherited);}}// SUPPORT_NONPOINTER_ISA#endif// Update superclass and metaclass in case of remappingcls->setSuperclass(supercls);cls->initClassIsa(metacls);// Reconcile instance variable offsets / layout.// This may reallocate class_ro_t, updating our ro variable.if (supercls &&!isMeta) reconcileInstanceVariables(cls, supercls, ro);// Set fastInstanceSize if it wasn't set already.cls->setInstanceSize(ro->instanceSize);// Copy some flags from ro to rwif (ro->flags & RO_HAS_CXX_STRUCTORS) {cls->setHasCxxDtor();if (! (ro->flags & RO_HAS_CXX_DTOR_ONLY)) {cls->setHasCxxCtor();}}// Propagate the associated objects forbidden flag from ro or from// the superclass.if ((ro->flags & RO_FORBIDS_ASSOCIATED_OBJECTS) ||(supercls && supercls->forbidsAssociatedObjects())){rw->flags |= RW_FORBIDS_ASSOCIATED_OBJECTS;}// Connect this class to its superclass's subclass listsif (supercls) {addSubclass(supercls, cls);} else {addRootClass(cls);}// Attach categoriesmethodizeClass(cls, previously);return cls;}具体做了什么?
- 分配类的读写空间(
rwdata) - 初始化缓存
- 初始化
superclass、metaclass - Tagged Pointer优化
- rwdata初始化(
methodizeClass)、初始化方法表- 安装 class 自己的方法
- 处理 preoptimized method lists
- root metaclass 处理
- attach categories
对swift类初始化#
实现逻辑在realizeSwiftClass里
/************************************************************************ realizeSwiftClass* Performs first-time initialization on class cls,* including allocating its read-write data,* and any Swift-side initialization.* Returns the real class structure for the class.* Locking: acquires runtimeLock indirectly**********************************************************************/staticClassrealizeSwiftClass(Classcls){lockdebug::assert_unlocked(&runtimeLock.get());// Some assumptions:// * Metaclasses never have a Swift initializer.// * Root classes never have a Swift initializer.// (These two together avoid initialization order problems at the root.)// * Unrealized non-Swift classes have no Swift ancestry.// * Unrealized Swift classes with no initializer have no ancestry that// does have the initializer.// (These two together mean we don't need to scan superclasses here// and we don't need to worry about Swift superclasses inside// realizeClassWithoutSwift()).// fixme some of these assumptions will be wrong// if we add support for ObjC sublasses of Swift classes.#ifDEBUGruntimeLock.lock();ASSERT(remapClass(cls) == cls);ASSERT(cls->isSwiftStable_ButAllowLegacyForNow());ASSERT(!cls->isMetaClassMaybeUnrealized());ASSERT(cls->getSuperclass());runtimeLock.unlock();#endif// Look for a Swift metadata initialization function// installed on the class. If it is present we call it.// That function in turn initializes the Swift metadata,// prepares the "compiler-generated" ObjC metadata if not// already present, and calls _objc_realizeSwiftClass() to finish// our own initialization.if (auto init = cls->swiftMetadataInitializer()) {if (PrintConnecting) {_objc_inform("CLASS: calling Swift metadata initializer ""for class '%s' (%p)", cls->nameForLogging(), cls);}Class newcls =init(cls, nil);if (cls != newcls) {mutex_locker_tlock(runtimeLock);addRemappedClass(cls, newcls);}return newcls;}else {// No Swift-side initialization callback.// Perform our own realization directly.mutex_locker_tlock(runtimeLock);returnrealizeClassWithoutSwift(cls, nil);}}实际上会调用swift runtime的初始化函数。
查找IMP#
这一步的逻辑很简单:
for (unsigned attempts =unreasonableClassCount();;) {if (curClass->cache.isConstantOptimizedCache(/* strict */true)) {#ifCONFIG_USE_PREOPT_CACHESimp =cache_getImp(curClass, sel);if (imp) goto done_unlock;curClass = curClass->cache.preoptFallbackClass();#endif} else {// curClass method list.method_t*meth =getMethodNoSuper_nolock(curClass, sel);if (meth) {imp = meth->imp(false);goto done;}if (slowpath((curClass = curClass->getSuperclass()) ==nil)) {// No implementation found, and method resolver didn't help.// Use forwarding.imp = forward_imp;break;}}// Halt if there is a cycle in the superclass chain.if (slowpath(--attempts ==0)) {_objc_fatal("Memory corruption in class list.");}// Superclass cache.imp =cache_getImp(curClass, sel);if (slowpath(imp == forward_imp)) {// Found a forward:: entry in a superclass.// Stop searching, but don't cache yet; call method// resolver for this class first.break;}if (fastpath(imp)) {// Found the method in a superclass. Cache it in this class.goto done;}}isConstantOptimizedCache-> dyld塞的prebuild buckets。如果有,尝试去读取- 否则,调用
getMethodNoSuper_nolock,去查方法表
/************************************************************************ getMethodNoSuper_nolock* fixme* Locking: runtimeLock must be read- or write-locked by the caller**********************************************************************/staticmethod_t*getMethodNoSuper_nolock(Classcls, SELsel){lockdebug::assert_locked(&runtimeLock.get());ASSERT(cls->isRealized());// fixme nil cls?// fixme nil sel?auto alternates = cls->data()->methodAlternates();if (auto *relativeList = alternates.relativeList)returngetMethodFromRelativeList(relativeList, sel);if (alternates.list)returngetMethodFromListArray(&alternates.list, 1, sel);if (auto *array = alternates.array) {auto listAlternates = array->listAlternates();if (listAlternates.oneList)returngetMethodFromListArray(&listAlternates.oneList, 1, sel);if (auto innerArray = listAlternates.array)returngetMethodFromListArray(innerArray, listAlternates.arrayCount, sel);if (auto *relativeList = listAlternates.listList)returngetMethodFromRelativeList(relativeList, sel);}returnnil;}cls->data() 对应的就是class的rwdata。而methodAlternative的代码如下:
// Get the class's method lists without wrapping the different// representations in a method_array_t. This allows the caller to directly// access the underlying representations and have separate code for them,// rather than relying on the iterator abstraction being sufficiently// optimized. This exists for getMethodNoSuper_nolock to call, other callers// should be able to just use methods().ALWAYS_INLINEMethodListAlternates methodAlternates() const {MethodListAlternates result = {};auto v =get_ro_or_rwe();if (v.is<class_rw_ext_t*>()) {result.array =&v.get<class_rw_ext_t*>(&ro_or_rw_ext)->methods;} else {auto &baseMethods = v.get<constclass_ro_t*>(&ro_or_rw_ext)->baseMethods;result.list = baseMethods.dyn_cast<method_list_t*>();result.relativeList = baseMethods.dyn_cast<relative_list_list_t<method_list_t>*>();}return result;}功能为获取class的方法表
写入缓存#
最后,将拿到的IMP写入到cache里,并返回IMP,为了方便下次做查询。
done:if (fastpath((behavior & LOOKUP_NOCACHE) ==0)) {#ifCONFIG_USE_PREOPT_CACHESwhile (cls->cache.isConstantOptimizedCache(/* strict */true)) {cls = cls->cache.preoptFallbackClass();}#endiflog_and_fill_cache(cls, imp, sel, inst, curClass);}#ifCONFIG_USE_PREOPT_CACHESdone_unlock:#endifruntimeLock.unlock();if (slowpath((behavior & LOOKUP_NIL) && imp == forward_imp)) {returnnil;}return imp;/************************************************************************ log_and_fill_cache* Log this method call. If the logger permits it, fill the method cache.* cls is the method whose cache should be filled.* implementer is the class that owns the implementation in question.**********************************************************************/staticvoidlog_and_fill_cache(Classcls, IMPimp, SELsel, idreceiver, Classimplementer){#ifSUPPORT_MESSAGE_LOGGINGif (slowpath(objcMsgLogEnabled && implementer)) {bool cacheIt =logMessageSend(implementer->isMetaClass(),cls->nameForLogging(),implementer->nameForLogging(),sel);if (!cacheIt) return;}#endifif (slowpath(msgSendCacheMissHook.isSet())) {auto hook = msgSendCacheMissHook.get();hook(cls, receiver, sel, imp);}cls->cache.insert(sel, imp, receiver);}收尾#
最后,执行栈回到TailCallFunctionPointer x17上,跳转到IMP对应的地址上,完成一次消息调用。
总结#
本文稍微深入得讲了下objc_msgSend的原理。我本来以为,objc_msgSend无非就是一个类似于消息中心一样的东西,没想到Apple针对这个做了很多非常细致的性能优化。
目前为止,Objective-C运行时消息派发的部分已经全部完结了🤮。接下来,终于可以回到Kotlin Native部分。