commit 8cca79a72cccbdb54726125d690d7d0095fc2409 added use of SYS_pause to exit() without accounting for newer archs omitting the syscall. use the newly-added __sys_pause abstraction instead, which uses SYS_ppoll when SYS_pause is missing.
per the C and POSIX standards, calling exit "more than once", including via return from main, produces undefined behavior. this language predates threads, and at the time it was written, could only have applied to recursive calls to exit via atexit handlers. C++ likewise makes calls to exit from global dtors undefined. nonetheless, by the present specification as written, concurrent calls to exit by multiple threads also have undefined behavior. originally, our implementation of exit did have locking to handle concurrent calls safely, but that was changed in commit 2e55da911896a91e95b24ab5dc8a9d9b0718f4de based on it being undefined. from a standpoint of both hardening and quality of implementation, that change seems to have been a mistake. this change adds back locking, but with awareness of the lock owner so that recursive calls to exit can be trapped rather than deadlocking. this also opens up the possibility of allowing recursive calls to succeed, if future consensus ends up being in favor of that. prior to this change, exit already behaved partly as if protected by a lock as long as atexit was linked, but multiple threads calling exit could concurrently "pop off" atexit handlers and execute them in parallel with one another rather than serialized in the reverse order of registration. this was a likely unnoticed but potentially very dangerous manifestation of the undefined behavior. if on the other hand atexit was not linked, multiple threads calling exit concurrently could each run their own instance of global dtors, if any, likely producing double-free situations. now, if multiple threads call exit concurrently, all but the first will permanently block (in SYS_pause) until the process terminates, and all atexit handlers, global dtors, and stdio flushing/position consistency will be handled in the thread that arrived first. this is really the only reasonable way to define concurrent calls to exit. it is not recommended usage, but may become so in the future if there is consensus/standardization, as there is a push from the rust language community (and potentially other languages interoperating with the C runtime) to make concurrent calls to the language's exit interfaces safe even when multiple languages are involved in a program, and this is only possible by having the locking in the underlying C exit.
this cleans up what had become widespread direct inline use of "GNU C" style attributes directly in the source, and lowers the barrier to increased use of hidden visibility, which will be useful to recovering some of the efficiency lost when the protected visibility hack was dropped in commit dc2f368e565c37728b0d620380b849c3a1ddd78f, especially on archs where the PLT ABI is costly.
commit ad1cd43a86645ba2d4f7c8747240452a349d6bc1 eliminated preprocessor-level omission of references to the init/fini array symbols from object files going into libc.so. the references are weak, and the intent was that the linker would resolve them to zero in libc.so, but instead it leaves undefined references that could be satisfied at runtime. normally these references would be harmless, since the code using them does not even get executed, but some older binutils versions produce a linking error: when linking a program against libc.so, ld first tries to use the hidden init/fini array symbols produced by the linker script to satisfy the references in libc.so, then produces an error because the definitions are hidden. ideally ld would have already provided definitions of these symbols when linking libc.so, but the linker script for -shared omits them. to avoid this situation, the dynamic linker now provides its own dummy definitions of the init/fini array symbols for libc.so. since they are hidden, everything binds at ld time and no references remain in the dynamic symbol table. with modern binutils and --gc-sections, both the dummy empty array objects and the code referencing them get dropped at link time, anyway. the _init and _fini symbols are also switched back to using weak definitions rather than weak references since the latter behave somewhat problematically in general, and the weak definition approach was known to work well.
use weak definitions that the dynamic linker can override instead of preprocessor conditionals on SHARED so that the same libc start and exit code can be used for both static and dynamic linking.
this was originally added as a cheap but portable way to quell warnings about reaching the end of a function that does not return, but since _Exit is marked _Noreturn, it's not needed. removing it makes the call to _Exit into a tail call and shaves off a few bytes of code from minimal static programs.
the purpose of this logic is to avoid linking __stdio_exit unless any stdio reads (which might require repositioning the file offset at exit time) or writes (which might require flushing at exit time) could have been performed. previously, exit called two wrapper functions for __stdio_exit named __flush_on_exit and __seek_on_exit. both of these functions actually performed both tasks (seek and flushing) by calling the underlying __stdio_exit. in order to avoid doing this twice, an overridable data object __towrite_used was used to cause __seek_on_exit to act as a nop when __towrite was linked. now, exit only makes one call, directly to __stdio_exit. this is satisfiable by a weak dummy definition in exit.c, but the real definition is pulled in by either __toread.c or __towrite.c through their referencing a symbol which is defined only in __stdio_exit.c.
calling exit more than once invokes undefined behavior. in some cases it's desirable to detect undefined behavior and diagnose it via a predictable crash, but the code here was silently covering up an uncommon case (exit from more than one thread) and turning a much more common case (recursive calls to exit) into a permanent hang.
modern (4.7.x and later) gcc uses init/fini arrays, rather than the legacy _init/_fini function pasting and crtbegin/crtend ctors/dtors system, on most or all archs. some archs had already switched a long time ago. without following this change, global ctors/dtors will cease to work under musl when building with new gcc versions. the most surprising part of this patch is that it actually reduces the size of the init code, for both static and shared libc. this is achieved by (1) unifying the handling main program and shared libraries in the dynamic linker, and (2) eliminating the glibc-inspired rube goldberg machine for passing around init and fini function pointers. to clarify, some background: the function signature for __libc_start_main was based on glibc, as part of the original goal of being able to run some glibc-linked binaries. it worked by having the crt1 code, which is linked into every application, static or dynamic, obtain and pass pointers to the init and fini functions, which __libc_start_main is then responsible for using and recording for later use, as necessary. however, in neither the static-linked nor dynamic-linked case do we actually need crt1.o's help. with dynamic linking, all the pointers are available in the _DYNAMIC block. with static linking, it's safe to simply access the _init/_fini and __init_array_start, etc. symbols directly. obviously changing the __libc_start_main function signature in an incompatible way would break both old musl-linked programs and glibc-linked programs, so let's not do that. instead, the function can just ignore the information it doesn't need. new archs need not even provide the useless args in their versions of crt1.o. existing archs should continue to provide it as long as there is an interest in having newly-linked applications be able to run on old versions of musl; at some point in the future, this support can be removed.