musl/src/stdio/fgetc.c, branch master musl - an implementation of the standard library for Linux-based systems optimize hot paths of getc with manual shrink-wrapping 2018年10月18日T03:16:35+00:00 Rich Felker dalias@aerifal.cx 2018年10月16日T05:08:21+00:00 dd8f02b7dce53d6b1c4282439f1636a2d63bee01 with these changes, in a program that has not created any threads besides the main thread and that has not called f[try]lockfile, getc performs indistinguishably from getc_unlocked. this was measured on several i386 and x86_64 models, and should hold on other archs too simply by the properties of the code generation. the case where the caller already holds the lock (via flockfile) is improved significantly as well (40-60% reduction in time on machines tested) and the case where locking is needed is improved somewhat (roughly 10%). the key technique used here is forcing the non-hot path out-of-line and enabling it to be a tail call. a static noinline function (conditional on __GNUC__) is used rather than the extern hiddens used elsewhere for this purpose, so that the compiler can choose non-default calling conventions, making it possible to tail-call to a callee that takes more arguments than the caller on archs where arguments are passed on the stack or must have space reserved on the stack for spilling the. the tid could just be reloaded via the thread pointer in locking_getc, but that would be ridiculously expensive on some archs where thread pointer load requires a trap or syscall.
with these changes, in a program that has not created any threads
besides the main thread and that has not called f[try]lockfile, getc
performs indistinguishably from getc_unlocked. this was measured on
several i386 and x86_64 models, and should hold on other archs too
simply by the properties of the code generation.
the case where the caller already holds the lock (via flockfile) is
improved significantly as well (40-60% reduction in time on machines
tested) and the case where locking is needed is improved somewhat
(roughly 10%).
the key technique used here is forcing the non-hot path out-of-line
and enabling it to be a tail call. a static noinline function
(conditional on __GNUC__) is used rather than the extern hiddens used
elsewhere for this purpose, so that the compiler can choose
non-default calling conventions, making it possible to tail-call to a
callee that takes more arguments than the caller on archs where
arguments are passed on the stack or must have space reserved on the
stack for spilling the. the tid could just be reloaded via the thread
pointer in locking_getc, but that would be ridiculously expensive on
some archs where thread pointer load requires a trap or syscall.
separate getc/putc from fgetc/fputc 2012年10月27日T23:52:40+00:00 Rich Felker dalias@aerifal.cx 2012年10月27日T23:52:40+00:00 8fc7b5965ac6a000c93c7362276a6a7b193647f4 for conformance, two functions should not have the same address. a conforming program could use the addresses of getc and fgetc in ways that assume they are distinct. normally i would just use a wrapper, but these functions are so small and performance-critical that an extra layer of function call could make the one that's a wrapper nearly twice as slow, so I'm just duplicating the code instead.
for conformance, two functions should not have the same address. a
conforming program could use the addresses of getc and fgetc in ways
that assume they are distinct. normally i would just use a wrapper,
but these functions are so small and performance-critical that an
extra layer of function call could make the one that's a wrapper
nearly twice as slow, so I'm just duplicating the code instead.
add some ugly aliases for LSB ABI compatibility 2012年06月03日T01:20:21+00:00 Rich Felker dalias@aerifal.cx 2012年06月03日T01:20:21+00:00 6a4b9472fb0a85e55030b37ec3017ba0319e03f9 for some nonsensical reason, glibc's headers use inline functions that redirect some of the standard functions to ugly nonstandard names (and likewise for some of their nonstandard functions).
for some nonsensical reason, glibc's headers use inline functions that
redirect some of the standard functions to ugly nonstandard names (and
likewise for some of their nonstandard functions).
add proper fuxed-based locking for stdio 2011年07月30日T12:02:14+00:00 Rich Felker dalias@aerifal.cx 2011年07月30日T12:02:14+00:00 dba68bf98fc708cea4c478278c889fc7ad802b00 previously, stdio used spinlocks, which would be unacceptable if we ever add support for thread priorities, and which yielded pathologically bad performance if an application attempted to use flockfile on a key file as a major/primary locking mechanism. i had held off on making this change for fear that it would hurt performance in the non-threaded case, but actually support for recursive locking had already inflicted that cost. by having the internal locking functions store a flag indicating whether they need to perform unlocking, rather than using the actual recursive lock counter, i was able to combine the conditionals at unlock time, eliminating any additional cost, and also avoid a nasty corner case where a huge number of calls to ftrylockfile could cause deadlock later at the point of internal locking. this commit also fixes some issues with usage of pthread_self conflicting with __attribute__((const)) which resulted in crashes with some compiler versions/optimizations, mainly in flockfile prior to pthread_create.
previously, stdio used spinlocks, which would be unacceptable if we
ever add support for thread priorities, and which yielded
pathologically bad performance if an application attempted to use
flockfile on a key file as a major/primary locking mechanism.
i had held off on making this change for fear that it would hurt
performance in the non-threaded case, but actually support for
recursive locking had already inflicted that cost. by having the
internal locking functions store a flag indicating whether they need
to perform unlocking, rather than using the actual recursive lock
counter, i was able to combine the conditionals at unlock time,
eliminating any additional cost, and also avoid a nasty corner case
where a huge number of calls to ftrylockfile could cause deadlock
later at the point of internal locking.
this commit also fixes some issues with usage of pthread_self
conflicting with __attribute__((const)) which resulted in crashes with
some compiler versions/optimizations, mainly in flockfile prior to
pthread_create.
major stdio overhaul, using readv/writev, plus other changes 2011年03月28日T05:14:44+00:00 Rich Felker dalias@aerifal.cx 2011年03月28日T05:14:44+00:00 e3cd6c5c265cd481db6e0c5b529855d99f0bda30 the biggest change in this commit is that stdio now uses readv to fill the caller's buffer and the FILE buffer with a single syscall, and likewise writev to flush the FILE buffer and write out the caller's buffer in a single syscall. making this change required fundamental architectural changes to stdio, so i also made a number of other improvements in the process: - the implementation no longer assumes that further io will fail following errors, and no longer blocks io when the error flag is set (though the latter could easily be changed back if desired) - unbuffered mode is no longer implemented as a one-byte buffer. as a consequence, scanf unreading has to use ungetc, to the unget buffer has been enlarged to hold at least 2 wide characters. - the FILE structure has been rearranged to maintain the locations of the fields that might be used in glibc getc/putc type macros, while shrinking the structure to save some space. - error cases for fflush, fseek, etc. should be more correct. - library-internal macros are used for getc_unlocked and putc_unlocked now, eliminating some ugly code duplication. __uflow and __overflow are no longer used anywhere but these macros. switch to read or write mode is also separated so the code can be better shared, e.g. with ungetc. - lots of other small things.
the biggest change in this commit is that stdio now uses readv to fill
the caller's buffer and the FILE buffer with a single syscall, and
likewise writev to flush the FILE buffer and write out the caller's
buffer in a single syscall.
making this change required fundamental architectural changes to
stdio, so i also made a number of other improvements in the process:
- the implementation no longer assumes that further io will fail
 following errors, and no longer blocks io when the error flag is set
 (though the latter could easily be changed back if desired)
- unbuffered mode is no longer implemented as a one-byte buffer. as a
 consequence, scanf unreading has to use ungetc, to the unget buffer
 has been enlarged to hold at least 2 wide characters.
- the FILE structure has been rearranged to maintain the locations of
 the fields that might be used in glibc getc/putc type macros, while
 shrinking the structure to save some space.
- error cases for fflush, fseek, etc. should be more correct.
- library-internal macros are used for getc_unlocked and putc_unlocked
 now, eliminating some ugly code duplication. __uflow and __overflow
 are no longer used anywhere but these macros. switch to read or
 write mode is also separated so the code can be better shared, e.g.
 with ungetc.
- lots of other small things.
initial check-in, version 0.5.0 2011年02月12日T05:22:29+00:00 Rich Felker dalias@aerifal.cx 2011年02月12日T05:22:29+00:00 0b44a0315b47dd8eced9f3b7f31580cf14bbfc01

AltStyle によって変換されたページ (->オリジナル) /