musl/src/stdio/getc.c, branch master musl - an implementation of the standard library for Linux-based systems optimize hot paths of getc with manual shrink-wrapping 2018年10月18日T03:16:35+00:00 Rich Felker dalias@aerifal.cx 2018年10月16日T05:08:21+00:00 dd8f02b7dce53d6b1c4282439f1636a2d63bee01 with these changes, in a program that has not created any threads besides the main thread and that has not called f[try]lockfile, getc performs indistinguishably from getc_unlocked. this was measured on several i386 and x86_64 models, and should hold on other archs too simply by the properties of the code generation. the case where the caller already holds the lock (via flockfile) is improved significantly as well (40-60% reduction in time on machines tested) and the case where locking is needed is improved somewhat (roughly 10%). the key technique used here is forcing the non-hot path out-of-line and enabling it to be a tail call. a static noinline function (conditional on __GNUC__) is used rather than the extern hiddens used elsewhere for this purpose, so that the compiler can choose non-default calling conventions, making it possible to tail-call to a callee that takes more arguments than the caller on archs where arguments are passed on the stack or must have space reserved on the stack for spilling the. the tid could just be reloaded via the thread pointer in locking_getc, but that would be ridiculously expensive on some archs where thread pointer load requires a trap or syscall.
with these changes, in a program that has not created any threads
besides the main thread and that has not called f[try]lockfile, getc
performs indistinguishably from getc_unlocked. this was measured on
several i386 and x86_64 models, and should hold on other archs too
simply by the properties of the code generation.
the case where the caller already holds the lock (via flockfile) is
improved significantly as well (40-60% reduction in time on machines
tested) and the case where locking is needed is improved somewhat
(roughly 10%).
the key technique used here is forcing the non-hot path out-of-line
and enabling it to be a tail call. a static noinline function
(conditional on __GNUC__) is used rather than the extern hiddens used
elsewhere for this purpose, so that the compiler can choose
non-default calling conventions, making it possible to tail-call to a
callee that takes more arguments than the caller on archs where
arguments are passed on the stack or must have space reserved on the
stack for spilling the. the tid could just be reloaded via the thread
pointer in locking_getc, but that would be ridiculously expensive on
some archs where thread pointer load requires a trap or syscall.
separate getc/putc from fgetc/fputc 2012年10月27日T23:52:40+00:00 Rich Felker dalias@aerifal.cx 2012年10月27日T23:52:40+00:00 8fc7b5965ac6a000c93c7362276a6a7b193647f4 for conformance, two functions should not have the same address. a conforming program could use the addresses of getc and fgetc in ways that assume they are distinct. normally i would just use a wrapper, but these functions are so small and performance-critical that an extra layer of function call could make the one that's a wrapper nearly twice as slow, so I'm just duplicating the code instead.
for conformance, two functions should not have the same address. a
conforming program could use the addresses of getc and fgetc in ways
that assume they are distinct. normally i would just use a wrapper,
but these functions are so small and performance-critical that an
extra layer of function call could make the one that's a wrapper
nearly twice as slow, so I'm just duplicating the code instead.
major stdio overhaul, using readv/writev, plus other changes 2011年03月28日T05:14:44+00:00 Rich Felker dalias@aerifal.cx 2011年03月28日T05:14:44+00:00 e3cd6c5c265cd481db6e0c5b529855d99f0bda30 the biggest change in this commit is that stdio now uses readv to fill the caller's buffer and the FILE buffer with a single syscall, and likewise writev to flush the FILE buffer and write out the caller's buffer in a single syscall. making this change required fundamental architectural changes to stdio, so i also made a number of other improvements in the process: - the implementation no longer assumes that further io will fail following errors, and no longer blocks io when the error flag is set (though the latter could easily be changed back if desired) - unbuffered mode is no longer implemented as a one-byte buffer. as a consequence, scanf unreading has to use ungetc, to the unget buffer has been enlarged to hold at least 2 wide characters. - the FILE structure has been rearranged to maintain the locations of the fields that might be used in glibc getc/putc type macros, while shrinking the structure to save some space. - error cases for fflush, fseek, etc. should be more correct. - library-internal macros are used for getc_unlocked and putc_unlocked now, eliminating some ugly code duplication. __uflow and __overflow are no longer used anywhere but these macros. switch to read or write mode is also separated so the code can be better shared, e.g. with ungetc. - lots of other small things.
the biggest change in this commit is that stdio now uses readv to fill
the caller's buffer and the FILE buffer with a single syscall, and
likewise writev to flush the FILE buffer and write out the caller's
buffer in a single syscall.
making this change required fundamental architectural changes to
stdio, so i also made a number of other improvements in the process:
- the implementation no longer assumes that further io will fail
 following errors, and no longer blocks io when the error flag is set
 (though the latter could easily be changed back if desired)
- unbuffered mode is no longer implemented as a one-byte buffer. as a
 consequence, scanf unreading has to use ungetc, to the unget buffer
 has been enlarged to hold at least 2 wide characters.
- the FILE structure has been rearranged to maintain the locations of
 the fields that might be used in glibc getc/putc type macros, while
 shrinking the structure to save some space.
- error cases for fflush, fseek, etc. should be more correct.
- library-internal macros are used for getc_unlocked and putc_unlocked
 now, eliminating some ugly code duplication. __uflow and __overflow
 are no longer used anywhere but these macros. switch to read or
 write mode is also separated so the code can be better shared, e.g.
 with ungetc.
- lots of other small things.
initial check-in, version 0.5.0 2011年02月12日T05:22:29+00:00 Rich Felker dalias@aerifal.cx 2011年02月12日T05:22:29+00:00 0b44a0315b47dd8eced9f3b7f31580cf14bbfc01

AltStyle によって変換されたページ (->オリジナル) /