musl/src/stdio/getdelim.c, branch master musl - an implementation of the standard library for Linux-based systems fix undefined behavior in getdelim via null pointer arithmetic and memcpy 2021年09月12日T01:21:43+00:00 Rich Felker dalias@aerifal.cx 2021年09月12日T01:21:43+00:00 e3e7189c11d909199155327fd6a93dcc6b68c7b3 both passing a null pointer to memcpy with length 0, and adding 0 to a null pointer, are undefined. in some sense this is 'benign' UB, but having it precludes use of tooling that strictly traps on UB. there may be better ways to fix it, but conditioning the operations which are intended to be no-ops in the k==0 case on k being nonzero is a simple and safe solution.
both passing a null pointer to memcpy with length 0, and adding 0 to a
null pointer, are undefined. in some sense this is 'benign' UB, but
having it precludes use of tooling that strictly traps on UB. there
may be better ways to fix it, but conditioning the operations which
are intended to be no-ops in the k==0 case on k being nonzero is a
simple and safe solution.
getdelim: only grow buffer when necessary, improve OOM behavior 2018年09月17日T03:03:12+00:00 Rich Felker dalias@aerifal.cx 2018年09月17日T02:42:59+00:00 1f6cbdb434114139081fe65a9bafe775e9ab6c41 commit b114190b29417fff6f701eea3a3b3b6030338280 introduced spurious realloc of the output buffer in cases where the result would exactly fit in the caller-provided buffer. this is contrary to a strict reading of the spec, which only allows realloc when the provided buffer is "of insufficient size". revert the adjustment of the realloc threshold, and instead push the byte read by getc_unlocked (for which the adjustment was made) back into the stdio buffer if it does not fit in the output buffer, to be read in the next loop iteration. in order not to leave a pushed-back byte in the stdio buffer if realloc fails (which would violate the invariant that logical FILE position and underlying open file description offset match for unbuffered FILEs), the OOM code path must be changed. it would suffice move just one byte in this case, but from a QoI perspective, in the event of ENOMEM the entire output buffer (up to the allocated length reported via *n) should contain bytes read from the FILE stream. otherwise the caller has no way to distinguish trunated data from uninitialized buffer space. the SIZE_MAX/2 check is removed since the sum of disjoint object sizes is assumed not to be able to overflow, leaving just one OOM code path.
commit b114190b29417fff6f701eea3a3b3b6030338280 introduced spurious
realloc of the output buffer in cases where the result would exactly
fit in the caller-provided buffer. this is contrary to a strict
reading of the spec, which only allows realloc when the provided
buffer is "of insufficient size".
revert the adjustment of the realloc threshold, and instead push the
byte read by getc_unlocked (for which the adjustment was made) back
into the stdio buffer if it does not fit in the output buffer, to be
read in the next loop iteration.
in order not to leave a pushed-back byte in the stdio buffer if
realloc fails (which would violate the invariant that logical FILE
position and underlying open file description offset match for
unbuffered FILEs), the OOM code path must be changed. it would suffice
move just one byte in this case, but from a QoI perspective, in the
event of ENOMEM the entire output buffer (up to the allocated length
reported via *n) should contain bytes read from the FILE stream.
otherwise the caller has no way to distinguish trunated data from
uninitialized buffer space.
the SIZE_MAX/2 check is removed since the sum of disjoint object sizes
is assumed not to be able to overflow, leaving just one OOM code path.
fix null pointer subtraction and comparison in stdio 2018年09月16日T18:37:22+00:00 Rich Felker dalias@aerifal.cx 2018年09月16日T17:46:46+00:00 849e7603e9004fd292a93df64dd3524025f2987a morally, for null pointers a and b, a-b, a<b, and a>b should all be defined as 0; however, C does not define any of them. the stdio implementation makes heavy use of such pointer comparison and subtraction for buffer logic, and also uses null pos/base/end pointers to indicate that the FILE is not in the corresponding (read or write) mode ready for accesses through the buffer. all of the comparisons are fixed trivially by using != in place of the relational operators, since the opposite relation (e.g. pos>end) is logically impossible. the subtractions have been reviewed to check that they are conditional the stream being in the appropriate reading- or writing-through-buffer mode, with checks added where needed. in fgets and getdelim, the checks added should improve performance for unbuffered streams by avoiding a do-nothing call to memchr, and should be negligible for buffered streams.
morally, for null pointers a and b, a-b, a<b, and a>b should all be
defined as 0; however, C does not define any of them.
the stdio implementation makes heavy use of such pointer comparison
and subtraction for buffer logic, and also uses null pos/base/end
pointers to indicate that the FILE is not in the corresponding (read
or write) mode ready for accesses through the buffer.
all of the comparisons are fixed trivially by using != in place of the
relational operators, since the opposite relation (e.g. pos>end) is
logically impossible. the subtractions have been reviewed to check
that they are conditional the stream being in the appropriate reading-
or writing-through-buffer mode, with checks added where needed.
in fgets and getdelim, the checks added should improve performance for
unbuffered streams by avoiding a do-nothing call to memchr, and should
be negligible for buffered streams.
fix failure of getdelim to set stream orientation on error 2018年09月16日T12:46:26+00:00 Rich Felker dalias@aerifal.cx 2018年09月16日T12:40:46+00:00 5cd309f0cc3c92f3fabbaa499652a8329137c4de if EINVAL or ENOMEM happened before the first getc_unlocked, it was possible that the stream orientation had not yet been set.
if EINVAL or ENOMEM happened before the first getc_unlocked, it was
possible that the stream orientation had not yet been set.
reduce spurious inclusion of libc.h 2018年09月12日T18:34:37+00:00 Rich Felker dalias@aerifal.cx 2018年09月12日T04:08:09+00:00 5ce3737931bb411a8d167356d4d0287b53b0cbdc libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
libc.h was intended to be a header for access to global libc state and
related interfaces, but ended up included all over the place because
it was the way to get the weak_alias macro. most of the inclusions
removed here are places where weak_alias was needed. a few were
recently introduced for hidden. some go all the way back to when
libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented)
cancellation points had to include it.
remaining spurious users are mostly callers of the LOCK/UNLOCK macros
and files that use the LFS64 macro to define the awful *64 aliases.
in a few places, new inclusion of libc.h is added because several
internal headers no longer implicitly include libc.h.
declarations for __lockfile and __unlockfile are moved from libc.h to
stdio_impl.h so that the latter does not need libc.h. putting them in
libc.h made no sense at all, since the macros in stdio_impl.h are
needed to use them correctly anyway.
remove unused MIN macro from getdelim source file 2018年02月24日T16:38:53+00:00 Rich Felker dalias@aerifal.cx 2018年02月24日T16:38:53+00:00 aaa29c26eed4a09625e61c6af31d16b1a4163fc3
fix overly pessimistic realloc strategy in getdelim 2015年12月20日T05:39:35+00:00 Rich Felker dalias@aerifal.cx 2015年12月20日T05:32:46+00:00 c673158d91ad995ed59dd910777cd6464f61fe8e previously, getdelim was allocating twice the space needed every time it expanded its buffer to implement exponential buffer growth (in order to avoid quadratic run time). however, this doubling was performed even when the final buffer length needed was already known, which is the common case that occurs whenever the delimiter is in the FILE's buffer. this patch makes two changes to remedy the situation: 1. over-allocation is no longer performed if the delimiter has already been found when realloc is needed. 2. growth factor is reduced from 2x to 1.5x to reduce the relative excess allocation in cases where the delimiter is not initially in the buffer, including unbuffered streams. in theory these changes could lead to quadratic time if the same buffer is reused to process a sequence of lines successively increasing in length, but once this length exceeds the stdio buffer size, the delimiter will not be found in the buffer right away and exponential growth will still kick in.
previously, getdelim was allocating twice the space needed every time
it expanded its buffer to implement exponential buffer growth (in
order to avoid quadratic run time). however, this doubling was
performed even when the final buffer length needed was already known,
which is the common case that occurs whenever the delimiter is in the
FILE's buffer.
this patch makes two changes to remedy the situation:
1. over-allocation is no longer performed if the delimiter has already
been found when realloc is needed.
2. growth factor is reduced from 2x to 1.5x to reduce the relative
excess allocation in cases where the delimiter is not initially in the
buffer, including unbuffered streams.
in theory these changes could lead to quadratic time if the same
buffer is reused to process a sequence of lines successively
increasing in length, but once this length exceeds the stdio buffer
size, the delimiter will not be found in the buffer right away and
exponential growth will still kick in.
avoid updating caller's size when getdelim fails to realloc 2015年12月20日T04:43:31+00:00 Rich Felker dalias@aerifal.cx 2015年12月20日T04:43:31+00:00 d87f0a9a95f0a1228ee5579e5822a8c93bc96823 getdelim was updating *n, the caller's stored buffer size, before calling realloc. if getdelim then failed due to realloc failure, the caller would see in *n a value larger than the actual size of the allocated block, and use of that value is unsafe. in particular, passing it again to getdelim is unsafe. now, temporary storage is used for the desired new size, and *n is not written until realloc succeeds.
getdelim was updating *n, the caller's stored buffer size, before
calling realloc. if getdelim then failed due to realloc failure, the
caller would see in *n a value larger than the actual size of the
allocated block, and use of that value is unsafe. in particular,
passing it again to getdelim is unsafe.
now, temporary storage is used for the desired new size, and *n is not
written until realloc succeeds.
fix single-byte overflow of malloc'd buffer in getdelim 2015年10月25日T02:42:10+00:00 Rich Felker dalias@aerifal.cx 2015年10月25日T02:42:10+00:00 b114190b29417fff6f701eea3a3b3b6030338280 the buffer enlargement logic here accounted for the terminating null byte, but not for the possibility of hitting the delimiter in the buffer-refill code path that uses getc_unlocked, in which case two additional bytes (the delimiter and the null termination) are written without another chance to enlarge the buffer. this patch and the corresponding bug report are by Felix Janda.
the buffer enlargement logic here accounted for the terminating null
byte, but not for the possibility of hitting the delimiter in the
buffer-refill code path that uses getc_unlocked, in which case two
additional bytes (the delimiter and the null termination) are written
without another chance to enlarge the buffer.
this patch and the corresponding bug report are by Felix Janda.
fix getdelim to set the error indicator on all failures 2015年04月04日T14:53:09+00:00 Szabolcs Nagy nsz@port70.net 2015年04月04日T11:27:06+00:00 05e0e301e3efbeb399b9f3d96fab63aac18e601a

AltStyle によって変換されたページ (->オリジナル) /