Clean up atomics, non-normative changes

Jens Gustedt, INRIA and ICube, France

2025年04月29日

target

integration into IS ISO/IEC 9899:202y

document history

document number date comment
n2389 201906 non-normative parts of n1955, n2064 and n2329
proposal for atomic_fetch_OP_explicit resolution
n3523 202503 rebase for C2y
removal of atomic_fetch_OP_explicit resolution
improve the integration into C terminology
extend to some terminology issues in Annex K
remove filler sentences
remove spurious double quotes
n3542 202504 this document, restrict silent wrap around to generic functions
remove allusion to multiple mappings in the same execution

1 Introduction

The concurrent integration of atomics, threads and Annex K into the C standard had a lot of difficulties, such as the use of non-normative terminology (processes, address types, regular type, personification of actions, directories), missing integration of threads and atomic synchronization, misleading introductory references to implicated parts of the standard, missing integration between clause 6 and 7, or spurious claims such as

This paper builds on the following series of papers: n1955, n2064, n2329, and n2389.

In particular, there were the following votes on n2329 during the 2019 London meeting (see n2376).

As an answer to that, n2389 combined these two aspects, but then did not find consensus in the 2019 Ithaca meeting because of the atomic_fetch_OP_explicit functions.

This paper here so tries to address the aspects of non-normative changes that had been agreed upon during the London meeting and removes mis-wordings and ambiguities from the description of the atomic_fetch_OP_explicit functions. Many of the identified problems have already been resolved through other paths and C23 now already has much less defects in this domain. So we may now concentrate on issues that are still left.

Additionally, we make a pass on the other material (mostly in non-normative text) that had been included for C11 but where proper integration between the different additions and with the standard terminology has been missed.

2 Proposed procedure

All the proposed changes are supposed to be non-normative and are mostly independent from one another. Therefore any of the proposed changes could have been presented with its own paper resulting in perhaps about 10 papers. This approach would block therefore about 5 hours of committee time, which appears far too much.

Instead, I propose that members of WG14 raise objections against particular changes they find in this paper. If these objections cannot be resolved until the end of June 2025, say, they should be raised, discussed and then be voted separately in the Brno session. For the remaining, undisputed, parts I will then propose to have a single vote.

I will ask the convenor to reserve an appropriate time slot in the session, depending on the objections that are still open, then.

3 Wording changes

New text is (追記) underlined green (追記ここまで), removed text is (削除) stroke-out red (削除ここまで). Possible reorganization of the paragraphs is left to the discretion of the editors.

5.2.2.5 Multi-threaded executions and data races

This clause is particularly misleading, since synchronization operations are not limited to the library, and also within the library concerns much more that the indicated subclauses.

5 (削除) The library defines atomic operations (7.17) and operations on mutexes (7.29.4) that are specially identified as synchronization operations. (削除ここまで) (追記) There are operations that are specially identified as synchronization operations. If the implementation supports the atomics extension these are operators and generic functions that act on atomic objects (6.5 and 7.17). If the implementation supports the thread extension these are calls to initialization functions (7.24.1 and 7.29.2), memory management functions (7.24.4), operations on mutexes (7.29.3.5, 7.29.3.6 and 7.29.4), and calls to some thread functions (7.29.5.1, 7.29.5.5 and 7.29.5.6). (追記ここまで) These operations play a special role in making (削除) assignments (削除ここまで) (追記) side effects (追記ここまで) in one thread visible to another. A synchronization operation on one or more memory locations is one of an acquire operation, a release operation, both an acquire and release operation, or a consume operation. A synchronization operation without an associated memory location is a fence and can be either an acquire fence, a release fence, or both an acquire and release fence. In addition, there are relaxed atomic operations, which are not synchronization operations (追記) but still are indivisible and strictly ordered (追記ここまで), and atomic read-modify-write operations, which (削除) have special characteristics. (削除ここまで) (追記) are those operations defined in 6.5 and 7.17 that act on an atomic object by reading its value, by performing an optional operation with that value and by storing back a value into that object. (追記ここまで)

...

11 Certain (削除) library calls (削除ここまで) (追記) operations (追記ここまで) synchronize with other (削除) library calls (削除ここまで) (追記) operations (追記ここまで) performed by another thread. In particular, an atomic operation A that performs a release operation on an object M synchronizes with an atomic operation B that performs an acquire operation on M and reads a value written by any side effect in the release sequence headed by A.

...

Remove the use of non-standard terminology and spurious quotes, and use direct language instead of double negation.

33 NOTE 16 This effectively (削除) disallows compiler reordering (削除ここまで) (追記) enforces the odering (追記ここまで) of atomic operations to a single object, even if both operations are (削除) " (削除ここまで)relaxed(削除) " (削除ここまで) loads. (削除) By doing so, it effectively makes the "cache coherence" guarantee provided by most hardware available to C atomic operations. (削除ここまで)

34 NOTE 17 The value observed by a load of an atomic object depends on the (削除) " (削除ここまで)happens before(削除) " (削除ここまで) relation, which in turn depends on the values observed by loads of atomic objects. The intended reading is that there exists an association of atomic loads with modifications they observe that, together with suitably chosen modification orders and the (削除) " (削除ここまで)happens before(削除) " (削除ここまで) relation derived as described previously, satisfy the resulting constraints as imposed here.

As defined here, a data race is not an event that fits into the happens before relation. So we can’t speak of a result of it.

35 The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. (削除) Any such data race results in undefined behavior. (削除ここまで) (追記) If a program execution contains a data race, the behavior is undefined. (追記ここまで)

Move to standard terminology and remove claims about "data-race-free" programs which is a term that is not introduced (only data-race-free program executions are).

36 NOTE 18 It can be shown that programs that correctly use (削除) simple mutexes (削除ここまで) (追記) operations on mtx_t (追記ここまで) and memory_order_seq_cst (追記) atomic (追記ここまで) operations to prevent all data races, and use no other synchronization operations, behave as though the operations executed by their constituent threads were (削除) simply (削除ここまで) interleaved, with each value computation of an object being the last value stored in that interleaving. This is (削除) normally (削除ここまで) referred to as (削除) " (削除ここまで)sequential consistency(削除) " (削除ここまで). (削除) However, this applies only to data-race-free programs, and data-race-free programs cannot observe most program transformations that do not change single-threaded program semantics. In fact, most single-threaded (削除ここまで) (追記) Many (追記ここまで) program transformations (追記) that are valid in the absence of multiple threads (追記ここまで) continue to be allowed (追記) for sequentially consistent programs (追記ここまで), since any (追記) execution of such a (追記ここまで) program that behaves differently as a result of such transformations necessarily has undefined behavior even (削除) before such a transformation is applied (削除ここまで) (追記) if the transformation were not applied (追記ここまで).

What is a "compiler transformation"? What are "atomics in question"?

37 NOTE 19 (削除) Compiler (削除ここまで) (追記) Program (追記ここまで) transformations that introduce assignments to a potentially shared memory location that would not be modified by the abstract machine are generally precluded by this document, since such an assignment can overwrite another assignment by a different thread in cases in which an abstract machine execution would not have encountered a data race. This includes implementations of data member assignment that overwrite adjacent members in separate memory locations. Reordering of atomic loads in cases in which the (削除) atomics in question can (削除ここまで) (追記) atomic operands potentially (追記ここまで) alias is also generally precluded, since this (削除) can (削除ここまで) (追記) may (追記ここまで) violate the coherence requirements.

Remove some useless blabla, move to standard terminology.

38 NOTE 20 Transformations that introduce a speculative read of a potentially shared memory location possibly (削除) will not preserve the semantics of the program as defined in this document, since they potentially (削除ここまで) introduce a data race. However, they (削除) are typically (削除ここまで) (追記) may be (追記ここまで) valid in the context of (削除) an optimizing compiler that targets a specific machine (削除ここまで) (追記) a specific implementation (追記ここまで) with well-defined semantics for data races. They (削除) would be invalid for a hypothetical machine (削除ここまで) (追記) are invalid for an implementation (追記ここまで) that is not tolerant of (追記) data (追記ここまで) races or provides hardware (追記) data (追記ここまで) race detection.

...

6.2.6 Representations of types

We collect all information about atomic types and their operation here to have a unified text. Some phrases in individual clauses on operators, for example, may then be removed.

6.2.6.1 General

9 (削除) Loads and stores of objects with atomic types are done with memory_order_seq_cst semantics. (削除ここまで) (追記) If not specified otherwise, synchronizing operations on atomic objects have memory_order_seq_cst memory consistency. (追記ここまで)

6.5.3.5 Postfix increment and decrement operators

2 ... (削除) Postfix ++ on an object with atomic type is a read-modify-write operation with memory_order_seq_cst memory order semantics (削除ここまで).

6.5.17.3 Compound assignment

4 ... (削除) If E1 has an atomic type, compound assignment is a read-modify-write operation with memory_order_seq_cst memory order semantics. (削除ここまで)

6.7.2.4 Atomic type specifiers

add an example at the end of the clause

(追記) 5 EXAMPLE This disambiguation of the grammar is necessary in case a qualifier or specifier is followed by an opening parenthesis. (追記ここまで)
(追記)
typedef double toto;

void ic(int const tutu); // valid prototype, void g(int tutu)
void hc(int const(tutu)); // valid prototype, void g(int tutu)
void gc(int const(toto)); // valid prototype, void g(int(*)(double))

void ia(int _Atomic tutu); // valid prototype, void g(int tutu)
void ha(int _Atomic(tutu)); // invalid prototype, tutu not a type for _Atomic()
void ga(int _Atomic(toto)); // invalid prototype, two type names in parameter declaration
(追記ここまで)

Atomics <stdatomic.h>

7.17.1 Introduction

Atomics are not only relevant for threads but also for communication with signal handlers.

1 The header <stdatomic.h> defines several macros and declares several types and functions for performing atomic operations on data shared (追記) with signal handlers and (追記ここまで) between threads.302)

...

This paragraph about the synopses is currently a weird mixture of adding precision to the synopses and adding semantic properties to certain operations.

6 In the following synopses:

Remove the useless defintion on M (no specific operation is defined for pointer types).

The following item attempts to add a semantic property to the non-_explicit operations. This addition is superfluous because it is already governed by a general rule and so normatively it is not necessary. Nevertheless, it might be good to point this property out, so we transform this into a note. Also donate a plural "s" because there is one generic function that has two memory_order parameters.

Note to the editors: maybe this note should be shifted to the end of the clause and joined with the note that is already placed, there.

  • The functions not ending in _explicit have the same semantics as the corresponding _explicit function with memory_order_seq_cst for the memory_order arguments. (削除ここまで)
(追記) NOTE: Most of the operations are provided by a generic function with a name with the suffix _explicit and one that omits that suffix. For the first, one or two trailing parameters of type memory_order specify the operation ordering. For the second, the prototype omits these trailing parameters and the operation ordering is as if additional arguments of value memory_order_seq_cst were provided to the first kind of function in every function call. (追記ここまで)

...

Replace some unredacted text from the original proposal by standardese.

8 NOTE (削除) Many operations are volatile-qualified. The "volatile as device register" semantics have not changed in the standard. This qualification means that volatility is preserved when applying these operations to volatile objects. (削除ここまで) (追記) Most of these generic functions have volatile-qualified parameters to allow their application to volatile-qualified objects.

...

7.17.2.2 The atomic_init generic function

...

Be clear that atomic_init does not synchronize and avoid repetition.

3 Although this function initializes an atomic object, it does not avoid data races (追記ここまで) (追記) it is not a synchronizing operation (追記ここまで); concurrent access to the object being initialized, even via an atomic operation, constitutes a data race.

...

7.17.3 Order and consistency

7.17.3.1 General

It is not necessary clear to the occasional reader what a "stronger memory_order" specifications (see 7.17.7.5) would be. Therefore, add a new note after p12 to provide words for a relation between different memory consistency models.

p12′ (追記) NOTE 2′ The memory orderings of memory_order impose different ordering constraints on certain operations. memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_acq_rel and memory_order_seq_cst form an inclusive chain of such constraints, from weakest to strongest. memory_order_release imposes constraints that are incompatible with memory_order_consume and memory_order_acquire, and that are stronger than memory_order_relaxed and weaker than memory_order_acq_rel. (追記ここまで)

7.17.5 Lock-free property

7.17.5.1 General

1 The atomic lock-free macros indicate the lock-free property of integer and (削除) address (削除ここまで) atomic (追記) pointer (追記ここまで) types. A value of 0 indicates that the type is never lock-free; a value of 1 indicates that the type is sometimes lock-free; a value of 2 indicates that the type is always lock-free.

2 (追記) NOTE In addition to the synchronization properties between threads, the lock-free property of a type warrants that operations are perceived indivisible and strictly ordered in the presence of signals, see 5.2.2.4. (追記ここまで)

Recommended practice

Operations that are lock-free should also be address-free. That is, atomic operations on the same (削除) memory (削除ここまで) (追記) storage (追記ここまで) location via two different addresses (追記) (e.g when mapped by an implementation-specific feature to different addresses in concurrent program executions) (追記ここまで) will (削除) communicate atomically (削除ここまで) (追記) synchronize (for a memory order other than relaxed) and be indivisible and strictly ordered (追記ここまで). The implementation should not depend on any (削除) per-process (削除ここまで) (追記) execution specific (追記ここまで) state. (削除) This restriction enables communication via memory mapped into a process more than once and memory shared between two processes. (削除ここまで)

7.17.6 Atomic integer types

...

Recommended practice

3 The representation of an atomic integer type is not required to have the same size as the (削除) corresponding regular type (削除ここまで) (追記) non-atomic version of the direct type (追記ここまで) but it should have the same size whenever possible, as it eases effort required to port existing code.

...

7.17.7 Operations on atomic types

7.17.7.1 General

1 (削除) There are only a few kinds of operations on atomic types, though there are many instances of those kinds. This subclause specifies each general kind. (削除ここまで) (追記) This clause describes several generic functions that operate on all atomic types other than atomic_flag. Also, 7.17.7.6 provides such generic functions for some read-modify-write operations on atomic integer types that are defined in addition to the operators defined in clauses 6.5.3.5, 6.5.4.2 and 6.5.17. (追記ここまで)

7.17.7.5 The atomic_compare_exchange generic functions

...

Description

The failure argument shall not be memory_order_release nor memory_order_acq_rel. The failure argument shall (削除) be no stronger (削除ここまで) (追記) not impose more constraints on the operation (追記ここまで) than the success argument.

7.17.7.6 The atomic_fetch and modify generic functions

...

In the Synopsis replace M by C.

...

Description

Atomically replaces the value pointed to by object with the result of the computation applied to the value pointed to by object and the given operand. Memory is affected according to the value of order. These operations are atomic read-modify-write operations (5.2.2.5). (削除) For signed integer types, arithmetic performs silent wraparound on integer overflow; there are no undefined results. (削除ここまで) (追記) The value stored is the mathematical result of the operation wrapped around to the width of the non-atomic type C.FNTA) (追記ここまで)

(追記) FTNA) For the generic functions of this clause this completely defines the behavior. (追記ここまで)

(削除) For address types, the result may be an undefined address, but the operations otherwise have no undefined behavior. (削除ここまで)

7.17.8 Atomic flag type and operations

17.17.8.1 General

1 The atomic_flag type provides (削除) the classic test-and-set functionality. It has (削除ここまで) (追記) an atomic data primitive that has exactly (追記ここまで) two states, set and clear.

2 ...

3 NOTE Hence, as per 7.17.5, the operations should also be address-free. No other type requires lock-free operations, so the atomic_flag type is the minimum (削除) hardware-implemented (削除ここまで) type needed to conform to this document (追記) that is asynchronous signal safe and that is expected to be compatible with implementation-specific extensions for shared objects between different program executions (追記ここまで). (削除) The remaining types can be emulated with atomic_flag, though with less than ideal properties. (削除ここまで)

...

7.30.3.1 General

...

386) This does not mean that these functions are forbidden to read global state that describes the time and calendar settings of the execution, such as the LC_TIME locale or the implementation-defined specification of the local time zone. Only the setting of that state by setlocale or by means of implementation-defined functions can constitute (追記) data (追記ここまで) races.

K .3.5.2 Operations on files

K .3.5.2.1 The tmpfile_s function

...

Recommended practice

(note to the editors: the paragraph number is missing, similar changes should also be applied to the clause of tmpfile)

(追記) 5′ (追記ここまで)

It (削除) should be (削除ここまで) (追記) is (追記ここまで) possible to open at least TMP_MAX_S temporary files during the (削除) lifetime of the program (削除ここまで) (追記) program execution (追記ここまで) (this limit can be shared with tmpnam_s) and there should be no limit on the number simultaneously open other than this limit and any limit on the number of open files (FOPEN_MAX).

...

K .3.5.2.2 The tmpnam_s function

...

Recommended practice

People don’t create files, concurrent program executions do. Race condition is not an introduced term. Directories are not a concept that we can refer to in this standard.

7 After a program (追記) execution (追記ここまで) obtains a file name using the tmpnam_s function and before the program (追記) execution (追記ここまで) creates a file with that name, the possibility exists that (削除) someone else can create (削除ここまで) (追記) a concurrent program execution creates (追記ここまで) a file with that same name. To avoid this (削除) race (削除ここまで) condition, the tmpfile_s function should be used instead of tmpnam_s when possible. (削除) One situation that requires the use of the tmpnam_s function is when the program needs to create a temporary directory rather than a temporary file. (削除ここまで)

8 Implementations should take care in choosing the patterns used for names returned by tmpnam_s. For example, making a thread ID part of the names avoids the (削除) race condition and (削除ここまで) possible conflict (削除) when (削除ここまで) (追記) that (追記ここまで) multiple programs (追記) and threads (追記ここまで) run (削除) simultaneously by the same user (削除ここまで) (追記) concurrently and (追記ここまで) generate the same temporary file names.

Aknowledments

Thanks to Javier Múgica for review and text improvements.

AltStyle によって変換されたページ (->オリジナル) /