Jens Gustedt, INRIA and ICube, France
2025年05月27日
integration into IS ISO/IEC 9899:202y
| document number | date | comment |
|---|---|---|
| n3447 | 202501 | Original proposal, partially accepted in Graz |
| n3558 | 202505 | Elaborate the different changes |
| Add one new UB that flew under the radar | ||
CC BY, see https://creativecommons.org/licenses/by/4.0
The recent campaign for slaying daemons has revealed that in fact some of the undefined behavior (UB) in the current C standard doesn’t even exist: some of the situations in J.2 that would in principle result in UB cannot trigger at all. The reason for these are misformulations in the normative text that seem to indicate UB where in fact there only are constraint violations or unspecified behavior.
We say that a semantic non-constraint requirement is a ghost-UB if no conforming program in any execution may ever violate it.
The present paper deals with ghost-UB that is attributed to constant expression, namely J.2 (50) to (53) in the counting as in n3467 before Graz. The principal observation here is that the term "constant expression" is a syntax term and not a semantic term. So either an expression is a constant expression or it is isn’t and the difference between the two is not behavior (semantic) but syntax alone.
In fact, all uses in the standard of the term are either covered by constraint violations (such as for array designators in initializers) or by the fact that whether an expression is a constant expression (or not) distinguishes different categories (such as VLA and arrays of known constant size). In all of these cases, it makes no sense to speak of UB.
In fact, there are only two types of UB that remain in this clause. The first concerns a possible discrepancy between evaluation of floating point expressions in the translation and execution environment; this UB is now handled as J .2 (45) in n3550. The second is one that so far flew below the radar, namely the use of the member operator on union constants for members other than the one that is initialized.
Otherwise, the standard does not seem to intend to leave constant expressions as a UB extension point: 6.6.1 p14 explicitly states that the use of extensions to the concept of constant expressions is implementation-defined.
Once convinced that we have ghost-UB, the easiest way to deal with the situation is just to remove the useless listings (50) to (53) in J.2 of n3467. This has already been voted in Graz and in n3550 the items are already removed.
We think that this alone would not be very user friendly and that users still would trip over the many "shall" that are confusingly applied in the text.
In fact, most of the text in 6.6.1 is even placed in the wrong category. The definitions made there are purely descriptive and convey no semantics beyond that. Thus we propose to reorder most of the text that then would appear under "Description" and only leave those entries that must in "Constraints" and "Semantics". These deal with cases where the evaluated value does not fit:
Interesting for the last two, the implied UB didn’t even make it into J.2’s list of n3467, so we also add it, there.
No normative change to the concept of "constant expression" is intended by this paper.
There is a subtle change, though, for extended constant expressions. The current text formulates a constraint for operators that may be used within a constant expression and thereby either
-pendantic in
some environments).This paper here moves this possible use to syntax alone. Thereby it enables implementations in particular to extend constant expressions to
constexpr
functions(5, 6)
to be accounted as an integer constant expression.New text is (追記) underlined green (追記ここまで), removed text is
(削除) stroke-out red (削除ここまで). Reorganization of the paragraphs is indicated
in the running text, Xa refers to new paragraph numbers, pX to current
numbers.
Existing footnotes are unchanged and their numbers refer to n3550.
We propose to reorder this clause completely and to remove most "shall" by just factual description and add some explanations where this seems appropriate. This means that the following is a complete replacement of the corresponding section. Diff-marks are applied on a paragraph level, that is moved paragraphs without other changes occur without diff-marks.
6.6.1 General
Syntax
1a constant-expression: conditional-expression
Description
2a
(削除) A constant expression can be evaluated during translation rather than runtime, and accordingly can be used in any place that a constant can be. (削除ここまで)(追記) The fact that a given conditional expression forms a constant expression is detected in translation phases 4XXX) and 7. In most of the cases, the value of the constant expression is determined in translation phases 1-7 (i.e. before linking). Values of address constants or arithmetic constant expressions with floating type and values that are derived from these are possibly only determined during linking (translation phase 8) or just before program startup. (追記ここまで)
Add the footnote
(追記) XXX) These are the integer constant expressions that are used for conditional inclusion (6.10.2) and binary resource inclusion (6.10.4). (追記ここまで)
Reuse a descriptive phrase from p5, complement it with more information and form a new paragraph.
3a An expression that evaluates to a constant is required in several contexts. (追記) The most general form appears in initializers for objects whose value is determined at translation time or program startup such as objects with static storage duration or with the
constexprspecifier. Additionally, for some uses the property of an expressionEbeing an integer constant expression or not changes the semantics of the program. In particular, it determines if an array declarationT A[E]forms a VLA or not and thus if a seemingly benign expressionsizeof(A)is determined at translation time or evaluated each time when it is met during the program execution. (追記ここまで)
Reuse p3 from the constraints section.
4a Constant expressions
(削除) shall (削除ここまで)(追記) do (追記ここまで) not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated.116) (追記) Additionally, such a constant expression is, or evaluates to, a null pointer constant (6.3.3.3) or one of the categories that are described in this clause: (追記ここまで)
Use the list from p7
- a compound literal constant,
- a named constant,
- an integer constant expression,
- an arithmetic constant expression,
(削除) a null pointer constant, (削除ここまで)- an address constant, or
- an address constant for a complete object type plus or minus an integer constant expression.
Continue with p6
5a A compound literal with storage-class specifier
constexpris a compound literal constant, as is a postfix expression that applies the . member access operator to a compound literal constant of structure or union type, even recursively. A compound literal constant is a constant expression with the type and value of the unnamed object.
6a An identifier that is:
- an enumeration constant,
- a predefined constant, or
- declared with storage-class specifier
constexprand has an object type,
is a named constant, as is a postfix expression that applies the
.member access operator to a named constant of structure or union type, even recursively. For enumeration and predefined constants, their value and type are defined in the respective clauses; forconstexprobjects, such a named constant is a constant expression with the type and value of the declared object.
7a An integer constant expression118)
(削除) shall have (削除ここまで)(追記) has (追記ここまで) integer type and(削除) shall (削除ここまで)only(削除) have (削除ここまで)(追記) has (追記ここまで) operands that are integer literals, named and compound literal constants of integer type, character literals,sizeofor_Countofexpressions which are integer constant expressions,alignofexpressions, and floating, named, or compound literal constants of arithmetic type that are the immediate operands of casts. Cast operators in an integer constant expression(削除) shall (削除ここまで)only convert arithmetic types to integer types, except as part of an operand to the typeof operators,sizeofoperator,_Countofoperator, oralignofoperator.
Paragraph p9 has already been used above.
(削除) p9 More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following: (削除ここまで)
(削除)(削除ここまで)
- a named constant,
- a compound literal constant,
- an arithmetic constant,
- a null pointer constant,
- an address constant, or
- an address constant for a complete object type plus or minus an integer constant expression.
Continue with p10
8a An arithmetic constant expression
(削除) shall have (削除ここまで)(追記) has (追記ここまで) arithmetic type and(削除) shall (削除ここまで)only(削除) have (削除ここまで)(追記) has (追記ここまで) operands that are floating literals, named or compound literal constants of arithmetic type and integer constant expressions. Cast operators in an arithmetic constant expression(削除) shall (削除ここまで)only convert arithmetic types to arithmetic types, except as part of an operand to the typeof operators,sizeofoperator,_Countofoperator, oralignofoperator.
9a An address constant is a null pointer,119) a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator; it
(削除) shall be (削除ここまで)(追記) is (追記ここまで) created explicitly using the unary&operator or an integer constant cast to pointer type, or implicitly using an expression of array or function type.
10a The array-subscript
[]and member-access->operator, the address&and indirection*unary operators, and pointer casts can be used in the creation of an address constant,(削除) but the (削除ここまで)(追記) if no (追記ここまで) value of an object(削除) shall not be (削除ここまで)(追記) is (追記ここまで) accessed by use of these operators.120)
11a A structure or union constant is a named constant or compound literal constant with structure or union type, respectively.
12a An implementation may accept other forms of constant expressions, called extended constant expressions. It is implementation-defined whether extended constant expressions are usable in the same manner as the constant expressions defined in this document, including whether or not extended integer constant expressions are considered to be integer constant expressions.121)
The next paragraph is merely a repetition and probably confuses more than it helps:
(削除) p15 Starting from a structure or union constant, the member-access.operator can be used to form a named constant or compound literal constant as described previously in this subclause. (削除ここまで)
Move p4 here and start the constraints section
(追記) Constraints (追記ここまで)
13a Each constant expression shall evaluate to a constant that is in the range of representable values for its type.
Move p5 here and start the semantics section
(追記) Semantics (追記ここまで)
14a
(削除) An expression that evaluates to a constant is required in several contexts. (削除ここまで)If a floating expression is evaluated in the translation environment, the arithmetic range and precision shall be at least as great as if the expression were being evaluated in the execution environment.117)
Continue with p16
15a If the member-access operator
.accesses a member of a union constant, the accessed member shall be the same as the member that is initialized by the union constant’s initializer.
16a
(削除) The (削除ここまで)(追記) Otherwise, the (追記ここまで) semantic rules for the evaluation of a constant expression are the same as for nonconstant expressions.122)
Forward references: array declarators (6.7.7.3), initialization (6.7.11).
Remove the following four entries (50)-(53) (counting as in n3467 before Graz, already done in the Graz meeting)
(削除) (50) An expression that is required to be an integer constant expression does not have an integer type; has operands that are not integer literals, named constants, compound literal constants, enumeration constants, character literals, predefined constants,sizeofor_Lengthofexpressions whose results are integer constant expressions,alignofexpressions, or immediately-cast floating literals; or contains casts (outside operands tosizeof,_Lengthofandalignofoperators) other than conversions of arithmetic types to integer types (6.6). (削除ここまで)
(削除) (51) A constant expression in an initializer is not, or does not evaluate to, one of the following: a named constant, a compound literal constant, an arithmetic constant expression, a null pointer constant, an address constant, or an address constant for a complete object type plus or minus an integer constant expression (6.6). (削除ここまで)
(削除) (52) An arithmetic constant expression does not have arithmetic type; has operands that are not integer literals, floating literals, named and compound literal constants of arithmetic type, character literals, predefined constants,sizeofor_Lengthofexpressions whose results are integer constant expressions, oralignofexpressions; or contains casts (outside operands tosizeoforalignofoperators) other than conversions of arithmetic types to arithmetic types (6.6). (削除ここまで)
(削除) (53) The value of an object is accessed by an array-subscript[], member-access.or->, address&, or indirection*operator or a pointer cast in creating an address constant (6.6). (削除ここまで)
Add one new entry (counting as in n3550, already done in the Graz meeting)
(追記) (45) The value of a floating expression as determined in the translation environment in the context of the evaluation of a constant expression is outside the arithmetic range or has less precision than if it were evaluated in the execution environment (6.6.1). (追記ここまで)
Add one new entry (new with this paper)
(追記)
(45′) The member-access operator . accesses a member
of a union constant and the accessed member is not the same member that
is initialized by the union constant’s initializer. (6.6.1).
(追記ここまで)
There is a branch on WG14’s gitlab that reflects the proposed changes:
https://gitlab.gwdg.de/iso-c/draft/-/tree/CE
Thanks to Martin Uecker, Javier Múgica, Joseph Myers and Chris Bazley for review and discussions.