Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 45fa404

Browse files
Nadrieriltshepang
authored andcommitted
Explain the important concepts of exhaustiveness checking
1 parent 5606d30 commit 45fa404

File tree

1 file changed

+138
-8
lines changed

1 file changed

+138
-8
lines changed

‎src/pat-exhaustive-checking.md

Lines changed: 138 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ are exhaustive.
77
## Pattern usefulness
88

99
The central question that usefulness checking answers is:
10-
"in this match expression, is that branch reachable?".
10+
"in this match expression, is that branch redundant?".
1111
More precisely, it boils down to computing whether,
1212
given a list of patterns we have already seen,
1313
a given new pattern might match any new value.
@@ -42,10 +42,8 @@ because a match expression can return a value).
4242

4343
## Where it happens
4444

45-
This check is done to any expression that desugars to a match expression in MIR.
46-
That includes actual `match` expressions,
47-
but also anything that looks like pattern matching,
48-
including `if let`, destructuring `let`, and similar expressions.
45+
This check is done anywhere you can write a pattern: `match` expressions, `if let`, `let else`,
46+
plain `let`, and function arguments.
4947

5048
```rust
5149
// `match`
@@ -80,9 +78,141 @@ fn foo(Foo { x, y }: Foo) {
8078

8179
## The algorithm
8280

83-
Exhaustiveness checking is implemented in [`check_match`].
84-
The core of the algorithm is in [`usefulness`].
81+
Exhaustiveness checking is run before MIR building in [`check_match`].
82+
It is implemented in the [`rustc_pattern_analysis`] crate,
83+
with the core of the algorithm in the [`usefulness`] module.
8584
That file contains a detailed description of the algorithm.
8685

86+
## Important concepts
87+
88+
### Constructors and fields
89+
90+
In the value `Pair(Some(0), true)`, `Pair` is called the constructor of the value, and `Some(0)` and
91+
`true` are its fields. Every matcheable value can be decomposed in this way. Examples of
92+
constructors are: `Some`, `None`, `(,)` (the 2-tuple constructor), `Foo {..}` (the constructor for
93+
a struct `Foo`), and `2` (the constructor for the number `2`).
94+
95+
Each constructor takes a fixed number of fields; this is called its arity. `Pair` and `(,)` have
96+
arity 2, `Some` has arity 1, `None` and `42` have arity 0. Each type has a known set of
97+
constructors. Some types have many constructors (like `u64`) or even an infinitely many (like `&str`
98+
and `&[T]`).
99+
100+
Patterns are similar: `Pair(Some(_), _)` has constructor `Pair` and two fields. The difference is
101+
that we get some extra pattern-only constructors, namely: the wildcard `_`, variable bindings,
102+
integer ranges like `0..=10`, and variable-length slices like `[_, .., _]`. We treat or-patterns
103+
separately.
104+
105+
Now to check if a value `v` matches a pattern `p`, we check if `v`'s constructor matches `p`'s
106+
constructor, then recursively compare their fields if necessary. A few representative examples:
107+
108+
- `matches!(v, _) := true`
109+
- `matches!((v0, v1), (p0, p1)) := matches!(v0, p0) && matches!(v1, p1)`
110+
- `matches!(Foo { a: v0, b: v1 }, Foo { a: p0, b: p1 }) := matches!(v0, p0) && matches!(v1, p1)`
111+
- `matches!(Ok(v0), Ok(p0)) := matches!(v0, p0)`
112+
- `matches!(Ok(v0), Err(p0)) := false` (incompatible variants)
113+
- `matches!(v, 1..=100) := matches!(v, 1) || ... || matches!(v, 100)`
114+
- `matches!([v0], [p0, .., p1]) := false` (incompatible lengths)
115+
- `matches!([v0, v1, v2], [p0, .., p1]) := matches!(v0, p0) && matches!(v2, p1)`
116+
117+
This concept is absolutely central to pattern analysis. The [`constructor`] module provides
118+
functions to extract, list and manipulate constructors. This is a useful enough concept that
119+
variations of it can be found in other places of the compiler, like in the MIR-lowering of a match
120+
expression and in some clippy lints.
121+
122+
### Constructor grouping and splitting
123+
124+
The pattern-only constructors (`_`, ranges and variable-length slices) each stand for a set of
125+
normal constructors, e.g. `_: Option<T>` stands for the set {`None`, `Some`} and `[_, .., _]` stands
126+
for the infinite set {`[,]`, `[,,]`, `[,,,]`, ...} of the slice constructors of arity >= 2.
127+
128+
In order to manage these constructors, we keep them as grouped as possible. For example:
129+
130+
```rust
131+
match (0, false) {
132+
(0 ..=100, true) => {}
133+
(50..=150, false) => {}
134+
(0 ..=200, _) => {}
135+
}
136+
```
137+
138+
In this example, all of `0`, `1`, .., `49` match the same arms, and thus can be treated as a group.
139+
In fact, in this match, the only ranges we need to consider are: `0..50`, `50..=100`,
140+
`101..=150`,`151..=200` and `201..`. Similarly:
141+
142+
```rust
143+
enum Direction { North, South, East, West }
144+
# let wind = (Direction::North, 0u8);
145+
match wind {
146+
(Direction::North, 50..) => {}
147+
(_, _) => {}
148+
}
149+
```
150+
151+
Here we can treat all the non-`North` constructors as a group, giving us only two cases to handle:
152+
`North`, and everything else.
153+
154+
This is called "constructor splitting" and is crucial to having exhaustiveness run in reasonable
155+
time.
156+
157+
### Usefulness vs reachability in the presence of empty types
158+
159+
This is likely the subtlest aspect of exhaustiveness. To be fully precise, a match doesn't operate
160+
on a value, it operates on a place. In certain unsafe circumstances, it is possible for a place to
161+
not contain valid data for its type. This has subtle consequences for empty types. Take the
162+
following:
163+
164+
```rust
165+
enum Void {}
166+
let x: u8 = 0;
167+
let ptr: *const Void = &x as *const u8 as *const Void;
168+
unsafe {
169+
match *ptr {
170+
_ => println!("Reachable!"),
171+
}
172+
}
173+
```
174+
175+
In this example, `ptr` is a valid pointer pointing to a place with invalid data. The `_` pattern
176+
does not look at the contents of the place `*ptr`, so this code is ok and the arm is taken. In other
177+
words, despite the place we are inspecting being of type `Void`, there is a reachable arm. If the
178+
arm had a binding however:
179+
180+
```rust
181+
# #[derive(Copy, Clone)]
182+
# enum Void {}
183+
# let x: u8 = 0;
184+
# let ptr: *const Void = &x as *const u8 as *const Void;
185+
# unsafe {
186+
match *ptr {
187+
_a => println!("Unreachable!"),
188+
}
189+
# }
190+
```
191+
192+
Here the binding loads the value of type `Void` from the `*ptr` place. In this example, this causes
193+
UB since the data is not valid. In the general case, this asserts validity of the data at `*ptr`.
194+
Either way, this arm will never be taken.
195+
196+
Finally, let's consider the empty match `match *ptr {}`. If we consider this exhaustive, then
197+
having invalid data at `*ptr` is invalid. In other words, the empty match is semantically
198+
equivalent to the `_a => ...` match. In the interest of explicitness, we prefer the case with an
199+
arm, hence we won't tell the user to remove the `_a` arm. In other words, the `_a` arm is
200+
unreachable yet not redundant. This is why we lint on redundant arms rather than unreachable
201+
arms, despite the fact that the lint says "unreachable".
202+
203+
These considerations only affects certain places, namely those that can contain non-valid data
204+
without UB. These are: pointer dereferences, reference dereferences, and union field accesses. We
205+
track during exhaustiveness checking whether a given place is known to contain valid data.
206+
207+
Having said all that, the current implementation of exhaustiveness checking does not follow the
208+
above considerations. On stable, empty types are for the most part treated as non-empty. The
209+
[`exhaustive_patterns`] feature errs on the other end: it allows omitting arms that could be
210+
reachable in unsafe situations. The [`never_patterns`] experimental feature aims to fix this and
211+
permit the correct behavior of empty types in patterns.
212+
87213
[`check_match`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/thir/pattern/check_match/index.html
88-
[`usefulness`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/thir/pattern/usefulness/index.html
214+
[`rustc_pattern_analysis`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_pattern_analysis/index.html
215+
[`usefulness`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_pattern_analysis/usefulness/index.html
216+
[`constructor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_pattern_analysis/constructor/index.html
217+
[`never_patterns`]: https://github.com/rust-lang/rust/issues/118155
218+
[`exhaustive_patterns`]: https://github.com/rust-lang/rust/issues/51085

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /