Disintegration theorem

Theorem in measure theory

In mathematics, the disintegration theorem is a result in measure theory and probability theory. It rigorously defines the idea of a non-trivial "restriction" of a measure to a measure zero subset of the measure space in question. It is related to the existence of conditional probability measures. In a sense, "disintegration" is the opposite process to the construction of a product measure.

Motivation

[edit ]

Consider the unit square $S=[0,1]\times [0,1]$ {\displaystyle S=[0,1]\times [0,1]} in the Euclidean plane $\mathbb {R} ^{2}$ {\displaystyle \mathbb {R} ^{2}}. Consider the probability measure $\mu$ {\displaystyle \mu } defined on $S$ {\displaystyle S} by the restriction of two-dimensional Lebesgue measure $\lambda ^{2}$ {\displaystyle \lambda ^{2}} to $S$ {\displaystyle S}. That is, the probability of an event $E\subseteq S$ {\displaystyle E\subseteq S} is simply the area of $E$ {\displaystyle E}. We assume $E$ {\displaystyle E} is a measurable subset of $S$ {\displaystyle S}.

Consider a one-dimensional subset of $S$ {\displaystyle S} such as the line segment $L_{x}=\{x\}\times [0,1]$ {\displaystyle L_{x}=\{x\}\times [0,1]}. $L_{x}$ {\displaystyle L_{x}} has $\mu$ {\displaystyle \mu }-measure zero; every subset of $L_{x}$ {\displaystyle L_{x}} is a $\mu$ {\displaystyle \mu }-null set; since the Lebesgue measure space is a complete measure space, $E\subseteq L_{x}\implies \mu (E)=0.$ {\displaystyle E\subseteq L_{x}\implies \mu (E)=0.}

While true, this is somewhat unsatisfying. It would be nice to say that $\mu$ {\displaystyle \mu } "restricted to" $L_{x}$ {\displaystyle L_{x}} is the one-dimensional Lebesgue measure $\lambda ^{1}$ {\displaystyle \lambda ^{1}}, rather than the zero measure. The probability of a "two-dimensional" event $E$ {\displaystyle E} could then be obtained as an integral of the one-dimensional probabilities of the vertical "slices" $E\cap L_{x}$ {\displaystyle E\cap L_{x}}: more formally, if $\mu _{x}$ {\displaystyle \mu _{x}} denotes one-dimensional Lebesgue measure on $L_{x}$ {\displaystyle L_{x}}, then $\mu (E)=\int _{[0,1]}\mu _{x}(E\cap L_{x}),円\mathrm {d} x$ {\displaystyle \mu (E)=\int _{[0,1]}\mu _{x}(E\cap L_{x}),円\mathrm {d} x} for any "nice" $E\subseteq S$ {\displaystyle E\subseteq S}. The disintegration theorem makes this argument rigorous in the context of measures on metric spaces.

Statement of the theorem

[edit ]

(Hereafter, ${\mathcal {P}}(X)$ {\displaystyle {\mathcal {P}}(X)} will denote the collection of Borel probability measures on a topological space $(X,T)$ {\displaystyle (X,T)}.) The assumptions of the theorem are as follows:

Let $Y$ {\displaystyle Y} and $X$ {\displaystyle X} be two Radon spaces (i.e. a topological space such that every Borel probability measure on it is inner regular, e.g. separably metrizable spaces; in particular, every probability measure on it is outright a Radon measure).
Let $\mu \in {\mathcal {P}}(Y)$ {\displaystyle \mu \in {\mathcal {P}}(Y)}.
Let $\pi :Y\to X$ {\displaystyle \pi :Y\to X} be a Borel-measurable function. Here one should think of $\pi$ {\displaystyle \pi } as a function to "disintegrate" $Y$ {\displaystyle Y}, in the sense of partitioning $Y$ {\displaystyle Y} into $\{\pi ^{-1}(x)\ |\ x\in X\}$ {\displaystyle \{\pi ^{-1}(x)\ |\ x\in X\}}. For example, for the motivating example above, one can define $\pi ((a,b))=a$ {\displaystyle \pi ((a,b))=a}, $(a,b)\in [0,1]\times [0,1]$ {\displaystyle (a,b)\in [0,1]\times [0,1]}, which gives that $\pi ^{-1}(a)=a\times [0,1]$ {\displaystyle \pi ^{-1}(a)=a\times [0,1]}, a slice we want to capture.
Let $\nu \in {\mathcal {P}}(X)$ {\displaystyle \nu \in {\mathcal {P}}(X)} be the pushforward measure $\nu =\pi _{*}(\mu )=\mu \circ \pi ^{-1}$ {\displaystyle \nu =\pi _{*}(\mu )=\mu \circ \pi ^{-1}}. This measure provides the distribution of $x$ {\displaystyle x} (which corresponds to the events $\pi ^{-1}(x)$ {\displaystyle \pi ^{-1}(x)}).

The conclusion of the theorem: There exists a $\nu$ {\displaystyle \nu }-almost everywhere uniquely determined family of probability measures $\{\mu _{x}\}_{x\in X}\subseteq {\mathcal {P}}(Y)$ {\displaystyle \{\mu _{x}\}_{x\in X}\subseteq {\mathcal {P}}(Y)}, which provides a "disintegration" of $\mu$ {\displaystyle \mu } into $\{\mu _{x}\}_{x\in X}$ {\displaystyle \{\mu _{x}\}_{x\in X}}, such that:

the function $x\mapsto \mu _{x}$ {\displaystyle x\mapsto \mu _{x}} is Borel measurable, in the sense that $x\mapsto \mu _{x}(B)$ {\displaystyle x\mapsto \mu _{x}(B)} is a Borel-measurable function for each Borel-measurable set $B\subseteq Y$ {\displaystyle B\subseteq Y};
$\mu _{x}$ {\displaystyle \mu _{x}} "lives on" the fiber $\pi ^{-1}(x)$ {\displaystyle \pi ^{-1}(x)}: for $\nu$ {\displaystyle \nu }-almost all $x\in X$ {\displaystyle x\in X}, $\mu _{x}\left(Y\setminus \pi ^{-1}(x)\right)=0,$ {\displaystyle \mu _{x}\left(Y\setminus \pi ^{-1}(x)\right)=0,} and so $\mu _{x}(E)=\mu _{x}(E\cap \pi ^{-1}(x))$ {\displaystyle \mu _{x}(E)=\mu _{x}(E\cap \pi ^{-1}(x))};
for every Borel-measurable function $f:Y\to [0,\infty ]$ {\displaystyle f:Y\to [0,\infty ]}, $\int _{Y}f(y),円\mathrm {d} \mu (y)=\int _{X}\int _{\pi ^{-1}(x)}f(y),円\mathrm {d} \mu _{x}(y),円\mathrm {d} \nu (x).$ {\displaystyle \int _{Y}f(y),円\mathrm {d} \mu (y)=\int _{X}\int _{\pi ^{-1}(x)}f(y),円\mathrm {d} \mu _{x}(y),円\mathrm {d} \nu (x).} In particular, for any event $E\subseteq Y$ {\displaystyle E\subseteq Y}, taking $f$ {\displaystyle f} to be the indicator function of $E$ {\displaystyle E},^[1] $\mu (E)=\int _{X}\mu _{x}(E),円\mathrm {d} \nu (x).$ {\displaystyle \mu (E)=\int _{X}\mu _{x}(E),円\mathrm {d} \nu (x).}

Applications

[edit ]

Product spaces

[edit ]

This section needs additional citations for verification . Please help improve this article by adding citations to reliable sources in this section. Unsourced material may be challenged and removed. (May 2022) (Learn how and when to remove this message)

The original example was a special case of the problem of product spaces, to which the disintegration theorem applies.

When $Y$ {\displaystyle Y} is written as a Cartesian product $Y=X_{1}\times X_{2}$ {\displaystyle Y=X_{1}\times X_{2}} and $\pi _{i}:Y\to X_{i}$ {\displaystyle \pi _{i}:Y\to X_{i}} is the natural projection, then each fibre $\pi _{1}^{-1}(x_{1})$ {\displaystyle \pi _{1}^{-1}(x_{1})} can be canonically identified with $X_{2}$ {\displaystyle X_{2}} and there exists a Borel family of probability measures $\{\mu _{x_{1}}\}_{x_{1}\in X_{1}}$ {\displaystyle \{\mu _{x_{1}}\}_{x_{1}\in X_{1}}} in ${\mathcal {P}}(X_{2})$ {\displaystyle {\mathcal {P}}(X_{2})} (which is $(\pi _{1})_{*}(\mu )$ {\displaystyle (\pi _{1})_{*}(\mu )}-almost everywhere uniquely determined) such that $\mu =\int _{X_{1}}\mu _{x_{1}},円\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)=\int _{X_{1}}\mu _{x_{1}},円\mathrm {d} (\pi _{1})_{*}(\mu )(x_{1}),$ {\displaystyle \mu =\int _{X_{1}}\mu _{x_{1}},円\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)=\int _{X_{1}}\mu _{x_{1}},円\mathrm {d} (\pi _{1})_{*}(\mu )(x_{1}),} which is in particular^{[clarification needed ]} $\int _{X_{1}\times X_{2}}f(x_{1},x_{2}),円\mu (\mathrm {d} x_{1},\mathrm {d} x_{2})=\int _{X_{1}}\left(\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1})\right)\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)$ {\displaystyle \int _{X_{1}\times X_{2}}f(x_{1},x_{2}),円\mu (\mathrm {d} x_{1},\mathrm {d} x_{2})=\int _{X_{1}}\left(\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1})\right)\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right)} and $\mu (A\times B)=\int _{A}\mu \left(B\mid x_{1}\right),円\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right).$ {\displaystyle \mu (A\times B)=\int _{A}\mu \left(B\mid x_{1}\right),円\mu \left(\pi _{1}^{-1}(\mathrm {d} x_{1})\right).}

The relation to conditional expectation is given by the identities $\operatorname {E} (f\mid \pi _{1})(x_{1})=\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1}),$ {\displaystyle \operatorname {E} (f\mid \pi _{1})(x_{1})=\int _{X_{2}}f(x_{1},x_{2})\mu (\mathrm {d} x_{2}\mid x_{1}),} $\mu (A\times B\mid \pi _{1})(x_{1})=1_{A}(x_{1})\cdot \mu (B\mid x_{1}).$ {\displaystyle \mu (A\times B\mid \pi _{1})(x_{1})=1_{A}(x_{1})\cdot \mu (B\mid x_{1}).}

Vector calculus

[edit ]

The disintegration theorem can also be seen as justifying the use of a "restricted" measure in vector calculus. For instance, in Stokes' theorem as applied to a vector field flowing through a compact surface $\Sigma \subset \mathbb {R} ^{3}$ {\displaystyle \Sigma \subset \mathbb {R} ^{3}}, it is implicit that the "correct" measure on $\Sigma$ {\displaystyle \Sigma } is the disintegration of three-dimensional Lebesgue measure $\lambda ^{3}$ {\displaystyle \lambda ^{3}} on $\Sigma$ {\displaystyle \Sigma }, and that the disintegration of this measure on ∂Σ is the same as the disintegration of $\lambda ^{3}$ {\displaystyle \lambda ^{3}} on $\partial \Sigma$ {\displaystyle \partial \Sigma }.^[2]

Conditional distributions

[edit ]

The disintegration theorem can be applied to give a rigorous treatment of conditional probability distributions in statistics, while avoiding purely abstract formulations of conditional probability.^[3] The theorem is related to the Borel–Kolmogorov paradox, for example.

References

[edit ]

^ Dellacherie, C.; Meyer, P.-A. (1978). Probabilities and Potential . North-Holland Mathematics Studies. Amsterdam: North-Holland. ISBN 0-7204-0701-X.
^ Ambrosio, L.; Gigli, N.; Savaré, G. (2005). Gradient Flows in Metric Spaces and in the Space of Probability Measures. ETH Zürich, Birkhäuser Verlag, Basel. ISBN 978-3-7643-2428-5.
^ Chang, J.T.; Pollard, D. (1997). "Conditioning as disintegration" (PDF). Statistica Neerlandica. 51 (3): 287. CiteSeerX 10.1.1.55.7544 . doi:10.1111/1467-9574.00056. S2CID 16749932.

v
t
e

Measure theory

Basic concepts

Sets

Types of measures

Atomic
Baire
Banach
Besov
Borel
Brown
Complex
Complete
Content
(Logarithmically) Convex
Decomposable
Discrete
Equivalent
Finite
Inner
(Quasi-) Invariant
Locally finite
Maximising
Metric outer
Outer
Perfect
Pre-measure
(Sub-) Probability
Projection-valued
Radon
Random
Regular
- Borel regular
- Inner regular
- Outer regular
Saturated
Set function
σ-finite
s-finite
Signed
Singular
Spectral
Strictly positive
Tight
Vector

Particular measures

Maps

Measurable function
- Bochner
- Strongly
- Weakly
Convergence: almost everywhere
of measures
in measure
of random variables
- in distribution
- in probability
Cylinder set measure
Random: compact set
element
measure
process
variable
vector
Projection-valued measure

Main results

Carathéodory's extension theorem
Convergence theorems
Decomposition theorems
- Hahn
- Jordan
- Maharam's
Egorov's
Fatou's lemma
Fubini's
- Fubini–Tonelli
Hölder's inequality
Minkowski inequality
Radon–Nikodym
Riesz–Markov–Kakutani representation theorem

Other results

Disintegration theorem Lifting theory Lebesgue's density theorem Lebesgue differentiation theorem Sard's theorem Vitali–Hahn–Saks theorem
For Lebesgue measure	Isoperimetric inequality Brunn–Minkowski theorem Milman's reverse Minkowski–Steiner formula Prékopa–Leindler inequality Vitale's random Brunn–Minkowski inequality

Applications & related

Retrieved from "https://en.wikipedia.org/w/index.php?title=Disintegration_theorem&oldid=1307953490"

Motivation

Statement of the theorem

Applications

Product spaces

Vector calculus

Conditional distributions

See also

References