Polar factorization theorem
In optimal transport, a branch of mathematics, polar factorization of vector fields is a basic result due to Brenier (1987),[1] with antecedents of Knott-Smith (1984)[2] and Rachev (1985),[3] that generalizes many existing results among which are the polar decomposition of real matrices, and the rearrangement of real-valued functions.
The theorem
[edit ]Notation. Denote {\displaystyle \xi _{\#}\mu } the image measure of {\displaystyle \mu } through the map {\displaystyle \xi }.
Definition: Measure preserving map. Let {\displaystyle (X,\mu )} and {\displaystyle (Y,\nu )} be some probability spaces and {\displaystyle \sigma :X\rightarrow Y} a measurable map. Then, {\displaystyle \sigma } is said to be measure preserving iff {\displaystyle \sigma _{\#}\mu =\nu }, where {\displaystyle \#} is the pushforward measure. Spelled out: for every {\displaystyle \nu }-measurable subset {\displaystyle \Omega } of {\displaystyle Y}, {\displaystyle \sigma ^{-1}(\Omega )} is {\displaystyle \mu }-measurable, and {\displaystyle \mu (\sigma ^{-1}(\Omega ))=\nu (\Omega )}. The latter is equivalent to:
- {\displaystyle \int _{X}(f\circ \sigma )(x)\mu (dx)=\int _{X}(\sigma ^{*}f)(x)\mu (dx)=\int _{Y}f(y)(\sigma _{\#}\mu )(dy)=\int _{Y}f(y)\nu (dy)}
where {\displaystyle f} is {\displaystyle \nu }-integrable and {\displaystyle f\circ \sigma } is {\displaystyle \mu }-integrable.
Theorem. Consider a map {\displaystyle \xi :\Omega \rightarrow R^{d}} where {\displaystyle \Omega } is a convex subset of {\displaystyle R^{d}}, and {\displaystyle \mu } a measure on {\displaystyle \Omega } which is absolutely continuous. Assume that {\displaystyle \xi _{\#}\mu } is absolutely continuous. Then there is a convex function {\displaystyle \varphi :\Omega \rightarrow R} and a map {\displaystyle \sigma :\Omega \rightarrow \Omega } preserving {\displaystyle \mu } such that
{\displaystyle \xi =\left(\nabla \varphi \right)\circ \sigma }
In addition, {\displaystyle \nabla \varphi } and {\displaystyle \sigma } are uniquely defined almost everywhere.[1] [4]
Applications and connections
[edit ]Dimension 1
[edit ]In dimension 1, and when {\displaystyle \mu } is the Lebesgue measure over the unit interval, the result specializes to Ryff's theorem.[5] When {\displaystyle d=1} and {\displaystyle \mu } is the uniform distribution over {\displaystyle \left[0,1\right]}, the polar decomposition boils down to
{\displaystyle \xi \left(t\right)=F_{X}^{-1}\left(\sigma \left(t\right)\right)}
where {\displaystyle F_{X}} is cumulative distribution function of the random variable {\displaystyle \xi \left(U\right)} and {\displaystyle U} has a uniform distribution over {\displaystyle \left[0,1\right]}. {\displaystyle F_{X}} is assumed to be continuous, and {\displaystyle \sigma \left(t\right)=F_{X}\left(\xi \left(t\right)\right)} preserves the Lebesgue measure on {\displaystyle \left[0,1\right]}.
Polar decomposition of matrices
[edit ]When {\displaystyle \xi } is a linear map and {\displaystyle \mu } is the Gaussian normal distribution, the result coincides with the polar decomposition of matrices. Assuming {\displaystyle \xi \left(x\right)=Mx} where {\displaystyle M} is an invertible {\displaystyle d\times d} matrix and considering {\displaystyle \mu } the {\displaystyle {\mathcal {N}}\left(0,I_{d}\right)} probability measure, the polar decomposition boils down to
{\displaystyle M=SO}
where {\displaystyle S} is a symmetric positive definite matrix, and {\displaystyle O} an orthogonal matrix. The connection with the polar factorization is {\displaystyle \varphi \left(x\right)=x^{\top }Sx/2} which is convex, and {\displaystyle \sigma \left(x\right)=Ox} which preserves the {\displaystyle {\mathcal {N}}\left(0,I_{d}\right)} measure.
Helmholtz decomposition
[edit ]The results also allow to recover Helmholtz decomposition. Letting {\displaystyle x\rightarrow V\left(x\right)} be a smooth vector field it can then be written in a unique way as
{\displaystyle V=w+\nabla p}
where {\displaystyle p} is a smooth real function defined on {\displaystyle \Omega }, unique up to an additive constant, and {\displaystyle w} is a smooth divergence free vector field, parallel to the boundary of {\displaystyle \Omega }.
The connection can be seen by assuming {\displaystyle \mu } is the Lebesgue measure on a compact set {\displaystyle \Omega \subset R^{n}} and by writing {\displaystyle \xi } as a perturbation of the identity map
{\displaystyle \xi _{\epsilon }(x)=x+\epsilon V(x)}
where {\displaystyle \epsilon } is small. The polar decomposition of {\displaystyle \xi _{\epsilon }} is given by {\displaystyle \xi _{\epsilon }=(\nabla \varphi _{\epsilon })\circ \sigma _{\epsilon }}. Then, for any test function {\displaystyle f:R^{n}\rightarrow R} the following holds:
{\displaystyle \int _{\Omega }f(x+\epsilon V(x))dx=\int _{\Omega }f((\nabla \varphi _{\epsilon })\circ \sigma _{\epsilon }\left(x\right))dx=\int _{\Omega }f(\nabla \varphi _{\epsilon }\left(x\right))dx}
where the fact that {\displaystyle \sigma _{\epsilon }} was preserving the Lebesgue measure was used in the second equality.
In fact, as {\displaystyle \textstyle \varphi _{0}(x)={\frac {1}{2}}\Vert x\Vert ^{2}}, one can expand {\displaystyle \textstyle \varphi _{\epsilon }(x)={\frac {1}{2}}\Vert x\Vert ^{2}+\epsilon p(x)+O(\epsilon ^{2})}, and therefore {\displaystyle \textstyle \nabla \varphi _{\epsilon }\left(x\right)=x+\epsilon \nabla p(x)+O(\epsilon ^{2})}. As a result, {\displaystyle \textstyle \int _{\Omega }\left(V(x)-\nabla p(x)\right)\nabla f(x))dx} for any smooth function {\displaystyle f}, which implies that {\displaystyle w\left(x\right)=V(x)-\nabla p(x)} is divergence-free.[1] [6]
See also
[edit ]- polar decomposition – Representation of invertible matrices as unitary operator multiplying a Hermitian operator
References
[edit ]- ^ a b c Brenier, Yann (1991). "Polar factorization and monotone rearrangement of vector‐valued functions" (PDF). Communications on Pure and Applied Mathematics. 44 (4): 375–417. doi:10.1002/cpa.3160440402 . Retrieved 16 April 2021.
- ^ Knott, M.; Smith, C. S. (1984). "On the optimal mapping of distributions" . Journal of Optimization Theory and Applications. 43: 39–49. doi:10.1007/BF00934745. S2CID 120208956 . Retrieved 16 April 2021.
- ^ Rachev, Svetlozar T. (1985). "The Monge–Kantorovich mass transference problem and its stochastic applications" (PDF). Theory of Probability & Its Applications. 29 (4): 647–676. doi:10.1137/1129093 . Retrieved 16 April 2021.
- ^ Santambrogio, Filippo (2015). Optimal transport for applied mathematicians. New York: Birkäuser. CiteSeerX 10.1.1.726.35 .
- ^ Ryff, John V. (1965). "Orbits of L1-Functions Under Doubly Stochastic Transformation" . Transactions of the American Mathematical Society. 117: 92–100. doi:10.2307/1994198. JSTOR 1994198 . Retrieved 16 April 2021.
- ^ Villani, Cédric (2003). Topics in optimal transportation. American Mathematical Society.