Min-max theorem

Variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces

Not to be confused with Minimax theorem.

"Variational theorem" redirects here; not to be confused with variational principle.

This article needs additional citations for verification . Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Min-max theorem" – news · newspapers · books · scholar · JSTOR (November 2011) (Learn how and when to remove this message)

In linear algebra and functional analysis, the min-max theorem, or variational theorem, or Courant–Fischer–Weyl min-max principle, is a result that gives a variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces. It can be viewed as the starting point of many results of similar nature.

This article first discusses the finite-dimensional case and its applications before considering compact operators on infinite-dimensional Hilbert spaces. We will see that for compact operators, the proof of the main theorem uses essentially the same idea from the finite-dimensional argument.

In the case that the operator is non-Hermitian, the theorem provides an equivalent characterization of the associated singular values. The min-max theorem can be extended to self-adjoint operators that are bounded below.

Matrices

[edit ]

Let A be a n ×ばつ n Hermitian matrix. As with many other variational results on eigenvalues, one considers the Rayleigh–Ritz quotient R_A : Cⁿ \ {0} → R defined by

R_{A}(x)={\frac {(Ax,x)}{(x,x)}}

{\displaystyle R_{A}(x)={\frac {(Ax,x)}{(x,x)}}}

where (⋅, ⋅) denotes the Euclidean inner product on Cⁿ. Equivalently, the Rayleigh–Ritz quotient can be replaced by

f(x)=(Ax,x),\;\|x\|=1.

{\displaystyle f(x)=(Ax,x),\;\|x\|=1.}

The Rayleigh quotient of an eigenvector $v$ {\displaystyle v} is its associated eigenvalue $\lambda$ {\displaystyle \lambda } because $R_{A}(v)=(\lambda x,x)/(x,x)=\lambda$ {\displaystyle R_{A}(v)=(\lambda x,x)/(x,x)=\lambda }. For a Hermitian matrix A, the range of the continuous functions R_A(x) and f(x) is a compact interval [a, b] of the real line. The maximum b and the minimum a are the largest and smallest eigenvalue of A, respectively. The min-max theorem is a refinement of this fact.

Min-max theorem

[edit ]

Let ${\textstyle A}$ {\textstyle A} be Hermitian on an inner product space ${\textstyle V}$ {\textstyle V} with dimension ${\textstyle n}$ {\textstyle n}, with spectrum ordered in descending order ${\textstyle \lambda _{1}\geq ...\geq \lambda _{n}}$ {\textstyle \lambda _{1}\geq ...\geq \lambda _{n}}.

Let ${\textstyle v_{1},...,v_{n}}$ {\textstyle v_{1},...,v_{n}} be the corresponding unit-length orthogonal eigenvectors.

Reverse the spectrum ordering, so that ${\textstyle \xi _{1}=\lambda _{n},...,\xi _{n}=\lambda _{1}}$ {\textstyle \xi _{1}=\lambda _{n},...,\xi _{n}=\lambda _{1}}.

(Poincaré’s inequality)—Let ${\textstyle M}$ {\textstyle M} be a subspace of ${\textstyle V}$ {\textstyle V} with dimension ${\textstyle k}$ {\textstyle k}, then there exists unit vectors ${\textstyle x,y\in M}$ {\textstyle x,y\in M}, such that

${\textstyle \langle x,Ax\rangle \leq \lambda _{k}}$ {\textstyle \langle x,Ax\rangle \leq \lambda _{k}}, and ${\textstyle \langle y,Ay\rangle \geq \xi _{k}}$ {\textstyle \langle y,Ay\rangle \geq \xi _{k}}.

Proof

Part 2 is a corollary, using ${\textstyle -A}$ {\textstyle -A}.

${\textstyle M}$ {\textstyle M} is a ${\textstyle k}$ {\textstyle k} dimensional subspace, so if we pick any list of ${\textstyle n-k+1}$ {\textstyle n-k+1} vectors, their span ${\textstyle N:=span(v_{k},...v_{n})}$ {\textstyle N:=span(v_{k},...v_{n})} must intersect ${\textstyle M}$ {\textstyle M} on at least a single line.

Take unit ${\textstyle x\in M\cap N}$ {\textstyle x\in M\cap N}. That’s what we need.

{\textstyle x=\sum _{i=k}^{n}a_{i}v_{i}}

{\textstyle x=\sum _{i=k}^{n}a_{i}v_{i}}, since

{\textstyle x\in N}

{\textstyle x\in N}.

Since

{\textstyle \sum _{i=k}^{n}|a_{i}|^{2}=1}

{\textstyle \sum _{i=k}^{n}|a_{i}|^{2}=1}, we find

{\textstyle \langle x,Ax\rangle =\sum _{i=k}^{n}|a_{i}|^{2}\lambda _{i}\leq \lambda _{k}}

{\textstyle \langle x,Ax\rangle =\sum _{i=k}^{n}|a_{i}|^{2}\lambda _{i}\leq \lambda _{k}}.

min-max theorem— ${\begin{aligned}\lambda _{k}&=\max _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=k\end{array}}\min _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle \\&=\min _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=n-k+1\end{array}}\max _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle {\text{. }}\end{aligned}}$ {\displaystyle {\begin{aligned}\lambda _{k}&=\max _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=k\end{array}}\min _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle \\&=\min _{\begin{array}{c}{\mathcal {M}}\subset V\\\operatorname {dim} ({\mathcal {M}})=n-k+1\end{array}}\max _{\begin{array}{c}x\in {\mathcal {M}}\\\|x\|=1\end{array}}\langle x,Ax\rangle {\text{. }}\end{aligned}}}

Proof

Part 2 is a corollary of part 1, by using ${\textstyle -A}$ {\textstyle -A}.

By Poincare’s inequality, ${\textstyle \lambda _{k}}$ {\textstyle \lambda _{k}} is an upper bound to the right side.

By setting ${\textstyle {\mathcal {M}}=span(v_{1},...v_{k})}$ {\textstyle {\mathcal {M}}=span(v_{1},...v_{k})}, the upper bound is achieved.

Define the partial trace ${\textstyle tr_{V}(A)}$ {\textstyle tr_{V}(A)} to be the trace of projection of ${\textstyle A}$ {\textstyle A} to ${\textstyle V}$ {\textstyle V}. It is equal to ${\textstyle \sum _{i}v_{i}^{*}Av_{i}}$ {\textstyle \sum _{i}v_{i}^{*}Av_{i}} given an orthonormal basis of ${\textstyle V}$ {\textstyle V}.

Wielandt minimax formula (^[1]^: 44)—Let ${\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n}$ {\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n} be integers. Define a partial flag to be a nested collection ${\textstyle V_{1}\subset \cdots \subset V_{k}}$ {\textstyle V_{1}\subset \cdots \subset V_{k}} of subspaces of ${\textstyle \mathbb {C} ^{n}}$ {\textstyle \mathbb {C} ^{n}} such that ${\textstyle \operatorname {dim} \left(V_{j}\right)=i_{j}}$ {\textstyle \operatorname {dim} \left(V_{j}\right)=i_{j}} for all ${\textstyle 1\leq j\leq k}$ {\textstyle 1\leq j\leq k}.

Define the associated Schubert variety ${\textstyle X\left(V_{1},\ldots ,V_{k}\right)}$ {\textstyle X\left(V_{1},\ldots ,V_{k}\right)} to be the collection of all ${\textstyle k}$ {\textstyle k} dimensional subspaces ${\textstyle W}$ {\textstyle W} such that ${\textstyle \operatorname {dim} \left(W\cap V_{j}\right)\geq j}$ {\textstyle \operatorname {dim} \left(W\cap V_{j}\right)\geq j}.

$\lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)=\sup _{V_{1},\ldots ,V_{k}}\inf _{W\in X\left(V_{1},\ldots ,V_{k}\right)}tr_{W}(A)$ {\displaystyle \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)=\sup _{V_{1},\ldots ,V_{k}}\inf _{W\in X\left(V_{1},\ldots ,V_{k}\right)}tr_{W}(A)}

Proof

Proof

The ${\textstyle \leq }$ {\textstyle \leq } case.

Let ${\textstyle V_{j}=span(e_{1},\dots ,e_{i_{j}})}$ {\textstyle V_{j}=span(e_{1},\dots ,e_{i_{j}})}, and any ${\textstyle W\in X\left(V_{1},\ldots ,V_{k}\right)}$ {\textstyle W\in X\left(V_{1},\ldots ,V_{k}\right)}, it remains to show that $\lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)\leq tr_{W}(A)$ {\displaystyle \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)\leq tr_{W}(A)}

To show this, we construct an orthonormal set of vectors ${\textstyle v_{1},\dots ,v_{k}}$ {\textstyle v_{1},\dots ,v_{k}} such that ${\textstyle v_{j}\in V_{j}\cap W}$ {\textstyle v_{j}\in V_{j}\cap W}. Then ${\textstyle tr_{W}(A)\geq \sum _{j}\langle v_{j},Av_{j}\rangle \geq \lambda _{i_{j}}(A)}$ {\textstyle tr_{W}(A)\geq \sum _{j}\langle v_{j},Av_{j}\rangle \geq \lambda _{i_{j}}(A)}

Since ${\textstyle dim(V_{1}\cap W)\geq 1}$ {\textstyle dim(V_{1}\cap W)\geq 1}, we pick any unit ${\textstyle v_{1}\in V_{1}\cap W}$ {\textstyle v_{1}\in V_{1}\cap W}. Next, since ${\textstyle dim(V_{2}\cap W)\geq 2}$ {\textstyle dim(V_{2}\cap W)\geq 2}, we pick any unit ${\textstyle v_{2}\in (V_{2}\cap W)}$ {\textstyle v_{2}\in (V_{2}\cap W)} that is perpendicular to ${\textstyle v_{1}}$ {\textstyle v_{1}}, and so on.

The ${\textstyle \geq }$ {\textstyle \geq } case.

For any such sequence of subspaces ${\textstyle V_{i}}$ {\textstyle V_{i}}, we must find some ${\textstyle W\in X\left(V_{1},\ldots ,V_{k}\right)}$ {\textstyle W\in X\left(V_{1},\ldots ,V_{k}\right)} such that $\lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)\geq tr_{W}(A)$ {\displaystyle \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)\geq tr_{W}(A)}

Now we prove this by induction.

The ${\textstyle n=1}$ {\textstyle n=1} case is the Courant-Fischer theorem. Assume now ${\textstyle n\geq 2}$ {\textstyle n\geq 2}.

If ${\textstyle i_{1}\geq 2}$ {\textstyle i_{1}\geq 2}, then we can apply induction. Let ${\textstyle E=span(e_{i_{1}},\dots ,e_{n})}$ {\textstyle E=span(e_{i_{1}},\dots ,e_{n})}. We construct a partial flag within ${\textstyle E}$ {\textstyle E} from the intersection of ${\textstyle E}$ {\textstyle E} with ${\textstyle V_{1},\dots ,V_{k}}$ {\textstyle V_{1},\dots ,V_{k}}.

We begin by picking a ${\textstyle (i_{k}-(i_{1}-1))}$ {\textstyle (i_{k}-(i_{1}-1))}-dimensional subspace ${\textstyle W_{k}'\subset E\cap V_{i_{k}}}$ {\textstyle W_{k}'\subset E\cap V_{i_{k}}}, which exists by counting dimensions. This has codimension ${\textstyle (i_{1}-1)}$ {\textstyle (i_{1}-1)} within ${\textstyle V_{i_{k}}}$ {\textstyle V_{i_{k}}}.

Then we go down by one space, to pick a ${\textstyle (i_{k-1}-(i_{1}-1))}$ {\textstyle (i_{k-1}-(i_{1}-1))}-dimensional subspace ${\textstyle W_{k-1}'\subset W_{k}\cap V_{i_{k-1}}}$ {\textstyle W_{k-1}'\subset W_{k}\cap V_{i_{k-1}}}. This still exists. Etc. Now since ${\textstyle dim(E)\leq n-1}$ {\textstyle dim(E)\leq n-1}, apply the induction hypothesis, there exists some ${\textstyle W\in X(W_{1},\dots ,W_{k})}$ {\textstyle W\in X(W_{1},\dots ,W_{k})} such that $\lambda _{i_{1}-(i_{1}-1)}(A|E)+\cdots +\lambda _{i_{k}-(i_{1}-1)}(A|E)\geq tr_{W}(A)$ {\displaystyle \lambda _{i_{1}-(i_{1}-1)}(A|E)+\cdots +\lambda _{i_{k}-(i_{1}-1)}(A|E)\geq tr_{W}(A)} Now ${\textstyle \lambda _{i_{j}-(i_{1}-1)}(A|E)}$ {\textstyle \lambda _{i_{j}-(i_{1}-1)}(A|E)} is the ${\textstyle (i_{j}-(i_{1}-1))}$ {\textstyle (i_{j}-(i_{1}-1))}-th eigenvalue of ${\textstyle A}$ {\textstyle A} orthogonally projected down to ${\textstyle E}$ {\textstyle E}. By Cauchy interlacing theorem, ${\textstyle \lambda _{i_{j}-(i_{1}-1)}(A|E)\leq \lambda _{i_{j}}(A)}$ {\textstyle \lambda _{i_{j}-(i_{1}-1)}(A|E)\leq \lambda _{i_{j}}(A)}. Since ${\textstyle X(W_{1},\dots ,W_{k})\subset X(V_{1},\dots ,V_{k})}$ {\textstyle X(W_{1},\dots ,W_{k})\subset X(V_{1},\dots ,V_{k})}, we’re done.

If ${\textstyle i_{1}=1}$ {\textstyle i_{1}=1}, then we perform a similar construction. Let ${\textstyle E=span(e_{2},\dots ,e_{n})}$ {\textstyle E=span(e_{2},\dots ,e_{n})}. If ${\textstyle V_{k}\subset E}$ {\textstyle V_{k}\subset E}, then we can induct. Otherwise, we construct a partial flag sequence ${\textstyle W_{2},\dots ,W_{k}}$ {\textstyle W_{2},\dots ,W_{k}} By induction, there exists some ${\textstyle W'\in X(W_{2},\dots ,W_{k})\subset X(V_{2},\dots ,V_{k})}$ {\textstyle W'\in X(W_{2},\dots ,W_{k})\subset X(V_{2},\dots ,V_{k})}, such that $\lambda _{i_{2}-1}(A|E)+\cdots +\lambda _{i_{k}-1}(A|E)\geq tr_{W'}(A)$ {\displaystyle \lambda _{i_{2}-1}(A|E)+\cdots +\lambda _{i_{k}-1}(A|E)\geq tr_{W'}(A)} thus
$\lambda _{i_{2}}(A)+\cdots +\lambda _{i_{k}}(A)\geq tr_{W'}(A)$ {\displaystyle \lambda _{i_{2}}(A)+\cdots +\lambda _{i_{k}}(A)\geq tr_{W'}(A)} And it remains to find some ${\textstyle v}$ {\textstyle v} such that ${\textstyle W'\oplus v\in X(V_{1},\dots ,V_{k})}$ {\textstyle W'\oplus v\in X(V_{1},\dots ,V_{k})}.

If ${\textstyle V_{1}\not \subset W'}$ {\textstyle V_{1}\not \subset W'}, then any ${\textstyle v\in V_{1}\setminus W'}$ {\textstyle v\in V_{1}\setminus W'} would work. Otherwise, if ${\textstyle V_{2}\not \subset W'}$ {\textstyle V_{2}\not \subset W'}, then any ${\textstyle v\in V_{2}\setminus W'}$ {\textstyle v\in V_{2}\setminus W'} would work, and so on. If none of these work, then it means ${\textstyle V_{k}\subset E}$ {\textstyle V_{k}\subset E}, contradiction.

This has some corollaries:^[1]^: 44

Extremal partial trace— $\lambda _{1}(A)+\dots +\lambda _{k}(A)=\sup _{\operatorname {dim} (V)=k}tr_{V}(A)$ {\displaystyle \lambda _{1}(A)+\dots +\lambda _{k}(A)=\sup _{\operatorname {dim} (V)=k}tr_{V}(A)}

$\xi _{1}(A)+\dots +\xi _{k}(A)=\inf _{\operatorname {dim} (V)=k}tr_{V}(A)$ {\displaystyle \xi _{1}(A)+\dots +\xi _{k}(A)=\inf _{\operatorname {dim} (V)=k}tr_{V}(A)}

Corollary—The sum ${\textstyle \lambda _{1}(A)+\dots +\lambda _{k}(A)}$ {\textstyle \lambda _{1}(A)+\dots +\lambda _{k}(A)} is a convex function, and ${\textstyle \xi _{1}(A)+\dots +\xi _{k}(A)}$ {\textstyle \xi _{1}(A)+\dots +\xi _{k}(A)} is concave.

(Schur-Horn inequality) $\xi _{1}(A)+\dots +\xi _{k}(A)\leq a_{i_{1},i_{1}}+\dots +a_{i_{k},i_{k}}\leq \lambda _{1}(A)+\dots +\lambda _{k}(A)$ {\displaystyle \xi _{1}(A)+\dots +\xi _{k}(A)\leq a_{i_{1},i_{1}}+\dots +a_{i_{k},i_{k}}\leq \lambda _{1}(A)+\dots +\lambda _{k}(A)} for any subset of indices.

Equivalently, this states that the diagonal vector of ${\textstyle A}$ {\textstyle A} is majorized by its eigenspectrum.

Schatten-norm Hölder inequality—Given Hermitian ${\textstyle A,B}$ {\textstyle A,B} and Hölder pair ${\textstyle 1/p+1/q=1}$ {\textstyle 1/p+1/q=1}, $|\operatorname {tr} (AB)|\leq \|A\|_{S^{p}}\|B\|_{S^{q}}$ {\displaystyle |\operatorname {tr} (AB)|\leq \|A\|_{S^{p}}\|B\|_{S^{q}}}

Proof

Proof

WLOG, ${\textstyle B}$ {\textstyle B} is diagonalized, then we need to show ${\textstyle |\sum _{i}B_{ii}A_{ii}|\leq \|A\|_{S^{p}}\|(B_{ii})\|_{l^{q}}}$ {\textstyle |\sum _{i}B_{ii}A_{ii}|\leq \|A\|_{S^{p}}\|(B_{ii})\|_{l^{q}}}

By the standard Hölder inequality, it suffices to show ${\textstyle \|(A_{ii})\|_{l^{p}}\leq \|A\|_{S^{p}}}$ {\textstyle \|(A_{ii})\|_{l^{p}}\leq \|A\|_{S^{p}}}

By the Schur-Horn inequality, the diagonals of ${\textstyle A}$ {\textstyle A} are majorized by the eigenspectrum of ${\textstyle A}$ {\textstyle A}, and since the map ${\textstyle f(x_{1},\dots ,x_{n})=\|x\|_{p}}$ {\textstyle f(x_{1},\dots ,x_{n})=\|x\|_{p}} is symmetric and convex, it is Schur-convex.

Counterexample in the non-Hermitian case

[edit ]

Let N be the nilpotent matrix

{\begin{bmatrix}0&1\0円&0\end{bmatrix}}.

{\displaystyle {\begin{bmatrix}0&1\0円&0\end{bmatrix}}.}

Define the Rayleigh quotient $R_{N}(x)$ {\displaystyle R_{N}(x)} exactly as above in the Hermitian case. Then it is easy to see that the only eigenvalue of N is zero, while the maximum value of the Rayleigh quotient is ⁠1/2⁠. That is, the maximum value of the Rayleigh quotient is larger than the maximum eigenvalue.

Applications

[edit ]

Min-max principle for singular values

[edit ]

The singular values {σ_k} of a square matrix M are the square roots of the eigenvalues of M*M (equivalently MM*). An immediate consequence^{[citation needed ]} of the first equality in the min-max theorem is:

\sigma _{k}^{\downarrow }=\max _{S:\dim(S)=k}\min _{x\in S,\|x\|=1}(M^{*}Mx,x)^{\frac {1}{2}}=\max _{S:\dim(S)=k}\min _{x\in S,\|x\|=1}\|Mx\|.

{\displaystyle \sigma _{k}^{\downarrow }=\max _{S:\dim(S)=k}\min _{x\in S,\|x\|=1}(M^{*}Mx,x)^{\frac {1}{2}}=\max _{S:\dim(S)=k}\min _{x\in S,\|x\|=1}\|Mx\|.}

Similarly,

\sigma _{k}^{\downarrow }=\min _{S:\dim(S)=n-k+1}\max _{x\in S,\|x\|=1}\|Mx\|.

{\displaystyle \sigma _{k}^{\downarrow }=\min _{S:\dim(S)=n-k+1}\max _{x\in S,\|x\|=1}\|Mx\|.}

Here $\sigma _{k}^{\downarrow }$ {\displaystyle \sigma _{k}^{\downarrow }} denotes the k^th entry in the decreasing sequence of the singular values, so that $\sigma _{1}^{\downarrow }\geq \sigma _{2}^{\downarrow }\geq \cdots$ {\displaystyle \sigma _{1}^{\downarrow }\geq \sigma _{2}^{\downarrow }\geq \cdots }.

Cauchy interlacing theorem

[edit ]

Main article: Poincaré separation theorem

Let A be a symmetric n ×ばつ n matrix. The m ×ばつ m matrix B, where m ≤ n, is called a compression of A if there exists an orthogonal projection P onto a subspace of dimension m such that PAP* = B. The Cauchy interlacing theorem states:

Theorem. If the eigenvalues of A are α₁ ≤ ... ≤ α_n, and those of B are β₁ ≤ ... ≤ β_j ≤ ... ≤ β_m, then for all j ≤ m,

\alpha _{j}\leq \beta _{j}\leq \alpha _{n-m+j}.

{\displaystyle \alpha _{j}\leq \beta _{j}\leq \alpha _{n-m+j}.}

This can be proven using the min-max principle. Let β_i have corresponding eigenvector b_i and S_j be the j dimensional subspace S_j = span{b₁, ..., b_j}, then

\beta _{j}=\max _{x\in S_{j},\|x\|=1}(Bx,x)=\max _{x\in S_{j},\|x\|=1}(PAP^{*}x,x)\geq \min _{S_{j}}\max _{x\in S_{j},\|x\|=1}(A(P^{*}x),P^{*}x)=\alpha _{j}.

{\displaystyle \beta _{j}=\max _{x\in S_{j},\|x\|=1}(Bx,x)=\max _{x\in S_{j},\|x\|=1}(PAP^{*}x,x)\geq \min _{S_{j}}\max _{x\in S_{j},\|x\|=1}(A(P^{*}x),P^{*}x)=\alpha _{j}.}

According to first part of min-max, α_j ≤ β_j. On the other hand, if we define S_m−j+1 = span{b_j, ..., b_m}, then

\beta _{j}=\min _{x\in S_{m-j+1},\|x\|=1}(Bx,x)=\min _{x\in S_{m-j+1},\|x\|=1}(PAP^{*}x,x)=\min _{x\in S_{m-j+1},\|x\|=1}(A(P^{*}x),P^{*}x)\leq \alpha _{n-m+j},

{\displaystyle \beta _{j}=\min _{x\in S_{m-j+1},\|x\|=1}(Bx,x)=\min _{x\in S_{m-j+1},\|x\|=1}(PAP^{*}x,x)=\min _{x\in S_{m-j+1},\|x\|=1}(A(P^{*}x),P^{*}x)\leq \alpha _{n-m+j},}

where the last inequality is given by the second part of min-max.

When n − m = 1, we have α_j ≤ β_j ≤ α_j+1, hence the name interlacing theorem.

Lidskii's inequality

[edit ]

Main article: Trace class § Lidskii's theorem

Lidskii inequality—If ${\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n}$ {\textstyle 1\leq i_{1}<\cdots <i_{k}\leq n} then ${\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \leq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\lambda _{1}(B)+\cdots +\lambda _{k}(B)\end{aligned}}$ {\displaystyle {\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \leq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\lambda _{1}(B)+\cdots +\lambda _{k}(B)\end{aligned}}}

${\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \geq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\xi _{1}(B)+\cdots +\xi _{k}(B)\end{aligned}}$ {\displaystyle {\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\&\quad \geq \lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\xi _{1}(B)+\cdots +\xi _{k}(B)\end{aligned}}}

Proof

Proof

The second is the negative of the first. The first is by Wielandt minimax.

${\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\=&\sup _{V_{1},\dots ,V_{k}}\inf _{W\in X(V_{1},\dots ,V_{k})}(tr_{W}(A)+tr_{W}(B))\\=&\sup _{V_{1},\dots ,V_{k}}(\inf _{W\in X(V_{1},\dots ,V_{k})}tr_{W}(A)+tr_{W}(B))\\\leq &\sup _{V_{1},\dots ,V_{k}}(\inf _{W\in X(V_{1},\dots ,V_{k})}tr_{W}(A)+(\lambda _{1}(B)+\cdots +\lambda _{k}(B)))\\=&\lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\lambda _{1}(B)+\cdots +\lambda _{k}(B)\end{aligned}}$ {\displaystyle {\begin{aligned}&\lambda _{i_{1}}(A+B)+\cdots +\lambda _{i_{k}}(A+B)\\=&\sup _{V_{1},\dots ,V_{k}}\inf _{W\in X(V_{1},\dots ,V_{k})}(tr_{W}(A)+tr_{W}(B))\\=&\sup _{V_{1},\dots ,V_{k}}(\inf _{W\in X(V_{1},\dots ,V_{k})}tr_{W}(A)+tr_{W}(B))\\\leq &\sup _{V_{1},\dots ,V_{k}}(\inf _{W\in X(V_{1},\dots ,V_{k})}tr_{W}(A)+(\lambda _{1}(B)+\cdots +\lambda _{k}(B)))\\=&\lambda _{i_{1}}(A)+\cdots +\lambda _{i_{k}}(A)+\lambda _{1}(B)+\cdots +\lambda _{k}(B)\end{aligned}}}

Note that $\sum _{i}\lambda _{i}(A+B)=tr(A+B)=\sum _{i}\lambda _{i}(A)+\lambda _{i}(B)$ {\displaystyle \sum _{i}\lambda _{i}(A+B)=tr(A+B)=\sum _{i}\lambda _{i}(A)+\lambda _{i}(B)}. In other words, $\lambda (A+B)-\lambda (A)\preceq \lambda (B)$ {\displaystyle \lambda (A+B)-\lambda (A)\preceq \lambda (B)} where $\preceq$ {\displaystyle \preceq } means majorization. By the Schur convexity theorem, we then have

p-Wielandt-Hoffman inequality— ${\textstyle \|\lambda (A+B)-\lambda (A)\|_{\ell ^{p}}\leq \|B\|_{S^{p}}}$ {\textstyle \|\lambda (A+B)-\lambda (A)\|_{\ell ^{p}}\leq \|B\|_{S^{p}}} where ${\textstyle \|\cdot \|_{S^{p}}}$ {\textstyle \|\cdot \|_{S^{p}}} stands for the p-Schatten norm.

Compact operators

[edit ]

Let A be a compact, Hermitian operator on a Hilbert space H. Recall that the non-zero spectrum of such an operator consists of real eigenvalues with finite multiplicities whose only possible cluster point is zero. If A has infinitely many positive eigenvalues, they accumulate at zero. In this case, we list the positive eigenvalues of A as

\cdots \leq \lambda _{k}\leq \cdots \leq \lambda _{1},

{\displaystyle \cdots \leq \lambda _{k}\leq \cdots \leq \lambda _{1},}

where entries are repeated with multiplicity, as in the matrix case. (To emphasize that the sequence is decreasing, we may write $\lambda _{k}=\lambda _{k}^{\downarrow }$ {\displaystyle \lambda _{k}=\lambda _{k}^{\downarrow }}.) We now apply the same reasoning as in the matrix case. Letting S_k ⊂ H be a k dimensional subspace, we can obtain the following theorem.

Theorem (Min-Max). Let A be a compact, self-adjoint operator on a Hilbert space H, whose positive eigenvalues are listed in decreasing order ... ≤ λ_k ≤ ... ≤ λ₁. Then:

{\begin{aligned}\max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow },\\\min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow }.\end{aligned}}

{\displaystyle {\begin{aligned}\max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow },\\\min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)&=\lambda _{k}^{\downarrow }.\end{aligned}}}

A similar pair of equalities hold for negative eigenvalues.

Proof

Let S' be the closure of the linear span $S'=\operatorname {span} \{u_{k},u_{k+1},\ldots \}$ {\displaystyle S'=\operatorname {span} \{u_{k},u_{k+1},\ldots \}}. The subspace S' has codimension k − 1. By the same dimension count argument as in the matrix case, S' ∩ S_k has positive dimension. So there exists x ∈ S' ∩ S_k with $\|x\|=1$ {\displaystyle \|x\|=1}. Since it is an element of S' , such an x necessarily satisfy

(Ax,x)\leq \lambda _{k}.

{\displaystyle (Ax,x)\leq \lambda _{k}.}

Therefore, for all S_k

\inf _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}

{\displaystyle \inf _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}}

But A is compact, therefore the function f(x) = (Ax, x) is weakly continuous. Furthermore, any bounded set in H is weakly compact. This lets us replace the infimum by minimum:

\min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.

{\displaystyle \min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.}

So

\sup _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.

{\displaystyle \sup _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)\leq \lambda _{k}.}

Because equality is achieved when $S_{k}=\operatorname {span} \{u_{1},\ldots ,u_{k}\}$ {\displaystyle S_{k}=\operatorname {span} \{u_{1},\ldots ,u_{k}\}},

\max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)=\lambda _{k}.

{\displaystyle \max _{S_{k}}\min _{x\in S_{k},\|x\|=1}(Ax,x)=\lambda _{k}.}

This is the first part of min-max theorem for compact self-adjoint operators.

Analogously, consider now a (k − 1)-dimensional subspace S_k−1, whose the orthogonal complement is denoted by S_k−1^⊥. If S' = span{u₁...u_k},

S'\cap S_{k-1}^{\perp }\neq {0}.

{\displaystyle S'\cap S_{k-1}^{\perp }\neq {0}.}

So

\exists x\in S_{k-1}^{\perp },円\|x\|=1,(Ax,x)\geq \lambda _{k}.

{\displaystyle \exists x\in S_{k-1}^{\perp },円\|x\|=1,(Ax,x)\geq \lambda _{k}.}

This implies

\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}

{\displaystyle \max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}}

where the compactness of A was applied. Index the above by the collection of k-1-dimensional subspaces gives

\inf _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}.

{\displaystyle \inf _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)\geq \lambda _{k}.}

Pick S_k−1 = span{u₁, ..., u_k−1} and we deduce

\min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)=\lambda _{k}.

{\displaystyle \min _{S_{k-1}}\max _{x\in S_{k-1}^{\perp },\|x\|=1}(Ax,x)=\lambda _{k}.}

Self-adjoint operators

[edit ]

The min-max theorem also applies to (possibly unbounded) self-adjoint operators.^[2]^[3] Recall the essential spectrum is the spectrum without isolated eigenvalues of finite multiplicity. Sometimes we have some eigenvalues below the essential spectrum, and we would like to approximate the eigenvalues and eigenfunctions.

Theorem (Min-Max). Let A be self-adjoint, and let

E_{1}\leq E_{2}\leq E_{3}\leq \cdots

{\displaystyle E_{1}\leq E_{2}\leq E_{3}\leq \cdots } be the eigenvalues of A below the essential spectrum. Then

$E_{n}=\min _{\psi _{1},\ldots ,\psi _{n}}\max\{\langle \psi ,A\psi \rangle :\psi \in \operatorname {span} (\psi _{1},\ldots ,\psi _{n}),,円\|\psi \|=1\}$ {\displaystyle E_{n}=\min _{\psi _{1},\ldots ,\psi _{n}}\max\{\langle \psi ,A\psi \rangle :\psi \in \operatorname {span} (\psi _{1},\ldots ,\psi _{n}),,円\|\psi \|=1\}}.

If we only have N eigenvalues and hence run out of eigenvalues, then we let $E_{n}:=\inf \sigma _{ess}(A)$ {\displaystyle E_{n}:=\inf \sigma _{ess}(A)} (the bottom of the essential spectrum) for n>N, and the above statement holds after replacing min-max with inf-sup.

Theorem (Max-Min). Let A be self-adjoint, and let

E_{1}\leq E_{2}\leq E_{3}\leq \cdots

{\displaystyle E_{1}\leq E_{2}\leq E_{3}\leq \cdots } be the eigenvalues of A below the essential spectrum. Then

$E_{n}=\max _{\psi _{1},\ldots ,\psi _{n-1}}\min\{\langle \psi ,A\psi \rangle :\psi \perp \psi _{1},\ldots ,\psi _{n-1},,円\|\psi \|=1\}$ {\displaystyle E_{n}=\max _{\psi _{1},\ldots ,\psi _{n-1}}\min\{\langle \psi ,A\psi \rangle :\psi \perp \psi _{1},\ldots ,\psi _{n-1},,円\|\psi \|=1\}}.

If we only have N eigenvalues and hence run out of eigenvalues, then we let $E_{n}:=\inf \sigma _{ess}(A)$ {\displaystyle E_{n}:=\inf \sigma _{ess}(A)} (the bottom of the essential spectrum) for n > N, and the above statement holds after replacing max-min with sup-inf.

The proofs^[2]^[3] use the following results about self-adjoint operators:

Theorem. Let A be self-adjoint. Then

(A-E)\geq 0

{\displaystyle (A-E)\geq 0} for

E\in \mathbb {R}

{\displaystyle E\in \mathbb {R} } if and only if

\sigma (A)\subseteq [E,\infty )

{\displaystyle \sigma (A)\subseteq [E,\infty )}.^[2]^: 77

Theorem. If A is self-adjoint, then

$\inf \sigma (A)=\inf _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle$ {\displaystyle \inf \sigma (A)=\inf _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle }

and

$\sup \sigma (A)=\sup _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle$ {\displaystyle \sup \sigma (A)=\sup _{\psi \in {\mathfrak {D}}(A),\|\psi \|=1}\langle \psi ,A\psi \rangle }.^[2]^: 77

References

[edit ]

^ ^a ^b Tao, Terence (2012). Topics in random matrix theory. Graduate studies in mathematics. Providence, R.I: American Mathematical Society. ISBN 978-0-8218-7430-1.
^ ^a ^b ^c ^d G. Teschl, Mathematical Methods in Quantum Mechanics (GSM 99) https://www.mat.univie.ac.at/~gerald/ftp/book-schroe/schroe.pdf
^ ^a ^b Lieb; Loss (2001). Analysis. GSM. Vol. 14 (2nd ed.). Providence: American Mathematical Society. ISBN 0-8218-2783-9.

External links and citations to related work

[edit ]

Fisk, Steve (2005). "A very short proof of Cauchy's interlace theorem for eigenvalues of Hermitian matrices". arXiv:math/0502408 .
Hwang, Suk-Geun (2004). "Cauchy's Interlace Theorem for Eigenvalues of Hermitian Matrices". The American Mathematical Monthly. 111 (2): 157–159. doi:10.2307/4145217. JSTOR 4145217.
Kline, Jeffery (2020). "Bordered Hermitian matrices and sums of the Möbius function". Linear Algebra and Its Applications. 588: 224–237. doi:10.1016/j.laa.201912004 .
Reed, Michael; Simon, Barry (1978). Methods of Modern Mathematical Physics IV: Analysis of Operators. Academic Press. ISBN 978-0-08-057045-7.
Edmunds, D. E.; Evans, W. D. (2018). "11.1 The Max–Min Principle for Semi-Bounded, Self-Adjoint Operators". Spectral theory and differential operators. Oxford science publications (2 ed.). Oxford: Oxford University Press. ISBN 978-0-19-881205-0.

v
t
e

Functional analysis (topics – glossary)

Spaces

Banach Besov Fréchet Hilbert Hölder Nuclear Orlicz Schwartz Sobolev Topological vector
Properties	Barrelled Complete Dual (Algebraic / Topological) Locally convex Reflexive Separable

Theorems

Operators

Algebras

Open problems

Applications

Advanced topics

Category

v t e Analysis in topological vector spaces
Basic concepts	Abstract Wiener space Classical Wiener space Bochner space Convex series Cylinder set measure Infinite-dimensional vector function Matrix calculus Vector calculus
Derivatives	Differentiable vector-valued functions from Euclidean space Differentiation in Fréchet spaces Fréchet derivative Total Functional derivative Gateaux derivative Directional Generalizations of the derivative Hadamard derivative Holomorphic Quasi-derivative
Measurability	Besov measure Cylinder set measure Canonical Gaussian Classical Wiener measure Measure like set functions infinite-dimensional Gaussian measure Projection-valued Vector Bochner / Weakly / Strongly measurable function Radonifying function
Integrals	Bochner Direct integral Dunford Gelfand–Pettis/Weak Regulated Paley–Wiener
Results	Cameron–Martin theorem Inverse function theorem Nash–Moser theorem Feldman–Hájek theorem No infinite-dimensional Lebesgue measure Sazonov's theorem Structure theorem for Gaussian measures
Related	Crinkled arc Covariance operator
Functional calculus	Borel functional calculus Continuous functional calculus Holomorphic functional calculus
Applications	Banach manifold (bundle) Convenient vector space Choquet theory Fréchet manifold Hilbert manifold

v t e Spectral theory and ^*-algebras
Basic concepts	Involution/-algebra Banach algebra B-algebra C-algebra Noncommutative topology Projection-valued measure Spectrum Spectrum of a C-algebra Spectral radius Operator space
Main results	Gelfand–Mazur theorem Gelfand–Naimark theorem Gelfand representation Polar decomposition Singular value decomposition Spectral theorem Spectral theory of normal C*-algebras
Special Elements/Operators	Isospectral Normal operator Hermitian/Self-adjoint operator Unitary operator Unit
Spectrum	Krein–Rutman theorem Normal eigenvalue Spectrum of a C*-algebra Spectral radius Spectral asymmetry Spectral gap
Decomposition	Decomposition of a spectrum Continuous Point Residual Approximate point Compression Direct integral Discrete Spectral abscissa
Spectral Theorem	Borel functional calculus Min-max theorem Positive operator-valued measure Projection-valued measure Riesz projector Rigged Hilbert space Spectral theorem Spectral theory of compact operators Spectral theory of normal C*-algebras
Special algebras	Amenable Banach algebra With an Approximate identity Banach function algebra Disk algebra Nuclear C*-algebra Uniform algebra Von Neumann algebra Tomita–Takesaki theory
Finite-Dimensional	Alon–Boppana bound Bauer–Fike theorem Numerical range Schur–Horn theorem
Generalizations	Dirac spectrum Essential spectrum Pseudospectrum Structure space (Shilov boundary)
Miscellaneous	Abstract index group Banach algebra cohomology Cohen–Hewitt factorization theorem Extensions of symmetric operators Fredholm theory Limiting absorption principle Schröder–Bernstein theorems for operator algebras Sherman–Takeda theorem Unbounded operator
Examples	Wiener algebra
Applications	Almost Mathieu operator Corona theorem Hearing the shape of a drum (Dirichlet eigenvalue) Heat kernel Kuznetsov trace formula Lax pair Proto-value function Ramanujan graph Rayleigh–Faber–Krahn inequality Spectral geometry Spectral method Spectral theory of ordinary differential equations Sturm–Liouville theory Superstrong approximation Transfer operator Transform theory Weyl law Wiener–Khinchin theorem

Retrieved from "https://en.wikipedia.org/w/index.php?title=Min-max_theorem&oldid=1320914404"

Matrices

Min-max theorem

Counterexample in the non-Hermitian case

Applications

Min-max principle for singular values

Cauchy interlacing theorem

Lidskii's inequality

Compact operators

Self-adjoint operators

See also

References

External links and citations to related work