Extreme value theorem
In real analysis, a branch of mathematics, the extreme value theorem states that if a real-valued function {\displaystyle f} is continuous on the closed and bounded interval {\displaystyle [a,b]}, then {\displaystyle f} must attain a maximum and a minimum, each at least once.[1] [2] That is, there exist numbers {\displaystyle c} and {\displaystyle d} in {\displaystyle [a,b]} such that: {\displaystyle f(c)\leq f(x)\leq f(d)\quad \forall x\in [a,b].}
The extreme value theorem is more specific than the related boundedness theorem, which states merely that a continuous function {\displaystyle f} on the closed interval {\displaystyle [a,b]} is bounded on that interval; that is, there exist real numbers {\displaystyle m} and {\displaystyle M} such that: {\displaystyle m\leq f(x)\leq M\quad \forall x\in [a,b].}
This does not say that {\displaystyle M} and {\displaystyle m} are necessarily the maximum and minimum values of {\displaystyle f} on the interval {\displaystyle [a,b],} which is what the extreme value theorem stipulates must also be the case.
The extreme value theorem is used to prove Rolle's theorem. In a formulation due to Karl Weierstrass, this theorem states that a continuous function from a non-empty compact space to a subset of the real numbers attains a maximum and a minimum.
History
[edit ]The extreme value theorem was originally proven by Bernard Bolzano in the 1830s in a work Function Theory but the work remained unpublished until 1930. Bolzano's proof consisted of showing that a continuous function on a closed interval was bounded, and then showing that the function attained a maximum and a minimum value. Both proofs involved what is known today as the Bolzano–Weierstrass theorem.[3]
Functions to which the theorem does not apply
[edit ]The following examples show why the function domain must be closed and bounded in order for the theorem to apply. Each fails to attain a maximum on the given interval.
- {\displaystyle f(x)=x} defined over {\displaystyle [0,\infty )} is not bounded from above.
- {\displaystyle f(x)={\frac {x}{1+x}}} defined over {\displaystyle [0,\infty )} is bounded from below but does not attain its least upper bound {\displaystyle 1}.
- {\displaystyle f(x)={\frac {1}{x}}} defined over {\displaystyle (0,1]} is not bounded from above.
- {\displaystyle f(x)=1-x} defined over {\displaystyle (0,1]} is bounded but never attains its least upper bound {\displaystyle 1}.
Defining {\displaystyle f(0)=0} in the last two examples shows that both theorems require continuity on {\displaystyle [a,b]}.
Generalization to metric and topological spaces
[edit ]When moving from the real line {\displaystyle \mathbb {R} } to metric spaces and general topological spaces, the appropriate generalization of a closed bounded interval is a compact set. A set {\displaystyle K} is said to be compact if it has the following property: from every collection of open sets {\displaystyle U_{\alpha }} such that {\textstyle \bigcup U_{\alpha }\supset K}, a finite subcollection {\displaystyle U_{\alpha _{1}},\ldots ,U_{\alpha _{n}}}can be chosen such that {\textstyle \bigcup _{i=1}^{n}U_{\alpha _{i}}\supset K}. This is usually stated in short as "every open cover of {\displaystyle K} has a finite subcover". The Heine–Borel theorem asserts that a subset of the real line is compact if and only if it is both closed and bounded. Correspondingly, a metric space has the Heine–Borel property if every closed and bounded set is also compact.
The concept of a continuous function can likewise be generalized. Given topological spaces {\displaystyle V,\ W}, a function {\displaystyle f:V\to W} is said to be continuous if for every open set {\displaystyle U\subset W}, {\displaystyle f^{-1}(U)\subset V} is also open. Given these definitions, continuous functions can be shown to preserve compactness:[4]
Theorem—If {\displaystyle V,\ W} are topological spaces, {\displaystyle f:V\to W} is a continuous function, and {\displaystyle K\subset V} is compact, then {\displaystyle f(K)\subset W} is also compact.
In particular, if {\displaystyle W=\mathbb {R} }, then this theorem implies that {\displaystyle f(K)} is closed and bounded for any compact set {\displaystyle K}, which in turn implies that {\displaystyle f} attains its supremum and infimum on any (nonempty) compact set {\displaystyle K}. Thus, we have the following generalization of the extreme value theorem:[4]
Theorem—If {\displaystyle K} is a nonempty compact set and {\displaystyle f:K\to \mathbb {R} } is a continuous function, then {\displaystyle f} is bounded and there exist {\displaystyle p,q\in K} such that {\displaystyle f(p)=\sup _{x\in K}f(x)} and {\displaystyle f(q)=\inf _{x\in K}f(x)}.
Slightly more generally, this is also true for an upper semicontinuous function. (see compact space#Functions and compact spaces).
Proving the theorems
[edit ]We look at the proof for the upper bound and the maximum of {\displaystyle f}. By applying these results to the function {\displaystyle -f}, the existence of the lower bound and the result for the minimum of {\displaystyle f} follows. Also note that everything in the proof is done within the context of the real numbers.
We first prove the boundedness theorem, which is a step in the proof of the extreme value theorem. The basic steps involved in the proof of the extreme value theorem are:
- Prove the boundedness theorem.
- Find a sequence so that its image converges to the supremum of {\displaystyle f}.
- Show that there exists a subsequence that converges to a point in the domain.
- Use continuity to show that the image of the subsequence converges to the supremum.
Proof of the boundedness theorem
[edit ]Boundedness Theorem—If {\displaystyle f(x)} is continuous on {\displaystyle [a,b],} then it is bounded on {\displaystyle [a,b].}
Suppose the function {\displaystyle f} is not bounded above on the interval {\displaystyle [a,b]}. Pick a sequence {\displaystyle (x_{n})_{n\in \mathbb {N} }} such that {\displaystyle x_{n}\in [a,b]} and {\displaystyle f(x_{n})>n}. Because {\displaystyle [a,b]} is bounded, the Bolzano–Weierstrass theorem implies that there exists a convergent subsequence {\displaystyle (x_{n_{k}})_{k\in \mathbb {N} }} of {\displaystyle ({x_{n}})}. Denote its limit by {\displaystyle x}. As {\displaystyle [a,b]} is closed, it contains {\displaystyle x}. Because {\displaystyle f} is continuous at {\displaystyle x}, we know that {\displaystyle f(x_{{n}_{k}})} converges to the real number {\displaystyle f(x)} (as {\displaystyle f} is sequentially continuous at {\displaystyle x}). But {\displaystyle f(x_{{n}_{k}})>n_{k}\geq k} for every {\displaystyle k}, which implies that {\displaystyle f(x_{{n}_{k}})} diverges to {\displaystyle +\infty }, a contradiction. Therefore, {\displaystyle f} is bounded above on {\displaystyle [a,b]}. ∎
Consider the set {\displaystyle B} of points {\displaystyle p} in {\displaystyle [a,b]} such that {\displaystyle f(x)} is bounded on {\displaystyle [a,p]}. We note that {\displaystyle a} is one such point, for {\displaystyle f(x)} is bounded on {\displaystyle [a,a]} by the value {\displaystyle f(a)}. If {\displaystyle e>a} is another point, then all points between {\displaystyle a} and {\displaystyle e} also belong to {\displaystyle B}. In other words {\displaystyle B} is an interval closed at its left end by {\displaystyle a}.
Now {\displaystyle f} is continuous on the right at {\displaystyle a}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(a)|<1} for all {\displaystyle x} in {\displaystyle [a,a+\delta ]}. Thus {\displaystyle f} is bounded by {\displaystyle f(a)-1} and {\displaystyle f(a)+1} on the interval {\displaystyle [a,a+\delta ]} so that all these points belong to {\displaystyle B}.
So far, we know that {\displaystyle B} is an interval of non-zero length, closed at its left end by {\displaystyle a}.
Next, {\displaystyle B} is bounded above by {\displaystyle b}. Hence the set {\displaystyle B} has a supremum in {\displaystyle [a,b]} ; let us call it {\displaystyle s}. From the non-zero length of {\displaystyle B} we can deduce that {\displaystyle s>a}.
Suppose {\displaystyle s<b}. Now {\displaystyle f} is continuous at {\displaystyle s}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<1} for all {\displaystyle x} in {\displaystyle [s-\delta ,s+\delta ]} so that {\displaystyle f} is bounded on this interval. But it follows from the supremacy of {\displaystyle s} that there exists a point belonging to {\displaystyle B}, {\displaystyle e} say, which is greater than {\displaystyle s-\delta /2}. Thus {\displaystyle f} is bounded on {\displaystyle [a,e]} which overlaps {\displaystyle [s-\delta ,s+\delta ]} so that {\displaystyle f} is bounded on {\displaystyle [a,s+\delta ]}. This however contradicts the supremacy of {\displaystyle s}.
We must therefore have {\displaystyle s=b}. Now {\displaystyle f} is continuous on the left at {\displaystyle s}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<1} for all {\displaystyle x} in {\displaystyle [s-\delta ,s]} so that {\displaystyle f} is bounded on this interval. But it follows from the supremacy of {\displaystyle s} that there exists a point belonging to {\displaystyle B}, {\displaystyle e} say, which is greater than {\displaystyle s-\delta /2}. Thus {\displaystyle f} is bounded on {\displaystyle [a,e]} which overlaps {\displaystyle [s-\delta ,s]} so that {\displaystyle f} is bounded on {\displaystyle [a,s]}. ∎
Proofs of the extreme value theorem
[edit ]By the boundedness theorem, f is bounded from above, hence, by the Dedekind-completeness of the real numbers, the least upper bound (supremum) M of f exists. It is necessary to find a point d in [a, b] such that M = f(d). Let n be a natural number. As M is the least upper bound, M − 1/n is not an upper bound for f. Therefore, there exists dn in [a, b] so that M − 1/n < f(dn). This defines a sequence {dn}. Since M is an upper bound for f, we have M − 1/n < f(dn) ≤ M for all n. Therefore, the sequence {f(dn)} converges to M.
The Bolzano–Weierstrass theorem tells us that there exists a subsequence {{\displaystyle d_{n_{k}}}}, which converges to some d and, as [a, b] is closed, d is in [a, b]. Since f is continuous at d, the sequence {f({\displaystyle d_{n_{k}}})} converges to f(d). But {f(dnk)} is a subsequence of {f(dn)} that converges to M, so M = f(d). Therefore, f attains its supremum M at d. ∎
The set {y ∈ R : y = f(x) for some x ∈ [a,b]} is a bounded set. Hence, its least upper bound exists by least upper bound property of the real numbers. Let M = sup(f(x)) on [a, b]. If there is no point x on [a, b] so that f(x) = M, then f(x) < M on [a, b]. Therefore, 1/(M − f(x)) is continuous on [a, b].
However, to every positive number ε, there is always some x in [a, b] such that M − f(x) < ε because M is the least upper bound. Hence, 1/(M − f(x)) > 1/ε, which means that 1/(M − f(x)) is not bounded. Since every continuous function on [a, b] is bounded, this contradicts the conclusion that 1/(M − f(x)) was continuous on [a, b]. Therefore, there must be a point x in [a, b] such that f(x) = M. ∎
Proof using the hyperreals
[edit ]In the setting of non-standard calculus, let N be an infinite hyperinteger. The interval [0, 1] has a natural hyperreal extension. Consider its partition into N subintervals of equal infinitesimal length 1/N, with partition points xi = i /N as i "runs" from 0 to N. The function ƒ is also naturally extended to a function ƒ* defined on the hyperreals between 0 and 1. Note that in the standard setting (when N is finite), a point with the maximal value of ƒ can always be chosen among the N+1 points xi, by induction. Hence, by the transfer principle, there is a hyperinteger i0 such that 0 ≤ i0 ≤ N and {\displaystyle f^{*}(x_{i_{0}})\geq f^{*}(x_{i})} for all i = 0, ..., N. Consider the real point {\displaystyle c=\mathbf {st} (x_{i_{0}})} where st is the standard part function. An arbitrary real point x lies in a suitable sub-interval of the partition, namely {\displaystyle x\in [x_{i},x_{i+1}]}, so that st(xi) = x. Applying st to the inequality {\displaystyle f^{*}(x_{i_{0}})\geq f^{*}(x_{i})}, we obtain {\displaystyle \mathbf {st} (f^{*}(x_{i_{0}}))\geq \mathbf {st} (f^{*}(x_{i}))}. By continuity of ƒ we have
- {\displaystyle \mathbf {st} (f^{*}(x_{i_{0}}))=f(\mathbf {st} (x_{i_{0}}))=f(c)}.
Hence ƒ(c) ≥ ƒ(x), for all real x, proving c to be a maximum of ƒ.[5] ∎
Proof from first principles
[edit ]Statement If {\displaystyle f(x)} is continuous on {\displaystyle [a,b]} then it attains its supremum on {\displaystyle [a,b]}
By the Boundedness Theorem, {\displaystyle f(x)} is bounded above on {\displaystyle [a,b]} and by the completeness property of the real numbers has a supremum in {\displaystyle [a,b]}. Let us call it {\displaystyle M}, or {\displaystyle M[a,b]}. It is clear that the restriction of {\displaystyle f} to the subinterval {\displaystyle [a,x]} where {\displaystyle x\leq b} has a supremum {\displaystyle M[a,x]} which is less than or equal to {\displaystyle M}, and that {\displaystyle M[a,x]} increases from {\displaystyle f(a)} to {\displaystyle M} as {\displaystyle x} increases from {\displaystyle a} to {\displaystyle b}.
If {\displaystyle f(a)=M} then we are done. Suppose therefore that {\displaystyle f(a)<M} and let {\displaystyle d=M-f(a)}. Consider the set {\displaystyle L} of points {\displaystyle x} in {\displaystyle [a,b]} such that {\displaystyle M[a,x]<M}.
Clearly {\displaystyle a\in L} ; moreover if {\displaystyle e>a} is another point in {\displaystyle L} then all points between {\displaystyle a} and {\displaystyle e} also belong to {\displaystyle L} because {\displaystyle M[a,x]} is monotonic increasing. Hence {\displaystyle L} is a non-empty interval, closed at its left end by {\displaystyle a}.
Now {\displaystyle f} is continuous on the right at {\displaystyle a}, hence there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(a)|<d/2} for all {\displaystyle x} in {\displaystyle [a,a+\delta ]}. Thus {\displaystyle f} is less than {\displaystyle M-d/2} on the interval {\displaystyle [a,a+\delta ]} so that all these points belong to {\displaystyle L}.
Next, {\displaystyle L} is bounded above by {\displaystyle b} and has therefore a supremum in {\displaystyle [a,b]}: let us call it {\displaystyle s}. We see from the above that {\displaystyle s>a}. We will show that {\displaystyle s} is the point we are seeking i.e. the point where {\displaystyle f} attains its supremum, or in other words {\displaystyle f(s)=M}.
Suppose the contrary viz. {\displaystyle f(s)<M}. Let {\displaystyle d=M-f(s)} and consider the following two cases:
- {\displaystyle s<b}. As {\displaystyle f} is continuous at {\displaystyle s}, there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<d/2} for all {\displaystyle x} in {\displaystyle [s-\delta ,s+\delta ]}. This means that {\displaystyle f} is less than {\displaystyle M-d/2} on the interval {\displaystyle [s-\delta ,s+\delta ]}. But it follows from the supremacy of {\displaystyle s} that there exists a point, {\displaystyle e} say, belonging to {\displaystyle L} which is greater than {\displaystyle s-\delta }. By the definition of {\displaystyle L}, {\displaystyle M[a,e]<M}. Let {\displaystyle d_{1}=M-M[a,e]} then for all {\displaystyle x} in {\displaystyle [a,e]}, {\displaystyle f(x)\leq M-d_{1}}. Taking {\displaystyle d_{2}} to be the minimum of {\displaystyle d/2} and {\displaystyle d_{1}}, we have {\displaystyle f(x)\leq M-d_{2}} for all {\displaystyle x} in {\displaystyle [a,s+\delta ]}. Hence {\displaystyle M[a,s+\delta ]<M} so that {\displaystyle s+\delta \in L}. This however contradicts the supremacy of {\displaystyle s} and completes the proof.
- {\displaystyle s=b}. As {\displaystyle f} is continuous on the left at {\displaystyle s}, there exists {\displaystyle \delta >0} such that {\displaystyle |f(x)-f(s)|<d/2} for all {\displaystyle x} in {\displaystyle [s-\delta ,s]}. This means that {\displaystyle f} is less than {\displaystyle M-d/2} on the interval {\displaystyle [s-\delta ,s]}. But it follows from the supremacy of {\displaystyle s} that there exists a point, {\displaystyle e} say, belonging to {\displaystyle L} which is greater than {\displaystyle s-\delta }. By the definition of {\displaystyle L}, {\displaystyle M[a,e]<M}. Let {\displaystyle d_{1}=M-M[a,e]} then for all {\displaystyle x} in {\displaystyle [a,e]}, {\displaystyle f(x)\leq M-d_{1}}. Taking {\displaystyle d_{2}} to be the minimum of {\displaystyle d/2} and {\displaystyle d_{1}}, we have {\displaystyle f(x)\leq M-d_{2}} for all {\displaystyle x} in {\displaystyle [a,b]}. This contradicts the supremacy of {\displaystyle M} and completes the proof. ∎
Extension to semi-continuous functions
[edit ]If the continuity of the function f is weakened to semi-continuity, then the corresponding half of the boundedness theorem and the extreme value theorem hold and the values −∞ or +∞, respectively, from the extended real number line can be allowed as possible values.[clarification needed ]
A function {\displaystyle f:[a,b]\to [-\infty ,\infty )} is said to be upper semi-continuous if {\displaystyle \limsup _{y\to x}f(y)\leq f(x)\quad \forall x\in [a,b].}
Theorem—If a function f : [a, b] → [–∞, ∞) is upper semi-continuous, then f is bounded above and attains its supremum.
If {\displaystyle f(x)=-\infty } for all x in [a,b], then the supremum is also {\displaystyle -\infty } and the theorem is true. In all other cases, the proof is a slight modification of the proofs given above. In the proof of the boundedness theorem, the upper semi-continuity of f at x only implies that the limit superior of the subsequence {f(xnk)} is bounded above by f(x) < ∞, but that is enough to obtain the contradiction. In the proof of the extreme value theorem, upper semi-continuity of f at d implies that the limit superior of the subsequence {f(dnk)} is bounded above by f(d), but this suffices to conclude that f(d) = M. ∎
Applying this result to −f proves a similar result for the infimums of lower semicontinuous functions.
A function {\displaystyle f:[a,b]\to [-\infty ,\infty )} is said to be lower semi-continuous if {\displaystyle \liminf _{y\to x}f(y)\geq f(x)\quad \forall x\in [a,b].}
Theorem—If a function f : [a, b] → (–∞, ∞] is lower semi-continuous, then f is bounded below and attains its infimum.
A real-valued function is upper as well as lower semi-continuous, if and only if it is continuous in the usual sense. Hence these two theorems imply the boundedness theorem and the extreme value theorem.
References
[edit ]- ^ Spivak, Michael (September 1994). Calculus. Publish or Perish publishing. ISBN 978-0-914098-89-8.
- ^ Abbott, Stephen (2001). Understanding Analysis. Undergraduate Texts in Mathematics. New York: Springer-Verlag. ISBN 978-0387950600.
- ^ Rusnock, Paul; Kerr-Lawson, Angus (2005). "Bolzano and Uniform Continuity". Historia Mathematica. 32 (3): 303–311. doi:10.1016/j.hm.200411003.
- ^ a b Rudin, Walter (1976). Principles of Mathematical Analysis. New York: McGraw Hill. pp. 89–90. ISBN 0-07-054235-X.
- ^ Keisler, H. Jerome (1986). Elementary Calculus : An Infinitesimal Approach (PDF). Boston: Prindle, Weber & Schmidt. p. 164. ISBN 0-87150-911-3.
Further reading
[edit ]- Adams, Robert A. (1995). Calculus : A Complete Course. Reading: Addison-Wesley. pp. 706–707. ISBN 0-201-82823-5.
- Protter, M. H.; Morrey, C. B. (1977). "The Boundedness and Extreme–Value Theorems". A First Course in Real Analysis. New York: Springer. pp. 71–73. ISBN 0-387-90215-5.
External links
[edit ]- A Proof for extreme value theorem at cut-the-knot
- Extreme Value Theorem by Jacqueline Wandzura with additional contributions by Stephen Wandzura, the Wolfram Demonstrations Project.
- Weisstein, Eric W. "Extreme Value Theorem". MathWorld .
- Mizar system proof: http://mizar.org/version/current/html/weierstr.html#T15