Fixed-point computation
Fixed-point computation refers to the process of computing an exact or approximate fixed point of a given function.[1] In its most common form, the given function {\displaystyle f} satisfies the condition to the Brouwer fixed-point theorem: that is, {\displaystyle f} is continuous and maps the unit d-cube to itself. The Brouwer fixed-point theorem guarantees that {\displaystyle f} has a fixed point, but the proof is not constructive. Various algorithms have been devised for computing an approximate fixed point. Such algorithms are used in various tasks, such as
Definitions
[edit ]The unit interval is denoted by {\displaystyle E:=[0,1]}, and the unit d-dimensional cube is denoted by {\displaystyle E^{d}}. A continuous function {\displaystyle f} is defined on {\displaystyle E^{d}} (from {\displaystyle E^{d}} to itself). Often, it is assumed that {\displaystyle f} is not only continuous but also Lipschitz continuous, that is, for some constant {\displaystyle L}, {\displaystyle |f(x)-f(y)|\leq L\cdot |x-y|} for all {\displaystyle x,y} in {\displaystyle E^{d}}.
A fixed point of {\displaystyle f} is a point {\displaystyle x} in {\displaystyle E^{d}} such that {\displaystyle f(x)=x}. By the Brouwer fixed-point theorem, any continuous function from {\displaystyle E^{d}} to itself has a fixed point. But for general functions, it is impossible to compute a fixed point precisely, since it can be an arbitrary real number. Fixed-point computation algorithms look for approximate fixed points. There are several criteria for an approximate fixed point. Several common criteria are:[2]
- The residual criterion: given an approximation parameter {\displaystyle \varepsilon >0} , An ε-residual fixed-point of {\displaystyle f} is a point {\displaystyle x} in {\displaystyle E^{d}}' such that {\displaystyle |f(x)-x|\leq \varepsilon }, where here {\displaystyle |\cdot |} denotes the maximum norm. That is, all {\displaystyle d} coordinates of the difference {\displaystyle f(x)-x} should be at most ε.[3] : 4
- The absolute criterion: given an approximation parameter {\displaystyle \delta >0}, A δ-absolute fixed-point of {\displaystyle f} is a point {\displaystyle x} in {\displaystyle E^{d}} such that {\displaystyle |x-x_{0}|\leq \delta }, where {\displaystyle x_{0}} is any fixed-point of {\displaystyle f}.
- The relative criterion: given an approximation parameter {\displaystyle \delta >0}, A δ-relative fixed-point of {\displaystyle f} is a point x in {\displaystyle E^{d}} such that {\displaystyle |x-x_{0}|/|x_{0}|\leq \delta }, where {\displaystyle x_{0}} is any fixed-point of {\displaystyle f}.
For Lipschitz-continuous functions, the absolute criterion is stronger than the residual criterion: If {\displaystyle f} is Lipschitz-continuous with constant {\displaystyle L}, then {\displaystyle |x-x_{0}|\leq \delta } implies {\displaystyle |f(x)-f(x_{0})|\leq L\cdot \delta }. Since {\displaystyle x_{0}} is a fixed-point of {\displaystyle f}, this implies {\displaystyle |f(x)-x_{0}|\leq L\cdot \delta }, so {\displaystyle |f(x)-x|\leq (1+L)\cdot \delta }. Therefore, a δ-absolute fixed-point is also an ε-residual fixed-point with {\displaystyle \varepsilon =(1+L)\cdot \delta }.
The most basic step of a fixed-point computation algorithm is a value query: given any {\displaystyle x} in {\displaystyle E^{d}}, the algorithm is provided with an oracle {\displaystyle {\tilde {f}}} to {\displaystyle f} that returns the value {\displaystyle f(x)}. The accuracy of the approximate fixed-point depends upon the error in the oracle {\displaystyle {\tilde {f}}(x)}.
The function {\displaystyle f} is accessible via evaluation queries: for any {\displaystyle x}, the algorithm can evaluate {\displaystyle f(x)}. The run-time complexity of an algorithm is usually given by the number of required evaluations.
Contractive functions
[edit ]A Lipschitz-continuous function with constant {\displaystyle L} is called contractive if {\displaystyle L<1}; it is called weakly-contractive if {\displaystyle L\leq 1}. Every contractive function satisfying Brouwer's conditions has a unique fixed point. Moreover, fixed-point computation for contractive functions is easier than for general functions.
The first algorithm for fixed-point computation was the fixed-point iteration algorithm of Banach. Banach's fixed-point theorem implies that, when fixed-point iteration is applied to a contraction mapping, the error after {\displaystyle t} iterations is in {\displaystyle O(L^{t})}. Therefore, the number of evaluations required for a {\displaystyle \delta }-relative fixed-point is approximately {\displaystyle \log _{L}(\delta )=\log(\delta )/\log(L)=\log(1/\delta )/\log(1/L)}. Sikorski and Wozniakowski[4] showed that Banach's algorithm is optimal when the dimension is large. Specifically, when {\displaystyle d\geq \log(1/\delta )/\log(1/L)}, the number of required evaluations of any algorithm for {\displaystyle \delta }-relative fixed-point is larger than 50% the number of evaluations required by the iteration algorithm. Note that when {\displaystyle L} approaches 1, the number of evaluations approaches infinity. No finite algorithm can compute a {\displaystyle \delta }-absolute fixed point for all functions with {\displaystyle L=1}.[5]
When {\displaystyle L} < 1 and d = 1, the optimal algorithm is the Fixed Point Envelope (FPE) algorithm of Sikorski and Wozniakowski.[4] It finds a δ-relative fixed point using {\displaystyle O(\log(1/\delta )+\log \log(1/(1-L)))} queries, and a δ-absolute fixed point using {\displaystyle O(\log(1/\delta ))} queries. This is faster than the fixed-point iteration algorithm.[6]
When {\displaystyle d>1} but not too large, and {\displaystyle L\leq 1}, the optimal algorithm is the interior-ellipsoid algorithm (based on the ellipsoid method).[7] It finds an ε-residual fixed-point using {\displaystyle O(d\cdot \log(1/\varepsilon ))} evaluations. When {\displaystyle L<1}, it finds a {\displaystyle \delta }-absolute fixed point using {\displaystyle O(d\cdot [\log(1/\delta )+\log(1/(1-L))])} evaluations.
Shellman and Sikorski[8] presented an algorithm called BEFix (Bisection Envelope Fixed-point) for computing an ε-residual fixed-point of a two-dimensional function with '{\displaystyle L\leq 1}, using only {\displaystyle 2\lceil \log _{2}(1/\varepsilon )\rceil +1} queries. They later[9] presented an improvement called BEDFix (Bisection Envelope Deep-cut Fixed-point), with the same worst-case guarantee but better empirical performance. When {\displaystyle L<1}, BEDFix can also compute a {\displaystyle \delta }-absolute fixed-point using {\displaystyle O(\log(1/\varepsilon )+\log(1/(1-L)))} queries.
Shellman and Sikorski[2] presented an algorithm called PFix for computing an ε-residual fixed-point of a d-dimensional function with L ≤ 1, using {\displaystyle O(\log ^{d}(1/\varepsilon ))} queries. When {\displaystyle L} < 1, PFix can be executed with {\displaystyle \varepsilon =(1-L)\cdot \delta }, and in that case, it computes a δ-absolute fixed-point, using {\displaystyle O(\log ^{d}(1/[(1-L)\delta ]))} queries. It is more efficient than the iteration algorithm when {\displaystyle L} is close to 1. The algorithm is recursive: it handles a d-dimensional function by recursive calls on (d-1)-dimensional functions.
Algorithms for differentiable functions
[edit ]When the function {\displaystyle f} is differentiable, and the algorithm can evaluate its derivative (not only {\displaystyle f} itself), the Newton method can be used and it is much faster.[10] [11]
General functions
[edit ]For functions with Lipschitz constant {\displaystyle L} > 1, computing a fixed-point is much harder.
One dimension
[edit ]For a 1-dimensional function (d = 1), a {\displaystyle \delta }-absolute fixed-point can be found using {\displaystyle O(\log(1/\delta ))} queries using the bisection method: start with the interval {\displaystyle E:=[0,1]}; at each iteration, let {\displaystyle x} be the center of the current interval, and compute {\displaystyle f(x)}; if {\displaystyle f(x)>x} then recurse on the sub-interval to the right of {\displaystyle x}; otherwise, recurse on the interval to the left of {\displaystyle x}. Note that the current interval always contains a fixed point, so after {\displaystyle O(\log(1/\delta ))} queries, any point in the remaining interval is a {\displaystyle \delta }-absolute fixed-point of {\displaystyle f} Setting {\displaystyle \delta :=\varepsilon /(L+1)}, where {\displaystyle L} is the Lipschitz constant, gives an ε-residual fixed-point, using {\displaystyle O(\log(L/\varepsilon )=\log(L)+\log(1/\varepsilon ))} queries.[3]
Two or more dimensions
[edit ]For functions in two or more dimensions, the problem is much more challenging. Shellman and Sikorski[2] proved that for any integers d ≥ 2 and {\displaystyle L} > 1, finding a δ-absolute fixed-point of d-dimensional {\displaystyle L}-Lipschitz functions might require infinitely many evaluations. The proof idea is as follows. For any integer T > 1 and any sequence of T of evaluation queries (possibly adaptive), one can construct two functions that are Lipschitz-continuous with constant {\displaystyle L}, and yield the same answer to all these queries, but one of them has a unique fixed-point at (x, 0) and the other has a unique fixed-point at (x, 1). Any algorithm using T evaluations cannot differentiate between these functions, so cannot find a δ-absolute fixed-point. This is true for any finite integer T.
Several algorithms based on function evaluations have been developed for finding an ε-residual fixed-point
- The first algorithm to approximate a fixed point of a general function was developed by Herbert Scarf in 1967.[12] [13] Scarf's algorithm finds an ε-residual fixed-point by finding a fully labeled "primitive set", in a construction similar to Sperner's lemma.
- A later algorithm by Harold Kuhn [14] used simplices and simplicial partitions instead of primitive sets.
- Developing the simplicial approach further, Orin Harrison Merrill[15] presented the restart algorithm.
- B. Curtis Eaves[16] presented the Homotopy method. The algorithm works by starting with an affine function that approximates {\displaystyle f}, and deforming it towards {\displaystyle f} while following the fixed point.
- A book by Michael Todd[1] surveys various algorithms developed until 1976.
- David Gale [17] showed that computing a fixed point of an n-dimensional function (on the unit d-dimensional cube) is equivalent to deciding who is the winner in a d-dimensional game of Hex (a game with d players, each of whom needs to connect two opposite faces of a d-cube). Given the desired accuracy ε
- Construct a Hex board of size kd, where {\displaystyle k>1/\varepsilon }. Each vertex z corresponds to a point z/k in the unit n-cube.
- Compute the difference {\displaystyle f}(z/k) - z/k; note that the difference is an n-vector.
- Label the vertex z by a label in 1, ..., d, denoting the largest coordinate in the difference vector.
- The resulting labeling corresponds to a possible play of the d-dimensional Hex game among d players. This game must have a winner, and Gale presents an algorithm for constructing the winning path.
- In the winning path, there must be a point in which fi(z/k) - z/k is positive, and an adjacent point in which fi(z/k) - z/k is negative. This means that there is a fixed point of {\displaystyle f} between these two points.
In the worst case, the number of function evaluations required by all these algorithms is exponential in the binary representation of the accuracy, that is, in {\displaystyle \Omega (1/\varepsilon )}.
Query complexity
[edit ]Hirsch, Papadimitriou and Vavasis proved that[3] any algorithm based on function evaluations, that finds an ε-residual fixed-point of f, requires {\displaystyle \Omega (L'/\varepsilon )} function evaluations, where {\displaystyle L'} is the Lipschitz constant of the function {\displaystyle f(x)-x} (note that {\displaystyle L-1\leq L'\leq L+1}). More precisely:
- For a 2-dimensional function (d=2), they prove a tight bound {\displaystyle \Theta (L'/\varepsilon )}.
- For any d ≥ 3, finding an ε-residual fixed-point of a d-dimensional function requires {\displaystyle \Omega ((L'/\varepsilon )^{d-2})} queries and {\displaystyle O((L'/\varepsilon )^{d})} queries.
The latter result leaves a gap in the exponent. Chen and Deng[18] closed the gap. They proved that, for any d ≥ 2 and {\displaystyle 1/\varepsilon >4d} and {\displaystyle L'/\varepsilon >192d^{3}}, the number of queries required for computing an ε-residual fixed-point is in {\displaystyle \Theta ((L'/\varepsilon )^{d-1})}.
Discrete fixed-point computation
[edit ]A discrete function is a function defined on a subset of {\displaystyle \mathbb {Z} ^{d}} (the d-dimensional integer grid). There are several discrete fixed-point theorems, stating conditions under which a discrete function has a fixed point. For example, the Iimura-Murota-Tamura theorem states that (in particular) if {\displaystyle f} is a function from a rectangle subset of {\displaystyle \mathbb {Z} ^{d}} to itself, and {\displaystyle f} is hypercubic direction-preserving , then {\displaystyle f} has a fixed point.
Let {\displaystyle f} be a direction-preserving function from the integer cube {\displaystyle \{1,\dots ,n\}^{d}} to itself. Chen and Deng[18] prove that, for any d ≥ 2 and n > 48d, computing such a fixed point requires {\displaystyle \Theta (n^{d-1})} function evaluations.
Chen and Deng[19] define a different discrete-fixed-point problem, which they call 2D-BROUWER. It considers a discrete function {\displaystyle f} on {\displaystyle \{0,\dots ,n\}^{2}} such that, for every x on the grid, {\displaystyle f}(x) - x is either (0, 1) or (1, 0) or (-1, -1). The goal is to find a square in the grid, in which all three labels occur. The function {\displaystyle f} must map the square {\displaystyle \{0,\dots ,n\}^{2}}to itself, so it must map the lines x = 0 and y = 0 to either (0, 1) or (1, 0); the line x = n to either (-1, -1) or (0, 1); and the line y = n to either (-1, -1) or (1,0). The problem can be reduced to 2D-SPERNER (computing a fully-labeled triangle in a triangulation satisfying the conditions to Sperner's lemma), and therefore it is PPAD-complete. This implies that computing an approximate fixed-point is PPAD-complete even for very simple functions.
Relation between fixed-point computation and root-finding algorithms
[edit ]Given a function {\displaystyle g} from {\displaystyle E^{d}} to R, a root of {\displaystyle g} is a point x in {\displaystyle E^{d}} such that {\displaystyle g}(x)=0. An ε-root of g is a point x in {\displaystyle E^{d}} such that {\displaystyle g(x)\leq \varepsilon }.
Fixed-point computation is a special case of root-finding: given a function {\displaystyle f} on {\displaystyle E^{d}}, define {\displaystyle g(x):=|f(x)-x|}. X is a fixed-point of {\displaystyle f} if and only if x is a root of {\displaystyle g}, and x is an ε-residual fixed-point of {\displaystyle f} if and only if x is an ε-root of {\displaystyle g}. Therefore, any root-finding algorithm (an algorithm that computes an approximate root of a function) can be used to find an approximate fixed-point.
The opposite is not true: finding an approximate root of a general function may be harder than finding an approximate fixed point. In particular, Sikorski[20] proved that finding an ε-root requires {\displaystyle \Omega (1/\varepsilon ^{d})} function evaluations. This gives an exponential lower bound even for a one-dimensional function (in contrast, an ε-residual fixed-point of a one-dimensional function can be found using {\displaystyle O(\log(1/\varepsilon ))} queries using the bisection method). Here is a proof sketch.[3] : 35 Construct a function {\displaystyle g} that is slightly larger than ε everywhere in {\displaystyle E^{d}} except in some small cube around some point x0, where x0 is the unique root of {\displaystyle g}. If {\displaystyle g} is Lipschitz continuous with constant {\displaystyle L}, then the cube around x0 can have a side-length of {\displaystyle \varepsilon /L}. Any algorithm that finds an ε-root of {\displaystyle g} must check a set of cubes that covers the entire {\displaystyle E^{d}}; the number of such cubes is at least {\displaystyle (L/\varepsilon )^{d}}.
However, there are classes of functions for which finding an approximate root is equivalent to finding an approximate fixed point. One example[18] is the class of functions {\displaystyle g} such that {\displaystyle g(x)+x} maps {\displaystyle E^{d}} to itself (that is: {\displaystyle g(x)+x} is in {\displaystyle E^{d}} for all x in {\displaystyle E^{d}}). This is because, for every such function, the function {\displaystyle f(x):=g(x)+x} satisfies the conditions of Brouwer's fixed-point theorem. X is a fixed-point of {\displaystyle f} if and only if x is a root of {\displaystyle g}, and x is an ε-residual fixed-point of {\displaystyle f} if and only if x is an ε-root of {\displaystyle g}. Chen and Deng[18] show that the discrete variants of these problems are computationally equivalent: both problems require {\displaystyle \Theta (n^{d-1})} function evaluations.
Communication complexity
[edit ]Roughgarden and Weinstein[21] studied the communication complexity of computing an approximate fixed-point. In their model, there are two agents: one of them knows a function {\displaystyle f} and the other knows a function {\displaystyle g}. Both functions are Lipschitz continuous and satisfy Brouwer's conditions. The goal is to compute an approximate fixed point of the composite function {\displaystyle g\circ f}. They show that the deterministic communication complexity is in {\displaystyle \Omega (2^{d})}.
References
[edit ]- ^ a b The Computation of Fixed Points and Applications. Lecture Notes in Economics and Mathematical Systems. Vol. 124. 1976. doi:10.1007/978-3-642-50327-6. ISBN 978-3-540-07685-8.
- ^ a b c Shellman, Spencer; Sikorski, K. (December 2003). "A recursive algorithm for the infinity-norm fixed point problem". Journal of Complexity. 19 (6): 799–834. doi:10.1016/j.jco.200306001 .
- ^ a b c d Hirsch, Michael D; Papadimitriou, Christos H; Vavasis, Stephen A (December 1989). "Exponential lower bounds for finding Brouwer fix points". Journal of Complexity. 5 (4): 379–416. doi:10.1016/0885-064X(89)90017-4. S2CID 1727254.
- ^ a b Sikorski, K; Woźniakowski, H (December 1987). "Complexity of fixed points, I". Journal of Complexity. 3 (4): 388–405. doi:10.1016/0885-064X(87)90008-2 .
- ^ Sikorski, Krzysztof A. (2001). Optimal Solution of Nonlinear Equations. Oxford University Press. ISBN 978-0-19-510690-9.[page needed ]
- ^ Sikorski, K. (1989). "Fast Algorithms for the Computation of Fixed Points". Robustness in Identification and Control. pp. 49–58. doi:10.1007/978-1-4615-9552-6_4. ISBN 978-1-4615-9554-0.
- ^ Huang, Z; Khachiyan, L; Sikorski, K (June 1999). "Approximating Fixed Points of Weakly Contracting Mappings". Journal of Complexity. 15 (2): 200–213. doi:10.1006/jcom.1999.0504 .
- ^ Shellman, Spencer; Sikorski, K. (June 2002). "A Two-Dimensional Bisection Envelope Algorithm for Fixed Points". Journal of Complexity. 18 (2): 641–659. doi:10.1006/jcom.2001.0625 .
- ^ Shellman, Spencer; Sikorski, K. (September 2003). "Algorithm 825: A deep-cut bisection envelope algorithm for fixed points". ACM Transactions on Mathematical Software. 29 (3): 309–325. doi:10.1145/838250.838255. S2CID 7786886.
- ^ Kellogg, R. B.; Li, T. Y.; Yorke, J. (September 1976). "A Constructive Proof of the Brouwer Fixed-Point Theorem and Computational Results". SIAM Journal on Numerical Analysis. 13 (4): 473–483. doi:10.1137/0713041.
- ^ Smale, Steve (July 1976). "A convergent process of price adjustment and global newton methods". Journal of Mathematical Economics. 3 (2): 107–120. doi:10.1016/0304-4068(76)90019-7.
- ^ Scarf, Herbert (September 1967). "The Approximation of Fixed Points of a Continuous Mapping". SIAM Journal on Applied Mathematics. 15 (5): 1328–1343. doi:10.1137/0115116.
- ^ H. Scarf found the first algorithmic proof: Voitsekhovskii, M.I. (2001) [1994]. "Brouwer theorem". Encyclopedia of Mathematics . EMS Press. ISBN 1-4020-0609-8..
- ^ Kuhn, Harold W. (1968). "Simplicial Approximation of Fixed Points". Proceedings of the National Academy of Sciences of the United States of America. 61 (4): 1238–1242. doi:10.1073/pnas.61.4.1238 . JSTOR 58762. PMC 225246 . PMID 16591723.
- ^ Merrill, Orin Harrison (1972). Applications and Extensions of an Algorithm that Computes Fixed Points of Certain Upper Semi-continuous Point to Set Mappings (Thesis). OCLC 570461463. NAID 10006142329.
- ^ Eaves, B. Curtis (December 1972). "Homotopies for computation of fixed points". Mathematical Programming. 3–3 (1): 1–22. doi:10.1007/BF01584975. S2CID 39504380.
- ^ Gale, David (1979). "The Game of Hex and Brouwer Fixed-Point Theorem". The American Mathematical Monthly. 86 (10): 818–827. doi:10.2307/2320146. JSTOR 2320146.
- ^ a b c d Chen, Xi; Deng, Xiaotie (2005). "On algorithms for discrete and approximate brouwer fixed points". Proceedings of the thirty-seventh annual ACM symposium on Theory of computing. pp. 323–330. doi:10.1145/1060590.1060638. ISBN 1581139608. S2CID 16942881.
- ^ Chen, Xi; Deng, Xiaotie (October 2009). "On the complexity of 2D discrete fixed point problem". Theoretical Computer Science. 410 (44): 4448–4456. doi:10.1016/j.tcs.2009年07月05日2. S2CID 2831759.
- ^ Sikorski, K. (June 1984). "Optimal solution of nonlinear equations satisfying a Lipschitz condition". Numerische Mathematik. 43 (2): 225–240. doi:10.1007/BF01390124. S2CID 120937024.
- ^ Roughgarden, Tim; Weinstein, Omri (2016). "On the Communication Complexity of Approximate Fixed Points". 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS). pp. 229–238. doi:10.1109/FOCS.2016.32. ISBN 978-1-5090-3933-3. S2CID 87553.
Further reading
[edit ]- Yannakakis, Mihalis (May 2009). "Equilibria, fixed points, and complexity classes". Computer Science Review. 3 (2): 71–85. arXiv:0802.2831 . doi:10.1016/j.cosrev.200903004.