Bartels–Stewart algorithm

Algorithm in numerical linear algebra

In numerical linear algebra, the Bartels–Stewart algorithm is used to numerically solve the Sylvester matrix equation $AX-XB=C$ {\displaystyle AX-XB=C}. Developed by R.H. Bartels and G.W. Stewart in 1971,^[1] it was the first numerically stable method that could be systematically applied to solve such equations. The algorithm works by using the real Schur decompositions of $A$ {\displaystyle A} and $B$ {\displaystyle B} to transform $AX-XB=C$ {\displaystyle AX-XB=C} into a triangular system that can then be solved using forward or backward substitution. In 1979, G. Golub, C. Van Loan and S. Nash introduced an improved version of the algorithm,^[2] known as the Hessenberg–Schur algorithm. It remains a standard approach for solving Sylvester equations when $X$ {\displaystyle X} is of small to moderate size.

The algorithm

[edit ]

Let $X,C\in \mathbb {R} ^{m\times n}$ {\displaystyle X,C\in \mathbb {R} ^{m\times n}}, and assume that the eigenvalues of $A$ {\displaystyle A} are distinct from the eigenvalues of $B$ {\displaystyle B}. Then, the matrix equation $AX-XB=C$ {\displaystyle AX-XB=C} has a unique solution. The Bartels–Stewart algorithm computes $X$ {\displaystyle X} by applying the following steps:^[2]

1.Compute the real Schur decompositions

R=U^{T}AU,

{\displaystyle R=U^{T}AU,}

S=V^{T}B^{T}V.

{\displaystyle S=V^{T}B^{T}V.}

The matrices $R$ {\displaystyle R} and $S$ {\displaystyle S} are block-upper triangular matrices, with diagonal blocks of size $1\times 1$ {\displaystyle 1\times 1} or $2\times 2$ {\displaystyle 2\times 2}.

2. Set $F=U^{T}CV.$ {\displaystyle F=U^{T}CV.}

3. Solve the simplified system $RY-YS^{T}=F$ {\displaystyle RY-YS^{T}=F}, where $Y=U^{T}XV$ {\displaystyle Y=U^{T}XV}. This can be done using forward substitution on the blocks. Specifically, if $s_{k-1,k}=0$ {\displaystyle s_{k-1,k}=0}, then

(R-s_{kk}I)y_{k}=f_{k}+\sum _{j=k+1}^{n}s_{kj}y_{j},

{\displaystyle (R-s_{kk}I)y_{k}=f_{k}+\sum _{j=k+1}^{n}s_{kj}y_{j},}

where $y_{k}$ {\displaystyle y_{k}} is the $k$ {\displaystyle k}th column of $Y$ {\displaystyle Y}. When $s_{k-1,k}\neq 0$ {\displaystyle s_{k-1,k}\neq 0}, columns $[y_{k-1}\mid y_{k}]$ {\displaystyle [y_{k-1}\mid y_{k}]} should be concatenated and solved for simultaneously.

4. Set $X=UYV^{T}.$ {\displaystyle X=UYV^{T}.}

Computational cost

[edit ]

Using the QR algorithm, the real Schur decompositions in step 1 require approximately $10(m^{3}+n^{3})$ {\displaystyle 10(m^{3}+n^{3})} flops, so that the overall computational cost is $10(m^{3}+n^{3})+2.5(mn^{2}+nm^{2})$ {\displaystyle 10(m^{3}+n^{3})+2.5(mn^{2}+nm^{2})}.^[2]

Simplifications and special cases

[edit ]

In the special case where $B=-A^{T}$ {\displaystyle B=-A^{T}} and $C$ {\displaystyle C} is symmetric, the solution $X$ {\displaystyle X} will also be symmetric. This symmetry can be exploited so that $Y$ {\displaystyle Y} is found more efficiently in step 3 of the algorithm.^[1]

The Hessenberg–Schur algorithm

[edit ]

The Hessenberg–Schur algorithm^[2] replaces the decomposition $R=U^{T}AU$ {\displaystyle R=U^{T}AU} in step 1 with the decomposition $H=Q^{T}AQ$ {\displaystyle H=Q^{T}AQ}, where $H$ {\displaystyle H} is an upper-Hessenberg matrix. This leads to a system of the form $HY-YS^{T}=F$ {\displaystyle HY-YS^{T}=F} that can be solved using forward substitution. The advantage of this approach is that $H=Q^{T}AQ$ {\displaystyle H=Q^{T}AQ} can be found using Householder reflections at a cost of $(5/3)m^{3}$ {\displaystyle (5/3)m^{3}} flops, compared to the $10m^{3}$ {\displaystyle 10m^{3}} flops required to compute the real Schur decomposition of $A$ {\displaystyle A}.

Software and implementation

[edit ]

The subroutines required for the Hessenberg-Schur variant of the Bartels–Stewart algorithm are implemented in the SLICOT library. These are used in the MATLAB control system toolbox.

Alternative approaches

[edit ]

For large systems, the ${\mathcal {O}}(m^{3}+n^{3})$ {\displaystyle {\mathcal {O}}(m^{3}+n^{3})} cost of the Bartels–Stewart algorithm can be prohibitive. When $A$ {\displaystyle A} and $B$ {\displaystyle B} are sparse or structured, so that linear solves and matrix vector multiplies involving them are efficient, iterative algorithms can potentially perform better. These include projection-based methods, which use Krylov subspace iterations, methods based on the alternating direction implicit (ADI) iteration, and hybridizations that involve both projection and ADI.^[3] Iterative methods can also be used to directly construct low rank approximations to $X$ {\displaystyle X} when solving $AX-XB=C$ {\displaystyle AX-XB=C}.

References

[edit ]

^ ^a ^b Bartels, R. H.; Stewart, G. W. (1972). "Solution of the matrix equation AX + XB = C [F4]". Communications of the ACM. 15 (9): 820–826. doi:10.1145/361573.361582 . ISSN 0001-0782.
^ ^a ^b ^c ^d Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control. 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. hdl:1813/7472 . ISSN 0018-9286.
^ Simoncini, V. (2016). "Computational Methods for Linear Matrix Equations". SIAM Review. 58 (3): 377–441. doi:10.1137/130912839. hdl:11585/586011 . ISSN 0036-1445. S2CID 17271167.

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	ATLAS MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software

Retrieved from "https://en.wikipedia.org/w/index.php?title=Bartels–Stewart_algorithm&oldid=1285634050"