9. Solving Linear Equations

Michael Friendly and John Fox

2024年11月24日

Source: vignettes/a9-linear-equations.Rmd

a9-linear-equations.Rmd

This vignette illustrates the ideas behind solving systems of linear equations of the form $\mathbf{A x = b}$ where

$\mathbf{A}$ is an $m \times n$ matrix of coefficients for $m$ equations in $n$ unknowns
$\mathbf{x}$ is an $n \times 1$ vector unknowns, $x_1, x_2, \dots x_n$
$\mathbf{b}$ is an $m \times 1$ vector of constants, the "right-hand sides" of the equations

or, spelled out,

$\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix} \begin{pmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \\ \end{pmatrix} \quad=\quad \begin{pmatrix} b_{1} \\ b_{2} \\ \vdots \\ b_{m} \\ \end{pmatrix}$ For three equations in three unknowns, the equations look like this:

A <- matrix (paste0 ("a_{", outer (1:3, 1:3, FUN = paste0), "}"), 
 nrow=3) 
b <- paste0 ("b_", 1:3)
x <- paste0 ("x", 1:3)
showEqn (A, b, vars = x, latex=TRUE)

\begin{array}{lllllll} a_{11} \cdot x_1 &+& a_{12} \cdot x_2 &+& a_{13} \cdot x_3 &=& b_1 \\ a_{21} \cdot x_1 &+& a_{22} \cdot x_2 &+& a_{23} \cdot x_3 &=& b_2 \\ a_{31} \cdot x_1 &+& a_{32} \cdot x_2 &+& a_{33} \cdot x_3 &=& b_3 \\ \end{array}

Conditions for a solution

The general conditions for solutions are:

the equations are consistent (solutions exist) if r(A|b)=r(A)r( \mathbf{A} | \mathbf{b}) = r( \mathbf{A})
- the solution is unique if $r( \mathbf{A} | \mathbf{b}) = r( \mathbf{A}) = n$
- the solution is underdetermined if $r( \mathbf{A} | \mathbf{b}) = r( \mathbf{A}) < n$
the equations are inconsistent (no solutions) if $r( \mathbf{A} | \mathbf{b}) > r( \mathbf{A})$

We use c( R(A), R(cbind(A,b)) ) to show the ranks, and all.equal( R(A), R(cbind(A,b)) ) to test for consistency.

library (matlib ) # use the package

Equations in two unknowns

Each equation in two unknowns corresponds to a line in 2D space. The equations have a unique solution if all lines intersect in a point.

Two consistent equations

A <- matrix (c (1, 2, -1, 2), 2, 2)
b <- c (2,1)
showEqn (A, b)

## 1*x1 - 1*x2 = 2 
## 2*x1 + 2*x2 = 1

Check whether they are consistent:

c ( R (A), R (cbind (A,b)) ) # show ranks

## [1] 2 2

all.equal ( R (A), R (cbind (A,b)) ) # consistent?

## [1] TRUE

Plot the equations:

plotEqn (A,b)

## x[1] - 1*x[2] = 2 
## 2*x[1] + 2*x[2] = 1

Plot of two consistent equations which plot as lines intersecting in a point

Solve() is a convenience function that shows the solution in a more comprehensible form:

Solve (A, b, fractions = TRUE)

## x1 = 5/4 
## x2 = -3/4

Three consistent equations

For three (or more) equations in two unknowns, $r(\mathbf{A}) \le 2$ , because $r(\mathbf{A}) \le \min(m,n)$ . The equations will be consistent if $r(\mathbf{A}) = r(\mathbf{A | b})$ . This means that whatever linear relations exist among the rows of $\mathbf{A}$ are the same as those among the elements of $\mathbf{b}$ .

Geometrically, this means that all three lines intersect in a point.

A <- matrix (c (1,2,3, -1, 2, 1), 3, 2)
b <- c (2,1,3)
showEqn (A, b)

## 1*x1 - 1*x2 = 2 
## 2*x1 + 2*x2 = 1 
## 3*x1 + 1*x2 = 3

c ( R (A), R (cbind (A,b)) ) # show ranks

## [1] 2 2

all.equal ( R (A), R (cbind (A,b)) ) # consistent?

## [1] TRUE

Solve (A, b, fractions=TRUE) # show solution

## x1 = 5/4 
## x2 = -3/4 
## 0 = 0

Plot the equations:

plotEqn (A,b)

## x[1] - 1*x[2] = 2 
## 2*x[1] + 2*x[2] = 1 
## 3*x[1] + x[2] = 3

Plot of three consistent equations which plot as three lines intersecting in a point

Three inconsistent equations

Three equations in two unknowns are inconsistent when $r(\mathbf{A}) < r(\mathbf{A | b})$ .

A <- matrix (c (1,2,3, -1, 2, 1), 3, 2)
b <- c (2,1,6)
showEqn (A, b)

## 1*x1 - 1*x2 = 2 
## 2*x1 + 2*x2 = 1 
## 3*x1 + 1*x2 = 6

c ( R (A), R (cbind (A,b)) ) # show ranks

## [1] 2 3

all.equal ( R (A), R (cbind (A,b)) ) # consistent?

## [1] "Mean relative difference: 0.5"

You can see this in the result of reducing $\mathbf{A} | \mathbf{b}$ to echelon form, where the last row indicates the inconsistency because it represents the equation $0 x_1 + 0 x_2 = -3$ .

echelon (A, b)

## [,1] [,2] [,3]
## [1,] 1 0 2.75
## [2,] 0 1 -2.25
## [3,] 0 0 -3.00

Solve() shows this more explicitly, using fractions where possible:

Solve (A, b, fractions=TRUE)

## x1 = 11/4 
## x2 = -9/4 
## 0 = -3

An approximate solution is sometimes available using a generalized inverse. This gives $\mathbf{x} = (2, -1)$ as a best close solution.

x <- MASS::ginv (A) %*%  b
x

## [,1]
## [1,] 2
## [2,] -1

Plot the equations. You can see that each pair of equations has a solution, but all three do not have a common, consistent solution.

par (mar=c (4,4,0,0)+.1)
plotEqn (A,b, xlim=c (-2, 4))

## x[1] - 1*x[2] = 2 
## 2*x[1] + 2*x[2] = 1 
## 3*x[1] + x[2] = 6

# add the ginv() solution
points (x[1], x[2], pch=15)

Plot of the lines corresponding to three inconsistent equations. They do not all intersect in a point, indicating that there is no common solution.

Equations in three unknowns

Each equation in three unknowns corresponds to a plane in 3D space. The equations have a unique solution if all planes intersect in a point.

Three consistent equations

An example:

A <- matrix (c (2, 1, -1,
 -3, -1, 2,
 -2, 1, 2), 3, 3, byrow=TRUE)
colnames (A) <- paste0 ('x', 1:3)
b <- c (8, -11, -3)
showEqn (A, b)

## 2*x1 + 1*x2 - 1*x3 = 8 
## -3*x1 - 1*x2 + 2*x3 = -11 
## -2*x1 + 1*x2 + 2*x3 = -3

Are the equations consistent?

c ( R (A), R (cbind (A,b)) ) # show ranks

## [1] 3 3

all.equal ( R (A), R (cbind (A,b)) ) # consistent?

## [1] TRUE

Solve for $\mathbf{x}$ .

solve (A, b)

## x1 x2 x3 
## 2 3 -1

Other ways of solving:

solve (A) %*%  b

## [,1]
## x1 2
## x2 3
## x3 -1

inv (A) %*%  b

## [,1]
## [1,] 2
## [2,] 3
## [3,] -1

Yet another way to see the solution is to reduce $\mathbf{A | b}$ to echelon form. The result of this is the matrix $[\mathbf{I \quad | \quad A^{-1}b}]$ , with the solution in the last column.

echelon (A, b)

## x1 x2 x3 
## [1,] 1 0 0 2
## [2,] 0 1 0 3
## [3,] 0 0 1 -1

`echelon() can be asked to show the steps, as the row operations necessary to reduce $\mathbf{X}$ to the identity matrix $\mathbf{I}$ .

echelon (A, b, verbose=TRUE, fractions=TRUE)

## 
## Initial matrix:

## x1 x2 x3 
## [1,] 2 1 -1 8
## [2,] -3 -1 2 -11
## [3,] -2 1 2 -3
## 
## row: 1 
## 
## exchange rows 1 and 2

## x1 x2 x3 
## [1,] -3 -1 2 -11
## [2,] 2 1 -1 8
## [3,] -2 1 2 -3
## 
## multiply row 1 by -1/3

## x1 x2 x3 
## [1,] 1 1/3 -2/3 11/3
## [2,] 2 1 -1 8
## [3,] -2 1 2 -3
## 
## multiply row 1 by 2 and subtract from row 2

## x1 x2 x3 
## [1,] 1 1/3 -2/3 11/3
## [2,] 0 1/3 1/3 2/3
## [3,] -2 1 2 -3
## 
## multiply row 1 by 2 and add to row 3

## x1 x2 x3 
## [1,] 1 1/3 -2/3 11/3
## [2,] 0 1/3 1/3 2/3
## [3,] 0 5/3 2/3 13/3
## 
## row: 2 
## 
## exchange rows 2 and 3

## x1 x2 x3 
## [1,] 1 1/3 -2/3 11/3
## [2,] 0 5/3 2/3 13/3
## [3,] 0 1/3 1/3 2/3
## 
## multiply row 2 by 3/5

## x1 x2 x3 
## [1,] 1 1/3 -2/3 11/3
## [2,] 0 1 2/5 13/5
## [3,] 0 1/3 1/3 2/3
## 
## multiply row 2 by 1/3 and subtract from row 1

## x1 x2 x3 
## [1,] 1 0 -4/5 14/5
## [2,] 0 1 2/5 13/5
## [3,] 0 1/3 1/3 2/3
## 
## multiply row 2 by 1/3 and subtract from row 3

## x1 x2 x3 
## [1,] 1 0 -4/5 14/5
## [2,] 0 1 2/5 13/5
## [3,] 0 0 1/5 -1/5
## 
## row: 3 
## 
## multiply row 3 by 5

## x1 x2 x3 
## [1,] 1 0 -4/5 14/5
## [2,] 0 1 2/5 13/5
## [3,] 0 0 1 -1
## 
## multiply row 3 by 4/5 and add to row 1

## x1 x2 x3 
## [1,] 1 0 0 2
## [2,] 0 1 2/5 13/5
## [3,] 0 0 1 -1
## 
## multiply row 3 by 2/5 and subtract from row 2

## x1 x2 x3 
## [1,] 1 0 0 2
## [2,] 0 1 0 3
## [3,] 0 0 1 -1

Now, let’s plot them.

plotEqn3d() uses rgl for 3D graphics. If you rotate the figure, you’ll see an orientation where all three planes intersect at the solution point, $\mathbf{x} = (2, 3, -1)$

plotEqn3d (A,b, xlim=c (0,4), ylim=c (0,4))

Three inconsistent equations

A <- matrix (c (1, 3, 1,
 1, -2, -2,
 2, 1, -1), 3, 3, byrow=TRUE)
colnames (A) <- paste0 ('x', 1:3)
b <- c (2, 3, 6)
showEqn (A, b)

## 1*x1 + 3*x2 + 1*x3 = 2 
## 1*x1 - 2*x2 - 2*x3 = 3 
## 2*x1 + 1*x2 - 1*x3 = 6

Are the equations consistent? No.

c ( R (A), R (cbind (A,b)) ) # show ranks

## [1] 2 3

all.equal ( R (A), R (cbind (A,b)) ) # consistent?

## [1] "Mean relative difference: 0.5"