Skip to contents

3. Inverse of a matrix

Michael Friendly

November 24, 2024

Source: vignettes/a3-inv-ex1.Rmd
a3-inv-ex1.Rmd

The inverse of a matrix plays the same roles in matrix algebra as the reciprocal of a number and division does in ordinary arithmetic: Just as we can solve a simple equation like 4x=84 x = 8 for xx by multiplying both sides by the reciprocal 4x=8414x=418x=8/4=2 4 x = 8 \Rightarrow 4^{-1} 4 x = 4^{-1} 8 \Rightarrow x = 8 / 4 = 2 we can solve a matrix equation like Ax=b\mathbf{A x} = \mathbf{b} for the vector x\mathbf{x} by multiplying both sides by the inverse of the matrix A\mathbf{A}, Ax=bA1Ax=A1bx=A1b\mathbf{A x} = \mathbf{b} \Rightarrow \mathbf{A}^{-1} \mathbf{A x} = \mathbf{A}^{-1} \mathbf{b} \Rightarrow \mathbf{x} = \mathbf{A}^{-1} \mathbf{b}

The following examples illustrate the basic properties of the inverse of a matrix.

Load the matlib package

This defines: inv() , Inverse() ; the standard R function for matrix inverse is solve()

Create a 3 x 3 matrix

The ordinary inverse is defined only for square matrices.

 A <- matrix ( c (5, 1, 0,
 3,-1, 2,
 4, 0,-1), nrow=3, byrow=TRUE)
 det (A)
## [1] 16

Basic properties

1. det(A) != 0, so inverse exists

Only non-singular matrices have an inverse.

 (AI <- inv (A))
## [,1] [,2] [,3]
## [1,] 0.0625 0.0625 0.125
## [2,] 0.6875 -0.3125 -0.625
## [3,] 0.2500 0.2500 -0.500

2. Definition of the inverse: A1A=AA1=IA^{-1} A = A A^{-1} = I or AI * A = diag(nrow(A))

The inverse of a matrix AA is defined as the matrix A1A^{-1} which multiplies AA to give the identity matrix, just as, for a scalar aa, aa1=a/a=1a a^{-1} = a / a = 1.

NB: Sometimes you will get very tiny off-diagonal values (like 1.341e-13). The function zapsmall() will round those to 0.

 AI %*%  A
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 1 0
## [3,] 0 0 1

3. Inverse is reflexive: inv(inv(A)) = A

Taking the inverse twice gets you back to where you started.

 inv (AI)
## [,1] [,2] [,3]
## [1,] 5 1 0
## [2,] 3 -1 2
## [3,] 4 0 -1

4. inv(A) is symmetric if and only if A is symmetric

 inv ( t (A) )
## [,1] [,2] [,3]
## [1,] 0.0625 0.6875 0.25
## [2,] 0.0625 -0.3125 0.25
## [3,] 0.1250 -0.6250 -0.50
## [1] FALSE
## [1] FALSE

Here is a symmetric case:

 B <- matrix ( c (4, 2, 2,
 2, 3, 1,
 2, 1, 3), nrow=3, byrow=TRUE)
 inv (B)
## [,1] [,2] [,3]
## [1,] 0.50 -0.25 -0.25
## [2,] -0.25 0.50 0.00
## [3,] -0.25 0.00 0.50
 inv ( t (B) )
## [,1] [,2] [,3]
## [1,] 0.50 -0.25 -0.25
## [2,] -0.25 0.50 0.00
## [3,] -0.25 0.00 0.50
## [1] TRUE
## [1] TRUE
 all.equal ( inv (B), inv ( t (B) ) )
## [1] TRUE

More properties of matrix inverse

1. inverse of diagonal matrix = diag( 1/ diagonal)

In these simple examples, it is often useful to show the results of matrix calculations as fractions, using MASS::fractions() .

 D <- diag (c (1, 2, 4))
 inv (D)
## [,1] [,2] [,3]
## [1,] 1 0.0 0.00
## [2,] 0 0.5 0.00
## [3,] 0 0.0 0.25
 MASS::fractions ( diag (1 / c (1, 2, 4)) )
## [,1] [,2] [,3]
## [1,] 1 0 0 
## [2,] 0 1/2 0 
## [3,] 0 0 1/4

2. Inverse of an inverse: inv(inv(A)) = A

 A <- matrix (c (1, 2, 3, 2, 3, 0, 0, 1, 2), nrow=3, byrow=TRUE)
 AI <- inv (A)
 inv (AI)
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 2 3 0
## [3,] 0 1 2

3. inverse of a transpose: inv(t(A)) = t(inv(A))

 inv ( t (A) )
## [,1] [,2] [,3]
## [1,] 1.50 -1.0 0.50
## [2,] -0.25 0.5 -0.25
## [3,] -2.25 1.5 -0.25
 t ( inv (A) )
## [,1] [,2] [,3]
## [1,] 1.50 -1.0 0.50
## [2,] -0.25 0.5 -0.25
## [3,] -2.25 1.5 -0.25

4. inverse of a scalar * matrix: inv( k*A ) = (1/k) * inv(A)

 inv (5 * A)
## [,1] [,2] [,3]
## [1,] 0.3 -0.05 -0.45
## [2,] -0.2 0.10 0.30
## [3,] 0.1 -0.05 -0.05
 (1/5) * inv (A)
## [,1] [,2] [,3]
## [1,] 0.3 -0.05 -0.45
## [2,] -0.2 0.10 0.30
## [3,] 0.1 -0.05 -0.05

5. inverse of a matrix product: inv(A * B) = inv(B) %*% inv(A)

 B <- matrix (c (1, 2, 3, 1, 3, 2, 2, 4, 1), nrow=3, byrow=TRUE)
 C <- B[, 3:1]
 A %*%  B
## [,1] [,2] [,3]
## [1,] 9 20 10
## [2,] 5 13 12
## [3,] 5 11 4
 inv (A %*%  B)
## [,1] [,2] [,3]
## [1,] 4.0 -1.50 -5.50
## [2,] -2.0 0.70 2.90
## [3,] 0.5 -0.05 -0.85
 inv (B) %*%  inv (A)
## [,1] [,2] [,3]
## [1,] 4.0 -1.50 -5.50
## [2,] -2.0 0.70 2.90
## [3,] 0.5 -0.05 -0.85

This extends to any number of terms: the inverse of a product is the product of the inverses in reverse order.

 (ABC <- A %*%  B %*%  C)
## [,1] [,2] [,3]
## [1,] 77 118 49
## [2,] 53 97 42
## [3,] 41 59 24
 inv (A %*%  B %*%  C)
## [,1] [,2] [,3]
## [1,] 1.5 -0.59 -2.03
## [2,] -4.5 1.61 6.37
## [3,] 8.5 -2.95 -12.15
 inv (C) %*%  inv (B) %*%  inv (A)
## [,1] [,2] [,3]
## [1,] 1.5 -0.59 -2.03
## [2,] -4.5 1.61 6.37
## [3,] 8.5 -2.95 -12.15
 inv (ABC)
## [,1] [,2] [,3]
## [1,] 1.5 -0.59 -2.03
## [2,] -4.5 1.61 6.37
## [3,] 8.5 -2.95 -12.15

6. det(A1)=1/det(A)=[det(A)]1\det (A^{-1}) = 1 / \det(A) = [\det(A)]^{-1}

The determinant of an inverse is the inverse (reciprocal) of the determinant

 det (AI)
## [1] 0.25
 1 / det (A)
## [1] 0.25

Geometric interpretations

Some of these properties of the matrix inverse can be more easily understood from geometric diagrams. Here, we take a 2×ばつ22 \times 2 non-singular matrix AA,

A <- matrix (c (2, 1, 
 1, 2), nrow=2, byrow=TRUE)
A
## [,1] [,2]
## [1,] 2 1
## [2,] 1 2
det (A)
## [1] 3

The larger the determinant of AA, the smaller is the determinant of A1A^{-1}.

AI <- inv (A)
MASS::fractions (AI)
## [,1] [,2]
## [1,] 2/3 -1/3
## [2,] -1/3 2/3
det (AI)
## [1] 0.3333

Now, plot the rows of AA as vectors a1,a2a_1, a_2 from the origin in a 2D space. As illustrated in vignette("a1-det-ex1") , the area of the parallelogram defined by these vectors is the determinant.

par (mar=c (3,3,1,1)+.1)
xlim <- c (-1,3)
ylim <- c (-1,3)
plot (xlim, ylim, type="n", xlab="X1", ylab="X2", asp=1)
sum <- A[1,] + A[2,]
# draw the parallelogram determined by the rows of A
polygon ( rbind (c (0,0), A[1,], sum, A[2,]), col=rgb (1,0,0,.2))
vectors (A, labels=c (expression (a[1]), expression (a[2])), pos.lab=c (4,2))
vectors (sum, origin=A[1,], col="gray")
vectors (sum, origin=A[2,], col="gray")
text (mean (A[,1]), mean (A[,2]), "A", cex=1.5)

The rows of the inverse A1A^{-1} can be shown as vectors a1,a2a^1, a^2 from the origin in the same space.

vectors (AI, labels=c (expression (a^1), expression (a^2)), pos.lab=c (4,2))
sum <- AI[1,] + AI[2,]
polygon ( rbind (c (0,0), AI[1,], sum, AI[2,]), col=rgb (0,0,1,.2))
text (mean (AI[,1])-.3, mean (AI[,2])-.2, expression (A^{-1}), cex=1.5)

Thus, we can see:

  • The shape of A1A^{-1} is a 90o90^o rotation of the shape of AA.

  • A1A^{-1} is small in the directions where AA is large.

  • The vector a2a^2 is at right angles to a1a_1 and a1a^1 is at right angles to a2a_2

  • If we multiplied AA by a constant kk to make its determinant larger (by a factor of k2k^2), the inverse would have to be divided by the same factor to preserve AA1=IA A^{-1} = I.

One might wonder whether these properties depend on symmetry of AA, so here is another example, for the matrix A <- matrix(c(2, 1, 1, 1), nrow=2), where det(A)=1\det(A)=1.

(A <- matrix (c (2, 1, 1, 1), nrow=2))
## [,1] [,2]
## [1,] 2 1
## [2,] 1 1
(AI <- inv (A))
## [,1] [,2]
## [1,] 1 -1
## [2,] -1 2

The areas of the two parallelograms are the same because det(A)=det(A1)=1\det(A) = \det(A^{-1}) = 1.

AltStyle によって変換されたページ (->オリジナル) /