Matrix Inverse
Reading time: 16 mins.Preamble
It's worth noting that when a matrix is orthogonal, meaning a matrix in which the scale is 1, it can easily be inverted by just taking its transpose. Matrix transposition is explained in the lesson on Geometry. An orthogonal matrix \(Q\) has the property that its transpose \(Q^T\) is equal to its inverse \(Q^{-1}\), i.e., \(Q^TQ = QQ^T = I\), where \(I\) is the identity matrix. This property significantly simplifies the inversion process for orthogonal matrices, as no complex computations are required beyond transposition.
You should only need to invert the matrix using one of the most common methods (Gauss-Jordan, explainedi in this lesson, and the Adjoint), which is more computationally expensive than a simple transpose, whenever your matrix is non-orthogonal.
Gauss-Jordina Matrix Inversion Method
In this lesson, we will demonstrate how to derive the inverse of a matrix using the Gauss-Jordan (or reduced row) elimination method. Finding the inverse of a matrix implies several prerequisites, including the matrix being invertible. To be invertible, a matrix must be a square matrix (having an equal number of rows and columns, such as 2x2, 3x3, 4x4, etc.), and its determinant must be non-zero (for more on matrix determinants, refer to Geometry: Matrix Operations). A matrix with a zero determinant is considered singular, which means it is not invertible.
Before we begin, it’s important to note that a matrix can represent a system of linear equations solvable using row elementary operations. These operations preserve the set of solutions represented by the matrix. There are three types of row elementary operations:
-
Operation 1: Swapping two rows of the matrix.
-
Operation 2: Multiplying each element of a row by a non-zero constant \(k\). Here, 'multiplying' also encompasses division—for example, multiplying by \(\frac{1}{2}\) is the same as dividing by 2. The constant \(k\) can be positive or negative, such as 2 or -2.
-
Operation 3: Replacing a row with the sum of itself and a multiple of another row. For example, adding the elements of row 4, possibly scaled by some constant \(k\), to those of row 1:
The above operations are illustrated in the following examples, which include row switching, row multiplication, and row addition:
In the second example, we multiply all coefficients of row 1 by 1/2 (or equivalently, divide them by 2), and the coefficients of row 3 by 2. In the final example, we add the coefficients of row 1 to those of row 2.
The Gauss-Jordan elimination method aims to use these elementary row operations to transform the 4x4 matrix on the left side of the augmented matrix into the identity matrix (we say that \(M\) is row-reduced), as shown below. We start by augmenting \(M\) by the identity matrix and then obtain the inverse of \(M\) by applying the same row operations to the 4x4 identity matrix on the right side of the augmented matrix.
The Gauss-Jordan elimination method proceeds as follows:
-
Begin with two matrices: the matrix \(M\) that we want to invert and another matrix, which is initially set to the identity matrix (the Matrix class constructor defaults to the identity matrix). Our matrix \(M\) is then augmented.
-
Apply row elementary operations to convert the original matrix \(M\) into the identity matrix, a process known as row reduction. Perform these same operations on the identity matrix in parallel.
-
If the reduction of \(M\) to the identity matrix is successful, the identity matrix will be transformed into \(M^{-1}\), the inverse of \(M\). You can then choose to return \(M^{-1}\) (in the method named
inverse
below) or copy the inverted matrix into \(M\) (in the method calledinvertIt
below).
Matrix44& Matrix44::invertIt() { Matrix44 inv; // start from identity matrix ... *this = inv; return *this; } Matrix44 Matrix44::inverse(const Matrix44& M) const { Matrix44 temp(M); temp.invertIt(); return temp; }
Let's examine how this process is carried out.
Step 1: Set The Row So That The Pivot Is Different From Zero
The coefficients forming the diagonal of the matrix are known as the pivots of the matrix.
As mentioned earlier, the objective of the matrix inversion process is to utilize row elementary operations to adjust the pivot of each column to 1 and all other coefficients to 0 (by the end of this process, we will obtain the identity matrix). To accomplish this, the most effective approach is to row-reduce each column one after another, starting from the left. Proceeding from left to right ensures that as we progress from column to column, we do not alter the values of columns that have already been row-reduced.
First, we examine the value of the pivot coefficient for the current column (the column being processed). This value must be different from zero. If it is not zero, we can directly move to step two. If it is zero, we need to find another row in the matrix for this column with a non-zero value and swap these two rows (the first of the permitted row elementary operations). However, if more than one row has a non-zero value, which one should we choose? We will select the row with the highest absolute value.
In the example on the right, we are examining the pivot coefficient of the second column. Because the value of that coefficient is zero, we swap row 2 with the row of the second column whose absolute value is the greatest (in this example, row 3).
If we can't find a suitable value for the pivot coefficient (i.e., if all other values in the column are zero), then the matrix cannot be inverted and is considered singular.
Upon identifying a valid pivot coefficient for the current column we are processing, we can proceed to the next step.
Matrix4<T>& invertIt() { Matrix4<T> inv; for (uint32_t column = 0; column < 4; ++column) { // Swap row in case our pivot is zero if (m[column][column] == 0) { uint32_t big = column; for (uint32_t row = 0; row < 4; ++row) if (fabs(m[row][column]) > fabs(m[big][column])) big = row; // Indicate singular matrix, return identity? if (big == column) fprintf(stderr, "Singular matrix\n"); // Swap rows else for (uint32_t j = 0; j < 4; ++j) { std::swap(m[column][j], m[big][j]); std::swap(inv[column][j], inv[big][j]); } } ... } }
Step 2: Eliminate All Numbers Under the Diagonal Element
The Gauss-Jordan elimination method, at this step, processes each column from left to right.
In this step, we first row-reduce the coefficients (in other words, eliminate the numbers) of each column below the pivot coefficient, which we will not alter for now. We process in order: \(M_{21}\) (we skip \(M_{11}\) since it's the pivot coefficient), \(M_{31}\), \(M_{41}\), then \(M_{32}\) (we skip \(M_{22}\) since it's the pivot coefficient), \(M_{42}\), and finally \(M_{43}\). Note that we can skip the last column since there is no coefficient under \(M_{44}\). We are about to eliminate the numbers in green in the matrix below.
$$ \begin{bmatrix} \text{pivot} & M_{12} & M_{13} & M_{14}\\ \color{green}{M_{21}} & \text{pivot} & M_{23} & M_{24}\\ \color{green}{M_{31}} & \color{green}{M_{32}} & \text{pivot} & M_{34}\\ \color{green}{M_{41}} & \color{green}{M_{42}} & \color{green}{M_{43}} & \text{pivot} \end{bmatrix} $$Remember, we can only use row elementary operations, meaning you can modify a single coefficient through a series of operations, applying the same set of operations to all other coefficients in the same row. If not done this way, the solution set of the matrix is not preserved. This condition is the basis for row-reduction. You can multiply all coefficients in a row by the same constant, swap rows, and add the coefficients of one row to those of another.
Let's illustrate this with the first column. Starting with the first coefficient \(\textcolor{red}{M_{21}}\) (ignoring the pivot coefficient \(M_{11}\) for now), let's label it as ext{Joe}. Suppose ext{Joe}'s value is 2. The objective is to make ext{Joe} 0 using one or a combination of the three possible row elementary operations. To eliminate ext{Joe}, we might subtract the value 2. How? Consider the coefficient \(\textcolor{blue}{M_{11}}\), above ext{Joe}, equal to 4 and also the pivot coefficient of column 1, which ext{Joe} belongs to.
$$ \begin{bmatrix} \textcolor{blue}{M_{11}} & M_{12} & M_{13} & M_{14}\\ \textcolor{red}{M_{21}} & M_{22} & M_{23} & M_{24}\\ M_{31} & M_{32} & M_{33} & M_{34}\\ M_{41} & M_{42} & M_{43} & M_{44} \end{bmatrix} \;=\; \begin{bmatrix} \textcolor{blue}{4} & x & x & x\\ \textcolor{red}{2} & x & x & x\\ x & x & x & x\\ x & x & x & x \end{bmatrix} \;=\; \begin{bmatrix} \textcolor{blue}{\text{pivot}} & x & x & x\\ \textcolor{red}{\text{Joe}} & x & x & x\\ x & x & x & x\\ x & x & x & x \end{bmatrix} $$Operation 3 allows us to add the coefficients of any matrix row to our current row's coefficients. Using the coefficient of the row above ext{Joe}, \(\textcolor{blue}{M_{11}}\) (equal to 4), results in 6:
$$2 + 4 = 6$$That differs from our goal. However, operation 2 states we can also multiply the coefficients of a row by a constant. Imagine this constant is -1. Multiplying \(\textcolor{blue}{M_{11}}\) by this constant gives:
$$2 + -1 \times 4 = -2$$We're closer to 0 but not there yet. Suppose our constant was \(-\frac{1}{2}\) or \(-\frac{2}{4}\):
$$2 - \frac{2}{4} \times 4 = 0$$Now, we've canceled the value of the coefficient \(\textcolor{red}{M_{21}}\), making it 0. It's noteworthy that 2 is the value of the coefficient \(\textcolor{red}{M_{21}}\), Joe itself, and 4 is the coefficient value of \(\textcolor{blue}{M_{11}}\), the pivot coefficient of the first column, to which Joe belongs.
Remember, the operations applied to Joe must also apply to all coefficients in the row Joe belongs to. This rule underpins the use of row-elementary operations. Hence, every coefficient on Joe's row must undergo the same treatment:
$$ \begin{array}{l} \textcolor{red}{M_{21}} &-=& k \times M_{11},\\ M_{22} &-=& k \times M_{12},\\ M_{23} &-=& k \times M_{13},\\ M_{24} &-=& k \times M_{14}. \end{array} $$With:
$$k = -\frac{\textcolor{red}{M_{21}}}{\textcolor{blue}{M_{11}}}$$Let's now generalize:
Here's the revised text with corrected grammar and markdown formatting preserved:
-
For every coefficient \(M_{ij}\) in the matrix below the pivot \(M_{ii}\), where \(i\) represents the coefficient's row and \(j\) its column, proceed in order through the columns first and then, for each column, from top to bottom.
-
Create a constant \(k\) that consists of \(M_{ij}\) divided by the pivot coefficient of the row above \(M_{ij}\), \(i-1\): \(M_{(i-1)(i-1)}\). If processing \(M_{21}\) (row 2, column 1), then use the pivot coefficient of the first row, \(M_{11}\). If processing \(M_{32}\) (row 3, column 2), then use the pivot coefficient \(M_{22}\) (row 2), and so on.
-
Negate that number to get \(k\).
-
Add the coefficients from row \(j\) multiplied by \(k\) to the coefficients of row \(i\) (the row of \(M_{ij}\)). This is permissible under operation 3.
-
\(M_{ij}\) will then equal 0.
It's important to loop through the columns first (\(j\)) and for each column, process the coefficients below the pivot from top to bottom (\(i\)), skipping the column's pivot coefficient (\(i == j\)).
By the end of this process, all coefficients below the diagonal should be 0, excluding the pivot coefficients.
The corresponding C++ code is as follows:
// Go left to right. There's no need to process the last column // since there's no coefficient under pivot m[3][3]. for (uint32_t column = 0; column < 3; ++column) { // Make each coefficient in the column = 0. Proceed top to bottom. for (uint32_t row = column + 1; row < 4; ++row) { // This is our constant k. T constant = m[row][column] / m[column][column]; // Process each coefficient on the current row. for (uint32_t j = 0; j < 4; ++j) { m[row][j] -= constant * m[column][j]; inv[row][j] -= constant * inv[column][j]; } // Set the element to 0 for safety. m[row][column] = 0; } }
Note how the same operations are applied to the right-hand side of the augmented matrix. This step is known as the forward substitution step. The matrix should now appear as follows:
$$ \begin{bmatrix} \text{pivot} & x & x & x\\ 0 & \text{pivot} & x & x\\ 0 & 0 & \text{pivot} & x\\ 0 & 0 & 0 & \text{pivot} \end{bmatrix} $$Step 3: Scale All The Pivot Coefficients To 1
Now, the pivot coefficients can be set to 1. This requires scaling the coefficients by their value. Similarly to the approach in step 2, we will loop over each row and each column, dividing the current coefficient by the pivot coefficient value for the current row. Remember to apply the same operation to the inverted matrix.
for (uint32_t row = 0; row < 4; ++row) { float divisor = m[row][row]; for (uint32_t column = 0; column < 4; ++column) { m[row][column] /= divisor; inv[row][column] /= divisor; } }
The matrix should now be represented as follows:
$$ \begin{bmatrix} 1 & x & x & x\\ 0 & 1 & x & x\\ 0 & 0 & 1 & x\\ 0 & 0 & 0 & 1 \end{bmatrix} $$Step 4 (and Final): Eliminate All Numbers Above the Diagonal
We are now tasked with eliminating the numbers above the diagonal. This process is simpler than in Step 2. Here, we will subtract the coefficient itself from the one we wish to eliminate (reduce to 0). As in Step 2, this can only be done using one of the three possible row operations. We will employ operation 3: replacing the row with the sum of its elements and another row's elements (multiplied by some constant). For the constant, we will choose the coefficient we are eliminating (the one we referred to as ext{Joe} previously). If this coefficient (ext{Joe}) is \(M_{ij}\), where \(i\) is the row index and \(j\) the column index, we will add the coefficients of row \(j\) multiplied by \(-M_{ij}\) to the coefficients of row \(i\). In code, this translates to the following:
// For each row for (uint32_t row = 0; row < 4; ++row) { // For each coefficient above the diagonal for this row for (uint32_t column = row + 1; column < 4; ++column) { T constant = m[row][column]; for (uint32_t k = 0; k < 4; ++k) { m[row][k] -= m[column][k] * constant; inv[row][k] -= inv[column][k] * constant; } // In case of a round-off error m[row][column] = 0.f; } }
This step is sometimes referred to as backward substitution. We step through the rows and then through the columns, processing only the elements of the matrix that are above the diagonal. This approach can be applied regardless of whether you process the columns from left to right (as in our example) or right to left. The math holds because we have already set the matrix elements below the diagonal to 0 (which is why this method isn't applicable in Step 2).
Finally, we synchronize the current matrix with the result of our right inside matrix (inv
in the code). We have successfully inverted the matrix using the Gauss-Jordan elimination technique.
Here is the complete code for the method:
Matrix4 Matrix4::Inverse() const { Matrix4 s; Matrix4 t(*this); // Forward elimination for (uint32_t i = 0; i < 3; i++) { // Step 1: Choose a pivot uint32_t pivot = i; float pivotSize = t[i][i]; if (pivotSize < 0) pivotSize = -pivotSize; for (uint32_t j = i + 1; j < 4; j++) { float tmp = t[j][i]; if (tmp < 0) tmp = -tmp; if (tmp > pivotSize) { pivot = j; pivotSize = tmp; } } if (pivotSize == 0) { return Matrix4(); } if (pivot != i) { for (uint32_t j = 0; j < 4; j++) { float tmp; tmp = t[i][j]; t[i][j] = t[pivot][j]; t[pivot][j] = tmp; tmp = s[i][j]; s[i][j] = s[pivot][j]; s[pivot][j] = tmp; } } // Step 2: Eliminate all the numbers below the diagonal for (uint32_t j = i + 1; j < 4; j++) { float f = t[j][i] / t[i][i]; for (uint32_t k = 0; k < 4; k++) { t[j][k] -= f * t[i][k]; s[j][k] -= f * s[i][k]; } // Set the column value to exactly 0 in case // numeric round-off left it a very tiny number t[j][i] = 0.f; } } // Step 3: Set elements along the diagonal to 1.0 for (uint32_t i = 0; i < 4; i++) { float divisor = t[i][i]; for (uint32_t j = 0; j < 4; j++) { t[i][j] = t[i][j] / divisor; s[i][j] = s[i][j] / divisor; } // Set the diagonal to 1.0 exactly to avoid // possible round-off error t[i][i] = 1.f; } // Step 4: Eliminate all the numbers above the diagonal for (uint32_t i = 0; i < 3; i++) { for (uint32_t j = i + 1; j < 4; j++) { float constant = t[i][j]; for (uint32_t k = 0; k < 4; k++) { t[i][k] -= t[j][k] * constant; s[i][k] -= s[j][k] * constant; } t[i][j] = 0.f; // In case of round-off error } } return s; }
What's Next?
The other popular method for matrix inversion is often referred to as the adjoint method or the method involving the adjugate (adjoint) matrix and determinants. This method is particularly useful for finding the inverse of a square matrix, provided that the matrix is non-singular (i.e., it has a non-zero determinant). The process is somewhat more straightforward conceptually than the Gauss-Jordan elimination method but can be more computationally intensive, especially as the size of the matrix increases. This method will be explored in a future revision of this lesson.