More Properties of Matrices - Day 17 of ML

Jun 24, 2024
7 min read

Time spent: 4h
Total: 45h/10000h


I’m reading through my linear algebra book and this is just a brief look over the various things I went over today.

LU Decomposition

We briefly talked about upper and lower triangular matrices on Day 10 of ML. Today we’ll look a bit further into how they can be used to solve systems. We’ll also be going over matrix inverses and transposes.

NOTE: A prerequisite for reading this section would be having read Day 10 of ML, since that’s where we go over the principles of the upper and lower triangular matrices.

Following the principles of Day 10 of ML, we know that the lower triangular matrix LL is used for forward elimination, and the upper triangular matrix UU is used for back-substitution. With forward elimination, we go from bb to cc, and with back-substitution we go from cc to xx.

The system Ax=bAx = b can be split into two different parts: Lc=bLc = b and Ux=cUx = c. Knowing that A=LUA = LU, we can multiply the second equation with LL to get LUx=LcLUx = Lc, which is simply equivalent to Ax=bAx = b.

Elimination algorithms tend to do this. The algorithm can be broken into two steps:

  1. factoring (finding LL and UU from AA), and
  2. solving (finding xx using LL and UU).

The Permutation Matrix P

What if you have a system of equations that looks something like this:

[0234][uv]=[b1b2]\begin{bmatrix} 0 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \end{bmatrix}

We run into a problem. No multiple of row 1 can eliminate the element a21a_{21} (3). We can remedy this by swapping rows 1 and 2 from the matrix, which gives us a normal-looking system again. To express this in matrix terms, we have the permutation matrix PP that swaps the rows. It looks like this:

P=[0110]PA=[0110][0234]=[3402]P = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \,\,\,\, PA = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 0 & 2 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} 3 & 4 \\ 0 & 2 \end{bmatrix}

Note that we also have to swap the rows in bb. The new system is PAx=PbPAx = Pb. For a matrix with nn rows, there are n!=(n)(n1)(1)n! = (n)(n-1)\dots(1) permutations.

Inverses

The inverse for some matrix AA exists when there is a matrix A1A^{-1} that contains the following property: A1A=IA^{-1}A = I. If you multiply by AA first and then multiply by A1A^{-1}, you get back to where you started. The inverse of an n×nn \times n matrix is another n×nn \times n matrix.

Inverse matrices are useful for solving systems. If AA is invertible, the one and only solution to Ax=bAx = b is A1bA^{-1}b. Let’s prove this by multiplying Ax=bAx = b with A1A^{-1}:

A1Ax=A1b    x=A1bA^{-1}Ax = A^{-1}b \implies x = A^{-1}b

Here are some things to note about inverses:

  1. The inverse exists only for nonsingular matrices.
  2. A matrix has only one inverse at most.
  3. A 2×22 \times 2 matrix is invertible only if it’s determinant is not zero.
  4. The inverse of a product of two invertible matrices (AB)1=B1A1(AB)^{-1} = B^{-1}A^{-1}. A similar rule is for NN invertible matrices. Consider a set of matrices S={Ai}i=1NS = \{A_i\}_{i=1}^{N}. The inverse of the product of all these matrices is:
    (i=1NAi)1=i=N1Ai1.(\prod\limits_{i=1}^N{A_i})^{-1} = \prod\limits_{i=N}^1{A_i^{-1}}.

Calculating Inverses with the Gauss-Jordan Method

The Gauss-Jordan Method is a simple system for getting the inverse of some matrix AA. Here are the steps on how it works:

  1. Combine AA into one matrix with the identity matrix II (also known as the augmented matrix [AI][A \vert I]).
  2. Solve the upper triangular form of AA, so we end up with [UL1][U \vert L^{-1}].
  3. Create zeros above the pivots as well and divide each pivot to form the identity matrix. This results in [IU1L1]=[IA1][I \vert U^{-1}L^{-1}] = [I | A^{-1}].

Let’s see an example (ChatGPT-generated) for this. This will be the matrix for our example:

A=[121341122]A = \begin{bmatrix} 1 & 2 & 1 \\ 3 & 4 & 1 \\ 1 & 2 & 2 \end{bmatrix}

We start with the augmented matrix [AI][A | I]:

[121100341010122001]\left[\begin{array}{ccc|ccc} 1 & 2 & 1 & 1 & 0 & 0 \\ 3 & 4 & 1 & 0 & 1 & 0 \\ 1 & 2 & 2 & 0 & 0 & 1 \\ \end{array}\right]

Step 1: We first eliminate the entries below the first element in the first column.

[121100022310001101]\left[\begin{array}{ccc|ccc} 1 & 2 & 1 & 1 & 0 & 0 \\ 0 & -2 & -2 & -3 & 1 & 0 \\ 0 & 0 & 1 & -1 & 0 & 1 \\ \end{array}\right]

Step 2: Use the third row to make the third column above it zero.

[120201020112001101]\left[\begin{array}{ccc|ccc} 1 & 2 & 0 & 2 & 0 & -1 \\ 0 & -2 & 0 & -1 & 1 & 2 \\ 0 & 0 & 1 & -1 & 0 & 1 \\ \end{array}\right]

Step 3: Normalize the second row by multiplying by 12\frac{1}{2}.

[1202010100.50.51001101]\left[\begin{array}{ccc|ccc} 1 & 2 & 0 & 2 & 0 & -1 \\ 0 & 1 & 0 & 0.5 & -0.5 & -1 \\ 0 & 0 & 1 & -1 & 0 & 1 \\ \end{array}\right]

Step 4: Clear the second column in rows 1 and 3.

[1001110100.50.51001101]\left[\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0.5 & -0.5 & -1 \\ 0 & 0 & 1 & -1 & 0 & 1 \\ \end{array}\right]

The right side of the augmented matrix now represents the inverse of AA:

A1=[1110.50.51101]A^{-1} = \begin{bmatrix} 1 & 1 & 1 \\ 0.5 & -0.5 & -1 \\ -1 & 0 & 1 \\ \end{bmatrix}

The Transpose Of a Matrix

Fortunately, this idea is much simpler than the inverse. The transpose ATA^T of a matrix AA essentially just swaps the columns and rows in the matrix. Yes, it is as simple as it sounds. Take a look:

If A=[123456], then AT=[142536]\text{If } A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \text{, then } A^T = \begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{bmatrix}

As with inverses, the product of matrix transposes follows a similar rule. The transpose of a product of two matrices (AB)T=BTAT(AB)^{T} = B^{T}A^{T}.

Again, consider a set of matrices S={Ai}i=1NS = \{A_i\}_{i=1}^{N}. The transpose of the product of all these matrices is:

(i=1NAi)T=i=N1AiT.(\prod\limits_{i=1}^N{A_i})^{T} = \prod\limits_{i=N}^1{A_i^{T}}.

Symmetric matrices

A symmetric matrix AA is one that possesses the following property: AT=AA^T = A. Here are some examples of symmetric matrices:

A=[1332],B=[214135456],C=[700080009]A = \begin{bmatrix} 1 & 3 \\ 3 & 2 \end{bmatrix},\,\,\,\,\, B = \begin{bmatrix} 2 & -1 & 4 \\ -1 & 3 & 5 \\ 4 & 5 & 6 \end{bmatrix},\,\,\,\,\, C = \begin{bmatrix} 7 & 0 & 0 \\ 0 & 8 & 0 \\ 0 & 0 & 9 \end{bmatrix}

Conclusion

That was that. Not sure what I’ll be doing in the upcoming days. Perhaps I’ll look into something like decision trees and just keep going with linear algebra. I really want to get into neural networks and stuff but I want to have a great linear algebra base before I do that.