Background Information

Diagonalization 

Before we dive into SVD, it is good to either acquaint ourselves with or recall matrix diagonalization. A square matrix A with linearly independent eignevectors can be defined as

A = PDP-1

where P is matrix consisting of eigenvectors and D is a diagonal matrix with descending eigenvalues. If we normalize P, such that it is now orthonormal, we can take advantage of the orthogonal property that equates the inverse of a matrix to its transform.

P-1 = PT

This allows us to redefine our diagonalization equation as

A = PDPT

Singular Value Decomposition

Now suppose we want to work with non-square matrices, e.g. A is a matrix with m x n dimensions. Can we still diagonalize it? Sure! We can use Singular Value decomposition or SVD for short. To do this, we first need to “squarify” A. We can accomplish this by multiplying A by its transform, resulting in

AAT or ATA

Proof of Squareness:

AAT:

[m x n] [n x m] = [m x m]

which is a square sized m matrix.

ATA:

[n x m] [m x n] = [n x n]

which is a square sized n matrix

From now on, to simplify explanations, let us define U as AAT and V as ATA. As you can see the U and V terms are similar to the P and P-1 terms , but what about D? Well SVD also has a term similar to it: Σ. Σ is actually the padded (so that it is [m x n]) square root version of D. The values on its diagonal, however, are called singular values.

Multiplying U, Σ, and V allows us to reconstruct A.

A = UΣVT

A visual representation of this can be found below

The spectral decomposition of this is

A = UσV1T + UσV2+ … + UσVrT

You’re probably wondering, what is the significance of the spectral decomposition? Well, each successive term helps contribute a piece to matrix A. Now suppose matrix A represents an image. We can apply SVD and acquire the spectral decomposition. The more singular terms we use, the closer we get to reconstructing the original image.  But, we don’t necessarily have to use all of the terms to get back a visually similar image. We can in fact use less terms and get back something that is compressed and pretty hard for us to distinguish from the original. Pretty cool, right?!