Singular Value Decomposition

This is the first in a short series of posts on an application of matrix theory to the mechanics of continuous media.  I’m planning my Vector and Tensor Analysis course for the winter quarter, and wanted to include a discussion of stress in elastic media as an application of tensors interesting to the engineering students.  Identifying the prerequisite material needed for this topic and to what extent I can expect the students to have been exposed to it through our curriculum has been an interesting exercise.  The physical interpretation of the stress tensor (which I will define in a later post) comes through the real polar decomposition of a matrix.  The real polar decomposition can be deduced from the singular value decomposition which can be deduced from the real spectral theorem.  Here’s how that last step goes.

Theorem. (Singular Value Decomposition)  Every m\times n real matrix A can be factored as A=USV^T where U is an $m\times m$ orthogonal matrix, S is an m\times n matrix with non-negative real numbers on the diagonal (called the singular values of A) and V is an n\times n orthogonal matrix.

With an understanding of the spectral theorem and how to interpret matrix multiplication one can actually derive a procedure for computing this factorization.  The important things to remember are:

  • if M is an m\times n matrix then M^TM is a symmetric n\times n matrix and MM^T is a symmetric $m\times m$ matrix because of the two properties of the transpose that (CD)^T=D^TC^T and (C^T)^T=C;
  • what it means for a matrix W to be orthogonal is that W is square and W^TW=I=WW^T, i.e., W is invertible and W^T=W^{-1};
  • the condition W^TW=I also says that the columns of W are orthonormal vectors since the (i,j)-entry of the product W^TW is precisely the dot product of row i of W^T (i.e., column i of W) with column j of W;
  • if D=[d_{ij}] is a diagonal matrix of size m\times m then DM is the result of multiplying the rows of M by the corresponding diagonal entries of D, while if D is diagonal of size $n\times n$ then MD is the result of multiplying the columns of M by the corresponding entries of D.

Let’s suppose for a moment that A=USV^T with U, S, and V as in the statement of the theorem.  Then

A^TA=(USV^T)^T(USV^T)=VS^TU^TUSV^T=VS^TSV^T\Rightarrow A^TAV=VS^TS

and

AA^T=(USV^T)(USV^T)^T=USV^TVS^TU^T=USS^TU^T\Rightarrow AA^TU=USS^T

by the orthogonality of U and V.  Due to the structure of S, the matrix S^TS is an n\times n diagonal matrix whose first \ell=\min\{m,n\} diagonal entries are the squares of the \ell diagonal entries of S.  Likewise, the matrix SS^T is an m\times m matrix whose first \ell diagonal entries are the squares of the \ell diagonal entries of S (in the same order).  The matrix equation (A^TA)V=V(S^TS) then says  that the columns of V are eigenvectors of A^TA.  Similarly, the equation (AA^T)U=U(SS^T) then says that the columns of U are eigenvectors of AA^T.

Fortunately, the real spectral theorem guarantees that the eigenvectors of the symmetric matrices A^TA and AA^T are orthogonal, and thus can be chosen to be orthonormal.  If u is an normalized eigenvector of AA^T with associated eigenvalue \lambda (i.e., AA^Tu=\lambda u) then \lambda=\lambda u^Tu=u^T\lambda u=u^T(AA^Tu)=(A^Tu)^T(Au)\ge 0 being the dot product of a vector with itself.  Similarly, the eigenvalues of A^A are non-negative.  So it is certainly possible that the non-negative diagonal entries of SS^T and S^TS are the eigenvalues of AA^T and A^TA, respectively.  But, it must be the case that the i^{th} eigenvalue of AA^T is equal to the i^{th} eigenvalue of A^TA for 1\le i\le \ell in order for our factorization to work since both must be equal to the square of the i^{th} diagonal entry of S.  This is where a clever observation saves the day.  The matrix A converts eigenvectors of A^TA into eigenvectors of AA^T!  Indeed, if v is an eigenvector of A^TA with eigenvalue \lambda, then

AA^T(Av)=A(A^TAv)=A\lambda v=\lambda (Av).

Hence, if Av\not=0 then it is an eigenvalue of AA^T with the same eigenvalue \lambda!  Of course, if Av=0 but v is non-zero (being an eigenvector of A^TA) then v\in\ker A and so the associated eigenvalue of A^TA is zero.  Likewise, if u is an eigenvector of AA^T and A^Tu is non-zero then A^Tu is an eigenvector of A^TA with the same eigenvalue.  It follows that at most \ell of the eigenvalues of AA^T and A^TA are nonzero and, in fact, \ell of them are the same.  The square roots of these \ell joint eigenvalues of A^TA and AA^T are called the singular values of A.

The upshot is that we can compute a set of normalized eigenvectors for AA^T and order them in descending order of their associated eigenvalues (so that those in the kernel occur at the end) and use them to form the columns of the orthogonal matrix U.  We can compute a set of normalized eigenvectors for A^TA and order them in descending order of their associated eigenvalues to form the columns of the orthogonal matrix V.  Then, if S is the m\times n matrix whose non-diagonal entries are zero but with diagonal entries equal to the singular values of A (also listed in descending order) we are guaranteed that A=USV^T.  This constructs the singular value decomposition of A.

About these ads

Posted on July 20, 2012, in Matrix Theory and tagged , , , . Bookmark the permalink. 1 Comment.

  1. A quick couple of things you probably already know:

    1) there are some java demos out there illustrating svd used to do image compression. Nice eye-candy!

    2) also, svd gives a great way to visualize linear transformations from R^n to R^n–i.e. the vectors in some priveleged o.n. basis are first scaled individually (the ones scaled by 0 span the kernel) and then the resulting orthogonal set is mapped isometrically into the target (i.e. by rotation/reflection if target = source)

    Finally, a couple of my honors alg students last year took on the project of programming svd from scratch. The business of finding evals/evecs of a largish symmetric matrix was pretty enlightening.

    -Jeff

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 144 other followers

%d bloggers like this: