<< Chapter < Page Chapter >> Page >

Question about this? Interviewee:

[Inaudible].

Instructor (Andrew Ng) :Which ones? This one? Interviewee:

No, the line.

Instructor (Andrew Ng) :Oh, this one. Oh, yes. Thank you. [Inaudible]. So let's talk about how to actually implement this now. Okay. How many of you know what an SVD or single value decomposition is? Wow, that's a lot of you. That's a lot more than I thought. Curious. Did you guys learn it as under grads or as graduate students?

All right. Let me talk about it anyway. I wasn't expecting so many of you to know what SVD is, but I want to get this on tape, just so everyone else can learn about this, too. So I'll say a little bit about how to implement PCA. The problem I was eluding to just now was that when you have these very high-dimensional vectors, than sigma is a large matrix.

In particular, for our text example, if the vectors XI are 50,000 dimensional, then the covariance matrix will be 50,000 dimensional by 50,000 dimensional, which is much too big to represent explicitly. I guess many of you already know this, but I'll just say it anyway. It turns out there's another way to implement PCA, which is if A is any N by N matrix, than one of the most remarkable results of linear algebra is that the matrix, A, can be decomposed into a singular value decomposition.

What that means is that the matrix, A, which is N by N, can always be decomposed into a product of three matrixes. U is N by N, D is a square matrix, which is N by N, and V is also N by N. D is going to be diagonal. Zeros are on the off-diagonals, and the values sigma I are called the singular values of the matrix A.

Almost all of you said you learned this as a graduate student, rather than as an under grad, so it turns out that when you take a class in undergraduate linear algebra, usually you learn a bunch of decomposition. So you usually learn about the QLD composition, maybe the LU factorization of the matrixes. Most under grad courses don't get to talk about singular value decompositions, but at least in – almost everything I do in machine learning, you actually find that you end up using SVDs much more than any of the decompositions you learned in typical under grad linear algebra class.

So personally, I [inaudible] an SVD dozens of times in the last year, but LU and QRD compositions, I think I used the QRD composition once and an LU decomposition in the last year. So let's see. I'll say a bit more about this. So I'm going to draw the picture, I guess.

For example, if A is an N by N matrix, it can be decomposed into another matrix, U, which is also N by N. It's the same size, D, which is N by N. Another square matrix, V transpose, which is also N by N. Furthermore, in a singular value decomposition, the columns of the matrix, U, will be the eigen vectors of A transpose, and the columns of V will be the eigen vectors of A transpose A.

To compute it, you just use the SVD commands in Matlab or Octave. Today, say the art in numerical linear algebra is that SVD, singular value decompositions, and matrixes can be computed extremely [inaudible]. We've used a package like Matlab or Octave to compute, say, the eigen vectors of a matrix. So if SVD routines are even more numerically stable than eigen vector routines for finding eigen vector in the matrix. So you can safely use a routine like this, and similar to the way they use a square root command without thinking about how it's computed.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask