<< Chapter < Page Chapter >> Page >

Okay, so Justin?

Student: Yeah, I was just wondering, I guess I'm a little confused because it's like, okay, you have two class of data. And you can say, "Okay, please draw me a line such that you maximize the distance between – the smallest distance that [inaudible] between the line and the data points." And it seems like that's kind of what we're doing, but it's – it seems like this is more complicated than that. And I guess I'm wondering what is the difference.

Instructor (Andrew Ng) :I see. So I mean, this is – the question is [inaudible]. Two class of data – trying to find separate hyperplane. And this seems more complicated than trying to find a line [inaudible]. So I'm just repeating the questions in case – since I'm not sure how all the audio catches it.

So the answer is this is actually exactly that problem. This is exactly that problem of given the two class of data, positive and negative examples, this is exactly the formalization of the problem where I go is to find a line that separates the two – the positive and negative examples, maximizing the worst-case distance between the [inaudible] point and this line.

Okay? Yeah, [Inaudible]?

Student: So why do you care about the worst-case distance [inaudible]?

Instructor (Andrew Ng) :Yeah, let me – for now, why do we care about the worst-case distance? For now, let's just say – let's just care about the worst-case distance for now. We'll come back, and we'll fix that later. We'll – that's a – caring about the worst case is is just – is just a nice way to formulate this optimization problem. I'll come back, and I'll change that later.

Okay, raise your hand if this makes sense – if this formulation makes sense? Okay, yeah, cool.

Great. So let's see – so this is just a different way of posing the same optimization problem. And on the one hand, I've got to get rid of this nasty, nonconvex constraint, while on the other hand, I've now added a nasty, nonconvex objective. In particular, this is not a convex function in parameters w.

And so you can't – you don't have the usual guarantees like if you [inaudible] global minimum. At least that does not follow immediately from this because this is nonconvex.

So what I'm going to do is, earlier, I said that can pose any of a number of even fairly bizarre scaling constraints on w. So you can choose any scaling constraint like this, and things are still fine. And so here's the scaling I'm going to choose to add. Again, I'm gonna assume for the purposes of today's lecture, I'm gonna assume that these examples are linearly separable, that you can actually separate the positive and negative classes, and that we'll come back and fix this later as well.

But here's the scaling constraint I want to impose on w. I want to impose a constraint that the functional margin is equal to 1. And another way of writing that is that I want to impose a constraint that min over i, yi – that in the worst case, function y is over 1.

And clearly, this is a scaling constraint because if you solve for w and b, and you find that your worst-case function margin is actually 10 or whatever, then by dividing through w and b by a factor of 10, I can get my functional margin to be over 1. So this is a scaling constraint [inaudible] would imply. And this is just more compactly written as follows. This is imposing a constraint that the functional margin be equal to 1.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask