Evaluate your classification model's effectiveness by examining accuracy and error types. Understand the nuances of true positives, true negatives, false positives, and false negatives to enhance predictive performance.
Key Insights
- The overall accuracy of the evaluated classification model is 77%, indicating a relatively strong predictive capability on approximately 3,000 test samples.
- Sample assessments revealed variations in prediction accuracy, ranging from 90% correct predictions (2 incorrect out of 20 samples) to lower accuracy intervals such as 75% (5 incorrect out of 20 samples).
- The article distinguishes clearly between the types of prediction errors: false negatives (predicted to stay but left) and false positives (predicted to leave but stayed), highlighting the importance of analyzing errors for improving model performance.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Let's run the same evaluation we ran on the linear regression and see how we did. So first, let's just take a look at our predictions. Model, go predict based on the test data now.
I'll save that as maybe predictions. And then we'll say, okay, I want to print a list version of, there are going to be about 3,000 things here. We don't want to print them all out.
So make a list of Y tests and give me the first 20. And then also, same thing for our model, for our predictions. Give me the first 20.
And they're not perfectly matched. In this case, we can actually see this one, they got almost all right. All these zeros are states.
This one, the actual, this is the actual answers. The third one did leave, did not predict the third one left. The first one stayed and we instead predicted it left.
So that's two wrong out of 20, 90%. That's pretty good. How about, let's take a look at from 20 to 40.
Let's look at the next 20. Here, we didn't do quite as well. Here, we got one wrong.
We got another one wrong. I move this number five here. And there were three more that left that we didn't catch at all.
So that's five wrong out of 20 in that case. There's only, that's only 75%. But you know, this is only, these are tiny samples, 20 samples, 20 long samples of it out of 3,000.
Let's get an actual score. Our accuracy score, which is just, in this case, it's not about guessing the mean. This is not about getting numbers closer or not.
This is just, hey, how many corrections did you correct? How many predictions did you get right out of how many predictions you made? Right, so again, like this one would be 75% because we got, we missed five out of 20. And the previous one, we only missed two out of 20. So that was 90%.
Let's try to take a look. Also, I think that math I just did was wrong. But that's why we have computers.
Okay, so we're going to say, give me model.score. What does this evaluate to? And we pass it the test data and the answers from the test data. And we got overall 77%. It's not bad.
It's pretty good. Now, we're going to analyze next exactly what we got wrong because that's pretty good score overall. But we're going to see that there's some real variance in what we got right and what we got wrong.
So let's take a look. We could have got it correct. We predicted, stayed, and they did.
So that's, you know, ones like this. Third one, we predicted they stayed. Yeah, that was right.
Another one, we predicted they left, and they did. There's actually none in this sample. If I undo this and run it again.
Think there was some, maybe not. Let's try 40 to 60 and see if we can see some of that example. Let's see, one, two, three, four, five.
One, two, three, four, five, six. Nope, that was wrong. That was wrong.
I think this one is right. One, two, three, four, five, six. Yeah, this one we predicted they left and they did.
When they match, that's a correct prediction. And this is called a true negative. Zero, and it was zero.
True positive. It was one, and we guessed one. We, you know, didn't guess, right? But we figured, we estimated it was a one.
We predicted it. Now there's two different kinds of errors, and this will be important. We predicted they stayed, but they left.
We predicted zero, but it was one. Here's one. Here's a one where we predicted they stayed, but they left.
That's a false negative. We said, nope, they didn't leave. And it's like, no, actually they did.
We predicted negative, but that was false. Then the other direction. We predicted they left, but they actually stayed.
That's a false positive. One, two, three, four, five, six. Our sixth one here, we predicted left.
One, two, three, four, five, six, but they actually stayed. That's a false positive. And we'll be analyzing this for like what, well, okay, we got some errors.
Overall good score. We got some errors. Which ones did we get wrong in general? And which ones did we get right in general? Let's take a look at those more advanced evaluations in a moment.