Evaluate the accuracy of your linear regression models clearly and effectively. Learn how model scores measure improvement over simple mean-based predictions.
Key Insights
- The model's accuracy score of approximately 69% indicates its predictions are significantly better than merely guessing the mean value (29.47) for every data point.
- An accuracy score in linear regression does not represent the percentage of exact predictions, but rather how much better the model performs compared to baseline predictions using the mean.
- A negative accuracy score is possible and indicates that the model's predictions are worse than a simple mean-based approach.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Let's find out exactly how accurate our predictions were. So this time what we're going to do is we're going to, they have a dot score. Every one of these models has a dot score method that tells you what is the accuracy score.
Now the score method will use different measurements depending on what you're doing. With a linear regression we get the accuracy score. We'll talk about what accuracy is in a second.
We could say score equals model dot score. And what we give it is the x-test and the y-test. Now we're giving it the answer and saying, how'd you do? How does your predictions of the predictions from running x-test data through the model compare to those from the y-test, the actual answers? And now we can print out the score.
And it's not bad. It's actually very good. It's about 69%.
So what does that mean? It doesn't mean that 69% of them were correct. And that's important to know because, I mean, I bet actually 0% were correct, right? Because we're talking about continuous values, which means that nailing it to the decimal point would be very difficult, would be very rare. So it certainly doesn't mean that.
So what is this 69% measuring? It's measuring how close our predictions were, our model's predictions, to a model that just predicted the mean. It just said, listen, look at all these values here, average them out, and guess that every time, right? So just really eyeballing this, we can, actually, we can get the mean. That's pretty easy.
We can say y-test, it's a list. We can say add up all the y-test values and divide it by how many y-test values there are. That's mean.
So 29.47. If our model just guessed 29.47 for every single one of these, it's just like, yeah, they're all around there. What's this one? 29.47. Oh, how do you, what do you predict given this set of features? 29.47. It just guessed that for every single one. If it did that, then each one of these, their score would be zero because it's comparing it against that.
It's no better or worse than just guessing the mean. If you have a negative value, it's actually possible to have a negative percent here. And what that means is you are worse than just guessing the mean every time.
But we're not there. We're, in fact, way better than that. We're at 69% better than this.
69% more accurate than just guessing the mean every time. And that's actually really good. That's a very good score.
So next we'll talk about how to make that better.