Evaluate machine learning predictions effectively by interpreting accuracy scores and detailed classification reports. Understand precisely how precision, recall, and F1 scores reveal your model's strengths and weaknesses.
Key Insights
- Using the KNN model, accuracy was evaluated at 97%, indicating only one incorrect prediction out of 30 test cases.
- A detailed classification report from SK Learn Metrics showed perfect precision and recall for the Setosa category, but highlighted mild inaccuracies distinguishing between Versicolor and Virginica.
- The classification report provided critical evaluation metrics, including precision, recall, and F1 score, helping to better understand the model's predictive performance.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Let's check our score a couple of different ways. First, accuracy. What is our, out of all the predictions we made, how many were correct? We can get that by doing knnmodel.score. And it looks like, ah, we need to score some data.
Required missing two positional arguments, x and y, indeed. In order to score it, we need to give it the testing data. Here's the x-status data.
Make your predictions based on that, and then here's the answers. Tell me how many got right. And that's pretty good, 97%.
So that means we only missed 3% of it, which probably means only one wrong out of 30. One wrong would be 97%. We got one wrong out of 30.
Sit here and eyeball and try to figure out which one it is. Tempted to do that, but we definitely got one of them wrong, and we can see, however, better if we get a classification report. It'll tell us what we missed.
If you remember, we talked about precision and recall. Precision is out of that category. When we guessed that category, how often was it right? And recall is out of how many times we guessed that category, was it? Oh, how often our guesses for that category were correct out of how many times it actually was that category? How often did we identify that category correctly? We can get all of that, and the F1 score, which is the mean average of precision and recall, we can get that using the classification report.
That's a function given to us by SK Learn Metrics. Let's make a report. It's the classification report, and we'll pass it.
Here's the actual answers. Here's our model's predictions. And also, just to make this easier for us to read, we'll give it the iris data's target names, and then we'll print that report.
And here it is. We can see that we had perfect precision and recall on setosas, but we got a little bit wrong in the versus color in Virginia. We'll dive into that more in the next video.