Master slicing DataFrames confidently by understanding the difference between Pandas' iloc and loc methods. Clear up common indexing confusion to streamline your data analysis process.
Key Insights
- Use negative indexing with
iloc[-3:]
to dynamically select the last three rows or columns, preventing errors when new data is added. - Remember that
iloc
slicing excludes the stop index, requiring an incremented value (e.g., use154:157
to select rows 154–156), whereasloc
includes the stop index (e.g., use154:156
). - With
loc
, leverage column names (e.g.,'fuel efficiency':'power perf factor'
) for clearer, more readable DataFrame slicing, and calculate row positions dynamically usinglen(cars.index) - 3
to ensure robustness.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Let's take a look at the solution for this challenge. We're going to get the last three rows, the last three columns, and let's do it with iloc first. We can say cars.iloc. It's very easy, especially if you're used to regular Pythonic programming and that a lot, it's very easy to forget the .iloc or the .loc. And you may see me even make that mistake at some point.
But I remember this time. Okay, so .iloc, we're going to give the .loc values for the last three rows first. And they are rows 154 to 156.
However, if we write 154 to 156, that's going to be exclusive, remember, with iloc. We actually want to say 154 to 157, which is, of course, not even a row. But don't include 157, we say.
Up to, but not including. That should be rows 154, 155, and 156. Now, for the columns, there are 16 columns in this.
And we want to say that we want columns 14, 15, and 16. We can say 14, 15, and 16, that will be 14, up to, but not including, 17, because it's not a column. So don't include it.
So that's how we can say that. Let's see if that looks like what we expect it to. It doesn't.
We've made a mistake. That's fantastic. So what mistake have we made? We've got the three rows, but we do not have the three columns.
We only have two. And that's because I, like many people working with indexes, have made an error. There are 16 columns, but the last one is 15.
They are 13, 14, and 15. Because computers start counting at zero. So zero is the first one, 15 is the last one, not 16.
16 columns in the last index is, again, 15, colon. We can fix this. It's a very easy mistake to make.
We actually want columns 13, 14, and 15. We're going to say I want 13 to 16. So that'll be columns 13, 14, and 15.
Let's try that again. There we go. We now have just what we were expecting to get.
Now, I didn't do that on purpose. I wish I had. That'd be cool.
But it's very easy to make those off-by-one errors. I mean, it's called the off-by-one error. There's an official name for that kind of error.
It's very easy to make those off-by-one errors with iloc. There's one way we can fix this. We can make this a little easier.
One is making these numbers sort of more semantic. In order to use this, we have to know what are the last three row numbers and what are the last three column numbers. And, in fact, again, I got that wrong.
It also means that if we change this, let's say we add another car model to our data, well, suddenly this is no longer the last three. If there's one more than the last ones, it would be 155, 156, and 157, not 154, 155, 156. So, adding a row or adding a column would throw off this last, you know, this quote-unquote, you know, give me the last three.
So, however, we can write something a little more semantic, a little more human meaningful. Instead of hard coding it, we can say, okay, well, whatever the last three are, whatever those numbers are. So, here's how we can do that.
Instead of this, or more semantically, we can say cars.iloc negative three until the end. If you leave off the last number, that gives you the number after the colon, that gives you, it assumes, oh, until the end then. So, this means from the third to last on.
And we can do the same thing for column, from index negative three as in the third from last on. Now, if we add more columns or rows, we won't run into any problems. This will still give us always the last three rows, last three columns.
And it's also a little more obvious what we're doing and harder to get wrong. I want the last three, negative three. Let's try running that and confirm it's still the same result.
Still the same result. Okay. Let's do the loc version.
Now, for this one, I'm going to create a new code block so we can continue to compare it, compare the values to this one. I didn't do this with the last one, which meant that up here, like, we could only output one of them unless we print. So, I'm going to make a new code block and say now using loc.
And let's take a look at what that would look like. We could say car sales or cars.loc and say, okay, I want 154 to 156. Now, we still have to do numbers because our rows just don't have names.
But for the columns, we can make this a lot easier. Fuel, oh, and of course, this is inclusive. So, it's 154 up to and including 156.
Notice we want 154 to 157 here. All right. So, the columns we want are fuel efficiency to power perf factor.
Power perf factor. Inclusive. So, we can say fuel efficiency to power perf factor.
And we should get the same result. Although, you know, always worth checking. We did.
We did get the exact same result. Now, we could write this to be a little more a little more like the other one. A little more like negative three.
However, because we're dealing with actual row numbers and not indexes anymore, like these are the names, 154 to 156, not really numbers anymore. Just like these are the names for these columns. We can't quite do the negative three there.
But if we want to do that, we could say I want the length of cars dot index. Cars dot index is the rows of cars. The index numbers.
The length of that list minus three. So, that would be 154. This way if we add another one to the end, well, that would still be the one we'd want to start with.
If we still want to get the last three. And then we could say colon for do that from that on. So, from the last three on to the end.
Similar to this. And then I'm going to copy and paste for the next part. I want still these columns.
All right. Let's see if we get the same result. Looks good.
That's the solution for this challenge. I hope that was good practice for your slices of Pandas DataFrames skills.