Unpack how image pixel data fuels machine learning models in recognizing handwritten digits. Gain a deeper understanding of how grayscale arrays translate into recognizable numeric forms.
Key Insights
- The dataset contains 60,000 images, each represented as a 28 by 28 pixel array, totaling 784 grayscale values per image, with intensities ranging from 0 (black) to 255 (white).
- Each image corresponds to a handwritten digit, labeled clearly to train the neural network for accurate digit recognition.
- Data is stored in NumPy arrays, allowing visualization in platforms like Jupyter Notebook, where numerical arrays can be directly interpreted as visual pixel images for analysis.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Okay, let's talk about these training images. They have a shape 60,000, 28, 28, meaning they're 60,000 rows of data and each one is a 28 × 28 array. And each of those represents a 28 × 28 pixel, 28 pixels across, 28 pixels down, handwritten digit.
Each of those values, and it's 784 values, 28 times 28, it's an integer from zero to 255. And that's the gray scale range. Zero is all the way black, 255 is all the way white, and everything in between is a shade of gray.
And we're gonna use those 60,000 images to train our neural network machine learning model. Let's take a look at one image. Let's print out the type of training images at index zero.
Let's print out that number of dimensions of it. And then let's just take a look at it. And finally, let's just output the whole thing.
Let's just say training images at index zero. All right, let's run all that. It's a NumPy array, it's got two dimensions because it's 28 × 28, and it printed out like this.
Now, if I print it the regular way, print training images zero, I get the full array. There's a square brackets at the front that isn't ended until the square bracket at the end here, as you can see. So, all the rest of this is each rows of an image.
The first row is all black, the second row of pixels is all black, the third is all black, and the first bunch of them are all black. But then we finally start to see somewhere it's starting to get lighter pixels, lighter pixels going back and forth. And it's sort of starts to make a little bit of a shape here.
If we just output again, the images without saying print, our Jupyter Notebook will interpret it as, I want you to print it out as an image. And it makes a little 28 × 28 pixel little image. And where on the row, started like the fifth row or so, we had started to see a little white, well, there's a little white at the end.
We'll visualize this different ways as we go, but this is what this is. All of these little parts of the image are 28 × 28, are each of these little things in a 28 × 28 array, each little dots here is a value for zero to 255. And that's what makes this image visible.
Let's take a look at the image is a five, great. We can see it's a five. Let's check the digit by looking at the training label.
So the image has an answer. Right? What's training labels at zero? Oh, not the last value, so we print it five. And we can in fact, look at the first 10 values, zero up to but not including 10.
And there it is. Those are our first 10 labels for each of these. And we can see the handwritten version, if we, of the next one, let's say the, which is a zero by just outputting it as an image here.
And there it is. So our machine learning model is going to have all those lists of lists. What does it look like this? And it has to look at all these numbers that we know represent digits, represent pixels.
It doesn't really know that. And it must, it has to be able to say, okay, that looks like a zero to me. Let's explore further how we're going to do that.