Standard Deviation and the Bell Curve

Understand how standard deviation measures variations from the average, and visualize its role in common scenarios like human height. This article clarifies the bell curve distribution and explains the practical meaning behind standard deviations.

Key Insights

Standard deviation quantifies how much individual values in a data set vary from the mean, with approximately 68.2% of values falling within one standard deviation.
The normal distribution, commonly called a bell curve, represents how data clusters around the mean, with progressively fewer data points at greater standard deviations, capturing 95.4% within two standard deviations.
The concept of standard deviation is demonstrated through real-world examples such as the height of white American males, illustrating that most heights cluster closely around the mean of five-foot-nine.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

Let's talk about standard deviation. It's a measure of how far value is from the mean. And again, we usually measure this from the mean because the important question is, hey, what is what is our values mathematically what's the middle of our values.

And how much do things deviate from that. So, there's a distribution called the normal distribution because it's very common. And in a normal distribution.

Most values cluster around me. And it usually is symmetrical. And it's called a bell curve.

What does that mean it means that on a bell curve. We have a few outliers, and most things are right in the middle so we'll see a big rise in the curve towards that. Now, the standard deviation is a measurement.

Looking at the values and how they vary from the mean. If the standard deviation is set so that one standard deviation encompasses 68.2% of the overall values. So about two thirds of all values will be within one standard deviation.

Data Analytics Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

And that's how we measure standard deviation what what difference plus or minus from the mean would encompass this percentage. Then we have two standard deviations whatever this differences this deviation from the middle this difference between the middle and our value again plus or minus. Double that.

With two standard deviations 95.4% of our values will be within those deviations. Now for three standard deviations, again we're getting, you know, almost all of our values. At that point we're looking at, you know, two out of 1000 values will be more than three standard deviations.

So let's take a look at that and and and see what this would look like. Here, if we execute this cell it'll load an image for us from our Google Drive. And this is a visualization of that this is the bell curve that we're talking about because it's sort of vaguely bell shaped.

Now, this, the 68.2% here is these two middle ones again, half of it to the left half of it to the right. And the, these lines are how many standard deviations the symbol for that is sigma, the Greek letter sigma. Then, a small number of other ones a smaller number are between one and two standard deviations away.

And a very small amount is more than three or more than two, rather. The last, you know, another four, four and a half percent or so are more than two standard deviations, and then more than three is the last point 1% on each side, the extreme outliers. And you see this kind of distribution all the time.

If you think about height for humans is a good example. You'll get high for humans. Well, you know, for, let's say a white male in America, the height is average height is five foot nine.

And you see people, white men in America tightly clustered around that amount right within, you know, two or three inches. And it's a few people who are, you know, shorter than five, six, again, for, for this subset of the population, white men in America. So that's, you know, lower than five, six might be one deviation, one standard deviation, three inches, I'm, you know, estimating here.

And then, you know, going the other direction, you know, six feet over six feet tall would be the same amount, same deviation from five, nine in the other direction. So most people two thirds, if that if three inches is a standard deviation would be between five, six and six feet. And then you have, you know, your, your more extreme outliers, you're very short people over on this side, and your basketball players over this side.

So that standard deviation, we're going to take a look at how to calculate it next.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

Key Insights

Colin Jaffe

How to Learn Machine Learning