Data with Python: Analyzing Min, Max, and Mean Values

Demonstrate calculating minimum, maximum, and mean values from lists and pandas data frames using Python functions.

Learn how Python and pandas simplify analyzing large datasets to quickly identify key statistics like minimum, maximum, and mean values. Understand the importance of statistical methods when handling sizable data, such as car resale values across extensive databases.

Key Insights

  • Python provides built-in functions such as min() and max() to easily find the lowest and highest values in a given list, demonstrated by identifying temperatures ranging from 48°F to higher summer temperatures.
  • Pandas offers specialized methods like .min() and .max() for analyzing data sets, allowing efficient determination of the lowest ($5,160) and highest ($68,000) year resale values within a data frame containing approximately 157 vehicle records.
  • The article illustrates computing the mean average using basic Python functions (sum() and len()) and NumPy's .mean(), rounding the result to a more readable degree of precision (79.4°F) for practical data presentation.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

Now we're going to take a look at some of the ways we can analyze some fundamental statistical data out of a series of data points. We'll start off here with our degrees list that we gave you. This is just a Python list with a bunch of degrees.

Looks like it's, you know, mostly summary temperatures except for one little 48. But, you know, I can eyeball that. But when we're looking at a giant amount of data, we wouldn't be able to do so.

And we'll take a look at some where we won't be able to do so. If we want to find the minimum and maximum values here, what was the low temperature, what was the high temperature? There's some built-in Python methods for doing so. If I want the low temperature, I can look at what's the min, that's a built-in Python function, min of this list.

What's the minimum value of degrees? If I evaluate that, I just use a shortcut, by the way. It is command, no, control, return on macOS or command enter. So, sorry.

Command enter on macOS or control enter on macOS and control return on Windows and Linux. Okay. That's a shortcut I use quite a lot to run the code without having to use the mouse.

Data Analytics Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

Just like I wrote some code, don't even take my hands off the keyboard, just hit command or control enter or return, whatever is on yours. Okay. So I just ran that.

Let's see what it evaluates to. 48. Which, again, we can eyeball this time.

Let's make a new code block to look at the max just so we can keep that value there. Actually, instead, let's print it out. Like so.

Now we can print that out, still print out as 48, and then we can evaluate with this code block what's the max of degrees. And that works the same way it is a built-in Python method, a built-in function for getting the highest number in a list of numbers. There we go.

All right. So, this is, you know, mildly useful for here, but it's very useful when we're talking about values in a database.

In the data frame. So, let's get let's look at the year resale value. For our cars.

This is a column. We look at our full thing here. Year resale value, that's how much they're worth a year in with their in thousands.

So, if we want to find which car has the smallest resale value after a year, which car has the biggest resale value after a year, we can't eyeball that. Not in 157 rows. So, let's take a look at it using some built-in data frame pandas methods.

So, first minimum value. Let's print. Actually, the very first thing I want to do, like, how do we look at just one column in a pandas data frame? We could say cars at year resale value.

And if I look at that, if I execute that and see what's that value, it's got a column name. Year resale value. And then all the indexes on the left and all the values on the right.

Including these NANs which we'll be talking about. Talk about what that means in a little while. All right.

So, we got this cars at year resale value. Like, what is this thing? It's not a data frame. And it's not a list.

Let's look at the type of that. And if you don't know type, it's a built-in Python function for telling you the data type of something. It is a pandas series.

As it says in this documentation, a one dimensional array with access labels. So, it's a series. And a series means a row or a column by itself.

Not as part of a data frame. So, pandas serieses don't work exactly like arrays. We can't say what's the max of this.

Because that's a Python function for working with lists. But fortunately pandas just builds this in as a method. Meaning after the value, a dot value.

Give me the min. The dot min of that. And it's 5.16. There's a car that's only worth $5,000 after you buy it.

That's unfortunate for owners of that car. Now, let's print that out. Let's print it.

So, we can continue to see it. And we might print a label. We can print a minimum resale value.

And if you're printing out multiple things, a string and then a value, it's good that you have to put a comma in between. You're printing out this value and this value. Okay.

If we print that out. Minimum resale value, 5.16. Let's get the maximum value. This is slightly off.

Maximum value. There we go. All right.

Let's say let's print maximum resale value just as a label with a typo. So, that's just how I like to roll. And we'll say what is the cars at year resale value.

No typo. Wonderful. And then after the square bracket, because that's the one value, that's the series, dot max.

And let's print both of those out. There we go. This car worth $68,000 a year later.

Pretty good. Depending on how much you spent on it. Okay.

So, that's how we can look at maximum and minimum values. Let's take a look at our next task. The mean average.

Sometimes just known as the mean in math. And for people who might have learned this in grade school, you might have learned just the average. It's generally how way people calculate an average.

It's one of many averages. We'll be discussing that. So, if you have three different values, let's say.

Or let's say any number of values. The way you calculate the mean is you sum up all the elements and you divide it by the number of elements. There's a couple of different ways we can do that.

But we're going to use numpy's .mean method. If we have three numbers, 100, 0, and 100, we add them up. 100 plus 0 plus 100.

And we divide by how many values there were. Three. To give us a rough average.

A mean average. We could do it using Python. Let's do that.

We could say, well, first, let's get let's we're going to work with our degrees up here. We're going to get the mean average of those. So, we look at the all those numbers added together.

There's a built in Python method for that function, I should say. I sometimes say method when I mean function. Sum of degrees.

That will give us one number that is all those numbers added together. All the numbers in degrees. And then we look at how many values are there.

There's a built in Python function. I got it right this time. Called length.

Short until len. That will give you how many items are in a list. So, we'll say length of degrees.

I want the sum of the degrees divided by the length of degrees. Let's print that out. Print and surround that with parentheses.

79.36. Great. Let's do it with NumPy. And this time, I'm going to save it into a variable.

We'll say I want mean degrees to be a new variable referencing the value that this evaluates to. NumPy, that's our built in import.mean of degrees. And then we can evaluate what is mean degrees.

It's a NumPy float, and it's the same number. Okay. But let's do one more thing to mean degrees.

Let's round it off. Nobody's particularly interested in multiple decimal points for degrees. Usually, you know, if you're getting that exact, 65.3 degrees.

You know, your average body temperature, 98.6. Generally, how far we go with degrees. So, we could say mean degrees is now equal to the rounded version. What evaluates when we throw mean degrees into the round function from Python.

And we say the value we want around and how many decimal points passed the how many digits passed the decimal point to include. Now, we look at mean degrees again. And there it is.

It's 79.4. So, that's how we could calculate mean. We're going to get into some other averages, talk about the problem with mean in the next video.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

More articles by Colin Jaffe

How to Learn Machine Learning

Master machine learning with hands-on training. Use Python to make, modify, and test your own machine learning models.

Yelp Facebook LinkedIn YouTube Twitter Instagram