Finding the Mean Using Python

Explore the concepts of mean, median, and mode, their real-world applications, and how they can be computed using Python. Learn how these statistical measurements, despite their different methods and implications, serve as an efficient way of understanding data in numerous fields, including investment.

Key Insights

The terms mean, median, and mode are statistical measurements utilized to find the central tendency around data using different methods, each with its own advantages and disadvantages.
The mean, often referred to as average, is a summary measurement that takes all data values into account and divides it by the number of data points. This is commonly used to understand data in various fields.
Despite its wide use, the mean is heavily affected by outliers, which can skew the data and present a misleading representation of the state of affairs in situations such as investment returns.
To calculate the mean, one has to add up all the numbers (sum) and then divide the sum by the number of data points (count).
Python offers a straightforward way to calculate the mean with two commands: the len attribute to count the data points and the sum attribute to add up all the values.
For those new to Python or looking to improve their skills, Python Classes are available both in-person and live online.

In this series of posts, we'll cover various applications of statistics in Python. This first post talks about calculating the mean using Python.

For the next three posts, we will tackle the topics of mean, median, and mode. We will discuss the motivation behind finding them, how to calculate them, and ultimately show how easy it is to code for these statistics in Python. All of these statistical measurements are useful for finding the central tendency around the data but use different methods and each method has its own pros and cons.

Basics of Mean

You probably have learned about these numerical measurements, but you were too young to understand the real-world application. Let’s discuss the term mean in this post, or more commonly referred to as the average. The mean is used as a “summary” measurement since it takes all the data values into account and divides it by the number of data points. This is very commonly used in all walks of life as it is an efficient way to understand data. Think of an investment fund who has 25 current investments and you want to know if they are doing well but do not have time to analyze each investment, the mean is a great measurement to see “on average” if the fund is performing well. However, the main drawback of using the mean is that it is heavily affected by outliers. For example, if the investment fund has actually made poor investment decisions on 24 of its investments but has one extremely lucrative investment, the lucrative investment will skew the mean and make it seem as if the fund has made consistently good investments, even though they have not.

Finding the Mean: Tutorial

So how do we find the mean? The mean is a relatively easy measure to find mathematically, it takes only two steps. To calculate the mean, you must add up all the numbers and then divide the sum of those numbers by how many data points they are, or in math terms, it is simply the sum divided by the count (remember these two terms). It is good to know the math and understand where the number is coming from but Python does all of this for us with two short commands. Look below on how easy it is to solve for the mean in Python.

Step 1: Create a variable named test_scores and populate it with a list of individual test scores.
Step 2: Use the len attribute to count how many data points are in test_scores and use the sum attribute to add up all the scores in test_scores.
Step 3: Create a variable named count and set it equal to 12 (got 12 from the len function) and create a variable named sum scores and set it equal to 1024 (got 1024 from the sum function).
Step 4: Divide the sum by the count which is how you find the mean and then use the print function to show the output to the user.

Python for Data Science Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes, 1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

Python Mean

Note: If you are new to Python or need to brush up on some skills check out our Python Classes that are offered in-person or live online.

Basics of Mean

Finding the Mean: Tutorial

How to Learn Python

Related Resources

Python and Pandas: A Bigger Data Solution to Excel

Learning the Math used in Data Science: Introduction