Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python List Average

#1
Python List Average

<div><p><em>Don’t Be Mean, Be Median.</em></p>
<p>This article shows you how to calculate the average of a given list of numerical inputs in Python. </p>
<p>In case you’ve attended your last statistics course a few years ago, let’s quickly recap the definition of the average: <em>sum over all values and divide them by the number of values. </em></p>
<p><strong>So, how to calculate the average of a given list in Python?</strong></p>
<p><strong><em>Python 3.x doesn’t have a built-in method to calculate the average. Instead, simply divide the sum of list values through the number of list elements using the two built-in functions <code>sum()</code> and <code>len()</code>. You calculate the average of a given <code>list</code> in Python as <code>sum(list)/len(list)</code>. The return value is of type float.</em></strong></p>
<p>Here’s a short example that calculates the average income of income data $80000, $90000, and $100000:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">income = [80000, 90000, 100000]
average = sum(income) / len(income)
print(average)
# 90000.0</pre>
<p>You can see that the return value is of type float, even though the list data is of type integer. The reason is that the default division operator in Python performs <a rel="noreferrer noopener" href="https://blog.finxter.com/decimal-pythons-float-trap-and-how-to-solve-it/" target="_blank">floating point arithmetic</a>, even if you divide two integers.</p>
<p><strong>Puzzle</strong>: Try to modify the elements in the list <code>income</code> so that the average is 80000.0 instead of 90000.0 in our interactive shell:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">&lt;iframe height="700px" width="100%" src="https://repl.it/@finxter/averagepython?lite=true" scrolling="no" frameborder="no" allowtransparency="true" allowfullscreen="true" sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals">&lt;/iframe></pre>
<p>If you cannot see the interactive shell, here’s the non-interactive version:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Define the list data
income = [80000, 90000, 100000] # Calculate the average as the sum divided
# by the length of the list (float division)
average = sum(income) / len(income) # Print the result to the shell
print(average) # Puzzle: modify the income list so that
# the result is 80000.0</pre>
<p>This is the absolute minimum you need to know about calculating basic statistics such as the average in Python. But there’s far more to it and studying the other ways and alternatives will actually make you a better coder. So, let’s dive into some related questions and topics you may want to learn!</p>
<h2>Python List Average Median</h2>
<p>What’s the median of a Python list? Formally, the median is “the value separating the higher half from the lower half of a data sample” (<a rel="noreferrer noopener" href="https://en.wikipedia.org/wiki/Median" target="_blank">wiki</a>). </p>
<figure class="wp-block-image size-large is-resized"><img src="https://blog.finxter.com/wp-content/uploads/2020/04/MedianAverage-1024x576.jpg" alt="" class="wp-image-7476" width="400" height="225" srcset="https://blog.finxter.com/wp-content/uploads/2020/04/MedianAverage-scaled.jpg 1024w, https://blog.finxter.com/wp-content/uplo...00x169.jpg 300w, https://blog.finxter.com/wp-content/uplo...68x432.jpg 768w" sizes="(max-width: 400px) 100vw, 400px" /></figure>
<p><strong>How to calculate the median of a Python list?</strong></p>
<ul>
<li>Sort the list of elements using the <code>sorted()</code> built-in function in Python.</li>
<li>Calculate the index of the middle element (see graphic) by dividing the length of the list by 2 using integer division.</li>
<li>Return the middle element.</li>
</ul>
<p>Together, you can simply get the median by executing the expression <code>median = sorted(income)[len(income)//2]</code>.</p>
<p>Here’s the concrete code example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">income = [80000, 90000, 100000, 88000] average = sum(income) / len(income)
median = sorted(income)[len(income)//2] print(average)
# 89500.0 print(median)
# 90000.0</pre>
<p><strong>Related tutorials:</strong></p>
<ul>
<li><a rel="noreferrer noopener" href="https://blog.finxter.com/python-list-sort/" target="_blank">Detailed tutorial how to sort a list in Python on this blog</a>. </li>
</ul>
<h2>Python List Average Mean</h2>
<p>The mean value is exactly the same as the average value: sum up all values in your sequence and divide by the length of the sequence. You can use either the calculation <code>sum(list) / len(list)</code> or you can import the <code>statistics</code> module and call <code>mean(list)</code>. </p>
<p>Here are both examples:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">lst = [1, 4, 2, 3] # method 1
average = sum(lst) / len(lst)
print(average)
# 2.5 # method 2
import statistics
print(statistics.mean(lst))
# 2.5</pre>
<p>Both methods are equivalent. The <code>statistics</code> module has some more interesting variations of the <code>mean()</code> method (<a href="https://docs.python.org/3.4/library/statistics.html" target="_blank" rel="noreferrer noopener">source</a>):</p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td><a href="https://docs.python.org/3.4/library/statistics.html#statistics.mean">mean()</a></td>
<td>Arithmetic mean (“average”) of data.</td>
</tr>
<tr>
<td><a href="https://docs.python.org/3.4/library/statistics.html#statistics.median">median()</a></td>
<td>Median (middle value) of data.</td>
</tr>
<tr>
<td><a href="https://docs.python.org/3.4/library/statistics.html#statistics.median_low">median_low()</a></td>
<td>Low median of data.</td>
</tr>
<tr>
<td><a href="https://docs.python.org/3.4/library/statistics.html#statistics.median_high">median_high()</a></td>
<td>High median of data.</td>
</tr>
<tr>
<td><a href="https://docs.python.org/3.4/library/statistics.html#statistics.median_grouped">median_grouped()</a></td>
<td>Median, or 50th percentile, of grouped data.</td>
</tr>
<tr>
<td><a href="https://docs.python.org/3.4/library/statistics.html#statistics.mode">mode()</a></td>
<td>Mode (most common value) of discrete data.</td>
</tr>
</tbody>
</table>
</figure>
<p>These are especially interesting if you have two median values and you want to decide which one to take.</p>
<h2>Python List Average Standard Deviation</h2>
<p>Standard deviation is defined as the deviation of the data values from the average (<a rel="noreferrer noopener" href="https://en.wikipedia.org/wiki/Standard_deviation" target="_blank">wiki</a>). It’s used to measure the dispersion of a data set. You can calculate the standard deviation of the values in the list by using the statistics module:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import statistics as s lst = [1, 0, 4, 3]
print(s.stdev(lst))
# 1.8257418583505538</pre>
<h2>Python List Average Min Max</h2>
<p>In contrast to the average, there are <a href="https://blog.finxter.com/subscribe/" target="_blank" rel="noreferrer noopener">Python built-in functions</a> that calculate the <a href="https://blog.finxter.com/how-to-get-the-key-with-minimum-value-in-a-python-dictionary/" target="_blank" rel="noreferrer noopener">minimum </a>and <a href="https://blog.finxter.com/how-to-get-the-key-with-the-maximum-value-in-a-dictionary/" target="_blank" rel="noreferrer noopener">maximum </a>of a given list. The <code>min(list)</code> method calculates the minimum value and the <code>max(list)</code> method calculates the maximum value in a list. </p>
<p>Here’s an example of the minimum, maximum and average computations on a <a href="https://blog.finxter.com/python-lists/" target="_blank" rel="noreferrer noopener">Python list</a>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import statistics as s lst = [1, 1, 2, 0]
average = sum(lst) / len(lst)
minimum = min(lst)
maximum = max(lst) print(average)
# 1.0 print(minimum)
# 0 print(maximum)
# 2</pre>
<h2>Python List Average Sum</h2>
<p>How to calculate the average using the <code>sum()</code> built-in Python method? Simple, divide the result of the <code>sum(list)</code> function call by the number of elements in the list. This normalizes the result and calculates the average of all elements in a list.</p>
<p>Again, the following example shows how to do this:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import statistics as s lst = [1, 1, 2, 0]
average = sum(lst) / len(lst) print(average)
# 1.0</pre>
<h2>Python List Average NumPy</h2>
<p>Python’s package for data science computation <a rel="noreferrer noopener" href="https://blog.finxter.com/numpy-tutorial/" target="_blank">NumPy</a> also has great statistics functionality. You can calculate all basic statistics functions such as <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-calculate-weighted-average-numpy-array-along-axis/" target="_blank">average</a>, median, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-calculate-variance-numpy-array/" target="_blank">variance</a>, and <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-calculate-column-standard-deviation-2d-numpy-array/" target="_blank">standard deviation</a> on NumPy arrays. Simply import the NumPy library and use the <code>np.average(a)</code> method to calculate the average value of NumPy array <code>a</code>.</p>
<p>Here’s the code:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import numpy as np a = np.array([1, 2, 3])
print(np.average(a))
# 2.0</pre>
<h2>Python Average List of (NumPy) Arrays</h2>
<p><a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.average.html" target="_blank" rel="noreferrer noopener">NumPy’s average function</a> computes the average of all numerical values in a NumPy array. When used without parameters, it simply calculates the numerical average of all values in the array, no matter the array’s dimensionality. For example, the expression <code>np.average([[1,2],[2,3]])</code> results in the average value <code>(1+2+2+3)/4 = 2.0</code>.</p>
<figure><iframe src="https://repl.it/repls/SwelteringTornApplets?lite=true" allowfullscreen="true" width="100%" height="800px"></iframe></figure>
<p>However, what if you want to calculate the <strong>weighted average</strong> of a NumPy array? In other words, you want to <em>overweigh</em>t some array values and <em>underweigh</em>t others.</p>
<p>You can easily accomplish this with NumPy’s average function by passing the weights argument to the NumPy <code>average</code> function. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import numpy as np a = [-1, 1, 2, 2] print(np.average(a))
# 1.0 print(np.average(a, weights = [1, 1, 1, 5]))
# 1.5</pre>
<p>In the first example, we simply averaged over all array values: <code>(-1+1+2+2)/4 = 1.0</code>. However, in the second example, we overweight the last array element 2—it now carries five times the weight of the other elements resulting in the following computation: <code>(-1+1+2+(2+2+2+2+2))/8 = 1.5</code>.</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<div class="ast-oembed-container"><iframe title="How to Calculate the Weighted Average of a Numpy Array in Python?" width="1400" height="788" src="https://www.youtube.com/embed/b0U31GkE7ho?feature=oembed" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></div>
</div>
</figure>
<p>Let’s explore the different parameters we can pass to <code>np.average(...)</code>.</p>
<ul>
<li><strong>The NumPy array</strong> which can be multi-dimensional.</li>
<li>(Optional) <strong>The axis</strong> along which you want to average. If you don’t specify the argument, the averaging is done over the whole array.</li>
<li>(Optional) <strong>The weights</strong> of each column of the specified axis. If you don’t specify the argument, the weights are assumed to be homogeneous.</li>
<li>(Optional) <strong>The return value</strong> of the function. Only if you set this to True, you will get a tuple (average, weights_sum) as a result. This may help you to normalize the output. In most cases, you can skip this argument.</li>
</ul>
<p>Here is an example how to <a href="https://blog.finxter.com/numpy-average-along-axis/" target="_blank" rel="noreferrer noopener">average along the columns</a> of a 2D NumPy array with specified weights for both rows.</p>
<p><pre data-enlighter-language="python" class="EnlighterJSRAW">
import numpy as np # daily stock prices
# [morning, midday, evening]
solar_x = np.array(
[[2, 3, 4], # today
[2, 2, 5]]) # yesterday # midday - weighted average
print(np.average(solar_x, axis=0, weights=[3/4, 1/4])[1])
</pre>
</p>
<p><em>What is the output of this puzzle?</em><br />*Beginner Level* (solution below)</p>
<p>You can also solve this puzzle in our puzzle-based learning app (100% FREE):<a rel="noreferrer noopener" href="https://app.finxter.com/learn/computer/science/433" target="_blank"> Test your skills now!</a></p>
<p><strong>Related article: </strong></p>
<ul>
<li><a href="https://blog.finxter.com/how-to-calculate-weighted-average-numpy-array-along-axis/" target="_blank" rel="noreferrer noopener">NumPy (Weighted) Averaging</a></li>
</ul>
<h2>Python Average List of Dictionaries</h2>
<p><strong>Problem</strong>: Given is a list of dictionaries. Your goal is to calculate the average of the values associated to a specific key from all dictionaries. </p>
<p><strong>Example</strong>: Consider the following example where you want to get the average value of a list of database entries (e.g., each stored as a dictionary) stored under the key <code>'age'</code>. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">db = [{'username': 'Alice', 'joined': 2020, 'age': 23}, {'username': 'Bob', 'joined': 2018, 'age': 19}, {'username': 'Alice', 'joined': 2020, 'age': 31}] average = # ... Averaging Magic Here ... print(average)</pre>
<p>The output should look like this where the average is determined using the ages <code>(23+19+31)/3 = 24.333</code>. </p>
<p><strong>Solution</strong>: <strong>Solution</strong>: You use the feature of <a href="https://blog.finxter.com/10-python-one-liners/" target="_blank" rel="noreferrer noopener">generator express</a>ion in Python to dynamically create a list of <code>age</code> values. Then, you sum them up and divide them by the number of <code>age</code> values. The result is the average of all <code>age</code> values in the dictionary.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">db = [{'username': 'Alice', 'joined': 2020, 'age': 23}, {'username': 'Bob', 'joined': 2018, 'age': 19}, {'username': 'Alice', 'joined': 2020, 'age': 31}] average = sum(d['age'] for d in db) / len(db) print(average)
# 24.333333333333332</pre>
<p>Let’s move on to the next question: how to calculate the average of a list of floats?</p>
<h2>Python Average List of Floats</h2>
<p>Averaging a list of floats is as simple as averaging a list of integers. Just sum them up and divide them by the number of float values. Here’s the code:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">lst = [1.0, 2.5, 3.0, 1.5]
average = sum(lst) / len(lst)
print(average)
# 2.0</pre>
<h2>Python Average List of Tuples</h2>
<p><strong>Problem</strong>: How to average all values if the values are stored in a list of tuples?</p>
<p><strong>Example</strong>: You have the list of tuples <code>[(1, 2), (2, 2), (1, 1)]</code> and you want the average value <code>(1+2+2+2+1+1)/6 = 1.5</code>. </p>
<p><strong>Solution</strong>: There are three solution ideas:</p>
<ul>
<li><a rel="noreferrer noopener" href="https://blog.finxter.com/what-is-asterisk-in-python/" target="_blank">Unpack </a>the tuple values into a list and calculate the average of this list. </li>
<li>Use only <a href="https://blog.finxter.com/list-comprehension/" target="_blank" rel="noreferrer noopener">list comprehension</a> with nested for loop.</li>
<li>Use a simple nested for loop.</li>
</ul>
<p>Next, I’ll give all three examples in a single code snippet:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">lst = [(1, 2), (2, 2), (1, 1)] # 1. Unpacking
lst_2 = [*lst[0], *lst[1], *lst[2]]
print(sum(lst_2) / len(lst_2))
# 1.5 # 2. List comprehension
lst_3 = [x for t in lst for x in t]
print(sum(lst_3) / len(lst_3))
# 1.5 # 3. Nested for loop
lst_4 = []
for t in lst: for x in t: lst_4.append(x)
print(sum(lst_4) / len(lst_4))
# 1.5</pre>
<p><strong>Unpacking</strong>: The asterisk operator in front of an iterable “unpacks” all values in the iterable into the outer context. You can use it only in a container data structure that’s able to catch the unpacked values.</p>
<p><strong>List comprehension</strong> is a compact way of creating lists. The simple formula is [ expression + context ].</p>
<ul>
<li>Expression: What to do with each list element?</li>
<li>Context: What list elements to select? It consists of an arbitrary number of for and if statements.</li>
</ul>
<p>The example <code>[x for x in range(3)]</code> creates the list <code>[0, 1, 2]</code>.</p>
<h2>Python Average Nested List</h2>
<p><strong>Problem</strong>: How to calculate the average of a nested list?</p>
<p><strong>Example</strong>: Given a nested list [[1, 2, 3], [4, 5, 6]]. You want to calculate the average (1+2+3+4+5+6)/6=3.5. How do you do that?</p>
<p><strong>Solution</strong>: Again, there are three solution ideas:</p>
<ul>
<li><a rel="noreferrer noopener" href="https://blog.finxter.com/what-is-asterisk-in-python/" target="_blank">Unpack </a>the tuple values into a list and calculate the average of this list. </li>
<li>Use only <a href="https://blog.finxter.com/list-comprehension/" target="_blank" rel="noreferrer noopener">list comprehension</a> with nested for loop.</li>
<li>Use a simple nested for loop.</li>
</ul>
<p>Next, I’ll give all three examples in a single code snippet:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">lst = [[1, 2, 3], [4, 5, 6]] # 1. Unpacking
lst_2 = [*lst[0], *lst[1]]
print(sum(lst_2) / len(lst_2))
# 3.5 # 2. List comprehension
lst_3 = [x for t in lst for x in t]
print(sum(lst_3) / len(lst_3))
# 3.5 # 3. Nested for loop
lst_4 = []
for t in lst: for x in t: lst_4.append(x)
print(sum(lst_4) / len(lst_4))
# 3.5 </pre>
<p><strong>Unpacking</strong>: The asterisk operator in front of an iterable “unpacks” all values in the iterable into the outer context. You can use it only in a container data structure that’s able to catch the unpacked values.</p>
<p><strong>List comprehension</strong> is a compact way of creating lists. The simple formula is [ expression + context ].</p>
<ul>
<li>Expression: What to do with each list element?</li>
<li>Context: What list elements to select? It consists of an arbitrary number of for and if statements.</li>
</ul>
<p>The example <code>[x for x in range(3)]</code> creates the list <code>[0, 1, 2]</code>.</p>
<h2>Where to Go From Here</h2>
<p><em><strong><em>Python 3.x doesn’t have a built-in method to calculate the average. Instead, simply divide the sum of list values through the number of list elements using the two built-in functions <code>sum()</code> and <code>len()</code>. You calculate the average of a given <code>list</code> in Python as <code>sum(list)/len(list)</code>. The return value is of type float.</em></strong></em></p>
<p>If you keep struggling with those basic Python commands and you feel stuck in your learning progress, I’ve got something for you: <a rel="noreferrer noopener" href="https://www.amazon.com/gp/product/B07ZY7XMX8" target="_blank">Python One-Liners</a> (Amazon Link). </p>
<p>In the book, I’ll give you a thorough overview of critical computer science topics such as machine learning, regular expression, data science, NumPy, and Python basics—all in a single line of Python code!</p>
<p><a rel="noreferrer noopener" href="https://www.amazon.com/gp/product/B07ZY7XMX8" target="_blank">Get the book from Amazon!</a></p>
<p><strong>OFFICIAL BOOK DESCRIPTION:</strong> <em>Python One-Liners will show readers how to perform useful tasks with one line of Python code. Following a brief Python refresher, the book covers essential advanced topics like slicing, list comprehension, broadcasting, lambda functions, algorithms, regular expressions, neural networks, logistic regression and more. Each of the 50 book sections introduces a problem to solve, walks the reader through the skills necessary to solve that problem, then provides a concise one-liner Python solution with a detailed explanation.</em></p>
</div>


https://www.sickgaming.net/blog/2020/04/...t-average/
Reply



Forum Jump:


Users browsing this thread:
2 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016