When does the IndexError: list assignment index out of range appear?
Python throws an IndexError if you try to assign a value to a list index that doesn’t exist, yet. For example, if you execute the expression list[1] = 10 on an empty list, Python throws the IndexError. Simply resolve it by adding elements to your list until the index actually exists.
Here’s the minimal example that throws the IndexError:
lst = []
lst[1] = 10
If you run this code, you’ll see that Python throws an IndexError:
Traceback (most recent call last): File "C:\Users\xcent\Desktop\code.py", line 2, in <module> lst[1] = 10
IndexError: list assignment index out of range
You can resolve it by adding two “dummy” elements to the list so that the index 1 actually exists in the list:
lst = [None, None]
lst[1] = 10
print(lst)
Now, Python will print the expected output:
[None, 10]
Try to fix the IndexError in the following interactive code shell:
Exercise: Can you fix this code?
So what are some other occurrences of the IndexError?
IndexError in For Loop
Frequently, the IndexError happens if you use a for loop to modify some list elements like here:
# WRONG CODE:
lst = []
for i in range(10): lst[i] = i
print(lst)
Again, the result is an IndexError:
Traceback (most recent call last): File "C:\Users\xcent\Desktop\code.py", line 4, in <module> lst[i] = i
IndexError: list assignment index out of range
You modify a list element at index i that doesn’t exist in the list. Instead, create the list using the list(range(10)) list constructor.
You’ve learned how to resolve one error. By doing this, your Python skills have improved a little bit. Do this every day and soon, you’ll be a skilled master coder.
Do you want to leverage those skills in the most effective way? In other words: do you want to earn money with Python?
If the answer is yes, let me show you a simple way how you can create your simple, home-based coding business online:
What Are Alternative Methods to Convert a List of Strings to a String?
Python is flexible—you can use multiple methods to achieve the same thing. So what are the different methods to convert a list to a string?
Method 1: Use the method ''.join(list) to concatenate all strings in a given list to a single list. The string on which you call the method is the delimiter between the list elements.
Method 2: Start with an empty string variable. Use a simple for loop to iterate over all elements in the list and add the current element to the string variable.
Method 3: Use list comprehension[str(x) for x in list] if the list contains elements of different types to convert all elements to the string data type. Combine them using the ''.join(newlist) method.
Method 4: Use the map functionmap(str, list] if the list contains elements of different types to convert all elements to the string data type. Combine them using the ''.join(newlist) method.
Here are all four variants in some code:
lst = ['learn' , 'python', 'fast'] # Method 1
print(''.join(lst))
# learnpythonfast # Method 2
s = ''
for st in lst: s += st
print(s)
# learnpythonfast # Method 3
lst = ['learn', 9, 'python', 9, 'fast']
s = ''.join([str(x) for x in lst])
print(s)
# learn9python9fast # Method 4
lst = ['learn', 9, 'python', 9, 'fast']
s = ''.join(map(str, lst))
print(s)
# learn9python9fast
Again, try to modify the delimiter string yourself using our interactive code shell:
So far so good. You’ve learned how to convert a list to a string. But that’s not all! Let’s dive into some more specifics of converting a list to a string.
Python List to String with Commas
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a comma as the delimiter between the list elements?
Example: You want to convert list ['learn', 'python', 'fast'] to the string 'learn,python,fast'.
Solution: to convert a list of strings to a string, call the ','.join(list) method on the delimiter string ',' that glues together all strings in the list and returns a new string.
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a space as the delimiter between the list elements?
Example: You want to convert list ['learn', 'python', 'fast'] to the string 'learn python fast'. (Note the empty spaces between the terms.)
Solution: to convert a list of strings to a string, call the ' '.join(list) method on the string ' ' (space character) that glues together all strings in the list and returns a new string.
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a newline character as the delimiter between the list elements?
Example: You want to convert list ['learn', 'python', 'fast'] to the string 'learn\npython\nfast' or as a multiline string:
'''learn
python
fast'''
Solution: to convert a list of strings to a string, call the '\n'.join(list) method on the newline character '\n' that glues together all strings in the list and returns a new string.
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a comma character followed by an empty space as the delimiter between the list elements? Additionally, you want to wrap each string in double quotes.
Example: You want to convert list ['learn', 'python', 'fast'] to the string '"learn", "python", "fast"' :
Solution: to convert a list of strings to a string, call the ', '.join('"' + x + '"' for x in lst) method on the delimiter string ', ' that glues together all strings in the list and returns a new string. You use a generator expression to modify each element of the original element so that it is enclosed by the double quote " chararacter.
Code: Let’s have a look at the code.
lst = ['learn', 'python', 'fast']
print(', '.join('"' + x + '"' for x in lst))
The output is:
"learn", "python", "fast"
Python List to String with Brackets
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a comma character followed by an empty space as the delimiter between the list elements? Additionally, you want to wrap the whole string in a square bracket to indicate that’s a list.
Example: You want to convert list ['learn', 'python', 'fast'] to the string '[learn, python, fast]' :
Solution: to convert a list of strings to a string, call the '[' + ', '.join(lst) + ']' method on the delimiter string ', ' that glues together all strings in the list and returns a new string.
Although the output of both the converted list and the original list look the same, you can see that the data type is string for the former and list for the latter.
Convert List of Int to String
Problem: You want to convert a list into a string but the list contains integer values.
Example: Convert the list [1, 2, 3] to a string '123'.
Solution: Use the join method in combination with a generator expression to convert the list of integers to a single string value:
lst = [1, 2, 3]
print(''.join(str(x) for x in lst))
# 123
The generator expression converts each element in the list to a string. You can then combine the string elements using the join method of the string object.
If you miss the conversion from integer to string, you get the following TypeError:
lst = [1, 2, 3]
print(''.join(lst)) '''
Traceback (most recent call last): File "C:\Users\xcent\Desktop\code.py", line 2, in <module> print(''.join(lst))
TypeError: sequence item 0: expected str instance, int found '''
Python List to String One Line
To convert a list to a string in one line, use either of the three methods:
Use the ''.join(list) method to glue together all list elements to a single string.
Use the list comprehension method [str(x) for x in lst] to convert all list elements to type string.
Use str(list) to convert the list to a string representation.
Here are three examples:
lst = ['finxter', 'is', 'awesome']
print(' '.join(lst))
# finxter is awesome lst = [1, 2, 3]
print([str(x) for x in lst])
# ['1', '2', '3'] print(str(lst))
# [1, 2, 3]
Where to Go From Here
Want to increase your Python skill on a daily basis? Just by following a series of FREE Python course emails? Then join the #1 Python Email Academy in the world!
For my subscribers, I regularly publish educative emails about the most important Python topics. Register and join my community of thousands of ambitious coders. I guarantee, you will love it!
(Besides—it’s free and you can unsubscribe anytime so you’ve nothing to lose and everything to gain.)
[Spoiler] Which function filters a list faster: filter() vs list comprehension? For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in filter() method.
To answer this question, I’ve written a short script that tests the runtime performance of filtering large lists of increasing sizes using the filter() and the list comprehension methods.
My thesis is that the list comprehension method should be slightly faster for larger list sizes because it leverages the efficient cPython implementation of list comprehension and doesn’t need to call an extra function.
I used my notebook with an Intel(R) Core(TM) i7-8565U 1.8GHz processor (with Turbo Boost up to 4.6 GHz) and 8 GB of RAM.
Try It Yourself:
import time # Compare runtime of both methods
list_sizes = [i * 10000 for i in range(100)]
filter_runtimes = []
list_comp_runtimes = [] for size in list_sizes: lst = list(range(size)) # Get time stamps time_0 = time.time() list(filter(lambda x: x%2, lst)) time_1 = time.time() [x for x in lst if x%2] time_2 = time.time() # Calculate runtimes filter_runtimes.append((size, time_1 - time_0)) list_comp_runtimes.append((size, time_2 - time_1)) # Plot everything
import matplotlib.pyplot as plt
import numpy as np f_r = np.array(filter_runtimes)
l_r = np.array(list_comp_runtimes) print(filter_runtimes)
print(list_comp_runtimes) plt.plot(f_r[:,0], f_r[:,1], label='filter()')
plt.plot(l_r[:,0], l_r[:,1], label='list comprehension') plt.xlabel('list size')
plt.ylabel('runtime (seconds)') plt.legend()
plt.savefig('filter_list_comp.jpg')
plt.show()
The code compares the runtimes of the filter() function and the list comprehension variant to filter a list. Note that the filter() function returns a filter object, so you need to convert it to a list using the list() constructor.
Here’s the resulting plot that compares the runtime of the two methods. On the x axis, you can see the list size from 0 to 1,000,000 elements. On the y axis, you can see the runtime in seconds needed to execute the respective functions.
The resulting plot shows that both methods are extremely fast for a few tens of thousands of elements. In fact, they are so fast that the time() function of the time module cannot capture the elapsed time.
But as you increase the size of the lists to hundreds of thousands of elements, the list comprehension method starts to win:
For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in filter() method.
The reason is the efficient implementation of the list comprehension statement. An interesting observation is the following though. If you don’t convert the filter function to a list, you get the following result:
Suddenly the filter() function has constant runtime of close to 0 seconds—no matter how many elements are in the list. Why is this happening?
The explanation is simple: the filter function returns an iterator, not a list. The iterator doesn’t need to compute a single element until it is requested to compute the next() element. So, the filter() function computes the next element only if it is required to do so. Only if you convert it to a list, it must compute all values. Otherwise, it doesn’t actually compute a single value beforehand.
Where to Go From Here
This tutorial has shown you the filter() function in Python and compared it against the list comprehension way of filtering: [x for x in list if condition]. You’ve seen that the latter is not only more readable and more Pythonic, but also faster. So take the list comprehension approach to filter lists!
If you love coding and you want to do this full-time from the comfort of your own home, you’re in luck:
I’ve created a free webinar that shows you how I started as a Python freelancer after my computer science studies working from home (and seeing my kids grow up) while earning a full-time income working only part-time hours.
There are three equally interpretations of this term:
Coming from a computer science background, I was assuming that “nested list comprehension” refers to the creation of a list of lists. In other words: How to create a nested list with list comprehension?
But after a bit of research, I learned that there is a second interpretation of nested list comprehension: How to use a nested for loop in the list comprehension?
A few months later, I realized that some people use “nested list comprehension” to mean the use of a list comprehension statement as expression within a list comprehension statement. In other words: How to use a list comprehension statement within a list comprehension statement? (Watch the video to learn about this third interpretation.)
How to Create a Nested List with List Comprehension?
It is possible to create a nested list with list comprehension in Python. What is a nested list? It’s a list of lists. Here is an example:
## Nested List Comprehension
lst = [[x for x in range(5)] for y in range(3)]
print(lst)
# [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
As you can see, we create a list with three elements. Each list element is a list by itself.
Everything becomes clear when we go back to our magic formula of list comprehension: [ expression + context]. The expression part generates a new list consisting of 5 integers. The context part repeats this three times. Hence, each of the three nested lists has five elements.
If you are an advanced programmer, you may ask whether there is some aliasing going on here. Aliasing in this context means that the three list elements point to the same list [0, 1, 2, 3, 4]. This is not the case because each expression is evaluated separately, a new list is created for each of the three context executions. This is nicely demonstrated in this code snippet:
How to Use a Nested For Loop in the List Comprehension?
To be frank, this is super-simple stuff. Do you remember the formula of list comprehension (= ‘[‘ + expression + context + ‘]’)?
The context is an arbitrary complex restriction construct of for loops and if restrictions with the goal of specifying the data items on which the expression should be applied.
In the expression, you can use any variable you define within a for loop in the context. Let’s have a look at an example.
Suppose you want to use list comprehension to make this code more concise (for example, you want to find all possible pairs of users in your social network application):
# BEFORE
users = ["John", "Alice", "Ann", "Zach"]
pairs = []
for x in users: for y in users: if x != y: pairs.append((x,y))
print(pairs)
#[('John', 'Alice'), ('John', 'Ann'), ('John', 'Zach'), ('Alice', 'John'), ('Alice', 'Ann'), ('Alice', 'Zach'), ('Ann', 'John'), ('Ann', 'Alice'), ('Ann', 'Zach'), ('Zach', 'John'), ('Zach', 'Alice'), ('Zach', 'Ann')]
Now, this code is a mess! How can we fix it? Simply use nested list comprehension!
# AFTER
pairs = [(x,y) for x in users for y in users if x!=y]
print(pairs)
# [('John', 'Alice'), ('John', 'Ann'), ('John', 'Zach'), ('Alice', 'John'), ('Alice', 'Ann'), ('Alice', 'Zach'), ('Ann', 'John'), ('Ann', 'Alice'), ('Ann', 'Zach'), ('Zach', 'John'), ('Zach', 'Alice'), ('Zach', 'Ann')]
As you can see, we are doing exactly the same thing as with un-nested list comprehension. The only difference is to write the two for loops and the if statement in a single line within the list notation [].
How to Use a List Comprehension Statement Within a List Comprehension Statement?
Our goal is to solve the following problem: given a multiline string, create a list of lists—each consisting of all the words in a line that have more than three characters.
## Data
text = '''
Call me Ishmael. Some years ago - never mind how long precisely - having
little or no money in my purse, and nothing particular to interest me
on shore, I thought I would sail about a little and see the watery part
of the world. It is a way I have of driving off the spleen, and regulating
the circulation. - Moby Dick''' words = [[x for x in line.split() if len(x)>3] for line in text.split('\n')] print(words)
This is a nested list comprehension statement. This creates a new inner list as an element of the outer list. Each inner list contains all words with more than 4 characters. Each outer list contains an inner list for each line of text.
Where to Go From Here?
Want to increase your Python skill on a daily basis? Just by following a series of FREE Python course emails? Then join the #1 Python Email Academy in the world!
For my subscribers, I regularly publish educative emails about the most important Python topics. Register and join my community of thousands of ambitious coders. I guarantee, you will love it!
(Besides—it’s free and you can unsubscribe anytime so you’ve nothing to lose and everything to gain.)
In this article, you’ll learn the ins and outs of the sorting function in Python. In particular, you’re going to learn how to filter a list of dictionaries. So let’s get started!
Short answer: The list comprehension statement [x for x in lst if condition(x)] creates a new list of dictionaries that meet the condition. All dictionaries in lst that don’t meet the condition are filtered out. You can define your own condition on list element x.
Here’s a quick and minimal example:
l = [{'key':10}, {'key':4}, {'key':8}] def condition(dic): ''' Define your own condition here''' return dic['key'] > 7 filtered = [d for d in l if condition(d)] print(filtered)
# [{'key': 10}, {'key': 8}]
Try it yourself in the interactive Python shell (in your browser):
You’ll now get the step-by-step solution of this solution. I tried to keep it as simple as possible. So keep reading!
Filter a List of Dictionaries By Value
Problem: Given a list of dictionaries. Each dictionary consists of one or more (key, value) pairs. You want to filter them by value of a particular dictionary key (attribute). How do you do this?
Minimal Example: Consider the following example where you’ve three user dictionaries with username, age, and play_time keys. You want to get a list of all users that meet a certain condition such as play_time>100. Here’s what you try to accomplish:
Solution: Use list comprehension[x for x in lst if condition(x)] to create a new list of dictionaries that meet the condition. All dictionaries in lst that don’t meet the condition are filtered out. You can define your own condition on list element x.
Here’s the code that shows you how to filter out all user dictionaries that don’t meet the condition of having played at least 100 hours.
users = [{'username': 'alice', 'age': 23, 'play_time': 101}, {'username': 'bob', 'age': 31, 'play_time': 88}, {'username': 'ann', 'age': 25, 'play_time': 121},] superplayers = [user for user in users if user['play_time']>100] print(superplayers)
The output is the filtered list of dictionaries that meet the condition:
Problem: Given a list of dictionaries. Each dictionary consists of one or more (key, value) pairs. You want to filter them by key (attribute). All dictionaries that don’t have this key (attribute) should be filtered out. How do you do this?
Minimal Example: Consider the following example again where you’ve three user dictionaries with username, age, and play_time keys. You want to get a list of all users for which the key play_time exists. Here’s what you try to accomplish:
The output should look like this where the play_time attribute determines whether a dictionary passes the filter or not (as long as it exists, it shall pass the filter).
Solution: Use list comprehension[x for x in lst if condition(x)] to create a new list of dictionaries that meet the condition. All dictionaries in lst that don’t meet the condition are filtered out. You can define your own condition on list element x.
Here’s the code that shows you how to filter out all user dictionaries that don’t meet the condition of having a key play_time.
users = [{'username': 'alice', 'age': 23, 'play_time': 101}, {'username': 'bob', 'age': 31, 'play_time': 88}, {'username': 'ann', 'age': 25},] superplayers = [user for user in users if 'play_time' in user] print(superplayers)
The output is the filtered list of dictionaries that meet the condition:
Problem: Given a list of dictionaries. Each dictionary consists of multiple (key, value) pairs. You want to sort them by value of a particular dictionary key (attribute). How do you sort this dictionary?
Minimal Example: Consider the following example where you want to sort a list of salary dictionaries by value of the key 'Alice'.
Solution: You have two main ways to do this—both are based on defining the key function of Python’s sorting methods. The key function maps each list element (in our case a dictionary) to a single value that can be used as the basis of comparison.
Use a lambda function as key function to sort the list of dictionaries.
Use the itemgetter function as key function to sort the list of dictionaries.
Here’s the code of the first option using a lambda function that returns the value of the key 'Alice' from each dictionary:
# Create the dictionary of Bob's and Alice's salary data
salaries = [{'Alice': 100000, 'Bob': 24000}, {'Alice': 121000, 'Bob': 48000}, {'Alice': 12000, 'Bob': 66000}] # Use the sorted() function with key argument to create a new dic.
# Each dictionary list element is "reduced" to the value stored for key 'Alice'
sorted_salaries = sorted(salaries, key=lambda d: d['Alice']) # Print everything to the shell
print(sorted_salaries)
The output is the sorted dictionary. Note that the first dictionary has the smallest salary of Alice and the third dictionary has the largest salary of Alice.
In this article, you’ve learned how to filter a list of dictionaries easily with a simple list comprehension statement. That’s far more efficient than using the filter() method proposed in many other blog tutorials. Guido, the creator of Python, hated the filter() function!
I’ve realized that professional coders tend to use dictionaries more often than beginners due to their superior understanding of the benefits of dictionaries. If you want to learn about those, check out my in-depth tutorial of Python dictionaries.
If you want to stop learning and start earning with Python, check out my free webinar “How to Become a Python Freelance Developer?”. It’s a great way of starting your thriving coding business online.
Here’s your free PDF cheat sheet showing you all Python list methods on one simple page. Click the image to download the high-resolution PDF file, print it, and post it to your office wall:
Sorts the elements in the list lst in ascending order.
Go ahead and try the Python list methods yourself:
Puzzle: Can you figure out all outputs of this interactive Python script?
If you’ve studied the table carefully, you’ll know the most important list methods in Python. Let’s have a look at some examples of above methods:
>>> l = []
>>> l.append(2)
>>> l
[2]
>>> l.clear()
>>> l
[]
>>> l.append(2)
>>> l
[2]
>>> l.copy()
[2]
>>> l.count(2)
1
>>> l.extend([2,3,4])
>>> l
[2, 2, 3, 4]
>>> l.index(3)
2
>>> l.insert(2, 99)
>>> l
[2, 2, 99, 3, 4]
>>> l.pop()
4
>>> l.remove(2)
>>> l
[2, 99, 3]
>>> l.reverse()
>>> l
[3, 99, 2]
>>> l.sort()
>>> l
[2, 3, 99]
Where to Go From Here?
Want more cheat sheets? Excellent. I believe learning with cheat sheets is one of the most efficient learning techniques. Join my free Python email list where I’ll send you more than 10 new Python cheat sheets and regular Python courses for continuous improvement. It’s free!
In this article, I’ll show you how to divide a list into equally-sized chunks in Python. Step-by-step, you’ll arrive at the following great code that accomplishes exactly that:
You can play around with the code yourself but if you need some explanations, read on because I’ll explain it to you in much detail:
Chunking Your List
Let’s make this question more palpable by transforming it into a practical problem:
Problem: Imagine that you have a temperature sensor that sends data every 6 minutes, which makes 10 data points per hour. All these data points are stored in one list for each day.
Now, we want to have a list of hourly average temperatures for each day—this is why we need to split the list of data for one day into evenly sized chunks.
Solution: To achieve this, we use a for-loop and Python’s built-in function range() which we have to examine in depth.
The range() function can be used either with one, two or three arguments.
If you use it with one single argument, e.g., range(10), we get a range object containing the numbers 0 to 9. So, if you call range with one argument, this argument will be interpreted as the max or stop value of the range, but it is excluded from the range.
You can also call the range() function with two arguments, e.g., range(5, 10). This call with two arguments returns a range object containing the numbers 5 to 9. So, now we have a lower and an upper bound for the range. Contrary to the stop value, the start value is included in the range.
In a call of the function range() with three parameters, the first parameter is the start value, the second one is the stop value and the third value is the step size. For example, range(5, 15, 2) returns a range object containing the following values: 5, 7, 9, 11, 13. As you can see, the range starts with the start and then it adds the step value as long as the values are less than the stop value.
In our problem, our chunks have a length of 10, the start value is 0 and the max value is the end of the list of data.
Putting all together: Calling range(0, len(data), 10) will give us exactly what we need to iterate over the chunks. Let’s put some numbers there to visualize it.
For one single day, we have a data length of 24 * 10 = 240, so the call of the range function would be this: range(0, 240, 10) and the resulting range would be 0, 10, 20, 30, …, 230. Pause a moment and consider these values: they represent the indices of the first element of each chunk.
So what do we have now? The start indices of each chunk and also the length – and that’s all we need to slice the input data into the chunks we need.
The slicing operator takes two or three arguments separated by the colon : symbol. They have the same meaning as in the range function.
data = [15.7, 16.2, 16.5, 15.9, ..., 27.3, 26.4, 26.1, 27.2]
chunk_length = 10 for i in range(0, len(data), chunk_length): print(data[i:i+chunk_length])
Play with this code in our interactive Python shell:
However, we can still improve this code and make it reusable by creating a generator out of it.
Chunking With Generator Expressions
A generator is a function but instead of a return statement it uses the keyword yield.
The keyword yield interrupts the function and returns a value. The next time the function gets called, the next value is returned and the function’s execution stops again. This behavior can be used in a for-loop, where we want to get a value from the generator, work with this value inside the loop and then repeat it with the next value. Now, let’s take a look at the improved version of our code:
data = [15.7, 16.2, 16.5, 15.9, ..., 27.3, 26.4, 26.1, 27.2]
chunk_length = 10 def make_chunks(data, length): for i in range(0, len(data), length): yield data[i:i + length] for chunk in make_chunks(data, chunk_length): print(chunk)
That looks already pretty pythonic and we can reuse the function make_chunks() for all the other data we need to process.
Let’s finish the code so that we get a list of hourly average temperatures as result.
import random def make_chunks(data, length): for i in range(0, len(data), length): yield data[i:i + length] def process(chunk): return round(sum(chunk)/len(chunk), 2) n = 10
# generate random temperature values
day_temperatures = [random.random() * 20 for x in range(24 * n)]
avg_per_hour = [] for chunk in make_chunks(day_temperatures, n): r = process(batch) avg_per_hour.append(r) print(avg_per_hour)
And that’s it, this cool pythonic code solves our problem. We can make the code even a bit shorter but I consider this code less readable because you need to know really advanced Python concepts.
import random make_chunks = lambda data, n: (data[i:i + n] for i in range(0, len(data), n))
process = lambda data: round(sum(data)/len(data), 2) n = 10
# generate random temperature values
day_temperatures = [random.random() * 20 for x in range(24 * n)]
avg_per_hour = [] for chunk in make_chunks(day_temperatures, n): r = process(batch) avg_per_hour.append(r) print(avg_per_hour)
So, what did we do? We reduced the helper functions to lambda expressions and for the generator function we use a special shorthand – the parenthesis.
Summary
To sum up the solution: We used the range function with three arguments, the start value, the stop value and the step value. By setting the step value to our desired chunk length, the start value to 0 and the stop value to the total data length, we get a range object containing all the start indices of our chunks. With the help of slicing we can access exactly the chunk we need in each iteration step.
Where to Go From Here?
Want to start earning a full-time income with Python—while working only part-time hours? Then join our free Python Freelancer Webinar.
It shows you exactly how you can grow your business and Python skills to a point where you can work comfortable for 3-4 hours from home and enjoy the rest of the day (=20 hours) spending time with the persons you love doing things you enjoy to do.
Want to calculate the standard deviation of a column in your Pandas DataFrame?
In case you’ve attended your last statistics course a few years ago, let’s quickly recap the definition of variance: it’s the average squared deviation of the list elements from the average value.
You can do this by using the pd.std() function that calculates the standard deviation along all columns. You can then get the column you’re interested in after the computation.
import pandas as pd # Create your Pandas DataFrame
d = {'username': ['Alice', 'Bob', 'Carl'], 'age': [18, 22, 43], 'income': [100000, 98000, 111000]}
df = pd.DataFrame(d) print(df)
Your DataFrame looks like this:
username
age
income
0
Alice
18
100000
1
Bob
22
98000
2
Carl
43
111000
Here’s how you can calculate the standard deviation of all columns:
print(df.std())
The output is the standard deviation of all columns:
age 13.428825
income 7000.000000
dtype: float64
To get the variance of an individual column, access it using simple indexing:
print(df.std()['age'])
# 180.33333333333334
Together, the code looks as follows. Use the interactive shell to play with it!
Standard Deviation in NumPy Library
Python’s package for data science computation NumPy also has great statistics functionality. You can calculate all basic statistics functions such as average, median, variance, and standard deviation on NumPy arrays. Simply import the NumPy library and use the np.var(a) method to calculate the average value of NumPy array a.
Here’s the code:
import numpy as np a = np.array([1, 2, 3])
print(np.std(a))
# 0.816496580927726
Where to Go From Here?
Before you can become a data science master, you first need to master Python. Join my free Python email course and receive your daily Python lesson directly in your INBOX. It’s fun!
You work in law enforcement for the US Department of Labor, finding companies that pay below minimum wage so you can initiate further investigations. Like hungry dogs on the back of a meat truck, your Fair Labor Standards Act (FLSA) officers are already waiting for the list of companies that violated the minimum wage law. Can you give it to them?
This article shows you how to calculate the average of a given list of numerical inputs in Python.
In case you’ve attended your last statistics course a few years ago, let’s quickly recap the definition of the average: sum over all values and divide them by the number of values.
So, how to calculate the average of a given list in Python?
Python 3.x doesn’t have a built-in method to calculate the average. Instead, simply divide the sum of list values through the number of list elements using the two built-in functions sum() and len(). You calculate the average of a given list in Python as sum(list)/len(list). The return value is of type float.
Here’s a short example that calculates the average income of income data $80000, $90000, and $100000:
income = [80000, 90000, 100000]
average = sum(income) / len(income)
print(average)
# 90000.0
You can see that the return value is of type float, even though the list data is of type integer. The reason is that the default division operator in Python performs floating point arithmetic, even if you divide two integers.
Puzzle: Try to modify the elements in the list income so that the average is 80000.0 instead of 90000.0 in our interactive shell:
If you cannot see the interactive shell, here’s the non-interactive version:
# Define the list data
income = [80000, 90000, 100000] # Calculate the average as the sum divided
# by the length of the list (float division)
average = sum(income) / len(income) # Print the result to the shell
print(average) # Puzzle: modify the income list so that
# the result is 80000.0
This is the absolute minimum you need to know about calculating basic statistics such as the average in Python. But there’s far more to it and studying the other ways and alternatives will actually make you a better coder. So, let’s dive into some related questions and topics you may want to learn!
Python List Average Median
What’s the median of a Python list? Formally, the median is “the value separating the higher half from the lower half of a data sample” (wiki).
How to calculate the median of a Python list?
Sort the list of elements using the sorted() built-in function in Python.
Calculate the index of the middle element (see graphic) by dividing the length of the list by 2 using integer division.
Return the middle element.
Together, you can simply get the median by executing the expression median = sorted(income)[len(income)//2].
Here’s the concrete code example:
income = [80000, 90000, 100000, 88000] average = sum(income) / len(income)
median = sorted(income)[len(income)//2] print(average)
# 89500.0 print(median)
# 90000.0
The mean value is exactly the same as the average value: sum up all values in your sequence and divide by the length of the sequence. You can use either the calculation sum(list) / len(list) or you can import the statistics module and call mean(list).
These are especially interesting if you have two median values and you want to decide which one to take.
Python List Average Standard Deviation
Standard deviation is defined as the deviation of the data values from the average (wiki). It’s used to measure the dispersion of a data set. You can calculate the standard deviation of the values in the list by using the statistics module:
import statistics as s lst = [1, 0, 4, 3]
print(s.stdev(lst))
# 1.8257418583505538
Python List Average Min Max
In contrast to the average, there are Python built-in functions that calculate the minimum and maximum of a given list. The min(list) method calculates the minimum value and the max(list) method calculates the maximum value in a list.
Here’s an example of the minimum, maximum and average computations on a Python list:
import statistics as s lst = [1, 1, 2, 0]
average = sum(lst) / len(lst)
minimum = min(lst)
maximum = max(lst) print(average)
# 1.0 print(minimum)
# 0 print(maximum)
# 2
Python List Average Sum
How to calculate the average using the sum() built-in Python method? Simple, divide the result of the sum(list) function call by the number of elements in the list. This normalizes the result and calculates the average of all elements in a list.
Again, the following example shows how to do this:
import statistics as s lst = [1, 1, 2, 0]
average = sum(lst) / len(lst) print(average)
# 1.0
Python List Average NumPy
Python’s package for data science computation NumPy also has great statistics functionality. You can calculate all basic statistics functions such as average, median, variance, and standard deviation on NumPy arrays. Simply import the NumPy library and use the np.average(a) method to calculate the average value of NumPy array a.
Here’s the code:
import numpy as np a = np.array([1, 2, 3])
print(np.average(a))
# 2.0
Python Average List of (NumPy) Arrays
NumPy’s average function computes the average of all numerical values in a NumPy array. When used without parameters, it simply calculates the numerical average of all values in the array, no matter the array’s dimensionality. For example, the expression np.average([[1,2],[2,3]]) results in the average value (1+2+2+3)/4 = 2.0.
However, what if you want to calculate the weighted average of a NumPy array? In other words, you want to overweight some array values and underweight others.
You can easily accomplish this with NumPy’s average function by passing the weights argument to the NumPy average function.
In the first example, we simply averaged over all array values: (-1+1+2+2)/4 = 1.0. However, in the second example, we overweight the last array element 2—it now carries five times the weight of the other elements resulting in the following computation: (-1+1+2+(2+2+2+2+2))/8 = 1.5.
Let’s explore the different parameters we can pass to np.average(...).
The NumPy array which can be multi-dimensional.
(Optional) The axis along which you want to average. If you don’t specify the argument, the averaging is done over the whole array.
(Optional) The weights of each column of the specified axis. If you don’t specify the argument, the weights are assumed to be homogeneous.
(Optional) The return value of the function. Only if you set this to True, you will get a tuple (average, weights_sum) as a result. This may help you to normalize the output. In most cases, you can skip this argument.
Here is an example how to average along the columns of a 2D NumPy array with specified weights for both rows.
Problem: Given is a list of dictionaries. Your goal is to calculate the average of the values associated to a specific key from all dictionaries.
Example: Consider the following example where you want to get the average value of a list of database entries (e.g., each stored as a dictionary) stored under the key 'age'.
db = [{'username': 'Alice', 'joined': 2020, 'age': 23}, {'username': 'Bob', 'joined': 2018, 'age': 19}, {'username': 'Alice', 'joined': 2020, 'age': 31}] average = # ... Averaging Magic Here ... print(average)
The output should look like this where the average is determined using the ages (23+19+31)/3 = 24.333.
Solution: Solution: You use the feature of generator expression in Python to dynamically create a list of age values. Then, you sum them up and divide them by the number of age values. The result is the average of all age values in the dictionary.
db = [{'username': 'Alice', 'joined': 2020, 'age': 23}, {'username': 'Bob', 'joined': 2018, 'age': 19}, {'username': 'Alice', 'joined': 2020, 'age': 31}] average = sum(d['age'] for d in db) / len(db) print(average)
# 24.333333333333332
Let’s move on to the next question: how to calculate the average of a list of floats?
Python Average List of Floats
Averaging a list of floats is as simple as averaging a list of integers. Just sum them up and divide them by the number of float values. Here’s the code:
Next, I’ll give all three examples in a single code snippet:
lst = [(1, 2), (2, 2), (1, 1)] # 1. Unpacking
lst_2 = [*lst[0], *lst[1], *lst[2]]
print(sum(lst_2) / len(lst_2))
# 1.5 # 2. List comprehension
lst_3 = [x for t in lst for x in t]
print(sum(lst_3) / len(lst_3))
# 1.5 # 3. Nested for loop
lst_4 = []
for t in lst: for x in t: lst_4.append(x)
print(sum(lst_4) / len(lst_4))
# 1.5
Unpacking: The asterisk operator in front of an iterable “unpacks” all values in the iterable into the outer context. You can use it only in a container data structure that’s able to catch the unpacked values.
List comprehension is a compact way of creating lists. The simple formula is [ expression + context ].
Expression: What to do with each list element?
Context: What list elements to select? It consists of an arbitrary number of for and if statements.
The example [x for x in range(3)] creates the list [0, 1, 2].
Python Average Nested List
Problem: How to calculate the average of a nested list?
Example: Given a nested list [[1, 2, 3], [4, 5, 6]]. You want to calculate the average (1+2+3+4+5+6)/6=3.5. How do you do that?
Solution: Again, there are three solution ideas:
Unpack the tuple values into a list and calculate the average of this list.
Next, I’ll give all three examples in a single code snippet:
lst = [[1, 2, 3], [4, 5, 6]] # 1. Unpacking
lst_2 = [*lst[0], *lst[1]]
print(sum(lst_2) / len(lst_2))
# 3.5 # 2. List comprehension
lst_3 = [x for t in lst for x in t]
print(sum(lst_3) / len(lst_3))
# 3.5 # 3. Nested for loop
lst_4 = []
for t in lst: for x in t: lst_4.append(x)
print(sum(lst_4) / len(lst_4))
# 3.5
Unpacking: The asterisk operator in front of an iterable “unpacks” all values in the iterable into the outer context. You can use it only in a container data structure that’s able to catch the unpacked values.
List comprehension is a compact way of creating lists. The simple formula is [ expression + context ].
Expression: What to do with each list element?
Context: What list elements to select? It consists of an arbitrary number of for and if statements.
The example [x for x in range(3)] creates the list [0, 1, 2].
Where to Go From Here
Python 3.x doesn’t have a built-in method to calculate the average. Instead, simply divide the sum of list values through the number of list elements using the two built-in functions sum() and len(). You calculate the average of a given list in Python as sum(list)/len(list). The return value is of type float.
If you keep struggling with those basic Python commands and you feel stuck in your learning progress, I’ve got something for you: Python One-Liners (Amazon Link).
In the book, I’ll give you a thorough overview of critical computer science topics such as machine learning, regular expression, data science, NumPy, and Python basics—all in a single line of Python code!
OFFICIAL BOOK DESCRIPTION:Python One-Liners will show readers how to perform useful tasks with one line of Python code. Following a brief Python refresher, the book covers essential advanced topics like slicing, list comprehension, broadcasting, lambda functions, algorithms, regular expressions, neural networks, logistic regression and more. Each of the 50 book sections introduces a problem to solve, walks the reader through the skills necessary to solve that problem, then provides a concise one-liner Python solution with a detailed explanation.