Note the dot character inside the character set. As you may know, the dot metacharacter matches an arbitrary character if it is used outside a character set.
The answer is that the dot inside the character set matches the dot symbol—and not an arbitrary character. The reason is that the character set removes the special meaning of the dot symbol.
If you’ve already learned how to make basic 3d plots in maptlotlib and want to take them to the next level, then look no further. In this article, I’ll teach you how to create the two most common 3D plots (surface and wireframe plots) and a step-by-step method you can use to create any shape you can imagine.
In addition to import matplotlib.pyplot as plt and calling plt.show(), to create a 3D plot in matplotlib, you need to:
Import the Axes3D object
Initialize your Figure and Axes3D object
Get some 3D data
Plot it using Axes notation
Here’s a wireframe plot:
# Standard import
import matplotlib.pyplot as plt # Import 3D Axes
from mpl_toolkits.mplot3d import axes3d # Set up Figure and 3D Axes
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d') # Get some data
X, Y, Z = axes3d.get_test_data(0.1) # Plot using Axes notation
ax.plot_wireframe(X, Y, Z)
plt.show()
Try It Yourself on our interactive Python shell (and check out the file 'plot.png'):
Changing the plot call to ax.plot_surface(X, Y, Z) gives
Great! You’ve just created your first 3D wireframe and surface plots. Don’t worry if that was a bit fast; let’s dive into a more detailed example.
But first, note that your plots may look different to mine because I use the seaborn style throughout. You can set this by installing the seaborn library and calling the set function at the top of your code.
import seaborn as sns; sns.set()
Matplotlib 3D Plot Example
The four steps needed to create advanced 3D plots are the same as those needed to create basic ones. If you don’t understand those steps, check out my article on how to make basic 3D plots first.
The most difficult part of creating surface and wireframe plots is step 3: getting 3D data. Matplotlib actually includes a helper function axes3d.get_test_data() to generate some data for you. It accepts a float and, for best results, choose a value between 0 and 1. It always produces the same plot, but different floats give you different sized data and thus impact how detailed the plot is.
However, the best way to learn 3D plotting is to create custom plots.
At the end of step 3, you want to have three numpy arraysX, Y and Z, which you will pass to ax.plot_wireframe() or ax.plot_surface(). You can break step 3 down into four steps:
Define the x-axis and y-axis limits
Create a grid of XY-points (to get X and Y)
Define a z-function
Apply the z-function to X and Y (to get Z)
In matplotlib, the z-axis is vertical by default. So, the ‘bottom’ of the Axes3D object is a grid of XY points. For surface or wireframe plots, each pair of XY points has a corresponding Z value. So, we can think of surface/wireframe plots as the result of applying some z-function to every XY-pair on the ‘bottom’ of the Axes3D object.
Since there are infinitely many numbers on the XY-plane, it is not possible to map every one to a Z-value. You just need an amount large enough to deceive humans – anything above 50 pairs usually works well.
To create your XY-points, you first need to define the x-axis and y-axis limits. Let’s say you want X-values ranging from -5 to +5 and Y-values from -2 to +2. You can create an array of numbers for each of these using the np.linspace() function. For reasons that will become clear later, I will make x have 100 points, and y have 70.
x = np.linspace(-5, 5, num=100)
y = np.linspace(-2, 2, num=70)
Both x and y are 1D arrays containing num equally spaced floats in the ranges [-5, 5] and [-2, 2] respectively.
Since the XY-plane is a 2D object, you now need to create a rectangular grid of all xy-pairs. To do this, use the numpy function np.meshgrid(). It takes n 1D arrays and turns them into an N-dimensional grid. In this case, it takes two 1D arrays and turns them into a 2D grid.
X, Y = np.meshgrid(x, y)
Now you’ve created X and Y, so let’s inspect them.
print(f'Type of X: {type(X)}')
print(f'Shape of X: {X.shape}\n')
print(f'Type of Y: {type(Y)}')
print(f'Shape of Y: {Y.shape}')
Type of X: <class 'numpy.ndarray'>
Shape of X: (70, 100) Type of Y: <class 'numpy.ndarray'>
Shape of Y: (70, 100)
Both X and Y are numpy arrays of the same shape: (70, 100). This corresponds to the size of y and x respectively. As you would expect, the size of y dictates the height of the array, i.e., the number of rows and the size of x dictates the width, i.e., the number of columns.
Note that I used lowercase x and y for the 1D arrays and uppercase X and Y for the 2D arrays. This is standard practice when making 3D plots, and I use it throughout the article.
Now you’ve created your grid of points; it’s time to define a function to apply to them all. Since this function outputs z-values, I call it a z-function. Common z-functions contain np.sin() and np.cos() because they create repeating, cyclical patterns that look interesting when plotted in 3D. Additionally, z-functions usually combine both X and Y variables as 3D plots look at how all the variables interact.
# Define z-function with 2 arguments: x and y
def z_func(x, y): return np.sin(np.cos(x) + y) # Apply to X and Y
Z = z_func(X, Y)
Here I defined a z-function that accepts 2 variables – x and y – and is a combination of np.sin() and np.cos() functions. Then I applied it to X and Y to get the Z array. Thanks to numpy broadcasting, python applies the z-function to every XY pair almost instantly and saves you from having to write a wildly inefficient for loop.
Note that Z is the same shape and type as both X and Y.
print(f'Type of Z: {type(Z)}')
print(f'Shape of Z: {Z.shape}')
Type of Z: <class 'numpy.ndarray'>
Shape of Z: (70, 100)
Now that you have got your data, all that is left to do is make the plots. Let’s put all the above code together:
# Set up Figure and 3D Axes
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d') # Create x and y 1D arrays
x = np.linspace(-5, 5, num=100)
y = np.linspace(-2, 2, num=70) # Create X and Y 2D arrays
X, Y = np.meshgrid(x, y) # Define Z-function
def z_func(x, y): return np.sin(np.cos(x) + y) # Create Z 2D array
Z = z_func(X, Y) # Plot using Axes notation
ax.plot_wireframe(X, Y, Z)
# Set axes lables
ax.set(xlabel='x', ylabel='y', zlabel='z')
plt.show()
Great, I found the above plot by playing around with different z-functions and think it looks pretty cool! Z-functions containing np.log(), np.exp(), np.sin(), np.cos() and combinations of x and y usually lead to interesting plots – I encourage you to experiment yourself.
Now I’ll create 3 different z-functions with the same X and Y as before and create a subplot of them so you can see the differences.
# Set up Figure and Axes
fig, axes = plt.subplots(1, 3, subplot_kw=dict(projection='3d'), figsize=plt.figaspect(1/3)) # Create 3 z-functions
def z_1(x, y): return np.exp(np.cos(x)*y)
def z_2(x, y): return np.log(x**2 + y**4)
def z_3(x, y): return np.sin(x * y) # Create 3 Z arrays Z_arrays = [z_1(X, Y), z_2(X, Y), z_3(X, Y)]
# Titles for the plots
z_func_names = ['np.exp(np.cos(x)*y)', 'np.log(x**2 + y**4)', 'np.sin(x * y)'] # Plot all 3 wireframes
for Z_array, z_name, ax in zip(Z_arrays, z_func_names, axes): ax.plot_wireframe(X, Y, Z_array) ax.set(title=z_name)
plt.show()
I think all of these images demonstrate the power of 3D plotting, and I hope they have encouraged you to create your own.
Now you know how to create any surface or wireframe plot with your data. But so far, you have only used the default settings. Let’s modify them using the available keyword arguments.
Matplotlib 3D Plot Wireframe
To make a wireframe plot, call ax.plot_wireframe(X, Y, Z). These plots give you an overview of the surface. Plus, you can see through them to more easily identify peaks and troughs that may otherwise be hidden.
A wireframe plot works by only plotting a sample of the data passed to it. You can modify how large the samples are with 4 keyword arguments:
rstride and cstride, or
rcount and ccount
The r and c stand for row and column respectively. The difference between them is similar to the difference between np.arange() and np.linspace().
The stride arguments default to 1 and set the step sizes between each sampled point. A stride of 1 means that every value is chosen, and a stride of 10 means that every 10th value is chosen. In this way, it is similar to np.arange() where you select the step size. A larger stride means fewer values are chosen, so your plot renders faster and is less detailed.
The count arguments default to 50 and set the number of (equally spaced) rows/columns sampled. A count of 1 means you use 1 row/column, and a count of 100 means you use 100. In this way, it is similar to np.linspace() where you select the total number of values with the num keyword argument. A larger count means more values are chosen, so your plot renders slower and is more detailed.
The matplotlib docs say that you should use the count arguments. However, both are still available, and it doesn’t look like the stride arguments will be depreciated any time soon. Note, though, that you cannot use both count and stride, and if you try to do so, it’s a ValueError.
By setting any of the keyword arguments to 0, you do not sample data along that axis. The result is then a 3D line plot rather than a wireframe.
To demonstrate the differences between different counts or strides, I’ll create a subplot with the same X, Y and Z arrays as the first example but with different stride and count values.
fig, axes = plt.subplots(nrows=1, ncols=3, subplot_kw=dict(projection='3d'), figsize=plt.figaspect(1/3))
# Same as first example
x = np.linspace(-5, 5, num=100)
y = np.linspace(-2, 2, num=70)
X, Y = np.meshgrid(x, y) def z_func(x, y): return np.sin(np.cos(x) + y)
Z = z_func(X, Y) # Define different strides
strides = [1, 5, 10] for stride, ax in zip(strides, axes.flat): ax.plot_wireframe(X, Y, Z, rstride=stride, cstride=stride) ax.set(title=f'stride={stride}') plt.show()
Here you can see that a larger stride produces a less detailed wireframe plot. Note that stride=1 is the default and is incredibly detailed for a plot that is supposed to give a general overview of the data.
fig, axes = plt.subplots(nrows=1, ncols=3, subplot_kw=dict(projection='3d'), figsize=plt.figaspect(1/3)) counts = [5, 20, 50] for count, ax in zip(counts, axes.flat): # Use same data as the above plots ax.plot_wireframe(X, Y, Z, rcount=count, ccount=count) ax.set(title=f'count={count}') plt.show()
Here you can see that a larger count produces a more detailed wireframe plot. Again note that the default count=50 produces a very detailed plot.
To make a surface plot call ax.plot_surface(X, Y, Z). Surface plots are the same as wireframe plots, except that spaces in between the lines are colored. Plus, there are some additional keyword arguments you can use, which can add a ton of value to the plot.
First, let’s make the same plots as above with the default surface plot settings and different rcount and ccount values.
fig, axes = plt.subplots(nrows=1, ncols=3, subplot_kw=dict(projection='3d'), figsize=plt.figaspect(1/3)) counts = [5, 20, 50] for count, ax in zip(counts, axes.flat): # Use same data as the above plots surf = ax.plot_surface(X, Y, Z, rcount=count, ccount=count) ax.set(title=f'count={count}') plt.show()
In contrast to wireframe plots, the space in between each line is filled with the color blue. Note that the plots get whiter as the count gets larger. This is because the lines are white, and, as the count increases, there are more lines on each plot. You can modify this by setting the linewidth or lw argument to a smaller number such, as 0.1 or even 0.
Much nicer! Now you can see the color of the plot rather than the color of the lines. It is possible to almost completely remove the lines by setting antialiased=False.
Antialiasing removes noise from data and smooths out images. By turning it off, the surface is less smooth, and so you can’t see the lines as easily.
Now the surface is slightly less smooth, and so you can’t see the lines.
Maptlotlib 3D Surface Plot Cmap
Arguably the most crucial keyword argument for surface plots is cmap which sets the colormap. When you look at a surface plot from different angles, having a colormap helps you understand which parts of the surface are where. Usually, you want high points to be one color (e.g., orange) and low points to be another (e.g., black). Having two distinct colors is especially helpful if you look at a plot from different angles (which I will show you how to do in a moment).
The colormap copper maps large z-values to orange and smaller ones to black.
Now I’ll use 3 different and commonly used colormaps for the same plot to give you an idea of how color can help and (massively) hinder your plots.
fig, axes = plt.subplots(nrows=1, ncols=3, subplot_kw=dict(projection='3d'), figsize=plt.figaspect(1/3)) cmaps = ['copper', 'coolwarm', 'jet'] for cmap, ax in zip(cmaps, axes): ax.plot_surface(X, Y, Z, lw=0, cmap=cmap) ax.set(title=f'{cmap}')
plt.show()
The coolwarm colormap works well if you want to highlight extremely high and extremely low points. This non-technical paper defines a colormap similar to coolwarm and argues it should be the default cmap for all data science work.
The jet colormap is well known and is a terrible choice for all of your plotting needs. It contains so many colors that it is hard for a human to know which corresponds to high, low, or middle points. I included it as an example here but urge you to never use it in any of your plots.
Now let’s look at how the count and stride arguments can affect the color of your surface plots. For brevity, I will just make one subplot demonstrating different rccount and ccount sizes and leave the reader to experiment with rstride and cstride.
fig, axes = plt.subplots(nrows=1, ncols=3, subplot_kw=dict(projection='3d'), figsize=plt.figaspect(1/3)) counts = [5, 20, 50] for count, ax in zip(counts, axes.flat): # Use same data as the above plots ax.plot_surface(X, Y, Z, rcount=count, ccount=count, cmap='copper', lw=0) ax.set(title=f'count={count}')
plt.show()
If you pass a lower value to the count keyword arguments, there are fewer areas that can be colored. As such, the colors have much more distinct bands when you set the count keyword arguments to smaller values. The change in color is much smoother in the plots that have large count arguments.
Matplotlib 3D Plot Colorbar
Adding a colorbar to a 3D surface plot is the same as adding them to other plots.
The simplest method is to save the output of ax.plot_surface() in a variable such as surf and pass that variable to plt.colorbar().
Here’s an example using the three different colormaps from before.
fig, axes = plt.subplots(nrows=1, ncols=3, subplot_kw=dict(projection='3d'), figsize=plt.figaspect(1/3)) cmaps = ['copper', 'coolwarm', 'jet'] for cmap, ax in zip(cmaps, axes): # Save surface in a variable: surf surf = ax.plot_surface(X, Y, Z, lw=0, cmap=cmap) # Plot colorbar on the correct Axes: ax fig.colorbar(surf, ax=ax) ax.set(title=f'{cmap}')
plt.show()
It’s essential to provide a colorbar for any colored plots you create, especially if you use different colormaps. Remember that colorbar() is a Figure (not Axes) method, and you must use the ax keyword argument to place it on the correct Axes.
Now, let’s see why colormaps are so crucial by rotating the surface plots and viewing them from different angles.
Matplotlib 3D Plot View_Init
One way to rotate your plots is by using the magic command %matplotlib notebook at the top of your Jupyter notebooks. If you do this, all your plots appear in interactive windows. If instead, you use %matplotlib inline (the default settings), you have to rotate your plots using code.
Two attributes that control the rotation of a 3D plot: ax.elev and ax.azim, which represent the elevation and azimuthal angles of the plot, respectively.
The elevation is the angle above the XY-plane and the azimuth (don’t worry, I hadn’t heard of it before either) is the counter-clockwise rotation about the z-axis. Note that they are properties of the Axes3D object and so you can happily create subplots where each has a different angle.
Let’s find the default values.
fig = plt.figure()
ax = plt.axes(projection='3d') print(f'The default elevation angle is: {ax.elev}')
print(f'The default azimuth angle is: {ax.azim}')
The default elevation angle is: 30
The default azimuth angle is: -60
You can see that the defaults are 30 and -60 degrees for the elevation and azimuth, respectively.
You can set them to any float you want, and there are two ways to do it:
Reassign the ax.azim and ax.elev attributes, or
Use the ax.view_init(elev, azim) method
Here’s an example with method 1.
# Same as usual
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot_surface(X, Y, Z, lw=0, cmap='copper')
# Set axis labels so you know what you are looking at
ax.set(xlabel='x', ylabel='y', zlabel='z') # Reassign rotation angles to 0
ax.azim, ax.elev = 0, 0
plt.show()
Here I set both angles to 0, and you can see the y-axis at the front, the x-axis at the side, and the z-axis as vertical.
I’ll now create the same plot using the ax.view_init() method, which accepts two floats: the elevation and azimuth.
# Same as usual
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot_surface(X, Y, Z, lw=0, cmap='copper')
# Set axis labels so you know what you are looking at
ax.set(xlabel='x', ylabel='y', zlabel='z') # Reassign rotation angles to 0
ax.view_init(elev=0, azim=0)
plt.show()
Excellent! This plot looks identical to the one above, but I used the ax.view_init() method instead. If you just want to change one of the angles, only pass one of the keyword arguments.
# Same as usual
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot_surface(X, Y, Z, lw=0, cmap='copper')
# Set axis labels so you know what you are looking at
ax.set(xlabel='x', ylabel='y', zlabel='z') # Set elevation to 90 degrees
ax.view_init(elev=90)
plt.show()
Here I set the elevation to 90 degrees but left the azimuth with its default value. This demonstrates one more reason why colormaps are important: you can infer the shape of the surface from the color (black is low, light is high).
Conclusion
Now you know how to create the most critical 3D plots: wireframe and surface plots.
You’ve learned how to create custom 3D plot datasets using np.linspace(), np.meshgrid() and z-functions. Plus, you can create them with varying degrees of accuracy by modifying the count and stride keyword arguments.
You can make surface plots of any color and colormap and modify them so that the color of the lines doesn’t take over the plot. Finally, you can rotate them by setting the ax.azim or ax.elev attributes to a float of your choice and even use the ax.view_init() method to do the same thing.
Congratulations on mastering these plots! Creating other advanced ones such as contour, tri-surface, and quiver plots for you will be easy. You know all the high-level skills; you just need to go out there and practice.
Where To Go From Here?
Do you wish you could be a programmer full-time but don’t know how to start?
Check out the pure value-packed webinar where Chris – creator of Finxter.com – teaches you to become a Python freelancer in 60 days or your money back!
It doesn’t matter if you’re a Python novice or Python pro. If you are not making six figures/year with Python right now, you will learn something from this webinar.
These are proven, no-BS methods that get you results fast.
This webinar won’t be online forever. Click the link below before the seats fill up and learn how to become a Python freelancer, guaranteed.
When does the IndexError: list assignment index out of range appear?
Python throws an IndexError if you try to assign a value to a list index that doesn’t exist, yet. For example, if you execute the expression list[1] = 10 on an empty list, Python throws the IndexError. Simply resolve it by adding elements to your list until the index actually exists.
Here’s the minimal example that throws the IndexError:
lst = []
lst[1] = 10
If you run this code, you’ll see that Python throws an IndexError:
Traceback (most recent call last): File "C:\Users\xcent\Desktop\code.py", line 2, in <module> lst[1] = 10
IndexError: list assignment index out of range
You can resolve it by adding two “dummy” elements to the list so that the index 1 actually exists in the list:
lst = [None, None]
lst[1] = 10
print(lst)
Now, Python will print the expected output:
[None, 10]
Try to fix the IndexError in the following interactive code shell:
Exercise: Can you fix this code?
So what are some other occurrences of the IndexError?
IndexError in For Loop
Frequently, the IndexError happens if you use a for loop to modify some list elements like here:
# WRONG CODE:
lst = []
for i in range(10): lst[i] = i
print(lst)
Again, the result is an IndexError:
Traceback (most recent call last): File "C:\Users\xcent\Desktop\code.py", line 4, in <module> lst[i] = i
IndexError: list assignment index out of range
You modify a list element at index i that doesn’t exist in the list. Instead, create the list using the list(range(10)) list constructor.
You’ve learned how to resolve one error. By doing this, your Python skills have improved a little bit. Do this every day and soon, you’ll be a skilled master coder.
Do you want to leverage those skills in the most effective way? In other words: do you want to earn money with Python?
If the answer is yes, let me show you a simple way how you can create your simple, home-based coding business online:
What Are Alternative Methods to Convert a List of Strings to a String?
Python is flexible—you can use multiple methods to achieve the same thing. So what are the different methods to convert a list to a string?
Method 1: Use the method ''.join(list) to concatenate all strings in a given list to a single list. The string on which you call the method is the delimiter between the list elements.
Method 2: Start with an empty string variable. Use a simple for loop to iterate over all elements in the list and add the current element to the string variable.
Method 3: Use list comprehension[str(x) for x in list] if the list contains elements of different types to convert all elements to the string data type. Combine them using the ''.join(newlist) method.
Method 4: Use the map functionmap(str, list] if the list contains elements of different types to convert all elements to the string data type. Combine them using the ''.join(newlist) method.
Here are all four variants in some code:
lst = ['learn' , 'python', 'fast'] # Method 1
print(''.join(lst))
# learnpythonfast # Method 2
s = ''
for st in lst: s += st
print(s)
# learnpythonfast # Method 3
lst = ['learn', 9, 'python', 9, 'fast']
s = ''.join([str(x) for x in lst])
print(s)
# learn9python9fast # Method 4
lst = ['learn', 9, 'python', 9, 'fast']
s = ''.join(map(str, lst))
print(s)
# learn9python9fast
Again, try to modify the delimiter string yourself using our interactive code shell:
So far so good. You’ve learned how to convert a list to a string. But that’s not all! Let’s dive into some more specifics of converting a list to a string.
Python List to String with Commas
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a comma as the delimiter between the list elements?
Example: You want to convert list ['learn', 'python', 'fast'] to the string 'learn,python,fast'.
Solution: to convert a list of strings to a string, call the ','.join(list) method on the delimiter string ',' that glues together all strings in the list and returns a new string.
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a space as the delimiter between the list elements?
Example: You want to convert list ['learn', 'python', 'fast'] to the string 'learn python fast'. (Note the empty spaces between the terms.)
Solution: to convert a list of strings to a string, call the ' '.join(list) method on the string ' ' (space character) that glues together all strings in the list and returns a new string.
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a newline character as the delimiter between the list elements?
Example: You want to convert list ['learn', 'python', 'fast'] to the string 'learn\npython\nfast' or as a multiline string:
'''learn
python
fast'''
Solution: to convert a list of strings to a string, call the '\n'.join(list) method on the newline character '\n' that glues together all strings in the list and returns a new string.
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a comma character followed by an empty space as the delimiter between the list elements? Additionally, you want to wrap each string in double quotes.
Example: You want to convert list ['learn', 'python', 'fast'] to the string '"learn", "python", "fast"' :
Solution: to convert a list of strings to a string, call the ', '.join('"' + x + '"' for x in lst) method on the delimiter string ', ' that glues together all strings in the list and returns a new string. You use a generator expression to modify each element of the original element so that it is enclosed by the double quote " chararacter.
Code: Let’s have a look at the code.
lst = ['learn', 'python', 'fast']
print(', '.join('"' + x + '"' for x in lst))
The output is:
"learn", "python", "fast"
Python List to String with Brackets
Problem: Given a list of strings. How to convert the list to a string by concatenating all strings in the list—using a comma character followed by an empty space as the delimiter between the list elements? Additionally, you want to wrap the whole string in a square bracket to indicate that’s a list.
Example: You want to convert list ['learn', 'python', 'fast'] to the string '[learn, python, fast]' :
Solution: to convert a list of strings to a string, call the '[' + ', '.join(lst) + ']' method on the delimiter string ', ' that glues together all strings in the list and returns a new string.
Although the output of both the converted list and the original list look the same, you can see that the data type is string for the former and list for the latter.
Convert List of Int to String
Problem: You want to convert a list into a string but the list contains integer values.
Example: Convert the list [1, 2, 3] to a string '123'.
Solution: Use the join method in combination with a generator expression to convert the list of integers to a single string value:
lst = [1, 2, 3]
print(''.join(str(x) for x in lst))
# 123
The generator expression converts each element in the list to a string. You can then combine the string elements using the join method of the string object.
If you miss the conversion from integer to string, you get the following TypeError:
lst = [1, 2, 3]
print(''.join(lst)) '''
Traceback (most recent call last): File "C:\Users\xcent\Desktop\code.py", line 2, in <module> print(''.join(lst))
TypeError: sequence item 0: expected str instance, int found '''
Python List to String One Line
To convert a list to a string in one line, use either of the three methods:
Use the ''.join(list) method to glue together all list elements to a single string.
Use the list comprehension method [str(x) for x in lst] to convert all list elements to type string.
Use str(list) to convert the list to a string representation.
Here are three examples:
lst = ['finxter', 'is', 'awesome']
print(' '.join(lst))
# finxter is awesome lst = [1, 2, 3]
print([str(x) for x in lst])
# ['1', '2', '3'] print(str(lst))
# [1, 2, 3]
Where to Go From Here
Want to increase your Python skill on a daily basis? Just by following a series of FREE Python course emails? Then join the #1 Python Email Academy in the world!
For my subscribers, I regularly publish educative emails about the most important Python topics. Register and join my community of thousands of ambitious coders. I guarantee, you will love it!
(Besides—it’s free and you can unsubscribe anytime so you’ve nothing to lose and everything to gain.)
[Spoiler] Which function filters a list faster: filter() vs list comprehension? For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in filter() method.
To answer this question, I’ve written a short script that tests the runtime performance of filtering large lists of increasing sizes using the filter() and the list comprehension methods.
My thesis is that the list comprehension method should be slightly faster for larger list sizes because it leverages the efficient cPython implementation of list comprehension and doesn’t need to call an extra function.
I used my notebook with an Intel(R) Core(TM) i7-8565U 1.8GHz processor (with Turbo Boost up to 4.6 GHz) and 8 GB of RAM.
Try It Yourself:
import time # Compare runtime of both methods
list_sizes = [i * 10000 for i in range(100)]
filter_runtimes = []
list_comp_runtimes = [] for size in list_sizes: lst = list(range(size)) # Get time stamps time_0 = time.time() list(filter(lambda x: x%2, lst)) time_1 = time.time() [x for x in lst if x%2] time_2 = time.time() # Calculate runtimes filter_runtimes.append((size, time_1 - time_0)) list_comp_runtimes.append((size, time_2 - time_1)) # Plot everything
import matplotlib.pyplot as plt
import numpy as np f_r = np.array(filter_runtimes)
l_r = np.array(list_comp_runtimes) print(filter_runtimes)
print(list_comp_runtimes) plt.plot(f_r[:,0], f_r[:,1], label='filter()')
plt.plot(l_r[:,0], l_r[:,1], label='list comprehension') plt.xlabel('list size')
plt.ylabel('runtime (seconds)') plt.legend()
plt.savefig('filter_list_comp.jpg')
plt.show()
The code compares the runtimes of the filter() function and the list comprehension variant to filter a list. Note that the filter() function returns a filter object, so you need to convert it to a list using the list() constructor.
Here’s the resulting plot that compares the runtime of the two methods. On the x axis, you can see the list size from 0 to 1,000,000 elements. On the y axis, you can see the runtime in seconds needed to execute the respective functions.
The resulting plot shows that both methods are extremely fast for a few tens of thousands of elements. In fact, they are so fast that the time() function of the time module cannot capture the elapsed time.
But as you increase the size of the lists to hundreds of thousands of elements, the list comprehension method starts to win:
For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in filter() method.
The reason is the efficient implementation of the list comprehension statement. An interesting observation is the following though. If you don’t convert the filter function to a list, you get the following result:
Suddenly the filter() function has constant runtime of close to 0 seconds—no matter how many elements are in the list. Why is this happening?
The explanation is simple: the filter function returns an iterator, not a list. The iterator doesn’t need to compute a single element until it is requested to compute the next() element. So, the filter() function computes the next element only if it is required to do so. Only if you convert it to a list, it must compute all values. Otherwise, it doesn’t actually compute a single value beforehand.
Where to Go From Here
This tutorial has shown you the filter() function in Python and compared it against the list comprehension way of filtering: [x for x in list if condition]. You’ve seen that the latter is not only more readable and more Pythonic, but also faster. So take the list comprehension approach to filter lists!
If you love coding and you want to do this full-time from the comfort of your own home, you’re in luck:
I’ve created a free webinar that shows you how I started as a Python freelancer after my computer science studies working from home (and seeing my kids grow up) while earning a full-time income working only part-time hours.
There are three equally interpretations of this term:
Coming from a computer science background, I was assuming that “nested list comprehension” refers to the creation of a list of lists. In other words: How to create a nested list with list comprehension?
But after a bit of research, I learned that there is a second interpretation of nested list comprehension: How to use a nested for loop in the list comprehension?
A few months later, I realized that some people use “nested list comprehension” to mean the use of a list comprehension statement as expression within a list comprehension statement. In other words: How to use a list comprehension statement within a list comprehension statement? (Watch the video to learn about this third interpretation.)
How to Create a Nested List with List Comprehension?
It is possible to create a nested list with list comprehension in Python. What is a nested list? It’s a list of lists. Here is an example:
## Nested List Comprehension
lst = [[x for x in range(5)] for y in range(3)]
print(lst)
# [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
As you can see, we create a list with three elements. Each list element is a list by itself.
Everything becomes clear when we go back to our magic formula of list comprehension: [ expression + context]. The expression part generates a new list consisting of 5 integers. The context part repeats this three times. Hence, each of the three nested lists has five elements.
If you are an advanced programmer, you may ask whether there is some aliasing going on here. Aliasing in this context means that the three list elements point to the same list [0, 1, 2, 3, 4]. This is not the case because each expression is evaluated separately, a new list is created for each of the three context executions. This is nicely demonstrated in this code snippet:
How to Use a Nested For Loop in the List Comprehension?
To be frank, this is super-simple stuff. Do you remember the formula of list comprehension (= ‘[‘ + expression + context + ‘]’)?
The context is an arbitrary complex restriction construct of for loops and if restrictions with the goal of specifying the data items on which the expression should be applied.
In the expression, you can use any variable you define within a for loop in the context. Let’s have a look at an example.
Suppose you want to use list comprehension to make this code more concise (for example, you want to find all possible pairs of users in your social network application):
# BEFORE
users = ["John", "Alice", "Ann", "Zach"]
pairs = []
for x in users: for y in users: if x != y: pairs.append((x,y))
print(pairs)
#[('John', 'Alice'), ('John', 'Ann'), ('John', 'Zach'), ('Alice', 'John'), ('Alice', 'Ann'), ('Alice', 'Zach'), ('Ann', 'John'), ('Ann', 'Alice'), ('Ann', 'Zach'), ('Zach', 'John'), ('Zach', 'Alice'), ('Zach', 'Ann')]
Now, this code is a mess! How can we fix it? Simply use nested list comprehension!
# AFTER
pairs = [(x,y) for x in users for y in users if x!=y]
print(pairs)
# [('John', 'Alice'), ('John', 'Ann'), ('John', 'Zach'), ('Alice', 'John'), ('Alice', 'Ann'), ('Alice', 'Zach'), ('Ann', 'John'), ('Ann', 'Alice'), ('Ann', 'Zach'), ('Zach', 'John'), ('Zach', 'Alice'), ('Zach', 'Ann')]
As you can see, we are doing exactly the same thing as with un-nested list comprehension. The only difference is to write the two for loops and the if statement in a single line within the list notation [].
How to Use a List Comprehension Statement Within a List Comprehension Statement?
Our goal is to solve the following problem: given a multiline string, create a list of lists—each consisting of all the words in a line that have more than three characters.
## Data
text = '''
Call me Ishmael. Some years ago - never mind how long precisely - having
little or no money in my purse, and nothing particular to interest me
on shore, I thought I would sail about a little and see the watery part
of the world. It is a way I have of driving off the spleen, and regulating
the circulation. - Moby Dick''' words = [[x for x in line.split() if len(x)>3] for line in text.split('\n')] print(words)
This is a nested list comprehension statement. This creates a new inner list as an element of the outer list. Each inner list contains all words with more than 4 characters. Each outer list contains an inner list for each line of text.
Where to Go From Here?
Want to increase your Python skill on a daily basis? Just by following a series of FREE Python course emails? Then join the #1 Python Email Academy in the world!
For my subscribers, I regularly publish educative emails about the most important Python topics. Register and join my community of thousands of ambitious coders. I guarantee, you will love it!
(Besides—it’s free and you can unsubscribe anytime so you’ve nothing to lose and everything to gain.)
In this article, you’ll learn the ins and outs of the sorting function in Python. In particular, you’re going to learn how to filter a list of dictionaries. So let’s get started!
Short answer: The list comprehension statement [x for x in lst if condition(x)] creates a new list of dictionaries that meet the condition. All dictionaries in lst that don’t meet the condition are filtered out. You can define your own condition on list element x.
Here’s a quick and minimal example:
l = [{'key':10}, {'key':4}, {'key':8}] def condition(dic): ''' Define your own condition here''' return dic['key'] > 7 filtered = [d for d in l if condition(d)] print(filtered)
# [{'key': 10}, {'key': 8}]
Try it yourself in the interactive Python shell (in your browser):
You’ll now get the step-by-step solution of this solution. I tried to keep it as simple as possible. So keep reading!
Filter a List of Dictionaries By Value
Problem: Given a list of dictionaries. Each dictionary consists of one or more (key, value) pairs. You want to filter them by value of a particular dictionary key (attribute). How do you do this?
Minimal Example: Consider the following example where you’ve three user dictionaries with username, age, and play_time keys. You want to get a list of all users that meet a certain condition such as play_time>100. Here’s what you try to accomplish:
Solution: Use list comprehension[x for x in lst if condition(x)] to create a new list of dictionaries that meet the condition. All dictionaries in lst that don’t meet the condition are filtered out. You can define your own condition on list element x.
Here’s the code that shows you how to filter out all user dictionaries that don’t meet the condition of having played at least 100 hours.
users = [{'username': 'alice', 'age': 23, 'play_time': 101}, {'username': 'bob', 'age': 31, 'play_time': 88}, {'username': 'ann', 'age': 25, 'play_time': 121},] superplayers = [user for user in users if user['play_time']>100] print(superplayers)
The output is the filtered list of dictionaries that meet the condition:
Problem: Given a list of dictionaries. Each dictionary consists of one or more (key, value) pairs. You want to filter them by key (attribute). All dictionaries that don’t have this key (attribute) should be filtered out. How do you do this?
Minimal Example: Consider the following example again where you’ve three user dictionaries with username, age, and play_time keys. You want to get a list of all users for which the key play_time exists. Here’s what you try to accomplish:
The output should look like this where the play_time attribute determines whether a dictionary passes the filter or not (as long as it exists, it shall pass the filter).
Solution: Use list comprehension[x for x in lst if condition(x)] to create a new list of dictionaries that meet the condition. All dictionaries in lst that don’t meet the condition are filtered out. You can define your own condition on list element x.
Here’s the code that shows you how to filter out all user dictionaries that don’t meet the condition of having a key play_time.
users = [{'username': 'alice', 'age': 23, 'play_time': 101}, {'username': 'bob', 'age': 31, 'play_time': 88}, {'username': 'ann', 'age': 25},] superplayers = [user for user in users if 'play_time' in user] print(superplayers)
The output is the filtered list of dictionaries that meet the condition:
Problem: Given a list of dictionaries. Each dictionary consists of multiple (key, value) pairs. You want to sort them by value of a particular dictionary key (attribute). How do you sort this dictionary?
Minimal Example: Consider the following example where you want to sort a list of salary dictionaries by value of the key 'Alice'.
Solution: You have two main ways to do this—both are based on defining the key function of Python’s sorting methods. The key function maps each list element (in our case a dictionary) to a single value that can be used as the basis of comparison.
Use a lambda function as key function to sort the list of dictionaries.
Use the itemgetter function as key function to sort the list of dictionaries.
Here’s the code of the first option using a lambda function that returns the value of the key 'Alice' from each dictionary:
# Create the dictionary of Bob's and Alice's salary data
salaries = [{'Alice': 100000, 'Bob': 24000}, {'Alice': 121000, 'Bob': 48000}, {'Alice': 12000, 'Bob': 66000}] # Use the sorted() function with key argument to create a new dic.
# Each dictionary list element is "reduced" to the value stored for key 'Alice'
sorted_salaries = sorted(salaries, key=lambda d: d['Alice']) # Print everything to the shell
print(sorted_salaries)
The output is the sorted dictionary. Note that the first dictionary has the smallest salary of Alice and the third dictionary has the largest salary of Alice.
In this article, you’ve learned how to filter a list of dictionaries easily with a simple list comprehension statement. That’s far more efficient than using the filter() method proposed in many other blog tutorials. Guido, the creator of Python, hated the filter() function!
I’ve realized that professional coders tend to use dictionaries more often than beginners due to their superior understanding of the benefits of dictionaries. If you want to learn about those, check out my in-depth tutorial of Python dictionaries.
If you want to stop learning and start earning with Python, check out my free webinar “How to Become a Python Freelance Developer?”. It’s a great way of starting your thriving coding business online.
Here’s your free PDF cheat sheet showing you all Python list methods on one simple page. Click the image to download the high-resolution PDF file, print it, and post it to your office wall:
Sorts the elements in the list lst in ascending order.
Go ahead and try the Python list methods yourself:
Puzzle: Can you figure out all outputs of this interactive Python script?
If you’ve studied the table carefully, you’ll know the most important list methods in Python. Let’s have a look at some examples of above methods:
>>> l = []
>>> l.append(2)
>>> l
[2]
>>> l.clear()
>>> l
[]
>>> l.append(2)
>>> l
[2]
>>> l.copy()
[2]
>>> l.count(2)
1
>>> l.extend([2,3,4])
>>> l
[2, 2, 3, 4]
>>> l.index(3)
2
>>> l.insert(2, 99)
>>> l
[2, 2, 99, 3, 4]
>>> l.pop()
4
>>> l.remove(2)
>>> l
[2, 99, 3]
>>> l.reverse()
>>> l
[3, 99, 2]
>>> l.sort()
>>> l
[2, 3, 99]
Where to Go From Here?
Want more cheat sheets? Excellent. I believe learning with cheat sheets is one of the most efficient learning techniques. Join my free Python email list where I’ll send you more than 10 new Python cheat sheets and regular Python courses for continuous improvement. It’s free!
In this article, I’ll show you how to divide a list into equally-sized chunks in Python. Step-by-step, you’ll arrive at the following great code that accomplishes exactly that:
You can play around with the code yourself but if you need some explanations, read on because I’ll explain it to you in much detail:
Chunking Your List
Let’s make this question more palpable by transforming it into a practical problem:
Problem: Imagine that you have a temperature sensor that sends data every 6 minutes, which makes 10 data points per hour. All these data points are stored in one list for each day.
Now, we want to have a list of hourly average temperatures for each day—this is why we need to split the list of data for one day into evenly sized chunks.
Solution: To achieve this, we use a for-loop and Python’s built-in function range() which we have to examine in depth.
The range() function can be used either with one, two or three arguments.
If you use it with one single argument, e.g., range(10), we get a range object containing the numbers 0 to 9. So, if you call range with one argument, this argument will be interpreted as the max or stop value of the range, but it is excluded from the range.
You can also call the range() function with two arguments, e.g., range(5, 10). This call with two arguments returns a range object containing the numbers 5 to 9. So, now we have a lower and an upper bound for the range. Contrary to the stop value, the start value is included in the range.
In a call of the function range() with three parameters, the first parameter is the start value, the second one is the stop value and the third value is the step size. For example, range(5, 15, 2) returns a range object containing the following values: 5, 7, 9, 11, 13. As you can see, the range starts with the start and then it adds the step value as long as the values are less than the stop value.
In our problem, our chunks have a length of 10, the start value is 0 and the max value is the end of the list of data.
Putting all together: Calling range(0, len(data), 10) will give us exactly what we need to iterate over the chunks. Let’s put some numbers there to visualize it.
For one single day, we have a data length of 24 * 10 = 240, so the call of the range function would be this: range(0, 240, 10) and the resulting range would be 0, 10, 20, 30, …, 230. Pause a moment and consider these values: they represent the indices of the first element of each chunk.
So what do we have now? The start indices of each chunk and also the length – and that’s all we need to slice the input data into the chunks we need.
The slicing operator takes two or three arguments separated by the colon : symbol. They have the same meaning as in the range function.
data = [15.7, 16.2, 16.5, 15.9, ..., 27.3, 26.4, 26.1, 27.2]
chunk_length = 10 for i in range(0, len(data), chunk_length): print(data[i:i+chunk_length])
Play with this code in our interactive Python shell:
However, we can still improve this code and make it reusable by creating a generator out of it.
Chunking With Generator Expressions
A generator is a function but instead of a return statement it uses the keyword yield.
The keyword yield interrupts the function and returns a value. The next time the function gets called, the next value is returned and the function’s execution stops again. This behavior can be used in a for-loop, where we want to get a value from the generator, work with this value inside the loop and then repeat it with the next value. Now, let’s take a look at the improved version of our code:
data = [15.7, 16.2, 16.5, 15.9, ..., 27.3, 26.4, 26.1, 27.2]
chunk_length = 10 def make_chunks(data, length): for i in range(0, len(data), length): yield data[i:i + length] for chunk in make_chunks(data, chunk_length): print(chunk)
That looks already pretty pythonic and we can reuse the function make_chunks() for all the other data we need to process.
Let’s finish the code so that we get a list of hourly average temperatures as result.
import random def make_chunks(data, length): for i in range(0, len(data), length): yield data[i:i + length] def process(chunk): return round(sum(chunk)/len(chunk), 2) n = 10
# generate random temperature values
day_temperatures = [random.random() * 20 for x in range(24 * n)]
avg_per_hour = [] for chunk in make_chunks(day_temperatures, n): r = process(batch) avg_per_hour.append(r) print(avg_per_hour)
And that’s it, this cool pythonic code solves our problem. We can make the code even a bit shorter but I consider this code less readable because you need to know really advanced Python concepts.
import random make_chunks = lambda data, n: (data[i:i + n] for i in range(0, len(data), n))
process = lambda data: round(sum(data)/len(data), 2) n = 10
# generate random temperature values
day_temperatures = [random.random() * 20 for x in range(24 * n)]
avg_per_hour = [] for chunk in make_chunks(day_temperatures, n): r = process(batch) avg_per_hour.append(r) print(avg_per_hour)
So, what did we do? We reduced the helper functions to lambda expressions and for the generator function we use a special shorthand – the parenthesis.
Summary
To sum up the solution: We used the range function with three arguments, the start value, the stop value and the step value. By setting the step value to our desired chunk length, the start value to 0 and the stop value to the total data length, we get a range object containing all the start indices of our chunks. With the help of slicing we can access exactly the chunk we need in each iteration step.
Where to Go From Here?
Want to start earning a full-time income with Python—while working only part-time hours? Then join our free Python Freelancer Webinar.
It shows you exactly how you can grow your business and Python skills to a point where you can work comfortable for 3-4 hours from home and enjoy the rest of the day (=20 hours) spending time with the persons you love doing things you enjoy to do.
Want to calculate the standard deviation of a column in your Pandas DataFrame?
In case you’ve attended your last statistics course a few years ago, let’s quickly recap the definition of variance: it’s the average squared deviation of the list elements from the average value.
You can do this by using the pd.std() function that calculates the standard deviation along all columns. You can then get the column you’re interested in after the computation.
import pandas as pd # Create your Pandas DataFrame
d = {'username': ['Alice', 'Bob', 'Carl'], 'age': [18, 22, 43], 'income': [100000, 98000, 111000]}
df = pd.DataFrame(d) print(df)
Your DataFrame looks like this:
username
age
income
0
Alice
18
100000
1
Bob
22
98000
2
Carl
43
111000
Here’s how you can calculate the standard deviation of all columns:
print(df.std())
The output is the standard deviation of all columns:
age 13.428825
income 7000.000000
dtype: float64
To get the variance of an individual column, access it using simple indexing:
print(df.std()['age'])
# 180.33333333333334
Together, the code looks as follows. Use the interactive shell to play with it!
Standard Deviation in NumPy Library
Python’s package for data science computation NumPy also has great statistics functionality. You can calculate all basic statistics functions such as average, median, variance, and standard deviation on NumPy arrays. Simply import the NumPy library and use the np.var(a) method to calculate the average value of NumPy array a.
Here’s the code:
import numpy as np a = np.array([1, 2, 3])
print(np.std(a))
# 0.816496580927726
Where to Go From Here?
Before you can become a data science master, you first need to master Python. Join my free Python email course and receive your daily Python lesson directly in your INBOX. It’s fun!