Posted on Leave a comment

How to Filter a Dictionary in Python? (… The Most Pythonic Way)

Problem: Given a dictionary and a filter condition. How to filter a dictionary by …

  • key so that only those (key, value) pairs in the dictionary remain where the key satisfies the condition?
  • value so that only those (key, value) pairs remain where the value satisfies the condition?

In this tutorial, you’ll learn four methods and how they compare against each other. It’s loosely based on this tutorial but extends it by many additional examples and code explanations.

If you’re too busy to read this tutorial, here’s the spoiler:

Method 4 — Dictionary comprehension {k:v for (k,v) in dict.items() if condition} is the most Pythonic and fastest way to filter a dictionary in Python.

However, by reading this short 8-minute tutorial, you’re going to learn a lot about the nuances of writing Pythonic code. So keep reading!

Method 1: Simple Iteration to Filter Dictionary

You should always start with the simplest method to solve a problem (see Occam’s razor) because premature optimization is the root of all evil!

So let’s have a look at the straightforward loop iteration method to filter a dictionary.

You start with the following dictionary of names:

names = {1: 'Alice', 2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'}

Filter Python Dictionary By Key (Simple Loop)

You want to keep those (key, value) pairs where key meets a certain condition (such as key%2 == 1).

newDict = dict() # Iterate over all (k,v) pairs in names
for key, value in names.items(): # Is condition satisfied? if key%2 == 1: newDict[key] = value

Let’s have a look at the output:

print(newDict)
# {1: 'Alice', 3: 'Carl', 5: 'Liz'}

Only the (key, value) pairs where the key is an odd integer remain in the filtered dictionary newDict.

But what if you want to filter the dictionary by a condition on the values?

Filter Python Dictionary By Value (Simple Loop)

To filter by value, you only need to replace one line in the previous code snippet: instead of writing if key%2 == 1: ..., you use the value to determine whether to add a certain (key, value) pair: if len(value)<5: ....

newDict = dict() # Iterate over all (k,v) pairs in names
for key, value in names.items(): # Is condition satisfied? if len(value)<5: newDict[key] = value print(newDict)
# {2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'}

Only the dictionary (key, value) pairs remain in the filtered dictionary newDict where the length of the name string value is less than five characters.

Try It Yourself in Our Interactive Cod Shell (Click “run”):

Now, you know the basic method of filtering a dictionary in Python (by key and by value). But can we do better? What if you need to filter many dictionaries by many different filtering conditions? Do we have to rewrite the same code again and again?

The answer is no! Read on to learn about a more generic way to make filtering a dictionary as easy as calling a function passing the dictionary and the filter function.

Method 2: Generic Function to Filter Dictionary

How can you use different filtering functions on different dictionaries without writing the same code again and again? The answer is simple: create your own generic filtering function!

Your goal is to create a function filter_dict(dictionary, filter_func) that takes a dictionary to be filtered and a filter_func to determine for each (key, value) pair whether it should be included in the filtered dictionary.

You start with the following dictionary of names:

names = {1: 'Alice', 2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'}

Let’s create the generic filter function!

def filter_dict(d, f): ''' Filters dictionary d by function f. ''' newDict = dict() # Iterate over all (k,v) pairs in names for key, value in d.items(): # Is condition satisfied? if f(key, value): newDict[key] = value return newDict

The function takes two arguments: the dictionary d to be filtered and the function f that decides if an element should be included in the new dictionary.

You create an empty dictionary newDict and decide for all elements of the original dictionary d whether they should be included. To accomplish this, you iterate over each original (key, value) pair and pass it to the function f: key, value --> Boolean. The function f returns a Boolean value. If it evaluates to True, the (key, value) pair is added to the new dictionary. Otherwise, it’s skipped. The return value is the newly created dictionary newDict.

Here’s how you can use the filtering function to filter by key:

Filter Python Dictionary By Key Using Generic Function

If you want to accomplish the same thing as above—filtering by key to include only odd keys—you simply use the following one-liner call:

print(filter_dict(names, lambda k,v: k%2 == 1))
# {1: 'Alice', 3: 'Carl', 5: 'Liz'}

That was easy! The lambda function you pass returns k%2 == 1 which is the Boolean filtering value associated to each original element in the dictionary names.

Similarly, if you want to filter by key to include only even key, you’d do the following:

print(filter_dict(names, lambda k,v: k%2 == 0))
# {2: 'Bob', 4: 'Ann'}

In fact, it’s equally convenient to filter by value using this exact same strategy:

Filter Python Dictionary By Value Using Generic Function

Here’s how you use our function filter_dict to filter by value:

print(filter_dict(names, lambda k,v: len(v)<5))
# {2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'}

In the previous code snippet, you filter the dictionary so that only those (key, value) pairs remain where the value has less than five characters.

print(filter_dict(names, lambda k,v: v.startswith('A')))
# {1: 'Alice', 4: 'Ann'}

In this code snippet, you filter the dictionary so that only those (key, value) pairs remain where the value starts with character 'A'.

Try It Yourself in Our Interactive Cod Shell (Click “run”):

But is this the most Pythonic way? Hardly so! Read on to learn about a functional approach to accomplish the same thing with less code!

(More with less in Python is usually a good thing… Check out my book “Python One-Liners” to master the art of compressing complicated code into a single line.)

Method 3: filter() Function on Dictionary

The filter(function, iterable) function takes a function as input that takes one argument (an element of an iterable) and returns a Boolean value whether this element should pass the filter. All elements that pass the filter are returned as a new iterable object (a filter object).

You can use the lambda function statement to create the function right where you pass it as an argument. The syntax of the lambda function is lambda x: expression and it means that you use x as an input argument and you return expression as a result (that can or cannot use x to decide about the return value). For more information, see my detailed blog article about the lambda function.

Filter Python Dictionary By Key Using filter() + Lambda Functions

Here’s how you can filter a dictionary by key using only a single line of code (without defining your custom filter function as in method 2):

# FILTER BY KEY
print(dict(filter(lambda x: x[0]%2 == 1, names.items())))
# {1: 'Alice', 3: 'Carl', 5: 'Liz'}

You may recognize the same filter lambda function lambda x: x[0]%2 == 1 that returns True if the key is an odd integer. Note that this lambda function takes only a single input as this is how the filter() function works (it requires that you pass a function object that takes one argument and maps it to a Boolean value).

You operate on the iterable names.items() that gives you all (key, value) pairs (an element of the iterable is a (key, value) tuple).

After filtering, you convert the filter object back to a dictionary using the dict(...) constructor function.

Another example:

print(dict(filter(lambda x: x[0]%2 == 0, names.items())))
# {2: 'Bob', 4: 'Ann'}

Let’s see how this works for filtering by value:

Filter Python Dictionary By Value Using filter() + Lambda Functions

You can use the same basic idea—filter() + lambda + dict()—to filter a dictionary by value. For example, if you want to filter out all (key, value) pairs where the value has less than five characters, use the following one-liner:

print(dict(filter(lambda x: len(x[1])<5, names.items())))
# {2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'}

And a second example:

print(dict(filter(lambda x: x[1].startswith('A'), names.items())))
# {1: 'Alice', 4: 'Ann'}

Try It Yourself in Our Interactive Cod Shell (Click “run”):

So, far so good. But can we do any better? I mean, a single line of code is a single line of code, right? How can we possible improve on that?

Method 4: Filter By Dictionary Comprehension

The best way to filter a dictionary in Python is to use the powerful method of dictionary comprehension.

Dictionary comprehension allows you to transform one dictionary into another one—by modifying each (key, value) pair as you like.

Filter Python Dictionary By Key Using Dictionary Comprehension

The general framework for dictionary comprehension is { expression context }.

  • expression defines how you would like to change each (key, value) pair.
  • context defines the (key, value) pairs, you’d like to include in the new dictionary.

Here’s a practical example that filters all (key, value) pairs with odd keys:

print({k:v for (k,v) in names.items() if k%2 == 1})
# {1: 'Alice', 3: 'Carl', 5: 'Liz'}

And here’s an example that filters all (key, value) pairs with even keys:

print({k:v for (k,v) in names.items() if k%2 == 0})
# {2: 'Bob', 4: 'Ann'}

Let’s look at dictionary filtering by value! Is it any different?

Filter Python Dictionary By Value Using Dictionary Comprehension

No! It’s exactly the same:

print({k:v for (k,v) in names.items() if len(v)<5})
# {2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'}

This powerful framework is not only fast and easy to understand, it’s also concise and consistent. The filtering criteria is the last part of the expression so that you can quickly grasp how it’s filtered:

print({k:v for (k,v) in names.items() if v.startswith('A')})
# {1: 'Alice', 4: 'Ann'}

Try It Yourself in Our Interactive Cod Shell (Click “run”):

Related tutorials:

All Four Methods for Copy&Paste

Here are all four methods from the tutorial to simplify copy&pasting:

names = {1: 'Alice', 2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'} ''' Method 1: Simple For Loop ''' # FILTER BY KEY
newDict = dict() # Iterate over all (k,v) pairs in names
for key, value in names.items(): # Is condition satisfied? if key%2 == 1: newDict[key] = value print(newDict)
# {1: 'Alice', 3: 'Carl', 5: 'Liz'} # FILTER BY VALUE
newDict = dict() # Iterate over all (k,v) pairs in names
for key, value in names.items(): # Is condition satisfied? if len(value)<5: newDict[key] = value print(newDict)
# {2: 'Bob', 3: 'Carl', 4: 'Ann', 5: 'Liz'} ''' Method 2: Custom Function ''' def filter_dict(d, f): ''' Filters dictionary d by function f. ''' newDict = dict() # Iterate over all (k,v) pairs in names for key, value in d.items(): # Is condition satisfied? if f(key, value): newDict[key] = value return newDict # FILTER BY KEY
print(filter_dict(names, lambda k,v: k%2 == 1))
print(filter_dict(names, lambda k,v: k%2 == 0)) # FILTER BY VALUE
print(filter_dict(names, lambda k,v: len(v)<5))
print(filter_dict(names, lambda k,v: v.startswith('A'))) ''' Method 3: filter() ''' # FILTER BY KEY
print(dict(filter(lambda x: x[0]%2 == 1, names.items())))
print(dict(filter(lambda x: x[0]%2 == 0, names.items()))) # FITER BY VALUE
print(dict(filter(lambda x: len(x[1])<5, names.items())))
print(dict(filter(lambda x: x[1].startswith('A'), names.items()))) ''' Method 4: Dict Comprehension ''' # FITER BY KEY
print({k:v for (k,v) in names.items() if k%2 == 1})
print({k:v for (k,v) in names.items() if k%2 == 0}) # FITER BY VALUE
print({k:v for (k,v) in names.items() if len(v)<5})
print({k:v for (k,v) in names.items() if v.startswith('A')})

But how do all of those methods compare in terms of algorithmic complexity and runtime? Let’s see…

Algorithmic Analysis

Before we benchmark those methods against each other, let’s quickly discuss computational complexity.

All four methods have linear runtime complexity in the number of elements in the original dictionary to be filtered—assuming that the filtering condition itself has constant runtime complexity (it’s independent of the number of elements in the dictionary).

Method Complexity
Method 1: Loop O(n)
Method 2: Custom O(n)
Method 3: filter() O(n)
Method 4: Dictionary Comprehension O(n)

But this doesn’t mean that all methods are equally efficient. The dictionary comprehension is usually fastest for these types of filtering operations because it doesn’t use an intermediate function filter(). And it’s also easiest to understand. Therefore, you should use dictionary comprehension in your own code to filter a dictionary by key or by value!

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Posted on Leave a comment

Runtime Complexity of Python List Methods [Easy Table Lookup]

What’s the runtime complexity of various list methods?

The following table summarizes the runtime complexity of all list methods.

Assume that the length of the data type is defined as n (that is—len(data_type)). You can now categorize the asymptotic complexity of the different complexity functions as follows:

Operation Example Complexity
Index l[i] O(1)
Store l[i] = 42 O(1)
Length len(l) O(1)
Append l.append(42) O(1)
Pop l.pop() O(1)
Clear l.clear() O(1)
Slicing l[a:b] O(b-a)
Extend l1.extend(l2) O(len(l1)+len(l2))
Constructor list(iter) O(len(iter))
Equality l1 == l2 O(n)
Slice Assign l[a:b] = ... O(n)
Delete del l[i] O(n)
Remove l.remove(...) O(n)
Membership x in l / x not in l O(n)
Copy l.copy() O(n)
Pop l.pop(0) O(n)
Min min(l) O(n)
Max max(l) O(n)
Reverse l.reverse() O(n)
Iterate for x in l: O(n)
Sort l.sort() O(n log(n))
Multiply l*k O(n k)

Need to learn more about these methods? Watch me giving you a quick overview of all Python list methods:

You can read more about all methods in my detailed tutorial on the Finxter blog.

Here’s your free PDF cheat sheet showing you all Python list methods on one simple page. Click the image to download the high-resolution PDF file, print it, and post it to your office wall:

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Posted on Leave a comment

Complexity of Python Operations

In this tutorial, you’ll learn the runtime complexity of different Python operations.

Then, you’ll learn how to calculate the complexity of your own function by combining the complexity classes of its constituents. This is called “static analysis”

The tutorial is loosely based on (source) but it extends it significantly with more practical examples, interactive snippets, and explanatory material.

Introducing Big-O

Definition: The complexity of an operation (or an algorithm for that matter) is the number of resources that are needed to run it (source). Resources can be time (runtime complexity) or space (memory complexity).

So, how can you measure the complexity of an algorithm? In most cases, the complexity of an algorithm is not static. It varies with the size of the input to the algorithm or the operation.

For example, the list method list.sort() has one input argument: the list object to be sorted. The runtime complexity is not constant, it increases with increasing size of the list. If there are more elements to be sorted, the runtime of the algorithm increases.

Complexity Function

Given input size n, you can describe the complexity of your algorithm with a function of the input nf(n) that defines the number of “resource units” (e.g., time, memory) needed to finish it (worst-case or average-case).

Say, you’ve got three implementations to sort a list: implementation_1, implementation_2, implementation_3.

The figure shows the three complexity functions. The x axis measures the input size (in our example, it would be the list size). The y axis measures the complexity with respect to this input.

  • implementation_1 has a quadratic complexity function f(n) = n².
  • implementation_2 has a quadratic complexity function f(n) = 2n².
  • implementation_3 has a logarithmic complexity function f(n) = n log(n).

You can see the code we used to generate this plot here:

import matplotlib.pyplot as plt
import math implementation_1 = [n**2 for n in range(1, 100, 10)]
implementation_2 = [2*n**2 for n in range(1, 100, 10)]
implementation_3 = [n*math.log(n) for n in range(1, 100, 10)] plt.plot(implementation_1, '--.', label='implementation 1')
plt.plot(implementation_2, '-o', label='implementation 2')
plt.plot(implementation_3, '-x', label='implementation 3') plt.legend()
plt.xlabel('Input (ex: List Size)')
plt.ylabel('Complexity (ex: Runtime)')
plt.grid()
plt.show()

Of course, it’s good to waste as little resources as possible so the logarithmic complexity function is superior to the quadratic complexity functions.

Big-O Notation

For large inputs, the runtime behavior will be dominated by the part of the complexity function that grows fastest. For example, a quadratic runtime complexity function f(n) = 1000n² + 100000n + 999 will be much better than a cubic runtime complexity function g(n) = 0.1n³.

Why? Because sooner or later the function g(n) will produce much higher values than f(n) as the input size n increases.

In fact, you can argue that the only important part of a complexity function is the part that grows fastest with increasing input size.

This is exactly what’s the Big-O notation is all about:

The Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation. (wiki)

Roughly speaking, you remove everything but the fastest-growing term from the complexity function. This allows you to quickly compare different algorithms against each other.

To show this, have a look at our two examples:

  • Complexity function f(n) = 1000n² + 100000n + 999 grows like O(n²).
  • Complexity function g(n) = 0.1n³ grows like O(n³).

By reducing the complexity function to its asymptotic growth, you can immediately see that the former is superior to the latter in terms of runtime complexity—without being distracted by all the constant factors in front of the constituents or the constituents with smaller asymptotic growth.

Examples Big-O of Complexity Functions

So, here are a few examples of complexity functions and their asymptotic growth in Big-O notation:

Complexity Function Asymptotic Growth
f(n) = 10000 O(1)
f(n) = n + 1000000 O(n)
f(n) = 33n + log(n) O(n log(n))
f(n) = 33n + 4n * log(n) O(n log(n))
f(n) = n² + n O()
f(n) = 1000n² + 100000n + 999 O()
f(n) = 0.000000001n³ + 4n² + 100000n + 999 O()
f(n) = n * n³ + 33n O()

You can see that the asymptotic growth of a function (in Big-O notation) is dominated by the fastest-growing term in the function equation.

Python Complexity of Operations

Let’s explore the complexity of Python operations—classified by the data structure on which you perform those operations. A great coder will always use the data structure that suits their needs best.

In general, the list data structure supports more operations than the set data structure as it keeps information about the ordering of the elements—at the cost of higher computational complexity.

Python List Complexity

Assume that the length of the data type is defined as n (that is—n = len(data_type)). You can now categorize the asymptotic complexity of the different complexity functions as follows:

Operation Example Complexity
Index l[i] O(1)
Store l[i] = 42 O(1)
Length len(l) O(1)
Append l.append(42) O(1)
Pop l.pop() O(1)
Clear l.clear() O(1)
Slicing l[a:b] O(b-a)
Extend l1.extend(l2) O(len(l1)+len(l2))
Constructor list(iter) O(len(iter))
Equality l1 == l2 O(n)
Slice Assign l[a:b] = ... O(n)
Delete del l[i] O(n)
Remove l.remove(...) O(n)
Membership x in l / x not in l O(n)
Copy l.copy() O(n)
Pop l.pop(0) O(n)
Min min(l) O(n)
Max max(l) O(n)
Reverse l.reverse() O(n)
Iterate for x in l: O(n)
Sort l.sort() O(n log(n))
Multiply l*k O(n k)

Tuples are similar as lists—with a few exceptions: you cannot modify a tuple because they are immutable.

Let’s consider another important data structure:

Python Set Complexity

Assume that the length of the data type is defined as n (that is—n = len(data_type)). If there are two sets in a single operation such as s1 == s2, the lengths are given by the variables n1 and n2. You can now categorize the asymptotic complexity of the different complexity functions as follows:

Operation Example Complexity
Length len(s) O(1)
Add s.add(42) O(1)
Membership 42 in s / 42 not in s O(1)
Remove s.remove(42) O(1)
Pop s.pop() O(1)
Clear s.clear() O(1)
Constructor set(iter) O(n)
Equality s1 == s2 / s1 != s2 O(min(n1, n2))
Union s1 | s2 O(n1+n2)
Intersection s1 & s2 O(min(n1, n2))
Difference s1 - s2 O(n2)
Symmetric Difference s1 ^ s2 O(n1)
Iteration for x in s: O(n)
Copy s.copy() O(n)

I highlighted the set operations that are more efficient than the corresponding list operations. The reason for those being O(1) rather than O(n) is that the list data structure also maintains the ordering of the elements—which incurs additional overhead.

Python Dictionary Complexity

Now, have a look at the time complexity of Python dictionary operations:

Operation Example Complexity
Index (Get) dict[k] O(1)
Store dict[k] = v O(1)
Length len(dict) O(1)
Delete del dict[key] O(1)
Pop dict.pop(k) O(1)
Clear dict.clear() O(1)
Keys dict.keys() O(1)
Construction dict(...) O(n)
Iteration for k in dict: O(n)

Most operations are O(1) because Python dictionaries share multiple properties of Python sets (such as fast membership operation).

Composing Complexity Classes

Now, you know the complexity of different operations. But how do you obtain the complexity of your algorithm?

The concept is simple: you break the big problem (knowing the complexity of the whole algorithm) into a series of smaller problems (knowing the complexity of individual operations).

Then, you recombine the individual operation’s complexities to obtain the solution to the big problem.

How? Let’s start with a small example: you first get a list from a dictionary of list values. Then you sort the list:

d = {1: [3, 1, 2, 4], 2: [4, 4, 1, 9]} lst = d.get(1)
lst.sort()
print(lst)
# [1, 2, 3, 4]

Try it yourself in our interactive code shell (click “run” to execute the code):

You have four operations in the code snippet. Let’s annotate each operation with a complexity class.

# Operation 1: Dictionary Creation --> O(n)
d = {1: [3, 1, 2, 4], 2: [4, 4, 1, 9]} # Operation 2: Dictionary Get --> O(1)
lst = d.get(1) # Operation 3: List Sorting --> O(n log n)
lst.sort() # Operation 4: Print List --> O(n)
print(lst)

Knowing the complexity classes of the different operations, you can recombine it as follows:

O(n) + O(1) + O(n log n) + O(n)
(1) = O(n + 1 + n log n + n)
(2) = O(n log n + 2n + 1)
(3) = O(n log n)

You see in Equation (1) what you could call the “chain rule of complexity analysis”: O(f(n)) + O(g(n)) = O(f(n) + g(n)).

You can see in Equation (3) that the Big-O notation focuses only on the largest growing term. In our case, O(n log n) grows faster than O(2n).

Here are two important examples of Big-O recombination:

  • O(f(n)) + O(f(n)) = O(f(n) + f(n)) = O(2f(n)) = O(f(n). In other words, Executing an operation a constant (fixed) number of times, doesn’t change the overall complexity of the algorithm.
  • O(f(n) + g(n)) = O(f(n)) if the complexity function f(n) grows faster than g(n). An example is O(n³ + 1000n³) = O(n³).

In programming, you can also have conditional execution:

if condition: f(n)
else: g(n)

You can recombine the complexity class of the overall code snippet as follows: O(max(f(n), g(n)). Roughly speaking, (if the condition can be true), the complexity of the conditional execution is the maximum of both blocks f(n) or g(n).

Example:

if lst: lst.sort()
else: lst = [1, 2, 3, 4]

The complexity is O(n log n) because it grows faster than O(n) — the complexity of the block in the else statement.

Another possibility is to repeatedly execute a certain function (e.g., in a for loop).

If you repeat function f(n) m times, the computational complexity is m * O(f(n)) = O(m f(n)). If m is a constant, the computational complexity simplifies to O(f(n)).

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Posted on Leave a comment

Python List Concatenation: Add (+) vs INPLACE Add (+=) vs extend()

A wildly popular operation you’ll find in any (non-trivial) code base is to concatenate lists—but there are multiple methods to accomplish this. Master coders will always choose the right method for the right problem.

This tutorial shows you the difference between three methods to concatenate lists:

  • Concatenate two lists with the + operator. For example, the expression [1, 2, 3] + [4, 5] results in a new list [1, 2, 3, 4, 5]. More here.
  • Concatenate two lists with the += operator. This operation is inplace which means that you don’t create a new list and the result of the expression lst += [4, 5] is to add the elements on the right to the existing list object lst. More here.
  • Concatenate two lists with the extend() method of Python lists. Like +=, this method modifies an existing list in place. So the result of lst.extend([4, 5]) adds the elements 4 and 5 to the list lst. More here.

To summarize: the difference between the + method and the += and extend() methods is that the former creates a new list and the latter modify an existing list object in-place.

You can quickly compare those three methods in the following interactive code shell:

Puzzle: Can you already figure out the outputs of this code snippet?

Fear not if you can’t! I’ll explain you each detailed example next.

Method 1: Add (+)

The standard way of adding two lists is to use the + operator like this:

# METHOD 1: ADD +
lst = ['Alice', 'Bob', 'Ann']
lst_new = lst + [42, 21]
print(lst)
print(lst_new)

While the + operator is the most readable one (especially for beginner coders), it’s not the best choice in most scenarios. The reason is that it creates a new list each time you call it. This can become very slow and I’ve seen many practical code snippets where the list data structure used with the + operator is the bottleneck of the whole algorithm.

In the above code snippet, you create two list objects in memory—even though your goal is probably just to update the existing list ['Alice', 'Bob', 'Ann'].

This can be nicely demonstrated in the code visualization tool:

Just keep clicking “Next” until the second list appears in memory.

Method 2: INPLACE Add (+=)

The += operator is not well understood by the Python community. Many of my students (join us for free) believe the add operation lst += [3, 4] is just short for lst = lst + [3, 4]. This is wrong and I’ll demonstrate it in the following example:

# METHOD 2: INPLACE ADD +=
lst = ['Alice', 'Bob', 'Ann']
lst_old = lst
lst += [42, 21]
print(lst)
print(lst_old)

Again, you can visualize the memory objects with the following interactive tool (click “Next”):

The takeaway is that the += operation performs INPLACE add. It changes an existing list object rather than creating a new one. This makes it more efficient in the majority of cases. Only if you absolutely need to create a new list, you should use the + operator. In all other cases, you should use the += operator or the extend() method.

Speaking of which…

Method 3: Extend()

Like the previous method +=, the list.extend(iterable) method adds a number of elements to the end of a list. The method operators in-place so no new list object is created.

# METHOD 3: EXTEND()
lst = ['Alice', 'Bob', 'Ann']
lst_old = lst
lst.extend([42, 21])
print(lst)
print(lst_old)

Here’s the interactive memory visualization:

Click “Next” and explore how the memory allocation “unfolds” as the execution proceeds.

Speed Comparison Benchmark

Having understood the differences of the three methods + vs += vs extend(), you may ask: what’s the fastest?

To help you understand why it’s important to choose the best method, I’ve performed a detailed speed benchmark on my Intel i7 (8th Gen) Notebook (8GB RAM) concatenating lists with increasing sizes using the three methods described previously.

Here’s the result:

The plot shows that with increasing list size, the runtime difference between the + method (Method 1), and the += and extend() methods (Methods 2 and 3) becomes increasingly evident. The former creates a new list for each concatenation operation—and this slows it down.

Result: Thus, both INPLACE methods += and extend() are more than 30% faster than the + method for list concatenation.

You can reproduce the result with the following code snippet:

import time # Compare runtime of three methods
list_sizes = [i * 300000 for i in range(40)]
runtimes_1 = [] # Method 1: + Operator
runtimes_2 = [] # Method 2: += Operator
runtimes_3 = [] # Method 3: extend() for size in list_sizes: to_add = list(range(size)) # Get time stamps time_0 = time.time() lst = [1] lst = lst + to_add time_1 = time.time() lst = [1] lst += to_add time_2 = time.time() lst = [1] lst.extend(to_add) time_3 = time.time() # Calculate runtimes runtimes_1.append((size, time_1 - time_0)) runtimes_2.append((size, time_2 - time_1)) runtimes_3.append((size, time_3 - time_2)) # Plot everything
import matplotlib.pyplot as plt
import numpy as np runtimes_1 = np.array(runtimes_1)
runtimes_2 = np.array(runtimes_2)
runtimes_3 = np.array(runtimes_3) print(runtimes_1)
print(runtimes_2)
print(runtimes_3) plt.plot(runtimes_1[:,0], runtimes_1[:,1], label='Method 1: +')
plt.plot(runtimes_2[:,0], runtimes_2[:,1], label='Method 2: +=')
plt.plot(runtimes_3[:,0], runtimes_3[:,1], label='Method 3: extend()') plt.xlabel('list size')
plt.ylabel('runtime (seconds)') plt.legend()
plt.savefig('speed.jpg')
plt.show()

If you liked this tutorial, join my free email list where I’ll send you the most comprehensive FREE Python email academy right in your INBOX.

Join the Finxter Community Now!

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Posted on Leave a comment

[Free PDF Download] Coffee Break Python – Mastery Workout

Want to boost your Python skills to the next level as an intermediate coder? Want to know how to overcome being stuck at average coding level? Do you enjoy solving puzzles?

We’re excited to release a new fresh, and tough Finxter book with 99 never-seen Python puzzles. It’s the hardest one in our Coffee Break Python series.

IF YOU CAN DO IT THERE, YOU CAN DO IT EVERYWHERE!

If you want to download a 50-page sample of the book for FREE, you’re on the right spot:

Title: Coffee Break Python – Mastery Workout

Subtitle: 99 Tricky Python Puzzles to Push You to Programming Mastery

Download link PDF (sample): https://drive.google.com/open?id=1bBH0-Eu2fsF2U31xucd75IpjJxnBxz3z

Related articles:

Posted on Leave a comment

[Top 6] What’s the Best YouTube Channel to Learn Python? Channel #1 Will Surprise You

YouTube is a great way of learning Python. But those channels top all others. In this article, you’ll find a list of the top YouTube Channels — in reverse order!


Corey Schäfer #6

Channel link: https://www.youtube.com/user/schafer5/

37,444,096 channel views

Channel Description: This channel is focused on creating tutorials and walkthroughs for software developers, programmers, and engineers. We cover topics for all different skill levels, so whether you are a beginner or have many years of experience, this channel will have something for you.

We’ve already released a wide variety of videos on topics that include: Python, Git, Development Environments, Terminal Commands, SQL, Programming Terms, JavaScript, Computer Science Fundamentals, and plenty of other tips and tricks which will help you in your career.


Clever Programmer #5

Channel link: https://www.youtube.com/channel/UCqrILQNl5Ed9Dz6CGMyvMTQ

13,893,366 channel views

Channel Description: You can find awesome programming lessons here! Also, expect programming tips and tricks that will take your coding skills to the next level.


Real Python #4

Channel Link: https://www.youtube.com/channel/UCI0vQvr9aFn27yR6Ej6n5UA

2,166,889 channel views

Channel Description: On this channel you’ll get new Python videos and screencasts every week. They’re bite-sized and to the point so you can fit them in with your day and pick up new Python skills on the side:


CS Dojo #3

Channel Link: https://www.youtube.com/channel/UCxX9wt5FWQUAAz4UrysqK9A

36,733,368 channel views

Channel Description: The videos are mostly about programming and computer science (but also some interviews).


Socratia #2

Channel Link: https://www.youtube.com/user/SocraticaStudios

23,042,289 channel views

Channel Description: Socratica makes high-quality educational videos on math and science. The videos are …. different. Check them out to see what I mean!


Sentdex #1

Channel Link: https://www.youtube.com/user/sentdex

67,432,457 channel views

Channel Description: Python Programming tutorials, going further than just the basics. Learn about machine learning, finance, data analysis, robotics, web development, game development and more.


These are the best Python channels on YouTube. Check them out, there’s an infinite number of YT videos that will make you a better coder — for free!

The Finxter Channel

You may also check out the Finxter channel which is a small Python Business related channel. If you want to improve your Python business skills, this channel is for you!

Channel Link: https://www.youtube.com/channel/UCRlWL2q80BnI4sA5ISrz9uw

Subscribe to the Python email list for continuous improvement in computer science. It’s free!

Posted on Leave a comment

Python One-Liner Webserver HTTP

Want to create your own webserver in a single line of Python code? No problem, just use this command in your shell:

$ python -m http.server 8000

The terminal will tell you:

Serving HTTP on 0.0.0.0 port 8000

To shut down your webserver, kill the Python program with CTRL+c.

This works if you’ve Python 3 installed on your system. To check your version, use the command python --version in your shell.

You can run this command in your Windows Powershell, Win Command Line, MacOS Terminal, or Linux Bash Script.

You can see in the screenshot that the server runs on your local host listening on port 8000 (the standard HTTP port to serve web requests).

Note: The IP address is NOT 0.0.0.0—this is an often-confused mistake by many readers. Instead, your webserver listens at your “local” IP address 127.0.0.1 on port 8000. Thus, only web requests issued on your computer will arrive at this port. The webserver is NOT visible to the outside world.

Python 2: To run the same simple webserver on Python 2, you need to use another command using SimpleHTTPServer instead of http:

$ python -m SimpleHTTPServer 8000
Serving HTTP on 0.0.0.0 port 8000 ...

If you want to start your webserver from within your Python script, no problem:

import http.server
import socketserver PORT = 8000 Handler = http.server.SimpleHTTPRequestHandler with socketserver.TCPServer(("", PORT), Handler) as httpd: print("serving at port", PORT) httpd.serve_forever()

You can execute this in our online Python browser (yes, you’re creating a local webserver in the browser—how cool is that)!

This code comes from the official Python documentation—feel free to read more if you’re interested in setting up the server (most of the code is relatively self-explanatory).

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Posted on Leave a comment

Tilde Python Pandas DataFrame

Python’s Tilde ~n operator is the bitwise negation operator: it takes the number n as binary number and “flips” all bits 0 to 1 and 1 to 0 to obtain the complement binary number. For example, the tilde operation ~1 becomes 0 and ~0 becomes 1 and ~101 becomes 010.

Read all about the Tilde operator in my detailed tutorial on this blog.

Sometimes, you’ll see the tilde operator in a Pandas DataFrame for indexing. Here’s an example:

import pandas as pd # Create a DataFrame
df = pd.DataFrame([{'User': 'Alice', 'Age': 22}, {'User': 'Bob', 'Age': 24}])
print(df) ''' User Age
0 Alice 22
1 Bob 24 ''' # Use Tilde to access all lines where user doesn't contain 'A'
df = df[~df['User'].str.contains('A')]
print(df) ''' User Age
1 Bob 24 '''

To improve your practical understanding, feel free to run this code in your browser in our interactive Python shell:

The tilde operator negates the Boolean values in the DataFrame: True becomes False and False becomes True.

You can see this in action when printing the result of different operations:

This is the original DataFrame in the code:

print(df) ''' User Age
0 Alice 22
1 Bob 24 '''

Now apply the contains operation to find all user names that contain the character 'A'.

print(df['User'].str.contains('A')) '''
0 True
1 False
Name: User, dtype: bool '''

The result is a DataFrame with Boolean values that indicate whether a user contains the character 'A' or not.

Let’s apply the Tilde operator on the result:

print(~df['User'].str.contains('A')) '''
0 False
1 True
Name: User, dtype: bool '''

Now, we use this DataFrame to access only those rows with users that don’t contain the character 'A'.

df = df[~df['User'].str.contains('A')]
print(df) ''' User Age
1 Bob 24 '''

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Posted on Leave a comment

How to Add an Element to a Python List at an Index?

To add an element to a given Python list, you can use either of the three following methods:

  1. Use the list insert method list.insert(index, element).
  2. Use slice assignment lst[index:index] = [element] to overwrite the empty slice with a list of one element.
  3. Use list concatenation with slicing lst[:2] + ['Alice'] + lst[2:] to create a new list object.

In the following, you’ll learn about all three methods in greater detail. But before that, feel free to test those yourself in our interactive Python shell (just click “Run” to see the output):

Method 1: insert(index, element)

The list.insert(i, element) method adds an element element to an existing list at position i. All elements j>i will be moved by one index position to the right.

Here’s an example with comments:

# Create the list
lst = [2, 4, 6, 8] # Insert string at index 2
lst.insert(2, 'Alice') # Print modified list object
print(lst)
# [2, 4, 'Alice', 6, 8]
Properties of insert()
Operates on existing list object
Simple
Fast

Check out the objects in memory while executing this code snippet (in comparison to the other methods discussed in this article):

Click “Next” to move on in the code and observe the memory objects creation.

Related article: Python List insert() Method

Method 2: Slice Assignment

Slice assignment is a little-used, beautiful Python feature to replace a slice with another sequence.

Simply select the slice you want to replace on the left and the values to replace it on the right side of the equation.

For example, the slice assignment list[2:4] = [42, 42] replaces the list elements with index 2 and 3 with the value 42.

Here’s another example that shows you how to insert the string 'Alice' into a list with four integers.

Let’s have a look at the code:

# Create the list
lst = [2, 4, 6, 8] # Insert string at index 2
lst[2:2] = ['Alice'] # Print modified list object
print(lst)
# [2, 4, 'Alice', 6, 8]
Properties of Slice Assignment
Operates on existing list object
Slightly more complex
Fast

Related article: Python Slice Assignment

Method 3: List Concatenation

If you use the + operator on two integers, you’ll get the sum of those integers. But if you use the + operator on two lists, you’ll get a new list that is the concatenation of those lists.

Here’s the same example, you’ve already seen in the previous sections:

# Create the list
lst = [2, 4, 6, 8] # Insert string at index 2
lst = lst[:2] + ['Alice'] + lst[2:] # Print modified list object
print(lst)
# [2, 4, 'Alice', 6, 8]
Properties of List Concatenation
Creates a new list object
Slightly more complex
Slower

Related article: How to Concatenate Lists in Python? [Interactive Guide]

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Posted on Leave a comment

Google’s FooBar Challenge | See How I Passed Levels 1 to 5 in Real-Time

I just got invited to perform Google’s FooBar challenge. In this article, I want to share with you how I solved the problems in real-time. The purpose of this article is to educate you—and to have some fun. So, are you ready?

Level 1: Prime Numbers

The first goal was to find an identifier for a new employee at the “minions” company.

The identifier is selected base on a random number i. How do we come from the random integer i to the identifier of the new minion employee?

  • Create a sequence of prime numbers '23571113...'.
  • The identifier is the subsequence starting from index i and ending in index i+4 (included).
  • The value i is an integer between 0 and 10000.

Here’s the solution I implemented in the video:

def solution(i): # Determine prime sequence primes = getPrimeNumbers() return primes[i:i+5] def getPrimeNumbers(): '''Returns the string of prime numbers up to 10k+5 positions.''' s = '' prime = 2 while len(s) < 10005: # Add new prime to s s += str(prime) # Calculate next prime prime += 1 while not is_prime(prime): prime += 1 return s def is_prime(n): '''Tests if a number is prime. ''' for i in range(2,n): if n % i == 0: return False return True print(solution(0))
# 23571 print(solution(3))
# 71113

Level 2 | Part 1: Sequence Sum

Here’s the problem posed by Google:

Numbers Station Coded Messages
When you went undercover in Commander Lambda's organization, you set up a coded messaging system with Bunny Headquarters to allow them to send you important mission updates. Now that you're here and promoted to Henchman, you need to make sure you can receive those messages - but since you need to sneak them past Commander Lambda's spies, it won't be easy! Bunny HQ has secretly taken control of two of the galaxy's more obscure numbers stations, and will use them to broadcast lists of numbers. They've given you a numerical key, and their messages will be encrypted within the first sequence of numbers that adds up to that key within any given list of numbers. Given a non-empty list of positive integers l and a 
target positive integer t, write a function solution(l,
 t) which verifies if there is at least one consecutive
 sequence of positive integers within the list l (i.e. 
a contiguous sub-list) that can be summed up to the 
given target positive integer t (the key) and returns 
the lexicographically smallest list containing the 
smallest start and end indexes where this sequence can 
be found, or returns the array [-1, -1] in the case 
that there is no such sequence (to throw off Lambda's spies, not all number broadcasts will contain a coded message). For example, given the broadcast list l as [4, 3, 5, 7, 8] and the key t as 12, the function solution(l, t) would return the list [0, 2] because the list l contains the sub-list [4, 3, 5] starting at index 0 and ending at index 2, for which 4 + 3 + 5 = 12, even though there is a shorter sequence that happens later in the list (5 + 7). On the other hand, given the list l as [1, 2, 3, 4] and the key t as 15, the function solution(l, t) would return [-1, -1] because there is no sub-list of list l that can be summed up to the given target value t = 15. To help you identify the coded broadcasts, Bunny HQ has agreed to the following standards: - Each list l will contain at least 1 element but never more than 100.
- Each element of l will be between 1 and 100.
- t will be a positive integer, not exceeding 250.
- The first element of the list l has index 0.
- For the list returned by solution(l, t), the start index must be equal or smaller than the end index. Remember, to throw off Lambda's spies, Bunny HQ might include more than one contiguous sublist of a number broadcast that can be summed up to the key. You know that the message will always be hidden in the first sublist that sums up to the key, so solution(l, t) should only return that sublist. Languages
To provide a Python solution, edit solution.py
To provide a Java solution, edit Solution.java Test cases
Your code should pass the following test cases.
Note that it may also be run against hidden test cases not shown here. -- Python cases --
Input:
solution.solution([1, 2, 3, 4], 15)
Output:
-1,-1
Input:
solution.solution([4, 3, 10, 2, 8], 12)
Output:
2,3 -- Java cases --
Input:
Solution.solution({1, 2, 3, 4}, 15)
Output:
-1,-1
Input:
Solution.solution({4, 3, 10, 2, 8}, 12)
Output:
2,3 Use verify [file] to test your solution and see how it does. When you are finished editing your code, use submit [file] to submit your answer. If your solution passes the test cases, it will be removed from your home folder.

Here’s the first code that I tried:

def solution(l, t): start = 0 while start < len(l): for stop in range(start, len(l)): s = sum(l[start:stop+1]) if s == t: return [start, stop] elif s > t: break start += 1 return [-1, -1] 

The code solves the problem but it takes quadratic runtime so I though—can we do better? Yes, we can! There’s a linear runtime solution:

def solution(l, t): start = stop = 0 while start <= stop and stop < len(l): s = sum(l[start:stop+1]) if s == t: return [start, stop] elif s < t: stop += 1 else: start += 1 stop = max(start, stop) return [-1, -1]

Both solutions work—but the latter is much faster. Here’s the output and the test cases:

print(solution([250,0,0], 250))
print(solution([1,2,3,4], 15))
print(solution([4, 3, 10, 2, 8], 12))
print(solution([4, 3, 5, 7, 8], 12))
print(solution([260], 260)) '''
[0, 0]
[-1, -1]
[2, 3]
[0, 2]
[0, 0] '''

After submitting the solution in my browser shell, Google tells me that there is one more challenge to go to reach the next level: