Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python Join List of DataFrames

#1
Python Join List of DataFrames

To join a list of DataFrames, say dfs, use the pandas.concat(dfs) function that merges an arbitrary number of DataFrames to a single one.

When browsing StackOverflow, I recently stumbled upon the following interesting problem. By thinking about solutions to those small data science problems, you can improve your data science skills, so let’s dive into the problem description.

Problem: Given a list of Pandas DataFrames. How to merge them into a single DataFrame?

Example: You have the list of Pandas DataFrames:

df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # List of DataFrames
dfs = [df1, df2, df3]

Say, you want to get the following DataFrame:

 Alice Bob
0 18 24
1 scientist student
2 24000 12000
0 19 25
1 scientist student
2 25000 11000
0 20 26
1 scientist student
2 26000 10000

You can try the solution quickly in our interactive Python shell:

Exercise: Print the resulting DataFrame. Run the code. Which merging strategy is used?

Method 1: Pandas Concat


This is the easiest and most straightforward way to concatenate multiple DataFrames.

import pandas as pd df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # list of dataframes
dfs = [df1, df2, df3] df = pd.concat(dfs)

This generates the following output:

print(df) ''' Alice Bob
0 18 24
1 scientist student
2 24000 12000
0 19 25
1 scientist student
2 25000 11000
0 20 26
1 scientist student
2 26000 10000 '''

The resulting DataFrames contains all original data from all three DataFrames.

Method 2: Reduce + DataFrame Merge


The following method uses the reduce function to repeatedly merge together all dictionaries in the list (no matter its size). To merge two dictionaries, the df.merge() method is used. You can use several merging strategies—in the example, you use "outer":

import pandas as pd df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # list of dataframes
dfs = [df1, df2, df3] # Method 2
from functools import reduce
df = reduce(lambda df1, df2: df1.merge(df2, "outer"), dfs)

This generates the following output:

print(df) ''' Alice Bob
0 18 24
1 scientist student
2 24000 12000
3 19 25
4 25000 11000
5 20 26
6 26000 10000 '''

You can find a discussion of the different merge strategies here. If you’d use the parameter "inner", you’d obtain the following result:

 Alice Bob
0 scientist student

Where to Go From Here?


Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!



https://www.sickgaming.net/blog/2020/06/...ataframes/
Reply



Possibly Related Threads…
Thread Author Replies Views Last Post
  [Tut] List Comprehension in Python xSicKxBot 0 2,107 08-23-2023, 07:54 PM
Last Post: xSicKxBot
  [Tut] Collections.Counter: How to Count List Elements (Python) xSicKxBot 0 1,956 08-19-2023, 06:03 AM
Last Post: xSicKxBot
  [Tut] 5 Effective Methods to Sort a List of String Numbers Numerically in Python xSicKxBot 0 1,556 08-16-2023, 08:49 AM
Last Post: xSicKxBot
  [Tut] Sort a List, String, Tuple in Python (sort, sorted) xSicKxBot 0 1,697 08-15-2023, 02:08 PM
Last Post: xSicKxBot
  [Tut] Python Converting List of Strings to * [Ultimate Guide] xSicKxBot 0 1,608 05-02-2023, 01:17 PM
Last Post: xSicKxBot
  [Tut] Python List of Tuples to DataFrame ? xSicKxBot 0 1,512 04-22-2023, 06:10 AM
Last Post: xSicKxBot
  [Tut] Python List of Dicts to Pandas DataFrame xSicKxBot 0 1,534 04-11-2023, 04:15 AM
Last Post: xSicKxBot
  [Tut] Python | Split String into List of Substrings xSicKxBot 0 1,441 12-11-2022, 12:17 PM
Last Post: xSicKxBot
  [Tut] Python Find in List [Ultimate Guide] xSicKxBot 0 1,429 12-09-2022, 11:35 PM
Last Post: xSicKxBot
  [Tut] Easiest Way to Convert List of Hex Strings to List of Integers xSicKxBot 0 1,459 11-25-2022, 11:54 AM
Last Post: xSicKxBot

Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016