06-16-2020, 07:16 AM
Python Join List of DataFrames
<div><p class="has-background has-luminous-vivid-amber-background-color"><strong>To <a href="https://blog.finxter.com/python-join-list/" target="_blank" rel="noreferrer noopener">join </a>a list of DataFrames, say <code>dfs</code>, use the <code>pandas.concat(dfs)</code> function that merges an arbitrary number of DataFrames to a single one.</strong></p>
<p>When browsing <a rel="noreferrer noopener" href="https://stackoverflow.com/questions/32444138/concatenate-a-list-of-pandas-dataframes-together" target="_blank">StackOverflow</a>, I recently stumbled upon the following interesting problem. By thinking about solutions to those small data science problems, you can <a href="https://blog.finxter.com/coffee-break-numpy/" target="_blank" rel="noreferrer noopener">improve your data science skills</a>, so let’s dive into the problem description.</p>
<p><strong>Problem</strong>: Given a list of Pandas <a href="https://blog.finxter.com/tilde-python-pandas-dataframe/" target="_blank" rel="noreferrer noopener">DataFrames</a>. How to merge them into a single DataFrame?</p>
<p><strong>Example</strong>: You have the list of <a href="https://blog.finxter.com/pandas-cheat-sheets/" target="_blank" rel="noreferrer noopener">Pandas </a>DataFrames:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # List of DataFrames
dfs = [df1, df2, df3]</pre>
<p>Say, you want to get the following DataFrame:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Alice Bob
0 18 24
1 scientist student
2 24000 12000
0 19 25
1 scientist student
2 25000 11000
0 20 26
1 scientist student
2 26000 10000</pre>
<p>You can try the solution quickly in our interactive Python shell:</p>
<p> <iframe height="800px" width="100%" src="https://repl.it/@finxter/pandasmergedf?lite=true" scrolling="no" frameborder="no" allowtransparency="true" allowfullscreen="true" sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals"></iframe> </p>
<p><em><strong>Exercise</strong>: Print the resulting DataFrame. Run the code. Which merging strategy is used?</em></p>
<h2>Method 1: Pandas Concat</h2>
<p>This is the easiest and most straightforward way to concatenate multiple DataFrames.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # list of dataframes
dfs = [df1, df2, df3] df = pd.concat(dfs)</pre>
<p>This generates the following output:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(df) ''' Alice Bob
0 18 24
1 scientist student
2 24000 12000
0 19 25
1 scientist student
2 25000 11000
0 20 26
1 scientist student
2 26000 10000 '''</pre>
<p>The resulting DataFrames contains all original data from all three DataFrames.</p>
<h2>Method 2: Reduce + DataFrame Merge</h2>
<p>The following method uses the reduce function to repeatedly merge together all dictionaries in the list (no matter its size). To merge two dictionaries, the <a rel="noreferrer noopener" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html" target="_blank"><code>df.merge()</code></a> method is used. You can use several merging strategies—in the example, you use <code>"outer"</code>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # list of dataframes
dfs = [df1, df2, df3] # Method 2
from functools import reduce
df = reduce(lambda df1, df2: df1.merge(df2, "outer"), dfs)</pre>
<p>This generates the following output:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(df) ''' Alice Bob
0 18 24
1 scientist student
2 24000 12000
3 19 25
4 25000 11000
5 20 26
6 26000 10000 '''</pre>
<p>You can find a discussion of the different merge strategies <a rel="noreferrer noopener" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html" target="_blank">here</a>. If you’d use the parameter <code>"inner"</code>, you’d obtain the following result:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Alice Bob
0 scientist student</pre>
<h2>Where to Go From Here?</h2>
<p>Enough theory, let’s get some practice!</p>
<p>To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?</p>
<p><strong>Practice projects is how you sharpen your saw in coding!</strong></p>
<p>Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?</p>
<p>Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.</p>
<p>Join my free webinar <a rel="noreferrer noopener" href="https://blog.finxter.com/webinar-freelancer/" target="_blank">“How to Build Your High-Income Skill Python”</a> and watch how I grew my coding business online and how you can, too—from the comfort of your own home.</p>
<p><a href="https://blog.finxter.com/webinar-freelancer/" target="_blank" rel="noreferrer noopener">Join the free webinar now!</a></p>
</div>
https://www.sickgaming.net/blog/2020/06/...ataframes/
<div><p class="has-background has-luminous-vivid-amber-background-color"><strong>To <a href="https://blog.finxter.com/python-join-list/" target="_blank" rel="noreferrer noopener">join </a>a list of DataFrames, say <code>dfs</code>, use the <code>pandas.concat(dfs)</code> function that merges an arbitrary number of DataFrames to a single one.</strong></p>
<p>When browsing <a rel="noreferrer noopener" href="https://stackoverflow.com/questions/32444138/concatenate-a-list-of-pandas-dataframes-together" target="_blank">StackOverflow</a>, I recently stumbled upon the following interesting problem. By thinking about solutions to those small data science problems, you can <a href="https://blog.finxter.com/coffee-break-numpy/" target="_blank" rel="noreferrer noopener">improve your data science skills</a>, so let’s dive into the problem description.</p>
<p><strong>Problem</strong>: Given a list of Pandas <a href="https://blog.finxter.com/tilde-python-pandas-dataframe/" target="_blank" rel="noreferrer noopener">DataFrames</a>. How to merge them into a single DataFrame?</p>
<p><strong>Example</strong>: You have the list of <a href="https://blog.finxter.com/pandas-cheat-sheets/" target="_blank" rel="noreferrer noopener">Pandas </a>DataFrames:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # List of DataFrames
dfs = [df1, df2, df3]</pre>
<p>Say, you want to get the following DataFrame:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Alice Bob
0 18 24
1 scientist student
2 24000 12000
0 19 25
1 scientist student
2 25000 11000
0 20 26
1 scientist student
2 26000 10000</pre>
<p>You can try the solution quickly in our interactive Python shell:</p>
<p> <iframe height="800px" width="100%" src="https://repl.it/@finxter/pandasmergedf?lite=true" scrolling="no" frameborder="no" allowtransparency="true" allowfullscreen="true" sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals"></iframe> </p>
<p><em><strong>Exercise</strong>: Print the resulting DataFrame. Run the code. Which merging strategy is used?</em></p>
<h2>Method 1: Pandas Concat</h2>
<p>This is the easiest and most straightforward way to concatenate multiple DataFrames.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # list of dataframes
dfs = [df1, df2, df3] df = pd.concat(dfs)</pre>
<p>This generates the following output:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(df) ''' Alice Bob
0 18 24
1 scientist student
2 24000 12000
0 19 25
1 scientist student
2 25000 11000
0 20 26
1 scientist student
2 26000 10000 '''</pre>
<p>The resulting DataFrames contains all original data from all three DataFrames.</p>
<h2>Method 2: Reduce + DataFrame Merge</h2>
<p>The following method uses the reduce function to repeatedly merge together all dictionaries in the list (no matter its size). To merge two dictionaries, the <a rel="noreferrer noopener" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html" target="_blank"><code>df.merge()</code></a> method is used. You can use several merging strategies—in the example, you use <code>"outer"</code>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd df1 = pd.DataFrame({'Alice' : [18, 'scientist', 24000], 'Bob' : [24, 'student', 12000]})
df2 = pd.DataFrame({'Alice' : [19, 'scientist', 25000], 'Bob' : [25, 'student', 11000]})
df3 = pd.DataFrame({'Alice' : [20, 'scientist', 26000], 'Bob' : [26, 'student', 10000]}) # list of dataframes
dfs = [df1, df2, df3] # Method 2
from functools import reduce
df = reduce(lambda df1, df2: df1.merge(df2, "outer"), dfs)</pre>
<p>This generates the following output:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(df) ''' Alice Bob
0 18 24
1 scientist student
2 24000 12000
3 19 25
4 25000 11000
5 20 26
6 26000 10000 '''</pre>
<p>You can find a discussion of the different merge strategies <a rel="noreferrer noopener" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html" target="_blank">here</a>. If you’d use the parameter <code>"inner"</code>, you’d obtain the following result:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""> Alice Bob
0 scientist student</pre>
<h2>Where to Go From Here?</h2>
<p>Enough theory, let’s get some practice!</p>
<p>To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?</p>
<p><strong>Practice projects is how you sharpen your saw in coding!</strong></p>
<p>Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?</p>
<p>Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.</p>
<p>Join my free webinar <a rel="noreferrer noopener" href="https://blog.finxter.com/webinar-freelancer/" target="_blank">“How to Build Your High-Income Skill Python”</a> and watch how I grew my coding business online and how you can, too—from the comfort of your own home.</p>
<p><a href="https://blog.finxter.com/webinar-freelancer/" target="_blank" rel="noreferrer noopener">Join the free webinar now!</a></p>
</div>
https://www.sickgaming.net/blog/2020/06/...ataframes/