Sick Gaming
[Tut] Python List of Dicts to Pandas DataFrame - Printable Version

+- Sick Gaming (https://www.sickgaming.net)
+-- Forum: Programming (https://www.sickgaming.net/forum-76.html)
+--- Forum: Python (https://www.sickgaming.net/forum-83.html)
+--- Thread: [Tut] Python List of Dicts to Pandas DataFrame (/thread-100969.html)



[Tut] Python List of Dicts to Pandas DataFrame - xSicKxBot - 04-11-2023

Python List of Dicts to Pandas DataFrame

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;1271692&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;1&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;title&quot;:&quot;Python List of Dicts to Pandas DataFrame&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div>
</p></div>
<p>In this article, I will discuss a popular and efficient way to work with structured data in Python using DataFrames. </p>
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> A <strong>DataFrame</strong> is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns). It can be thought of as a table or a spreadsheet with rows and columns that can hold a variety of data types. </p>
<p>One common challenge is <strong><em>converting a Python list of dictionaries into a DataFrame</em></strong>.</p>
<p class="has-global-color-8-background-color has-background"><strong>To create a DataFrame from a Python list of dicts, you can use the <code>pandas.DataFrame(list_of_dicts)</code> constructor.</strong></p>
<p>Here’s a minimal example: </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd
list_of_dicts = [{'key1': 'value1', 'key2': 'value2'}, {'key1': 'value3', 'key2': 'value4'}]
df = pd.DataFrame(list_of_dicts) </pre>
<p>With this simple code, you can transform your list of dictionaries directly into a pandas DataFrame, giving you a clean and structured dataset to work with.</p>
<p>A similar problem is discussed in this Finxter blog post: </p>
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended</strong>: <a href="https://blog.finxter.com/how-to-convert-list-of-lists-to-a-pandas-dataframe/" data-type="post" data-id="7942" target="_blank" rel="noreferrer noopener">How to Convert List of Lists to a Pandas Dataframe</a></p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/python-list-of-dicts-to-pandas-dataframe/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FpcF4rYfqs34%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<h2 class="wp-block-heading">Converting Python List of Dicts to DataFrame</h2>
<p>Let’s go through various methods and techniques, including using the DataFrame constructor, handling missing data, and assigning column names and indexes. <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f603.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h3 class="wp-block-heading">Using DataFrame Constructor</h3>
<p>The simplest way to convert a list of dictionaries to a DataFrame is by using the pandas DataFrame constructor. You can do this in just one line of code:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd
data = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
df = pd.DataFrame(data)
</pre>
<p>Now, <code>df</code> is a DataFrame with the contents of the list of dictionaries. Easy peasy! <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f60a.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h3 class="wp-block-heading">Handling Missing Data</h3>
<p>When your list of dictionaries contains missing keys or values, pandas automatically fills in the gaps with <code><a href="https://blog.finxter.com/check-for-nan-values-in-python/" data-type="post" data-id="273492" target="_blank" rel="noreferrer noopener">NaN</a></code> values. Let’s see an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">data = [{'a': 1, 'b': 2}, {'a': 3, 'c': 4}]
df = pd.DataFrame(data)
</pre>
<p>The resulting DataFrame will have <code>NaN</code> values in the missing spots:</p>
<pre class="wp-block-preformatted"><code> a b c
0 1 2.0 NaN
1 3 NaN 4.0
</code></pre>
<p>No need to manually handle missing data! <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f44d.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h3 class="wp-block-heading">Assigning Column Names and Indexes</h3>
<p>You may want to assign custom column names or indexes when creating the DataFrame. To do this, use the columns and index parameters:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">column_names = ['col_1', 'col_2', 'col_3']
index_names = ['row_1', 'row_2']
df = pd.DataFrame(data, columns=column_names, index=index_names)
</pre>
<p>This will create a DataFrame with the specified column names and index labels:</p>
<pre class="wp-block-preformatted"><code> col_1 col_2 col_3
row_1 1.0 2.0 NaN
row_2 3.0 NaN 4.0</code></pre>
<h2 class="wp-block-heading">Working with the Resulting DataFrame</h2>
<p>Once you’ve converted your Python list of dictionaries into a pandas DataFrame, you can work with the data in a more structured and efficient way. </p>
<p>In this section, I will discuss three common operations you may want to perform with a DataFrame: </p>
<ul>
<li>filtering and selecting data, </li>
<li>sorting and grouping data, and </li>
<li>applying functions and calculations. </li>
</ul>
<p>Let’s dive into each of these sub-sections! <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f603.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h3 class="wp-block-heading">Filtering and Selecting Data</h3>
<p>Working with data in a DataFrame allows you to easily filter and select specific data using various techniques. To select specific columns, you can use either DataFrame column names or the <code>loc</code> and <code>iloc</code> methods.</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/python-list-of-dicts-to-pandas-dataframe/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FJQBOpbhxQrM%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended</strong>: <a href="https://blog.finxter.com/pandas-loc-and-iloc-a-simple-guide-with-video/" data-type="URL" data-id="https://blog.finxter.com/pandas-loc-and-iloc-a-simple-guide-with-video/" target="_blank" rel="noreferrer noopener">Pandas loc() and iloc() – A Simple Guide with Video</a></p>
<p>For example, if you need to select columns A and B from your DataFrame, you can use the following approach:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">
selected_columns = df[['A', 'B']]
</pre>
<p>If you want to filter rows based on certain conditions, you can use <a href="https://blog.finxter.com/pandas-dataframe-indexing/" data-type="post" data-id="64801" target="_blank" rel="noreferrer noopener">boolean indexing</a>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">
filtered_data = df[(df['A'] > 5) &amp; (df['B'] &lt; 10)]
</pre>
<p>This will return all the rows where column A contains values greater than 5 and column B contains values less than 10. <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f680.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h3 class="wp-block-heading">Sorting and Grouping Data</h3>
<p>Sorting your DataFrame can make it easier to analyze and visualize the data. You can sort the data using the <code>sort_values</code> method, specifying the column(s) to sort by and the sorting order:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">
sorted_data = df.sort_values(by=['A'], ascending=True)
</pre>
<p>Grouping data is also a powerful operation to perform statistical analysis or data aggregation. You can use the <code>groupby</code> method to group the data by a specific column:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">
grouped_data = df.groupby(['A']).sum()
</pre>
<p>In this case, I’m grouping the data by column A and aggregating the values using the sum function. These operations can help you better understand patterns and trends in your data. <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4ca.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h3 class="wp-block-heading">Applying Functions and Calculations</h3>
<p>DataFrames allow you to easily apply functions and calculations on your data. You can use the <code><a href="https://blog.finxter.com/the-pandas-apply-function/" data-type="post" data-id="37756" target="_blank" rel="noreferrer noopener">apply</a></code> and <code><a href="https://blog.finxter.com/how-to-apply-a-function-to-each-cell-in-a-pandas-dataframe/" data-type="post" data-id="595293" target="_blank" rel="noreferrer noopener">applymap</a></code> methods to apply functions to columns, rows, or individual cells.</p>
<p>For example, if you want to calculate the square of each value in column A, you can use the <code>apply</code> method:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">
df['A_squared'] = df['A'].apply(lambda x: x**2)
</pre>
<p>Alternatively, if you need to apply a function to all cells in the DataFrame, you can use the <code>applymap</code> method:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">
df_cleaned = df.applymap(lambda x: x.strip() if isinstance(x, str) else x)
</pre>
<p>In this example, I’m using <code>applymap</code> to <a href="https://blog.finxter.com/python-string-strip/" data-type="post" data-id="26104" target="_blank" rel="noreferrer noopener">strip</a> all strings in the DataFrame, removing any unnecessary whitespace. Utilizing these methods will make your data processing and analysis tasks more efficient and easier to manage. <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4aa.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<p>To keep improving your data science skills, make sure you know what you’re going yourself into: <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f447.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> </p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><a href="https://blog.finxter.com/data-scientist-income-and-opportunity/" target="_blank" rel="noreferrer noopener"><img decoding="async" loading="lazy" width="987" height="567" src="https://blog.finxter.com/wp-content/uploads/2023/04/image-71.png" alt="" class="wp-image-1271712" srcset="https://blog.finxter.com/wp-content/uploads/2023/04/image-71.png 987w, https://blog.finxter.com/wp-content/uploads/2023/04/image-71-300x172.png 300w, https://blog.finxter.com/wp-content/uploads/2023/04/image-71-768x441.png 768w" sizes="(max-width: 987px) 100vw, 987px" /></a></figure>
</div>
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended</strong>: <a href="https://blog.finxter.com/data-scientist-income-and-opportunity/" data-type="post" data-id="332478" target="_blank" rel="noreferrer noopener">Data Scientist – Income and Opportunity</a></p>
</div>


https://www.sickgaming.net/blog/2023/04/06/python-list-of-dicts-to-pandas-dataframe/