{"id":133029,"date":"2023-04-06T13:18:51","date_gmt":"2023-04-06T13:18:51","guid":{"rendered":"https:\/\/blog.finxter.com\/?p=1271692"},"modified":"2023-04-06T13:18:51","modified_gmt":"2023-04-06T13:18:51","slug":"python-list-of-dicts-to-pandas-dataframe","status":"publish","type":"post","link":"https:\/\/sickgaming.net\/blog\/2023\/04\/06\/python-list-of-dicts-to-pandas-dataframe\/","title":{"rendered":"Python List of Dicts to Pandas DataFrame"},"content":{"rendered":"\n<div class=\"kk-star-ratings kksr-auto kksr-align-left kksr-valign-top\" data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;1271692&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;1&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\\\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;title&quot;:&quot;Python List of Dicts to Pandas DataFrame&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\\\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>\n<div class=\"kksr-stars\">\n<div class=\"kksr-stars-inactive\">\n<div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<div class=\"kksr-stars-active\" style=\"width: 142.5px;\">\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n<div class=\"kksr-legend\" style=\"font-size: 19.2px;\"> 5\/5 &#8211; (1 vote) <\/div>\n<\/p><\/div>\n<p>In this article, I will discuss a popular and efficient way to work with structured data in Python using DataFrames. <\/p>\n<p class=\"has-base-2-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4a1.png\" alt=\"\ud83d\udca1\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> A <strong>DataFrame<\/strong> is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns). It can be thought of as a table or a spreadsheet with rows and columns that can hold a variety of data types. <\/p>\n<p>One common challenge is <strong><em>converting a Python list of dictionaries into a DataFrame<\/em><\/strong>.<\/p>\n<p class=\"has-global-color-8-background-color has-background\"><strong>To create a DataFrame from a Python list of dicts, you can use the <code>pandas.DataFrame(list_of_dicts)<\/code> constructor.<\/strong><\/p>\n<p>Here&#8217;s a minimal example: <\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"4\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import pandas as pd\nlist_of_dicts = [{'key1': 'value1', 'key2': 'value2'}, {'key1': 'value3', 'key2': 'value4'}]\ndf = pd.DataFrame(list_of_dicts) <\/pre>\n<p>With this simple code, you can transform your list of dictionaries directly into a pandas DataFrame, giving you a clean and structured dataset to work with.<\/p>\n<p>A similar problem is discussed in this Finxter blog post: <\/p>\n<p class=\"has-base-2-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4a1.png\" alt=\"\ud83d\udca1\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Recommended<\/strong>: <a href=\"https:\/\/blog.finxter.com\/how-to-convert-list-of-lists-to-a-pandas-dataframe\/\" data-type=\"post\" data-id=\"7942\" target=\"_blank\" rel=\"noreferrer noopener\">How to Convert List of Lists to a Pandas Dataframe<\/a><\/p>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube\"><a href=\"https:\/\/blog.finxter.com\/python-list-of-dicts-to-pandas-dataframe\/\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FpcF4rYfqs34%2Fhqdefault.jpg\" alt=\"YouTube Video\"><\/a><figcaption><\/figcaption><\/figure>\n<h2 class=\"wp-block-heading\">Converting Python List of Dicts to DataFrame<\/h2>\n<p>Let&#8217;s go through various methods and techniques, including using the DataFrame constructor, handling missing data, and assigning column names and indexes. <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f603.png\" alt=\"\ud83d\ude03\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n<h3 class=\"wp-block-heading\">Using DataFrame Constructor<\/h3>\n<p>The simplest way to convert a list of dictionaries to a DataFrame is by using the pandas DataFrame constructor. You can do this in just one line of code:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"3\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import pandas as pd\ndata = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]\ndf = pd.DataFrame(data)\n<\/pre>\n<p>Now, <code>df<\/code> is a DataFrame with the contents of the list of dictionaries. Easy peasy! <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f60a.png\" alt=\"\ud83d\ude0a\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n<h3 class=\"wp-block-heading\">Handling Missing Data<\/h3>\n<p>When your list of dictionaries contains missing keys or values, pandas automatically fills in the gaps with <code><a href=\"https:\/\/blog.finxter.com\/check-for-nan-values-in-python\/\" data-type=\"post\" data-id=\"273492\" target=\"_blank\" rel=\"noreferrer noopener\">NaN<\/a><\/code> values. Let&#8217;s see an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">data = [{'a': 1, 'b': 2}, {'a': 3, 'c': 4}]\ndf = pd.DataFrame(data)\n<\/pre>\n<p>The resulting DataFrame will have <code>NaN<\/code> values in the missing spots:<\/p>\n<pre class=\"wp-block-preformatted\"><code> a b c\n0 1 2.0 NaN\n1 3 NaN 4.0\n<\/code><\/pre>\n<p>No need to manually handle missing data! <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f44d.png\" alt=\"\ud83d\udc4d\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n<h3 class=\"wp-block-heading\">Assigning Column Names and Indexes<\/h3>\n<p>You may want to assign custom column names or indexes when creating the DataFrame. To do this, use the columns and index parameters:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">column_names = ['col_1', 'col_2', 'col_3']\nindex_names = ['row_1', 'row_2']\ndf = pd.DataFrame(data, columns=column_names, index=index_names)\n<\/pre>\n<p>This will create a DataFrame with the specified column names and index labels:<\/p>\n<pre class=\"wp-block-preformatted\"><code> col_1 col_2 col_3\nrow_1 1.0 2.0 NaN\nrow_2 3.0 NaN 4.0<\/code><\/pre>\n<h2 class=\"wp-block-heading\">Working with the Resulting DataFrame<\/h2>\n<p>Once you&#8217;ve converted your Python list of dictionaries into a pandas DataFrame, you can work with the data in a more structured and efficient way. <\/p>\n<p>In this section, I will discuss three common operations you may want to perform with a DataFrame: <\/p>\n<ul>\n<li>filtering and selecting data, <\/li>\n<li>sorting and grouping data, and <\/li>\n<li>applying functions and calculations. <\/li>\n<\/ul>\n<p>Let&#8217;s dive into each of these sub-sections! <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f603.png\" alt=\"\ud83d\ude03\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n<h3 class=\"wp-block-heading\">Filtering and Selecting Data<\/h3>\n<p>Working with data in a DataFrame allows you to easily filter and select specific data using various techniques. To select specific columns, you can use either DataFrame column names or the <code>loc<\/code> and <code>iloc<\/code> methods.<\/p>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube\"><a href=\"https:\/\/blog.finxter.com\/python-list-of-dicts-to-pandas-dataframe\/\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FJQBOpbhxQrM%2Fhqdefault.jpg\" alt=\"YouTube Video\"><\/a><figcaption><\/figcaption><\/figure>\n<p class=\"has-base-2-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4a1.png\" alt=\"\ud83d\udca1\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Recommended<\/strong>: <a href=\"https:\/\/blog.finxter.com\/pandas-loc-and-iloc-a-simple-guide-with-video\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/pandas-loc-and-iloc-a-simple-guide-with-video\/\" target=\"_blank\" rel=\"noreferrer noopener\">Pandas loc() and iloc() \u2013 A Simple Guide with Video<\/a><\/p>\n<p>For example, if you need to select columns A and B from your DataFrame, you can use the following approach:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">\nselected_columns = df[['A', 'B']]\n<\/pre>\n<p>If you want to filter rows based on certain conditions, you can use <a href=\"https:\/\/blog.finxter.com\/pandas-dataframe-indexing\/\" data-type=\"post\" data-id=\"64801\" target=\"_blank\" rel=\"noreferrer noopener\">boolean indexing<\/a>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">\nfiltered_data = df[(df['A'] > 5) &amp; (df['B'] &lt; 10)]\n<\/pre>\n<p>This will return all the rows where column A contains values greater than 5 and column B contains values less than 10. <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f680.png\" alt=\"\ud83d\ude80\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n<h3 class=\"wp-block-heading\">Sorting and Grouping Data<\/h3>\n<p>Sorting your DataFrame can make it easier to analyze and visualize the data. You can sort the data using the <code>sort_values<\/code> method, specifying the column(s) to sort by and the sorting order:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">\nsorted_data = df.sort_values(by=['A'], ascending=True)\n<\/pre>\n<p>Grouping data is also a powerful operation to perform statistical analysis or data aggregation. You can use the <code>groupby<\/code> method to group the data by a specific column:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">\ngrouped_data = df.groupby(['A']).sum()\n<\/pre>\n<p>In this case, I&#8217;m grouping the data by column A and aggregating the values using the sum function. These operations can help you better understand patterns and trends in your data. <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4ca.png\" alt=\"\ud83d\udcca\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n<h3 class=\"wp-block-heading\">Applying Functions and Calculations<\/h3>\n<p>DataFrames allow you to easily apply functions and calculations on your data. You can use the <code><a href=\"https:\/\/blog.finxter.com\/the-pandas-apply-function\/\" data-type=\"post\" data-id=\"37756\" target=\"_blank\" rel=\"noreferrer noopener\">apply<\/a><\/code> and <code><a href=\"https:\/\/blog.finxter.com\/how-to-apply-a-function-to-each-cell-in-a-pandas-dataframe\/\" data-type=\"post\" data-id=\"595293\" target=\"_blank\" rel=\"noreferrer noopener\">applymap<\/a><\/code> methods to apply functions to columns, rows, or individual cells.<\/p>\n<p>For example, if you want to calculate the square of each value in column A, you can use the <code>apply<\/code> method:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">\ndf['A_squared'] = df['A'].apply(lambda x: x**2)\n<\/pre>\n<p>Alternatively, if you need to apply a function to all cells in the DataFrame, you can use the <code>applymap<\/code> method:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">\ndf_cleaned = df.applymap(lambda x: x.strip() if isinstance(x, str) else x)\n<\/pre>\n<p>In this example, I&#8217;m using <code>applymap<\/code> to <a href=\"https:\/\/blog.finxter.com\/python-string-strip\/\" data-type=\"post\" data-id=\"26104\" target=\"_blank\" rel=\"noreferrer noopener\">strip<\/a> all strings in the DataFrame, removing any unnecessary whitespace. Utilizing these methods will make your data processing and analysis tasks more efficient and easier to manage. <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4aa.png\" alt=\"\ud83d\udcaa\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<p>To keep improving your data science skills, make sure you know what you&#8217;re going yourself into: <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f447.png\" alt=\"\ud83d\udc47\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><a href=\"https:\/\/blog.finxter.com\/data-scientist-income-and-opportunity\/\" target=\"_blank\" rel=\"noreferrer noopener\"><img decoding=\"async\" loading=\"lazy\" width=\"987\" height=\"567\" src=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2023\/04\/image-71.png\" alt=\"\" class=\"wp-image-1271712\" srcset=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2023\/04\/image-71.png 987w, https:\/\/blog.finxter.com\/wp-content\/uploads\/2023\/04\/image-71-300x172.png 300w, https:\/\/blog.finxter.com\/wp-content\/uploads\/2023\/04\/image-71-768x441.png 768w\" sizes=\"auto, (max-width: 987px) 100vw, 987px\" \/><\/a><\/figure>\n<\/div>\n<p class=\"has-base-2-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4a1.png\" alt=\"\ud83d\udca1\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Recommended<\/strong>: <a href=\"https:\/\/blog.finxter.com\/data-scientist-income-and-opportunity\/\" data-type=\"post\" data-id=\"332478\" target=\"_blank\" rel=\"noreferrer noopener\">Data Scientist &#8211; Income and Opportunity<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>5\/5 &#8211; (1 vote) In this article, I will discuss a popular and efficient way to work with structured data in Python using DataFrames. A DataFrame is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns). It can be thought of as a table or a spreadsheet with rows and [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[857],"tags":[73,468,528],"class_list":["post-133029","post","type-post","status-publish","format-standard","hentry","category-python-tut","tag-programming","tag-python","tag-tutorial"],"_links":{"self":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/133029","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/comments?post=133029"}],"version-history":[{"count":0,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/133029\/revisions"}],"wp:attachment":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media?parent=133029"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/categories?post=133029"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/tags?post=133029"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}