[Tut] Python – Finding the Most Common Element in a Column - Printable Version +- Sick Gaming (https://www.sickgaming.net) +-- Forum: Programming (https://www.sickgaming.net/forum-76.html) +--- Forum: Python (https://www.sickgaming.net/forum-83.html) +--- Thread: [Tut] Python – Finding the Most Common Element in a Column (/thread-99913.html) |
[Tut] Python – Finding the Most Common Element in a Column - xSicKxBot - 09-06-2022 Python – Finding the Most Common Element in a Column <div> <div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload="{"align":"left","id":"644227","slug":"default","valign":"top","ignore":"","reference":"auto","class":"","count":"1","readonly":"","score":"5","best":"5","gap":"5","greet":"Rate this post","legend":"5\/5 - (1 vote)","size":"24","width":"142.5","_legend":"{score}\/{best} - ({count} {votes})","font_factor":"1.25"}"> <div class="kksr-stars"> <div class="kksr-stars-inactive"> <div class="kksr-star" data-star="1" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="2" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="3" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="4" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="5" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> <div class="kksr-stars-active" style="width: 142.5px;"> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> </div> <div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div> </div> <h2 class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Problem Formulation and Solution Overview</h2> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">This article will show you how to find the most common element in a Pandas Column.</p> <p>To make it more interesting, we have the following running scenario:</p> <p><em>You have been provided with a <a rel="noreferrer noopener" href="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" data-type="URL" data-id="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" target="_blank">downloadable</a> CSV file containing crime statistics for the San Diego area, including their respective <a rel="noreferrer noopener" href="https://secure.ssa.gov/poms.nsf/lnx/0202613900" data-type="URL" data-id="https://secure.ssa.gov/poms.nsf/lnx/0202613900" target="_blank">NCIC</a> Crime Codes. </em></p> <hr class="wp-block-separator has-alpha-channel-opacity wp-embed-aspect-16-9 wp-has-aspect-ratio"/> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio has-global-color-8-background-color has-background"><em><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Question</strong>: How would you determine the most common <a rel="noreferrer noopener" href="https://secure.ssa.gov/poms.nsf/lnx/0202613900" data-type="URL" data-id="https://secure.ssa.gov/poms.nsf/lnx/0202613900" target="_blank">NCIC</a> Crime Code that occurs in San Diego’s jurisdiction?</em></p> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">We can accomplish this task by one of the following options:</p> <ul type="video" class="wp-embed-aspect-16-9 wp-has-aspect-ratio"> <li><strong>Method </strong>1: Use Pandas <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-mode-method/" data-type="URL" data-id="https://blog.finxter.com/pandas-dataframe-mode-method/" target="_blank"><code>mode()</code></a></li> <li><strong>Method </strong>2: Use Pandas <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" data-type="URL" data-id="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank"><code>value_counts()</code></a></li> <li><strong>Method </strong>3: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" data-type="URL" data-id="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank"><code>value_counts()</code></a> and <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-methods-idxmax-idxmin-reindex-reindex_like-rename-rename_axis/" data-type="URL" data-id="https://blog.finxter.com/pandas-dataframe-methods-idxmax-idxmin-reindex-reindex_like-rename-rename_axis/" target="_blank"><code>idxmax()</code></a></li> <li><strong>Method 4</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" data-type="URL" data-id="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank"><code>value_counts()</code></a> and <a rel="noreferrer noopener" href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.keys.html" data-type="URL" data-id="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.keys.html" target="_blank"><code>keys()</code></a></li> <li><strong>Method 5</strong>: Use Pandas <a rel="noreferrer noopener" href="https://blog.finxter.com/the-pandas-groupby-method/" data-type="URL" data-id="https://blog.finxter.com/the-pandas-groupby-method/" target="_blank"><code>groupby()</code></a></li> </ul> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2 class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Preparation</h2> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Before moving forward, please ensure the <a href="https://blog.finxter.com/category/pandas-library/" data-type="URL" data-id="https://blog.finxter.com/category/pandas-library/" target="_blank" rel="noreferrer noopener">Pandas</a> library is installed. Click <a href="https://blog.finxter.com/how-to-install-pandas-in-python/" data-type="URL" data-id="https://blog.finxter.com/how-to-install-pandas-in-python/" target="_blank" rel="noreferrer noopener">here</a> if you require instructions.</p> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Then, add the following code to the top of each script. This snippet will allow the code in this article to run error-free.</p> <pre class="EnlighterJSRAW wp-embed-aspect-16-9 wp-has-aspect-ratio" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd</pre> <p>After importing the <a rel="noreferrer noopener" href="https://blog.finxter.com/numpy-tutorial/" data-type="URL" data-id="https://blog.finxter.com/numpy-tutorial/" target="_blank"></a><a rel="noreferrer noopener" href="https://blog.finxter.com/category/pandas-library/" data-type="URL" data-id="https://blog.finxter.com/category/pandas-library/" target="_blank">Pandas</a> library, this library is referenced by calling the shortcode (<code>pd</code>).</p> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Method 1: Use Pandas mode()</h2> <p class="has-global-color-8-background-color has-background">This example uses the <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-mode-method/" target="_blank"><code>mode()</code></a> method to determine the single most common crime committed in San Diego on a given day.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr']) max_crime = df['crimedescr'].mode() print(max_crime)</pre> <p>The above code reads in the <code>crimedescr </code>column from the <a rel="noreferrer noopener" href="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" data-type="URL" data-id="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" target="_blank">crimes.csv</a> file downloaded earlier. This saves to the DataFrame <code>df</code>.</p> <p>Next, the <code>crimedescr </code>column is then accessed, and the <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-mode-method/" data-type="URL" data-id="https://blog.finxter.com/pandas-dataframe-mode-method/" target="_blank"><code>mode()</code></a> method is appended. This method returns a value or set of values that appear most often along a selected axis. The results save to <code>max_crime</code>.</p> <p>These results are output to the terminal.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>0 10851(A)VC TAKE VEH W/O OWNER<br />Name: crimedescr, dtype: object</code></td> </tr> </tbody> </table> </figure> <p>So, out of 7,854 rows of crimes committed on a given day for San Diego, the above offense was committed the highest number of times.</p> <p>The above code only provides us with the name of the most common crime; what if we need the crime name and the respective count?</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2-3" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr', 'ucr_ncic_code']) max_crime = df['crimedescr'].mode() max_count = df['ucr_ncic_code'].mode() print(max_crime) print(max_count)</pre> <hr class="wp-block-separator has-alpha-channel-opacity"/> <p>The above code is output to the terminal and displays the following.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>0 <strong>10851(A)VC TAKE VEH W/O OWNER</strong><br />Name: crimedescr, dtype: object<br />0 <strong>7000</strong><br />Name: ucr_ncic_code, dtype: int64</code></td> </tr> </tbody> </table> </figure> <p>Now, you are equipped to return to your boss and tell them that 7,000 offenses of <strong>10851 (A) VC TAKE VEH W/O OWNER</strong> occurred on a given day in San Diego.</p> <figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/python-finding-the-most-common-element-in-a-column/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F-JKVy_HliQE%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Method 2: Use value_counts()</h2> <p class="has-global-color-8-background-color has-background">This example uses the <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" data-type="URL" data-id="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank"><code>value_counts()</code></a> function to determine the top 5 most common crimes committed in San Diego on a given day.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr', 'ucr_ncic_code']) top5_names = df['crimedescr'].value_counts()[:5].index.tolist() print(top5_names)</pre> <p>The above code reads in the <code>crimedescr</code> and <code>ucr_ncic_code </code>columns from the <a rel="noreferrer noopener" href="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" data-type="URL" data-id="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" target="_blank">crimes.csv</a> file downloaded earlier. This saves to the DataFrame <code>df</code>.</p> <p>Then, the <code>crimedescr</code> column is accessed, and the <code><a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-mode-method/" data-type="URL" data-id="https://blog.finxter.com/pandas-dataframe-mode-method/" target="_blank"></a><a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" data-type="URL" data-id="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank">value_counts()</a></code> function is appended. This function returns a series containing the counts of unique values. </p> <p>However, since <a rel="noreferrer noopener" href="https://blog.finxter.com/introduction-to-slicing-in-python/" data-type="URL" data-id="https://blog.finxter.com/introduction-to-slicing-in-python/" target="_blank">slicing</a> is also appended (<code>[:5]</code>), only the top five (5) common crimes are retrieved and then converted to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a>. The results save to <code>top5_names</code>.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>['10851(A)VC TAKE VEH W/O OWNER', 'TOWED/STORED VEH-14602.6', '459 PC BURGLARY VEHICLE', 'TOWED/STORED VEHICLE', '459 PC BURGLARY RESIDENCE']</code></td> </tr> </tbody> </table> </figure> <p>The above code only provides us with the names of the top 5 most common crimes; what if we need the names and their respective counts?</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr', 'ucr_ncic_code']) top5 = df['crimedescr'].value_counts()[:5].sort_values(ascending=False) print(top5)</pre> <p>The above output is sent to the terminal.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td>10851(A)VC TAKE VEH W/O OWNER</td> <td>653</td> </tr> <tr> <td>TOWED/STORED VEH-14602.6</td> <td>463</td> </tr> <tr> <td>459 PC BURGLARY VEHICLE</td> <td>462</td> </tr> <tr> <td>TOWED/STORED VEHICLE</td> <td>434</td> </tr> <tr> <td>459 PC BURGLARY RESIDENCE</td> <td>356</td> </tr> <tr> <td>Name: crimedescr, dtype: int64</td> <td></td> </tr> </tbody> </table> </figure> <figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/python-finding-the-most-common-element-in-a-column/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FD2ZueuWXST8%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure> <p>A cleaner way to achieve the same results is to use the following code.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr', 'ucr_ncic_code']) top5 = df['crimedescr'].value_counts().nlargest(5) print(top5)</pre> <p>The above code calls the <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-nlargest-method/" data-type="URL" data-id="https://blog.finxter.com/pandas-dataframe-nlargest-method/" target="_blank"><code>nlargest()</code></a> method to determine and retrieve the top five (5) common crimes. The output is identical to the above.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td>10851(A)VC TAKE VEH W/O OWNER</td> <td>653</td> </tr> <tr> <td>TOWED/STORED VEH-14602.6</td> <td>463</td> </tr> <tr> <td>459 PC BURGLARY VEHICLE</td> <td>462</td> </tr> <tr> <td>TOWED/STORED VEHICLE</td> <td>434</td> </tr> <tr> <td>459 PC BURGLARY RESIDENCE</td> <td>356</td> </tr> <tr> <td>Name: crimedescr, dtype: int64</td> <td></td> </tr> </tbody> </table> </figure> <p>A much cleaner and more precise output to send to the boss!</p> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Method 3: Use value_counts() and idxmax()</h2> <p class="has-global-color-8-background-color has-background">This example uses <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" data-type="URL" data-id="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank"><code>value_counts()</code></a> and <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-methods-idxmax-idxmin-reindex-reindex_like-rename-rename_axis/" target="_blank"><code>idxmax()</code></a> to determine the single most common crime committed in San Diego on a given day.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr', 'ucr_ncic_code']) max_crime = df['crimedescr'].value_counts().idxmax() print(max_crime)</pre> <p>The above code reads in the <code>crimedescr</code> and <code>ucr_ncic_code </code>columns from the <a rel="noreferrer noopener" href="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" data-type="URL" data-id="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" target="_blank">crimes.csv</a> file downloaded earlier. This saves to the DataFrame <code>df</code>.</p> <p>Then, the <code>crimedescr </code>column is accessed, and the <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-mode-method/" data-type="URL" data-id="https://blog.finxter.com/pandas-dataframe-mode-method/" target="_blank"></a><a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" data-type="URL" data-id="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank"><code>value_counts()</code></a> function is appended. This function returns a series containing the count of unique values.</p> <p>Next, <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-methods-idxmax-idxmin-reindex-reindex_like-rename-rename_axis/" target="_blank"><code>idxmax()</code></a> is appended. This method returns the index of the first occurrence of the maximum index(es) over a selected axis. </p> <p>The results save to <code>max_crime</code> and are output to the terminal.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>10851(A)VC TAKE VEH W/O OWNER</code></td> </tr> </tbody> </table> </figure> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Method 4: Use value_counts() and keys()</h2> <p class="has-global-color-8-background-color has-background">This example uses <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-count-occurrences-of-elements-in-pandas/" target="_blank"><code>value_counts()</code></a> and <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-methods-idxmax-idxmin-reindex-reindex_like-rename-rename_axis/" target="_blank"></a><a rel="noreferrer noopener" href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.keys.html" target="_blank"><code>keys()</code></a> to determine the top 5 most common crimes committed in unique grid areas of San Diego on a given day.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr', 'grid', 'ucr_ncic_code']) top5_grids = df['grid'].value_counts().keys()[:5] print(top5_grids)</pre> <p>The above code reads in the <code>crimedescr</code>, <code>grid</code>, and the <code>ucr_ncic_code </code>columns from the <a rel="noreferrer noopener" href="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" data-type="URL" data-id="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" target="_blank">crimes.csv</a> file downloaded earlier. This saves to the DataFrame <code>df</code>.</p> <p>Let’s break the highlighted line down. </p> <p>If <code>df['grid'].value_counts()</code> was output to the terminal, the following would display (snippet). However, we have added a heading row to make it more understandable, and only five (5) rows are displayed.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><strong>Grid #</strong></td> <td><strong>Grid Total</strong></td> </tr> <tr> <td>742</td> <td>115</td> </tr> <tr> <td>969</td> <td>105</td> </tr> <tr> <td>958 </td> <td>100</td> </tr> <tr> <td>564</td> <td>80</td> </tr> <tr> <td>1084</td> <td>71</td> </tr> </tbody> </table> </figure> <p>Next, the code <code>keys()[:5]</code> is appended. The final output displays as follows.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>Int64Index([742, 969, 958, 564, 1084], dtype='int64')</code></td> </tr> </tbody> </table> </figure> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Method 5: Use groupby()</h2> <p class="has-global-color-8-background-color has-background">This examples uses <a rel="noreferrer noopener" href="https://blog.finxter.com/the-pandas-groupby-method/" data-type="URL" data-id="https://blog.finxter.com/the-pandas-groupby-method/" target="_blank"><code>groupby()</code></a> to group our data on the Crime Code and displays the totals in descending order.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">df = pd.read_csv('crimes.csv', usecols=['crimedescr', 'ucr_ncic_code']) res = (df.groupby(['ucr_ncic_code','crimedescr']).size() .sort_values(ascending=False) .reset_index(name='count')) print(res)</pre> <p>The above code reads in the <code>crimedescr</code> and the <code>ucr_ncic_code </code>columns from the <a rel="noreferrer noopener" href="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" data-type="URL" data-id="https://blog.finxter.com/wp-content/uploads/2022/02/crimes.csv" target="_blank">crimes.csv</a> file downloaded earlier. This saves to the DataFrame <code>df</code>.</p> <p>Next, the <a rel="noreferrer noopener" href="https://blog.finxter.com/the-pandas-groupby-method/" data-type="URL" data-id="https://blog.finxter.com/the-pandas-groupby-method/" target="_blank"><code>groupby()</code></a> function is called and passed the first argument: <code>df.groupby(['ucr_ncic_code','crimedescr']).size()</code>. If this was output to the terminal at this point, the following would display (snippet).</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(df.groupby(['ucr_ncic_code','crimedescr']).size())</pre> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td>ucr_ncic_code</td> <td>crimedescr</td> <td></td> </tr> <tr> <td>909 </td> <td></td> <td>2</td> </tr> <tr> <td>999</td> <td></td> <td>1</td> </tr> <tr> <td>197</td> <td></td> <td>1</td> </tr> <tr> <td>664</td> <td></td> <td>1</td> </tr> <tr> <td>1099</td> <td></td> <td>1</td> </tr> </tbody> </table> </figure> <p>As you can see, the other arguments need to be added to turn this into something usable. Sorting the data in descending order and adding a count column will provide the results we are looking for. </p> <p>If the original Method 5 code example was output to the terminal, the following would display.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td></td> <td>ucr_ncic_code</td> <td>crimedescr</td> <td>count</td> </tr> <tr> <td>0</td> <td>2404</td> <td>10851(A)VC TAKE VEH W/O OWNER</td> <td>653</td> </tr> <tr> <td>1</td> <td>7000</td> <td>TOWED/STORED VEH-14602.6</td> <td>463</td> </tr> <tr> <td>2</td> <td>2299</td> <td>459 PC BURGLARY VEHICLE</td> <td>462</td> </tr> <tr> <td>3</td> <td>7000</td> <td>TOWED/STORED VEHICLE</td> <td>434</td> </tr> <tr> <td>4</td> <td>2204</td> <td>459 PC BURGLARY RESIDENCE</td> <td>356</td> </tr> </tbody> </table> </figure> <figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/python-finding-the-most-common-element-in-a-column/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FgUk4hMCuu2g%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Summary</h2> <p>This article has provided five (5) ways to find the most common element in a Panda Column. These examples should provide you with enough information to select the one that best meets your coding requirements.</p> <p>Good Luck & Happy Coding!</p> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Programming Humor – Python</h2> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="518" height="588" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-65.png" alt="" class="wp-image-471102" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-65.png 518w, https://blog.finxter.com/wp-content/uploads/2022/07/image-65-264x300.png 264w" sizes="(max-width: 518px) 100vw, 518px" /><figcaption><em>“I wrote 20 short programs in Python yesterday. It was wonderful. Perl, I’m leaving you.”</em> — <a rel="noreferrer noopener" href="https://imgs.xkcd.com/comics/python.png" data-type="URL" data-id="https://imgs.xkcd.com/comics/python.png" target="_blank">xkcd</a></figcaption></figure> </div> </div> https://www.sickgaming.net/blog/2022/09/05/python-finding-the-most-common-element-in-a-column/ |