[Tut] 5 Best Ways to Check a List for Duplicates in Python - Printable Version +- Sick Gaming (https://www.sickgaming.net) +-- Forum: Programming (https://www.sickgaming.net/forum-76.html) +--- Forum: Python (https://www.sickgaming.net/forum-83.html) +--- Thread: [Tut] 5 Best Ways to Check a List for Duplicates in Python (/thread-99446.html) |
[Tut] 5 Best Ways to Check a List for Duplicates in Python - xSicKxBot - 05-23-2022 5 Best Ways to Check a List for Duplicates in Python <div><div class="kk-star-ratings kksr-valign-top kksr-align-left " data-payload="{"align":"left","id":"365014","slug":"default","valign":"top","reference":"auto","count":"0","readonly":"","score":"0","best":"5","gap":"5","greet":"Rate this post","legend":"0\/5 - (0 votes)","size":"24","width":"0","_legend":"{score}\/{best} - ({count} {votes})"}"> <div class="kksr-stars"> <div class="kksr-stars-inactive"> <div class="kksr-star" data-star="1" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="2" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="3" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="4" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="5" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> <div class="kksr-stars-active" style="width: 0px;"> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> </div> <div class="kksr-legend"> <span class="kksr-muted">Rate this post</span> </div> </div> <h2 class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Problem Formulation and Solution Overview</h2> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">In this article, you’ll learn how to check a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank" rel="noreferrer noopener">List</a> for Duplicates in Python.</p> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">To make it more fun, we have the following running scenario:</p> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio"><em>The <a rel="noreferrer noopener" href="https://academy.finxter.com/" data-type="URL" data-id="https://academy.finxter.com/" target="_blank">Finxter Academy</a> has given you an extensive list of usernames. Somewhere along the line, duplicate entries were added. They need you to check if their <em><a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> </em>contains duplicates. For testing purposes, a small sampling of this <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> is used.</em></p> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio has-global-color-8-background-color has-background"><em><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Question</strong>: How would we write Python code to check a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> for duplicate elements?</em></p> <p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">We can accomplish this task by one of the following options:</p> <ul type="video" class="wp-embed-aspect-16-9 wp-has-aspect-ratio"> <li><strong>Method 1</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> and <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> to return a <strong>Duplicate-Free</strong> <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a></li> <li><strong>Method 2</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>, <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>For</code></a> loop and <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> to return a List of <strong>Duplicates</strong> found.</li> <li><strong>Method 3</strong>: Use a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>For</code></a> loop to return <strong>Duplicates</strong> and <strong>Counts</strong></li> <li><strong>Method 4</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/python-any-function/" data-type="URL" data-id="https://blog.finxter.com/python-any-function/" target="_blank"><code>any()</code></a> to check for <strong>Duplicates</strong> and return a Boolean</li> <li><strong>Method 5</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a> to return a List of all <strong>Duplicates </strong></li> </ul> <hr class="wp-block-separator wp-embed-aspect-16-9 wp-has-aspect-ratio"/> <h2>Method 1: Use set() and List to return a Duplicate-Free List</h2> <p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>which removes any duplicate values (<code>set(users)</code>) to produce a <strong>Duplicate-Free</strong> <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>. This set is then converted to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> (<code>list(set(users))</code>). </p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] dup_free = list(set(users)) print(dup_free)</pre> <p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p> <p>Next, <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> is called and <code>users</code> is passed as an argument to the same. Then, the new set is converted to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> and saved to <code>dup_free</code>.</p> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"> <div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="Python set() Function — A Simple Guide" width="780" height="439" src="https://www.youtube.com/embed/fZJsKQPlzRg?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> </div> </figure> <p>If <code><code>dup_free</code></code> was output to the terminal before converting to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a>, the result would be a <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>, which is <em>not subscriptable</em>. Meaning the elements are inaccessible in this format.</p> <p><strong>Output</strong></p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>{'csealker', 'cdriver', 'shoeguy', 'ollie3', 'kyliek', 'stewieboy', 'AmyP'}</code></td> </tr> </tbody> </table> </figure> <p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Note:</strong> Any attempt to access an element from a set will result in a <em>not subscriptable</em> error. </p> <p>In this example, the <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> was converted to a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a>, and displays a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a> of <strong>Duplicate-Free</strong> values.</p> <p><strong>Output </strong></p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>['csealker', 'cdriver', 'shoeguy', 'ollie3', 'kyliek', 'stewieboy', 'AmyP']</code></td> </tr> </tbody> </table> </figure> <p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Note:</strong> An <em>empty</em> set will result if no argument is passed.</p> <hr class="wp-block-separator"/> <h2>Method 2: Use set(), For loop, and List to return a List of Duplicates Found</h2> <p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>, and a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop to check for and return any <strong>Duplicates</strong> found <code>(set(x for x in users if ((x in tmp) or tmp.add(x)))</code>) to <code>dups</code>. The <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> is then converted to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> (<code>print(list(dups))</code>).</p> <p>Here’s an example:</p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="4-6" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] tmp = set() dups = set(x for x in users if (x in tmp or tmp.add(x))) print(list(dups))</pre> <p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p> <p>Next, a new empty set, <code>tmp</code> is declared. A <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop is then instantiated to check each element in <code>users</code> for duplicates. If a <strong>duplicate</strong> is found, it is appended to <code>tmp</code>. The results save to <code>dups</code> as a <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>.</p> <p><strong>Output</strong></p> <p>In this example, the <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> was converted to a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a> and displays a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a> of Duplicates values found in the original <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a>, <code>users</code>.</p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code> ['kyliek', 'ollie3', 'shoeguy']</code></td> </tr> </tbody> </table> </figure> <hr class="wp-block-separator"/> <h2>Method 3: Use a <code>For</code> loop to return <strong>Duplicates</strong> and <strong>Counts</strong></h2> <p class="has-global-color-8-background-color has-background">This method uses a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop to navigate through and check each element of <code>users</code> while keeping track of all usernames and the number of times they appear. A <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank">Dictionary</a> of <strong>Duplicates</strong>, including the <strong>Usernames</strong> and <strong>Counts</strong> returns.</p> <p>Here’s an example:</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">count = {} dup_count = {} for i in users: if i not in count: count[i] = 1 else: count[i] += 1 dup_count[i] = count[i] print(dup_count)</pre> <p>This code declares two (2) empty sets, <code>count</code> and <code>dup_count</code> respectively.</p> <p>A <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop is instantiated to loop through each element of <code>users</code> and does the following:</p> <ul> <li>If the element <code>i</code> is not in <code>count</code>, then the <code>count</code> element (<code>count[i]=1</code>) is set to one (1).</li> <li>If element <code>i</code> is found in <code>count</code>, it falls to <code>else</code> where one (1) is added to <code>count</code> (<code>count[i]+=1</code>) and then added to <code>dup_count</code> (<code>dup_count[i]=count[i]</code>)</li> </ul> <p>This code repeats until the end of <code>users</code> has been reached.</p> <p>At this point, a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank">Dictionary</a> containing the <strong>Duplicates</strong>, and the number of times they appear displays.</p> <p><strong>Output</strong></p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>{'ollie3': 2, 'shoeguy': 2, 'kyliek': 2}</code></td> </tr> </tbody> </table> </figure> <hr class="wp-block-separator"/> <h2>Method 4: Use Any to Check for Duplicate Values</h2> <p class="has-global-color-8-background-color has-background">This example uses <code><a href="https://blog.finxter.com/python-any-function/" data-type="URL" data-id="https://blog.finxter.com/python-any-function/" target="_blank" rel="noreferrer noopener">any()</a></code>, and passes the <a href="https://blog.finxter.com/iterators-iterables-and-itertools/" data-type="post" data-id="29507" target="_blank" rel="noreferrer noopener">iterable</a> <code>users</code> to iterate and locate <strong>Duplicates</strong>. If found, <code>True</code> returns. Otherwise, <code>False</code> returns. Best used on small <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">Lists</a>.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] dups = any(users.count(x) > 1 for x in users) print(dups)</pre> <p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p> <p>Next, <a href="https://blog.finxter.com/python-any-function/" data-type="URL" data-id="https://blog.finxter.com/python-any-function/"><code>any()</code></a> is called and loops through each element of <code>users</code> checking to see if the element is a <strong>duplicate</strong>. If found, <code>True</code> is assigned. Otherwise, <code>False</code><em> </em>is assigned. The result saves to <code>dups</code> and the output displays as follows:</p> <p><strong>Output</strong></p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td>True</td> </tr> </tbody> </table> </figure> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio"> <div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="Python Built-in Functions - all() and any()" width="780" height="585" src="https://www.youtube.com/embed/F6W3-N1-NtE?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> </div> </figure> <hr class="wp-block-separator"/> <h2>Method 5: Use List Comprehension to return a <strong>List of all Duplicates</strong></h2> <p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a> to loop through <code>users</code>, checking for duplicates. If found, the <strong>Duplicates</strong> are appended to <code>dups</code>. </p> <p>Here’s an example:</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] dups = [x for x in users if users.count(x) >= 2] print(dups)</pre> <p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p> <p>Next, <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a> extracts and displays <strong>duplicate</strong> usernames and save them to a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank" rel="noreferrer noopener">List</a>. The <strong>duplicate</strong> values are output to the terminal</p> <p><strong>Output</strong></p> <figure class="wp-block-table is-style-stripes"> <table> <tbody> <tr> <td><code>['ollie3', 'shoeguy', 'kyliek', 'ollie3', 'shoeguy', 'kyliek']</code></td> </tr> </tbody> </table> </figure> <hr class="wp-block-separator"/> <h2>Summary</h2> <p>These five (5) methods of checking a List for Duplicates should give you enough information to select the best one for your coding requirements.</p> <p>Good Luck & Happy Coding!</p> <hr class="wp-block-separator"/> </div> https://www.sickgaming.net/blog/2022/05/17/5-best-ways-to-check-a-list-for-duplicates-in-python/ |