Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] 5 Best Ways to Check a List for Duplicates in Python

#1
5 Best Ways to Check a List for Duplicates in Python

<div><div class="kk-star-ratings kksr-valign-top kksr-align-left " data-payload="{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;365014&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;count&quot;:&quot;0&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;0\/5 - (0 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;}">
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 0px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend"> <span class="kksr-muted">Rate this post</span> </div>
</div>
<h2 class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Problem Formulation and Solution Overview</h2>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">In this article, you’ll learn how to check a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank" rel="noreferrer noopener">List</a> for Duplicates in Python.</p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">To make it more fun, we have the following running scenario:</p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio"><em>The <a rel="noreferrer noopener" href="https://academy.finxter.com/" data-type="URL" data-id="https://academy.finxter.com/" target="_blank">Finxter Academy</a> has given you an extensive list of usernames. Somewhere along the line, duplicate entries were added. They need you to check if their <em><a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> </em>contains duplicates. For testing purposes, a small sampling of this <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> is used.</em></p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio has-global-color-8-background-color has-background"><em><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Question</strong>: How would we write Python code to check a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> for duplicate elements?</em></p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">We can accomplish this task by one of the following options:</p>
<ul type="video" class="wp-embed-aspect-16-9 wp-has-aspect-ratio">
<li><strong>Method 1</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> and <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> to return a <strong>Duplicate-Free</strong> <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a></li>
<li><strong>Method 2</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>, <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>For</code></a> loop and <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> to return a List of <strong>Duplicates</strong> found.</li>
<li><strong>Method 3</strong>: Use a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>For</code></a> loop to return <strong>Duplicates</strong> and <strong>Counts</strong></li>
<li><strong>Method 4</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/python-any-function/" data-type="URL" data-id="https://blog.finxter.com/python-any-function/" target="_blank"><code>any()</code></a> to check for <strong>Duplicates</strong> and return a Boolean</li>
<li><strong>Method 5</strong>: Use <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a> to return a List of all <strong>Duplicates </strong></li>
</ul>
<hr class="wp-block-separator wp-embed-aspect-16-9 wp-has-aspect-ratio"/>
<h2>Method 1: Use set() and List to return a Duplicate-Free List</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>which removes any duplicate values (<code>set(users)</code>) to produce a <strong>Duplicate-Free</strong> <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>. This set is then converted to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">List</a> (<code>list(set(users))</code>). </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] dup_free = list(set(users))
print(dup_free)</pre>
<p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p>
<p>Next, <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> is called and <code>users</code> is passed as an argument to the same. Then, the new set is converted to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> and saved to <code>dup_free</code>.</p>
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Python set() Function — A Simple Guide" width="780" height="439" src="https://www.youtube.com/embed/fZJsKQPlzRg?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>
</figure>
<p>If <code><code>dup_free</code></code> was output to the terminal before converting to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a>, the result would be a <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>, which is <em>not subscriptable</em>. Meaning the elements are inaccessible in this format.</p>
<p><strong>Output</strong></p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td><code>{'csealker', 'cdriver', 'shoeguy', 'ollie3', 'kyliek', 'stewieboy', 'AmyP'}</code></td>
</tr>
</tbody>
</table>
</figure>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Note:</strong> Any attempt to access an element from a set will result in a <em>not subscriptable</em> error. </p>
<p>In this example, the <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> was converted to a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a>, and displays a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a> of <strong>Duplicate-Free</strong> values.</p>
<p><strong>Output </strong></p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td><code>['csealker', 'cdriver', 'shoeguy', 'ollie3', 'kyliek', 'stewieboy', 'AmyP']</code></td>
</tr>
</tbody>
</table>
</figure>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Note:</strong> An <em>empty</em> set will result if no argument is passed.</p>
<hr class="wp-block-separator"/>
<h2>Method 2: Use set(), For loop, and List to return a List of Duplicates Found</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>, and a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop to check for and return any <strong>Duplicates</strong> found <code>(set(x for x in users if ((x in tmp) or tmp.add(x)))</code>) to <code>dups</code>. The <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> is then converted to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank"><code>List</code></a> (<code>print(list(dups))</code>).</p>
<p>Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="4-6" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] tmp = set()
dups = set(x for x in users if (x in tmp or tmp.add(x)))
print(list(dups))</pre>
<p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p>
<p>Next, a new empty set, <code>tmp</code> is declared. A <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop is then instantiated to check each element in <code>users</code> for duplicates. If a <strong>duplicate</strong> is found, it is appended to <code>tmp</code>. The results save to <code>dups</code> as a <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a>.</p>
<p><strong>Output</strong></p>
<p>In this example, the <a rel="noreferrer noopener" href="https://blog.finxter.com/sets-in-python/" data-type="URL" data-id="https://blog.finxter.com/sets-in-python/" target="_blank"><code>set()</code></a> was converted to a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a> and displays a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a> of Duplicates values found in the original <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/"><code>List</code></a>, <code>users</code>.</p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td><code> ['kyliek', 'ollie3', 'shoeguy']</code></td>
</tr>
</tbody>
</table>
</figure>
<hr class="wp-block-separator"/>
<h2>Method 3: Use a <code>For</code> loop to return <strong>Duplicates</strong> and <strong>Counts</strong></h2>
<p class="has-global-color-8-background-color has-background">This method uses a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop to navigate through and check each element of <code>users</code> while keeping track of all usernames and the number of times they appear. A <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank">Dictionary</a> of <strong>Duplicates</strong>, including the <strong>Usernames</strong> and <strong>Counts</strong> returns.</p>
<p>Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">count = {}
dup_count = {}
for i in users: if i not in count: count[i] = 1 else: count[i] += 1 dup_count[i] = count[i]
print(dup_count)</pre>
<p>This code declares two (2) empty sets, <code>count</code> and <code>dup_count</code> respectively.</p>
<p>A <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>For</code></a> loop is instantiated to loop through each element of <code>users</code> and does the following:</p>
<ul>
<li>If the element <code>i</code> is not in <code>count</code>, then the <code>count</code> element (<code>count[i]=1</code>) is set to one (1).</li>
<li>If element <code>i</code> is found in <code>count</code>, it falls to <code>else</code> where one (1) is added to <code>count</code> (<code>count[i]+=1</code>) and then added to <code>dup_count</code> (<code>dup_count[i]=count[i]</code>)</li>
</ul>
<p>This code repeats until the end of <code>users</code> has been reached.</p>
<p>At this point, a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank">Dictionary</a> containing the <strong>Duplicates</strong>, and the number of times they appear displays.</p>
<p><strong>Output</strong></p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td><code>{'ollie3': 2, 'shoeguy': 2, 'kyliek': 2}</code></td>
</tr>
</tbody>
</table>
</figure>
<hr class="wp-block-separator"/>
<h2>Method 4: Use Any to Check for Duplicate Values</h2>
<p class="has-global-color-8-background-color has-background">This example uses <code><a href="https://blog.finxter.com/python-any-function/" data-type="URL" data-id="https://blog.finxter.com/python-any-function/" target="_blank" rel="noreferrer noopener">any()</a></code>, and passes the <a href="https://blog.finxter.com/iterators-iterables-and-itertools/" data-type="post" data-id="29507" target="_blank" rel="noreferrer noopener">iterable</a> <code>users</code> to iterate and locate <strong>Duplicates</strong>. If found, <code>True</code> returns. Otherwise, <code>False</code> returns. Best used on small <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank">Lists</a>.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] dups = any(users.count(x) > 1 for x in users)
print(dups)</pre>
<p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p>
<p>Next, <a href="https://blog.finxter.com/python-any-function/" data-type="URL" data-id="https://blog.finxter.com/python-any-function/"><code>any()</code></a> is called and loops through each element of <code>users</code> checking to see if the element is a <strong>duplicate</strong>. If found, <code>True</code> is assigned. Otherwise, <code>False</code><em> </em>is assigned. The result saves to <code>dups</code> and the output displays as follows:</p>
<p><strong>Output</strong></p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td>True</td>
</tr>
</tbody>
</table>
</figure>
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Python Built-in Functions - all() and any()" width="780" height="585" src="https://www.youtube.com/embed/F6W3-N1-NtE?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>
</figure>
<hr class="wp-block-separator"/>
<h2>Method 5: Use List Comprehension to return a <strong>List of all Duplicates</strong></h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a> to loop through <code>users</code>, checking for duplicates. If found, the <strong>Duplicates</strong> are appended to <code>dups</code>. </p>
<p>Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">users = ['AmyP', 'ollie3', 'shoeguy', 'kyliek', 'ollie3', 'stewieboy', 'csealker', 'shoeguy', 'cdriver', 'kyliek'] dups = [x for x in users if users.count(x) >= 2]
print(dups)</pre>
<p>This code declares a small sampling of <a rel="noreferrer noopener" href="https://finxter.com/" data-type="URL" data-id="https://finxter.com/" target="_blank">Finxter</a> usernames and saves them to <code>users</code>.</p>
<p>Next, <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a> extracts and displays <strong>duplicate</strong> usernames and save them to a <a href="https://blog.finxter.com/python-lists/" data-type="URL" data-id="https://blog.finxter.com/python-lists/" target="_blank" rel="noreferrer noopener">List</a>. The <strong>duplicate</strong> values are output to the terminal</p>
<p><strong>Output</strong></p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td><code>['ollie3', 'shoeguy', 'kyliek', 'ollie3', 'shoeguy', 'kyliek']</code></td>
</tr>
</tbody>
</table>
</figure>
<hr class="wp-block-separator"/>
<h2>Summary</h2>
<p>These five (5) methods of checking a List for Duplicates should give you enough information to select the best one for your coding requirements.</p>
<p>Good Luck &amp; Happy Coding!</p>
<hr class="wp-block-separator"/>
</div>


https://www.sickgaming.net/blog/2022/05/...in-python/
Reply



Forum Jump:


Users browsing this thread:
2 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016