{"id":111674,"date":"2020-04-17T12:43:57","date_gmt":"2020-04-17T12:43:57","guid":{"rendered":"https:\/\/blog.finxter.com\/?p=7622"},"modified":"2020-04-17T12:43:57","modified_gmt":"2020-04-17T12:43:57","slug":"python-lists-filter-vs-list-comprehension-which-is-faster","status":"publish","type":"post","link":"https:\/\/sickgaming.net\/blog\/2020\/04\/17\/python-lists-filter-vs-list-comprehension-which-is-faster\/","title":{"rendered":"Python Lists filter() vs List Comprehension \u2013 Which is Faster?"},"content":{"rendered":"<p><strong>[Spoiler] Which function filters a list faster: filter() vs list comprehension? For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in <code>filter()<\/code> method.<\/strong><\/p>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\">\n<div class=\"wp-block-embed__wrapper\">\n<div class=\"ast-oembed-container\"><iframe loading=\"lazy\" title=\"How to Filter a List in Python?\" width=\"1400\" height=\"788\" src=\"https:\/\/www.youtube.com\/embed\/3nG4TLkqzf8?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe><\/div>\n<\/div>\n<\/figure>\n<p>To answer this question, I&#8217;ve written a short script that tests the runtime performance of filtering large lists of increasing sizes using the <code>filter()<\/code> and the list comprehension methods. <\/p>\n<p>My thesis is that the list comprehension method should be slightly faster for larger list sizes because it leverages the efficient cPython implementation of list comprehension and doesn&#8217;t need to call an extra function.<\/p>\n<p><strong>Related Article:<\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/blog.finxter.com\/how-to-filter-a-list-in-python\/\" target=\"_blank\" rel=\"noreferrer noopener\">How to Filter a List in Python?<\/a><\/li>\n<\/ul>\n<p>I used my notebook with an Intel(R) Core(TM) i7-8565U 1.8GHz processor (with Turbo Boost up to 4.6 GHz) and 8 GB of RAM. <\/p>\n<p>Try It Yourself:<\/p>\n<p> <iframe loading=\"lazy\" height=\"600px\" width=\"100%\" src=\"https:\/\/repl.it\/@finxter\/filtervslistcomp?lite=true\" scrolling=\"no\" frameborder=\"no\" allowtransparency=\"true\" allowfullscreen=\"true\" sandbox=\"allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals\"><\/iframe> <\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import time # Compare runtime of both methods\nlist_sizes = [i * 10000 for i in range(100)]\nfilter_runtimes = []\nlist_comp_runtimes = [] for size in list_sizes: lst = list(range(size)) # Get time stamps time_0 = time.time() list(filter(lambda x: x%2, lst)) time_1 = time.time() [x for x in lst if x%2] time_2 = time.time() # Calculate runtimes filter_runtimes.append((size, time_1 - time_0)) list_comp_runtimes.append((size, time_2 - time_1)) # Plot everything\nimport matplotlib.pyplot as plt\nimport numpy as np f_r = np.array(filter_runtimes)\nl_r = np.array(list_comp_runtimes) print(filter_runtimes)\nprint(list_comp_runtimes) plt.plot(f_r[:,0], f_r[:,1], label='filter()')\nplt.plot(l_r[:,0], l_r[:,1], label='list comprehension') plt.xlabel('list size')\nplt.ylabel('runtime (seconds)') plt.legend()\nplt.savefig('filter_list_comp.jpg')\nplt.show()\n<\/pre>\n<p>The code compares the runtimes of the <code>filter()<\/code> function and the list comprehension variant to filter a list. Note that the <code>filter()<\/code> function returns a filter object, so you need to convert it to a list using the <code>list()<\/code> constructor. <\/p>\n<p>Here&#8217;s the resulting plot that compares the runtime of the two methods. On the x axis, you can see the list size from 0 to 1,000,000 elements. On the y axis, you can see the runtime in seconds needed to execute the respective functions.<\/p>\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2020\/04\/filter_list_comp.jpg\" alt=\"\" class=\"wp-image-7619\" srcset=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2020\/04\/filter_list_comp.jpg 640w, https:\/\/blog.finxter.com\/wp-content\/uploads\/2020\/04\/filter_list_comp-300x225.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/figure>\n<p>The resulting plot shows that both methods are extremely fast for a few tens of thousands of elements. In fact, they are so fast that the <code>time()<\/code> function of the <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/2\/library\/time.html#time.time\" target=\"_blank\">time module<\/a> cannot capture the elapsed time.<\/p>\n<p>But as you increase the size of the lists to hundreds of thousands of elements, the list comprehension method starts to win:<\/p>\n<p><strong>For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in <code>filter()<\/code> method.<\/strong><\/p>\n<p>The reason is the efficient implementation of the list comprehension statement. An interesting observation is the following though. If you don&#8217;t convert the filter function to a list, you get the following result:<\/p>\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2020\/04\/filter_list_comp-1.jpg\" alt=\"\" class=\"wp-image-7620\" srcset=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2020\/04\/filter_list_comp-1.jpg 640w, https:\/\/blog.finxter.com\/wp-content\/uploads\/2020\/04\/filter_list_comp-1-300x225.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/figure>\n<p>Suddenly the <code>filter()<\/code> function has constant runtime of close to 0 seconds&#8212;no matter how many elements are in the list. Why is this happening?<\/p>\n<p>The explanation is simple: the filter function returns an iterator, not a list. The iterator doesn&#8217;t need to compute a single element until it is requested to compute the <code>next()<\/code> element. So, the <code>filter()<\/code> function computes the next element only if it is required to do so. Only if you convert it to a list, it must compute all values. Otherwise, it doesn&#8217;t actually compute a single value beforehand. <\/p>\n<h2>Where to Go From Here<\/h2>\n<p><strong>This tutorial has shown you the <code>filter()<\/code> function in Python and compared it against the list comprehension way of filtering: <code>[x for x in list if condition]<\/code>. You&#8217;ve seen that the latter is not only more readable and more Pythonic, but also faster. So take the list comprehension approach to filter lists!<\/strong><\/p>\n<p>If you love coding and you want to do this full-time from the comfort of your own home, you\u2019re in luck:<\/p>\n<p>I\u2019ve created a <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/webinar-freelancer\/\" target=\"_blank\">free webinar<\/a> that shows you how I started as a Python freelancer after my computer science studies working from home (and seeing my kids grow up) while earning a full-time income working only part-time hours.<\/p>\n<p><a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/webinar-freelancer\/\" target=\"_blank\">Webinar: How to Become Six-Figure Python Freelancer?<\/a><\/p>\n<p>Join 21,419 ambitious Python coders. It\u2019s fun! <img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/12.0.0-1\/72x72\/1f604.png\" alt=\"\ud83d\ude04\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/12.0.0-1\/72x72\/1f40d.png\" alt=\"\ud83d\udc0d\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[Spoiler] Which function filters a list faster: filter() vs list comprehension? For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in filter() method. To answer this question, I&#8217;ve written a short script that tests the runtime performance of filtering large lists of increasing sizes using the filter() [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[857],"tags":[73,468,528],"class_list":["post-111674","post","type-post","status-publish","format-standard","hentry","category-python-tut","tag-programming","tag-python","tag-tutorial"],"_links":{"self":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/111674","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/comments?post=111674"}],"version-history":[{"count":0,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/111674\/revisions"}],"wp:attachment":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media?parent=111674"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/categories?post=111674"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/tags?post=111674"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}