Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Your Python Regex Pattern Doesn’t Match? Try This!

#1
Your Python Regex Pattern Doesn’t Match? Try This!

<div><div class="kk-star-ratings kksr-valign-top kksr-align-left " data-payload="{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;379866&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;count&quot;:&quot;1&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;}">
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend"> 5/5 – (1 vote) </div>
</div>
<h2>Problem Formulation</h2>
<p>Say, you want to find a <a href="https://blog.finxter.com/python-regex/" data-type="post" data-id="6210" target="_blank" rel="noreferrer noopener">regex</a> pattern in a given string. You know the pattern exists in the string. You use the <code>re.match(pattern, string)</code> function to find the <a rel="noreferrer noopener" href="https://blog.finxter.com/python-regex-match/" data-type="post" data-id="5759" target="_blank">match object</a> where the pattern matches in the string.</p>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Problem</strong>: The Python regular expression pattern is not found in the string. The pattern doesn’t match anything, and, thus, the match object is <code>None</code>. How to fix this?</p>
<p>Here’s an example in which you’re searching for the pattern <code>'h[a-z]+'</code> which should match the substring <code>'hello'</code>. </p>
<p>But it doesn’t match! <img src="https://s.w.org/images/core/emoji/13.1.0/72x72/26a1.png" alt="⚡" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re my_string = 'hello world'
pattern = re.compile('h[a-z]+') match = re.match(pattern, my_string) if match: print('found!')
else: print('not found!')</pre>
<p>Output:</p>
<pre class="wp-block-preformatted"><code>not found!</code></pre>
<p>Where is the bug? And how to fix it, so that the pattern matches the substring <code>'hello'</code>?</p>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Learn More</strong>: Improve your regex superpower by studying <em><strong>character classes</strong></em> used in the example pattern <code>'h[a-z]+'</code> by visiting <a rel="noreferrer noopener" href="https://blog.finxter.com/python-character-set-regex-tutorial/" data-type="post" data-id="6208" target="_blank">this tutorial on the Finxter blog</a>.</p>
<h2>Solution: Use re.search() instead of re.match()</h2>
<p class="has-global-color-8-background-color has-background">A common reason why your Python regular expression pattern is not matching in a given string is that you mistakenly used <code>re.match(pattern, string)</code> instead of <code>re.search(pattern, string)</code> or <code>re.findall(pattern, string)</code>. The former attempts to match the <code>pattern</code> at the beginning of the <code>string</code>, whereas the latter two functions attempt to match anywhere in the string.</p>
<p>Here’s a quick recap of the three regex functions:</p>
<ul>
<li><code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-regex-match/" data-type="post" data-id="5759" target="_blank">re.match(pattern, string)</a></code> returns a match object if the <code>pattern</code> matches <em><strong>at the beginning</strong></em> of the <code>string</code>. The match object contains useful information such as the matching groups and the matching positions.</li>
<li><code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-regex-search/" data-type="post" data-id="5749" target="_blank">re.search(pattern, string)</a></code> matches the first occurrence of the <code>pattern</code> in the <code>string</code> and returns a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-one-line-regex-match/" target="_blank">match </a>object.</li>
<li><code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-re-findall/" data-type="post" data-id="5729" target="_blank">re.findall(pattern, string)</a></code> scans <code>string</code> from left to right, searching for all non-overlapping matches of the <code>pattern</code>. It returns a list of strings in the matching order when scanning the string from left to right.</li>
</ul>
<p>Thus, the following code uses re.search() to fix our problem:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re my_string = 'hello world'
pattern = re.compile('h[a-z]+') match = re.search(pattern, my_string) if match: print('found!')
else: print('not found!')</pre>
<p>Output:</p>
<pre class="wp-block-preformatted"><code>found!</code></pre>
<p>Finally, the pattern <code>'h[a-z]+'</code> does match the string <code>'hello world'</code>. </p>
<p>Note that you can also use the re.findall() function if you’re interested in just the string matches of your pattern (without match object). We’ll explain all of this — re.match(), re.search(), re.findall(), and match objects — in a moment but first, let’s have a look at the same example with re.findall():</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re my_string = 'hello world'
pattern = re.compile('h[a-z]+') match = re.findall(pattern, my_string) print(match)
# ['hello'] if match: print('found!')
else: print('not found!')</pre>
<p>Output:</p>
<pre class="wp-block-preformatted"><code>['hello']
found!</code></pre>
<h2>Understanding re.match()</h2>
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Python Regex Match: A Complete Guide to re.match()" width="780" height="439" src="https://www.youtube.com/embed/5d3vQ8N0MJg?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>
</figure>
<p class="has-pale-cyan-blue-background-color has-background">The <code>re.match(pattern, string)</code> method returns a match object if the <code>pattern</code> matches <em><strong>at the beginning</strong></em> of the <code>string</code>. The match object contains useful information such as the matching groups and the matching positions. An optional argument <code>flags</code> allows you to customize the regex engine, for example to ignore capitalization.</p>
<p><strong>Specification</strong>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">re.match(pattern, string, flags=0)</pre>
<p>The <code>re.match()</code> method has up to three arguments.</p>
<ul>
<li><strong><code>pattern</code></strong>: the regular expression pattern that you want to match.</li>
<li><strong><code>string</code></strong>: the string which you want to search for the pattern.</li>
<li><strong><code>flags</code> </strong>(optional argument): a more advanced modifier that allows you to customize the behavior of the function. Want to know <a href="https://blog.finxter.com/python-regex-flags/">how to use those flags? Check out this detailed article</a> on the Finxter blog.</li>
</ul>
<p>We’ll explore them in more detail later. </p>
<p><strong>Return Value:</strong></p>
<p>The <code>re.match()</code> method returns a match object. You may ask (and rightly so):</p>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Learn More</strong>: Understanding <code><a href="https://blog.finxter.com/python-regex-match/" data-type="post" data-id="5759" target="_blank" rel="noreferrer noopener">re.match()</a></code> on the Finxter blog.</p>
<h3>What’s a Match Object?</h3>
<p>If a regular expression matches a part of your string, there’s a lot of useful information that comes with it: what’s the exact position of the match? Which regex groups were matched—and where? </p>
<p>The <a href="https://docs.python.org/3/library/re.html#match-objects" target="_blank" rel="noreferrer noopener">match object</a> is a simple wrapper for this information. Some regex methods of the re package in Python—such as <code>search()</code>—automatically create a match object upon the first pattern match.</p>
<p>At this point, you don’t need to explore the match object in detail. Just know that we can access the start and end positions of the match in the string by calling the methods <code>m.start()</code> and <code>m.end()</code> on the match object <code>m</code>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> m = re.search('h...o', 'hello world')
>>> m.start()
0
>>> m.end()
5
>>> 'hello world'[m.start():m.end()] 'hello'</pre>
<p>In the first line, you create a match object m by using the <code>re.search()</code> method. The pattern <code>'h...o'</code> matches in the string <code>'hello world'</code> at start position 0. </p>
<p>You use the start and end position to access the substring that matches the pattern (using the popular <a href="https://blog.finxter.com/introduction-to-slicing-in-python/">Python technique of slicing</a>).</p>
<hr class="wp-block-separator"/>
<p>Now that you understood the purpose of the match object, let’s have a look at the alternative to the <code>re.match()</code> function next! <img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f680.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<h2>Understanding re.search()</h2>
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Python Regex re.search() - A Simple Guide with Example" width="780" height="439" src="https://www.youtube.com/embed/Mv2VVpUgypc?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>
</figure>
<p class="has-global-color-8-background-color has-background">The <code>re.search(pattern, string)</code> method matches the first occurrence of the <code>pattern</code> in the <code>string</code> and returns a <a rel="noreferrer noopener" title="Python One Line Regex Match" href="https://blog.finxter.com/python-one-line-regex-match/" target="_blank">match </a>object. </p>
<p><strong>Specification</strong>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">re.search(pattern, string, flags=0)</pre>
<p>The <code>re.search()</code> method has up to three arguments.</p>
<ul>
<li><strong><code>pattern</code></strong>: the regular expression pattern that you want to match.</li>
<li><strong><code>string</code></strong>: the string which you want to search for the pattern.</li>
<li><strong><code>flags </code></strong>(optional argument): a more advanced modifier that allows you to customize the behavior of the function. Want to know <a href="https://blog.finxter.com/python-regex-flags/">how to use those flags? Check out this detailed article</a> on the Finxter blog.</li>
</ul>
<p>We’ll explore them in more detail later. </p>
<p><strong>Return Value:</strong></p>
<p>The <code>re.search()</code> method returns a match object. You may ask (and rightly so):</p>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Learn More</strong>: Understanding <code><a href="https://blog.finxter.com/python-regex-search/" data-type="post" data-id="5749" target="_blank" rel="noreferrer noopener">re.search()</a></code> on the Finxter blog.</p>
<h2>Understanding re.findall()</h2>
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Python Regex Findall()" width="780" height="439" src="https://www.youtube.com/embed/QqUVPaP8fpA?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>
</figure>
<p class="has-pale-cyan-blue-background-color has-background">The <code>re.findall(pattern, string)</code> method scans <code>string</code> from <strong>left to right</strong>, searching for all <strong>non-overlapping matches</strong> of the <code>pattern</code>. It returns a <strong>list of strings</strong> in the matching order when scanning the string from left to right.</p>
<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img loading="lazy" src="https://blog.finxter.com/wp-content/uploads/2020/11/refindall-1024x576.jpg" alt="re.findall() Visual Explanation" class="wp-image-17238" width="768" height="432" srcset="https://blog.finxter.com/wp-content/uploads/2020/11/refindall-scaled.jpg 1024w, https://blog.finxter.com/wp-content/uplo...00x169.jpg 300w, https://blog.finxter.com/wp-content/uplo...68x432.jpg 768w, https://blog.finxter.com/wp-content/uplo...150x84.jpg 150w" sizes="(max-width: 768px) 100vw, 768px" /></figure>
</div>
<p><strong>Specification</strong>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">re.findall(pattern, string, flags=0)</pre>
<p>The <code>re.findall()</code> method has up to three arguments.</p>
<ul>
<li><strong><code>pattern</code></strong>: the regular expression pattern that you want to match.</li>
<li><strong><code>string</code></strong>: the string which you want to search for the pattern.</li>
<li><strong><code>flags</code> </strong>(optional argument): a more advanced modifier that allows you to customize the behavior of the function. Want to know <a href="https://blog.finxter.com/python-regex-flags/">how to use those flags? Check out this detailed article</a> on the Finxter blog.</li>
</ul>
<p>We will have a look at each of them in more detail. </p>
<p><strong>Return Value:</strong></p>
<p>The <code>re.findall()</code> method returns a list of strings. Each string element is a matching substring of the string argument.</p>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/13.1.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Learn More</strong>: Understanding <code><a href="https://blog.finxter.com/python-re-findall/" data-type="post" data-id="5729">re.findall()</a></code> on the Finxter blog.</p>
<div class="wp-container-1 wp-block-group">
<div class="wp-block-group__inner-container">
<h2><a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noreferrer noopener" title="https://academy.finxter.com/university/mastering-regular-expressions/">Python Regex Course</a></h2>
<p><strong><em>Google engineers are regular expression masters. </em></strong>The Google search engine is a massive <em>text-processing engine</em> that extracts value from trillions of webpages.  </p>
<p><strong><em>Facebook engineers are regular expression masters.</em></strong> Social networks like Facebook, WhatsApp, and Instagram connect humans via <em>text messages</em>. </p>
<p><strong><em>Amazon engineers are regular expression masters. </em></strong>Ecommerce giants ship products based on <em>textual product descriptions</em>.  Regular expressions ​rule the game ​when text processing ​meets computer science. </p>
<p><em><strong>If you want to become a regular expression master too, check out the<a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noreferrer noopener" title="https://academy.finxter.com/university/mastering-regular-expressions/"> most comprehensive Python regex course</a> on the planet:</strong></em></p>
<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noopener"><img loading="lazy" width="1024" height="576" src="https://blog.finxter.com/wp-content/uploads/2018/10/ClickToPlay-1024x576.jpg" alt="" class="wp-image-19840" srcset="https://blog.finxter.com/wp-content/uploads/2018/10/ClickToPlay-scaled.jpg 1024w, https://blog.finxter.com/wp-content/uplo...00x169.jpg 300w, https://blog.finxter.com/wp-content/uplo...68x432.jpg 768w, https://blog.finxter.com/wp-content/uplo...36x864.jpg 1536w, https://blog.finxter.com/wp-content/uplo...8x1152.jpg 2048w, https://blog.finxter.com/wp-content/uplo...150x84.jpg 150w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>
</div>
</div>
</div>
<hr class="wp-block-separator"/>
<p>Now, this was a lot of theory! Let’s get some practice.</p>
<p>In my Python freelancer bootcamp, I’ll train you on how to create yourself a new success skill as a Python freelancer with the potential of earning six figures online. </p>
<p>The next recession is coming for sure, and you want to be able to create your own economy so that you can take care of your loved ones.</p>
<p><a rel="noreferrer noopener" aria-label="Check out my free webinar now. (opens in a new tab)" href="https://blog.finxter.com/webinar-freelancer/" target="_blank">Check out my free “Python Freelancer” webinar now!</a></p>
<p><strong>Join 20,000+ ambitious coders for free!</strong></p>
</div>


https://www.sickgaming.net/blog/2022/05/...-try-this/
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016