Login

Python Regex Flags

<div><p>In many functions, you see a third argument <em>flags</em>. What are they and how do they work?</p>
<p>Flags allow you to control the regular expression engine. Because regular expressions are so powerful, they are a useful way of switching on and off certain features (e.g. whether to ignore capitalization when matching your regex). </p>
<p>For example, here’s how the third argument flags is used in the <a href="https://blog.finxter.com/python-re-findall/">re.findall() method</a>:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">re.findall(pattern, string, flags=0)</pre>
<p>So the flags argument seems to be an integer argument with the default value of 0. To control the default regex behavior, you simply use one of the predefined integer values. You can access these predefined values via the re library:</p>
<figure class="wp-block-table is-style-stripes">
<table class="">
<tbody>
<tr>
<td><strong>Syntax</strong></td>
<td><strong>Meaning</strong></td>
</tr>
<tr>
<td> <strong>re.ASCII</strong></td>
<td>If you don’t use this flag, the special Python regex symbols \w, \W, \b, \B, \d, \D, \s and \S will match Unicode characters. If you use this flag, those special symbols will match only ASCII characters — as the name suggests. </td>
</tr>
<tr>
<td> <strong>re.A</strong> </td>
<td>Same as re.ASCII </td>
</tr>
<tr>
<td> <strong>re.DEBUG</strong> </td>
<td>If you use this flag, Python will print some useful information to the shell that helps you debugging your regex. </td>
</tr>
<tr>
<td> <strong>re.IGNORECASE</strong> </td>
<td>If you use this flag, the regex engine will perform case-insensitive matching. So if you’re searching for [A-Z], it will also match [a-z]. </td>
</tr>
<tr>
<td> <strong>re.I</strong> </td>
<td>Same as re.IGNORECASE </td>
</tr>
<tr>
<td> <strong>re.LOCALE</strong> </td>
<td>Don’t use this flag — ever. It’s depreciated—the idea was to perform case-insensitive matching depending on your current locale. But it isn’t reliable. </td>
</tr>
<tr>
<td> <strong>re.L</strong> </td>
<td>Same as re.LOCALE </td>
</tr>
<tr>
<td> <strong>re.MULTILINE</strong> </td>
<td>This flag switches on the following feature: the start-of-the-string regex ‘^’ matches at the beginning of each line (rather than only at the beginning of the string). The same holds for the end-of-the-string regex ‘$’ that now matches also at the end of each line in a multi-line string. </td>
</tr>
<tr>
<td> <strong>re.M</strong> </td>
<td>Same as re.MULTILINE </td>
</tr>
<tr>
<td> <strong>re.DOTALL</strong> </td>
<td>Without using this flag, the dot regex ‘.’ matches all characters except the newline character ‘\n’. Switch on this flag to really match all characters including the newline character. </td>
</tr>
<tr>
<td> <strong>re.S</strong> </td>
<td>Same as re.DOTALL </td>
</tr>
<tr>
<td> <strong>re.VERBOSE</strong> </td>
<td>To improve the readability of complicated regular expressions, you may want to allow comments and (multi-line) formatting of the regex itself. This is possible with this flag: all whitespace characters and lines that start with the character ‘#’ are ignored in the regex. </td>
</tr>
<tr>
<td> <strong>re.X</strong> </td>
<td>Same as re.VERBOSE </td>
</tr>
</tbody>
</table>
</figure>
<h2>How to Use These Flags?</h2>
<p>Simply include the flag as the optional flag argument as follows:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re text = ''' Ha! let me see her: out, alas! he's cold: Her blood is settled, and her joints are stiff; Life and these lips have long been separated: Death lies on her like an untimely frost Upon the sweetest flower of all the field. ''' print(re.findall('HER', text, flags=re.IGNORECASE))
# ['her', 'Her', 'her', 'her']</pre>
<p>As you see, the re.IGNORECASE flag ensures that all occurrences of the string ‘her’ are matched—no matter their capitalization. </p>
<h2>How to Use Multiple Flags?</h2>
<p>Yes, simply add them together (sum them up) as follows:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re text = ''' Ha! let me see her: out, alas! he's cold: Her blood is settled, and her joints are stiff; Life and these lips have long been separated: Death lies on her like an untimely frost Upon the sweetest flower of all the field. ''' print(re.findall(' HER # Ignored', text, flags=re.IGNORECASE + re.VERBOSE))
# ['her', 'Her', 'her', 'her']
</pre>
<p>You use both flags re.IGNORECASE (all occurrences of lower- or uppercase string variants of ‘her’ are matched) and re.VERBOSE (ignore comments and whitespaces in the regex). You sum them together re.IGNORECASE + re.VERBOSE to indicate that you want to take both.</p>
</div>

https://www.sickgaming.net/blog/2020/01/...gex-flags/

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[Tut] Python Regex Capturing Groups – A Helpful Guide (+Video)	xSicKxBot	0	758	04-07-2023, 10:07 AM Last Post: xSicKxBot
	[Tut] How to Access Multiple Matches of a Regex Group in Python?	xSicKxBot	0	765	04-04-2023, 02:26 PM Last Post: xSicKxBot
	[Tut] Python \| Split String with Regex	xSicKxBot	0	734	12-13-2022, 06:04 AM Last Post: xSicKxBot
	[Tut] Python RegEx – Match Whitespace But Not Newline	xSicKxBot	0	646	08-02-2022, 09:58 PM Last Post: xSicKxBot
	[Tut] Your Python Regex Pattern Doesn’t Match? Try This!	xSicKxBot	0	705	05-27-2022, 01:54 AM Last Post: xSicKxBot
	[Tut] Python Regex – ¿Cómo contar el número de coincidencias?	xSicKxBot	0	640	04-04-2022, 12:32 AM Last Post: xSicKxBot
	[Tut] Python One Line Regex Match	xSicKxBot	0	728	08-12-2020, 03:43 PM Last Post: xSicKxBot
	[Tut] Python Regex Quantifiers – Question Mark (?) vs Plus (+) vs Asterisk (*)	xSicKxBot	0	925	03-23-2020, 02:41 PM Last Post: xSicKxBot
	[Tut] How to Match an Exact Word in Python Regex? (Answer: Don’t)	xSicKxBot	0	925	03-08-2020, 02:43 PM Last Post: xSicKxBot
	[Tut] How to Find All Lines Not Containing a Regex in Python?	xSicKxBot	0	846	03-06-2020, 11:51 AM Last Post: xSicKxBot

xSicKxBot