03-01-2020, 11:33 AM
Python Re Escape
<div><p>I don’t know how often I sat in front of my computer, writing regular expressions and wondering: how to escape this or that character? The problem is that some special characters have a special meaning in Python strings and regular expressions. If you want to remove the special meaning, you need to escape the characters with an additional backslash.</p>
<p>If you have this problem too, you’re in luck. This article is the ultimate guide to escape special characters in Python. Just click on the topic that interests you and learn how to escape the special character you’re currently struggling with!</p>
<p>If you’re the impatient guy, you’re in luck too. Just try to add the backslash to your special character you want to escape: <code>\x</code> to escape special character <code>x</code>.</p>
<p>Here are a few examples:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\( \{ \" \. \* \+', r'( { " . * +')
['( { " . * +']</pre>
<p>You can also watch the following video where I give you a quick example:</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-rich is-provider-embed-handler wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<div class="ast-oembed-container"><iframe title="Python Regex - How to Escape Special Characters?" width="1100" height="619" src="https://www.youtube.com/embed/F6LY3qo3J9c?feature=oembed" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></div>
</p></div>
</figure>
<h2>Python Regex Escape Characters</h2>
<p>If you use special characters in strings, they carry a special meaning. Sometimes you don’t need that. The general idea is to escape the special character <code>x</code> with an additional backslash <code>\x</code> to get rid of the special meaning.</p>
<p>In the following, I show how to escape all possible special characters for Python strings and regular expressions:</p>
<h3>Python Regex Escape Parentheses ()</h3>
<p>How to escape the parentheses <code>(</code> and <code>)</code> in Python regular expressions?</p>
<p>Parentheses have a special meaning in Python regular expressions: they open and close <a href="https://blog.finxter.com/python-re-groups/" target="_blank" rel="noreferrer noopener" aria-label="matching groups (opens in a new tab)">matching groups</a>. </p>
<p>You can get rid of the special meaning of parentheses by using the backslash prefix: <code>\(</code> and <code>\)</code>. This way, you can match the parentheses characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\(.*\)', 'Python is (really) great')
['(really)']</pre>
<p>The result shows a string that contains the “special” characters <code>'('</code> and <code>')'</code>. </p>
<h3>Python Regex Escape Square Brackets []</h3>
<p>How to escape the square brackets <code>[</code> and <code>]</code> in Python regular expressions?</p>
<p>Square brackets have a special meaning in Python regular expressions: they open and close <a rel="noreferrer noopener" aria-label="matching groups (opens in a new tab)" href="https://blog.finxter.com/python-re-groups/" target="_blank">character sets</a>. </p>
<p>You can get rid of the special meaning of brackets by using the backslash prefix: <code>\[</code> and <code>\]</code>. This way, you can match the brackets characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\[.*\]', 'Is Python [really] easy?')
['[really]']</pre>
<p>The result shows a string that contains the “special” characters <code>'['</code> and <code>']'</code>. </p>
<h3>Python Regex Escape Curly Brace (Brackets)</h3>
<p>How to escape the curly braces<code>{</code> and <code>}</code> in Python regular expressions?</p>
<p>The curly braces don’t have any special meaning in Python strings or regular expressions. Therefore, you don’t need to escape them with a leading backslash character <code>\</code>. However, you can do so if you wish as you see in the following example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\{.*\}', 'if (2==2) { y = 3; }')
['{ y = 3; }']
>>> re.findall(r'{.*}', 'if (2==2) { y = 3; }')
['{ y = 3; }']
>>> re.findall('{.*}', 'if (2==2) { y = 3; }')
['{ y = 3; }']</pre>
<p>All three cases match the same string enclosed in curly braces—even though we did not escape them and didn’t use the raw string <code>r''</code> in the third example. </p>
<h3>Python Regex Escape Slash (Backslash and Forward-Slash)</h3>
<p>How to escape the slash characters—backslash <code>\</code> and forward-slash <code>/</code>—in Python regular expressions?</p>
<p>The backslash has a special meaning in Python regular expressions: it escapes special characters and, thus, removes the special meaning. (How meta.)</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\\...', r'C:\home\usr\dir\hello\world')
['\\hom', '\\usr', '\\dir', '\\hel', '\\wor']</pre>
<p>You can see that the resulting matches have escaped backslashes themselves. This is because the backslash character has a special meaning in normal strings. Thus, the Python interpreter escapes it automatically by itself when printing it on the shell. Note that you didn’t need to escape the backslash characters when writing the raw string <code>r'C:\home\usr\dir\hello\world'</code> because the raw string already removes all the special meaning from the backslashed characters. But if you don’t want to use a raw string but a normal string, you need to escape the backslash character yourself:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall(r'\\...', 'C:\\home\\usr\\dir\\hello\\world')
['\\hom', '\\usr', '\\dir', '\\hel', '\\wor']</pre>
<p>In contrast to the backslash, the forward-slash doesn’t need to be escaped. Why? Because it doesn’t have a special meaning in Python strings and regular expressions. You can see this in the following example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('/...', '/home/usr/dir/hello/world')
['/hom', '/usr', '/dir', '/hel', '/wor']</pre>
<p>The result shows that even in a non-raw string, you can use the forward-slash without leading escape character.</p>
<h3>Python Regex Escape String Single Quotes</h3>
<p>How to escape the single quotes <code>'</code> in Python regular expressions?</p>
<p>Single quotes have a special meaning in Python regular expressions: they open and close <a href="https://blog.finxter.com/python-string-replace/" target="_blank" rel="noreferrer noopener" aria-label="strings (opens in a new tab)">strings</a>. </p>
<p>You can get rid of the special meaning of single quotes by using the backslash prefix: <code>\'</code>. This way, you can match the string quote characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\'.*\'', "hello 'world'")
["'world'"]</pre>
<p>The result shows a string that contains the “special” single quote characters. The result also shows an alternative that removes the special meaning of the single quotes: enclose them in double quotes: <code>"hello 'world'"</code>. </p>
<h3>Python Regex Escape String Double Quotes</h3>
<p>How to escape the double quotes <code>"</code> in Python regular expressions?</p>
<p>Double quotes have a special meaning in Python regular expressions: they open and close <a rel="noreferrer noopener" aria-label="strings (opens in a new tab)" href="https://blog.finxter.com/python-string-replace/" target="_blank">strings</a>. </p>
<p>You can get rid of the special meaning of single quotes by using the backslash prefix: <code>\"</code>. This way, you can match the string quote characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\".*\"', 'hello "world"')
['"world"']</pre>
<p>The result shows a string that contains the “special” single quote characters. The result also shows an alternative that removes the special meaning of the single quotes: enclose them in double quotes: <code>'hello "world"'</code>. </p>
<h3>Python Regex Escape Dot (Period)</h3>
<p>How to escape the regex dot (or <em>period</em>) meta character <code>.</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/python-re-dot/" target="_blank" rel="noreferrer noopener" aria-label="dot character (opens in a new tab)">dot character</a> has a special meaning in Python regular expressions: it matches an arbitrary character (except newline). </p>
<p>You can get rid of the special meaning of the dot character by using the backslash prefix: <code>\.</code>. This way, you can match the dot character in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('..\.', 'my. name. is. python.')
['my.', 'me.', 'is.', 'on.']</pre>
<p>The result shows four strings that contain the “special” characters <code>'.'</code>. </p>
<h3>Python Regex Escape Plus</h3>
<p>How to escape the plus symbol <code>+</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/python-re-plus/" target="_blank" rel="noreferrer noopener" aria-label="plus symbol (opens in a new tab)">plus symbol</a> has a special meaning in Python regular expressions: it’s the one-or-more quantifier of the preceding regex. </p>
<p>You can get rid of the special meaning of the regex plus symbol by using the backslash prefix: <code>\+</code>. This way, you can match the plus symbol characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\++', '+++python+++rocks')
['+++', '+++']</pre>
<p>The result shows both usages: the plus symbol with and without leading escape character. If it is escaped <code>\+</code>, it matches the raw plus character. If it isn’t escaped <code>+</code>, it quantifies the regex pattern just in front of it (in our case the plus symbol itself).</p>
<h3>Python Regex Escape Asterisk</h3>
<p>How to escape the asterisk symbol <code>*</code> in Python regular expressions?</p>
<p>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.finxter.com/python-re-asterisk/" target="_blank">asterisk symbol</a> has a special meaning in Python regular expressions: it’s the zero-or-more quantifier of the preceding regex. </p>
<p>You can get rid of the special meaning of the regex asterisk symbol by using the backslash prefix: <code>\*</code>. This way, you can match the asterisk symbol characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\**', '***python***rocks')
['***', '***']</pre>
<p>The result shows both usages: the asterisk symbol with and without leading escape character. If it is escaped <code>\*</code>, it matches the raw asterisk character. If it isn’t escaped <code>*</code>, it quantifies the regex pattern just in front of it (in our case the asterisk symbol itself).</p>
<h3>Python Regex Escape Question Mark</h3>
<p>How to escape the question mark symbol <code>?</code> in Python regular expressions?</p>
<p>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.finxter.com/python-re-question-mark/" target="_blank">question mark symbol</a> has a special meaning in Python regular expressions: it’s the zero-or-one quantifier of the preceding regex. </p>
<p>You can get rid of the special meaning of the question mark symbol by using the backslash prefix: <code>\?</code>. This way, you can match the question mark symbol characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('...\?', 'how are you?')
['you?']</pre>
<p>The result shows that the question mark symbol was matched in the given string.</p>
<h3>Python Regex Escape Underscore</h3>
<p>How to escape the underscore character <code>_</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/underscore-in-python/" target="_blank" rel="noreferrer noopener" aria-label="underscore (opens in a new tab)">underscore </a>doesn’t have a special meaning in Python regular expressions or Python strings. </p>
<p>Therefore, you don’t need to escape the underscore character—just use it in your regular expression unescaped. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('..._', 'i_use_underscore_not_whitespace')
['use_', 'ore_', 'not_']</pre>
<p>However, it doesn’t harm to escape it either:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('...\_', 'i_use_underscore_not_whitespace')
['use_', 'ore_', 'not_']</pre>
<p>In both cases, Python finds the underscore characters in the string and matches them in the result.</p>
<h3>Python Regex Escape Pipe</h3>
<p>How to escape the pipe symbol <code>|</code> (vertical line) in Python regular expressions?</p>
<p>The pipe symbol has a special meaning in Python regular expressions: the<a href="https://blog.finxter.com/python-regex-or/" target="_blank" rel="noreferrer noopener" aria-label=" regex OR operator (opens in a new tab)"> regex OR operator</a>.</p>
<p>You can get rid of the special meaning of the pipe symbol by using the backslash prefix: <code>\|</code>. This way, you can match the parentheses characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('.\|.', 'a|b|c|d|e')
['a|b', 'c|d']</pre>
<p>By escaping the pipe symbol, you get rid of the special meaning. The result is just the matched pipe symbol with leading and trailing arbitrary character. </p>
<p>If you don’t escape the pipe symbol, the result will be quite different:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('.|.', 'a|b|c|d|e')
['a', '|', 'b', '|', 'c', '|', 'd', '|', 'e']</pre>
<p>In this case, the regex <code>.|.</code> matches<em> “an arbitrary character or an arbitrary character”</em>—quite meaningless!</p>
<h3>Python Regex Escape Dollar</h3>
<p>How to escape the dollar symbol <code>$</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/python-regex-start-of-line-and-end-of-line/" target="_blank" rel="noreferrer noopener" aria-label="dollar symbol (opens in a new tab)">dollar symbol</a> has a special meaning in Python regular expressions: it matches at the end of the string. </p>
<p>You can get rid of the special meaning by using the backslash prefix: <code>\$</code>. This way, you can match the dollar symbol in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\$\d+', 'Your house is worth $1000000')
['$1000000']</pre>
<p>Note that the <code>\d+</code> regex matches an arbitrary number of numerical digits between 0 and 9. </p>
<h3>Python Regex Escape Greater Than and Smaller Than</h3>
<p>How to escape the greater than <code><</code> and smaller than <code>></code> symbols in Python regular expressions?</p>
<p>Greater and smaller than symbols don’t have a special meaning in Python regular expressions. Therefore, you don’t need to escape them.</p>
<p>Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('<.*>.*<.*>', '<div>hello world</div>')
['<div>hello world</div>']</pre>
<p>The result shows a string that even without escaping the HTML tag symbols, the regex matches the whole string. </p>
<h3>Python Regex Escape Hyphen</h3>
<p>How to escape the hyphen<code>-</code> in Python regular expressions?</p>
<p><strong>Outside</strong> a <a rel="noreferrer noopener" aria-label="character class (opens in a new tab)" href="https://blog.finxter.com/python-character-set-regex-tutorial/" target="_blank">character se</a><a href="https://blog.finxter.com/python-character-set-regex-tutorial/" target="_blank" rel="noreferrer noopener" aria-label="character class (opens in a new tab)">t</a>, the hyphen doesn’t have a special meaning and you don’t need to escape it. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('..-', 'this is-me')
['is-']</pre>
<p>The unescaped hyphen character in the regex matches the hyphen in the string.</p>
<p>However, <strong>inside </strong>a <a rel="noreferrer noopener" aria-label="character set (opens in a new tab)" href="https://blog.finxter.com/python-character-set-regex-tutorial/" target="_blank">character set</a>, the hyphen stands for the range symbol (e.g. <code>[0-9]</code>) so you need to escape it if you want to get rid of its special meaning and match the hyphen symbol itself. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('[a-z\-]+', 'hello-world is one word')
['hello-world', 'is', 'one', 'word']</pre>
<p>Note that, in this case, if you don’t escape the hyphen in the character set, you get the same result:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('[a-z-]+', 'hello-world is one word')
['hello-world', 'is', 'one', 'word']</pre>
<p>The reason is that the hyphen appears at the end of the character set where it can have only one meaning: the hyphen symbol itself. However, in all other cases, the hyphen would be assumed to mean the range character which will result in strange behavior. A good practice is, thus, to escape the hyphen in the character class per default.</p>
<h3>Python Regex Escape Newline</h3>
<p>In a recent <a href="https://stackoverflow.com/questions/14689531/how-to-match-a-new-line-character-in-python-raw-string" target="_blank" rel="noreferrer noopener" aria-label="StackOverflow (opens in a new tab)">StackOverflow</a> article, I read the following question:</p>
<p><em>I got a little confused about Python raw string. I know that if we use raw string, then it will treat <code>'\'</code> as a normal backslash (ex. <code>r'\n'</code> would be <code>'\'</code> and <code>'n'</code>). However, I was wondering what if I want to match a new line character in raw string. I tried <code>r'\n'</code>, but it didn’t work. Anybody has some good idea about this?</em></p>
<p>The coder asking the question has understood that the Python interpreter doesn’t assume that the two characters <code>\</code> and <code>n</code> do have any special meaning in raw strings (in contrast to normal strings). </p>
<p>However, those two symbols have a special meaning for the regex engine! So if you use them as a regular expression pattern, they will indeed match the newline character:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> text = '''This
is
a
multiline
string'''
>>> re.findall(r'[a-z]+\n', text)
['his\n', 'is\n', 'a\n', 'multiline\n']</pre>
<p>Therefore, you don’t need to escape the newline character again to match it in a given string.</p>
<h2>Python re.escape Method</h2>
<p>If you know that your string has a lot of special characters, you can also use the convenience method <code>re.escape(pattern)</code> from Python’s <a rel="noreferrer noopener" aria-label="re package. (opens in a new tab)" href="https://blog.finxter.com/python-regex/" target="_blank">re module. </a></p>
<p><strong>Specification</strong>: <code>re.escape(pattern)</code></p>
<p><strong>Definition</strong>: escapes all special regex meta characters in the given <code>pattern</code>. </p>
<p><strong>Example</strong>: you can escape all special symbols in one go:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.escape('https://www.finxter.com/') 'https://www\\.finxter\\.com/'</pre>
<p>The dot symbol has a special meaning in the string <code>'https://www.finxter.com/'</code>. There are no other special symbols. Therefore, all special symbols are replaced.</p>
<p>Note that “only characters that can have special meaning in a regular expression are escaped. As a result, <code>'!'</code>, <code>'"'</code>, <code>'%'</code>, <code>"'"</code>, <code>','</code>, <code>'/'</code>, <code>':'</code>, <code>';'</code>, <code>'<'</code>, <code>'='</code>, <code>'>'</code>, <code>'@'</code>, and <code>"`"</code> are no longer escaped” (<a href="https://docs.python.org/3/library/re.html">source</a>).</p>
<h2>Python Regex Bad Escape</h2>
<p>There are some common errors in relation to escaping in Python regular expressions.</p>
<p>If you try to escape a normal character that has not a special meaning, Python will throw a “bad escape error”: </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('\m', 'hello {world}')
Traceback (most recent call last): File "<pyshell#61>", line 1, in <module> re.findall('\m', 'hello {world}') File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\re.py", line 223, in findall return _compile(pattern, flags).findall(string) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\re.py", line 286, in _compile p = sre_compile.compile(pattern, flags) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_compile.py", line 764, in compile p = sre_parse.parse(p, flags) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 930, in parse p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 426, in _parse_sub not nested and not items)) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 507, in _parse code = _escape(source, this, state) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 402, in _escape raise source.error("bad escape %s" % escape, len(escape))
re.error: bad escape \m at position 0</pre>
<p>As the error message suggests, there’s no escape sequence <code>\m</code> so you need to get rid of it to avoid the error.</p>
<h2>Where to Go From Here</h2>
<p>Wow, you either have read about a lot of escaped character sequences or you did a lot of scrolling to reach this point.</p>
<p>In both cases, you have a great advantage over other coders: you’re a persistent guy or gal!</p>
<p>Do you want to increase your advantage over your peers? Then join my <a rel="noreferrer noopener" aria-label="Python email academy (opens in a new tab)" href="https://blog.finxter.com/subscribe/" target="_blank">Python email academy</a>! I’ll teach you the ins and outs of Python coding—all free!</p>
<p><a href="https://blog.finxter.com/subscribe/" target="_blank" rel="noreferrer noopener" aria-label="Join Finxter Email Academy, become a better coder, and download your free Python cheat sheets! (opens in a new tab)">Join Finxter Email Academy, become a better coder, and download your free Python cheat sheets!</a></p>
</div>
https://www.sickgaming.net/blog/2020/02/...re-escape/
<div><p>I don’t know how often I sat in front of my computer, writing regular expressions and wondering: how to escape this or that character? The problem is that some special characters have a special meaning in Python strings and regular expressions. If you want to remove the special meaning, you need to escape the characters with an additional backslash.</p>
<p>If you have this problem too, you’re in luck. This article is the ultimate guide to escape special characters in Python. Just click on the topic that interests you and learn how to escape the special character you’re currently struggling with!</p>
<p>If you’re the impatient guy, you’re in luck too. Just try to add the backslash to your special character you want to escape: <code>\x</code> to escape special character <code>x</code>.</p>
<p>Here are a few examples:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\( \{ \" \. \* \+', r'( { " . * +')
['( { " . * +']</pre>
<p>You can also watch the following video where I give you a quick example:</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-rich is-provider-embed-handler wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<div class="ast-oembed-container"><iframe title="Python Regex - How to Escape Special Characters?" width="1100" height="619" src="https://www.youtube.com/embed/F6LY3qo3J9c?feature=oembed" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></div>
</p></div>
</figure>
<h2>Python Regex Escape Characters</h2>
<p>If you use special characters in strings, they carry a special meaning. Sometimes you don’t need that. The general idea is to escape the special character <code>x</code> with an additional backslash <code>\x</code> to get rid of the special meaning.</p>
<p>In the following, I show how to escape all possible special characters for Python strings and regular expressions:</p>
<h3>Python Regex Escape Parentheses ()</h3>
<p>How to escape the parentheses <code>(</code> and <code>)</code> in Python regular expressions?</p>
<p>Parentheses have a special meaning in Python regular expressions: they open and close <a href="https://blog.finxter.com/python-re-groups/" target="_blank" rel="noreferrer noopener" aria-label="matching groups (opens in a new tab)">matching groups</a>. </p>
<p>You can get rid of the special meaning of parentheses by using the backslash prefix: <code>\(</code> and <code>\)</code>. This way, you can match the parentheses characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\(.*\)', 'Python is (really) great')
['(really)']</pre>
<p>The result shows a string that contains the “special” characters <code>'('</code> and <code>')'</code>. </p>
<h3>Python Regex Escape Square Brackets []</h3>
<p>How to escape the square brackets <code>[</code> and <code>]</code> in Python regular expressions?</p>
<p>Square brackets have a special meaning in Python regular expressions: they open and close <a rel="noreferrer noopener" aria-label="matching groups (opens in a new tab)" href="https://blog.finxter.com/python-re-groups/" target="_blank">character sets</a>. </p>
<p>You can get rid of the special meaning of brackets by using the backslash prefix: <code>\[</code> and <code>\]</code>. This way, you can match the brackets characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\[.*\]', 'Is Python [really] easy?')
['[really]']</pre>
<p>The result shows a string that contains the “special” characters <code>'['</code> and <code>']'</code>. </p>
<h3>Python Regex Escape Curly Brace (Brackets)</h3>
<p>How to escape the curly braces<code>{</code> and <code>}</code> in Python regular expressions?</p>
<p>The curly braces don’t have any special meaning in Python strings or regular expressions. Therefore, you don’t need to escape them with a leading backslash character <code>\</code>. However, you can do so if you wish as you see in the following example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\{.*\}', 'if (2==2) { y = 3; }')
['{ y = 3; }']
>>> re.findall(r'{.*}', 'if (2==2) { y = 3; }')
['{ y = 3; }']
>>> re.findall('{.*}', 'if (2==2) { y = 3; }')
['{ y = 3; }']</pre>
<p>All three cases match the same string enclosed in curly braces—even though we did not escape them and didn’t use the raw string <code>r''</code> in the third example. </p>
<h3>Python Regex Escape Slash (Backslash and Forward-Slash)</h3>
<p>How to escape the slash characters—backslash <code>\</code> and forward-slash <code>/</code>—in Python regular expressions?</p>
<p>The backslash has a special meaning in Python regular expressions: it escapes special characters and, thus, removes the special meaning. (How meta.)</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall(r'\\...', r'C:\home\usr\dir\hello\world')
['\\hom', '\\usr', '\\dir', '\\hel', '\\wor']</pre>
<p>You can see that the resulting matches have escaped backslashes themselves. This is because the backslash character has a special meaning in normal strings. Thus, the Python interpreter escapes it automatically by itself when printing it on the shell. Note that you didn’t need to escape the backslash characters when writing the raw string <code>r'C:\home\usr\dir\hello\world'</code> because the raw string already removes all the special meaning from the backslashed characters. But if you don’t want to use a raw string but a normal string, you need to escape the backslash character yourself:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall(r'\\...', 'C:\\home\\usr\\dir\\hello\\world')
['\\hom', '\\usr', '\\dir', '\\hel', '\\wor']</pre>
<p>In contrast to the backslash, the forward-slash doesn’t need to be escaped. Why? Because it doesn’t have a special meaning in Python strings and regular expressions. You can see this in the following example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('/...', '/home/usr/dir/hello/world')
['/hom', '/usr', '/dir', '/hel', '/wor']</pre>
<p>The result shows that even in a non-raw string, you can use the forward-slash without leading escape character.</p>
<h3>Python Regex Escape String Single Quotes</h3>
<p>How to escape the single quotes <code>'</code> in Python regular expressions?</p>
<p>Single quotes have a special meaning in Python regular expressions: they open and close <a href="https://blog.finxter.com/python-string-replace/" target="_blank" rel="noreferrer noopener" aria-label="strings (opens in a new tab)">strings</a>. </p>
<p>You can get rid of the special meaning of single quotes by using the backslash prefix: <code>\'</code>. This way, you can match the string quote characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\'.*\'', "hello 'world'")
["'world'"]</pre>
<p>The result shows a string that contains the “special” single quote characters. The result also shows an alternative that removes the special meaning of the single quotes: enclose them in double quotes: <code>"hello 'world'"</code>. </p>
<h3>Python Regex Escape String Double Quotes</h3>
<p>How to escape the double quotes <code>"</code> in Python regular expressions?</p>
<p>Double quotes have a special meaning in Python regular expressions: they open and close <a rel="noreferrer noopener" aria-label="strings (opens in a new tab)" href="https://blog.finxter.com/python-string-replace/" target="_blank">strings</a>. </p>
<p>You can get rid of the special meaning of single quotes by using the backslash prefix: <code>\"</code>. This way, you can match the string quote characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\".*\"', 'hello "world"')
['"world"']</pre>
<p>The result shows a string that contains the “special” single quote characters. The result also shows an alternative that removes the special meaning of the single quotes: enclose them in double quotes: <code>'hello "world"'</code>. </p>
<h3>Python Regex Escape Dot (Period)</h3>
<p>How to escape the regex dot (or <em>period</em>) meta character <code>.</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/python-re-dot/" target="_blank" rel="noreferrer noopener" aria-label="dot character (opens in a new tab)">dot character</a> has a special meaning in Python regular expressions: it matches an arbitrary character (except newline). </p>
<p>You can get rid of the special meaning of the dot character by using the backslash prefix: <code>\.</code>. This way, you can match the dot character in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('..\.', 'my. name. is. python.')
['my.', 'me.', 'is.', 'on.']</pre>
<p>The result shows four strings that contain the “special” characters <code>'.'</code>. </p>
<h3>Python Regex Escape Plus</h3>
<p>How to escape the plus symbol <code>+</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/python-re-plus/" target="_blank" rel="noreferrer noopener" aria-label="plus symbol (opens in a new tab)">plus symbol</a> has a special meaning in Python regular expressions: it’s the one-or-more quantifier of the preceding regex. </p>
<p>You can get rid of the special meaning of the regex plus symbol by using the backslash prefix: <code>\+</code>. This way, you can match the plus symbol characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\++', '+++python+++rocks')
['+++', '+++']</pre>
<p>The result shows both usages: the plus symbol with and without leading escape character. If it is escaped <code>\+</code>, it matches the raw plus character. If it isn’t escaped <code>+</code>, it quantifies the regex pattern just in front of it (in our case the plus symbol itself).</p>
<h3>Python Regex Escape Asterisk</h3>
<p>How to escape the asterisk symbol <code>*</code> in Python regular expressions?</p>
<p>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.finxter.com/python-re-asterisk/" target="_blank">asterisk symbol</a> has a special meaning in Python regular expressions: it’s the zero-or-more quantifier of the preceding regex. </p>
<p>You can get rid of the special meaning of the regex asterisk symbol by using the backslash prefix: <code>\*</code>. This way, you can match the asterisk symbol characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\**', '***python***rocks')
['***', '***']</pre>
<p>The result shows both usages: the asterisk symbol with and without leading escape character. If it is escaped <code>\*</code>, it matches the raw asterisk character. If it isn’t escaped <code>*</code>, it quantifies the regex pattern just in front of it (in our case the asterisk symbol itself).</p>
<h3>Python Regex Escape Question Mark</h3>
<p>How to escape the question mark symbol <code>?</code> in Python regular expressions?</p>
<p>The <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.finxter.com/python-re-question-mark/" target="_blank">question mark symbol</a> has a special meaning in Python regular expressions: it’s the zero-or-one quantifier of the preceding regex. </p>
<p>You can get rid of the special meaning of the question mark symbol by using the backslash prefix: <code>\?</code>. This way, you can match the question mark symbol characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('...\?', 'how are you?')
['you?']</pre>
<p>The result shows that the question mark symbol was matched in the given string.</p>
<h3>Python Regex Escape Underscore</h3>
<p>How to escape the underscore character <code>_</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/underscore-in-python/" target="_blank" rel="noreferrer noopener" aria-label="underscore (opens in a new tab)">underscore </a>doesn’t have a special meaning in Python regular expressions or Python strings. </p>
<p>Therefore, you don’t need to escape the underscore character—just use it in your regular expression unescaped. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('..._', 'i_use_underscore_not_whitespace')
['use_', 'ore_', 'not_']</pre>
<p>However, it doesn’t harm to escape it either:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('...\_', 'i_use_underscore_not_whitespace')
['use_', 'ore_', 'not_']</pre>
<p>In both cases, Python finds the underscore characters in the string and matches them in the result.</p>
<h3>Python Regex Escape Pipe</h3>
<p>How to escape the pipe symbol <code>|</code> (vertical line) in Python regular expressions?</p>
<p>The pipe symbol has a special meaning in Python regular expressions: the<a href="https://blog.finxter.com/python-regex-or/" target="_blank" rel="noreferrer noopener" aria-label=" regex OR operator (opens in a new tab)"> regex OR operator</a>.</p>
<p>You can get rid of the special meaning of the pipe symbol by using the backslash prefix: <code>\|</code>. This way, you can match the parentheses characters in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('.\|.', 'a|b|c|d|e')
['a|b', 'c|d']</pre>
<p>By escaping the pipe symbol, you get rid of the special meaning. The result is just the matched pipe symbol with leading and trailing arbitrary character. </p>
<p>If you don’t escape the pipe symbol, the result will be quite different:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('.|.', 'a|b|c|d|e')
['a', '|', 'b', '|', 'c', '|', 'd', '|', 'e']</pre>
<p>In this case, the regex <code>.|.</code> matches<em> “an arbitrary character or an arbitrary character”</em>—quite meaningless!</p>
<h3>Python Regex Escape Dollar</h3>
<p>How to escape the dollar symbol <code>$</code> in Python regular expressions?</p>
<p>The <a href="https://blog.finxter.com/python-regex-start-of-line-and-end-of-line/" target="_blank" rel="noreferrer noopener" aria-label="dollar symbol (opens in a new tab)">dollar symbol</a> has a special meaning in Python regular expressions: it matches at the end of the string. </p>
<p>You can get rid of the special meaning by using the backslash prefix: <code>\$</code>. This way, you can match the dollar symbol in a given string. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('\$\d+', 'Your house is worth $1000000')
['$1000000']</pre>
<p>Note that the <code>\d+</code> regex matches an arbitrary number of numerical digits between 0 and 9. </p>
<h3>Python Regex Escape Greater Than and Smaller Than</h3>
<p>How to escape the greater than <code><</code> and smaller than <code>></code> symbols in Python regular expressions?</p>
<p>Greater and smaller than symbols don’t have a special meaning in Python regular expressions. Therefore, you don’t need to escape them.</p>
<p>Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('<.*>.*<.*>', '<div>hello world</div>')
['<div>hello world</div>']</pre>
<p>The result shows a string that even without escaping the HTML tag symbols, the regex matches the whole string. </p>
<h3>Python Regex Escape Hyphen</h3>
<p>How to escape the hyphen<code>-</code> in Python regular expressions?</p>
<p><strong>Outside</strong> a <a rel="noreferrer noopener" aria-label="character class (opens in a new tab)" href="https://blog.finxter.com/python-character-set-regex-tutorial/" target="_blank">character se</a><a href="https://blog.finxter.com/python-character-set-regex-tutorial/" target="_blank" rel="noreferrer noopener" aria-label="character class (opens in a new tab)">t</a>, the hyphen doesn’t have a special meaning and you don’t need to escape it. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('..-', 'this is-me')
['is-']</pre>
<p>The unescaped hyphen character in the regex matches the hyphen in the string.</p>
<p>However, <strong>inside </strong>a <a rel="noreferrer noopener" aria-label="character set (opens in a new tab)" href="https://blog.finxter.com/python-character-set-regex-tutorial/" target="_blank">character set</a>, the hyphen stands for the range symbol (e.g. <code>[0-9]</code>) so you need to escape it if you want to get rid of its special meaning and match the hyphen symbol itself. Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('[a-z\-]+', 'hello-world is one word')
['hello-world', 'is', 'one', 'word']</pre>
<p>Note that, in this case, if you don’t escape the hyphen in the character set, you get the same result:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('[a-z-]+', 'hello-world is one word')
['hello-world', 'is', 'one', 'word']</pre>
<p>The reason is that the hyphen appears at the end of the character set where it can have only one meaning: the hyphen symbol itself. However, in all other cases, the hyphen would be assumed to mean the range character which will result in strange behavior. A good practice is, thus, to escape the hyphen in the character class per default.</p>
<h3>Python Regex Escape Newline</h3>
<p>In a recent <a href="https://stackoverflow.com/questions/14689531/how-to-match-a-new-line-character-in-python-raw-string" target="_blank" rel="noreferrer noopener" aria-label="StackOverflow (opens in a new tab)">StackOverflow</a> article, I read the following question:</p>
<p><em>I got a little confused about Python raw string. I know that if we use raw string, then it will treat <code>'\'</code> as a normal backslash (ex. <code>r'\n'</code> would be <code>'\'</code> and <code>'n'</code>). However, I was wondering what if I want to match a new line character in raw string. I tried <code>r'\n'</code>, but it didn’t work. Anybody has some good idea about this?</em></p>
<p>The coder asking the question has understood that the Python interpreter doesn’t assume that the two characters <code>\</code> and <code>n</code> do have any special meaning in raw strings (in contrast to normal strings). </p>
<p>However, those two symbols have a special meaning for the regex engine! So if you use them as a regular expression pattern, they will indeed match the newline character:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> text = '''This
is
a
multiline
string'''
>>> re.findall(r'[a-z]+\n', text)
['his\n', 'is\n', 'a\n', 'multiline\n']</pre>
<p>Therefore, you don’t need to escape the newline character again to match it in a given string.</p>
<h2>Python re.escape Method</h2>
<p>If you know that your string has a lot of special characters, you can also use the convenience method <code>re.escape(pattern)</code> from Python’s <a rel="noreferrer noopener" aria-label="re package. (opens in a new tab)" href="https://blog.finxter.com/python-regex/" target="_blank">re module. </a></p>
<p><strong>Specification</strong>: <code>re.escape(pattern)</code></p>
<p><strong>Definition</strong>: escapes all special regex meta characters in the given <code>pattern</code>. </p>
<p><strong>Example</strong>: you can escape all special symbols in one go:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.escape('https://www.finxter.com/') 'https://www\\.finxter\\.com/'</pre>
<p>The dot symbol has a special meaning in the string <code>'https://www.finxter.com/'</code>. There are no other special symbols. Therefore, all special symbols are replaced.</p>
<p>Note that “only characters that can have special meaning in a regular expression are escaped. As a result, <code>'!'</code>, <code>'"'</code>, <code>'%'</code>, <code>"'"</code>, <code>','</code>, <code>'/'</code>, <code>':'</code>, <code>';'</code>, <code>'<'</code>, <code>'='</code>, <code>'>'</code>, <code>'@'</code>, and <code>"`"</code> are no longer escaped” (<a href="https://docs.python.org/3/library/re.html">source</a>).</p>
<h2>Python Regex Bad Escape</h2>
<p>There are some common errors in relation to escaping in Python regular expressions.</p>
<p>If you try to escape a normal character that has not a special meaning, Python will throw a “bad escape error”: </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> re.findall('\m', 'hello {world}')
Traceback (most recent call last): File "<pyshell#61>", line 1, in <module> re.findall('\m', 'hello {world}') File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\re.py", line 223, in findall return _compile(pattern, flags).findall(string) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\re.py", line 286, in _compile p = sre_compile.compile(pattern, flags) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_compile.py", line 764, in compile p = sre_parse.parse(p, flags) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 930, in parse p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 426, in _parse_sub not nested and not items)) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 507, in _parse code = _escape(source, this, state) File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 402, in _escape raise source.error("bad escape %s" % escape, len(escape))
re.error: bad escape \m at position 0</pre>
<p>As the error message suggests, there’s no escape sequence <code>\m</code> so you need to get rid of it to avoid the error.</p>
<h2>Where to Go From Here</h2>
<p>Wow, you either have read about a lot of escaped character sequences or you did a lot of scrolling to reach this point.</p>
<p>In both cases, you have a great advantage over other coders: you’re a persistent guy or gal!</p>
<p>Do you want to increase your advantage over your peers? Then join my <a rel="noreferrer noopener" aria-label="Python email academy (opens in a new tab)" href="https://blog.finxter.com/subscribe/" target="_blank">Python email academy</a>! I’ll teach you the ins and outs of Python coding—all free!</p>
<p><a href="https://blog.finxter.com/subscribe/" target="_blank" rel="noreferrer noopener" aria-label="Join Finxter Email Academy, become a better coder, and download your free Python cheat sheets! (opens in a new tab)">Join Finxter Email Academy, become a better coder, and download your free Python cheat sheets!</a></p>
</div>
https://www.sickgaming.net/blog/2020/02/...re-escape/