{"id":109684,"date":"2020-02-27T16:56:58","date_gmt":"2020-02-27T16:56:58","guid":{"rendered":"https:\/\/blog.finxter.com\/?p=6430"},"modified":"2020-02-27T16:56:58","modified_gmt":"2020-02-27T16:56:58","slug":"python-re-escape","status":"publish","type":"post","link":"https:\/\/sickgaming.net\/blog\/2020\/02\/27\/python-re-escape\/","title":{"rendered":"Python Re Escape"},"content":{"rendered":"<p>I don&#8217;t know how often I sat in front of my computer, writing regular expressions and wondering: how to escape this or that character? The problem is that some special characters have a special meaning in Python strings and regular expressions. If you want to remove the special meaning, you need to escape the characters with an additional backslash.<\/p>\n<p>If you have this problem too, you&#8217;re in luck. This article is the ultimate guide to escape special characters in Python. Just click on the topic that interests you and learn how to escape the special character you&#8217;re currently struggling with!<\/p>\n<p>If you&#8217;re the impatient guy, you&#8217;re in luck too. Just try to add the backslash to your special character you want to escape: <code>\\x<\/code> to escape special character <code>x<\/code>.<\/p>\n<p>Here are a few examples:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('\\( \\{ \\\" \\. \\* \\+', r'( { \" . * +')\n['( { \" . * +']<\/pre>\n<p>You can also watch the following video where I give you a quick example:<\/p>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-rich is-provider-embed-handler wp-embed-aspect-16-9 wp-has-aspect-ratio\">\n<div class=\"wp-block-embed__wrapper\">\n<div class=\"ast-oembed-container\"><iframe loading=\"lazy\" title=\"Python Regex - How to Escape Special Characters?\" width=\"1100\" height=\"619\" src=\"https:\/\/www.youtube.com\/embed\/F6LY3qo3J9c?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe><\/div>\n<\/p><\/div>\n<\/figure>\n<h2>Python Regex Escape Characters<\/h2>\n<p>If you use special characters in strings, they carry a special meaning. Sometimes you don&#8217;t need that. The general idea is to escape the special character <code>x<\/code> with an additional backslash <code>\\x<\/code> to get rid of the special meaning.<\/p>\n<p>In the following, I show how to escape all possible special characters for Python strings and regular expressions:<\/p>\n<h3>Python Regex Escape Parentheses ()<\/h3>\n<p>How to escape the parentheses <code>(<\/code> and <code>)<\/code> in Python regular expressions?<\/p>\n<p>Parentheses have a special meaning in Python regular expressions: they open and close <a href=\"https:\/\/blog.finxter.com\/python-re-groups\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"matching groups (opens in a new tab)\">matching groups<\/a>. <\/p>\n<p>You can get rid of the special meaning of parentheses by using the backslash prefix: <code>\\(<\/code> and <code>\\)<\/code>. This way, you can match the parentheses characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall(r'\\(.*\\)', 'Python is (really) great')\n['(really)']<\/pre>\n<p>The result shows a string that contains the &#8220;special&#8221; characters <code>'('<\/code> and <code>')'<\/code>. <\/p>\n<h3>Python Regex Escape Square Brackets []<\/h3>\n<p>How to escape the square brackets <code>[<\/code> and <code>]<\/code> in Python regular expressions?<\/p>\n<p>Square brackets have a special meaning in Python regular expressions: they open and close <a rel=\"noreferrer noopener\" aria-label=\"matching groups (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/python-re-groups\/\" target=\"_blank\">character sets<\/a>. <\/p>\n<p>You can get rid of the special meaning of brackets by using the backslash prefix: <code>\\[<\/code> and <code>\\]<\/code>. This way, you can match the brackets characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall(r'\\[.*\\]', 'Is Python [really] easy?')\n['[really]']<\/pre>\n<p>The result shows a string that contains the &#8220;special&#8221; characters <code>'['<\/code> and <code>']'<\/code>. <\/p>\n<h3>Python Regex Escape Curly Brace (Brackets)<\/h3>\n<p>How to escape the curly braces<code>{<\/code> and <code>}<\/code> in Python regular expressions?<\/p>\n<p>The curly braces don&#8217;t have any special meaning in Python strings or regular expressions. Therefore, you don&#8217;t need to escape them with a leading backslash character <code>\\<\/code>. However, you can do so if you wish as you see in the following example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall(r'\\{.*\\}', 'if (2==2) { y = 3; }')\n['{ y = 3; }']\n>>> re.findall(r'{.*}', 'if (2==2) { y = 3; }')\n['{ y = 3; }']\n>>> re.findall('{.*}', 'if (2==2) { y = 3; }')\n['{ y = 3; }']<\/pre>\n<p>All three cases match the same string enclosed in curly braces&#8212;even though we did not escape them and didn&#8217;t use the raw string <code>r''<\/code> in the third example. <\/p>\n<h3>Python Regex Escape Slash (Backslash and Forward-Slash)<\/h3>\n<p>How to escape the slash characters&#8212;backslash <code>\\<\/code> and forward-slash <code>\/<\/code>&#8212;in Python regular expressions?<\/p>\n<p>The backslash has a special meaning in Python regular expressions: it escapes special characters and, thus, removes the special meaning. (How meta.)<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall(r'\\\\...', r'C:\\home\\usr\\dir\\hello\\world')\n['\\\\hom', '\\\\usr', '\\\\dir', '\\\\hel', '\\\\wor']<\/pre>\n<p>You can see that the resulting matches have escaped backslashes themselves. This is because the backslash character has a special meaning in normal strings. Thus, the Python interpreter escapes it automatically by itself when printing it on the shell. Note that you didn&#8217;t need to escape the backslash characters when writing the raw string <code>r'C:\\home\\usr\\dir\\hello\\world'<\/code> because the raw string already removes all the special meaning from the backslashed characters. But if you don&#8217;t want to use a raw string but a normal string, you need to escape the backslash character yourself:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> re.findall(r'\\\\...', 'C:\\\\home\\\\usr\\\\dir\\\\hello\\\\world')\n['\\\\hom', '\\\\usr', '\\\\dir', '\\\\hel', '\\\\wor']<\/pre>\n<p>In contrast to the backslash, the forward-slash doesn&#8217;t need to be escaped. Why? Because it doesn&#8217;t have a special meaning in Python strings and regular expressions. You can see this in the following example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('\/...', '\/home\/usr\/dir\/hello\/world')\n['\/hom', '\/usr', '\/dir', '\/hel', '\/wor']<\/pre>\n<p>The result shows that even in a non-raw string, you can use the forward-slash without leading escape character.<\/p>\n<h3>Python Regex Escape String Single Quotes<\/h3>\n<p>How to escape the single quotes <code>'<\/code> in Python regular expressions?<\/p>\n<p>Single quotes have a special meaning in Python regular expressions: they open and close <a href=\"https:\/\/blog.finxter.com\/python-string-replace\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"strings (opens in a new tab)\">strings<\/a>. <\/p>\n<p>You can get rid of the special meaning of single quotes by using the backslash prefix: <code>\\'<\/code>. This way, you can match the string quote characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('\\'.*\\'', \"hello 'world'\")\n[\"'world'\"]<\/pre>\n<p>The result shows a string that contains the &#8220;special&#8221; single quote characters. The result also shows an alternative that removes the special meaning of the single quotes: enclose them in double quotes: <code>\"hello 'world'\"<\/code>. <\/p>\n<h3>Python Regex Escape String Double Quotes<\/h3>\n<p>How to escape the double quotes <code>\"<\/code> in Python regular expressions?<\/p>\n<p>Double quotes have a special meaning in Python regular expressions: they open and close <a rel=\"noreferrer noopener\" aria-label=\"strings (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/python-string-replace\/\" target=\"_blank\">strings<\/a>. <\/p>\n<p>You can get rid of the special meaning of single quotes by using the backslash prefix: <code>\\\"<\/code>. This way, you can match the string quote characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('\\\".*\\\"', 'hello \"world\"')\n['\"world\"']<\/pre>\n<p>The result shows a string that contains the &#8220;special&#8221; single quote characters. The result also shows an alternative that removes the special meaning of the single quotes: enclose them in double quotes: <code>'hello \"world\"'<\/code>. <\/p>\n<h3>Python Regex Escape Dot (Period)<\/h3>\n<p>How to escape the regex dot (or <em>period<\/em>) meta character <code>.<\/code> in Python regular expressions?<\/p>\n<p>The <a href=\"https:\/\/blog.finxter.com\/python-re-dot\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"dot character (opens in a new tab)\">dot character<\/a> has a special meaning in Python regular expressions: it matches an arbitrary character (except newline). <\/p>\n<p>You can get rid of the special meaning of the dot character by using the backslash prefix: <code>\\.<\/code>. This way, you can match the dot character in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('..\\.', 'my. name. is. python.')\n['my.', 'me.', 'is.', 'on.']<\/pre>\n<p>The result shows four strings that contain the &#8220;special&#8221; characters <code>'.'<\/code>. <\/p>\n<h3>Python Regex Escape Plus<\/h3>\n<p>How to escape the plus symbol <code>+<\/code> in Python regular expressions?<\/p>\n<p>The <a href=\"https:\/\/blog.finxter.com\/python-re-plus\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"plus symbol (opens in a new tab)\">plus symbol<\/a> has a special meaning in Python regular expressions: it&#8217;s the one-or-more quantifier of the preceding regex. <\/p>\n<p>You can get rid of the special meaning of the regex plus symbol by using the backslash prefix: <code>\\+<\/code>. This way, you can match the plus symbol characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('\\++', '+++python+++rocks')\n['+++', '+++']<\/pre>\n<p>The result shows both usages: the plus symbol with and without leading escape character. If it is escaped <code>\\+<\/code>, it matches the raw plus character. If it isn&#8217;t escaped <code>+<\/code>, it quantifies the regex pattern just in front of it (in our case the plus symbol itself).<\/p>\n<h3>Python Regex Escape Asterisk<\/h3>\n<p>How to escape the asterisk symbol <code>*<\/code> in Python regular expressions?<\/p>\n<p>The <a rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/python-re-asterisk\/\" target=\"_blank\">asterisk symbol<\/a> has a special meaning in Python regular expressions: it&#8217;s the zero-or-more quantifier of the preceding regex. <\/p>\n<p>You can get rid of the special meaning of the regex asterisk symbol by using the backslash prefix: <code>\\*<\/code>. This way, you can match the asterisk symbol characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('\\**', '***python***rocks')\n['***', '***']<\/pre>\n<p>The result shows both usages: the asterisk symbol with and without leading escape character. If it is escaped <code>\\*<\/code>, it matches the raw asterisk character. If it isn&#8217;t escaped <code>*<\/code>, it quantifies the regex pattern just in front of it (in our case the asterisk symbol itself).<\/p>\n<h3>Python Regex Escape Question Mark<\/h3>\n<p>How to escape the question mark symbol <code>?<\/code> in Python regular expressions?<\/p>\n<p>The <a rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/python-re-question-mark\/\" target=\"_blank\">question mark symbol<\/a> has a special meaning in Python regular expressions: it&#8217;s the zero-or-one quantifier of the preceding regex. <\/p>\n<p>You can get rid of the special meaning of the question mark symbol by using the backslash prefix: <code>\\?<\/code>. This way, you can match the question mark symbol characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('...\\?', 'how are you?')\n['you?']<\/pre>\n<p>The result shows that the question mark symbol was matched in the given string.<\/p>\n<h3>Python Regex Escape Underscore<\/h3>\n<p>How to escape the underscore character <code>_<\/code> in Python regular expressions?<\/p>\n<p>The <a href=\"https:\/\/blog.finxter.com\/underscore-in-python\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"underscore (opens in a new tab)\">underscore <\/a>doesn&#8217;t have a special meaning in Python regular expressions or Python strings. <\/p>\n<p>Therefore, you don&#8217;t need to escape the underscore character&#8212;just use it in your regular expression unescaped. <\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('..._', 'i_use_underscore_not_whitespace')\n['use_', 'ore_', 'not_']<\/pre>\n<p>However, it doesn&#8217;t harm to escape it either:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> re.findall('...\\_', 'i_use_underscore_not_whitespace')\n['use_', 'ore_', 'not_']<\/pre>\n<p>In both cases, Python finds the underscore characters in the string and matches them in the result.<\/p>\n<h3>Python Regex Escape Pipe<\/h3>\n<p>How to escape the pipe symbol <code>|<\/code> (vertical line) in Python regular expressions?<\/p>\n<p>The pipe symbol has a special meaning in Python regular expressions: the<a href=\"https:\/\/blog.finxter.com\/python-regex-or\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\" regex OR operator (opens in a new tab)\"> regex OR operator<\/a>.<\/p>\n<p>You can get rid of the special meaning of the pipe symbol by using the backslash prefix: <code>\\|<\/code>. This way, you can match the parentheses characters in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('.\\|.', 'a|b|c|d|e')\n['a|b', 'c|d']<\/pre>\n<p>By escaping the pipe symbol, you get rid of the special meaning. The result is just the matched pipe symbol with leading and trailing arbitrary character. <\/p>\n<p>If you don&#8217;t escape the pipe symbol, the result will be quite different:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> re.findall('.|.', 'a|b|c|d|e')\n['a', '|', 'b', '|', 'c', '|', 'd', '|', 'e']<\/pre>\n<p>In this case, the regex <code>.|.<\/code> matches<em> &#8220;an arbitrary character or an arbitrary character&#8221;<\/em>&#8212;quite meaningless!<\/p>\n<h3>Python Regex Escape Dollar<\/h3>\n<p>How to escape the dollar symbol <code>$<\/code> in Python regular expressions?<\/p>\n<p>The <a href=\"https:\/\/blog.finxter.com\/python-regex-start-of-line-and-end-of-line\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"dollar symbol (opens in a new tab)\">dollar symbol<\/a> has a special meaning in Python regular expressions: it matches at the end of the string. <\/p>\n<p>You can get rid of the special meaning by using the backslash prefix: <code>\\$<\/code>. This way, you can match the dollar symbol in a given string. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('\\$\\d+', 'Your house is worth $1000000')\n['$1000000']<\/pre>\n<p>Note that the <code>\\d+<\/code> regex matches an arbitrary number of numerical digits between 0 and 9. <\/p>\n<h3>Python Regex Escape Greater Than and Smaller Than<\/h3>\n<p>How to escape the greater than <code>&lt;<\/code> and smaller than <code>&gt;<\/code> symbols in Python regular expressions?<\/p>\n<p>Greater and smaller than symbols don&#8217;t have a special meaning in Python regular expressions. Therefore, you don&#8217;t need to escape them.<\/p>\n<p>Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('&lt;.*>.*&lt;.*>', '&lt;div>hello world&lt;\/div>')\n['&lt;div>hello world&lt;\/div>']<\/pre>\n<p>The result shows a string that even without escaping the HTML tag symbols, the regex matches the whole string. <\/p>\n<h3>Python Regex Escape Hyphen<\/h3>\n<p>How to escape the hyphen<code>-<\/code> in Python regular expressions?<\/p>\n<p><strong>Outside<\/strong> a <a rel=\"noreferrer noopener\" aria-label=\"character class (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/python-character-set-regex-tutorial\/\" target=\"_blank\">character se<\/a><a href=\"https:\/\/blog.finxter.com\/python-character-set-regex-tutorial\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"character class (opens in a new tab)\">t<\/a>, the hyphen doesn&#8217;t have a special meaning and you don&#8217;t need to escape it. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> re.findall('..-', 'this is-me')\n['is-']<\/pre>\n<p>The unescaped hyphen character in the regex matches the hyphen in the string.<\/p>\n<p>However, <strong>inside <\/strong>a <a rel=\"noreferrer noopener\" aria-label=\"character set (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/python-character-set-regex-tutorial\/\" target=\"_blank\">character set<\/a>, the hyphen stands for the range symbol (e.g. <code>[0-9]<\/code>) so you need to escape it if you want to get rid of its special meaning and match the hyphen symbol itself. Here&#8217;s an example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> re.findall('[a-z\\-]+', 'hello-world is one word')\n['hello-world', 'is', 'one', 'word']<\/pre>\n<p>Note that, in this case, if you don&#8217;t escape the hyphen in the character set, you get the same result:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> re.findall('[a-z-]+', 'hello-world is one word')\n['hello-world', 'is', 'one', 'word']<\/pre>\n<p>The reason is that the hyphen appears at the end of the character set where it can have only one meaning: the hyphen symbol itself. However, in all other cases, the hyphen would be assumed to mean the range character which will result in strange behavior. A good practice is, thus, to escape the hyphen in the character class per default.<\/p>\n<h3>Python Regex Escape Newline<\/h3>\n<p>In a recent <a href=\"https:\/\/stackoverflow.com\/questions\/14689531\/how-to-match-a-new-line-character-in-python-raw-string\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"StackOverflow (opens in a new tab)\">StackOverflow<\/a> article, I read the following question:<\/p>\n<p><em>I got a little confused about Python raw string. I know that if we use raw string, then it will treat <code>'\\'<\/code> as a normal backslash (ex. <code>r'\\n'<\/code> would be <code>'\\'<\/code> and <code>'n'<\/code>). However, I was wondering what if I want to match a new line character in raw string. I tried <code>r'\\n'<\/code>, but it didn&#8217;t work. Anybody has some good idea about this?<\/em><\/p>\n<p>The coder asking the question has understood that the Python interpreter doesn&#8217;t assume that the two characters <code>\\<\/code> and <code>n<\/code> do have any special meaning in raw strings (in contrast to normal strings). <\/p>\n<p>However, those two symbols have a special meaning for the regex engine! So if you use them as a regular expression pattern, they will indeed match the newline character:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> import re\n>>> text = '''This\nis\na\nmultiline\nstring'''\n>>> re.findall(r'[a-z]+\\n', text)\n['his\\n', 'is\\n', 'a\\n', 'multiline\\n']<\/pre>\n<p>Therefore, you don&#8217;t need to escape the newline character again to match it in a given string.<\/p>\n<h2>Python re.escape Method<\/h2>\n<p>If you know that your string has a lot of special characters, you can also use the convenience method <code>re.escape(pattern)<\/code> from Python&#8217;s <a rel=\"noreferrer noopener\" aria-label=\"re package. (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/python-regex\/\" target=\"_blank\">re module. <\/a><\/p>\n<p><strong>Specification<\/strong>: <code>re.escape(pattern)<\/code><\/p>\n<p><strong>Definition<\/strong>: escapes all special regex meta characters in the given <code>pattern<\/code>. <\/p>\n<p><strong>Example<\/strong>: you can escape all special symbols in one go:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> re.escape('https:\/\/www.finxter.com\/') 'https:\/\/www\\\\.finxter\\\\.com\/'<\/pre>\n<p>The dot symbol has a special meaning in the string <code>'https:\/\/www.finxter.com\/'<\/code>. There are no other special symbols. Therefore, all special symbols are replaced.<\/p>\n<p>Note that &#8220;only characters that can have special meaning in a regular expression are escaped. As a result, <code>'!'<\/code>, <code>'\"'<\/code>, <code>'%'<\/code>, <code>\"'\"<\/code>, <code>','<\/code>, <code>'\/'<\/code>, <code>':'<\/code>, <code>';'<\/code>, <code>'&lt;'<\/code>, <code>'='<\/code>, <code>'&gt;'<\/code>, <code>'@'<\/code>, and <code>\"`\"<\/code> are no longer escaped&#8221; (<a href=\"https:\/\/docs.python.org\/3\/library\/re.html\">source<\/a>).<\/p>\n<h2>Python Regex Bad Escape<\/h2>\n<p>There are some common errors in relation to escaping in Python regular expressions.<\/p>\n<p>If you try to escape a normal character that has not a special meaning, Python will throw a &#8220;bad escape error&#8221;: <\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">>>> re.findall('\\m', 'hello {world}')\nTraceback (most recent call last): File \"&lt;pyshell#61>\", line 1, in &lt;module> re.findall('\\m', 'hello {world}') File \"C:\\Users\\xcent\\AppData\\Local\\Programs\\Python\\Python37\\lib\\re.py\", line 223, in findall return _compile(pattern, flags).findall(string) File \"C:\\Users\\xcent\\AppData\\Local\\Programs\\Python\\Python37\\lib\\re.py\", line 286, in _compile p = sre_compile.compile(pattern, flags) File \"C:\\Users\\xcent\\AppData\\Local\\Programs\\Python\\Python37\\lib\\sre_compile.py\", line 764, in compile p = sre_parse.parse(p, flags) File \"C:\\Users\\xcent\\AppData\\Local\\Programs\\Python\\Python37\\lib\\sre_parse.py\", line 930, in parse p = _parse_sub(source, pattern, flags &amp; SRE_FLAG_VERBOSE, 0) File \"C:\\Users\\xcent\\AppData\\Local\\Programs\\Python\\Python37\\lib\\sre_parse.py\", line 426, in _parse_sub not nested and not items)) File \"C:\\Users\\xcent\\AppData\\Local\\Programs\\Python\\Python37\\lib\\sre_parse.py\", line 507, in _parse code = _escape(source, this, state) File \"C:\\Users\\xcent\\AppData\\Local\\Programs\\Python\\Python37\\lib\\sre_parse.py\", line 402, in _escape raise source.error(\"bad escape %s\" % escape, len(escape))\nre.error: bad escape \\m at position 0<\/pre>\n<p>As the error message suggests, there&#8217;s no escape sequence <code>\\m<\/code> so you need to get rid of it to avoid the error.<\/p>\n<h2>Where to Go From Here<\/h2>\n<p>Wow, you either have read about a lot of escaped character sequences or you did a lot of scrolling to reach this point.<\/p>\n<p>In both cases, you have a great advantage over other coders: you&#8217;re a persistent guy or gal!<\/p>\n<p>Do you want to increase your advantage over your peers? Then join my <a rel=\"noreferrer noopener\" aria-label=\"Python email academy (opens in a new tab)\" href=\"https:\/\/blog.finxter.com\/subscribe\/\" target=\"_blank\">Python email academy<\/a>! I&#8217;ll teach you the ins and outs of Python coding&#8212;all free!<\/p>\n<p><a href=\"https:\/\/blog.finxter.com\/subscribe\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"Join Finxter Email Academy, become a better coder, and download your free Python cheat sheets! (opens in a new tab)\">Join Finxter Email Academy, become a better coder, and download your free Python cheat sheets!<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I don&#8217;t know how often I sat in front of my computer, writing regular expressions and wondering: how to escape this or that character? The problem is that some special characters have a special meaning in Python strings and regular expressions. If you want to remove the special meaning, you need to escape the characters [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[857],"tags":[73,468,528],"class_list":["post-109684","post","type-post","status-publish","format-standard","hentry","category-python-tut","tag-programming","tag-python","tag-tutorial"],"_links":{"self":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/109684","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/comments?post=109684"}],"version-history":[{"count":0,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/109684\/revisions"}],"wp:attachment":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media?parent=109684"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/categories?post=109684"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/tags?post=109684"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}