Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] How To Extract Numbers From A String In Python?

#1
How To Extract Numbers From A String In Python?

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;13007&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;4&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\/5 - (4 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (4 votes) </div>
</p></div>
<p class="has-global-color-8-background-color has-background">The easiest way to extract numbers from a Python string <code>s</code> is to use the expression <code>re.findall('\d+', s)</code>. For example, <code>re.findall('\d+', 'hi 100 alice 18 old 42')</code> yields the list of strings <code>['100', '18', '42']</code> that you can then convert to numbers using <code>int()</code> or <code>float()</code>.</p>
<p>There are some tricks and alternatives, so keep reading to learn about them. <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f447.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<p>In particular, you’ll learn about the following methods to extract numbers from a given string in Python:</p>
<ul>
<li>Use the <code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-regex/" target="_blank">regex</a></code> module.</li>
<li>Use <code><a href="https://blog.finxter.com/python-string-split/" data-type="post" data-id="26097">split()</a></code> and <code><a href="https://blog.finxter.com/python-list-append/" data-type="post" data-id="6605" target="_blank" rel="noreferrer noopener">append()</a></code> functions on a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-list-methods/" target="_blank">list</a>.</li>
<li>Use a <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/#:~:text=%E2%80%9CA%20list%20comprehension%20consists%20of,if%20clauses%20which%20follow%20it.%E2%80%9D&amp;text=In%20other%20words%2C%20here%20is%20the%20formula%20for%20list%20comprehension." target="_blank">List Comprehension</a> with <code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-isdigit/" data-type="post" data-id="26044" target="_blank">isdigit()</a></code> and <code><a href="https://blog.finxter.com/python-string-split/" data-type="post" data-id="26097" target="_blank" rel="noreferrer noopener">split()</a></code> functions.</li>
<li>Use the <code>num_from_string</code> module.</li>
</ul>
<h2>Problem Formulation</h2>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="963" height="587" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-346.png" alt="" class="wp-image-1160759" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-346.png 963w, https://blog.finxter.com/wp-content/uplo...00x183.png 300w, https://blog.finxter.com/wp-content/uplo...68x468.png 768w" sizes="(max-width: 963px) 100vw, 963px" /></figure>
</div>
<p><em>Extracting digits or numbers from a given string might come up in your coding journey quite often. For instance, you may want to extract certain numerical figures from a CSV file, or you need to separate complex digits and figures from given patterns. </em></p>
<p>Having said that, let us dive into our mission-critical question: </p>
<p><strong>Problem: </strong>Given a string. How to extract numbers from the string in Python?</p>
<p><strong>Example: </strong>Consider that you have been given a string and you want to extract all the numbers from the string as given in the following example:</p>
<p>Given is the following string: </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">s = 'Extract 100, 1000 and 10000 from this string'</pre>
<p>This is your desired output:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[100, 1000, 10000]</pre>
<p>Let us discuss the methods that we can use to extract the numbers from the given string: </p>
<h2>Method 1: Using Regex Module</h2>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="588" height="879" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-347.png" alt="" class="wp-image-1160760" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-347.png 588w, https://blog.finxter.com/wp-content/uplo...01x300.png 201w" sizes="(max-width: 588px) 100vw, 588px" /></figure>
</div>
<p>The most efficient approach to solving our problem is to leverage the power of the <code>re</code> module. You can easily use <a rel="noreferrer noopener" href="https://blog.finxter.com/python-regex/" data-type="post" data-id="6210" target="_blank">Regular Expressions</a> (<code>RegEx</code>) to check or verify if a given string contains a specified pattern (be it a digit or a special character, or any other pattern). </p>
<p>Thus to solve our problem, we must import the <code>regex</code> module, which is already included in Python’s standard library, and then with the help of the <code>findall()</code> function we can extract the numbers from the given string. </p>
<p class="has-base-background-color has-background">◈ <strong>Learn More</strong>: <code>re.findall()</code> is an easy-to-use regex function that returns a list containing all matches. To learn more about <code>re.findall()</code> check out our <a rel="noreferrer noopener" title="Python re.findall() – Everything You Need to Know" href="https://blog.finxter.com/python-re-findall/" target="_blank">blog tutorial here.</a></p>
<p>Let us have a look at the following code to understand how we can use the <code>regex</code> module to solve our problem:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re sentence = 'Extract 100 , 100.45 and 10000 from this string'
s = [float(s) for s in re.findall(r'-?\d+\.?\d*', sentence)]
print(s)</pre>
<p><strong>Output</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[100.0, 100.45, 10000.0]</pre>
<p>This is a Python code that uses the <code>re</code> module, which provides support for regular expressions in Python, to extract numerical values from a string. </p>
<p><strong>Code explanation:</strong> <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f447.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<p>The line <code>s = [float(s) for s in re.findall(r'-?\d+\.?\d*', sentence)]</code> uses the <code>re.findall()</code> function from the <code>re</code> module to search the <code>sentence</code> string for numerical values. </p>
<p>Specifically, it looks for strings of characters that match the regular expression pattern <code>r'-?\d+.?\d*'</code>. This pattern matches an optional minus sign, followed by one or more digits, followed by an optional decimal point, followed by zero or more digits. </p>
<p>The <code>re.findall()</code> function returns a list of all the matching strings.</p>
<p>The list comprehension <code>[float(s) for s in re.findall(r'-?\d+\.?\d*', sentence)]</code> takes the list of matching strings returned by <code>findall</code> and converts each string to a floating-point number using the <code>float()</code> function. This resulting list of floating-point numbers is then assigned to the variable <code>s</code>.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f449.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended</strong>: <a href="https://blog.finxter.com/list-comprehension/" data-type="post" data-id="1171" target="_blank" rel="noreferrer noopener">Python List Comprehension</a></p>
<h2>Method 2: Split and Append The Numbers To A List using split() and append()</h2>
<p>Another workaround for our problem is to split the given string using the <code>split()</code> function and then extract the numbers using the built-in <code>float()</code> method then append the extracted numbers to the <a href="https://blog.finxter.com/python-lists/" target="_blank" rel="noreferrer noopener" title="The Ultimate Guide to Python Lists">list</a>. </p>
<p><strong>Note:</strong></p>
<ul>
<li><code><a href="https://blog.finxter.com/how-to-split-a-list-into-evenly-sized-chunks/" target="_blank" rel="noreferrer noopener" title="How to Split a List Into Evenly-Sized Chunks?">split()</a></code> is a built-in python method which is used to split a string into a list.</li>
<li><code><a href="https://blog.finxter.com/python-list-append/" title="Python List append() Method">append()</a></code> is a built-in method in python that adds an item to the end of a list. </li>
</ul>
<p>Now that we have the necessary tools to solve our problem based on the above concept let us dive into the code to see how it works:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sentence = 'Extract 100 , 100.45 and 10000 from this string' s = []
for t in sentence.split(): try: s.append(float(t)) except ValueError: pass
print(s)</pre>
<p><strong>Output</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[100.0, 100.45, 10000.0]</pre>
<h2>Method 3: Using isdigit() Function In A List Comprehension</h2>
<p>Another approach to solving our problem is to use the <code>isdigit()</code> inbuilt function to extract the digits from the string and then store them in a list using a <a href="https://blog.finxter.com/how-does-nested-list-comprehension-work-in-python/">list comprehension</a>. </p>
<p>The <code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-isdigit/" data-type="post" data-id="26044" target="_blank">isdigit()</a></code> function is used to check if a given string contains digits. Thus if it finds a character that is a digit, then it returns <code>True</code>. Otherwise, it returns <code>False</code>.</p>
<p>Let us have a look at the code given below to see how the above concept works:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="2" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sentence = 'Extract 100 , 100.45 and 10000 from this string'
s = [int(s) for s in str.split(sentence) if s.isdigit()]
print(s)</pre>
<p><strong>Output</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[100, 10000]</pre>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2622.png" alt="☢" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Alert</strong>! This technique is best suited to extract only positive integers. It won’t work for negative integers, floats, or hexadecimal numbers. </p>
<h2>Method 4: Using Numbers from String Library</h2>
<p>This is a quick hack if you want to avoid spending time typing explicit code to extract numbers from a string. </p>
<p>You can import a library known as <code>nums_from_string</code> and then use it to extract numbers from a given string. It contains several regex rules with comprehensive coverage and can be a very useful tool for NLP researchers.</p>
<p>Since the <code>nums_from_string</code> library is not a part of the standard Python library, you have to install it before use. Use the following command to install this useful library:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">pip install nums_from_string</pre>
<p>The following program demonstrates the usage of <code>nums_from_string</code> :</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import nums_from_string sentence = 'Extract 100 , 100.45 and 10000 from this string'
print(nums_from_string.get_nums(sentence))</pre>
<p><strong>Output</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">[100.0, 100.45, 10000.0]</pre>
<h2>Conclusion</h2>
<p>Thus from the above discussions, we found that there are numerous ways of extracting a number from a given string in python. </p>
<p>My personal favorite, though, would certainly be the regex module <code>re</code>. </p>
<p>You might argue that using other methods like the <code>isdigit()</code> and <code>split()</code> functions provide simpler and more readable code and faster. However, as mentioned earlier, it does not return numbers that are negative (in reference to Method 2) and also does not work for floats that have no space between them and other characters like <code>'25.50k'</code> (in reference to Method 2). </p>
<p>Furthermore, speed is kind of an irrelevant metric when it comes to log parsing. Now you see why regex is my personal favorite in this list of solutions. </p>
<p>If you are not very supportive of the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/re.html" target="_blank"><code>re</code> library</a>, especially because you find it difficult to get a strong grip on this concept (just like me in the beginning), here’s <a rel="noreferrer noopener" href="https://blog.finxter.com/python-regex/" target="_blank">THE TUTORIAL</a> for you to become a regex master. <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f92f.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f92f.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<p>I hope you found this article useful and added some value to your coding journey. Please <a rel="noreferrer noopener" href="https://blog.finxter.com/subscribe" target="_blank">stay tuned</a> for more interesting stuff in the future. </p>
</div>


https://www.sickgaming.net/blog/2023/02/...in-python/
Reply



Forum Jump:


Users browsing this thread:
2 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016