Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python | Split String Multiple Whitespaces

#1
Python | Split String Multiple Whitespaces

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;953017&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;0&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;0\/5 - (0 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 0px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> <span class="kksr-muted">Rate this post</span> </div>
</div>
<p class="has-background" style="background-color:#99fbfb"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f34e.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Summary:</strong> The most efficient way to split a string using multiple whitespaces is to use the <code>split</code> function like so <code>given_string.split()</code>. An alternate approach is to use different functions of the regex package to split the string at multiple whitespaces. </p>
<h3><strong>Minimal Example:</strong></h3>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re text = "mouse\nsnake\teagle human"
# Method 1
print(text.split()) # Method 2
res = re.split("\s+", text)
print(res) # Method 3
res = re.sub(r'\s+', ',', text).split(',')
print(res) # Method 4
print(re.findall(r'\S+', text)) # ['mouse', 'snake', 'eagle', 'human']</pre>
<h2><strong>Problem Formulation</strong></h2>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4dc.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Problem: </strong>Given a string. How will you split the string using multiple whitespaces?</p>
<h3 class="has-large-font-size"><strong>Example</strong></h3>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Input
text = "abc\nlmn\tpqr xyz\rmno"
# Output
['abc', 'lmn', 'pqr', 'xyz', 'mno']</pre>
<hr class="wp-block-separator has-alpha-channel-opacity" />
<p>There are numerous ways of solving the given problem. So, without further ado, let us dive into the solutions. </p>
<h2>Method 1: Using <a href="https://blog.finxter.com/python-regex/" target="_blank" rel="noreferrer noopener">Regex</a></h2>
<p>The best way to deal with multiple delimiters is to use the flexibility of the regular expressions library. There are different functions available in the regex library that you can use to split the given string. Let’s go through each one by one.</p>
<h3><strong>1.1 Using re.split</strong></h3>
<p>The&nbsp;<code>re.split(pattern, string)</code>&nbsp;method matches all occurrences of the&nbsp;<code>pattern</code>&nbsp;in the&nbsp;<code>string</code>&nbsp;and divides the string along the matches resulting in a list of strings&nbsp;<em>between&nbsp;</em>the matches. For example,&nbsp;<code>re.split('a', 'bbabbbab')</code>&nbsp;results in the list of strings&nbsp;<code>['bb', 'bbb', 'b']</code>.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4da.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Recommended Read:  <strong><a rel="noreferrer noopener" href="https://blog.finxter.com/python-regex-split/" target="_blank">Python Regex Split</a></strong>.</strong></p>
<p><strong>Approach: </strong>To split the string using multiple whitespace characters use <code>re.split("\s+", text)</code> where <code>\s</code> is the matching pattern and it represents a special sequence that returns a match whenever it finds any whitespace character and splits the string.</p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re
text = "abc\nlmn\tpqr xyz\rmno"
res = re.split("\s+", text)
print(res) # ['abc', 'lmn', 'pqr', 'xyz', 'mno']</pre>
<h3><strong>1.2 Using re.findall</strong></h3>
<p>The&nbsp;<code>re.findall(pattern, string)</code>&nbsp;method scans the&nbsp;<code>string</code>&nbsp;from&nbsp;<strong>left to right</strong>, searching for all&nbsp;<strong>non-overlapping matches</strong>&nbsp;of the&nbsp;<code>pattern</code>. It returns a&nbsp;<strong>list of strings</strong>&nbsp;in the matching order when scanning the string from left to right.</p>
<p class="has-base-background-color has-background"><strong><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4da.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" />Recommended Read: <a rel="noreferrer noopener" href="https://blog.finxter.com/python-re-findall/" target="_blank">Python re.findall() – Everything You Need to Know</a></strong></p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re text = "abc\nlmn\tpqr xyz\rmno"
print(re.findall(r'\S+', text))</pre>
<p><strong>Explanation: </strong>In the expression, i.e., <code>re.findall(r"\S'+", text)</code>, all occurrences of characters except whitespaces are found and stored in a list. Here, <code>\S+</code> returns a match whenever the string contains one or more occurrences of normal characters (characters from a to Z, digits from 0-9, etc. However, not the whitespaces are considered). </p>
<h3> <strong>1.3 Using re.sub</strong></h3>
<p>The regex function <code>re.sub(P, R, S)</code> replaces all occurrences of the pattern <code>P</code> with the replacement <code>R</code> in string <code>S</code>. It returns a new string. For example, if you call <code>re.sub('a', 'b', 'aabb')</code>, the result will be the new string <code>'bbbb'</code> with all characters <code>'a'</code> replaced by <code>'b'</code>.</p>
<p><strong>Aprroach: </strong>Use the <code>re.sub</code> method to replace all occurrences of whitespace characters in the given string with a comma. Thus, the string will now have commas instead of whitespace characters and you can simply split it using a normal string split method by passing comma as the delimiter.</p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re
text = "abc\nlmn\tpqr xyz\rmno"
res = re.sub(r'\s+', ',', text).split(',')
print(res) # ['abc', 'lmn', 'pqr', 'xyz', 'mno']</pre>
<hr class="wp-block-separator has-alpha-channel-opacity" />
<p><strong><em>Do you want to master the regex superpower?</em></strong> Check out my new book <em><strong><a href="https://blog.finxter.com/ebook-the-smartest-way-to-learn-python-regex/" target="_blank" rel="noreferrer noopener" title="[eBook] The Smartest Way to Learn Python Regex">The Smartest Way to Learn Regular Expressions in Python</a></strong></em> with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video. </p>
<hr class="wp-block-separator has-alpha-channel-opacity" />
<h2>Method 2: Using <a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-split/" target="_blank">split()</a></h2>
<p>By default the <code>split</code> function splits a given string at whitespaces. Meaning, if you do not pass any delimiter to the split function then the string will be split at whitespaces. You can use this default property of the split function and successfully split the given string at multiple whitespaces just by using the <code>split()</code> function.</p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">text = "abc\nlmn\tpqr xyz\rmno"
print(text.split())
# ['abc', 'lmn', 'pqr', 'xyz', 'mno']</pre>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4da.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Recommended Digest</strong>: <strong><a href="https://blog.finxter.com/python-string-split/"> Python String split()</a></strong> </p>
<h2>Conclusion</h2>
<p>We have successfully solved the given problem using different approaches. Simply using split could do the job for you. However, feel free to explore and try out the other options mentioned above. I hope this <a rel="noreferrer noopener" href="https://blog.finxter.com/" target="_blank"><strong>article</strong></a> helped you in your Python coding journey. Please <a rel="noreferrer noopener" href="https://blog.finxter.com/subscribe" target="_blank"><strong>subscribe and stay tuned</strong></a> for more interesting articles.</p>
<p>Happy Pythoning!&nbsp;<img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f40d.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" />&nbsp;</p>
<hr class="wp-block-separator has-alpha-channel-opacity" />
<div class="is-layout-flow wp-block-group">
<div class="wp-block-group__inner-container">
<h2><a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noreferrer noopener" title="https://academy.finxter.com/university/mastering-regular-expressions/">Python Regex Course</a></h2>
<p><strong><em>Google engineers are regular expression masters. </em></strong>The Google search engine is a massive <em>text-processing engine</em> that extracts value from trillions of webpages.  </p>
<p><strong><em>Facebook engineers are regular expression masters.</em></strong> Social networks like Facebook, WhatsApp, and Instagram connect humans via <em>text messages</em>. </p>
<p><strong><em>Amazon engineers are regular expression masters. </em></strong>Ecommerce giants ship products based on <em>textual product descriptions</em>.  Regular expressions ​rule the game ​when text processing ​meets computer science. </p>
<p><em><strong>If you want to become a regular expression master too, check out the<a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noreferrer noopener" title="https://academy.finxter.com/university/mastering-regular-expressions/"> most comprehensive Python regex course</a> on the planet:</strong></em></p>
<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noopener"><img loading="lazy" decoding="async" width="1024" height="576" src="https://blog.finxter.com/wp-content/uploads/2018/10/ClickToPlay-1024x576.jpg" alt="" class="wp-image-19840" srcset="https://blog.finxter.com/wp-content/uploads/2018/10/ClickToPlay-scaled.jpg 1024w, https://blog.finxter.com/wp-content/uplo...00x169.jpg 300w, https://blog.finxter.com/wp-content/uplo...68x432.jpg 768w, https://blog.finxter.com/wp-content/uplo...36x864.jpg 1536w, https://blog.finxter.com/wp-content/uplo...8x1152.jpg 2048w, https://blog.finxter.com/wp-content/uplo...150x84.jpg 150w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>
</div>
</div>
</div>
</div>


https://www.sickgaming.net/blog/2022/12/...itespaces/
Reply



Forum Jump:


Users browsing this thread:
3 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016