Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python | Split String by Whitespace

#1
Python | Split String by Whitespace

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload="{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;823766&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;0&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;0\/5 - (0 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}">
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 0px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> <span class="kksr-muted">Rate this post</span> </div>
</div>
<p class="has-background" style="background-color:#d2fbfd"><strong>Summary:&nbsp;</strong>Use&nbsp;<code>"given string".split()</code>&nbsp;to split the given string by whitespace and store each word as an individual item in a list.<br /><strong>Minimal Example:</strong><br /><code>print("Welcome Finxter".split())</code><br /># OUTPUT: [‘Welcome’, ‘Finxter’]</p>
<h2><strong>Problem Formulation</strong></h2>
<p><strong>Problem</strong>: Given a string, How will you split the string into a list of words using whitespace as a separator/delimiter?</p>
<p>Let’s understand the problem with the help of a few examples:</p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td><strong>Example 1:</strong><br /><strong>Input:</strong> text = “Welcome to the world of Python”<br /><strong>Explanation: </strong>Split the string into a list of words using a space ” ” as the delimiter to separate the words from the given string. <br /><strong>Output: </strong><br />[‘Welcome’, ‘to’, ‘the’, ‘world’, ‘of’, ‘Python’]</p>
<p><strong>Example 2: </strong><br /><strong>Input:</strong><br />text = “””Item_1<br />Item_2<br />Item_3″””<br />print(text.split(‘\n’))<br /><strong>Explanation: </strong>Split the string into a list of words using a newline “\n” as the delimiter to separate the words from the given string. <br /><strong>Output:</strong> [‘Item_1’, ‘Item_2’, ‘Item_3’]</p>
<p><strong>Example 3: </strong><br />text = “This is just a random text:\n New Line”<br /><strong>Explanation: </strong>The given string contains a combination of whitespaces between the words, such as space, multiple-spaces, a tab and a new line character. All of these whitespace characters have to be considered as delimiters while separating the words from the given string and storing them as items in a list. Here’s how the output looks: <br /><strong>Output:</strong><br />[‘This’, ‘is’, ‘just’, ‘a’, ‘random’, ‘text:’, ‘New’, ‘Line’]</td>
</tr>
</tbody>
</table>
</figure>
<p>So, we have two situations at hand. One, that has a single whitespace used as a delimiter and another that has multiple whitespace characters as delimiters in the same string. Let’s dive into the numerous ways of solving this problem. </p>
<h2><strong>Method 1: Using split()</strong> </h2>
<p><code>split()</code> is a built-in method in Python which splits the string at a given separator and returns a split list of substrings. Here’s a minimal example that demonstrates how the <code>split</code> function works – <code>finxterx42'.split('x')</code> will split the string with the character ‘x’ as the delimiter and return the following list as an output: <code>['fin', 'ter', '42']</code>. The default separator, i.e., when no value is passed to the split function is considered as any whitespace character, i.e., it will take into account any whitespace such as ‘\n’, ” “, ‘\t’, etc.</p>
<p class="has-base-background-color has-background">Read more about the <code>split()</code> method in this blog tutorial: <strong><a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-split/" target="_blank">Python String split()</a></strong>.</p>
<p><strong>Approach: </strong>Thus to split a string based on a given whitespace delimiter, you can simply pass the specific whitespace character as a separator/delimiter to the <code>split('whitespace_character')</code> function.</p>
<p><strong>Code: </strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="3,10,15" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Example 1:
text = "Welcome to the world of Python"
print(text.split(' '))
# OUTPUT: ['Welcome', 'to', 'the', 'world', 'of', 'Python'] # Example 2:
text = """Item 1
Item 2
Item 3"""
print(text.split('\n'))
# OUTPUT: ['Item_1', 'Item_2', 'Item_3'] # Example 3: text = "This is just a\trandom text:\nNew Line"
print(text.split()) # OUTPUT: ['This', 'is', 'just', 'a', 'random', 'text:', 'New', 'Line']</pre>
<p>Note that to separate the words in the third example we did specify any separator within the <code>split()</code> function. This is because when you don’t specify the separator, then Python will automatically consider that any whitespace character that occurs within the given string is a separator. </p>
<h2><strong>Method 2: Using <a href="https://blog.finxter.com/python-regex/" target="_blank" rel="noreferrer noopener">regex</a></strong></h2>
<p>Another extremely handy way of separating a string with whitespace characters as separators is to use the regex library. </p>
<p><strong>Approach 1: </strong>Import the regex library and use its split method as <code>re.split('\s+', text)</code> where ‘\s+’ returns a match whenever the string contains one or more whitespace characters. Therefore, whenever any whitespace character is encountered, the string will be separated at that point. </p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="4, 11, 16" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re
# Example 1:
text = "Welcome to the world of Python"
print(re.split('\s+', text))
# OUTPUT: ['Welcome', 'to', 'the', 'world', 'of', 'Python'] # Example 2:
text = """Item_1
Item_2
Item_3"""
print(re.split('\s+', text))
# OUTPUT: ['Item_1', 'Item_2', 'Item_3'] # Example 3:
text = "This is just a\trandom text:\nNew Line"
print(re.split('\s+', text))
# OUTPUT: ['This', 'is', 'just', 'a', 'random', 'text:', 'New', 'Line']</pre>
<p class="has-base-background-color has-background"><strong>Related Tutorial: <a href="https://blog.finxter.com/python-regex-split/" target="_blank" rel="noreferrer noopener">Python Regex Split</a></strong></p>
<p><strong>Approach 2: </strong>Another way of using the regex library to solve this question is to use the <code>findall()</code> method of the regex library. Import the regex library and use <code>re.findall(r'\S+', text)</code> where the expression returns all the characters/words in a list that do not contain any whitespace character. This essentially means that whenever Python finds and segregates a string that has no whitespace in it. As soon as a whitespace character is found it considers that as a breakpoint, therefore the next word that has a continuous sequence of characters without the presence of any whitespace character is taken into account. </p>
<p>Here’s a graphical representation of the above explanaton:</p>
<figure class="wp-block-image size-full is-style-default"><img loading="lazy" width="715" height="456" src="https://blog.finxter.com/wp-content/uploads/2022/10/image-232.png" alt="" class="wp-image-831613" srcset="https://blog.finxter.com/wp-content/uploads/2022/10/image-232.png 715w, https://blog.finxter.com/wp-content/uplo...00x191.png 300w" sizes="(max-width: 715px) 100vw, 715px" /></figure>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="4,11,16" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re
# Example 1:
text = "Welcome to the world of Python"
print(re.findall(r'\S+', text))
# OUTPUT: ['Welcome', 'to', 'the', 'world', 'of', 'Python'] # Example 2:
text = """Item_1
Item_2
Item_3"""
print(re.findall(r'\S+', text))
# OUTPUT: ['Item_1', 'Item_2', 'Item_3'] # Example 3:
text = "This is just a random text:\n New Line"
print(re.findall(r'\S+', text))
# OUTPUT: ['This', 'is', 'just', 'a', 'random', 'text:', 'New', 'Line']</pre>
<p class="has-base-background-color has-background"><strong>Related Tutorial: <a href="https://blog.finxter.com/python-re-findall/" target="_blank" rel="noreferrer noopener">Python re.findall() – Everything You Need to Know</a></strong></p>
<p><strong><em>Do you want to master the regex superpower?</em></strong> Check out my new book <em><strong><a href="https://blog.finxter.com/ebook-the-smartest-way-to-learn-python-regex/" target="_blank" rel="noreferrer noopener" title="[eBook] The Smartest Way to Learn Python Regex">The Smartest Way to Learn Regular Expressions in Python</a></strong></em> with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video. </p>
<h2><strong>Conclusion</strong></h2>
<p>We have successfully solved the given problem using different approaches. I hope you enjoyed this&nbsp;<a rel="noreferrer noopener" href="https://blog.finxter.com/" target="_blank">article</a>&nbsp;and it helps you in your Python coding journey. Please&nbsp;<a rel="noreferrer noopener" href="https://blog.finxter.com/subscribe" target="_blank">subscribe and stay tuned</a>&nbsp;for more interesting articles!</p>
<p class="has-base-2-background-color has-background"><strong>Related Reads:</strong><br /><a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-split-a-string-and-keep-the-separators/" target="_blank">⦿</a>&nbsp;<a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-split-a-string-and-keep-the-separators/" target="_blank"><strong>How To Split A String And Keep The Separators?</strong></a><a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-cut-a-string-in-python/" target="_blank"><br />⦿</a>&nbsp;<a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-cut-a-string-in-python/" target="_blank"><strong>How To Cut A String In Python?</strong></a> <a rel="noreferrer noopener" href="https://blog.finxter.com/python-split-string-into-characters/" target="_blank"><br />⦿&nbsp;<strong>Python | Split String into Characters</strong></a></p>
<hr class="wp-block-separator has-alpha-channel-opacity" />
<div class="wp-container-1 wp-block-group">
<div class="wp-block-group__inner-container">
<h2><a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noreferrer noopener" title="https://academy.finxter.com/university/mastering-regular-expressions/">Python Regex Course</a></h2>
<p><strong><em>Google engineers are regular expression masters. </em></strong>The Google search engine is a massive <em>text-processing engine</em> that extracts value from trillions of webpages.  </p>
<p><strong><em>Facebook engineers are regular expression masters.</em></strong> Social networks like Facebook, WhatsApp, and Instagram connect humans via <em>text messages</em>. </p>
<p><strong><em>Amazon engineers are regular expression masters. </em></strong>Ecommerce giants ship products based on <em>textual product descriptions</em>.  Regular expressions ​rule the game ​when text processing ​meets computer science. </p>
<p><em><strong>If you want to become a regular expression master too, check out the<a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noreferrer noopener" title="https://academy.finxter.com/university/mastering-regular-expressions/"> most comprehensive Python regex course</a> on the planet:</strong></em></p>
<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://academy.finxter.com/university/mastering-regular-expressions/" target="_blank" rel="noopener"><img loading="lazy" width="1024" height="576" src="https://blog.finxter.com/wp-content/uploads/2018/10/ClickToPlay-1024x576.jpg" alt="" class="wp-image-19840" srcset="https://blog.finxter.com/wp-content/uploads/2018/10/ClickToPlay-scaled.jpg 1024w, https://blog.finxter.com/wp-content/uplo...00x169.jpg 300w, https://blog.finxter.com/wp-content/uplo...68x432.jpg 768w, https://blog.finxter.com/wp-content/uplo...36x864.jpg 1536w, https://blog.finxter.com/wp-content/uplo...8x1152.jpg 2048w, https://blog.finxter.com/wp-content/uplo...150x84.jpg 150w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>
</div>
</div>
</div>
</div>


https://www.sickgaming.net/blog/2022/10/...hitespace/
Reply



Forum Jump:


Users browsing this thread:
2 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016