Posted on Leave a comment

Python | Split String with Regex

Rate this post

Summary: The different methods to split a string using regex are:

  • re.split()
  • re.sub()
  • re.findall()
  • re.compile()

Minimal Example

import re text = "Earth:Moon::Mars:Phobos" # Method 1
res = re.split("[:]+", text)
print(res) # Method 2
res = re.sub(r':', " ", text).split()
print(res) # Method 3
res = re.findall("[^:\s]+", text)
print(res) # Method 4
pattern = re.compile("[^:\s]+").findall
print(pattern(text)) # Output
['Earth', 'Moon', 'Mars', 'Phobos']

Problem Formulation

📜Problem: Given a string and a delimiter. How will you split the string using the given delimiter using different functions from the regular expressions library?

Example: In the following example, the given string has to be split using a hyphen as the delimiter.

# Input
text = "abc-lmn-xyz" # Expected Output
['abc', 'lmn', 'xyz']

Method 1: re.split

The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

Approach: Use the re.split function and pass [_]+ as the pattern which splits the given string on occurrence of an underscore.

Code:

import re text = "abc_lmn_xyz"
res = re.split("[_]+", text)
print(res) # ['abc', 'lmn', 'xyz']

🚀Related Read: Python Regex Split

Method 2: re.sub

The regex function re.sub(P, R, S) replaces all occurrences of the pattern P with the replacement R in string S. It returns a new string. For example, if you call re.sub('a', 'b', 'aabb'), the result will be the new string 'bbbb' with all characters 'a' replaced by 'b'.

Approach: The idea here is to use the re.sub function to replace all occurrences of underscores with a space and then use the split function to split the string at spaces.

Code:

import re text = "abc_lmn_xyz"
res = re.sub(r'_', " ", text).split()
print(res) # ['abc', 'lmn', 'xyz']

🚀Related Read: Python Regex Sub

Method 3: re.findall

The re.findall(pattern, string) method scans string from left to right, searching for all non-overlapping matches of the pattern. It returns a list of strings in the matching order when scanning the string from left to right.

Approach: Find all occurrences of characters that are separated by underscores using the re.findall().

Code:

import re text = "abc_lmn_xyz"
res = re.findall("[^_\s]+", text)
print(res) # ['abc', 'lmn', 'xyz']

🚀Related Read: Python re.findall()

Method 4: re.compile

The method re.compile(pattern) returns a regular expression object from the pattern that provides basic regex methods such as pattern.search(string)pattern.match(string), and pattern.findall(string). The explicit two-step approach of (1) compiling and (2) searching the pattern is more efficient than calling, say, search(pattern, string) at once, if you match the same pattern multiple times because it avoids redundant compilations of the same pattern.

Code:

import re text = "abc_lmn_xyz"
pattern = re.compile("[^-\s]+").findall
print(pattern(text)) # ['abc', 'lmn', 'xyz']

Why use re.compile?

  • Efficiency: Using re.compile() to assemble regular expressions is effective when the expression has to be used more than once. Thus, by using the classes/objects created by compile function, we can search for instances that we need within different strings without having to rewirte the expressions again and again. This increases productivity as well as saves time.
  • Readability: Another advantage of using re.compile is the readability factor as it leverages you the power to decouple the specification of the regex.

🚀Read: Is It Worth Using Python’s re.compile()?

Exercise

Problem: Python regex split by spaces, commas, and periods, but not in cases like 1,000 or 1.50.

Given:
my_string = "one two 3.4 5,6 seven.eight nine,ten"
Expected Output:
["one", "two", "3.4", "25.6" , "seven", "eight", "nine", "ten"]

Solution

my_string = "one two 3.4 25.6 seven.eight nine,ten"
res = re.split('\s|(?<!\d)[,.](?!\d)', my_string)
print(res) # ['one', 'two', '3.4', '25.6', 'seven', 'eight', 'nine', 'ten']

Conclusion

Therefore, we have learned four different ways of splitting a string using the regular expressions package in Python. Feel free to use the suitable technique that fits your needs. The idea of this tutorial was to get you acquainted with the numerous ways of using regex to split a string and I hope it helped you.

Please stay tuned and subscribe for more interesting discussions and tutorials in the future. Happy coding! 🙂


Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.

Posted on Leave a comment

How to Split a Multi-line String into Multiple Lines?

Rate this post

Summary: Use given_string.splitlines() to split a given multiline string into multiple lines.

Minimal Example:

text = 'Python\nJava\nC#'
print(text.splitlines())
# Output: ['Python', 'Java', 'C#']

Problem Formulation

📜Problem: Given a string, How will you split the string into a list of words using newline as a separator/delimiter?

Example:

# Input
text = """abc
def
ghi """
# Expected Output
['abc', 'def', 'ghi']

Let’s dive into the different ways of solving the given problem.

Method 1: Using splitlines

Approach: The most convenient and easiest way to split a given multiline string into multiple strings is to use the splitlines() method, i.e., simply use – 'given_string'.splitlines().

NOTE: splitlines() is a built-in method in Python that splits a string at line breaks such as '\n' and returns a split list of substrings (i.e., lines). For example, 'finxter\nis\ncool'.splitlines() will return the following list: ['finxter', 'is', 'cool'].

Code:

# Input
text = """Python is an Object Oriented programming language.
COBOL is an Object Oriented programming language.
F# is an Object Oriented programming language.""" print(text.splitlines()) # Output: ['Python is an Object Oriented programming language.', 'COBOL is an Object Oriented programming language.', 'F# is an Object Oriented programming language.']

🌎Related Read: Python String splitlines()

Method 2: Using split()

Approach: Use 'given_string'.split('\n') to split the given multiline string at line breaks.

Code:

# Input
text = """Python is an Object Oriented programming language.
COBOL is an Object Oriented programming language.
F# is an Object Oriented programming language.""" print(text.split('\n')) # Output: ['Python is an Object Oriented programming language.', 'COBOL is an Object Oriented programming language.', 'F# is an Object Oriented programming language.']

Using “\n” ensures that whenever a new line occurs, the string is split.

🌎Related Read: Python String split()

Method 3: Using re.split in a List Comprehension

Another way to solve the given problem is to use the split method of the regex module. You can split the string at every line break by passing “\n” as the pattern within the re.split function. To ensure that there are no leading or trailing extra whitespaces in the resultant list you can use a list comprehension that stores the split strings only and eliminates whitespace characters. This can be done with the help of an if statement within the list comprehension as shown in the solution below.

Code:

import re text = """Python is an Object Oriented programming language.
COBOL is an Object Oriented programming language.
F# is an Object Oriented programming language.""" print([x for x in re.split("\n", text) if x!='']) # Output: ['Python is an Object Oriented programming language.', 'COBOL is an Object Oriented programming language.', 'F# is an Object Oriented programming language.']

NOTE: The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

🌎Read more here – Python Regex Split

List Comprehension: “A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.”

🌎Read more here: List Comprehension in Python — A Helpful Illustrated Guide


Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.

Conclusion

We have successfully solved the given problem using three different approaches. I hope you this article answered all your queries. Please subscribe and stay tuned for more interesting articles!

Happy coding! 🙂

Related Reads:
⦿ Python | Split String by Newline
⦿ Python | Split String by Whitespace
⦿ Python | Split String into Characters


Python Regex Course

Google engineers are regular expression masters. The Google search engine is a massive text-processing engine that extracts value from trillions of webpages.  

Facebook engineers are regular expression masters. Social networks like Facebook, WhatsApp, and Instagram connect humans via text messages

Amazon engineers are regular expression masters. Ecommerce giants ship products based on textual product descriptions.  Regular expressions ​rule the game ​when text processing ​meets computer science. 

If you want to become a regular expression master too, check out the most comprehensive Python regex course on the planet:

Posted on Leave a comment

Python | Split String into List of Substrings

Rate this post

🍎Summary: Use Python’s built-in split function to split a given string into a list substrings. Other methods include using the regex library and the map function.

Minimal Example

text = "Python Java Golang" # Method 1
print(text.split()) # Method 2
import re
print(re.split('\s+',text)) # Method 2.1
print(re.findall('\S+', text)) # Method 3
li = list(map(str.strip, text.split()))
res = []
for i in li: for j in i.split(): res.append(j)
print(res) # OUTPUTS: ['Python', 'Java', 'Golang']

Problem Formulation

📜Problem: Given a string containing numerous substrings. How will you split the string into a list of substrings?

Let’s understand the problem with the help of an example.

Example

# Input
text = "word1 word2 word3 word4 word5" # Output
['word1', 'word2', 'word3', 'word4', 'word5']

Method 1: Using strip

Approach: Use the split("sep") function where sep is the specified separator. In our case the separator is a space. Hence, you do not need to pass any separator to the function as whitespaces are considered to be default separators for the split function. Therefore, whenever a space occurs the string will be split and the substring will be stored in a list.

Code:

text = "word1 word2 word3 word4 word5"
print(text.split()) # ['word1', 'word2', 'word3', 'word4', 'word5']

Method 2: Using re.split

The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

Approach: Use thr re.split('\s+',text) method, where text is the given string and ‘\s+‘ returns a match whenever it finds a space in the string.Therefore, on every occurrence of a space the string will be split.

Code:

import re text = "word1 word2 word3 word4 word5"
print(re.split('\s+',text)) # ['word1', 'word2', 'word3', 'word4', 'word5']

🚀Related Read: Python Regex Split

Method 3: Using re.findall

The re.findall(pattern, string) method scans string from left to right, searching for all non-overlapping matches of the pattern. It returns a list of strings in the matching order when scanning the string from left to right.

🚀Related Read: Python re.findall() – Everything You Need to Know

Approach: Use thr re.findall('\S+',text) method, where text is the given string and ‘\S+‘ returns a match whenever it finds a normal character in the string except whitespace. Therefore, all the non-whitespace characters will be grouped together until the script encounters a space. On the occurrence of a space, the string will be split and the next group of characters that do not include a space will be searched.

Code:

import re text = "word1 word2 word3 word4 word5"
print(re.findall('\S+', text)) # ['word1', 'word2', 'word3', 'word4', 'word5']

Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.

Method 4: Using map

Prerequisite: The map() function transforms one or more iterables into a new one by applying a “transformator function” to the i-th elements of each iterable. The arguments are the transformator function object and one or more iterables. If you pass n iterables as arguments, the transformator function must be an n-ary function taking n input arguments. The return value is an iterable map object of transformed, and possibly aggregated, elements.

🚀Related Read: Python map() — Finally Mastering the Python Map Function [+Video]

Approach: Use the map function such that the iterable is the split list of substrings. This is the second argument of the map method. Now each item of this list will be passed to the strip method which eliminates the trailing spaces if any and then returns a map object containing the split substrings. You can convert this map object to a list using the list constructor.

Code:

text = "word1 word2 word3 word4 word5"
li = list(map(str.strip, text.split()))
res = []
for i in li: for j in i.split(): res.append(j)
print(res) # ['word1', 'word2', 'word3', 'word4', 'word5']

Exercise

Problem: Given a string containing numerous substrings separated by commas and spaces. How will you extract the substrings and store them in a list? Note that you have to eliminate the whitespaces as well as the commas.

# Input
text = "One, Two, Three"
# Output
['One', 'Two', 'Three']

🔎Hint: Python | Split String by Comma and Whitespace

Solution:

text = "One, Two, Three"
print([x.strip() for x in text.split(',')])
# ['One', 'Two', 'Three']

Conclusion

With that, we come to the end of this tutorial. I hope the methods discussed in this article have helped you and answered your queries. Please stay tuned and subscribe for more solutions and discussions in the future.

Happy learning!🐍


But before we move on, I’m excited to present you my new Python book Python One-Liners (Amazon Link).

If you like one-liners, you’ll LOVE the book. It’ll teach you everything there is to know about a single line of Python code. But it’s also an introduction to computer science, data science, machine learning, and algorithms. The universe in a single line of Python!

The book was released in 2020 with the world-class programming book publisher NoStarch Press (San Francisco).

Link: https://nostarch.com/pythononeliners

Posted on Leave a comment

A Comprehensive Guide to maxsplit in Python

Rate this post

Summary: maxsplit is one of the optional parameters used in the split() function. If maxsplit is specified within the split function, then the maximum number of splits done will be given by the specified maxsplit. Thus, the list will have at most maxsplit + 1 elements. If maxsplit is not specified or set to -1, then there is no limit on the number of splits (all the possible splits are made).

Minimal Example

cols = 'Red, Black, White, Yellow, Pink' # Maxsplit 0
print(cols.split(', ', 0))
# ['Red, Black, White, Yellow, Pink'] # Maxsplit 1
print(cols.split(', ', 1))
# ['Red', 'Black, White, Yellow, Pink'] # Maxsplit 3
print(cols.split(', ', 3))
# ['Red', 'Black', 'White', 'Yellow, Pink'] # Maxsplit 5
print(cols.split(', ', 5))
# ['Red', 'Black', 'White', 'Yellow', 'Pink']

In this comprehensive guide, you will learn everything you need to know about maxsplit in Python.

What is Maxsplit Anyways?

Before you understand what maxsplit does, it is important to understand what the split function does. The split() function in Python splits the string at a given separator and returns a split list of substrings.

Syntax and Explanation:
str.split(sep = None, maxsplit = -1)

🚀 maxsplit is an optional parameter that defines the maximum number of splits (the list will have at most maxsplit + 1 elements). If maxsplit is not provided or defined as -1, then there is no limit on the number of splits (all the possible splits get made).

🦅Related Read: Python String split()

How Many Elements will The List Contain when Maxsplit is Specified?

When the maxsplit is specified, the list will have a maximum of maxsplit + 1 items. Look at the following examples to understand this better.

text = 'Python Java C Ruby' # Example 1
print(text.split(' ', 0))
# ['Python Java C Ruby'] # Example 2
print(text.split(' ', 2))
# ['Python', 'Java', 'C Ruby'] # Example 3
print(text.split(' ', -1))
# ['Python', 'Java', 'C', 'Ruby']

Explanation: In the first example, the maxsplit gets set to 0. Hence the list will have a maximum of one item. In the second example, maxsplit is set to 2, therefore the resultant list will have 2+1 = 3 items. Note that in the third example, maxsplit gets specified as -1; hence by default all the possible splits have been made.

Will the split() Function Work if You Don’t Specify Any Parameter?

Example:

txt = 'Welcome to the world of Python'
print(txt.split())
# ['Welcome', 'to', 'the', 'world', 'of', 'Python']

The split() function works perfectly fine even when no arguments are specified. In the above example, no separator and no maxsplit has been specified. It takes the default separator (space) to split the string. By default, the maxsplit value is -1. So the string gets split wherever a space is found. Meaning the maximum number of splits will be performed.

Split a List up to a Maximum Number of Elements

Problem:  Given a list; How will you split the list up to a maximum number of elements?

Example: Let’s visualize the problem with a real problem asked in StackOverflow.

source: https://stackoverflow.com/questions/58952417/split-a-list-up-to-a-maximum-number-of-elements

Discussion: The question essentially requires you to split the first and the second item/row into 7 columns such that the expected output resembles the following: [['6697', '1100.0', '90.0', '0.0', '0.0', '6609', '!'], ['701', '0.0', '0.0', '83.9', '1.5', '000', '!AFR-AHS IndHS-AFR']]

Solution:

rot = ['6697 1100.0 90.0 0.0 0.0 6609 !', '701 0.0 0.0 83.9 1.5 000 !AFR-AHS IndHS-AFR'] for i in range(len(rot)): rot[i] = rot[i].split(maxsplit=6) print(rot) # [['6697', '1100.0', '90.0', '0.0', '0.0', '6609', '!'], ['701', '0.0', '0.0', '83.9', '1.5', '000', '!AFR-AHS IndHS-AFR']]

Explanation: Use the split() method and specify the maxsplit argument as the maximum number of elements in the list that you want to group. In this case, you need seven splits. Hence, the maxsplit can be set to 6 to achieve the final output.

Exercise

Given:
text = “abc_kjh_olp_xyz”
Challenge: Split the given string only at the first occurrence of the underscore “_”
Expected Output:
[‘abc’, ‘kjh_olp_xyz’]

Solution

text = "abc_kjh_olp_xyz"
print(text.split("_", maxsplit=1))

🦅Related Read: Python Split String at First Occurrence

Conclusion

That was all about the maxsplit parameter from the split() function in Python. I hope this article helped you to gain an in-depth insight into the maxsplit parameter. Please subscribe and stay tuned for more interesting articles! Happy coding.


Python One-Liners Book: Master the Single Line First!

Python programmers will improve their computer science skills with these useful one-liners.

Python One-Liners

Python One-Liners will teach you how to read and write “one-liners”: concise statements of useful functionality packed into a single line of code. You’ll learn how to systematically unpack and understand any line of Python code, and write eloquent, powerfully compressed Python like an expert.

The book’s five chapters cover (1) tips and tricks, (2) regular expressions, (3) machine learning, (4) core data science topics, and (5) useful algorithms.

Detailed explanations of one-liners introduce key computer science concepts and boost your coding and analytical skills. You’ll learn about advanced Python features such as list comprehension, slicing, lambda functions, regular expressions, map and reduce functions, and slice assignments.

You’ll also learn how to:

  • Leverage data structures to solve real-world problems, like using Boolean indexing to find cities with above-average pollution
  • Use NumPy basics such as array, shape, axis, type, broadcasting, advanced indexing, slicing, sorting, searching, aggregating, and statistics
  • Calculate basic statistics of multidimensional data arrays and the K-Means algorithms for unsupervised learning
  • Create more advanced regular expressions using grouping and named groups, negative lookaheads, escaped characters, whitespaces, character sets (and negative characters sets), and greedy/nongreedy operators
  • Understand a wide range of computer science topics, including anagrams, palindromes, supersets, permutations, factorials, prime numbers, Fibonacci numbers, obfuscation, searching, and algorithmic sorting

By the end of the book, you’ll know how to write Python at its most refined, and create concise, beautiful pieces of “Python art” in merely a single line.

Get your Python One-Liners on Amazon!!

Posted on Leave a comment

Python Find in List [Ultimate Guide]

5/5 – (1 vote)

When Google was founded in 1998, Wallstreet investors laughed at their bold vision of finding data efficiently in the web. Very few people actually believed that finding things can be at the heart of a sustainable business — let alone be a long-term challenge worth pursuing.

We have learned that searching — and finding — things is crucial wherever data volumes exceed processing capabilities. Every computer scientist knows about the importance of search.

And even non-coders don’t laugh about Google’s mission anymore!

⭐⭐⭐ This article will be the web’s most comprehensive guide on FINDING stuff in a Python list. ⭐⭐⭐

It’s a living document where I’ll update new topics as I go along — so stay tuned while this article grows to be the biggest resource on this topic on the whole web!

Let’s get started with the very basics of finding stuff in a Python list:

Finding an Element in a List Using the Membership Operator

You can use the membership keyword operator in to check if an element is present in a given list. For example, x in mylist returns True if element x is present in my_list using the equality == operator to compare all list elements against the element x to be found.

Here’s a minimal example:

my_list = ['Alice', 'Bob', 'Sergey', 'Larry', 'Eric', 'Sundar'] if 'Eric' in my_list: print('Eric is in the list')

The output is:

Eric is in the list

Here’s a graphical depiction of how the membership operator works on a list of numbers:

Figure 1: Check the membership of item 42 in the list of integers.

To dive deeper into this topic, I’d love to see you watch my explainer video on the membership operators here: 👇

YouTube Video

Posted on Leave a comment

Creating an Advent Calendar App in Python with AI Image Creation

5/5 – (2 votes)
Example: AI-Generated Image

This blog describes a fun mini Python project using AI image creation for creating artwork for an advent calendar implemented using PyGame.

Context

It’s December 1st, and my daughter has just opened the first door on her physical advent calendar, counting down the days of the festival of general consumerism that some people call Christmas. Out pops a small chocolate with a festive robin embossed onto it.

I don’t really recall when the chocolate treat became ubiquitous. Back in my childhood days, advent calendar doors just opened to reveal an image.

Example: AI-Generated Image

This article describes a mini project to generate a virtual old-school Advent calendar using royalty-free bespoke images generated from Python using a third-party AI service through an API.

All code and some AI-generated image samples are available on GitHub.

Painting the Images

There are now many AI engines for creating images from a text description. This project uses deepai. Creating an image with the deepai API is pretty simple. They provide code samples for several languages on their site.

I further selected the engine for producing art in the impressionism style but you can choose from a range of other styles at https://api.deepai.org.

The aipaint function below uses the requests library post method to send a description string e.g. "Christmas presents under tree" to the API endpoint https://api.deepai.org/api/impressionism-painting-generator with an API key included in the header.

Note that the AI artist style forms parts of the end-point URL.

An API key is used to control access to the API service. The quickstart api-key below is only good for about 10 free requests. You can pay $5 for a new API key and 100 more request calls if you so wish.

def aipaint(description): r = requests.post( "https://api.deepai.org/api/impressionism-painting-generator", data={ 'text': description, }, headers={'api-key': 'quickstart-QUdJIGlzIGNvbWluZy4uLi4K'} ) ret = r.json() return ret

The aipaint() function will return a JSON object, including an element called output_url from which the image can be viewed. 

The following is an example AI painting created, in the impressionist style, for the description "Christmas presents under tree":

The output_url will look something like this https://api.deepai.org/job-view-file/eb654821-f89c-4065-9302-a702ad942971/outputs/output.jpg

Be warned when you read this – the above URL will no longer exist. The created image is only hosted by deepai for a few hours, so our advent calendar code will need to download to a local folder.  

To download the image we can use urlretrieve:

painting=aipaint(description) urllib.request.urlretrieve(painting['output_url'], filename)

Caching 25 Images

The original hope was to create images on the fly as an advent door is opened, but as there is a cost for each API request and up to 30 seconds is required for each image to be generated by the AI engine and downloaded, an alternative approach was adopted in writing a prepaint.py script to create and download 25 images.

prepaint.py

import urllib.request
import config
import requests
import urllib.request
from os.path import exists def aipaint(description): r = requests.post( "https://api.deepai.org/api/impressionism-painting-generator", data={ 'text': description, }, headers={'api-key': 'quickstart-QUdJIGlzIGNvbWluZy4uLi4K'} ) ret = r.json() return ret count=1 while count<=25: # pop off the first image description description=config.descriptions.pop(0) # push it back on at end (this ensures we cycle through descriptions and never run out ) # obviously best if there are 25 descriptions though config.descriptions.append(description) filename='./images/image'+ (f"{count:02d}") +".jpg" if(exists(filename)): print(filename + " already exists") else: print("Painting: "+description) try: painting=aipaint(description) if(len(painting)<2): if(painting['status']): print("You've probably run out of deepai (free) credits.") print("Status returned "+painting['status']) else: print("Storing as "+filename) urllib.request.urlretrieve(painting['output_url'], filename) print("Paint now dried on "+filename) except Exception as ex: print("Paint "+filename+" failed") print(ex) count+=1 print("Paintings complete and paint has dried!")
print("Now run main.py to access the Advent calendar")

config.descriptions is a list of Christmas image descriptions.

Ideally, there should be 25 descriptions, but we treat the array as a circular queue so that every time we pop an item off the front, we push back at the end of the queue.

This just ensures we can generate 25 images even without 25 descriptions. My experimentation suggests you will not get the same image back from two requests with the same image description anyway!

Images are cached in an images folder with 25 filenames image01.jpg through image25.jpg

filename='./images/image'+ (f"{count:02d}") +".jpg"

The script first checks whether an image already exists – if yes we just loop through to the next image.

With the code in GitHub I’ve included 25 images. If you want to generate your own just delete some or all from the images folder and edit the descriptions in config.py.

If you don’t yet want to purchase deepai credits – I recommend you just delete a few to experiment.

Sample output from the prepaint.py script

The Advent app

I chose to use the PyGame library to produce the virtual Advent app.

Running main.py launches a window with a grid of 25 doors.

Red doors indicate available doors to open based on the current date of the month. When a red labeled door is opened, a festive image is displayed along with a short promotion text that can be customized in config.py

A click on the image re-displays the grid of calendar doors. 

A click on the lower portion of the screen opens a browser with the configured URL.

The PyGame script itself is fairly straightforward. Just one interesting snippet to highlight: To arrange the doors in a ‘random’ order. An array of the numbers 1 to 25 is created (constants HEIGHT and WIDTH are both defined as 5 in config.py).

This array is shuffled using random.shuffle. By pre-setting the random seed to a set figure (here 1 was chosen), the same random shuffle is produced every time the main.py is run.

doormap=list(range(1,HEIGHT*WIDTH+1))
random.seed(1)
random.shuffle(doormap)

Appendix – Full Code

Here are the two code files:

config.py:

HEIGHT = 5
WIDTH = 5 descriptions=["Christmas presents under tree", "Santas Elf", "Santa in sleigh flying over a town delivering gifts on christmas eve", "candy cane on an christmas tree", "a festive robin redbreast", "a snowman with carrot nose and presents", "fairy on top of a christmas tree", "children playing with christmas presents", "christmas lunch", "christmas carol singers", "christmas bells in a christmas tree", "rudolph the red nosed reindeer", "Santa in sleigh", "two snowmen with presents", "snowy village scene", "christmas decoration", "christmas bells", "christmas snowflake", "christmas pudding with custard", "christmas feast with turkey", "father christmas laughing", "christmas baby in a manger", "christmas mulled wine", "children enjoying playing with a new toy train", ] gifts=[('8 python cheat sheets','https://blog.finxter.com/python-cheat-sheets/'),
('the finxter academy','https://academy.finxter.com/'),
('the finxter app','https://app.finxter.com/learn/computer/science/'),
('the finxster freelancer course','https://finxter.gumroad.com/l/python-freelancer/'),
('The Ultimate Guide to Start Learning Python','https://blog.finxter.com/start-learning-python/'),
('Coffee Break Numpy book','https://www.amazon.com/gp/product/B07WHB8FWC/ref=as_li_tl?ie=UTF8&tag=finxter-20&camp=1789&creative=9325&linkCode=as2&creativeASIN=B07WHB8FWC&linkId=447a5019492081d1b2892d9470bb29fc'),
('Leaving the Rat Race with Python book','https://www.amazon.com/Leaving-Rat-Race-Python-Developing-ebook/dp/B08G1XLDNB/ref=sr_1_5?qid=1670149592&refinements=p_27%3AChristian+Mayer&s=digital-text&sr=1-5&text=Christian+Mayer'),
('Coffee break Python book','https://www.amazon.com/Coffee-Break-Python-Kickstart-Understanding-ebook/dp/B07GSTJPFD/ref=sr_1_4?qid=1670149592&refinements=p_27%3AChristian+Mayer&s=digital-text&sr=1-4&text=Christian+Mayer'),
('Coffee Break Python Slicing','https://www.amazon.com/Coffee-Break-Python-Slicing-Workouts-ebook/dp/B07KSHLLG5/ref=sr_1_8?qid=1670149592&refinements=p_27%3AChristian+Mayer&s=digital-text&sr=1-8&text=Christian+Mayer'),
('Coffee Break Pandas','https://www.amazon.com/Coffee-Break-Pandas-Puzzles-Superpower-ebook/dp/B08NG8QHW7/ref=sr_1_9?qid=1670149592&refinements=p_27%3AChristian+Mayer&s=digital-text&sr=1-9&text=Christian+Mayer'),
('',''),

main.py:

import pygame
import sys
import time
import random
import datetime
import webbrowser
import config HEIGHT = config.HEIGHT
WIDTH = config.WIDTH
LINEHEIGHT = 30
CLICKABLEHEIGHT = 150
FONT = 'verdana' # Colors
BLACK = (0, 0, 0)
GRAY = (200, 200, 200)
WHITE = (255, 255, 255)
RED = (200, 0, 0) # use a seeded random shuffle on numbers 1 to 25 # specify seed to always generate the same random sequence of door labels for our 5x5 grid
doormap=list(range(1,HEIGHT*WIDTH+1))
random.seed(1) random.shuffle(doormap) #get current time
d = datetime.datetime.now()
#get the day of month
datemax=int(d.strftime("%d")) #lambda function to convert eg 1 to 1st, 2 to 2nd
ordinal = lambda n: "%d%s" % (n,"tsnrhtdd"[(n//10%10!=1)*(n%10<4)*n%10::4]) # Create game
pygame.init()
size = width, height = 500, 500
screen = pygame.display.set_mode(size)
pygame.display.set_caption('Finxter Advent Calendar') #font_name = 'calibr'#pygame.font.get_default_font()
bigfont = pygame.font.SysFont(FONT, 20)
hugefont = pygame.font.SysFont(FONT, 40) # Compute board size
BOARD_PADDING = 10
board_width = width - (BOARD_PADDING * 2)
board_height = height - (BOARD_PADDING * 2)
cell_size = int(min(board_width / WIDTH, board_height / HEIGHT))
halfcell_size=cell_size/2+10
board_origin = (BOARD_PADDING, BOARD_PADDING) # utility function to add text to screen
def addText(text, position, color): giftText = bigfont.render(text, True, color) giftRect = giftText.get_rect() giftRect.center = position screen.blit(giftText, giftRect) # start from the main grid when dooropen>0 it indicate the door/image to display
dooropen=0 while True: # Check if game quit for event in pygame.event.get(): if event.type == pygame.QUIT: sys.exit() screen.fill(BLACK) if dooropen: # paint the correct image onto the screen filename='images/image'+ (f"{dooropen:02d}") +".jpg" image = pygame.image.load(filename) rect = image.get_rect() screen.blit(image, rect) # going to create a semi transparent clickable area which can link to a URL # the URL 'gifts' are stored in the imported config giftlabel,gifturl = config.gifts[dooropen%len(config.gifts)] s = pygame.Surface((width,CLICKABLEHEIGHT)) s.set_alpha(200) s.fill(GRAY) screen.blit(s, (0,height-CLICKABLEHEIGHT)) clickable = pygame.Rect(0,height-CLICKABLEHEIGHT,width,CLICKABLEHEIGHT) # add text to the clickable area if(dooropen==25): addText("It's Christmas!", ((width / 2), 4*cell_size+LINEHEIGHT), RED) else: addText("On the "+ ordinal(25-dooropen) +" night before xmas", ((width / 2), 4*cell_size), WHITE) addText("Finxter brought unto me:", ((width / 2), 4*cell_size+LINEHEIGHT), WHITE) addText(giftlabel, ((width / 2), 4*cell_size+(2*LINEHEIGHT)), RED) # open URL in browser if clickable area clicked # otherwise close the door by setting dooropen to 0 click, _, _ = pygame.mouse.get_pressed() if click == 1: mouse = pygame.mouse.get_pos() if clickable.collidepoint(mouse) : time.sleep(0.2) webbrowser.open(gifturl, new=dooropen, autoraise=True) else: dooropen = 0 time.sleep(0.2) pygame.display.flip() continue # Draw board cells = [] for i in range(HEIGHT): row = [] for j in range(WIDTH): # Draw rectangle for cell rect = pygame.Rect( board_origin[0] + j * cell_size, board_origin[1] + i * cell_size, cell_size, cell_size ) pygame.draw.rect(screen, GRAY, rect) pygame.draw.rect(screen, WHITE, rect, 3) doornumber=doormap[(j*HEIGHT)+i] label = hugefont.render(str(doornumber), True, RED if doornumber<=datemax else WHITE) labelRect = label.get_rect() labelRect.center = (j * cell_size+halfcell_size, i * cell_size+halfcell_size) screen.blit(label, labelRect) row.append(rect) cells.append(row) left, _, right = pygame.mouse.get_pressed() if left: mouse = pygame.mouse.get_pos() for i in range(HEIGHT): for j in range(WIDTH): if cells[i][j].collidepoint(mouse) and dooropen==0: dooropen=doormap[(j*HEIGHT)+i] # did they attempt to open a door ahead of current date # dont allow that! if dooropen>datemax: dooropen=0 time.sleep(0.2) pygame.display.flip()

prepaint.py:

import urllib.request
import config
import requests
from os.path import exists def aipaint(description): r = requests.post( "https://api.deepai.org/api/impressionism-painting-generator", data={ 'text': description, }, headers={'api-key': 'quickstart-QUdJIGlzIGNvbWluZy4uLi4K'} ) ret = r.json() return ret count=1 while count<=25: # pop off the first image description description=config.descriptions.pop(0) # push it back on at end (this ensures we cycle through descriptions and never run out ) # obviously best is there are 25 descriptions though config.descriptions.append(description) filename='./images/image'+ (f"{count:02d}") +".jpg" if(exists(filename)): print(filename+" already exists") else: print("Painting: "+description) try: painting=aipaint(description) if(len(painting)<2): if(painting['status']): print("You've probably run out of deepai (free) credits.") print("Status returned "+painting['status']) else: print("Storing as "+filename) urllib.request.urlretrieve(painting['output_url'], filename) print("Paint now dried on "+filename) except Exception as ex: print("Paint "+filename+" failed") print(ex) count+=1 print("Paintings complete and paint has dried!")
print("Now run main.py to access the Advent calendar")
Posted on Leave a comment

How to Convert an Octal Escape Sequence in Python – And Vice Versa?

5/5 – (1 vote)

This tutorial will show you how to convert an

  • octal escape sequence to a Python string, and a
  • Python string to an octal escape sequence.

But let’s quickly recap what an octal escape sequence is in the first place! 👇

This is you and your friend celebrating after having solved this problem! 🥳

What Is An Octal Escape Sequence?💡

An Octal Escape Sequence is a backslash followed by 1-3 octal digits (0-7) such as \150 which encodes the ASCII character 'h'. Each octal escape sequence encodes one character (except invalid octal sequence \000). You can chain together multiple octal escape sequences to obtain a word.

Problem Formulation

💬 Question: How to convert an octal escape sequence to a string and vice versa in Python?

Examples

Octal String
\101 \102 \103 'ABC'
\101 \040 \102 \040 \103 'A B C'
\141 \142 \143 'abc'
\150 \145 \154 \154 \157 'hello'
\150 \145 \154 \154 \157 \040 \167 \157 \162 \154 \144 'hello world'

Python Octal to String Built-In Conversion

You don’t need to “convert” an octal escape sequence to a Unicode string if you already have it represented by a bytes object. Python automatically resolves the encoding.

See here:

>>> b'\101\102\103'
b'ABC'
>>> b'101\040\102\040\103'
b'101 B C'
>>> b'\101\040\102\040\103'
b'A B C'
>>> b'\141\142\143'
b'abc'
>>> b'\150\145\154\154\157'
b'hello'
>>> b'\150\145\154\154\157\040\167\157\162\154\144'
b'hello world'

Python Octal to String Explicit Conversion

The bytes.decode('unicode-escape') function converts a given bytes object represented by an (octal) escape sequence to a Python string. For example, br'\101'.decode('unicode-escape') yields the Unicode (string) character 'A'.

def octal_to_string(x): ''' Converts an octal escape sequence to a string''' return x.decode('unicode-escape')

Example: Convert the octal representations presented above:

octals = [br'\101\102\103', br'\101\040\102\040\103', br'\141\142\143', br'\150\145\154\154\157', br'\150\145\154\154\157\040\167\157\162\154\144'] for octal in octals: print(octal_to_string(octal)) 

This leads to the following expected output:

ABC
A B C
abc
hello
hello world

Python String to Octal

To convert a Python string to an octal escape sequence representation, iterate over each character c and convert it to an octal escape sequence using oct(ord(c)).

The result uses octal representation such as '0o123'. You can do string manipulation, such as slicing and string concatenation, to bring it into the final version '\123', for instance.

Here’s the function that converts string to octal escape sequence format:

def string_to_octal(x): ''' Converts a string to an octal escape sequence''' return '\\' + '\\'.join(oct(ord(c))[2:] for c in x)

✅ Background:

  • The ord() function takes a character (=string of length one) as an input and returns the Unicode number of this character. For example, ord('a') returns the Unicode number 97. The inverse function of ord() is the chr() function, so chr(ord('a')) returns the original character 'a'.
  • The oct() function takes one integer argument and returns an octal string with prefix "0o".

Let’s check how our strings can be converted to the octal escape sequence representation using this function:

strings = ['ABC', 'A B C', 'abc', 'hello', 'hello world'] for s in strings: print(string_to_octal(s))

And here’s the expected output:

\101\102\103
\101\40\102\40\103
\141\142\143
\150\145\154\154\157
\150\145\154\154\157\40\167\157\162\154\144

If you don’t like the one-liner solution provided above, feel free to use this multi-liner instead that may be easier to read:

def string_to_octal(x): ''' Converts a string to an octal escape sequence''' result = '' for c in x: result += '\\' + oct(ord(c))[2:] return result

If you want to train your Python one-liner skills instead, check out my book! 👇

Python One-Liners Book: Master the Single Line First!

Python programmers will improve their computer science skills with these useful one-liners.

Python One-Liners

Python One-Liners will teach you how to read and write “one-liners”: concise statements of useful functionality packed into a single line of code. You’ll learn how to systematically unpack and understand any line of Python code, and write eloquent, powerfully compressed Python like an expert.

The book’s five chapters cover (1) tips and tricks, (2) regular expressions, (3) machine learning, (4) core data science topics, and (5) useful algorithms.

Detailed explanations of one-liners introduce key computer science concepts and boost your coding and analytical skills. You’ll learn about advanced Python features such as list comprehension, slicing, lambda functions, regular expressions, map and reduce functions, and slice assignments.

You’ll also learn how to:

  • Leverage data structures to solve real-world problems, like using Boolean indexing to find cities with above-average pollution
  • Use NumPy basics such as array, shape, axis, type, broadcasting, advanced indexing, slicing, sorting, searching, aggregating, and statistics
  • Calculate basic statistics of multidimensional data arrays and the K-Means algorithms for unsupervised learning
  • Create more advanced regular expressions using grouping and named groups, negative lookaheads, escaped characters, whitespaces, character sets (and negative characters sets), and greedy/nongreedy operators
  • Understand a wide range of computer science topics, including anagrams, palindromes, supersets, permutations, factorials, prime numbers, Fibonacci numbers, obfuscation, searching, and algorithmic sorting

By the end of the book, you’ll know how to write Python at its most refined, and create concise, beautiful pieces of “Python art” in merely a single line.

Get your Python One-Liners on Amazon!!

Posted on Leave a comment

Python | Split String Variable Spaces

Rate this post

⭐Summary: The most efficient way to split a string using variable spaces is to use the split function like so given_string.split(). An alternate approach is to use different functions of the regex package to split the string at multiple whitespaces.

Minimal Example

text = "a b c d"
# Method 1
print(text.split())
# Method 2
import re
print(re.split('\s+', text))
# Method 3
print([x for x in re.findall(r'\S+', text) if x != ''])
# Method 4
print(re.sub(r'\s+', ',', text).split(','))
# Method 5
print(list(filter(None, text.split()))) # ['a', 'b', 'c', 'd']

Problem Formulation

📜Problem: Given a string. How will you split the string using multiple spaces?

Example

# Input
text = "abc xyz lmn pqr"
# Output
['abc', 'xyz', 'lmn', 'pqr']

The given input has multiple spaces between each substring, i.e., there are three spaces after abc, two spaces after xyz while a single space after lmn. So, not only do you have multiple spaces between the substring but also varied number of spaces. Can you split the string by varied and multiple spaces?


Though the question might look daunting at first but once you get hold of it, the solutions to this problem are easier than one can imagine. So, without further delay let us dive into the different ways of solving the given problem.

Method 1: Using split()

The built-in split('sep') function allows you to split a string in Python based on a given delimiter. By default the split function splits a given string at whitespaces. Meaning, if you do not pass any delimiter to the split function then the string will be split at whitespaces.

You can use this default property of the split function and successfully split the given string at multiple spaces just by using the split() function.

Code:

text = "abc xyz lmn pqr"
print(text.split()) # ['abc', 'xyz', 'lmn', 'pqr']

📚Recommended DigestPython String split()

Method 2: Using re.split

The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

Approach: To split the string using multiple space characters use re.split("\s+", text) where \s+ is the matching pattern and it represents a special sequence that returns a match whenever it finds any whitespace character and splits the string. So, whenever there’s a space or multiple spaces (any number of occurrences of space are whitespace characters) the string will be split.

Code:

import re
text = "abc xyz lmn pqr"
print(re.split('\s+', text))
# ['abc', 'xyz', 'lmn', 'pqr']

📚Recommended Read:  Python Regex Split.

Method 3: Using re.findall

The re.findall(pattern, string) method scans the string from left to right, searching for all non-overlapping matches of the pattern. It returns a list of strings in the matching order when scanning the string from left to right.

📚Recommended Read: Python re.findall() – Everything You Need to Know

Code:

import re
text = "abc xyz lmn pqr"
print([x for x in re.findall(r'\S+', text) if x != ''])
# ['abc', 'xyz', 'lmn', 'pqr']

Method 4: Using re.sub

The regex function re.sub(P, R, S) replaces all occurrences of the pattern P with the replacement R in string S. It returns a new string. For example, if you call re.sub('a', 'b', 'aabb'), the result will be the new string 'bbbb' with all characters 'a' replaced by 'b'.

Approach: Use the re.sub method to replace all occurrences of space characters in the given string with a comma. Thus, the string will now have commas instead of space characters and you can simply split it using a normal string split method by passing comma as the delimiter.

Silly! Isn’t it? Nevertheless, it works.

Code:

import re
text = "abc xyz lmn pqr"
res = re.sub(r'\s+', ',', text).split(',')
print(res)
# ['abc', 'xyz', 'lmn', 'pqr']

Method 5: Using filter

Python’s built-in filter() function filters out the elements that pass a filtering condition. It takes two arguments: function and iterable. The function assigns a Boolean value to each element in the iterable to check whether the element will pass the filter or not. It returns an iterator with the elements that pass the filtering condition.

📚Related Read: Python filter()

Approach: You can use the filter() method to split the string by space. Feed in None as the first argument and the list of split strings as the second argument into the filter function. The filter() function then iterates through the list and filters out the spaces from the given string and returns only the non-whitespace characters. As the filter() method returns an object, we need to use the list() to convert the object into a list.

Code:

text = "abc xyz lmn pqr"
print(list(filter(None, text.split())))
# ['abc', 'xyz', 'lmn', 'pqr']

Conclusion

Hurrah! We have successfully solved the given problem using as many as five different ways. I hope you enjoyed reading this article and it helped you in your Python coding journey. Please subscribe and stay tuned for more interesting articles!

Happy coding! 🙂

📚Suggested Read: Python Regex Superpower [Full Tutorial]


Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.

Posted on Leave a comment

Python | Split String Multiple Whitespaces

Rate this post

🍎Summary: The most efficient way to split a string using multiple whitespaces is to use the split function like so given_string.split(). An alternate approach is to use different functions of the regex package to split the string at multiple whitespaces.

Minimal Example:

import re text = "mouse\nsnake\teagle human"
# Method 1
print(text.split()) # Method 2
res = re.split("\s+", text)
print(res) # Method 3
res = re.sub(r'\s+', ',', text).split(',')
print(res) # Method 4
print(re.findall(r'\S+', text)) # ['mouse', 'snake', 'eagle', 'human']

Problem Formulation

📜Problem: Given a string. How will you split the string using multiple whitespaces?

Example

# Input
text = "abc\nlmn\tpqr xyz\rmno"
# Output
['abc', 'lmn', 'pqr', 'xyz', 'mno']

There are numerous ways of solving the given problem. So, without further ado, let us dive into the solutions.

Method 1: Using Regex

The best way to deal with multiple delimiters is to use the flexibility of the regular expressions library. There are different functions available in the regex library that you can use to split the given string. Let’s go through each one by one.

1.1 Using re.split

The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

📚Recommended Read:  Python Regex Split.

Approach: To split the string using multiple whitespace characters use re.split("\s+", text) where \s is the matching pattern and it represents a special sequence that returns a match whenever it finds any whitespace character and splits the string.

Code:

import re
text = "abc\nlmn\tpqr xyz\rmno"
res = re.split("\s+", text)
print(res) # ['abc', 'lmn', 'pqr', 'xyz', 'mno']

1.2 Using re.findall

The re.findall(pattern, string) method scans the string from left to right, searching for all non-overlapping matches of the pattern. It returns a list of strings in the matching order when scanning the string from left to right.

📚Recommended Read: Python re.findall() – Everything You Need to Know

Code:

import re text = "abc\nlmn\tpqr xyz\rmno"
print(re.findall(r'\S+', text))

Explanation: In the expression, i.e., re.findall(r"\S'+", text), all occurrences of characters except whitespaces are found and stored in a list. Here, \S+ returns a match whenever the string contains one or more occurrences of normal characters (characters from a to Z, digits from 0-9, etc. However, not the whitespaces are considered).

1.3 Using re.sub

The regex function re.sub(P, R, S) replaces all occurrences of the pattern P with the replacement R in string S. It returns a new string. For example, if you call re.sub('a', 'b', 'aabb'), the result will be the new string 'bbbb' with all characters 'a' replaced by 'b'.

Aprroach: Use the re.sub method to replace all occurrences of whitespace characters in the given string with a comma. Thus, the string will now have commas instead of whitespace characters and you can simply split it using a normal string split method by passing comma as the delimiter.

Code:

import re
text = "abc\nlmn\tpqr xyz\rmno"
res = re.sub(r'\s+', ',', text).split(',')
print(res) # ['abc', 'lmn', 'pqr', 'xyz', 'mno']

Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.


Method 2: Using split()

By default the split function splits a given string at whitespaces. Meaning, if you do not pass any delimiter to the split function then the string will be split at whitespaces. You can use this default property of the split function and successfully split the given string at multiple whitespaces just by using the split() function.

Code:

text = "abc\nlmn\tpqr xyz\rmno"
print(text.split())
# ['abc', 'lmn', 'pqr', 'xyz', 'mno']

📚Recommended Digest: Python String split()

Conclusion

We have successfully solved the given problem using different approaches. Simply using split could do the job for you. However, feel free to explore and try out the other options mentioned above. I hope this article helped you in your Python coding journey. Please subscribe and stay tuned for more interesting articles.

Happy Pythoning! 🐍 


Python Regex Course

Google engineers are regular expression masters. The Google search engine is a massive text-processing engine that extracts value from trillions of webpages.  

Facebook engineers are regular expression masters. Social networks like Facebook, WhatsApp, and Instagram connect humans via text messages

Amazon engineers are regular expression masters. Ecommerce giants ship products based on textual product descriptions.  Regular expressions ​rule the game ​when text processing ​meets computer science. 

If you want to become a regular expression master too, check out the most comprehensive Python regex course on the planet:

Posted on Leave a comment

Python | Split String by Number

Rate this post

✨Summary: To split a string by a number, use the regex split method using the “\d” pattern.

Minimal Example

my_string = "#@1abc3$!*5xyz" # Method 1
import re res = re.split('\d+', my_string)
print(res) # Method 2
import re res = re.findall('\D+', my_string)
print(res) # Method 3
from itertools import groupby li = [''.join(g) for _, g in groupby(my_string, str.isdigit)]
res = [x for x in li if x.isdigit() == False]
print(res) # Method 4
res = []
for i in my_string: if i.isdigit() == True: my_string = my_string.replace(i, ",")
print(my_string.split(",")) # Outputs:
# ['#@', 'abc', '$!*', 'xyz']

Problem Formulation

📜Problem: Given a string containing different characters. How will you split the string whenever a number appears?

Method 1: re.split()

The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

Code:

import re
my_string = "#@1abc3$!*5xyz"
res = re.split('\d+', my_string)
print(res) # ['#@', 'abc', '$!*', 'xyz']

Explanation: The \d special character matches any digit between 0 and 9. By using the maximal number of digits as a delimiter, you split along the digit-word boundary. 

Method 2: re.findall()

The re.findall(pattern, string) method scans string from left to right, searching for all non-overlapping matches of the pattern. It returns a list of strings in the matching order when scanning the string from left to right.

Code:

import re
my_string = "#@1abc3$!*5xyz"
res = re.findall('\D+', my_string)
print(res) # ['#@', 'abc', '$!*', 'xyz']

Explanation: The \special character matches all characters except any digit between 0 and 9. Thus, you are essentially finding all character groups that appear before the occurrence of a digit.

Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.

Method 3: itertools.groupby()

Code:

from itertools import groupby
my_string = "#@1abc3$!*5xyz"
li = [''.join(g) for _, g in groupby(my_string, str.isdigit)]
res = [x for x in li if x.isdigit() == False]
print(res) # ['#@', 'abc', '$!*', 'xyz']

Explanation:

  • The itertools.groupby(iterable, key=None) function creates an iterator that returns tuples (key, group-iterator) grouped by each value of key. We use the str.isdigit() function as key function.
  • The str.isdigit() function returns True if the string consists only of numeric characters. Thus, you will have a list created by using numbers as separators. Note that this list will also contain the numbers as items within it.
  • In order to eliminate the numbers, use another list comprehension that checks if an element in the list returned previously is a digit or not with the help of the isdigit method. If it is a digit, the item will be discarded. Otherwise it will be stored in the list.

Method 4: Replace Using a for Loop

Approach: Use a for loop to iterate through the characters of the given string. Check if a character is a digit or not. As soon as a digit is found, replace that character/digit with a delimiter string ( we have used a comma here) with the help of the replace() method. This basically means that you are placing a particular character in the string whenever a number appears. Once all the digits are replaced by the separator string, split the string by passing the separator string as a delimiter to the split method.

Code:

my_string = "#@1abc3$!*5xyz"
res = []
for i in my_string: if i.isdigit(): my_string = my_string.replace(i, ",")
print(my_string.split(",")) # ['#@', 'abc', '$!*', 'xyz']

Conclusion

Phew! We have successfully solved the given problem and managed to do so using four different ways. I hope you found this article helpful and it answered your queries. Please subscribe and stay tuned for more solutions and tutorials.

Happy coding! 🙂

🌐Related Read: How to Split a String Between Numbers and Letters?