Python – Page 55 – Sick Gaming

Posted on December 6, 2020 by sick skills — Leave a comment

ASCII Table

The following table is an ASCII character table that translates different character codes—such as obtained by Python’s char() function— into the respective symbol. You can find the source of the description here. Note that you can define each number using the decimal, binary, octal, or hexadecimal system—it’s always the same value!

Decimal	Binary	Octal	Hexadecimal	Symbol	Description
0	0	0	0	NUL	Null char
1	1	1	1	SOH	Start of Heading
2	10	2	2	STX	Start of Text
3	11	3	3	ETX	End of Text
4	100	4	4	EOT	End of Transmission
5	101	5	5	ENQ	Enquiry
6	110	6	6	ACK	Acknowledgement
7	111	7	7	BEL	Bell
8	1000	10	8	BS	Back Space
9	1001	11	9	HT	Horizontal Tab
10	1010	12	0A	LF	Line Feed
11	1011	13	0B	VT	Vertical Tab
12	1100	14	0C	FF	Form Feed
13	1101	15	0D	CR	Carriage Return
14	1110	16	0E	SO	Shift Out / X-On
15	1111	17	0F	SI	Shift In / X-Off
16	10000	20	10	DLE	Data Line Escape
17	10001	21	11	DC1	Device Control 1 (oft.XON)
18	10010	22	12	DC2	Device Control 2
19	10011	23	13	DC3	Device Control 3 (oft.XOFF)
20	10100	24	14	DC4	Device Control 4
21	10101	25	15	NAK	Negative Acknowledgement
22	10110	26	16	SYN	Synchronous Idle
23	10111	27	17	ETB	End of Transmit Block
24	11000	30	18	CAN	Cancel
25	11001	31	19	EM	End of Medium
26	11010	32	1A	SUB	Substitute
27	11011	33	1B	ESC	Escape
28	11100	34	1C	FS	File Separator
29	11101	35	1D	GS	Group Separator
30	11110	36	1E	RS	Record Separator
31	11111	37	1F	US	Unit Separator
32	100000	40	20	SPACE	Space
33	100001	41	21	!	Exclamation mark
34	100010	42	22	“	Double quotes (or speech marks)
35	100011	43	23	#	Number
36	100100	44	24	$	Dollar
37	100101	45	25	%	Percent
38	100110	46	26	&	Ampersand
39	100111	47	27	‘	Single quote
40	101000	50	28	(	Open parenthesis (or open bracket)
41	101001	51	29	)	Close parenthesis (orclose bracket)
42	101010	52	2A	*	Asterisk
43	101011	53	2B	+	Plus
44	101100	54	2C	,	Comma
45	101101	55	2D	–	Hyphen
46	101110	56	2E	.	Period, dot or full stop
47	101111	57	2F	/	Slash or divide
48	110000	60	30	0	Zero
49	110001	61	31	1	One
50	110010	62	32	2	Two
51	110011	63	33	3	Three
52	110100	64	34	4	Four
53	110101	65	35	5	Five
54	110110	66	36	6	Six
55	110111	67	37	7	Seven
56	111000	70	38	8	Eight
57	111001	71	39	9	Nine
58	111010	72	3A	:	Colon
59	111011	73	3B	;	Semicolon
60	111100	74	3C	<	Less than (or open angled bracket)
61	111101	75	3D	=	Equals
62	111110	76	3E	>	Greater than (or closeangled bracket)
63	111111	77	3F	?	Question mark
64	1000000	100	40	@	At symbol
65	1000001	101	41	A	Uppercase A
66	1000010	102	42	B	Uppercase B
67	1000011	103	43	C	Uppercase C
68	1000100	104	44	D	Uppercase D
69	1000101	105	45	E	Uppercase E
70	1000110	106	46	F	Uppercase F
71	1000111	107	47	G	Uppercase G
72	1001000	110	48	H	Uppercase H
73	1001001	111	49	I	Uppercase I
74	1001010	112	4A	J	Uppercase J
75	1001011	113	4B	K	Uppercase K
76	1001100	114	4C	L	Uppercase L
77	1001101	115	4D	M	Uppercase M
78	1001110	116	4E	N	Uppercase N
79	1001111	117	4F	O	Uppercase O
80	1010000	120	50	P	Uppercase P
81	1010001	121	51	Q	Uppercase Q
82	1010010	122	52	R	Uppercase R
83	1010011	123	53	S	Uppercase S
84	1010100	124	54	T	Uppercase T
85	1010101	125	55	U	Uppercase U
86	1010110	126	56	V	Uppercase V
87	1010111	127	57	W	Uppercase W
88	1011000	130	58	X	Uppercase X
89	1011001	131	59	Y	Uppercase Y
90	1011010	132	5A	Z	Uppercase Z
91	1011011	133	5B	[	Opening bracket
92	1011100	134	5C	\	Backslash
93	1011101	135	5D	]	Closing bracket
94	1011110	136	5E	^	Caret – circumflex
95	1011111	137	5F	_	Underscore
96	1100000	140	60	`	Grave accent
97	1100001	141	61	a	Lowercase a
98	1100010	142	62	b	Lowercase b
99	1100011	143	63	c	Lowercase c
100	1100100	144	64	d	Lowercase d
101	1100101	145	65	e	Lowercase e
102	1100110	146	66	f	Lowercase f
103	1100111	147	67	g	Lowercase g
104	1101000	150	68	h	Lowercase h
105	1101001	151	69	i	Lowercase i
106	1101010	152	6A	j	Lowercase j
107	1101011	153	6B	k	Lowercase k
108	1101100	154	6C	l	Lowercase l
109	1101101	155	6D	m	Lowercase m
110	1101110	156	6E	n	Lowercase n
111	1101111	157	6F	o	Lowercase o
112	1110000	160	70	p	Lowercase p
113	1110001	161	71	q	Lowercase q
114	1110010	162	72	r	Lowercase r
115	1110011	163	73	s	Lowercase s
116	1110100	164	74	t	Lowercase t
117	1110101	165	75	u	Lowercase u
118	1110110	166	76	v	Lowercase v
119	1110111	167	77	w	Lowercase w
120	1111000	170	78	x	Lowercase x
121	1111001	171	79	y	Lowercase y
122	1111010	172	7A	z	Lowercase z
123	1111011	173	7B	{	Opening brace
124	1111100	174	7C	\|	Vertical bar
125	1111101	175	7D	}	Closing brace
126	1111110	176	7E	~	Equivalency sign – tilde
127	1111111	177	7F	DEL	Delete

Source

The post ASCII Table first appeared on Finxter.

Posted on December 5, 2020 by sick skills — Leave a comment

Python bool() Function

Python’s built-in bool(x) function converts value x to a Boolean value True or False. It uses implicit Boolean conversion on the input argument x. Any Python object has an associated truth value. The bool(x) function takes only one argument, the object for which a Boolean value is desired.

Input : bool(1)
Output : True Input : bool(0)
Output : False Input : bool(True)
Output : True Input : bool([1, 2, 3])
Output : True Input : bool([])
Output : False

But before we move on, I’m excited to present you my brand-new Python book Python One-Liners (Amazon Link).

If you like one-liners, you’ll LOVE the book. It’ll teach you everything there is to know about a single line of Python code. But it’s also an introduction to computer science, data science, machine learning, and algorithms. The universe in a single line of Python!

The book is released in 2020 with the world-class programming book publisher NoStarch Press (San Francisco).

Link: https://nostarch.com/pythononeliners

Examples bool() Functions

The following code shows you how to use the bool(x) function on different input arguments that all lead to True results.

#####################
# True Boolean Values
##################### # All integers except 0
print(bool(1))
print(bool(2))
print(bool(42))
print(bool(-1)) # All collections except empty ones
# (lists, tuples, sets)
print(bool([1, 2]))
print(bool([-1]))
print(bool((-1, -2)))
print(bool({1, 2, 3})) # All floats except 0.0
print(bool(0.1))
print(bool(0.0000001))
print(bool(3.4)) # Output is True for all previous examples

The following list of executions of the function bool(x) all result in Boolean values of False.

#####################
# False Boolean Values
##################### # Integer 0
print(bool(0)) # Empty collections
# (lists, tuples, sets)
print(bool([]))
print(bool({}))
print(bool(())) # Float 0.0
print(bool(0.0)) # Output is False for all previous examples

You can observe multiple properties of the bool() function:

You can pass any object into it and it will always return a Boolean value because all Python objects implement the __bool__() method and have an associated implicit Boolean value. You can use them to test a condition: 0 if x else 1 (example ternary operator).
The vast majority of objects are converted to True. Semantically, this means that they’re non-empty or whole.
A minority of objects convert to False. These are the “empty” values—for example, empty lists, empty sets, empty tuples, or an empty number 0.

Summary

Python’s built-in bool(x) function converts value x to a Boolean value True or False.

It uses implicit Boolean conversion on the input argument x.

Any Python object has an associated truth value.

The bool(x) function takes only one argument, the object for which a Boolean value is desired.

Where to Go From Here?

Enough theory, let’s get some practice!

To become successful in coding, you need to get out there and solve real problems for real people. That’s how you can become a six-figure earner easily. And that’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

The post Python bool() Function first appeared on Finxter.

Posted on December 4, 2020 by sick skills — Leave a comment

Parsing XML Using BeautifulSoup In Python

Introduction

XML is a tool that is used to store and transport data. It stands for eXtensible Markup Language. XML is quite similar to HTML and they have almost the same kind of structure but they were designed to accomplish different goals.

XML is designed to transport data while HTML is designed to display data. Many systems contain incompatible data formats. This makes data exchange between incompatible systems is a time-consuming task for web developers as large amounts of data has to be converted. Further, there are chances that incompatible data is lost. But, XML stores data in plain text format thereby providing software and hardware-independent method of storing and sharing data.

Another major difference is that HTML tags are predefined whereas XML files are not.

❖ Example of XML:

<?xml version="1.0" encoding="UTF-8"?>
<note> <to>Harry Potter</to> <from>Albus Dumbledore</from> <heading>Reminder</heading> <body>It does not do to dwell on dreams and forget to live!</body>
</note>

As mentioned earlier, XML tags are not pre-defined so we need to find the tag that holds the information that we want to extract. Thus there are two major aspects governing the parsing of XML files:

Finding the required Tags.
Extracting data from after identifying the Tags.

BeautifulSoup and LXML Installation

When it comes to web scraping with Python, BeautifulSoup the most commonly used library. The recommended way of parsing XML files using BeautifulSoup is to use Python’s lxml parser.

You can install both libraries using the pip installation tool. Please have a look at our BLOG TUTORIAL to learn how to install them if you want to scrape data from an XML file using Beautiful soup.

TUTORIAL: Installing BeautifulSoup and LXML

# Note: Before we proceed with our discussion, please have a look at the following XML file that we will be using throughout the course of this article. (Please create a file with the name sample.txt and copy-paste the code given below to practice further.)

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<CATALOG> <PLANT> <COMMON>Bloodroot</COMMON> <BOTANICAL>Sanguinaria canadensis</BOTANICAL> <ZONE>4</ZONE> <LIGHT>Mostly Shady</LIGHT> <PRICE>$2.44</PRICE> <AVAILABILITY>031599</AVAILABILITY> </PLANT> <PLANT> <COMMON>Marsh Marigold</COMMON> <BOTANICAL>Caltha palustris</BOTANICAL> <ZONE>4</ZONE> <LIGHT>Mostly Sunny</LIGHT> <PRICE>$6.81</PRICE> <AVAILABILITY>051799</AVAILABILITY> </PLANT> <PLANT> <COMMON>Cowslip</COMMON> <BOTANICAL>Caltha palustris</BOTANICAL> <ZONE>4</ZONE> <LIGHT>Mostly Shady</LIGHT> <PRICE>$9.90</PRICE> <AVAILABILITY>030699</AVAILABILITY> </PLANT>
</CATALOG>

Searching The Required Tags in The XML Document

Since the tags are not pre-defined in XML, we must identify the tags and search them using the different methods provided by the BeautifulSoup library. Now, how do we find the right tags? We can do so with the help of BeautifulSoup's search methods.

Beautiful Soup has numerous methods for searching a parse tree. The two most popular and commonly used methods are:

find()
find_all()

We have an entire blog tutorial on the two methods. Please have a look at the following tutorial to understand how these search methods work.

Tutorial: Searching A Parse Tree

If you have read the above-mentioned article, then you can easily use the find and find_all methods to search for tags anywhere in the XML document.

Relationship Between Tags

It is extremely important to understand the relationship between tags, especially while scraping data from XML documents.

The three key relationships in the XML parse tree are:

Parent: The tag which is used as the reference tag for navigating to child tags.
Children: The tags contained within the parent tag.
Siblings: As the name suggests these are the tags that exist on the same level of the parse tree.

Let us have a look at how we can navigate the XML parse tree using the above relationships.

Finding Parents

❖ The parent attribute allows us to find the parent/reference tag as shown in the example below.

Example: In the following code we will find out the parents of the common tag.

print(soup.common.parent.name)

Output:

plant

Note: The name attribute allows us to extract the name of the tag instead of extracting the entire content.

Finding Children

❖ The children attribute allows us to find the child tag as shown in the example below.

Example: In the following code we will find out the children of the plant tag.

for child in soup.plant.children: if child.name == None: pass else: print(child.name)

Output:

common
botanical
zone
light
price
availability

Finding Siblings

A tag can have siblings before and after it.

❖ The previous_siblings attribute returns the siblings before the referenced tag, and the next_siblings attribute returns the siblings after it.

Example: The following code finds the previous and next sibling tags of the light tag of the XML document.

print("***Previous Siblings***")
for sibling in soup.light.previous_siblings: if sibling.name == None: pass else: print(sibling.name) print("\n***Next Siblings***")
for sibling in soup.light.next_siblings: if sibling.name == None: pass else: print(sibling.name)

Output:

***Previous Siblings***
zone
botanical
common ***Next Siblings***
price
availability

Extracting Data From Tags

By now, we know how to navigate and find data within tags. Let us have a look at the attributes that help us to extract data from the tags.

Text And String Attributes

To access the text values within tags, you can use the text or strings attribute.

Example: let us extract the the text from the first price tag using text and string attributes.

print('***PLANT NAME***')
for tag in plant_name: print(tag.text)
print('\n***BOTANICAL NAME***')
for tag in scientific_name: print(tag.string)

Output:

***PLANT NAME***
Bloodroot
Marsh Marigold
Cowslip ***BOTANICAL NAME***
Sanguinaria canadensis
Caltha palustris
Caltha palustris

The Contents Attribute

The contents attribute allows us to extract the entire content from the tags, that is the tag along with the data. The contents attribute returns a list, therefore we can access its elements using their index.

Example:

print(soup.plant.contents)
# Accessing content using index
print()
print(soup.plant.contents[1])

Output:

['\n', <common>Bloodroot</common>, '\n', <botanical>Sanguinaria canadensis</botanical>, '\n', <zone>4</zone>, '\n', <light>Mostly Shady</light>, '\n', <price>$2.44</price>, '\n', <availability>031599</availability>, '\n'] <common>Bloodroot</common>

Pretty Printing The Beautiful Soup Object

If you observe closely when we print the tags on the screen, they have a sort of messy appearance. While this may not have direct productivity issues, but a better and structured print style helps us to parse the document more effectively.

The following code shows how the output looks when we print the BeautifulSoup object normally:

print(soup)

Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><html><body><catalog>
<plant>
<common>Bloodroot</common>
<botanical>Sanguinaria canadensis</botanical>
<zone>4</zone>
<light>Mostly Shady</light>
<price>$2.44</price>
<availability>031599</availability>
</plant>
<plant>
<common>Marsh Marigold</common>
<botanical>Caltha palustris</botanical>
<zone>4</zone>
<light>Mostly Sunny</light>
<price>$6.81</price>
<availability>051799</availability>
</plant>
<plant>
<common>Cowslip</common>
<botanical>Caltha palustris</botanical>
<zone>4</zone>
<light>Mostly Shady</light>
<price>$9.90</price>
<availability>030699</availability>
</plant>
</catalog>
</body></html>

Now let us use the prettify method to improve the appearance of our output.

print(soup.prettify())

Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<html> <body> <catalog> <plant> <common> Bloodroot </common> <botanical> Sanguinaria canadensis </botanical> <zone> 4 </zone> <light> Mostly Shady </light> <price> $2.44 </price> <availability> 031599 </availability> </plant> <plant> <common> Marsh Marigold </common> <botanical> Caltha palustris </botanical> <zone> 4 </zone> <light> Mostly Sunny </light> <price> $6.81 </price> <availability> 051799 </availability> </plant> <plant> <common> Cowslip </common> <botanical> Caltha palustris </botanical> <zone> 4 </zone> <light> Mostly Shady </light> <price> $9.90 </price> <availability> 030699 </availability> </plant> </catalog> </body>
</html>

The Final Solution

We are now well versed with all the concepts required to extract data from a given XML document. It is now time to have a look at the final code where we shall be extracting the Name, Botanical Name, and Price of each plant in our example XML document (sample.xml).

Please follow the comments along with the code given below to have a understanding of the logic used in the solution.

from bs4 import BeautifulSoup # Open and read the XML file
file = open("sample.xml", "r")
contents = file.read() # Create the BeautifulSoup Object and use the parser
soup = BeautifulSoup(contents, 'lxml') # extract the contents of the common, botanical and price tags
plant_name = soup.find_all('common') # store the name of the plant
scientific_name = soup.find_all('botanical') # store the scientific name of the plant
price = soup.find_all('price') # store the price of the plant # Use a for loop along with the enumerate function that keeps count of each iteration
for n, title in enumerate(plant_name): print("Plant Name:", title.text) # print the name of the plant using text print("Botanical Name: ", scientific_name[ n].text) # use the counter to access each index of the list that stores the scientific name of the plant print("Price: ", price[n].text) # use the counter to access each index of the list that stores the price of the plant print()

Output:

Plant Name: Bloodroot
Botanical Name: Sanguinaria canadensis
Price: $2.44 Plant Name: Marsh Marigold
Botanical Name: Caltha palustris
Price: $6.81 Plant Name: Cowslip
Botanical Name: Caltha palustris
Price: $9.90

Conclusion

XML documents are an important source of transporting data and hopefully after reading this article you are well equipped to extract the data you want from these documents. You might be tempted to have a look at this video series where you can learn how to scrape webpages.

Please subscribe and stay tuned for more interesting articles in the future.

Where to Go From Here?

Enough theory, let’s get some practice!

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

The post Parsing XML Using BeautifulSoup In Python first appeared on Finxter.

Posted on December 3, 2020 by sick skills — Leave a comment

How to Install Python?

The post How to Install Python? first appeared on Finxter.

Posted on December 2, 2020 by sick skills — Leave a comment

Premature Optimization is the Root of All Evil

This chapter draft is part of my upcoming book “From One to Zero” (NoStarch 2021). You’ll learn about the concept of premature optimization and why it hurts your programming productivity. Premature optimization is one of the main problems of poorly written code. But what is it anyway?

Definition Premature Optimization

Definition: Premature optimization is the act of spending valuable resources—such as time, effort, lines of code, or even simplicity—on unnecessary code optimizations.

There’s nothing wrong with optimized code.

The problem is that there’s no such thing as free lunch. If you think you optimize code snippets, what you’re really doing is to trade one variable (e.g., complexity) against another variable (e.g., performance).

Sometimes you can obtain clean code that is also more performant and easier to read—but you must spend time to get to this state! Other times, you prematurely spend more lines of code on a state-of-the-art algorithm to improve execution speed. For example, you may add 30% more lines of code to improve execution speed by 0.1%. These types of trade-offs will screw up your whole software development process when done repeatedly.

Donald Knuth Quote Premature Optimization

But don’t take my word for it. Here’s what one of the most famous computer scientists of all times, Donald Knuth, says about premature optimization:

“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97 % of the time: premature optimization is the root of all evil.” — Donald Knuth

Knuth argues that most of the time, you shouldn’t bother tweaking your code to obtain small efficiency gains. Let’s dive into five practical instances of premature optimization to see how it can get you.

Six Examples of Premature Optimization

There are many situations where premature optimization may occur. Watch out for those! Next, I’ll show you six instances—but I’m sure there are more.

Premature Optimization of Code Functions

Free stock photo of close-up, code, coder

First, you spend a lot of time optimizing a code function or code snippet that you just cannot stand leaving unoptimized. You argue that it’s a bad programming style to use the naïve method, and you should use more efficient data structures or algorithms to tackle the problem. So, you dive into learning mode, and you find better and better algorithms. Finally, you decide on one that’s considered best—but it takes you hours and hours to make them work. The optimization was premature because, as it turns out, your code snippet is executed only seldom, and it doesn’t result in meaningful performance improvements.

Premature Optimization of Software Product’s Features

Second, you add more features to your software product because you believe that users will need them. You optimize for expected but unproven user needs. Say you develop a smartphone app that translates text into morse code lights. Instead of developing the minimum viable product (MVP, see Chapter 3) that does just that, you add more and more features that you expect are necessary, such as a text to audio conversion and even a receiver that translates light signals to text. Later you find out that your users never use these features. Premature optimization has significantly slowed down your product development cycle and reduced your learning speed.

Premature Optimization of Planning Phase

Third, you prematurely optimize your planning phase, trying to find solutions to all kinds of problems that may occur. While it’s very costly to avoid planning, many people never stop planning, which can be just as costly! Only now the costs are opportunity costs of not taking action. Making a software product a reality requires you to ship something of value to the real world—even if this thing is not perfect, yet. You need user feedback and a reality check before even knowing which problems will hit you the hardest. Planning can help you avoid many pitfalls, but if you’re the type of person without a bias towards action, all your planning will turn into nothing of value.

Premature Optimization of Scalability

Fourth, you prematurely optimize the scalability of your application. Expecting millions of visitors, you design a distributed architecture that dynamically adds virtual machines to handle peak load if necessary. Distributed systems are complex and error-prone, and it takes you months to make your system work. Even worse, I’ve seen more cases where the distribution has reduced an application’s scalability due to an increased overhead for communication and data consistency. Scalable distributed systems always come at a price—are you sure you need to pay it? What’s the point of being able to scale to millions of users if you haven’t even served your first one?

Premature Optimization of Test Design

Fifth, you believe in test-driven development, and you insist on 100% test coverage. Some functions don’t lend themselves to unit tests because of their non-deterministic input (e.g., functions that process free text from users). Even though it has little value, you prematurely optimize for a perfect coverage of unit tests, and it slows down the software development cycle while introducing unnecessary complexity into the project.

Premature Optimization of Object-Orientated World Building

Sixth, you believe in object orientation and insist on modeling the world using a complex hierarchy of classes. For example, you write a small computer game about car racing. You create a class hierarchy where the Porsche class inherits from the Car class, which inherits from the Vehicle class. In many cases, these types of stacked inheritance structures add unnecessary complexity and could be avoided. You’ve prematurely optimized your code to model a world with more details than the application needs.

Code Example of Premature Optimization Gone Bad

Let’s consider a small Python application that should serve as an example for a case where premature optimization went bad. Say, three colleagues Alice, Bob, and Carl regularly play poker games in the evenings. They need to keep track during a game night who owes whom. As Alice is a passionate programmer, she decides to create a small application that tracks the balances of a number of players.

She comes up with the code that serves the purpose well.

transactions = []
balances = {} def transfer(sender, receiver, amount): transactions.append((sender, receiver, amount)) if not sender in balances: balances[sender] = 0 if not receiver in balances: balances[receiver] = 0 balances[sender] -= amount balances[receiver] += amount def get_balance(user): return balances[user] def max_transaction(): return max(transactions, key=lambda x:x[2]) transfer('Alice', 'Bob', 2000)
transfer('Bob', 'Carl', 4000)
transfer('Alice', 'Carl', 2000) print('Balance Alice: ' + str(get_balance('Alice')))
print('Balance Bob: ' + str(get_balance('Bob')))
print('Balance Carl: ' + str(get_balance('Carl'))) print('Max Transaction: ' + str(max_transaction())) transfer('Alice', 'Bob', 1000)
transfer('Carl', 'Alice', 8000) print('Balance Alice: ' + str(get_balance('Alice')))
print('Balance Bob: ' + str(get_balance('Bob')))
print('Balance Carl: ' + str(get_balance('Carl'))) print('Max Transaction: ' + str(max_transaction()))

Listing: Simple script to track transactions and balances.

The script has two global variables transactions and balances. The list transactions tracks the transactions as they occurred during a game night. Each transaction is a tuple of sender identifier, receiver identifier, and the amount to be transferred from the sender to the receiver. The dictionary balances tracks the mapping from user identifier to the number of credits based on the occurred transactions.

The function transfer(sender, receiver, amount) creates and stores a new transaction in the global list, creates new balances for users sender and receiver if they haven’t already been created, and updates the balances according to the transaction. The function get_balance(user) returns the balance of the user given as an argument. The function max_transaction() goes over all transactions and returns the one that has the maximum value in the third tuple element—the transaction amount.

The application works—it returns the following output:

Balance Alice: -4000
Balance Bob: -2000
Balance Carl: 6000
Max Transaction: ('Bob', 'Carl', 4000)
Balance Alice: 3000
Balance Bob: -1000
Balance Carl: -2000
Max Transaction: ('Carl', 'Alice', 8000)

But Alice isn’t happy with the application. She realizes that calling max_transaction() results in some inefficiencies due to redundant calculations—the script goes over the list transactions twice to find the transaction with the maximum amount. The second time, it could theoretically reuse the result of the first call and only look at the new transactions.

To make the code more efficient, she adds another global variable max_transaction that keeps track of the maximum transaction amount ever seen.

transactions = []
balances = {}
max_transaction = ('X', 'Y', -9999999) def transfer(sender, receiver, amount):
… if amount > max_transaction[2]: max_transaction = (sender, receiver, amount)

By adding more complexity to the code, it is now more performant—but at what costs? The added complexity results in no meaningful performance benefit for the small applications for which Alice is using the code. It makes it more complicated and reduces maintainability. Nobody will ever recognize the performance benefit in the evening gaming sessions. But Alice’s progress will slow down as she adds more and more global variables (e.g., tracking the minimal transaction amounts etc.). The optimization clearly was a premature optimization without need for the concrete application.

Do you want to develop the skills of a well-rounded Python professional—while getting paid in the process? Become a Python freelancer and order your book Leaving the Rat Race with Python on Amazon (Kindle/Print)!

Where to Go From Here?

Enough theory, let’s get some practice!

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

The post Premature Optimization is the Root of All Evil first appeared on Finxter.

Posted on December 1, 2020 by sick skills — Leave a comment

Searching The Parse Tree Using BeautifulSoup

Introduction

HTML (Hypertext Markup Language) consists of numerous tags and the data we need to extract lies inside those tags. Thus we need to find the right tags to extract what we need. Now, how do we find the right tags? We can do so with the help of BeautifulSoup's search methods.

Beautiful Soup has numerous methods for searching a parse tree. The two most popular and commonly methods are:

find()
find_all()

The other methods are quite similar in terms of their usage. Therefore, we will be focusing on the find() and find_all() methods in this article.

The following Example will be used throughout this document while demonstrating the concepts:

html_doc = """ <html><head><title>Searching Tree</title></head>
<body>
<h1>Searching Parse Tree In BeautifulSoup</h1></p> <p class="Main">Learning <a href="https://docs.python.org/3/" class="language" id="python">Python</a>,
<a href="https://docs.oracle.com/en/java/" class="language" id="java">Java</a> and
<a href="https://golang.org/doc/" class="language" id="golang">Golang</a>;
is fun!</p> <p class="Secondary"><b>Please subscribe!</b></p>
<p class="Secondary" id= "finxter"><b>copyright - FINXTER</b></p> """
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, "html.parser")

Types Of Filters

There are different filters that can be passed into the find() and find_all() methods and it is crucial to have a clear understanding of these filters as they are used again and again, throughout the search mechanism. These filters can be used based on the tags:

name,
attributes,
on the text of a string,
or a mix of these.

❖ A String

When we pass a string to a search method then Beautiful Soup performs a match against that passed string. Let us have a look at an example and find the <h1> tags in the HTML document:

print(soup.find_all('h1'))

Output:

[<h1>Searching Parse Tree In BeautifulSoup</h1>]

❖ A Regular Expression

Passing a regular expression object allows Beautiful Soup to filter results according to that regular expression. In case you want to master the concepts of the regex module in Python, please refer to our tutorial here.

Note:

We need to import the re module to use a regular expression.
To get just the name of the tag instead of the entire content (tag+ content within the tag), use the .name attribute.

Example: The following code finds all instances of the tags starting with the letter “b”.

# finding regular expressions
for regular in soup.find_all(re.compile("^b")): print(regular.name)

Output:

body
b

❖ A List

Multiple tags can be passed into the search functions using a list a shown in the example below:

Example: The following code finds all the <a> and <b> tags in the HTML document.

for tag in soup.find_all(['a','b']): print(tag)

Output:

<a class="language" href="https://docs.python.org/3/" id="python">Python</a>
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a>
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>
<b>Please subscribe!</b>

❖ A function

We can define a function and pass an element as its argument. The function returns True in case of a match, otherwise it returns False.

Example: The following code defines a function which returns True for all classes that also have an id in the HTML document. We then pass this function to the find_all() method to get the desired output.

def func(tag): return tag.has_attr('class') and tag.has_attr('id') for tag in soup.find_all(func): print(tag)

Output:

<a class="language" href="https://docs.python.org/3/" id="python">Python</a>
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a>
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>

➠ Now that we have gone through the different kind of filters that we use with the search methods, we are well equipped to dive deep into the find() and find_all() methods.

The find() Method

The find() method is used to search for the occurrence of the first instance of a tag with the needed name.

Syntax:

find(name, attrs, recursive, string, **kwargs)

➠ find() returns an object of type bs4.element.Tag.

Example:

print(soup.find('h1'), "\n")
print("RETURN TYPE OF find(): ",type(soup.find('h1')), "\n")
# note that only the first instance of the tag is returned
print(soup.find('a'))

Output:

<h1>Searching Parse Tree In BeautifulSoup</h1> RETURN TYPE OF find(): <class 'bs4.element.Tag'> <a class="language" href="https://docs.python.org/3/" id="python">Python</a>

➠ The above operation is the same as done by the soup.h1 or soup soup.a which also returns the first instance of the given tag. So what’s, the difference? The find() method helps us to find a particular instance of a given tag using key-value pairs as shown in the example below:

print(soup.find('a',id='golang'))

Output:

<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>

The find_all() Method

We saw that the find() method is used to search for the first tag. What if we want to find all instances of a tag or numerous instances of a given tag within the HTML document? The find_all() method, helps us to search for all tags with the given tag name and returns a list of type bs4.element.ResultSet. Since the items are returned in a list, they can be accessed with help of their index.

Syntax:

find_all(name, attrs, recursive, string, limit, **kwargs)

Example: Searching all instances of the ‘a’ tag in the HTML document.

for tag in soup.find_all('a'): print(tag)

Output:

<a class="language" href="https://docs.python.org/3/" id="python">Python</a>
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a>
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>

Now there are numerous other argument apart from the filters that we already discussed earlier. Let us have a look at them one by one.

❖ The name Argument

As stated earlier the name argument can be a string, a regular expression, a list, a function, or the value True.

Example:

for tag in soup.find_all('p'): print(tag)

Output:

<p class="Main">Learning <a class="language" href="https://docs.python.org/3/" id="python">Python</a>,
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a> and
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>;
is fun!</p>
<p class="Secondary"><b>Please subscribe!</b></p>

❖ The keyword Arguments

Just like the find() method, find_all() also allows us to find particular instances of a tag. For example, if the id argument is passed, Beautiful Soup filters against each tag’s ‘id’ attribute and returns the result accordingly.

Example:

print(soup.find_all('a',id='java'))

Output:

[<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a>]

You can also pass the attributes as dictionary key-value pairs using the attrs argument.

Example:

print(soup.find_all('a', attrs={'id': 'java'}))

Output:

[<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a>]

❖ Search Using CSS Class

Often we need to find a tag that has a certain CSS class, but the attribute, class, is a reserved keyword in Python. Thus, using class as a keyword argument will give a syntax error. Beautiful Soup 4.1.2 allows us to search a CSS class using the keyword class_

Example:

print(soup.find_all('p', class_='Secondary'))

Output:

[<p class="Secondary"><b>Please subscribe!</b></p>]

❖ Note: The above search will allow you to search all instances of the p tag with the class “Secondary” . But you can also filter searches based on multiple attributes, using a dictionary.

Example:

print(soup.find_all('p', attrs={'class': 'Secondary', 'id': 'finxter'}))

Output:

[<p class="Secondary" id="finxter"><b>copyright - FINXTER</b></p>]

❖ The string Argument

The string argument allows us to search for strings instead of tags.

Example:

print(soup.find_all(string=["Python", "Java", "Golang"]))

Output:

['Python', 'Java', 'Golang']

❖ The limit Argument

The find_all() method scans through the entire HTML document and returns all the matching tags and strings. This can be extremely tedious and take a lot of time if the document is large. So, you can limit the number of results by passing in the limit argument.

Example: There are three links in the example HTML document, but this code only finds the first two:

print(soup.find_all("a", limit=2))

Output:

[<a class="language" href="https://docs.python.org/3/" id="python">Python</a>, <a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a>]

Other Search Methods

We have successfully explored the most commonly used search methods, i.e., find and find_all(). Beautiful Soup also has other methods for searching the parse tree, but they are quite similar to what we already discussed above. The only differences are where they are used. Let us have a quick look at these methods.

find_parents() and find_parent(): these methods are used to traverse the parse tree upwards and look for a tag’s/string’s parent(s).
find_next_siblings() and find_next_sibling(): these methods are used to find the next sibling(s) of an element in the HTML document.
find_previous_siblings() and find_previous_sibling(): these methods are used to find and iterate over the sibling(s) that appear before the current element.
find_all_next() and find_next(): these methods are used to find and iterate over the sibling(s) that appear after the current element.
find_all_previous and find_previous(): these methods are used to find and iterate over the tags and strings that appear before the current element in the HTML document.

Example:

current = soup.find('a', id='java')
print(current.find_parent())
print()
print(current.find_parents())
print()
print(current.find_previous_sibling())
print()
print(current.find_previous_siblings())
print()
print(current.find_next())
print()
print(current.find_all_next())
print()

Output:

<p class="Main">Learning <a class="language" href="https://docs.python.org/3/" id="python">Python</a>,
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a> and
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>;
is fun!</p> [<p class="Main">Learning <a class="language" href="https://docs.python.org/3/" id="python">Python</a>,
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a> and
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>;
is fun!</p>, <body>
<h1>Searching Parse Tree In BeautifulSoup</h1>
<p class="Main">Learning <a class="language" href="https://docs.python.org/3/" id="python">Python</a>,
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a> and
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>;
is fun!</p>
<p class="Secondary"><b>Please subscribe!</b></p>
<p class="Secondary" id="finxter"><b>copyright - FINXTER</b></p>
<p class="Secondary"><b>Please subscribe!</b></p>
</body>, <html><head><title>Searching Tree</title></head>
<body>
<h1>Searching Parse Tree In BeautifulSoup</h1>
<p class="Main">Learning <a class="language" href="https://docs.python.org/3/" id="python">Python</a>,
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a> and
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>;
is fun!</p>
<p class="Secondary"><b>Please subscribe!</b></p>
<p class="Secondary" id="finxter"><b>copyright - FINXTER</b></p>
<p class="Secondary"><b>Please subscribe!</b></p>
</body></html>, <html><head><title>Searching Tree</title></head>
<body>
<h1>Searching Parse Tree In BeautifulSoup</h1>
<p class="Main">Learning <a class="language" href="https://docs.python.org/3/" id="python">Python</a>,
<a class="language" href="https://docs.oracle.com/en/java/" id="java">Java</a> and
<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>;
is fun!</p>
<p class="Secondary"><b>Please subscribe!</b></p>
<p class="Secondary" id="finxter"><b>copyright - FINXTER</b></p>
<p class="Secondary"><b>Please subscribe!</b></p>
</body></html>] <a class="language" href="https://docs.python.org/3/" id="python">Python</a> [<a class="language" href="https://docs.python.org/3/" id="python">Python</a>] <a class="language" href="https://golang.org/doc/" id="golang">Golang</a> [<a class="language" href="https://golang.org/doc/" id="golang">Golang</a>, <p class="Secondary"><b>Please subscribe!</b></p>, <b>Please subscribe!</b>, <p class="Secondary" id="finxter"><b>copyright - FINXTER</b></p>, <b>copyright - FINXTER</b>, <p class="Secondary"><b>Please subscribe!</b></p>, <b>Please subscribe!</b>]

Conclusion

With that we come to the end of this article; I hope that after reading this article you can search elements within a parse tree with ease! Please subscribe and stay tuned for more interesting articles.

Where to Go From Here?

Enough theory, let’s get some practice!

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

The post Searching The Parse Tree Using BeautifulSoup first appeared on Finxter.

Posted on November 30, 2020 by sick skills — Leave a comment

list.clear() vs New List — Why Clearing a List Rather Than Creating a New One?

Problem: You’ve just learned about the list.clear() method in Python. You wonder, what’s its purpose? Why not creating a new list and overwriting the variable instead of clearing an existing list?

Example: Say, you have the following list.

lst = ['Alice', 'Bob', 'Carl']

If you clear the list, it becomes empty:

lst.clear()
print(lst)
# []

However, you could have accomplished the same thing by just assigning a new empty list to the variable lst:

lst = ['Alice', 'Bob', 'Carl']
lst = []
print(lst)
# []

The output is the same. Why does the list.clear() method exist in the first place?

If you go through the following interactive memory visualizer, you’ll see that both variants lead to different results if you have multiple variables pointing to the list object:

In the second example, the variable lst_2 still points to a non-empty list object!

So, there are at least two reasons why the list.clear() method can be superior to creating a new list:

Release Memory: If you have a large list that fills your memory—such as a huge data set or a large file read via readlines()—and you don’t need it anymore, you can immediately release the memory with list.clear(). Especially in interactive mode, Python doesn’t know which variable you still need – so it must keep all variables till session end. But if you call list.clear(), it can release the memory for other processing tasks.
Clear Multiple List Variables: Multiple variables may refer to the same list object. If you want to reflect that the list is now empty, you can either call list.clear() on one variable and all other variables will see it, or you must call var1 = [], var2 = [], ..., varn = [] for all variables. This can be a pain if you have many variables.

The post list.clear() vs New List — Why Clearing a List Rather Than Creating a New One? first appeared on Finxter.

Posted on November 29, 2020 by sick skills — Leave a comment

Python abs() Function

Python’s built-in abs(x) function returns the absolute value of the argument x that can be an integer, float, or object implementing the __abs__() function. For a complex number, the function returns its magnitude. The absolute value of any numerical input argument -x or +x is the corresponding positive value +x.

Argument	`x`	int, float, complex, object with `__abs__()` implementation
Return Value	`\|x\|`	Returns the absolute value of the input argument. Integer input –> Integer output Float input –> Float output Complex input –> Complex output

Interactive Code Shell

Example Integer abs()

The following code snippet shows you how to use the absolute value 42 of a positive integer value 42.

# POSITIVE INTEGER
x = 42
abs_x = abs(x) print(f"Absolute value of {x} is {abs_x}")
# Absolute value of 42 is 42

The following code snippet shows you how to use the absolute value 42 of a negative integer value -42.

# NEGATIVE INTEGER
x = -42
abs_x = abs(x) print(f"Absolute value of {x} is {abs_x}")
# Absolute value of -42 is 42

Example Float abs()

The following code snippet shows you how to use the absolute value 42.42 of a positive integer value 42.42.

# POSITIVE FLOAT
x = 42.42
abs_x = abs(x) print(f"Absolute value of {x} is {abs_x}")
# Absolute value of 42.42 is 42.42

The following code snippet shows you how to use the absolute value 42.42 of a negative integer value -42.42.

# NEGATIVE FLOAT
x = -42.42
abs_x = abs(x) print(f"Absolute value of {x} is {abs_x}")
# Absolute value of -42.42 is 42.42

Example Complex abs()

The following code snippet shows you how to use the absolute value of a complex number (3+10j).

# COMPLEX NUMBER
complex_number = (3+10j)
abs_complex_number = abs(complex_number) print(f"Absolute value of {complex_number} is {abs_complex_number}")
# Absolute value of (3+10j) is 10.44030650891055

Python abs() vs fabs()

Python’s built-in function abs(x) calculates the absolute number of the argument x. Similarly, the fabs(x) function of the math module calculates the same absolute value. The difference is that math.fabs(x) always returns a float number while Python’s built-in abs(x) returns an integer if the argument x is an integer as well. The name “fabs” is shorthand for “float absolute value”.

Here’s a minimal example:

x = 42 # abs()
print(abs(x))
# 42 # math.fabs()
import math
print(math.fabs(x))
# 42.0

Python abs() vs np.abs()

Python’s built-in function abs(x) calculates the absolute number of the argument x. Similarly, NumPy’s np.abs(x) function calculates the same absolute value. There are two differences: (1) np.abs(x) always returns a float number while Python’s built-in abs(x) returns an integer if the argument x is an integer, and (2) np.abs(arr) can be also applied to a NumPy array arr that calculates the absolute values element-wise.

Here’s a minimal example:

x = 42 # abs()
print(abs(x))
# 42 # numpy.abs()
import numpy as np
print(np.fabs(x))
# 42.0 # numpy.abs() array
a = np.array([-1, 2, -4])
print(np.abs(a))
# [1 2 4]

abs and np. absolute are completely identical. It doesn’t matter which one you use. There are several advantages to the short names: They are shorter and they are known to Python programmers because the names are identical to the built-in Python functions.

Summary

The abs() function is a built-in function that returns the absolute value of a number. The function accepts integers, floats, and complex numbers as input.

If you pass abs() an integer or float, n, it returns the non-negative value of n and preserves its type. In other words, if you pass an integer, abs() returns an integer, and if you pass a float, it returns a float.

# Int returns int
>>> abs(20)
20
# Float returns float
>>> abs(20.0)
20.0
>>> abs(-20.0)
20.0

The first example returns an int, the second returns a float, and the final example returns a float and demonstrates that abs() always returns a positive number.

Complex numbers are made up of two parts and can be written as a + bj where a and b are either ints or floats. The absolute value of a + bj is defined mathematically as math.sqrt(a**2 + b**2). Thus, the result is always positive and always a float (since taking the square root always returns a float).

>>> abs(3 + 4j)
5.0
>>> math.sqrt(3**2 + 4**2)
5.0

Here you can see that abs() always returns a float and that the result of abs(a + bj) is the same as math.sqrt(a**2 + b**2).

Where to Go From Here?

Enough theory, let’s get some practice!

Practice projects is how you sharpen your saw in coding!

Do you want to become a code master by focusing on practical code projects that actually earn you money and solve problems for people?

Then become a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

Join my free webinar “How to Build Your High-Income Skill Python” and watch how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

The post Python abs() Function first appeared on Finxter.

Posted on November 28, 2020 by sick skills — Leave a comment

Python Built-In Functions

Python comes with many built-in functions you can use without importing any library. Here are they in alphabetical order:

		Built-in Functions
`abs()`	`delattr()`	`hash()`	`memoryview()`	`set()`
`all()`	`dict()`	`help()`	`min()`	`setattr()`
`any()`	`dir()`	`hex()`	`next()`	`slice()`
`ascii()`	`divmod()`	`id()`	`object()`	`sorted()`
`bin()`	`enumerate()`	`input()`	`oct()`	`staticmethod()`
`bool()`	`eval()`	`int()`	`open()`	`str()`
`breakpoint()`	`exec()`	`isinstance()`	`ord()`	`sum()`
`bytearray()`	`filter()`	`issubclass()`	`pow()`	`super()`
`bytes()`	`float()`	`iter()`	`print()`	`tuple()`
`callable()`	`format()`	`len()`	`property()`	`type()`
`chr()`	`frozenset()`	`list()`	`range()`	`vars()`
`classmethod()`	`getattr()`	`locals()`	`repr()`	`zip()`
`compile()`	`globals()`	`map()`	`reversed()`	`__import__()`
`complex()`	`hasattr()`	`max()`	`round()`

The post Python Built-In Functions first appeared on Finxter.

Posted on November 27, 2020 by sick skills — Leave a comment

Exponential Fit with SciPy’s curve_fit()

In this article, you’ll explore how to generate exponential fits by exploiting the curve_fit() function from the Scipy library. SciPy’s curve_fit() allows building custom fit functions with which we can describe data points that follow an exponential trend.

In the first part of the article, the curve_fit() function is used to fit the exponential trend of the number of COVID-19 cases registered in California (CA).
The second part of the article deals with fitting histograms, characterized, also in this case, by an exponential trend.

Disclaimer: I’m not a virologist, I suppose that the fitting of a viral infection is defined by more complicated and accurate models; however, the only aim of this article is to show how to apply an exponential fit to model (to a certain degree of approximation) the increase in the total infection cases from the COVID-19.

Exponential fit of COVID-19 total cases in California

Data related to the COVID-19 pandemic have been obtained from the official website of the “Centers for Disease Control and Prevention” (https://data.cdc.gov/Case-Surveillance/United-States-COVID-19-Cases-and-Deaths-by-State-o/9mfq-cb36) and downloaded as a .csv file. The first thing to do is to import the data into a Pandas dataframe. To do this, the Pandas functions pandas.read_csv() and pandas.Dataframe() were employed. The created dataframe is made up of 15 columns, among which we can find the submission_date, the state, the total cases, the confirmed cases and other related observables. To gain an insight into the order in which these categories are displayed, we print the header of the dataframe; as can be noticed, the total cases are listed under the voice “tot_cases”.

Since in this article we are only interested in the data related to the California, we create a sub-dataframe that contains only the information related to the California state. To do that, we exploit the potential of Pandas in indexing subsections of a dataframe. This dataframe will be called df_CA (from California) and contains all the elements of the main dataframe for which the column “state” is equal to “CA”. After this step, we can build two arrays, one (called tot_cases) that contains the total cases (the name of the respective header column is “tot_cases”) and one that contains the number of days passed by the first recording (called days). Since the data were recorded daily, in order to build the “days” array, we simply build an array of equally spaced integer number from 0 to the length of the “tot_cases” array, in this way, each number refers to the n° of days passed from the first recording (day 0).

At this point, we can define the function that will be used by curve_fit() to fit the created dataset. An exponential function is defined by the equation:

y = a*exp(b*x) +c

where a, b and c are the fitting parameters. We will hence define the function exp_fit() which return the exponential function, y, previously defined. The curve_fit() function takes as necessary input the fitting function that we want to fit the data with, the x and y arrays in which are stored the values of the datapoints. It is also possible to provide initial guesses for each of the fitting parameters by inserting them in a list called p0 = […] and upper and lower boundaries for these parameters (for a comprehensive description of the curve_fit() function, please refer to https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html ). In this example, we will only provide initial guesses for our fitting parameters. Moreover, we will only fit the total cases of the first 200 days; this is because for the successive days, the number of cases didn’t follow an exponential trend anymore (possibly due to a decrease in the number of new cases). To refer only to the first 200 values of the arrays “days” and “tot_cases”, we exploit array slicing (e.g. days[:200]).

The output of curve_fit() are the fitting parameters, presented in the same order that was used during their definition, within the fitting function. Keeping this in mind, we can build the array that contains the fitted results, calling it “fit_eq”.

Now that we built the fitting array, we can plot both the original data points and their exponential fit.

The final result will be a plot like the one in Figure 1:

Application of an exponential fit to histograms

Now that we know how to define and use an exponential fit, we will see how to apply it to the data displayed on a histogram. Histograms are frequently used to display the distributions of specific quantities like prices, heights etc…The most common type of distribution is the Gaussian distribution; however, some types of observables can be defined by a decaying exponential distribution. In a decaying exponential distribution, the frequency of the observables decreases following an exponential [A1] trend; a possible example is the amount of time that the battery of your car will last (i.e. the probability of having a battery lasting for long periods decreases exponentially). The exponentially decaying array will be defined by exploiting the Numpy function random.exponential(). According to the Numpy documentation, the random.exponential() function draws samples from an exponential distribution; it takes two inputs, the “scale” which is a parameter defining the exponential decay and the “size” which is the length of the array that will be generated. Once obtained random values from an exponential distribution, we have to generate the histogram; to do this, we employ another Numpy function, called histogram(), which generates an histogram taking as input the distribution of the data (we set the binning to “auto”, in this way the width of the bins is automatically computed). The output of histogram() is a 2D array; the first array contains the frequencies of the distribution while the second one contains the edges of the bins. Since we are only interested in the frequencies, we assign the first output to the variable “hist”. For this example, we will generate the array containing the bin position by using the Numpy arange() function; the bins will have a width of 1 and their number will be equal to the number of elements contained in the “hist” array.

At this point, we have to define the fitting function and to call curve_fit() for the values of the just created histogram. The equation describing an exponential decay is similar to the one defined in the first part; the only difference is that the exponent has a negative sign, this allows the values to decrease according to an exponential fashion. Since the elements in the “x” array, defined for the bin position, are the coordinates of the left edge of each bin, we define another x array that stores the position of the center of each bin (called “x_fit”); this allows the fitting curve to pass through the center of each bin, leading to a better visual impression. This array will be defined by taking the values of the left side of the bins (“x” array elements) and adding half the bin size; which corresponds to half the value of the second bin position (element of index 1). Similar to the previous part, we now call curve_fit(), generate the fitting array and assign it to the varaible “fit_eq”.

Once the distribution has been fitted, the last thing to do is to check the result by plotting both the histogram and the fitting function. In order to plot the histogram, we will use the matplotlib function bar(), while the fitting function will be plotted using the classical plot() function.

The final result is displayed in Figure 2:

Summary

In these two examples, the curve_fit() function was used to apply to different exponential fits to specific data points. However, the power of the curve_fit() function, is that it allows you defining your own custom fit functions, being them linear, polynomial or logarithmic functions. The procedure is identical to the one shown in this article, the only difference is in the shape of the function that you have to define before calling curve_fit().

Full Code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit url = "United_States_COVID-19_Cases_and_Deaths_by_State_over_Time" #url of the .csv file
file = pd.read_csv(url, sep = ';', thousands = ',') # import the .csv file
df = pd.DataFrame(file) # build up the pandas dataframe
print(df.columns) #visualize the header
df_CA = df[df['state'] == 'CA'] #initialize a sub-dataframe for storing only the values for the California
tot_cases = np.array((df_CA['tot_cases'])) #create an array with the total n° of cases
days = np.linspace(0, len(tot_cases), len(tot_cases)) # array containing the n° of days from the first recording #DEFINITION OF THE FITTING FUNCTION
def exp_fit(x, a, b, c): y = a*np.exp(b*x) + c return y #----CALL THE FITTING FUNCTION----
fit = curve_fit(exp_fit,days[:200],tot_cases[:200], p0 = [0.005, 0.03, 5])
fit_eq = fit[0][0]*np.exp(fit[0][1]*days[:200])+fit[0][2] # #----PLOTTING-------
fig = plt.figure()
ax = fig.subplots()
ax.scatter(days[:200], tot_cases[:200], color = 'b', s = 5)
ax.plot(days[:200], fit_eq, color = 'r', alpha = 0.7)
ax.set_ylabel('Total cases')
ax.set_xlabel('N° of days')
plt.show() #-----APPLY AN EXPONENTIAL FIT TO A HISTOGRAM--------
data = np.random.exponential(5, size=10000) #generating a random exponential distribution
hist = np.histogram(data, bins="auto")[0] #generating a histogram from the exponential distribution
x = np.arange(0, len(hist), 1) # generating an array that contains the coordinated of the left edge of each bar #---DECAYING FIT OF THE DISTRIBUTION----
def exp_fit(x,a,b): #defining a decaying exponential function y = a*np.exp(-b*x) return y x_fit = x + x[1]/2 # the point of the fit will be positioned at the center of the bins
fit_ = curve_fit(exp_fit,x_fit,hist) # calling the fit function
fit_eq = fit_[0][0]*np.exp(-fit_[0][1]*x_fit) # building the y-array of the fit
#Plotting
plt.bar(x,hist, alpha = 0.5, align = 'edge', width = 1)
plt.plot(x_fit,fit_eq, color = 'red')
plt.show()

The post Exponential Fit with SciPy’s curve_fit() first appeared on Finxter.