Do you want to learn Solidity and create your own dApps and smart contracts? This free online course gives you a comprehensive overview that is aimed to be more accessible than the Solidity documentation but still complete and descriptive.
Multimodal Learning: Each tutorial comes with a tutorial video that helps you grasp the concepts in a more interactive manner.
If you print all values from a dictionary in Python using print(dict.values()), Python returns a dict_values object, a view of the dictionary values. The representation prints the keys enclosed in a weird dict_values(...), for example: dict_values([1, 2, 3]).
There are multiple ways to change the string representation of the values, so that the print() output doesn’t yield the strange dict_values view object.
Method 1: Convert to List
An easy way to obtain a pretty output when printing the dictionary values without dict_values(...) representation is to convert the dict_value object to a list using the list() built-in function. For instance, print(list(my_dict.value())) prints the dictionary values as a simple list.
So far, so simple. Read on to learn or recap some important Python features and improve your skills. There are many paths to Rome!
Method 2: Unpacking
An easy and Pythonic way to print a dictionary without the dict_values prefix is to unpack all values into the print() function using the asterisk operator. This works because the print() function allows an arbitrary number of values as input. It prints those values separated by a single whitespace character per default.
Do you need even greater flexibility than this? No problem! See here:
Method 3: String Join Function and Generator Expression
To convert the dictionary values to a single string object without 'dict_values' in it and with maximal control, you can use the string.join() function in combination with a generator expression and the built-in str() function.
Here’s an example:
my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(', '.join(str(x) for x in my_dict.values()))
# Carl, 42, 100000
Note: You can replace the comma ',' with your desired separator character and modify the representation of each individual element by modifying the expression str(x) of the generator expression to something arbitrary complicated.
See here for something crazy that wouldn’t make any sense:
my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(' | '.join('x' + str(x) + 'x' for x in my_dict.values()))
# xCarlx | x42x | x100000x
Note that you could also use the repr() function instead of the str() function in this example—it wouldn’t matter too much.
Finally, I’d recommend you check out this tutorial to learn more how generator expressions work—many Python beginners struggle with this concept even though it’s ubiquitous in expert coders’ code bases.
The most Pythonic way to print a dictionary except for one or multiple keys is to filter it using dictionary comprehension and pass the filtered dictionary into the print() function.
There are multiple ways to accomplish this and I’ll show you the best ones in this tutorial. Let’s get started!
Say, you have one or more keys stored in a variable ignore_keys that may be a list or a set for efficiency reasons.
Create a filtered dictionary without one or multiple keys using the dictionary comprehension {k:v for k,v in my_dict.items() if k not in ignore_keys} that iterates over the original dictionary’s key-value pairs and confirms for each key that it doesn’t belong to the ones that should be ignored.
Here’s a minimal example:
ignore_keys = {'x', 'y'}
my_dict = {'x': 1, 'y': 2, 'z': 3} filtered_dict = {k:v for k,v in my_dict.items() if k not in ignore_keys}
print(filtered_dict)
# {'z': 3}
The dict.items() method creates an iterable of key-value pairs over which we can iterate.
The membership operatork not in ignore_keys tests if a given key doesn’t belong to the set.
The runtime complexity of the membership check is constant O(1) if you use a set for the ignore_keys data structure. It would be linear O(n) in the number of elements if you used a list which is not a good idea for that reason.
Note that you can also use this approach to print a dictionary except a single key by putting only one key into the ignore list.
A not-so-Pythonic but reasonably readable way to print a dict without one or multiple keys is to use a simple for loop with if condition to avoid all keys in the ignore list.
Here’s an example using three lines and directly printing the key-value pairs:
ignore_keys = {'x', 'y'}
my_dict = {'x': 1, 'y': 2, 'z': 3} for k, v in my_dict.items(): if k not in ignore_keys: print(k, v)
The output:
z 3
Of course, you can modify the output to your own needs. See the customizations of the built-in print() function and its awesome arguments:
I could have listed many more ways to solve this problem of printing a dict except one or more keys.
I have seen super inefficient ways proposed on forums that use exclude_keys that are list types.
I have also seen elaborate schemes to use set difference operations or more.
But I don’t recommend anything else than dict comprehension if you want to create a filtered dictionary object first and the simple for loop if you want to print on the fly.
In this article, Iβll be going over the different types of state variables in Solidity and how to use them. State variables are one of the most important parts of any smart contract, as they allow us to store data that can change over time.
This article is mainly focused on value types of state variables, but Iβll be continuing with another two articles on reference and complex types as well as data location. Letβs dive in!
Basics – A Quick Review
Smart contracts are pieces of code that are deployed in blockchain nodes. They are immutable, meaning they cannot be changed once they have been deployed. This can make it necessary to redeploy the code as a new smart contract or redirect calls from an old contract to new ones.
A smart contract is initiated by a message embedded in a transaction. Ethereum enables these transactions, which may carry out more sophisticated operations like conditional transfers.
A conditional transfer, such as one that depends on the age of the buyer or the value of their bid, could be required.
Example: If the buyer is over 21 and their bid is greater than the minimum bid, then accept the bid. Otherwise reject it.
Smart contracts are executed when predetermined conditions are met to automate the execution of an agreement so that all parties can be immediately certain of the outcome without the need for an intermediary.
a collection of code (its functions or methods with modifiers public or private with getter and set functions).
What is the structure of a smart contract?
As we have seen in other articles in Finxter, the structure of a smart contract is as follows:
Contract in the Ethereum blockchain has pragma directive;
Name of the contract;
Data or the state variable that define the state of the contract;
Collection of functions to carry out the intent of a smart contract;
Note that the identifiers representing these elements are restricted to the ASCII character set. Make sure you select meaningful identifiers and follow camel case convention in naming them.
Variable Declaration
To declare a variable in Solidity, you must first specify its data type. This is followed by an access modifier and the variable name.
Structure
<type> <access modifier> <variable name> ;
Example:
What Categories of Variables Exist in Solidity?
Solidity supports three categories of variables:
(1) State Variables
State variables are variables whose values are permanently stored in a contract storage.
What does this mean?
State variables are an essential part of any contract. They are variables whose values are permanently stored in the contract storage. They can be thought of as a single slot in a database that you can query and alter by calling functions of the code that manages the database. The set and get functions can be used to modify and retrieve the value of the variables.
In other words, the data (state variables) are stored contiguously item after item starting with the first state variable, stored in slot 0. For each variable, the size in bytes is determined according to its type. Several contiguous itemsΒ that require less than 32 bytes are packed into a single storage slot if possible.
To make it easier, if you use other languages and want to store user information for a long time, you would connect your application to a database server and then store the information in the database. In Solidity, however, you do not need to connect, you can simply store the data permanently using state variables.
(2) Local Variables
Local variables are variables whose values exist until the function is executed; the context of local variables is within the function and cannot be accessed outside.
Typically, these variables are used to hold temporary values for processing or computing something. In the following example, “temp” is a local variable that cannot be used outside the “set” function.
(3) Global Variables
Global variables are variables whose values exist in the global namespace to obtain information about the blockchain.
Each function has its own scope, but state variables should always be defined outside the scope, like the attributes of a class.
They are permanently stored in the Ethereum blockchain, more precisely in the storage Merkle-Patricia tree, which is part of the information that forms the state of an account (that’s why we call them state variables).
What Types of Valid State Variables Exist?
Info: Solidity is a statically typed language, meaning each variable’s type must be specified at the time of its declaration.Β
“Undefined” or “null” values do not exist in Solidity, but newly declared variables always have a default value depending on their type, typically called “zero- state”.
For example, the default value for bool is false.
As in other languages (not Python ), there are two types in Solidity: value types and reference types.
The value type is a variable that stores its value or its own data directly; it is a value type. If the variable contains a location of the data – it is a reference type.
The reference types are discussed in a separate article.
For example, consider the integer variable int i = 100;
The system stores 100 in the memory location allocated for the variable i. The following image shows how 100 is stored in a hypothetical location in memory (0x239110) for “i”:
What are the Modifiers for the State Variables?
Visibility – access modifiers
Access modifiers are the keywords used to specify the declared accessibility of a state variable and functions.
Variables in Solidity have three types of visibility: public, private, andΒ internal. If visibility is not explicitly declared, the compiler considers it internal.
For variables of type public, the compiler automatically creates a method to retrieve them through a call. This does not apply to private or internal variables.
Example:
uint256 public a; is actually exactly the same thing as : uint256 private a;
function a() public view returns(uint256) {
return a;
}
When you create a public variable, it is stored the same way as a private variable, but the compiler automatically creates a getter function for it.
The difference between private and internal variables is that internal variables are inherited by child contracts, while private variables are not.
To learn more about private variables:
contract Addition { uint x; //internal variable uint public y; // contract Child is Addition{ //no need to define x since the child contract inherits the variable //uintx function setX(uint _x) public { x =_x; function getX() public view returns (uint) { return x; }
}
Note that the data location (memory, storage, and call data) must be specified for variables of reference type. This is necessary when function arguments are involved. We will cover this in an article on data location.
Other keywords
The following keywords can be used for state variables to restrict changes to their state.
Constant (replaced by “view” and “pure” in functions)
Constant disallows assignment (except at initialization), i.e. they cannot be changed after initialization, but must be initialized at the time of their declaration.
Example:
uint private constant t = 40;
The variable t has been declared once and therefore cannot be changed.
It is interesting to note that the declaration of a constant variable without initialization is forbidden and the compiler displays an error, e.g.:
Contract Addition { uint private x; uint public y; uint private constant z; //gives an error because constant variables must be initialized when declared.
ImmutableΒ
These variables can be declared without being initialized, but the assignment, which is only one, must be done in the constructor. After that, the variable is constant thereafter.
uint private immutable w; //now we declare a constructor for the contract, using the function constructor constructor() { w = 20; //initiate variable }
Override
This keyword states that the public state variables change the behavior of a function.
Value Types
These variables are passed by value. That is, they are copied when they are used either in an assignment or in a function argument.
If this sentence is not clear, you can check here.
Here we will see the basic value types.
Value types are booleans, integers, addresses, enums, and bytes.
Booleans
Boolean values can be true or false
An example of a boolean type:
contract ExampleBool { // example of a bool value type in solidity bool public IsVerified = false; bool public IsSent = true; }
Integers
There are int/uint (signed and unsigned integers) types of various sizes. It stores the values in a range of 8, int16, …up to int256. Int256 is the same as int, same for uint8, and uint256.Β
Note: uint256 is the same as uint.
The type uint stands for positive integers. The type int stands for both positive and negative integers.
The type uint8 (has 8 bits, which corresponds to 1 byte. This means that it accepts numbers between 0 and 255; bit is a binary digit. So one byte can hold 2 (binary) ^ 8 numbers from 0 to 2^8-1 = 255. This is the same as asking why a three-digit decimal number can represent the values 0 to 999.
The type uint256 accepts numbers between 0 and 2^256.
If we try to assign the value 256 to a variable of type uint8, the compiler will print an error.
The best practice for integers is to specify the value of the bits at the declaration stage to use as little space as possible and reduce the cost of storage. So use uint8 or uint16 instead of always using int (uint256).
contract SimpleContract{ uint32 public uidata = 1234567; //un-signed integer int32 public idata = -1234567; //signed integer }
Fixed Point Numbers
According to the Solidity documents, fixed-point numbers are the type for floating-point numbers. However, the official document states that “Fixed point numbers are not yet fully supported by Solidity”. They can be declared, but cannot be added to or derived from.
However, you can use floating point numbers for calculations, but the value resulting from the calculation should be an integer.
Here is an example,
contract additionContract{ uint8 result; function Addition(uint) public { result = 2/3; //error result = 3.5 + 1.5; // final result will be an integer } }
Letβs do a subtle change,
Address
The address data type is very specific to Solidity.
On the Ethereum blockchain, every account and smart contract has an address that is used to send and receive Ether from one account to another.
This is your public identity on the blockchain.
Also, when you deploy a smart contract on the blockchain, that contract is assigned an address that you can use to identify and call the smart contract.
There are two variants for the address type, which are identical:
address – stores a 20-byte value (the size of an Ethereum address or account). The default value for the address is 0x…followed by 40 0’s, or 20 bytes of 0’s.
address payable – like address, but transfer and send with the additional members.
The idea behind this distinction is that the address payable is an address you can send Ether to, while you should not send Ether to a plain address, as it could be a smart contract that was not built to accept Ether.
contract ExampleAddress { address public myAddress = 0xc895t6ea1bc39595cf849612ffta7427f5792987
Enums
What stands for enumerable is a user-defined data type that restricts the variable to have only one of the predefined values.
These values listed in the enumerated list are called enums, and internally these enums are treated like numbers (resource). This makes the contract more readable and maintainable.
contract SampleEnum{ //Creating an enumerator enum animal_classes { Mammals, Fish, Amphibians, Reptiles, Birds } function getFirstEnum() public pure returns(animal_classes){ return animal_classes.Mammals; } // result: // 0: uint8: 0 }
With enums, we can also set a default value;
animal_classes constant defaultValue = animal_classes.Reptiles; function getDefaultValue() public pure returns(animal_classes) { return defaultValue; } } //result // result: // 0: uint8: 2
Bytes and Strings
A byte refers to signed 8-bit integers. Everything in memory is stored in bits with binary values 0 and 1.
Solidity supports string literals that use both double quotes (") and single quotes ('). It provides String as a data type to declare a variable of type String.
Strings are unique in Solidity compared to Python or other programming languages in that there are no functions for manipulating strings, except that you can concatenate strings. The reason for this is that storing strings in a blockchain is very expensive.
Bytes and strings are easy to handle in Solidity because Solidity treats them similarly to an array. The two are very similar. (See Arrays in the Reference Type article).
Conclusion
Smart contracts reside at a specific address in the Ethereum blockchain. In this article, we learned about state variables in Solidity.
We looked at state, local variables, and the different types with a value type.
We tried to understand Boolean, Integers, Enums, Addresses, Bytes, and Strings (although the last ones are treated with more depth in reference types)
You can unpack all list elements into the print() function to print all values individually, separated by an empty space per default (that you can override using the sep argument). For example, the expression print('[', *lst, ']') prints the elements in my_list, empty space separated, with the enclosing square brackets and without the separating commas!
Here’s an example:
lst = [1, 2, 3]
print('[', *lst, ']')
# [ 1 2 3 ]
You can learn about the ins and outs of the built-in print() function in the following video:
To master the basics of unpacking, feel free to check out this video on the asterisk operator:
Method 2: String Replace Method
A simple way to print a list without commas is to first convert the list to a string using the built-in str() function. Then modify the resulting string representation of the list by using the string.replace() method until you get the desired result.
Here’s an example:
my_list = [1, 2, 3] # Convert List to String
s = str(my_list)
print(s)
# [1, 2, 3] # Replace Separating Commas
s = s.replace(',', '') # Print List Without Commas
print(s)
# [1 2 3]
The last line of the code snippet shows that the commas are removed from the output.
Method 3: String Join With Generator Expression
You can print a list without commas using the string.join() method on any separator string such as ' ' or '\t'. Pass a generator expression to convert each list element to a string using the str() built-in function.
Specifically, the expression print('[', ' '.join(str(x) for x in my_list), ']') prints my_list to the shell without separating commas.
my_list = [1, 2, 3]
print('[', ' '.join(str(x) for x in my_list), ']')
# Output: [ 1 2 3 ]
The str(object) built-in function converts a given object to its string representation.
Generator expressions or list comprehensions are concise one-liner ways to create a new iterable based by reusing elements from another iterable.
You can dive deeper into generators in the following video:
Note: Combining the join() method with a generator expression and string concatenation is the recommended approach of choice if you want to convert a list to a string without commas instead of printing it.
Here’s an example:
my_list = [1, 2, 3]
s = '[' + ' '.join(str(x) for x in my_list) + ']'
print(s)
# Output: [ 1 2 3 ]
Method 4: Print NumPy Array
Sometimes it is sufficient to use the NumPy default output that is without separating commas. For example, if you print a list it yields [1, 2, 3]. And if you print an array it yields [1 2 3]. You can easily convert a list to a NumPy array using the np.array(lst) constructor.
Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.
To become more successful in coding, solve more real problems for real people. Thatβs how you polish the skills you really need in practice. After all, whatβs the use of learning theory that nobody ever needs?
You build high-value coding skills by working on practical coding projects!
Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?
If your answer is YES!, consider becoming a Python freelance developer! Itβs the best way of approaching the task of improving your Python skillsβeven if you are a complete beginner.
If you just want to learn about the freelancing opportunity, feel free to watch my free webinar βHow to Build Your High-Income Skill Pythonβ and learn how I grew my coding business online and how you can, tooβfrom the comfort of your own home.
Per default, Python doesn’t truncate lists when printing them to the shell, even if they are large. For example, you can call print(my_list) and see the full list even if the list has one thousand elements or more!
Here’s an example:
However, Python may squeeze the text (e.g., in programming environments such as IDLE) so you would have to press the button before seeing the output. The reason is that showing the whole output could be time-consuming and visually cluttering.
Here’s an example:
How to Print a NumPy Array Without Truncating?
In many cases, large NumPy arrays when printed out are not truncated as well on the default Python programming environment IDLE:
However, in the interactive mode of the Python shell, a NumPy array may be truncated, unlike a Python list:
Welcome to the Finxter blog! My name is Chris, and I started this coding venture a couple of years ago.
Over the years, I have chatted with tens of thousands of Finxters who shared their stories and struggles with me.
See here and here to read a lot of feedback from the community.
Today, allow me to share my story about why I started teaching freelancing.
It may inspire you to take control of your life if you’re in a tough spot right now – for example, struggling with the economic, military, and energy crises that are happening right now.
If you’re not interested in my personal story, now would be the time to stop reading. I won’t blame you!
~~~
Once upon a time, when I was a timid and naive 20-year-old dreamer, my 18-year-old girlfriend unexpectedly got pregnant.
She was still in high school, and I had just started studying computer science.
At the time, we had zero income and maybe $900 in savings.
I was living in a cheap 15-square-meter room with a desk and a bed and not much else.
As young and poor parents without any education or degree, we constantly felt judgment and pity from society.
We couldn’t even rent a flat because no landlord was crazy enough to take us in.
During all the struggle, we had love and dreams and the belief that everything would get better eventually: I was going to be a computer scientist in five years.
That is if I found a way to support my family on a shoestring – and avoided screwing up my education.
The first ten years, money was tight as hell. Little time. Lots of hard work. No TV. No Games. No Saturday night partying.
Well, maybe a little…
I am not a wunderkind. But I have good work ethic, and long-term goals, and I don’t give up easily. Finally, after ten tough years, I got my Ph.D. in computer science “summa cum laude”.
I now had a steady paycheck from my government job. But I eventually learned that the academic degrees didn’t help in improving our financial situation.
People made far more money and had far more free time coding in the private sector and without academic degrees.
I decided to take matters into my own hands again by creating my own coding business as a freelance developer.
In little time, I reached six-figure income levels. And I had much more free time compared to my government job that I held before.
My second child – now five years old – knows his father to have infinite time playing soccer, video games, or watching the Tesla Bot taking his first steps on YouTube.
(He plans to become CEO of Tesla – stay tuned @Elon).
~~~
Becoming a freelancer was a pivot point in my life.
To share all I know about creating a thriving coding business online, I have set up our freelancer course.
It focuses on the fundamentals:
find your niche,
build your skills,
create value for your customers, and
take massive action.
Simple, but sometimes not so easy…
If you want more from life and you love coding, feel free to subscribe to my free email academy, I’d love to have you in our community of ambitious coders who have not yet lost their ability to dream of a better life!
It’s part of our long-standing tradition to make this (and other) articles a faithful companion or a supplement to the official Solidity documentation.
Download PDF Slide Deck at the end of this tutorial!
Contract Types
To quote the official Solidity documentation, “every contract defines its own type”.
This statement might seem a bit cryptic, and since we’re an efficient crowd, we’d surely like to know what it means.
We can all remember that some number of articles ago, we mentioned how Solidity has key elements of an object-oriented programming language (OOPL). We also emphasized how smart contracts in Solidity are very similar to classes in an OOPL.
Classes themselves are a mesh of custom data types, i.e. structs, and functions, which qualifies classes to be treated as types.
By extension, our contracts are also treated as types, and as every contract is unique in its own right, it defines its own type. Being a type, we can implicitly convert a specific contract to a contract it inherits from, i.e. if contract “Aa” inherits from contract A, it can also be converted to contract “A”.
Besides that, we can explicitly convert each contract to and from the address type. Even more, we can conditionally convert a contract to and from the address payable type (remember, that’s the same type as the address type, but predetermined to receive Ether).
The condition is that the contract type must have a receive or payable fallback function. If it does, we can make the conversion to address payable by using address(x).
However, if the contract type does not implement (a more professional way to say “have”) a receive or payable fallback function, then the conversion to address payable has to be even more explicit (no swearing!) by stating payable(address(x)).
A local variable obc of a contract type OurBeautifulContract is declared by OurBeautifulContract obc;.
Once we point our variable obc to an instantiated (newly created) contract, we’d be able to call functions on that contract.
In terms of its data representation, a contract is identical to the address type. This is important because the contract type is not directly supported by the ABI, but the address type, as its representative, is supported by the ABI.
In contrast to the types mentioned so far, contract types don’t support any operators.
The members of contract types are the external functions (the functions only available to other contracts) and state variables whose visibility is set to public.
When we need to access type information about the contract, like the OurBeautifulContract above, we’d call the type(OurBeautifulContract) function (docs).
Fixed-Size Byte Arrays
The value type bytesN holds a sequence of bytes, whose length, and accordingly N goes from 1 to up to 32, i.e., bytes1, …, bytes32.
The available operators for fixed-size operators are:
Comparisons: <=, <, ==, !=, >=, > (evaluate to bool)
Index access: If x is of type bytesN, then x[k] for 0 <= k < N returns the k-th byte (read-only). In other words, x[0] up to (inclusive) x[N-1] is available for index access; if N = 1, then only x is of type bytes1, and x[0] is the only element, i.e. byte accessible by the index.
The shifting operator always uses an unsigned integer type as a right operand, which represents the number of bits to shift by, and returns the type of the left operand.
Let’s take a look at a simple example to illustrate:
bytes2 lo = 0x1234; // (lo is the left operand)
uint8 ro = 5; // (ro is the right operand variable, must be u... type)
lo << ro // will evaluate to an lo type, bytes2
A fixed-size byte array has only one member, .length, that holds the fixed length of the byte array. This member is accessible as the read-only value.
Warning: Since the type bytes1 is a sequence of 1 byte in length, the type bytes1[] is a fixed-size byte array of 1-byte sequences. However, each element of the array is padded with 31 bytes, due to padding rules for elements stored in memory, stack, and call data, i.e., except in storage. Therefore, according to the official Solidity documentation, it’s better to use bytes type instead of bytes1[].
Note: Value types in storage are packed/compacted together and share a storage slot, taking only as much space per value type as really needed. In contrast, the stack, memory, and calldata pad value types and store in separate slots, meaning that each variable uses a whole slot of 32 bytes, even if the value type is shorter than 32 bytes, effectively wasting the memory space.
Before Solidity v0.8.0, the keyword byte was an alias for bytes1.
Dynamically-Sized Byte Arrays
There are two dynamically-sized non-value types, namely bytes and string.
bytes is a dynamically-sized byte array, while
string is a dynamically-sized UTF-8-encoded string.
Address Literals
Address literals are hexadecimal literals that pass the address checksum test, e.g. 0xdCad3a6d3569DF655070DEd06cb7A1b2Ccd1D3AF.
Hexadecimal literals will produce an error if they are between 39 and 41 digits long and do not pass the checksum test.
However, we can remove the error by prepending zeros to integer types or appending zeros to bytesNN types.
The Ethereum Improvement Proposal EIP-55 defines the mixed-case address checksum.
Integer and Rational Literals
Integer Literals
Integer literals are created using a sequence of digits from a range 0-9, and each digit is interpreted (weighted) based on its position in the sequence.
Multiplied by an exponent of 10, e.g. 217 is interpreted as two hundred and seventeen, because, reading from right to left, we have 7 * 100 + 1 * 101 + 2 * 102.
A reminder, 100 = 1.
Octal literals don’t exist in Solidity and leading zeros are invalid.
Decimal Fractional Literals
Decimal fractional literals consist of a dot . (or, depending on the locale) and at least one number on either of the sides, e.g. 1., .1, and 1.3.
Info: “A locale consists of a number of categories for which country-dependent formatting or other specifications exist” (source).
Scientific Notation
Solidity also supports scientific notation in the form of 2e10, where 2 (left of “e”) is called mantissa (M) and the exponent (E) must be an integer. In a general form, we would write it as MeE and it is interpreted as M * 10**E, e.g. 2e10, -2e10, 2e-10, 2.5e1.
Readable Underscore Notation
We can also do a neat thing: separate the digits of a numeric literal for easier readability, such as in decimal 123_000, hexadecimal 0x2eff_abde, scientific decimal notation 1_2e345_678.
However, there are no leading, trailing, or multiple underscores; they can only be added between two digits.
Number Literal Expressions
Expressions containing number literals preserve their precision until they are converted to a non-literal type.
Such a conversion means an explicit conversion, or that the number literals are used with something else than a number literal expression, like boolean literals.
This behavior implies that computations don’t overflow and divisions don’t truncate in number literal expressions.
A very good example would be a number literal expression (2**800 + 1) – 2**800, which results in the constant 1 (of type uint8), although the intermediate results would not fit the capacity of the EVM word length of 32 bytes.
One more example shows that an integer 4 is produced by computing the expression .5 * 8, although the intermediary results are not integers.
More Operations
Warning: most operators produce a literal expression when applied to number literals, but there are also two exceptions:
Ternary operator (... ? ... : ...),
Array subscript (<array>[<index>]).
In other words, expressions like 255 + (true ? 1 : 0) or 255 + [1, 2, 3][0] are not equivalent to using the literal 256 (the result of these two expressions), as they are computed within the type uint8 and can lead to an overflow.
Number literal expressions can use the same operators as the integers, but both operands must compute yield an integer.
If either of the operands is fractional, bit operations are inapplicable for use;
If the exponent is a decimal fractional literal, the exponentiation operation is also inapplicable for use.
Shifts and exponentiation * operations with literal numbers in place of a left (base*) operand and integer types in place of the right (exponent*) operand are performed in the uint256 for non-negative literals or int256 for negative literals (a * symbol pertains to the exponentiation operations context).
Warning: Since Solidity v0.4.0 division on integer literals produces a rational number, e.g. 7 / 2 = 3.5.
Solidity has a number literal types for each rational number, e.g. integer literals and rational number literals belong to the same number literal type.
All number literal expressions (expressions with only number literals and operators) also belong to number literal types, e.g. 1 + 2 and 2 + 1 belong to the same number literal type.
Note: When number literal types are used with non-literal expressions, they are converted into a non-literal type, e.g.Β uint128 a = 1; uint128 b = 2.5 + a + 0.5;
Here, 1 is converted into a non-literal type uint128, i.e. variable a, but a common type for both 2.5 and uint128 doesn’t exist and the compiler will reject the code.
Conclusion
In this article, we added even more data types in Solidity under our proverbial belt!
First, we introduced and learned about the contract type.
Second, we fixed our understanding of the fixed-size byte array type.
Third, the situation got dynamic by studying the dynamically-sized byte array type.
Fourth, we addressed the… what was it called… Aha – address literals!
Fifth, we came to the most rational decision and discovered what rational and integer literals are and, of course, how can they be put to good use.
Slide Deck Data Types
You can scroll through the data types discussed in this tutorial here:
To convert a given Boolean value to a string in Python, use the str(boolean) function and pass the Boolean value into it. This converts Boolean True to string "True" and Boolean False to string "False".
Here’s a minimal example:
>>> str(True) 'True'
>>> str(False) 'False'
Python Boolean Type is Integer
Booleans are represented by integers in Python, i.e., bool is a subclass of int. Boolean value True is represented with integer 1. And Boolean value False is represented with integer 0.
Here’s a minimal example:
>>> True == 1
True
>>> False == 0
True
Convert True to ‘1’ and False to ‘0’
To convert a Boolean value to a string '1' or '0', use the expression str(int(boolean)). For instance, str(int(True)) returns '1' and str(int(False)) returns '0'. This is because of Python’s use of integers to represent Boolean values.
Here’s a minimal example:
>>> str(int(True)) '1'
>>> str(int(False)) '0'
Convert List of Boolean to List of Strings
To convert a Boolean to a string list, use the list comprehension expression [str(x) for x in my_bools] assuming the Boolean list is stored in variable my_bools. This converts each Boolean x to a string using the built-in str() function and repeats it for all x in the Boolean list.
Here’s a simple example:
my_bools = [True, True, False, False, True]
my_strings = [str(x) for x in my_bools]
print(my_strings)
# ['True', 'True', 'False', 'False', 'True']
Convert String Back to Boolean
What if you want to convert the string representation 'True' and 'False' (or: '1' and '0') back to the Boolean representation True and False?
You can convert a string value s to a Boolean value using the Python function bool(s).
For example, bool('True') and bool('1') return True.
However, bool('False') and bool('0') return False as well which may come unexpected to you.
This is because all Python objects are “truthy”, i.e., they have an associated Boolean value. As a rule of thumb: empty values return Boolean True and non-empty values return Boolean False. So, only bool('') on the empty string '' returns False. All other strings return True!
Easy – first pass the string into the eval() function and then pass the result into the bool() function. In other words, the expression bool(eval(my_string)) converts a string to a Boolean mapping 'True' and '1' to Boolean True and 'False' and '0' to Boolean False.
Finally – this behavior is as expected by many coders just starting out.
A Time Series is essentially a tabular data with the special feature of having a time index. The common forecast taks is ‘knowing the past (and sometimes the present), predict the future’. This task, taken as a principle, reveals itself in several ways: in how to interpret your problem, in feature engineering and in which forecast strategy to take.
This is the second article in our series. In the first article we discussed how to create features out of a time series using lags and trends. Today we follow the opposite direction by highlighting trends as something you want directly deducted from your model.Β
Reason is, Machine Learning models work in different ways. Some are good with subtractions, others are not.
For example, for any feature you include in a Linear Regression, the model will automatically detect whether to deduce it from the actual data or not. A Tree Regressor (and its variants) will not behave in the same way and usually will ignore a trend in the data.
Therefore, whenever using the latter type of models, one usually calls for a hybrid model, meaning, we use a Linear(ish) first model to detect global periodic patterns and then apply a second Machine Learning model to infer more sophisticated behavior.
The hybrid model part of this article is heavily based on Kaggle’s Time Series Crash Course, however, we intend to automate the process and discuss more in-depth the DeterministicProcess class.
DeterministicProcess aims at creating features to be used in a Regression model to determine trend and periodicity. It takes your DatetimeIndex and a few other parameters and returns a DataFrame full of features for your ML model.
A usual instance of the class will read like the one below. We use the sentic_mean column to illustrate.
from statsmodels.tsa.deterministic import DeterministicProcess y = dataset['sentic_mean'].copy() dp = DeterministicProcess(
index=y.index, constant=True, order=2
) X = dp.in_sample() X
We can use X and y as features and target to train a LinearRegression model. In this way, the LinearRegression will learn whatever characteristics from y can be inferred (in our case) solely out of:
the number of elapsed time intervals (trend column);
the last number squared (trend_squared); and
a bias term (const).
Check out the result:
from sklearn.linear_model import LinearRegression model = LinearRegression().fit(X,y) predictions = pd.DataFrame( model.predict(X), index=X.index, columns=['Deterministic Curve']
)
Even the quadratic term seems ignorable here. The DeterministicProcess class also helps us with future predictions since it carries a method that provides the appropriate future form of the chosen features.
Specifically, the out_of_sample method of dp takes the number of time intervals we want to predict as input and generates the needed features for you.
Although the order-three polynomial fits the data better, use discretion in deciding whether the sentiment count will decrease so drastically in the next 60 days or not. Usually, trust short-time predictions rather than long ones.
DeterministicProcess accepts other parameters, making it a very interesting tool. Find a description of the almost full list below.
dp = DeterministicProcess( index, # the DatetimeIndex of your data period: int or None, # in case the data shows some periodicity, include the size of the periodic cycle here: 7 would mean 7 days in our case constant: bool, # includes a constant feature in the returned DataFrame, i.e., a feature with the same value for everyone. It returns the equivalent of a bias term in Linear Regression order: int, # order of the polynomial that you think better approximates your trend: the simplest the better seasonal: bool, # make it True if you think the data has some periodicity. If you make it True and do not specify the period, the dp will try to infer the period out of the index additional_terms: tuple of statsmodel's DeterministicTerms, # we come back to this next drop: bool # drops resulting features which are collinear to others. If you will use a linear model, make it True
)
Seasonality
As a hardened Mathematician, seasonality is my favorite part because it deals with Fourier analysis (and wave functions are just⦠cool!):
Do you remember your first ML course when you heard Linear Regression can fit arbitrary functions, not only lines? So, why not a wave function? We just did it for polynomials and didn’t even feel like it
In general, for any expression f which is a function of a feature or of your DatetimeIndex, you can create a feature column whose ith row is the value of f corresponding to the ith index.
Then linear regression finds the constant coefficient multiplying f that best fits your data. Again, this procedure works in general, not only with Datetime indexes – the trend_squared term above is an example of it.
For seasonality, we use a second statsmodel‘s amazing class: CalendarFourier. It is another statsmodel‘s DeterministicTerm class (i.e., with the in_sample and out_of_sample methods) and instantiates with two parameters, 'frequency' and 'order'.
As a 'frequency', the class expects a string such as ‘D’, ‘W’, ‘M’ for day, week or month, respectively, or any of the quite comprehensive Pandas Datetime offset aliases.
The 'order' is the Fourier expansion order which should be understood as the number of waves you are expecting in your chosen frequency (count the number of ups and downs β one wave would be understood as one up and one down)
CalendarFourier integrates swiftly with DeterministicProcess by including an instance of it in the list of additional_terms.
If we take seasonal=True inside DeterministicProcess, we get a crispier line:
Including ax.set_xlim(('2022-08-01', '2022-10-01')) before plt.show() zooms the graph in:
Although I suggest using the seasonal=True parameter with care, it does find interesting patterns (with huge RMSE error, though).
For instance, look at this BTC percentage change zoomed chart:
Here period is set to 30 and seasonal=True. I also manually rescaled the predictions to be better visible in the graphic. Although the predictions are far away from truth, thinking as a trader, isn’t it impressive how many peaks and hills it gets right? At least for this zoomed monthβ¦
To maintain the workflow promise, I prepared a code that does everything so far in one shot:
def deseasonalize(df: pd.Series, season_freq='A', fourier_order=0, constant=True, dp_order=1, dp_drop=True, model=LinearRegression(), fourier=None, dp=None, **DeterministicProcesskwargs)->(pd.Series, plt.Axes, pd.DataFrame): """ Returns a deseasonalized and detrended df, a seasonal plot, and the fitted DeterministicProcess instance. """ if fourier is None: fourier = CalendarFourier(freq=season_freq, order=fourier_order) if dp is None: dp = DeterministicProcess( index=df.index, constant=True, order=dp_order, additional_terms=[fourier], drop=dp_drop, **DeterministicProcesskwargs ) X = dp.in_sample() model = LinearRegression().fit(X, df) y_pred = pd.Series( model.predict(X), index=X.index, name=df.name+'_pred' ) ax = plt.subplot() y.plot(ax=ax, legend=True) predictions.plot(ax=ax) y_pred.columns = df.name y_deseason = df - y_pred y_deseason.name = df.name +'_deseasoned' return y_deseason, ax, dp The sentic_mean analyses get reduced to: y_deseason, ax, dp= deseasonalize(y, season_freq='A', fourier_order=2, constant=True, dp_order=2, dp_drop=True, model=LinearRegression() )
Cycles and Hybrid Models
Let us move on to a complete Machine Learning prediction. We use XGBRegressor and compare its performance among three instances:Β
Predict sentic_mean directly using lags;
Same prediction adding the seasonal/trending with a DeterministicProcess;
A hybrid model, using LinearRegression to infer and remove seasons/trends, and then apply a XGBRegressor.
The first part will be the bulkier since the other two follow from simple modifications in the resulting code.
Preparing the data
Before any analysis, we split the data in train and test sets. Since we are dealing with time series, this means we set the ‘present date’ as a point in the past and try to predict its respective ‘future’. Here we pick 22 days in the past.
s = dataset['sentic_mean'] s_train = s[:'2022-09-01']
We made this first split in order to not leak data while doing any analysis.
Next, we prepare target and feature sets. Recall our SentiCrypto’s data was set to be available everyday at 8AM. Imagine we are doing the prediction by 9AM.
In this case, anything until the present data (the ‘lag_0‘) can be used as features, and our target is s_train‘s first lead (which we define as a -1 lag). To choose other lags as features, we examine theirs statsmodel’s partial auto-correlation plot:
from statsmodels.graphics.tsaplots import plot_pacf plot_pacf(s_train, lags=20)
We use the first four for sentic_mean and the first seven + the 11th for sentic_count (you can easily test different combinations with the code below.)
Now we finish choosing features, we go back to the full series for engineering. We apply to s_maen and s_count the make_lags function we defined in the last article (which we transcribe here for convenience).Β
def make_lags(df, n_lags=1, lead_time=1): """ Compute lags of a pandas.Series from lead_time to lead_time + n_lags. Alternatively, a list can be passed as n_lags. Returns a pd.DataFrame whose ith column is either the i+lead_time lag or the ith element of n_lags. """ if isinstance(n_lags,int): lag_list = list(range(lead_time, n_lags+lead_time)) else: lag_list = n_lags lags ={ f'{df.name}_lag_{i}': df.shift(i) for i in lag_list } return pd.concat(lags,axis=1) X = make_lags(s, [0,1,2,3,4]) y = make_lags(s, [-1]) display(X)
y
Now a train-test split with sklearn is convenient (Notice the shuffle=False parameter, that is key for time series):
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=22, shuffle=False) X_train
(Observe that the final date is set correctly, in accordance with our analysis’ split.)
You can reduce overfitness by reducing the number of estimators, but the R2 test score maintains negative.
We can replicate the process for sentic_count (or whatever you want). Below is a function to automate it.
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from statsmodels.tsa.stattools import pacf def apply_univariate_prediction(series, test_size, to_predict=1, nlags=20, minimal_pacf=0.1, model=XGBRegressor(n_estimators=50)): ''' Starting from series, breaks it in train and test subsets; chooses which lags to use based on pacf > minimal_pacf; and applies the given sklearn-type model. Returns the resulting features and targets and the trained model. It plots the graph of the training and prediction, together with their r2_score. ''' s = series.iloc[:-test_size] if isinstance(to_predict,int): to_predict = [to_predict] from statsmodels.tsa.stattools import pacf s_pacf = pd.Series(pacf(s, nlags=nlags)) column_list = s_pacf[s_pacf>minimal_pacf].index X = make_lags(series, n_lags=column_list).dropna() y = make_lags(series,n_lags=[-x for x in to_predict]).loc[X.index] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size, shuffle=False) model.fit(X_train,y_train) predictions_train = pd.DataFrame( model.predict(X_train), index=X_train.index, columns=['Train Predictions'] ) predictions_test = pd.DataFrame( model.predict(X_test), index=X_test.index, columns=['Test Predictions'] ) fig, (ax1,ax2) = plt.subplots(1,2, figsize=(14,5), sharey=True) y_train.plot(ax=ax1, legend=True) predictions_train.plot(ax=ax1) ax1.set_title('Train Predictions') y_test.plot(ax=ax2, legend=True) predictions_test.plot(ax=ax2) ax2.set_title('Test Predictions') plt.show() print(f'R2 train score: {r2_score(y_train[:-1],predictions_train[:-1])}') print(f'R2 test score: {r2_score(y_test[:-1],predictions_test[:-1])}') return X, y, model apply_univariate_prediction(dataset['sentic_count'],22)
Since the features created by DeterministicProcess are only time-dependent, we can add them harmlessly to the feature DataFrame we automated get from our univariate predictions.
The predictions, though, are still univariate. We use the deseasonalize function to obtain the season features. The data preparation is as follows:
s = dataset['sentic_mean'] X, y, _ = apply_univariate_prediction(s,22); s_deseason, _, dp = deseasonalize(s, season_freq='A', fourier_order=2, constant=True, dp_order=2, dp_drop=True, model=LinearRegression() );
X_f = dp.in_sample().shift(-1) X = pd.concat([X,X_f], axis=1, join='inner').dropna()
With a bit of copy and paste, we arrive at:
And we actually perform way worse!
Deseasonalizing
Nevertheless, the right-hand graphic illustrates the inability of grasping trends. Our last shot is a hybrid model.
Here we follow three steps:
We use the LinearRegression to capture the seasons and trends, rendering the series y_s. Then we acquire a deseasonalized target y_ds = y-y_s;
Train an XGBRegressor on y_ds and the lagged features, resulting in deseasonalized predictions y_pred;
Finally, we incorporate y_s back to y_pred to compare the final result.
Although Bitcoin-related data are hard to predict, there was a huge improvement on the r2_score (finally something positive!). We define the used function below.
Instead of going through every detail, we will also automate this code. In order to get the code running smoothly, we revisit the deseasonalize and the apply_univariate_prediction functions in order to remove the plotting part of them.
The final function only plots graphs and returns nothing. It intends to give you a baseline for a hybrid model score. Change the function at will to make it return whatever you need.
def get_season(series: pd.Series, test_size, season_freq='A', fourier_order=0, constant=True, dp_order=1, dp_drop=True, model1=LinearRegression(), fourier=None, is_seasonal=False, season_period=None, dp=None): """ Decompose series in a deseasonalized and a seasonal part. The parameters are relative to the fourier and DeterministicProcess used. Returns y_ds and y_s. """ se = series.iloc[:-test_size] if fourier is None: fourier = CalendarFourier(freq=season_freq, order=fourier_order) if dp is None: dp = DeterministicProcess( index=se.index, constant=True, order=dp_order, additional_terms=[fourier], drop=dp_drop, seasonal=is_seasonal, period=season_period ) X_in = dp.in_sample() X_out = dp.out_of_sample(test_size) model1 = model1.fit(X_in, se) X = pd.concat([X_in,X_out],axis=0) y_s = pd.Series( model1.predict(X), index=X.index, name=series.name+'_pred' ) y_s.name = series.name y_ds = series - y_s y_ds.name = series.name +'_deseasoned' return y_ds, y_s def prepare_data(series, test_size, to_predict=1, nlags=20, minimal_pacf=0.1): ''' Creates a feature dataframe by making lags and a target series by a negative to_predict-shift. Returns X, y. ''' s = series.iloc[:-test_size] if isinstance(to_predict,int): to_predict = [to_predict] from statsmodels.tsa.stattools import pacf s_pacf = pd.Series(pacf(s,nlags=nlags)) column_list = s_pacf[s_pacf>minimal_pacf].index X = make_lags(series, n_lags=column_list).dropna() y = make_lags(series,n_lags=[-x for x in to_predict]).loc[X.index].squeeze() return X, y def get_hybrid_univariate_prediction(series: pd.Series, test_size, season_freq='A', fourier_order=0, constant=True, dp_order=1, dp_drop=True, model1=LinearRegression(), fourier=None, is_seasonal=False, season_period=None, dp=None, to_predict=1, nlags=20, minimal_pacf=0.1, model2=XGBRegressor(n_estimators=50) ): """ Apply the hybrid model method by deseasonalizing/detrending a time series with model1 and investigating the resulting series with model2. It plots the respective graphs and computes r2_scores. """ y_ds, y_s = get_season(series, test_size, season_freq=season_freq, fourier_order=fourier_order, constant=constant, dp_order=dp_order, dp_drop=dp_drop, model1=model1, fourier=fourier, dp=dp, is_seasonal=is_seasonal, season_period=season_period) X, y_ds = prepare_data(y_ds,test_size=test_size) X_train, X_test, y_train, y_test = train_test_split(X, y_ds, test_size=test_size, shuffle=False) y = y_s.squeeze() + y_ds.squeeze() model2 = model2.fit(X_train,y_train) predictions_train = pd.Series( model2.predict(X_train), index=X_train.index, name='Prediction' )+y_s[X_train.index] predictions_test = pd.Series( model2.predict(X_test), index=X_test.index, name='Prediction' )+y_s[X_test.index] fig, (ax1,ax2) = plt.subplots(1,2, figsize=(14,5), sharey=True) y_train_ps = y.loc[y_train.index] y_test_ps = y.loc[y_test.index] y_train_ps.plot(ax=ax1, legend=True) predictions_train.plot(ax=ax1) ax1.set_title('Train Predictions') y_test_ps.plot(ax=ax2, legend=True) predictions_test.plot(ax=ax2) ax2.set_title('Test Predictions') plt.show() print(f'R2 train score: {r2_score(y_train_ps[:-to_predict],predictions_train[:-to_predict])}') print(f'R2 test score: {r2_score(y_test_ps[:-to_predict],predictions_test[:-to_predict])}')
A note of warning: if you do not expect your data to follow time patterns, do focus on cycles! The hybrid model succeeds well for many tasks, but it actually decreases the R2 score of our previous Bitcoin prediction:
This article aims at presenting functions for your time series workflow, specially for lags and deseasonalization. Use them with care, though: apply them to have baseline scores before delving into more sophisticated models.
In future articles we will bring forth multi-step predictions (predict more than one day ahead) and compare performance of different models, both univariate and multivariate.