Posted on Leave a comment

Python Return String From Function

5/5 – (1 vote)

Do you need to create a function that returns a string but you don’t know how? No worries, in sixty seconds, you’ll know! Go! 🚀

A Python function can return any object such as a string. To return a string, create the string object within the function body, assign it to a variable my_string, and return it to the caller of the function using the keyword operation return my_string. Or simply create the string within the return expression like so: return "hello world"

def f(): return 'hello world' f()
# hello world

Create String in Function Body

Let’s have a look at another example:

The following code creates a function create_string() that iterates over all numbers 0, 1, 2, …, 9, appends them to the string my_string, and returns the string to the caller of the function:

def create_string(): ''' Function to return string ''' my_string = '' for i in range(10): my_string += str(i) return my_string s = create_string()
print(s)
# 0123456789

Note that you store the resulting string in the variable s. The local variable my_string that you created within the function body is only visible within the function but not outside of it.

So, if you try to access the name my_string, Python will raise a NameError:

>>> print(my_string)
Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> print(my_string)
NameError: name 'my_string' is not defined

To fix this, simply assign the return value of the function — a string — to a new variable and access the content of this new variable:

>>> s = create_string()
>>> print(s)
0123456789

There are many other ways to return a string in Python.

Return String With List Comprehension

For example, you can use a list comprehension in combination with the string.join() method instead that is much more concise than the previous code—but creates the same string of digits:

def create_string(): ''' Function to return string ''' return ''.join([str(i) for i in range(10)]) s = create_string()
print(s)
# 0123456789

For a quick recap on list comprehension, feel free to scroll down to the end of this article.

You can also add some separator strings like so:

def create_string(): ''' Function to return string ''' return ' xxx '.join([str(i) for i in range(10)]) s = create_string()
print(s)
# 0 xxx 1 xxx 2 xxx 3 xxx 4 xxx 5 xxx 6 xxx 7 xxx 8 xxx 9

Return String with String Concatenation

You can also use a string concatenation and string multiplication statement to create a string dynamically and return it from a function.

Here’s an example of string multiplication:

def create_string(): ''' Function to return string ''' return 'ho' * 10 s = create_string()
print(s)
# hohohohohohohohohoho

String Concatenation of Function Arguments

Here’s an example of string concatenation that appends all arguments to a given string and returns the result from the function:

def create_string(a, b, c): ''' Function to return string ''' return 'My String: ' + a + b + c s = create_string('python ', 'is ', 'great')
print(s)
# My String: python is great

Concatenate Arbitrary String Arguments and Return String Result

You can also use dynamic argument lists to be able to add an arbitrary number of string arguments and concatenate all of them:

def create_string(*args): ''' Function to return string ''' return ' '.join(str(x) for x in args) print(create_string('python', 'is', 'great'))
# python is great print(create_string(42, 41, 40, 41, 42, 9999, 'hi'))
# 42 41 40 41 42 9999 hi

Background List Comprehension

💡 Knowledge: List comprehension is a very useful Python feature that allows you to dynamically create a list by using the syntax [expression context]. You iterate over all elements in a given context “for i in range(10)“, and apply a certain expression, e.g., the identity expression i, before adding the resulting values to the newly-created list.

In case you need to learn more about list comprehension, feel free to check out my explainer video:

YouTube Video

Programmer Humor

Q: How do you tell an introverted computer scientist from an extroverted computer scientist? A: An extroverted computer scientist looks at your shoes when he talks to you.
Posted on Leave a comment

Deep Forecasting Bitcoin with LSTM Architectures

5/5 – (1 vote)

Although Neural Networks do a tremendous job learning rules in tabular, structured data, it leaves a great deal to be desired in terms of ‘unstructured’ data. And there we come to a new concept: Recurrent Neural Networks.

YouTube Video

Recurrent Neural Network

A Recurrent Neural Network is to a Feedforward Neural Network as a single object is to a list: it may be thought as a set of interrelated feedforward networks, or a looped network.

It is specialized in picking up and highlighting the main characteristics of your data (more on that in Andrej Karpathy’s Blog). They are often followed by a Feed Forward (Dense) Layer which will weigh the output.

Long Short-Term Memory

Long Short-Term Memory (LSTM) clusters have the extra special ability to deal with time (more on it can be found in Colah’s article).

As the term memory suggests, its greatest promise is to understand correlations between past and present events. In particular, they fit naturally in time series forecasts.

Here we aim at a hands-on introduction to several LSTM-based architectures (and more is to come 😉).

Article Overview

We use Bitcoin daily closing price as a case study. Specifically, we use the Bitcoin price and sentiment analysis we have gathered in a previous article. We use TensorFlow‘s Keras API for the implementation.

In this article will aim at the following architectures:

  1. ‘Vanilla’ LSTM
  2. Stacked LSTM
  3. Bidirectional LSTM
  4. Encoder-Decoder LSTM-LSTM
  5. Encoder-Decoder CNN-LSTM

The last one being the more convoluted (pun intended).

There is one main issue dealing with time series, which is the implementation of the problem. Are common situation both having only the historical target value alone (univariate problem) or together with other information (multivariate problem).

Moreover, you might be interested in one-step prediction or a multi-step prediction, i.e., predicting only the next day or, say, all days in the next week. Although it doesn’t sound so, you have to adjust your model to whatever situation you are facing. 

Think of how you would deal with a multivariate multi-step problem: should you train a one-step model and forecast all features in order to feed your model to predict the following days? That would be a crazy!

Kaggle’s time series course does a good job introducing the several strategies present to deal with multi-step prediction. Fortunately, setting an LSTM network for a multi-step multivariate problem is as easy as setting it for a univariate one-step problem – you just need to change two numbers.

This is another advantage of Neural Networks, apart from its capacity of memory. 

Of course, the architecture list above is not exhaustive. For instance, a new Attention layer was recently introduced, which has been working wonders. We shall come back to it in a next article, where we will walk through a hybrid Attention-CLX model.

Credits to ML Mastery blog for part of the code. 

🚫 Disclaimer: This article is a programming/data analysis tutorial only and is not intended to be any kind of investment advice.

How to Prepare the Data for LSTM?

We will use two sources of data, both explicit in our previous article: the SentiCrypt‘s Bitcoin sentiment analysis and Bitcoin’s daily closing price (by following the steps in the previous article, you can do it differently, using a minute-base data, for example).

Let us load the already-saved sentiment analysis and download the Bitcoin price:

import pandas as pd
import yfinance as yf sentic = pd.read_csv('sentic.csv', index_col=0, parse_dates=True)
sentic.index.freq='D' btc = yf.download('BTC-USD', start='2020-02-14', end='2022-09-23', period='1d')[['Close']]
btc.columns = ['btc'] data = pd.concat([sentic,btc], axis=1) data

The LSTM layer expects a 3D array as input whose shape represents:

(data_size, timesteps, number_of_features).

Meaning, the first and last elements are the number of rows and columns from the input data, respectively. The timestep argument is the size of the time chunk you want your LSTM to process at a time. This will be the time frame the LSTM will look for relations between past and present. It is essentially the size of its (long short-term) memory.

To decide how many time-steps, we recall our first time series article where we explored partial auto-correlations of Bitcoin price’s lags.

That is easily achieved through statsmodels:

from statsmodels.graphics.tsaplots import plot_pacf
import matplotlib.pyplot as plt plot_pacf(data.btc, lags=20)
plt.show()

If you were there, in the first article, with me, you might remember our curious 10-lags correlation. Here we use this magic number and feed the model with a 10 days frame and to make a 5 days prediction. I found the results with 10 days better than for 6 or 20 days (for most cases – see below for more about this). We also assume we have today’s data and try to forecast the next 5 days.

An easy way to accomplish the reshaping of the data is through (a slight modification) of our make_lags function together with NumPy’s reshape() method.

So, instead of a Series, we will take a DataFrame as input and will output a concatenation of the original frame with its respective lags. We use negative lags to prepare the target DataFrame. We will ignore observations with the produced NaN values and will use the align method to align their indexes. 

def make_lags(df, n_lags=1, lead_time=1): """ Compute lags of a pandas.DataFrame from lead_time to lead_time + n_lags. Alternatively, a list can be passed as n_lags. Returns a pd.DataFrame resulting from the concatenation of df's shifts. """ if isinstance(n_lags,int): lag_list = range(lead_time, n_lags+lead_time) else: lag_list = n_lags lags=list() for i in lag_list: df_lag = df.shift(i) if i!=0: df_lag.columns = [f'{col}_lag_{i}' for col in df.columns] lags.append(df_lag) return pd.concat(lags, axis=1) X = make_lags(data, n_lags=20, lead_time=0).dropna()
y = make_lags(data[['btc']], n_lags=range(-5,0)).dropna() X, y = X.align(y, join='inner', axis=0)

Next, we train-test split the data with sklearn, taking 10% as test size. As usual for time series, we include shuffle=False as a parameter.

from sklearn.model_selection import train_test_split X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=.1, shuffle=False)

Before proceeding, it is good practice to normalize the data before feeding it into a Neural Network. We do it now, before things get 3D.

from sklearn.preprocessing import MinMaxScaler mms = MinMaxScaler().fit(X_train)
X_train, X_val = mms.transform(X_train), mms.transform(X_val)

Finally, we use NumPy to reshape everything to 3D arrays. Observe that there is not such a thing as a 3D pd.DataFrame.

import numpy as np def add_dim(df, timesteps=5): """ Transforms a pd.DataFrame into a 3D np.array with shape (n_samples, timesteps, n_features) """ df = np.array(df) array_3d = df.reshape(df.shape[0],timesteps ,df.shape[1]//timesteps) return array_3d X_train, X_val = map(add_dim, [X_train, X_val], [timesteps]*2)

Of course, you can always prepare a function to do everything in one shot:

def prepare_data(df, target_name, n_lags, n_steps, lead_time, test_size, normalize=True): ''' Prepare data for LSTM. ''' if isinstance(n_steps,int): n_steps = range(1,n_steps+1) n_steps = [-x for x in list(n_steps)] X = make_lags(df, n_lags=n_lags, lead_time=lead_time).dropna() y = make_lags(df[[target_name]], n_lags=n_steps).dropna() X, y = X.align(y, join='inner', axis=0) from sklearn.model_selection import train_test_split X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=test_size, shuffle=False) if normalize: from sklearn.preprocessing import MinMaxScaler mms = MinMaxScaler().fit(X_train) X_train, X_val = mms.transform(X_train), mms.transform(X_val) if isinstance(n_lags,int): timesteps = n_lags else: timesteps = len(n_lags) return add_dim(X_train,timesteps), add_dim(X_val,timesteps), y_train, y_val

Note that one should give positive values to n_steps to have the right negative shifts. Fortunately, y_train, y_val are not reshaped, which makes life easier when comparing predictions with reality.

All set, let’s start with the most basic Vanilla model.

💡 Side note: We are keeping things simple here, but in a future post, we will prepare our own batches and explore better the stateful parameter of an LSTM layer. More on its input and output can be found in Mohammad’s Git.

How to Implement Vanilla LSTM with Keras?

A model is called Vanilla when it has no additional structure apart from the output layer.

To implement it we add an LSTM and a Dense layer. We must pass the number of units of each and the input shape for the LSTM layer.

The input shape is exactly (n_timesteps, n_features) which can be inferred from X_train.shape. The number of units for the LSTM layer is a hyperparameter and shall be tuned, for the Dense layer it is the number of outputs we want. Therefore 5.

Next follows a hypertuning-friendly code, specifying the main parameters in advance. 

from keras.models import Sequential
from keras.layers import Dense, LSTM # Data preparation
n_lags, n_steps, lead_time, test_size = 10, 5, 0, .2 # hyperparameters
epochs, batch_size, verbose = 50, 72, 0 model_params = {} # preparing data
X_train, X_val, y_train, y_val = prepare_data(data, 'btc', n_lags, n_steps, lead_time, test_size) # model architecture
vanilla = Sequential()
vanilla.add(LSTM(units=200, activation='relu', input_shape=(X_train.shape[1],X_train.shape[2]) ))
vanilla.add(Dense(n_steps))

The model_params dictionary will be useful for including additional parameters to the compile method, such as an EarlyStopping callback. 

We also write a function that fits the model, plot and assess predictions. The present code does not output anything, so, feel free to change it in order to do so. We fix the optimizer as Adam and the loss metric as Mean Squared Error.

def fit_model(model, learning_rate=0.001, time_distributed=False, epochs=epochs, batch_size=batch_size, verbose=verbose): y_ind = y_val.index if time_distributed: y_train_0 = y_train.to_numpy().reshape((y_train.shape[0], y_train.shape[1],1)) y_val_0 = y_val.to_numpy().reshape((y_val.shape[0], y_val.shape[1],1)) else: y_train_0 = y_train y_val_0 = y_val # fit network from keras.optimizers import Adam adam = Adam(learning_rate=learning_rate) model.compile(loss='mse', optimizer='adam') history = model.fit(X_train, y_train_0, epochs=epochs, batch_size=batch_size, verbose=verbose, **model_params, validation_data=(X_val, y_val_0), shuffle=False) # make a prediction if time_distributed: predictions = model.predict(X_val)[:,:,0] else: predictions = model.predict(X_val) yhat = pd.DataFrame(predictions, index=y_ind, columns=[f'pred_lag_{i}' for i in range(-n_steps,0)]) yhat_shifted = pd.concat([yhat.iloc[:,i].shift(-n_steps+i) for i in range(len(yhat.columns))], axis=1) # calculate RMSE from sklearn.metrics import mean_squared_error, r2_score rmse = np.sqrt(mean_squared_error(y_val, yhat)) import matplotlib.pyplot as plt fig, (ax1,ax2) = plt.subplots(2,1,figsize=(14,14)) y_val.iloc[:,0].plot(ax=ax2,legend=True) yhat_shifted.plot(ax=ax2) ax2.set_title('Prediction comparison') ax2.annotate(f'RMSE: {rmse:.5f} \n R2 score: {r2_score(yhat,y_val):.5f}', xy=(.68,.93), xycoords='axes fraction') ax1.plot(history.history['loss'], label='train') ax1.plot(history.history['val_loss'], label='test') ax1.legend() plt.show()

The time_distributed parameter will be used in the last two architectures.

I opted to set a manual learning_rate since once the Stacked LSTM’s output was an array of NaNs. After figuring out that the gradient descent was not converging, that was fixed by decreasing Adam’s learning rate.

Use verbose=1 as a global parameter to debug your network.

Without further ado:

fit_model(vanilla)

The performance is comparable to our XGBoost 1-day prediction in the last article:

Moreover, we are predicting 5 days, not only one, making the r2 score more impressive.

What bothers me, on the other hand, is the fact the predictions for all five days look identical. It requires further analysis to understand why that is happening, which we will not do here.

How to Build a Stacked LSTM?

We also can queue two LSTM layers.

To this aim, we need to be careful to give a 3D input to the second LSTM layer and that is the role the parameter return_sequences plays. We gain a slight increase in the training score in this case.

# model architecture
stacked = Sequential()
stacked.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(X_train.shape[1],X_train.shape[2])))
stacked.add(LSTM(100, activation='relu'))
stacked.add(Dense(n_steps)) fit_model(stacked)

What is a Bidirectional LSTM Layer?

In general, any RNN within minimal requirements can be made bidirectional through Keras’ Bidirectional layer. It stacks two copies of your RNN layer, making one backward. 

Image from AIM.

You can either specify the backward_layer as a second RNN layer or just wrap a single one, which will make the Bidirectional instance use a copy as the backward model. An implementation can be found below.

The score is comparable to the Stacked LSTM.

from keras.layers import Bidirectional bilstm = Sequential()
bilstm.add(Bidirectional(LSTM(100, activation='relu'), input_shape=(X_train.shape[1], X_train.shape[2])))
bilstm.add(Dense(n_steps)) fit_model(bilstm)

Encoder-Decoder LSTM

An Encoder-Decoder structure is designed in a way you have one network dedicated to feature selection and a second one to the actual forecast. The architectures used can be of different types; even of recurrent-non recurrent pairs are allowed.

Here we explore two pairs: LSTM-LSTM and CNN-LSTM. 

Compared to the previous presented architectures, the main difference is the inclusion of the RepeatVector layer and the wrapper TimeDistributed.

Although the RepeatVector is smoothly included, the TimeDistributed layer needs some care. It wraps a layer object and has the duty to apply a copy of each to each temporal slice imputed into it. It considers the .shape[1] of the first input as the temporal dimension (our prepare_data is in accordance to that).

Moreover, one has to watch out since it outputs a 3D array, in particular our model will output 3D predictions.

For this reason, we have to feed the model with reshaped y_val, y_train so that the loss functions can be computed. Fortunately, we already included the time_distributed parameter in the fit_model to deal with the reshaping.

We also increase the number of Epochs since these networks seem to take longer to find a minimum. We include an EarlyStopping though. It already gives an astonishing score!

from keras.layers import RepeatVector, TimeDistributed # Data preparation
n_lags, n_steps, lead_time, test_size = 10, 5, 0, .2 # hyperparameters
epochs, batch_size, verbose = 300, 32, 0 model_params = {'callbacks':[EarlyStopping( monitor="val_loss", patience=20, mode="auto")]} # preparing data
X_train, X_val, y_train, y_val = prepare_data(data, 'btc', n_lags, n_steps, lead_time, test_size, normalize=True) # Encoder
lstmlstm = Sequential()
lstmlstm.add(LSTM(100, activation='relu', input_shape=(X_train.shape[1], X_train.shape[2])))
lstmlstm.add(RepeatVector(n_steps)) # Decoder
lstmlstm.add(LSTM(100, activation='relu', return_sequences=True))
lstmlstm.add(TimeDistributed(Dense(n_steps))) fit_model(lstmlstm, time_distributed=True)

This is the first time the steps outputs are visibly different from each other.

Nevertheless, it seems to be following some trend. In theory, the NN should be so powerful that it can capture trends as well. However, in practice detrending often gives better results. Nevertheless, 0.82 is a massive increase from our 0.32 XGBoost. 

Encoder-Decoder CNN-LSTM Network

The last architecture we present is the CNN-LSTM one.

Here a Convolutional Neural Network is used as a feature selector, being well-known to perform well in this role for photos and videos.

The main reason they are so useful in this case is mathematical: the convolutional part of CNN’s name refers to the convolution operation in mathematics, which is used to emphasize translation-invariant features.

That makes complete sense when you have a photo, since you want your mobile phone to recoginze Toto as a dog, independent if it is in the lower-left corner or in the upper-center of the picture (of course your dog’s name is Toto, right?). You may recognize the CNN action as the smoothed lines in the graph. 

from keras.layers import RepeatVector, TimeDistributed, Conv1D, MaxPooling1D, Flatten # Data preparation
n_lags, n_steps, lead_time, test_size = 10, 5, 0, .2 # hyperparameters
epochs, batch_size, verbose = 300, 32, 0 model_params = {'callbacks':[EarlyStopping( monitor="val_loss", patience=20, mode="auto")]} # preparing data
X_train, X_val, y_train, y_val = prepare_data(data, 'btc', n_lags, n_steps, lead_time, test_size) # Encoder
cnn_lstm = Sequential()
cnn_lstm.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(X_train.shape[1], X_train.shape[2])))
cnn_lstm.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
cnn_lstm.add(MaxPooling1D(pool_size=2))
cnn_lstm.add(Flatten())
cnn_lstm.add(RepeatVector(n_steps)) # Decoder
cnn_lstm.add(LSTM(200, activation='relu', return_sequences=True))
cnn_lstm.add(TimeDistributed(Dense(100, activation='relu')))
cnn_lstm.add(TimeDistributed(Dense(n_steps))) fit_model(cnn_lstm, time_distributed=True)

Extra Perks

For the sake of completion, we tweaked the code around a bit.

Do you remember the seemly significant correlation popped up in the 20-days lags? Well, increasing from 10 to 20 timesteps actually increases the R2 score in the last model:

Funnily enough, it increases even more if you use unnormalized data, making a stellar ~.94 score! 

The last thing worth mentioning is the choice of the activation function. If you got the Warning below and wonder why, the Keras’ LSTM documentation provides an answer.

🛑 WARNING: tensorflow:Layer lstm_70 will not use cuDNN kernels since it doesn’t meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.

(No, I did not loaded 70 LSTM layers. I loaded around 210 😵‍💫)

The documentation says:

“The requirements to use the cuDNN implementation are:

  1. activation == tanh
  2. recurrent_activation == sigmoid
  3. recurrent_dropout == 0
  4. unroll is False
  5. use_bias is True
  6. Inputs, if use masking, are strictly right-padded.
  7. Eager execution is enabled in the outermost context.”

Changing the activation to ‘tanh‘ is enough in our case to use cuDNN, and they are incredibly faster! However tanh fits poorly into our problem:

fit_model(cnn_lstm, time_distributed=True, learning_rate=1)

(You saw it right, the learning rate is 1000x larger than the default. Otherwise the loss curve does not even change.)

Main Takeaways

There are a few points we have to keep in mind about LSTM:

  • The shape of their input 
  • What are time steps
  • The shape of the layer’s output, especially when using return_sequences
  • Hyperparameters tunning is worth your time. For instance, the activation functions relu and tanh have their own pros and cons.
  • There are different architectures to play with (and many more to come – we will deal with Attention blocks and Multi-headed networks soon). Consider using them. I’ve become specially inclined towards the Encoder-Decoders

Feel free to use and edit the code here. 


Posted on Leave a comment

Solidity Deep Dive — Syllabus + Video Tutorial Resources

5/5 – (1 vote)

Do you want to learn Solidity and create your own dApps and smart contracts? This free online course gives you a comprehensive overview that is aimed to be more accessible than the Solidity documentation but still complete and descriptive.

▶ Multimodal Learning: Each tutorial comes with a tutorial video that helps you grasp the concepts in a more interactive manner.

Are you ready to build your skills as a highly sought-after Blockchain Developer or Solidity Engineer? Let’s dive right in! 👇

Basic Introduction and Overview

  1. Introduction to Smart Contracts and Solidity
  2. How to Create Your Own Token in Solidity – Easy Example
  3. Blockchain Basics of Smart Contracts and Solidity
  4. Mastering the Ethereum Virtual Machine (EVM)
  5. Ethereum Virtual Machine (EVM) Message Calls

Installation and Technical Requirements

  1. [Overview] Installing Solidity Compiler
  2. How to Install the Solidity Compiler with npm?
  3. How to Install the Solidity Compiler via Docker on Ubuntu?
  4. How to Install the Solidity Compiler via Source Code Compilation?
  5. How to Install the Solidity Compiler via Static Binary and Linux Packages

Guided Example Smart Contracts

  1. [Overview] Top 5 Solidity Smart Contract Examples for Learning
  2. Example 1: How Does the Solidity Voting Smart Contract Work?
  3. Example 2: Simple Open Auction (Explained)
  4. Example 3: Mastering the Solidity Blind Auction Contract
  5. Example 4: Safe Remote Purchase
  6. Example 5: Understanding Modular Contracts

Solidity Layout

  1. Solidity File Layout – SPDX License ID and Version Pragmas
  2. Solidity Layout – Pragmas, Importing, and Comments

Solidity Language Elements

  1. [Overview] Seven Simple Solidity Blocks to Build Your dApp (State Variables, Functions, Modifiers, Events, Errors, Structs, Enums)
  2. Boolean and Integer Types
  3. Fixed Point Numbers and Address Types
  4. Contract Types, Byte Arrays, and {Address, Int, Rational} Literals
  5. String Types, Unicode/Hex Literals, and Enums
  6. User-Defined Value Types
  7. Function Types
Posted on Leave a comment

Python Print Dictionary Values Without “dict_values”

5/5 – (1 vote)

Problem Formulation and Solution Overview

If you print all values from a dictionary in Python using print(dict.values()), Python returns a dict_values object, a view of the dictionary values. The representation prints the keys enclosed in a weird dict_values(...), for example: dict_values([1, 2, 3]).

Here’s an example:

my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(my_dict.values())
# dict_values(['Carl', 42, 100000])

There are multiple ways to change the string representation of the values, so that the print() output doesn’t yield the strange dict_values view object.

Method 1: Convert to List

An easy way to obtain a pretty output when printing the dictionary values without dict_values(...) representation is to convert the dict_value object to a list using the list() built-in function. For instance, print(list(my_dict.value())) prints the dictionary values as a simple list.

Here’s an example:

my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(list(my_dict.values()))
# ['Carl', 42, 100000]

So far, so simple. Read on to learn or recap some important Python features and improve your skills. There are many paths to Rome! 🏛

Method 2: Unpacking

An easy and Pythonic way to print a dictionary without the dict_values prefix is to unpack all values into the print() function using the asterisk operator. This works because the print() function allows an arbitrary number of values as input. It prints those values separated by a single whitespace character per default.

Here’s an example:

my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(*my_dict.values())
# Carl 42 100000

It cannot get any more concise, frankly. 🙂

Of course, you can change the separator and end arguments accordingly to obtain more control of the output:

my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(*my_dict.values(), sep='\n', end='\nThe End')

Output:

Carl
42
100000
The End

Do you need even greater flexibility than this? No problem! See here: ⤵

Method 3: String Join Function and Generator Expression

To convert the dictionary values to a single string object without 'dict_values' in it and with maximal control, you can use the string.join() function in combination with a generator expression and the built-in str() function.

Here’s an example:

my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(', '.join(str(x) for x in my_dict.values()))
# Carl, 42, 100000

💡 Note: You can replace the comma ',' with your desired separator character and modify the representation of each individual element by modifying the expression str(x) of the generator expression to something arbitrary complicated.

See here for something crazy that wouldn’t make any sense:

my_dict = {'name': 'Carl', 'age': 42, 'income': 100000}
print(' | '.join('x' + str(x) + 'x' for x in my_dict.values()))
# xCarlx | x42x | x100000x

Note that you could also use the repr() function instead of the str() function in this example—it wouldn’t matter too much.

Finally, I’d recommend you check out this tutorial to learn more how generator expressions work—many Python beginners struggle with this concept even though it’s ubiquitous in expert coders’ code bases. 👇

👉 Recommended Tutorial: Understanding One-Line Generators in Python

Posted on Leave a comment

Python Print Dictionary Without One Key or Multiple Keys

5/5 – (1 vote)

The most Pythonic way to print a dictionary except for one or multiple keys is to filter it using dictionary comprehension and pass the filtered dictionary into the print() function.

There are multiple ways to accomplish this and I’ll show you the best ones in this tutorial. Let’s get started! 🚀

👉 Recommended Tutorial: How to Filter a Dictionary in Python

Method 1: Dictionary Comprehension

Say, you have one or more keys stored in a variable ignore_keys that may be a list or a set for efficiency reasons.

Create a filtered dictionary without one or multiple keys using the dictionary comprehension {k:v for k,v in my_dict.items() if k not in ignore_keys} that iterates over the original dictionary’s key-value pairs and confirms for each key that it doesn’t belong to the ones that should be ignored.

Here’s a minimal example:

ignore_keys = {'x', 'y'}
my_dict = {'x': 1, 'y': 2, 'z': 3} filtered_dict = {k:v for k,v in my_dict.items() if k not in ignore_keys}
print(filtered_dict)
# {'z': 3}

The dict.items() method creates an iterable of key-value pairs over which we can iterate.

The membership operator k not in ignore_keys tests if a given key doesn’t belong to the set.

💡 The runtime complexity of the membership check is constant O(1) if you use a set for the ignore_keys data structure. It would be linear O(n) in the number of elements if you used a list which is not a good idea for that reason.

Note that you can also use this approach to print a dictionary except a single key by putting only one key into the ignore list.

👉 Recommended Tutorial: Dictionary Comprehension in Python

Method 2: Simple For Loop with If Condition

A not-so-Pythonic but reasonably readable way to print a dict without one or multiple keys is to use a simple for loop with if condition to avoid all keys in the ignore list.

Here’s an example using three lines and directly printing the key-value pairs:

ignore_keys = {'x', 'y'}
my_dict = {'x': 1, 'y': 2, 'z': 3} for k, v in my_dict.items(): if k not in ignore_keys: print(k, v)

The output:

z 3

Of course, you can modify the output to your own needs. See the customizations of the built-in print() function and its awesome arguments:

👉 Recommended Tutorial: Python print() and Separator and End Arguments


My Recommendation – Use This Method!

I could have listed many more ways to solve this problem of printing a dict except one or more keys.

I have seen super inefficient ways proposed on forums that use exclude_keys that are list types.

I have also seen elaborate schemes to use set difference operations or more.

But I don’t recommend anything else than dict comprehension if you want to create a filtered dictionary object first and the simple for loop if you want to print on the fly.

That’s it. 👌


Posted on Leave a comment

State Variables in Solidity

5/5 – (1 vote)

In this article, I’ll be going over the different types of state variables in Solidity and how to use them. State variables are one of the most important parts of any smart contract, as they allow us to store data that can change over time.

This article is mainly focused on value types of state variables, but I’ll be continuing with another two articles on reference and complex types as well as data location. Let’s dive in!

Basics – A Quick Review

Smart contracts are pieces of code that are deployed in blockchain nodes. They are immutable, meaning they cannot be changed once they have been deployed. This can make it necessary to redeploy the code as a new smart contract or redirect calls from an old contract to new ones.

A smart contract is initiated by a message embedded in a transaction. Ethereum enables these transactions, which may carry out more sophisticated operations like conditional transfers.

A conditional transfer, such as one that depends on the age of the buyer or the value of their bid, could be required.

💡 Example: If the buyer is over 21 and their bid is greater than the minimum bid, then accept the bid. Otherwise reject it.

Smart contracts are executed when predetermined conditions are met to automate the execution of an agreement so that all parties can be immediately certain of the outcome without the need for an intermediary.

How Do You Write a Smart Contract?

Smart contracts are similar to a class definition in an object-oriented programming language.

The smart contracts are:

  • data (its state);
  • a collection of code (its functions or methods with modifiers public or private with getter and set functions).

What is the structure of a smart contract?

As we have seen in other articles in Finxter, the structure of a smart contract is as follows:

  • Contract in the Ethereum blockchain has pragma directive;
  • Name of the contract;
  • Data or the state variable that define the state of the contract;
  • Collection of functions to carry out the intent of a smart contract;

Note that the identifiers representing these elements are restricted to the ASCII character set. Make sure you select meaningful identifiers and follow camel case convention in naming them.

Variable Declaration

To declare a variable in Solidity, you must first specify its data type. This is followed by an access modifier and the variable name.

Structure

<type> <access modifier> <variable name> ; 

Example:

What Categories of Variables Exist in Solidity?

Solidity supports three categories of variables:

(1) State Variables

State variables are variables whose values are permanently stored in a contract storage.

What does this mean?

State variables are an essential part of any contract. They are variables whose values are permanently stored in the contract storage. They can be thought of as a single slot in a database that you can query and alter by calling functions of the code that manages the database. The set and get functions can be used to modify and retrieve the value of the variables.

In other words, the data (state variables) are stored contiguously item after item starting with the first state variable, stored in slot 0. For each variable, the size in bytes is determined according to its type. Several contiguous items  that require less than 32 bytes are packed into a single storage slot if possible.

To make it easier, if you use other languages and want to store user information for a long time, you would connect your application to a database server and then store the information in the database. In Solidity, however, you do not need to connect, you can simply store the data permanently using state variables.

(2) Local Variables

Local variables are variables whose values exist until the function is executed; the context of local variables is within the function and cannot be accessed outside.

Typically, these variables are used to hold temporary values for processing or computing something. In the following example, “temp” is a local variable that cannot be used outside the “set” function.

(3) Global Variables

Global variables are variables whose values exist in the global namespace to obtain information about the blockchain.

Each function has its own scope, but state variables should always be defined outside the scope, like the attributes of a class.

They are permanently stored in the Ethereum blockchain, more precisely in the storage Merkle-Patricia tree, which is part of the information that forms the state of an account (that’s why we call them state variables).

What Types of Valid State Variables Exist?

💡 Info: Solidity is a statically typed language, meaning each variable’s type must be specified at the time of its declaration. 

“Undefined” or “null” values do not exist in Solidity, but newly declared variables always have a default value depending on their type, typically called “zero- state”.

For example, the default value for bool is false.

As in other languages (not Python 😀 ), there are two types in Solidity: value types and reference types.

  • The value type is a variable that stores its value or its own data directly; it is a value type. If the variable contains a location of the data – it is a reference type.
  • The reference types are discussed in a separate article.

For example, consider the integer variable int i = 100;

The system stores 100 in the memory location allocated for the variable i. The following image shows how 100 is stored in a hypothetical location in memory (0x239110) for “i”:

What are the Modifiers for the State Variables?

Visibility – access modifiers

Access modifiers are the keywords used to specify the declared accessibility of a state variable and functions.

Variables in Solidity have three types of visibility: public, private, and  internal. If visibility is not explicitly declared, the compiler considers it internal.

For variables of type public, the compiler automatically creates a method to retrieve them through a call. This does not apply to private or internal variables.

Example:

uint256 public a; is actually exactly the same thing as : uint256 private a;
function a() public view returns(uint256) {
return a;
}

When you create a public variable, it is stored the same way as a private variable, but the compiler automatically creates a getter function for it.

💡 The difference between private and internal variables is that internal variables are inherited by child contracts, while private variables are not.

To learn more about private variables:

contract Addition { uint x; //internal variable uint public y; // contract Child is Addition{ //no need to define x since the child contract inherits the variable //uintx function setX(uint _x) public { x =_x; function getX() public view returns (uint) { return x; }
}

Note that the data location (memory, storage, and call data) must be specified for variables of reference type. This is necessary when function arguments are involved. We will cover this in an article on data location.

Other keywords

The following keywords can be used for state variables to restrict changes to their state.

Constant (replaced by “view” and “pure” in functions)

Constant disallows assignment (except at initialization), i.e. they cannot be changed after initialization, but must be initialized at the time of their declaration.

Example:

uint private constant t = 40;

The variable t has been declared once and therefore cannot be changed.

It is interesting to note that the declaration of a constant variable without initialization is forbidden and the compiler displays an error, e.g.:

Contract Addition { uint private x; uint public y; uint private constant z; //gives an error because constant variables must be initialized when declared.

Immutable 

These variables can be declared without being initialized, but the assignment, which is only one, must be done in the constructor. After that, the variable is constant thereafter.

 uint private immutable w; //now we declare a constructor for the contract, using the function constructor constructor() { w = 20; //initiate variable }

Override 

This keyword states that the public state variables change the behavior of a function.

Value Types

These variables are passed by value. That is, they are copied when they are used either in an assignment or in a function argument.

👉 If this sentence is not clear, you can check here.

Here we will see the basic value types.

Value types are booleans, integers, addresses, enums, and bytes.

Booleans

Boolean values can be true or false

An example of a boolean type:

contract ExampleBool { // example of a bool value type in solidity bool public IsVerified = false; bool public IsSent = true; }

Integers

There are int/uint (signed and unsigned integers) types of various sizes. It stores the values in a range of 8, int16, …up to int256. Int256 is the same as int, same for uint8, and uint256

💡 Note: uint256 is the same as uint.

The type uint stands for positive integers. The type int stands for both positive and negative integers.

👉 Recommended Tutorial: Solidity Data Types – Integer and Boolean

The type uint8 (has 8 bits, which corresponds to 1 byte. This means that it accepts numbers between 0 and 255; bit is a binary digit. So one byte can hold 2 (binary) ^ 8 numbers from 0 to 2^8-1 = 255. This is the same as asking why a three-digit decimal number can represent the values 0 to 999.

The type uint256 accepts numbers between 0 and 2^256.

If we try to assign the value 256 to a variable of type uint8, the compiler will print an error.

The best practice for integers is to specify the value of the bits at the declaration stage to use as little space as possible and reduce the cost of storage. So use uint8 or uint16 instead of always using int (uint256).

contract SimpleContract{ uint32 public uidata = 1234567; //un-signed integer int32 public idata = -1234567; //signed integer }

Fixed Point Numbers   

According to the Solidity documents, fixed-point numbers are the type for floating-point numbers. However, the official document states that “Fixed point numbers are not yet fully supported by Solidity”. They can be declared, but cannot be added to or derived from.

However, you can use floating point numbers for calculations, but the value resulting from the calculation should be an integer.

Here is an example,

contract additionContract{ uint8 result; function Addition(uint) public { result = 2/3; //error result = 3.5 + 1.5; // final result will be an integer } }

Let’s do a subtle change,

Address

The address data type is very specific to Solidity.

On the Ethereum blockchain, every account and smart contract has an address that is used to send and receive Ether from one account to another.

This is your public identity on the blockchain.

Also, when you deploy a smart contract on the blockchain, that contract is assigned an address that you can use to identify and call the smart contract.

There are two variants for the address type, which are identical:

  • address – stores a 20-byte value (the size of an Ethereum address or account). The default value for the address is 0x…followed by 40 0’s, or 20 bytes of 0’s.
  • address payable – like address, but transfer and send with the additional members.

The idea behind this distinction is that the address payable is an address you can send Ether to, while you should not send Ether to a plain address, as it could be a smart contract that was not built to accept Ether.

 contract ExampleAddress { address public myAddress = 0xc895t6ea1bc39595cf849612ffta7427f5792987

Enums 

What stands for enumerable is a user-defined data type that restricts the variable to have only one of the predefined values.

These values listed in the enumerated list are called enums, and internally these enums are treated like numbers (resource). This makes the contract more readable and maintainable.

contract SampleEnum{ //Creating an enumerator enum animal_classes { Mammals, Fish, Amphibians, Reptiles, Birds } function getFirstEnum() public pure returns(animal_classes){ return animal_classes.Mammals; } // result: // 0: uint8: 0 }

With enums, we can also set a default value;

animal_classes constant defaultValue = animal_classes.Reptiles; function getDefaultValue() public pure returns(animal_classes) { return defaultValue; } } //result // result: // 0: uint8: 2 

Bytes and Strings

A byte refers to signed 8-bit integers. Everything in memory is stored in bits with binary values 0 and 1.

Solidity supports string literals that use both double quotes (") and single quotes ('). It provides String as a data type to declare a variable of type String.

Strings are unique in Solidity compared to Python or other programming languages in that there are no functions for manipulating strings, except that you can concatenate strings. The reason for this is that storing strings in a blockchain is very expensive.

Bytes and strings are easy to handle in Solidity because Solidity treats them similarly to an array. The two are very similar. (See Arrays in the Reference Type article).

Conclusion

Smart contracts reside at a specific address in the Ethereum blockchain. In this article, we learned about state variables in Solidity.

We looked at state, local variables, and the different types with a value type.

We tried to understand Boolean, Integers, Enums, Addresses, Bytes, and Strings (although the last ones are treated with more depth in reference types)

Bibliography


Posted on Leave a comment

How to Print a List Without Commas in Python

5/5 – (1 vote)

Problem Formulation

Given a Python list of elements.

If you print the list to the shell using print([1, 2, 3]), the output is enclosed in square brackets and separated by commas like so:

[1, 2, 3]

But you want the list without commas like so:

[1 2 3]

print([1, 2, 3])
# Output: [1, 2, 3]
# Desired: [1 2 3]

How to print the list without separating commas in Python?

Note that this is slightly different to those two problem variants—feel free to click there to learn more about those problem variants:

🌍 Recommended Tutorial: How to Print a List Without Brackets and Commas in Python?

🌍 Recommended Tutorial: How to Print a List Without Brackets in Python?

Method 1: Unpacking Multiple Values into Print Function

The asterisk operator * is used to unpack an iterable into the argument list of a given function.

You can unpack all list elements into the print() function to print all values individually, separated by an empty space per default (that you can override using the sep argument). For example, the expression print('[', *lst, ']') prints the elements in my_list, empty space separated, with the enclosing square brackets and without the separating commas!

Here’s an example:

lst = [1, 2, 3]
print('[', *lst, ']')
# [ 1 2 3 ]

You can learn about the ins and outs of the built-in print() function in the following video:

YouTube Video

To master the basics of unpacking, feel free to check out this video on the asterisk operator:

YouTube Video

Method 2: String Replace Method

A simple way to print a list without commas is to first convert the list to a string using the built-in str() function. Then modify the resulting string representation of the list by using the string.replace() method until you get the desired result.

Here’s an example:

my_list = [1, 2, 3] # Convert List to String
s = str(my_list)
print(s)
# [1, 2, 3] # Replace Separating Commas
s = s.replace(',', '') # Print List Without Commas
print(s)
# [1 2 3]

The last line of the code snippet shows that the commas are removed from the output.

Method 3: String Join With Generator Expression

You can print a list without commas using the string.join() method on any separator string such as ' ' or '\t'. Pass a generator expression to convert each list element to a string using the str() built-in function.

Specifically, the expression print('[', ' '.join(str(x) for x in my_list), ']') prints my_list to the shell without separating commas.

my_list = [1, 2, 3]
print('[', ' '.join(str(x) for x in my_list), ']')
# Output: [ 1 2 3 ]
  • The string.join(iterable) method concatenates the elements in the given iterable.
  • The str(object) built-in function converts a given object to its string representation.
  • Generator expressions or list comprehensions are concise one-liner ways to create a new iterable based by reusing elements from another iterable.

You can dive deeper into generators in the following video:

YouTube Video

💡 Note: Combining the join() method with a generator expression and string concatenation is the recommended approach of choice if you want to convert a list to a string without commas instead of printing it.

Here’s an example:

my_list = [1, 2, 3]
s = '[' + ' '.join(str(x) for x in my_list) + ']'
print(s)
# Output: [ 1 2 3 ]

Method 4: Print NumPy Array

Sometimes it is sufficient to use the NumPy default output that is without separating commas. For example, if you print a list it yields [1, 2, 3]. And if you print an array it yields [1 2 3]. You can easily convert a list to a NumPy array using the np.array(lst) constructor.

import numpy as np my_list = [1, 2, 3]
print(np.array(my_list))
# Output: [1 2 3]

👉 Recommended Tutorial: How to Install NumPy?

Where to Go From Here?

Enough theory. Let’s get some practice!

Coders get paid six figures and more because they can solve problems more effectively using machine intelligence and automation.

To become more successful in coding, solve more real problems for real people. That’s how you polish the skills you really need in practice. After all, what’s the use of learning theory that nobody ever needs?

You build high-value coding skills by working on practical coding projects!

Do you want to stop learning with toy projects and focus on practical code projects that earn you money and solve real problems for people?

🚀 If your answer is YES!, consider becoming a Python freelance developer! It’s the best way of approaching the task of improving your Python skills—even if you are a complete beginner.

If you just want to learn about the freelancing opportunity, feel free to watch my free webinar “How to Build Your High-Income Skill Python” and learn how I grew my coding business online and how you can, too—from the comfort of your own home.

Join the free webinar now!

Programmer Humor

❓ Question: How did the programmer die in the shower? ☠

Answer: They read the shampoo bottle instructions:
Lather. Rinse. Repeat.

Posted on Leave a comment

Python Print List Without Truncating

5/5 – (1 vote)

How to Print a List Without Truncating?

Per default, Python doesn’t truncate lists when printing them to the shell, even if they are large. For example, you can call print(my_list) and see the full list even if the list has one thousand elements or more!

Here’s an example:

However, Python may squeeze the text (e.g., in programming environments such as IDLE) so you would have to press the button before seeing the output. The reason is that showing the whole output could be time-consuming and visually cluttering.

Here’s an example:

How to Print a NumPy Array Without Truncating?

In many cases, large NumPy arrays when printed out are not truncated as well on the default Python programming environment IDLE:

However, in the interactive mode of the Python shell, a NumPy array may be truncated, unlike a Python list:

>>> np.arange(10000)
array([ 0, 1, 2, ..., 9997, 9998, 9999])

To print the NumPy array without truncating, simply

>>> import sys, numpy
>>> numpy.set_printoptions(threshold=sys.maxsize)
>>> np.arange(10000)

The output shows the full array without converting it to a list first:

Not all output is shown to save some space. 🙂

Of course, you could also convert the NumPy array to a Python list first.

👉 Recommended Tutorial: How to Print a NumPy Array Without Truncating It?


Feel free to check out our Python cheat sheets and free email academy:

Posted on Leave a comment

About Me – When The Going Gets Tough, Keep Going

5/5 – (1 vote)

Welcome to the Finxter blog! My name is Chris, and I started this coding venture a couple of years ago.


Over the years, I have chatted with tens of thousands of Finxters who shared their stories and struggles with me.

👉 See here and here to read a lot of feedback from the community.

Today, allow me to share my story about why I started teaching freelancing.

It may inspire you to take control of your life if you’re in a tough spot right now – for example, struggling with the economic, military, and energy crises that are happening right now.

If you’re not interested in my personal story, now would be the time to stop reading. I won’t blame you!

~~~

Once upon a time, when I was a timid and naive 20-year-old dreamer, my 18-year-old girlfriend unexpectedly got pregnant. 🤰

She was still in high school, and I had just started studying computer science.

At the time, we had zero income and maybe $900 in savings.

I was living in a cheap 15-square-meter room with a desk and a bed and not much else.

As young and poor parents without any education or degree, we constantly felt judgment and pity from society.

We couldn’t even rent a flat because no landlord was crazy enough to take us in.

During all the struggle, we had love and dreams and the belief that everything would get better eventually: I was going to be a computer scientist in five years.

That is if I found a way to support my family on a shoestring – and avoided screwing up my education.

The first ten years, money was tight as hell. Little time. Lots of hard work. No TV. No Games. No Saturday night partying.

Well, maybe a little…

I am not a wunderkind. But I have good work ethic, and long-term goals, and I don’t give up easily. Finally, after ten tough years, I got my Ph.D. in computer science “summa cum laude”.

I now had a steady paycheck from my government job. But I eventually learned that the academic degrees didn’t help in improving our financial situation. 

People made far more money and had far more free time coding in the private sector and without academic degrees.

I decided to take matters into my own hands again by creating my own coding business as a freelance developer.

In little time, I reached six-figure income levels. And I had much more free time compared to my government job that I held before.

My second child – now five years old – knows his father to have infinite time playing soccer, video games, or watching the Tesla Bot taking his first steps on YouTube. 

(He plans to become CEO of Tesla – stay tuned @Elon).

~~~

Becoming a freelancer was a pivot point in my life.

To share all I know about creating a thriving coding business online, I have set up our freelancer course.

It focuses on the fundamentals:

  1. find your niche,
  2. build your skills,
  3. create value for your customers, and
  4. take massive action.

Simple, but sometimes not so easy…

If you want more from life and you love coding, feel free to subscribe to my free email academy, I’d love to have you in our community of ambitious coders who have not yet lost their ability to dream of a better life! ❤

Posted on Leave a comment

Solidity Contract Types, Byte Arrays, and {Address, Int, Rational} Literals

5/5 – (1 vote)
YouTube Video

With this article, we continue our journey through the realm of Solidity data types following today’s topics:

  • contract types,
  • fixed-size byte arrays,
  • dynamically-sized byte arrays,
  • address literals,
  • rational, and
  • integer literals.

It’s part of our long-standing tradition to make this (and other) articles a faithful companion or a supplement to the official Solidity documentation.

👇 Download PDF Slide Deck at the end of this tutorial!

Contract Types

To quote the official Solidity documentation, “every contract defines its own type”.

This statement might seem a bit cryptic, and since we’re an efficient crowd, we’d surely like to know what it means.

We can all remember that some number of articles ago, we mentioned how Solidity has key elements of an object-oriented programming language (OOPL). We also emphasized how smart contracts in Solidity are very similar to classes in an OOPL.

Classes themselves are a mesh of custom data types, i.e. structs, and functions, which qualifies classes to be treated as types.

👉 By extension, our contracts are also treated as types, and as every contract is unique in its own right, it defines its own type. Being a type, we can implicitly convert a specific contract to a contract it inherits from, i.e. if contract “Aa” inherits from contract A, it can also be converted to contract “A”.

Besides that, we can explicitly convert each contract to and from the address type. Even more, we can conditionally convert a contract to and from the address payable type (remember, that’s the same type as the address type, but predetermined to receive Ether).

The condition is that the contract type must have a receive or payable fallback function. If it does, we can make the conversion to address payable by using address(x).

However, if the contract type does not implement (a more professional way to say “have”) a receive or payable fallback function, then the conversion to address payable has to be even more explicit (no swearing!) by stating payable(address(x)).

A local variable obc of a contract type OurBeautifulContract is declared by OurBeautifulContract obc;.

Once we point our variable obc to an instantiated (newly created) contract, we’d be able to call functions on that contract.

In terms of its data representation, a contract is identical to the address type. This is important because the contract type is not directly supported by the ABI, but the address type, as its representative, is supported by the ABI.

In contrast to the types mentioned so far, contract types don’t support any operators.

The members of contract types are the external functions (the functions only available to other contracts) and state variables whose visibility is set to public.

When we need to access type information about the contract, like the OurBeautifulContract above, we’d call the type(OurBeautifulContract) function (docs).

Fixed-Size Byte Arrays

The value type bytesN holds a sequence of bytes, whose length, and accordingly N goes from 1 to up to 32, i.e., bytes1, …, bytes32.

The available operators for fixed-size operators are:

  • Comparisons: <=, <, ==, !=, >=, > (evaluate to bool)
  • Bit operators: &, |, ^ (bitwise exclusive or), ~ (bitwise negation)
  • Shift operators: << (left shift), >> (right shift)
  • Index access: If x is of type bytesN, then x[k] for 0 <= k < N returns the k-th byte (read-only). In other words, x[0] up to (inclusive) x[N-1] is available for index access; if N = 1, then only x is of type bytes1, and x[0] is the only element, i.e. byte accessible by the index.

The shifting operator always uses an unsigned integer type as a right operand, which represents the number of bits to shift by, and returns the type of the left operand.

Let’s take a look at a simple example to illustrate:

bytes2 lo = 0x1234; // (lo is the left operand)
uint8 ro = 5; // (ro is the right operand variable, must be u... type)
lo << ro // will evaluate to an lo type, bytes2

A fixed-size byte array has only one member, .length, that holds the fixed length of the byte array. This member is accessible as the read-only value.

⚡ Warning: Since the type bytes1 is a sequence of 1 byte in length, the type bytes1[] is a fixed-size byte array of 1-byte sequences. However, each element of the array is padded with 31 bytes, due to padding rules for elements stored in memory, stack, and call data, i.e., except in storage. Therefore, according to the official Solidity documentation, it’s better to use bytes type instead of bytes1[].

💡 Note: Value types in storage are packed/compacted together and share a storage slot, taking only as much space per value type as really needed. In contrast, the stack, memory, and calldata pad value types and store in separate slots, meaning that each variable uses a whole slot of 32 bytes, even if the value type is shorter than 32 bytes, effectively wasting the memory space.

Before Solidity v0.8.0, the keyword byte was an alias for bytes1.

Dynamically-Sized Byte Arrays

There are two dynamically-sized non-value types, namely bytes and string.

  • bytes is a dynamically-sized byte array, while
  • string is a dynamically-sized UTF-8-encoded string.

Address Literals

Address literals are hexadecimal literals that pass the address checksum test, e.g. 0xdCad3a6d3569DF655070DEd06cb7A1b2Ccd1D3AF.

Hexadecimal literals will produce an error if they are between 39 and 41 digits long and do not pass the checksum test.

However, we can remove the error by prepending zeros to integer types or appending zeros to bytesNN types.

The Ethereum Improvement Proposal EIP-55 defines the mixed-case address checksum.

Integer and Rational Literals

Integer Literals

Integer literals are created using a sequence of digits from a range 0-9, and each digit is interpreted (weighted) based on its position in the sequence.

Multiplied by an exponent of 10, e.g. 217 is interpreted as two hundred and seventeen, because, reading from right to left, we have 7 * 100 + 1 * 101 + 2 * 102.

A reminder, 100 = 1.

Octal literals don’t exist in Solidity and leading zeros are invalid.

Decimal Fractional Literals

Decimal fractional literals consist of a dot . (or, depending on the locale) and at least one number on either of the sides, e.g. 1., .1, and 1.3.

💡 Info: “A locale consists of a number of categories for which country-dependent formatting or other specifications exist” (source).

Scientific Notation

Solidity also supports scientific notation in the form of 2e10, where 2 (left of “e”) is called mantissa (M) and the exponent (E) must be an integer. In a general form, we would write it as MeE and it is interpreted as M * 10**E, e.g. 2e10, -2e10, 2e-10, 2.5e1.

Readable Underscore Notation

We can also do a neat thing: separate the digits of a numeric literal for easier readability, such as in decimal 123_000, hexadecimal 0x2eff_abde, scientific decimal notation 1_2e345_678.

However, there are no leading, trailing, or multiple underscores; they can only be added between two digits.

Number Literal Expressions

Expressions containing number literals preserve their precision until they are converted to a non-literal type.

Such a conversion means an explicit conversion, or that the number literals are used with something else than a number literal expression, like boolean literals.

This behavior implies that computations don’t overflow and divisions don’t truncate in number literal expressions.

A very good example would be a number literal expression (2**800 + 1) – 2**800, which results in the constant 1 (of type uint8), although the intermediate results would not fit the capacity of the EVM word length of 32 bytes.

One more example shows that an integer 4 is produced by computing the expression .5 * 8, although the intermediary results are not integers.

More Operations

⚡ Warning: most operators produce a literal expression when applied to number literals, but there are also two exceptions:

  • Ternary operator (... ? ... : ...),
  • Array subscript (<array>[<index>]).

In other words, expressions like 255 + (true ? 1 : 0) or 255 + [1, 2, 3][0] are not equivalent to using the literal 256 (the result of these two expressions), as they are computed within the type uint8 and can lead to an overflow.

Number literal expressions can use the same operators as the integers, but both operands must compute yield an integer.

  • If either of the operands is fractional, bit operations are inapplicable for use;
  • If the exponent is a decimal fractional literal, the exponentiation operation is also inapplicable for use.

Shifts and exponentiation * operations with literal numbers in place of a left (base*) operand and integer types in place of the right (exponent*) operand are performed in the uint256 for non-negative literals or int256 for negative literals (a * symbol pertains to the exponentiation operations context).

⚡ Warning: Since Solidity v0.4.0 division on integer literals produces a rational number, e.g. 7 / 2 = 3.5.

Solidity has a number literal types for each rational number, e.g. integer literals and rational number literals belong to the same number literal type.

All number literal expressions (expressions with only number literals and operators) also belong to number literal types, e.g. 1 + 2 and 2 + 1 belong to the same number literal type.

💡 Note: When number literal types are used with non-literal expressions, they are converted into a non-literal type, e.g.  uint128 a = 1; uint128 b = 2.5 + a + 0.5;

Here, 1 is converted into a non-literal type uint128, i.e. variable a, but a common type for both 2.5 and uint128 doesn’t exist and the compiler will reject the code.

Conclusion

In this article, we added even more data types in Solidity under our proverbial belt!

  • First, we introduced and learned about the contract type.
  • Second, we fixed our understanding of the fixed-size byte array type.
  • Third, the situation got dynamic by studying the dynamically-sized byte array type.
  • Fourth, we addressed the… what was it called… Aha – address literals!
  • Fifth, we came to the most rational decision and discovered what rational and integer literals are and, of course, how can they be put to good use.

Slide Deck Data Types

You can scroll through the data types discussed in this tutorial here: