Back To Blog

Sierpinski’s Triangle PT 3

Alternate methods of generating Sierpinski’s Triangle: one using a recursive function, the other using the chaos game.

First: what is recursion? What is the chaos game?

Recursion is a fundamental concept in computer science and mathematics, where a function or algorithm calls itself again and again in order to solve a problem. The principle of recursion is to break down a complex problem into smaller, more manageable sub-problems, then continues until it reaches a condition that can be solved without further recursion, usually referred to as the ‘base case’. One very common example of recursion is the factorial (the product of all positive numbers up to a number). Here it is in Python:

def factorial(n):
    # Base case: factorial of 1 is 1
    if n == 1:
        return 1
    else:
        return n * factorial(n-1)

In this example, factorial(n-1) is the recursive call. When you call factorial(n), it calls factorial(n-1), which calls factorial(n-2), and so on, until it reaches factorial(1). This base case returns a simple, non-recursive result, which is 1 in this case.

The Chaos Game is a mathematical procedure that generates a sequence of points in a space, often resulting in a fractal pattern. Invented by mathematician Michael Barnsley, it’s part of the study of chaotic dynamical systems.

Here is the basic process of the Chaos Game:

  1. Begin by defining a set of rules or transformations. Typically, these involve geometric operations like rotations, translations, and scaling.
  2. Choose an initial point in the space.
  3. Apply one of the transformations to the point at random. The result is a new point.
  4. Repeat step 3 for the new point, and continue doing so. The key is that the choice of transformation is random at each step, introducing an element of unpredictability or “chaos”.

The Chaos Game is fascinating because it demonstrates how complex and intricate structures can emerge from simple rules and random behavior. This makes it a useful tool in the study of chaos theory and fractals.

Let’s try a relatively simple example by creating the Barnsley Fern.

The Barnsley Fern is a fractal named after British mathematician Michael Barnsley who first described it in his book “Fractals Everywhere”. The fern is created by plotting a sequence of points in the plane according to a set of four affine transformations chosen at random.

Here is a simple Python code using matplotlib and numpy:

import numpy as np
import matplotlib.pyplot as plt
# Define the transformation functions
def f1(x, y):
    return 0, 0.16*y
def f2(x, y):
    return 0.85*x + 0.04*y, -0.04*x + 0.85*y + 1.6
def f3(x, y):
    return 0.2*x - 0.26*y, 0.23*x + 0.22*y + 1.6
def f4(x, y):
    return -0.15*x + 0.28*y, 0.26*x + 0.24*y + 0.44
# Create an array of transformation functions
fs = [f1, f2, f3, f4]
# Choose an initial point
x, y = 0, 0
# Initialize arrays to hold x and y values
xs = [x]
ys = [y]
# Iterate the transformations
for i in range(100000):
    # Choose a random transformation function and call it
    f = np.random.choice(fs, p=[0.01, 0.85, 0.07, 0.07])
    x, y = f(x, y)
    xs.append(x)
    ys.append(y)
# Plot the points
plt.scatter(xs, ys, s=0.2, color='green', lw=0)
plt.show()

In this code, we define four transformation functions f1, f2, f3, and f4, which correspond to the four transformations used to generate the Barnsley Fern. We then apply one of these functions at each step, chosen at random according to specified probabilities. Despite the randomness, the sequence of points forms the recognizable shape of a fern.

  1. Recursive Function:

This function involves dividing a triangle into four smaller triangles and removing the center one, and then repeating this process for the remaining triangles. Here’s a simple way to do this using the turtle module in Python:

import turtle
def draw_sierpinski(length, depth):
    if depth==0:
        for i in range(3):
            turtle.forward(length)
            turtle.left(120)
    else:
        draw_sierpinski(length / 2, depth - 1)
        turtle.forward(length / 2)
        draw_sierpinski(length / 2, depth - 1)
        turtle.backward(length / 2)
        turtle.left(60)
        turtle.forward(length / 2)
        turtle.right(60)
        draw_sierpinski(length / 2, depth - 1)
        turtle.left(60)
        turtle.backward(length / 2)
        turtle.right(60)
# Initial settings
turtle.speed(0)
turtle.penup()
turtle.goto(-200, -200)
turtle.pendown()
# Draw Sierpinski Triangle
draw_sierpinski(400, 4)
# End drawing
turtle.done()

In this code, draw_sierpinski is a recursive function where length is the side length of the triangle and depth is the recursion depth. The turtle starts by moving forward, then recurses, moves forward again, recurses, moves backwards, and finally turns to start the next recursion.

  1. Chaos Game:

The chaos game is a method of creating a Sierpinski triangle by randomly moving a point half the distance towards the corners of an initial triangle. Here’s a way to do this using matplotlib and numpy:

import numpy as np
import matplotlib.pyplot as plt
# Initialize the triangle corners
corners = np.array([[0, 0], [0.5, np.sqrt(3)/2], [1, 0]])
points = np.zeros((100000, 2))
points[0, :] = [0, 0]
for i in range(1, len(points)):
    # Randomly select a corner
    corner = corners[np.random.randint(3)]
    # Move half the distance to the chosen corner
    points[i] = (points[i-1] + corner) / 2
# Plot the result
plt.scatter(points[:, 0], points[:, 1], s=0.1, color='k')
plt.axis('equal')
plt.show()

In this code, corners contains the coordinates of the three corners of the initial triangle, and points is an array of points that will be plotted. In each step of the loop, a corner is randomly selected, and the next point is set to be halfway between the previous point and the selected corner. The result is a scatter plot of all the points, which forms a Sierpinski triangle.

Back To Blog

Learn (Basic) Natural Language Processing (NLP) in 6 easy steps

Recently, I started the ‘Intro To NLP’ Course on the 365 Data Science platform. I’ve noticed the platform scrambling in recent months to introduce AI content after focusing almost exclusively on Data Science topics. Since the two are related, this wasn’t a bad thing, but glad to see it bringing in more high quality courses focusing exclusively on AI/LLM topics, and looking forward to seeing more.

The Intro to NLP consists of 7 sections of lessons with a Practical Task at the end of each section, then a section devoted to a whole project, finishing up with a ‘Future of NLP’ section before the Final Exam. So far it’s been a good course, about a subject I knew almost nothing about before I started, but which is central to LLM like Chat GPT and AI generally.

What is Natural Language Processing, or NLP? NLP is a technology that allows computers to understand, interpret and respond to human languages, both written and spoken. As it turns out, it has an astonishing number of uses, and was widely used in ways I never even thought about, even before the arrival of the LLMs:

  • Search engines: when we type a query into a search engine, NLP algorithms interpret our intent, understand the context, and deliver relevant search results. This involves processing the query, identifying key terms, and even understanding the context of our question.
  • Voice Assistants and Smart Home Devices: Devices like Amazon’s Alexa, Google Assistant, and Apple’s Siri use NLP to understand spoken commands. They can interpret our requests, respond to queries, control smart home devices, and even engage in casual conversation. Sort of.
  • Text Autocorrect and Predictive Text: The autocorrect and predictive text features on smartphones and other devices use NLP to understand the context of what we are typing, correct spelling errors, and predict the next word we might type. Sometimes this works, sometimes it doesn’t. I’ve sent many garbled text because of Autocorrect.
  • Language Translation Services: Online translation tools like Google Translate utilize NLP to convert text or spoken words from one language to another, understanding grammatical nuances and context to provide accurate translations.
  • Chatbots and Customer Service: Many websites and customer service platforms employ chatbots that use NLP to understand and respond to customer inquiries. These bots can handle a range of tasks from answering FAQs to helping with online shopping or troubleshooting. Next time you’re frustrated by some generic response from a chatbot and wish you were talking to an actual human – thank NLP.
  • Email Filtering: Email services use NLP to filter out spam or categorize emails into different folders (like social, promotions, primary). This is done by analyzing the content of the emails and identifying certain patterns or keywords. This has gotten dramatically better in recent years.
  • Social Media Feeds: NLP algorithms help in personalizing our social media feeds. They analyze our interactions, the content we engage with, and use this data to curate a feed that is supposedly tailored to our interests. I would say they mostly ruin them, but you get the idea.
  • Sentiment Analysis: Businesses use NLP for sentiment analysis to gauge public opinion about their products or services. By analyzing social media posts, reviews, and comments, they can understand customer satisfaction and general sentiment.
  • Content Recommendations: Streaming services like Netflix or Spotify use NLP to recommend movies, shows, or music based on our previous viewing or listening habits, search history, and preferences. I’ve found these to be mostly . . . if not quite useless, close to it.
  • Accessibility Tools: NLP aids in creating tools for individuals with disabilities, such as text-to-speech and speech-to-text applications, which allow users with visual or hearing impairments to interact with technology more effectively. Potentially very useful.
  • Educational Tools: NLP is used in educational software to aid in language learning, provide automated grading of essays, and even give feedback on writing style and grammar.
  • Resume Screening: In the hiring process, NLP is used to screen resumes and applications to identify the most suitable candidates by matching job requirements with the skills and experiences listed in the resumes. Another instance where we’d probably be better off without this. Ever wonder why you can’t get an interview? Thank some unknowable NLP algorithm.

But with the arrival of Chat GPT and the other LLMs a year ago, NLP really came into its own. Suddenly we were able to engage with a machine in a way eerily similar to how we interact with human beings. We can argue whether this, long term, a good thing, but seen purely as a technology, Chat GPT is amazing. Even with all its flaws, hallucinations, errors, the dubious practice of just scraping content without any permissions at all, and so on. Chat GPT and the other LLMs struggling to catch up, really are a leap ahead on a technological scale and NLP made them possible.

How is NLP used in the new Large Language Models?

  • Understanding Language: ChatGPT uses NLP to grasp the nuances of human language. When you type a sentence, NLP helps the model understand not just the words, but also the meaning and context behind them. This understanding is crucial for generating relevant and coherent responses.
  • Generating Text: Once ChatGPT understands our input, it uses its knowledge gained from NLP to construct a reply. NLP guides it in forming sentences that are not only grammatically correct but also contextually appropriate, maintaining a flow that resembles natural human conversation.
  • Learning from Large Datasets: ChatGPT has been trained on a vast array of text data. NLP is used to process and learn from this data, enabling the model to recognize patterns, understand various topics, and even mimic different writing styles.
  • Handling Different Tasks: Whether it’s answering questions, writing essays, or even creating ‘poetry’, ChatGPT uses NLP to tailor its responses to the specific task at hand. NLP provides the flexibility to switch between different types of language use, from formal to casual, technical to creative.
  • Continuous Learning: As ChatGPT interacts with users, it continually refines its understanding and use of language. NLP is key in this learning process, helping the model to adapt and improve over time based on new interactions and data.

In short, NLP is the ‘brain’ behind our LLM’s ability to communicate effectively with humans, using human language. It is what allows the model to understand our questions and respond in a way that is informative, engaging and eerily human-like.

So now we’re going to learn the basics of NLP and how to use it. First stop: Text Preparation.

Back To Blog

SierPinski’s Triangle PT 2

In Part I, I explained what the Sierpinski’s Triangle is, and how it is based on fractals. In this post, we will actually build the Sierpinski’s Triangle, reviewing the basic mathematical concepts behind the Triangle, then using Python to build it. This tutorial is based on Giles McCullen-Klein’s ‘Sierpinski’s Triangle’ section of his excellent ‘Python Programming Bootcamp‘ course on the 365 Data Science platform (also on Udemy):

To review, the Sierpinskis Triangle is a fractal, a self-replicating geometric pattern that exhibits intricate detail and self-similarity at different scales. It is named after the Polish mathematician Wacław Sierpiński, who first described the pattern in 1915.

The Sierpinski Triangle is formed by recursively subdividing an equilateral triangle into smaller equilateral triangles. The process begins with a single large equilateral triangle. Then, at each iteration, we:

  1. Divide the initial triangle into four smaller equilateral triangles by connecting the midpoints of each side.
  2. Remove the central triangle, leaving three smaller equilateral triangles that form a larger equilateral triangle.
  3. Repeat steps 1 and 2 for each of the remaining smaller triangles, continuing indefinitely.

As the number of iterations approaches infinity, the resulting pattern becomes an increasingly intricate set of triangles, with the final Sierpinski Triangle having an infinite number of triangles and a total area of zero.

The Sierpinski Triangle is an example of a deterministic fractal, meaning that it can be generated through a specific set of rules. It has been studied extensively in mathematics and has applications in areas such as computer graphics, geometry, and the study of complex systems.

To recap, here are our three basic mathematical formulas:

First Transformation:

x_{n+1} = 0.5x_n 
y_{n+1} = 0.5y_n

Second Transformation:

x_{n+1} = 0.5x_n + 0.5
y_{n+1} = 0.5y_n + 0.5

Third Transformation:

x_{n+1} = 0.5x_n + 1
y_{n+1} = 0.5y_n

Next, we take these formulas and express them in Python:

from random import choice
def trans_1(p):
    x = p[0]
    y = p[1]
    x1 = 0.5 * x
    y1 = 0.5 * y
    return x1,y1
def trans_2(p):
    x = p[0]
    y = p[1]
    x1 = 0.5 * x + 0.5
    y1 = 0.5 * y + 0.5
    return x1,y1
def trans_3(p):
    x = p[0]
    y = p[1]
    x1 = 0.5 * x + 1
    y1 = 0.5 * y
    return x1,y1

transformations = [trans_1,trans_2,trans_3]
a1 = [0]
b1 = [0]
a,b = 0,0
 
for i in range(100):
    trans = choice(transformations)
    a,b = trans((a,b))
    a1.append(a)
    b1.append(b)

Then we use Matplotlib to visualize our triangle (we’ll have to import matplotlib first):

import matplotlib.pyplot as plt 
%matplotlib inline
plt.rc('figure',figsize=(16,16))
plt.plot(a1,b1,'o')
plt.savefig('my_figure.png')

as we can see, at 100 points our triangle barely exists. Let’s try again at 1000:

for i in range(1000):
    trans = choice(transformations)
    a,b = trans((a,b))
    a1.append(a)
    b1.append(b)

Ok, our triangle is starting to take shape. Let’s skip ahead and try again at one million:

Ta-dah! There we have it, our Sierpinski’s Triangle. The first time I saw the triangle appear in Giles’ lesson, I thought it was a miracle.

Part 3: Alternative Ways to Generate the Sierpinski’s Triangle

Back To Blog

Sierpinski’s Triangle PT 1

Summertime: Number 9 by Jackson Pollock
Summertime: Number 9A 1948 Jackson Pollock 1912-1956 Purchased 1988 http://www.tate.org.uk/art/work/T03977

A bit like the Random Walk Algorithm, but it embellishes it a bit.

Giles McCullen-Klein

Introduction

I first became aware of the Sierpinski’s Triangle while taking Giles McCullen-Klein’s excellent ‘Python Programmer Bootcamp‘ on the 365 Data Science platform. The Sierpinski Triangle is named after Polish mathematician Waclaw Sierpinski, who popularized the concept in the early 20th century. Giles used the Triangle in his section on MatPlotlib to demonstrate the power of visualizations. He started with these formulas:

First Transformation:

x_{n+1} = 0.5x_n 
y_{n+1} = 0.5y_n

Second Transformation:

x_{n+1} = 0.5x_n + 0.5
y_{n+1} = 0.5y_n + 0.5

Third Transformation:

x_{n+1} = 0.5x_n + 1
y_{n+1} = 0.5y_n

then explained how the formulas are chose at random to create the Triangle then followed with the formula’s equivalent in Python code (outlined in Part 2 of this post, where we build the Sierpinski’s Triangle using Python). He tried input 10 points, then 100, then 1000 where the first triangle began to take shape, albeit in outline. Finally, at a million points, we had it: The Sierpinski Triangle!

I was captivated by the appearance of the full Sierpinski’s Triangle, partly because of the ability of three simple mathematical formulas translated into Python to create such a compelling and beautiful image, but also by the design itself, with triangles inside triangles inside triangles on into infinity. I became even more intrigued when I discovered that the Sierpinski Triangle is based on fractals.

Fractals are geometric figures characterized by self-similarity: as you zoom in or out of a fractal, the pattern remains essentially the same, echoing itself over and over. What fascinated me about Sierpinski’s Triangle is that while it’s a fractal that exists in two dimensions, its fractal dimension – a measure of its complexity – is not an integer. Apparently, I am not alone: this quirk of fractal dimensions has perplexed and fascinated mathematicians since it was introduced. In the field of computer graphics, the Triangles’ algorithmic simplicity has lent itself to the generation of textures and patterns. The structure of the Triangle has relevance in fields as diverse as error correcting codes, network design, the study of dynamical systems, and the study of cellular automata.

Fractals were also found to be present in Jackson Pollock’s so-called ‘drip’ paintings of the mid-1950s. Though the concept of fractals didn’t exist until the 1970s, Pollock was discovered to have used them in his paintings, following the patterns of nature. I became aware of this when reading a plaque below his magnificent Summertime: Number 9. I can’t find the contents online, but I did find this from the University of Oregon:

In 1999, Richard Taylor and his research team published the results of their scientific analysis showing Pollock’s poured patterns to be fractal. Consisting of patterns that recur at increasingly fine magnifications, fractals are the basic building blocks of nature’s scenery. Labelled as “Fractal Expressionism,” Pollock distilled the essence of natural scenery and expressed it on his canvases with an unmatched directness. By adopting nature’s pattern generation processes, the resulting paintings didn’t mimic nature but instead stood as examples of nature. The above images compare Pollock’s fractals to those found in nature. Remarkably, the analysis revealed a highly systematic fractal painting process perfected by Pollock over a decade.

Richard Taylor, University of Oregon blog:

What is Sierpinski’s Triangle?

The Sierpinski’s Triangle, also known as the Sierpinski gasket, is a fractal named after the Polish mathematician Waclaw Sierpinski who described it in detail in 1915, though it’s worth noting that similar patterns were described by Italian mathematicians a few centuries earlier.

Walter Franciszek Sierpinski was born on March 14th, 1882 in Warsaw Poland. Through his long and prolific career, he made significant contributions in the field of set theory, number theory, theory of functions and topology, although he is best known for his work on set theory and the theory of numbers. Along with the triangle, he is credited with various other mathematical concepts, including Sierpinski’s conjecture, Sierpinski arrowhead curve, Sierpinski carpet, and Sierpinski constant. His work had a vital impact on the field of mathematics, and his concepts continue to be studied today. Over the course of his career, he published over 700 papers and 50 books, before passing away on October 21, 1969, in his hometown of Warsaw. That he accomplished all this while surviving through Nazi then Soviet occupation, and the leveling of Warsaw by the retreating Nazi army, makes his accomplishments all the more remarkable.

The Sierpinski Triangle is a fractal that is easy to construct and provides a clear example of self-similarity, a key property of fractals. Here’s a step-by-step guide to constructing a Sierpinski Triangle:

  1. Start with an equilateral triangle. This will be the base of your fractal. You can draw it on a piece of paper, or create it digitally.
  2. Divide the triangle into four smaller equilateral triangles. You can do this by connecting the midpoints of each side of the original triangle. This will create one triangle in the center and three triangles at the corners.
  3. Remove the middle triangle. This leaves you with three equilateral triangles. The shape now looks like a larger triangle made up of three smaller triangles.
  4. Repeat the process for each of the remaining smaller triangles. For each of the three smaller triangles, divide it into four even smaller triangles and remove the one in the center. This leaves you with nine small triangles.
  5. Continue this process indefinitely. Each time, you divide each remaining triangle into four smaller triangles and remove the one in the center. As you do this, the Sierpinski Triangle begins to take shape.

The Sierpinski Triangle is an example of a fractal because it is self-similar at all scales. If you zoom in on any part of the triangle, it looks the same as the whole triangle. This property is characteristic of fractals and is one of the things that makes them so interesting to mathematicians and scientists.

One of the more curious things about Sierpinski’s Triangle is that if you zoom in on the triangle, you will see the same pattern repeating over and over again, indefinitely. Despite this, the Sierpinski Triangle fits within an infinites space. Possibly, this is due to the nature of its construction: as we repeatedly remove the middle triangle from each smaller triangle in the figure, the total area of the triangles that make up the Sierpinski Triangle decreases, even as the number of triangles increases.

If we were to continue this process indefinitely, the total area of the Sierpinski Triangle would approach zero, even as it contains an infinite number of triangles. This is how the Sierpinski Triangle, and fractals generally, can exhibit complexity while remaining within an infinite space.

The mathematical principle behind the Sierpinski Triangle involves concepts from fractal geometry, particularly the idea of fractal dimension.

The Sierpinski Triangle is a fractal with a Hausdorff-Besicovitch dimension, also known as fractal dimension, of log(3)/log(2), which is approximately 1.585. This value indicates that the Sierpinski Triangle is more complex than a one-dimensional line (which has a fractal dimension of 1) but less complex than a two-dimensional shape (which has a fractal dimension of 2).

The area of the Sierpinski Triangle decreases with each iteration because with each step, we remove triangles, thereby reducing the total area. If we start with a triangle of area 1, after the first iteration, we remove the middle triangle, leaving 3/4 of the area. After the second iteration, we remove additional triangles, leaving (3/4)^2 of the area, and so on. So, with each iteration, the area of the Sierpinski Triangle is (3/4) to the power of the number of iterations, which approaches zero as the number of iterations goes to infinity.

On the other hand, the perimeter of the Sierpinski Triangle increases with each iteration. With each step, we add more edges, thereby increasing the total length of the boundary. If we start with a triangle of side length 1, after the first iteration, each side is divided into two segments of length 1/2, so the total perimeter is 321/2 = 3. After the second iteration, each side is divided into four segments of length 1/4, so the total perimeter is 341/4 = 3, and so on. So, with each iteration, the perimeter of the Sierpinski Triangle remains constant, but because there are an infinite number of iterations, the total perimeter is infinite.

This paradoxical situation, where the area approaches zero while the perimeter goes to infinity, is one of the fascinating properties of fractals like the Sierpinski Triangle.

Sierpinski’s Triangle has many parallels in art and nature.

In nature, certain types of ferns exhibit a fractal pattern similar to the Sierpinski triangle. The leaves of the fern are self-similar, with each leaflet being a smaller copy of the whole leaf. This self-similarity is a key characteristic of fractals.

In terms of culture, the triadic structure found in the triskelion symbol, which is common in Celtic and Greek art, is reminiscent of the Sierpinski triangle. The triskelion consists of three interlocked spirals or three bent human legs, and its recursive, triadic structure is similar to the recursive, triadic structure of the Sierpinski triangle.

In the 1990s, physicist Richard Taylor used computer analysis to study Pollock’s paintings and found that they contain fractal patterns. According to Taylor, Pollock’s drip paintings achieved a level of complexity in their fractal dimensions that is similar to those found in natural landscapes.

While Pollock’s paintings may not contain specific fractal shapes like the Sierpinski triangle, the overall fractal nature of his work does draw a parallel with the properties of the Sierpinski triangle and other fractals. Both Pollock’s paintings and the Sierpinski triangle demonstrate how simple rules and processes can generate complex and infinitely detailed patterns.

It’s important to note that while this analysis provides a fascinating intersection of art and mathematics, Pollock himself likely did not consciously incorporate mathematical fractals into his work. Rather, his intuitive process and the physical properties of the paint and canvas resulted in patterns that have fractal-like properties.

Pollock’s work has been analyzed from various scientific perspectives, and parallels between his paintings and fractal geometry have been suggested. Fractal geometry, which includes figures like the Sierpinski Triangle, is a branch of mathematics that deals with complex patterns that are self-similar across different scales.

Art critics and scientists have suggested that Pollock’s drip paintings exhibit fractal properties. Richard Taylor and his team initially proposed their theory that Jackson Pollock’s paintings may contain fractal patterns in the late 1990s. Their research, published in the scientific journal “Nature” in 1999, argued that Pollock’s drip paintings exhibited fractal properties, and that Pollock seemed to intuitively grasp these complex mathematical principles, even if he did not consciously understand them as fractals.

This intersection of art and mathematics demonstrates how concepts from one field can resonate in an entirely different one. However, this interpretation of Pollock’s work remains somewhat controversial, with some critics suggesting that Pollock likely wasn’t consciously employing mathematical principles in his work. Regardless of the controversy, the suggested link between Pollock’s paintings and fractal geometry, like the Sierpinski Triangle, offers an intriguing perspective on his unique artistic process and its outcomes.

This research has led to an ongoing and intriguing dialogue between art and science, and while not everyone in the scientific or artistic community agrees with Taylor’s conclusions, it certainly opened up new ways of analyzing and understanding abstract expressionist works.

Part 2: We make Sierpinski’s Triangle

More on fractals in Jackson Pollock’s work: https://www.arthurimiller.com/colliding-worlds-blog/how-fractals-science-and-technology-helped-to-resolve-a-major-art-scandal/

The facts about Pollock’s fractals: https://blogs.uoregon.edu/richardtaylor/2017/01/04/the-facts-about-pollocks-fractals/

Back To Blog

A Gentle Guide to Chat GPT, THE ChatBot threatening to take over the world

Introduction: So What is Chat GPT?

Chat GPT is an AI model developed by OpenAI, which uses machine learning to generate human-like text based on a given input. The GPT stands for ‘Generative Pre-training Transformer’. It was released in late November 2022. Within a week, it had a million subscribers, within two months, according to the London Guardian, 100 million, making it the fastest growing app since apps were created.

Some have called it an auto-complete on steroids, but it is more than that because of its ability to remember ‘conversations’ (more on this later) and to ‘learn’. Chat GPT is a ‘Large Language Model’, trained on roughly 570GB units of data over a period of a year and a half. Chat GPT uses a specific type of deep learning called a transformer, which makes it really good at understanding the context of language, which makes it very good at generating human-like responses. It should be noted, though, that because it took a year and a half to train GPT, Chat GPT’s world stops a year and a half before it’s release – mid to late 2021, depending on the model.

Chat GPT also has a cousin, Bing, and an offspring, Chat GPT4. Both Chat GPT (which is v3.5) and Bing are free, with Bing bundled both as an app and as part of the Microsoft Edge browser. Bing is basically a search engine built on top of the GPT technology that powers Chat GPT. Chat GPT4 requires a $20/month fee, and features not just a (much) superior service, both in depth and accuracy of responses, and continual service (the free version sometimes goes down due to the sheer number of users), but also access to plugins and the ability to use Bing to browse the web in real time. The plugins allow Chat GPT to be extended in almost infinite directions – shopping, academia, travel, interacting with PDFs, data and stock analysis. Dozens of new plugins are being approved on the platform each week, with hundreds of apps and plugins being released outside the Open AI concurrently. As with any new technology – some are useful and . . . ,some are not.

Chat GPT shares some similarity to that other famous AI, IBM Watson, but some important differences as well. IBM Watson has been around for awhile, and has been used in many applications, from diagnosing medical conditions to answering customer service inquiries to customer service inquiries to employee training. Watson was originally designed to understand and answer questions in natural language, which it famously demonstrated by competing on the game show Jeopardy. Chat GPT, on the other hand, is focused on text generation. Chat GPT is all about creating human-like text on the input it’s given. So while they are both AI tools, they are different tools for different tasks.

Another important difference: using IBM Watson ain’t cheap, so its (admittedly impressive) suite of applications are typically used by businesses and organizations rather than individuals. Chat GPT, on the other hand, can be accessed by anyone with access to a web browser and an email account to start an account, thus making AI accessible to anyone on the planet(until, of course, Open AI decides to eliminate the free option. So enjoy it while it lasts.

But What does Chat GPT ACtually DO?

In terms of text, and everything to do with text, which is a good deal of our civilization, the better question might be: What does it NOT do? To answer this question, I asked Chat GPT(4) itself:

What can Chat GPT be used for? Please generate a list of 10 examples

  • Content Generation: ChatGPT can be used to draft articles, blog posts, or other forms of written content. This can be helpful to start a writing project or to get past writer’s block.
  • Creative Writing: It can generate creative text, such as stories, poems, or scripts. This can be a fun way to create unique narratives or to get inspiration for your own creative writing.
  • Brainstorming Ideas: ChatGPT can be used to generate ideas for a variety of purposes, from business plans to creative projects.
  • Customer Service: It can be used in chatbot interfaces to interact with customers, answer frequently asked questions, and provide information about products or services.
  • Tutoring: ChatGPT can be used to explain complex topics or concepts in a simpler, more understandable way.

The other 5 were Language Translation, Email Drafting, Coding Assistance, Role-Playing and Simulation, and Personal Assistant. Anyone reading this can ask Chat GPT themselves, with a simple prompt ‘What are you good for?’ and receive a reasonably thoughtful reply. I use GPT-4 to generate code, to generate ideas, problem-solving, learning plans, to explain mathematical or programming concepts I don’t understand, to generate a basic structure for a cover letter or, for that matter, this article. I’ve used it to fill out a plot point or something about a character in a novel, to find parallels in mythology or literature with an idea I’ve had. And I feel like i’ve barely scratched the surface of what it’s capable of.

One thing Chat GPT, 3.5 or 4, does NOT do well is write. It really sucks at writing and, moreover, unless we all want to live in Idiocracy in real time, it should suck at writing. Which of course hasn’t stopped thousands, millions of students generating their essays, papers, what have you in Chat GPT then running them through various tools to avoid other tools that detect AI generated content. Nor, alas, has it stopped would-be bloggers generating their content with AI, or even online sites, already heavily reliant on SEO keyword – laden articles, listicles – who have been using AI in one form or another to generate content for the last couple of years anyway – from using Chat GPT to generate content then having editors ‘curate’ that content before publication. This is one truly depressing (and dangerous) aspect of Chat GPT and AI generally (more on this anon).

https://chat.openai.com/share/cd5fb547-f375-45e9-afd9-9c3982407b32

HOW TO ACCESS Chat GPT

Accessing Chat GPT (the free version) is as simple as creating an account after following the link from the Open AI landing page and punching in an email and password. Bing can be found at bing.com or bundled as the default search engine with Microsoft’s Edge browser. From there, you ask questions of the model the way you would with a search engine, with the major difference that with Chat GPT and to some extent Bing, is that you’re not just querying a search request, but having a conversation.

Some things to remember

The most important thing to remember about Chat GPT and, to a somewhat lesser extent, Bing, is that they are both iterative, that, unlike a pure search engine like Google, they remember the questions and responses in a thread and thus have a human-like quality in the call and response. Since Bing has access to the web, it can return real-time data – and provide sources, which Chat GPT does not. It’s also important to remember that Chat GPT is not always accurate. Sometimes it ‘hallucinates’ – gives answers, often with an almost gleeful (like) confidence, when it doesn’t actually know the answer. GPT4 is much more reliable, but even GPT4 does produce questionable results from time to time.

This has been hailed as proof that Chat GPT doesn’t actually work, but I think critics miss the point, both about the software and how to use it. No software is bug-free when it’s first rolled out, and for Chat GPT to go from 100,000 to 100 million to something like a billion users in just a few months understandably put incredible strain on its creator. But more importantly, it shouldn’t really be thought of as software.

Two other very important considerations: security and timeliness.

Security

Everything you put on Chat GPT stays on Chat GPT, very likely to be sucked into the data maw that is its training base. One of the more interesting facets of using Chat GPT is that it’s always learning, and seems to learn to anticipate your needs and even personality to some extent. And one of the more disturbing aspects of Chat GPT is . . . it’s always learning and it’s learning from the data you, me and a billion of other users (as of this writing) have prompted their way into over the last six months of its public existence. So don’t put in personal details, or any other sensitive data. As soon as you do, it becomes public property and yet more training data for Chat GPT.

Timeliness

Chat GPT v3 was released in late November 2002, Chat GPT v4 in early February 2023 (the free version is now Chat GPT v3.5 – v3 has been discontinued). But it took a year and a half up to that date to train the AI model that is Chat GPT on basically the entire internet – and because of that, Chat GPT’s awareness of the world stops somewhere around late summer 2021. So don’t use Chat GPT for current events, or any knowledge released in the last year and a half; for that use Bing, which can access the internet in current time, or a regular search engine.

CHAT GPT: Your new personal Assistant

Another reason to learn how to use Chat GPT and more generally AI: its development is moving at light speed. Some have compared its importance, and potential impact, as high as that as the arrival of the web (Bill Gates has gone further, saying the development of AI was as “fundamental as the creation of the microprocessor, the personal computer, the internet, and the mobile phone.” But since his old company funds Open AI, he would say that wouldn’t he?). But the main reason I advocate learning how to use it is, unlike the web, this is moving fast. Will AI eliminate jobs? In my opinion, in the long term, almost certainly yes – but in the short term, you might not lose your job directly to AI, but you could very well lose your job to someone who knows how to use AI.

So learn how to use AI. Stay tuned for more . . . .

Back To Blog

The Draughtsman Writer

A couple of years ago I saw The Draughtsman Writer at a larger exhibit at the Met, Technology In the Age of the Court Most of the exhibits were clever, if sometimes dazzling: many immensely complicated clockworks, constructed with gold and other precious minerals, but nothing truly blew me away until the exhibits at the show’s end, and in particular ‘The Draughtsman Writer’.

Along with our Draughtsman was his more famous cousin The Chess Player (sometimes called ‘The Turk’ because of his garb), a replication of the original automaton by Wolfgang Von Kepeler in 1769, which ‘played’ chess with prominent figures across Europe, and which was eventually revealed as a fraud, manned by skilled, even famous chess players hidden from view yet operating the Chess Player’s mechanical arms (I’m not entirely clear how they did this).

The Draughtsman Writer, however, needs no human intervention, except perhaps for a key to be wound up so that the Draughtman’s profoundly intricate gears, hidden in the Draughtsman’s desk, can begin turning, then a human hand to place the paper that with Draughtsman will fill with his exquisitely intricate drawings and poems. From the exhibit notes:

Maillard hid the mechanics of the Draughtsman Writer in a cabinet rather than the figure. This allowed for larger machinery and greater memory than in earlier efforts . . . an unprecedented three poems and four drawings are drawn by the figure, through a technology that foretold the computer.

Incredibly, when Pittsburgh’s Franklin Institute received the automaton in 1928, it was so damaged by the fire in the warehouse where it had been stored, they had no idea it was an automaton. They knew it had some mechanical function but, since it was in pieces, they had no idea what that function was. They didn’t even known the name of the inventor.

I was instantly captivated by the Draughtsman Writer. In part it was the instance of an early robot. No uncanny valley here – the automaton is only half-formed (many of its panel, its ‘skin’ possibly lost in the fire), with a young man’s dummy head, yet none of the creepiness we associate with a ventriloquist’s dummy. This is a benign, contemplative figure, eyes focused downward on its task, its transparently mechanical arm composed of brass strips, an almost human hand holding a pen, tracing delicate lines across the page fitted into the Draughtsman’s desk. As one of the curators says:

Normally we think of robots moving in very mechanical ways, very jerky movements. This machine is by far the most elegant in its movements.

The Draughtsman can compose four different pictures, including drawings of a Chinese temple and a ship, and write three poems, one in English, two in French. The ‘hard drive’ for these movements are the brass disks housed below the surface of the Draughtsman’s desk. The disks have hills and valleys on their surfaces, and a needle follows these grooves up and down, the collection of disks allowing for the most extensive mechanical memory of any known automaton.

The moving automaton was confined to a video next to the exhibit (as it was with all the other exhibits, presumably too old, too delicate for the repeat performances the exhibit would demand). Instead, in the actual exhibit, the Draughtsman peers straight ahead, eyes wide open, pen poised in its hand, waiting for the human intervention to fulfill its function and begin drawing and writing again.

The Draughtsman Writer at the Metropolitan Museum
The Draughtsman Writer at the Metropolitan Museum

I was struck, watching the video again, by the beauty of the calligraphy, the detail in each drawing. What a watchmaker Maillard must have been, to so precisely record each groove in his brass discs, long before the plastic record album would perform the same function. What might have Maillard done with a computer, with modern computer languages?

What also struck me is the essential frivolity of the Draughtsman Writer. Maillard likely built it to impress the court, as a sort of calling card. But essentially our Draughtsman exists to produce art, no more, no less. Sometimes I wonder if our machines shouldn’t be, at least partially, repurposed to do the same – not just for convenience, ‘communcation’, ‘disruption’ but to produce beauty, wonder. When our machines, our AI, can acheive some of the pure wonder of the Draughtsman Writer, will we able to say the Digital/ AI revolution has matured, been absorbed into our human fabric (instead of threatening to run amok as it is now?

Back To Blog

Introduction to Chat GPT for High School Teachers

Since the release of Chat GPT3 in late November, 2022, I’ve been wondering not just what impact it will have on everything from the job market to our media environment, but also how it can be used, both by myself and people I know. Since many people I know are teachers, including members of my family, education is as good as any a place to start.

What is Chat GPT?

Well, let’s ask Chat GPT itself:

ChatGPT is an advanced Artificial Intelligence (AI) model that uses Natural Language Processing (NLP) technology to understand and generate human-like responses to written text inputs. It is designed to respond to any input text by generating appropriate and contextually relevant responses, just like a human would in a conversation.

The technology behind ChatGPT is based on the GPT (Generative Pre-trained Transformer) architecture, which is a deep learning model that has been trained on massive amounts of text data from the internet. The GPT model learns to recognize patterns and relationships between words and phrases in the data, enabling it to generate coherent and contextually relevant responses to text inputs.

To create ChatGPT, OpenAI has trained the GPT architecture on an enormous amount of text data, including books, articles, and web pages, to ensure that it has a broad understanding of human language. This means that when you input text into ChatGPT, it can draw on its vast knowledge to generate a response that is both accurate and relevant to the input.

Open AI, the company behind Chat GPT, collected data from across the internet, and spent a year and a half training its incredibly powerful novel to match patterns in human speech. Unfortunately, since it took a year a half to train its LLM, Chat GPT’s knowledge stops, well, a year and a half ago. It CAN be coaxed, through careful prompting, into retrieve more recent information but for completely up to date information, you’ll have to look elsewhere. This is NOT true of Bing’s Chat, which can access the internet (and provides sources, which Chat GPT does not). However, despite this, I generally find Chat GPT a more powerful tool than either Bing or the recently released Google Bard.

When Chat GPT3 was first released, the immediate fear was that the technology would be used (or mis-used) not by teachers, but by students, since a simple prompt can solve problem sets and even generate whole essays. The style may be, well, robotic and sometimes flat-out wrong (GPT3 was known to ‘hallucinate’ – that is, to generate random nonsense when it couldn’t find an answer) but the fear was that many students would give up writing their own essays and just generate all their school assignments via AI. While this has happened, right from the beginning many teachers recognized its great potential as a learning tool:

Don’t Ban ChatGPT in Schools. Teach With It – NYT

To sign up for Chat GPT3, go to https://chat.openai.com/chat. GPT3 is free and only requires an email address. Note that during busiest times, availability may become limited, or drop out altogether. For that reason, when it became available a few weeks ago, I signed up for GPT Plus, which not only allows 24 hour access but immediate access to the recently released GPT4, which is a considerable improvement on GPT3, with considerably higher accuracy (especially in mathematics and programming) and more in-depth answers to general questions. For me, it’s worth paying $20 per month, but many will be satisfied with the free version.

So . . . how can teachers use Chat GPT to assist them in the classroom? Since I’m not a teacher, I turned once again to my new best friend Chat GPT4:

  1. Personalized Learning Experiences: Chat GPT can be an invaluable tool for creating personalized learning experiences for students. By inputting a student’s individual learning needs and preferences, teachers can use Chat GPT to generate tailored educational materials, including lesson plans, quizzes, and project ideas. This level of customization ensures that each student receives an education that is specifically catered to their unique strengths and weaknesses.
  2. Supplementing Lesson Plans: Teachers can use Chat GPT to supplement lesson plans by generating additional content and resources on a given topic. By providing a brief summary or key points of a lesson, Chat GPT can generate detailed explanations, examples, and practice problems. This can save teachers time and help them expand on topics that may be difficult for students to grasp fully.
  3. Real-Time Feedback and Grading Assistance: Chat GPT can be used to provide real-time feedback to students on their assignments and quizzes. By inputting a student’s response to a question or a written assignment, Chat GPT can generate feedback and suggestions for improvement. This immediate feedback can help students learn more effectively and allow teachers to focus on providing individualized attention and guidance.
  4. Tutoring and Homework Assistance: Outside of the classroom, Chat GPT can be utilized as a tutoring tool, offering guidance and assistance to students as they work through assignments or prepare for exams. By answering questions and providing explanations, Chat GPT can be a helpful resource for students who require additional support or clarification on a particular topic.
  5. Enhancing Classroom Discussions and Debates: Chat GPT can help facilitate engaging classroom discussions and debates by providing diverse perspectives on a given subject matter. Teachers can input discussion prompts or debate topics into Chat GPT, and the AI will generate thought-provoking arguments and counterarguments, encouraging students to think critically and engage in meaningful conversations.
  6. Encouraging Creativity and Originality: Chat GPT can be used to spark creativity and originality in student projects and writing assignments. By providing a prompt or topic, Chat GPT can generate a wide range of ideas and examples that students can draw inspiration from. This can help students break free from conventional thinking and explore new ideas and perspectives.

These are just a few examples. I could prompt again, and generate many more.

For teachers who do want to use Chat GPT, it’s important to have a basic idea of how it works. I might write a whole blog post about this, because it’s an important and surprisingly nuanced skill, but for now, it’s important to understand that because the LLM recognizes speech patterns, how you ask it a question is very important. For the above, I wrote:

Can you write me a blog post describing how high school teachers can use Chat GPT?

I could have written a much more detailed query, and received a much more detailed response. It’s also important to remember that, unlike a search engine, Chat GPT remembers ‘conversations’. so the process is iterative: you ask one question, then another, and Chat GPT will not only ‘remember’ the original question, but remember the answer it generated. Thus you can ask one question and drill down on the answer to get more detail, or receive an expanded answer and Chat GPT will, in a sense, be trained on the ‘conversation’ and tailor its responses appropriately. For example if I then ask:

Can you tell me more about ‘Real-Time Feedback and Grading Assistance’?

I’ll receive an even more detailed answer (I’ll just include the titles – Chat GPT can be verbose):

The use of Chat GPT for real-time feedback and grading assistance presents an opportunity for teachers to enhance the learning experience for their students while also reducing their workload. By harnessing the power of artificial intelligence, teachers can offer students immediate and personalized feedback on their work, helping them to learn more effectively and efficiently. Here’s a deeper look into how Chat GPT can provide real-time feedback and grading assistance:

  1. Instantaneous Feedback
  1. Consistent Evaluation Criteria
  1. Personalized Feedback for Individual Students
  1. Reducing Teacher Workload
  1. Continuous Learning and Improvement

It’s important to note that while Chat GPT can provide valuable real-time feedback and grading assistance, it should not completely replace the role of the teacher. Human interaction, empathy, and understanding are still essential components of a successful learning environment. Teachers should continue to monitor the feedback provided by Chat GPT to ensure its accuracy and relevance and supplement it with their insights and guidance as needed.

In this video Ted Pickett, himself a teacher, outlines the basics of creating a good prompt. He follows a four-step rule put out by an organization called ‘AI In Education’:

Use ARTS to help you craft efficient prompts

A udience: “This blog is from educators around the globe

R ole: “I want you to act as a title generator for the blog posts I am writing

T arget: “I will give you the topic and keywords from the article and you create five engaging titles

S pecific: “Keep the titles concise. Less than 15 words.

This is the whole video: From Ted Pickett’s ‘AI for the Classroom’ channel.

Ted Pickett’s ‘AI for the Classroom’ channel.

It’s also important to remember that the same response can produce somewhat different answers Since Chat GPT relies on pattern recognition, it will produce different answers for different users, and sometimes even variations on an answer for the same user with the same prompt. It also should not be seen as a replacement for human research: since it relies on information from the web, it can get things wrong (the so-called hallucinations).

Another useful Chat GPT function is its ability to summarize. Download the transcript of a video and Chat GPT will provide a summary. Chat GPT provided the following summary of the video below:

The video presents five ways teachers can use ChatGPT to enhance their teaching:

1) creating lesson sequences with student discussion questions,

2) designing well-being lessons,

3) providing feedback to students,

4) generating student reports,

5) crafting song lyrics for young learners.

The speaker emphasizes that AI tools like ChatGPT can help educators focus on the process and stages of student learning, rather than just the end product.

From Liam Bassett’s YT channel

What’s truly amazing is the speed at which this technology is evolving. Just a couple of months after the initial release of Chat GPT3 comes Chat GPT4, a significant improvement in both accuracy and depth of its responses. Then Microsoft included a somewhat dumbed-down version of Chat GPT4 in its Bing browser. Hundreds, even thousands of apps, built on the GPT API, are being released weekly. And soon, Open AI will allow the use of plugins which could revolutionize the technology even further, allowing for the customization of the core technology into every sphere imaginable, including (probably especially) education. Both Duolingo and Khan Academy have become early adopters (though Khan Academy is still in the testing phase – you can sign up to be on the waitlist for testers – I imagine teachers will get priority), using the chatbot as a sort of virtual tutors for their students.

What does the future hold? As with all technological change, it’s hard to know where this will end up, whether it will be a net benefit or loss. I think AI’s potential to help educators is very considerable indeed, but so is its capacity for misuse. For the time being, I think it’s up to everyone to learn how to use this properly – and to learn how to use it for good.

This is a vast and fast-growing field, so I’ll be posting more on the subject.