Back To Blog

Learn (Basic) Natural Language Processing (NLP) in 6 easy steps

Recently, I started the ‘Intro To NLP’ Course on the 365 Data Science platform. I’ve noticed the platform scrambling in recent months to introduce AI content after focusing almost exclusively on Data Science topics. Since the two are related, this wasn’t a bad thing, but glad to see it bringing in more high quality courses focusing exclusively on AI/LLM topics, and looking forward to seeing more.

The Intro to NLP consists of 7 sections of lessons with a Practical Task at the end of each section, then a section devoted to a whole project, finishing up with a ‘Future of NLP’ section before the Final Exam. So far it’s been a good course, about a subject I knew almost nothing about before I started, but which is central to LLM like Chat GPT and AI generally.

What is Natural Language Processing, or NLP? NLP is a technology that allows computers to understand, interpret and respond to human languages, both written and spoken. As it turns out, it has an astonishing number of uses, and was widely used in ways I never even thought about, even before the arrival of the LLMs:

  • Search engines: when we type a query into a search engine, NLP algorithms interpret our intent, understand the context, and deliver relevant search results. This involves processing the query, identifying key terms, and even understanding the context of our question.
  • Voice Assistants and Smart Home Devices: Devices like Amazon’s Alexa, Google Assistant, and Apple’s Siri use NLP to understand spoken commands. They can interpret our requests, respond to queries, control smart home devices, and even engage in casual conversation. Sort of.
  • Text Autocorrect and Predictive Text: The autocorrect and predictive text features on smartphones and other devices use NLP to understand the context of what we are typing, correct spelling errors, and predict the next word we might type. Sometimes this works, sometimes it doesn’t. I’ve sent many garbled text because of Autocorrect.
  • Language Translation Services: Online translation tools like Google Translate utilize NLP to convert text or spoken words from one language to another, understanding grammatical nuances and context to provide accurate translations.
  • Chatbots and Customer Service: Many websites and customer service platforms employ chatbots that use NLP to understand and respond to customer inquiries. These bots can handle a range of tasks from answering FAQs to helping with online shopping or troubleshooting. Next time you’re frustrated by some generic response from a chatbot and wish you were talking to an actual human – thank NLP.
  • Email Filtering: Email services use NLP to filter out spam or categorize emails into different folders (like social, promotions, primary). This is done by analyzing the content of the emails and identifying certain patterns or keywords. This has gotten dramatically better in recent years.
  • Social Media Feeds: NLP algorithms help in personalizing our social media feeds. They analyze our interactions, the content we engage with, and use this data to curate a feed that is supposedly tailored to our interests. I would say they mostly ruin them, but you get the idea.
  • Sentiment Analysis: Businesses use NLP for sentiment analysis to gauge public opinion about their products or services. By analyzing social media posts, reviews, and comments, they can understand customer satisfaction and general sentiment.
  • Content Recommendations: Streaming services like Netflix or Spotify use NLP to recommend movies, shows, or music based on our previous viewing or listening habits, search history, and preferences. I’ve found these to be mostly . . . if not quite useless, close to it.
  • Accessibility Tools: NLP aids in creating tools for individuals with disabilities, such as text-to-speech and speech-to-text applications, which allow users with visual or hearing impairments to interact with technology more effectively. Potentially very useful.
  • Educational Tools: NLP is used in educational software to aid in language learning, provide automated grading of essays, and even give feedback on writing style and grammar.
  • Resume Screening: In the hiring process, NLP is used to screen resumes and applications to identify the most suitable candidates by matching job requirements with the skills and experiences listed in the resumes. Another instance where we’d probably be better off without this. Ever wonder why you can’t get an interview? Thank some unknowable NLP algorithm.

But with the arrival of Chat GPT and the other LLMs a year ago, NLP really came into its own. Suddenly we were able to engage with a machine in a way eerily similar to how we interact with human beings. We can argue whether this, long term, a good thing, but seen purely as a technology, Chat GPT is amazing. Even with all its flaws, hallucinations, errors, the dubious practice of just scraping content without any permissions at all, and so on. Chat GPT and the other LLMs struggling to catch up, really are a leap ahead on a technological scale and NLP made them possible.

How is NLP used in the new Large Language Models?

  • Understanding Language: ChatGPT uses NLP to grasp the nuances of human language. When you type a sentence, NLP helps the model understand not just the words, but also the meaning and context behind them. This understanding is crucial for generating relevant and coherent responses.
  • Generating Text: Once ChatGPT understands our input, it uses its knowledge gained from NLP to construct a reply. NLP guides it in forming sentences that are not only grammatically correct but also contextually appropriate, maintaining a flow that resembles natural human conversation.
  • Learning from Large Datasets: ChatGPT has been trained on a vast array of text data. NLP is used to process and learn from this data, enabling the model to recognize patterns, understand various topics, and even mimic different writing styles.
  • Handling Different Tasks: Whether it’s answering questions, writing essays, or even creating ‘poetry’, ChatGPT uses NLP to tailor its responses to the specific task at hand. NLP provides the flexibility to switch between different types of language use, from formal to casual, technical to creative.
  • Continuous Learning: As ChatGPT interacts with users, it continually refines its understanding and use of language. NLP is key in this learning process, helping the model to adapt and improve over time based on new interactions and data.

In short, NLP is the ‘brain’ behind our LLM’s ability to communicate effectively with humans, using human language. It is what allows the model to understand our questions and respond in a way that is informative, engaging and eerily human-like.

So now we’re going to learn the basics of NLP and how to use it. First stop: Text Preparation.

Back To Blog

The Draughtsman Writer

A couple of years ago I saw The Draughtsman Writer at a larger exhibit at the Met, Technology In the Age of the Court Most of the exhibits were clever, if sometimes dazzling: many immensely complicated clockworks, constructed with gold and other precious minerals, but nothing truly blew me away until the exhibits at the show’s end, and in particular ‘The Draughtsman Writer’.

Along with our Draughtsman was his more famous cousin The Chess Player (sometimes called ‘The Turk’ because of his garb), a replication of the original automaton by Wolfgang Von Kepeler in 1769, which ‘played’ chess with prominent figures across Europe, and which was eventually revealed as a fraud, manned by skilled, even famous chess players hidden from view yet operating the Chess Player’s mechanical arms (I’m not entirely clear how they did this).

The Draughtsman Writer, however, needs no human intervention, except perhaps for a key to be wound up so that the Draughtman’s profoundly intricate gears, hidden in the Draughtsman’s desk, can begin turning, then a human hand to place the paper that with Draughtsman will fill with his exquisitely intricate drawings and poems. From the exhibit notes:

Maillard hid the mechanics of the Draughtsman Writer in a cabinet rather than the figure. This allowed for larger machinery and greater memory than in earlier efforts . . . an unprecedented three poems and four drawings are drawn by the figure, through a technology that foretold the computer.

Incredibly, when Pittsburgh’s Franklin Institute received the automaton in 1928, it was so damaged by the fire in the warehouse where it had been stored, they had no idea it was an automaton. They knew it had some mechanical function but, since it was in pieces, they had no idea what that function was. They didn’t even known the name of the inventor.

I was instantly captivated by the Draughtsman Writer. In part it was the instance of an early robot. No uncanny valley here – the automaton is only half-formed (many of its panel, its ‘skin’ possibly lost in the fire), with a young man’s dummy head, yet none of the creepiness we associate with a ventriloquist’s dummy. This is a benign, contemplative figure, eyes focused downward on its task, its transparently mechanical arm composed of brass strips, an almost human hand holding a pen, tracing delicate lines across the page fitted into the Draughtsman’s desk. As one of the curators says:

Normally we think of robots moving in very mechanical ways, very jerky movements. This machine is by far the most elegant in its movements.

The Draughtsman can compose four different pictures, including drawings of a Chinese temple and a ship, and write three poems, one in English, two in French. The ‘hard drive’ for these movements are the brass disks housed below the surface of the Draughtsman’s desk. The disks have hills and valleys on their surfaces, and a needle follows these grooves up and down, the collection of disks allowing for the most extensive mechanical memory of any known automaton.

The moving automaton was confined to a video next to the exhibit (as it was with all the other exhibits, presumably too old, too delicate for the repeat performances the exhibit would demand). Instead, in the actual exhibit, the Draughtsman peers straight ahead, eyes wide open, pen poised in its hand, waiting for the human intervention to fulfill its function and begin drawing and writing again.

The Draughtsman Writer at the Metropolitan Museum
The Draughtsman Writer at the Metropolitan Museum

I was struck, watching the video again, by the beauty of the calligraphy, the detail in each drawing. What a watchmaker Maillard must have been, to so precisely record each groove in his brass discs, long before the plastic record album would perform the same function. What might have Maillard done with a computer, with modern computer languages?

What also struck me is the essential frivolity of the Draughtsman Writer. Maillard likely built it to impress the court, as a sort of calling card. But essentially our Draughtsman exists to produce art, no more, no less. Sometimes I wonder if our machines shouldn’t be, at least partially, repurposed to do the same – not just for convenience, ‘communcation’, ‘disruption’ but to produce beauty, wonder. When our machines, our AI, can acheive some of the pure wonder of the Draughtsman Writer, will we able to say the Digital/ AI revolution has matured, been absorbed into our human fabric (instead of threatening to run amok as it is now?