LLM Round-up

When I asked ChatGPT what type of AI it is, this is what it said:

ChatGPT is a type of language model, specifically a variant of a transformer model called the Generative Pretrained Transformer (GPT). It is an example of artificial general intelligence (AGI), which means that it is capable of performing a wide range of natural language processing tasks, such as text generation, summarization, translation, and question answering, among others.

I know some of these words 🤦‍♂️.

I’ll never be an AI researcher, but I do want to have a laymen’s understanding of what the tool is capable of and how it works. I’ve been searching for educational resources that lie somewhere in the sweet spot of accessibility and depth. I want to understand the basic terminology, like what a transformer is, and also understand a large language model’s shortcomings. And I want to understand where a language model fits into the broader landscape of AI.

To that end, I’m compiling a few resources that seem to be well regarded. I’ll treat this post as a digital bookshelf, and will update it with my thoughts and notes as I work through the material.

We’ll start with 35,000 words from Stephen Wolfram as a primer, then move into Andrej Karpathy’s Zero to Hero lecture series.

Stephen Wolfram: What Is ChatGPT Doing … and Why Does It Work? (2023-02-14)
Andrej Karpathy: Neural Networks: Zero to Hero
OpenAI Documentation
Rodrigo Girão Serrão: Your first recurrent neural network (RNN)

2023-03-02

Technology