LLM Round-up
When I asked ChatGPT what type of AI it is, this is what it said:
ChatGPT is a type of language model, specifically a variant of a transformer model called the Generative Pretrained Transformer (GPT). It is an example of artificial general intelligence (AGI), which means that it is capable of performing a wide range of natural language processing tasks, such as text generation, summarization, translation, and question answering, among others.
I know some of these words 🤦‍♂️.
I’ll never be an AI researcher, but I do want to have a laymen’s understanding of what the tool is capable of and how it works. I’ve been searching for educational resources that lie somewhere in the sweet spot of accessibility and depth. I want to understand the basic terminology, like what a transformer is, and also understand a large language model’s shortcomings. And I want to understand where a language model fits into the broader landscape of AI.
To that end, I’m compiling a few resources that seem to be well regarded. I’ll treat this post as a digital bookshelf, and will update it with my thoughts and notes as I work through the material.
We’ll start with 35,000 words from Stephen Wolfram as a primer, then move into Andrej Karpathy’s Zero to Hero lecture series.
- Stephen Wolfram: What Is ChatGPT Doing … and Why Does It Work? (2023-02-14)
- Andrej Karpathy: Neural Networks: Zero to Hero
- OpenAI Documentation
- Rodrigo GirĂŁo SerrĂŁo: Your first recurrent neural network (RNN)