AI is weird
LLMs are unlike any other intelligence we've ever known
AI is strange. I can feed Bing a picture, and it can tell me where the picture was taken based on the landscape. ChatGPT can solve my son's A-level maths questions (useful for, err, checking answers). Udio can produce genuinely catchy music. A combination of ChatGPT, Copilot image creator, and Midjourney helped me create my own "Mr. Men" book - a unique birthday present for my wife (although I’m not sure she saw it that way…).
Yet, until recently, models couldn't even count the number of r's in Strawberry. Even o1 struggles with problems that require a basic understanding of the physical world. As humans, we delight in pointing out the simple (to us) things AI can't do while glossing over the remarkable feats it achieves.
AI is unlike any form of intelligence we've previously encountered. What it lacks in reasoning, it makes up for with sheer scale of knowledge. GPT-3, the last model for which data was published, was trained on around 300 billion words.
Let’s put that into context. The average human reads about 200 words per minute. Imagine you start reading at age 5 and live until 85 — a generous estimate for pretty much all the world. That would give you 80 years of reading, or roughly 80 * 60 * 24 * 365 = 42 million minutes. Or 8.5 billion words. But we need to sleep so the number drops to 5.6 billion.
So 5.6 billion is the upper limit on how many words a human can read in their lifetime. We can't exceed it. And, in reality, none of us gets close. We don't just read - we eat, we play, we relax, we work. Various studies reckon we read, on average, somewhere between 30 to 60 minutes per day. Assume the upper end and you get 80 * 60 * 1 * 365 * 200 = 350 million words.
But I nearly forgot! We don't remember everything we read. Back in the 1880s, psychologist Hermann Ebbinghaus introduced the concept of the forgetting curve - the loss of memory with time. The curve shows a rapid decline in memory retention shortly after learning, levelling off over time. After about 30 days, only 20% of new information is retained. Applying this, we’d remember roughly 70 million of those 350 million words.
And, of course, we need to be alive to use our knowledge effectively, ideally around the midpoint of our lives. That brings us to 35 million words. Just 35 million. GPT-3 was trained on 300 billion. Roughly 10,000 times more words than we can hope to retain. Four orders of magnitude.
With the commercialization of AI, Frontier labs no longer publish detailed training information. But reasonable estimates for the latest models put their training data at around ten trillion words. Thirty times more than GPT-3. Five orders of magnitude more than you or I or any human can read.
Put another way, you'd need to live for 100 million years to read that much. 100 million years ago dinosaurs roamed the earth and the Atlantic ocean was just starting to form as North and South America pulled away from Europe and Africa. Even if I could live that long, I'd hate to think how wrinkly I'd be.
AI models aren’t like any form of intelligence we’ve ever known. They possess knowledge on a scale that dwarfs human capacity. They have the potential to spot connections across disciplines and unearth insights that humans just can’t. These systems challenge our understanding not because they resemble us, but because they are so fundamentally different. They are genuinely alien creations that we have little hope of ever understanding. We live in interesting times!
