The fifty-one percent coin
When I was in college, before the LLM hype, I took a course on computational neuroscience. It taught us about parallels between the human brain and computer algorithms. For example, turns out that some of our computer models for visual object recognition have neural parallels.
Metaphors of the brain
However, one of the early readings (Brain Metaphor and Brain Theory, Daugman, 2001 ) was a warning that stuck with me: Humanity has always compared the brain to its most advanced technology.
- Ancient Greeks (5th century BC): Hydraulic systems
- Compared the brain to water technology (fountains, pumps, water clocks)
- The four humors flowing through the body like water through pipes
- Enlightenment Era (17th-18th century) - Clockwork mechanisms
- Hobbes and Descartes saw the brain as intricate clockwork
- La Mettrie’s “L’Homme machine” - sophisticated mechanical automata
- Industrial Revolution (18th-19th century) - Steam engines
- Telegraph Era (19th century) - Electrical communication systems
- Helmholtz used telegraph metaphors for neural communication
- Neurons as electrical relays and circuits
- Hebb’s reverberating circuits underlying memory
- Modern era (Mid-20th century) - Electronic computers
- McCullough & Pitts, von Neumann, Turing - A computing machine
- Today - Parallel processing and neural networks
- The brain as a massively parallel distributed processing system
- Neural networks and connectionist models
- Internet-like interconnected systems
While all of these have their merits (and some may even be partially correct!)—as Daugman notes, we’re “too easily imprinted with the spectacle of the day”. Each generation thinks they’ve finally found the right metaphor, but it’s really just reflecting their most impressive contemporary technology.
LLMs are probably not how the brain works
Certainly I think we are closer to finding how we process language than we have been in the past. However, I also believe that history rhymes, and our current understanding is lacking as all the previous attempts were.
We already have good evidence that we may be missing some key things:
- Humans generalize better than LLMs
- Humans exhibit better reasoning than LLMs
- Humans can learn new things faster than LLMs (without backprop)
So long as it’s 51%, it doesn’t matter
For a long time, I got hung up on that detail.
But here’s what I’ve realized: it might not matter if we’re wrong again.
To be reductionist, if humans are a coin that gives me the right answer to a question 80% of the time, and LLMs are a coin that only give me the right answer 51% of the time, well then LLMs can’t be generally intelligent.
What I missed is that brute force can cover for intelligence. So long as the coin’s result is any result other than random, I can flip the coin more times to get to 80%.
For example, GPT-4 might only be 60% accurate on reasoning tasks, but if we can run it through multiple reasoning chains (like o1 does), we can boost that to 80%.
Hell, if the coin is only 50.00001% accurate, we can build massive data centers in the desert to flip the coin trillions of times. Because that may literally be cheaper than developing a better coin.
And this is true whether it’s a coin, a 6-sided die, or a deck of cards with every possible next word in the english language. So long as the result is distributed non-randomly for general problems in the domain you’ll ask the coin questions about, it doesn’t matter.
Generalizing intelligence
This applies to intelligence broadly (whatever that may be); whether it’s memory, reasoning, or generalization, it doesn’t matter if the LLM is as good as a human, so long as we can scale those expressed behaviors through brute force.
That radically lowers the bar of what we need on the quest to AGI: we don’t need a system for memory that is as good as what humans have, we just need one that’s good enough. Specifically, one that:
- Gives us non-random distributions over the results
- For general problems in the domain we want to use it for
So long as those two are true, scale is all you need.