Turns out that when large language models (“LLMS”) get larger, they get better at certain tasks and worse on others.
Researchers in a group called BigScience found that feeding LLMs more data made them better at solving difficult questions – likely those that required access to that greater data and commensurate prior learning – but at the cost of delivering reliably accurate answers to simpler ones.
The chatbots also got more reckless in their willingness to tee-up those potentially wrong answers.
I can’t help but think of an otherwise smart human friend who gets more philosophically broad and sloppily stupid after a few cocktails.
The scientists can’t explain the cause of this degraded chatbot performance, as the machinations of evermore complex LLMs make such cause/effect assessments more inscrutable. They suspect that it has something to do with user variables like query structure (wording, length, order) or maybe how the results themselves are evaluated, as if a looser definition of accuracy or truth would improve our satisfaction with the outcomes.
The happyspeak technical term for such gyrations is “reliability fluctuations.”
So, don’t worry about the drunken friend’s reasoning…just smile at the entertaining outbursts and shrug at the blather. Take it all in with a grain of salt.
This sure seems to challenge the merits of gigantic, all-seeing and knowing AIs that will make difficult decisions for us.
It also begs questions about why the leading tech toffs are forever searching for more data to vacuum into their ever-bigger LLMs. There’s a mad dash to achieve artificial general intelligence (“AGI”) because it’s assumed there’s some point of hugeness and complexity that will yield a computer that thinks and responds like a human being.
Now we know that the faux person might be a loud drunk.
There’s a contrarian school of thought in AI research and development that suggests smaller is better because a simplified and shortened list of tasks can be accomplished with less data, use less energy, and spit out far more reliable results.
Your smart thermostat doesn’t need to contemplate Nietzsche, it just needs to sense and respond to the temperature. It’s also less likely to decide one day that it wants to annihilate life on the planet.
We already have this sort of AI distributed in devices and processes across our work and personal lives. Imagine if development was focused on making these smaller models smarter, faster, and more efficient, or finding new ways to clarify and synthesize tasks that suggested new ways to build and connect AIs to find big answers by asking ever-smaller questions?
Humanity doesn’t need AGI or evermore garrulous chatbots to solve even our most seemingly intractable problems
We know the answers to things like slowing or reversing climate change, for instance, but we just don’t like them. Our problems are social, political, economic, psychological…not really technological.
And the research coming from BigScience suggests that we’d need to take any counsel from an AI on the subject with that grain of salt anyway.
We should just order another cocktail.