Your grade school teacher probably didn’t show you how to add 20-digit numbers. But if you know how to add smaller numbers, all you need is paper and pencil and a bit of patience. Start with the ones place and work leftward step by step, and soon you’ll be stacking up quintillions with ease.
Problems like this are easy for humans, but only if we approach them in the right way. “How we humans solve these problems is not ‘stare at it and then write down the answer,’” said Eran Malach, a machine learning researcher at Harvard University. “We actually walk through the steps.”
That insight has inspired researchers studying the large language models that power chatbots like ChatGPT. While these systems might ace questions involving a few steps of arithmetic, they’ll often flub problems involving many steps, like calculating the sum of two large numbers. But in 2022, a team of Google researchers showed that asking language models to generate step-by-step solutions enabled the models to solve problems that had previously seemed beyond their reach. Their technique, called chain-of-thought prompting, soon became widespread, even as researchers struggled to understand what makes it work.
Now, several teams have explored the power of chain-of-thought reasoning by using techniques from an arcane branch of theoretical computer science called computational complexity theory. It’s the latest chapter in a line of research that uses complexity theory to study the intrinsic capabilities and limitations of language models. These efforts clarify where we should expect models to fail, and they might point toward new approaches to building them.
To read more, click here.