Imagine you had a friend who gave different answers to the same question, depending on how you asked it. “What’s the capital of Peru?” would get one answer, and “Is Lima the capital of Peru?” would get another. You’d probably be a little worried about your friend’s mental faculties, and you’d almost certainly find it hard to trust any answer they gave.
That’s exactly what’s happening with many large language models (LLMs), the ultra-powerful machine learning tools that power ChatGPT and other marvels of artificial intelligence. A generative question, which is open-ended, yields one answer, and a discriminative question, which involves having to choose between options, often yields a different one. “There is a disconnect when the same question is phrased differently,” said Athul Paul Jacob, a doctoral student at the Massachusetts Institute of Technology.
To make a language model’s answers more consistent — and make the model more reliable overall — Jacob and his colleagues devised a game where the model’s two modes are driven toward finding an answer they can agree on. Dubbed the consensus game, this simple procedure pits an LLM against itself, using the tools of game theory to improve the model’s accuracy and internal consistency.
To read more, click here.