Given the recent explosion of large language models (LLMs) that can make convincingly human-like statements, it makes sense that there's been a deepened focus on developing the models to be able to explain how they make decisions. But how can we be sure that what they're saying is the truth?
In a new paper, researchers from Microsoft and MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) propose a novel method for measuring LLM explanations with respect to their "faithfulness"—that is, how accurately an explanation represents the reasoning process behind the model's answer.
To read more, click here.