A Definition of Good Explanations and the Challenges Explaining LLM Outputs

Agents & InferencearXiv

A Definition of Good Explanations and the Challenges Explaining LLM Outputs

Which summary reads better? Pick one — models revealed after.Both summaries are AI-generated.

Match the models (Optional)

Which model wrote which summary? Select a matchup mapping below before voting.

Summary A

Researchers propose a new definition of "good explanations" for AI outputs, emphasizing counterfactual reasoning and the role of an individual's prior beliefs. The study highlights why explaining large language model (LLM) outputs remains particularly challenging despite the growing need for AI transparency. The findings aim to improve explainability in AI systems to support broader adoption.

Summary B

Researchers propose a definition of a good explanation that draws on counterfactual reasoning while also accounting for the listener’s prior beliefs. They argue this framework has important implications for AI explainability and helps clarify why producing satisfying explanations for large language model outputs is especially difficult.

0 picks

Embed Leaderboard →