As language models enter high-stakes domains including law, medicine, and journalism, their capacity to distinguish belief from knowledge becomes paramount. Evaluation of 24 state-of-the-art models using KaBLE—a benchmark spanning epistemic reasoning tasks—reveals systematic failures when processing first-person false beliefs, with even advanced models sometimes dramatically underperforming. Models, overall, exhibit pronounced attribution bias, handling third-party beliefs far more accurately than user-stated beliefs. They demonstrate inconsistent application of knowledge's factual nature and striking sensitivity to minor linguistic variations. These limitations carry immediate implications for legal practice, witness testimony analysis, evidence assessment, and the responsible governance of AI systems in judicial contexts.
Registration is mandatory, but free of charge. Please register here until 3 December 2025.
We look forward to seeing many of you there!
Location:
SEM 10, Juridicum, Schottenbastei 10-16, 1010 Vienna
[mehr]
