This is a potential security issue, you are being redirected to https://csrc.nist.gov.
You have JavaScript disabled. This site requires JavaScript to be enabled for complete site functionality.
Published: June 24, 2025
Name: 39th IFIP WG 11.3 Annual Conference on Data and Applications Security and Privacy, DBSec 2025
Dates: 06/23/2025 - 06/24/2025
Location: Gjøvik, Norway
Citation: Data and Applications Security and Privacy XXXIX, vol. 15722, pp. 116-133
Large language models (LLMs) have emerged as a powerful tool for retrieving knowledge through seamless, human-like interactions. Despite their advanced text generation capabilities, LLMs exhibit hallucination tendencies, where they generate factually incorrect statements and fabricate knowledge, undermining their reliability and trustworthiness. Multiple studies have explored methods to evaluate LLM uncertainty and detect hallucinations. However, existing approaches are often probabilistic and computationally expensive, limiting their practical applicability.
In this paper, we introduce diversion decoding, a novel method for developing an LLM uncertainty heuristic by actively challenging model-generated responses during the decoding phase. Through diversion decoding, we extract features that capture the LLM’s resistance to produce alternative answers and utilize these features to train a machine-learning model to develop a heuristic measure of the LLM’s uncertainty. Our experimental results demonstrate that diversion decoding outperforms existing methods with significantly lower computational complexity, making it an efficient and robust solution for evaluating hallucination detection.
Large language models (LLMs) have emerged as a powerful tool for retrieving knowledge through seamless, human-like interactions. Despite their advanced text generation capabilities, LLMs exhibit hallucination tendencies, where they generate factually incorrect statements and fabricate knowledge, undermining their reliability and trustworthiness. Multiple studies have explored methods to evaluate LLM uncertainty and detect hallucinations. However, existing approaches are often probabilistic and computationally expensive, limiting their practical applicability.
In this paper, we introduce diversion decoding, a novel method for developing an LLM uncertainty heuristic by actively challenging model-generated responses during the decoding phase. Through diversion decoding, we extract features that capture the LLM’s resistance to produce alternative answers and utilize these features to train a machine-learning model to develop a heuristic measure of the LLM’s uncertainty. Our experimental results demonstrate that diversion decoding outperforms existing methods with significantly lower computational complexity, making it an efficient and robust solution for evaluating hallucination detection.
None selected
Publication:
https://doi.org/10.1007/978-3-031-96590-6_7
Preprint (pdf)
Supplemental Material:
None available
Document History:
06/24/25: Conference Paper (Final)