Chainpoll: A high efficacy method for LLM hallucination detection

Article Status

Published

Authors/contributors

Friel, Robert (Author)
Sanyal, Atindriyo (Author)

Title

Chainpoll: A high efficacy method for LLM hallucination detection

Abstract

Large language models (LLMs) have experienced notable advancements in generating coherent and contextually relevant responses. However, hallucinations - incorrect or unfounded claims - are still prevalent, prompting the creation of automated metrics to detect these in LLM outputs. Our contributions include: introducing ChainPoll, an innovative hallucination detection method that excels compared to its counterparts, and unveiling RealHall, a refined collection of benchmark datasets to assess hallucination detection metrics from recent studies. While creating RealHall, we assessed tasks and datasets from previous hallucination detection studies and observed that many are not suitable for the potent LLMs currently in use. Overcoming this, we opted for four datasets challenging for modern LLMs and pertinent to real-world scenarios. Using RealHall, we conducted a comprehensive comparison of ChainPoll with numerous hallucination metrics from recent studies. Our findings indicate that ChainPoll outperforms in all RealHall benchmarks, achieving an overall AUROC of 0.781. This surpasses the next best theoretical method by 11% and exceeds industry standards by over 23%. Additionally, ChainPoll is cost-effective and offers greater transparency than other metrics. We introduce two novel metrics to assess LLM hallucinations: Adherence and Correctness. Adherence is relevant to Retrieval Augmented Generation workflows, evaluating an LLM's analytical capabilities within given documents and contexts. In contrast, Correctness identifies logical and reasoning errors.

Repository

arXiv

Archive ID

arXiv:2310.18344

Date

2023-10-22

URL

http://arxiv.org/abs/2310.18344

Accessed

06/05/2024, 21:41

Short Title

Chainpoll

Library Catalogue

arXiv.org

Extra

arXiv:2310.18344 [cs] <标题>: Chainpoll：一种高效的LLM幻觉检测方法 Read_Status: New Read_Status_Date: 2025-11-10T07:25:53.387Z Citation Key: friel2023

Citation

Friel, R., & Sanyal, A. (2023). Chainpoll: A high efficacy method for LLM hallucination detection (arXiv:2310.18344). arXiv. http://arxiv.org/abs/2310.18344

Link to this record

https://aievidencehub.org/lib/ZRUGP3ZB