One LLM is not Enough: Harnessing the Power of Ensemble Learning for Medical Question Answering

Open in Zotero

View on zotero.org

Open in Zotero

View on zotero.org

Article Status

Published

Authors/contributors

Yang, Han (Author)
Li, Mingchen (Author)
Zhou, Huixue (Author)
Xiao, Yongkang (Author)
Fang, Qian (Author)
Zhang, Rui (Author)

Title

One LLM is not Enough: Harnessing the Power of Ensemble Learning for Medical Question Answering

Abstract

To enhance the accuracy and reliability of diverse medical question-answering (QA) tasks and investigate efficient approaches deploying the Large Language Models (LLM) technologies, We developed a novel ensemble learning pipeline by utilizing state-of-the-art LLMs, focusing on improving performance on diverse medical QA datasets.Materials and MethodsOur study employs three medical QA datasets: PubMedQA, MedQA-USMLE, and MedMCQA, each presenting unique challenges in biomedical question-answering. The proposed LLM-Synergy framework, focusing exclusively on zero-shot cases using LLMs, incorporates two primary ensemble methods. The first is a Boosting-based weighted majority vote ensemble, where decision-making is expedited and refined by assigning variable weights to different LLMs through a boosting algorithm. The second method is Cluster-based Dynamic Model Selection, which dynamically selects the most suitable LLM votes for each query, based on the characteristics of question contexts, using a clustering approach.ResultsThe Majority Weighted Vote and Dynamic Model Selection methods demonstrate superior performance compared to individual LLMs across three medical QA datasets. Specifically, the accuracies are 35.84%, 96.21%, and 37.26% for MedMCQA, PubMedQA, and MedQA-USMLE, respectively, with the Majority Weighted Vote. Correspondingly, the Dynamic Model Selection yields slightly higher accuracies of 38.01%, 96.36%, and 38.13%.ConclusionThe LLM-Synergy framework with two ensemble methods, represents a significant advancement in leveraging LLMs for medical QA tasks and provides an innovative way of efficiently utilizing the development with LLM Technologies, customing for both existing and potentially future challenge tasks in biomedical and health informatics research.

Date

2023-12-24

DOI

10.1101/2023.12.21.23300380

URL

http://medrxiv.org/lookup/doi/10.1101/2023.12.21.23300380

Accessed

02/07/2024, 18:13

Short Title

One LLM is not Enough

Language

Library Catalogue

Health Informatics

Extra

<标题>: 一个大型语言模型不足以应对：利用集成学习的力量进行医学问答 <AI Smry>: The proposed LLM-Synergy framework with two ensemble methods, represents a significant advancement in leveraging LLMs for medical QA tasks and provides an innovative way of efficiently utilizing the development with LLM Technologies, customing for both existing and potentially future challenge tasks in biomedical and health informatics research. Citation Key: yang2023

Citation

Yang, H., Li, M., Zhou, H., Xiao, Y., Fang, Q., & Zhang, R. (2023). One LLM is not Enough: Harnessing the Power of Ensemble Learning for Medical Question Answering. https://doi.org/10.1101/2023.12.21.23300380

Link to this record

https://aievidencehub.org/lib/5EHK33QP