Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting

Article Status

Published

Authors/contributors

Title

Abstract

There exist both scalable tasks, like reading comprehension and fact-checking, where model performance improves with model size, and unscalable tasks, like arithmetic reasoning and symbolic reasoning, where model performance does not necessarily improve with model size. Large language models (LLMs) equipped with Chain-of-Thought (CoT) prompting are able to make accurate incremental predictions even on unscalable tasks. Unfortunately, despite their exceptional reasoning abilities, LLMs tend to internalize and reproduce discriminatory societal biases. Whether CoT can provide discriminatory or egalitarian rationalizations for the implicit information in unscalable tasks remains an open question. In this study, we examine the impact of LLMs' step-by-step predictions on gender bias in unscalable tasks. For this purpose, we construct a benchmark for an unscalable task where the LLM is given a list of words comprising feminine, masculine, and gendered occupational words, and is required to count the number of feminine and masculine words. In our CoT prompts, we require the LLM to explicitly indicate whether each word in the word list is a feminine or masculine before making the final predictions. With counting and handling the meaning of words, this benchmark has characteristics of both arithmetic reasoning and symbolic reasoning. Experimental results in English show that without step-by-step prediction, most LLMs make socially biased predictions, despite the task being as simple as counting words. Interestingly, CoT prompting reduces this unconscious social bias in LLMs and encourages fair predictions.

Repository

arXiv

Archive ID

arXiv:2401.15585

Date

2024-01-28

URL

http://arxiv.org/abs/2401.15585

Accessed

08/11/2024, 21:12

Library Catalogue

arXiv.org

Extra

arXiv:2401.15585 [cs] <标题>: 通过链式思维提示评估大型语言模型中的性别偏见 Read_Status: New Read_Status_Date: 2025-11-10T07:26:02.549Z Citation Key: kaneko2024

Citation

Kaneko, M., Bollegala, D., Okazaki, N., & Baldwin, T. (2024). Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting (arXiv:2401.15585). arXiv. http://arxiv.org/abs/2401.15585

Link to this record

https://aievidencehub.org/lib/W73SJJR2