The Llama 3 Herd of Models

Article Status
Published
Authors/contributors
Title
The Llama 3 Herd of Models
Abstract
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
Repository
arXiv
Archive ID
arXiv:2407.21783
Date
2024-08-15
Accessed
01/10/2024, 14:01
Library Catalogue
Extra
arXiv:2407.21783 [cs] <AI Smry>: It is found that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks, and performs competitively with the state-of-the-art on image, video, and speech recognition tasks.
Citation
Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Yang, A., Fan, A., Goyal, A., Hartshorn, A., Yang, A., Mitra, A., Sravankumar, A., Korenev, A., Hinsvark, A., Rao, A., Zhang, A., … Zhao, Z. (2024). The Llama 3 Herd of Models (arXiv:2407.21783). arXiv. https://doi.org/10.48550/arXiv.2407.21783
Technical methods
Powered by Zotero and Kerko.