Meet RedPajama: An AI Project to Create Fully Open-Source Large Language Models Beginning with the Release of a 1.2 Trillion Token Dataset

Article Status
Published
Author/contributor
Title
Meet RedPajama: An AI Project to Create Fully Open-Source Large Language Models Beginning with the Release of a 1.2 Trillion Token Dataset
Abstract
The most advanced foundation models for AI are only partially open-source and are only available through commercial APIs. This restricts their use and limits research and customization. However, a project called RedPajama now aims to create leading, fully open-source models. The first step of this project, reproducing the LLaMA training dataset, has been completed. Open-source models have made significant progress recently, and AI is experiencing a moment similar to the Linux movement. Stable Diffusion demonstrated that open-source models could compete with commercial offerings and encourage creativity through community participation. A similar movement has now emerged around large language models, with
Blog Title
MarkTechPost
Date
21/04/2023, 05:00
Accessed
26/04/2023, 16:16
Language
en-US
Short Title
Meet RedPajama
Citation
Singh, N. (2023, April 21). Meet RedPajama: An AI Project to Create Fully Open-Source Large Language Models Beginning with the Release of a 1.2 Trillion Token Dataset. MarkTechPost. https://www.marktechpost.com/2023/04/21/meet-redpajama-an-ai-project-to-create-fully-open-source-large-language-models-beginning-with-the-release-of-a-1-2-trillion-token-dataset/
Powered by Zotero and Kerko.