In authors or contributors
Technical methods

1 resource

  • Tim Dettmers, Artidoro Pagnoni, Ari Holt...
    |
    May 23rd, 2023
    |
    preprint
    Tim Dettmers, Artidoro Pagnoni, Ari Holt...
    May 23rd, 2023

    We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLoRA backpropagates gradients through a frozen, 4-bit quantized pretrained language model into Low Rank Adapters~(LoRA). Our best model family, which we name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99.3% of the performance level of ChatGPT while...

Last update from database: 16/04/2025, 09:15 (UTC)
Powered by Zotero and Kerko.