In authors or contributors

2 resources

  • Ted Zadouri, Ahmet Üstün, Arash Ahmadian...
    |
    Sep 11th, 2023
    |
    preprint
    Ted Zadouri, Ahmet Üstün, Arash Ahmadian...
    Sep 11th, 2023

    The Mixture of Experts (MoE) is a widely known neural architecture where an ensemble of specialized sub-models optimizes overall performance with a constant computational cost. However, conventional MoEs pose challenges at scale due to the need to store all experts in memory. In this paper, we push MoE to the limit. We propose extremely parameter-efficient MoE by uniquely combining MoE architecture with lightweight experts.Our MoE architecture outperforms standard parameter-efficient...

  • David Ifeoluwa Adelani, Jessica Ojo, Isr...
    |
    Jun 5th, 2024
    |
    preprint
    David Ifeoluwa Adelani, Jessica Ojo, Isr...
    Jun 5th, 2024

    Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (e.g. African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 16 typologically-diverse low-resource African...

Last update from database: 26/12/2024, 23:15 (UTC)
Powered by Zotero and Kerko.