3 resources

  • Jon Saad-Falcon, Omar Khattab, Christoph...
    |
    Dec 14th, 2024
    |
    preprint
    Jon Saad-Falcon, Omar Khattab, Christoph...
    Dec 14th, 2024

    Evaluating retrieval-augmented generation (RAG) systems traditionally relies on hand annotations for input queries, passages to retrieve, and responses to generate. We introduce ARES, an Automated RAG Evaluation System, for evaluating RAG systems along the dimensions of context relevance, answer faithfulness, and answer relevance. By creating its own synthetic training data, ARES finetunes lightweight LM judges to assess the quality of individual RAG components. To mitigate potential...

  • Rishi Bommasani, Drew A. Hudson, Ehsan A...
    |
    Dec 14th, 2021
    |
    journalArticle
    Rishi Bommasani, Drew A. Hudson, Ehsan A...
    Dec 14th, 2021

    AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical...

  • Rishi Bommasani, Drew A. Hudson, Ehsan A...
    |
    Jul 12th, 2022
    |
    preprint
    Rishi Bommasani, Drew A. Hudson, Ehsan A...
    Jul 12th, 2022

    AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical...

Last update from database: 14/12/2025, 20:15 (UTC)
Powered by Zotero and Kerko.