In authors or contributors

2 resources

  • Yiqing Xie, Alex Xie, Divyanshu Sheth
    |
    Mar 31st, 2024
    |
    preprint
    Yiqing Xie, Alex Xie, Divyanshu Sheth
    Mar 31st, 2024

    To facilitate evaluation of code generation systems across diverse scenarios, we present CodeBenchGen, a framework to create scalable execution-based benchmarks that only requires light guidance from humans. Specifically, we leverage a large language model (LLM) to convert an arbitrary piece of code into an evaluation example, including test cases for execution-based evaluation. We illustrate the usefulness of our framework by creating a dataset, Exec-CSN, which includes 1,931 examples...

  • Yiqing Xie, Alex Xie, Divyanshu Sheth
    |
    Mar 31st, 2024
    |
    preprint
    Yiqing Xie, Alex Xie, Divyanshu Sheth
    Mar 31st, 2024

    To facilitate evaluation of code generation systems across diverse scenarios, we present CodeBenchGen, a framework to create scalable execution-based benchmarks that only requires light guidance from humans. Specifically, we leverage a large language model (LLM) to convert an arbitrary piece of code into an evaluation example, including test cases for execution-based evaluation. We illustrate the usefulness of our framework by creating a dataset, Exec-CSN, which includes 1,931 examples...

Last update from database: 27/12/2024, 16:15 (UTC)
Powered by Zotero and Kerko.