--- title: BenchBenchTemp emoji: 👁 colorFrom: yellow colorTo: blue sdk: streamlit sdk_version: 1.36.0 app_file: app.py pinned: false license: apache-2.0 --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference ``` @misc{perlitz2024benchmarkagreementtestingright, title={Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation}, author={Yotam Perlitz and Ariel Gera and Ofir Arviv and Asaf Yehudai and Elron Bandel and Eyal Shnarch and Michal Shmueli-Scheuer and Leshem Choshen}, year={2024}, eprint={2407.13696}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2407.13696}, } ```