Spaces:
Runtime error
Runtime error
File size: 1,352 Bytes
9f70515 bdb9c7d 9f70515 bdb9c7d 1ca62af 10a9ff9 1ca62af 15e813d 892d25a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
title: Zeno Evals Hub
emoji: 🏃
colorFrom: pink
colorTo: indigo
sdk: docker
pinned: false
license: mit
fullWidth: true
---
# Zeno + OpenAI Evals
![Github Actions CI tests](https://github.com/zeno-ml/zeno-openai-evals/actions/workflows/test.yml/badge.svg)
[![MIT license](https://img.shields.io/badge/License-MIT-blue.svg)](https://lbesson.mit-license.org/)
[![Discord](https://img.shields.io/discord/1086004954872950834)](https://discord.gg/km62pDKAkE)
OpenAI's [Evals library](https://github.com/openai/evals) is a great resource providing evaluation sets for LLMS.
This repo provides a hub for exploring these results using the [Zeno](https://zenoml.com) evaluation tool.
## Add New Evals
To add new evals, add a new entry to `evals/evals.yaml` with the following fields:
- `results-file`: The first `.jsonl` result from `oaievals`
- `link`: A link to the evals commit for this evaluation
- `description`: A succint description of what the evaluation is testing
- `second-results-file`: An optional second `.jsonl` result from `oaievals`. Must be the same dataset as the first one.
- `functions-file`: An optional Python file with [Zeno functions](https://zenoml.com/docs/api) for the evaluations.
Make sure you test your evals locally before submitting a PR!
### Running
`poetry install`
`python -m zeno-evals-hub evals/evals.yaml`
|