Critique Models (CM) on the 🤗 Hub

alvarobartt 's Collections

Studio Ghibli Diffusion

About ORPO

AIF Datasets (with distilabel)

Apple MLX-compatible 7B LLMs on the 🤗 Hub

NER in Spanish

🇪🇸 Datasets in Spanish for LLM Evaluation

From zero to GPT-hero

Papers I have / will read in 2024

updated Sep 2

This collection contains some Critique Models (CM) for LLM evaluation available in the HuggingFace Hub

Upvote

openbmb/UltraCM-13b

Text Generation • Updated Oct 14, 2023 • 13 • 18

Note UltraCM was released by OpenBMB as part of their UltraFeedback paper, trained on GPT-4 critiques and scores
prometheus-eval/prometheus-7b-v1.0

Text2Text Generation • Updated Oct 14, 2023 • 186 • 30

Note Kaist AI released Prometheus, a Critique Model (CM) as an alternative to the usage of GPT-4 from OpenAI as a critique model, which is expensive and may have intra-model bias. This is the fine-tune on top of `meta-llama/Llama-2-7b-chat-hf`, and given an instruction, a response, and a reference answer, generates the `feedback` and `score` for it
prometheus-eval/prometheus-13b-v1.0

Text2Text Generation • Updated Oct 14, 2023 • 4.31k • 127

Note Kaist AI released Prometheus, a Critique Model (CM) as an alternative to the usage of GPT-4 from OpenAI as a critique model, which is expensive and may have intra-model bias. This is the fine-tune on top of `meta-llama/Llama-2-13b-chat-hf`, and given an instruction, a response, and a reference answer, generates the `feedback` and `score` for it
prometheus-eval/prometheus-7b-v2.0

Text2Text Generation • Updated Aug 14 • 23.5k • 82

Note Kaist AI (now under the `prometheus-eval` org) recently released Prometheus 2, this is the `mistralai/Mistral-7B-Instruct-v0.2` fine-tune that generates both `feedback` and `score`, and can run either absolute (single response) or relative (two responses), both with or without a reference answer; as Prometheus 1 was expecting a reference answer in any case
prometheus-eval/prometheus-8x7b-v2.0

Text2Text Generation • Updated May 3 • 5.13k • 44

Note Kaist AI (now under the `prometheus-eval` org) recently released Prometheus 2, this is the `mistralai/Mixtral-8x7B-v0.1` fine-tune that generates both `feedback` and `score`, and can run either absolute (single response) or relative (two responses), both with or without a reference answer; as Prometheus 1 was expecting a reference answer in any case

Upvote