Edit model card

Model Details

image/png

Model Description

This model is based on teknium/OpenHermes-2.5-Mistral-7B, DPO fine-tuned with the H4rmony_dpo dataset. Its completions should be more ecologically aware than the base model.

Developed by: Jorge Vallego
Funded by : Neovalle Ltd.
Shared by : airesearch@neovalle.co.uk
Model type: mistral
Language(s) (NLP): Primarily English
License: MIT
Finetuned from model: teknium/OpenHermes-2.5-Mistral-7B
Methodology: DPO

Uses

Intended as PoC to show the effects of H4rmony_dpo dataset with DPO fine-tuning.

Direct Use

For testing purposes to gain insight in order to help with the continous improvement of the H4rmony_dpo dataset.

Downstream Use

Its direct use in applications is not recommended as this model is under testing for a specific task only (Ecological Alignment) Out-of-Scope Use

Not meant to be used other than testing and evaluation of the H4rmony_dpo dataset and ecological alignment. Bias, Risks, and Limitations

This model might produce biased completions already existing in the base model, and others unintentionally introduced during fine-tuning.

How to Get Started with the Model

It can be loaded and run in a Colab instance with High RAM.

Training Details

Trained using DPO

Training Data

H4rmony Dataset - https://huggingface.co/datasets/neovalle/H4rmony_dpo

Downloads last month
753
Safetensors
Model size
7.24B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train neovalle/H4rmoniousAnthea