File size: 5,321 Bytes

74f3fb9
d5c9a22
 
74f3fb9
 
 
d416e4c
 
74f3fb9
d5c9a22
74f3fb9
d416e4c
 
 
d5c9a22
 
74f3fb9
592aa01
74f3fb9
 
 
d416e4c
54e1934
d416e4c
 
 
592aa01
d416e4c
 
 
a09b906
1eb85cd
592aa01
 
 
 
d416e4c
a09b906
b7b4479
a09b906
d416e4c
74f3fb9
592aa01
74f3fb9
44e5560
74f3fb9
 
 
e176936
74f3fb9
 
 
592aa01
e176936
 
 
 
74f3fb9
 
 
048a29d
 
 
 
 
74f3fb9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5c9a22
77503ea
d5c9a22
77503ea
 
 
d5c9a22

---
language:
- nl
license: mit
tags:
- trl
- fietje
- alignment-handbook
- sft
base_model: BramVanroy/fietje-2
datasets:
- BramVanroy/ultrachat_200k_dutch
- BramVanroy/no_robots_dutch
- BramVanroy/belebele_dutch
pipeline_tag: text-generation
inference: false
model-index:
- name: fietje-2-instruct
  results: []
---

<p align="center" style="margin:0;padding:0">
  <img src="https://huggingface.co/BramVanroy/fietje-2-instruct/resolve/main/img/fietje-2b-banner-rounded.png" alt="Fietje banner" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
</p>

<div style="margin:auto; text-align:center">
  <h1 style="margin-bottom: 0">Fietje 2 Instruct</h1>
  <em>An open and efficient LLM for Dutch</em>
</div>

<blockquote class="tip" style="padding: 1.5em; border: 0">
  <p align="center" style="text-align: center; margin: 0">
    <a rel="nofollow" href="https://huggingface.co/BramVanroy/fietje-2">👱‍♀️ Base version</a> -
    <a rel="nofollow" href="https://huggingface.co/BramVanroy/fietje-2-instruct">🤖 Instruct version</a> (this one) -
    <a rel="nofollow" href="https://huggingface.co/BramVanroy/fietje-2-chat">💬 Chat version</a> -
    <a rel="nofollow" href="https://huggingface.co/BramVanroy/fietje-2-chat-GGUF">🚀 GGUF of Instruct</a>
  </p>
  <p align="center" style="text-align: center; margin: 0">
    <a href="https://huggingface.co/spaces/BramVanroy/fietje-2b"><strong>Chat with Fietje here!</strong></a>
  </p>
</blockquote>

This is the instruct version of Fietje, an SFT-tuned (instruction-tuned) variant of [the base model](https://huggingface.co/BramVanroy/fietje-2). Fietje is an adapated version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2), tailored to Dutch text generation by training on 28B tokens. It is small and efficient with a size of 2.7 billion parameters while performing almost on par with more powerful Dutch LLMs of twice its size like [GEITje 7B Ultra](https://huggingface.co/BramVanroy/GEITje-7B-ultra).

A thorough description of the creation and evaluation of Fietje as well as usage examples are available in [this Github repository](https://github.com/BramVanroy/fietje).

## Intended uses & limitations

The same limitations as [phi-2](https://huggingface.co/microsoft/phi-2#limitations-of-phi-2), and LLMs in general, apply here. LLMs hallucinate, make mistakes, and should not be trusted. Use at your own risk!

## Training and evaluation data

Fietje 2 instruct was finetuned from [the base model](https://huggingface.co/BramVanroy/fietje-2) on the following datasets. Number of training samples per dataset given in brackets, totalling 201,579 samples.

- [BramVanroy/ultrachat_200k_dutch](https://huggingface.co/datasets/BramVanroy/ultrachat_200k_dutch): gpt-4-1106-preview; multi-turn; fully generated (192,598)
- [BramVanroy/no_robots_dutch](https://huggingface.co/datasets/BramVanroy/no_robots_dutch): gpt-4-1106-preview; prompt translate, answer generated; some items have system messages (8181)
- [BramVanroy/belebele_dutch](https://huggingface.co/datasets/BramVanroy/belebele_dutch): Dutch portion of [belebele](https://huggingface.co/datasets/facebook/belebele), formatted into SFT format (800)

## Training procedure

I am thankful to the [Flemish Supercomputer Center](https://www.vscentrum.be/) (VSC) for providing the computational power to accomplish this project. Accounting for waiting for jobs, training took around a day on four nodes of 4x A100 80GB each (16 total). I cannot find the exact time anymore and I do not think that the runtime in `all_results.json` accounts for interrupted-and-continued jobs.

Training was done with the wonderful [alignment-handbook](https://github.com/huggingface/alignment-handbook), using DeepSpeed as a back-end. Exact training recipes and SLURM script are given in the [Github repository](https://github.com/BramVanroy/fietje).


### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 42
- eval_batch_size: 42
- seed: 42
- distributed_type: multi-GPU
- num_devices: 16
- total_train_batch_size: 672
- total_eval_batch_size: 672
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-07
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3.0

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.9325        | 1.0   | 178  | 0.9060          |
| 0.8687        | 2.0   | 356  | 0.8850          |
| 0.8385        | 3.0   | 534  | 0.8818          |


### Framework versions

- Transformers 4.39.1
- Pytorch 2.1.2+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)

Results for the English Open LLM Leaderboard. For results specific to Dutch, check out [ScandEval](https://scandeval.com/dutch-nlg/).

Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_BramVanroy__fietje-2-instruct)

|      Metric       |Value|
|-------------------|----:|
|Avg.               |10.20|
|IFEval (0-Shot)    |27.90|
|BBH (3-Shot)       |17.57|
|MATH Lvl 5 (4-Shot)| 0.53|
|GPQA (0-shot)      | 0.00|
|MuSR (0-shot)      | 2.91|
|MMLU-PRO (5-shot)  |12.26|