README.md · Faradaylab/ARIA-7B-V3-mistral-french at e21852fc9318cfc286f3eb7096de835b1af5fd2c

metadata

library_name: peft
license: cc-by-nc-2.0
language:
  - fr
  - en
tags:
  - pytorch
  - llama
  - code

Aria 7B V3

We decided to build a V3 of Aria 7B based on Mistral instruct instead of LLAMA 2. The base model has been quantized with Qlora to reduce the model size and trained on a high quality french dataset.

Base Model : Mistral-7B-Instruct-v0.1

Technical issues Fixed & Limits of base model

We noticed that the base model had a common issue of mixing french and english when the request was done in french in some cases,not all of them. This issue was more visible for prompts over 1000 tokens. By training the base model on our dataset, we fixed this issue and allow the model to reply in the same specific language used for the question to answer. This pain-point is a valuable upgrade for corporate users in non-english areas willing to deploy a model with an increased quality and accuracy in french language.

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: bitsandbytes
load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: True
bnb_4bit_compute_dtype: bfloat16

Framework versions

PEFT 0.5.0