Kukedlc commited on
Commit
7f39c11
1 Parent(s): 1b1cf19

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -54
README.md CHANGED
@@ -1,67 +1,72 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
4
 
5
- Modelo entrenado con DPO
6
 
7
- Merge de dos modelos
8
 
9
- codigo de train:
10
 
11
- # LoRA configuration
12
- peft_config = LoraConfig(
13
- r=16,
14
- lora_alpha=16,
15
- lora_dropout=0.05,
16
- bias="none",
17
- task_type="CAUSAL_LM",
18
- target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
19
- )
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- # Model to fine-tune
22
- model = AutoModelForCausalLM.from_pretrained(
23
- model_name,
24
- torch_dtype=torch.float16,
25
- load_in_4bit=True
26
- )
27
- model.config.use_cache = False
28
 
29
- # Reference model
30
- ref_model = AutoModelForCausalLM.from_pretrained(
31
- model_name,
32
- torch_dtype=torch.float16,
33
- load_in_4bit=True
34
- )
35
 
36
- # Training arguments
37
- training_args = TrainingArguments(
38
- per_device_train_batch_size=4,
39
- gradient_accumulation_steps=4,
40
- gradient_checkpointing=True,
41
- learning_rate=5e-5,
42
- lr_scheduler_type="cosine",
43
- max_steps=200,
44
- save_strategy="no",
45
- logging_steps=1,
46
- output_dir=new_model,
47
- optim="paged_adamw_32bit",
48
- warmup_steps=100,
49
- bf16=True,
50
- report_to="wandb",
51
- )
52
 
53
- # Create DPO trainer
54
- dpo_trainer = DPOTrainer(
55
- model,
56
- ref_model,
57
- args=training_args,
58
- train_dataset=dataset,
59
- tokenizer=tokenizer,
60
- peft_config=peft_config,
61
- beta=0.1,
62
- max_prompt_length=1024,
63
- max_length=1536,
 
 
 
 
 
 
 
 
 
64
  )
65
 
66
- # Fine-tune model with DPO
67
- dpo_trainer.train()
 
 
1
  ---
2
+ tags:
3
+ - merge
4
+ - mergekit
5
+ - lazymergekit
6
+ - mlabonne/OmniBeagle-7B
7
+ - flemmingmiguel/MBX-7B-v3
8
+ - AiMavenAi/AiMaven-Prometheus
9
+ base_model:
10
+ - mlabonne/OmniBeagle-7B
11
+ - flemmingmiguel/MBX-7B-v3
12
+ - AiMavenAi/AiMaven-Prometheus
13
  license: apache-2.0
14
  ---
15
 
 
16
 
 
17
 
18
+ # NeuTrixOmniBe-DPO
19
 
20
+ NeuTrixOmniBe-DPO is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
21
+ ```yaml
22
+ MODEL_NAME = "NeuTrixOmniBe-DPO"
23
+ yaml_config = """
24
+ slices:
25
+ - sources:
26
+ - model: CultriX/NeuralTrix-7B-dpo
27
+ layer_range: [0, 32]
28
+ - model: paulml/OmniBeagleSquaredMBX-v3-7B-v2
29
+ layer_range: [0, 32]
30
+ merge_method: slerp
31
+ base_model: CultriX/NeuralTrix-7B-dpo
32
+ parameters:
33
+ t:
34
+ - filter: self_attn
35
+ value: [0, 0.5, 0.3, 0.7, 1]
36
+ - filter: mlp
37
+ value: [1, 0.5, 0.7, 0.3, 0]
38
+ - value: 0.5
39
+ dtype: bfloat16
40
+ """```
41
 
42
+ It was then trained with DPO using:
43
+ * Intel/orca_dpo_pairs
 
 
 
 
 
44
 
45
+ ## 🧩 Configuration
 
 
 
 
 
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
+
49
+ ## 💻 Usage
50
+
51
+ ```python
52
+ !pip install -qU transformers accelerate
53
+
54
+ from transformers import AutoTokenizer
55
+ import transformers
56
+ import torch
57
+
58
+ model = "Kukedlc/NeuTrixOmniBe-DPO"
59
+ messages = [{"role": "user", "content": "What is a large language model?"}]
60
+
61
+ tokenizer = AutoTokenizer.from_pretrained(model)
62
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
63
+ pipeline = transformers.pipeline(
64
+ "text-generation",
65
+ model=model,
66
+ torch_dtype=torch.float16,
67
+ device_map="auto",
68
  )
69
 
70
+ outputs = pipeline(prompt, max_new_tokens=128, do_sample=True, temperature=0.5, top_k=50, top_p=0.95)
71
+ print(outputs[0]["generated_text"])
72
+ ```