Commit
20542e5
1 Parent(s): c75b992

Update readme.md

Browse files
Files changed (1) hide show
  1. README.md +98 -7
README.md CHANGED
@@ -1,11 +1,102 @@
1
  ---
 
2
  language:
3
  - ar
4
- library_name: peft
5
- base_model: unsloth/llama-3-8b-bnb-4bit
6
  tags:
7
- - unsloth
8
- - llama-3
9
- - torch
10
- license: apache-2.0
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
  language:
4
  - ar
5
+ - en
 
6
  tags:
7
+ - alpaca
8
+ - llama3
9
+ - arabic
10
+ library_name: peft
11
+ ---
12
+
13
+ # 🚀 al-baka-llama3-8b
14
+
15
+ [<img src="https://i.ibb.co/fMsBM0M/Screenshot-2024-04-20-at-3-04-34-AM.png" width="150"/>](https://www.omarai.co)
16
+
17
+
18
+ Al Baka is an Fine Tuned Model based on the new released LLAMA3-8B Model on the Stanford Alpaca dataset Arabic version [Yasbok/Alpaca_arabic_instruct](https://huggingface.co/datasets/Yasbok/Alpaca_arabic_instruct).
19
+
20
+ ## Model Summary
21
+
22
+ - **Model Type:** Llama3-8B FineTuned Model (Lora Only)
23
+ - **Language(s):** Arabic, English
24
+ - **Base Model:** [LLAMA-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
25
+ - **Dataset:** [Yasbok/Alpaca_arabic_instruct](https://huggingface.co/datasets/Yasbok/Alpaca_arabic_instruct)
26
+
27
+ ## Model Details
28
+
29
+ - The model was fine-tuned in 4-bit precision using [unsloth](https://github.com/unslothai/unsloth)
30
+
31
+ - The run is performed only for 1000 steps with a single Google Colab T4 GPU NVIDIA GPU with 15 GB of available memory.
32
+
33
+
34
+ <span style="color:red">The model is currently being Experimentally Fine Tuned to assess LLaMA-3's response to Arabic, following a brief period of fine-tuning. Larger and more sophisticated models will be introduced soon.</span>
35
+
36
+ ## How to Get Started with the Model
37
+
38
+ ### Setup
39
+ ```python
40
+ # Install packages
41
+ %%capture
42
+ import torch
43
+ major_version, minor_version = torch.cuda.get_device_capability()
44
+ !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
45
+ if major_version >= 8:
46
+ # Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)
47
+ !pip install --no-deps packaging ninja einops flash-attn xformers trl peft accelerate bitsandbytes
48
+ else:
49
+ # Use this for older GPUs (V100, Tesla T4, RTX 20xx)
50
+ !pip install --no-deps xformers trl peft accelerate bitsandbytes
51
+ pass
52
+ ```
53
+
54
+ ### First, Load the Model
55
+ ```python
56
+ from unsloth import FastLanguageModel
57
+ import torch
58
+ max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
59
+ dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
60
+ load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
61
+
62
+
63
+ model, tokenizer = FastLanguageModel.from_pretrained(
64
+ model_name = "Omartificial-Intelligence-Space/al-baka-16bit-llama3-8b",
65
+ max_seq_length = max_seq_length,
66
+ dtype = dtype,
67
+ load_in_4bit = load_in_4bit,
68
+ # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
69
+ )
70
+ ```
71
+
72
+ ### Second, Try the model
73
+ ```python
74
+ alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
75
+
76
+ ### Instruction:
77
+ {}
78
+
79
+ ### Input:
80
+ {}
81
+
82
+ ### Response:
83
+ {}"""
84
+
85
+ # alpaca_prompt = Copied from above
86
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
87
+ inputs = tokenizer(
88
+ [
89
+ alpaca_prompt.format(
90
+ "استخدم البيانات المعطاة لحساب الوسيط.", # instruction
91
+ "[2 ، 3 ، 7 ، 8 ، 10]", # input
92
+ "", # output - leave this blank for generation!
93
+ )
94
+ ], return_tensors = "pt").to("cuda")
95
+
96
+ outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
97
+ tokenizer.batch_decode(outputs)
98
+ ```
99
+
100
+ ### Recommendations
101
+
102
+ - [unsloth](https://github.com/unslothai/unsloth) for finetuning models. You can get a 2x faster finetuned model which can be exported to any format or uploaded to Hugging Face.